Merge pull request #466 from alirezarezvani/dev

Dev
This commit is contained in:
Alireza Rezvani
2026-04-03 01:48:02 +02:00
committed by GitHub
5 changed files with 1798 additions and 50 deletions

View File

@@ -1,12 +1,27 @@
# Quality Scoring Rubric # Quality Scoring Rubric
**Version**: 1.0.0 **Version**: 2.0.0
**Last Updated**: 2026-02-16 **Last Updated**: 2026-03-27
**Authority**: Claude Skills Engineering Team **Authority**: Claude Skills Engineering Team
## Overview ## Overview
This document defines the comprehensive quality scoring methodology used to assess skills within the claude-skills ecosystem. The scoring system evaluates four key dimensions, each weighted equally at 25%, to provide an objective and consistent measure of skill quality. This document defines the comprehensive quality scoring methodology used to assess skills within the claude-skills ecosystem. The scoring system evaluates four key dimensions by default (each weighted at 25%), with an optional fifth Security dimension (enabled via `--include-security` flag).
### Dimension Configuration
**Default Mode (backward compatible)**:
- **Documentation Quality**: 25%
- **Code Quality**: 25%
- **Completeness**: 25%
- **Usability**: 25%
**With `--include-security` flag**:
- **Documentation Quality**: 20%
- **Code Quality**: 20%
- **Completeness**: 20%
- **Security**: 20%
- **Usability**: 20%
## Scoring Framework ## Scoring Framework
@@ -23,14 +38,17 @@ This document defines the comprehensive quality scoring methodology used to asse
- **D (50-54)**: Very poor quality, fundamental issues present - **D (50-54)**: Very poor quality, fundamental issues present
- **F (0-49)**: Failing quality, does not meet basic standards - **F (0-49)**: Failing quality, does not meet basic standards
### Dimension Weights ### Dimension Weights (Default: 4 dimensions × 25%)
Each dimension contributes equally to the overall score: Each dimension contributes equally to the overall score:
- **Documentation Quality**: 25% - **Documentation Quality**: 25%
- **Code Quality**: 25% - **Code Quality**: 25%
- **Completeness**: 25% - **Completeness**: 25%
- **Usability**: 25% - **Usability**: 25%
## Documentation Quality (25% Weight) When `--include-security` is used, all five dimensions are weighted at 20% each.
## Documentation Quality (20% Weight)
### Scoring Components ### Scoring Components
@@ -76,7 +94,7 @@ Each dimension contributes equally to the overall score:
- **Poor (40-59)**: 1-2 minimal examples - **Poor (40-59)**: 1-2 minimal examples
- **Failing (0-39)**: No examples or unclear usage - **Failing (0-39)**: No examples or unclear usage
## Code Quality (25% Weight) ## Code Quality (20% Weight)
### Scoring Components ### Scoring Components
@@ -142,7 +160,7 @@ Each dimension contributes equally to the overall score:
- **Poor (40-59)**: Basic output, formatting issues - **Poor (40-59)**: Basic output, formatting issues
- **Failing (0-39)**: Poor or no structured output - **Failing (0-39)**: Poor or no structured output
## Completeness (25% Weight) ## Completeness (20% Weight)
### Scoring Components ### Scoring Components
@@ -202,7 +220,7 @@ Structure Score = (Required Present / Required Total) * 0.6 +
- **Poor (40-59)**: Minimal testing support - **Poor (40-59)**: Minimal testing support
- **Failing (0-39)**: No testing or validation capability - **Failing (0-39)**: No testing or validation capability
## Usability (25% Weight) ## Usability (20% Weight)
### Scoring Components ### Scoring Components
@@ -379,6 +397,142 @@ def assign_letter_grade(overall_score):
- Create beginner-friendly tutorials - Create beginner-friendly tutorials
- Add interactive examples - Add interactive examples
## Security (Optional, 20% Weight when enabled)
### Overview
The Security dimension evaluates Python scripts for security vulnerabilities and best practices. This dimension is **optional** and only evaluated when the `--include-security` flag is passed to the quality scorer.
**Important**: By default, the quality scorer uses 4 dimensions × 25% weights for backward compatibility. To include Security assessment, use:
```bash
python quality_scorer.py <skill_path> --include-security
```
When Security is enabled, all dimensions are rebalanced to 20% each (5 dimensions × 20% = 100%).
This dimension is critical for ensuring that skills do not introduce security risks into the claude-skills ecosystem.
### Scoring Components
#### Sensitive Data Exposure Prevention (25% of Security Score)
**Component Breakdown:**
- **Hardcoded Credentials Detection**: Passwords, API keys, tokens, secrets
- **AWS Credential Detection**: Access keys and secret keys
- **Private Key Detection**: RSA, SSH, and other private keys
- **JWT Token Detection**: JSON Web Tokens in code
**Scoring Criteria:**
| Score Range | Criteria |
|-------------|----------|
| 90-100 | No hardcoded credentials, uses environment variables properly |
| 75-89 | Minor issues (e.g., placeholder values that aren't real secrets) |
| 60-74 | One or two low-severity issues |
| 40-59 | Multiple medium-severity issues |
| Below 40 | Critical hardcoded secrets detected |
#### Safe File Operations (25% of Security Score)
**Component Breakdown:**
- **Path Traversal Detection**: `../`, URL-encoded variants, Unicode variants
- **String Concatenation Risks**: `open(path + user_input)`
- **Null Byte Injection**: `%00`, `\x00`
- **Safe Pattern Usage**: `pathlib.Path`, `os.path.basename`
**Scoring Criteria:**
| Score Range | Criteria |
|-------------|----------|
| 90-100 | Uses pathlib/os.path safely, no path traversal vulnerabilities |
| 75-89 | Minor issues, uses safe patterns mostly |
| 60-74 | Some path concatenation with user input |
| 40-59 | Path traversal patterns detected |
| Below 40 | Critical vulnerabilities present |
#### Command Injection Prevention (25% of Security Score)
**Component Breakdown:**
- **Dangerous Functions**: `os.system()`, `eval()`, `exec()`, `subprocess` with `shell=True`
- **Safe Alternatives**: `subprocess.run(args, shell=False)`, `shlex.quote()`, `shlex.split()`
**Scoring Criteria:**
| Score Range | Criteria |
|-------------|----------|
| 90-100 | No command injection risks, uses subprocess safely |
| 75-89 | Minor issues, mostly safe patterns |
| 60-74 | Some use of shell=True or eval with safe context |
| 40-59 | Command injection patterns detected |
| Below 40 | Critical vulnerabilities (unfiltered user input to shell) |
#### Input Validation Quality (25% of Security Score)
**Component Breakdown:**
- **Argparse Usage**: CLI argument validation
- **Type Checking**: `isinstance()`, type hints
- **Error Handling**: `try/except` blocks
- **Input Sanitization**: Regex validation, input cleaning
**Scoring Criteria:**
| Score Range | Criteria |
|-------------|----------|
| 90-100 | Comprehensive input validation, proper error handling |
| 75-89 | Good validation coverage, most inputs checked |
| 60-74 | Basic validation present |
| 40-59 | Minimal input validation |
| Below 40 | No input validation |
### Security Best Practices
**Recommended Patterns:**
```python
# Use environment variables for secrets
import os
password = os.environ.get("PASSWORD")
# Use pathlib for safe path operations
from pathlib import Path
safe_path = Path(base_dir) / user_input
# Use subprocess safely
import subprocess
result = subprocess.run(["ls", user_input], capture_output=True)
# Use shlex for shell argument safety
import shlex
safe_arg = shlex.quote(user_input)
```
**Patterns to Avoid:**
```python
# Don't hardcode secrets
password = "my_secret_password" # BAD
# Don't use string concatenation for paths
open(base_path + "/" + user_input) # BAD
# Don't use shell=True with user input
os.system(f"ls {user_input}") # BAD
# Don't use eval on user input
eval(user_input) # VERY BAD
```
### Security Score Impact on Tiers
**Note**: Security requirements only apply when `--include-security` is used.
When Security dimension is enabled:
- **POWERFUL Tier**: Requires Security score ≥ 70
- **STANDARD Tier**: Requires Security score ≥ 50
- **BASIC Tier**: No minimum Security requirement
When Security dimension is not enabled (default):
- Tier recommendations are based on the 4 core dimensions (Documentation, Code Quality, Completeness, Usability)
#### Low Security Scores
- Remove hardcoded credentials, use environment variables
- Fix path traversal vulnerabilities
- Replace dangerous functions with safe alternatives
- Add input validation and error handling
## Quality Assurance Process ## Quality Assurance Process
### Automated Scoring ### Automated Scoring

View File

@@ -1,13 +1,15 @@
# Tier Requirements Matrix # Tier Requirements Matrix
**Version**: 1.0.0 **Version**: 2.0.0
**Last Updated**: 2026-02-16 **Last Updated**: 2026-03-27
**Authority**: Claude Skills Engineering Team **Authority**: Claude Skills Engineering Team
## Overview ## Overview
This document provides a comprehensive matrix of requirements for each skill tier within the claude-skills ecosystem. Skills are classified into three tiers based on complexity, functionality, and comprehensiveness: BASIC, STANDARD, and POWERFUL. This document provides a comprehensive matrix of requirements for each skill tier within the claude-skills ecosystem. Skills are classified into three tiers based on complexity, functionality, and comprehensiveness: BASIC, STANDARD, and POWERFUL.
**Note**: Security dimension requirements are optional and only apply when `--include-security` flag is used. By default, tier recommendations are based on 4 core dimensions (Documentation, Code Quality, Completeness, Usability) at 25% weight each.
## Tier Classification Philosophy ## Tier Classification Philosophy
### BASIC Tier ### BASIC Tier
@@ -33,6 +35,9 @@ Advanced skills that provide comprehensive functionality with sophisticated impl
| **Documentation Depth** | Functional | Comprehensive | Expert-level | | **Documentation Depth** | Functional | Comprehensive | Expert-level |
| **Examples Provided** | ≥1 | ≥3 | ≥5 | | **Examples Provided** | ≥1 | ≥3 | ≥5 |
| **Test Coverage** | Basic validation | Sample data testing | Comprehensive test suite | | **Test Coverage** | Basic validation | Sample data testing | Comprehensive test suite |
| **Security Score** *(opt-in)* | ≥40 | ≥50 | ≥70 |
| **Hardcoded Secrets** *(opt-in)* | None | None | None |
| **Input Validation** *(opt-in)* | Basic | Comprehensive | Advanced with sanitization |
## Detailed Requirements by Tier ## Detailed Requirements by Tier
@@ -65,6 +70,15 @@ Advanced skills that provide comprehensive functionality with sophisticated impl
- **Usability**: Clear usage instructions and examples - **Usability**: Clear usage instructions and examples
- **Completeness**: All essential components present - **Completeness**: All essential components present
#### Security Requirements *(opt-in with --include-security)*
**Note**: These requirements only apply when the Security dimension is enabled via `--include-security` flag.
- **Security Score**: Minimum 40/100
- **Hardcoded Secrets**: No hardcoded passwords, API keys, or tokens
- **Input Validation**: Basic validation for user inputs
- **Error Handling**: User-friendly error messages without exposing sensitive info
- **Safe Patterns**: Avoid obvious security anti-patterns
### STANDARD Tier Requirements ### STANDARD Tier Requirements
#### Documentation Requirements #### Documentation Requirements
@@ -96,6 +110,16 @@ Advanced skills that provide comprehensive functionality with sophisticated impl
- **Testing**: Sample data processing with validation - **Testing**: Sample data processing with validation
- **Integration**: Consideration for CI/CD and automation use - **Integration**: Consideration for CI/CD and automation use
#### Security Requirements *(opt-in with --include-security)*
**Note**: These requirements only apply when the Security dimension is enabled via `--include-security` flag.
- **Security Score**: Minimum 50/100
- **Hardcoded Secrets**: No hardcoded credentials (zero tolerance)
- **Input Validation**: Comprehensive validation with error messages
- **File Operations**: Safe path handling, no path traversal vulnerabilities
- **Command Execution**: No shell injection risks, safe subprocess usage
- **Security Patterns**: Use of environment variables for secrets
### POWERFUL Tier Requirements ### POWERFUL Tier Requirements
#### Documentation Requirements #### Documentation Requirements
@@ -128,6 +152,21 @@ Advanced skills that provide comprehensive functionality with sophisticated impl
- **Integration**: Full CI/CD integration capabilities - **Integration**: Full CI/CD integration capabilities
- **Maintainability**: Designed for long-term maintenance and extension - **Maintainability**: Designed for long-term maintenance and extension
#### Security Requirements *(opt-in with --include-security)*
**Note**: These requirements only apply when the Security dimension is enabled via `--include-security` flag.
- **Security Score**: Minimum 70/100
- **Hardcoded Secrets**: Zero tolerance for hardcoded credentials
- **Input Validation**: Advanced validation with sanitization and type checking
- **File Operations**: All file operations use safe patterns (pathlib, validation)
- **Command Execution**: All subprocess calls use safe patterns (no shell=True)
- **Security Patterns**: Comprehensive security practices including:
- Environment variables for all secrets
- Input sanitization for all user inputs
- Safe deserialization practices
- Secure error handling without info leakage
- **Security Documentation**: Security considerations documented in code and docs
## Tier Assessment Criteria ## Tier Assessment Criteria
### Automatic Tier Classification ### Automatic Tier Classification
@@ -299,15 +338,16 @@ except Exception as e:
## Quality Scoring by Tier ## Quality Scoring by Tier
### Scoring Thresholds ### Scoring Thresholds
- **POWERFUL Tier**: Overall score ≥80, all dimensions ≥75 - **POWERFUL Tier**: Overall score ≥80, all dimensions ≥75, Security ≥70
- **STANDARD Tier**: Overall score ≥70, 3+ dimensions ≥65 - **STANDARD Tier**: Overall score ≥70, 4+ dimensions ≥65, Security ≥50
- **BASIC Tier**: Overall score ≥60, meets minimum requirements - **BASIC Tier**: Overall score ≥60, meets minimum requirements, Security ≥40
### Dimension Weights (All Tiers) ### Dimension Weights (All Tiers)
- **Documentation**: 25% - **Documentation**: 20%
- **Code Quality**: 25% - **Code Quality**: 20%
- **Completeness**: 25% - **Completeness**: 20%
- **Usability**: 25% - **Security**: 20%
- **Usability**: 20%
### Tier-Specific Quality Expectations ### Tier-Specific Quality Expectations
@@ -315,18 +355,21 @@ except Exception as e:
- Documentation: Functional and clear (60+ points expected) - Documentation: Functional and clear (60+ points expected)
- Code Quality: Clean and maintainable (60+ points expected) - Code Quality: Clean and maintainable (60+ points expected)
- Completeness: Essential components present (60+ points expected) - Completeness: Essential components present (60+ points expected)
- Security: Basic security practices (40+ points expected)
- Usability: Easy to understand and use (60+ points expected) - Usability: Easy to understand and use (60+ points expected)
#### STANDARD Tier Quality Profile #### STANDARD Tier Quality Profile
- Documentation: Professional and comprehensive (70+ points expected) - Documentation: Professional and comprehensive (70+ points expected)
- Code Quality: Advanced patterns and best practices (70+ points expected) - Code Quality: Advanced patterns and best practices (70+ points expected)
- Completeness: All recommended components (70+ points expected) - Completeness: All recommended components (70+ points expected)
- Security: Good security practices, no hardcoded secrets (50+ points expected)
- Usability: Well-designed user experience (70+ points expected) - Usability: Well-designed user experience (70+ points expected)
#### POWERFUL Tier Quality Profile #### POWERFUL Tier Quality Profile
- Documentation: Expert-level and publication-ready (80+ points expected) - Documentation: Expert-level and publication-ready (80+ points expected)
- Code Quality: Enterprise-grade implementation (80+ points expected) - Code Quality: Enterprise-grade implementation (80+ points expected)
- Completeness: Comprehensive test and validation coverage (80+ points expected) - Completeness: Comprehensive test and validation coverage (80+ points expected)
- Security: Advanced security practices, comprehensive validation (70+ points expected)
- Usability: Exceptional user experience with extensive help (80+ points expected) - Usability: Exceptional user experience with extensive help (80+ points expected)
## Tier Migration Process ## Tier Migration Process

View File

@@ -3,15 +3,20 @@
Quality Scorer - Scores skills across multiple quality dimensions Quality Scorer - Scores skills across multiple quality dimensions
This script provides comprehensive quality assessment for skills in the claude-skills This script provides comprehensive quality assessment for skills in the claude-skills
ecosystem by evaluating documentation, code quality, completeness, and usability. ecosystem by evaluating documentation, code quality, completeness, security, and usability.
Generates letter grades, tier recommendations, and improvement roadmaps. Generates letter grades, tier recommendations, and improvement roadmaps.
Usage: Usage:
python quality_scorer.py <skill_path> [--detailed] [--minimum-score SCORE] [--json] python quality_scorer.py <skill_path> [--detailed] [--minimum-score SCORE] [--json]
Author: Claude Skills Engineering Team Author: Claude Skills Engineering Team
Version: 1.0.0 Version: 2.0.0
Dependencies: Python Standard Library Only Dependencies: Python Standard Library Only
Changelog:
v2.0.0 - Added Security dimension (opt-in via --include-security flag)
Default: 4 dimensions × 25% (backward compatible)
With --include-security: 5 dimensions × 20%
v1.0.0 - Initial release with 4 dimensions (25% each)
""" """
import argparse import argparse
@@ -23,6 +28,9 @@ import sys
from datetime import datetime from datetime import datetime
from pathlib import Path from pathlib import Path
from typing import Dict, List, Any, Optional, Tuple from typing import Dict, List, Any, Optional, Tuple
# Import Security Scorer module
from security_scorer import SecurityScorer
try: try:
import yaml import yaml
except ImportError: except ImportError:
@@ -148,20 +156,37 @@ class QualityReport:
code_score = self.dimensions.get("Code Quality", QualityDimension("", 0, "")).score code_score = self.dimensions.get("Code Quality", QualityDimension("", 0, "")).score
completeness_score = self.dimensions.get("Completeness", QualityDimension("", 0, "")).score completeness_score = self.dimensions.get("Completeness", QualityDimension("", 0, "")).score
usability_score = self.dimensions.get("Usability", QualityDimension("", 0, "")).score usability_score = self.dimensions.get("Usability", QualityDimension("", 0, "")).score
security_score = self.dimensions.get("Security", QualityDimension("", 0, "")).score
# POWERFUL tier requirements (all dimensions must be strong) # Check if Security dimension is included
if (self.overall_score >= 80 and has_security = "Security" in self.dimensions
all(score >= 75 for score in [doc_score, code_score, completeness_score, usability_score])):
self.tier_recommendation = "POWERFUL" # POWERFUL tier requirements
if has_security:
# STANDARD tier requirements (most dimensions good) # With Security: all 5 dimensions must be strong
elif (self.overall_score >= 70 and if (self.overall_score >= 80 and
sum(1 for score in [doc_score, code_score, completeness_score, usability_score] if score >= 65) >= 3): all(score >= 75 for score in [doc_score, code_score, completeness_score, usability_score]) and
self.tier_recommendation = "STANDARD" security_score >= 70):
self.tier_recommendation = "POWERFUL"
# BASIC tier (minimum viable quality)
# STANDARD tier requirements (with Security)
elif (self.overall_score >= 70 and
sum(1 for score in [doc_score, code_score, completeness_score, usability_score, security_score] if score >= 65) >= 4 and
security_score >= 50):
self.tier_recommendation = "STANDARD"
else: else:
self.tier_recommendation = "BASIC" # Without Security: 4 dimensions must be strong
if (self.overall_score >= 80 and
all(score >= 75 for score in [doc_score, code_score, completeness_score, usability_score])):
self.tier_recommendation = "POWERFUL"
# STANDARD tier requirements (without Security)
elif (self.overall_score >= 70 and
sum(1 for score in [doc_score, code_score, completeness_score, usability_score] if score >= 65) >= 3):
self.tier_recommendation = "STANDARD"
# BASIC tier (minimum viable quality)
# Falls through to BASIC if no other tier matched
def _generate_improvement_roadmap(self): def _generate_improvement_roadmap(self):
"""Generate prioritized improvement suggestions""" """Generate prioritized improvement suggestions"""
@@ -200,10 +225,11 @@ class QualityReport:
class QualityScorer: class QualityScorer:
"""Main quality scoring engine""" """Main quality scoring engine"""
def __init__(self, skill_path: str, detailed: bool = False, verbose: bool = False): def __init__(self, skill_path: str, detailed: bool = False, verbose: bool = False, include_security: bool = False):
self.skill_path = Path(skill_path).resolve() self.skill_path = Path(skill_path).resolve()
self.detailed = detailed self.detailed = detailed
self.verbose = verbose self.verbose = verbose
self.include_security = include_security
self.report = QualityReport(str(self.skill_path)) self.report = QualityReport(str(self.skill_path))
def log_verbose(self, message: str): def log_verbose(self, message: str):
@@ -221,10 +247,19 @@ class QualityScorer:
raise ValueError(f"Skill path does not exist: {self.skill_path}") raise ValueError(f"Skill path does not exist: {self.skill_path}")
# Score each dimension # Score each dimension
self._score_documentation() # Default: 4 dimensions at 25% each (backward compatible)
self._score_code_quality() # With --include-security: 5 dimensions at 20% each
self._score_completeness() weight = 0.20 if self.include_security else 0.25
self._score_usability()
self._score_documentation(weight)
self._score_code_quality(weight)
self._score_completeness(weight)
if self.include_security:
self._score_security(0.20)
self._score_usability(0.20)
else:
self._score_usability(0.25)
# Calculate overall metrics # Calculate overall metrics
self.report.calculate_overall_score() self.report.calculate_overall_score()
@@ -237,11 +272,11 @@ class QualityScorer:
return self.report return self.report
def _score_documentation(self): def _score_documentation(self, weight: float = 0.25):
"""Score documentation quality (25% weight)""" """Score documentation quality"""
self.log_verbose("Scoring documentation quality...") self.log_verbose("Scoring documentation quality...")
dimension = QualityDimension("Documentation", 0.25, "Quality of documentation and written materials") dimension = QualityDimension("Documentation", weight, "Quality of documentation and written materials")
# Score SKILL.md # Score SKILL.md
self._score_skill_md(dimension) self._score_skill_md(dimension)
@@ -491,11 +526,11 @@ class QualityScorer:
dimension.add_score("examples_availability", score, 25, dimension.add_score("examples_availability", score, 25,
f"Found {len(example_files)} example/sample files") f"Found {len(example_files)} example/sample files")
def _score_code_quality(self): def _score_code_quality(self, weight: float = 0.25):
"""Score code quality (25% weight)""" """Score code quality"""
self.log_verbose("Scoring code quality...") self.log_verbose("Scoring code quality...")
dimension = QualityDimension("Code Quality", 0.25, "Quality of Python scripts and implementation") dimension = QualityDimension("Code Quality", weight, "Quality of Python scripts and implementation")
scripts_dir = self.skill_path / "scripts" scripts_dir = self.skill_path / "scripts"
if not scripts_dir.exists(): if not scripts_dir.exists():
@@ -678,11 +713,11 @@ class QualityScorer:
if avg_output_score < 15: if avg_output_score < 15:
dimension.add_suggestion("Add support for both JSON and human-readable output formats") dimension.add_suggestion("Add support for both JSON and human-readable output formats")
def _score_completeness(self): def _score_completeness(self, weight: float = 0.25):
"""Score completeness (25% weight)""" """Score completeness"""
self.log_verbose("Scoring completeness...") self.log_verbose("Scoring completeness...")
dimension = QualityDimension("Completeness", 0.25, "Completeness of required components and assets") dimension = QualityDimension("Completeness", weight, "Completeness of required components and assets")
# Score directory structure # Score directory structure
self._score_directory_structure(dimension) self._score_directory_structure(dimension)
@@ -800,11 +835,11 @@ class QualityScorer:
if not test_files: if not test_files:
dimension.add_suggestion("Add test files or validation scripts") dimension.add_suggestion("Add test files or validation scripts")
def _score_usability(self): def _score_usability(self, weight: float = 0.25):
"""Score usability (25% weight)""" """Score usability"""
self.log_verbose("Scoring usability...") self.log_verbose("Scoring usability...")
dimension = QualityDimension("Usability", 0.25, "Ease of use and user experience") dimension = QualityDimension("Usability", weight, "Ease of use and user experience")
# Score installation simplicity # Score installation simplicity
self._score_installation(dimension) self._score_installation(dimension)
@@ -936,6 +971,58 @@ class QualityScorer:
dimension.add_score("practical_examples", score, 25, dimension.add_score("practical_examples", score, 25,
f"Practical examples: {len(example_files)} files") f"Practical examples: {len(example_files)} files")
def _score_security(self, weight: float = 0.20):
"""Score security quality"""
self.log_verbose("Scoring security quality...")
dimension = QualityDimension("Security", weight, "Security practices and vulnerability prevention")
# Find Python scripts
python_files = list(self.skill_path.rglob("*.py"))
# Filter out test files and __pycache__
python_files = [f for f in python_files
if "__pycache__" not in str(f) and "test_" not in f.name]
if not python_files:
dimension.add_score("scripts_existence", 25, 25,
"No scripts directory - no script security concerns")
dimension.calculate_final_score()
self.report.add_dimension(dimension)
return
# Use SecurityScorer module
try:
scorer = SecurityScorer(python_files, verbose=self.verbose)
result = scorer.get_overall_score()
# Extract scores from SecurityScorer result
sensitive_data_score = result.get("sensitive_data_exposure", {}).get("score", 0)
file_ops_score = result.get("safe_file_operations", {}).get("score", 0)
command_injection_score = result.get("command_injection_prevention", {}).get("score", 0)
input_validation_score = result.get("input_validation", {}).get("score", 0)
dimension.add_score("sensitive_data_exposure", sensitive_data_score, 25,
"Detection and prevention of hardcoded credentials")
dimension.add_score("safe_file_operations", file_ops_score, 25,
"Prevention of path traversal vulnerabilities")
dimension.add_score("command_injection_prevention", command_injection_score, 25,
"Prevention of command injection vulnerabilities")
dimension.add_score("input_validation", input_validation_score, 25,
"Quality of input validation and error handling")
# Add suggestions from SecurityScorer
for issue in result.get("issues", []):
dimension.add_suggestion(issue)
except Exception as e:
self.log_verbose(f"Security scoring failed: {str(e)}")
dimension.add_score("security_error", 0, 100, f"Security scoring failed: {str(e)}")
dimension.add_suggestion("Fix security scoring module integration")
dimension.calculate_final_score()
self.report.add_dimension(dimension)
class QualityReportFormatter: class QualityReportFormatter:
"""Formats quality reports for output""" """Formats quality reports for output"""
@@ -1016,13 +1103,17 @@ Examples:
python quality_scorer.py engineering/my-skill python quality_scorer.py engineering/my-skill
python quality_scorer.py engineering/my-skill --detailed --json python quality_scorer.py engineering/my-skill --detailed --json
python quality_scorer.py engineering/my-skill --minimum-score 75 python quality_scorer.py engineering/my-skill --minimum-score 75
python quality_scorer.py engineering/my-skill --include-security
Quality Dimensions (each 25%): Quality Dimensions (default: 4 dimensions × 25%):
Documentation - SKILL.md quality, README, references, examples Documentation - SKILL.md quality, README, references, examples
Code Quality - Script complexity, error handling, structure, output Code Quality - Script complexity, error handling, structure, output
Completeness - Directory structure, assets, expected outputs, tests Completeness - Directory structure, assets, expected outputs, tests
Usability - Installation simplicity, usage clarity, help accessibility Usability - Installation simplicity, usage clarity, help accessibility
With --include-security (5 dimensions × 20%):
Security - Sensitive data exposure, command injection, input validation
Letter Grades: A+ (95+), A (90+), A- (85+), B+ (80+), B (75+), B- (70+), C+ (65+), C (60+), C- (55+), D (50+), F (<50) Letter Grades: A+ (95+), A (90+), A- (85+), B+ (80+), B (75+), B- (70+), C+ (65+), C (60+), C- (55+), D (50+), F (<50)
""" """
) )
@@ -1042,12 +1133,15 @@ Letter Grades: A+ (95+), A (90+), A- (85+), B+ (80+), B (75+), B- (70+), C+ (65+
parser.add_argument("--verbose", parser.add_argument("--verbose",
action="store_true", action="store_true",
help="Enable verbose logging") help="Enable verbose logging")
parser.add_argument("--include-security",
action="store_true",
help="Include Security dimension (switches to 5 dimensions × 20%% each)")
args = parser.parse_args() args = parser.parse_args()
try: try:
# Create scorer and assess quality # Create scorer and assess quality
scorer = QualityScorer(args.skill_path, args.detailed, args.verbose) scorer = QualityScorer(args.skill_path, args.detailed, args.verbose, args.include_security)
report = scorer.assess_quality() report = scorer.assess_quality()
# Format and output report # Format and output report

View File

@@ -0,0 +1,606 @@
#!/usr/bin/env python3
"""
Security Scorer - Security dimension scoring module
This module provides comprehensive security assessment for Python scripts,
evaluating sensitive data exposure, safe file operations, command injection
prevention, and input validation quality.
Author: Claude Skills Engineering Team
Version: 2.0.0
"""
import re
from pathlib import Path
from typing import Dict, List, Tuple, Optional, Any
# =============================================================================
# CONSTANTS - Scoring thresholds and weights
# =============================================================================
# Maximum score per component (25 points each, 4 components = 100 total)
MAX_COMPONENT_SCORE: int = 25
# Minimum score floor (never go below 0)
MIN_SCORE: int = 0
# Security score thresholds for tier recommendations
SECURITY_SCORE_POWERFUL_TIER: int = 70 # Required for POWERFUL tier
SECURITY_SCORE_STANDARD_TIER: int = 50 # Required for STANDARD tier
# Scoring modifiers (magic numbers replaced with named constants)
BASE_SCORE_SENSITIVE_DATA: int = 25 # Start with full points
BASE_SCORE_FILE_OPS: int = 15 # Base score for file operations
BASE_SCORE_COMMAND_INJECTION: int = 25 # Start with full points
BASE_SCORE_INPUT_VALIDATION: int = 10 # Base score for input validation
# Penalty amounts (negative scoring)
CRITICAL_VULNERABILITY_PENALTY: int = -25 # Critical issues (hardcoded passwords, etc.)
HIGH_SEVERITY_PENALTY: int = -10 # High severity issues
MEDIUM_SEVERITY_PENALTY: int = -5 # Medium severity issues
LOW_SEVERITY_PENALTY: int = -2 # Low severity issues
# Bonus amounts (positive scoring)
SAFE_PATTERN_BONUS: int = 2 # Bonus for using safe patterns
GOOD_PRACTICE_BONUS: int = 3 # Bonus for good security practices
# =============================================================================
# PRE-COMPILED REGEX PATTERNS - Sensitive Data Detection
# =============================================================================
# Hardcoded credentials patterns (CRITICAL severity)
PATTERN_HARDCODED_PASSWORD = re.compile(
r'password\s*=\s*["\'][^"\']{4,}["\']',
re.IGNORECASE
)
PATTERN_HARDCODED_API_KEY = re.compile(
r'api_key\s*=\s*["\'][^"\']{8,}["\']',
re.IGNORECASE
)
PATTERN_HARDCODED_SECRET = re.compile(
r'secret\s*=\s*["\'][^"\']{4,}["\']',
re.IGNORECASE
)
PATTERN_HARDCODED_TOKEN = re.compile(
r'token\s*=\s*["\'][^"\']{8,}["\']',
re.IGNORECASE
)
PATTERN_HARDCODED_PRIVATE_KEY = re.compile(
r'private_key\s*=\s*["\'][^"\']{20,}["\']',
re.IGNORECASE
)
PATTERN_HARDCODED_AWS_KEY = re.compile(
r'aws_access_key\s*=\s*["\'][^"\']{16,}["\']',
re.IGNORECASE
)
PATTERN_HARDCODED_AWS_SECRET = re.compile(
r'aws_secret\s*=\s*["\'][^"\']{20,}["\']',
re.IGNORECASE
)
# Multi-line string patterns (CRITICAL severity)
PATTERN_MULTILINE_STRING = re.compile(
r'["\']{3}[^"\']*?(?:password|api_key|secret|token|private_key)[^"\']*?["\']{3}',
re.IGNORECASE | re.DOTALL
)
# F-string patterns (HIGH severity)
PATTERN_FSTRING_SENSITIVE = re.compile(
r'f["\'].*?(?:password|api_key|secret|token)\s*=',
re.IGNORECASE
)
# Base64 encoded secrets (MEDIUM severity)
PATTERN_BASE64_SECRET = re.compile(
r'(?:base64|b64encode|b64decode)\s*\([^)]*(?:password|api_key|secret|token)',
re.IGNORECASE
)
# JWT tokens (HIGH severity)
PATTERN_JWT_TOKEN = re.compile(
r'eyJ[a-zA-Z0-9_-]*\.eyJ[a-zA-Z0-9_-]*\.[a-zA-Z0-9_-]*'
)
# Connection strings (HIGH severity)
PATTERN_CONNECTION_STRING = re.compile(
r'(?:connection_string|conn_string|database_url)\s*=\s*["\'][^"\']*(?:password|pwd|passwd)[^"\']*["\']',
re.IGNORECASE
)
# Safe credential patterns (environment variables are OK)
PATTERN_SAFE_ENV_VAR = re.compile(
r'os\.(?:getenv|environ)\s*\(\s*["\'][^"\']+["\']',
re.IGNORECASE
)
# =============================================================================
# PRE-COMPILED REGEX PATTERNS - Path Traversal Detection
# =============================================================================
# Basic path traversal patterns
PATTERN_PATH_TRAVERSAL_BASIC = re.compile(r'\.\.\/')
PATTERN_PATH_TRAVERSAL_WINDOWS = re.compile(r'\.\.\\')
# URL encoded path traversal (MEDIUM severity)
PATTERN_PATH_TRAVERSAL_URL_ENCODED = re.compile(
r'%2e%2e%2f|%252e%252e%252f|\.\.%2f',
re.IGNORECASE
)
# Unicode encoded path traversal (MEDIUM severity)
PATTERN_PATH_TRAVERSAL_UNICODE = re.compile(
r'\\u002e\\u002e|\\uff0e\\uff0e|\u002e\u002e\/',
re.IGNORECASE
)
# Null byte injection (HIGH severity)
PATTERN_NULL_BYTE = re.compile(r'%00|\\x00|\0')
# Risky file operation patterns
PATTERN_PATH_CONCAT = re.compile(
r'open\s*\(\s*[^)]*\+',
re.IGNORECASE
)
PATTERN_USER_INPUT_PATH = re.compile(
r'\.join\s*\(\s*[^)]*input|os\.path\.join\s*\([^)]*request',
re.IGNORECASE
)
# Safe file operation patterns
PATTERN_SAFE_BASENAME = re.compile(r'os\.path\.basename', re.IGNORECASE)
PATTERN_SAFE_PATHLIB = re.compile(r'pathlib\.Path\s*\(', re.IGNORECASE)
PATTERN_PATH_VALIDATION = re.compile(r'validate.*path', re.IGNORECASE)
PATTERN_PATH_RESOLVE = re.compile(r'\.resolve\s*\(', re.IGNORECASE)
# =============================================================================
# PRE-COMPILED REGEX PATTERNS - Command Injection Detection
# =============================================================================
# Dangerous patterns (CRITICAL severity)
PATTERN_OS_SYSTEM = re.compile(r'os\.system\s*\(')
PATTERN_OS_POPEN = re.compile(r'os\.popen\s*\(')
PATTERN_EVAL = re.compile(r'eval\s*\(')
PATTERN_EXEC = re.compile(r'exec\s*\(')
# Subprocess with shell=True (HIGH severity)
PATTERN_SUBPROCESS_SHELL_TRUE = re.compile(
r'subprocess\.(?:call|run|Popen|check_output)\s*\([^)]*shell\s*=\s*True',
re.IGNORECASE
)
# Asyncio subprocess shell (HIGH severity)
PATTERN_ASYNCIO_SHELL = re.compile(
r'asyncio\.create_subprocess_shell\s*\(',
re.IGNORECASE
)
# Pexpect spawn (HIGH severity)
PATTERN_PEXPECT_SPAWN = re.compile(r'pexpect\.spawn\s*\(', re.IGNORECASE)
# Safe subprocess patterns
PATTERN_SAFE_SUBPROCESS = re.compile(
r'subprocess\.(?:run|call|Popen)\s*\([^)]*shell\s*=\s*False',
re.IGNORECASE
)
PATTERN_SHLEX_QUOTE = re.compile(r'shlex\.quote', re.IGNORECASE)
PATTERN_SHLEX_SPLIT = re.compile(r'shlex\.split', re.IGNORECASE)
# =============================================================================
# PRE-COMPILED REGEX PATTERNS - Input Validation Detection
# =============================================================================
# Good validation patterns
PATTERN_ARGPARSE = re.compile(r'argparse')
PATTERN_TRY_EXCEPT = re.compile(r'try\s*:[\s\S]*?except\s+\w*Error')
PATTERN_INPUT_CHECK = re.compile(r'if\s+not\s+\w+\s*:')
PATTERN_ISINSTANCE = re.compile(r'isinstance\s*\(')
PATTERN_ISDIGIT = re.compile(r'\.isdigit\s*\(\)')
PATTERN_REGEX_VALIDATION = re.compile(r're\.(?:match|search|fullmatch)\s*\(')
PATTERN_VALIDATOR_CLASS = re.compile(r'Validator', re.IGNORECASE)
PATTERN_VALIDATE_FUNC = re.compile(r'validate', re.IGNORECASE)
PATTERN_SANITIZE_FUNC = re.compile(r'sanitize', re.IGNORECASE)
class SecurityScorer:
"""
Security dimension scoring engine.
This class evaluates Python scripts for security vulnerabilities and best practices
across four components:
1. Sensitive Data Exposure Prevention (25% of security score)
2. Safe File Operations (25% of security score)
3. Command Injection Prevention (25% of security score)
4. Input Validation Quality (25% of security score)
Attributes:
scripts: List of Python script paths to evaluate
verbose: Whether to output verbose logging
"""
def __init__(self, scripts: List[Path], verbose: bool = False):
"""
Initialize the SecurityScorer.
Args:
scripts: List of Path objects pointing to Python scripts
verbose: Enable verbose output for debugging
"""
self.scripts = scripts
self.verbose = verbose
self._findings: List[str] = []
def _log_verbose(self, message: str) -> None:
"""Log verbose message if verbose mode is enabled."""
if self.verbose:
print(f"[SECURITY] {message}")
def _get_script_content(self, script_path: Path) -> Optional[str]:
"""
Safely read script content.
Args:
script_path: Path to the Python script
Returns:
Script content as string, or None if read fails
"""
try:
return script_path.read_text(encoding='utf-8')
except Exception as e:
self._log_verbose(f"Failed to read {script_path}: {e}")
return None
def _clamp_score(self, score: int) -> int:
"""
Clamp score to valid range [MIN_SCORE, MAX_COMPONENT_SCORE].
Args:
score: Raw score value
Returns:
Score clamped to valid range
"""
return max(MIN_SCORE, min(score, MAX_COMPONENT_SCORE))
def _score_patterns(
self,
content: str,
script_name: str,
dangerous_patterns: List[Tuple[re.Pattern, str, int]],
safe_patterns: List[Tuple[re.Pattern, str, int]],
base_score: int
) -> Tuple[int, List[str]]:
"""
Generic pattern scoring method.
This method evaluates a script against lists of dangerous and safe patterns,
applying penalties for dangerous patterns found and bonuses for safe patterns.
Args:
content: Script content to analyze
script_name: Name of the script (for findings)
dangerous_patterns: List of (pattern, description, penalty) tuples
safe_patterns: List of (pattern, description, bonus) tuples
base_score: Starting score before adjustments
Returns:
Tuple of (final_score, findings_list)
"""
score = base_score
findings = []
# Check for dangerous patterns
for pattern, description, penalty in dangerous_patterns:
matches = pattern.findall(content)
if matches:
score += penalty # Penalty is negative
findings.append(f"{script_name}: {description} ({len(matches)} occurrence(s))")
# Check for safe patterns
for pattern, description, bonus in safe_patterns:
if pattern.search(content):
score += bonus
self._log_verbose(f"Safe pattern found in {script_name}: {description}")
return self._clamp_score(score), findings
def score_sensitive_data_exposure(self) -> Tuple[float, List[str]]:
"""
Score sensitive data exposure prevention.
Evaluates scripts for:
- Hardcoded passwords, API keys, secrets, tokens, private keys
- Multi-line string credentials
- F-string sensitive data
- Base64 encoded secrets
- JWT tokens
- Connection strings with credentials
Returns:
Tuple of (average_score, findings_list)
"""
if not self.scripts:
return float(MAX_COMPONENT_SCORE), []
scores = []
all_findings = []
# Define dangerous patterns with severity-based penalties
dangerous_patterns = [
(PATTERN_HARDCODED_PASSWORD, 'hardcoded password', CRITICAL_VULNERABILITY_PENALTY),
(PATTERN_HARDCODED_API_KEY, 'hardcoded API key', CRITICAL_VULNERABILITY_PENALTY),
(PATTERN_HARDCODED_SECRET, 'hardcoded secret', CRITICAL_VULNERABILITY_PENALTY),
(PATTERN_HARDCODED_TOKEN, 'hardcoded token', CRITICAL_VULNERABILITY_PENALTY),
(PATTERN_HARDCODED_PRIVATE_KEY, 'hardcoded private key', CRITICAL_VULNERABILITY_PENALTY),
(PATTERN_HARDCODED_AWS_KEY, 'hardcoded AWS key', CRITICAL_VULNERABILITY_PENALTY),
(PATTERN_HARDCODED_AWS_SECRET, 'hardcoded AWS secret', CRITICAL_VULNERABILITY_PENALTY),
(PATTERN_MULTILINE_STRING, 'multi-line string credential', CRITICAL_VULNERABILITY_PENALTY),
(PATTERN_FSTRING_SENSITIVE, 'f-string sensitive data', HIGH_SEVERITY_PENALTY),
(PATTERN_BASE64_SECRET, 'base64 encoded secret', MEDIUM_SEVERITY_PENALTY),
(PATTERN_JWT_TOKEN, 'JWT token in code', HIGH_SEVERITY_PENALTY),
(PATTERN_CONNECTION_STRING, 'connection string with credentials', HIGH_SEVERITY_PENALTY),
]
# Safe patterns get bonus points
safe_patterns = [
(PATTERN_SAFE_ENV_VAR, 'safe environment variable usage', SAFE_PATTERN_BONUS),
]
for script_path in self.scripts:
content = self._get_script_content(script_path)
if content is None:
continue
score, findings = self._score_patterns(
content=content,
script_name=script_path.name,
dangerous_patterns=dangerous_patterns,
safe_patterns=safe_patterns,
base_score=BASE_SCORE_SENSITIVE_DATA
)
scores.append(score)
all_findings.extend(findings)
avg_score = sum(scores) / len(scores) if scores else 0.0
return avg_score, all_findings
def score_safe_file_operations(self) -> Tuple[float, List[str]]:
"""
Score safe file operations.
Evaluates scripts for:
- Path traversal vulnerabilities (basic, URL-encoded, Unicode, null bytes)
- Unsafe path construction
- Safe patterns (pathlib, basename, validation)
Returns:
Tuple of (average_score, findings_list)
"""
if not self.scripts:
return float(MAX_COMPONENT_SCORE), []
scores = []
all_findings = []
# Dangerous patterns with severity-based penalties
dangerous_patterns = [
(PATTERN_PATH_TRAVERSAL_BASIC, 'basic path traversal', HIGH_SEVERITY_PENALTY),
(PATTERN_PATH_TRAVERSAL_WINDOWS, 'Windows-style path traversal', HIGH_SEVERITY_PENALTY),
(PATTERN_PATH_TRAVERSAL_URL_ENCODED, 'URL-encoded path traversal', HIGH_SEVERITY_PENALTY),
(PATTERN_PATH_TRAVERSAL_UNICODE, 'Unicode-encoded path traversal', HIGH_SEVERITY_PENALTY),
(PATTERN_NULL_BYTE, 'null byte injection', HIGH_SEVERITY_PENALTY),
(PATTERN_PATH_CONCAT, 'potential path injection via concatenation', MEDIUM_SEVERITY_PENALTY),
(PATTERN_USER_INPUT_PATH, 'user input in path construction', MEDIUM_SEVERITY_PENALTY),
]
# Safe patterns get bonus points
safe_patterns = [
(PATTERN_SAFE_BASENAME, 'uses basename for safety', SAFE_PATTERN_BONUS),
(PATTERN_SAFE_PATHLIB, 'uses pathlib', SAFE_PATTERN_BONUS),
(PATTERN_PATH_VALIDATION, 'path validation', SAFE_PATTERN_BONUS),
(PATTERN_PATH_RESOLVE, 'path resolution', SAFE_PATTERN_BONUS),
]
for script_path in self.scripts:
content = self._get_script_content(script_path)
if content is None:
continue
score, findings = self._score_patterns(
content=content,
script_name=script_path.name,
dangerous_patterns=dangerous_patterns,
safe_patterns=safe_patterns,
base_score=BASE_SCORE_FILE_OPS
)
scores.append(score)
all_findings.extend(findings)
avg_score = sum(scores) / len(scores) if scores else 0.0
return avg_score, all_findings
def score_command_injection_prevention(self) -> Tuple[float, List[str]]:
"""
Score command injection prevention.
Evaluates scripts for:
- os.system(), os.popen() usage
- subprocess with shell=True
- eval(), exec() usage
- asyncio.create_subprocess_shell()
- pexpect.spawn()
- Safe patterns (shlex.quote, shell=False)
Returns:
Tuple of (average_score, findings_list)
"""
if not self.scripts:
return float(MAX_COMPONENT_SCORE), []
scores = []
all_findings = []
# Dangerous patterns with severity-based penalties
dangerous_patterns = [
(PATTERN_OS_SYSTEM, 'os.system usage - potential command injection', CRITICAL_VULNERABILITY_PENALTY),
(PATTERN_OS_POPEN, 'os.popen usage', HIGH_SEVERITY_PENALTY),
(PATTERN_EVAL, 'eval usage - code injection risk', CRITICAL_VULNERABILITY_PENALTY),
(PATTERN_EXEC, 'exec usage - code injection risk', CRITICAL_VULNERABILITY_PENALTY),
(PATTERN_SUBPROCESS_SHELL_TRUE, 'subprocess with shell=True', HIGH_SEVERITY_PENALTY),
(PATTERN_ASYNCIO_SHELL, 'asyncio.create_subprocess_shell()', HIGH_SEVERITY_PENALTY),
(PATTERN_PEXPECT_SPAWN, 'pexpect.spawn()', MEDIUM_SEVERITY_PENALTY),
]
# Safe patterns get bonus points
safe_patterns = [
(PATTERN_SAFE_SUBPROCESS, 'safe subprocess usage (shell=False)', GOOD_PRACTICE_BONUS),
(PATTERN_SHLEX_QUOTE, 'shell escaping with shlex.quote', GOOD_PRACTICE_BONUS),
(PATTERN_SHLEX_SPLIT, 'safe argument splitting with shlex.split', GOOD_PRACTICE_BONUS),
]
for script_path in self.scripts:
content = self._get_script_content(script_path)
if content is None:
continue
score, findings = self._score_patterns(
content=content,
script_name=script_path.name,
dangerous_patterns=dangerous_patterns,
safe_patterns=safe_patterns,
base_score=BASE_SCORE_COMMAND_INJECTION
)
scores.append(score)
all_findings.extend(findings)
avg_score = sum(scores) / len(scores) if scores else 0.0
return avg_score, all_findings
def score_input_validation(self) -> Tuple[float, List[str]]:
"""
Score input validation quality.
Evaluates scripts for:
- argparse usage for CLI validation
- Error handling patterns
- Type checking (isinstance)
- Regex validation
- Validation/sanitization functions
Returns:
Tuple of (average_score, suggestions_list)
"""
if not self.scripts:
return float(MAX_COMPONENT_SCORE), []
scores = []
suggestions = []
# Good validation patterns (each gives bonus points)
validation_patterns = [
(PATTERN_ARGPARSE, GOOD_PRACTICE_BONUS),
(PATTERN_TRY_EXCEPT, SAFE_PATTERN_BONUS),
(PATTERN_INPUT_CHECK, SAFE_PATTERN_BONUS),
(PATTERN_ISINSTANCE, SAFE_PATTERN_BONUS),
(PATTERN_ISDIGIT, SAFE_PATTERN_BONUS),
(PATTERN_REGEX_VALIDATION, SAFE_PATTERN_BONUS),
(PATTERN_VALIDATOR_CLASS, GOOD_PRACTICE_BONUS),
(PATTERN_VALIDATE_FUNC, SAFE_PATTERN_BONUS),
(PATTERN_SANITIZE_FUNC, SAFE_PATTERN_BONUS),
]
for script_path in self.scripts:
content = self._get_script_content(script_path)
if content is None:
continue
score = BASE_SCORE_INPUT_VALIDATION
# Check for validation patterns
for pattern, bonus in validation_patterns:
if pattern.search(content):
score += bonus
scores.append(self._clamp_score(score))
avg_score = sum(scores) / len(scores) if scores else 0.0
if avg_score < 15:
suggestions.append("Add input validation with argparse, type checking, and error handling")
return avg_score, suggestions
def get_overall_score(self) -> Dict[str, Any]:
"""
Calculate overall security score and return detailed results.
Returns:
Dictionary containing:
- overall_score: Weighted average of all components
- components: Individual component scores
- findings: List of security issues found
- suggestions: Improvement suggestions
"""
# Score each component
sensitive_score, sensitive_findings = self.score_sensitive_data_exposure()
file_ops_score, file_ops_findings = self.score_safe_file_operations()
command_injection_score, command_findings = self.score_command_injection_prevention()
input_validation_score, input_suggestions = self.score_input_validation()
# Calculate overall score (equal weight: 25% each)
overall_score = (
sensitive_score * 0.25 +
file_ops_score * 0.25 +
command_injection_score * 0.25 +
input_validation_score * 0.25
)
# Collect all findings
all_findings = sensitive_findings + file_ops_findings + command_findings
# Generate suggestions based on findings
suggestions = input_suggestions.copy()
if sensitive_findings:
suggestions.append("Remove hardcoded credentials and use environment variables or secure config")
if file_ops_findings:
suggestions.append("Validate and sanitize file paths, use pathlib for safe path handling")
if command_findings:
suggestions.append("Avoid shell=True in subprocess, use shlex.quote for shell arguments")
# Critical vulnerability check - if any critical issues, cap the score
critical_patterns = [
PATTERN_HARDCODED_PASSWORD, PATTERN_HARDCODED_API_KEY,
PATTERN_HARDCODED_PRIVATE_KEY, PATTERN_OS_SYSTEM,
PATTERN_EVAL, PATTERN_EXEC
]
has_critical = False
for script_path in self.scripts:
content = self._get_script_content(script_path)
if content is None:
continue
for pattern in critical_patterns:
if pattern.search(content):
has_critical = True
break
if has_critical:
break
if has_critical:
overall_score = min(overall_score, 30) # Cap at 30 if critical vulnerabilities exist
return {
'overall_score': round(overall_score, 1),
'components': {
'sensitive_data_exposure': round(sensitive_score, 1),
'safe_file_operations': round(file_ops_score, 1),
'command_injection_prevention': round(command_injection_score, 1),
'input_validation': round(input_validation_score, 1),
},
'findings': all_findings,
'suggestions': suggestions,
'has_critical_vulnerabilities': has_critical,
}

View File

@@ -0,0 +1,851 @@
#!/usr/bin/env python3
"""
Tests for security_scorer.py - Security Dimension Scoring Tests
This test module validates the security_scorer.py module's ability to:
- Detect hardcoded sensitive data (passwords, API keys, tokens, private keys)
- Detect path traversal vulnerabilities
- Detect command injection risks
- Score input validation quality
- Handle edge cases (empty files, environment variables, etc.)
Run with: python -m unittest test_security_scorer
"""
import tempfile
import unittest
from pathlib import Path
# Add the scripts directory to the path
import sys
SCRIPTS_DIR = Path(__file__).parent.parent / "scripts"
sys.path.insert(0, str(SCRIPTS_DIR))
from security_scorer import (
SecurityScorer,
# Constants
MAX_COMPONENT_SCORE,
MIN_SCORE,
BASE_SCORE_SENSITIVE_DATA,
BASE_SCORE_FILE_OPS,
BASE_SCORE_COMMAND_INJECTION,
BASE_SCORE_INPUT_VALIDATION,
CRITICAL_VULNERABILITY_PENALTY,
HIGH_SEVERITY_PENALTY,
MEDIUM_SEVERITY_PENALTY,
LOW_SEVERITY_PENALTY,
SAFE_PATTERN_BONUS,
GOOD_PRACTICE_BONUS,
# Pre-compiled patterns
PATTERN_HARDCODED_PASSWORD,
PATTERN_HARDCODED_API_KEY,
PATTERN_HARDCODED_TOKEN,
PATTERN_HARDCODED_PRIVATE_KEY,
PATTERN_OS_SYSTEM,
PATTERN_EVAL,
PATTERN_EXEC,
PATTERN_SUBPROCESS_SHELL_TRUE,
PATTERN_SHLEX_QUOTE,
PATTERN_SAFE_ENV_VAR,
)
class TestSecurityScorerConstants(unittest.TestCase):
"""Tests for security scorer constants."""
def test_max_component_score_value(self):
"""Test that MAX_COMPONENT_SCORE is 25."""
self.assertEqual(MAX_COMPONENT_SCORE, 25)
def test_min_score_value(self):
"""Test that MIN_SCORE is 0."""
self.assertEqual(MIN_SCORE, 0)
def test_base_scores_are_reasonable(self):
"""Test that base scores are within valid range."""
self.assertGreaterEqual(BASE_SCORE_SENSITIVE_DATA, MIN_SCORE)
self.assertLessEqual(BASE_SCORE_SENSITIVE_DATA, MAX_COMPONENT_SCORE)
self.assertGreaterEqual(BASE_SCORE_FILE_OPS, MIN_SCORE)
self.assertLessEqual(BASE_SCORE_FILE_OPS, MAX_COMPONENT_SCORE)
self.assertGreaterEqual(BASE_SCORE_COMMAND_INJECTION, MIN_SCORE)
self.assertLessEqual(BASE_SCORE_COMMAND_INJECTION, MAX_COMPONENT_SCORE)
def test_penalty_values_are_negative(self):
"""Test that penalty values are negative."""
self.assertLess(CRITICAL_VULNERABILITY_PENALTY, 0)
self.assertLess(HIGH_SEVERITY_PENALTY, 0)
self.assertLess(MEDIUM_SEVERITY_PENALTY, 0)
self.assertLess(LOW_SEVERITY_PENALTY, 0)
def test_bonus_values_are_positive(self):
"""Test that bonus values are positive."""
self.assertGreater(SAFE_PATTERN_BONUS, 0)
self.assertGreater(GOOD_PRACTICE_BONUS, 0)
def test_severity_ordering(self):
"""Test that severity penalties are ordered correctly."""
# Critical should be most severe (most negative)
self.assertLess(CRITICAL_VULNERABILITY_PENALTY, HIGH_SEVERITY_PENALTY)
self.assertLess(HIGH_SEVERITY_PENALTY, MEDIUM_SEVERITY_PENALTY)
self.assertLess(MEDIUM_SEVERITY_PENALTY, LOW_SEVERITY_PENALTY)
class TestPrecompiledPatterns(unittest.TestCase):
"""Tests for pre-compiled regex patterns."""
def test_password_pattern_detects_hardcoded(self):
"""Test that password pattern detects hardcoded passwords."""
code = 'password = "my_secret_password_123"'
self.assertTrue(PATTERN_HARDCODED_PASSWORD.search(code))
def test_password_pattern_ignores_short_values(self):
"""Test that password pattern ignores very short values."""
code = 'password = "x"' # Too short
self.assertFalse(PATTERN_HARDCODED_PASSWORD.search(code))
def test_api_key_pattern_detects_hardcoded(self):
"""Test that API key pattern detects hardcoded keys."""
code = 'api_key = "sk-1234567890abcdef"'
self.assertTrue(PATTERN_HARDCODED_API_KEY.search(code))
def test_token_pattern_detects_hardcoded(self):
"""Test that token pattern detects hardcoded tokens."""
code = 'token = "ghp_1234567890abcdef"'
self.assertTrue(PATTERN_HARDCODED_TOKEN.search(code))
def test_private_key_pattern_detects_hardcoded(self):
"""Test that private key pattern detects hardcoded keys."""
code = 'private_key = "-----BEGIN RSA PRIVATE KEY-----"'
self.assertTrue(PATTERN_HARDCODED_PRIVATE_KEY.search(code))
def test_os_system_pattern_detects(self):
"""Test that os.system pattern is detected."""
code = 'os.system("ls -la")'
self.assertTrue(PATTERN_OS_SYSTEM.search(code))
def test_eval_pattern_detects(self):
"""Test that eval pattern is detected."""
code = 'result = eval(user_input)'
self.assertTrue(PATTERN_EVAL.search(code))
def test_exec_pattern_detects(self):
"""Test that exec pattern is detected."""
code = 'exec(user_code)'
self.assertTrue(PATTERN_EXEC.search(code))
def test_subprocess_shell_true_pattern_detects(self):
"""Test that subprocess shell=True pattern is detected."""
code = 'subprocess.run(cmd, shell=True)'
self.assertTrue(PATTERN_SUBPROCESS_SHELL_TRUE.search(code))
def test_shlex_quote_pattern_detects(self):
"""Test that shlex.quote pattern is detected."""
code = 'safe_cmd = shlex.quote(user_input)'
self.assertTrue(PATTERN_SHLEX_QUOTE.search(code))
def test_safe_env_var_pattern_detects(self):
"""Test that safe environment variable pattern is detected."""
code = 'password = os.getenv("DB_PASSWORD")'
self.assertTrue(PATTERN_SAFE_ENV_VAR.search(code))
class TestSecurityScorerInit(unittest.TestCase):
"""Tests for SecurityScorer initialization."""
def test_init_with_empty_list(self):
"""Test initialization with empty script list."""
scorer = SecurityScorer([])
self.assertEqual(scorer.scripts, [])
self.assertFalse(scorer.verbose)
def test_init_with_scripts(self):
"""Test initialization with script list."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "test.py"
script_path.write_text("# test")
scorer = SecurityScorer([script_path])
self.assertEqual(len(scorer.scripts), 1)
def test_init_with_verbose(self):
"""Test initialization with verbose mode."""
scorer = SecurityScorer([], verbose=True)
self.assertTrue(scorer.verbose)
class TestSensitiveDataExposure(unittest.TestCase):
"""Tests for sensitive data exposure scoring."""
def test_no_scripts_returns_max_score(self):
"""Test that empty script list returns max score."""
scorer = SecurityScorer([])
score, findings = scorer.score_sensitive_data_exposure()
self.assertEqual(score, float(MAX_COMPONENT_SCORE))
self.assertEqual(findings, [])
def test_clean_script_scores_high(self):
"""Test that clean script without sensitive data scores high."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "clean.py"
script_path.write_text("""
import os
def get_password():
return os.getenv("DB_PASSWORD")
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_sensitive_data_exposure()
self.assertGreaterEqual(score, 20)
self.assertEqual(len(findings), 0)
def test_hardcoded_password_detected(self):
"""Test that hardcoded password is detected and penalized."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "insecure.py"
script_path.write_text("""
password = "super_secret_password_123"
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_sensitive_data_exposure()
self.assertLess(score, MAX_COMPONENT_SCORE)
self.assertTrue(any('password' in f.lower() for f in findings))
def test_hardcoded_api_key_detected(self):
"""Test that hardcoded API key is detected and penalized."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "insecure.py"
script_path.write_text("""
api_key = "sk-1234567890abcdef123456"
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_sensitive_data_exposure()
self.assertLess(score, MAX_COMPONENT_SCORE)
# Check for 'api' or 'hardcoded' in findings (description is 'hardcoded API key')
self.assertTrue(any('api' in f.lower() or 'hardcoded' in f.lower() for f in findings))
def test_hardcoded_token_detected(self):
"""Test that hardcoded token is detected and penalized."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "insecure.py"
script_path.write_text("""
token = "ghp_1234567890abcdef"
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_sensitive_data_exposure()
self.assertLess(score, MAX_COMPONENT_SCORE)
def test_hardcoded_private_key_detected(self):
"""Test that hardcoded private key is detected and penalized."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "insecure.py"
script_path.write_text("""
private_key = "-----BEGIN RSA PRIVATE KEY-----MIIEowIBAAJCA..."
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_sensitive_data_exposure()
self.assertLess(score, MAX_COMPONENT_SCORE)
def test_environment_variable_not_flagged(self):
"""Test that environment variable usage is not flagged as sensitive."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "secure.py"
script_path.write_text("""
import os
def get_credentials():
password = os.getenv("DB_PASSWORD")
api_key = os.environ.get("API_KEY")
return password, api_key
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_sensitive_data_exposure()
# Should score well because using environment variables
self.assertGreaterEqual(score, 20)
def test_empty_file_handled(self):
"""Test that empty file is handled gracefully."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "empty.py"
script_path.write_text("")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_sensitive_data_exposure()
# Should return max score for empty file (no sensitive data)
self.assertGreaterEqual(score, 20)
def test_jwt_token_detected(self):
"""Test that JWT token in code is detected."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "jwt.py"
script_path.write_text("""
# JWT token for testing
token = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0.dozjgNryP4J3jVmNHl0w5N_XgL0n3I9PlFUP0THsR8U"
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_sensitive_data_exposure()
# Should be penalized for JWT token
self.assertLess(score, MAX_COMPONENT_SCORE)
class TestSafeFileOperations(unittest.TestCase):
"""Tests for safe file operations scoring."""
def test_no_scripts_returns_max_score(self):
"""Test that empty script list returns max score."""
scorer = SecurityScorer([])
score, findings = scorer.score_safe_file_operations()
self.assertEqual(score, float(MAX_COMPONENT_SCORE))
def test_safe_pathlib_usage_scores_high(self):
"""Test that safe pathlib usage scores high."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "safe.py"
script_path.write_text("""
from pathlib import Path
def read_file(filename):
path = Path(filename).resolve()
return path.read_text()
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_safe_file_operations()
self.assertGreaterEqual(score, 15)
def test_path_traversal_detected(self):
"""Test that path traversal pattern is detected."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "risky.py"
script_path.write_text("""
def read_file(base_path, filename):
# Potential path traversal vulnerability
path = base_path + "/../../../etc/passwd"
return open(path).read()
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_safe_file_operations()
self.assertLess(score, MAX_COMPONENT_SCORE)
def test_basename_usage_scores_bonus(self):
"""Test that basename usage gets bonus points."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "safe.py"
script_path.write_text("""
import os
def safe_filename(user_input):
return os.path.basename(user_input)
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_safe_file_operations()
self.assertGreaterEqual(score, 15)
class TestCommandInjectionPrevention(unittest.TestCase):
"""Tests for command injection prevention scoring."""
def test_no_scripts_returns_max_score(self):
"""Test that empty script list returns max score."""
scorer = SecurityScorer([])
score, findings = scorer.score_command_injection_prevention()
self.assertEqual(score, float(MAX_COMPONENT_SCORE))
def test_os_system_detected(self):
"""Test that os.system usage is detected."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "risky.py"
script_path.write_text("""
import os
def run_command(user_input):
os.system("echo " + user_input)
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_command_injection_prevention()
self.assertLess(score, MAX_COMPONENT_SCORE)
self.assertTrue(any('os.system' in f.lower() for f in findings))
def test_subprocess_shell_true_detected(self):
"""Test that subprocess with shell=True is detected."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "risky.py"
script_path.write_text("""
import subprocess
def run_command(cmd):
subprocess.run(cmd, shell=True)
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_command_injection_prevention()
self.assertLess(score, MAX_COMPONENT_SCORE)
# Check for 'shell' or 'subprocess' in findings
self.assertTrue(any('shell' in f.lower() or 'subprocess' in f.lower() for f in findings))
def test_eval_detected(self):
"""Test that eval usage is detected."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "risky.py"
script_path.write_text("""
def evaluate(user_input):
return eval(user_input)
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_command_injection_prevention()
self.assertLess(score, MAX_COMPONENT_SCORE)
self.assertTrue(any('eval' in f.lower() for f in findings))
def test_exec_detected(self):
"""Test that exec usage is detected."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "risky.py"
script_path.write_text("""
def execute(user_code):
exec(user_code)
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_command_injection_prevention()
self.assertLess(score, MAX_COMPONENT_SCORE)
self.assertTrue(any('exec' in f.lower() for f in findings))
def test_shlex_quote_gets_bonus(self):
"""Test that shlex.quote usage gets bonus points."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "safe.py"
script_path.write_text("""
import shlex
import subprocess
def run_command(user_input):
safe_cmd = shlex.quote(user_input)
subprocess.run(["echo", safe_cmd], shell=False)
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_command_injection_prevention()
self.assertGreaterEqual(score, 20)
def test_safe_subprocess_scores_high(self):
"""Test that safe subprocess usage (shell=False) scores high."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "safe.py"
script_path.write_text("""
import subprocess
def run_command(cmd_parts):
subprocess.run(cmd_parts, shell=False)
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, findings = scorer.score_command_injection_prevention()
self.assertGreaterEqual(score, 20)
class TestInputValidation(unittest.TestCase):
"""Tests for input validation scoring."""
def test_no_scripts_returns_max_score(self):
"""Test that empty script list returns max score."""
scorer = SecurityScorer([])
score, suggestions = scorer.score_input_validation()
self.assertEqual(score, float(MAX_COMPONENT_SCORE))
def test_argparse_usage_scores_bonus(self):
"""Test that argparse usage gets bonus points."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "good.py"
script_path.write_text("""
import argparse
def main():
parser = argparse.ArgumentParser()
parser.add_argument("input")
args = parser.parse_args()
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, suggestions = scorer.score_input_validation()
# Base score is 10, argparse gives +3 bonus, so score should be 13
self.assertGreaterEqual(score, 10)
self.assertLessEqual(score, MAX_COMPONENT_SCORE)
def test_isinstance_usage_scores_bonus(self):
"""Test that isinstance usage gets bonus points."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "good.py"
script_path.write_text("""
def process(value):
if isinstance(value, str):
return value.upper()
return value
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, suggestions = scorer.score_input_validation()
self.assertGreater(score, BASE_SCORE_INPUT_VALIDATION)
def test_try_except_scores_bonus(self):
"""Test that try/except usage gets bonus points."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "good.py"
script_path.write_text("""
def process(value):
try:
return int(value)
except ValueError:
return 0
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, suggestions = scorer.score_input_validation()
self.assertGreater(score, BASE_SCORE_INPUT_VALIDATION)
def test_minimal_validation_gets_suggestion(self):
"""Test that minimal validation triggers suggestion."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "minimal.py"
script_path.write_text("""
def main():
print("Hello")
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
score, suggestions = scorer.score_input_validation()
self.assertLess(score, 15)
self.assertTrue(len(suggestions) > 0)
class TestOverallScore(unittest.TestCase):
"""Tests for overall security score calculation."""
def test_overall_score_components_present(self):
"""Test that overall score includes all components."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "test.py"
script_path.write_text("""
import os
import argparse
def main():
parser = argparse.ArgumentParser()
parser.parse_args()
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
results = scorer.get_overall_score()
self.assertIn('overall_score', results)
self.assertIn('components', results)
self.assertIn('findings', results)
self.assertIn('suggestions', results)
components = results['components']
self.assertIn('sensitive_data_exposure', components)
self.assertIn('safe_file_operations', components)
self.assertIn('command_injection_prevention', components)
self.assertIn('input_validation', components)
def test_overall_score_is_weighted_average(self):
"""Test that overall score is calculated as weighted average."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "test.py"
script_path.write_text("""
import argparse
def main():
parser = argparse.ArgumentParser()
parser.parse_args()
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
results = scorer.get_overall_score()
# Calculate expected weighted average
expected = (
results['components']['sensitive_data_exposure'] * 0.25 +
results['components']['safe_file_operations'] * 0.25 +
results['components']['command_injection_prevention'] * 0.25 +
results['components']['input_validation'] * 0.25
)
self.assertAlmostEqual(results['overall_score'], expected, places=0)
def test_critical_vulnerability_caps_score(self):
"""Test that critical vulnerabilities cap the overall score."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "critical.py"
script_path.write_text("""
password = "hardcoded_password_123"
api_key = "sk-1234567890abcdef"
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
results = scorer.get_overall_score()
# Score should be capped at 30 for critical vulnerabilities
self.assertLessEqual(results['overall_score'], 30)
self.assertTrue(results['has_critical_vulnerabilities'])
def test_secure_script_scores_high(self):
"""Test that secure script scores high overall."""
with tempfile.TemporaryDirectory() as tmpdir:
script_path = Path(tmpdir) / "secure.py"
script_path.write_text("""
#!/usr/bin/env python3
import argparse
import os
import shlex
import subprocess
from pathlib import Path
def validate_path(path_str):
path = Path(path_str).resolve()
if not path.exists():
raise FileNotFoundError("Path not found")
return path
def safe_command(args):
return subprocess.run(args, shell=False, capture_output=True)
def main():
parser = argparse.ArgumentParser()
parser.add_argument("input")
args = parser.parse_args()
db_password = os.getenv("DB_PASSWORD")
path = validate_path(args.input)
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([script_path])
results = scorer.get_overall_score()
# Secure script should score reasonably well
# Note: Score may vary based on pattern detection
self.assertGreater(results['overall_score'], 20)
class TestScoreClamping(unittest.TestCase):
"""Tests for score boundary clamping."""
def test_score_never_below_zero(self):
"""Test that score never goes below 0."""
scorer = SecurityScorer([])
# Test with extreme negative
result = scorer._clamp_score(-100)
self.assertEqual(result, MIN_SCORE)
def test_score_never_above_max(self):
"""Test that score never goes above max."""
scorer = SecurityScorer([])
# Test with extreme positive
result = scorer._clamp_score(1000)
self.assertEqual(result, MAX_COMPONENT_SCORE)
def test_score_unchanged_in_valid_range(self):
"""Test that score is unchanged in valid range."""
scorer = SecurityScorer([])
for test_score in [0, 5, 10, 15, 20, 25]:
result = scorer._clamp_score(test_score)
self.assertEqual(result, test_score)
class TestMultipleScripts(unittest.TestCase):
"""Tests for scoring multiple scripts."""
def test_multiple_scripts_averaged(self):
"""Test that scores are averaged across multiple scripts."""
with tempfile.TemporaryDirectory() as tmpdir:
secure_script = Path(tmpdir) / "secure.py"
secure_script.write_text("""
import os
def main():
password = os.getenv("PASSWORD")
if __name__ == "__main__":
main()
""")
insecure_script = Path(tmpdir) / "insecure.py"
insecure_script.write_text("""
password = "hardcoded_secret_password"
def main():
pass
if __name__ == "__main__":
main()
""")
scorer = SecurityScorer([secure_script, insecure_script])
score, findings = scorer.score_sensitive_data_exposure()
# Score should be between secure and insecure
self.assertGreater(score, 0)
self.assertLess(score, MAX_COMPONENT_SCORE)
if __name__ == "__main__":
unittest.main(verbosity=2)