Upgrade SKILL.md to version 2.0.0 with new guidelines
Updated version from 1.0.0 to 2.0.0 and expanded the audit dimensions with additional checks and guidelines.
This commit is contained in:
@@ -5,7 +5,7 @@ risk: safe
|
||||
source: original
|
||||
date_added: "2026-02-28"
|
||||
metadata:
|
||||
version: 1.0.0
|
||||
version: 2.0.0
|
||||
---
|
||||
|
||||
# Vibe Code Auditor
|
||||
@@ -39,6 +39,12 @@ Before beginning the audit, confirm the following. If any item is missing, state
|
||||
- **Scope defined**: Identify whether the input is a snippet, single file, or multi-file system.
|
||||
- **Context noted**: If no context was provided, state the assumptions made (e.g., "Assuming a web API backend with no specified scale requirements").
|
||||
|
||||
**Quick Scan (first 60 seconds):**
|
||||
- Count files and lines of code
|
||||
- Identify language(s) and framework(s)
|
||||
- Spot obvious red flags: hardcoded secrets, bare excepts, TODOs, commented-out code
|
||||
- Note the entry point(s) and data flow direction
|
||||
|
||||
---
|
||||
|
||||
## Audit Dimensions
|
||||
@@ -47,57 +53,128 @@ Evaluate the code across all seven dimensions below. For each finding, record: t
|
||||
|
||||
**Do not invent findings. Do not report issues you cannot substantiate from the code provided.**
|
||||
|
||||
**Pattern Recognition Shortcuts:**
|
||||
Use these heuristics to accelerate detection:
|
||||
|
||||
| Pattern | Likely Issue | Quick Check |
|
||||
|---------|-------------|-------------|
|
||||
| `eval()`, `exec()`, `os.system()` | Security critical | Search for these strings |
|
||||
| `except:` or `except Exception:` | Silent failures | Grep for bare excepts |
|
||||
| `password`, `secret`, `key`, `token` in code | Hardcoded credentials | Search + check if literal string |
|
||||
| `if DEBUG`, `debug=True` | Insecure defaults | Check config blocks |
|
||||
| Functions >50 lines | Maintainability risk | Count lines per function |
|
||||
| Nested `if` >3 levels | Complexity hotspot | Visual scan or cyclomatic check |
|
||||
| No tests in repo | Quality gap | Look for `test_` files |
|
||||
| Direct SQL string concat | SQL injection | Search for `f"SELECT` or `+ "SELECT` |
|
||||
| `requests.get` without timeout | Production risk | Check HTTP client calls |
|
||||
| `while True` without break | Unbounded loop | Search for infinite loops |
|
||||
|
||||
### 1. Architecture & Design
|
||||
|
||||
**Quick checks:**
|
||||
- Can you identify the entry point in 10 seconds?
|
||||
- Are there clear boundaries between layers (API, business logic, data)?
|
||||
- Does any single file exceed 300 lines?
|
||||
|
||||
- Separation of concerns violations (e.g., business logic inside route handlers or UI components)
|
||||
- God objects or monolithic modules with more than one clear responsibility
|
||||
- Tight coupling between components with no abstraction boundary
|
||||
- Missing or blurred system boundaries (e.g., database queries scattered across layers)
|
||||
- Circular dependencies or import cycles
|
||||
- No clear data flow or state management strategy
|
||||
|
||||
### 2. Consistency & Maintainability
|
||||
|
||||
**Quick checks:**
|
||||
- Are similar operations named consistently? (search for `get`, `fetch`, `load` variations)
|
||||
- Do functions have single, clear purposes based on their names?
|
||||
- Is duplicated logic visible? (search for repeated code blocks)
|
||||
|
||||
- Naming inconsistencies (e.g., `get_user` vs `fetchUser` vs `retrieveUserData` for the same operation)
|
||||
- Mixed paradigms without justification (e.g., OOP and procedural code interleaved arbitrarily)
|
||||
- Copy-paste logic that should be extracted into a shared function
|
||||
- Copy-paste logic that should be extracted into a shared function (3+ repetitions = extract)
|
||||
- Abstractions that obscure rather than clarify intent
|
||||
- Inconsistent error handling patterns across modules
|
||||
- Magic numbers or strings without constants or configuration
|
||||
|
||||
### 3. Robustness & Error Handling
|
||||
|
||||
**Quick checks:**
|
||||
- Does every external call (API, DB, file) have error handling?
|
||||
- Are there any bare `except:` blocks?
|
||||
- What happens if inputs are empty, null, or malformed?
|
||||
|
||||
- Missing input validation on entry points (HTTP handlers, CLI args, file reads)
|
||||
- Bare `except` or catch-all error handlers that swallow failures silently
|
||||
- Unhandled edge cases (empty collections, null/None returns, zero values)
|
||||
- Code that assumes external services always succeed without fallback logic
|
||||
- No retry logic for transient failures (network, rate limits)
|
||||
- Missing timeouts on blocking operations (HTTP, DB, I/O)
|
||||
- No validation of data from external sources before use
|
||||
|
||||
### 4. Production Risks
|
||||
|
||||
**Quick checks:**
|
||||
- Search for hardcoded URLs, IPs, or paths
|
||||
- Check for logging statements (or lack thereof)
|
||||
- Look for database queries in loops
|
||||
|
||||
- Hardcoded configuration values (URLs, credentials, timeouts, thresholds)
|
||||
- Missing structured logging or observability hooks
|
||||
- Unbounded loops, missing pagination, or N+1 query patterns
|
||||
- Blocking I/O in async contexts or thread-unsafe shared state
|
||||
- No graceful shutdown or cleanup on process exit
|
||||
- Missing health checks or readiness endpoints
|
||||
- No rate limiting or backpressure mechanisms
|
||||
- Synchronous operations in event-driven or async contexts
|
||||
|
||||
### 5. Security & Safety
|
||||
|
||||
**Quick checks:**
|
||||
- Search for: `eval`, `exec`, `os.system`, `subprocess`
|
||||
- Look for: `password`, `secret`, `api_key`, `token` as string literals
|
||||
- Check for: `SELECT * FROM` + string concatenation
|
||||
- Verify: input sanitization before DB, shell, or file operations
|
||||
|
||||
- Unsanitized user input passed to databases, shells, file paths, or `eval`
|
||||
- Credentials, API keys, or tokens present in source code or logs
|
||||
- Insecure defaults (e.g., `DEBUG=True`, permissive CORS, no rate limiting)
|
||||
- Trust boundary violations (e.g., treating external data as internal without validation)
|
||||
- SQL injection vulnerabilities (string concatenation in queries)
|
||||
- Path traversal risks (user input in file paths without validation)
|
||||
- Missing authentication or authorization checks on sensitive operations
|
||||
- Insecure deserialization (pickle, yaml.load without SafeLoader)
|
||||
|
||||
### 6. Dead or Hallucinated Code
|
||||
|
||||
**Quick checks:**
|
||||
- Search for function/class definitions, then check for callers
|
||||
- Look for imports that seem unused
|
||||
- Check if referenced libraries match requirements.txt or package.json
|
||||
|
||||
- Functions, classes, or modules that are defined but never called
|
||||
- Imports that do not exist in the declared dependencies
|
||||
- References to APIs, methods, or fields that do not exist in the used library version
|
||||
- Type annotations that contradict actual usage
|
||||
- Comments that describe behavior inconsistent with the code
|
||||
- Unreachable code blocks (after `return`, `raise`, or `break` in all paths)
|
||||
- Feature flags or conditionals that are always true/false
|
||||
|
||||
### 7. Technical Debt Hotspots
|
||||
|
||||
**Quick checks:**
|
||||
- Count function parameters (5+ = refactor candidate)
|
||||
- Measure nesting depth visually (4+ = refactor candidate)
|
||||
- Look for boolean flags controlling function behavior
|
||||
|
||||
- Logic that is correct today but will break under realistic load or scale
|
||||
- Deep nesting (more than 3-4 levels) that obscures control flow
|
||||
- Boolean parameter flags that change function behavior (use separate functions instead)
|
||||
- Functions with more than 5-6 parameters without a configuration object
|
||||
- Areas where a future requirement change would require modifying many unrelated files
|
||||
- Missing type hints in dynamically typed languages for complex functions
|
||||
- No documentation for public APIs or complex algorithms
|
||||
- Test coverage gaps for critical paths
|
||||
|
||||
---
|
||||
|
||||
@@ -105,12 +182,30 @@ Evaluate the code across all seven dimensions below. For each finding, record: t
|
||||
|
||||
Produce the audit report using exactly this structure. Do not omit sections. If a section has no findings, write "None identified."
|
||||
|
||||
**Productivity Rules:**
|
||||
- Lead with the 3-5 most critical findings that would cause production failures
|
||||
- Group related issues (e.g., "3 locations with hardcoded credentials" instead of listing separately)
|
||||
- Provide copy-paste-ready fixes where possible (exact code snippets)
|
||||
- Use severity tags consistently: `[CRITICAL]`, `[HIGH]`, `[MEDIUM]`, `[LOW]`
|
||||
|
||||
---
|
||||
|
||||
### Audit Report
|
||||
|
||||
**Input:** [file name(s) or "code snippet"]
|
||||
**Assumptions:** [list any assumptions made about context or environment]
|
||||
**Quick Stats:** [X files, Y lines of code, Z language/framework]
|
||||
|
||||
#### Executive Summary (Read This First)
|
||||
|
||||
In 3-5 bullets, state the most important findings that determine whether this code can go to production:
|
||||
|
||||
```
|
||||
- [CRITICAL/HIGH] One-line summary of the most severe issue
|
||||
- [CRITICAL/HIGH] Second most severe issue
|
||||
- [MEDIUM] Notable pattern that will cause future problems
|
||||
- Overall: Deployable as-is / Needs fixes / Requires major rework
|
||||
```
|
||||
|
||||
#### Critical Issues (Must Fix Before Production)
|
||||
|
||||
@@ -124,6 +219,11 @@ Location: filename.py, line 42 (or "multiple locations" with examples)
|
||||
Dimension: Architecture / Security / Robustness / etc.
|
||||
Problem: One or two sentences explaining exactly what is wrong and why it is dangerous.
|
||||
Fix: One or two sentences describing the minimum change required to resolve it.
|
||||
Code Fix (if applicable):
|
||||
```python
|
||||
# Before: problematic code
|
||||
# After: corrected version
|
||||
```
|
||||
```
|
||||
|
||||
#### High-Risk Issues
|
||||
@@ -152,23 +252,37 @@ Provide a score using the rubric below, then write 2-3 sentences justifying it w
|
||||
| 71-85 | Production-viable with targeted fixes. Known risks are bounded. |
|
||||
| 86-100 | Production-ready. Minor improvements only. |
|
||||
|
||||
Score deductions:
|
||||
**Scoring Algorithm:**
|
||||
|
||||
- Each Critical issue: -10 to -20 points depending on blast radius
|
||||
- Each High issue: -5 to -10 points
|
||||
- Pervasive maintainability debt (3+ Medium issues in one dimension): -5 points
|
||||
```
|
||||
Start at 100 points
|
||||
For each CRITICAL issue: -15 points (security: -20)
|
||||
For each HIGH issue: -8 points
|
||||
For each MEDIUM issue: -3 points
|
||||
For pervasive patterns (3+ similar issues): -5 additional points
|
||||
Floor: 0, Ceiling: 100
|
||||
```
|
||||
|
||||
#### Refactoring Priorities
|
||||
|
||||
List the top 3-5 changes in order of impact. Each item must reference a specific finding from above.
|
||||
|
||||
```
|
||||
1. [Priority] Fix title — addresses [CRITICAL/HIGH ref] — estimated effort: S/M/L
|
||||
2. ...
|
||||
1. [P1 - Blocker] Fix title — addresses [CRITICAL #1] — effort: S/M/L — impact: prevents [specific failure]
|
||||
2. [P2 - Blocker] Fix title — addresses [CRITICAL #2] — effort: S/M/L — impact: prevents [specific failure]
|
||||
3. [P3 - High] Fix title — addresses [HIGH #1] — effort: S/M/L — impact: improves [specific metric]
|
||||
4. [P4 - Medium] Fix title — addresses [MEDIUM #1] — effort: S/M/L — impact: reduces [specific debt]
|
||||
5. [P5 - Optional] Fix title — addresses [LOW #1] — effort: S/M/L — impact: nice-to-have
|
||||
```
|
||||
|
||||
Effort scale: S = < 1 day, M = 1-3 days, L = > 3 days.
|
||||
|
||||
**Quick Wins (fix in <1 hour):**
|
||||
List any issues that can be resolved immediately with minimal effort:
|
||||
```
|
||||
- [Issue name]: [one-line fix description]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Behavior Rules
|
||||
@@ -180,6 +294,20 @@ Effort scale: S = < 1 day, M = 1-3 days, L = > 3 days.
|
||||
- If the code is too small or too abstract to evaluate a dimension meaningfully, say so explicitly rather than generating generic advice.
|
||||
- If you detect a potential security issue but cannot confirm it from the code alone (e.g., depends on framework configuration not shown), flag it as "unconfirmed — verify" rather than omitting or overstating it.
|
||||
|
||||
**Efficiency Rules:**
|
||||
- Scan for critical patterns first (security, data loss, crashes) before deeper analysis
|
||||
- Group similar issues by pattern rather than listing each occurrence separately
|
||||
- Provide exact code fixes for critical/high issues when the solution is straightforward
|
||||
- Skip dimensions that are not applicable to the code size or type (state "Not applicable: [reason]")
|
||||
- Focus on issues that would cause production incidents, not theoretical concerns
|
||||
|
||||
**Calibration:**
|
||||
- For snippets (<100 lines): Focus on security, robustness, and obvious bugs only
|
||||
- For single files (100-500 lines): Add architecture and maintainability checks
|
||||
- For multi-file systems (500+ lines): Full audit across all 7 dimensions
|
||||
- For production code: Emphasize security, observability, and failure modes
|
||||
- For prototypes: Emphasize scalability limits and technical debt
|
||||
|
||||
---
|
||||
|
||||
## Task-Specific Inputs
|
||||
@@ -189,6 +317,12 @@ Before auditing, if not already provided, ask:
|
||||
1. **Code or files**: Share the source code to audit. Accepted: single file, multiple files, directory listing, or snippet.
|
||||
2. **Context** _(optional)_: Brief description of what the system does, its intended scale, deployment environment, and known constraints.
|
||||
3. **Target environment** _(optional)_: Target runtime (e.g., production web service, CLI tool, data pipeline). Used to calibrate risk severity.
|
||||
4. **Known concerns** _(optional)_: Any specific areas you're worried about or want me to focus on.
|
||||
|
||||
**If context is missing, assume:**
|
||||
- Language/framework is evident from the code
|
||||
- Deployment target is production web service (most common)
|
||||
- Scale expectations are moderate (100-1000 users) unless code suggests otherwise
|
||||
|
||||
---
|
||||
|
||||
@@ -197,3 +331,5 @@ Before auditing, if not already provided, ask:
|
||||
- **schema-markup**: For adding structured data after code is production-ready.
|
||||
- **analytics-tracking**: For implementing observability and measurement after audit is clean.
|
||||
- **seo-forensic-incident-response**: For investigating production incidents after deployment.
|
||||
- **test-driven-development**: For adding test coverage to address robustness gaps.
|
||||
- **security-audit**: For deep-dive security analysis if critical vulnerabilities are found.
|
||||
|
||||
Reference in New Issue
Block a user