Upgrade SKILL.md to version 2.0.0 with new guidelines

Updated version from 1.0.0 to 2.0.0 and expanded the audit dimensions with additional checks and guidelines.
2026-03-01 19:27:38 +05:00
parent 66df68d003
commit fdd53a7aca
1 changed files with 144 additions and 8 deletions
--- a/skills/vibe-code-auditor/SKILL.md
+++ b/skills/vibe-code-auditor/SKILL.md
@@ -5,7 +5,7 @@ risk: safe
 source: original
 date_added: "2026-02-28"
 metadata:
-  version: 1.0.0
+  version: 2.0.0
 ---

 # Vibe Code Auditor
@@ -39,6 +39,12 @@ Before beginning the audit, confirm the following. If any item is missing, state
 - **Scope defined**: Identify whether the input is a snippet, single file, or multi-file system.
 - **Context noted**: If no context was provided, state the assumptions made (e.g., "Assuming a web API backend with no specified scale requirements").

+**Quick Scan (first 60 seconds):**
+- Count files and lines of code
+- Identify language(s) and framework(s)
+- Spot obvious red flags: hardcoded secrets, bare excepts, TODOs, commented-out code
+- Note the entry point(s) and data flow direction
+
 ---

 ## Audit Dimensions
@@ -47,57 +53,128 @@ Evaluate the code across all seven dimensions below. For each finding, record: t

 **Do not invent findings. Do not report issues you cannot substantiate from the code provided.**

+**Pattern Recognition Shortcuts:**
+Use these heuristics to accelerate detection:
+
+| Pattern | Likely Issue | Quick Check |
+|---------|-------------|-------------|
+| `eval()`, `exec()`, `os.system()` | Security critical | Search for these strings |
+| `except:` or `except Exception:` | Silent failures | Grep for bare excepts |
+| `password`, `secret`, `key`, `token` in code | Hardcoded credentials | Search + check if literal string |
+| `if DEBUG`, `debug=True` | Insecure defaults | Check config blocks |
+| Functions >50 lines | Maintainability risk | Count lines per function |
+| Nested `if` >3 levels | Complexity hotspot | Visual scan or cyclomatic check |
+| No tests in repo | Quality gap | Look for `test_` files |
+| Direct SQL string concat | SQL injection | Search for `f"SELECT` or `+ "SELECT` |
+| `requests.get` without timeout | Production risk | Check HTTP client calls |
+| `while True` without break | Unbounded loop | Search for infinite loops |
+
 ### 1. Architecture & Design

+**Quick checks:**
+- Can you identify the entry point in 10 seconds?
+- Are there clear boundaries between layers (API, business logic, data)?
+- Does any single file exceed 300 lines?
+
 - Separation of concerns violations (e.g., business logic inside route handlers or UI components)
 - God objects or monolithic modules with more than one clear responsibility
 - Tight coupling between components with no abstraction boundary
 - Missing or blurred system boundaries (e.g., database queries scattered across layers)
+- Circular dependencies or import cycles
+- No clear data flow or state management strategy

 ### 2. Consistency & Maintainability

+**Quick checks:**
+- Are similar operations named consistently? (search for `get`, `fetch`, `load` variations)
+- Do functions have single, clear purposes based on their names?
+- Is duplicated logic visible? (search for repeated code blocks)
+
 - Naming inconsistencies (e.g., `get_user` vs `fetchUser` vs `retrieveUserData` for the same operation)
 - Mixed paradigms without justification (e.g., OOP and procedural code interleaved arbitrarily)
- Copy-paste logic that should be extracted into a shared function
+- Copy-paste logic that should be extracted into a shared function (3+ repetitions = extract)
 - Abstractions that obscure rather than clarify intent
+- Inconsistent error handling patterns across modules
+- Magic numbers or strings without constants or configuration

 ### 3. Robustness & Error Handling

+**Quick checks:**
+- Does every external call (API, DB, file) have error handling?
+- Are there any bare `except:` blocks?
+- What happens if inputs are empty, null, or malformed?
+
 - Missing input validation on entry points (HTTP handlers, CLI args, file reads)
 - Bare `except` or catch-all error handlers that swallow failures silently
 - Unhandled edge cases (empty collections, null/None returns, zero values)
 - Code that assumes external services always succeed without fallback logic
+- No retry logic for transient failures (network, rate limits)
+- Missing timeouts on blocking operations (HTTP, DB, I/O)
+- No validation of data from external sources before use

 ### 4. Production Risks

+**Quick checks:**
+- Search for hardcoded URLs, IPs, or paths
+- Check for logging statements (or lack thereof)
+- Look for database queries in loops
+
 - Hardcoded configuration values (URLs, credentials, timeouts, thresholds)
 - Missing structured logging or observability hooks
 - Unbounded loops, missing pagination, or N+1 query patterns
 - Blocking I/O in async contexts or thread-unsafe shared state
 - No graceful shutdown or cleanup on process exit
+- Missing health checks or readiness endpoints
+- No rate limiting or backpressure mechanisms
+- Synchronous operations in event-driven or async contexts

 ### 5. Security & Safety

+**Quick checks:**
+- Search for: `eval`, `exec`, `os.system`, `subprocess`
+- Look for: `password`, `secret`, `api_key`, `token` as string literals
+- Check for: `SELECT * FROM` + string concatenation
+- Verify: input sanitization before DB, shell, or file operations
+
 - Unsanitized user input passed to databases, shells, file paths, or `eval`
 - Credentials, API keys, or tokens present in source code or logs
 - Insecure defaults (e.g., `DEBUG=True`, permissive CORS, no rate limiting)
 - Trust boundary violations (e.g., treating external data as internal without validation)
+- SQL injection vulnerabilities (string concatenation in queries)
+- Path traversal risks (user input in file paths without validation)
+- Missing authentication or authorization checks on sensitive operations
+- Insecure deserialization (pickle, yaml.load without SafeLoader)

 ### 6. Dead or Hallucinated Code

+**Quick checks:**
+- Search for function/class definitions, then check for callers
+- Look for imports that seem unused
+- Check if referenced libraries match requirements.txt or package.json
+
 - Functions, classes, or modules that are defined but never called
 - Imports that do not exist in the declared dependencies
 - References to APIs, methods, or fields that do not exist in the used library version
 - Type annotations that contradict actual usage
 - Comments that describe behavior inconsistent with the code
+- Unreachable code blocks (after `return`, `raise`, or `break` in all paths)
+- Feature flags or conditionals that are always true/false

 ### 7. Technical Debt Hotspots

+**Quick checks:**
+- Count function parameters (5+ = refactor candidate)
+- Measure nesting depth visually (4+ = refactor candidate)
+- Look for boolean flags controlling function behavior
+
 - Logic that is correct today but will break under realistic load or scale
 - Deep nesting (more than 3-4 levels) that obscures control flow
 - Boolean parameter flags that change function behavior (use separate functions instead)
 - Functions with more than 5-6 parameters without a configuration object
 - Areas where a future requirement change would require modifying many unrelated files
+- Missing type hints in dynamically typed languages for complex functions
+- No documentation for public APIs or complex algorithms
+- Test coverage gaps for critical paths

 ---

@@ -105,12 +182,30 @@ Evaluate the code across all seven dimensions below. For each finding, record: t

 Produce the audit report using exactly this structure. Do not omit sections. If a section has no findings, write "None identified."

+**Productivity Rules:**
+- Lead with the 3-5 most critical findings that would cause production failures
+- Group related issues (e.g., "3 locations with hardcoded credentials" instead of listing separately)
+- Provide copy-paste-ready fixes where possible (exact code snippets)
+- Use severity tags consistently: `[CRITICAL]`, `[HIGH]`, `[MEDIUM]`, `[LOW]`
+
 ---

 ### Audit Report

 **Input:** [file name(s) or "code snippet"]
 **Assumptions:** [list any assumptions made about context or environment]
+**Quick Stats:** [X files, Y lines of code, Z language/framework]
+
+#### Executive Summary (Read This First)
+
+In 3-5 bullets, state the most important findings that determine whether this code can go to production:
+
+```
+- [CRITICAL/HIGH] One-line summary of the most severe issue
+- [CRITICAL/HIGH] Second most severe issue
+- [MEDIUM] Notable pattern that will cause future problems
+- Overall: Deployable as-is / Needs fixes / Requires major rework
+```

 #### Critical Issues (Must Fix Before Production)

@@ -124,6 +219,11 @@ Location: filename.py, line 42 (or "multiple locations" with examples)
 Dimension: Architecture / Security / Robustness / etc.
 Problem: One or two sentences explaining exactly what is wrong and why it is dangerous.
 Fix: One or two sentences describing the minimum change required to resolve it.
+Code Fix (if applicable):
+```python
+# Before: problematic code
+# After: corrected version
+```
 ```

 #### High-Risk Issues
@@ -152,23 +252,37 @@ Provide a score using the rubric below, then write 2-3 sentences justifying it w
 | 71-85  | Production-viable with targeted fixes. Known risks are bounded.        |
 | 86-100 | Production-ready. Minor improvements only.                             |

-Score deductions:
+**Scoring Algorithm:**

- Each Critical issue: -10 to -20 points depending on blast radius
- Each High issue: -5 to -10 points
- Pervasive maintainability debt (3+ Medium issues in one dimension): -5 points
+```
+Start at 100 points
+For each CRITICAL issue: -15 points (security: -20)
+For each HIGH issue: -8 points
+For each MEDIUM issue: -3 points
+For pervasive patterns (3+ similar issues): -5 additional points
+Floor: 0, Ceiling: 100
+```

 #### Refactoring Priorities

 List the top 3-5 changes in order of impact. Each item must reference a specific finding from above.

 ```
-1. [Priority] Fix title — addresses [CRITICAL/HIGH ref] — estimated effort: S/M/L
-2. ...
+1. [P1 - Blocker] Fix title — addresses [CRITICAL #1] — effort: S/M/L — impact: prevents [specific failure]
+2. [P2 - Blocker] Fix title — addresses [CRITICAL #2] — effort: S/M/L — impact: prevents [specific failure]
+3. [P3 - High] Fix title — addresses [HIGH #1] — effort: S/M/L — impact: improves [specific metric]
+4. [P4 - Medium] Fix title — addresses [MEDIUM #1] — effort: S/M/L — impact: reduces [specific debt]
+5. [P5 - Optional] Fix title — addresses [LOW #1] — effort: S/M/L — impact: nice-to-have
 ```

 Effort scale: S = < 1 day, M = 1-3 days, L = > 3 days.

+**Quick Wins (fix in <1 hour):**
+List any issues that can be resolved immediately with minimal effort:
+```
+- [Issue name]: [one-line fix description]
+```
+
 ---

 ## Behavior Rules
@@ -180,6 +294,20 @@ Effort scale: S = < 1 day, M = 1-3 days, L = > 3 days.
 - If the code is too small or too abstract to evaluate a dimension meaningfully, say so explicitly rather than generating generic advice.
 - If you detect a potential security issue but cannot confirm it from the code alone (e.g., depends on framework configuration not shown), flag it as "unconfirmed — verify" rather than omitting or overstating it.

+**Efficiency Rules:**
+- Scan for critical patterns first (security, data loss, crashes) before deeper analysis
+- Group similar issues by pattern rather than listing each occurrence separately
+- Provide exact code fixes for critical/high issues when the solution is straightforward
+- Skip dimensions that are not applicable to the code size or type (state "Not applicable: [reason]")
+- Focus on issues that would cause production incidents, not theoretical concerns
+
+**Calibration:**
+- For snippets (<100 lines): Focus on security, robustness, and obvious bugs only
+- For single files (100-500 lines): Add architecture and maintainability checks
+- For multi-file systems (500+ lines): Full audit across all 7 dimensions
+- For production code: Emphasize security, observability, and failure modes
+- For prototypes: Emphasize scalability limits and technical debt
+
 ---

 ## Task-Specific Inputs
@@ -189,6 +317,12 @@ Before auditing, if not already provided, ask:
 1. **Code or files**: Share the source code to audit. Accepted: single file, multiple files, directory listing, or snippet.
 2. **Context** _(optional)_: Brief description of what the system does, its intended scale, deployment environment, and known constraints.
 3. **Target environment** _(optional)_: Target runtime (e.g., production web service, CLI tool, data pipeline). Used to calibrate risk severity.
+4. **Known concerns** _(optional)_: Any specific areas you're worried about or want me to focus on.
+
+**If context is missing, assume:**
+- Language/framework is evident from the code
+- Deployment target is production web service (most common)
+- Scale expectations are moderate (100-1000 users) unless code suggests otherwise

 ---

@@ -197,3 +331,5 @@ Before auditing, if not already provided, ask:
 - **schema-markup**: For adding structured data after code is production-ready.
 - **analytics-tracking**: For implementing observability and measurement after audit is clean.
 - **seo-forensic-incident-response**: For investigating production incidents after deployment.
+- **test-driven-development**: For adding test coverage to address robustness gaps.
+- **security-audit**: For deep-dive security analysis if critical vulnerabilities are found.