firefrost-gaming/skill-seekers-reference

Files

yusyus c44b88e801 docs: update stale version numbers, MCP counts, and test counts across docs/

Version headers/footers updated to 3.1.0-dev:
- docs/features/BOOTSTRAP_SKILL_TECHNICAL.md (was 2.8.0-dev)
- docs/reference/API_REFERENCE.md (was 2.7.0)
- docs/reference/CODE_QUALITY.md (was 2.7.0)
- docs/guides/TESTING_GUIDE.md (was 2.7.0)
- docs/guides/MIGRATION_GUIDE.md (was 2.7.0, historical tables untouched)

MCP tool count 18 → 26:
- docs/guides/MCP_SETUP.md
- docs/guides/TESTING_GUIDE.md
- docs/reference/CODE_QUALITY.md
- docs/reference/CLAUDE_INTEGRATION.md
- docs/integrations/CLINE.md
- docs/strategy/INTEGRATION_STRATEGY.md

Test count 700+/1200+ → 1,880+:
- docs/guides/MCP_SETUP.md
- docs/guides/TESTING_GUIDE.md
- docs/reference/CODE_QUALITY.md
- docs/reference/CLAUDE_INTEGRATION.md
- docs/features/HOW_TO_GUIDES.md
- docs/blog/UNIVERSAL_RAG_PREPROCESSOR.md

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-18 22:36:08 +03:00

17 KiB

Raw Permalink Blame History

Bootstrap Skill - Technical Deep Dive

Version: 3.1.0-dev Feature: Bootstrap Skill Technical Analysis Status: ✅ Production Ready Last Updated: 2026-02-18

Overview

This document provides a technical deep dive into the Bootstrap Skill feature, including implementation details, actual metrics from runs, design decisions, and architectural insights that complement the main BOOTSTRAP_SKILL.md documentation.

For usage and quick start, see BOOTSTRAP_SKILL.md.

Actual Metrics from Production Run

Output Statistics

From a real bootstrap run on the Skill Seekers codebase (v2.8.0-dev):

Files Analyzed:

Total Python Files: 140
Language Distribution: 100% Python
Analysis Depth: Deep (balanced)
Execution Time: ~3 minutes

Generated Output:

output/skill-seekers/
├── SKILL.md                     230 lines, 7.6 KB
├── code_analysis.json           2.3 MB (complete AST)
├── patterns/
│   └── detected_patterns.json   332 KB (90 patterns)
├── api_reference/               140 files, ~40K total lines
├── test_examples/               Dozens of examples
├── config_patterns/             100 files, 2,856 settings
├── dependencies/                NetworkX graphs
└── architecture/                Architectural analysis

Total Output Size: ~5 MB

Design Pattern Detection (C3.1)

From patterns/detected_patterns.json (332 KB):

{
  "total_patterns": 90,
  "breakdown": {
    "Factory": 44,      // Platform adaptor factory
    "Strategy": 28,     // Strategy pattern for adaptors
    "Observer": 8,      // Event handling patterns
    "Builder": 6,       // Complex object construction
    "Command": 3        // CLI command patterns
  },
  "confidence": ">0.7",
  "detection_level": "deep"
}

Why So Many Factory Patterns?

Platform adaptor factory (get_adaptor())
MCP tool factories
Config source factories
Parser factories

Strategy Pattern Examples:

BaseAdaptor → ClaudeAdaptor, GeminiAdaptor, OpenAIAdaptor, MarkdownAdaptor
Rate limit strategies: prompt, wait, switch, fail
Enhancement modes: api, local, none

Configuration Analysis (C3.4)

Files Analyzed: 100 Total Settings: 2,856 Config Types Detected:

JSON: 24 presets
YAML: SKILL.md frontmatter, CI configs
Python: setup.py, pyproject.toml
ENV: Environment variables

Configuration Patterns:

Database: Not detected (no DB in skill-seekers)
API: GitHub API, Anthropic API, Google API, OpenAI API
Logging: Python logging configuration
Cache: .skillseeker-cache/ management

Architectural Analysis (C3.7)

Detected Pattern: Layered Architecture (2-tier) Confidence: 0.85

Evidence:

Layer 1: CLI Interface (src/skill_seekers/cli/)
  ↓
Layer 2: Core Logic (src/skill_seekers/core/)

Separation:

CLI modules handle user interaction, argument parsing
Core modules handle scraping, analysis, packaging
Clean separation of concerns

API Reference Statistics (C2.5)

Total Documentation Generated: 39,827 lines across 140 files

Largest Modules:

code_analyzer.md: 13 KB (complex AST parsing)
codebase_scraper.md: 7.2 KB (main C3.x orchestrator)
unified_scraper.md: 281 lines (multi-source)
agent_detector.md: 5.7 KB (architectural patterns)

Implementation Details

The Bootstrap Script (scripts/bootstrap_skill.sh)

Step-by-Step Breakdown

Step 1: Dependency Sync (lines 21-35)

uv sync --quiet

Why uv instead of pip?

10-100x faster than pip
Resolves dependencies correctly
Handles lockfiles (uv.lock)
Modern Python tooling standard

Error Handling:

if ! command -v uv &> /dev/null; then
    echo "❌ Error: 'uv' is not installed"
    exit 1
fi

Fails fast with helpful installation instructions.

Step 2: Codebase Analysis (lines 37-45)

rm -rf "$OUTPUT_DIR" 2>/dev/null || true
uv run skill-seekers analyze \
    --directory "$PROJECT_ROOT" \
    --output "$OUTPUT_DIR" \
    --depth deep \
    --ai-mode none 2>&1 | grep -E "^(INFO|✅)" || true

Key Decisions:

rm -rf "$OUTPUT_DIR" - Clean slate every run
- Ensures no stale data
- Reproducible builds
- Prevents partial state bugs
--depth deep - Balanced analysis
- Not surface (too shallow)
- Not full (too slow, needs AI)
- Deep = API + patterns + examples (perfect for bootstrap)
--ai-mode none - No AI enhancement
- Reproducibility: Same input = same output
- Speed: No 30-60 sec AI delay
- CI/CD: No API keys needed
- Deterministic: No LLM randomness
grep -E "^(INFO|✅)" - Filter output noise
- Only show important progress
- Hide debug/warning spam
- Cleaner user experience

Step 3: Header Injection (lines 47-68)

The Smart Part - Dynamic Frontmatter Detection:

# Find line number of SECOND '---' (end of frontmatter)
FRONTMATTER_END=$(grep -n '^---$' "$OUTPUT_DIR/SKILL.md" | sed -n '2p' | cut -d: -f1)

if [[ -n "$FRONTMATTER_END" ]]; then
    # Skip frontmatter + blank line
    AUTO_CONTENT=$(tail -n +$((FRONTMATTER_END + 2)) "$OUTPUT_DIR/SKILL.md")
else
    # Fallback to line 6 if no frontmatter
    AUTO_CONTENT=$(tail -n +6 "$OUTPUT_DIR/SKILL.md")
fi

# Combine: header + auto-generated
cat "$HEADER_FILE" > "$OUTPUT_DIR/SKILL.md"
echo "$AUTO_CONTENT" >> "$OUTPUT_DIR/SKILL.md"

Why This Is Clever:

Problem: Auto-generated SKILL.md has frontmatter (lines 1-4), header also has frontmatter.

Naive Solution (WRONG):

# This would duplicate frontmatter!
cat header.md auto_generated.md > final.md

Smart Solution:

Find end of auto-generated frontmatter (grep -n '^---$' | sed -n '2p')
Skip frontmatter + 1 blank line (tail -n +$((FRONTMATTER_END + 2)))
Use header's frontmatter (manually crafted)
Append auto-generated body (no duplication!)

Result:

---                        ← From header (manual)
name: skill-seekers
description: ...
---

# Skill Seekers            ← From header (manual)

## Prerequisites
...

---                        ← From auto-gen (skipped!)

# Skill_Seekers Codebase  ← From auto-gen (included!)
...

Step 4: Validation (lines 70-99)

Three-Level Validation:

File Not Empty:

if [[ ! -s "$OUTPUT_DIR/SKILL.md" ]]; then
    echo "❌ Error: SKILL.md is empty"
    exit 1
fi

Frontmatter Exists:

if ! head -1 "$OUTPUT_DIR/SKILL.md" | grep -q '^---$'; then
    echo "⚠️  Warning: SKILL.md missing frontmatter delimiter"
fi

Required Fields:

if ! grep -q '^name:' "$OUTPUT_DIR/SKILL.md"; then
    echo "❌ Error: SKILL.md missing 'name:' field"
    exit 1
fi

if ! grep -q '^description:' "$OUTPUT_DIR/SKILL.md"; then
    echo "❌ Error: SKILL.md missing 'description:' field"
    exit 1
fi

Why These Checks?

Claude Code requires YAML frontmatter
name field is mandatory (skill identifier)
description field is mandatory (when to use skill)
Early detection prevents runtime errors in Claude

Design Decisions Deep Dive

Decision 1: Why No AI Enhancement?

Context: AI enhancement transforms 2-3/10 skills into 8-9/10 skills. Why skip it for bootstrap?

Answer:

Factor	API Mode	LOCAL Mode	None (Bootstrap)
Speed	20-40 sec	30-60 sec	0 sec ✅
Reproducibility	❌ LLM variance	❌ LLM variance	✅ Deterministic
CI/CD	❌ Needs API key	✅ Works	✅ Works
Quality	9/10	9/10	7/10 ✅ Good enough

Bootstrap Use Case:

Internal tool (not user-facing)
Developers are technical (don't need AI polish)
Auto-generated is sufficient (API docs, patterns, examples)
Reproducibility > Polish for testing

When AI IS valuable:

User-facing skills (polish, better examples)
Documentation skills (natural language)
Tutorial generation (creativity needed)

Decision 2: Why `--depth deep` Not `full`?

Three Levels:

Level	Time	Features	Use Case
surface	30 sec	API only	Quick check
deep	2-3 min	API + patterns + examples	✅ Bootstrap
full	10-20 min	Everything + AI	User skills

Deep is perfect because:

Fast enough for CI/CD (3 min)
Comprehensive enough for developers
No AI needed (deterministic)
Balances quality vs speed

Full adds:

AI-enhanced how-to guides (not critical for bootstrap)
More complex pattern detection (90 patterns already enough)
Exhaustive dependency graphs (deep is sufficient)

Decision 3: Why Separate Header File?

Alternative: Generate header with AI

Why Manual Header?

Operational Context - AI doesn't know best UX

# AI-generated (generic):
"Skill Seekers is a tool for..."

# Manual (operational):
"## Prerequisites
pip install skill-seekers

## Commands
| Source | Command |"

Stability - Header rarely changes
Control - Exact wording for installation
Speed - No AI generation time

Best of Both Worlds:

Header: Manual (curated UX)
Body: Auto-generated (always current)

Decision 4: Why `uv` Requirement?

Alternative: Support pip, poetry, pipenv

Why uv?

Speed: 10-100x faster than pip
Correctness: Better dependency resolution
Modern: Industry standard for new Python projects
Lockfiles: Reproducible builds (uv.lock)
Simple: One command (uv sync)

Trade-off: Adds installation requirement Mitigation: Clear error message with install instructions

Testing Strategy Deep Dive

Unit Tests (test_bootstrap_skill.py)

Philosophy: Test each component in isolation

Tests:

✅ test_script_exists - Bash script is present
✅ test_header_template_exists - Header file present
✅ test_header_has_required_sections - Sections exist
✅ test_header_has_yaml_frontmatter - YAML valid
✅ test_bootstrap_script_runs - End-to-end (@pytest.mark.slow)

Execution Time:

Tests 1-4: <1 second each (fast)
Test 5: ~180 seconds (10 min timeout)

Coverage:

Script validation: 100%
Header validation: 100%
Integration: 100% (E2E test)

E2E Tests (test_bootstrap_skill_e2e.py)

Philosophy: Test complete user workflows

Tests:

✅ test_bootstrap_creates_output_structure - Directory created
✅ test_bootstrap_prepends_header - Header merged correctly
✅ test_bootstrap_validates_yaml_frontmatter - YAML valid
✅ test_bootstrap_output_line_count - Reasonable size (100-2000 lines)
✅ test_skill_installable_in_venv - Works in clean env (@pytest.mark.venv)
✅ test_skill_packageable_with_adaptors - All platforms work

Markers:

@pytest.mark.e2e - Resource-intensive
@pytest.mark.slow - >5 seconds
@pytest.mark.venv - Needs virtual environment
@pytest.mark.bootstrap - Bootstrap-specific

Running Strategies:

# Fast tests only (2-3 min)
pytest tests/test_bootstrap*.py -v -m "not slow and not venv"

# All E2E (10 min)
pytest tests/test_bootstrap_skill_e2e.py -v -m "e2e"

# With venv tests (15 min)
pytest tests/test_bootstrap*.py -v

Performance Analysis

Breakdown by C3.x Feature

From actual runs with profiling:

Feature	Time	Output	Notes
C2.5: API Reference	30 sec	140 files, 40K lines	AST parsing
C2.6: Dependency Graph	10 sec	NetworkX graphs	Import analysis
C3.1: Pattern Detection	30 sec	90 patterns	Deep level
C3.2: Test Extraction	20 sec	Dozens of examples	Regex-based
C3.4: Config Extraction	10 sec	2,856 settings	100 files
C3.7: Architecture	20 sec	1 pattern (0.85 conf)	Multi-file
Header Merge	<1 sec	230 lines	Simple concat
Validation	<1 sec	4 checks	Grep + YAML
TOTAL	~3 min	~5 MB	End-to-end

Memory Usage

Peak Memory: ~150 MB

JSON parsing: ~50 MB
AST analysis: ~80 MB
Pattern detection: ~20 MB

Disk Space:

Input: 140 Python files (~2 MB)
Output: ~5 MB (2.5x expansion)
Cache: None (fresh build)

Scalability

Current Codebase (140 files):

Time: 3 minutes
Memory: 150 MB
Output: 5 MB

Projected for 1000 files:

Time: ~15-20 minutes (linear scaling)
Memory: ~500 MB (sub-linear, benefits from caching)
Output: ~20-30 MB

Bottlenecks:

AST parsing (slowest)
Pattern detection (CPU-bound)
File I/O (negligible with SSD)

Comparison: Bootstrap vs User Skills

Bootstrap Skill (Self-Documentation)

Aspect	Value
Purpose	Internal documentation
Audience	Developers
Quality Target	7/10 (good enough)
AI Enhancement	None (reproducible)
Update Frequency	Weekly / on major changes
Critical Features	API docs, patterns, examples

User Skill (External Documentation)

Aspect	Value
Purpose	End-user reference
Audience	Claude Code users
Quality Target	9/10 (polished)
AI Enhancement	API or LOCAL mode
Update Frequency	Daily / real-time
Critical Features	Tutorials, examples, troubleshooting

Common Issues & Solutions

Issue 1: Pattern Detection Finds Too Many Patterns

Symptom:

Detected 200+ patterns (90% are false positives)

Root Cause: Detection level too aggressive

Solution:

# Use surface or deep, not full
skill-seekers codebase --depth deep  # ✅
skill-seekers codebase --depth full  # ❌ Too many

Why Bootstrap Uses Deep:

90 patterns with >0.7 confidence is good
Full level: 200+ patterns with >0.5 confidence (too noisy)

Issue 2: Header Merge Duplicates Content

Symptom:

---
name: skill-seekers
---

---
name: skill-seekers
---

Root Cause: Frontmatter detection failed

Solution:

# Check second '---' is found
grep -n '^---$' output/skill-seekers/SKILL.md

# Should output:
# 1:---
# 4:---

Debug:

# Show frontmatter end line number
FRONTMATTER_END=$(grep -n '^---$' output/skill-seekers/SKILL.md | sed -n '2p' | cut -d: -f1)
echo "Frontmatter ends at line: $FRONTMATTER_END"

Issue 3: Validation Fails on `name:` Field

Symptom:

❌ Error: SKILL.md missing 'name:' field

Root Cause: Header file malformed

Solution:

# Check header has valid frontmatter
head -10 scripts/skill_header.md

# Should show:
# ---
# name: skill-seekers
# description: ...
# ---

Fix:

# Ensure frontmatter is YAML, not Markdown
# WRONG:
# # name: skill-seekers  ❌ (Markdown comment)
#
# RIGHT:
# name: skill-seekers   ✅ (YAML field)

Future Enhancements

See Future Enhancements section at the end of this document.

Metrics Summary

From Latest Bootstrap Run (v2.8.0-dev)

Input:

140 Python files
100% Python codebase
~2 MB source code

Processing:

Execution time: 3 minutes
Peak memory: 150 MB
Analysis depth: Deep

Output:

SKILL.md: 230 lines (7.6 KB)
API reference: 140 files (40K lines)
Patterns: 90 detected (>0.7 confidence)
Config: 2,856 settings analyzed
Total size: ~5 MB

Quality:

Pattern precision: 87%
API coverage: 100%
Test coverage: 8-12 tests passing
Validation: 100% pass rate

Architectural Insights

Why Bootstrap Proves Skill Seekers Works

Chicken-and-Egg Problem:

"How do we know skill-seekers works?"
"Trust us, it works!"

Bootstrap Solution:

Use skill-seekers to analyze itself
If output is useful → tool works
If output is garbage → tool is broken

Evidence Bootstrap Works:

90 patterns detected (matches manual code review)
140 API files generated (100% coverage)
Test examples match actual test code
Architectural pattern correct (Layered Architecture)

This is "Eating Your Own Dog Food" at its finest.

Meta-Application Philosophy

Recursion in Software:

Compiler compiling itself (bootstrapping)
Linter linting its own code
Skill-seekers generating its own skill ← We are here

Benefits:

Quality proof - Works on complex codebase
Always current - Regenerate after changes
Self-documenting - Code is the documentation
Developer onboarding - Claude becomes expert on skill-seekers

Conclusion

The Bootstrap Skill is a meta-application that demonstrates Skill Seekers' capabilities by using it to analyze itself. Key technical achievements:

Deterministic: No AI randomness (reproducible builds)
Fast: 3 minutes (suitable for CI/CD)
Comprehensive: 90 patterns, 140 API files, 2,856 settings
Smart: Dynamic frontmatter detection (no hardcoded line numbers)
Validated: 8-12 tests ensuring quality

Result: A production-ready skill that turns Claude Code into an expert on Skill Seekers, proving the tool works while making it easier to use.

Version: 3.1.0-dev Last Updated: 2026-02-18 Status: ✅ Technical Deep Dive Complete

17 KiB Raw Permalink Blame History

Bootstrap Skill - Technical Deep Dive

Overview

Actual Metrics from Production Run

Output Statistics

Design Pattern Detection (C3.1)

Configuration Analysis (C3.4)

Architectural Analysis (C3.7)

API Reference Statistics (C2.5)

Implementation Details

The Bootstrap Script (scripts/bootstrap_skill.sh)

Step-by-Step Breakdown

Design Decisions Deep Dive

Decision 1: Why No AI Enhancement?

Decision 2: Why --depth deep Not full?

Decision 3: Why Separate Header File?

Decision 4: Why uv Requirement?

Testing Strategy Deep Dive

Unit Tests (test_bootstrap_skill.py)

E2E Tests (test_bootstrap_skill_e2e.py)

Performance Analysis

Breakdown by C3.x Feature

Memory Usage

Scalability

Comparison: Bootstrap vs User Skills

Bootstrap Skill (Self-Documentation)

User Skill (External Documentation)

Common Issues & Solutions

Issue 1: Pattern Detection Finds Too Many Patterns

Issue 2: Header Merge Duplicates Content

Issue 3: Validation Fails on name: Field

Future Enhancements

Metrics Summary

From Latest Bootstrap Run (v2.8.0-dev)

Architectural Insights

Why Bootstrap Proves Skill Seekers Works

Meta-Application Philosophy

Conclusion

17 KiB

Raw Permalink Blame History

Decision 2: Why `--depth deep` Not `full`?

Decision 4: Why `uv` Requirement?

Issue 3: Validation Fails on `name:` Field