Merge branch 'development' into feature/router-quality-improvements
Integrated multi-source support from development branch into feature branch's
C3.x auto-cloning and cache system. This merge combines TWO major features:
FEATURE BRANCH (C3.x + Cache):
- Automatic GitHub repository cloning for C3.x analysis
- Hidden .skillseeker-cache/ directory for intermediate files
- Cache reuse for faster rebuilds
- Enhanced AI skill quality improvements
DEVELOPMENT BRANCH (Multi-Source):
- Support multiple sources of same type (multiple GitHub repos, PDFs)
- List-based data storage with source indexing
- New configs: claude-code.json, medusa-mercurjs.json
- llms.txt downloader/parser enhancements
- New tests: test_markdown_parsing.py, test_multi_source.py
CONFLICT RESOLUTIONS:
1. configs/claude-code.json (COMPROMISE):
- Kept file with _migration_note (preserves PR #244 work)
- Feature branch had deleted it (config migration)
- Development branch enhanced it (47 Claude Code doc URLs)
2. src/skill_seekers/cli/unified_scraper.py (INTEGRATED):
Applied 8 changes for multi-source support:
- List-based storage: {'github': [], 'documentation': [], 'pdf': []}
- Source indexing with _source_counters
- Unique naming: {name}_github_{idx}_{repo_id}
- Unique data files: github_data_{idx}_{repo_id}.json
- List append instead of dict assignment
- Updated _clone_github_repo(repo_name, idx=0) signature
- Applied same logic to _scrape_pdf()
3. src/skill_seekers/cli/unified_skill_builder.py (INTEGRATED):
Applied 3 changes for multi-source synthesis:
- _load_source_skill_mds(): Glob pattern for multiple sources
- _generate_references(): Iterate through github_list
- _generate_c3_analysis_references(repo_id): Per-repo C3.x references
TESTING STRATEGY:
Backward Compatibility:
- Single source configs work exactly as before (idx=0)
New Capabilities:
- Multiple GitHub repos: encode/httpx + facebook/react
- Multiple PDFs with unique indexing
- Mixed sources: docs + multiple GitHub repos
Pipeline Integrity:
- Scraper: Multi-source data collection with indexing
- Builder: Loads all source SKILL.md files
- Synthesis: Merges multiple sources with separators
- C3.x: Independent analysis per repo in unique subdirectories
Result: Support MULTIPLE sources per type + C3.x analysis + cache system
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>