Files
skill-seekers-reference/configs
yusyus 9d26ca5d0a Merge branch 'development' into feature/router-quality-improvements
Integrated multi-source support from development branch into feature branch's
C3.x auto-cloning and cache system. This merge combines TWO major features:

FEATURE BRANCH (C3.x + Cache):
- Automatic GitHub repository cloning for C3.x analysis
- Hidden .skillseeker-cache/ directory for intermediate files
- Cache reuse for faster rebuilds
- Enhanced AI skill quality improvements

DEVELOPMENT BRANCH (Multi-Source):
- Support multiple sources of same type (multiple GitHub repos, PDFs)
- List-based data storage with source indexing
- New configs: claude-code.json, medusa-mercurjs.json
- llms.txt downloader/parser enhancements
- New tests: test_markdown_parsing.py, test_multi_source.py

CONFLICT RESOLUTIONS:

1. configs/claude-code.json (COMPROMISE):
   - Kept file with _migration_note (preserves PR #244 work)
   - Feature branch had deleted it (config migration)
   - Development branch enhanced it (47 Claude Code doc URLs)

2. src/skill_seekers/cli/unified_scraper.py (INTEGRATED):
   Applied 8 changes for multi-source support:
   - List-based storage: {'github': [], 'documentation': [], 'pdf': []}
   - Source indexing with _source_counters
   - Unique naming: {name}_github_{idx}_{repo_id}
   - Unique data files: github_data_{idx}_{repo_id}.json
   - List append instead of dict assignment
   - Updated _clone_github_repo(repo_name, idx=0) signature
   - Applied same logic to _scrape_pdf()

3. src/skill_seekers/cli/unified_skill_builder.py (INTEGRATED):
   Applied 3 changes for multi-source synthesis:
   - _load_source_skill_mds(): Glob pattern for multiple sources
   - _generate_references(): Iterate through github_list
   - _generate_c3_analysis_references(repo_id): Per-repo C3.x references

TESTING STRATEGY:

Backward Compatibility:
- Single source configs work exactly as before (idx=0)

New Capabilities:
- Multiple GitHub repos: encode/httpx + facebook/react
- Multiple PDFs with unique indexing
- Mixed sources: docs + multiple GitHub repos

Pipeline Integrity:
- Scraper: Multi-source data collection with indexing
- Builder: Loads all source SKILL.md files
- Synthesis: Merges multiple sources with separators
- C3.x: Independent analysis per repo in unique subdirectories

Result: Support MULTIPLE sources per type + C3.x analysis + cache system

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-12 00:11:31 +03:00
..