skill-seekers-reference/cli/unified_scraper.py at f03f4cf5694f92163ec4e29a8cda6c8eeb060bfe

firefrost-gaming/skill-seekers-reference

Files

yusyus f03f4cf569 feat: Phase 6 - Unified scraper orchestrator

Created main orchestrator that coordinates entire workflow:

Architecture:
- UnifiedScraper class orchestrates all phases
- Routes to appropriate scraper based on source type
- Supports any combination of sources

4-Phase Workflow:
1. Scrape all sources (docs, GitHub, PDF)
2. Detect conflicts (if multiple API sources)
3. Merge intelligently (rule-based or Claude-enhanced)
4. Build unified skill (placeholder for Phase 7)

Features:
✅ Validates unified config on startup
✅ Backward compatible with legacy configs
✅ Source-specific routing (documentation/github/pdf)
✅ Automatic conflict detection when needed
✅ Merge mode selection (rule-based/claude-enhanced)
✅ Creates organized output structure
✅ Comprehensive logging for each phase
✅ Error handling and graceful failures

CLI Usage:
- python3 cli/unified_scraper.py --config configs/godot_unified.json
- python3 cli/unified_scraper.py -c configs/react_unified.json -m claude-enhanced

Output Structure:
- output/{name}/ - Final skill directory
- output/{name}_unified_data/ - Intermediate data files
  * documentation_data.json
  * github_data.json
  * conflicts.json
  * merged_data.json

Next: Phase 7 - Skill builder to generate final SKILL.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-10-26 15:32:23 +03:00

14 KiB

Raw Blame History

View Raw

14 KiB Raw Blame History

14 KiB

Raw Blame History