Created main orchestrator that coordinates entire workflow: Architecture: - UnifiedScraper class orchestrates all phases - Routes to appropriate scraper based on source type - Supports any combination of sources 4-Phase Workflow: 1. Scrape all sources (docs, GitHub, PDF) 2. Detect conflicts (if multiple API sources) 3. Merge intelligently (rule-based or Claude-enhanced) 4. Build unified skill (placeholder for Phase 7) Features: ✅ Validates unified config on startup ✅ Backward compatible with legacy configs ✅ Source-specific routing (documentation/github/pdf) ✅ Automatic conflict detection when needed ✅ Merge mode selection (rule-based/claude-enhanced) ✅ Creates organized output structure ✅ Comprehensive logging for each phase ✅ Error handling and graceful failures CLI Usage: - python3 cli/unified_scraper.py --config configs/godot_unified.json - python3 cli/unified_scraper.py -c configs/react_unified.json -m claude-enhanced Output Structure: - output/{name}/ - Final skill directory - output/{name}_unified_data/ - Intermediate data files * documentation_data.json * github_data.json * conflicts.json * merged_data.json Next: Phase 7 - Skill builder to generate final SKILL.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
14 KiB
14 KiB