yusyus
|
f03f4cf569
|
feat: Phase 6 - Unified scraper orchestrator
Created main orchestrator that coordinates entire workflow:
Architecture:
- UnifiedScraper class orchestrates all phases
- Routes to appropriate scraper based on source type
- Supports any combination of sources
4-Phase Workflow:
1. Scrape all sources (docs, GitHub, PDF)
2. Detect conflicts (if multiple API sources)
3. Merge intelligently (rule-based or Claude-enhanced)
4. Build unified skill (placeholder for Phase 7)
Features:
✅ Validates unified config on startup
✅ Backward compatible with legacy configs
✅ Source-specific routing (documentation/github/pdf)
✅ Automatic conflict detection when needed
✅ Merge mode selection (rule-based/claude-enhanced)
✅ Creates organized output structure
✅ Comprehensive logging for each phase
✅ Error handling and graceful failures
CLI Usage:
- python3 cli/unified_scraper.py --config configs/godot_unified.json
- python3 cli/unified_scraper.py -c configs/react_unified.json -m claude-enhanced
Output Structure:
- output/{name}/ - Final skill directory
- output/{name}_unified_data/ - Intermediate data files
* documentation_data.json
* github_data.json
* conflicts.json
* merged_data.json
Next: Phase 7 - Skill builder to generate final SKILL.md
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
2025-10-26 15:32:23 +03:00 |
|