yusyus
bddb57f5ef
Add large documentation handling (40K+ pages support)
Implement comprehensive system for handling very large documentation sites
with intelligent splitting strategies and router/hub architecture.
**New CLI Tools:**
- cli/split_config.py: Split large configs into focused sub-skills
* Strategies: auto, category, router, size
* Configurable target pages per skill (default: 5000)
* Dry-run mode for preview
- cli/generate_router.py: Create intelligent router/hub skills
* Auto-generates routing logic based on keywords
* Creates SKILL.md with topic-to-skill mapping
* Infers router name from sub-skills
- cli/package_multi.py: Batch package multiple skills
* Package router + all sub-skills in one command
* Progress tracking for each skill
**MCP Integration:**
- Added split_config tool (8 total MCP tools now)
- Added generate_router tool
- Supports 40K+ page documentation via MCP
**Configuration:**
- New split_strategy parameter in configs
- split_config section for fine-tuned control
- checkpoint section for resume capability (ready for Phase 4)
- Example: configs/godot-large-example.json
**Documentation:**
- docs/LARGE_DOCUMENTATION.md (500+ lines)
* Complete guide for 10K+ page documentation
* All splitting strategies explained
* Detailed workflows with examples
* Best practices and troubleshooting
* Real-world examples (AWS, Microsoft, Godot)
**Features:**
✅ Handle 40K+ page documentation efficiently
✅ Parallel scraping support (5x-10x faster)
✅ Router + sub-skills architecture
✅ Intelligent keyword-based routing
✅ Multiple splitting strategies
✅ Full MCP integration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-19 20:48:03 +03:00
..
2025-10-19 19:44:27 +03:00
2025-10-19 02:24:56 +03:00
2025-10-19 02:24:56 +03:00
2025-10-19 20:48:03 +03:00
2025-10-19 02:24:56 +03:00
2025-10-19 09:28:44 +02:00
2025-10-19 19:44:27 +03:00
2025-10-19 02:24:56 +03:00
2025-10-17 15:14:44 +00:00
2025-10-19 19:44:27 +03:00
2025-10-19 19:44:27 +03:00
2025-10-19 02:24:56 +03:00