Completes the unified scraping system implementation: **Phase 7: Unified Skill Builder** - cli/unified_skill_builder.py: Generates final skill structure - Inline conflict warnings (⚠️) in API reference - Side-by-side docs vs code comparison - Severity-based conflict grouping - Separate conflicts.md report **Phase 8: MCP Integration** - skill_seeker_mcp/server.py: Auto-detects unified vs legacy configs - Routes to unified_scraper.py or doc_scraper.py automatically - Supports merge_mode parameter override - Maintains full backward compatibility **Phase 9: Example Unified Configs** - configs/react_unified.json: React docs + GitHub - configs/django_unified.json: Django docs + GitHub - configs/fastapi_unified.json: FastAPI docs + GitHub - configs/fastapi_unified_test.json: Test config with limited pages **Phase 10: Comprehensive Tests** - cli/test_unified_simple.py: Integration tests (all passing) - Tests unified config validation - Tests backward compatibility - Tests mixed source types - Tests error handling **Phase 11: Documentation** - docs/UNIFIED_SCRAPING.md: Complete guide (1000+ lines) - Examples, best practices, troubleshooting - Architecture diagrams and data flow - Command reference **Additional:** - demo_conflicts.py: Interactive conflict detection demo - TEST_RESULTS.md: Complete test results and findings - cli/unified_scraper.py: Fixed doc_scraper integration (subprocess) **Features:** ✅ Multi-source scraping (docs + GitHub + PDF) ✅ Conflict detection (4 types, 3 severity levels) ✅ Rule-based merging (fast, deterministic) ✅ Claude-enhanced merging (AI-powered) ✅ Transparent conflict reporting ✅ MCP auto-detection ✅ Backward compatibility **Test Results:** - 6/6 integration tests passed - 4 unified configs validated - 3 legacy configs backward compatible - 5 conflicts detected in test data - All documentation complete 🤖 Generated with Claude Code
46 lines
1.3 KiB
JSON
46 lines
1.3 KiB
JSON
{
|
|
"name": "fastapi",
|
|
"description": "Complete FastAPI knowledge combining official documentation and FastAPI codebase. Use when building FastAPI applications, understanding async patterns, or working with Pydantic models.",
|
|
"merge_mode": "rule-based",
|
|
"sources": [
|
|
{
|
|
"type": "documentation",
|
|
"base_url": "https://fastapi.tiangolo.com/",
|
|
"extract_api": true,
|
|
"selectors": {
|
|
"main_content": "article",
|
|
"title": "h1",
|
|
"code_blocks": "pre code"
|
|
},
|
|
"url_patterns": {
|
|
"include": [],
|
|
"exclude": ["/img/", "/js/"]
|
|
},
|
|
"categories": {
|
|
"getting_started": ["tutorial", "first-steps"],
|
|
"path_operations": ["path-params", "query-params", "body"],
|
|
"dependencies": ["dependencies"],
|
|
"security": ["security", "oauth2"],
|
|
"database": ["sql-databases"],
|
|
"advanced": ["advanced", "async", "middleware"],
|
|
"deployment": ["deployment"]
|
|
},
|
|
"rate_limit": 0.5,
|
|
"max_pages": 150
|
|
},
|
|
{
|
|
"type": "github",
|
|
"repo": "tiangolo/fastapi",
|
|
"include_issues": true,
|
|
"max_issues": 100,
|
|
"include_changelog": true,
|
|
"include_releases": true,
|
|
"include_code": true,
|
|
"code_analysis_depth": "surface",
|
|
"file_patterns": [
|
|
"fastapi/**/*.py"
|
|
]
|
|
}
|
|
]
|
|
}
|