Major documentation update explaining the new unified scraping system that combines documentation + GitHub + PDF sources in a single skill with automatic conflict detection.
## Changes:
**README.md:**
- Update version badge to v2.0.0
- Add "Unified Multi-Source Scraping" to Key Features section
- Add comprehensive Option 5 section showing:
- Problem statement (documentation drift)
- Solution with code example
- Conflict detection types and severity levels
- Transparent reporting with side-by-side comparison
- List of advantages (identifies gaps, catches changes, single source of truth)
- Available unified configs
- Link to full guide (docs/UNIFIED_SCRAPING.md)
**CLAUDE.md:**
- Update Current Status to v2.0.0
- Add "Major Release: Unified Multi-Source Scraping" in Recent Updates
- Update configs count from 11/11 to 15/15 (added 4 unified configs)
- Add new "Unified Multi-Source Scraping" section under Core Commands
- Include command examples and feature highlights
- Explain what makes unified scraping special
**QUICKSTART.md:**
- Add Option D: Unified Multi-Source to Step 2
- Add unified configs to Available Presets section
- Show react_unified, django_unified, fastapi_unified, godot_unified examples
## Value:
This documentation update explains how unified scraping helps developers:
- Mix documentation + code in one skill
- Automatically detect conflicts (missing_in_docs, missing_in_code, signature_mismatch)
- Get transparent side-by-side comparisons with ⚠️ warnings
- Identify documentation gaps and outdated docs
- Create a single source of truth combining both sources
Related to: Phase 7-11 unified scraper implementation (commit 5d8c7e3)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
3.8 KiB
3.8 KiB
Quick Start Guide
🚀 3 Steps to Create a Skill
Step 1: Install Dependencies
pip3 install requests beautifulsoup4
Note: Skill_Seekers automatically checks for llms.txt files first, which is 10x faster when available.
Step 2: Run the Tool
Option A: Use a Preset (Easiest)
python3 cli/doc_scraper.py --config configs/godot.json
Option B: Interactive Mode
python3 cli/doc_scraper.py --interactive
Option C: Quick Command
python3 cli/doc_scraper.py --name react --url https://react.dev/
Option D: Unified Multi-Source (NEW - v2.0.0)
# Combine documentation + GitHub code in one skill
python3 cli/unified_scraper.py --config configs/react_unified.json
Detects conflicts between docs and code automatically!
Step 3: Enhance SKILL.md (Recommended)
# LOCAL enhancement (no API key, uses Claude Code Max)
python3 cli/enhance_skill_local.py output/godot/
This takes 60 seconds and dramatically improves the SKILL.md quality!
Step 4: Package the Skill
python3 cli/package_skill.py output/godot/
Done! You now have godot.zip ready to use.
📋 Available Presets
# Godot Engine
python3 cli/doc_scraper.py --config configs/godot.json
# React
python3 cli/doc_scraper.py --config configs/react.json
# Vue.js
python3 cli/doc_scraper.py --config configs/vue.json
# Django
python3 cli/doc_scraper.py --config configs/django.json
# FastAPI
python3 cli/doc_scraper.py --config configs/fastapi.json
# Unified Multi-Source (NEW!)
python3 cli/unified_scraper.py --config configs/react_unified.json
python3 cli/unified_scraper.py --config configs/django_unified.json
python3 cli/unified_scraper.py --config configs/fastapi_unified.json
python3 cli/unified_scraper.py --config configs/godot_unified.json
⚡ Using Existing Data (Fast!)
If you already scraped once:
python3 cli/doc_scraper.py --config configs/godot.json
# When prompted:
✓ Found existing data: 245 pages
Use existing data? (y/n): y
# Builds in seconds!
Or use --skip-scrape:
python3 cli/doc_scraper.py --config configs/godot.json --skip-scrape
🎯 Complete Example (Recommended Workflow)
# 1. Install (once)
pip3 install requests beautifulsoup4
# 2. Scrape React docs with LOCAL enhancement
python3 cli/doc_scraper.py --config configs/react.json --enhance-local
# Wait 15-30 minutes (scraping) + 60 seconds (enhancement)
# 3. Package
python3 cli/package_skill.py output/react/
# 4. Use react.zip in Claude!
Alternative: Enhancement after scraping
# 2a. Scrape only (no enhancement)
python3 cli/doc_scraper.py --config configs/react.json
# 2b. Enhance later
python3 cli/enhance_skill_local.py output/react/
# 3. Package
python3 cli/package_skill.py output/react/
💡 Pro Tips
Test with Small Pages First
Edit config file:
{
"max_pages": 20 // Test with just 20 pages
}
Rebuild Instantly
# After first scrape, you can rebuild instantly:
python3 cli/doc_scraper.py --config configs/react.json --skip-scrape
Create Custom Config
# Copy a preset
cp configs/react.json configs/myframework.json
# Edit it
nano configs/myframework.json
# Use it
python3 cli/doc_scraper.py --config configs/myframework.json
📁 What You Get
output/
├── godot_data/ # Raw scraped data (reusable!)
└── godot/ # The skill
├── SKILL.md # With real code examples!
└── references/ # Organized docs
❓ Need Help?
See README.md for:
- Complete documentation
- Config file structure
- Troubleshooting
- Advanced usage
🎮 Let's Go!
# Godot
python3 cli/doc_scraper.py --config configs/godot.json
# Or interactive
python3 cli/doc_scraper.py --interactive
That's it! 🚀