firefrost-gaming/skill-seekers-reference

Files

yusyus 1e277f80d2 Update documentation for unified multi-source scraping (v2.0.0)

Major documentation update explaining the new unified scraping system that combines documentation + GitHub + PDF sources in a single skill with automatic conflict detection.

## Changes:

**README.md:**
- Update version badge to v2.0.0
- Add "Unified Multi-Source Scraping" to Key Features section
- Add comprehensive Option 5 section showing:
  - Problem statement (documentation drift)
  - Solution with code example
  - Conflict detection types and severity levels
  - Transparent reporting with side-by-side comparison
  - List of advantages (identifies gaps, catches changes, single source of truth)
  - Available unified configs
  - Link to full guide (docs/UNIFIED_SCRAPING.md)

**CLAUDE.md:**
- Update Current Status to v2.0.0
- Add "Major Release: Unified Multi-Source Scraping" in Recent Updates
- Update configs count from 11/11 to 15/15 (added 4 unified configs)
- Add new "Unified Multi-Source Scraping" section under Core Commands
- Include command examples and feature highlights
- Explain what makes unified scraping special

**QUICKSTART.md:**
- Add Option D: Unified Multi-Source to Step 2
- Add unified configs to Available Presets section
- Show react_unified, django_unified, fastapi_unified, godot_unified examples

## Value:
This documentation update explains how unified scraping helps developers:
- Mix documentation + code in one skill
- Automatically detect conflicts (missing_in_docs, missing_in_code, signature_mismatch)
- Get transparent side-by-side comparisons with ⚠️ warnings
- Identify documentation gaps and outdated docs
- Create a single source of truth combining both sources

Related to: Phase 7-11 unified scraper implementation (commit 5d8c7e3)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-10-26 16:41:58 +03:00

3.8 KiB

Raw Blame History

Quick Start Guide

🚀 3 Steps to Create a Skill

Step 1: Install Dependencies

pip3 install requests beautifulsoup4

Note: Skill_Seekers automatically checks for llms.txt files first, which is 10x faster when available.

Step 2: Run the Tool

Option A: Use a Preset (Easiest)

python3 cli/doc_scraper.py --config configs/godot.json

Option B: Interactive Mode

python3 cli/doc_scraper.py --interactive

Option C: Quick Command

python3 cli/doc_scraper.py --name react --url https://react.dev/

Option D: Unified Multi-Source (NEW - v2.0.0)

# Combine documentation + GitHub code in one skill
python3 cli/unified_scraper.py --config configs/react_unified.json

Detects conflicts between docs and code automatically!

Step 3: Enhance SKILL.md (Recommended)

# LOCAL enhancement (no API key, uses Claude Code Max)
python3 cli/enhance_skill_local.py output/godot/

This takes 60 seconds and dramatically improves the SKILL.md quality!

Step 4: Package the Skill

python3 cli/package_skill.py output/godot/

Done! You now have godot.zip ready to use.

📋 Available Presets

# Godot Engine
python3 cli/doc_scraper.py --config configs/godot.json

# React
python3 cli/doc_scraper.py --config configs/react.json

# Vue.js
python3 cli/doc_scraper.py --config configs/vue.json

# Django
python3 cli/doc_scraper.py --config configs/django.json

# FastAPI
python3 cli/doc_scraper.py --config configs/fastapi.json

# Unified Multi-Source (NEW!)
python3 cli/unified_scraper.py --config configs/react_unified.json
python3 cli/unified_scraper.py --config configs/django_unified.json
python3 cli/unified_scraper.py --config configs/fastapi_unified.json
python3 cli/unified_scraper.py --config configs/godot_unified.json

⚡ Using Existing Data (Fast!)

If you already scraped once:

python3 cli/doc_scraper.py --config configs/godot.json

# When prompted:
✓ Found existing data: 245 pages
Use existing data? (y/n): y

# Builds in seconds!

Or use --skip-scrape:

python3 cli/doc_scraper.py --config configs/godot.json --skip-scrape

🎯 Complete Example (Recommended Workflow)

# 1. Install (once)
pip3 install requests beautifulsoup4

# 2. Scrape React docs with LOCAL enhancement
python3 cli/doc_scraper.py --config configs/react.json --enhance-local
# Wait 15-30 minutes (scraping) + 60 seconds (enhancement)

# 3. Package
python3 cli/package_skill.py output/react/

# 4. Use react.zip in Claude!

Alternative: Enhancement after scraping

# 2a. Scrape only (no enhancement)
python3 cli/doc_scraper.py --config configs/react.json

# 2b. Enhance later
python3 cli/enhance_skill_local.py output/react/

# 3. Package
python3 cli/package_skill.py output/react/

💡 Pro Tips

Test with Small Pages First

Edit config file:

{
  "max_pages": 20  // Test with just 20 pages
}

Rebuild Instantly

# After first scrape, you can rebuild instantly:
python3 cli/doc_scraper.py --config configs/react.json --skip-scrape

Create Custom Config

# Copy a preset
cp configs/react.json configs/myframework.json

# Edit it
nano configs/myframework.json

# Use it
python3 cli/doc_scraper.py --config configs/myframework.json

📁 What You Get

output/
├── godot_data/          # Raw scraped data (reusable!)
└── godot/               # The skill
    ├── SKILL.md        # With real code examples!
    └── references/     # Organized docs

❓ Need Help?

See README.md for:

Complete documentation
Config file structure
Troubleshooting
Advanced usage

🎮 Let's Go!

# Godot
python3 cli/doc_scraper.py --config configs/godot.json

# Or interactive
python3 cli/doc_scraper.py --interactive

That's it! 🚀

3.8 KiB Raw Blame History