Comprehensive documentation updates for large docs support: README.md: - Add "Large Documentation Support" to key features - Add "Router/Hub Skills" feature highlight - Add "Checkpoint/Resume" feature highlight - Update MCP tools count: 6 → 8 - Add complete section 7: Large Documentation Support (10K-40K+ Pages) - Split strategies: auto, category, router, size - Parallel scraping workflow - Configuration examples - Benefits and use cases - Add section 8: Checkpoint/Resume for Long Scrapes - Configuration examples - Resume/fresh workflow - Benefits and features - Update documentation links to include LARGE_DOCUMENTATION.md - Update MCP guide links to reflect 8 tools docs/CLAUDE.md: - Add resume/checkpoint commands - Add large documentation commands (split, router, package_multi) - Update MCP integration section (8 tools) - Expand directory structure to show new files - Add split_strategy, split_config, checkpoint config parameters - Add "Large Documentation Support" and "Checkpoint/Resume" features - Add complete large documentation workflow (40K pages example) - Update all command paths to use cli/ prefix mcp/README.md: - Update tool count: 6 → 8 - Add tool 7: split_config with full documentation - Add tool 8: generate_router with full documentation - Add "Large Documentation (40K Pages)" workflow example - Update test coverage: 25 → 31 tests - Update performance table with parallel scraping metrics - Document all split strategies docs/MCP_SETUP.md: - Update verified tools count: 6 → 8 - Update test count: 25 → 31 All documentation now comprehensively covers: - Large documentation handling (10K-40K+ pages) - Router/hub architecture - Config splitting strategies - Checkpoint/resume functionality - Parallel scraping workflows - Complete MCP integration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
682 lines
18 KiB
Markdown
682 lines
18 KiB
Markdown
# Skill Seeker
|
|
|
|
[](https://opensource.org/licenses/MIT)
|
|
[](https://www.python.org/downloads/)
|
|
[](https://modelcontextprotocol.io)
|
|
[](tests/)
|
|
|
|
**Automatically convert any documentation website into a Claude AI skill in minutes.**
|
|
|
|
## What is Skill Seeker?
|
|
|
|
Skill Seeker is an automated tool that transforms any documentation website into a production-ready [Claude AI skill](https://claude.ai). Instead of manually reading and summarizing documentation, Skill Seeker:
|
|
|
|
1. **Scrapes** documentation websites automatically
|
|
2. **Organizes** content into categorized reference files
|
|
3. **Enhances** with AI to extract best examples and key concepts
|
|
4. **Packages** everything into an uploadable `.zip` file for Claude
|
|
|
|
**Result:** Get comprehensive Claude skills for any framework, API, or tool in 20-40 minutes instead of hours of manual work.
|
|
|
|
## Why Use This?
|
|
|
|
- 🎯 **For Developers**: Quickly create Claude skills for your favorite frameworks (React, Vue, Django, etc.)
|
|
- 🎮 **For Game Devs**: Generate skills for game engines (Godot, Unity documentation, etc.)
|
|
- 🔧 **For Teams**: Create internal documentation skills for your company's APIs
|
|
- 📚 **For Learners**: Build comprehensive reference skills for technologies you're learning
|
|
|
|
## Key Features
|
|
|
|
✅ **Universal Scraper** - Works with ANY documentation website
|
|
✅ **AI-Powered Enhancement** - Transforms basic templates into comprehensive guides
|
|
✅ **MCP Server for Claude Code** - Use directly from Claude Code with natural language
|
|
✅ **Large Documentation Support** - Handle 10K-40K+ page docs with intelligent splitting
|
|
✅ **Router/Hub Skills** - Intelligent routing to specialized sub-skills
|
|
✅ **8 Ready-to-Use Presets** - Godot, React, Vue, Django, FastAPI, and more
|
|
✅ **Smart Categorization** - Automatically organizes content by topic
|
|
✅ **Code Language Detection** - Recognizes Python, JavaScript, C++, GDScript, etc.
|
|
✅ **No API Costs** - FREE local enhancement using Claude Code Max
|
|
✅ **Checkpoint/Resume** - Never lose progress on long scrapes
|
|
✅ **Parallel Scraping** - Process multiple skills simultaneously
|
|
✅ **Caching System** - Scrape once, rebuild instantly
|
|
✅ **Fully Tested** - 96 tests with 100% pass rate
|
|
|
|
## Quick Example
|
|
|
|
### Option 1: Use from Claude Code (Recommended)
|
|
|
|
```bash
|
|
# One-time setup (5 minutes)
|
|
./setup_mcp.sh
|
|
|
|
# Then in Claude Code, just ask:
|
|
"Generate a React skill from https://react.dev/"
|
|
```
|
|
|
|
**Time:** Automated | **Quality:** Production-ready | **Cost:** Free
|
|
|
|
### Option 2: Use CLI Directly
|
|
|
|
```bash
|
|
# Install dependencies (2 pip packages)
|
|
pip3 install requests beautifulsoup4
|
|
|
|
# Generate a React skill in one command
|
|
python3 cli/doc_scraper.py --config configs/react.json --enhance-local
|
|
|
|
# Upload output/react.zip to Claude - Done!
|
|
```
|
|
|
|
**Time:** ~25 minutes | **Quality:** Production-ready | **Cost:** Free
|
|
|
|
## How It Works
|
|
|
|
```mermaid
|
|
graph LR
|
|
A[Documentation Website] --> B[Skill Seeker]
|
|
B --> C[Scraper]
|
|
B --> D[AI Enhancement]
|
|
B --> E[Packager]
|
|
C --> F[Organized References]
|
|
D --> F
|
|
F --> E
|
|
E --> G[Claude Skill .zip]
|
|
G --> H[Upload to Claude AI]
|
|
```
|
|
|
|
1. **Scrape**: Extracts all pages from documentation
|
|
2. **Categorize**: Organizes content into topics (API, guides, tutorials, etc.)
|
|
3. **Enhance**: AI analyzes docs and creates comprehensive SKILL.md with examples
|
|
4. **Package**: Bundles everything into a Claude-ready `.zip` file
|
|
|
|
## 🚀 Quick Start
|
|
|
|
### Method 1: MCP Server for Claude Code (Easiest)
|
|
|
|
Use Skill Seeker directly from Claude Code with natural language!
|
|
|
|
```bash
|
|
# One-time setup (5 minutes)
|
|
./setup_mcp.sh
|
|
|
|
# Restart Claude Code, then just ask:
|
|
```
|
|
|
|
**In Claude Code:**
|
|
```
|
|
List all available configs
|
|
Generate config for Tailwind at https://tailwindcss.com/docs
|
|
Scrape docs using configs/react.json
|
|
Package skill at output/react/
|
|
```
|
|
|
|
**Benefits:**
|
|
- ✅ No manual CLI commands
|
|
- ✅ Natural language interface
|
|
- ✅ Integrated with your workflow
|
|
- ✅ 8 tools available instantly (includes large docs support!)
|
|
- ✅ **Tested and working** in production
|
|
|
|
**Full guides:**
|
|
- 📘 [MCP Setup Guide](docs/MCP_SETUP.md) - Complete installation instructions
|
|
- 🧪 [MCP Testing Guide](docs/TEST_MCP_IN_CLAUDE_CODE.md) - Test all 8 tools
|
|
- 📦 [Large Documentation Guide](docs/LARGE_DOCUMENTATION.md) - Handle 10K-40K+ pages
|
|
|
|
### Method 2: CLI (Traditional)
|
|
|
|
#### Easiest: Use a Preset
|
|
|
|
```bash
|
|
# Install dependencies (macOS)
|
|
pip3 install requests beautifulsoup4
|
|
|
|
# Optional: Estimate pages first (fast, 1-2 minutes)
|
|
python3 estimate_pages.py configs/godot.json
|
|
|
|
# Use Godot preset
|
|
python3 doc_scraper.py --config configs/godot.json
|
|
|
|
# Use React preset
|
|
python3 doc_scraper.py --config configs/react.json
|
|
|
|
# See all presets
|
|
ls configs/
|
|
```
|
|
|
|
### Interactive Mode
|
|
|
|
```bash
|
|
python3 doc_scraper.py --interactive
|
|
```
|
|
|
|
### Quick Mode
|
|
|
|
```bash
|
|
python3 doc_scraper.py \
|
|
--name react \
|
|
--url https://react.dev/ \
|
|
--description "React framework for UIs"
|
|
```
|
|
|
|
## 📁 Simple Structure
|
|
|
|
```
|
|
doc-to-skill/
|
|
├── doc_scraper.py # Main scraping tool
|
|
├── enhance_skill.py # Optional: AI-powered SKILL.md enhancement
|
|
├── configs/ # Preset configurations
|
|
│ ├── godot.json # Godot Engine
|
|
│ ├── react.json # React
|
|
│ ├── vue.json # Vue.js
|
|
│ ├── django.json # Django
|
|
│ └── fastapi.json # FastAPI
|
|
└── output/ # All output (auto-created)
|
|
├── godot_data/ # Scraped data
|
|
└── godot/ # Built skill
|
|
```
|
|
|
|
## ✨ Features
|
|
|
|
### 1. Fast Page Estimation (NEW!)
|
|
|
|
```bash
|
|
python3 estimate_pages.py configs/react.json
|
|
|
|
# Output:
|
|
📊 ESTIMATION RESULTS
|
|
✅ Pages Discovered: 180
|
|
📈 Estimated Total: 230
|
|
⏱️ Time Elapsed: 1.2 minutes
|
|
💡 Recommended max_pages: 280
|
|
```
|
|
|
|
**Benefits:**
|
|
- Know page count BEFORE scraping (saves time)
|
|
- Validates URL patterns work correctly
|
|
- Estimates total scraping time
|
|
- Recommends optimal `max_pages` setting
|
|
- Fast (1-2 minutes vs 20-40 minutes full scrape)
|
|
|
|
### 2. Auto-Detect Existing Data
|
|
|
|
```bash
|
|
python3 doc_scraper.py --config configs/godot.json
|
|
|
|
# If data exists:
|
|
✓ Found existing data: 245 pages
|
|
Use existing data? (y/n): y
|
|
⏭️ Skipping scrape, using existing data
|
|
```
|
|
|
|
### 3. Knowledge Generation
|
|
|
|
**Automatic pattern extraction:**
|
|
- Extracts common code patterns from docs
|
|
- Detects programming language
|
|
- Creates quick reference with real examples
|
|
- Smarter categorization with scoring
|
|
|
|
**Enhanced SKILL.md:**
|
|
- Real code examples from documentation
|
|
- Language-annotated code blocks
|
|
- Common patterns section
|
|
- Quick reference from actual usage examples
|
|
|
|
### 4. Smart Categorization
|
|
|
|
Automatically infers categories from:
|
|
- URL structure
|
|
- Page titles
|
|
- Content keywords
|
|
- With scoring for better accuracy
|
|
|
|
### 5. Code Language Detection
|
|
|
|
```python
|
|
# Automatically detects:
|
|
- Python (def, import, from)
|
|
- JavaScript (const, let, =>)
|
|
- GDScript (func, var, extends)
|
|
- C++ (#include, int main)
|
|
- And more...
|
|
```
|
|
|
|
### 5. Skip Scraping
|
|
|
|
```bash
|
|
# Scrape once
|
|
python3 doc_scraper.py --config configs/react.json
|
|
|
|
# Later, just rebuild (instant)
|
|
python3 doc_scraper.py --config configs/react.json --skip-scrape
|
|
```
|
|
|
|
### 6. AI-Powered SKILL.md Enhancement
|
|
|
|
```bash
|
|
# Option 1: During scraping (API-based, requires API key)
|
|
pip3 install anthropic
|
|
export ANTHROPIC_API_KEY=sk-ant-...
|
|
python3 cli/doc_scraper.py --config configs/react.json --enhance
|
|
|
|
# Option 2: During scraping (LOCAL, no API key - uses Claude Code Max)
|
|
python3 cli/doc_scraper.py --config configs/react.json --enhance-local
|
|
|
|
# Option 3: After scraping (API-based, standalone)
|
|
python3 cli/enhance_skill.py output/react/
|
|
|
|
# Option 4: After scraping (LOCAL, no API key, standalone)
|
|
python3 cli/enhance_skill_local.py output/react/
|
|
```
|
|
|
|
**What it does:**
|
|
- Reads your reference documentation
|
|
- Uses Claude to generate an excellent SKILL.md
|
|
- Extracts best code examples (5-10 practical examples)
|
|
- Creates comprehensive quick reference
|
|
- Adds domain-specific key concepts
|
|
- Provides navigation guidance for different skill levels
|
|
- Automatically backs up original
|
|
- **Quality:** Transforms 75-line templates into 500+ line comprehensive guides
|
|
|
|
**LOCAL Enhancement (Recommended):**
|
|
- Uses your Claude Code Max plan (no API costs)
|
|
- Opens new terminal with Claude Code
|
|
- Analyzes reference files automatically
|
|
- Takes 30-60 seconds
|
|
- Quality: 9/10 (comparable to API version)
|
|
|
|
### 7. Large Documentation Support (10K-40K+ Pages)
|
|
|
|
**For massive documentation sites like Godot (40K pages), AWS, or Microsoft Docs:**
|
|
|
|
```bash
|
|
# 1. Estimate first (discover page count)
|
|
python3 cli/estimate_pages.py configs/godot.json
|
|
|
|
# 2. Auto-split into focused sub-skills
|
|
python3 cli/split_config.py configs/godot.json --strategy router
|
|
|
|
# Creates:
|
|
# - godot-scripting.json (5K pages)
|
|
# - godot-2d.json (8K pages)
|
|
# - godot-3d.json (10K pages)
|
|
# - godot-physics.json (6K pages)
|
|
# - godot-shaders.json (11K pages)
|
|
|
|
# 3. Scrape all in parallel (4-8 hours instead of 20-40!)
|
|
for config in configs/godot-*.json; do
|
|
python3 cli/doc_scraper.py --config $config &
|
|
done
|
|
wait
|
|
|
|
# 4. Generate intelligent router/hub skill
|
|
python3 cli/generate_router.py configs/godot-*.json
|
|
|
|
# 5. Package all skills
|
|
python3 cli/package_multi.py output/godot*/
|
|
|
|
# 6. Upload all .zip files to Claude
|
|
# Users just ask questions naturally!
|
|
# Router automatically directs to the right sub-skill!
|
|
```
|
|
|
|
**Split Strategies:**
|
|
- **auto** - Intelligently detects best strategy based on page count
|
|
- **category** - Split by documentation categories (scripting, 2d, 3d, etc.)
|
|
- **router** - Create hub skill + specialized sub-skills (RECOMMENDED)
|
|
- **size** - Split every N pages (for docs without clear categories)
|
|
|
|
**Benefits:**
|
|
- ✅ Faster scraping (parallel execution)
|
|
- ✅ More focused skills (better Claude performance)
|
|
- ✅ Easier maintenance (update one topic at a time)
|
|
- ✅ Natural user experience (router handles routing)
|
|
- ✅ Avoids context window limits
|
|
|
|
**Configuration:**
|
|
```json
|
|
{
|
|
"name": "godot",
|
|
"max_pages": 40000,
|
|
"split_strategy": "router",
|
|
"split_config": {
|
|
"target_pages_per_skill": 5000,
|
|
"create_router": true,
|
|
"split_by_categories": ["scripting", "2d", "3d", "physics"]
|
|
}
|
|
}
|
|
```
|
|
|
|
**Full Guide:** [Large Documentation Guide](docs/LARGE_DOCUMENTATION.md)
|
|
|
|
### 8. Checkpoint/Resume for Long Scrapes
|
|
|
|
**Never lose progress on long-running scrapes:**
|
|
|
|
```bash
|
|
# Enable in config
|
|
{
|
|
"checkpoint": {
|
|
"enabled": true,
|
|
"interval": 1000 // Save every 1000 pages
|
|
}
|
|
}
|
|
|
|
# If scrape is interrupted (Ctrl+C or crash)
|
|
python3 cli/doc_scraper.py --config configs/godot.json --resume
|
|
|
|
# Resume from last checkpoint
|
|
✅ Resuming from checkpoint (12,450 pages scraped)
|
|
⏭️ Skipping 12,450 already-scraped pages
|
|
🔄 Continuing from where we left off...
|
|
|
|
# Start fresh (clear checkpoint)
|
|
python3 cli/doc_scraper.py --config configs/godot.json --fresh
|
|
```
|
|
|
|
**Benefits:**
|
|
- ✅ Auto-saves every 1000 pages (configurable)
|
|
- ✅ Saves on interruption (Ctrl+C)
|
|
- ✅ Resume with `--resume` flag
|
|
- ✅ Never lose hours of scraping progress
|
|
|
|
## 🎯 Complete Workflows
|
|
|
|
### First Time (With Scraping + Enhancement)
|
|
|
|
```bash
|
|
# 1. Scrape + Build + AI Enhancement (LOCAL, no API key)
|
|
python3 doc_scraper.py --config configs/godot.json --enhance-local
|
|
|
|
# 2. Wait for new terminal to close (enhancement completes)
|
|
# Check the enhanced SKILL.md:
|
|
cat output/godot/SKILL.md
|
|
|
|
# 3. Package
|
|
python3 package_skill.py output/godot/
|
|
|
|
# 4. Done! You have godot.zip with excellent SKILL.md
|
|
```
|
|
|
|
**Time:** 20-40 minutes (scraping) + 60 seconds (enhancement) = ~21-41 minutes
|
|
|
|
### Using Existing Data (Fast!)
|
|
|
|
```bash
|
|
# 1. Use cached data + Local Enhancement
|
|
python3 doc_scraper.py --config configs/godot.json --skip-scrape
|
|
python3 enhance_skill_local.py output/godot/
|
|
|
|
# 2. Package
|
|
python3 package_skill.py output/godot/
|
|
|
|
# 3. Done!
|
|
```
|
|
|
|
**Time:** 1-3 minutes (build) + 60 seconds (enhancement) = ~2-4 minutes total
|
|
|
|
### Without Enhancement (Basic)
|
|
|
|
```bash
|
|
# 1. Scrape + Build (no enhancement)
|
|
python3 doc_scraper.py --config configs/godot.json
|
|
|
|
# 2. Package
|
|
python3 package_skill.py output/godot/
|
|
|
|
# 3. Done! (SKILL.md will be basic template)
|
|
```
|
|
|
|
**Time:** 20-40 minutes
|
|
**Note:** SKILL.md will be generic - enhancement strongly recommended!
|
|
|
|
## 📋 Available Presets
|
|
|
|
| Config | Framework | Description |
|
|
|--------|-----------|-------------|
|
|
| `godot.json` | Godot Engine | Game development |
|
|
| `react.json` | React | UI framework |
|
|
| `vue.json` | Vue.js | Progressive framework |
|
|
| `django.json` | Django | Python web framework |
|
|
| `fastapi.json` | FastAPI | Modern Python API |
|
|
|
|
### Using Presets
|
|
|
|
```bash
|
|
# Godot
|
|
python3 doc_scraper.py --config configs/godot.json
|
|
|
|
# React
|
|
python3 doc_scraper.py --config configs/react.json
|
|
|
|
# Vue
|
|
python3 doc_scraper.py --config configs/vue.json
|
|
|
|
# Django
|
|
python3 doc_scraper.py --config configs/django.json
|
|
|
|
# FastAPI
|
|
python3 doc_scraper.py --config configs/fastapi.json
|
|
```
|
|
|
|
## 🎨 Creating Your Own Config
|
|
|
|
### Option 1: Interactive
|
|
|
|
```bash
|
|
python3 doc_scraper.py --interactive
|
|
# Follow prompts, it will create the config for you
|
|
```
|
|
|
|
### Option 2: Copy and Edit
|
|
|
|
```bash
|
|
# Copy a preset
|
|
cp configs/react.json configs/myframework.json
|
|
|
|
# Edit it
|
|
nano configs/myframework.json
|
|
|
|
# Use it
|
|
python3 doc_scraper.py --config configs/myframework.json
|
|
```
|
|
|
|
### Config Structure
|
|
|
|
```json
|
|
{
|
|
"name": "myframework",
|
|
"description": "When to use this skill",
|
|
"base_url": "https://docs.myframework.com/",
|
|
"selectors": {
|
|
"main_content": "article",
|
|
"title": "h1",
|
|
"code_blocks": "pre code"
|
|
},
|
|
"url_patterns": {
|
|
"include": ["/docs", "/guide"],
|
|
"exclude": ["/blog", "/about"]
|
|
},
|
|
"categories": {
|
|
"getting_started": ["intro", "quickstart"],
|
|
"api": ["api", "reference"]
|
|
},
|
|
"rate_limit": 0.5,
|
|
"max_pages": 500
|
|
}
|
|
```
|
|
|
|
## 📊 What Gets Created
|
|
|
|
```
|
|
output/
|
|
├── godot_data/ # Scraped raw data
|
|
│ ├── pages/ # JSON files (one per page)
|
|
│ └── summary.json # Overview
|
|
│
|
|
└── godot/ # The skill
|
|
├── SKILL.md # Enhanced with real examples
|
|
├── references/ # Categorized docs
|
|
│ ├── index.md
|
|
│ ├── getting_started.md
|
|
│ ├── scripting.md
|
|
│ └── ...
|
|
├── scripts/ # Empty (add your own)
|
|
└── assets/ # Empty (add your own)
|
|
```
|
|
|
|
## 🎯 Command Line Options
|
|
|
|
```bash
|
|
# Interactive mode
|
|
python3 doc_scraper.py --interactive
|
|
|
|
# Use config file
|
|
python3 doc_scraper.py --config configs/godot.json
|
|
|
|
# Quick mode
|
|
python3 doc_scraper.py --name react --url https://react.dev/
|
|
|
|
# Skip scraping (use existing data)
|
|
python3 doc_scraper.py --config configs/godot.json --skip-scrape
|
|
|
|
# With description
|
|
python3 doc_scraper.py \
|
|
--name react \
|
|
--url https://react.dev/ \
|
|
--description "React framework for building UIs"
|
|
```
|
|
|
|
## 💡 Tips
|
|
|
|
### 1. Test Small First
|
|
|
|
Edit `max_pages` in config to test:
|
|
```json
|
|
{
|
|
"max_pages": 20 // Test with just 20 pages
|
|
}
|
|
```
|
|
|
|
### 2. Reuse Scraped Data
|
|
|
|
```bash
|
|
# Scrape once
|
|
python3 doc_scraper.py --config configs/react.json
|
|
|
|
# Rebuild multiple times (instant)
|
|
python3 doc_scraper.py --config configs/react.json --skip-scrape
|
|
python3 doc_scraper.py --config configs/react.json --skip-scrape
|
|
```
|
|
|
|
### 3. Finding Selectors
|
|
|
|
```python
|
|
# Test in Python
|
|
from bs4 import BeautifulSoup
|
|
import requests
|
|
|
|
url = "https://docs.example.com/page"
|
|
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
|
|
|
|
# Try different selectors
|
|
print(soup.select_one('article'))
|
|
print(soup.select_one('main'))
|
|
print(soup.select_one('div[role="main"]'))
|
|
```
|
|
|
|
### 4. Check Output Quality
|
|
|
|
```bash
|
|
# After building, check:
|
|
cat output/godot/SKILL.md # Should have real examples
|
|
cat output/godot/references/index.md # Categories
|
|
```
|
|
|
|
## 🐛 Troubleshooting
|
|
|
|
### No Content Extracted?
|
|
- Check your `main_content` selector
|
|
- Try: `article`, `main`, `div[role="main"]`
|
|
|
|
### Data Exists But Won't Use It?
|
|
```bash
|
|
# Force re-scrape
|
|
rm -rf output/myframework_data/
|
|
python3 doc_scraper.py --config configs/myframework.json
|
|
```
|
|
|
|
### Categories Not Good?
|
|
Edit the config `categories` section with better keywords.
|
|
|
|
### Want to Update Docs?
|
|
```bash
|
|
# Delete old data
|
|
rm -rf output/godot_data/
|
|
|
|
# Re-scrape
|
|
python3 doc_scraper.py --config configs/godot.json
|
|
```
|
|
|
|
## 📈 Performance
|
|
|
|
| Task | Time | Notes |
|
|
|------|------|-------|
|
|
| Scraping | 15-45 min | First time only |
|
|
| Building | 1-3 min | Fast! |
|
|
| Re-building | <1 min | With --skip-scrape |
|
|
| Packaging | 5-10 sec | Final zip |
|
|
|
|
## ✅ Summary
|
|
|
|
**One tool does everything:**
|
|
1. ✅ Scrapes documentation
|
|
2. ✅ Auto-detects existing data
|
|
3. ✅ Generates better knowledge
|
|
4. ✅ Creates enhanced skills
|
|
5. ✅ Works with presets or custom configs
|
|
6. ✅ Supports skip-scraping for fast iteration
|
|
|
|
**Simple structure:**
|
|
- `doc_scraper.py` - The tool
|
|
- `configs/` - Presets
|
|
- `output/` - Everything else
|
|
|
|
**Better output:**
|
|
- Real code examples with language detection
|
|
- Common patterns extracted from docs
|
|
- Smart categorization
|
|
- Enhanced SKILL.md with actual examples
|
|
|
|
## 📚 Documentation
|
|
|
|
- **[QUICKSTART.md](QUICKSTART.md)** - Get started in 3 steps
|
|
- **[docs/LARGE_DOCUMENTATION.md](docs/LARGE_DOCUMENTATION.md)** - Handle 10K-40K+ page docs
|
|
- **[docs/ENHANCEMENT.md](docs/ENHANCEMENT.md)** - AI enhancement guide
|
|
- **[docs/UPLOAD_GUIDE.md](docs/UPLOAD_GUIDE.md)** - How to upload skills to Claude
|
|
- **[docs/MCP_SETUP.md](docs/MCP_SETUP.md)** - MCP integration setup
|
|
- **[docs/CLAUDE.md](docs/CLAUDE.md)** - Technical architecture
|
|
- **[STRUCTURE.md](STRUCTURE.md)** - Repository structure
|
|
|
|
## 🎮 Ready?
|
|
|
|
```bash
|
|
# Try Godot
|
|
python3 doc_scraper.py --config configs/godot.json
|
|
|
|
# Try React
|
|
python3 doc_scraper.py --config configs/react.json
|
|
|
|
# Or go interactive
|
|
python3 doc_scraper.py --interactive
|
|
```
|
|
|
|
## 📝 License
|
|
|
|
MIT License - see [LICENSE](LICENSE) file for details
|
|
|
|
---
|
|
|
|
Happy skill building! 🚀
|