firefrost-gaming/skill-seekers-reference

Files

yusyus 66b7f9c4f6 chore: Bump version to v1.3.0

Update version numbers across project for v1.3.0 release:
- CHANGELOG.md: Move [Unreleased] → [1.3.0] - 2025-10-26
- README.md: Update version badge 1.2.0 → 1.3.0
- cli/__init__.py: Update __version__ = "1.3.0"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-10-26 13:16:54 +03:00

25 KiB

Raw Blame History

Skill Seeker

Automatically convert any documentation website into a Claude AI skill in minutes.

📋 View Development Roadmap & Tasks - 134 tasks across 10 categories, pick any to contribute!

What is Skill Seeker?

Skill Seeker is an automated tool that transforms any documentation website into a production-ready Claude AI skill. Instead of manually reading and summarizing documentation, Skill Seeker:

Scrapes documentation websites automatically
Organizes content into categorized reference files
Enhances with AI to extract best examples and key concepts
Packages everything into an uploadable .zip file for Claude

Result: Get comprehensive Claude skills for any framework, API, or tool in 20-40 minutes instead of hours of manual work.

Why Use This?

🎯 For Developers: Quickly create Claude skills for your favorite frameworks (React, Vue, Django, etc.)
🎮 For Game Devs: Generate skills for game engines (Godot, Unity documentation, etc.)
🔧 For Teams: Create internal documentation skills for your company's APIs
📚 For Learners: Build comprehensive reference skills for technologies you're learning

Key Features

🌐 Documentation Scraping

✅ llms.txt Support - Automatically detects and uses LLM-ready documentation files (10x faster)
✅ Universal Scraper - Works with ANY documentation website
✅ Smart Categorization - Automatically organizes content by topic
✅ Code Language Detection - Recognizes Python, JavaScript, C++, GDScript, etc.
✅ 8 Ready-to-Use Presets - Godot, React, Vue, Django, FastAPI, and more

📄 PDF Support (v1.2.0)

✅ Basic PDF Extraction - Extract text, code, and images from PDF files
✅ OCR for Scanned PDFs - Extract text from scanned documents
✅ Password-Protected PDFs - Handle encrypted PDFs
✅ Table Extraction - Extract complex tables from PDFs
✅ Parallel Processing - 3x faster for large PDFs
✅ Intelligent Caching - 50% faster on re-runs

🤖 AI & Enhancement

✅ AI-Powered Enhancement - Transforms basic templates into comprehensive guides
✅ No API Costs - FREE local enhancement using Claude Code Max
✅ MCP Server for Claude Code - Use directly from Claude Code with natural language

⚡ Performance & Scale

✅ Async Mode - 2-3x faster scraping with async/await (use --async flag)
✅ Large Documentation Support - Handle 10K-40K+ page docs with intelligent splitting
✅ Router/Hub Skills - Intelligent routing to specialized sub-skills
✅ Parallel Scraping - Process multiple skills simultaneously
✅ Checkpoint/Resume - Never lose progress on long scrapes
✅ Caching System - Scrape once, rebuild instantly

✅ Quality Assurance

✅ Fully Tested - 299 tests with 100% pass rate

Quick Example

Option 1: Use from Claude Code (Recommended)

# One-time setup (5 minutes)
./setup_mcp.sh

# Then in Claude Code, just ask:
"Generate a React skill from https://react.dev/"
"Scrape PDF at docs/manual.pdf and create skill"

Time: Automated | Quality: Production-ready | Cost: Free

Option 2: Use CLI Directly (HTML Docs)

# Install dependencies (2 pip packages)
pip3 install requests beautifulsoup4

# Generate a React skill in one command
python3 cli/doc_scraper.py --config configs/react.json --enhance-local

# Upload output/react.zip to Claude - Done!

Time: ~25 minutes | Quality: Production-ready | Cost: Free

Option 3: Use CLI for PDF Documentation

# Install PDF support
pip3 install PyMuPDF

# Basic PDF extraction
python3 cli/pdf_scraper.py --pdf docs/manual.pdf --name myskill

# Advanced features
python3 cli/pdf_scraper.py --pdf docs/manual.pdf --name myskill \
    --extract-tables \        # Extract tables
    --parallel \              # Fast parallel processing
    --workers 8               # Use 8 CPU cores

# Scanned PDFs (requires: pip install pytesseract Pillow)
python3 cli/pdf_scraper.py --pdf docs/scanned.pdf --name myskill --ocr

# Password-protected PDFs
python3 cli/pdf_scraper.py --pdf docs/encrypted.pdf --name myskill --password mypassword

# Upload output/myskill.zip to Claude - Done!

Time: ~5-15 minutes (or 2-5 minutes with parallel) | Quality: Production-ready | Cost: Free

Advanced Features:

✅ OCR for scanned PDFs (requires pytesseract)
✅ Password-protected PDF support
✅ Table extraction
✅ Parallel processing (3x faster)
✅ Intelligent caching

How It Works

graph LR
    A[Documentation Website] --> B[Skill Seeker]
    B --> C[Scraper]
    B --> D[AI Enhancement]
    B --> E[Packager]
    C --> F[Organized References]
    D --> F
    F --> E
    E --> G[Claude Skill .zip]
    G --> H[Upload to Claude AI]

Detect llms.txt - Checks for llms-full.txt, llms.txt, llms-small.txt first
Scrape: Extracts all pages from documentation
Categorize: Organizes content into topics (API, guides, tutorials, etc.)
Enhance: AI analyzes docs and creates comprehensive SKILL.md with examples
Package: Bundles everything into a Claude-ready .zip file

📋 Prerequisites

Before you start, make sure you have:

Python 3.10 or higher - Download | Check: python3 --version
Git - Download | Check: git --version
15-30 minutes for first-time setup

First time user? → Start Here: Bulletproof Quick Start Guide 🎯

This guide walks you through EVERYTHING step-by-step (Python install, git clone, first skill creation).

🚀 Quick Start

Method 1: MCP Server for Claude Code (Easiest)

Use Skill Seeker directly from Claude Code with natural language!

# Clone repository
git clone https://github.com/yusufkaraaslan/Skill_Seekers.git
cd Skill_Seekers

# One-time setup (5 minutes)
./setup_mcp.sh

# Restart Claude Code, then just ask:

In Claude Code:

List all available configs
Generate config for Tailwind at https://tailwindcss.com/docs
Scrape docs using configs/react.json
Package skill at output/react/

Benefits:

✅ No manual CLI commands
✅ Natural language interface
✅ Integrated with your workflow
✅ 9 tools available instantly (includes automatic upload!)
✅ Tested and working in production

Full guides:

📘 MCP Setup Guide - Complete installation instructions
🧪 MCP Testing Guide - Test all 9 tools
📦 Large Documentation Guide - Handle 10K-40K+ pages
📤 Upload Guide - How to upload skills to Claude

Method 2: CLI (Traditional)

One-Time Setup: Create Virtual Environment

# Clone repository
git clone https://github.com/yusufkaraaslan/Skill_Seekers.git
cd Skill_Seekers

# Create virtual environment
python3 -m venv venv

# Activate virtual environment
source venv/bin/activate  # macOS/Linux
# OR on Windows: venv\Scripts\activate

# Install dependencies
pip install requests beautifulsoup4 pytest

# Save dependencies
pip freeze > requirements.txt

# Optional: Install anthropic for API-based enhancement (not needed for LOCAL enhancement)
# pip install anthropic

Always activate the virtual environment before using Skill Seeker:

source venv/bin/activate  # Run this each time you start a new terminal session

Easiest: Use a Preset

# Make sure venv is activated (you should see (venv) in your prompt)
source venv/bin/activate

# Optional: Estimate pages first (fast, 1-2 minutes)
python3 cli/estimate_pages.py configs/godot.json

# Use Godot preset
python3 cli/doc_scraper.py --config configs/godot.json

# Use React preset
python3 cli/doc_scraper.py --config configs/react.json

# See all presets
ls configs/

Interactive Mode

python3 cli/doc_scraper.py --interactive

Quick Mode

python3 cli/doc_scraper.py \
  --name react \
  --url https://react.dev/ \
  --description "React framework for UIs"

📤 Uploading Skills to Claude

Once your skill is packaged, you need to upload it to Claude:

Option 1: Automatic Upload (API-based)

# Set your API key (one-time)
export ANTHROPIC_API_KEY=sk-ant-...

# Package and upload automatically
python3 cli/package_skill.py output/react/ --upload

# OR upload existing .zip
python3 cli/upload_skill.py output/react.zip

Benefits:

✅ Fully automatic
✅ No manual steps
✅ Works from command line

Requirements:

Anthropic API key (get from https://console.anthropic.com/)

Option 2: Manual Upload (No API Key)

# Package skill
python3 cli/package_skill.py output/react/

# This will:
# 1. Create output/react.zip
# 2. Open the output/ folder automatically
# 3. Show upload instructions

# Then manually upload:
# - Go to https://claude.ai/skills
# - Click "Upload Skill"
# - Select output/react.zip
# - Done!

Benefits:

✅ No API key needed
✅ Works for everyone
✅ Folder opens automatically

Option 3: Claude Code (MCP) - Smart & Automatic

In Claude Code, just ask:
"Package and upload the React skill"

# With API key set:
# - Packages the skill
# - Uploads to Claude automatically
# - Done! ✅

# Without API key:
# - Packages the skill
# - Shows where to find the .zip
# - Provides manual upload instructions

Benefits:

✅ Natural language
✅ Smart auto-detection (uploads if API key available)
✅ Works with or without API key
✅ No errors or failures

📁 Simple Structure

doc-to-skill/
├── cli/
│   ├── doc_scraper.py      # Main scraping tool
│   ├── package_skill.py    # Package to .zip
│   ├── upload_skill.py     # Auto-upload (API)
│   └── enhance_skill.py    # AI enhancement
├── mcp/                    # MCP server for Claude Code
│   └── server.py           # 9 MCP tools
├── configs/                # Preset configurations
│   ├── godot.json         # Godot Engine
│   ├── react.json         # React
│   ├── vue.json           # Vue.js
│   ├── django.json        # Django
│   └── fastapi.json       # FastAPI
└── output/                 # All output (auto-created)
    ├── godot_data/        # Scraped data
    ├── godot/             # Built skill
    └── godot.zip          # Packaged skill

✨ Features

1. Fast Page Estimation (NEW!)

python3 cli/estimate_pages.py configs/react.json

# Output:
📊 ESTIMATION RESULTS
✅ Pages Discovered: 180
📈 Estimated Total: 230
⏱️  Time Elapsed: 1.2 minutes
💡 Recommended max_pages: 280

Benefits:

Know page count BEFORE scraping (saves time)
Validates URL patterns work correctly
Estimates total scraping time
Recommends optimal max_pages setting
Fast (1-2 minutes vs 20-40 minutes full scrape)

2. Auto-Detect Existing Data

python3 cli/doc_scraper.py --config configs/godot.json

# If data exists:
✓ Found existing data: 245 pages
Use existing data? (y/n): y
⏭️  Skipping scrape, using existing data

3. Knowledge Generation

Automatic pattern extraction:

Extracts common code patterns from docs
Detects programming language
Creates quick reference with real examples
Smarter categorization with scoring

Enhanced SKILL.md:

Real code examples from documentation
Language-annotated code blocks
Common patterns section
Quick reference from actual usage examples

4. Smart Categorization

Automatically infers categories from:

URL structure
Page titles
Content keywords
With scoring for better accuracy

5. Code Language Detection

# Automatically detects:
- Python (def, import, from)
- JavaScript (const, let, =>)
- GDScript (func, var, extends)
- C++ (#include, int main)
- And more...

5. Skip Scraping

# Scrape once
python3 cli/doc_scraper.py --config configs/react.json

# Later, just rebuild (instant)
python3 cli/doc_scraper.py --config configs/react.json --skip-scrape

6. Async Mode for Faster Scraping (2-3x Speed!)

# Enable async mode with 8 workers (recommended for large docs)
python3 cli/doc_scraper.py --config configs/react.json --async --workers 8

# Small docs (~100-500 pages)
python3 cli/doc_scraper.py --config configs/mydocs.json --async --workers 4

# Large docs (2000+ pages) with no rate limiting
python3 cli/doc_scraper.py --config configs/largedocs.json --async --workers 8 --no-rate-limit

Performance Comparison:

Sync mode (threads): ~18 pages/sec, 120 MB memory
Async mode: ~55 pages/sec, 40 MB memory
Result: 3x faster, 66% less memory!

When to use:

✅ Large documentation (500+ pages)
✅ Network latency is high
✅ Memory is constrained
❌ Small docs (< 100 pages) - overhead not worth it

See full guide: ASYNC_SUPPORT.md

7. AI-Powered SKILL.md Enhancement

# Option 1: During scraping (API-based, requires API key)
pip3 install anthropic
export ANTHROPIC_API_KEY=sk-ant-...
python3 cli/doc_scraper.py --config configs/react.json --enhance

# Option 2: During scraping (LOCAL, no API key - uses Claude Code Max)
python3 cli/doc_scraper.py --config configs/react.json --enhance-local

# Option 3: After scraping (API-based, standalone)
python3 cli/enhance_skill.py output/react/

# Option 4: After scraping (LOCAL, no API key, standalone)
python3 cli/enhance_skill_local.py output/react/

What it does:

Reads your reference documentation
Uses Claude to generate an excellent SKILL.md
Extracts best code examples (5-10 practical examples)
Creates comprehensive quick reference
Adds domain-specific key concepts
Provides navigation guidance for different skill levels
Automatically backs up original
Quality: Transforms 75-line templates into 500+ line comprehensive guides

LOCAL Enhancement (Recommended):

Uses your Claude Code Max plan (no API costs)
Opens new terminal with Claude Code
Analyzes reference files automatically
Takes 30-60 seconds
Quality: 9/10 (comparable to API version)

7. Large Documentation Support (10K-40K+ Pages)

For massive documentation sites like Godot (40K pages), AWS, or Microsoft Docs:

# 1. Estimate first (discover page count)
python3 cli/estimate_pages.py configs/godot.json

# 2. Auto-split into focused sub-skills
python3 cli/split_config.py configs/godot.json --strategy router

# Creates:
# - godot-scripting.json (5K pages)
# - godot-2d.json (8K pages)
# - godot-3d.json (10K pages)
# - godot-physics.json (6K pages)
# - godot-shaders.json (11K pages)

# 3. Scrape all in parallel (4-8 hours instead of 20-40!)
for config in configs/godot-*.json; do
  python3 cli/doc_scraper.py --config $config &
done
wait

# 4. Generate intelligent router/hub skill
python3 cli/generate_router.py configs/godot-*.json

# 5. Package all skills
python3 cli/package_multi.py output/godot*/

# 6. Upload all .zip files to Claude
# Users just ask questions naturally!
# Router automatically directs to the right sub-skill!

Split Strategies:

auto - Intelligently detects best strategy based on page count
category - Split by documentation categories (scripting, 2d, 3d, etc.)
router - Create hub skill + specialized sub-skills (RECOMMENDED)
size - Split every N pages (for docs without clear categories)

Benefits:

✅ Faster scraping (parallel execution)
✅ More focused skills (better Claude performance)
✅ Easier maintenance (update one topic at a time)
✅ Natural user experience (router handles routing)
✅ Avoids context window limits

Configuration:

{
  "name": "godot",
  "max_pages": 40000,
  "split_strategy": "router",
  "split_config": {
    "target_pages_per_skill": 5000,
    "create_router": true,
    "split_by_categories": ["scripting", "2d", "3d", "physics"]
  }
}

Full Guide: Large Documentation Guide

8. Checkpoint/Resume for Long Scrapes

Never lose progress on long-running scrapes:

# Enable in config
{
  "checkpoint": {
    "enabled": true,
    "interval": 1000  // Save every 1000 pages
  }
}

# If scrape is interrupted (Ctrl+C or crash)
python3 cli/doc_scraper.py --config configs/godot.json --resume

# Resume from last checkpoint
✅ Resuming from checkpoint (12,450 pages scraped)
⏭️  Skipping 12,450 already-scraped pages
🔄 Continuing from where we left off...

# Start fresh (clear checkpoint)
python3 cli/doc_scraper.py --config configs/godot.json --fresh

Benefits:

✅ Auto-saves every 1000 pages (configurable)
✅ Saves on interruption (Ctrl+C)
✅ Resume with --resume flag
✅ Never lose hours of scraping progress

🎯 Complete Workflows

First Time (With Scraping + Enhancement)

# 1. Scrape + Build + AI Enhancement (LOCAL, no API key)
python3 cli/doc_scraper.py --config configs/godot.json --enhance-local

# 2. Wait for new terminal to close (enhancement completes)
# Check the enhanced SKILL.md:
cat output/godot/SKILL.md

# 3. Package
python3 cli/package_skill.py output/godot/

# 4. Done! You have godot.zip with excellent SKILL.md

Time: 20-40 minutes (scraping) + 60 seconds (enhancement) = ~21-41 minutes

Using Existing Data (Fast!)

# 1. Use cached data + Local Enhancement
python3 cli/doc_scraper.py --config configs/godot.json --skip-scrape
python3 cli/enhance_skill_local.py output/godot/

# 2. Package
python3 cli/package_skill.py output/godot/

# 3. Done!

Time: 1-3 minutes (build) + 60 seconds (enhancement) = ~2-4 minutes total

Without Enhancement (Basic)

# 1. Scrape + Build (no enhancement)
python3 cli/doc_scraper.py --config configs/godot.json

# 2. Package
python3 cli/package_skill.py output/godot/

# 3. Done! (SKILL.md will be basic template)

Time: 20-40 minutes Note: SKILL.md will be generic - enhancement strongly recommended!

📋 Available Presets

Config	Framework	Description
`godot.json`	Godot Engine	Game development
`react.json`	React	UI framework
`vue.json`	Vue.js	Progressive framework
`django.json`	Django	Python web framework
`fastapi.json`	FastAPI	Modern Python API
`ansible-core.json`	Ansible Core 2.19	Automation & configuration

Using Presets

# Godot
python3 cli/doc_scraper.py --config configs/godot.json

# React
python3 cli/doc_scraper.py --config configs/react.json

# Vue
python3 cli/doc_scraper.py --config configs/vue.json

# Django
python3 cli/doc_scraper.py --config configs/django.json

# FastAPI
python3 cli/doc_scraper.py --config configs/fastapi.json

# Ansible
python3 cli/doc_scraper.py --config configs/ansible-core.json

🎨 Creating Your Own Config

Option 1: Interactive

python3 cli/doc_scraper.py --interactive
# Follow prompts, it will create the config for you

Option 2: Copy and Edit

# Copy a preset
cp configs/react.json configs/myframework.json

# Edit it
nano configs/myframework.json

# Use it
python3 cli/doc_scraper.py --config configs/myframework.json

Config Structure

{
  "name": "myframework",
  "description": "When to use this skill",
  "base_url": "https://docs.myframework.com/",
  "selectors": {
    "main_content": "article",
    "title": "h1",
    "code_blocks": "pre code"
  },
  "url_patterns": {
    "include": ["/docs", "/guide"],
    "exclude": ["/blog", "/about"]
  },
  "categories": {
    "getting_started": ["intro", "quickstart"],
    "api": ["api", "reference"]
  },
  "rate_limit": 0.5,
  "max_pages": 500
}

📊 What Gets Created

output/
├── godot_data/              # Scraped raw data
│   ├── pages/              # JSON files (one per page)
│   └── summary.json        # Overview
│
└── godot/                   # The skill
    ├── SKILL.md            # Enhanced with real examples
    ├── references/         # Categorized docs
    │   ├── index.md
    │   ├── getting_started.md
    │   ├── scripting.md
    │   └── ...
    ├── scripts/            # Empty (add your own)
    └── assets/             # Empty (add your own)

🎯 Command Line Options

# Interactive mode
python3 cli/doc_scraper.py --interactive

# Use config file
python3 cli/doc_scraper.py --config configs/godot.json

# Quick mode
python3 cli/doc_scraper.py --name react --url https://react.dev/

# Skip scraping (use existing data)
python3 cli/doc_scraper.py --config configs/godot.json --skip-scrape

# With description
python3 cli/doc_scraper.py \
  --name react \
  --url https://react.dev/ \
  --description "React framework for building UIs"

💡 Tips

1. Test Small First

Edit max_pages in config to test:

{
  "max_pages": 20  // Test with just 20 pages
}

2. Reuse Scraped Data

# Scrape once
python3 cli/doc_scraper.py --config configs/react.json

# Rebuild multiple times (instant)
python3 cli/doc_scraper.py --config configs/react.json --skip-scrape
python3 cli/doc_scraper.py --config configs/react.json --skip-scrape

3. Finding Selectors

# Test in Python
from bs4 import BeautifulSoup
import requests

url = "https://docs.example.com/page"
soup = BeautifulSoup(requests.get(url).content, 'html.parser')

# Try different selectors
print(soup.select_one('article'))
print(soup.select_one('main'))
print(soup.select_one('div[role="main"]'))

4. Check Output Quality

# After building, check:
cat output/godot/SKILL.md  # Should have real examples
cat output/godot/references/index.md  # Categories

🐛 Troubleshooting

No Content Extracted?

Check your main_content selector
Try: article, main, div[role="main"]

Data Exists But Won't Use It?

# Force re-scrape
rm -rf output/myframework_data/
python3 cli/doc_scraper.py --config configs/myframework.json

Categories Not Good?

Edit the config categories section with better keywords.

Want to Update Docs?

# Delete old data
rm -rf output/godot_data/

# Re-scrape
python3 cli/doc_scraper.py --config configs/godot.json

📈 Performance

Task	Time	Notes
Scraping (sync)	15-45 min	First time only, thread-based
Scraping (async)	5-15 min	2-3x faster with --async flag
Building	1-3 min	Fast!
Re-building	<1 min	With --skip-scrape
Packaging	5-10 sec	Final zip

✅ Summary

One tool does everything:

✅ Scrapes documentation
✅ Auto-detects existing data
✅ Generates better knowledge
✅ Creates enhanced skills
✅ Works with presets or custom configs
✅ Supports skip-scraping for fast iteration

Simple structure:

doc_scraper.py - The tool
configs/ - Presets
output/ - Everything else

Better output:

Real code examples with language detection
Common patterns extracted from docs
Smart categorization
Enhanced SKILL.md with actual examples

📚 Documentation

Getting Started

BULLETPROOF_QUICKSTART.md - 🎯 START HERE if you're new!
QUICKSTART.md - Quick start for experienced users
TROUBLESHOOTING.md - Common issues and solutions

Guides

docs/LARGE_DOCUMENTATION.md - Handle 10K-40K+ page docs
ASYNC_SUPPORT.md - Async mode guide (2-3x faster scraping)
docs/ENHANCEMENT.md - AI enhancement guide
docs/UPLOAD_GUIDE.md - How to upload skills to Claude
docs/MCP_SETUP.md - MCP integration setup

Technical

docs/CLAUDE.md - Technical architecture
STRUCTURE.md - Repository structure

🎮 Ready?

# Try Godot
python3 cli/doc_scraper.py --config configs/godot.json

# Try React
python3 cli/doc_scraper.py --config configs/react.json

# Or go interactive
python3 cli/doc_scraper.py --interactive

📝 License

MIT License - see LICENSE file for details

Happy skill building! 🚀

25 KiB Raw Blame History

Skill Seeker

What is Skill Seeker?

Why Use This?

Key Features

🌐 Documentation Scraping

📄 PDF Support (v1.2.0)

🤖 AI & Enhancement

⚡ Performance & Scale

✅ Quality Assurance

Quick Example

Option 1: Use from Claude Code (Recommended)

Option 2: Use CLI Directly (HTML Docs)

Option 3: Use CLI for PDF Documentation

How It Works

📋 Prerequisites

🚀 Quick Start

Method 1: MCP Server for Claude Code (Easiest)

Method 2: CLI (Traditional)

One-Time Setup: Create Virtual Environment

Easiest: Use a Preset

Interactive Mode

Quick Mode

📤 Uploading Skills to Claude

Option 1: Automatic Upload (API-based)

Option 2: Manual Upload (No API Key)

Option 3: Claude Code (MCP) - Smart & Automatic

📁 Simple Structure

✨ Features

1. Fast Page Estimation (NEW!)

2. Auto-Detect Existing Data

3. Knowledge Generation

4. Smart Categorization

5. Code Language Detection

5. Skip Scraping

6. Async Mode for Faster Scraping (2-3x Speed!)

7. AI-Powered SKILL.md Enhancement

7. Large Documentation Support (10K-40K+ Pages)

8. Checkpoint/Resume for Long Scrapes

🎯 Complete Workflows

First Time (With Scraping + Enhancement)

Using Existing Data (Fast!)

Without Enhancement (Basic)

📋 Available Presets

Using Presets

🎨 Creating Your Own Config

Option 1: Interactive

Option 2: Copy and Edit

Config Structure

📊 What Gets Created

🎯 Command Line Options

💡 Tips

1. Test Small First

2. Reuse Scraped Data

3. Finding Selectors

4. Check Output Quality

🐛 Troubleshooting

No Content Extracted?

Data Exists But Won't Use It?

Categories Not Good?

Want to Update Docs?

📈 Performance

✅ Summary

📚 Documentation

Getting Started

Guides

Technical

🎮 Ready?

📝 License

25 KiB

Raw Blame History