Init
This commit is contained in:
2
LICENSE
2
LICENSE
@@ -1,6 +1,6 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2025 yusyus
|
||||
Copyright (c) 2025 [Your Name/Username]
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
|
||||
181
QUICKSTART.md
Normal file
181
QUICKSTART.md
Normal file
@@ -0,0 +1,181 @@
|
||||
# Quick Start Guide
|
||||
|
||||
## 🚀 3 Steps to Create a Skill
|
||||
|
||||
### Step 1: Install Dependencies
|
||||
|
||||
```bash
|
||||
pip3 install requests beautifulsoup4
|
||||
```
|
||||
|
||||
### Step 2: Run the Tool
|
||||
|
||||
**Option A: Use a Preset (Easiest)**
|
||||
```bash
|
||||
python3 doc_scraper.py --config configs/godot.json
|
||||
```
|
||||
|
||||
**Option B: Interactive Mode**
|
||||
```bash
|
||||
python3 doc_scraper.py --interactive
|
||||
```
|
||||
|
||||
**Option C: Quick Command**
|
||||
```bash
|
||||
python3 doc_scraper.py --name react --url https://react.dev/
|
||||
```
|
||||
|
||||
### Step 3: Enhance SKILL.md (Recommended)
|
||||
|
||||
```bash
|
||||
# LOCAL enhancement (no API key, uses Claude Code Max)
|
||||
python3 enhance_skill_local.py output/godot/
|
||||
```
|
||||
|
||||
**This takes 60 seconds and dramatically improves the SKILL.md quality!**
|
||||
|
||||
### Step 4: Package the Skill
|
||||
|
||||
```bash
|
||||
python3 package_skill.py output/godot/
|
||||
```
|
||||
|
||||
**Done!** You now have `godot.zip` ready to use.
|
||||
|
||||
---
|
||||
|
||||
## 📋 Available Presets
|
||||
|
||||
```bash
|
||||
# Godot Engine
|
||||
python3 doc_scraper.py --config configs/godot.json
|
||||
|
||||
# React
|
||||
python3 doc_scraper.py --config configs/react.json
|
||||
|
||||
# Vue.js
|
||||
python3 doc_scraper.py --config configs/vue.json
|
||||
|
||||
# Django
|
||||
python3 doc_scraper.py --config configs/django.json
|
||||
|
||||
# FastAPI
|
||||
python3 doc_scraper.py --config configs/fastapi.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ⚡ Using Existing Data (Fast!)
|
||||
|
||||
If you already scraped once:
|
||||
|
||||
```bash
|
||||
python3 doc_scraper.py --config configs/godot.json
|
||||
|
||||
# When prompted:
|
||||
✓ Found existing data: 245 pages
|
||||
Use existing data? (y/n): y
|
||||
|
||||
# Builds in seconds!
|
||||
```
|
||||
|
||||
Or use `--skip-scrape`:
|
||||
```bash
|
||||
python3 doc_scraper.py --config configs/godot.json --skip-scrape
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Complete Example (Recommended Workflow)
|
||||
|
||||
```bash
|
||||
# 1. Install (once)
|
||||
pip3 install requests beautifulsoup4
|
||||
|
||||
# 2. Scrape React docs with LOCAL enhancement
|
||||
python3 doc_scraper.py --config configs/react.json --enhance-local
|
||||
# Wait 15-30 minutes (scraping) + 60 seconds (enhancement)
|
||||
|
||||
# 3. Package
|
||||
python3 package_skill.py output/react/
|
||||
|
||||
# 4. Use react.zip in Claude!
|
||||
```
|
||||
|
||||
**Alternative: Enhancement after scraping**
|
||||
```bash
|
||||
# 2a. Scrape only (no enhancement)
|
||||
python3 doc_scraper.py --config configs/react.json
|
||||
|
||||
# 2b. Enhance later
|
||||
python3 enhance_skill_local.py output/react/
|
||||
|
||||
# 3. Package
|
||||
python3 package_skill.py output/react/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 💡 Pro Tips
|
||||
|
||||
### Test with Small Pages First
|
||||
Edit config file:
|
||||
```json
|
||||
{
|
||||
"max_pages": 20 // Test with just 20 pages
|
||||
}
|
||||
```
|
||||
|
||||
### Rebuild Instantly
|
||||
```bash
|
||||
# After first scrape, you can rebuild instantly:
|
||||
python3 doc_scraper.py --config configs/react.json --skip-scrape
|
||||
```
|
||||
|
||||
### Create Custom Config
|
||||
```bash
|
||||
# Copy a preset
|
||||
cp configs/react.json configs/myframework.json
|
||||
|
||||
# Edit it
|
||||
nano configs/myframework.json
|
||||
|
||||
# Use it
|
||||
python3 doc_scraper.py --config configs/myframework.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📁 What You Get
|
||||
|
||||
```
|
||||
output/
|
||||
├── godot_data/ # Raw scraped data (reusable!)
|
||||
└── godot/ # The skill
|
||||
├── SKILL.md # With real code examples!
|
||||
└── references/ # Organized docs
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ❓ Need Help?
|
||||
|
||||
See **README.md** for:
|
||||
- Complete documentation
|
||||
- Config file structure
|
||||
- Troubleshooting
|
||||
- Advanced usage
|
||||
|
||||
---
|
||||
|
||||
## 🎮 Let's Go!
|
||||
|
||||
```bash
|
||||
# Godot
|
||||
python3 doc_scraper.py --config configs/godot.json
|
||||
|
||||
# Or interactive
|
||||
python3 doc_scraper.py --interactive
|
||||
```
|
||||
|
||||
That's it! 🚀
|
||||
445
README.md
445
README.md
@@ -1,2 +1,443 @@
|
||||
# Skill_Seekers
|
||||
Single powerful tool to convert ANY documentation website into a Claude skill
|
||||
# Documentation to Claude Skill Converter
|
||||
|
||||
[](https://opensource.org/licenses/MIT)
|
||||
|
||||
**Single powerful tool to convert ANY documentation website into a Claude skill.**
|
||||
|
||||
## 🚀 Quick Start
|
||||
|
||||
### Easiest: Use a Preset
|
||||
|
||||
```bash
|
||||
# Install dependencies (macOS)
|
||||
pip3 install requests beautifulsoup4
|
||||
|
||||
# Use Godot preset
|
||||
python3 doc_scraper.py --config configs/godot.json
|
||||
|
||||
# Use React preset
|
||||
python3 doc_scraper.py --config configs/react.json
|
||||
|
||||
# See all presets
|
||||
ls configs/
|
||||
```
|
||||
|
||||
### Interactive Mode
|
||||
|
||||
```bash
|
||||
python3 doc_scraper.py --interactive
|
||||
```
|
||||
|
||||
### Quick Mode
|
||||
|
||||
```bash
|
||||
python3 doc_scraper.py \
|
||||
--name react \
|
||||
--url https://react.dev/ \
|
||||
--description "React framework for UIs"
|
||||
```
|
||||
|
||||
## 📁 Simple Structure
|
||||
|
||||
```
|
||||
doc-to-skill/
|
||||
├── doc_scraper.py # Main scraping tool
|
||||
├── enhance_skill.py # Optional: AI-powered SKILL.md enhancement
|
||||
├── configs/ # Preset configurations
|
||||
│ ├── godot.json # Godot Engine
|
||||
│ ├── react.json # React
|
||||
│ ├── vue.json # Vue.js
|
||||
│ ├── django.json # Django
|
||||
│ └── fastapi.json # FastAPI
|
||||
└── output/ # All output (auto-created)
|
||||
├── godot_data/ # Scraped data
|
||||
└── godot/ # Built skill
|
||||
```
|
||||
|
||||
## ✨ Features
|
||||
|
||||
### 1. Auto-Detect Existing Data
|
||||
|
||||
```bash
|
||||
python3 doc_scraper.py --config configs/godot.json
|
||||
|
||||
# If data exists:
|
||||
✓ Found existing data: 245 pages
|
||||
Use existing data? (y/n): y
|
||||
⏭️ Skipping scrape, using existing data
|
||||
```
|
||||
|
||||
### 2. Knowledge Generation
|
||||
|
||||
**Automatic pattern extraction:**
|
||||
- Extracts common code patterns from docs
|
||||
- Detects programming language
|
||||
- Creates quick reference with real examples
|
||||
- Smarter categorization with scoring
|
||||
|
||||
**Enhanced SKILL.md:**
|
||||
- Real code examples from documentation
|
||||
- Language-annotated code blocks
|
||||
- Common patterns section
|
||||
- Quick reference from actual usage examples
|
||||
|
||||
### 3. Smart Categorization
|
||||
|
||||
Automatically infers categories from:
|
||||
- URL structure
|
||||
- Page titles
|
||||
- Content keywords
|
||||
- With scoring for better accuracy
|
||||
|
||||
### 4. Code Language Detection
|
||||
|
||||
```python
|
||||
# Automatically detects:
|
||||
- Python (def, import, from)
|
||||
- JavaScript (const, let, =>)
|
||||
- GDScript (func, var, extends)
|
||||
- C++ (#include, int main)
|
||||
- And more...
|
||||
```
|
||||
|
||||
### 5. Skip Scraping
|
||||
|
||||
```bash
|
||||
# Scrape once
|
||||
python3 doc_scraper.py --config configs/react.json
|
||||
|
||||
# Later, just rebuild (instant)
|
||||
python3 doc_scraper.py --config configs/react.json --skip-scrape
|
||||
```
|
||||
|
||||
### 6. AI-Powered SKILL.md Enhancement (NEW!)
|
||||
|
||||
```bash
|
||||
# Option 1: During scraping (API-based, requires API key)
|
||||
pip3 install anthropic
|
||||
export ANTHROPIC_API_KEY=sk-ant-...
|
||||
python3 doc_scraper.py --config configs/react.json --enhance
|
||||
|
||||
# Option 2: During scraping (LOCAL, no API key - uses Claude Code Max)
|
||||
python3 doc_scraper.py --config configs/react.json --enhance-local
|
||||
|
||||
# Option 3: After scraping (API-based, standalone)
|
||||
python3 enhance_skill.py output/react/
|
||||
|
||||
# Option 4: After scraping (LOCAL, no API key, standalone)
|
||||
python3 enhance_skill_local.py output/react/
|
||||
```
|
||||
|
||||
**What it does:**
|
||||
- Reads your reference documentation
|
||||
- Uses Claude to generate an excellent SKILL.md
|
||||
- Extracts best code examples (5-10 practical examples)
|
||||
- Creates comprehensive quick reference
|
||||
- Adds domain-specific key concepts
|
||||
- Provides navigation guidance for different skill levels
|
||||
- Automatically backs up original
|
||||
- **Quality:** Transforms 75-line templates into 500+ line comprehensive guides
|
||||
|
||||
**LOCAL Enhancement (Recommended):**
|
||||
- Uses your Claude Code Max plan (no API costs)
|
||||
- Opens new terminal with Claude Code
|
||||
- Analyzes reference files automatically
|
||||
- Takes 30-60 seconds
|
||||
- Quality: 9/10 (comparable to API version)
|
||||
|
||||
## 🎯 Complete Workflows
|
||||
|
||||
### First Time (With Scraping + Enhancement)
|
||||
|
||||
```bash
|
||||
# 1. Scrape + Build + AI Enhancement (LOCAL, no API key)
|
||||
python3 doc_scraper.py --config configs/godot.json --enhance-local
|
||||
|
||||
# 2. Wait for new terminal to close (enhancement completes)
|
||||
# Check the enhanced SKILL.md:
|
||||
cat output/godot/SKILL.md
|
||||
|
||||
# 3. Package
|
||||
python3 package_skill.py output/godot/
|
||||
|
||||
# 4. Done! You have godot.zip with excellent SKILL.md
|
||||
```
|
||||
|
||||
**Time:** 20-40 minutes (scraping) + 60 seconds (enhancement) = ~21-41 minutes
|
||||
|
||||
### Using Existing Data (Fast!)
|
||||
|
||||
```bash
|
||||
# 1. Use cached data + Local Enhancement
|
||||
python3 doc_scraper.py --config configs/godot.json --skip-scrape
|
||||
python3 enhance_skill_local.py output/godot/
|
||||
|
||||
# 2. Package
|
||||
python3 package_skill.py output/godot/
|
||||
|
||||
# 3. Done!
|
||||
```
|
||||
|
||||
**Time:** 1-3 minutes (build) + 60 seconds (enhancement) = ~2-4 minutes total
|
||||
|
||||
### Without Enhancement (Basic)
|
||||
|
||||
```bash
|
||||
# 1. Scrape + Build (no enhancement)
|
||||
python3 doc_scraper.py --config configs/godot.json
|
||||
|
||||
# 2. Package
|
||||
python3 package_skill.py output/godot/
|
||||
|
||||
# 3. Done! (SKILL.md will be basic template)
|
||||
```
|
||||
|
||||
**Time:** 20-40 minutes
|
||||
**Note:** SKILL.md will be generic - enhancement strongly recommended!
|
||||
|
||||
## 📋 Available Presets
|
||||
|
||||
| Config | Framework | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `godot.json` | Godot Engine | Game development |
|
||||
| `react.json` | React | UI framework |
|
||||
| `vue.json` | Vue.js | Progressive framework |
|
||||
| `django.json` | Django | Python web framework |
|
||||
| `fastapi.json` | FastAPI | Modern Python API |
|
||||
|
||||
### Using Presets
|
||||
|
||||
```bash
|
||||
# Godot
|
||||
python3 doc_scraper.py --config configs/godot.json
|
||||
|
||||
# React
|
||||
python3 doc_scraper.py --config configs/react.json
|
||||
|
||||
# Vue
|
||||
python3 doc_scraper.py --config configs/vue.json
|
||||
|
||||
# Django
|
||||
python3 doc_scraper.py --config configs/django.json
|
||||
|
||||
# FastAPI
|
||||
python3 doc_scraper.py --config configs/fastapi.json
|
||||
```
|
||||
|
||||
## 🎨 Creating Your Own Config
|
||||
|
||||
### Option 1: Interactive
|
||||
|
||||
```bash
|
||||
python3 doc_scraper.py --interactive
|
||||
# Follow prompts, it will create the config for you
|
||||
```
|
||||
|
||||
### Option 2: Copy and Edit
|
||||
|
||||
```bash
|
||||
# Copy a preset
|
||||
cp configs/react.json configs/myframework.json
|
||||
|
||||
# Edit it
|
||||
nano configs/myframework.json
|
||||
|
||||
# Use it
|
||||
python3 doc_scraper.py --config configs/myframework.json
|
||||
```
|
||||
|
||||
### Config Structure
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "myframework",
|
||||
"description": "When to use this skill",
|
||||
"base_url": "https://docs.myframework.com/",
|
||||
"selectors": {
|
||||
"main_content": "article",
|
||||
"title": "h1",
|
||||
"code_blocks": "pre code"
|
||||
},
|
||||
"url_patterns": {
|
||||
"include": ["/docs", "/guide"],
|
||||
"exclude": ["/blog", "/about"]
|
||||
},
|
||||
"categories": {
|
||||
"getting_started": ["intro", "quickstart"],
|
||||
"api": ["api", "reference"]
|
||||
},
|
||||
"rate_limit": 0.5,
|
||||
"max_pages": 500
|
||||
}
|
||||
```
|
||||
|
||||
## 📊 What Gets Created
|
||||
|
||||
```
|
||||
output/
|
||||
├── godot_data/ # Scraped raw data
|
||||
│ ├── pages/ # JSON files (one per page)
|
||||
│ └── summary.json # Overview
|
||||
│
|
||||
└── godot/ # The skill
|
||||
├── SKILL.md # Enhanced with real examples
|
||||
├── references/ # Categorized docs
|
||||
│ ├── index.md
|
||||
│ ├── getting_started.md
|
||||
│ ├── scripting.md
|
||||
│ └── ...
|
||||
├── scripts/ # Empty (add your own)
|
||||
└── assets/ # Empty (add your own)
|
||||
```
|
||||
|
||||
## 🎯 Command Line Options
|
||||
|
||||
```bash
|
||||
# Interactive mode
|
||||
python3 doc_scraper.py --interactive
|
||||
|
||||
# Use config file
|
||||
python3 doc_scraper.py --config configs/godot.json
|
||||
|
||||
# Quick mode
|
||||
python3 doc_scraper.py --name react --url https://react.dev/
|
||||
|
||||
# Skip scraping (use existing data)
|
||||
python3 doc_scraper.py --config configs/godot.json --skip-scrape
|
||||
|
||||
# With description
|
||||
python3 doc_scraper.py \
|
||||
--name react \
|
||||
--url https://react.dev/ \
|
||||
--description "React framework for building UIs"
|
||||
```
|
||||
|
||||
## 💡 Tips
|
||||
|
||||
### 1. Test Small First
|
||||
|
||||
Edit `max_pages` in config to test:
|
||||
```json
|
||||
{
|
||||
"max_pages": 20 // Test with just 20 pages
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Reuse Scraped Data
|
||||
|
||||
```bash
|
||||
# Scrape once
|
||||
python3 doc_scraper.py --config configs/react.json
|
||||
|
||||
# Rebuild multiple times (instant)
|
||||
python3 doc_scraper.py --config configs/react.json --skip-scrape
|
||||
python3 doc_scraper.py --config configs/react.json --skip-scrape
|
||||
```
|
||||
|
||||
### 3. Finding Selectors
|
||||
|
||||
```python
|
||||
# Test in Python
|
||||
from bs4 import BeautifulSoup
|
||||
import requests
|
||||
|
||||
url = "https://docs.example.com/page"
|
||||
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
|
||||
|
||||
# Try different selectors
|
||||
print(soup.select_one('article'))
|
||||
print(soup.select_one('main'))
|
||||
print(soup.select_one('div[role="main"]'))
|
||||
```
|
||||
|
||||
### 4. Check Output Quality
|
||||
|
||||
```bash
|
||||
# After building, check:
|
||||
cat output/godot/SKILL.md # Should have real examples
|
||||
cat output/godot/references/index.md # Categories
|
||||
```
|
||||
|
||||
## 🐛 Troubleshooting
|
||||
|
||||
### No Content Extracted?
|
||||
- Check your `main_content` selector
|
||||
- Try: `article`, `main`, `div[role="main"]`
|
||||
|
||||
### Data Exists But Won't Use It?
|
||||
```bash
|
||||
# Force re-scrape
|
||||
rm -rf output/myframework_data/
|
||||
python3 doc_scraper.py --config configs/myframework.json
|
||||
```
|
||||
|
||||
### Categories Not Good?
|
||||
Edit the config `categories` section with better keywords.
|
||||
|
||||
### Want to Update Docs?
|
||||
```bash
|
||||
# Delete old data
|
||||
rm -rf output/godot_data/
|
||||
|
||||
# Re-scrape
|
||||
python3 doc_scraper.py --config configs/godot.json
|
||||
```
|
||||
|
||||
## 📈 Performance
|
||||
|
||||
| Task | Time | Notes |
|
||||
|------|------|-------|
|
||||
| Scraping | 15-45 min | First time only |
|
||||
| Building | 1-3 min | Fast! |
|
||||
| Re-building | <1 min | With --skip-scrape |
|
||||
| Packaging | 5-10 sec | Final zip |
|
||||
|
||||
## ✅ Summary
|
||||
|
||||
**One tool does everything:**
|
||||
1. ✅ Scrapes documentation
|
||||
2. ✅ Auto-detects existing data
|
||||
3. ✅ Generates better knowledge
|
||||
4. ✅ Creates enhanced skills
|
||||
5. ✅ Works with presets or custom configs
|
||||
6. ✅ Supports skip-scraping for fast iteration
|
||||
|
||||
**Simple structure:**
|
||||
- `doc_scraper.py` - The tool
|
||||
- `configs/` - Presets
|
||||
- `output/` - Everything else
|
||||
|
||||
**Better output:**
|
||||
- Real code examples with language detection
|
||||
- Common patterns extracted from docs
|
||||
- Smart categorization
|
||||
- Enhanced SKILL.md with actual examples
|
||||
|
||||
## 📚 Documentation
|
||||
|
||||
- **[QUICKSTART.md](QUICKSTART.md)** - Get started in 3 steps
|
||||
- **[docs/ENHANCEMENT.md](docs/ENHANCEMENT.md)** - AI enhancement guide
|
||||
- **[docs/UPLOAD_GUIDE.md](docs/UPLOAD_GUIDE.md)** - How to upload skills to Claude
|
||||
- **[docs/CLAUDE.md](docs/CLAUDE.md)** - Technical architecture
|
||||
- **[STRUCTURE.md](STRUCTURE.md)** - Repository structure
|
||||
|
||||
## 🎮 Ready?
|
||||
|
||||
```bash
|
||||
# Try Godot
|
||||
python3 doc_scraper.py --config configs/godot.json
|
||||
|
||||
# Try React
|
||||
python3 doc_scraper.py --config configs/react.json
|
||||
|
||||
# Or go interactive
|
||||
python3 doc_scraper.py --interactive
|
||||
```
|
||||
|
||||
## 📝 License
|
||||
|
||||
MIT License - see [LICENSE](LICENSE) file for details
|
||||
|
||||
---
|
||||
|
||||
Happy skill building! 🚀
|
||||
|
||||
55
STRUCTURE.md
Normal file
55
STRUCTURE.md
Normal file
@@ -0,0 +1,55 @@
|
||||
# Repository Structure
|
||||
|
||||
```
|
||||
doc-to-skill/
|
||||
│
|
||||
├── README.md # Main documentation (start here!)
|
||||
├── QUICKSTART.md # 3-step quick start guide
|
||||
├── LICENSE # MIT License
|
||||
├── .gitignore # Git ignore rules
|
||||
│
|
||||
├── 🐍 Core Scripts
|
||||
│ ├── doc_scraper.py # Main scraping tool
|
||||
│ ├── enhance_skill.py # AI enhancement (API-based)
|
||||
│ ├── enhance_skill_local.py # AI enhancement (LOCAL, no API)
|
||||
│ └── package_skill.py # Skill packaging tool
|
||||
│
|
||||
├── 📁 configs/ # Preset configurations
|
||||
│ ├── godot.json
|
||||
│ ├── react.json
|
||||
│ ├── vue.json
|
||||
│ ├── django.json
|
||||
│ ├── fastapi.json
|
||||
│ ├── steam-inventory.json
|
||||
│ ├── steam-economy.json
|
||||
│ └── steam-economy-complete.json
|
||||
│
|
||||
├── 📚 docs/ # Detailed documentation
|
||||
│ ├── CLAUDE.md # Technical architecture
|
||||
│ ├── ENHANCEMENT.md # AI enhancement guide
|
||||
│ ├── UPLOAD_GUIDE.md # How to upload skills
|
||||
│ └── READY_TO_SHARE.md # Sharing checklist
|
||||
│
|
||||
└── 📦 output/ # Generated skills (git-ignored)
|
||||
├── {name}_data/ # Scraped raw data (cached)
|
||||
└── {name}/ # Built skills
|
||||
├── SKILL.md # Main skill file
|
||||
└── references/ # Reference documentation
|
||||
```
|
||||
|
||||
## Key Files
|
||||
|
||||
### For Users:
|
||||
- **README.md** - Start here for overview and installation
|
||||
- **QUICKSTART.md** - Get started in 3 steps
|
||||
- **configs/** - 8 ready-to-use presets
|
||||
|
||||
### For Developers:
|
||||
- **doc_scraper.py** - Main tool (787 lines)
|
||||
- **docs/CLAUDE.md** - Architecture and internals
|
||||
- **docs/ENHANCEMENT.md** - How enhancement works
|
||||
|
||||
### For Contributors:
|
||||
- **LICENSE** - MIT License
|
||||
- **.gitignore** - What Git ignores
|
||||
- **docs/READY_TO_SHARE.md** - Distribution guide
|
||||
BIN
configs/.DS_Store
vendored
Normal file
BIN
configs/.DS_Store
vendored
Normal file
Binary file not shown.
25
configs/django.json
Normal file
25
configs/django.json
Normal file
@@ -0,0 +1,25 @@
|
||||
{
|
||||
"name": "django",
|
||||
"description": "Django web framework for Python. Use for Django models, views, templates, ORM, authentication, and web development.",
|
||||
"base_url": "https://docs.djangoproject.com/en/stable/",
|
||||
"selectors": {
|
||||
"main_content": "div.document",
|
||||
"title": "h1",
|
||||
"code_blocks": "pre"
|
||||
},
|
||||
"url_patterns": {
|
||||
"include": ["/topics/", "/ref/", "/howto/"],
|
||||
"exclude": ["/faq/", "/misc/"]
|
||||
},
|
||||
"categories": {
|
||||
"getting_started": ["intro", "tutorial", "install"],
|
||||
"models": ["models", "database", "orm", "queries"],
|
||||
"views": ["views", "urlconf", "routing"],
|
||||
"templates": ["templates", "template"],
|
||||
"forms": ["forms", "form"],
|
||||
"authentication": ["auth", "authentication", "user"],
|
||||
"api": ["ref", "reference"]
|
||||
},
|
||||
"rate_limit": 0.3,
|
||||
"max_pages": 500
|
||||
}
|
||||
24
configs/fastapi.json
Normal file
24
configs/fastapi.json
Normal file
@@ -0,0 +1,24 @@
|
||||
{
|
||||
"name": "fastapi",
|
||||
"description": "FastAPI modern Python web framework. Use for building APIs, async endpoints, dependency injection, and Python backend development.",
|
||||
"base_url": "https://fastapi.tiangolo.com/",
|
||||
"selectors": {
|
||||
"main_content": "article",
|
||||
"title": "h1",
|
||||
"code_blocks": "pre code"
|
||||
},
|
||||
"url_patterns": {
|
||||
"include": ["/tutorial/", "/advanced/", "/reference/"],
|
||||
"exclude": ["/help/", "/external-links/"]
|
||||
},
|
||||
"categories": {
|
||||
"getting_started": ["first-steps", "tutorial", "intro"],
|
||||
"path_operations": ["path", "operations", "routing"],
|
||||
"request_data": ["request", "body", "query", "parameters"],
|
||||
"dependencies": ["dependencies", "injection"],
|
||||
"security": ["security", "oauth", "authentication"],
|
||||
"database": ["database", "sql", "orm"]
|
||||
},
|
||||
"rate_limit": 0.5,
|
||||
"max_pages": 250
|
||||
}
|
||||
34
configs/godot.json
Normal file
34
configs/godot.json
Normal file
@@ -0,0 +1,34 @@
|
||||
{
|
||||
"name": "godot",
|
||||
"description": "Godot Engine game development. Use for Godot projects, GDScript/C# coding, scene setup, node systems, 2D/3D development, physics, animation, UI, shaders, or any Godot-specific questions.",
|
||||
"base_url": "https://docs.godotengine.org/en/stable/",
|
||||
"selectors": {
|
||||
"main_content": "div[role='main']",
|
||||
"title": "title",
|
||||
"code_blocks": "pre"
|
||||
},
|
||||
"url_patterns": {
|
||||
"include": [],
|
||||
"exclude": [
|
||||
"/genindex.html",
|
||||
"/search.html",
|
||||
"/_static/",
|
||||
"/_sources/"
|
||||
]
|
||||
},
|
||||
"categories": {
|
||||
"getting_started": ["introduction", "getting_started", "first", "your_first"],
|
||||
"scripting": ["scripting", "gdscript", "c#", "csharp"],
|
||||
"2d": ["/2d/", "sprite", "canvas", "tilemap"],
|
||||
"3d": ["/3d/", "spatial", "mesh", "3d_"],
|
||||
"physics": ["physics", "collision", "rigidbody", "characterbody"],
|
||||
"animation": ["animation", "tween", "animationplayer"],
|
||||
"ui": ["ui", "control", "gui", "theme"],
|
||||
"shaders": ["shader", "material", "visual_shader"],
|
||||
"audio": ["audio", "sound"],
|
||||
"networking": ["networking", "multiplayer", "rpc"],
|
||||
"export": ["export", "platform", "deploy"]
|
||||
},
|
||||
"rate_limit": 0.5,
|
||||
"max_pages": 500
|
||||
}
|
||||
23
configs/react.json
Normal file
23
configs/react.json
Normal file
@@ -0,0 +1,23 @@
|
||||
{
|
||||
"name": "react",
|
||||
"description": "React framework for building user interfaces. Use for React components, hooks, state management, JSX, and modern frontend development.",
|
||||
"base_url": "https://react.dev/",
|
||||
"selectors": {
|
||||
"main_content": "article",
|
||||
"title": "h1",
|
||||
"code_blocks": "pre code"
|
||||
},
|
||||
"url_patterns": {
|
||||
"include": ["/learn", "/reference"],
|
||||
"exclude": ["/community", "/blog"]
|
||||
},
|
||||
"categories": {
|
||||
"getting_started": ["quick-start", "installation", "tutorial"],
|
||||
"hooks": ["usestate", "useeffect", "usememo", "usecallback", "usecontext", "useref", "hook"],
|
||||
"components": ["component", "props", "jsx"],
|
||||
"state": ["state", "context", "reducer"],
|
||||
"api": ["api", "reference"]
|
||||
},
|
||||
"rate_limit": 0.5,
|
||||
"max_pages": 300
|
||||
}
|
||||
108
configs/steam-economy-complete.json
Normal file
108
configs/steam-economy-complete.json
Normal file
@@ -0,0 +1,108 @@
|
||||
{
|
||||
"name": "steam-economy-complete",
|
||||
"description": "Complete Steam Economy system including inventory, microtransactions, trading, and monetization. Use for ISteamInventory API, ISteamEconomy API, IInventoryService Web API, Steam Wallet integration, in-app purchases, item definitions, trading, crafting, market integration, and all economy features for game developers.",
|
||||
"base_url": "https://partner.steamgames.com/doc/",
|
||||
"start_urls": [
|
||||
"https://partner.steamgames.com/doc/features/inventory",
|
||||
"https://partner.steamgames.com/doc/features/microtransactions",
|
||||
"https://partner.steamgames.com/doc/features/microtransactions/implementation",
|
||||
"https://partner.steamgames.com/doc/api/ISteamInventory",
|
||||
"https://partner.steamgames.com/doc/webapi/ISteamEconomy",
|
||||
"https://partner.steamgames.com/doc/webapi/IInventoryService",
|
||||
"https://partner.steamgames.com/doc/features/inventory/economy"
|
||||
],
|
||||
"selectors": {
|
||||
"main_content": "div.documentation_bbcode",
|
||||
"title": "div.docPageTitle",
|
||||
"code_blocks": "div.bb_code"
|
||||
},
|
||||
"url_patterns": {
|
||||
"include": [
|
||||
"/features/inventory",
|
||||
"/features/microtransactions",
|
||||
"/api/ISteamInventory",
|
||||
"/webapi/ISteamEconomy",
|
||||
"/webapi/IInventoryService"
|
||||
],
|
||||
"exclude": [
|
||||
"/home",
|
||||
"/sales",
|
||||
"/marketing",
|
||||
"/legal",
|
||||
"/finance",
|
||||
"/login",
|
||||
"/search",
|
||||
"/steamworks/apps",
|
||||
"/steamworks/partner"
|
||||
]
|
||||
},
|
||||
"categories": {
|
||||
"getting_started": [
|
||||
"overview",
|
||||
"getting started",
|
||||
"introduction",
|
||||
"quickstart",
|
||||
"setup"
|
||||
],
|
||||
"inventory_system": [
|
||||
"inventory",
|
||||
"item definition",
|
||||
"item schema",
|
||||
"item properties",
|
||||
"itemdefs",
|
||||
"ISteamInventory"
|
||||
],
|
||||
"microtransactions": [
|
||||
"microtransaction",
|
||||
"purchase",
|
||||
"payment",
|
||||
"checkout",
|
||||
"wallet",
|
||||
"transaction"
|
||||
],
|
||||
"economy_api": [
|
||||
"ISteamEconomy",
|
||||
"economy",
|
||||
"asset",
|
||||
"context"
|
||||
],
|
||||
"inventory_webapi": [
|
||||
"IInventoryService",
|
||||
"webapi",
|
||||
"web api",
|
||||
"http"
|
||||
],
|
||||
"trading": [
|
||||
"trading",
|
||||
"trade",
|
||||
"exchange",
|
||||
"market"
|
||||
],
|
||||
"crafting": [
|
||||
"crafting",
|
||||
"recipe",
|
||||
"combine",
|
||||
"exchange"
|
||||
],
|
||||
"pricing": [
|
||||
"pricing",
|
||||
"price",
|
||||
"cost",
|
||||
"currency"
|
||||
],
|
||||
"implementation": [
|
||||
"integration",
|
||||
"implementation",
|
||||
"configure",
|
||||
"best practices"
|
||||
],
|
||||
"examples": [
|
||||
"example",
|
||||
"sample",
|
||||
"tutorial",
|
||||
"walkthrough"
|
||||
]
|
||||
},
|
||||
"rate_limit": 0.7,
|
||||
"max_pages": 1000
|
||||
}
|
||||
23
configs/vue.json
Normal file
23
configs/vue.json
Normal file
@@ -0,0 +1,23 @@
|
||||
{
|
||||
"name": "vue",
|
||||
"description": "Vue.js progressive JavaScript framework. Use for Vue components, reactivity, composition API, and frontend development.",
|
||||
"base_url": "https://vuejs.org/guide/",
|
||||
"selectors": {
|
||||
"main_content": "main",
|
||||
"title": "h1",
|
||||
"code_blocks": "pre code"
|
||||
},
|
||||
"url_patterns": {
|
||||
"include": ["/guide/", "/api/", "/examples/"],
|
||||
"exclude": ["/about/", "/sponsor/"]
|
||||
},
|
||||
"categories": {
|
||||
"getting_started": ["quick-start", "introduction", "essentials"],
|
||||
"components": ["component", "props", "events"],
|
||||
"reactivity": ["reactivity", "reactive", "ref", "computed"],
|
||||
"composition_api": ["composition", "setup"],
|
||||
"api": ["api", "reference"]
|
||||
},
|
||||
"rate_limit": 0.5,
|
||||
"max_pages": 200
|
||||
}
|
||||
789
doc_scraper.py
Normal file
789
doc_scraper.py
Normal file
@@ -0,0 +1,789 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Documentation to Claude Skill Converter
|
||||
Single tool to scrape any documentation and create high-quality Claude skills.
|
||||
|
||||
Usage:
|
||||
python3 doc_scraper.py --interactive
|
||||
python3 doc_scraper.py --config configs/godot.json
|
||||
python3 doc_scraper.py --url https://react.dev/ --name react
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import json
|
||||
import time
|
||||
import re
|
||||
import argparse
|
||||
import hashlib
|
||||
import requests
|
||||
from pathlib import Path
|
||||
from urllib.parse import urljoin, urlparse
|
||||
from bs4 import BeautifulSoup
|
||||
from collections import deque, defaultdict
|
||||
|
||||
|
||||
class DocToSkillConverter:
|
||||
def __init__(self, config):
|
||||
self.config = config
|
||||
self.name = config['name']
|
||||
self.base_url = config['base_url']
|
||||
|
||||
# Paths
|
||||
self.data_dir = f"output/{self.name}_data"
|
||||
self.skill_dir = f"output/{self.name}"
|
||||
|
||||
# State
|
||||
self.visited_urls = set()
|
||||
# Support multiple starting URLs
|
||||
start_urls = config.get('start_urls', [self.base_url])
|
||||
self.pending_urls = deque(start_urls)
|
||||
self.pages = []
|
||||
|
||||
# Create directories
|
||||
os.makedirs(f"{self.data_dir}/pages", exist_ok=True)
|
||||
os.makedirs(f"{self.skill_dir}/references", exist_ok=True)
|
||||
os.makedirs(f"{self.skill_dir}/scripts", exist_ok=True)
|
||||
os.makedirs(f"{self.skill_dir}/assets", exist_ok=True)
|
||||
|
||||
def is_valid_url(self, url):
|
||||
"""Check if URL should be scraped"""
|
||||
if not url.startswith(self.base_url):
|
||||
return False
|
||||
|
||||
# Include patterns
|
||||
includes = self.config.get('url_patterns', {}).get('include', [])
|
||||
if includes and not any(pattern in url for pattern in includes):
|
||||
return False
|
||||
|
||||
# Exclude patterns
|
||||
excludes = self.config.get('url_patterns', {}).get('exclude', [])
|
||||
if any(pattern in url for pattern in excludes):
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
def extract_content(self, soup, url):
|
||||
"""Extract content with improved code and pattern detection"""
|
||||
page = {
|
||||
'url': url,
|
||||
'title': '',
|
||||
'content': '',
|
||||
'headings': [],
|
||||
'code_samples': [],
|
||||
'patterns': [], # NEW: Extract common patterns
|
||||
'links': []
|
||||
}
|
||||
|
||||
selectors = self.config.get('selectors', {})
|
||||
|
||||
# Extract title
|
||||
title_elem = soup.select_one(selectors.get('title', 'title'))
|
||||
if title_elem:
|
||||
page['title'] = self.clean_text(title_elem.get_text())
|
||||
|
||||
# Find main content
|
||||
main_selector = selectors.get('main_content', 'div[role="main"]')
|
||||
main = soup.select_one(main_selector)
|
||||
|
||||
if not main:
|
||||
print(f"⚠ No content: {url}")
|
||||
return page
|
||||
|
||||
# Extract headings with better structure
|
||||
for h in main.find_all(['h1', 'h2', 'h3', 'h4', 'h5', 'h6']):
|
||||
text = self.clean_text(h.get_text())
|
||||
if text:
|
||||
page['headings'].append({
|
||||
'level': h.name,
|
||||
'text': text,
|
||||
'id': h.get('id', '')
|
||||
})
|
||||
|
||||
# Extract code with language detection
|
||||
code_selector = selectors.get('code_blocks', 'pre code')
|
||||
for code_elem in main.select(code_selector):
|
||||
code = code_elem.get_text()
|
||||
if len(code.strip()) > 10:
|
||||
# Try to detect language
|
||||
lang = self.detect_language(code_elem, code)
|
||||
page['code_samples'].append({
|
||||
'code': code.strip(),
|
||||
'language': lang
|
||||
})
|
||||
|
||||
# Extract patterns (NEW: common code patterns)
|
||||
page['patterns'] = self.extract_patterns(main, page['code_samples'])
|
||||
|
||||
# Extract paragraphs
|
||||
paragraphs = []
|
||||
for p in main.find_all('p'):
|
||||
text = self.clean_text(p.get_text())
|
||||
if text and len(text) > 20: # Skip very short paragraphs
|
||||
paragraphs.append(text)
|
||||
|
||||
page['content'] = '\n\n'.join(paragraphs)
|
||||
|
||||
# Extract links
|
||||
for link in main.find_all('a', href=True):
|
||||
href = urljoin(url, link['href'])
|
||||
if self.is_valid_url(href):
|
||||
page['links'].append(href)
|
||||
|
||||
return page
|
||||
|
||||
def detect_language(self, elem, code):
|
||||
"""Detect programming language from code block"""
|
||||
# Check class attribute
|
||||
classes = elem.get('class', [])
|
||||
for cls in classes:
|
||||
if 'language-' in cls:
|
||||
return cls.replace('language-', '')
|
||||
if 'lang-' in cls:
|
||||
return cls.replace('lang-', '')
|
||||
|
||||
# Check parent pre element
|
||||
parent = elem.parent
|
||||
if parent and parent.name == 'pre':
|
||||
classes = parent.get('class', [])
|
||||
for cls in classes:
|
||||
if 'language-' in cls:
|
||||
return cls.replace('language-', '')
|
||||
|
||||
# Heuristic detection
|
||||
if 'import ' in code and 'from ' in code:
|
||||
return 'python'
|
||||
if 'const ' in code or 'let ' in code or '=>' in code:
|
||||
return 'javascript'
|
||||
if 'func ' in code and 'var ' in code:
|
||||
return 'gdscript'
|
||||
if 'def ' in code and ':' in code:
|
||||
return 'python'
|
||||
if '#include' in code or 'int main' in code:
|
||||
return 'cpp'
|
||||
|
||||
return 'unknown'
|
||||
|
||||
def extract_patterns(self, main, code_samples):
|
||||
"""Extract common coding patterns (NEW FEATURE)"""
|
||||
patterns = []
|
||||
|
||||
# Look for "Example:" or "Pattern:" sections
|
||||
for elem in main.find_all(['p', 'div']):
|
||||
text = elem.get_text().lower()
|
||||
if any(word in text for word in ['example:', 'pattern:', 'usage:', 'typical use']):
|
||||
# Get the code that follows
|
||||
next_code = elem.find_next(['pre', 'code'])
|
||||
if next_code:
|
||||
patterns.append({
|
||||
'description': self.clean_text(elem.get_text()),
|
||||
'code': next_code.get_text().strip()
|
||||
})
|
||||
|
||||
return patterns[:5] # Limit to 5 most relevant patterns
|
||||
|
||||
def clean_text(self, text):
|
||||
"""Clean text content"""
|
||||
text = re.sub(r'\s+', ' ', text)
|
||||
return text.strip()
|
||||
|
||||
def save_page(self, page):
|
||||
"""Save page data"""
|
||||
url_hash = hashlib.md5(page['url'].encode()).hexdigest()[:10]
|
||||
safe_title = re.sub(r'[^\w\s-]', '', page['title'])[:50]
|
||||
safe_title = re.sub(r'[-\s]+', '_', safe_title)
|
||||
|
||||
filename = f"{safe_title}_{url_hash}.json"
|
||||
filepath = os.path.join(self.data_dir, "pages", filename)
|
||||
|
||||
with open(filepath, 'w', encoding='utf-8') as f:
|
||||
json.dump(page, f, indent=2, ensure_ascii=False)
|
||||
|
||||
def scrape_page(self, url):
|
||||
"""Scrape a single page"""
|
||||
try:
|
||||
print(f" {url}")
|
||||
|
||||
headers = {'User-Agent': 'Mozilla/5.0 (Documentation Scraper)'}
|
||||
response = requests.get(url, headers=headers, timeout=30)
|
||||
response.raise_for_status()
|
||||
|
||||
soup = BeautifulSoup(response.content, 'html.parser')
|
||||
page = self.extract_content(soup, url)
|
||||
|
||||
self.save_page(page)
|
||||
self.pages.append(page)
|
||||
|
||||
# Add new URLs
|
||||
for link in page['links']:
|
||||
if link not in self.visited_urls and link not in self.pending_urls:
|
||||
self.pending_urls.append(link)
|
||||
|
||||
# Rate limiting
|
||||
time.sleep(self.config.get('rate_limit', 0.5))
|
||||
|
||||
except Exception as e:
|
||||
print(f" ✗ Error: {e}")
|
||||
|
||||
def scrape_all(self):
|
||||
"""Scrape all pages"""
|
||||
print(f"\n{'='*60}")
|
||||
print(f"SCRAPING: {self.name}")
|
||||
print(f"{'='*60}")
|
||||
print(f"Base URL: {self.base_url}")
|
||||
print(f"Output: {self.data_dir}\n")
|
||||
|
||||
max_pages = self.config.get('max_pages', 500)
|
||||
|
||||
while self.pending_urls and len(self.visited_urls) < max_pages:
|
||||
url = self.pending_urls.popleft()
|
||||
|
||||
if url in self.visited_urls:
|
||||
continue
|
||||
|
||||
self.visited_urls.add(url)
|
||||
self.scrape_page(url)
|
||||
|
||||
if len(self.visited_urls) % 10 == 0:
|
||||
print(f" [{len(self.visited_urls)} pages]")
|
||||
|
||||
print(f"\n✅ Scraped {len(self.visited_urls)} pages")
|
||||
self.save_summary()
|
||||
|
||||
def save_summary(self):
|
||||
"""Save scraping summary"""
|
||||
summary = {
|
||||
'name': self.name,
|
||||
'total_pages': len(self.pages),
|
||||
'base_url': self.base_url,
|
||||
'pages': [{'title': p['title'], 'url': p['url']} for p in self.pages]
|
||||
}
|
||||
|
||||
with open(f"{self.data_dir}/summary.json", 'w', encoding='utf-8') as f:
|
||||
json.dump(summary, f, indent=2, ensure_ascii=False)
|
||||
|
||||
def load_scraped_data(self):
|
||||
"""Load previously scraped data"""
|
||||
pages = []
|
||||
pages_dir = Path(self.data_dir) / "pages"
|
||||
|
||||
if not pages_dir.exists():
|
||||
return []
|
||||
|
||||
for json_file in pages_dir.glob("*.json"):
|
||||
try:
|
||||
with open(json_file, 'r', encoding='utf-8') as f:
|
||||
pages.append(json.load(f))
|
||||
except Exception as e:
|
||||
print(f"⚠ Error loading {json_file}: {e}")
|
||||
|
||||
return pages
|
||||
|
||||
def smart_categorize(self, pages):
|
||||
"""Improved categorization with better pattern matching"""
|
||||
category_defs = self.config.get('categories', {})
|
||||
|
||||
# Default smart categories if none provided
|
||||
if not category_defs:
|
||||
category_defs = self.infer_categories(pages)
|
||||
|
||||
categories = {cat: [] for cat in category_defs.keys()}
|
||||
categories['other'] = []
|
||||
|
||||
for page in pages:
|
||||
url = page['url'].lower()
|
||||
title = page['title'].lower()
|
||||
content = page.get('content', '').lower()[:500] # Check first 500 chars
|
||||
|
||||
categorized = False
|
||||
|
||||
# Match against keywords
|
||||
for cat, keywords in category_defs.items():
|
||||
score = 0
|
||||
for keyword in keywords:
|
||||
keyword = keyword.lower()
|
||||
if keyword in url:
|
||||
score += 3
|
||||
if keyword in title:
|
||||
score += 2
|
||||
if keyword in content:
|
||||
score += 1
|
||||
|
||||
if score >= 2: # Threshold for categorization
|
||||
categories[cat].append(page)
|
||||
categorized = True
|
||||
break
|
||||
|
||||
if not categorized:
|
||||
categories['other'].append(page)
|
||||
|
||||
# Remove empty categories
|
||||
categories = {k: v for k, v in categories.items() if v}
|
||||
|
||||
return categories
|
||||
|
||||
def infer_categories(self, pages):
|
||||
"""Infer categories from URL patterns (IMPROVED)"""
|
||||
url_segments = defaultdict(int)
|
||||
|
||||
for page in pages:
|
||||
path = urlparse(page['url']).path
|
||||
segments = [s for s in path.split('/') if s and s not in ['en', 'stable', 'latest', 'docs']]
|
||||
|
||||
for seg in segments:
|
||||
url_segments[seg] += 1
|
||||
|
||||
# Top segments become categories
|
||||
top_segments = sorted(url_segments.items(), key=lambda x: x[1], reverse=True)[:8]
|
||||
|
||||
categories = {}
|
||||
for seg, count in top_segments:
|
||||
if count >= 3: # At least 3 pages
|
||||
categories[seg] = [seg]
|
||||
|
||||
# Add common defaults
|
||||
if 'tutorial' not in categories and any('tutorial' in url for url in [p['url'] for p in pages]):
|
||||
categories['tutorials'] = ['tutorial', 'guide', 'getting-started']
|
||||
|
||||
if 'api' not in categories and any('api' in url or 'reference' in url for url in [p['url'] for p in pages]):
|
||||
categories['api'] = ['api', 'reference', 'class']
|
||||
|
||||
return categories
|
||||
|
||||
def generate_quick_reference(self, pages):
|
||||
"""Generate quick reference from common patterns (NEW FEATURE)"""
|
||||
quick_ref = []
|
||||
|
||||
# Collect all patterns
|
||||
all_patterns = []
|
||||
for page in pages:
|
||||
all_patterns.extend(page.get('patterns', []))
|
||||
|
||||
# Get most common code patterns
|
||||
seen_codes = set()
|
||||
for pattern in all_patterns:
|
||||
code = pattern['code']
|
||||
if code not in seen_codes and len(code) < 300:
|
||||
quick_ref.append(pattern)
|
||||
seen_codes.add(code)
|
||||
if len(quick_ref) >= 15:
|
||||
break
|
||||
|
||||
return quick_ref
|
||||
|
||||
def create_reference_file(self, category, pages):
|
||||
"""Create enhanced reference file"""
|
||||
if not pages:
|
||||
return
|
||||
|
||||
lines = []
|
||||
lines.append(f"# {self.name.title()} - {category.replace('_', ' ').title()}\n")
|
||||
lines.append(f"**Pages:** {len(pages)}\n")
|
||||
lines.append("---\n")
|
||||
|
||||
for page in pages:
|
||||
lines.append(f"## {page['title']}\n")
|
||||
lines.append(f"**URL:** {page['url']}\n")
|
||||
|
||||
# Table of contents from headings
|
||||
if page.get('headings'):
|
||||
lines.append("**Contents:**")
|
||||
for h in page['headings'][:10]:
|
||||
level = int(h['level'][1]) if len(h['level']) > 1 else 1
|
||||
indent = " " * max(0, level - 2)
|
||||
lines.append(f"{indent}- {h['text']}")
|
||||
lines.append("")
|
||||
|
||||
# Content
|
||||
if page.get('content'):
|
||||
content = page['content'][:2500]
|
||||
if len(page['content']) > 2500:
|
||||
content += "\n\n*[Content truncated]*"
|
||||
lines.append(content)
|
||||
lines.append("")
|
||||
|
||||
# Code examples with language
|
||||
if page.get('code_samples'):
|
||||
lines.append("**Examples:**\n")
|
||||
for i, sample in enumerate(page['code_samples'][:4], 1):
|
||||
lang = sample.get('language', 'unknown')
|
||||
code = sample.get('code', sample if isinstance(sample, str) else '')
|
||||
lines.append(f"Example {i} ({lang}):")
|
||||
lines.append(f"```{lang}")
|
||||
lines.append(code[:600])
|
||||
if len(code) > 600:
|
||||
lines.append("...")
|
||||
lines.append("```\n")
|
||||
|
||||
lines.append("---\n")
|
||||
|
||||
filepath = os.path.join(self.skill_dir, "references", f"{category}.md")
|
||||
with open(filepath, 'w', encoding='utf-8') as f:
|
||||
f.write('\n'.join(lines))
|
||||
|
||||
print(f" ✓ {category}.md ({len(pages)} pages)")
|
||||
|
||||
def create_enhanced_skill_md(self, categories, quick_ref):
|
||||
"""Create SKILL.md with actual examples (IMPROVED)"""
|
||||
description = self.config.get('description', f'Comprehensive assistance with {self.name}')
|
||||
|
||||
# Extract actual code examples from docs
|
||||
example_codes = []
|
||||
for pages in categories.values():
|
||||
for page in pages[:3]: # First 3 pages per category
|
||||
for sample in page.get('code_samples', [])[:2]: # First 2 samples per page
|
||||
code = sample.get('code', sample if isinstance(sample, str) else '')
|
||||
lang = sample.get('language', 'unknown')
|
||||
if len(code) < 200 and lang != 'unknown':
|
||||
example_codes.append((lang, code))
|
||||
if len(example_codes) >= 10:
|
||||
break
|
||||
if len(example_codes) >= 10:
|
||||
break
|
||||
if len(example_codes) >= 10:
|
||||
break
|
||||
|
||||
content = f"""---
|
||||
name: {self.name}
|
||||
description: {description}
|
||||
---
|
||||
|
||||
# {self.name.title()} Skill
|
||||
|
||||
Comprehensive assistance with {self.name} development, generated from official documentation.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
This skill should be triggered when:
|
||||
- Working with {self.name}
|
||||
- Asking about {self.name} features or APIs
|
||||
- Implementing {self.name} solutions
|
||||
- Debugging {self.name} code
|
||||
- Learning {self.name} best practices
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Common Patterns
|
||||
|
||||
"""
|
||||
|
||||
# Add actual quick reference patterns
|
||||
if quick_ref:
|
||||
for i, pattern in enumerate(quick_ref[:8], 1):
|
||||
content += f"**Pattern {i}:** {pattern.get('description', 'Example pattern')}\n\n"
|
||||
content += "```\n"
|
||||
content += pattern.get('code', '')[:300]
|
||||
content += "\n```\n\n"
|
||||
else:
|
||||
content += "*Quick reference patterns will be added as you use the skill.*\n\n"
|
||||
|
||||
# Add example codes from docs
|
||||
if example_codes:
|
||||
content += "### Example Code Patterns\n\n"
|
||||
for i, (lang, code) in enumerate(example_codes[:5], 1):
|
||||
content += f"**Example {i}** ({lang}):\n```{lang}\n{code}\n```\n\n"
|
||||
|
||||
content += f"""## Reference Files
|
||||
|
||||
This skill includes comprehensive documentation in `references/`:
|
||||
|
||||
"""
|
||||
|
||||
for cat in sorted(categories.keys()):
|
||||
content += f"- **{cat}.md** - {cat.replace('_', ' ').title()} documentation\n"
|
||||
|
||||
content += """
|
||||
Use `view` to read specific reference files when detailed information is needed.
|
||||
|
||||
## Working with This Skill
|
||||
|
||||
### For Beginners
|
||||
Start with the getting_started or tutorials reference files for foundational concepts.
|
||||
|
||||
### For Specific Features
|
||||
Use the appropriate category reference file (api, guides, etc.) for detailed information.
|
||||
|
||||
### For Code Examples
|
||||
The quick reference section above contains common patterns extracted from the official docs.
|
||||
|
||||
## Resources
|
||||
|
||||
### references/
|
||||
Organized documentation extracted from official sources. These files contain:
|
||||
- Detailed explanations
|
||||
- Code examples with language annotations
|
||||
- Links to original documentation
|
||||
- Table of contents for quick navigation
|
||||
|
||||
### scripts/
|
||||
Add helper scripts here for common automation tasks.
|
||||
|
||||
### assets/
|
||||
Add templates, boilerplate, or example projects here.
|
||||
|
||||
## Notes
|
||||
|
||||
- This skill was automatically generated from official documentation
|
||||
- Reference files preserve the structure and examples from source docs
|
||||
- Code examples include language detection for better syntax highlighting
|
||||
- Quick reference patterns are extracted from common usage examples in the docs
|
||||
|
||||
## Updating
|
||||
|
||||
To refresh this skill with updated documentation:
|
||||
1. Re-run the scraper with the same configuration
|
||||
2. The skill will be rebuilt with the latest information
|
||||
"""
|
||||
|
||||
filepath = os.path.join(self.skill_dir, "SKILL.md")
|
||||
with open(filepath, 'w', encoding='utf-8') as f:
|
||||
f.write(content)
|
||||
|
||||
print(f" ✓ SKILL.md (enhanced with {len(example_codes)} examples)")
|
||||
|
||||
def create_index(self, categories):
|
||||
"""Create navigation index"""
|
||||
lines = []
|
||||
lines.append(f"# {self.name.title()} Documentation Index\n")
|
||||
lines.append("## Categories\n")
|
||||
|
||||
for cat, pages in sorted(categories.items()):
|
||||
lines.append(f"### {cat.replace('_', ' ').title()}")
|
||||
lines.append(f"**File:** `{cat}.md`")
|
||||
lines.append(f"**Pages:** {len(pages)}\n")
|
||||
|
||||
filepath = os.path.join(self.skill_dir, "references", "index.md")
|
||||
with open(filepath, 'w', encoding='utf-8') as f:
|
||||
f.write('\n'.join(lines))
|
||||
|
||||
print(" ✓ index.md")
|
||||
|
||||
def build_skill(self):
|
||||
"""Build the skill from scraped data"""
|
||||
print(f"\n{'='*60}")
|
||||
print(f"BUILDING SKILL: {self.name}")
|
||||
print(f"{'='*60}\n")
|
||||
|
||||
# Load data
|
||||
print("Loading scraped data...")
|
||||
pages = self.load_scraped_data()
|
||||
|
||||
if not pages:
|
||||
print("✗ No scraped data found!")
|
||||
return False
|
||||
|
||||
print(f" ✓ Loaded {len(pages)} pages\n")
|
||||
|
||||
# Categorize
|
||||
print("Categorizing pages...")
|
||||
categories = self.smart_categorize(pages)
|
||||
print(f" ✓ Created {len(categories)} categories\n")
|
||||
|
||||
# Generate quick reference
|
||||
print("Generating quick reference...")
|
||||
quick_ref = self.generate_quick_reference(pages)
|
||||
print(f" ✓ Extracted {len(quick_ref)} patterns\n")
|
||||
|
||||
# Create reference files
|
||||
print("Creating reference files...")
|
||||
for cat, cat_pages in categories.items():
|
||||
self.create_reference_file(cat, cat_pages)
|
||||
|
||||
# Create index
|
||||
self.create_index(categories)
|
||||
print()
|
||||
|
||||
# Create enhanced SKILL.md
|
||||
print("Creating SKILL.md...")
|
||||
self.create_enhanced_skill_md(categories, quick_ref)
|
||||
|
||||
print(f"\n✅ Skill built: {self.skill_dir}/")
|
||||
return True
|
||||
|
||||
|
||||
def load_config(config_path):
|
||||
"""Load configuration from file"""
|
||||
with open(config_path, 'r') as f:
|
||||
return json.load(f)
|
||||
|
||||
|
||||
def interactive_config():
|
||||
"""Interactive configuration"""
|
||||
print("\n" + "="*60)
|
||||
print("Documentation to Skill Converter")
|
||||
print("="*60 + "\n")
|
||||
|
||||
config = {}
|
||||
|
||||
# Basic info
|
||||
config['name'] = input("Skill name (e.g., 'react', 'godot'): ").strip()
|
||||
config['description'] = input("Skill description: ").strip()
|
||||
config['base_url'] = input("Base URL (e.g., https://docs.example.com/): ").strip()
|
||||
|
||||
if not config['base_url'].endswith('/'):
|
||||
config['base_url'] += '/'
|
||||
|
||||
# Selectors
|
||||
print("\nCSS Selectors (press Enter for defaults):")
|
||||
selectors = {}
|
||||
selectors['main_content'] = input(" Main content [div[role='main']]: ").strip() or "div[role='main']"
|
||||
selectors['title'] = input(" Title [title]: ").strip() or "title"
|
||||
selectors['code_blocks'] = input(" Code blocks [pre code]: ").strip() or "pre code"
|
||||
config['selectors'] = selectors
|
||||
|
||||
# URL patterns
|
||||
print("\nURL Patterns (comma-separated, optional):")
|
||||
include = input(" Include: ").strip()
|
||||
exclude = input(" Exclude: ").strip()
|
||||
config['url_patterns'] = {
|
||||
'include': [p.strip() for p in include.split(',') if p.strip()],
|
||||
'exclude': [p.strip() for p in exclude.split(',') if p.strip()]
|
||||
}
|
||||
|
||||
# Settings
|
||||
rate = input("\nRate limit (seconds) [0.5]: ").strip()
|
||||
config['rate_limit'] = float(rate) if rate else 0.5
|
||||
|
||||
max_p = input("Max pages [500]: ").strip()
|
||||
config['max_pages'] = int(max_p) if max_p else 500
|
||||
|
||||
return config
|
||||
|
||||
|
||||
def check_existing_data(name):
|
||||
"""Check if scraped data already exists"""
|
||||
data_dir = f"output/{name}_data"
|
||||
if os.path.exists(data_dir) and os.path.exists(f"{data_dir}/summary.json"):
|
||||
with open(f"{data_dir}/summary.json", 'r') as f:
|
||||
summary = json.load(f)
|
||||
return True, summary.get('total_pages', 0)
|
||||
return False, 0
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description='Convert documentation websites to Claude skills',
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter
|
||||
)
|
||||
|
||||
parser.add_argument('--interactive', '-i', action='store_true',
|
||||
help='Interactive configuration mode')
|
||||
parser.add_argument('--config', '-c', type=str,
|
||||
help='Load configuration from file (e.g., configs/godot.json)')
|
||||
parser.add_argument('--name', type=str,
|
||||
help='Skill name')
|
||||
parser.add_argument('--url', type=str,
|
||||
help='Base documentation URL')
|
||||
parser.add_argument('--description', '-d', type=str,
|
||||
help='Skill description')
|
||||
parser.add_argument('--skip-scrape', action='store_true',
|
||||
help='Skip scraping, use existing data')
|
||||
parser.add_argument('--enhance', action='store_true',
|
||||
help='Enhance SKILL.md using Claude API after building (requires API key)')
|
||||
parser.add_argument('--enhance-local', action='store_true',
|
||||
help='Enhance SKILL.md using Claude Code in new terminal (no API key needed)')
|
||||
parser.add_argument('--api-key', type=str,
|
||||
help='Anthropic API key for --enhance (or set ANTHROPIC_API_KEY)')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Get configuration
|
||||
if args.config:
|
||||
config = load_config(args.config)
|
||||
elif args.interactive or not (args.name and args.url):
|
||||
config = interactive_config()
|
||||
else:
|
||||
config = {
|
||||
'name': args.name,
|
||||
'description': args.description or f'Comprehensive assistance with {args.name}',
|
||||
'base_url': args.url,
|
||||
'selectors': {
|
||||
'main_content': "div[role='main']",
|
||||
'title': 'title',
|
||||
'code_blocks': 'pre code'
|
||||
},
|
||||
'url_patterns': {'include': [], 'exclude': []},
|
||||
'rate_limit': 0.5,
|
||||
'max_pages': 500
|
||||
}
|
||||
|
||||
# Check for existing data
|
||||
exists, page_count = check_existing_data(config['name'])
|
||||
|
||||
if exists and not args.skip_scrape:
|
||||
print(f"\n✓ Found existing data: {page_count} pages")
|
||||
response = input("Use existing data? (y/n): ").strip().lower()
|
||||
if response == 'y':
|
||||
args.skip_scrape = True
|
||||
|
||||
# Create converter
|
||||
converter = DocToSkillConverter(config)
|
||||
|
||||
# Scrape or skip
|
||||
if not args.skip_scrape:
|
||||
try:
|
||||
converter.scrape_all()
|
||||
except KeyboardInterrupt:
|
||||
print("\n\nScraping interrupted.")
|
||||
response = input("Continue with skill building? (y/n): ").strip().lower()
|
||||
if response != 'y':
|
||||
return
|
||||
else:
|
||||
print(f"\n⏭️ Skipping scrape, using existing data")
|
||||
|
||||
# Build skill
|
||||
success = converter.build_skill()
|
||||
|
||||
if not success:
|
||||
sys.exit(1)
|
||||
|
||||
# Optional enhancement with Claude API
|
||||
if args.enhance:
|
||||
print(f"\n{'='*60}")
|
||||
print(f"ENHANCING SKILL.MD WITH CLAUDE API")
|
||||
print(f"{'='*60}\n")
|
||||
|
||||
try:
|
||||
import subprocess
|
||||
enhance_cmd = ['python3', 'enhance_skill.py', f'output/{config["name"]}/']
|
||||
if args.api_key:
|
||||
enhance_cmd.extend(['--api-key', args.api_key])
|
||||
|
||||
result = subprocess.run(enhance_cmd, check=True)
|
||||
if result.returncode == 0:
|
||||
print("\n✅ Enhancement complete!")
|
||||
except subprocess.CalledProcessError:
|
||||
print("\n⚠ Enhancement failed, but skill was still built")
|
||||
except FileNotFoundError:
|
||||
print("\n⚠ enhance_skill.py not found. Run manually:")
|
||||
print(f" python3 enhance_skill.py output/{config['name']}/")
|
||||
|
||||
# Optional enhancement with Claude Code (local, no API key)
|
||||
if args.enhance_local:
|
||||
print(f"\n{'='*60}")
|
||||
print(f"ENHANCING SKILL.MD WITH CLAUDE CODE (LOCAL)")
|
||||
print(f"{'='*60}\n")
|
||||
|
||||
try:
|
||||
import subprocess
|
||||
enhance_cmd = ['python3', 'enhance_skill_local.py', f'output/{config["name"]}/']
|
||||
subprocess.run(enhance_cmd, check=True)
|
||||
except subprocess.CalledProcessError:
|
||||
print("\n⚠ Enhancement failed, but skill was still built")
|
||||
except FileNotFoundError:
|
||||
print("\n⚠ enhance_skill_local.py not found. Run manually:")
|
||||
print(f" python3 enhance_skill_local.py output/{config['name']}/")
|
||||
|
||||
print(f"\n📦 Package your skill:")
|
||||
print(f" python3 /mnt/skills/examples/skill-creator/scripts/package_skill.py output/{config['name']}/")
|
||||
|
||||
if not args.enhance and not args.enhance_local:
|
||||
print(f"\n💡 Optional: Enhance SKILL.md with Claude:")
|
||||
print(f" API-based: python3 enhance_skill.py output/{config['name']}/")
|
||||
print(f" or re-run with: --enhance")
|
||||
print(f" Local (no API key): python3 enhance_skill_local.py output/{config['name']}/")
|
||||
print(f" or re-run with: --enhance-local")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
239
docs/CLAUDE.md
Normal file
239
docs/CLAUDE.md
Normal file
@@ -0,0 +1,239 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Overview
|
||||
|
||||
This is a Python-based documentation scraper that converts ANY documentation website into a Claude skill. It's a single-file tool (`doc_scraper.py`) that scrapes documentation, extracts code patterns, detects programming languages, and generates structured skill files ready for use with Claude.
|
||||
|
||||
## Dependencies
|
||||
|
||||
```bash
|
||||
pip3 install requests beautifulsoup4
|
||||
```
|
||||
|
||||
## Core Commands
|
||||
|
||||
### Run with a preset configuration
|
||||
```bash
|
||||
python3 doc_scraper.py --config configs/godot.json
|
||||
python3 doc_scraper.py --config configs/react.json
|
||||
python3 doc_scraper.py --config configs/vue.json
|
||||
python3 doc_scraper.py --config configs/django.json
|
||||
python3 doc_scraper.py --config configs/fastapi.json
|
||||
```
|
||||
|
||||
### Interactive mode (for new frameworks)
|
||||
```bash
|
||||
python3 doc_scraper.py --interactive
|
||||
```
|
||||
|
||||
### Quick mode (minimal config)
|
||||
```bash
|
||||
python3 doc_scraper.py --name react --url https://react.dev/ --description "React framework"
|
||||
```
|
||||
|
||||
### Skip scraping (use cached data)
|
||||
```bash
|
||||
python3 doc_scraper.py --config configs/godot.json --skip-scrape
|
||||
```
|
||||
|
||||
### AI-powered SKILL.md enhancement
|
||||
```bash
|
||||
# Option 1: During scraping (API-based, requires ANTHROPIC_API_KEY)
|
||||
pip3 install anthropic
|
||||
export ANTHROPIC_API_KEY=sk-ant-...
|
||||
python3 doc_scraper.py --config configs/react.json --enhance
|
||||
|
||||
# Option 2: During scraping (LOCAL, no API key - uses Claude Code Max)
|
||||
python3 doc_scraper.py --config configs/react.json --enhance-local
|
||||
|
||||
# Option 3: Standalone after scraping (API-based)
|
||||
python3 enhance_skill.py output/react/
|
||||
|
||||
# Option 4: Standalone after scraping (LOCAL, no API key)
|
||||
python3 enhance_skill_local.py output/react/
|
||||
```
|
||||
|
||||
The LOCAL enhancement option (`--enhance-local` or `enhance_skill_local.py`) opens a new terminal with Claude Code, which analyzes reference files and enhances SKILL.md automatically. This requires Claude Code Max plan but no API key.
|
||||
|
||||
### Test with limited pages (edit config first)
|
||||
Set `"max_pages": 20` in the config file to test with fewer pages.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Single-File Design
|
||||
The entire tool is contained in `doc_scraper.py` (~737 lines). It follows a class-based architecture with a single `DocToSkillConverter` class that handles:
|
||||
- **Web scraping**: BFS traversal with URL validation
|
||||
- **Content extraction**: CSS selectors for title, content, code blocks
|
||||
- **Language detection**: Heuristic-based detection from code samples (Python, JavaScript, GDScript, C++, etc.)
|
||||
- **Pattern extraction**: Identifies common coding patterns from documentation
|
||||
- **Categorization**: Smart categorization using URL structure, page titles, and content keywords with scoring
|
||||
- **Skill generation**: Creates SKILL.md with real code examples and categorized reference files
|
||||
|
||||
### Data Flow
|
||||
1. **Scrape Phase**:
|
||||
- Input: Config JSON (name, base_url, selectors, url_patterns, categories, rate_limit, max_pages)
|
||||
- Process: BFS traversal starting from base_url, respecting include/exclude patterns
|
||||
- Output: `output/{name}_data/pages/*.json` + `summary.json`
|
||||
|
||||
2. **Build Phase**:
|
||||
- Input: Scraped JSON data from `output/{name}_data/`
|
||||
- Process: Load pages → Smart categorize → Extract patterns → Generate references
|
||||
- Output: `output/{name}/SKILL.md` + `output/{name}/references/*.md`
|
||||
|
||||
### Directory Structure
|
||||
```
|
||||
doc-to-skill/
|
||||
├── doc_scraper.py # Main scraping & building tool
|
||||
├── enhance_skill.py # AI enhancement (API-based)
|
||||
├── enhance_skill_local.py # AI enhancement (LOCAL, no API)
|
||||
├── configs/ # Preset configurations
|
||||
│ ├── godot.json
|
||||
│ ├── react.json
|
||||
│ ├── steam-inventory.json
|
||||
│ └── ...
|
||||
└── output/
|
||||
├── {name}_data/ # Raw scraped data (cached)
|
||||
│ ├── pages/ # Individual page JSONs
|
||||
│ └── summary.json # Scraping summary
|
||||
└── {name}/ # Generated skill
|
||||
├── SKILL.md # Main skill file with examples
|
||||
├── SKILL.md.backup # Backup (if enhanced)
|
||||
├── references/ # Categorized documentation
|
||||
│ ├── index.md
|
||||
│ ├── getting_started.md
|
||||
│ ├── api.md
|
||||
│ └── ...
|
||||
├── scripts/ # Empty (for user scripts)
|
||||
└── assets/ # Empty (for user assets)
|
||||
```
|
||||
|
||||
### Configuration Format
|
||||
Config files in `configs/*.json` contain:
|
||||
- `name`: Skill identifier (e.g., "godot", "react")
|
||||
- `description`: When to use this skill
|
||||
- `base_url`: Starting URL for scraping
|
||||
- `selectors`: CSS selectors for content extraction
|
||||
- `main_content`: Main documentation content (e.g., "article", "div[role='main']")
|
||||
- `title`: Page title selector
|
||||
- `code_blocks`: Code sample selector (e.g., "pre code", "pre")
|
||||
- `url_patterns`: URL filtering
|
||||
- `include`: Only scrape URLs containing these patterns
|
||||
- `exclude`: Skip URLs containing these patterns
|
||||
- `categories`: Keyword-based categorization mapping
|
||||
- `rate_limit`: Delay between requests (seconds)
|
||||
- `max_pages`: Maximum pages to scrape
|
||||
|
||||
### Key Features
|
||||
|
||||
**Auto-detect existing data**: Tool checks for `output/{name}_data/` and prompts to reuse, avoiding re-scraping.
|
||||
|
||||
**Language detection**: Detects code languages from:
|
||||
1. CSS class attributes (`language-*`, `lang-*`)
|
||||
2. Heuristics (keywords like `def`, `const`, `func`, etc.)
|
||||
|
||||
**Pattern extraction**: Looks for "Example:", "Pattern:", "Usage:" markers in content and extracts following code blocks (up to 5 per page).
|
||||
|
||||
**Smart categorization**:
|
||||
- Scores pages against category keywords (3 points for URL match, 2 for title, 1 for content)
|
||||
- Threshold of 2+ for categorization
|
||||
- Auto-infers categories from URL segments if none provided
|
||||
- Falls back to "other" category
|
||||
|
||||
**Enhanced SKILL.md**: Generated with:
|
||||
- Real code examples from documentation (language-annotated)
|
||||
- Quick reference patterns extracted from docs
|
||||
- Common pattern section
|
||||
- Category file listings
|
||||
|
||||
**AI-Powered Enhancement**: Two scripts to dramatically improve SKILL.md quality:
|
||||
- `enhance_skill.py`: Uses Anthropic API (~$0.15-$0.30 per skill, requires API key)
|
||||
- `enhance_skill_local.py`: Uses Claude Code Max (free, no API key needed)
|
||||
- Transforms generic 75-line templates into comprehensive 500+ line guides
|
||||
- Extracts best examples, explains key concepts, adds navigation guidance
|
||||
- Success rate: 9/10 quality (based on steam-economy test)
|
||||
|
||||
## Key Code Locations
|
||||
|
||||
- **URL validation**: `is_valid_url()` doc_scraper.py:47-62
|
||||
- **Content extraction**: `extract_content()` doc_scraper.py:64-131
|
||||
- **Language detection**: `detect_language()` doc_scraper.py:133-163
|
||||
- **Pattern extraction**: `extract_patterns()` doc_scraper.py:165-181
|
||||
- **Smart categorization**: `smart_categorize()` doc_scraper.py:280-321
|
||||
- **Category inference**: `infer_categories()` doc_scraper.py:323-349
|
||||
- **Quick reference generation**: `generate_quick_reference()` doc_scraper.py:351-370
|
||||
- **SKILL.md generation**: `create_enhanced_skill_md()` doc_scraper.py:424-540
|
||||
- **Scraping loop**: `scrape_all()` doc_scraper.py:226-249
|
||||
- **Main workflow**: `main()` doc_scraper.py:661-733
|
||||
|
||||
## Workflow Examples
|
||||
|
||||
### First time scraping (with scraping)
|
||||
```bash
|
||||
# 1. Scrape + Build
|
||||
python3 doc_scraper.py --config configs/godot.json
|
||||
# Time: 20-40 minutes
|
||||
|
||||
# 2. Package (assuming skill-creator is available)
|
||||
python3 package_skill.py output/godot/
|
||||
|
||||
# Result: godot.zip
|
||||
```
|
||||
|
||||
### Using cached data (fast iteration)
|
||||
```bash
|
||||
# 1. Use existing data
|
||||
python3 doc_scraper.py --config configs/godot.json --skip-scrape
|
||||
# Time: 1-3 minutes
|
||||
|
||||
# 2. Package
|
||||
python3 package_skill.py output/godot/
|
||||
```
|
||||
|
||||
### Creating a new framework config
|
||||
```bash
|
||||
# Option 1: Interactive
|
||||
python3 doc_scraper.py --interactive
|
||||
|
||||
# Option 2: Copy and modify
|
||||
cp configs/react.json configs/myframework.json
|
||||
# Edit configs/myframework.json
|
||||
python3 doc_scraper.py --config configs/myframework.json
|
||||
```
|
||||
|
||||
## Testing Selectors
|
||||
|
||||
To find the right CSS selectors for a documentation site:
|
||||
|
||||
```python
|
||||
from bs4 import BeautifulSoup
|
||||
import requests
|
||||
|
||||
url = "https://docs.example.com/page"
|
||||
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
|
||||
|
||||
# Try different selectors
|
||||
print(soup.select_one('article'))
|
||||
print(soup.select_one('main'))
|
||||
print(soup.select_one('div[role="main"]'))
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**No content extracted**: Check `main_content` selector. Common values: `article`, `main`, `div[role="main"]`, `div.content`
|
||||
|
||||
**Poor categorization**: Edit `categories` section in config with better keywords specific to the documentation structure
|
||||
|
||||
**Force re-scrape**: Delete cached data with `rm -rf output/{name}_data/`
|
||||
|
||||
**Rate limiting issues**: Increase `rate_limit` value in config (e.g., from 0.5 to 1.0 seconds)
|
||||
|
||||
## Output Quality Checks
|
||||
|
||||
After building, verify quality:
|
||||
```bash
|
||||
cat output/godot/SKILL.md # Should have real code examples
|
||||
cat output/godot/references/index.md # Should show categories
|
||||
ls output/godot/references/ # Should have category .md files
|
||||
```
|
||||
250
docs/ENHANCEMENT.md
Normal file
250
docs/ENHANCEMENT.md
Normal file
@@ -0,0 +1,250 @@
|
||||
# AI-Powered SKILL.md Enhancement
|
||||
|
||||
Two scripts are available to dramatically improve your SKILL.md file:
|
||||
1. **`enhance_skill_local.py`** - Uses Claude Code Max (no API key, **recommended**)
|
||||
2. **`enhance_skill.py`** - Uses Anthropic API (~$0.15-$0.30 per skill)
|
||||
|
||||
Both analyze reference documentation and extract the best examples and guidance.
|
||||
|
||||
## Why Use Enhancement?
|
||||
|
||||
**Problem:** The auto-generated SKILL.md is often too generic:
|
||||
- Empty Quick Reference section
|
||||
- No practical code examples
|
||||
- Generic "When to Use" triggers
|
||||
- Doesn't highlight key features
|
||||
|
||||
**Solution:** Let Claude read your reference docs and create a much better SKILL.md with:
|
||||
- ✅ Best code examples extracted from documentation
|
||||
- ✅ Practical quick reference with real patterns
|
||||
- ✅ Domain-specific guidance
|
||||
- ✅ Clear navigation tips
|
||||
- ✅ Key concepts explained
|
||||
|
||||
## Quick Start (LOCAL - No API Key)
|
||||
|
||||
**Recommended for Claude Code Max users:**
|
||||
|
||||
```bash
|
||||
# Option 1: Standalone enhancement
|
||||
python3 enhance_skill_local.py output/steam-inventory/
|
||||
|
||||
# Option 2: Integrated with scraper
|
||||
python3 doc_scraper.py --config configs/steam-inventory.json --enhance-local
|
||||
```
|
||||
|
||||
**What happens:**
|
||||
1. Opens new terminal window
|
||||
2. Runs Claude Code with enhancement prompt
|
||||
3. Claude analyzes reference files (~15-20K chars)
|
||||
4. Generates enhanced SKILL.md (30-60 seconds)
|
||||
5. Terminal auto-closes when done
|
||||
|
||||
**Requirements:**
|
||||
- Claude Code Max plan (you're already using it!)
|
||||
- macOS (auto-launch works) or manual terminal run on other OS
|
||||
|
||||
## API-Based Enhancement (Alternative)
|
||||
|
||||
**If you prefer API-based approach:**
|
||||
|
||||
### Installation
|
||||
|
||||
```bash
|
||||
pip3 install anthropic
|
||||
```
|
||||
|
||||
### Setup API Key
|
||||
|
||||
```bash
|
||||
# Option 1: Environment variable (recommended)
|
||||
export ANTHROPIC_API_KEY=sk-ant-...
|
||||
|
||||
# Option 2: Pass directly with --api-key
|
||||
python3 enhance_skill.py output/react/ --api-key sk-ant-...
|
||||
```
|
||||
|
||||
### Usage
|
||||
|
||||
```bash
|
||||
# Standalone enhancement
|
||||
python3 enhance_skill.py output/steam-inventory/
|
||||
|
||||
# Integrated with scraper
|
||||
python3 doc_scraper.py --config configs/steam-inventory.json --enhance
|
||||
|
||||
# Dry run (see what would be done)
|
||||
python3 enhance_skill.py output/react/ --dry-run
|
||||
```
|
||||
|
||||
## What It Does
|
||||
|
||||
1. **Reads reference files** (api_reference.md, webapi.md, etc.)
|
||||
2. **Sends to Claude** with instructions to:
|
||||
- Extract 5-10 best code examples
|
||||
- Create practical quick reference
|
||||
- Write domain-specific "When to Use" triggers
|
||||
- Add helpful navigation guidance
|
||||
3. **Backs up original** SKILL.md to SKILL.md.backup
|
||||
4. **Saves enhanced version** as new SKILL.md
|
||||
|
||||
## Example Enhancement
|
||||
|
||||
### Before (Auto-Generated)
|
||||
```markdown
|
||||
## Quick Reference
|
||||
|
||||
### Common Patterns
|
||||
|
||||
*Quick reference patterns will be added as you use the skill.*
|
||||
```
|
||||
|
||||
### After (AI-Enhanced)
|
||||
```markdown
|
||||
## Quick Reference
|
||||
|
||||
### Common API Patterns
|
||||
|
||||
**Granting promotional items:**
|
||||
```cpp
|
||||
void CInventory::GrantPromoItems()
|
||||
{
|
||||
SteamItemDef_t newItems[2];
|
||||
newItems[0] = 110;
|
||||
newItems[1] = 111;
|
||||
SteamInventory()->AddPromoItems( &s_GenerateRequestResult, newItems, 2 );
|
||||
}
|
||||
```
|
||||
|
||||
**Getting all items in player inventory:**
|
||||
```cpp
|
||||
SteamInventoryResult_t resultHandle;
|
||||
bool success = SteamInventory()->GetAllItems( &resultHandle );
|
||||
```
|
||||
[... 8 more practical examples ...]
|
||||
```
|
||||
|
||||
## Cost Estimate
|
||||
|
||||
- **Input**: ~50,000-100,000 tokens (reference docs)
|
||||
- **Output**: ~4,000 tokens (enhanced SKILL.md)
|
||||
- **Model**: claude-sonnet-4-20250514
|
||||
- **Estimated cost**: $0.15-$0.30 per skill
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "No API key provided"
|
||||
```bash
|
||||
export ANTHROPIC_API_KEY=sk-ant-...
|
||||
# or
|
||||
python3 enhance_skill.py output/react/ --api-key sk-ant-...
|
||||
```
|
||||
|
||||
### "No reference files found"
|
||||
Make sure you've run the scraper first:
|
||||
```bash
|
||||
python3 doc_scraper.py --config configs/react.json
|
||||
```
|
||||
|
||||
### "anthropic package not installed"
|
||||
```bash
|
||||
pip3 install anthropic
|
||||
```
|
||||
|
||||
### Don't like the result?
|
||||
```bash
|
||||
# Restore original
|
||||
mv output/steam-inventory/SKILL.md.backup output/steam-inventory/SKILL.md
|
||||
|
||||
# Try again (it may generate different content)
|
||||
python3 enhance_skill.py output/steam-inventory/
|
||||
```
|
||||
|
||||
## Tips
|
||||
|
||||
1. **Run after scraping completes** - Enhancement works best with complete reference docs
|
||||
2. **Review the output** - AI is good but not perfect, check the generated SKILL.md
|
||||
3. **Keep the backup** - Original is saved as SKILL.md.backup
|
||||
4. **Re-run if needed** - Each run may produce slightly different results
|
||||
5. **Works offline after first run** - Reference files are local
|
||||
|
||||
## Real-World Results
|
||||
|
||||
**Test Case: steam-economy skill**
|
||||
- **Before:** 75 lines, generic template, empty Quick Reference
|
||||
- **After:** 570 lines, 10 practical API examples, key concepts explained
|
||||
- **Time:** 60 seconds
|
||||
- **Quality Rating:** 9/10
|
||||
|
||||
The LOCAL enhancement successfully:
|
||||
- Extracted best HTTP/JSON examples from 24 pages of documentation
|
||||
- Explained domain concepts (Asset Classes, Context IDs, Transaction Lifecycle)
|
||||
- Created navigation guidance for beginners through advanced users
|
||||
- Added best practices for security, economy design, and API integration
|
||||
|
||||
## Limitations
|
||||
|
||||
**LOCAL Enhancement (`enhance_skill_local.py`):**
|
||||
- Requires Claude Code Max plan
|
||||
- macOS auto-launch only (manual on other OS)
|
||||
- Opens new terminal window
|
||||
- Takes ~60 seconds
|
||||
|
||||
**API Enhancement (`enhance_skill.py`):**
|
||||
- Requires Anthropic API key (paid)
|
||||
- Cost: ~$0.15-$0.30 per skill
|
||||
- Limited to ~100K tokens of reference input
|
||||
|
||||
**Both:**
|
||||
- May occasionally miss the best examples
|
||||
- Can't understand context beyond the reference docs
|
||||
- Doesn't modify reference files (only SKILL.md)
|
||||
|
||||
## Enhancement Options Comparison
|
||||
|
||||
| Aspect | Manual Edit | LOCAL Enhancement | API Enhancement |
|
||||
|--------|-------------|-------------------|-----------------|
|
||||
| Time | 15-30 minutes | 30-60 seconds | 30-60 seconds |
|
||||
| Code examples | You pick | AI picks best | AI picks best |
|
||||
| Quick reference | Write yourself | Auto-generated | Auto-generated |
|
||||
| Domain guidance | Your knowledge | From docs | From docs |
|
||||
| Consistency | Varies | Consistent | Consistent |
|
||||
| Cost | Free (your time) | Free (Max plan) | ~$0.20 per skill |
|
||||
| Setup | None | None | API key needed |
|
||||
| Quality | High (if expert) | 9/10 | 9/10 |
|
||||
| **Recommended?** | For experts only | ✅ **Yes** | If no Max plan |
|
||||
|
||||
## When to Use
|
||||
|
||||
**Use enhancement when:**
|
||||
- You want high-quality SKILL.md quickly
|
||||
- Working with large documentation (50+ pages)
|
||||
- Creating skills for unfamiliar frameworks
|
||||
- Need practical code examples extracted
|
||||
- Want consistent quality across multiple skills
|
||||
|
||||
**Skip enhancement when:**
|
||||
- Budget constrained (use manual editing)
|
||||
- Very small documentation (<10 pages)
|
||||
- You know the framework intimately
|
||||
- Documentation has no code examples
|
||||
|
||||
## Advanced: Customization
|
||||
|
||||
To customize how Claude enhances the SKILL.md, edit `enhance_skill.py` and modify the `_build_enhancement_prompt()` method around line 130.
|
||||
|
||||
Example customization:
|
||||
```python
|
||||
prompt += """
|
||||
ADDITIONAL REQUIREMENTS:
|
||||
- Focus on security best practices
|
||||
- Include performance tips
|
||||
- Add troubleshooting section
|
||||
"""
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- [README.md](../README.md) - Main documentation
|
||||
- [CLAUDE.md](CLAUDE.md) - Architecture guide
|
||||
- [doc_scraper.py](../doc_scraper.py) - Main scraping tool
|
||||
252
docs/UPLOAD_GUIDE.md
Normal file
252
docs/UPLOAD_GUIDE.md
Normal file
@@ -0,0 +1,252 @@
|
||||
# How to Upload Skills to Claude
|
||||
|
||||
## Quick Answer
|
||||
|
||||
**You upload the `.zip` file created by `package_skill.py`**
|
||||
|
||||
```bash
|
||||
# Create the zip file
|
||||
python3 package_skill.py output/steam-economy/
|
||||
|
||||
# This creates: output/steam-economy.zip
|
||||
# Upload this file to Claude!
|
||||
```
|
||||
|
||||
## What's Inside the Zip?
|
||||
|
||||
The `.zip` file contains:
|
||||
|
||||
```
|
||||
steam-economy.zip
|
||||
├── SKILL.md ← Main skill file (Claude reads this first)
|
||||
└── references/ ← Reference documentation
|
||||
├── index.md ← Category index
|
||||
├── api_reference.md ← API docs
|
||||
├── pricing.md ← Pricing docs
|
||||
├── trading.md ← Trading docs
|
||||
└── ... ← Other categorized docs
|
||||
```
|
||||
|
||||
**Note:** The zip only includes what Claude needs. It excludes:
|
||||
- `.backup` files
|
||||
- Build artifacts
|
||||
- Temporary files
|
||||
|
||||
## What Does package_skill.py Do?
|
||||
|
||||
The package script:
|
||||
|
||||
1. **Finds your skill directory** (e.g., `output/steam-economy/`)
|
||||
2. **Validates SKILL.md exists** (required!)
|
||||
3. **Creates a .zip file** with the same name
|
||||
4. **Includes all files** except backups
|
||||
5. **Saves to** `output/` directory
|
||||
|
||||
**Example:**
|
||||
```bash
|
||||
python3 package_skill.py output/steam-economy/
|
||||
|
||||
📦 Packaging skill: steam-economy
|
||||
Source: output/steam-economy
|
||||
Output: output/steam-economy.zip
|
||||
+ SKILL.md
|
||||
+ references/api_reference.md
|
||||
+ references/pricing.md
|
||||
+ references/trading.md
|
||||
+ ...
|
||||
|
||||
✅ Package created: output/steam-economy.zip
|
||||
Size: 14,290 bytes (14.0 KB)
|
||||
```
|
||||
|
||||
## Complete Workflow
|
||||
|
||||
### Step 1: Scrape & Build
|
||||
```bash
|
||||
python3 doc_scraper.py --config configs/steam-economy.json
|
||||
```
|
||||
|
||||
**Output:**
|
||||
- `output/steam-economy_data/` (raw scraped data)
|
||||
- `output/steam-economy/` (skill directory)
|
||||
|
||||
### Step 2: Enhance (Recommended)
|
||||
```bash
|
||||
python3 enhance_skill_local.py output/steam-economy/
|
||||
```
|
||||
|
||||
**What it does:**
|
||||
- Analyzes reference files
|
||||
- Creates comprehensive SKILL.md
|
||||
- Backs up original to SKILL.md.backup
|
||||
|
||||
**Output:**
|
||||
- `output/steam-economy/SKILL.md` (enhanced)
|
||||
- `output/steam-economy/SKILL.md.backup` (original)
|
||||
|
||||
### Step 3: Package
|
||||
```bash
|
||||
python3 package_skill.py output/steam-economy/
|
||||
```
|
||||
|
||||
**Output:**
|
||||
- `output/steam-economy.zip` ← **THIS IS WHAT YOU UPLOAD**
|
||||
|
||||
### Step 4: Upload to Claude
|
||||
1. Go to Claude (claude.ai)
|
||||
2. Click "Add Skill" or skill upload button
|
||||
3. Select `output/steam-economy.zip`
|
||||
4. Done!
|
||||
|
||||
## What Files Are Required?
|
||||
|
||||
**Minimum required structure:**
|
||||
```
|
||||
your-skill/
|
||||
└── SKILL.md ← Required! Claude reads this first
|
||||
```
|
||||
|
||||
**Recommended structure:**
|
||||
```
|
||||
your-skill/
|
||||
├── SKILL.md ← Main skill file (required)
|
||||
└── references/ ← Reference docs (highly recommended)
|
||||
├── index.md
|
||||
└── *.md ← Category files
|
||||
```
|
||||
|
||||
**Optional (can add manually):**
|
||||
```
|
||||
your-skill/
|
||||
├── SKILL.md
|
||||
├── references/
|
||||
├── scripts/ ← Helper scripts
|
||||
│ └── *.py
|
||||
└── assets/ ← Templates, examples
|
||||
└── *.txt
|
||||
```
|
||||
|
||||
## File Size Limits
|
||||
|
||||
The package script shows size after packaging:
|
||||
```
|
||||
✅ Package created: output/steam-economy.zip
|
||||
Size: 14,290 bytes (14.0 KB)
|
||||
```
|
||||
|
||||
**Typical sizes:**
|
||||
- Small skill: 5-20 KB
|
||||
- Medium skill: 20-100 KB
|
||||
- Large skill: 100-500 KB
|
||||
|
||||
Claude has generous size limits, so most documentation-based skills fit easily.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Package a Skill
|
||||
```bash
|
||||
python3 package_skill.py output/steam-economy/
|
||||
```
|
||||
|
||||
### Package Multiple Skills
|
||||
```bash
|
||||
# Package all skills in output/
|
||||
for dir in output/*/; do
|
||||
if [ -f "$dir/SKILL.md" ]; then
|
||||
python3 package_skill.py "$dir"
|
||||
fi
|
||||
done
|
||||
```
|
||||
|
||||
### Check What's in a Zip
|
||||
```bash
|
||||
unzip -l output/steam-economy.zip
|
||||
```
|
||||
|
||||
### Test a Packaged Skill Locally
|
||||
```bash
|
||||
# Extract to temp directory
|
||||
mkdir temp-test
|
||||
unzip output/steam-economy.zip -d temp-test/
|
||||
cat temp-test/SKILL.md
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "SKILL.md not found"
|
||||
```bash
|
||||
# Make sure you scraped and built first
|
||||
python3 doc_scraper.py --config configs/steam-economy.json
|
||||
|
||||
# Then package
|
||||
python3 package_skill.py output/steam-economy/
|
||||
```
|
||||
|
||||
### "Directory not found"
|
||||
```bash
|
||||
# Check what skills are available
|
||||
ls output/
|
||||
|
||||
# Use correct path
|
||||
python3 package_skill.py output/YOUR-SKILL-NAME/
|
||||
```
|
||||
|
||||
### Zip is Too Large
|
||||
Most skills are small, but if yours is large:
|
||||
```bash
|
||||
# Check size
|
||||
ls -lh output/steam-economy.zip
|
||||
|
||||
# If needed, check what's taking space
|
||||
unzip -l output/steam-economy.zip | sort -k1 -rn | head -20
|
||||
```
|
||||
|
||||
Reference files are usually small. Large sizes often mean:
|
||||
- Many images (skills typically don't need images)
|
||||
- Large code examples (these are fine, just be aware)
|
||||
|
||||
## What Does Claude Do With the Zip?
|
||||
|
||||
When you upload a skill zip:
|
||||
|
||||
1. **Claude extracts it**
|
||||
2. **Reads SKILL.md first** - This tells Claude:
|
||||
- When to activate this skill
|
||||
- What the skill does
|
||||
- Quick reference examples
|
||||
- How to navigate the references
|
||||
3. **Indexes reference files** - Claude can search through:
|
||||
- `references/*.md` files
|
||||
- Find specific APIs, examples, concepts
|
||||
4. **Activates automatically** - When you ask about topics matching the skill
|
||||
|
||||
## Example: Using the Packaged Skill
|
||||
|
||||
After uploading `steam-economy.zip`:
|
||||
|
||||
**You ask:** "How do I implement microtransactions in my Steam game?"
|
||||
|
||||
**Claude:**
|
||||
- Recognizes this matches steam-economy skill
|
||||
- Reads SKILL.md for quick reference
|
||||
- Searches references/microtransactions.md
|
||||
- Provides detailed answer with code examples
|
||||
|
||||
## Summary
|
||||
|
||||
**What you need to do:**
|
||||
1. ✅ Scrape: `python3 doc_scraper.py --config configs/YOUR-CONFIG.json`
|
||||
2. ✅ Enhance: `python3 enhance_skill_local.py output/YOUR-SKILL/`
|
||||
3. ✅ Package: `python3 package_skill.py output/YOUR-SKILL/`
|
||||
4. ✅ Upload: Upload the `.zip` file to Claude
|
||||
|
||||
**What you upload:**
|
||||
- The `.zip` file from `output/` directory
|
||||
- Example: `output/steam-economy.zip`
|
||||
|
||||
**What's in the zip:**
|
||||
- `SKILL.md` (required)
|
||||
- `references/*.md` (recommended)
|
||||
- Any scripts/assets you added (optional)
|
||||
|
||||
That's it! 🚀
|
||||
292
enhance_skill.py
Normal file
292
enhance_skill.py
Normal file
@@ -0,0 +1,292 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
SKILL.md Enhancement Script
|
||||
Uses Claude API to improve SKILL.md by analyzing reference documentation.
|
||||
|
||||
Usage:
|
||||
python3 enhance_skill.py output/steam-inventory/
|
||||
python3 enhance_skill.py output/react/
|
||||
python3 enhance_skill.py output/godot/ --api-key YOUR_API_KEY
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import json
|
||||
import argparse
|
||||
from pathlib import Path
|
||||
|
||||
try:
|
||||
import anthropic
|
||||
except ImportError:
|
||||
print("❌ Error: anthropic package not installed")
|
||||
print("Install with: pip3 install anthropic")
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
class SkillEnhancer:
|
||||
def __init__(self, skill_dir, api_key=None):
|
||||
self.skill_dir = Path(skill_dir)
|
||||
self.references_dir = self.skill_dir / "references"
|
||||
self.skill_md_path = self.skill_dir / "SKILL.md"
|
||||
|
||||
# Get API key
|
||||
self.api_key = api_key or os.environ.get('ANTHROPIC_API_KEY')
|
||||
if not self.api_key:
|
||||
raise ValueError(
|
||||
"No API key provided. Set ANTHROPIC_API_KEY environment variable "
|
||||
"or use --api-key argument"
|
||||
)
|
||||
|
||||
self.client = anthropic.Anthropic(api_key=self.api_key)
|
||||
|
||||
def read_reference_files(self, max_chars=100000):
|
||||
"""Read reference files with size limit"""
|
||||
references = {}
|
||||
|
||||
if not self.references_dir.exists():
|
||||
print(f"⚠ No references directory found at {self.references_dir}")
|
||||
return references
|
||||
|
||||
total_chars = 0
|
||||
for ref_file in sorted(self.references_dir.glob("*.md")):
|
||||
if ref_file.name == "index.md":
|
||||
continue
|
||||
|
||||
content = ref_file.read_text(encoding='utf-8')
|
||||
|
||||
# Limit size per file
|
||||
if len(content) > 40000:
|
||||
content = content[:40000] + "\n\n[Content truncated...]"
|
||||
|
||||
references[ref_file.name] = content
|
||||
total_chars += len(content)
|
||||
|
||||
# Stop if we've read enough
|
||||
if total_chars > max_chars:
|
||||
print(f" ℹ Limiting input to {max_chars:,} characters")
|
||||
break
|
||||
|
||||
return references
|
||||
|
||||
def read_current_skill_md(self):
|
||||
"""Read existing SKILL.md"""
|
||||
if not self.skill_md_path.exists():
|
||||
return None
|
||||
return self.skill_md_path.read_text(encoding='utf-8')
|
||||
|
||||
def enhance_skill_md(self, references, current_skill_md):
|
||||
"""Use Claude to enhance SKILL.md"""
|
||||
|
||||
# Build prompt
|
||||
prompt = self._build_enhancement_prompt(references, current_skill_md)
|
||||
|
||||
print("\n🤖 Asking Claude to enhance SKILL.md...")
|
||||
print(f" Input: {len(prompt):,} characters")
|
||||
|
||||
try:
|
||||
message = self.client.messages.create(
|
||||
model="claude-sonnet-4-20250514",
|
||||
max_tokens=4096,
|
||||
temperature=0.3,
|
||||
messages=[{
|
||||
"role": "user",
|
||||
"content": prompt
|
||||
}]
|
||||
)
|
||||
|
||||
enhanced_content = message.content[0].text
|
||||
return enhanced_content
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ Error calling Claude API: {e}")
|
||||
return None
|
||||
|
||||
def _build_enhancement_prompt(self, references, current_skill_md):
|
||||
"""Build the prompt for Claude"""
|
||||
|
||||
# Extract skill name and description
|
||||
skill_name = self.skill_dir.name
|
||||
|
||||
prompt = f"""You are enhancing a Claude skill's SKILL.md file. This skill is about: {skill_name}
|
||||
|
||||
I've scraped documentation and organized it into reference files. Your job is to create an EXCELLENT SKILL.md that will help Claude use this documentation effectively.
|
||||
|
||||
CURRENT SKILL.MD:
|
||||
{'```markdown' if current_skill_md else '(none - create from scratch)'}
|
||||
{current_skill_md or 'No existing SKILL.md'}
|
||||
{'```' if current_skill_md else ''}
|
||||
|
||||
REFERENCE DOCUMENTATION:
|
||||
"""
|
||||
|
||||
for filename, content in references.items():
|
||||
prompt += f"\n\n## {filename}\n```markdown\n{content[:30000]}\n```\n"
|
||||
|
||||
prompt += """
|
||||
|
||||
YOUR TASK:
|
||||
Create an enhanced SKILL.md that includes:
|
||||
|
||||
1. **Clear "When to Use This Skill" section** - Be specific about trigger conditions
|
||||
2. **Excellent Quick Reference section** - Extract 5-10 of the BEST, most practical code examples from the reference docs
|
||||
- Choose SHORT, clear examples that demonstrate common tasks
|
||||
- Include both simple and intermediate examples
|
||||
- Annotate examples with clear descriptions
|
||||
- Use proper language tags (cpp, python, javascript, json, etc.)
|
||||
3. **Detailed Reference Files description** - Explain what's in each reference file
|
||||
4. **Practical "Working with This Skill" section** - Give users clear guidance on how to navigate the skill
|
||||
5. **Key Concepts section** (if applicable) - Explain core concepts
|
||||
6. **Keep the frontmatter** (---\nname: ...\n---) intact
|
||||
|
||||
IMPORTANT:
|
||||
- Extract REAL examples from the reference docs, don't make them up
|
||||
- Prioritize SHORT, clear examples (5-20 lines max)
|
||||
- Make it actionable and practical
|
||||
- Don't be too verbose - be concise but useful
|
||||
- Maintain the markdown structure for Claude skills
|
||||
- Keep code examples properly formatted with language tags
|
||||
|
||||
OUTPUT:
|
||||
Return ONLY the complete SKILL.md content, starting with the frontmatter (---).
|
||||
"""
|
||||
|
||||
return prompt
|
||||
|
||||
def save_enhanced_skill_md(self, content):
|
||||
"""Save the enhanced SKILL.md"""
|
||||
# Backup original
|
||||
if self.skill_md_path.exists():
|
||||
backup_path = self.skill_md_path.with_suffix('.md.backup')
|
||||
self.skill_md_path.rename(backup_path)
|
||||
print(f" 💾 Backed up original to: {backup_path.name}")
|
||||
|
||||
# Save enhanced version
|
||||
self.skill_md_path.write_text(content, encoding='utf-8')
|
||||
print(f" ✅ Saved enhanced SKILL.md")
|
||||
|
||||
def run(self):
|
||||
"""Main enhancement workflow"""
|
||||
print(f"\n{'='*60}")
|
||||
print(f"ENHANCING SKILL: {self.skill_dir.name}")
|
||||
print(f"{'='*60}\n")
|
||||
|
||||
# Read reference files
|
||||
print("📖 Reading reference documentation...")
|
||||
references = self.read_reference_files()
|
||||
|
||||
if not references:
|
||||
print("❌ No reference files found to analyze")
|
||||
return False
|
||||
|
||||
print(f" ✓ Read {len(references)} reference files")
|
||||
total_size = sum(len(c) for c in references.values())
|
||||
print(f" ✓ Total size: {total_size:,} characters\n")
|
||||
|
||||
# Read current SKILL.md
|
||||
current_skill_md = self.read_current_skill_md()
|
||||
if current_skill_md:
|
||||
print(f" ℹ Found existing SKILL.md ({len(current_skill_md)} chars)")
|
||||
else:
|
||||
print(f" ℹ No existing SKILL.md, will create new one")
|
||||
|
||||
# Enhance with Claude
|
||||
enhanced = self.enhance_skill_md(references, current_skill_md)
|
||||
|
||||
if not enhanced:
|
||||
print("❌ Enhancement failed")
|
||||
return False
|
||||
|
||||
print(f" ✓ Generated enhanced SKILL.md ({len(enhanced)} chars)\n")
|
||||
|
||||
# Save
|
||||
print("💾 Saving enhanced SKILL.md...")
|
||||
self.save_enhanced_skill_md(enhanced)
|
||||
|
||||
print(f"\n✅ Enhancement complete!")
|
||||
print(f"\nNext steps:")
|
||||
print(f" 1. Review: {self.skill_md_path}")
|
||||
print(f" 2. If you don't like it, restore backup: {self.skill_md_path.with_suffix('.md.backup')}")
|
||||
print(f" 3. Package your skill:")
|
||||
print(f" python3 /mnt/skills/examples/skill-creator/scripts/package_skill.py {self.skill_dir}/")
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description='Enhance SKILL.md using Claude API',
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""
|
||||
Examples:
|
||||
# Using ANTHROPIC_API_KEY environment variable
|
||||
export ANTHROPIC_API_KEY=sk-ant-...
|
||||
python3 enhance_skill.py output/steam-inventory/
|
||||
|
||||
# Providing API key directly
|
||||
python3 enhance_skill.py output/react/ --api-key sk-ant-...
|
||||
|
||||
# Show what would be done (dry run)
|
||||
python3 enhance_skill.py output/godot/ --dry-run
|
||||
"""
|
||||
)
|
||||
|
||||
parser.add_argument('skill_dir', type=str,
|
||||
help='Path to skill directory (e.g., output/steam-inventory/)')
|
||||
parser.add_argument('--api-key', type=str,
|
||||
help='Anthropic API key (or set ANTHROPIC_API_KEY env var)')
|
||||
parser.add_argument('--dry-run', action='store_true',
|
||||
help='Show what would be done without calling API')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Validate skill directory
|
||||
skill_dir = Path(args.skill_dir)
|
||||
if not skill_dir.exists():
|
||||
print(f"❌ Error: Directory not found: {skill_dir}")
|
||||
sys.exit(1)
|
||||
|
||||
if not skill_dir.is_dir():
|
||||
print(f"❌ Error: Not a directory: {skill_dir}")
|
||||
sys.exit(1)
|
||||
|
||||
# Dry run mode
|
||||
if args.dry_run:
|
||||
print(f"🔍 DRY RUN MODE")
|
||||
print(f" Would enhance: {skill_dir}")
|
||||
print(f" References: {skill_dir / 'references'}")
|
||||
print(f" SKILL.md: {skill_dir / 'SKILL.md'}")
|
||||
|
||||
refs_dir = skill_dir / "references"
|
||||
if refs_dir.exists():
|
||||
ref_files = list(refs_dir.glob("*.md"))
|
||||
print(f" Found {len(ref_files)} reference files:")
|
||||
for rf in ref_files:
|
||||
size = rf.stat().st_size
|
||||
print(f" - {rf.name} ({size:,} bytes)")
|
||||
|
||||
print("\nTo actually run enhancement:")
|
||||
print(f" python3 enhance_skill.py {skill_dir}")
|
||||
return
|
||||
|
||||
# Create enhancer and run
|
||||
try:
|
||||
enhancer = SkillEnhancer(skill_dir, api_key=args.api_key)
|
||||
success = enhancer.run()
|
||||
sys.exit(0 if success else 1)
|
||||
|
||||
except ValueError as e:
|
||||
print(f"❌ Error: {e}")
|
||||
print("\nSet your API key:")
|
||||
print(" export ANTHROPIC_API_KEY=sk-ant-...")
|
||||
print("Or provide it directly:")
|
||||
print(f" python3 enhance_skill.py {skill_dir} --api-key sk-ant-...")
|
||||
sys.exit(1)
|
||||
except Exception as e:
|
||||
print(f"❌ Unexpected error: {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
244
enhance_skill_local.py
Normal file
244
enhance_skill_local.py
Normal file
@@ -0,0 +1,244 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
SKILL.md Enhancement Script (Local - Using Claude Code)
|
||||
Opens a new terminal with Claude Code to enhance SKILL.md, then reports back.
|
||||
No API key needed - uses your existing Claude Code Max plan!
|
||||
|
||||
Usage:
|
||||
python3 enhance_skill_local.py output/steam-inventory/
|
||||
python3 enhance_skill_local.py output/react/
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
import subprocess
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
class LocalSkillEnhancer:
|
||||
def __init__(self, skill_dir):
|
||||
self.skill_dir = Path(skill_dir)
|
||||
self.references_dir = self.skill_dir / "references"
|
||||
self.skill_md_path = self.skill_dir / "SKILL.md"
|
||||
|
||||
def create_enhancement_prompt(self):
|
||||
"""Create the prompt file for Claude Code"""
|
||||
|
||||
# Read reference files
|
||||
references = self.read_reference_files()
|
||||
|
||||
if not references:
|
||||
print("❌ No reference files found")
|
||||
return None
|
||||
|
||||
# Read current SKILL.md
|
||||
current_skill_md = ""
|
||||
if self.skill_md_path.exists():
|
||||
current_skill_md = self.skill_md_path.read_text(encoding='utf-8')
|
||||
|
||||
# Build prompt
|
||||
prompt = f"""I need you to enhance the SKILL.md file for the {self.skill_dir.name} skill.
|
||||
|
||||
CURRENT SKILL.MD:
|
||||
{'-'*60}
|
||||
{current_skill_md if current_skill_md else '(No existing SKILL.md - create from scratch)'}
|
||||
{'-'*60}
|
||||
|
||||
REFERENCE DOCUMENTATION:
|
||||
{'-'*60}
|
||||
"""
|
||||
|
||||
for filename, content in references.items():
|
||||
prompt += f"\n## {filename}\n{content[:15000]}\n"
|
||||
|
||||
prompt += f"""
|
||||
{'-'*60}
|
||||
|
||||
YOUR TASK:
|
||||
Create an EXCELLENT SKILL.md file that will help Claude use this documentation effectively.
|
||||
|
||||
Requirements:
|
||||
1. **Clear "When to Use This Skill" section**
|
||||
- Be SPECIFIC about trigger conditions
|
||||
- List concrete use cases
|
||||
|
||||
2. **Excellent Quick Reference section**
|
||||
- Extract 5-10 of the BEST, most practical code examples from the reference docs
|
||||
- Choose SHORT, clear examples (5-20 lines max)
|
||||
- Include both simple and intermediate examples
|
||||
- Use proper language tags (cpp, python, javascript, json, etc.)
|
||||
- Add clear descriptions for each example
|
||||
|
||||
3. **Detailed Reference Files description**
|
||||
- Explain what's in each reference file
|
||||
- Help users navigate the documentation
|
||||
|
||||
4. **Practical "Working with This Skill" section**
|
||||
- Clear guidance for beginners, intermediate, and advanced users
|
||||
- Navigation tips
|
||||
|
||||
5. **Key Concepts section** (if applicable)
|
||||
- Explain core concepts
|
||||
- Define important terminology
|
||||
|
||||
IMPORTANT:
|
||||
- Extract REAL examples from the reference docs above
|
||||
- Prioritize SHORT, clear examples
|
||||
- Make it actionable and practical
|
||||
- Keep the frontmatter (---\\nname: ...\\n---) intact
|
||||
- Use proper markdown formatting
|
||||
|
||||
SAVE THE RESULT:
|
||||
Save the complete enhanced SKILL.md to: {self.skill_md_path.absolute()}
|
||||
|
||||
First, backup the original to: {self.skill_md_path.with_suffix('.md.backup').absolute()}
|
||||
"""
|
||||
|
||||
return prompt
|
||||
|
||||
def read_reference_files(self, max_chars=50000):
|
||||
"""Read reference files with size limit"""
|
||||
references = {}
|
||||
|
||||
if not self.references_dir.exists():
|
||||
return references
|
||||
|
||||
total_chars = 0
|
||||
for ref_file in sorted(self.references_dir.glob("*.md")):
|
||||
if ref_file.name == "index.md":
|
||||
continue
|
||||
|
||||
content = ref_file.read_text(encoding='utf-8')
|
||||
|
||||
# Limit size per file
|
||||
if len(content) > 20000:
|
||||
content = content[:20000] + "\n\n[Content truncated...]"
|
||||
|
||||
references[ref_file.name] = content
|
||||
total_chars += len(content)
|
||||
|
||||
if total_chars > max_chars:
|
||||
break
|
||||
|
||||
return references
|
||||
|
||||
def run(self):
|
||||
"""Main enhancement workflow"""
|
||||
print(f"\n{'='*60}")
|
||||
print(f"LOCAL ENHANCEMENT: {self.skill_dir.name}")
|
||||
print(f"{'='*60}\n")
|
||||
|
||||
# Validate
|
||||
if not self.skill_dir.exists():
|
||||
print(f"❌ Directory not found: {self.skill_dir}")
|
||||
return False
|
||||
|
||||
# Read reference files
|
||||
print("📖 Reading reference documentation...")
|
||||
references = self.read_reference_files()
|
||||
|
||||
if not references:
|
||||
print("❌ No reference files found to analyze")
|
||||
return False
|
||||
|
||||
print(f" ✓ Read {len(references)} reference files")
|
||||
total_size = sum(len(c) for c in references.values())
|
||||
print(f" ✓ Total size: {total_size:,} characters\n")
|
||||
|
||||
# Create prompt
|
||||
print("📝 Creating enhancement prompt...")
|
||||
prompt = self.create_enhancement_prompt()
|
||||
|
||||
if not prompt:
|
||||
return False
|
||||
|
||||
# Save prompt to temp file
|
||||
with tempfile.NamedTemporaryFile(mode='w', suffix='.txt', delete=False, encoding='utf-8') as f:
|
||||
prompt_file = f.name
|
||||
f.write(prompt)
|
||||
|
||||
print(f" ✓ Prompt saved ({len(prompt):,} characters)\n")
|
||||
|
||||
# Launch Claude Code in new terminal
|
||||
print("🚀 Launching Claude Code in new terminal...")
|
||||
print(" This will:")
|
||||
print(" 1. Open a new terminal window")
|
||||
print(" 2. Run Claude Code with the enhancement task")
|
||||
print(" 3. Claude will read the docs and enhance SKILL.md")
|
||||
print(" 4. Terminal will auto-close when done")
|
||||
print()
|
||||
|
||||
# Create a shell script to run in the terminal
|
||||
shell_script = f'''#!/bin/bash
|
||||
claude {prompt_file}
|
||||
echo ""
|
||||
echo "✅ Enhancement complete!"
|
||||
echo "Press any key to close..."
|
||||
read -n 1
|
||||
rm {prompt_file}
|
||||
'''
|
||||
|
||||
# Save shell script
|
||||
with tempfile.NamedTemporaryFile(mode='w', suffix='.sh', delete=False) as f:
|
||||
script_file = f.name
|
||||
f.write(shell_script)
|
||||
|
||||
os.chmod(script_file, 0o755)
|
||||
|
||||
# Launch in new terminal (macOS specific)
|
||||
if sys.platform == 'darwin':
|
||||
# macOS Terminal - simple approach
|
||||
try:
|
||||
subprocess.Popen(['open', '-a', 'Terminal', script_file])
|
||||
except Exception as e:
|
||||
print(f"⚠️ Error launching terminal: {e}")
|
||||
print(f"\nManually run: {script_file}")
|
||||
return False
|
||||
else:
|
||||
print("⚠️ Auto-launch only works on macOS")
|
||||
print(f"\nManually run this command in a new terminal:")
|
||||
print(f" claude '{prompt_file}'")
|
||||
print(f"\nThen delete the prompt file:")
|
||||
print(f" rm '{prompt_file}'")
|
||||
return False
|
||||
|
||||
print("✅ New terminal launched with Claude Code!")
|
||||
print()
|
||||
print("📊 Status:")
|
||||
print(f" - Prompt file: {prompt_file}")
|
||||
print(f" - Skill directory: {self.skill_dir.absolute()}")
|
||||
print(f" - SKILL.md will be saved to: {self.skill_md_path.absolute()}")
|
||||
print(f" - Original backed up to: {self.skill_md_path.with_suffix('.md.backup').absolute()}")
|
||||
print()
|
||||
print("⏳ Wait for Claude Code to finish in the other terminal...")
|
||||
print(" (Usually takes 30-60 seconds)")
|
||||
print()
|
||||
print("💡 When done:")
|
||||
print(f" 1. Check the enhanced SKILL.md: {self.skill_md_path}")
|
||||
print(f" 2. If you don't like it, restore: mv {self.skill_md_path.with_suffix('.md.backup')} {self.skill_md_path}")
|
||||
print(f" 3. Package: python3 /mnt/skills/examples/skill-creator/scripts/package_skill.py {self.skill_dir}/")
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def main():
|
||||
if len(sys.argv) < 2:
|
||||
print("Usage: python3 enhance_skill_local.py <skill_directory>")
|
||||
print()
|
||||
print("Examples:")
|
||||
print(" python3 enhance_skill_local.py output/steam-inventory/")
|
||||
print(" python3 enhance_skill_local.py output/react/")
|
||||
sys.exit(1)
|
||||
|
||||
skill_dir = sys.argv[1]
|
||||
|
||||
enhancer = LocalSkillEnhancer(skill_dir)
|
||||
success = enhancer.run()
|
||||
|
||||
sys.exit(0 if success else 1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
BIN
output/.DS_Store
vendored
Normal file
BIN
output/.DS_Store
vendored
Normal file
Binary file not shown.
78
package_skill.py
Normal file
78
package_skill.py
Normal file
@@ -0,0 +1,78 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Simple Skill Packager
|
||||
Packages a skill directory into a .zip file for Claude.
|
||||
|
||||
Usage:
|
||||
python3 package_skill.py output/steam-inventory/
|
||||
python3 package_skill.py output/react/
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import zipfile
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def package_skill(skill_dir):
|
||||
"""Package a skill directory into a .zip file"""
|
||||
skill_path = Path(skill_dir)
|
||||
|
||||
if not skill_path.exists():
|
||||
print(f"❌ Error: Directory not found: {skill_dir}")
|
||||
return False
|
||||
|
||||
if not skill_path.is_dir():
|
||||
print(f"❌ Error: Not a directory: {skill_dir}")
|
||||
return False
|
||||
|
||||
# Verify SKILL.md exists
|
||||
skill_md = skill_path / "SKILL.md"
|
||||
if not skill_md.exists():
|
||||
print(f"❌ Error: SKILL.md not found in {skill_dir}")
|
||||
return False
|
||||
|
||||
# Create zip filename
|
||||
skill_name = skill_path.name
|
||||
zip_path = skill_path.parent / f"{skill_name}.zip"
|
||||
|
||||
print(f"📦 Packaging skill: {skill_name}")
|
||||
print(f" Source: {skill_path}")
|
||||
print(f" Output: {zip_path}")
|
||||
|
||||
# Create zip file
|
||||
with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zf:
|
||||
for root, dirs, files in os.walk(skill_path):
|
||||
# Skip backup files
|
||||
files = [f for f in files if not f.endswith('.backup')]
|
||||
|
||||
for file in files:
|
||||
file_path = Path(root) / file
|
||||
arcname = file_path.relative_to(skill_path)
|
||||
zf.write(file_path, arcname)
|
||||
print(f" + {arcname}")
|
||||
|
||||
# Get zip size
|
||||
zip_size = zip_path.stat().st_size
|
||||
print(f"\n✅ Package created: {zip_path}")
|
||||
print(f" Size: {zip_size:,} bytes ({zip_size / 1024:.1f} KB)")
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def main():
|
||||
if len(sys.argv) < 2:
|
||||
print("Usage: python3 package_skill.py <skill_directory>")
|
||||
print()
|
||||
print("Examples:")
|
||||
print(" python3 package_skill.py output/steam-inventory/")
|
||||
print(" python3 package_skill.py output/react/")
|
||||
sys.exit(1)
|
||||
|
||||
skill_dir = sys.argv[1]
|
||||
success = package_skill(skill_dir)
|
||||
sys.exit(0 if success else 1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Reference in New Issue
Block a user