Init
This commit is contained in:
2
LICENSE
2
LICENSE
@@ -1,6 +1,6 @@
|
|||||||
MIT License
|
MIT License
|
||||||
|
|
||||||
Copyright (c) 2025 yusyus
|
Copyright (c) 2025 [Your Name/Username]
|
||||||
|
|
||||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||||
of this software and associated documentation files (the "Software"), to deal
|
of this software and associated documentation files (the "Software"), to deal
|
||||||
|
|||||||
181
QUICKSTART.md
Normal file
181
QUICKSTART.md
Normal file
@@ -0,0 +1,181 @@
|
|||||||
|
# Quick Start Guide
|
||||||
|
|
||||||
|
## 🚀 3 Steps to Create a Skill
|
||||||
|
|
||||||
|
### Step 1: Install Dependencies
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip3 install requests beautifulsoup4
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: Run the Tool
|
||||||
|
|
||||||
|
**Option A: Use a Preset (Easiest)**
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py --config configs/godot.json
|
||||||
|
```
|
||||||
|
|
||||||
|
**Option B: Interactive Mode**
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py --interactive
|
||||||
|
```
|
||||||
|
|
||||||
|
**Option C: Quick Command**
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py --name react --url https://react.dev/
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3: Enhance SKILL.md (Recommended)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# LOCAL enhancement (no API key, uses Claude Code Max)
|
||||||
|
python3 enhance_skill_local.py output/godot/
|
||||||
|
```
|
||||||
|
|
||||||
|
**This takes 60 seconds and dramatically improves the SKILL.md quality!**
|
||||||
|
|
||||||
|
### Step 4: Package the Skill
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 package_skill.py output/godot/
|
||||||
|
```
|
||||||
|
|
||||||
|
**Done!** You now have `godot.zip` ready to use.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📋 Available Presets
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Godot Engine
|
||||||
|
python3 doc_scraper.py --config configs/godot.json
|
||||||
|
|
||||||
|
# React
|
||||||
|
python3 doc_scraper.py --config configs/react.json
|
||||||
|
|
||||||
|
# Vue.js
|
||||||
|
python3 doc_scraper.py --config configs/vue.json
|
||||||
|
|
||||||
|
# Django
|
||||||
|
python3 doc_scraper.py --config configs/django.json
|
||||||
|
|
||||||
|
# FastAPI
|
||||||
|
python3 doc_scraper.py --config configs/fastapi.json
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ⚡ Using Existing Data (Fast!)
|
||||||
|
|
||||||
|
If you already scraped once:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py --config configs/godot.json
|
||||||
|
|
||||||
|
# When prompted:
|
||||||
|
✓ Found existing data: 245 pages
|
||||||
|
Use existing data? (y/n): y
|
||||||
|
|
||||||
|
# Builds in seconds!
|
||||||
|
```
|
||||||
|
|
||||||
|
Or use `--skip-scrape`:
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py --config configs/godot.json --skip-scrape
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🎯 Complete Example (Recommended Workflow)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Install (once)
|
||||||
|
pip3 install requests beautifulsoup4
|
||||||
|
|
||||||
|
# 2. Scrape React docs with LOCAL enhancement
|
||||||
|
python3 doc_scraper.py --config configs/react.json --enhance-local
|
||||||
|
# Wait 15-30 minutes (scraping) + 60 seconds (enhancement)
|
||||||
|
|
||||||
|
# 3. Package
|
||||||
|
python3 package_skill.py output/react/
|
||||||
|
|
||||||
|
# 4. Use react.zip in Claude!
|
||||||
|
```
|
||||||
|
|
||||||
|
**Alternative: Enhancement after scraping**
|
||||||
|
```bash
|
||||||
|
# 2a. Scrape only (no enhancement)
|
||||||
|
python3 doc_scraper.py --config configs/react.json
|
||||||
|
|
||||||
|
# 2b. Enhance later
|
||||||
|
python3 enhance_skill_local.py output/react/
|
||||||
|
|
||||||
|
# 3. Package
|
||||||
|
python3 package_skill.py output/react/
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 💡 Pro Tips
|
||||||
|
|
||||||
|
### Test with Small Pages First
|
||||||
|
Edit config file:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"max_pages": 20 // Test with just 20 pages
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Rebuild Instantly
|
||||||
|
```bash
|
||||||
|
# After first scrape, you can rebuild instantly:
|
||||||
|
python3 doc_scraper.py --config configs/react.json --skip-scrape
|
||||||
|
```
|
||||||
|
|
||||||
|
### Create Custom Config
|
||||||
|
```bash
|
||||||
|
# Copy a preset
|
||||||
|
cp configs/react.json configs/myframework.json
|
||||||
|
|
||||||
|
# Edit it
|
||||||
|
nano configs/myframework.json
|
||||||
|
|
||||||
|
# Use it
|
||||||
|
python3 doc_scraper.py --config configs/myframework.json
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📁 What You Get
|
||||||
|
|
||||||
|
```
|
||||||
|
output/
|
||||||
|
├── godot_data/ # Raw scraped data (reusable!)
|
||||||
|
└── godot/ # The skill
|
||||||
|
├── SKILL.md # With real code examples!
|
||||||
|
└── references/ # Organized docs
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ❓ Need Help?
|
||||||
|
|
||||||
|
See **README.md** for:
|
||||||
|
- Complete documentation
|
||||||
|
- Config file structure
|
||||||
|
- Troubleshooting
|
||||||
|
- Advanced usage
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🎮 Let's Go!
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Godot
|
||||||
|
python3 doc_scraper.py --config configs/godot.json
|
||||||
|
|
||||||
|
# Or interactive
|
||||||
|
python3 doc_scraper.py --interactive
|
||||||
|
```
|
||||||
|
|
||||||
|
That's it! 🚀
|
||||||
445
README.md
445
README.md
@@ -1,2 +1,443 @@
|
|||||||
# Skill_Seekers
|
# Documentation to Claude Skill Converter
|
||||||
Single powerful tool to convert ANY documentation website into a Claude skill
|
|
||||||
|
[](https://opensource.org/licenses/MIT)
|
||||||
|
|
||||||
|
**Single powerful tool to convert ANY documentation website into a Claude skill.**
|
||||||
|
|
||||||
|
## 🚀 Quick Start
|
||||||
|
|
||||||
|
### Easiest: Use a Preset
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install dependencies (macOS)
|
||||||
|
pip3 install requests beautifulsoup4
|
||||||
|
|
||||||
|
# Use Godot preset
|
||||||
|
python3 doc_scraper.py --config configs/godot.json
|
||||||
|
|
||||||
|
# Use React preset
|
||||||
|
python3 doc_scraper.py --config configs/react.json
|
||||||
|
|
||||||
|
# See all presets
|
||||||
|
ls configs/
|
||||||
|
```
|
||||||
|
|
||||||
|
### Interactive Mode
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py --interactive
|
||||||
|
```
|
||||||
|
|
||||||
|
### Quick Mode
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py \
|
||||||
|
--name react \
|
||||||
|
--url https://react.dev/ \
|
||||||
|
--description "React framework for UIs"
|
||||||
|
```
|
||||||
|
|
||||||
|
## 📁 Simple Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
doc-to-skill/
|
||||||
|
├── doc_scraper.py # Main scraping tool
|
||||||
|
├── enhance_skill.py # Optional: AI-powered SKILL.md enhancement
|
||||||
|
├── configs/ # Preset configurations
|
||||||
|
│ ├── godot.json # Godot Engine
|
||||||
|
│ ├── react.json # React
|
||||||
|
│ ├── vue.json # Vue.js
|
||||||
|
│ ├── django.json # Django
|
||||||
|
│ └── fastapi.json # FastAPI
|
||||||
|
└── output/ # All output (auto-created)
|
||||||
|
├── godot_data/ # Scraped data
|
||||||
|
└── godot/ # Built skill
|
||||||
|
```
|
||||||
|
|
||||||
|
## ✨ Features
|
||||||
|
|
||||||
|
### 1. Auto-Detect Existing Data
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py --config configs/godot.json
|
||||||
|
|
||||||
|
# If data exists:
|
||||||
|
✓ Found existing data: 245 pages
|
||||||
|
Use existing data? (y/n): y
|
||||||
|
⏭️ Skipping scrape, using existing data
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Knowledge Generation
|
||||||
|
|
||||||
|
**Automatic pattern extraction:**
|
||||||
|
- Extracts common code patterns from docs
|
||||||
|
- Detects programming language
|
||||||
|
- Creates quick reference with real examples
|
||||||
|
- Smarter categorization with scoring
|
||||||
|
|
||||||
|
**Enhanced SKILL.md:**
|
||||||
|
- Real code examples from documentation
|
||||||
|
- Language-annotated code blocks
|
||||||
|
- Common patterns section
|
||||||
|
- Quick reference from actual usage examples
|
||||||
|
|
||||||
|
### 3. Smart Categorization
|
||||||
|
|
||||||
|
Automatically infers categories from:
|
||||||
|
- URL structure
|
||||||
|
- Page titles
|
||||||
|
- Content keywords
|
||||||
|
- With scoring for better accuracy
|
||||||
|
|
||||||
|
### 4. Code Language Detection
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Automatically detects:
|
||||||
|
- Python (def, import, from)
|
||||||
|
- JavaScript (const, let, =>)
|
||||||
|
- GDScript (func, var, extends)
|
||||||
|
- C++ (#include, int main)
|
||||||
|
- And more...
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Skip Scraping
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Scrape once
|
||||||
|
python3 doc_scraper.py --config configs/react.json
|
||||||
|
|
||||||
|
# Later, just rebuild (instant)
|
||||||
|
python3 doc_scraper.py --config configs/react.json --skip-scrape
|
||||||
|
```
|
||||||
|
|
||||||
|
### 6. AI-Powered SKILL.md Enhancement (NEW!)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Option 1: During scraping (API-based, requires API key)
|
||||||
|
pip3 install anthropic
|
||||||
|
export ANTHROPIC_API_KEY=sk-ant-...
|
||||||
|
python3 doc_scraper.py --config configs/react.json --enhance
|
||||||
|
|
||||||
|
# Option 2: During scraping (LOCAL, no API key - uses Claude Code Max)
|
||||||
|
python3 doc_scraper.py --config configs/react.json --enhance-local
|
||||||
|
|
||||||
|
# Option 3: After scraping (API-based, standalone)
|
||||||
|
python3 enhance_skill.py output/react/
|
||||||
|
|
||||||
|
# Option 4: After scraping (LOCAL, no API key, standalone)
|
||||||
|
python3 enhance_skill_local.py output/react/
|
||||||
|
```
|
||||||
|
|
||||||
|
**What it does:**
|
||||||
|
- Reads your reference documentation
|
||||||
|
- Uses Claude to generate an excellent SKILL.md
|
||||||
|
- Extracts best code examples (5-10 practical examples)
|
||||||
|
- Creates comprehensive quick reference
|
||||||
|
- Adds domain-specific key concepts
|
||||||
|
- Provides navigation guidance for different skill levels
|
||||||
|
- Automatically backs up original
|
||||||
|
- **Quality:** Transforms 75-line templates into 500+ line comprehensive guides
|
||||||
|
|
||||||
|
**LOCAL Enhancement (Recommended):**
|
||||||
|
- Uses your Claude Code Max plan (no API costs)
|
||||||
|
- Opens new terminal with Claude Code
|
||||||
|
- Analyzes reference files automatically
|
||||||
|
- Takes 30-60 seconds
|
||||||
|
- Quality: 9/10 (comparable to API version)
|
||||||
|
|
||||||
|
## 🎯 Complete Workflows
|
||||||
|
|
||||||
|
### First Time (With Scraping + Enhancement)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Scrape + Build + AI Enhancement (LOCAL, no API key)
|
||||||
|
python3 doc_scraper.py --config configs/godot.json --enhance-local
|
||||||
|
|
||||||
|
# 2. Wait for new terminal to close (enhancement completes)
|
||||||
|
# Check the enhanced SKILL.md:
|
||||||
|
cat output/godot/SKILL.md
|
||||||
|
|
||||||
|
# 3. Package
|
||||||
|
python3 package_skill.py output/godot/
|
||||||
|
|
||||||
|
# 4. Done! You have godot.zip with excellent SKILL.md
|
||||||
|
```
|
||||||
|
|
||||||
|
**Time:** 20-40 minutes (scraping) + 60 seconds (enhancement) = ~21-41 minutes
|
||||||
|
|
||||||
|
### Using Existing Data (Fast!)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Use cached data + Local Enhancement
|
||||||
|
python3 doc_scraper.py --config configs/godot.json --skip-scrape
|
||||||
|
python3 enhance_skill_local.py output/godot/
|
||||||
|
|
||||||
|
# 2. Package
|
||||||
|
python3 package_skill.py output/godot/
|
||||||
|
|
||||||
|
# 3. Done!
|
||||||
|
```
|
||||||
|
|
||||||
|
**Time:** 1-3 minutes (build) + 60 seconds (enhancement) = ~2-4 minutes total
|
||||||
|
|
||||||
|
### Without Enhancement (Basic)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Scrape + Build (no enhancement)
|
||||||
|
python3 doc_scraper.py --config configs/godot.json
|
||||||
|
|
||||||
|
# 2. Package
|
||||||
|
python3 package_skill.py output/godot/
|
||||||
|
|
||||||
|
# 3. Done! (SKILL.md will be basic template)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Time:** 20-40 minutes
|
||||||
|
**Note:** SKILL.md will be generic - enhancement strongly recommended!
|
||||||
|
|
||||||
|
## 📋 Available Presets
|
||||||
|
|
||||||
|
| Config | Framework | Description |
|
||||||
|
|--------|-----------|-------------|
|
||||||
|
| `godot.json` | Godot Engine | Game development |
|
||||||
|
| `react.json` | React | UI framework |
|
||||||
|
| `vue.json` | Vue.js | Progressive framework |
|
||||||
|
| `django.json` | Django | Python web framework |
|
||||||
|
| `fastapi.json` | FastAPI | Modern Python API |
|
||||||
|
|
||||||
|
### Using Presets
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Godot
|
||||||
|
python3 doc_scraper.py --config configs/godot.json
|
||||||
|
|
||||||
|
# React
|
||||||
|
python3 doc_scraper.py --config configs/react.json
|
||||||
|
|
||||||
|
# Vue
|
||||||
|
python3 doc_scraper.py --config configs/vue.json
|
||||||
|
|
||||||
|
# Django
|
||||||
|
python3 doc_scraper.py --config configs/django.json
|
||||||
|
|
||||||
|
# FastAPI
|
||||||
|
python3 doc_scraper.py --config configs/fastapi.json
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🎨 Creating Your Own Config
|
||||||
|
|
||||||
|
### Option 1: Interactive
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py --interactive
|
||||||
|
# Follow prompts, it will create the config for you
|
||||||
|
```
|
||||||
|
|
||||||
|
### Option 2: Copy and Edit
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Copy a preset
|
||||||
|
cp configs/react.json configs/myframework.json
|
||||||
|
|
||||||
|
# Edit it
|
||||||
|
nano configs/myframework.json
|
||||||
|
|
||||||
|
# Use it
|
||||||
|
python3 doc_scraper.py --config configs/myframework.json
|
||||||
|
```
|
||||||
|
|
||||||
|
### Config Structure
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"name": "myframework",
|
||||||
|
"description": "When to use this skill",
|
||||||
|
"base_url": "https://docs.myframework.com/",
|
||||||
|
"selectors": {
|
||||||
|
"main_content": "article",
|
||||||
|
"title": "h1",
|
||||||
|
"code_blocks": "pre code"
|
||||||
|
},
|
||||||
|
"url_patterns": {
|
||||||
|
"include": ["/docs", "/guide"],
|
||||||
|
"exclude": ["/blog", "/about"]
|
||||||
|
},
|
||||||
|
"categories": {
|
||||||
|
"getting_started": ["intro", "quickstart"],
|
||||||
|
"api": ["api", "reference"]
|
||||||
|
},
|
||||||
|
"rate_limit": 0.5,
|
||||||
|
"max_pages": 500
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## 📊 What Gets Created
|
||||||
|
|
||||||
|
```
|
||||||
|
output/
|
||||||
|
├── godot_data/ # Scraped raw data
|
||||||
|
│ ├── pages/ # JSON files (one per page)
|
||||||
|
│ └── summary.json # Overview
|
||||||
|
│
|
||||||
|
└── godot/ # The skill
|
||||||
|
├── SKILL.md # Enhanced with real examples
|
||||||
|
├── references/ # Categorized docs
|
||||||
|
│ ├── index.md
|
||||||
|
│ ├── getting_started.md
|
||||||
|
│ ├── scripting.md
|
||||||
|
│ └── ...
|
||||||
|
├── scripts/ # Empty (add your own)
|
||||||
|
└── assets/ # Empty (add your own)
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🎯 Command Line Options
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Interactive mode
|
||||||
|
python3 doc_scraper.py --interactive
|
||||||
|
|
||||||
|
# Use config file
|
||||||
|
python3 doc_scraper.py --config configs/godot.json
|
||||||
|
|
||||||
|
# Quick mode
|
||||||
|
python3 doc_scraper.py --name react --url https://react.dev/
|
||||||
|
|
||||||
|
# Skip scraping (use existing data)
|
||||||
|
python3 doc_scraper.py --config configs/godot.json --skip-scrape
|
||||||
|
|
||||||
|
# With description
|
||||||
|
python3 doc_scraper.py \
|
||||||
|
--name react \
|
||||||
|
--url https://react.dev/ \
|
||||||
|
--description "React framework for building UIs"
|
||||||
|
```
|
||||||
|
|
||||||
|
## 💡 Tips
|
||||||
|
|
||||||
|
### 1. Test Small First
|
||||||
|
|
||||||
|
Edit `max_pages` in config to test:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"max_pages": 20 // Test with just 20 pages
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Reuse Scraped Data
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Scrape once
|
||||||
|
python3 doc_scraper.py --config configs/react.json
|
||||||
|
|
||||||
|
# Rebuild multiple times (instant)
|
||||||
|
python3 doc_scraper.py --config configs/react.json --skip-scrape
|
||||||
|
python3 doc_scraper.py --config configs/react.json --skip-scrape
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Finding Selectors
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Test in Python
|
||||||
|
from bs4 import BeautifulSoup
|
||||||
|
import requests
|
||||||
|
|
||||||
|
url = "https://docs.example.com/page"
|
||||||
|
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
|
||||||
|
|
||||||
|
# Try different selectors
|
||||||
|
print(soup.select_one('article'))
|
||||||
|
print(soup.select_one('main'))
|
||||||
|
print(soup.select_one('div[role="main"]'))
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. Check Output Quality
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# After building, check:
|
||||||
|
cat output/godot/SKILL.md # Should have real examples
|
||||||
|
cat output/godot/references/index.md # Categories
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🐛 Troubleshooting
|
||||||
|
|
||||||
|
### No Content Extracted?
|
||||||
|
- Check your `main_content` selector
|
||||||
|
- Try: `article`, `main`, `div[role="main"]`
|
||||||
|
|
||||||
|
### Data Exists But Won't Use It?
|
||||||
|
```bash
|
||||||
|
# Force re-scrape
|
||||||
|
rm -rf output/myframework_data/
|
||||||
|
python3 doc_scraper.py --config configs/myframework.json
|
||||||
|
```
|
||||||
|
|
||||||
|
### Categories Not Good?
|
||||||
|
Edit the config `categories` section with better keywords.
|
||||||
|
|
||||||
|
### Want to Update Docs?
|
||||||
|
```bash
|
||||||
|
# Delete old data
|
||||||
|
rm -rf output/godot_data/
|
||||||
|
|
||||||
|
# Re-scrape
|
||||||
|
python3 doc_scraper.py --config configs/godot.json
|
||||||
|
```
|
||||||
|
|
||||||
|
## 📈 Performance
|
||||||
|
|
||||||
|
| Task | Time | Notes |
|
||||||
|
|------|------|-------|
|
||||||
|
| Scraping | 15-45 min | First time only |
|
||||||
|
| Building | 1-3 min | Fast! |
|
||||||
|
| Re-building | <1 min | With --skip-scrape |
|
||||||
|
| Packaging | 5-10 sec | Final zip |
|
||||||
|
|
||||||
|
## ✅ Summary
|
||||||
|
|
||||||
|
**One tool does everything:**
|
||||||
|
1. ✅ Scrapes documentation
|
||||||
|
2. ✅ Auto-detects existing data
|
||||||
|
3. ✅ Generates better knowledge
|
||||||
|
4. ✅ Creates enhanced skills
|
||||||
|
5. ✅ Works with presets or custom configs
|
||||||
|
6. ✅ Supports skip-scraping for fast iteration
|
||||||
|
|
||||||
|
**Simple structure:**
|
||||||
|
- `doc_scraper.py` - The tool
|
||||||
|
- `configs/` - Presets
|
||||||
|
- `output/` - Everything else
|
||||||
|
|
||||||
|
**Better output:**
|
||||||
|
- Real code examples with language detection
|
||||||
|
- Common patterns extracted from docs
|
||||||
|
- Smart categorization
|
||||||
|
- Enhanced SKILL.md with actual examples
|
||||||
|
|
||||||
|
## 📚 Documentation
|
||||||
|
|
||||||
|
- **[QUICKSTART.md](QUICKSTART.md)** - Get started in 3 steps
|
||||||
|
- **[docs/ENHANCEMENT.md](docs/ENHANCEMENT.md)** - AI enhancement guide
|
||||||
|
- **[docs/UPLOAD_GUIDE.md](docs/UPLOAD_GUIDE.md)** - How to upload skills to Claude
|
||||||
|
- **[docs/CLAUDE.md](docs/CLAUDE.md)** - Technical architecture
|
||||||
|
- **[STRUCTURE.md](STRUCTURE.md)** - Repository structure
|
||||||
|
|
||||||
|
## 🎮 Ready?
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Try Godot
|
||||||
|
python3 doc_scraper.py --config configs/godot.json
|
||||||
|
|
||||||
|
# Try React
|
||||||
|
python3 doc_scraper.py --config configs/react.json
|
||||||
|
|
||||||
|
# Or go interactive
|
||||||
|
python3 doc_scraper.py --interactive
|
||||||
|
```
|
||||||
|
|
||||||
|
## 📝 License
|
||||||
|
|
||||||
|
MIT License - see [LICENSE](LICENSE) file for details
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Happy skill building! 🚀
|
||||||
|
|||||||
55
STRUCTURE.md
Normal file
55
STRUCTURE.md
Normal file
@@ -0,0 +1,55 @@
|
|||||||
|
# Repository Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
doc-to-skill/
|
||||||
|
│
|
||||||
|
├── README.md # Main documentation (start here!)
|
||||||
|
├── QUICKSTART.md # 3-step quick start guide
|
||||||
|
├── LICENSE # MIT License
|
||||||
|
├── .gitignore # Git ignore rules
|
||||||
|
│
|
||||||
|
├── 🐍 Core Scripts
|
||||||
|
│ ├── doc_scraper.py # Main scraping tool
|
||||||
|
│ ├── enhance_skill.py # AI enhancement (API-based)
|
||||||
|
│ ├── enhance_skill_local.py # AI enhancement (LOCAL, no API)
|
||||||
|
│ └── package_skill.py # Skill packaging tool
|
||||||
|
│
|
||||||
|
├── 📁 configs/ # Preset configurations
|
||||||
|
│ ├── godot.json
|
||||||
|
│ ├── react.json
|
||||||
|
│ ├── vue.json
|
||||||
|
│ ├── django.json
|
||||||
|
│ ├── fastapi.json
|
||||||
|
│ ├── steam-inventory.json
|
||||||
|
│ ├── steam-economy.json
|
||||||
|
│ └── steam-economy-complete.json
|
||||||
|
│
|
||||||
|
├── 📚 docs/ # Detailed documentation
|
||||||
|
│ ├── CLAUDE.md # Technical architecture
|
||||||
|
│ ├── ENHANCEMENT.md # AI enhancement guide
|
||||||
|
│ ├── UPLOAD_GUIDE.md # How to upload skills
|
||||||
|
│ └── READY_TO_SHARE.md # Sharing checklist
|
||||||
|
│
|
||||||
|
└── 📦 output/ # Generated skills (git-ignored)
|
||||||
|
├── {name}_data/ # Scraped raw data (cached)
|
||||||
|
└── {name}/ # Built skills
|
||||||
|
├── SKILL.md # Main skill file
|
||||||
|
└── references/ # Reference documentation
|
||||||
|
```
|
||||||
|
|
||||||
|
## Key Files
|
||||||
|
|
||||||
|
### For Users:
|
||||||
|
- **README.md** - Start here for overview and installation
|
||||||
|
- **QUICKSTART.md** - Get started in 3 steps
|
||||||
|
- **configs/** - 8 ready-to-use presets
|
||||||
|
|
||||||
|
### For Developers:
|
||||||
|
- **doc_scraper.py** - Main tool (787 lines)
|
||||||
|
- **docs/CLAUDE.md** - Architecture and internals
|
||||||
|
- **docs/ENHANCEMENT.md** - How enhancement works
|
||||||
|
|
||||||
|
### For Contributors:
|
||||||
|
- **LICENSE** - MIT License
|
||||||
|
- **.gitignore** - What Git ignores
|
||||||
|
- **docs/READY_TO_SHARE.md** - Distribution guide
|
||||||
BIN
configs/.DS_Store
vendored
Normal file
BIN
configs/.DS_Store
vendored
Normal file
Binary file not shown.
25
configs/django.json
Normal file
25
configs/django.json
Normal file
@@ -0,0 +1,25 @@
|
|||||||
|
{
|
||||||
|
"name": "django",
|
||||||
|
"description": "Django web framework for Python. Use for Django models, views, templates, ORM, authentication, and web development.",
|
||||||
|
"base_url": "https://docs.djangoproject.com/en/stable/",
|
||||||
|
"selectors": {
|
||||||
|
"main_content": "div.document",
|
||||||
|
"title": "h1",
|
||||||
|
"code_blocks": "pre"
|
||||||
|
},
|
||||||
|
"url_patterns": {
|
||||||
|
"include": ["/topics/", "/ref/", "/howto/"],
|
||||||
|
"exclude": ["/faq/", "/misc/"]
|
||||||
|
},
|
||||||
|
"categories": {
|
||||||
|
"getting_started": ["intro", "tutorial", "install"],
|
||||||
|
"models": ["models", "database", "orm", "queries"],
|
||||||
|
"views": ["views", "urlconf", "routing"],
|
||||||
|
"templates": ["templates", "template"],
|
||||||
|
"forms": ["forms", "form"],
|
||||||
|
"authentication": ["auth", "authentication", "user"],
|
||||||
|
"api": ["ref", "reference"]
|
||||||
|
},
|
||||||
|
"rate_limit": 0.3,
|
||||||
|
"max_pages": 500
|
||||||
|
}
|
||||||
24
configs/fastapi.json
Normal file
24
configs/fastapi.json
Normal file
@@ -0,0 +1,24 @@
|
|||||||
|
{
|
||||||
|
"name": "fastapi",
|
||||||
|
"description": "FastAPI modern Python web framework. Use for building APIs, async endpoints, dependency injection, and Python backend development.",
|
||||||
|
"base_url": "https://fastapi.tiangolo.com/",
|
||||||
|
"selectors": {
|
||||||
|
"main_content": "article",
|
||||||
|
"title": "h1",
|
||||||
|
"code_blocks": "pre code"
|
||||||
|
},
|
||||||
|
"url_patterns": {
|
||||||
|
"include": ["/tutorial/", "/advanced/", "/reference/"],
|
||||||
|
"exclude": ["/help/", "/external-links/"]
|
||||||
|
},
|
||||||
|
"categories": {
|
||||||
|
"getting_started": ["first-steps", "tutorial", "intro"],
|
||||||
|
"path_operations": ["path", "operations", "routing"],
|
||||||
|
"request_data": ["request", "body", "query", "parameters"],
|
||||||
|
"dependencies": ["dependencies", "injection"],
|
||||||
|
"security": ["security", "oauth", "authentication"],
|
||||||
|
"database": ["database", "sql", "orm"]
|
||||||
|
},
|
||||||
|
"rate_limit": 0.5,
|
||||||
|
"max_pages": 250
|
||||||
|
}
|
||||||
34
configs/godot.json
Normal file
34
configs/godot.json
Normal file
@@ -0,0 +1,34 @@
|
|||||||
|
{
|
||||||
|
"name": "godot",
|
||||||
|
"description": "Godot Engine game development. Use for Godot projects, GDScript/C# coding, scene setup, node systems, 2D/3D development, physics, animation, UI, shaders, or any Godot-specific questions.",
|
||||||
|
"base_url": "https://docs.godotengine.org/en/stable/",
|
||||||
|
"selectors": {
|
||||||
|
"main_content": "div[role='main']",
|
||||||
|
"title": "title",
|
||||||
|
"code_blocks": "pre"
|
||||||
|
},
|
||||||
|
"url_patterns": {
|
||||||
|
"include": [],
|
||||||
|
"exclude": [
|
||||||
|
"/genindex.html",
|
||||||
|
"/search.html",
|
||||||
|
"/_static/",
|
||||||
|
"/_sources/"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"categories": {
|
||||||
|
"getting_started": ["introduction", "getting_started", "first", "your_first"],
|
||||||
|
"scripting": ["scripting", "gdscript", "c#", "csharp"],
|
||||||
|
"2d": ["/2d/", "sprite", "canvas", "tilemap"],
|
||||||
|
"3d": ["/3d/", "spatial", "mesh", "3d_"],
|
||||||
|
"physics": ["physics", "collision", "rigidbody", "characterbody"],
|
||||||
|
"animation": ["animation", "tween", "animationplayer"],
|
||||||
|
"ui": ["ui", "control", "gui", "theme"],
|
||||||
|
"shaders": ["shader", "material", "visual_shader"],
|
||||||
|
"audio": ["audio", "sound"],
|
||||||
|
"networking": ["networking", "multiplayer", "rpc"],
|
||||||
|
"export": ["export", "platform", "deploy"]
|
||||||
|
},
|
||||||
|
"rate_limit": 0.5,
|
||||||
|
"max_pages": 500
|
||||||
|
}
|
||||||
23
configs/react.json
Normal file
23
configs/react.json
Normal file
@@ -0,0 +1,23 @@
|
|||||||
|
{
|
||||||
|
"name": "react",
|
||||||
|
"description": "React framework for building user interfaces. Use for React components, hooks, state management, JSX, and modern frontend development.",
|
||||||
|
"base_url": "https://react.dev/",
|
||||||
|
"selectors": {
|
||||||
|
"main_content": "article",
|
||||||
|
"title": "h1",
|
||||||
|
"code_blocks": "pre code"
|
||||||
|
},
|
||||||
|
"url_patterns": {
|
||||||
|
"include": ["/learn", "/reference"],
|
||||||
|
"exclude": ["/community", "/blog"]
|
||||||
|
},
|
||||||
|
"categories": {
|
||||||
|
"getting_started": ["quick-start", "installation", "tutorial"],
|
||||||
|
"hooks": ["usestate", "useeffect", "usememo", "usecallback", "usecontext", "useref", "hook"],
|
||||||
|
"components": ["component", "props", "jsx"],
|
||||||
|
"state": ["state", "context", "reducer"],
|
||||||
|
"api": ["api", "reference"]
|
||||||
|
},
|
||||||
|
"rate_limit": 0.5,
|
||||||
|
"max_pages": 300
|
||||||
|
}
|
||||||
108
configs/steam-economy-complete.json
Normal file
108
configs/steam-economy-complete.json
Normal file
@@ -0,0 +1,108 @@
|
|||||||
|
{
|
||||||
|
"name": "steam-economy-complete",
|
||||||
|
"description": "Complete Steam Economy system including inventory, microtransactions, trading, and monetization. Use for ISteamInventory API, ISteamEconomy API, IInventoryService Web API, Steam Wallet integration, in-app purchases, item definitions, trading, crafting, market integration, and all economy features for game developers.",
|
||||||
|
"base_url": "https://partner.steamgames.com/doc/",
|
||||||
|
"start_urls": [
|
||||||
|
"https://partner.steamgames.com/doc/features/inventory",
|
||||||
|
"https://partner.steamgames.com/doc/features/microtransactions",
|
||||||
|
"https://partner.steamgames.com/doc/features/microtransactions/implementation",
|
||||||
|
"https://partner.steamgames.com/doc/api/ISteamInventory",
|
||||||
|
"https://partner.steamgames.com/doc/webapi/ISteamEconomy",
|
||||||
|
"https://partner.steamgames.com/doc/webapi/IInventoryService",
|
||||||
|
"https://partner.steamgames.com/doc/features/inventory/economy"
|
||||||
|
],
|
||||||
|
"selectors": {
|
||||||
|
"main_content": "div.documentation_bbcode",
|
||||||
|
"title": "div.docPageTitle",
|
||||||
|
"code_blocks": "div.bb_code"
|
||||||
|
},
|
||||||
|
"url_patterns": {
|
||||||
|
"include": [
|
||||||
|
"/features/inventory",
|
||||||
|
"/features/microtransactions",
|
||||||
|
"/api/ISteamInventory",
|
||||||
|
"/webapi/ISteamEconomy",
|
||||||
|
"/webapi/IInventoryService"
|
||||||
|
],
|
||||||
|
"exclude": [
|
||||||
|
"/home",
|
||||||
|
"/sales",
|
||||||
|
"/marketing",
|
||||||
|
"/legal",
|
||||||
|
"/finance",
|
||||||
|
"/login",
|
||||||
|
"/search",
|
||||||
|
"/steamworks/apps",
|
||||||
|
"/steamworks/partner"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"categories": {
|
||||||
|
"getting_started": [
|
||||||
|
"overview",
|
||||||
|
"getting started",
|
||||||
|
"introduction",
|
||||||
|
"quickstart",
|
||||||
|
"setup"
|
||||||
|
],
|
||||||
|
"inventory_system": [
|
||||||
|
"inventory",
|
||||||
|
"item definition",
|
||||||
|
"item schema",
|
||||||
|
"item properties",
|
||||||
|
"itemdefs",
|
||||||
|
"ISteamInventory"
|
||||||
|
],
|
||||||
|
"microtransactions": [
|
||||||
|
"microtransaction",
|
||||||
|
"purchase",
|
||||||
|
"payment",
|
||||||
|
"checkout",
|
||||||
|
"wallet",
|
||||||
|
"transaction"
|
||||||
|
],
|
||||||
|
"economy_api": [
|
||||||
|
"ISteamEconomy",
|
||||||
|
"economy",
|
||||||
|
"asset",
|
||||||
|
"context"
|
||||||
|
],
|
||||||
|
"inventory_webapi": [
|
||||||
|
"IInventoryService",
|
||||||
|
"webapi",
|
||||||
|
"web api",
|
||||||
|
"http"
|
||||||
|
],
|
||||||
|
"trading": [
|
||||||
|
"trading",
|
||||||
|
"trade",
|
||||||
|
"exchange",
|
||||||
|
"market"
|
||||||
|
],
|
||||||
|
"crafting": [
|
||||||
|
"crafting",
|
||||||
|
"recipe",
|
||||||
|
"combine",
|
||||||
|
"exchange"
|
||||||
|
],
|
||||||
|
"pricing": [
|
||||||
|
"pricing",
|
||||||
|
"price",
|
||||||
|
"cost",
|
||||||
|
"currency"
|
||||||
|
],
|
||||||
|
"implementation": [
|
||||||
|
"integration",
|
||||||
|
"implementation",
|
||||||
|
"configure",
|
||||||
|
"best practices"
|
||||||
|
],
|
||||||
|
"examples": [
|
||||||
|
"example",
|
||||||
|
"sample",
|
||||||
|
"tutorial",
|
||||||
|
"walkthrough"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"rate_limit": 0.7,
|
||||||
|
"max_pages": 1000
|
||||||
|
}
|
||||||
23
configs/vue.json
Normal file
23
configs/vue.json
Normal file
@@ -0,0 +1,23 @@
|
|||||||
|
{
|
||||||
|
"name": "vue",
|
||||||
|
"description": "Vue.js progressive JavaScript framework. Use for Vue components, reactivity, composition API, and frontend development.",
|
||||||
|
"base_url": "https://vuejs.org/guide/",
|
||||||
|
"selectors": {
|
||||||
|
"main_content": "main",
|
||||||
|
"title": "h1",
|
||||||
|
"code_blocks": "pre code"
|
||||||
|
},
|
||||||
|
"url_patterns": {
|
||||||
|
"include": ["/guide/", "/api/", "/examples/"],
|
||||||
|
"exclude": ["/about/", "/sponsor/"]
|
||||||
|
},
|
||||||
|
"categories": {
|
||||||
|
"getting_started": ["quick-start", "introduction", "essentials"],
|
||||||
|
"components": ["component", "props", "events"],
|
||||||
|
"reactivity": ["reactivity", "reactive", "ref", "computed"],
|
||||||
|
"composition_api": ["composition", "setup"],
|
||||||
|
"api": ["api", "reference"]
|
||||||
|
},
|
||||||
|
"rate_limit": 0.5,
|
||||||
|
"max_pages": 200
|
||||||
|
}
|
||||||
789
doc_scraper.py
Normal file
789
doc_scraper.py
Normal file
@@ -0,0 +1,789 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Documentation to Claude Skill Converter
|
||||||
|
Single tool to scrape any documentation and create high-quality Claude skills.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 doc_scraper.py --interactive
|
||||||
|
python3 doc_scraper.py --config configs/godot.json
|
||||||
|
python3 doc_scraper.py --url https://react.dev/ --name react
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import json
|
||||||
|
import time
|
||||||
|
import re
|
||||||
|
import argparse
|
||||||
|
import hashlib
|
||||||
|
import requests
|
||||||
|
from pathlib import Path
|
||||||
|
from urllib.parse import urljoin, urlparse
|
||||||
|
from bs4 import BeautifulSoup
|
||||||
|
from collections import deque, defaultdict
|
||||||
|
|
||||||
|
|
||||||
|
class DocToSkillConverter:
|
||||||
|
def __init__(self, config):
|
||||||
|
self.config = config
|
||||||
|
self.name = config['name']
|
||||||
|
self.base_url = config['base_url']
|
||||||
|
|
||||||
|
# Paths
|
||||||
|
self.data_dir = f"output/{self.name}_data"
|
||||||
|
self.skill_dir = f"output/{self.name}"
|
||||||
|
|
||||||
|
# State
|
||||||
|
self.visited_urls = set()
|
||||||
|
# Support multiple starting URLs
|
||||||
|
start_urls = config.get('start_urls', [self.base_url])
|
||||||
|
self.pending_urls = deque(start_urls)
|
||||||
|
self.pages = []
|
||||||
|
|
||||||
|
# Create directories
|
||||||
|
os.makedirs(f"{self.data_dir}/pages", exist_ok=True)
|
||||||
|
os.makedirs(f"{self.skill_dir}/references", exist_ok=True)
|
||||||
|
os.makedirs(f"{self.skill_dir}/scripts", exist_ok=True)
|
||||||
|
os.makedirs(f"{self.skill_dir}/assets", exist_ok=True)
|
||||||
|
|
||||||
|
def is_valid_url(self, url):
|
||||||
|
"""Check if URL should be scraped"""
|
||||||
|
if not url.startswith(self.base_url):
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Include patterns
|
||||||
|
includes = self.config.get('url_patterns', {}).get('include', [])
|
||||||
|
if includes and not any(pattern in url for pattern in includes):
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Exclude patterns
|
||||||
|
excludes = self.config.get('url_patterns', {}).get('exclude', [])
|
||||||
|
if any(pattern in url for pattern in excludes):
|
||||||
|
return False
|
||||||
|
|
||||||
|
return True
|
||||||
|
|
||||||
|
def extract_content(self, soup, url):
|
||||||
|
"""Extract content with improved code and pattern detection"""
|
||||||
|
page = {
|
||||||
|
'url': url,
|
||||||
|
'title': '',
|
||||||
|
'content': '',
|
||||||
|
'headings': [],
|
||||||
|
'code_samples': [],
|
||||||
|
'patterns': [], # NEW: Extract common patterns
|
||||||
|
'links': []
|
||||||
|
}
|
||||||
|
|
||||||
|
selectors = self.config.get('selectors', {})
|
||||||
|
|
||||||
|
# Extract title
|
||||||
|
title_elem = soup.select_one(selectors.get('title', 'title'))
|
||||||
|
if title_elem:
|
||||||
|
page['title'] = self.clean_text(title_elem.get_text())
|
||||||
|
|
||||||
|
# Find main content
|
||||||
|
main_selector = selectors.get('main_content', 'div[role="main"]')
|
||||||
|
main = soup.select_one(main_selector)
|
||||||
|
|
||||||
|
if not main:
|
||||||
|
print(f"⚠ No content: {url}")
|
||||||
|
return page
|
||||||
|
|
||||||
|
# Extract headings with better structure
|
||||||
|
for h in main.find_all(['h1', 'h2', 'h3', 'h4', 'h5', 'h6']):
|
||||||
|
text = self.clean_text(h.get_text())
|
||||||
|
if text:
|
||||||
|
page['headings'].append({
|
||||||
|
'level': h.name,
|
||||||
|
'text': text,
|
||||||
|
'id': h.get('id', '')
|
||||||
|
})
|
||||||
|
|
||||||
|
# Extract code with language detection
|
||||||
|
code_selector = selectors.get('code_blocks', 'pre code')
|
||||||
|
for code_elem in main.select(code_selector):
|
||||||
|
code = code_elem.get_text()
|
||||||
|
if len(code.strip()) > 10:
|
||||||
|
# Try to detect language
|
||||||
|
lang = self.detect_language(code_elem, code)
|
||||||
|
page['code_samples'].append({
|
||||||
|
'code': code.strip(),
|
||||||
|
'language': lang
|
||||||
|
})
|
||||||
|
|
||||||
|
# Extract patterns (NEW: common code patterns)
|
||||||
|
page['patterns'] = self.extract_patterns(main, page['code_samples'])
|
||||||
|
|
||||||
|
# Extract paragraphs
|
||||||
|
paragraphs = []
|
||||||
|
for p in main.find_all('p'):
|
||||||
|
text = self.clean_text(p.get_text())
|
||||||
|
if text and len(text) > 20: # Skip very short paragraphs
|
||||||
|
paragraphs.append(text)
|
||||||
|
|
||||||
|
page['content'] = '\n\n'.join(paragraphs)
|
||||||
|
|
||||||
|
# Extract links
|
||||||
|
for link in main.find_all('a', href=True):
|
||||||
|
href = urljoin(url, link['href'])
|
||||||
|
if self.is_valid_url(href):
|
||||||
|
page['links'].append(href)
|
||||||
|
|
||||||
|
return page
|
||||||
|
|
||||||
|
def detect_language(self, elem, code):
|
||||||
|
"""Detect programming language from code block"""
|
||||||
|
# Check class attribute
|
||||||
|
classes = elem.get('class', [])
|
||||||
|
for cls in classes:
|
||||||
|
if 'language-' in cls:
|
||||||
|
return cls.replace('language-', '')
|
||||||
|
if 'lang-' in cls:
|
||||||
|
return cls.replace('lang-', '')
|
||||||
|
|
||||||
|
# Check parent pre element
|
||||||
|
parent = elem.parent
|
||||||
|
if parent and parent.name == 'pre':
|
||||||
|
classes = parent.get('class', [])
|
||||||
|
for cls in classes:
|
||||||
|
if 'language-' in cls:
|
||||||
|
return cls.replace('language-', '')
|
||||||
|
|
||||||
|
# Heuristic detection
|
||||||
|
if 'import ' in code and 'from ' in code:
|
||||||
|
return 'python'
|
||||||
|
if 'const ' in code or 'let ' in code or '=>' in code:
|
||||||
|
return 'javascript'
|
||||||
|
if 'func ' in code and 'var ' in code:
|
||||||
|
return 'gdscript'
|
||||||
|
if 'def ' in code and ':' in code:
|
||||||
|
return 'python'
|
||||||
|
if '#include' in code or 'int main' in code:
|
||||||
|
return 'cpp'
|
||||||
|
|
||||||
|
return 'unknown'
|
||||||
|
|
||||||
|
def extract_patterns(self, main, code_samples):
|
||||||
|
"""Extract common coding patterns (NEW FEATURE)"""
|
||||||
|
patterns = []
|
||||||
|
|
||||||
|
# Look for "Example:" or "Pattern:" sections
|
||||||
|
for elem in main.find_all(['p', 'div']):
|
||||||
|
text = elem.get_text().lower()
|
||||||
|
if any(word in text for word in ['example:', 'pattern:', 'usage:', 'typical use']):
|
||||||
|
# Get the code that follows
|
||||||
|
next_code = elem.find_next(['pre', 'code'])
|
||||||
|
if next_code:
|
||||||
|
patterns.append({
|
||||||
|
'description': self.clean_text(elem.get_text()),
|
||||||
|
'code': next_code.get_text().strip()
|
||||||
|
})
|
||||||
|
|
||||||
|
return patterns[:5] # Limit to 5 most relevant patterns
|
||||||
|
|
||||||
|
def clean_text(self, text):
|
||||||
|
"""Clean text content"""
|
||||||
|
text = re.sub(r'\s+', ' ', text)
|
||||||
|
return text.strip()
|
||||||
|
|
||||||
|
def save_page(self, page):
|
||||||
|
"""Save page data"""
|
||||||
|
url_hash = hashlib.md5(page['url'].encode()).hexdigest()[:10]
|
||||||
|
safe_title = re.sub(r'[^\w\s-]', '', page['title'])[:50]
|
||||||
|
safe_title = re.sub(r'[-\s]+', '_', safe_title)
|
||||||
|
|
||||||
|
filename = f"{safe_title}_{url_hash}.json"
|
||||||
|
filepath = os.path.join(self.data_dir, "pages", filename)
|
||||||
|
|
||||||
|
with open(filepath, 'w', encoding='utf-8') as f:
|
||||||
|
json.dump(page, f, indent=2, ensure_ascii=False)
|
||||||
|
|
||||||
|
def scrape_page(self, url):
|
||||||
|
"""Scrape a single page"""
|
||||||
|
try:
|
||||||
|
print(f" {url}")
|
||||||
|
|
||||||
|
headers = {'User-Agent': 'Mozilla/5.0 (Documentation Scraper)'}
|
||||||
|
response = requests.get(url, headers=headers, timeout=30)
|
||||||
|
response.raise_for_status()
|
||||||
|
|
||||||
|
soup = BeautifulSoup(response.content, 'html.parser')
|
||||||
|
page = self.extract_content(soup, url)
|
||||||
|
|
||||||
|
self.save_page(page)
|
||||||
|
self.pages.append(page)
|
||||||
|
|
||||||
|
# Add new URLs
|
||||||
|
for link in page['links']:
|
||||||
|
if link not in self.visited_urls and link not in self.pending_urls:
|
||||||
|
self.pending_urls.append(link)
|
||||||
|
|
||||||
|
# Rate limiting
|
||||||
|
time.sleep(self.config.get('rate_limit', 0.5))
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
print(f" ✗ Error: {e}")
|
||||||
|
|
||||||
|
def scrape_all(self):
|
||||||
|
"""Scrape all pages"""
|
||||||
|
print(f"\n{'='*60}")
|
||||||
|
print(f"SCRAPING: {self.name}")
|
||||||
|
print(f"{'='*60}")
|
||||||
|
print(f"Base URL: {self.base_url}")
|
||||||
|
print(f"Output: {self.data_dir}\n")
|
||||||
|
|
||||||
|
max_pages = self.config.get('max_pages', 500)
|
||||||
|
|
||||||
|
while self.pending_urls and len(self.visited_urls) < max_pages:
|
||||||
|
url = self.pending_urls.popleft()
|
||||||
|
|
||||||
|
if url in self.visited_urls:
|
||||||
|
continue
|
||||||
|
|
||||||
|
self.visited_urls.add(url)
|
||||||
|
self.scrape_page(url)
|
||||||
|
|
||||||
|
if len(self.visited_urls) % 10 == 0:
|
||||||
|
print(f" [{len(self.visited_urls)} pages]")
|
||||||
|
|
||||||
|
print(f"\n✅ Scraped {len(self.visited_urls)} pages")
|
||||||
|
self.save_summary()
|
||||||
|
|
||||||
|
def save_summary(self):
|
||||||
|
"""Save scraping summary"""
|
||||||
|
summary = {
|
||||||
|
'name': self.name,
|
||||||
|
'total_pages': len(self.pages),
|
||||||
|
'base_url': self.base_url,
|
||||||
|
'pages': [{'title': p['title'], 'url': p['url']} for p in self.pages]
|
||||||
|
}
|
||||||
|
|
||||||
|
with open(f"{self.data_dir}/summary.json", 'w', encoding='utf-8') as f:
|
||||||
|
json.dump(summary, f, indent=2, ensure_ascii=False)
|
||||||
|
|
||||||
|
def load_scraped_data(self):
|
||||||
|
"""Load previously scraped data"""
|
||||||
|
pages = []
|
||||||
|
pages_dir = Path(self.data_dir) / "pages"
|
||||||
|
|
||||||
|
if not pages_dir.exists():
|
||||||
|
return []
|
||||||
|
|
||||||
|
for json_file in pages_dir.glob("*.json"):
|
||||||
|
try:
|
||||||
|
with open(json_file, 'r', encoding='utf-8') as f:
|
||||||
|
pages.append(json.load(f))
|
||||||
|
except Exception as e:
|
||||||
|
print(f"⚠ Error loading {json_file}: {e}")
|
||||||
|
|
||||||
|
return pages
|
||||||
|
|
||||||
|
def smart_categorize(self, pages):
|
||||||
|
"""Improved categorization with better pattern matching"""
|
||||||
|
category_defs = self.config.get('categories', {})
|
||||||
|
|
||||||
|
# Default smart categories if none provided
|
||||||
|
if not category_defs:
|
||||||
|
category_defs = self.infer_categories(pages)
|
||||||
|
|
||||||
|
categories = {cat: [] for cat in category_defs.keys()}
|
||||||
|
categories['other'] = []
|
||||||
|
|
||||||
|
for page in pages:
|
||||||
|
url = page['url'].lower()
|
||||||
|
title = page['title'].lower()
|
||||||
|
content = page.get('content', '').lower()[:500] # Check first 500 chars
|
||||||
|
|
||||||
|
categorized = False
|
||||||
|
|
||||||
|
# Match against keywords
|
||||||
|
for cat, keywords in category_defs.items():
|
||||||
|
score = 0
|
||||||
|
for keyword in keywords:
|
||||||
|
keyword = keyword.lower()
|
||||||
|
if keyword in url:
|
||||||
|
score += 3
|
||||||
|
if keyword in title:
|
||||||
|
score += 2
|
||||||
|
if keyword in content:
|
||||||
|
score += 1
|
||||||
|
|
||||||
|
if score >= 2: # Threshold for categorization
|
||||||
|
categories[cat].append(page)
|
||||||
|
categorized = True
|
||||||
|
break
|
||||||
|
|
||||||
|
if not categorized:
|
||||||
|
categories['other'].append(page)
|
||||||
|
|
||||||
|
# Remove empty categories
|
||||||
|
categories = {k: v for k, v in categories.items() if v}
|
||||||
|
|
||||||
|
return categories
|
||||||
|
|
||||||
|
def infer_categories(self, pages):
|
||||||
|
"""Infer categories from URL patterns (IMPROVED)"""
|
||||||
|
url_segments = defaultdict(int)
|
||||||
|
|
||||||
|
for page in pages:
|
||||||
|
path = urlparse(page['url']).path
|
||||||
|
segments = [s for s in path.split('/') if s and s not in ['en', 'stable', 'latest', 'docs']]
|
||||||
|
|
||||||
|
for seg in segments:
|
||||||
|
url_segments[seg] += 1
|
||||||
|
|
||||||
|
# Top segments become categories
|
||||||
|
top_segments = sorted(url_segments.items(), key=lambda x: x[1], reverse=True)[:8]
|
||||||
|
|
||||||
|
categories = {}
|
||||||
|
for seg, count in top_segments:
|
||||||
|
if count >= 3: # At least 3 pages
|
||||||
|
categories[seg] = [seg]
|
||||||
|
|
||||||
|
# Add common defaults
|
||||||
|
if 'tutorial' not in categories and any('tutorial' in url for url in [p['url'] for p in pages]):
|
||||||
|
categories['tutorials'] = ['tutorial', 'guide', 'getting-started']
|
||||||
|
|
||||||
|
if 'api' not in categories and any('api' in url or 'reference' in url for url in [p['url'] for p in pages]):
|
||||||
|
categories['api'] = ['api', 'reference', 'class']
|
||||||
|
|
||||||
|
return categories
|
||||||
|
|
||||||
|
def generate_quick_reference(self, pages):
|
||||||
|
"""Generate quick reference from common patterns (NEW FEATURE)"""
|
||||||
|
quick_ref = []
|
||||||
|
|
||||||
|
# Collect all patterns
|
||||||
|
all_patterns = []
|
||||||
|
for page in pages:
|
||||||
|
all_patterns.extend(page.get('patterns', []))
|
||||||
|
|
||||||
|
# Get most common code patterns
|
||||||
|
seen_codes = set()
|
||||||
|
for pattern in all_patterns:
|
||||||
|
code = pattern['code']
|
||||||
|
if code not in seen_codes and len(code) < 300:
|
||||||
|
quick_ref.append(pattern)
|
||||||
|
seen_codes.add(code)
|
||||||
|
if len(quick_ref) >= 15:
|
||||||
|
break
|
||||||
|
|
||||||
|
return quick_ref
|
||||||
|
|
||||||
|
def create_reference_file(self, category, pages):
|
||||||
|
"""Create enhanced reference file"""
|
||||||
|
if not pages:
|
||||||
|
return
|
||||||
|
|
||||||
|
lines = []
|
||||||
|
lines.append(f"# {self.name.title()} - {category.replace('_', ' ').title()}\n")
|
||||||
|
lines.append(f"**Pages:** {len(pages)}\n")
|
||||||
|
lines.append("---\n")
|
||||||
|
|
||||||
|
for page in pages:
|
||||||
|
lines.append(f"## {page['title']}\n")
|
||||||
|
lines.append(f"**URL:** {page['url']}\n")
|
||||||
|
|
||||||
|
# Table of contents from headings
|
||||||
|
if page.get('headings'):
|
||||||
|
lines.append("**Contents:**")
|
||||||
|
for h in page['headings'][:10]:
|
||||||
|
level = int(h['level'][1]) if len(h['level']) > 1 else 1
|
||||||
|
indent = " " * max(0, level - 2)
|
||||||
|
lines.append(f"{indent}- {h['text']}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Content
|
||||||
|
if page.get('content'):
|
||||||
|
content = page['content'][:2500]
|
||||||
|
if len(page['content']) > 2500:
|
||||||
|
content += "\n\n*[Content truncated]*"
|
||||||
|
lines.append(content)
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Code examples with language
|
||||||
|
if page.get('code_samples'):
|
||||||
|
lines.append("**Examples:**\n")
|
||||||
|
for i, sample in enumerate(page['code_samples'][:4], 1):
|
||||||
|
lang = sample.get('language', 'unknown')
|
||||||
|
code = sample.get('code', sample if isinstance(sample, str) else '')
|
||||||
|
lines.append(f"Example {i} ({lang}):")
|
||||||
|
lines.append(f"```{lang}")
|
||||||
|
lines.append(code[:600])
|
||||||
|
if len(code) > 600:
|
||||||
|
lines.append("...")
|
||||||
|
lines.append("```\n")
|
||||||
|
|
||||||
|
lines.append("---\n")
|
||||||
|
|
||||||
|
filepath = os.path.join(self.skill_dir, "references", f"{category}.md")
|
||||||
|
with open(filepath, 'w', encoding='utf-8') as f:
|
||||||
|
f.write('\n'.join(lines))
|
||||||
|
|
||||||
|
print(f" ✓ {category}.md ({len(pages)} pages)")
|
||||||
|
|
||||||
|
def create_enhanced_skill_md(self, categories, quick_ref):
|
||||||
|
"""Create SKILL.md with actual examples (IMPROVED)"""
|
||||||
|
description = self.config.get('description', f'Comprehensive assistance with {self.name}')
|
||||||
|
|
||||||
|
# Extract actual code examples from docs
|
||||||
|
example_codes = []
|
||||||
|
for pages in categories.values():
|
||||||
|
for page in pages[:3]: # First 3 pages per category
|
||||||
|
for sample in page.get('code_samples', [])[:2]: # First 2 samples per page
|
||||||
|
code = sample.get('code', sample if isinstance(sample, str) else '')
|
||||||
|
lang = sample.get('language', 'unknown')
|
||||||
|
if len(code) < 200 and lang != 'unknown':
|
||||||
|
example_codes.append((lang, code))
|
||||||
|
if len(example_codes) >= 10:
|
||||||
|
break
|
||||||
|
if len(example_codes) >= 10:
|
||||||
|
break
|
||||||
|
if len(example_codes) >= 10:
|
||||||
|
break
|
||||||
|
|
||||||
|
content = f"""---
|
||||||
|
name: {self.name}
|
||||||
|
description: {description}
|
||||||
|
---
|
||||||
|
|
||||||
|
# {self.name.title()} Skill
|
||||||
|
|
||||||
|
Comprehensive assistance with {self.name} development, generated from official documentation.
|
||||||
|
|
||||||
|
## When to Use This Skill
|
||||||
|
|
||||||
|
This skill should be triggered when:
|
||||||
|
- Working with {self.name}
|
||||||
|
- Asking about {self.name} features or APIs
|
||||||
|
- Implementing {self.name} solutions
|
||||||
|
- Debugging {self.name} code
|
||||||
|
- Learning {self.name} best practices
|
||||||
|
|
||||||
|
## Quick Reference
|
||||||
|
|
||||||
|
### Common Patterns
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Add actual quick reference patterns
|
||||||
|
if quick_ref:
|
||||||
|
for i, pattern in enumerate(quick_ref[:8], 1):
|
||||||
|
content += f"**Pattern {i}:** {pattern.get('description', 'Example pattern')}\n\n"
|
||||||
|
content += "```\n"
|
||||||
|
content += pattern.get('code', '')[:300]
|
||||||
|
content += "\n```\n\n"
|
||||||
|
else:
|
||||||
|
content += "*Quick reference patterns will be added as you use the skill.*\n\n"
|
||||||
|
|
||||||
|
# Add example codes from docs
|
||||||
|
if example_codes:
|
||||||
|
content += "### Example Code Patterns\n\n"
|
||||||
|
for i, (lang, code) in enumerate(example_codes[:5], 1):
|
||||||
|
content += f"**Example {i}** ({lang}):\n```{lang}\n{code}\n```\n\n"
|
||||||
|
|
||||||
|
content += f"""## Reference Files
|
||||||
|
|
||||||
|
This skill includes comprehensive documentation in `references/`:
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
for cat in sorted(categories.keys()):
|
||||||
|
content += f"- **{cat}.md** - {cat.replace('_', ' ').title()} documentation\n"
|
||||||
|
|
||||||
|
content += """
|
||||||
|
Use `view` to read specific reference files when detailed information is needed.
|
||||||
|
|
||||||
|
## Working with This Skill
|
||||||
|
|
||||||
|
### For Beginners
|
||||||
|
Start with the getting_started or tutorials reference files for foundational concepts.
|
||||||
|
|
||||||
|
### For Specific Features
|
||||||
|
Use the appropriate category reference file (api, guides, etc.) for detailed information.
|
||||||
|
|
||||||
|
### For Code Examples
|
||||||
|
The quick reference section above contains common patterns extracted from the official docs.
|
||||||
|
|
||||||
|
## Resources
|
||||||
|
|
||||||
|
### references/
|
||||||
|
Organized documentation extracted from official sources. These files contain:
|
||||||
|
- Detailed explanations
|
||||||
|
- Code examples with language annotations
|
||||||
|
- Links to original documentation
|
||||||
|
- Table of contents for quick navigation
|
||||||
|
|
||||||
|
### scripts/
|
||||||
|
Add helper scripts here for common automation tasks.
|
||||||
|
|
||||||
|
### assets/
|
||||||
|
Add templates, boilerplate, or example projects here.
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- This skill was automatically generated from official documentation
|
||||||
|
- Reference files preserve the structure and examples from source docs
|
||||||
|
- Code examples include language detection for better syntax highlighting
|
||||||
|
- Quick reference patterns are extracted from common usage examples in the docs
|
||||||
|
|
||||||
|
## Updating
|
||||||
|
|
||||||
|
To refresh this skill with updated documentation:
|
||||||
|
1. Re-run the scraper with the same configuration
|
||||||
|
2. The skill will be rebuilt with the latest information
|
||||||
|
"""
|
||||||
|
|
||||||
|
filepath = os.path.join(self.skill_dir, "SKILL.md")
|
||||||
|
with open(filepath, 'w', encoding='utf-8') as f:
|
||||||
|
f.write(content)
|
||||||
|
|
||||||
|
print(f" ✓ SKILL.md (enhanced with {len(example_codes)} examples)")
|
||||||
|
|
||||||
|
def create_index(self, categories):
|
||||||
|
"""Create navigation index"""
|
||||||
|
lines = []
|
||||||
|
lines.append(f"# {self.name.title()} Documentation Index\n")
|
||||||
|
lines.append("## Categories\n")
|
||||||
|
|
||||||
|
for cat, pages in sorted(categories.items()):
|
||||||
|
lines.append(f"### {cat.replace('_', ' ').title()}")
|
||||||
|
lines.append(f"**File:** `{cat}.md`")
|
||||||
|
lines.append(f"**Pages:** {len(pages)}\n")
|
||||||
|
|
||||||
|
filepath = os.path.join(self.skill_dir, "references", "index.md")
|
||||||
|
with open(filepath, 'w', encoding='utf-8') as f:
|
||||||
|
f.write('\n'.join(lines))
|
||||||
|
|
||||||
|
print(" ✓ index.md")
|
||||||
|
|
||||||
|
def build_skill(self):
|
||||||
|
"""Build the skill from scraped data"""
|
||||||
|
print(f"\n{'='*60}")
|
||||||
|
print(f"BUILDING SKILL: {self.name}")
|
||||||
|
print(f"{'='*60}\n")
|
||||||
|
|
||||||
|
# Load data
|
||||||
|
print("Loading scraped data...")
|
||||||
|
pages = self.load_scraped_data()
|
||||||
|
|
||||||
|
if not pages:
|
||||||
|
print("✗ No scraped data found!")
|
||||||
|
return False
|
||||||
|
|
||||||
|
print(f" ✓ Loaded {len(pages)} pages\n")
|
||||||
|
|
||||||
|
# Categorize
|
||||||
|
print("Categorizing pages...")
|
||||||
|
categories = self.smart_categorize(pages)
|
||||||
|
print(f" ✓ Created {len(categories)} categories\n")
|
||||||
|
|
||||||
|
# Generate quick reference
|
||||||
|
print("Generating quick reference...")
|
||||||
|
quick_ref = self.generate_quick_reference(pages)
|
||||||
|
print(f" ✓ Extracted {len(quick_ref)} patterns\n")
|
||||||
|
|
||||||
|
# Create reference files
|
||||||
|
print("Creating reference files...")
|
||||||
|
for cat, cat_pages in categories.items():
|
||||||
|
self.create_reference_file(cat, cat_pages)
|
||||||
|
|
||||||
|
# Create index
|
||||||
|
self.create_index(categories)
|
||||||
|
print()
|
||||||
|
|
||||||
|
# Create enhanced SKILL.md
|
||||||
|
print("Creating SKILL.md...")
|
||||||
|
self.create_enhanced_skill_md(categories, quick_ref)
|
||||||
|
|
||||||
|
print(f"\n✅ Skill built: {self.skill_dir}/")
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
def load_config(config_path):
|
||||||
|
"""Load configuration from file"""
|
||||||
|
with open(config_path, 'r') as f:
|
||||||
|
return json.load(f)
|
||||||
|
|
||||||
|
|
||||||
|
def interactive_config():
|
||||||
|
"""Interactive configuration"""
|
||||||
|
print("\n" + "="*60)
|
||||||
|
print("Documentation to Skill Converter")
|
||||||
|
print("="*60 + "\n")
|
||||||
|
|
||||||
|
config = {}
|
||||||
|
|
||||||
|
# Basic info
|
||||||
|
config['name'] = input("Skill name (e.g., 'react', 'godot'): ").strip()
|
||||||
|
config['description'] = input("Skill description: ").strip()
|
||||||
|
config['base_url'] = input("Base URL (e.g., https://docs.example.com/): ").strip()
|
||||||
|
|
||||||
|
if not config['base_url'].endswith('/'):
|
||||||
|
config['base_url'] += '/'
|
||||||
|
|
||||||
|
# Selectors
|
||||||
|
print("\nCSS Selectors (press Enter for defaults):")
|
||||||
|
selectors = {}
|
||||||
|
selectors['main_content'] = input(" Main content [div[role='main']]: ").strip() or "div[role='main']"
|
||||||
|
selectors['title'] = input(" Title [title]: ").strip() or "title"
|
||||||
|
selectors['code_blocks'] = input(" Code blocks [pre code]: ").strip() or "pre code"
|
||||||
|
config['selectors'] = selectors
|
||||||
|
|
||||||
|
# URL patterns
|
||||||
|
print("\nURL Patterns (comma-separated, optional):")
|
||||||
|
include = input(" Include: ").strip()
|
||||||
|
exclude = input(" Exclude: ").strip()
|
||||||
|
config['url_patterns'] = {
|
||||||
|
'include': [p.strip() for p in include.split(',') if p.strip()],
|
||||||
|
'exclude': [p.strip() for p in exclude.split(',') if p.strip()]
|
||||||
|
}
|
||||||
|
|
||||||
|
# Settings
|
||||||
|
rate = input("\nRate limit (seconds) [0.5]: ").strip()
|
||||||
|
config['rate_limit'] = float(rate) if rate else 0.5
|
||||||
|
|
||||||
|
max_p = input("Max pages [500]: ").strip()
|
||||||
|
config['max_pages'] = int(max_p) if max_p else 500
|
||||||
|
|
||||||
|
return config
|
||||||
|
|
||||||
|
|
||||||
|
def check_existing_data(name):
|
||||||
|
"""Check if scraped data already exists"""
|
||||||
|
data_dir = f"output/{name}_data"
|
||||||
|
if os.path.exists(data_dir) and os.path.exists(f"{data_dir}/summary.json"):
|
||||||
|
with open(f"{data_dir}/summary.json", 'r') as f:
|
||||||
|
summary = json.load(f)
|
||||||
|
return True, summary.get('total_pages', 0)
|
||||||
|
return False, 0
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description='Convert documentation websites to Claude skills',
|
||||||
|
formatter_class=argparse.RawDescriptionHelpFormatter
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument('--interactive', '-i', action='store_true',
|
||||||
|
help='Interactive configuration mode')
|
||||||
|
parser.add_argument('--config', '-c', type=str,
|
||||||
|
help='Load configuration from file (e.g., configs/godot.json)')
|
||||||
|
parser.add_argument('--name', type=str,
|
||||||
|
help='Skill name')
|
||||||
|
parser.add_argument('--url', type=str,
|
||||||
|
help='Base documentation URL')
|
||||||
|
parser.add_argument('--description', '-d', type=str,
|
||||||
|
help='Skill description')
|
||||||
|
parser.add_argument('--skip-scrape', action='store_true',
|
||||||
|
help='Skip scraping, use existing data')
|
||||||
|
parser.add_argument('--enhance', action='store_true',
|
||||||
|
help='Enhance SKILL.md using Claude API after building (requires API key)')
|
||||||
|
parser.add_argument('--enhance-local', action='store_true',
|
||||||
|
help='Enhance SKILL.md using Claude Code in new terminal (no API key needed)')
|
||||||
|
parser.add_argument('--api-key', type=str,
|
||||||
|
help='Anthropic API key for --enhance (or set ANTHROPIC_API_KEY)')
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
# Get configuration
|
||||||
|
if args.config:
|
||||||
|
config = load_config(args.config)
|
||||||
|
elif args.interactive or not (args.name and args.url):
|
||||||
|
config = interactive_config()
|
||||||
|
else:
|
||||||
|
config = {
|
||||||
|
'name': args.name,
|
||||||
|
'description': args.description or f'Comprehensive assistance with {args.name}',
|
||||||
|
'base_url': args.url,
|
||||||
|
'selectors': {
|
||||||
|
'main_content': "div[role='main']",
|
||||||
|
'title': 'title',
|
||||||
|
'code_blocks': 'pre code'
|
||||||
|
},
|
||||||
|
'url_patterns': {'include': [], 'exclude': []},
|
||||||
|
'rate_limit': 0.5,
|
||||||
|
'max_pages': 500
|
||||||
|
}
|
||||||
|
|
||||||
|
# Check for existing data
|
||||||
|
exists, page_count = check_existing_data(config['name'])
|
||||||
|
|
||||||
|
if exists and not args.skip_scrape:
|
||||||
|
print(f"\n✓ Found existing data: {page_count} pages")
|
||||||
|
response = input("Use existing data? (y/n): ").strip().lower()
|
||||||
|
if response == 'y':
|
||||||
|
args.skip_scrape = True
|
||||||
|
|
||||||
|
# Create converter
|
||||||
|
converter = DocToSkillConverter(config)
|
||||||
|
|
||||||
|
# Scrape or skip
|
||||||
|
if not args.skip_scrape:
|
||||||
|
try:
|
||||||
|
converter.scrape_all()
|
||||||
|
except KeyboardInterrupt:
|
||||||
|
print("\n\nScraping interrupted.")
|
||||||
|
response = input("Continue with skill building? (y/n): ").strip().lower()
|
||||||
|
if response != 'y':
|
||||||
|
return
|
||||||
|
else:
|
||||||
|
print(f"\n⏭️ Skipping scrape, using existing data")
|
||||||
|
|
||||||
|
# Build skill
|
||||||
|
success = converter.build_skill()
|
||||||
|
|
||||||
|
if not success:
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
# Optional enhancement with Claude API
|
||||||
|
if args.enhance:
|
||||||
|
print(f"\n{'='*60}")
|
||||||
|
print(f"ENHANCING SKILL.MD WITH CLAUDE API")
|
||||||
|
print(f"{'='*60}\n")
|
||||||
|
|
||||||
|
try:
|
||||||
|
import subprocess
|
||||||
|
enhance_cmd = ['python3', 'enhance_skill.py', f'output/{config["name"]}/']
|
||||||
|
if args.api_key:
|
||||||
|
enhance_cmd.extend(['--api-key', args.api_key])
|
||||||
|
|
||||||
|
result = subprocess.run(enhance_cmd, check=True)
|
||||||
|
if result.returncode == 0:
|
||||||
|
print("\n✅ Enhancement complete!")
|
||||||
|
except subprocess.CalledProcessError:
|
||||||
|
print("\n⚠ Enhancement failed, but skill was still built")
|
||||||
|
except FileNotFoundError:
|
||||||
|
print("\n⚠ enhance_skill.py not found. Run manually:")
|
||||||
|
print(f" python3 enhance_skill.py output/{config['name']}/")
|
||||||
|
|
||||||
|
# Optional enhancement with Claude Code (local, no API key)
|
||||||
|
if args.enhance_local:
|
||||||
|
print(f"\n{'='*60}")
|
||||||
|
print(f"ENHANCING SKILL.MD WITH CLAUDE CODE (LOCAL)")
|
||||||
|
print(f"{'='*60}\n")
|
||||||
|
|
||||||
|
try:
|
||||||
|
import subprocess
|
||||||
|
enhance_cmd = ['python3', 'enhance_skill_local.py', f'output/{config["name"]}/']
|
||||||
|
subprocess.run(enhance_cmd, check=True)
|
||||||
|
except subprocess.CalledProcessError:
|
||||||
|
print("\n⚠ Enhancement failed, but skill was still built")
|
||||||
|
except FileNotFoundError:
|
||||||
|
print("\n⚠ enhance_skill_local.py not found. Run manually:")
|
||||||
|
print(f" python3 enhance_skill_local.py output/{config['name']}/")
|
||||||
|
|
||||||
|
print(f"\n📦 Package your skill:")
|
||||||
|
print(f" python3 /mnt/skills/examples/skill-creator/scripts/package_skill.py output/{config['name']}/")
|
||||||
|
|
||||||
|
if not args.enhance and not args.enhance_local:
|
||||||
|
print(f"\n💡 Optional: Enhance SKILL.md with Claude:")
|
||||||
|
print(f" API-based: python3 enhance_skill.py output/{config['name']}/")
|
||||||
|
print(f" or re-run with: --enhance")
|
||||||
|
print(f" Local (no API key): python3 enhance_skill_local.py output/{config['name']}/")
|
||||||
|
print(f" or re-run with: --enhance-local")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
239
docs/CLAUDE.md
Normal file
239
docs/CLAUDE.md
Normal file
@@ -0,0 +1,239 @@
|
|||||||
|
# CLAUDE.md
|
||||||
|
|
||||||
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
This is a Python-based documentation scraper that converts ANY documentation website into a Claude skill. It's a single-file tool (`doc_scraper.py`) that scrapes documentation, extracts code patterns, detects programming languages, and generates structured skill files ready for use with Claude.
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip3 install requests beautifulsoup4
|
||||||
|
```
|
||||||
|
|
||||||
|
## Core Commands
|
||||||
|
|
||||||
|
### Run with a preset configuration
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py --config configs/godot.json
|
||||||
|
python3 doc_scraper.py --config configs/react.json
|
||||||
|
python3 doc_scraper.py --config configs/vue.json
|
||||||
|
python3 doc_scraper.py --config configs/django.json
|
||||||
|
python3 doc_scraper.py --config configs/fastapi.json
|
||||||
|
```
|
||||||
|
|
||||||
|
### Interactive mode (for new frameworks)
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py --interactive
|
||||||
|
```
|
||||||
|
|
||||||
|
### Quick mode (minimal config)
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py --name react --url https://react.dev/ --description "React framework"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Skip scraping (use cached data)
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py --config configs/godot.json --skip-scrape
|
||||||
|
```
|
||||||
|
|
||||||
|
### AI-powered SKILL.md enhancement
|
||||||
|
```bash
|
||||||
|
# Option 1: During scraping (API-based, requires ANTHROPIC_API_KEY)
|
||||||
|
pip3 install anthropic
|
||||||
|
export ANTHROPIC_API_KEY=sk-ant-...
|
||||||
|
python3 doc_scraper.py --config configs/react.json --enhance
|
||||||
|
|
||||||
|
# Option 2: During scraping (LOCAL, no API key - uses Claude Code Max)
|
||||||
|
python3 doc_scraper.py --config configs/react.json --enhance-local
|
||||||
|
|
||||||
|
# Option 3: Standalone after scraping (API-based)
|
||||||
|
python3 enhance_skill.py output/react/
|
||||||
|
|
||||||
|
# Option 4: Standalone after scraping (LOCAL, no API key)
|
||||||
|
python3 enhance_skill_local.py output/react/
|
||||||
|
```
|
||||||
|
|
||||||
|
The LOCAL enhancement option (`--enhance-local` or `enhance_skill_local.py`) opens a new terminal with Claude Code, which analyzes reference files and enhances SKILL.md automatically. This requires Claude Code Max plan but no API key.
|
||||||
|
|
||||||
|
### Test with limited pages (edit config first)
|
||||||
|
Set `"max_pages": 20` in the config file to test with fewer pages.
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
### Single-File Design
|
||||||
|
The entire tool is contained in `doc_scraper.py` (~737 lines). It follows a class-based architecture with a single `DocToSkillConverter` class that handles:
|
||||||
|
- **Web scraping**: BFS traversal with URL validation
|
||||||
|
- **Content extraction**: CSS selectors for title, content, code blocks
|
||||||
|
- **Language detection**: Heuristic-based detection from code samples (Python, JavaScript, GDScript, C++, etc.)
|
||||||
|
- **Pattern extraction**: Identifies common coding patterns from documentation
|
||||||
|
- **Categorization**: Smart categorization using URL structure, page titles, and content keywords with scoring
|
||||||
|
- **Skill generation**: Creates SKILL.md with real code examples and categorized reference files
|
||||||
|
|
||||||
|
### Data Flow
|
||||||
|
1. **Scrape Phase**:
|
||||||
|
- Input: Config JSON (name, base_url, selectors, url_patterns, categories, rate_limit, max_pages)
|
||||||
|
- Process: BFS traversal starting from base_url, respecting include/exclude patterns
|
||||||
|
- Output: `output/{name}_data/pages/*.json` + `summary.json`
|
||||||
|
|
||||||
|
2. **Build Phase**:
|
||||||
|
- Input: Scraped JSON data from `output/{name}_data/`
|
||||||
|
- Process: Load pages → Smart categorize → Extract patterns → Generate references
|
||||||
|
- Output: `output/{name}/SKILL.md` + `output/{name}/references/*.md`
|
||||||
|
|
||||||
|
### Directory Structure
|
||||||
|
```
|
||||||
|
doc-to-skill/
|
||||||
|
├── doc_scraper.py # Main scraping & building tool
|
||||||
|
├── enhance_skill.py # AI enhancement (API-based)
|
||||||
|
├── enhance_skill_local.py # AI enhancement (LOCAL, no API)
|
||||||
|
├── configs/ # Preset configurations
|
||||||
|
│ ├── godot.json
|
||||||
|
│ ├── react.json
|
||||||
|
│ ├── steam-inventory.json
|
||||||
|
│ └── ...
|
||||||
|
└── output/
|
||||||
|
├── {name}_data/ # Raw scraped data (cached)
|
||||||
|
│ ├── pages/ # Individual page JSONs
|
||||||
|
│ └── summary.json # Scraping summary
|
||||||
|
└── {name}/ # Generated skill
|
||||||
|
├── SKILL.md # Main skill file with examples
|
||||||
|
├── SKILL.md.backup # Backup (if enhanced)
|
||||||
|
├── references/ # Categorized documentation
|
||||||
|
│ ├── index.md
|
||||||
|
│ ├── getting_started.md
|
||||||
|
│ ├── api.md
|
||||||
|
│ └── ...
|
||||||
|
├── scripts/ # Empty (for user scripts)
|
||||||
|
└── assets/ # Empty (for user assets)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Configuration Format
|
||||||
|
Config files in `configs/*.json` contain:
|
||||||
|
- `name`: Skill identifier (e.g., "godot", "react")
|
||||||
|
- `description`: When to use this skill
|
||||||
|
- `base_url`: Starting URL for scraping
|
||||||
|
- `selectors`: CSS selectors for content extraction
|
||||||
|
- `main_content`: Main documentation content (e.g., "article", "div[role='main']")
|
||||||
|
- `title`: Page title selector
|
||||||
|
- `code_blocks`: Code sample selector (e.g., "pre code", "pre")
|
||||||
|
- `url_patterns`: URL filtering
|
||||||
|
- `include`: Only scrape URLs containing these patterns
|
||||||
|
- `exclude`: Skip URLs containing these patterns
|
||||||
|
- `categories`: Keyword-based categorization mapping
|
||||||
|
- `rate_limit`: Delay between requests (seconds)
|
||||||
|
- `max_pages`: Maximum pages to scrape
|
||||||
|
|
||||||
|
### Key Features
|
||||||
|
|
||||||
|
**Auto-detect existing data**: Tool checks for `output/{name}_data/` and prompts to reuse, avoiding re-scraping.
|
||||||
|
|
||||||
|
**Language detection**: Detects code languages from:
|
||||||
|
1. CSS class attributes (`language-*`, `lang-*`)
|
||||||
|
2. Heuristics (keywords like `def`, `const`, `func`, etc.)
|
||||||
|
|
||||||
|
**Pattern extraction**: Looks for "Example:", "Pattern:", "Usage:" markers in content and extracts following code blocks (up to 5 per page).
|
||||||
|
|
||||||
|
**Smart categorization**:
|
||||||
|
- Scores pages against category keywords (3 points for URL match, 2 for title, 1 for content)
|
||||||
|
- Threshold of 2+ for categorization
|
||||||
|
- Auto-infers categories from URL segments if none provided
|
||||||
|
- Falls back to "other" category
|
||||||
|
|
||||||
|
**Enhanced SKILL.md**: Generated with:
|
||||||
|
- Real code examples from documentation (language-annotated)
|
||||||
|
- Quick reference patterns extracted from docs
|
||||||
|
- Common pattern section
|
||||||
|
- Category file listings
|
||||||
|
|
||||||
|
**AI-Powered Enhancement**: Two scripts to dramatically improve SKILL.md quality:
|
||||||
|
- `enhance_skill.py`: Uses Anthropic API (~$0.15-$0.30 per skill, requires API key)
|
||||||
|
- `enhance_skill_local.py`: Uses Claude Code Max (free, no API key needed)
|
||||||
|
- Transforms generic 75-line templates into comprehensive 500+ line guides
|
||||||
|
- Extracts best examples, explains key concepts, adds navigation guidance
|
||||||
|
- Success rate: 9/10 quality (based on steam-economy test)
|
||||||
|
|
||||||
|
## Key Code Locations
|
||||||
|
|
||||||
|
- **URL validation**: `is_valid_url()` doc_scraper.py:47-62
|
||||||
|
- **Content extraction**: `extract_content()` doc_scraper.py:64-131
|
||||||
|
- **Language detection**: `detect_language()` doc_scraper.py:133-163
|
||||||
|
- **Pattern extraction**: `extract_patterns()` doc_scraper.py:165-181
|
||||||
|
- **Smart categorization**: `smart_categorize()` doc_scraper.py:280-321
|
||||||
|
- **Category inference**: `infer_categories()` doc_scraper.py:323-349
|
||||||
|
- **Quick reference generation**: `generate_quick_reference()` doc_scraper.py:351-370
|
||||||
|
- **SKILL.md generation**: `create_enhanced_skill_md()` doc_scraper.py:424-540
|
||||||
|
- **Scraping loop**: `scrape_all()` doc_scraper.py:226-249
|
||||||
|
- **Main workflow**: `main()` doc_scraper.py:661-733
|
||||||
|
|
||||||
|
## Workflow Examples
|
||||||
|
|
||||||
|
### First time scraping (with scraping)
|
||||||
|
```bash
|
||||||
|
# 1. Scrape + Build
|
||||||
|
python3 doc_scraper.py --config configs/godot.json
|
||||||
|
# Time: 20-40 minutes
|
||||||
|
|
||||||
|
# 2. Package (assuming skill-creator is available)
|
||||||
|
python3 package_skill.py output/godot/
|
||||||
|
|
||||||
|
# Result: godot.zip
|
||||||
|
```
|
||||||
|
|
||||||
|
### Using cached data (fast iteration)
|
||||||
|
```bash
|
||||||
|
# 1. Use existing data
|
||||||
|
python3 doc_scraper.py --config configs/godot.json --skip-scrape
|
||||||
|
# Time: 1-3 minutes
|
||||||
|
|
||||||
|
# 2. Package
|
||||||
|
python3 package_skill.py output/godot/
|
||||||
|
```
|
||||||
|
|
||||||
|
### Creating a new framework config
|
||||||
|
```bash
|
||||||
|
# Option 1: Interactive
|
||||||
|
python3 doc_scraper.py --interactive
|
||||||
|
|
||||||
|
# Option 2: Copy and modify
|
||||||
|
cp configs/react.json configs/myframework.json
|
||||||
|
# Edit configs/myframework.json
|
||||||
|
python3 doc_scraper.py --config configs/myframework.json
|
||||||
|
```
|
||||||
|
|
||||||
|
## Testing Selectors
|
||||||
|
|
||||||
|
To find the right CSS selectors for a documentation site:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from bs4 import BeautifulSoup
|
||||||
|
import requests
|
||||||
|
|
||||||
|
url = "https://docs.example.com/page"
|
||||||
|
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
|
||||||
|
|
||||||
|
# Try different selectors
|
||||||
|
print(soup.select_one('article'))
|
||||||
|
print(soup.select_one('main'))
|
||||||
|
print(soup.select_one('div[role="main"]'))
|
||||||
|
```
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
**No content extracted**: Check `main_content` selector. Common values: `article`, `main`, `div[role="main"]`, `div.content`
|
||||||
|
|
||||||
|
**Poor categorization**: Edit `categories` section in config with better keywords specific to the documentation structure
|
||||||
|
|
||||||
|
**Force re-scrape**: Delete cached data with `rm -rf output/{name}_data/`
|
||||||
|
|
||||||
|
**Rate limiting issues**: Increase `rate_limit` value in config (e.g., from 0.5 to 1.0 seconds)
|
||||||
|
|
||||||
|
## Output Quality Checks
|
||||||
|
|
||||||
|
After building, verify quality:
|
||||||
|
```bash
|
||||||
|
cat output/godot/SKILL.md # Should have real code examples
|
||||||
|
cat output/godot/references/index.md # Should show categories
|
||||||
|
ls output/godot/references/ # Should have category .md files
|
||||||
|
```
|
||||||
250
docs/ENHANCEMENT.md
Normal file
250
docs/ENHANCEMENT.md
Normal file
@@ -0,0 +1,250 @@
|
|||||||
|
# AI-Powered SKILL.md Enhancement
|
||||||
|
|
||||||
|
Two scripts are available to dramatically improve your SKILL.md file:
|
||||||
|
1. **`enhance_skill_local.py`** - Uses Claude Code Max (no API key, **recommended**)
|
||||||
|
2. **`enhance_skill.py`** - Uses Anthropic API (~$0.15-$0.30 per skill)
|
||||||
|
|
||||||
|
Both analyze reference documentation and extract the best examples and guidance.
|
||||||
|
|
||||||
|
## Why Use Enhancement?
|
||||||
|
|
||||||
|
**Problem:** The auto-generated SKILL.md is often too generic:
|
||||||
|
- Empty Quick Reference section
|
||||||
|
- No practical code examples
|
||||||
|
- Generic "When to Use" triggers
|
||||||
|
- Doesn't highlight key features
|
||||||
|
|
||||||
|
**Solution:** Let Claude read your reference docs and create a much better SKILL.md with:
|
||||||
|
- ✅ Best code examples extracted from documentation
|
||||||
|
- ✅ Practical quick reference with real patterns
|
||||||
|
- ✅ Domain-specific guidance
|
||||||
|
- ✅ Clear navigation tips
|
||||||
|
- ✅ Key concepts explained
|
||||||
|
|
||||||
|
## Quick Start (LOCAL - No API Key)
|
||||||
|
|
||||||
|
**Recommended for Claude Code Max users:**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Option 1: Standalone enhancement
|
||||||
|
python3 enhance_skill_local.py output/steam-inventory/
|
||||||
|
|
||||||
|
# Option 2: Integrated with scraper
|
||||||
|
python3 doc_scraper.py --config configs/steam-inventory.json --enhance-local
|
||||||
|
```
|
||||||
|
|
||||||
|
**What happens:**
|
||||||
|
1. Opens new terminal window
|
||||||
|
2. Runs Claude Code with enhancement prompt
|
||||||
|
3. Claude analyzes reference files (~15-20K chars)
|
||||||
|
4. Generates enhanced SKILL.md (30-60 seconds)
|
||||||
|
5. Terminal auto-closes when done
|
||||||
|
|
||||||
|
**Requirements:**
|
||||||
|
- Claude Code Max plan (you're already using it!)
|
||||||
|
- macOS (auto-launch works) or manual terminal run on other OS
|
||||||
|
|
||||||
|
## API-Based Enhancement (Alternative)
|
||||||
|
|
||||||
|
**If you prefer API-based approach:**
|
||||||
|
|
||||||
|
### Installation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip3 install anthropic
|
||||||
|
```
|
||||||
|
|
||||||
|
### Setup API Key
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Option 1: Environment variable (recommended)
|
||||||
|
export ANTHROPIC_API_KEY=sk-ant-...
|
||||||
|
|
||||||
|
# Option 2: Pass directly with --api-key
|
||||||
|
python3 enhance_skill.py output/react/ --api-key sk-ant-...
|
||||||
|
```
|
||||||
|
|
||||||
|
### Usage
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Standalone enhancement
|
||||||
|
python3 enhance_skill.py output/steam-inventory/
|
||||||
|
|
||||||
|
# Integrated with scraper
|
||||||
|
python3 doc_scraper.py --config configs/steam-inventory.json --enhance
|
||||||
|
|
||||||
|
# Dry run (see what would be done)
|
||||||
|
python3 enhance_skill.py output/react/ --dry-run
|
||||||
|
```
|
||||||
|
|
||||||
|
## What It Does
|
||||||
|
|
||||||
|
1. **Reads reference files** (api_reference.md, webapi.md, etc.)
|
||||||
|
2. **Sends to Claude** with instructions to:
|
||||||
|
- Extract 5-10 best code examples
|
||||||
|
- Create practical quick reference
|
||||||
|
- Write domain-specific "When to Use" triggers
|
||||||
|
- Add helpful navigation guidance
|
||||||
|
3. **Backs up original** SKILL.md to SKILL.md.backup
|
||||||
|
4. **Saves enhanced version** as new SKILL.md
|
||||||
|
|
||||||
|
## Example Enhancement
|
||||||
|
|
||||||
|
### Before (Auto-Generated)
|
||||||
|
```markdown
|
||||||
|
## Quick Reference
|
||||||
|
|
||||||
|
### Common Patterns
|
||||||
|
|
||||||
|
*Quick reference patterns will be added as you use the skill.*
|
||||||
|
```
|
||||||
|
|
||||||
|
### After (AI-Enhanced)
|
||||||
|
```markdown
|
||||||
|
## Quick Reference
|
||||||
|
|
||||||
|
### Common API Patterns
|
||||||
|
|
||||||
|
**Granting promotional items:**
|
||||||
|
```cpp
|
||||||
|
void CInventory::GrantPromoItems()
|
||||||
|
{
|
||||||
|
SteamItemDef_t newItems[2];
|
||||||
|
newItems[0] = 110;
|
||||||
|
newItems[1] = 111;
|
||||||
|
SteamInventory()->AddPromoItems( &s_GenerateRequestResult, newItems, 2 );
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Getting all items in player inventory:**
|
||||||
|
```cpp
|
||||||
|
SteamInventoryResult_t resultHandle;
|
||||||
|
bool success = SteamInventory()->GetAllItems( &resultHandle );
|
||||||
|
```
|
||||||
|
[... 8 more practical examples ...]
|
||||||
|
```
|
||||||
|
|
||||||
|
## Cost Estimate
|
||||||
|
|
||||||
|
- **Input**: ~50,000-100,000 tokens (reference docs)
|
||||||
|
- **Output**: ~4,000 tokens (enhanced SKILL.md)
|
||||||
|
- **Model**: claude-sonnet-4-20250514
|
||||||
|
- **Estimated cost**: $0.15-$0.30 per skill
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### "No API key provided"
|
||||||
|
```bash
|
||||||
|
export ANTHROPIC_API_KEY=sk-ant-...
|
||||||
|
# or
|
||||||
|
python3 enhance_skill.py output/react/ --api-key sk-ant-...
|
||||||
|
```
|
||||||
|
|
||||||
|
### "No reference files found"
|
||||||
|
Make sure you've run the scraper first:
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py --config configs/react.json
|
||||||
|
```
|
||||||
|
|
||||||
|
### "anthropic package not installed"
|
||||||
|
```bash
|
||||||
|
pip3 install anthropic
|
||||||
|
```
|
||||||
|
|
||||||
|
### Don't like the result?
|
||||||
|
```bash
|
||||||
|
# Restore original
|
||||||
|
mv output/steam-inventory/SKILL.md.backup output/steam-inventory/SKILL.md
|
||||||
|
|
||||||
|
# Try again (it may generate different content)
|
||||||
|
python3 enhance_skill.py output/steam-inventory/
|
||||||
|
```
|
||||||
|
|
||||||
|
## Tips
|
||||||
|
|
||||||
|
1. **Run after scraping completes** - Enhancement works best with complete reference docs
|
||||||
|
2. **Review the output** - AI is good but not perfect, check the generated SKILL.md
|
||||||
|
3. **Keep the backup** - Original is saved as SKILL.md.backup
|
||||||
|
4. **Re-run if needed** - Each run may produce slightly different results
|
||||||
|
5. **Works offline after first run** - Reference files are local
|
||||||
|
|
||||||
|
## Real-World Results
|
||||||
|
|
||||||
|
**Test Case: steam-economy skill**
|
||||||
|
- **Before:** 75 lines, generic template, empty Quick Reference
|
||||||
|
- **After:** 570 lines, 10 practical API examples, key concepts explained
|
||||||
|
- **Time:** 60 seconds
|
||||||
|
- **Quality Rating:** 9/10
|
||||||
|
|
||||||
|
The LOCAL enhancement successfully:
|
||||||
|
- Extracted best HTTP/JSON examples from 24 pages of documentation
|
||||||
|
- Explained domain concepts (Asset Classes, Context IDs, Transaction Lifecycle)
|
||||||
|
- Created navigation guidance for beginners through advanced users
|
||||||
|
- Added best practices for security, economy design, and API integration
|
||||||
|
|
||||||
|
## Limitations
|
||||||
|
|
||||||
|
**LOCAL Enhancement (`enhance_skill_local.py`):**
|
||||||
|
- Requires Claude Code Max plan
|
||||||
|
- macOS auto-launch only (manual on other OS)
|
||||||
|
- Opens new terminal window
|
||||||
|
- Takes ~60 seconds
|
||||||
|
|
||||||
|
**API Enhancement (`enhance_skill.py`):**
|
||||||
|
- Requires Anthropic API key (paid)
|
||||||
|
- Cost: ~$0.15-$0.30 per skill
|
||||||
|
- Limited to ~100K tokens of reference input
|
||||||
|
|
||||||
|
**Both:**
|
||||||
|
- May occasionally miss the best examples
|
||||||
|
- Can't understand context beyond the reference docs
|
||||||
|
- Doesn't modify reference files (only SKILL.md)
|
||||||
|
|
||||||
|
## Enhancement Options Comparison
|
||||||
|
|
||||||
|
| Aspect | Manual Edit | LOCAL Enhancement | API Enhancement |
|
||||||
|
|--------|-------------|-------------------|-----------------|
|
||||||
|
| Time | 15-30 minutes | 30-60 seconds | 30-60 seconds |
|
||||||
|
| Code examples | You pick | AI picks best | AI picks best |
|
||||||
|
| Quick reference | Write yourself | Auto-generated | Auto-generated |
|
||||||
|
| Domain guidance | Your knowledge | From docs | From docs |
|
||||||
|
| Consistency | Varies | Consistent | Consistent |
|
||||||
|
| Cost | Free (your time) | Free (Max plan) | ~$0.20 per skill |
|
||||||
|
| Setup | None | None | API key needed |
|
||||||
|
| Quality | High (if expert) | 9/10 | 9/10 |
|
||||||
|
| **Recommended?** | For experts only | ✅ **Yes** | If no Max plan |
|
||||||
|
|
||||||
|
## When to Use
|
||||||
|
|
||||||
|
**Use enhancement when:**
|
||||||
|
- You want high-quality SKILL.md quickly
|
||||||
|
- Working with large documentation (50+ pages)
|
||||||
|
- Creating skills for unfamiliar frameworks
|
||||||
|
- Need practical code examples extracted
|
||||||
|
- Want consistent quality across multiple skills
|
||||||
|
|
||||||
|
**Skip enhancement when:**
|
||||||
|
- Budget constrained (use manual editing)
|
||||||
|
- Very small documentation (<10 pages)
|
||||||
|
- You know the framework intimately
|
||||||
|
- Documentation has no code examples
|
||||||
|
|
||||||
|
## Advanced: Customization
|
||||||
|
|
||||||
|
To customize how Claude enhances the SKILL.md, edit `enhance_skill.py` and modify the `_build_enhancement_prompt()` method around line 130.
|
||||||
|
|
||||||
|
Example customization:
|
||||||
|
```python
|
||||||
|
prompt += """
|
||||||
|
ADDITIONAL REQUIREMENTS:
|
||||||
|
- Focus on security best practices
|
||||||
|
- Include performance tips
|
||||||
|
- Add troubleshooting section
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
|
||||||
|
## See Also
|
||||||
|
|
||||||
|
- [README.md](../README.md) - Main documentation
|
||||||
|
- [CLAUDE.md](CLAUDE.md) - Architecture guide
|
||||||
|
- [doc_scraper.py](../doc_scraper.py) - Main scraping tool
|
||||||
252
docs/UPLOAD_GUIDE.md
Normal file
252
docs/UPLOAD_GUIDE.md
Normal file
@@ -0,0 +1,252 @@
|
|||||||
|
# How to Upload Skills to Claude
|
||||||
|
|
||||||
|
## Quick Answer
|
||||||
|
|
||||||
|
**You upload the `.zip` file created by `package_skill.py`**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Create the zip file
|
||||||
|
python3 package_skill.py output/steam-economy/
|
||||||
|
|
||||||
|
# This creates: output/steam-economy.zip
|
||||||
|
# Upload this file to Claude!
|
||||||
|
```
|
||||||
|
|
||||||
|
## What's Inside the Zip?
|
||||||
|
|
||||||
|
The `.zip` file contains:
|
||||||
|
|
||||||
|
```
|
||||||
|
steam-economy.zip
|
||||||
|
├── SKILL.md ← Main skill file (Claude reads this first)
|
||||||
|
└── references/ ← Reference documentation
|
||||||
|
├── index.md ← Category index
|
||||||
|
├── api_reference.md ← API docs
|
||||||
|
├── pricing.md ← Pricing docs
|
||||||
|
├── trading.md ← Trading docs
|
||||||
|
└── ... ← Other categorized docs
|
||||||
|
```
|
||||||
|
|
||||||
|
**Note:** The zip only includes what Claude needs. It excludes:
|
||||||
|
- `.backup` files
|
||||||
|
- Build artifacts
|
||||||
|
- Temporary files
|
||||||
|
|
||||||
|
## What Does package_skill.py Do?
|
||||||
|
|
||||||
|
The package script:
|
||||||
|
|
||||||
|
1. **Finds your skill directory** (e.g., `output/steam-economy/`)
|
||||||
|
2. **Validates SKILL.md exists** (required!)
|
||||||
|
3. **Creates a .zip file** with the same name
|
||||||
|
4. **Includes all files** except backups
|
||||||
|
5. **Saves to** `output/` directory
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```bash
|
||||||
|
python3 package_skill.py output/steam-economy/
|
||||||
|
|
||||||
|
📦 Packaging skill: steam-economy
|
||||||
|
Source: output/steam-economy
|
||||||
|
Output: output/steam-economy.zip
|
||||||
|
+ SKILL.md
|
||||||
|
+ references/api_reference.md
|
||||||
|
+ references/pricing.md
|
||||||
|
+ references/trading.md
|
||||||
|
+ ...
|
||||||
|
|
||||||
|
✅ Package created: output/steam-economy.zip
|
||||||
|
Size: 14,290 bytes (14.0 KB)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Complete Workflow
|
||||||
|
|
||||||
|
### Step 1: Scrape & Build
|
||||||
|
```bash
|
||||||
|
python3 doc_scraper.py --config configs/steam-economy.json
|
||||||
|
```
|
||||||
|
|
||||||
|
**Output:**
|
||||||
|
- `output/steam-economy_data/` (raw scraped data)
|
||||||
|
- `output/steam-economy/` (skill directory)
|
||||||
|
|
||||||
|
### Step 2: Enhance (Recommended)
|
||||||
|
```bash
|
||||||
|
python3 enhance_skill_local.py output/steam-economy/
|
||||||
|
```
|
||||||
|
|
||||||
|
**What it does:**
|
||||||
|
- Analyzes reference files
|
||||||
|
- Creates comprehensive SKILL.md
|
||||||
|
- Backs up original to SKILL.md.backup
|
||||||
|
|
||||||
|
**Output:**
|
||||||
|
- `output/steam-economy/SKILL.md` (enhanced)
|
||||||
|
- `output/steam-economy/SKILL.md.backup` (original)
|
||||||
|
|
||||||
|
### Step 3: Package
|
||||||
|
```bash
|
||||||
|
python3 package_skill.py output/steam-economy/
|
||||||
|
```
|
||||||
|
|
||||||
|
**Output:**
|
||||||
|
- `output/steam-economy.zip` ← **THIS IS WHAT YOU UPLOAD**
|
||||||
|
|
||||||
|
### Step 4: Upload to Claude
|
||||||
|
1. Go to Claude (claude.ai)
|
||||||
|
2. Click "Add Skill" or skill upload button
|
||||||
|
3. Select `output/steam-economy.zip`
|
||||||
|
4. Done!
|
||||||
|
|
||||||
|
## What Files Are Required?
|
||||||
|
|
||||||
|
**Minimum required structure:**
|
||||||
|
```
|
||||||
|
your-skill/
|
||||||
|
└── SKILL.md ← Required! Claude reads this first
|
||||||
|
```
|
||||||
|
|
||||||
|
**Recommended structure:**
|
||||||
|
```
|
||||||
|
your-skill/
|
||||||
|
├── SKILL.md ← Main skill file (required)
|
||||||
|
└── references/ ← Reference docs (highly recommended)
|
||||||
|
├── index.md
|
||||||
|
└── *.md ← Category files
|
||||||
|
```
|
||||||
|
|
||||||
|
**Optional (can add manually):**
|
||||||
|
```
|
||||||
|
your-skill/
|
||||||
|
├── SKILL.md
|
||||||
|
├── references/
|
||||||
|
├── scripts/ ← Helper scripts
|
||||||
|
│ └── *.py
|
||||||
|
└── assets/ ← Templates, examples
|
||||||
|
└── *.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
## File Size Limits
|
||||||
|
|
||||||
|
The package script shows size after packaging:
|
||||||
|
```
|
||||||
|
✅ Package created: output/steam-economy.zip
|
||||||
|
Size: 14,290 bytes (14.0 KB)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Typical sizes:**
|
||||||
|
- Small skill: 5-20 KB
|
||||||
|
- Medium skill: 20-100 KB
|
||||||
|
- Large skill: 100-500 KB
|
||||||
|
|
||||||
|
Claude has generous size limits, so most documentation-based skills fit easily.
|
||||||
|
|
||||||
|
## Quick Reference
|
||||||
|
|
||||||
|
### Package a Skill
|
||||||
|
```bash
|
||||||
|
python3 package_skill.py output/steam-economy/
|
||||||
|
```
|
||||||
|
|
||||||
|
### Package Multiple Skills
|
||||||
|
```bash
|
||||||
|
# Package all skills in output/
|
||||||
|
for dir in output/*/; do
|
||||||
|
if [ -f "$dir/SKILL.md" ]; then
|
||||||
|
python3 package_skill.py "$dir"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
```
|
||||||
|
|
||||||
|
### Check What's in a Zip
|
||||||
|
```bash
|
||||||
|
unzip -l output/steam-economy.zip
|
||||||
|
```
|
||||||
|
|
||||||
|
### Test a Packaged Skill Locally
|
||||||
|
```bash
|
||||||
|
# Extract to temp directory
|
||||||
|
mkdir temp-test
|
||||||
|
unzip output/steam-economy.zip -d temp-test/
|
||||||
|
cat temp-test/SKILL.md
|
||||||
|
```
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### "SKILL.md not found"
|
||||||
|
```bash
|
||||||
|
# Make sure you scraped and built first
|
||||||
|
python3 doc_scraper.py --config configs/steam-economy.json
|
||||||
|
|
||||||
|
# Then package
|
||||||
|
python3 package_skill.py output/steam-economy/
|
||||||
|
```
|
||||||
|
|
||||||
|
### "Directory not found"
|
||||||
|
```bash
|
||||||
|
# Check what skills are available
|
||||||
|
ls output/
|
||||||
|
|
||||||
|
# Use correct path
|
||||||
|
python3 package_skill.py output/YOUR-SKILL-NAME/
|
||||||
|
```
|
||||||
|
|
||||||
|
### Zip is Too Large
|
||||||
|
Most skills are small, but if yours is large:
|
||||||
|
```bash
|
||||||
|
# Check size
|
||||||
|
ls -lh output/steam-economy.zip
|
||||||
|
|
||||||
|
# If needed, check what's taking space
|
||||||
|
unzip -l output/steam-economy.zip | sort -k1 -rn | head -20
|
||||||
|
```
|
||||||
|
|
||||||
|
Reference files are usually small. Large sizes often mean:
|
||||||
|
- Many images (skills typically don't need images)
|
||||||
|
- Large code examples (these are fine, just be aware)
|
||||||
|
|
||||||
|
## What Does Claude Do With the Zip?
|
||||||
|
|
||||||
|
When you upload a skill zip:
|
||||||
|
|
||||||
|
1. **Claude extracts it**
|
||||||
|
2. **Reads SKILL.md first** - This tells Claude:
|
||||||
|
- When to activate this skill
|
||||||
|
- What the skill does
|
||||||
|
- Quick reference examples
|
||||||
|
- How to navigate the references
|
||||||
|
3. **Indexes reference files** - Claude can search through:
|
||||||
|
- `references/*.md` files
|
||||||
|
- Find specific APIs, examples, concepts
|
||||||
|
4. **Activates automatically** - When you ask about topics matching the skill
|
||||||
|
|
||||||
|
## Example: Using the Packaged Skill
|
||||||
|
|
||||||
|
After uploading `steam-economy.zip`:
|
||||||
|
|
||||||
|
**You ask:** "How do I implement microtransactions in my Steam game?"
|
||||||
|
|
||||||
|
**Claude:**
|
||||||
|
- Recognizes this matches steam-economy skill
|
||||||
|
- Reads SKILL.md for quick reference
|
||||||
|
- Searches references/microtransactions.md
|
||||||
|
- Provides detailed answer with code examples
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
**What you need to do:**
|
||||||
|
1. ✅ Scrape: `python3 doc_scraper.py --config configs/YOUR-CONFIG.json`
|
||||||
|
2. ✅ Enhance: `python3 enhance_skill_local.py output/YOUR-SKILL/`
|
||||||
|
3. ✅ Package: `python3 package_skill.py output/YOUR-SKILL/`
|
||||||
|
4. ✅ Upload: Upload the `.zip` file to Claude
|
||||||
|
|
||||||
|
**What you upload:**
|
||||||
|
- The `.zip` file from `output/` directory
|
||||||
|
- Example: `output/steam-economy.zip`
|
||||||
|
|
||||||
|
**What's in the zip:**
|
||||||
|
- `SKILL.md` (required)
|
||||||
|
- `references/*.md` (recommended)
|
||||||
|
- Any scripts/assets you added (optional)
|
||||||
|
|
||||||
|
That's it! 🚀
|
||||||
292
enhance_skill.py
Normal file
292
enhance_skill.py
Normal file
@@ -0,0 +1,292 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
SKILL.md Enhancement Script
|
||||||
|
Uses Claude API to improve SKILL.md by analyzing reference documentation.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 enhance_skill.py output/steam-inventory/
|
||||||
|
python3 enhance_skill.py output/react/
|
||||||
|
python3 enhance_skill.py output/godot/ --api-key YOUR_API_KEY
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import json
|
||||||
|
import argparse
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
try:
|
||||||
|
import anthropic
|
||||||
|
except ImportError:
|
||||||
|
print("❌ Error: anthropic package not installed")
|
||||||
|
print("Install with: pip3 install anthropic")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
|
||||||
|
class SkillEnhancer:
|
||||||
|
def __init__(self, skill_dir, api_key=None):
|
||||||
|
self.skill_dir = Path(skill_dir)
|
||||||
|
self.references_dir = self.skill_dir / "references"
|
||||||
|
self.skill_md_path = self.skill_dir / "SKILL.md"
|
||||||
|
|
||||||
|
# Get API key
|
||||||
|
self.api_key = api_key or os.environ.get('ANTHROPIC_API_KEY')
|
||||||
|
if not self.api_key:
|
||||||
|
raise ValueError(
|
||||||
|
"No API key provided. Set ANTHROPIC_API_KEY environment variable "
|
||||||
|
"or use --api-key argument"
|
||||||
|
)
|
||||||
|
|
||||||
|
self.client = anthropic.Anthropic(api_key=self.api_key)
|
||||||
|
|
||||||
|
def read_reference_files(self, max_chars=100000):
|
||||||
|
"""Read reference files with size limit"""
|
||||||
|
references = {}
|
||||||
|
|
||||||
|
if not self.references_dir.exists():
|
||||||
|
print(f"⚠ No references directory found at {self.references_dir}")
|
||||||
|
return references
|
||||||
|
|
||||||
|
total_chars = 0
|
||||||
|
for ref_file in sorted(self.references_dir.glob("*.md")):
|
||||||
|
if ref_file.name == "index.md":
|
||||||
|
continue
|
||||||
|
|
||||||
|
content = ref_file.read_text(encoding='utf-8')
|
||||||
|
|
||||||
|
# Limit size per file
|
||||||
|
if len(content) > 40000:
|
||||||
|
content = content[:40000] + "\n\n[Content truncated...]"
|
||||||
|
|
||||||
|
references[ref_file.name] = content
|
||||||
|
total_chars += len(content)
|
||||||
|
|
||||||
|
# Stop if we've read enough
|
||||||
|
if total_chars > max_chars:
|
||||||
|
print(f" ℹ Limiting input to {max_chars:,} characters")
|
||||||
|
break
|
||||||
|
|
||||||
|
return references
|
||||||
|
|
||||||
|
def read_current_skill_md(self):
|
||||||
|
"""Read existing SKILL.md"""
|
||||||
|
if not self.skill_md_path.exists():
|
||||||
|
return None
|
||||||
|
return self.skill_md_path.read_text(encoding='utf-8')
|
||||||
|
|
||||||
|
def enhance_skill_md(self, references, current_skill_md):
|
||||||
|
"""Use Claude to enhance SKILL.md"""
|
||||||
|
|
||||||
|
# Build prompt
|
||||||
|
prompt = self._build_enhancement_prompt(references, current_skill_md)
|
||||||
|
|
||||||
|
print("\n🤖 Asking Claude to enhance SKILL.md...")
|
||||||
|
print(f" Input: {len(prompt):,} characters")
|
||||||
|
|
||||||
|
try:
|
||||||
|
message = self.client.messages.create(
|
||||||
|
model="claude-sonnet-4-20250514",
|
||||||
|
max_tokens=4096,
|
||||||
|
temperature=0.3,
|
||||||
|
messages=[{
|
||||||
|
"role": "user",
|
||||||
|
"content": prompt
|
||||||
|
}]
|
||||||
|
)
|
||||||
|
|
||||||
|
enhanced_content = message.content[0].text
|
||||||
|
return enhanced_content
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
print(f"❌ Error calling Claude API: {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
def _build_enhancement_prompt(self, references, current_skill_md):
|
||||||
|
"""Build the prompt for Claude"""
|
||||||
|
|
||||||
|
# Extract skill name and description
|
||||||
|
skill_name = self.skill_dir.name
|
||||||
|
|
||||||
|
prompt = f"""You are enhancing a Claude skill's SKILL.md file. This skill is about: {skill_name}
|
||||||
|
|
||||||
|
I've scraped documentation and organized it into reference files. Your job is to create an EXCELLENT SKILL.md that will help Claude use this documentation effectively.
|
||||||
|
|
||||||
|
CURRENT SKILL.MD:
|
||||||
|
{'```markdown' if current_skill_md else '(none - create from scratch)'}
|
||||||
|
{current_skill_md or 'No existing SKILL.md'}
|
||||||
|
{'```' if current_skill_md else ''}
|
||||||
|
|
||||||
|
REFERENCE DOCUMENTATION:
|
||||||
|
"""
|
||||||
|
|
||||||
|
for filename, content in references.items():
|
||||||
|
prompt += f"\n\n## {filename}\n```markdown\n{content[:30000]}\n```\n"
|
||||||
|
|
||||||
|
prompt += """
|
||||||
|
|
||||||
|
YOUR TASK:
|
||||||
|
Create an enhanced SKILL.md that includes:
|
||||||
|
|
||||||
|
1. **Clear "When to Use This Skill" section** - Be specific about trigger conditions
|
||||||
|
2. **Excellent Quick Reference section** - Extract 5-10 of the BEST, most practical code examples from the reference docs
|
||||||
|
- Choose SHORT, clear examples that demonstrate common tasks
|
||||||
|
- Include both simple and intermediate examples
|
||||||
|
- Annotate examples with clear descriptions
|
||||||
|
- Use proper language tags (cpp, python, javascript, json, etc.)
|
||||||
|
3. **Detailed Reference Files description** - Explain what's in each reference file
|
||||||
|
4. **Practical "Working with This Skill" section** - Give users clear guidance on how to navigate the skill
|
||||||
|
5. **Key Concepts section** (if applicable) - Explain core concepts
|
||||||
|
6. **Keep the frontmatter** (---\nname: ...\n---) intact
|
||||||
|
|
||||||
|
IMPORTANT:
|
||||||
|
- Extract REAL examples from the reference docs, don't make them up
|
||||||
|
- Prioritize SHORT, clear examples (5-20 lines max)
|
||||||
|
- Make it actionable and practical
|
||||||
|
- Don't be too verbose - be concise but useful
|
||||||
|
- Maintain the markdown structure for Claude skills
|
||||||
|
- Keep code examples properly formatted with language tags
|
||||||
|
|
||||||
|
OUTPUT:
|
||||||
|
Return ONLY the complete SKILL.md content, starting with the frontmatter (---).
|
||||||
|
"""
|
||||||
|
|
||||||
|
return prompt
|
||||||
|
|
||||||
|
def save_enhanced_skill_md(self, content):
|
||||||
|
"""Save the enhanced SKILL.md"""
|
||||||
|
# Backup original
|
||||||
|
if self.skill_md_path.exists():
|
||||||
|
backup_path = self.skill_md_path.with_suffix('.md.backup')
|
||||||
|
self.skill_md_path.rename(backup_path)
|
||||||
|
print(f" 💾 Backed up original to: {backup_path.name}")
|
||||||
|
|
||||||
|
# Save enhanced version
|
||||||
|
self.skill_md_path.write_text(content, encoding='utf-8')
|
||||||
|
print(f" ✅ Saved enhanced SKILL.md")
|
||||||
|
|
||||||
|
def run(self):
|
||||||
|
"""Main enhancement workflow"""
|
||||||
|
print(f"\n{'='*60}")
|
||||||
|
print(f"ENHANCING SKILL: {self.skill_dir.name}")
|
||||||
|
print(f"{'='*60}\n")
|
||||||
|
|
||||||
|
# Read reference files
|
||||||
|
print("📖 Reading reference documentation...")
|
||||||
|
references = self.read_reference_files()
|
||||||
|
|
||||||
|
if not references:
|
||||||
|
print("❌ No reference files found to analyze")
|
||||||
|
return False
|
||||||
|
|
||||||
|
print(f" ✓ Read {len(references)} reference files")
|
||||||
|
total_size = sum(len(c) for c in references.values())
|
||||||
|
print(f" ✓ Total size: {total_size:,} characters\n")
|
||||||
|
|
||||||
|
# Read current SKILL.md
|
||||||
|
current_skill_md = self.read_current_skill_md()
|
||||||
|
if current_skill_md:
|
||||||
|
print(f" ℹ Found existing SKILL.md ({len(current_skill_md)} chars)")
|
||||||
|
else:
|
||||||
|
print(f" ℹ No existing SKILL.md, will create new one")
|
||||||
|
|
||||||
|
# Enhance with Claude
|
||||||
|
enhanced = self.enhance_skill_md(references, current_skill_md)
|
||||||
|
|
||||||
|
if not enhanced:
|
||||||
|
print("❌ Enhancement failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
print(f" ✓ Generated enhanced SKILL.md ({len(enhanced)} chars)\n")
|
||||||
|
|
||||||
|
# Save
|
||||||
|
print("💾 Saving enhanced SKILL.md...")
|
||||||
|
self.save_enhanced_skill_md(enhanced)
|
||||||
|
|
||||||
|
print(f"\n✅ Enhancement complete!")
|
||||||
|
print(f"\nNext steps:")
|
||||||
|
print(f" 1. Review: {self.skill_md_path}")
|
||||||
|
print(f" 2. If you don't like it, restore backup: {self.skill_md_path.with_suffix('.md.backup')}")
|
||||||
|
print(f" 3. Package your skill:")
|
||||||
|
print(f" python3 /mnt/skills/examples/skill-creator/scripts/package_skill.py {self.skill_dir}/")
|
||||||
|
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description='Enhance SKILL.md using Claude API',
|
||||||
|
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||||
|
epilog="""
|
||||||
|
Examples:
|
||||||
|
# Using ANTHROPIC_API_KEY environment variable
|
||||||
|
export ANTHROPIC_API_KEY=sk-ant-...
|
||||||
|
python3 enhance_skill.py output/steam-inventory/
|
||||||
|
|
||||||
|
# Providing API key directly
|
||||||
|
python3 enhance_skill.py output/react/ --api-key sk-ant-...
|
||||||
|
|
||||||
|
# Show what would be done (dry run)
|
||||||
|
python3 enhance_skill.py output/godot/ --dry-run
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument('skill_dir', type=str,
|
||||||
|
help='Path to skill directory (e.g., output/steam-inventory/)')
|
||||||
|
parser.add_argument('--api-key', type=str,
|
||||||
|
help='Anthropic API key (or set ANTHROPIC_API_KEY env var)')
|
||||||
|
parser.add_argument('--dry-run', action='store_true',
|
||||||
|
help='Show what would be done without calling API')
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
# Validate skill directory
|
||||||
|
skill_dir = Path(args.skill_dir)
|
||||||
|
if not skill_dir.exists():
|
||||||
|
print(f"❌ Error: Directory not found: {skill_dir}")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
if not skill_dir.is_dir():
|
||||||
|
print(f"❌ Error: Not a directory: {skill_dir}")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
# Dry run mode
|
||||||
|
if args.dry_run:
|
||||||
|
print(f"🔍 DRY RUN MODE")
|
||||||
|
print(f" Would enhance: {skill_dir}")
|
||||||
|
print(f" References: {skill_dir / 'references'}")
|
||||||
|
print(f" SKILL.md: {skill_dir / 'SKILL.md'}")
|
||||||
|
|
||||||
|
refs_dir = skill_dir / "references"
|
||||||
|
if refs_dir.exists():
|
||||||
|
ref_files = list(refs_dir.glob("*.md"))
|
||||||
|
print(f" Found {len(ref_files)} reference files:")
|
||||||
|
for rf in ref_files:
|
||||||
|
size = rf.stat().st_size
|
||||||
|
print(f" - {rf.name} ({size:,} bytes)")
|
||||||
|
|
||||||
|
print("\nTo actually run enhancement:")
|
||||||
|
print(f" python3 enhance_skill.py {skill_dir}")
|
||||||
|
return
|
||||||
|
|
||||||
|
# Create enhancer and run
|
||||||
|
try:
|
||||||
|
enhancer = SkillEnhancer(skill_dir, api_key=args.api_key)
|
||||||
|
success = enhancer.run()
|
||||||
|
sys.exit(0 if success else 1)
|
||||||
|
|
||||||
|
except ValueError as e:
|
||||||
|
print(f"❌ Error: {e}")
|
||||||
|
print("\nSet your API key:")
|
||||||
|
print(" export ANTHROPIC_API_KEY=sk-ant-...")
|
||||||
|
print("Or provide it directly:")
|
||||||
|
print(f" python3 enhance_skill.py {skill_dir} --api-key sk-ant-...")
|
||||||
|
sys.exit(1)
|
||||||
|
except Exception as e:
|
||||||
|
print(f"❌ Unexpected error: {e}")
|
||||||
|
import traceback
|
||||||
|
traceback.print_exc()
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
244
enhance_skill_local.py
Normal file
244
enhance_skill_local.py
Normal file
@@ -0,0 +1,244 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
SKILL.md Enhancement Script (Local - Using Claude Code)
|
||||||
|
Opens a new terminal with Claude Code to enhance SKILL.md, then reports back.
|
||||||
|
No API key needed - uses your existing Claude Code Max plan!
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 enhance_skill_local.py output/steam-inventory/
|
||||||
|
python3 enhance_skill_local.py output/react/
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
import subprocess
|
||||||
|
import tempfile
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
|
||||||
|
class LocalSkillEnhancer:
|
||||||
|
def __init__(self, skill_dir):
|
||||||
|
self.skill_dir = Path(skill_dir)
|
||||||
|
self.references_dir = self.skill_dir / "references"
|
||||||
|
self.skill_md_path = self.skill_dir / "SKILL.md"
|
||||||
|
|
||||||
|
def create_enhancement_prompt(self):
|
||||||
|
"""Create the prompt file for Claude Code"""
|
||||||
|
|
||||||
|
# Read reference files
|
||||||
|
references = self.read_reference_files()
|
||||||
|
|
||||||
|
if not references:
|
||||||
|
print("❌ No reference files found")
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Read current SKILL.md
|
||||||
|
current_skill_md = ""
|
||||||
|
if self.skill_md_path.exists():
|
||||||
|
current_skill_md = self.skill_md_path.read_text(encoding='utf-8')
|
||||||
|
|
||||||
|
# Build prompt
|
||||||
|
prompt = f"""I need you to enhance the SKILL.md file for the {self.skill_dir.name} skill.
|
||||||
|
|
||||||
|
CURRENT SKILL.MD:
|
||||||
|
{'-'*60}
|
||||||
|
{current_skill_md if current_skill_md else '(No existing SKILL.md - create from scratch)'}
|
||||||
|
{'-'*60}
|
||||||
|
|
||||||
|
REFERENCE DOCUMENTATION:
|
||||||
|
{'-'*60}
|
||||||
|
"""
|
||||||
|
|
||||||
|
for filename, content in references.items():
|
||||||
|
prompt += f"\n## {filename}\n{content[:15000]}\n"
|
||||||
|
|
||||||
|
prompt += f"""
|
||||||
|
{'-'*60}
|
||||||
|
|
||||||
|
YOUR TASK:
|
||||||
|
Create an EXCELLENT SKILL.md file that will help Claude use this documentation effectively.
|
||||||
|
|
||||||
|
Requirements:
|
||||||
|
1. **Clear "When to Use This Skill" section**
|
||||||
|
- Be SPECIFIC about trigger conditions
|
||||||
|
- List concrete use cases
|
||||||
|
|
||||||
|
2. **Excellent Quick Reference section**
|
||||||
|
- Extract 5-10 of the BEST, most practical code examples from the reference docs
|
||||||
|
- Choose SHORT, clear examples (5-20 lines max)
|
||||||
|
- Include both simple and intermediate examples
|
||||||
|
- Use proper language tags (cpp, python, javascript, json, etc.)
|
||||||
|
- Add clear descriptions for each example
|
||||||
|
|
||||||
|
3. **Detailed Reference Files description**
|
||||||
|
- Explain what's in each reference file
|
||||||
|
- Help users navigate the documentation
|
||||||
|
|
||||||
|
4. **Practical "Working with This Skill" section**
|
||||||
|
- Clear guidance for beginners, intermediate, and advanced users
|
||||||
|
- Navigation tips
|
||||||
|
|
||||||
|
5. **Key Concepts section** (if applicable)
|
||||||
|
- Explain core concepts
|
||||||
|
- Define important terminology
|
||||||
|
|
||||||
|
IMPORTANT:
|
||||||
|
- Extract REAL examples from the reference docs above
|
||||||
|
- Prioritize SHORT, clear examples
|
||||||
|
- Make it actionable and practical
|
||||||
|
- Keep the frontmatter (---\\nname: ...\\n---) intact
|
||||||
|
- Use proper markdown formatting
|
||||||
|
|
||||||
|
SAVE THE RESULT:
|
||||||
|
Save the complete enhanced SKILL.md to: {self.skill_md_path.absolute()}
|
||||||
|
|
||||||
|
First, backup the original to: {self.skill_md_path.with_suffix('.md.backup').absolute()}
|
||||||
|
"""
|
||||||
|
|
||||||
|
return prompt
|
||||||
|
|
||||||
|
def read_reference_files(self, max_chars=50000):
|
||||||
|
"""Read reference files with size limit"""
|
||||||
|
references = {}
|
||||||
|
|
||||||
|
if not self.references_dir.exists():
|
||||||
|
return references
|
||||||
|
|
||||||
|
total_chars = 0
|
||||||
|
for ref_file in sorted(self.references_dir.glob("*.md")):
|
||||||
|
if ref_file.name == "index.md":
|
||||||
|
continue
|
||||||
|
|
||||||
|
content = ref_file.read_text(encoding='utf-8')
|
||||||
|
|
||||||
|
# Limit size per file
|
||||||
|
if len(content) > 20000:
|
||||||
|
content = content[:20000] + "\n\n[Content truncated...]"
|
||||||
|
|
||||||
|
references[ref_file.name] = content
|
||||||
|
total_chars += len(content)
|
||||||
|
|
||||||
|
if total_chars > max_chars:
|
||||||
|
break
|
||||||
|
|
||||||
|
return references
|
||||||
|
|
||||||
|
def run(self):
|
||||||
|
"""Main enhancement workflow"""
|
||||||
|
print(f"\n{'='*60}")
|
||||||
|
print(f"LOCAL ENHANCEMENT: {self.skill_dir.name}")
|
||||||
|
print(f"{'='*60}\n")
|
||||||
|
|
||||||
|
# Validate
|
||||||
|
if not self.skill_dir.exists():
|
||||||
|
print(f"❌ Directory not found: {self.skill_dir}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Read reference files
|
||||||
|
print("📖 Reading reference documentation...")
|
||||||
|
references = self.read_reference_files()
|
||||||
|
|
||||||
|
if not references:
|
||||||
|
print("❌ No reference files found to analyze")
|
||||||
|
return False
|
||||||
|
|
||||||
|
print(f" ✓ Read {len(references)} reference files")
|
||||||
|
total_size = sum(len(c) for c in references.values())
|
||||||
|
print(f" ✓ Total size: {total_size:,} characters\n")
|
||||||
|
|
||||||
|
# Create prompt
|
||||||
|
print("📝 Creating enhancement prompt...")
|
||||||
|
prompt = self.create_enhancement_prompt()
|
||||||
|
|
||||||
|
if not prompt:
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Save prompt to temp file
|
||||||
|
with tempfile.NamedTemporaryFile(mode='w', suffix='.txt', delete=False, encoding='utf-8') as f:
|
||||||
|
prompt_file = f.name
|
||||||
|
f.write(prompt)
|
||||||
|
|
||||||
|
print(f" ✓ Prompt saved ({len(prompt):,} characters)\n")
|
||||||
|
|
||||||
|
# Launch Claude Code in new terminal
|
||||||
|
print("🚀 Launching Claude Code in new terminal...")
|
||||||
|
print(" This will:")
|
||||||
|
print(" 1. Open a new terminal window")
|
||||||
|
print(" 2. Run Claude Code with the enhancement task")
|
||||||
|
print(" 3. Claude will read the docs and enhance SKILL.md")
|
||||||
|
print(" 4. Terminal will auto-close when done")
|
||||||
|
print()
|
||||||
|
|
||||||
|
# Create a shell script to run in the terminal
|
||||||
|
shell_script = f'''#!/bin/bash
|
||||||
|
claude {prompt_file}
|
||||||
|
echo ""
|
||||||
|
echo "✅ Enhancement complete!"
|
||||||
|
echo "Press any key to close..."
|
||||||
|
read -n 1
|
||||||
|
rm {prompt_file}
|
||||||
|
'''
|
||||||
|
|
||||||
|
# Save shell script
|
||||||
|
with tempfile.NamedTemporaryFile(mode='w', suffix='.sh', delete=False) as f:
|
||||||
|
script_file = f.name
|
||||||
|
f.write(shell_script)
|
||||||
|
|
||||||
|
os.chmod(script_file, 0o755)
|
||||||
|
|
||||||
|
# Launch in new terminal (macOS specific)
|
||||||
|
if sys.platform == 'darwin':
|
||||||
|
# macOS Terminal - simple approach
|
||||||
|
try:
|
||||||
|
subprocess.Popen(['open', '-a', 'Terminal', script_file])
|
||||||
|
except Exception as e:
|
||||||
|
print(f"⚠️ Error launching terminal: {e}")
|
||||||
|
print(f"\nManually run: {script_file}")
|
||||||
|
return False
|
||||||
|
else:
|
||||||
|
print("⚠️ Auto-launch only works on macOS")
|
||||||
|
print(f"\nManually run this command in a new terminal:")
|
||||||
|
print(f" claude '{prompt_file}'")
|
||||||
|
print(f"\nThen delete the prompt file:")
|
||||||
|
print(f" rm '{prompt_file}'")
|
||||||
|
return False
|
||||||
|
|
||||||
|
print("✅ New terminal launched with Claude Code!")
|
||||||
|
print()
|
||||||
|
print("📊 Status:")
|
||||||
|
print(f" - Prompt file: {prompt_file}")
|
||||||
|
print(f" - Skill directory: {self.skill_dir.absolute()}")
|
||||||
|
print(f" - SKILL.md will be saved to: {self.skill_md_path.absolute()}")
|
||||||
|
print(f" - Original backed up to: {self.skill_md_path.with_suffix('.md.backup').absolute()}")
|
||||||
|
print()
|
||||||
|
print("⏳ Wait for Claude Code to finish in the other terminal...")
|
||||||
|
print(" (Usually takes 30-60 seconds)")
|
||||||
|
print()
|
||||||
|
print("💡 When done:")
|
||||||
|
print(f" 1. Check the enhanced SKILL.md: {self.skill_md_path}")
|
||||||
|
print(f" 2. If you don't like it, restore: mv {self.skill_md_path.with_suffix('.md.backup')} {self.skill_md_path}")
|
||||||
|
print(f" 3. Package: python3 /mnt/skills/examples/skill-creator/scripts/package_skill.py {self.skill_dir}/")
|
||||||
|
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
if len(sys.argv) < 2:
|
||||||
|
print("Usage: python3 enhance_skill_local.py <skill_directory>")
|
||||||
|
print()
|
||||||
|
print("Examples:")
|
||||||
|
print(" python3 enhance_skill_local.py output/steam-inventory/")
|
||||||
|
print(" python3 enhance_skill_local.py output/react/")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
skill_dir = sys.argv[1]
|
||||||
|
|
||||||
|
enhancer = LocalSkillEnhancer(skill_dir)
|
||||||
|
success = enhancer.run()
|
||||||
|
|
||||||
|
sys.exit(0 if success else 1)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
BIN
output/.DS_Store
vendored
Normal file
BIN
output/.DS_Store
vendored
Normal file
Binary file not shown.
78
package_skill.py
Normal file
78
package_skill.py
Normal file
@@ -0,0 +1,78 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Simple Skill Packager
|
||||||
|
Packages a skill directory into a .zip file for Claude.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 package_skill.py output/steam-inventory/
|
||||||
|
python3 package_skill.py output/react/
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import zipfile
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
|
||||||
|
def package_skill(skill_dir):
|
||||||
|
"""Package a skill directory into a .zip file"""
|
||||||
|
skill_path = Path(skill_dir)
|
||||||
|
|
||||||
|
if not skill_path.exists():
|
||||||
|
print(f"❌ Error: Directory not found: {skill_dir}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
if not skill_path.is_dir():
|
||||||
|
print(f"❌ Error: Not a directory: {skill_dir}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Verify SKILL.md exists
|
||||||
|
skill_md = skill_path / "SKILL.md"
|
||||||
|
if not skill_md.exists():
|
||||||
|
print(f"❌ Error: SKILL.md not found in {skill_dir}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Create zip filename
|
||||||
|
skill_name = skill_path.name
|
||||||
|
zip_path = skill_path.parent / f"{skill_name}.zip"
|
||||||
|
|
||||||
|
print(f"📦 Packaging skill: {skill_name}")
|
||||||
|
print(f" Source: {skill_path}")
|
||||||
|
print(f" Output: {zip_path}")
|
||||||
|
|
||||||
|
# Create zip file
|
||||||
|
with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zf:
|
||||||
|
for root, dirs, files in os.walk(skill_path):
|
||||||
|
# Skip backup files
|
||||||
|
files = [f for f in files if not f.endswith('.backup')]
|
||||||
|
|
||||||
|
for file in files:
|
||||||
|
file_path = Path(root) / file
|
||||||
|
arcname = file_path.relative_to(skill_path)
|
||||||
|
zf.write(file_path, arcname)
|
||||||
|
print(f" + {arcname}")
|
||||||
|
|
||||||
|
# Get zip size
|
||||||
|
zip_size = zip_path.stat().st_size
|
||||||
|
print(f"\n✅ Package created: {zip_path}")
|
||||||
|
print(f" Size: {zip_size:,} bytes ({zip_size / 1024:.1f} KB)")
|
||||||
|
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
if len(sys.argv) < 2:
|
||||||
|
print("Usage: python3 package_skill.py <skill_directory>")
|
||||||
|
print()
|
||||||
|
print("Examples:")
|
||||||
|
print(" python3 package_skill.py output/steam-inventory/")
|
||||||
|
print(" python3 package_skill.py output/react/")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
skill_dir = sys.argv[1]
|
||||||
|
success = package_skill(skill_dir)
|
||||||
|
sys.exit(0 if success else 1)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
Reference in New Issue
Block a user