This commit is contained in:
yusyus
2025-10-17 15:14:44 +00:00
parent 397d47fe7c
commit 78b9cae398
19 changed files with 3061 additions and 3 deletions

View File

@@ -1,6 +1,6 @@
MIT License MIT License
Copyright (c) 2025 yusyus Copyright (c) 2025 [Your Name/Username]
Permission is hereby granted, free of charge, to any person obtaining a copy Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal of this software and associated documentation files (the "Software"), to deal

181
QUICKSTART.md Normal file
View File

@@ -0,0 +1,181 @@
# Quick Start Guide
## 🚀 3 Steps to Create a Skill
### Step 1: Install Dependencies
```bash
pip3 install requests beautifulsoup4
```
### Step 2: Run the Tool
**Option A: Use a Preset (Easiest)**
```bash
python3 doc_scraper.py --config configs/godot.json
```
**Option B: Interactive Mode**
```bash
python3 doc_scraper.py --interactive
```
**Option C: Quick Command**
```bash
python3 doc_scraper.py --name react --url https://react.dev/
```
### Step 3: Enhance SKILL.md (Recommended)
```bash
# LOCAL enhancement (no API key, uses Claude Code Max)
python3 enhance_skill_local.py output/godot/
```
**This takes 60 seconds and dramatically improves the SKILL.md quality!**
### Step 4: Package the Skill
```bash
python3 package_skill.py output/godot/
```
**Done!** You now have `godot.zip` ready to use.
---
## 📋 Available Presets
```bash
# Godot Engine
python3 doc_scraper.py --config configs/godot.json
# React
python3 doc_scraper.py --config configs/react.json
# Vue.js
python3 doc_scraper.py --config configs/vue.json
# Django
python3 doc_scraper.py --config configs/django.json
# FastAPI
python3 doc_scraper.py --config configs/fastapi.json
```
---
## ⚡ Using Existing Data (Fast!)
If you already scraped once:
```bash
python3 doc_scraper.py --config configs/godot.json
# When prompted:
✓ Found existing data: 245 pages
Use existing data? (y/n): y
# Builds in seconds!
```
Or use `--skip-scrape`:
```bash
python3 doc_scraper.py --config configs/godot.json --skip-scrape
```
---
## 🎯 Complete Example (Recommended Workflow)
```bash
# 1. Install (once)
pip3 install requests beautifulsoup4
# 2. Scrape React docs with LOCAL enhancement
python3 doc_scraper.py --config configs/react.json --enhance-local
# Wait 15-30 minutes (scraping) + 60 seconds (enhancement)
# 3. Package
python3 package_skill.py output/react/
# 4. Use react.zip in Claude!
```
**Alternative: Enhancement after scraping**
```bash
# 2a. Scrape only (no enhancement)
python3 doc_scraper.py --config configs/react.json
# 2b. Enhance later
python3 enhance_skill_local.py output/react/
# 3. Package
python3 package_skill.py output/react/
```
---
## 💡 Pro Tips
### Test with Small Pages First
Edit config file:
```json
{
"max_pages": 20 // Test with just 20 pages
}
```
### Rebuild Instantly
```bash
# After first scrape, you can rebuild instantly:
python3 doc_scraper.py --config configs/react.json --skip-scrape
```
### Create Custom Config
```bash
# Copy a preset
cp configs/react.json configs/myframework.json
# Edit it
nano configs/myframework.json
# Use it
python3 doc_scraper.py --config configs/myframework.json
```
---
## 📁 What You Get
```
output/
├── godot_data/ # Raw scraped data (reusable!)
└── godot/ # The skill
├── SKILL.md # With real code examples!
└── references/ # Organized docs
```
---
## ❓ Need Help?
See **README.md** for:
- Complete documentation
- Config file structure
- Troubleshooting
- Advanced usage
---
## 🎮 Let's Go!
```bash
# Godot
python3 doc_scraper.py --config configs/godot.json
# Or interactive
python3 doc_scraper.py --interactive
```
That's it! 🚀

445
README.md
View File

@@ -1,2 +1,443 @@
# Skill_Seekers # Documentation to Claude Skill Converter
Single powerful tool to convert ANY documentation website into a Claude skill
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
**Single powerful tool to convert ANY documentation website into a Claude skill.**
## 🚀 Quick Start
### Easiest: Use a Preset
```bash
# Install dependencies (macOS)
pip3 install requests beautifulsoup4
# Use Godot preset
python3 doc_scraper.py --config configs/godot.json
# Use React preset
python3 doc_scraper.py --config configs/react.json
# See all presets
ls configs/
```
### Interactive Mode
```bash
python3 doc_scraper.py --interactive
```
### Quick Mode
```bash
python3 doc_scraper.py \
--name react \
--url https://react.dev/ \
--description "React framework for UIs"
```
## 📁 Simple Structure
```
doc-to-skill/
├── doc_scraper.py # Main scraping tool
├── enhance_skill.py # Optional: AI-powered SKILL.md enhancement
├── configs/ # Preset configurations
│ ├── godot.json # Godot Engine
│ ├── react.json # React
│ ├── vue.json # Vue.js
│ ├── django.json # Django
│ └── fastapi.json # FastAPI
└── output/ # All output (auto-created)
├── godot_data/ # Scraped data
└── godot/ # Built skill
```
## ✨ Features
### 1. Auto-Detect Existing Data
```bash
python3 doc_scraper.py --config configs/godot.json
# If data exists:
✓ Found existing data: 245 pages
Use existing data? (y/n): y
⏭️ Skipping scrape, using existing data
```
### 2. Knowledge Generation
**Automatic pattern extraction:**
- Extracts common code patterns from docs
- Detects programming language
- Creates quick reference with real examples
- Smarter categorization with scoring
**Enhanced SKILL.md:**
- Real code examples from documentation
- Language-annotated code blocks
- Common patterns section
- Quick reference from actual usage examples
### 3. Smart Categorization
Automatically infers categories from:
- URL structure
- Page titles
- Content keywords
- With scoring for better accuracy
### 4. Code Language Detection
```python
# Automatically detects:
- Python (def, import, from)
- JavaScript (const, let, =>)
- GDScript (func, var, extends)
- C++ (#include, int main)
- And more...
```
### 5. Skip Scraping
```bash
# Scrape once
python3 doc_scraper.py --config configs/react.json
# Later, just rebuild (instant)
python3 doc_scraper.py --config configs/react.json --skip-scrape
```
### 6. AI-Powered SKILL.md Enhancement (NEW!)
```bash
# Option 1: During scraping (API-based, requires API key)
pip3 install anthropic
export ANTHROPIC_API_KEY=sk-ant-...
python3 doc_scraper.py --config configs/react.json --enhance
# Option 2: During scraping (LOCAL, no API key - uses Claude Code Max)
python3 doc_scraper.py --config configs/react.json --enhance-local
# Option 3: After scraping (API-based, standalone)
python3 enhance_skill.py output/react/
# Option 4: After scraping (LOCAL, no API key, standalone)
python3 enhance_skill_local.py output/react/
```
**What it does:**
- Reads your reference documentation
- Uses Claude to generate an excellent SKILL.md
- Extracts best code examples (5-10 practical examples)
- Creates comprehensive quick reference
- Adds domain-specific key concepts
- Provides navigation guidance for different skill levels
- Automatically backs up original
- **Quality:** Transforms 75-line templates into 500+ line comprehensive guides
**LOCAL Enhancement (Recommended):**
- Uses your Claude Code Max plan (no API costs)
- Opens new terminal with Claude Code
- Analyzes reference files automatically
- Takes 30-60 seconds
- Quality: 9/10 (comparable to API version)
## 🎯 Complete Workflows
### First Time (With Scraping + Enhancement)
```bash
# 1. Scrape + Build + AI Enhancement (LOCAL, no API key)
python3 doc_scraper.py --config configs/godot.json --enhance-local
# 2. Wait for new terminal to close (enhancement completes)
# Check the enhanced SKILL.md:
cat output/godot/SKILL.md
# 3. Package
python3 package_skill.py output/godot/
# 4. Done! You have godot.zip with excellent SKILL.md
```
**Time:** 20-40 minutes (scraping) + 60 seconds (enhancement) = ~21-41 minutes
### Using Existing Data (Fast!)
```bash
# 1. Use cached data + Local Enhancement
python3 doc_scraper.py --config configs/godot.json --skip-scrape
python3 enhance_skill_local.py output/godot/
# 2. Package
python3 package_skill.py output/godot/
# 3. Done!
```
**Time:** 1-3 minutes (build) + 60 seconds (enhancement) = ~2-4 minutes total
### Without Enhancement (Basic)
```bash
# 1. Scrape + Build (no enhancement)
python3 doc_scraper.py --config configs/godot.json
# 2. Package
python3 package_skill.py output/godot/
# 3. Done! (SKILL.md will be basic template)
```
**Time:** 20-40 minutes
**Note:** SKILL.md will be generic - enhancement strongly recommended!
## 📋 Available Presets
| Config | Framework | Description |
|--------|-----------|-------------|
| `godot.json` | Godot Engine | Game development |
| `react.json` | React | UI framework |
| `vue.json` | Vue.js | Progressive framework |
| `django.json` | Django | Python web framework |
| `fastapi.json` | FastAPI | Modern Python API |
### Using Presets
```bash
# Godot
python3 doc_scraper.py --config configs/godot.json
# React
python3 doc_scraper.py --config configs/react.json
# Vue
python3 doc_scraper.py --config configs/vue.json
# Django
python3 doc_scraper.py --config configs/django.json
# FastAPI
python3 doc_scraper.py --config configs/fastapi.json
```
## 🎨 Creating Your Own Config
### Option 1: Interactive
```bash
python3 doc_scraper.py --interactive
# Follow prompts, it will create the config for you
```
### Option 2: Copy and Edit
```bash
# Copy a preset
cp configs/react.json configs/myframework.json
# Edit it
nano configs/myframework.json
# Use it
python3 doc_scraper.py --config configs/myframework.json
```
### Config Structure
```json
{
"name": "myframework",
"description": "When to use this skill",
"base_url": "https://docs.myframework.com/",
"selectors": {
"main_content": "article",
"title": "h1",
"code_blocks": "pre code"
},
"url_patterns": {
"include": ["/docs", "/guide"],
"exclude": ["/blog", "/about"]
},
"categories": {
"getting_started": ["intro", "quickstart"],
"api": ["api", "reference"]
},
"rate_limit": 0.5,
"max_pages": 500
}
```
## 📊 What Gets Created
```
output/
├── godot_data/ # Scraped raw data
│ ├── pages/ # JSON files (one per page)
│ └── summary.json # Overview
└── godot/ # The skill
├── SKILL.md # Enhanced with real examples
├── references/ # Categorized docs
│ ├── index.md
│ ├── getting_started.md
│ ├── scripting.md
│ └── ...
├── scripts/ # Empty (add your own)
└── assets/ # Empty (add your own)
```
## 🎯 Command Line Options
```bash
# Interactive mode
python3 doc_scraper.py --interactive
# Use config file
python3 doc_scraper.py --config configs/godot.json
# Quick mode
python3 doc_scraper.py --name react --url https://react.dev/
# Skip scraping (use existing data)
python3 doc_scraper.py --config configs/godot.json --skip-scrape
# With description
python3 doc_scraper.py \
--name react \
--url https://react.dev/ \
--description "React framework for building UIs"
```
## 💡 Tips
### 1. Test Small First
Edit `max_pages` in config to test:
```json
{
"max_pages": 20 // Test with just 20 pages
}
```
### 2. Reuse Scraped Data
```bash
# Scrape once
python3 doc_scraper.py --config configs/react.json
# Rebuild multiple times (instant)
python3 doc_scraper.py --config configs/react.json --skip-scrape
python3 doc_scraper.py --config configs/react.json --skip-scrape
```
### 3. Finding Selectors
```python
# Test in Python
from bs4 import BeautifulSoup
import requests
url = "https://docs.example.com/page"
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
# Try different selectors
print(soup.select_one('article'))
print(soup.select_one('main'))
print(soup.select_one('div[role="main"]'))
```
### 4. Check Output Quality
```bash
# After building, check:
cat output/godot/SKILL.md # Should have real examples
cat output/godot/references/index.md # Categories
```
## 🐛 Troubleshooting
### No Content Extracted?
- Check your `main_content` selector
- Try: `article`, `main`, `div[role="main"]`
### Data Exists But Won't Use It?
```bash
# Force re-scrape
rm -rf output/myframework_data/
python3 doc_scraper.py --config configs/myframework.json
```
### Categories Not Good?
Edit the config `categories` section with better keywords.
### Want to Update Docs?
```bash
# Delete old data
rm -rf output/godot_data/
# Re-scrape
python3 doc_scraper.py --config configs/godot.json
```
## 📈 Performance
| Task | Time | Notes |
|------|------|-------|
| Scraping | 15-45 min | First time only |
| Building | 1-3 min | Fast! |
| Re-building | <1 min | With --skip-scrape |
| Packaging | 5-10 sec | Final zip |
## ✅ Summary
**One tool does everything:**
1. ✅ Scrapes documentation
2. ✅ Auto-detects existing data
3. ✅ Generates better knowledge
4. ✅ Creates enhanced skills
5. ✅ Works with presets or custom configs
6. ✅ Supports skip-scraping for fast iteration
**Simple structure:**
- `doc_scraper.py` - The tool
- `configs/` - Presets
- `output/` - Everything else
**Better output:**
- Real code examples with language detection
- Common patterns extracted from docs
- Smart categorization
- Enhanced SKILL.md with actual examples
## 📚 Documentation
- **[QUICKSTART.md](QUICKSTART.md)** - Get started in 3 steps
- **[docs/ENHANCEMENT.md](docs/ENHANCEMENT.md)** - AI enhancement guide
- **[docs/UPLOAD_GUIDE.md](docs/UPLOAD_GUIDE.md)** - How to upload skills to Claude
- **[docs/CLAUDE.md](docs/CLAUDE.md)** - Technical architecture
- **[STRUCTURE.md](STRUCTURE.md)** - Repository structure
## 🎮 Ready?
```bash
# Try Godot
python3 doc_scraper.py --config configs/godot.json
# Try React
python3 doc_scraper.py --config configs/react.json
# Or go interactive
python3 doc_scraper.py --interactive
```
## 📝 License
MIT License - see [LICENSE](LICENSE) file for details
---
Happy skill building! 🚀

55
STRUCTURE.md Normal file
View File

@@ -0,0 +1,55 @@
# Repository Structure
```
doc-to-skill/
├── README.md # Main documentation (start here!)
├── QUICKSTART.md # 3-step quick start guide
├── LICENSE # MIT License
├── .gitignore # Git ignore rules
├── 🐍 Core Scripts
│ ├── doc_scraper.py # Main scraping tool
│ ├── enhance_skill.py # AI enhancement (API-based)
│ ├── enhance_skill_local.py # AI enhancement (LOCAL, no API)
│ └── package_skill.py # Skill packaging tool
├── 📁 configs/ # Preset configurations
│ ├── godot.json
│ ├── react.json
│ ├── vue.json
│ ├── django.json
│ ├── fastapi.json
│ ├── steam-inventory.json
│ ├── steam-economy.json
│ └── steam-economy-complete.json
├── 📚 docs/ # Detailed documentation
│ ├── CLAUDE.md # Technical architecture
│ ├── ENHANCEMENT.md # AI enhancement guide
│ ├── UPLOAD_GUIDE.md # How to upload skills
│ └── READY_TO_SHARE.md # Sharing checklist
└── 📦 output/ # Generated skills (git-ignored)
├── {name}_data/ # Scraped raw data (cached)
└── {name}/ # Built skills
├── SKILL.md # Main skill file
└── references/ # Reference documentation
```
## Key Files
### For Users:
- **README.md** - Start here for overview and installation
- **QUICKSTART.md** - Get started in 3 steps
- **configs/** - 8 ready-to-use presets
### For Developers:
- **doc_scraper.py** - Main tool (787 lines)
- **docs/CLAUDE.md** - Architecture and internals
- **docs/ENHANCEMENT.md** - How enhancement works
### For Contributors:
- **LICENSE** - MIT License
- **.gitignore** - What Git ignores
- **docs/READY_TO_SHARE.md** - Distribution guide

BIN
configs/.DS_Store vendored Normal file

Binary file not shown.

25
configs/django.json Normal file
View File

@@ -0,0 +1,25 @@
{
"name": "django",
"description": "Django web framework for Python. Use for Django models, views, templates, ORM, authentication, and web development.",
"base_url": "https://docs.djangoproject.com/en/stable/",
"selectors": {
"main_content": "div.document",
"title": "h1",
"code_blocks": "pre"
},
"url_patterns": {
"include": ["/topics/", "/ref/", "/howto/"],
"exclude": ["/faq/", "/misc/"]
},
"categories": {
"getting_started": ["intro", "tutorial", "install"],
"models": ["models", "database", "orm", "queries"],
"views": ["views", "urlconf", "routing"],
"templates": ["templates", "template"],
"forms": ["forms", "form"],
"authentication": ["auth", "authentication", "user"],
"api": ["ref", "reference"]
},
"rate_limit": 0.3,
"max_pages": 500
}

24
configs/fastapi.json Normal file
View File

@@ -0,0 +1,24 @@
{
"name": "fastapi",
"description": "FastAPI modern Python web framework. Use for building APIs, async endpoints, dependency injection, and Python backend development.",
"base_url": "https://fastapi.tiangolo.com/",
"selectors": {
"main_content": "article",
"title": "h1",
"code_blocks": "pre code"
},
"url_patterns": {
"include": ["/tutorial/", "/advanced/", "/reference/"],
"exclude": ["/help/", "/external-links/"]
},
"categories": {
"getting_started": ["first-steps", "tutorial", "intro"],
"path_operations": ["path", "operations", "routing"],
"request_data": ["request", "body", "query", "parameters"],
"dependencies": ["dependencies", "injection"],
"security": ["security", "oauth", "authentication"],
"database": ["database", "sql", "orm"]
},
"rate_limit": 0.5,
"max_pages": 250
}

34
configs/godot.json Normal file
View File

@@ -0,0 +1,34 @@
{
"name": "godot",
"description": "Godot Engine game development. Use for Godot projects, GDScript/C# coding, scene setup, node systems, 2D/3D development, physics, animation, UI, shaders, or any Godot-specific questions.",
"base_url": "https://docs.godotengine.org/en/stable/",
"selectors": {
"main_content": "div[role='main']",
"title": "title",
"code_blocks": "pre"
},
"url_patterns": {
"include": [],
"exclude": [
"/genindex.html",
"/search.html",
"/_static/",
"/_sources/"
]
},
"categories": {
"getting_started": ["introduction", "getting_started", "first", "your_first"],
"scripting": ["scripting", "gdscript", "c#", "csharp"],
"2d": ["/2d/", "sprite", "canvas", "tilemap"],
"3d": ["/3d/", "spatial", "mesh", "3d_"],
"physics": ["physics", "collision", "rigidbody", "characterbody"],
"animation": ["animation", "tween", "animationplayer"],
"ui": ["ui", "control", "gui", "theme"],
"shaders": ["shader", "material", "visual_shader"],
"audio": ["audio", "sound"],
"networking": ["networking", "multiplayer", "rpc"],
"export": ["export", "platform", "deploy"]
},
"rate_limit": 0.5,
"max_pages": 500
}

23
configs/react.json Normal file
View File

@@ -0,0 +1,23 @@
{
"name": "react",
"description": "React framework for building user interfaces. Use for React components, hooks, state management, JSX, and modern frontend development.",
"base_url": "https://react.dev/",
"selectors": {
"main_content": "article",
"title": "h1",
"code_blocks": "pre code"
},
"url_patterns": {
"include": ["/learn", "/reference"],
"exclude": ["/community", "/blog"]
},
"categories": {
"getting_started": ["quick-start", "installation", "tutorial"],
"hooks": ["usestate", "useeffect", "usememo", "usecallback", "usecontext", "useref", "hook"],
"components": ["component", "props", "jsx"],
"state": ["state", "context", "reducer"],
"api": ["api", "reference"]
},
"rate_limit": 0.5,
"max_pages": 300
}

View File

@@ -0,0 +1,108 @@
{
"name": "steam-economy-complete",
"description": "Complete Steam Economy system including inventory, microtransactions, trading, and monetization. Use for ISteamInventory API, ISteamEconomy API, IInventoryService Web API, Steam Wallet integration, in-app purchases, item definitions, trading, crafting, market integration, and all economy features for game developers.",
"base_url": "https://partner.steamgames.com/doc/",
"start_urls": [
"https://partner.steamgames.com/doc/features/inventory",
"https://partner.steamgames.com/doc/features/microtransactions",
"https://partner.steamgames.com/doc/features/microtransactions/implementation",
"https://partner.steamgames.com/doc/api/ISteamInventory",
"https://partner.steamgames.com/doc/webapi/ISteamEconomy",
"https://partner.steamgames.com/doc/webapi/IInventoryService",
"https://partner.steamgames.com/doc/features/inventory/economy"
],
"selectors": {
"main_content": "div.documentation_bbcode",
"title": "div.docPageTitle",
"code_blocks": "div.bb_code"
},
"url_patterns": {
"include": [
"/features/inventory",
"/features/microtransactions",
"/api/ISteamInventory",
"/webapi/ISteamEconomy",
"/webapi/IInventoryService"
],
"exclude": [
"/home",
"/sales",
"/marketing",
"/legal",
"/finance",
"/login",
"/search",
"/steamworks/apps",
"/steamworks/partner"
]
},
"categories": {
"getting_started": [
"overview",
"getting started",
"introduction",
"quickstart",
"setup"
],
"inventory_system": [
"inventory",
"item definition",
"item schema",
"item properties",
"itemdefs",
"ISteamInventory"
],
"microtransactions": [
"microtransaction",
"purchase",
"payment",
"checkout",
"wallet",
"transaction"
],
"economy_api": [
"ISteamEconomy",
"economy",
"asset",
"context"
],
"inventory_webapi": [
"IInventoryService",
"webapi",
"web api",
"http"
],
"trading": [
"trading",
"trade",
"exchange",
"market"
],
"crafting": [
"crafting",
"recipe",
"combine",
"exchange"
],
"pricing": [
"pricing",
"price",
"cost",
"currency"
],
"implementation": [
"integration",
"implementation",
"configure",
"best practices"
],
"examples": [
"example",
"sample",
"tutorial",
"walkthrough"
]
},
"rate_limit": 0.7,
"max_pages": 1000
}

23
configs/vue.json Normal file
View File

@@ -0,0 +1,23 @@
{
"name": "vue",
"description": "Vue.js progressive JavaScript framework. Use for Vue components, reactivity, composition API, and frontend development.",
"base_url": "https://vuejs.org/guide/",
"selectors": {
"main_content": "main",
"title": "h1",
"code_blocks": "pre code"
},
"url_patterns": {
"include": ["/guide/", "/api/", "/examples/"],
"exclude": ["/about/", "/sponsor/"]
},
"categories": {
"getting_started": ["quick-start", "introduction", "essentials"],
"components": ["component", "props", "events"],
"reactivity": ["reactivity", "reactive", "ref", "computed"],
"composition_api": ["composition", "setup"],
"api": ["api", "reference"]
},
"rate_limit": 0.5,
"max_pages": 200
}

789
doc_scraper.py Normal file
View File

@@ -0,0 +1,789 @@
#!/usr/bin/env python3
"""
Documentation to Claude Skill Converter
Single tool to scrape any documentation and create high-quality Claude skills.
Usage:
python3 doc_scraper.py --interactive
python3 doc_scraper.py --config configs/godot.json
python3 doc_scraper.py --url https://react.dev/ --name react
"""
import os
import sys
import json
import time
import re
import argparse
import hashlib
import requests
from pathlib import Path
from urllib.parse import urljoin, urlparse
from bs4 import BeautifulSoup
from collections import deque, defaultdict
class DocToSkillConverter:
def __init__(self, config):
self.config = config
self.name = config['name']
self.base_url = config['base_url']
# Paths
self.data_dir = f"output/{self.name}_data"
self.skill_dir = f"output/{self.name}"
# State
self.visited_urls = set()
# Support multiple starting URLs
start_urls = config.get('start_urls', [self.base_url])
self.pending_urls = deque(start_urls)
self.pages = []
# Create directories
os.makedirs(f"{self.data_dir}/pages", exist_ok=True)
os.makedirs(f"{self.skill_dir}/references", exist_ok=True)
os.makedirs(f"{self.skill_dir}/scripts", exist_ok=True)
os.makedirs(f"{self.skill_dir}/assets", exist_ok=True)
def is_valid_url(self, url):
"""Check if URL should be scraped"""
if not url.startswith(self.base_url):
return False
# Include patterns
includes = self.config.get('url_patterns', {}).get('include', [])
if includes and not any(pattern in url for pattern in includes):
return False
# Exclude patterns
excludes = self.config.get('url_patterns', {}).get('exclude', [])
if any(pattern in url for pattern in excludes):
return False
return True
def extract_content(self, soup, url):
"""Extract content with improved code and pattern detection"""
page = {
'url': url,
'title': '',
'content': '',
'headings': [],
'code_samples': [],
'patterns': [], # NEW: Extract common patterns
'links': []
}
selectors = self.config.get('selectors', {})
# Extract title
title_elem = soup.select_one(selectors.get('title', 'title'))
if title_elem:
page['title'] = self.clean_text(title_elem.get_text())
# Find main content
main_selector = selectors.get('main_content', 'div[role="main"]')
main = soup.select_one(main_selector)
if not main:
print(f"⚠ No content: {url}")
return page
# Extract headings with better structure
for h in main.find_all(['h1', 'h2', 'h3', 'h4', 'h5', 'h6']):
text = self.clean_text(h.get_text())
if text:
page['headings'].append({
'level': h.name,
'text': text,
'id': h.get('id', '')
})
# Extract code with language detection
code_selector = selectors.get('code_blocks', 'pre code')
for code_elem in main.select(code_selector):
code = code_elem.get_text()
if len(code.strip()) > 10:
# Try to detect language
lang = self.detect_language(code_elem, code)
page['code_samples'].append({
'code': code.strip(),
'language': lang
})
# Extract patterns (NEW: common code patterns)
page['patterns'] = self.extract_patterns(main, page['code_samples'])
# Extract paragraphs
paragraphs = []
for p in main.find_all('p'):
text = self.clean_text(p.get_text())
if text and len(text) > 20: # Skip very short paragraphs
paragraphs.append(text)
page['content'] = '\n\n'.join(paragraphs)
# Extract links
for link in main.find_all('a', href=True):
href = urljoin(url, link['href'])
if self.is_valid_url(href):
page['links'].append(href)
return page
def detect_language(self, elem, code):
"""Detect programming language from code block"""
# Check class attribute
classes = elem.get('class', [])
for cls in classes:
if 'language-' in cls:
return cls.replace('language-', '')
if 'lang-' in cls:
return cls.replace('lang-', '')
# Check parent pre element
parent = elem.parent
if parent and parent.name == 'pre':
classes = parent.get('class', [])
for cls in classes:
if 'language-' in cls:
return cls.replace('language-', '')
# Heuristic detection
if 'import ' in code and 'from ' in code:
return 'python'
if 'const ' in code or 'let ' in code or '=>' in code:
return 'javascript'
if 'func ' in code and 'var ' in code:
return 'gdscript'
if 'def ' in code and ':' in code:
return 'python'
if '#include' in code or 'int main' in code:
return 'cpp'
return 'unknown'
def extract_patterns(self, main, code_samples):
"""Extract common coding patterns (NEW FEATURE)"""
patterns = []
# Look for "Example:" or "Pattern:" sections
for elem in main.find_all(['p', 'div']):
text = elem.get_text().lower()
if any(word in text for word in ['example:', 'pattern:', 'usage:', 'typical use']):
# Get the code that follows
next_code = elem.find_next(['pre', 'code'])
if next_code:
patterns.append({
'description': self.clean_text(elem.get_text()),
'code': next_code.get_text().strip()
})
return patterns[:5] # Limit to 5 most relevant patterns
def clean_text(self, text):
"""Clean text content"""
text = re.sub(r'\s+', ' ', text)
return text.strip()
def save_page(self, page):
"""Save page data"""
url_hash = hashlib.md5(page['url'].encode()).hexdigest()[:10]
safe_title = re.sub(r'[^\w\s-]', '', page['title'])[:50]
safe_title = re.sub(r'[-\s]+', '_', safe_title)
filename = f"{safe_title}_{url_hash}.json"
filepath = os.path.join(self.data_dir, "pages", filename)
with open(filepath, 'w', encoding='utf-8') as f:
json.dump(page, f, indent=2, ensure_ascii=False)
def scrape_page(self, url):
"""Scrape a single page"""
try:
print(f" {url}")
headers = {'User-Agent': 'Mozilla/5.0 (Documentation Scraper)'}
response = requests.get(url, headers=headers, timeout=30)
response.raise_for_status()
soup = BeautifulSoup(response.content, 'html.parser')
page = self.extract_content(soup, url)
self.save_page(page)
self.pages.append(page)
# Add new URLs
for link in page['links']:
if link not in self.visited_urls and link not in self.pending_urls:
self.pending_urls.append(link)
# Rate limiting
time.sleep(self.config.get('rate_limit', 0.5))
except Exception as e:
print(f" ✗ Error: {e}")
def scrape_all(self):
"""Scrape all pages"""
print(f"\n{'='*60}")
print(f"SCRAPING: {self.name}")
print(f"{'='*60}")
print(f"Base URL: {self.base_url}")
print(f"Output: {self.data_dir}\n")
max_pages = self.config.get('max_pages', 500)
while self.pending_urls and len(self.visited_urls) < max_pages:
url = self.pending_urls.popleft()
if url in self.visited_urls:
continue
self.visited_urls.add(url)
self.scrape_page(url)
if len(self.visited_urls) % 10 == 0:
print(f" [{len(self.visited_urls)} pages]")
print(f"\n✅ Scraped {len(self.visited_urls)} pages")
self.save_summary()
def save_summary(self):
"""Save scraping summary"""
summary = {
'name': self.name,
'total_pages': len(self.pages),
'base_url': self.base_url,
'pages': [{'title': p['title'], 'url': p['url']} for p in self.pages]
}
with open(f"{self.data_dir}/summary.json", 'w', encoding='utf-8') as f:
json.dump(summary, f, indent=2, ensure_ascii=False)
def load_scraped_data(self):
"""Load previously scraped data"""
pages = []
pages_dir = Path(self.data_dir) / "pages"
if not pages_dir.exists():
return []
for json_file in pages_dir.glob("*.json"):
try:
with open(json_file, 'r', encoding='utf-8') as f:
pages.append(json.load(f))
except Exception as e:
print(f"⚠ Error loading {json_file}: {e}")
return pages
def smart_categorize(self, pages):
"""Improved categorization with better pattern matching"""
category_defs = self.config.get('categories', {})
# Default smart categories if none provided
if not category_defs:
category_defs = self.infer_categories(pages)
categories = {cat: [] for cat in category_defs.keys()}
categories['other'] = []
for page in pages:
url = page['url'].lower()
title = page['title'].lower()
content = page.get('content', '').lower()[:500] # Check first 500 chars
categorized = False
# Match against keywords
for cat, keywords in category_defs.items():
score = 0
for keyword in keywords:
keyword = keyword.lower()
if keyword in url:
score += 3
if keyword in title:
score += 2
if keyword in content:
score += 1
if score >= 2: # Threshold for categorization
categories[cat].append(page)
categorized = True
break
if not categorized:
categories['other'].append(page)
# Remove empty categories
categories = {k: v for k, v in categories.items() if v}
return categories
def infer_categories(self, pages):
"""Infer categories from URL patterns (IMPROVED)"""
url_segments = defaultdict(int)
for page in pages:
path = urlparse(page['url']).path
segments = [s for s in path.split('/') if s and s not in ['en', 'stable', 'latest', 'docs']]
for seg in segments:
url_segments[seg] += 1
# Top segments become categories
top_segments = sorted(url_segments.items(), key=lambda x: x[1], reverse=True)[:8]
categories = {}
for seg, count in top_segments:
if count >= 3: # At least 3 pages
categories[seg] = [seg]
# Add common defaults
if 'tutorial' not in categories and any('tutorial' in url for url in [p['url'] for p in pages]):
categories['tutorials'] = ['tutorial', 'guide', 'getting-started']
if 'api' not in categories and any('api' in url or 'reference' in url for url in [p['url'] for p in pages]):
categories['api'] = ['api', 'reference', 'class']
return categories
def generate_quick_reference(self, pages):
"""Generate quick reference from common patterns (NEW FEATURE)"""
quick_ref = []
# Collect all patterns
all_patterns = []
for page in pages:
all_patterns.extend(page.get('patterns', []))
# Get most common code patterns
seen_codes = set()
for pattern in all_patterns:
code = pattern['code']
if code not in seen_codes and len(code) < 300:
quick_ref.append(pattern)
seen_codes.add(code)
if len(quick_ref) >= 15:
break
return quick_ref
def create_reference_file(self, category, pages):
"""Create enhanced reference file"""
if not pages:
return
lines = []
lines.append(f"# {self.name.title()} - {category.replace('_', ' ').title()}\n")
lines.append(f"**Pages:** {len(pages)}\n")
lines.append("---\n")
for page in pages:
lines.append(f"## {page['title']}\n")
lines.append(f"**URL:** {page['url']}\n")
# Table of contents from headings
if page.get('headings'):
lines.append("**Contents:**")
for h in page['headings'][:10]:
level = int(h['level'][1]) if len(h['level']) > 1 else 1
indent = " " * max(0, level - 2)
lines.append(f"{indent}- {h['text']}")
lines.append("")
# Content
if page.get('content'):
content = page['content'][:2500]
if len(page['content']) > 2500:
content += "\n\n*[Content truncated]*"
lines.append(content)
lines.append("")
# Code examples with language
if page.get('code_samples'):
lines.append("**Examples:**\n")
for i, sample in enumerate(page['code_samples'][:4], 1):
lang = sample.get('language', 'unknown')
code = sample.get('code', sample if isinstance(sample, str) else '')
lines.append(f"Example {i} ({lang}):")
lines.append(f"```{lang}")
lines.append(code[:600])
if len(code) > 600:
lines.append("...")
lines.append("```\n")
lines.append("---\n")
filepath = os.path.join(self.skill_dir, "references", f"{category}.md")
with open(filepath, 'w', encoding='utf-8') as f:
f.write('\n'.join(lines))
print(f"{category}.md ({len(pages)} pages)")
def create_enhanced_skill_md(self, categories, quick_ref):
"""Create SKILL.md with actual examples (IMPROVED)"""
description = self.config.get('description', f'Comprehensive assistance with {self.name}')
# Extract actual code examples from docs
example_codes = []
for pages in categories.values():
for page in pages[:3]: # First 3 pages per category
for sample in page.get('code_samples', [])[:2]: # First 2 samples per page
code = sample.get('code', sample if isinstance(sample, str) else '')
lang = sample.get('language', 'unknown')
if len(code) < 200 and lang != 'unknown':
example_codes.append((lang, code))
if len(example_codes) >= 10:
break
if len(example_codes) >= 10:
break
if len(example_codes) >= 10:
break
content = f"""---
name: {self.name}
description: {description}
---
# {self.name.title()} Skill
Comprehensive assistance with {self.name} development, generated from official documentation.
## When to Use This Skill
This skill should be triggered when:
- Working with {self.name}
- Asking about {self.name} features or APIs
- Implementing {self.name} solutions
- Debugging {self.name} code
- Learning {self.name} best practices
## Quick Reference
### Common Patterns
"""
# Add actual quick reference patterns
if quick_ref:
for i, pattern in enumerate(quick_ref[:8], 1):
content += f"**Pattern {i}:** {pattern.get('description', 'Example pattern')}\n\n"
content += "```\n"
content += pattern.get('code', '')[:300]
content += "\n```\n\n"
else:
content += "*Quick reference patterns will be added as you use the skill.*\n\n"
# Add example codes from docs
if example_codes:
content += "### Example Code Patterns\n\n"
for i, (lang, code) in enumerate(example_codes[:5], 1):
content += f"**Example {i}** ({lang}):\n```{lang}\n{code}\n```\n\n"
content += f"""## Reference Files
This skill includes comprehensive documentation in `references/`:
"""
for cat in sorted(categories.keys()):
content += f"- **{cat}.md** - {cat.replace('_', ' ').title()} documentation\n"
content += """
Use `view` to read specific reference files when detailed information is needed.
## Working with This Skill
### For Beginners
Start with the getting_started or tutorials reference files for foundational concepts.
### For Specific Features
Use the appropriate category reference file (api, guides, etc.) for detailed information.
### For Code Examples
The quick reference section above contains common patterns extracted from the official docs.
## Resources
### references/
Organized documentation extracted from official sources. These files contain:
- Detailed explanations
- Code examples with language annotations
- Links to original documentation
- Table of contents for quick navigation
### scripts/
Add helper scripts here for common automation tasks.
### assets/
Add templates, boilerplate, or example projects here.
## Notes
- This skill was automatically generated from official documentation
- Reference files preserve the structure and examples from source docs
- Code examples include language detection for better syntax highlighting
- Quick reference patterns are extracted from common usage examples in the docs
## Updating
To refresh this skill with updated documentation:
1. Re-run the scraper with the same configuration
2. The skill will be rebuilt with the latest information
"""
filepath = os.path.join(self.skill_dir, "SKILL.md")
with open(filepath, 'w', encoding='utf-8') as f:
f.write(content)
print(f" ✓ SKILL.md (enhanced with {len(example_codes)} examples)")
def create_index(self, categories):
"""Create navigation index"""
lines = []
lines.append(f"# {self.name.title()} Documentation Index\n")
lines.append("## Categories\n")
for cat, pages in sorted(categories.items()):
lines.append(f"### {cat.replace('_', ' ').title()}")
lines.append(f"**File:** `{cat}.md`")
lines.append(f"**Pages:** {len(pages)}\n")
filepath = os.path.join(self.skill_dir, "references", "index.md")
with open(filepath, 'w', encoding='utf-8') as f:
f.write('\n'.join(lines))
print(" ✓ index.md")
def build_skill(self):
"""Build the skill from scraped data"""
print(f"\n{'='*60}")
print(f"BUILDING SKILL: {self.name}")
print(f"{'='*60}\n")
# Load data
print("Loading scraped data...")
pages = self.load_scraped_data()
if not pages:
print("✗ No scraped data found!")
return False
print(f" ✓ Loaded {len(pages)} pages\n")
# Categorize
print("Categorizing pages...")
categories = self.smart_categorize(pages)
print(f" ✓ Created {len(categories)} categories\n")
# Generate quick reference
print("Generating quick reference...")
quick_ref = self.generate_quick_reference(pages)
print(f" ✓ Extracted {len(quick_ref)} patterns\n")
# Create reference files
print("Creating reference files...")
for cat, cat_pages in categories.items():
self.create_reference_file(cat, cat_pages)
# Create index
self.create_index(categories)
print()
# Create enhanced SKILL.md
print("Creating SKILL.md...")
self.create_enhanced_skill_md(categories, quick_ref)
print(f"\n✅ Skill built: {self.skill_dir}/")
return True
def load_config(config_path):
"""Load configuration from file"""
with open(config_path, 'r') as f:
return json.load(f)
def interactive_config():
"""Interactive configuration"""
print("\n" + "="*60)
print("Documentation to Skill Converter")
print("="*60 + "\n")
config = {}
# Basic info
config['name'] = input("Skill name (e.g., 'react', 'godot'): ").strip()
config['description'] = input("Skill description: ").strip()
config['base_url'] = input("Base URL (e.g., https://docs.example.com/): ").strip()
if not config['base_url'].endswith('/'):
config['base_url'] += '/'
# Selectors
print("\nCSS Selectors (press Enter for defaults):")
selectors = {}
selectors['main_content'] = input(" Main content [div[role='main']]: ").strip() or "div[role='main']"
selectors['title'] = input(" Title [title]: ").strip() or "title"
selectors['code_blocks'] = input(" Code blocks [pre code]: ").strip() or "pre code"
config['selectors'] = selectors
# URL patterns
print("\nURL Patterns (comma-separated, optional):")
include = input(" Include: ").strip()
exclude = input(" Exclude: ").strip()
config['url_patterns'] = {
'include': [p.strip() for p in include.split(',') if p.strip()],
'exclude': [p.strip() for p in exclude.split(',') if p.strip()]
}
# Settings
rate = input("\nRate limit (seconds) [0.5]: ").strip()
config['rate_limit'] = float(rate) if rate else 0.5
max_p = input("Max pages [500]: ").strip()
config['max_pages'] = int(max_p) if max_p else 500
return config
def check_existing_data(name):
"""Check if scraped data already exists"""
data_dir = f"output/{name}_data"
if os.path.exists(data_dir) and os.path.exists(f"{data_dir}/summary.json"):
with open(f"{data_dir}/summary.json", 'r') as f:
summary = json.load(f)
return True, summary.get('total_pages', 0)
return False, 0
def main():
parser = argparse.ArgumentParser(
description='Convert documentation websites to Claude skills',
formatter_class=argparse.RawDescriptionHelpFormatter
)
parser.add_argument('--interactive', '-i', action='store_true',
help='Interactive configuration mode')
parser.add_argument('--config', '-c', type=str,
help='Load configuration from file (e.g., configs/godot.json)')
parser.add_argument('--name', type=str,
help='Skill name')
parser.add_argument('--url', type=str,
help='Base documentation URL')
parser.add_argument('--description', '-d', type=str,
help='Skill description')
parser.add_argument('--skip-scrape', action='store_true',
help='Skip scraping, use existing data')
parser.add_argument('--enhance', action='store_true',
help='Enhance SKILL.md using Claude API after building (requires API key)')
parser.add_argument('--enhance-local', action='store_true',
help='Enhance SKILL.md using Claude Code in new terminal (no API key needed)')
parser.add_argument('--api-key', type=str,
help='Anthropic API key for --enhance (or set ANTHROPIC_API_KEY)')
args = parser.parse_args()
# Get configuration
if args.config:
config = load_config(args.config)
elif args.interactive or not (args.name and args.url):
config = interactive_config()
else:
config = {
'name': args.name,
'description': args.description or f'Comprehensive assistance with {args.name}',
'base_url': args.url,
'selectors': {
'main_content': "div[role='main']",
'title': 'title',
'code_blocks': 'pre code'
},
'url_patterns': {'include': [], 'exclude': []},
'rate_limit': 0.5,
'max_pages': 500
}
# Check for existing data
exists, page_count = check_existing_data(config['name'])
if exists and not args.skip_scrape:
print(f"\n✓ Found existing data: {page_count} pages")
response = input("Use existing data? (y/n): ").strip().lower()
if response == 'y':
args.skip_scrape = True
# Create converter
converter = DocToSkillConverter(config)
# Scrape or skip
if not args.skip_scrape:
try:
converter.scrape_all()
except KeyboardInterrupt:
print("\n\nScraping interrupted.")
response = input("Continue with skill building? (y/n): ").strip().lower()
if response != 'y':
return
else:
print(f"\n⏭️ Skipping scrape, using existing data")
# Build skill
success = converter.build_skill()
if not success:
sys.exit(1)
# Optional enhancement with Claude API
if args.enhance:
print(f"\n{'='*60}")
print(f"ENHANCING SKILL.MD WITH CLAUDE API")
print(f"{'='*60}\n")
try:
import subprocess
enhance_cmd = ['python3', 'enhance_skill.py', f'output/{config["name"]}/']
if args.api_key:
enhance_cmd.extend(['--api-key', args.api_key])
result = subprocess.run(enhance_cmd, check=True)
if result.returncode == 0:
print("\n✅ Enhancement complete!")
except subprocess.CalledProcessError:
print("\n⚠ Enhancement failed, but skill was still built")
except FileNotFoundError:
print("\n⚠ enhance_skill.py not found. Run manually:")
print(f" python3 enhance_skill.py output/{config['name']}/")
# Optional enhancement with Claude Code (local, no API key)
if args.enhance_local:
print(f"\n{'='*60}")
print(f"ENHANCING SKILL.MD WITH CLAUDE CODE (LOCAL)")
print(f"{'='*60}\n")
try:
import subprocess
enhance_cmd = ['python3', 'enhance_skill_local.py', f'output/{config["name"]}/']
subprocess.run(enhance_cmd, check=True)
except subprocess.CalledProcessError:
print("\n⚠ Enhancement failed, but skill was still built")
except FileNotFoundError:
print("\n⚠ enhance_skill_local.py not found. Run manually:")
print(f" python3 enhance_skill_local.py output/{config['name']}/")
print(f"\n📦 Package your skill:")
print(f" python3 /mnt/skills/examples/skill-creator/scripts/package_skill.py output/{config['name']}/")
if not args.enhance and not args.enhance_local:
print(f"\n💡 Optional: Enhance SKILL.md with Claude:")
print(f" API-based: python3 enhance_skill.py output/{config['name']}/")
print(f" or re-run with: --enhance")
print(f" Local (no API key): python3 enhance_skill_local.py output/{config['name']}/")
print(f" or re-run with: --enhance-local")
if __name__ == "__main__":
main()

239
docs/CLAUDE.md Normal file
View File

@@ -0,0 +1,239 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Overview
This is a Python-based documentation scraper that converts ANY documentation website into a Claude skill. It's a single-file tool (`doc_scraper.py`) that scrapes documentation, extracts code patterns, detects programming languages, and generates structured skill files ready for use with Claude.
## Dependencies
```bash
pip3 install requests beautifulsoup4
```
## Core Commands
### Run with a preset configuration
```bash
python3 doc_scraper.py --config configs/godot.json
python3 doc_scraper.py --config configs/react.json
python3 doc_scraper.py --config configs/vue.json
python3 doc_scraper.py --config configs/django.json
python3 doc_scraper.py --config configs/fastapi.json
```
### Interactive mode (for new frameworks)
```bash
python3 doc_scraper.py --interactive
```
### Quick mode (minimal config)
```bash
python3 doc_scraper.py --name react --url https://react.dev/ --description "React framework"
```
### Skip scraping (use cached data)
```bash
python3 doc_scraper.py --config configs/godot.json --skip-scrape
```
### AI-powered SKILL.md enhancement
```bash
# Option 1: During scraping (API-based, requires ANTHROPIC_API_KEY)
pip3 install anthropic
export ANTHROPIC_API_KEY=sk-ant-...
python3 doc_scraper.py --config configs/react.json --enhance
# Option 2: During scraping (LOCAL, no API key - uses Claude Code Max)
python3 doc_scraper.py --config configs/react.json --enhance-local
# Option 3: Standalone after scraping (API-based)
python3 enhance_skill.py output/react/
# Option 4: Standalone after scraping (LOCAL, no API key)
python3 enhance_skill_local.py output/react/
```
The LOCAL enhancement option (`--enhance-local` or `enhance_skill_local.py`) opens a new terminal with Claude Code, which analyzes reference files and enhances SKILL.md automatically. This requires Claude Code Max plan but no API key.
### Test with limited pages (edit config first)
Set `"max_pages": 20` in the config file to test with fewer pages.
## Architecture
### Single-File Design
The entire tool is contained in `doc_scraper.py` (~737 lines). It follows a class-based architecture with a single `DocToSkillConverter` class that handles:
- **Web scraping**: BFS traversal with URL validation
- **Content extraction**: CSS selectors for title, content, code blocks
- **Language detection**: Heuristic-based detection from code samples (Python, JavaScript, GDScript, C++, etc.)
- **Pattern extraction**: Identifies common coding patterns from documentation
- **Categorization**: Smart categorization using URL structure, page titles, and content keywords with scoring
- **Skill generation**: Creates SKILL.md with real code examples and categorized reference files
### Data Flow
1. **Scrape Phase**:
- Input: Config JSON (name, base_url, selectors, url_patterns, categories, rate_limit, max_pages)
- Process: BFS traversal starting from base_url, respecting include/exclude patterns
- Output: `output/{name}_data/pages/*.json` + `summary.json`
2. **Build Phase**:
- Input: Scraped JSON data from `output/{name}_data/`
- Process: Load pages → Smart categorize → Extract patterns → Generate references
- Output: `output/{name}/SKILL.md` + `output/{name}/references/*.md`
### Directory Structure
```
doc-to-skill/
├── doc_scraper.py # Main scraping & building tool
├── enhance_skill.py # AI enhancement (API-based)
├── enhance_skill_local.py # AI enhancement (LOCAL, no API)
├── configs/ # Preset configurations
│ ├── godot.json
│ ├── react.json
│ ├── steam-inventory.json
│ └── ...
└── output/
├── {name}_data/ # Raw scraped data (cached)
│ ├── pages/ # Individual page JSONs
│ └── summary.json # Scraping summary
└── {name}/ # Generated skill
├── SKILL.md # Main skill file with examples
├── SKILL.md.backup # Backup (if enhanced)
├── references/ # Categorized documentation
│ ├── index.md
│ ├── getting_started.md
│ ├── api.md
│ └── ...
├── scripts/ # Empty (for user scripts)
└── assets/ # Empty (for user assets)
```
### Configuration Format
Config files in `configs/*.json` contain:
- `name`: Skill identifier (e.g., "godot", "react")
- `description`: When to use this skill
- `base_url`: Starting URL for scraping
- `selectors`: CSS selectors for content extraction
- `main_content`: Main documentation content (e.g., "article", "div[role='main']")
- `title`: Page title selector
- `code_blocks`: Code sample selector (e.g., "pre code", "pre")
- `url_patterns`: URL filtering
- `include`: Only scrape URLs containing these patterns
- `exclude`: Skip URLs containing these patterns
- `categories`: Keyword-based categorization mapping
- `rate_limit`: Delay between requests (seconds)
- `max_pages`: Maximum pages to scrape
### Key Features
**Auto-detect existing data**: Tool checks for `output/{name}_data/` and prompts to reuse, avoiding re-scraping.
**Language detection**: Detects code languages from:
1. CSS class attributes (`language-*`, `lang-*`)
2. Heuristics (keywords like `def`, `const`, `func`, etc.)
**Pattern extraction**: Looks for "Example:", "Pattern:", "Usage:" markers in content and extracts following code blocks (up to 5 per page).
**Smart categorization**:
- Scores pages against category keywords (3 points for URL match, 2 for title, 1 for content)
- Threshold of 2+ for categorization
- Auto-infers categories from URL segments if none provided
- Falls back to "other" category
**Enhanced SKILL.md**: Generated with:
- Real code examples from documentation (language-annotated)
- Quick reference patterns extracted from docs
- Common pattern section
- Category file listings
**AI-Powered Enhancement**: Two scripts to dramatically improve SKILL.md quality:
- `enhance_skill.py`: Uses Anthropic API (~$0.15-$0.30 per skill, requires API key)
- `enhance_skill_local.py`: Uses Claude Code Max (free, no API key needed)
- Transforms generic 75-line templates into comprehensive 500+ line guides
- Extracts best examples, explains key concepts, adds navigation guidance
- Success rate: 9/10 quality (based on steam-economy test)
## Key Code Locations
- **URL validation**: `is_valid_url()` doc_scraper.py:47-62
- **Content extraction**: `extract_content()` doc_scraper.py:64-131
- **Language detection**: `detect_language()` doc_scraper.py:133-163
- **Pattern extraction**: `extract_patterns()` doc_scraper.py:165-181
- **Smart categorization**: `smart_categorize()` doc_scraper.py:280-321
- **Category inference**: `infer_categories()` doc_scraper.py:323-349
- **Quick reference generation**: `generate_quick_reference()` doc_scraper.py:351-370
- **SKILL.md generation**: `create_enhanced_skill_md()` doc_scraper.py:424-540
- **Scraping loop**: `scrape_all()` doc_scraper.py:226-249
- **Main workflow**: `main()` doc_scraper.py:661-733
## Workflow Examples
### First time scraping (with scraping)
```bash
# 1. Scrape + Build
python3 doc_scraper.py --config configs/godot.json
# Time: 20-40 minutes
# 2. Package (assuming skill-creator is available)
python3 package_skill.py output/godot/
# Result: godot.zip
```
### Using cached data (fast iteration)
```bash
# 1. Use existing data
python3 doc_scraper.py --config configs/godot.json --skip-scrape
# Time: 1-3 minutes
# 2. Package
python3 package_skill.py output/godot/
```
### Creating a new framework config
```bash
# Option 1: Interactive
python3 doc_scraper.py --interactive
# Option 2: Copy and modify
cp configs/react.json configs/myframework.json
# Edit configs/myframework.json
python3 doc_scraper.py --config configs/myframework.json
```
## Testing Selectors
To find the right CSS selectors for a documentation site:
```python
from bs4 import BeautifulSoup
import requests
url = "https://docs.example.com/page"
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
# Try different selectors
print(soup.select_one('article'))
print(soup.select_one('main'))
print(soup.select_one('div[role="main"]'))
```
## Troubleshooting
**No content extracted**: Check `main_content` selector. Common values: `article`, `main`, `div[role="main"]`, `div.content`
**Poor categorization**: Edit `categories` section in config with better keywords specific to the documentation structure
**Force re-scrape**: Delete cached data with `rm -rf output/{name}_data/`
**Rate limiting issues**: Increase `rate_limit` value in config (e.g., from 0.5 to 1.0 seconds)
## Output Quality Checks
After building, verify quality:
```bash
cat output/godot/SKILL.md # Should have real code examples
cat output/godot/references/index.md # Should show categories
ls output/godot/references/ # Should have category .md files
```

250
docs/ENHANCEMENT.md Normal file
View File

@@ -0,0 +1,250 @@
# AI-Powered SKILL.md Enhancement
Two scripts are available to dramatically improve your SKILL.md file:
1. **`enhance_skill_local.py`** - Uses Claude Code Max (no API key, **recommended**)
2. **`enhance_skill.py`** - Uses Anthropic API (~$0.15-$0.30 per skill)
Both analyze reference documentation and extract the best examples and guidance.
## Why Use Enhancement?
**Problem:** The auto-generated SKILL.md is often too generic:
- Empty Quick Reference section
- No practical code examples
- Generic "When to Use" triggers
- Doesn't highlight key features
**Solution:** Let Claude read your reference docs and create a much better SKILL.md with:
- ✅ Best code examples extracted from documentation
- ✅ Practical quick reference with real patterns
- ✅ Domain-specific guidance
- ✅ Clear navigation tips
- ✅ Key concepts explained
## Quick Start (LOCAL - No API Key)
**Recommended for Claude Code Max users:**
```bash
# Option 1: Standalone enhancement
python3 enhance_skill_local.py output/steam-inventory/
# Option 2: Integrated with scraper
python3 doc_scraper.py --config configs/steam-inventory.json --enhance-local
```
**What happens:**
1. Opens new terminal window
2. Runs Claude Code with enhancement prompt
3. Claude analyzes reference files (~15-20K chars)
4. Generates enhanced SKILL.md (30-60 seconds)
5. Terminal auto-closes when done
**Requirements:**
- Claude Code Max plan (you're already using it!)
- macOS (auto-launch works) or manual terminal run on other OS
## API-Based Enhancement (Alternative)
**If you prefer API-based approach:**
### Installation
```bash
pip3 install anthropic
```
### Setup API Key
```bash
# Option 1: Environment variable (recommended)
export ANTHROPIC_API_KEY=sk-ant-...
# Option 2: Pass directly with --api-key
python3 enhance_skill.py output/react/ --api-key sk-ant-...
```
### Usage
```bash
# Standalone enhancement
python3 enhance_skill.py output/steam-inventory/
# Integrated with scraper
python3 doc_scraper.py --config configs/steam-inventory.json --enhance
# Dry run (see what would be done)
python3 enhance_skill.py output/react/ --dry-run
```
## What It Does
1. **Reads reference files** (api_reference.md, webapi.md, etc.)
2. **Sends to Claude** with instructions to:
- Extract 5-10 best code examples
- Create practical quick reference
- Write domain-specific "When to Use" triggers
- Add helpful navigation guidance
3. **Backs up original** SKILL.md to SKILL.md.backup
4. **Saves enhanced version** as new SKILL.md
## Example Enhancement
### Before (Auto-Generated)
```markdown
## Quick Reference
### Common Patterns
*Quick reference patterns will be added as you use the skill.*
```
### After (AI-Enhanced)
```markdown
## Quick Reference
### Common API Patterns
**Granting promotional items:**
```cpp
void CInventory::GrantPromoItems()
{
SteamItemDef_t newItems[2];
newItems[0] = 110;
newItems[1] = 111;
SteamInventory()->AddPromoItems( &s_GenerateRequestResult, newItems, 2 );
}
```
**Getting all items in player inventory:**
```cpp
SteamInventoryResult_t resultHandle;
bool success = SteamInventory()->GetAllItems( &resultHandle );
```
[... 8 more practical examples ...]
```
## Cost Estimate
- **Input**: ~50,000-100,000 tokens (reference docs)
- **Output**: ~4,000 tokens (enhanced SKILL.md)
- **Model**: claude-sonnet-4-20250514
- **Estimated cost**: $0.15-$0.30 per skill
## Troubleshooting
### "No API key provided"
```bash
export ANTHROPIC_API_KEY=sk-ant-...
# or
python3 enhance_skill.py output/react/ --api-key sk-ant-...
```
### "No reference files found"
Make sure you've run the scraper first:
```bash
python3 doc_scraper.py --config configs/react.json
```
### "anthropic package not installed"
```bash
pip3 install anthropic
```
### Don't like the result?
```bash
# Restore original
mv output/steam-inventory/SKILL.md.backup output/steam-inventory/SKILL.md
# Try again (it may generate different content)
python3 enhance_skill.py output/steam-inventory/
```
## Tips
1. **Run after scraping completes** - Enhancement works best with complete reference docs
2. **Review the output** - AI is good but not perfect, check the generated SKILL.md
3. **Keep the backup** - Original is saved as SKILL.md.backup
4. **Re-run if needed** - Each run may produce slightly different results
5. **Works offline after first run** - Reference files are local
## Real-World Results
**Test Case: steam-economy skill**
- **Before:** 75 lines, generic template, empty Quick Reference
- **After:** 570 lines, 10 practical API examples, key concepts explained
- **Time:** 60 seconds
- **Quality Rating:** 9/10
The LOCAL enhancement successfully:
- Extracted best HTTP/JSON examples from 24 pages of documentation
- Explained domain concepts (Asset Classes, Context IDs, Transaction Lifecycle)
- Created navigation guidance for beginners through advanced users
- Added best practices for security, economy design, and API integration
## Limitations
**LOCAL Enhancement (`enhance_skill_local.py`):**
- Requires Claude Code Max plan
- macOS auto-launch only (manual on other OS)
- Opens new terminal window
- Takes ~60 seconds
**API Enhancement (`enhance_skill.py`):**
- Requires Anthropic API key (paid)
- Cost: ~$0.15-$0.30 per skill
- Limited to ~100K tokens of reference input
**Both:**
- May occasionally miss the best examples
- Can't understand context beyond the reference docs
- Doesn't modify reference files (only SKILL.md)
## Enhancement Options Comparison
| Aspect | Manual Edit | LOCAL Enhancement | API Enhancement |
|--------|-------------|-------------------|-----------------|
| Time | 15-30 minutes | 30-60 seconds | 30-60 seconds |
| Code examples | You pick | AI picks best | AI picks best |
| Quick reference | Write yourself | Auto-generated | Auto-generated |
| Domain guidance | Your knowledge | From docs | From docs |
| Consistency | Varies | Consistent | Consistent |
| Cost | Free (your time) | Free (Max plan) | ~$0.20 per skill |
| Setup | None | None | API key needed |
| Quality | High (if expert) | 9/10 | 9/10 |
| **Recommended?** | For experts only | ✅ **Yes** | If no Max plan |
## When to Use
**Use enhancement when:**
- You want high-quality SKILL.md quickly
- Working with large documentation (50+ pages)
- Creating skills for unfamiliar frameworks
- Need practical code examples extracted
- Want consistent quality across multiple skills
**Skip enhancement when:**
- Budget constrained (use manual editing)
- Very small documentation (<10 pages)
- You know the framework intimately
- Documentation has no code examples
## Advanced: Customization
To customize how Claude enhances the SKILL.md, edit `enhance_skill.py` and modify the `_build_enhancement_prompt()` method around line 130.
Example customization:
```python
prompt += """
ADDITIONAL REQUIREMENTS:
- Focus on security best practices
- Include performance tips
- Add troubleshooting section
"""
```
## See Also
- [README.md](../README.md) - Main documentation
- [CLAUDE.md](CLAUDE.md) - Architecture guide
- [doc_scraper.py](../doc_scraper.py) - Main scraping tool

252
docs/UPLOAD_GUIDE.md Normal file
View File

@@ -0,0 +1,252 @@
# How to Upload Skills to Claude
## Quick Answer
**You upload the `.zip` file created by `package_skill.py`**
```bash
# Create the zip file
python3 package_skill.py output/steam-economy/
# This creates: output/steam-economy.zip
# Upload this file to Claude!
```
## What's Inside the Zip?
The `.zip` file contains:
```
steam-economy.zip
├── SKILL.md ← Main skill file (Claude reads this first)
└── references/ ← Reference documentation
├── index.md ← Category index
├── api_reference.md ← API docs
├── pricing.md ← Pricing docs
├── trading.md ← Trading docs
└── ... ← Other categorized docs
```
**Note:** The zip only includes what Claude needs. It excludes:
- `.backup` files
- Build artifacts
- Temporary files
## What Does package_skill.py Do?
The package script:
1. **Finds your skill directory** (e.g., `output/steam-economy/`)
2. **Validates SKILL.md exists** (required!)
3. **Creates a .zip file** with the same name
4. **Includes all files** except backups
5. **Saves to** `output/` directory
**Example:**
```bash
python3 package_skill.py output/steam-economy/
📦 Packaging skill: steam-economy
Source: output/steam-economy
Output: output/steam-economy.zip
+ SKILL.md
+ references/api_reference.md
+ references/pricing.md
+ references/trading.md
+ ...
✅ Package created: output/steam-economy.zip
Size: 14,290 bytes (14.0 KB)
```
## Complete Workflow
### Step 1: Scrape & Build
```bash
python3 doc_scraper.py --config configs/steam-economy.json
```
**Output:**
- `output/steam-economy_data/` (raw scraped data)
- `output/steam-economy/` (skill directory)
### Step 2: Enhance (Recommended)
```bash
python3 enhance_skill_local.py output/steam-economy/
```
**What it does:**
- Analyzes reference files
- Creates comprehensive SKILL.md
- Backs up original to SKILL.md.backup
**Output:**
- `output/steam-economy/SKILL.md` (enhanced)
- `output/steam-economy/SKILL.md.backup` (original)
### Step 3: Package
```bash
python3 package_skill.py output/steam-economy/
```
**Output:**
- `output/steam-economy.zip`**THIS IS WHAT YOU UPLOAD**
### Step 4: Upload to Claude
1. Go to Claude (claude.ai)
2. Click "Add Skill" or skill upload button
3. Select `output/steam-economy.zip`
4. Done!
## What Files Are Required?
**Minimum required structure:**
```
your-skill/
└── SKILL.md ← Required! Claude reads this first
```
**Recommended structure:**
```
your-skill/
├── SKILL.md ← Main skill file (required)
└── references/ ← Reference docs (highly recommended)
├── index.md
└── *.md ← Category files
```
**Optional (can add manually):**
```
your-skill/
├── SKILL.md
├── references/
├── scripts/ ← Helper scripts
│ └── *.py
└── assets/ ← Templates, examples
└── *.txt
```
## File Size Limits
The package script shows size after packaging:
```
✅ Package created: output/steam-economy.zip
Size: 14,290 bytes (14.0 KB)
```
**Typical sizes:**
- Small skill: 5-20 KB
- Medium skill: 20-100 KB
- Large skill: 100-500 KB
Claude has generous size limits, so most documentation-based skills fit easily.
## Quick Reference
### Package a Skill
```bash
python3 package_skill.py output/steam-economy/
```
### Package Multiple Skills
```bash
# Package all skills in output/
for dir in output/*/; do
if [ -f "$dir/SKILL.md" ]; then
python3 package_skill.py "$dir"
fi
done
```
### Check What's in a Zip
```bash
unzip -l output/steam-economy.zip
```
### Test a Packaged Skill Locally
```bash
# Extract to temp directory
mkdir temp-test
unzip output/steam-economy.zip -d temp-test/
cat temp-test/SKILL.md
```
## Troubleshooting
### "SKILL.md not found"
```bash
# Make sure you scraped and built first
python3 doc_scraper.py --config configs/steam-economy.json
# Then package
python3 package_skill.py output/steam-economy/
```
### "Directory not found"
```bash
# Check what skills are available
ls output/
# Use correct path
python3 package_skill.py output/YOUR-SKILL-NAME/
```
### Zip is Too Large
Most skills are small, but if yours is large:
```bash
# Check size
ls -lh output/steam-economy.zip
# If needed, check what's taking space
unzip -l output/steam-economy.zip | sort -k1 -rn | head -20
```
Reference files are usually small. Large sizes often mean:
- Many images (skills typically don't need images)
- Large code examples (these are fine, just be aware)
## What Does Claude Do With the Zip?
When you upload a skill zip:
1. **Claude extracts it**
2. **Reads SKILL.md first** - This tells Claude:
- When to activate this skill
- What the skill does
- Quick reference examples
- How to navigate the references
3. **Indexes reference files** - Claude can search through:
- `references/*.md` files
- Find specific APIs, examples, concepts
4. **Activates automatically** - When you ask about topics matching the skill
## Example: Using the Packaged Skill
After uploading `steam-economy.zip`:
**You ask:** "How do I implement microtransactions in my Steam game?"
**Claude:**
- Recognizes this matches steam-economy skill
- Reads SKILL.md for quick reference
- Searches references/microtransactions.md
- Provides detailed answer with code examples
## Summary
**What you need to do:**
1. ✅ Scrape: `python3 doc_scraper.py --config configs/YOUR-CONFIG.json`
2. ✅ Enhance: `python3 enhance_skill_local.py output/YOUR-SKILL/`
3. ✅ Package: `python3 package_skill.py output/YOUR-SKILL/`
4. ✅ Upload: Upload the `.zip` file to Claude
**What you upload:**
- The `.zip` file from `output/` directory
- Example: `output/steam-economy.zip`
**What's in the zip:**
- `SKILL.md` (required)
- `references/*.md` (recommended)
- Any scripts/assets you added (optional)
That's it! 🚀

292
enhance_skill.py Normal file
View File

@@ -0,0 +1,292 @@
#!/usr/bin/env python3
"""
SKILL.md Enhancement Script
Uses Claude API to improve SKILL.md by analyzing reference documentation.
Usage:
python3 enhance_skill.py output/steam-inventory/
python3 enhance_skill.py output/react/
python3 enhance_skill.py output/godot/ --api-key YOUR_API_KEY
"""
import os
import sys
import json
import argparse
from pathlib import Path
try:
import anthropic
except ImportError:
print("❌ Error: anthropic package not installed")
print("Install with: pip3 install anthropic")
sys.exit(1)
class SkillEnhancer:
def __init__(self, skill_dir, api_key=None):
self.skill_dir = Path(skill_dir)
self.references_dir = self.skill_dir / "references"
self.skill_md_path = self.skill_dir / "SKILL.md"
# Get API key
self.api_key = api_key or os.environ.get('ANTHROPIC_API_KEY')
if not self.api_key:
raise ValueError(
"No API key provided. Set ANTHROPIC_API_KEY environment variable "
"or use --api-key argument"
)
self.client = anthropic.Anthropic(api_key=self.api_key)
def read_reference_files(self, max_chars=100000):
"""Read reference files with size limit"""
references = {}
if not self.references_dir.exists():
print(f"⚠ No references directory found at {self.references_dir}")
return references
total_chars = 0
for ref_file in sorted(self.references_dir.glob("*.md")):
if ref_file.name == "index.md":
continue
content = ref_file.read_text(encoding='utf-8')
# Limit size per file
if len(content) > 40000:
content = content[:40000] + "\n\n[Content truncated...]"
references[ref_file.name] = content
total_chars += len(content)
# Stop if we've read enough
if total_chars > max_chars:
print(f" Limiting input to {max_chars:,} characters")
break
return references
def read_current_skill_md(self):
"""Read existing SKILL.md"""
if not self.skill_md_path.exists():
return None
return self.skill_md_path.read_text(encoding='utf-8')
def enhance_skill_md(self, references, current_skill_md):
"""Use Claude to enhance SKILL.md"""
# Build prompt
prompt = self._build_enhancement_prompt(references, current_skill_md)
print("\n🤖 Asking Claude to enhance SKILL.md...")
print(f" Input: {len(prompt):,} characters")
try:
message = self.client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
temperature=0.3,
messages=[{
"role": "user",
"content": prompt
}]
)
enhanced_content = message.content[0].text
return enhanced_content
except Exception as e:
print(f"❌ Error calling Claude API: {e}")
return None
def _build_enhancement_prompt(self, references, current_skill_md):
"""Build the prompt for Claude"""
# Extract skill name and description
skill_name = self.skill_dir.name
prompt = f"""You are enhancing a Claude skill's SKILL.md file. This skill is about: {skill_name}
I've scraped documentation and organized it into reference files. Your job is to create an EXCELLENT SKILL.md that will help Claude use this documentation effectively.
CURRENT SKILL.MD:
{'```markdown' if current_skill_md else '(none - create from scratch)'}
{current_skill_md or 'No existing SKILL.md'}
{'```' if current_skill_md else ''}
REFERENCE DOCUMENTATION:
"""
for filename, content in references.items():
prompt += f"\n\n## {filename}\n```markdown\n{content[:30000]}\n```\n"
prompt += """
YOUR TASK:
Create an enhanced SKILL.md that includes:
1. **Clear "When to Use This Skill" section** - Be specific about trigger conditions
2. **Excellent Quick Reference section** - Extract 5-10 of the BEST, most practical code examples from the reference docs
- Choose SHORT, clear examples that demonstrate common tasks
- Include both simple and intermediate examples
- Annotate examples with clear descriptions
- Use proper language tags (cpp, python, javascript, json, etc.)
3. **Detailed Reference Files description** - Explain what's in each reference file
4. **Practical "Working with This Skill" section** - Give users clear guidance on how to navigate the skill
5. **Key Concepts section** (if applicable) - Explain core concepts
6. **Keep the frontmatter** (---\nname: ...\n---) intact
IMPORTANT:
- Extract REAL examples from the reference docs, don't make them up
- Prioritize SHORT, clear examples (5-20 lines max)
- Make it actionable and practical
- Don't be too verbose - be concise but useful
- Maintain the markdown structure for Claude skills
- Keep code examples properly formatted with language tags
OUTPUT:
Return ONLY the complete SKILL.md content, starting with the frontmatter (---).
"""
return prompt
def save_enhanced_skill_md(self, content):
"""Save the enhanced SKILL.md"""
# Backup original
if self.skill_md_path.exists():
backup_path = self.skill_md_path.with_suffix('.md.backup')
self.skill_md_path.rename(backup_path)
print(f" 💾 Backed up original to: {backup_path.name}")
# Save enhanced version
self.skill_md_path.write_text(content, encoding='utf-8')
print(f" ✅ Saved enhanced SKILL.md")
def run(self):
"""Main enhancement workflow"""
print(f"\n{'='*60}")
print(f"ENHANCING SKILL: {self.skill_dir.name}")
print(f"{'='*60}\n")
# Read reference files
print("📖 Reading reference documentation...")
references = self.read_reference_files()
if not references:
print("❌ No reference files found to analyze")
return False
print(f" ✓ Read {len(references)} reference files")
total_size = sum(len(c) for c in references.values())
print(f" ✓ Total size: {total_size:,} characters\n")
# Read current SKILL.md
current_skill_md = self.read_current_skill_md()
if current_skill_md:
print(f" Found existing SKILL.md ({len(current_skill_md)} chars)")
else:
print(f" No existing SKILL.md, will create new one")
# Enhance with Claude
enhanced = self.enhance_skill_md(references, current_skill_md)
if not enhanced:
print("❌ Enhancement failed")
return False
print(f" ✓ Generated enhanced SKILL.md ({len(enhanced)} chars)\n")
# Save
print("💾 Saving enhanced SKILL.md...")
self.save_enhanced_skill_md(enhanced)
print(f"\n✅ Enhancement complete!")
print(f"\nNext steps:")
print(f" 1. Review: {self.skill_md_path}")
print(f" 2. If you don't like it, restore backup: {self.skill_md_path.with_suffix('.md.backup')}")
print(f" 3. Package your skill:")
print(f" python3 /mnt/skills/examples/skill-creator/scripts/package_skill.py {self.skill_dir}/")
return True
def main():
parser = argparse.ArgumentParser(
description='Enhance SKILL.md using Claude API',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# Using ANTHROPIC_API_KEY environment variable
export ANTHROPIC_API_KEY=sk-ant-...
python3 enhance_skill.py output/steam-inventory/
# Providing API key directly
python3 enhance_skill.py output/react/ --api-key sk-ant-...
# Show what would be done (dry run)
python3 enhance_skill.py output/godot/ --dry-run
"""
)
parser.add_argument('skill_dir', type=str,
help='Path to skill directory (e.g., output/steam-inventory/)')
parser.add_argument('--api-key', type=str,
help='Anthropic API key (or set ANTHROPIC_API_KEY env var)')
parser.add_argument('--dry-run', action='store_true',
help='Show what would be done without calling API')
args = parser.parse_args()
# Validate skill directory
skill_dir = Path(args.skill_dir)
if not skill_dir.exists():
print(f"❌ Error: Directory not found: {skill_dir}")
sys.exit(1)
if not skill_dir.is_dir():
print(f"❌ Error: Not a directory: {skill_dir}")
sys.exit(1)
# Dry run mode
if args.dry_run:
print(f"🔍 DRY RUN MODE")
print(f" Would enhance: {skill_dir}")
print(f" References: {skill_dir / 'references'}")
print(f" SKILL.md: {skill_dir / 'SKILL.md'}")
refs_dir = skill_dir / "references"
if refs_dir.exists():
ref_files = list(refs_dir.glob("*.md"))
print(f" Found {len(ref_files)} reference files:")
for rf in ref_files:
size = rf.stat().st_size
print(f" - {rf.name} ({size:,} bytes)")
print("\nTo actually run enhancement:")
print(f" python3 enhance_skill.py {skill_dir}")
return
# Create enhancer and run
try:
enhancer = SkillEnhancer(skill_dir, api_key=args.api_key)
success = enhancer.run()
sys.exit(0 if success else 1)
except ValueError as e:
print(f"❌ Error: {e}")
print("\nSet your API key:")
print(" export ANTHROPIC_API_KEY=sk-ant-...")
print("Or provide it directly:")
print(f" python3 enhance_skill.py {skill_dir} --api-key sk-ant-...")
sys.exit(1)
except Exception as e:
print(f"❌ Unexpected error: {e}")
import traceback
traceback.print_exc()
sys.exit(1)
if __name__ == "__main__":
main()

244
enhance_skill_local.py Normal file
View File

@@ -0,0 +1,244 @@
#!/usr/bin/env python3
"""
SKILL.md Enhancement Script (Local - Using Claude Code)
Opens a new terminal with Claude Code to enhance SKILL.md, then reports back.
No API key needed - uses your existing Claude Code Max plan!
Usage:
python3 enhance_skill_local.py output/steam-inventory/
python3 enhance_skill_local.py output/react/
"""
import os
import sys
import time
import subprocess
import tempfile
from pathlib import Path
class LocalSkillEnhancer:
def __init__(self, skill_dir):
self.skill_dir = Path(skill_dir)
self.references_dir = self.skill_dir / "references"
self.skill_md_path = self.skill_dir / "SKILL.md"
def create_enhancement_prompt(self):
"""Create the prompt file for Claude Code"""
# Read reference files
references = self.read_reference_files()
if not references:
print("❌ No reference files found")
return None
# Read current SKILL.md
current_skill_md = ""
if self.skill_md_path.exists():
current_skill_md = self.skill_md_path.read_text(encoding='utf-8')
# Build prompt
prompt = f"""I need you to enhance the SKILL.md file for the {self.skill_dir.name} skill.
CURRENT SKILL.MD:
{'-'*60}
{current_skill_md if current_skill_md else '(No existing SKILL.md - create from scratch)'}
{'-'*60}
REFERENCE DOCUMENTATION:
{'-'*60}
"""
for filename, content in references.items():
prompt += f"\n## {filename}\n{content[:15000]}\n"
prompt += f"""
{'-'*60}
YOUR TASK:
Create an EXCELLENT SKILL.md file that will help Claude use this documentation effectively.
Requirements:
1. **Clear "When to Use This Skill" section**
- Be SPECIFIC about trigger conditions
- List concrete use cases
2. **Excellent Quick Reference section**
- Extract 5-10 of the BEST, most practical code examples from the reference docs
- Choose SHORT, clear examples (5-20 lines max)
- Include both simple and intermediate examples
- Use proper language tags (cpp, python, javascript, json, etc.)
- Add clear descriptions for each example
3. **Detailed Reference Files description**
- Explain what's in each reference file
- Help users navigate the documentation
4. **Practical "Working with This Skill" section**
- Clear guidance for beginners, intermediate, and advanced users
- Navigation tips
5. **Key Concepts section** (if applicable)
- Explain core concepts
- Define important terminology
IMPORTANT:
- Extract REAL examples from the reference docs above
- Prioritize SHORT, clear examples
- Make it actionable and practical
- Keep the frontmatter (---\\nname: ...\\n---) intact
- Use proper markdown formatting
SAVE THE RESULT:
Save the complete enhanced SKILL.md to: {self.skill_md_path.absolute()}
First, backup the original to: {self.skill_md_path.with_suffix('.md.backup').absolute()}
"""
return prompt
def read_reference_files(self, max_chars=50000):
"""Read reference files with size limit"""
references = {}
if not self.references_dir.exists():
return references
total_chars = 0
for ref_file in sorted(self.references_dir.glob("*.md")):
if ref_file.name == "index.md":
continue
content = ref_file.read_text(encoding='utf-8')
# Limit size per file
if len(content) > 20000:
content = content[:20000] + "\n\n[Content truncated...]"
references[ref_file.name] = content
total_chars += len(content)
if total_chars > max_chars:
break
return references
def run(self):
"""Main enhancement workflow"""
print(f"\n{'='*60}")
print(f"LOCAL ENHANCEMENT: {self.skill_dir.name}")
print(f"{'='*60}\n")
# Validate
if not self.skill_dir.exists():
print(f"❌ Directory not found: {self.skill_dir}")
return False
# Read reference files
print("📖 Reading reference documentation...")
references = self.read_reference_files()
if not references:
print("❌ No reference files found to analyze")
return False
print(f" ✓ Read {len(references)} reference files")
total_size = sum(len(c) for c in references.values())
print(f" ✓ Total size: {total_size:,} characters\n")
# Create prompt
print("📝 Creating enhancement prompt...")
prompt = self.create_enhancement_prompt()
if not prompt:
return False
# Save prompt to temp file
with tempfile.NamedTemporaryFile(mode='w', suffix='.txt', delete=False, encoding='utf-8') as f:
prompt_file = f.name
f.write(prompt)
print(f" ✓ Prompt saved ({len(prompt):,} characters)\n")
# Launch Claude Code in new terminal
print("🚀 Launching Claude Code in new terminal...")
print(" This will:")
print(" 1. Open a new terminal window")
print(" 2. Run Claude Code with the enhancement task")
print(" 3. Claude will read the docs and enhance SKILL.md")
print(" 4. Terminal will auto-close when done")
print()
# Create a shell script to run in the terminal
shell_script = f'''#!/bin/bash
claude {prompt_file}
echo ""
echo "✅ Enhancement complete!"
echo "Press any key to close..."
read -n 1
rm {prompt_file}
'''
# Save shell script
with tempfile.NamedTemporaryFile(mode='w', suffix='.sh', delete=False) as f:
script_file = f.name
f.write(shell_script)
os.chmod(script_file, 0o755)
# Launch in new terminal (macOS specific)
if sys.platform == 'darwin':
# macOS Terminal - simple approach
try:
subprocess.Popen(['open', '-a', 'Terminal', script_file])
except Exception as e:
print(f"⚠️ Error launching terminal: {e}")
print(f"\nManually run: {script_file}")
return False
else:
print("⚠️ Auto-launch only works on macOS")
print(f"\nManually run this command in a new terminal:")
print(f" claude '{prompt_file}'")
print(f"\nThen delete the prompt file:")
print(f" rm '{prompt_file}'")
return False
print("✅ New terminal launched with Claude Code!")
print()
print("📊 Status:")
print(f" - Prompt file: {prompt_file}")
print(f" - Skill directory: {self.skill_dir.absolute()}")
print(f" - SKILL.md will be saved to: {self.skill_md_path.absolute()}")
print(f" - Original backed up to: {self.skill_md_path.with_suffix('.md.backup').absolute()}")
print()
print("⏳ Wait for Claude Code to finish in the other terminal...")
print(" (Usually takes 30-60 seconds)")
print()
print("💡 When done:")
print(f" 1. Check the enhanced SKILL.md: {self.skill_md_path}")
print(f" 2. If you don't like it, restore: mv {self.skill_md_path.with_suffix('.md.backup')} {self.skill_md_path}")
print(f" 3. Package: python3 /mnt/skills/examples/skill-creator/scripts/package_skill.py {self.skill_dir}/")
return True
def main():
if len(sys.argv) < 2:
print("Usage: python3 enhance_skill_local.py <skill_directory>")
print()
print("Examples:")
print(" python3 enhance_skill_local.py output/steam-inventory/")
print(" python3 enhance_skill_local.py output/react/")
sys.exit(1)
skill_dir = sys.argv[1]
enhancer = LocalSkillEnhancer(skill_dir)
success = enhancer.run()
sys.exit(0 if success else 1)
if __name__ == "__main__":
main()

BIN
output/.DS_Store vendored Normal file

Binary file not shown.

78
package_skill.py Normal file
View File

@@ -0,0 +1,78 @@
#!/usr/bin/env python3
"""
Simple Skill Packager
Packages a skill directory into a .zip file for Claude.
Usage:
python3 package_skill.py output/steam-inventory/
python3 package_skill.py output/react/
"""
import os
import sys
import zipfile
from pathlib import Path
def package_skill(skill_dir):
"""Package a skill directory into a .zip file"""
skill_path = Path(skill_dir)
if not skill_path.exists():
print(f"❌ Error: Directory not found: {skill_dir}")
return False
if not skill_path.is_dir():
print(f"❌ Error: Not a directory: {skill_dir}")
return False
# Verify SKILL.md exists
skill_md = skill_path / "SKILL.md"
if not skill_md.exists():
print(f"❌ Error: SKILL.md not found in {skill_dir}")
return False
# Create zip filename
skill_name = skill_path.name
zip_path = skill_path.parent / f"{skill_name}.zip"
print(f"📦 Packaging skill: {skill_name}")
print(f" Source: {skill_path}")
print(f" Output: {zip_path}")
# Create zip file
with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zf:
for root, dirs, files in os.walk(skill_path):
# Skip backup files
files = [f for f in files if not f.endswith('.backup')]
for file in files:
file_path = Path(root) / file
arcname = file_path.relative_to(skill_path)
zf.write(file_path, arcname)
print(f" + {arcname}")
# Get zip size
zip_size = zip_path.stat().st_size
print(f"\n✅ Package created: {zip_path}")
print(f" Size: {zip_size:,} bytes ({zip_size / 1024:.1f} KB)")
return True
def main():
if len(sys.argv) < 2:
print("Usage: python3 package_skill.py <skill_directory>")
print()
print("Examples:")
print(" python3 package_skill.py output/steam-inventory/")
print(" python3 package_skill.py output/react/")
sys.exit(1)
skill_dir = sys.argv[1]
success = package_skill(skill_dir)
sys.exit(0 if success else 1)
if __name__ == "__main__":
main()