Update 32 documentation files across English and Chinese (zh-CN) docs to reflect the 10 new source types added in the previous commit. Updated files: - README.md, README.zh-CN.md — taglines, feature lists, examples, install extras - docs/reference/ — CLI_REFERENCE, FEATURE_MATRIX, MCP_REFERENCE, CONFIG_FORMAT, API_REFERENCE - docs/features/ — UNIFIED_SCRAPING with generic merge docs - docs/advanced/ — multi-source guide, MCP server guide - docs/getting-started/ — installation extras, quick-start examples - docs/user-guide/ — core-concepts, scraping, packaging, workflows (complex-merge) - docs/ — FAQ, TROUBLESHOOTING, BEST_PRACTICES, ARCHITECTURE, UNIFIED_PARSERS, README - Root — BULLETPROOF_QUICKSTART, CONTRIBUTING, ROADMAP - docs/zh-CN/ — Chinese translations for all of the above 32 files changed, +3,016 lines, -245 lines
831 lines
22 KiB
Markdown
831 lines
22 KiB
Markdown
# Frequently Asked Questions (FAQ)
|
|
|
|
**Version:** 3.2.0
|
|
**Last Updated:** 2026-03-15
|
|
|
|
---
|
|
|
|
## General Questions
|
|
|
|
### What is Skill Seekers?
|
|
|
|
Skill Seekers is a Python tool that converts 17 source types — documentation websites, GitHub repos, PDFs, videos, Word docs, EPUB books, Jupyter notebooks, local HTML files, OpenAPI specs, AsciiDoc, PowerPoint, RSS/Atom feeds, man pages, Confluence wikis, Notion pages, Slack/Discord exports, and local codebases — into AI-ready formats for 16+ platforms: LLM platforms (Claude, Gemini, OpenAI), RAG frameworks (LangChain, LlamaIndex, Haystack), vector databases (ChromaDB, FAISS, Weaviate, Qdrant, Pinecone), and AI coding assistants (Cursor, Windsurf, Cline, Continue.dev).
|
|
|
|
**Use Cases:**
|
|
- Create custom documentation skills for your favorite frameworks
|
|
- Analyze GitHub repositories and extract code patterns
|
|
- Convert PDF manuals into searchable AI skills
|
|
- Import knowledge from Confluence, Notion, or Slack/Discord
|
|
- Extract content from videos (YouTube, Vimeo, local files)
|
|
- Convert Jupyter notebooks, EPUB books, or PowerPoint slides into skills
|
|
- Parse OpenAPI/Swagger specs into API reference skills
|
|
- Combine multiple sources (docs + code + PDFs + more) into unified skills
|
|
|
|
### Which platforms are supported?
|
|
|
|
**Supported Platforms (16+):**
|
|
|
|
*LLM Platforms:*
|
|
1. **Claude AI** - ZIP format with YAML frontmatter
|
|
2. **Google Gemini** - tar.gz format for Grounded Generation
|
|
3. **OpenAI ChatGPT** - ZIP format for Vector Stores
|
|
4. **Generic Markdown** - ZIP format with markdown files
|
|
|
|
*RAG Frameworks:*
|
|
5. **LangChain** - Document objects for QA chains and agents
|
|
6. **LlamaIndex** - TextNodes for query engines
|
|
7. **Haystack** - Document objects for enterprise RAG
|
|
|
|
*Vector Databases:*
|
|
8. **ChromaDB** - Direct collection upload
|
|
9. **FAISS** - Index files for local similarity search
|
|
10. **Weaviate** - Vector objects with schema creation
|
|
11. **Qdrant** - Points with payload indexing
|
|
12. **Pinecone** - Ready-to-upsert format
|
|
|
|
*AI Coding Assistants:*
|
|
13. **Cursor** - .cursorrules persistent context
|
|
14. **Windsurf** - .windsurfrules AI coding rules
|
|
15. **Cline** - .clinerules + MCP integration
|
|
16. **Continue.dev** - HTTP context server (all IDEs)
|
|
|
|
Each platform has a dedicated adaptor for optimal formatting and upload.
|
|
|
|
### Is it free to use?
|
|
|
|
**Tool:** Yes, Skill Seekers is 100% free and open-source (MIT license).
|
|
|
|
**API Costs:**
|
|
- **Scraping:** Free (just bandwidth)
|
|
- **AI Enhancement (API mode):** ~$0.15-0.30 per skill (Claude API)
|
|
- **AI Enhancement (LOCAL mode):** Free! (uses your Claude Code Max plan)
|
|
- **Upload:** Free (platform storage limits apply)
|
|
|
|
**Recommendation:** Use LOCAL mode for free AI enhancement or skip enhancement entirely.
|
|
|
|
### How do I set up video extraction?
|
|
|
|
**Quick setup:**
|
|
```bash
|
|
# 1. Install video support
|
|
pip install skill-seekers[video-full]
|
|
|
|
# 2. Auto-detect GPU and install visual deps
|
|
skill-seekers video --setup
|
|
```
|
|
|
|
The `--setup` command auto-detects your GPU vendor (NVIDIA CUDA, AMD ROCm, or CPU-only) and installs the correct PyTorch variant along with easyocr and other visual extraction dependencies. This avoids the ~2GB NVIDIA CUDA download that would happen if easyocr were installed via pip on non-NVIDIA systems.
|
|
|
|
**What it detects:**
|
|
- **NVIDIA:** Uses `nvidia-smi` to find CUDA version → installs matching `cu124`/`cu121`/`cu118` PyTorch
|
|
- **AMD:** Uses `rocminfo` to find ROCm version → installs matching ROCm PyTorch
|
|
- **CPU-only:** Installs lightweight CPU-only PyTorch
|
|
|
|
### What source types are supported?
|
|
|
|
Skill Seekers supports **17 source types**:
|
|
|
|
| # | Source Type | CLI Command | Auto-Detection |
|
|
|---|------------|-------------|----------------|
|
|
| 1 | Documentation (web) | `scrape` / `create <url>` | HTTP/HTTPS URLs |
|
|
| 2 | GitHub repo | `github` / `create owner/repo` | `owner/repo` or github.com URLs |
|
|
| 3 | PDF | `pdf` / `create file.pdf` | `.pdf` extension |
|
|
| 4 | Word (.docx) | `word` / `create file.docx` | `.docx` extension |
|
|
| 5 | EPUB | `epub` / `create file.epub` | `.epub` extension |
|
|
| 6 | Video | `video` / `create <url/file>` | YouTube/Vimeo URLs, video extensions |
|
|
| 7 | Local codebase | `analyze` / `create ./path` | Directory paths |
|
|
| 8 | Jupyter Notebook | `jupyter` / `create file.ipynb` | `.ipynb` extension |
|
|
| 9 | Local HTML | `html` / `create file.html` | `.html`/`.htm` extensions |
|
|
| 10 | OpenAPI/Swagger | `openapi` / `create spec.yaml` | `.yaml`/`.yml` with OpenAPI content |
|
|
| 11 | AsciiDoc | `asciidoc` / `create file.adoc` | `.adoc`/`.asciidoc` extensions |
|
|
| 12 | PowerPoint | `pptx` / `create file.pptx` | `.pptx` extension |
|
|
| 13 | RSS/Atom | `rss` / `create feed.rss` | `.rss`/`.atom` extensions |
|
|
| 14 | Man pages | `manpage` / `create cmd.1` | `.1`-`.8`/`.man` extensions |
|
|
| 15 | Confluence | `confluence` | API or export directory |
|
|
| 16 | Notion | `notion` | API or export directory |
|
|
| 17 | Slack/Discord | `chat` | Export directory or API |
|
|
|
|
The `create` command auto-detects the source type from your input, so you often don't need to specify a subcommand.
|
|
|
|
### How long does it take to create a skill?
|
|
|
|
**Typical Times:**
|
|
- Documentation scraping: 5-45 minutes (depends on size)
|
|
- GitHub analysis: 1-5 minutes (basic) or 20-60 minutes (C3.x deep analysis)
|
|
- PDF extraction: 30 seconds - 5 minutes
|
|
- Video extraction: 2-10 minutes (depends on length and visual analysis)
|
|
- Word/EPUB/PPTX: 10-60 seconds
|
|
- Jupyter notebook: 10-30 seconds
|
|
- OpenAPI spec: 5-15 seconds
|
|
- Confluence/Notion import: 1-5 minutes (depends on space size)
|
|
- AI enhancement: 30-60 seconds (LOCAL or API mode)
|
|
- Total workflow: 10-60 minutes
|
|
|
|
**Speed Tips:**
|
|
- Use `--async` for 2-3x faster scraping
|
|
- Use `--skip-scrape` to rebuild without re-scraping
|
|
- Skip AI enhancement for faster workflow
|
|
|
|
---
|
|
|
|
## Installation & Setup
|
|
|
|
### How do I install Skill Seekers?
|
|
|
|
```bash
|
|
# Basic installation
|
|
pip install skill-seekers
|
|
|
|
# With all platform support
|
|
pip install skill-seekers[all-llms]
|
|
|
|
# Development installation
|
|
git clone https://github.com/yusufkaraaslan/Skill_Seekers.git
|
|
cd Skill_Seekers
|
|
pip install -e ".[all-llms,dev]"
|
|
```
|
|
|
|
### What Python version do I need?
|
|
|
|
**Required:** Python 3.10 or higher
|
|
**Tested on:** Python 3.10, 3.11, 3.12, 3.13
|
|
**OS Support:** Linux, macOS, Windows (WSL recommended)
|
|
|
|
**Check your version:**
|
|
```bash
|
|
python --version # Should be 3.10+
|
|
```
|
|
|
|
### Why do I get "No module named 'skill_seekers'" error?
|
|
|
|
**Common Causes:**
|
|
1. Package not installed
|
|
2. Wrong Python environment
|
|
|
|
**Solutions:**
|
|
```bash
|
|
# Install package
|
|
pip install skill-seekers
|
|
|
|
# Or for development
|
|
pip install -e .
|
|
|
|
# Verify installation
|
|
skill-seekers --version
|
|
```
|
|
|
|
### How do I set up API keys?
|
|
|
|
```bash
|
|
# Claude AI (for enhancement and upload)
|
|
export ANTHROPIC_API_KEY=sk-ant-...
|
|
|
|
# Google Gemini (for upload)
|
|
export GOOGLE_API_KEY=AIza...
|
|
|
|
# OpenAI ChatGPT (for upload)
|
|
export OPENAI_API_KEY=sk-...
|
|
|
|
# GitHub (for higher rate limits)
|
|
export GITHUB_TOKEN=ghp_...
|
|
|
|
# Make permanent (add to ~/.bashrc or ~/.zshrc)
|
|
echo 'export ANTHROPIC_API_KEY=sk-ant-...' >> ~/.bashrc
|
|
```
|
|
|
|
---
|
|
|
|
## Usage Questions
|
|
|
|
### How do I scrape documentation?
|
|
|
|
**Using preset config:**
|
|
```bash
|
|
skill-seekers scrape --config react
|
|
```
|
|
|
|
**Using custom URL:**
|
|
```bash
|
|
skill-seekers scrape --base-url https://docs.example.com --name my-framework
|
|
```
|
|
|
|
**From custom config file:**
|
|
```bash
|
|
skill-seekers scrape --config configs/my-framework.json
|
|
```
|
|
|
|
### Can I analyze GitHub repositories?
|
|
|
|
Yes! Skill Seekers has powerful GitHub analysis:
|
|
|
|
```bash
|
|
# Basic analysis (fast)
|
|
skill-seekers github https://github.com/facebook/react
|
|
|
|
# Deep C3.x analysis (includes patterns, tests, guides)
|
|
skill-seekers github https://github.com/vercel/next.js --analysis-depth c3x
|
|
```
|
|
|
|
**C3.x Features:**
|
|
- Design pattern detection (10 GoF patterns)
|
|
- Test example extraction
|
|
- How-to guide generation
|
|
- Configuration pattern extraction
|
|
- Architectural overview
|
|
- API reference generation
|
|
|
|
### Can I extract content from PDFs?
|
|
|
|
Yes! PDF extraction with OCR support:
|
|
|
|
```bash
|
|
# Basic PDF extraction
|
|
skill-seekers pdf manual.pdf --name product-manual
|
|
|
|
# With OCR (for scanned PDFs)
|
|
skill-seekers pdf scanned.pdf --enable-ocr
|
|
|
|
# Extract images and tables
|
|
skill-seekers pdf document.pdf --extract-images --extract-tables
|
|
```
|
|
|
|
### How do I scrape a Jupyter Notebook?
|
|
|
|
```bash
|
|
# Extract cells, outputs, and markdown from a notebook
|
|
skill-seekers jupyter analysis.ipynb --name data-analysis
|
|
|
|
# Or use auto-detection
|
|
skill-seekers create analysis.ipynb
|
|
```
|
|
|
|
Jupyter extraction preserves code cells, markdown cells, and cell outputs. It works with `.ipynb` files from JupyterLab, Google Colab, and other notebook environments.
|
|
|
|
### How do I import from Confluence or Notion?
|
|
|
|
**Confluence:**
|
|
```bash
|
|
# From Confluence Cloud API
|
|
export CONFLUENCE_URL=https://yourorg.atlassian.net
|
|
export CONFLUENCE_TOKEN=your-api-token
|
|
export CONFLUENCE_EMAIL=your-email@example.com
|
|
skill-seekers confluence --space MYSPACE --name my-wiki
|
|
|
|
# From a Confluence HTML/XML export directory
|
|
skill-seekers confluence --export-dir ./confluence-export --name my-wiki
|
|
```
|
|
|
|
**Notion:**
|
|
```bash
|
|
# From Notion API
|
|
export NOTION_TOKEN=secret_...
|
|
skill-seekers notion --database DATABASE_ID --name my-notes
|
|
|
|
# From a Notion HTML/Markdown export directory
|
|
skill-seekers notion --export-dir ./notion-export --name my-notes
|
|
```
|
|
|
|
### How do I convert Word, EPUB, or PowerPoint files?
|
|
|
|
```bash
|
|
# Word document
|
|
skill-seekers word report.docx --name quarterly-report
|
|
|
|
# EPUB book
|
|
skill-seekers epub handbook.epub --name dev-handbook
|
|
|
|
# PowerPoint presentation
|
|
skill-seekers pptx slides.pptx --name training-deck
|
|
|
|
# Or use auto-detection for any of them
|
|
skill-seekers create report.docx
|
|
skill-seekers create handbook.epub
|
|
skill-seekers create slides.pptx
|
|
```
|
|
|
|
### How do I parse an OpenAPI/Swagger spec?
|
|
|
|
```bash
|
|
# From a local YAML/JSON file
|
|
skill-seekers openapi api-spec.yaml --name my-api
|
|
|
|
# Auto-detection works too
|
|
skill-seekers create api-spec.yaml
|
|
```
|
|
|
|
OpenAPI extraction parses endpoints, schemas, parameters, and examples into a structured API reference skill.
|
|
|
|
### How do I extract content from RSS feeds or man pages?
|
|
|
|
```bash
|
|
# RSS/Atom feed
|
|
skill-seekers rss https://blog.example.com/feed.xml --name blog-feed
|
|
|
|
# Man page
|
|
skill-seekers manpage grep.1 --name grep-manual
|
|
```
|
|
|
|
### How do I import from Slack or Discord?
|
|
|
|
```bash
|
|
# From a Slack export directory
|
|
skill-seekers chat --platform slack --export-dir ./slack-export --name team-knowledge
|
|
|
|
# From a Discord export directory
|
|
skill-seekers chat --platform discord --export-dir ./discord-export --name server-archive
|
|
```
|
|
|
|
### Can I combine multiple sources?
|
|
|
|
Yes! Unified multi-source scraping:
|
|
|
|
**Create unified config** (`configs/unified/my-framework.json`):
|
|
```json
|
|
{
|
|
"name": "my-framework",
|
|
"sources": {
|
|
"documentation": {
|
|
"type": "docs",
|
|
"base_url": "https://docs.example.com"
|
|
},
|
|
"github": {
|
|
"type": "github",
|
|
"repo_url": "https://github.com/org/repo"
|
|
},
|
|
"pdf": {
|
|
"type": "pdf",
|
|
"pdf_path": "manual.pdf"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Run unified scraping:**
|
|
```bash
|
|
skill-seekers unified --config configs/unified/my-framework.json
|
|
```
|
|
|
|
### How do I upload skills to platforms?
|
|
|
|
```bash
|
|
# Upload to Claude AI
|
|
export ANTHROPIC_API_KEY=sk-ant-...
|
|
skill-seekers upload output/react-claude.zip --target claude
|
|
|
|
# Upload to Google Gemini
|
|
export GOOGLE_API_KEY=AIza...
|
|
skill-seekers upload output/react-gemini.tar.gz --target gemini
|
|
|
|
# Upload to OpenAI ChatGPT
|
|
export OPENAI_API_KEY=sk-...
|
|
skill-seekers upload output/react-openai.zip --target openai
|
|
```
|
|
|
|
**Or use complete workflow:**
|
|
```bash
|
|
skill-seekers install react --target claude --upload
|
|
```
|
|
|
|
---
|
|
|
|
## Platform-Specific Questions
|
|
|
|
### What's the difference between platforms?
|
|
|
|
| Feature | Claude AI | Google Gemini | OpenAI ChatGPT | Markdown |
|
|
|---------|-----------|---------------|----------------|----------|
|
|
| Format | ZIP + YAML | tar.gz | ZIP | ZIP |
|
|
| Upload API | Projects API | Corpora API | Vector Stores | N/A |
|
|
| Model | Sonnet 4.5 | Gemini 2.0 Flash | GPT-4o | N/A |
|
|
| Max Size | 32MB | 10MB | 512MB | N/A |
|
|
| Use Case | Claude Code | Grounded Gen | ChatGPT Custom | Export |
|
|
|
|
**Choose based on:**
|
|
- Claude AI: Best for Claude Code integration
|
|
- Google Gemini: Best for Grounded Generation in Gemini
|
|
- OpenAI ChatGPT: Best for ChatGPT Custom GPTs
|
|
- Markdown: Generic export for other tools
|
|
|
|
### Can I use multiple platforms at once?
|
|
|
|
Yes! Package and upload to all platforms:
|
|
|
|
```bash
|
|
# Package for all platforms
|
|
for platform in claude gemini openai markdown; do
|
|
skill-seekers package output/react/ --target $platform
|
|
done
|
|
|
|
# Upload to all platforms
|
|
skill-seekers install react --target claude,gemini,openai --upload
|
|
```
|
|
|
|
### How do I use skills in Claude Code?
|
|
|
|
1. **Install skill to Claude Code directory:**
|
|
```bash
|
|
skill-seekers install-agent --skill-dir output/react/ --agent-dir ~/.claude/skills/react
|
|
```
|
|
|
|
2. **Use in Claude Code:**
|
|
```
|
|
Use the react skill to explain React hooks
|
|
```
|
|
|
|
3. **Or upload to Claude AI:**
|
|
```bash
|
|
skill-seekers upload output/react-claude.zip --target claude
|
|
```
|
|
|
|
---
|
|
|
|
## Features & Capabilities
|
|
|
|
### What is AI enhancement?
|
|
|
|
AI enhancement transforms basic skills (2-3/10 quality) into production-ready skills (8-9/10 quality) using LLMs.
|
|
|
|
**Two Modes:**
|
|
1. **API Mode:** Direct Claude API calls (fast, costs ~$0.15-0.30)
|
|
2. **LOCAL Mode:** Uses Claude Code CLI (free with your Max plan)
|
|
|
|
**What it improves:**
|
|
- Better organization and structure
|
|
- Clearer explanations
|
|
- More examples and use cases
|
|
- Better cross-references
|
|
- Improved searchability
|
|
|
|
**Usage:**
|
|
```bash
|
|
# API mode (if ANTHROPIC_API_KEY is set)
|
|
skill-seekers enhance output/react/
|
|
|
|
# LOCAL mode (free!)
|
|
skill-seekers enhance output/react/ --mode LOCAL
|
|
|
|
# Background mode
|
|
skill-seekers enhance output/react/ --background
|
|
skill-seekers enhance-status output/react/ --watch
|
|
```
|
|
|
|
### What are C3.x features?
|
|
|
|
C3.x features are advanced codebase analysis capabilities:
|
|
|
|
- **C3.1:** Design pattern detection (Singleton, Factory, Strategy, etc.)
|
|
- **C3.2:** Test example extraction (real usage examples from tests)
|
|
- **C3.3:** How-to guide generation (educational guides from test workflows)
|
|
- **C3.4:** Configuration pattern extraction (env vars, config files)
|
|
- **C3.5:** Architectural overview (system architecture analysis)
|
|
- **C3.6:** AI enhancement (Claude API integration for insights)
|
|
- **C3.7:** Architectural pattern detection (MVC, MVVM, Repository, etc.)
|
|
- **C3.8:** Standalone codebase scraping (300+ line SKILL.md from code alone)
|
|
|
|
**Enable C3.x:**
|
|
```bash
|
|
# All C3.x features enabled by default
|
|
skill-seekers codebase --directory /path/to/repo
|
|
|
|
# Skip specific features
|
|
skill-seekers codebase --directory . --skip-patterns --skip-how-to-guides
|
|
```
|
|
|
|
### What are router skills?
|
|
|
|
Router skills help Claude navigate large documentation (>500 pages) by providing a table of contents and keyword index.
|
|
|
|
**When to use:**
|
|
- Documentation with 500+ pages
|
|
- Complex multi-section docs
|
|
- Large API references
|
|
|
|
**Generate router:**
|
|
```bash
|
|
skill-seekers generate-router output/large-docs/
|
|
```
|
|
|
|
### What preset configurations are available?
|
|
|
|
**24 preset configs:**
|
|
- Web: react, vue, angular, svelte, nextjs
|
|
- Python: django, flask, fastapi, sqlalchemy, pytest
|
|
- Game Dev: godot, pygame, unity
|
|
- DevOps: docker, kubernetes, terraform, ansible
|
|
- Unified: react-unified, vue-unified, nextjs-unified, etc.
|
|
|
|
**List all:**
|
|
```bash
|
|
skill-seekers list-configs
|
|
```
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Scraping is very slow, how can I speed it up?
|
|
|
|
**Solutions:**
|
|
1. **Use async mode** (2-3x faster):
|
|
```bash
|
|
skill-seekers scrape --config react --async
|
|
```
|
|
|
|
2. **Increase rate limit** (faster requests):
|
|
```json
|
|
{
|
|
"rate_limit": 0.1 // Faster (but may hit rate limits)
|
|
}
|
|
```
|
|
|
|
3. **Limit pages**:
|
|
```json
|
|
{
|
|
"max_pages": 100 // Stop after 100 pages
|
|
}
|
|
```
|
|
|
|
### Why are some pages missing?
|
|
|
|
**Common Causes:**
|
|
1. **URL patterns exclude them**
|
|
2. **Max pages limit reached**
|
|
3. **BFS didn't reach them**
|
|
|
|
**Solutions:**
|
|
```bash
|
|
# Check URL patterns in config
|
|
{
|
|
"url_patterns": {
|
|
"include": ["/docs/"], // Make sure your pages match
|
|
"exclude": [] // Remove overly broad exclusions
|
|
}
|
|
}
|
|
|
|
# Increase max pages
|
|
{
|
|
"max_pages": 1000 // Default is 500
|
|
}
|
|
|
|
# Use verbose mode to see what's being scraped
|
|
skill-seekers scrape --config react --verbose
|
|
```
|
|
|
|
### How do I fix "NetworkError: Connection failed"?
|
|
|
|
**Solutions:**
|
|
1. **Check internet connection**
|
|
2. **Verify URL is accessible**:
|
|
```bash
|
|
curl -I https://docs.example.com
|
|
```
|
|
|
|
3. **Increase timeout**:
|
|
```json
|
|
{
|
|
"timeout": 30 // 30 seconds
|
|
}
|
|
```
|
|
|
|
4. **Check rate limiting**:
|
|
```json
|
|
{
|
|
"rate_limit": 1.0 // Slower requests
|
|
}
|
|
```
|
|
|
|
### Tests are failing, what should I do?
|
|
|
|
**Quick fixes:**
|
|
```bash
|
|
# Ensure package is installed
|
|
pip install -e ".[all-llms,dev]"
|
|
|
|
# Clear caches
|
|
rm -rf .pytest_cache/ **/__pycache__/
|
|
|
|
# Run specific failing test
|
|
pytest tests/test_file.py::test_name -vv
|
|
|
|
# Check for missing dependencies
|
|
pip install -e ".[all-llms,dev]"
|
|
```
|
|
|
|
**If still failing:**
|
|
1. Check [Troubleshooting Guide](../TROUBLESHOOTING.md)
|
|
2. Report issue on [GitHub](https://github.com/yusufkaraaslan/Skill_Seekers/issues)
|
|
|
|
---
|
|
|
|
## MCP Server Questions
|
|
|
|
### How do I start the MCP server?
|
|
|
|
```bash
|
|
# stdio mode (Claude Code, VS Code + Cline)
|
|
skill-seekers-mcp
|
|
|
|
# HTTP mode (Cursor, Windsurf, IntelliJ)
|
|
skill-seekers-mcp --transport http --port 8765
|
|
```
|
|
|
|
### What MCP tools are available?
|
|
|
|
**26 MCP tools:**
|
|
|
|
*Core Tools (9):*
|
|
1. `list_configs` - List preset configurations
|
|
2. `generate_config` - Generate config from docs URL
|
|
3. `validate_config` - Validate config structure
|
|
4. `estimate_pages` - Estimate page count
|
|
5. `scrape_docs` - Scrape documentation
|
|
6. `package_skill` - Package to .zip (supports `--format` and `--target`)
|
|
7. `upload_skill` - Upload to platform (supports `--target`)
|
|
8. `enhance_skill` - AI enhancement
|
|
9. `install_skill` - Complete workflow
|
|
|
|
*Extended Tools (10):*
|
|
10. `scrape_github` - GitHub analysis
|
|
11. `scrape_pdf` - PDF extraction
|
|
12. `unified_scrape` - Multi-source scraping
|
|
13. `merge_sources` - Merge docs + code
|
|
14. `detect_conflicts` - Find discrepancies
|
|
15. `split_config` - Split large configs
|
|
16. `generate_router` - Generate router skills
|
|
17. `add_config_source` - Register git repos
|
|
18. `fetch_config` - Fetch configs from git
|
|
19. `list_config_sources` - List registered sources
|
|
20. `remove_config_source` - Remove config source
|
|
|
|
*Vector DB Tools (4):*
|
|
21. `export_to_chroma` - Export to ChromaDB
|
|
22. `export_to_weaviate` - Export to Weaviate
|
|
23. `export_to_faiss` - Export to FAISS
|
|
24. `export_to_qdrant` - Export to Qdrant
|
|
|
|
*Cloud Tools (3):*
|
|
25. `cloud_upload` - Upload to S3/GCS/Azure
|
|
26. `cloud_download` - Download from cloud storage
|
|
|
|
### How do I configure MCP for Claude Code?
|
|
|
|
**Add to `claude_desktop_config.json`:**
|
|
```json
|
|
{
|
|
"mcpServers": {
|
|
"skill-seekers": {
|
|
"command": "skill-seekers-mcp"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Restart Claude Code**, then use:
|
|
```
|
|
Use skill-seekers MCP tools to scrape React documentation
|
|
```
|
|
|
|
---
|
|
|
|
## Advanced Questions
|
|
|
|
### Can I use Skill Seekers programmatically?
|
|
|
|
Yes! Full API for Python integration:
|
|
|
|
```python
|
|
from skill_seekers.cli.doc_scraper import scrape_all, build_skill
|
|
from skill_seekers.cli.adaptors import get_adaptor
|
|
|
|
# Scrape documentation
|
|
pages = scrape_all(
|
|
base_url='https://docs.example.com',
|
|
selectors={'main_content': 'article'},
|
|
config={'name': 'example'}
|
|
)
|
|
|
|
# Build skill
|
|
skill_path = build_skill(
|
|
config_name='example',
|
|
output_dir='output/example'
|
|
)
|
|
|
|
# Package for platform
|
|
adaptor = get_adaptor('claude')
|
|
package_path = adaptor.package(skill_path, 'output/')
|
|
```
|
|
|
|
**See:** [API Reference](reference/API_REFERENCE.md)
|
|
|
|
### How do I create custom configurations?
|
|
|
|
**Create config file** (`configs/my-framework.json`):
|
|
```json
|
|
{
|
|
"name": "my-framework",
|
|
"description": "My custom framework documentation",
|
|
"base_url": "https://docs.example.com/",
|
|
"selectors": {
|
|
"main_content": "article", // CSS selector
|
|
"title": "h1",
|
|
"code_blocks": "pre code"
|
|
},
|
|
"url_patterns": {
|
|
"include": ["/docs/", "/api/"],
|
|
"exclude": ["/blog/", "/changelog/"]
|
|
},
|
|
"categories": {
|
|
"getting_started": ["intro", "quickstart"],
|
|
"api": ["api", "reference"]
|
|
},
|
|
"rate_limit": 0.5,
|
|
"max_pages": 500
|
|
}
|
|
```
|
|
|
|
**Use config:**
|
|
```bash
|
|
skill-seekers scrape --config configs/my-framework.json
|
|
```
|
|
|
|
### Can I contribute preset configs?
|
|
|
|
Yes! We welcome config contributions:
|
|
|
|
1. **Create config** in `configs/` directory
|
|
2. **Test it** thoroughly:
|
|
```bash
|
|
skill-seekers scrape --config configs/your-framework.json
|
|
```
|
|
3. **Submit PR** on [GitHub](https://github.com/yusufkaraaslan/Skill_Seekers)
|
|
|
|
**Guidelines:**
|
|
- Name: `{framework-name}.json`
|
|
- Include all required fields
|
|
- Add to appropriate category
|
|
- Test with real documentation
|
|
|
|
### How do I debug scraping issues?
|
|
|
|
```bash
|
|
# Verbose output
|
|
skill-seekers scrape --config react --verbose
|
|
|
|
# Dry run (no actual scraping)
|
|
skill-seekers scrape --config react --dry-run
|
|
|
|
# Single page test
|
|
skill-seekers scrape --base-url https://docs.example.com/intro --max-pages 1
|
|
|
|
# Check selectors
|
|
skill-seekers validate-config configs/react.json
|
|
```
|
|
|
|
---
|
|
|
|
## Getting More Help
|
|
|
|
### Where can I find documentation?
|
|
|
|
**Main Documentation:**
|
|
- [README](../README.md) - Project overview
|
|
- [Usage Guide](guides/USAGE.md) - Detailed usage
|
|
- [API Reference](reference/API_REFERENCE.md) - Programmatic usage
|
|
- [Troubleshooting](../TROUBLESHOOTING.md) - Common issues
|
|
|
|
**Guides:**
|
|
- [MCP Setup](guides/MCP_SETUP.md)
|
|
- [Testing Guide](guides/TESTING_GUIDE.md)
|
|
- [Migration Guide](guides/MIGRATION_GUIDE.md)
|
|
- [Quick Reference](QUICK_REFERENCE.md)
|
|
|
|
### How do I report bugs?
|
|
|
|
1. **Check existing issues:** https://github.com/yusufkaraaslan/Skill_Seekers/issues
|
|
2. **Create new issue** with:
|
|
- Skill Seekers version (`skill-seekers --version`)
|
|
- Python version (`python --version`)
|
|
- Operating system
|
|
- Config file (if relevant)
|
|
- Error message and stack trace
|
|
- Steps to reproduce
|
|
|
|
### How do I request features?
|
|
|
|
1. **Check roadmap:** [ROADMAP.md](../ROADMAP.md)
|
|
2. **Create feature request:** https://github.com/yusufkaraaslan/Skill_Seekers/issues
|
|
3. **Join discussions:** https://github.com/yusufkaraaslan/Skill_Seekers/discussions
|
|
|
|
### Is there a community?
|
|
|
|
Yes!
|
|
- **GitHub Discussions:** https://github.com/yusufkaraaslan/Skill_Seekers/discussions
|
|
- **Issue Tracker:** https://github.com/yusufkaraaslan/Skill_Seekers/issues
|
|
- **Project Board:** https://github.com/users/yusufkaraaslan/projects/2
|
|
|
|
---
|
|
|
|
**Version:** 3.2.0
|
|
**Last Updated:** 2026-03-15
|
|
**Questions? Ask on [GitHub Discussions](https://github.com/yusufkaraaslan/Skill_Seekers/discussions)**
|