Files
yusyus 37cb307455 docs: update all documentation for 17 source types
Update 32 documentation files across English and Chinese (zh-CN) docs
to reflect the 10 new source types added in the previous commit.

Updated files:
- README.md, README.zh-CN.md — taglines, feature lists, examples, install extras
- docs/reference/ — CLI_REFERENCE, FEATURE_MATRIX, MCP_REFERENCE, CONFIG_FORMAT, API_REFERENCE
- docs/features/ — UNIFIED_SCRAPING with generic merge docs
- docs/advanced/ — multi-source guide, MCP server guide
- docs/getting-started/ — installation extras, quick-start examples
- docs/user-guide/ — core-concepts, scraping, packaging, workflows (complex-merge)
- docs/ — FAQ, TROUBLESHOOTING, BEST_PRACTICES, ARCHITECTURE, UNIFIED_PARSERS, README
- Root — BULLETPROOF_QUICKSTART, CONTRIBUTING, ROADMAP
- docs/zh-CN/ — Chinese translations for all of the above

32 files changed, +3,016 lines, -245 lines
2026-03-15 15:56:04 +03:00

7.3 KiB
Raw Permalink Blame History

MCP Server Setup Guide

Skill Seekers v3.2.0
通过 Model Context Protocol 与 AI 代理集成


What is MCP?

MCP (Model Context Protocol) lets AI agents like Claude Code control Skill Seekers through natural language:

You: "Scrape the React documentation"
Claude: ▶️ scrape_docs({"url": "https://react.dev/"})
        ✅ Done! Created output/react/

Installation

# Install with MCP support
pip install skill-seekers[mcp]

# Verify
skill-seekers-mcp --version

Transport Modes

stdio Mode (Default)

For Claude Code, VS Code + Cline:

skill-seekers-mcp

Use when:

  • Running in Claude Code
  • Direct integration with terminal-based agents
  • Simple local setup

HTTP Mode

For Cursor, Windsurf, HTTP clients:

# Start HTTP server
skill-seekers-mcp --transport http --port 8765

# Custom host
skill-seekers-mcp --transport http --host 0.0.0.0 --port 8765

Use when:

  • IDE integration (Cursor, Windsurf)
  • Remote access needed
  • Multiple clients

Claude Code Integration

Automatic Setup

# In Claude Code, run:
/claude add-mcp-server skill-seekers

Or manually add to ~/.claude/mcp.json:

{
  "mcpServers": {
    "skill-seekers": {
      "command": "skill-seekers-mcp",
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "GITHUB_TOKEN": "ghp_..."
      }
    }
  }
}

Usage

Once connected, ask Claude:

"List available configs"
"Scrape the Django documentation"
"Package output/react for Gemini"
"Enhance output/my-skill with security-focus workflow"

Cursor IDE Integration

Setup

  1. Start MCP server:
skill-seekers-mcp --transport http --port 8765
  1. In Cursor Settings → MCP:
    • Name: skill-seekers
    • URL: http://localhost:8765

Usage

In Cursor chat:

"Create a skill from the current project"
"Analyze this codebase and generate a cursorrules file"

Windsurf Integration

Setup

  1. Start MCP server:
skill-seekers-mcp --transport http --port 8765
  1. In Windsurf Settings:
    • Add MCP server endpoint: http://localhost:8765

可用工具

27 个工具,按类别组织:

核心工具9 个)

  • list_configs - 列出预设
  • generate_config - 从 URL 创建配置
  • validate_config - 检查配置
  • estimate_pages - 页面估算
  • scrape_docs - 抓取文档
  • package_skill - 打包技能
  • upload_skill - 上传到平台
  • enhance_skill - AI 增强
  • install_skill - 完整工作流

扩展工具10 个)

  • scrape_github - GitHub 仓库
  • scrape_pdf - PDF 提取
  • scrape_generic - 10 种新来源类型的通用抓取器(见下文)
  • scrape_codebase - 本地代码
  • unified_scrape - 多源抓取
  • detect_patterns - 模式检测
  • extract_test_examples - 测试示例
  • build_how_to_guides - 操作指南
  • extract_config_patterns - 配置模式
  • detect_conflicts - 文档/代码冲突

配置源5 个)

  • add_config_source - 注册 Git 源
  • list_config_sources - 列出源
  • remove_config_source - 删除源
  • fetch_config - 获取配置
  • submit_config - 提交配置

向量数据库4 个)

  • export_to_weaviate
  • export_to_chroma
  • export_to_faiss
  • export_to_qdrant

scrape_generic 工具

scrape_generic 是 v3.2.0 新增的 10 种来源类型的通用入口。它将请求委托给相应的 CLI 抓取器模块。

支持的来源类型: jupyterJupyter 笔记本)、html(本地 HTMLopenapiOpenAPI/Swagger 规范)、asciidocAsciiDoc 文档)、pptxPowerPoint 演示文稿)、rssRSS/Atom 订阅源)、manpageMan 手册页)、confluenceConfluence 维基)、notionNotion 页面)、chatSlack/Discord 聊天记录)

参数:

名称 类型 必需 描述
source_type string 10 种支持的来源类型之一
name string 输出的技能名称
path string 文件或目录路径(用于基于文件的来源)
url string URL用于 confluence、notion、rss 等基于 URL 的来源)

使用示例:

"抓取 Jupyter 笔记本 analysis.ipynb"
→ scrape_generic(source_type="jupyter", name="analysis", path="analysis.ipynb")

"提取 API 规范内容"
→ scrape_generic(source_type="openapi", name="my-api", path="api-spec.yaml")

"处理 PowerPoint 演示文稿"
→ scrape_generic(source_type="pptx", name="slides", path="presentation.pptx")

"抓取 Confluence 维基"
→ scrape_generic(source_type="confluence", name="wiki", url="https://wiki.example.com")

详见 MCP 参考文档


Common Workflows

Workflow 1: Documentation Skill

User: "Create a skill from React docs"
Claude: ▶️ scrape_docs({"url": "https://react.dev/"})
        ⏳ Scraping...
        ✅ Created output/react/
        
        ▶️ package_skill({"skill_directory": "output/react/", "target": "claude"})
        ✅ Created output/react-claude.zip
        
        Skill ready! Upload to Claude?

Workflow 2: GitHub Analysis

User: "Analyze the facebook/react repo"
Claude: ▶️ scrape_github({"repo": "facebook/react"})
        ⏳ Analyzing...
        ✅ Created output/react/
        
        ▶️ enhance_skill({"skill_directory": "output/react/", "workflow": "architecture-comprehensive"})
        ✅ Enhanced with architecture analysis

Workflow 3: Multi-Platform Export

User: "Create Django skill for all platforms"
Claude: ▶️ scrape_docs({"config": "django"})
        ✅ Created output/django/
        
        ▶️ package_skill({"skill_directory": "output/django/", "target": "claude"})
        ▶️ package_skill({"skill_directory": "output/django/", "target": "gemini"})
        ▶️ package_skill({"skill_directory": "output/django/", "target": "openai"})
        ✅ Created packages for all platforms

Configuration

Environment Variables

Set in ~/.claude/mcp.json or before starting server:

export ANTHROPIC_API_KEY=sk-ant-...
export GOOGLE_API_KEY=AIza...
export OPENAI_API_KEY=sk-...
export GITHUB_TOKEN=ghp_...

Server Options

# Debug mode
skill-seekers-mcp --verbose

# Custom port
skill-seekers-mcp --port 8080

# Allow all origins (CORS)
skill-seekers-mcp --cors

Security

Local Only (stdio)

# Only accessible by local Claude Code
skill-seekers-mcp

HTTP with Auth

# Use reverse proxy with auth
# nginx, traefik, etc.

API Key Protection

# Don't hardcode keys
# Use environment variables
# Or secret management

Troubleshooting

"Server not found"

# Check if running
curl http://localhost:8765/health

# Restart
skill-seekers-mcp --transport http --port 8765

"Tool not available"

# Check version
skill-seekers-mcp --version

# Update
pip install --upgrade skill-seekers[mcp]

"Connection refused"

# Check port
lsof -i :8765

# Use different port
skill-seekers-mcp --port 8766

See Also