docs: complete documentation overhaul with v3.1.0 release notes and zh-CN translations

Documentation restructure: - New docs/getting-started/ guide (4 files: install, quick-start, first-skill, next-steps) - New docs/user-guide/ section (6 files: core concepts through troubleshooting) - New docs/reference/ section (CLI_REFERENCE, CONFIG_FORMAT, ENVIRONMENT_VARIABLES, MCP_REFERENCE) - New docs/advanced/ section (custom-workflows, mcp-server, multi-source) - New docs/ARCHITECTURE.md - system architecture overview - Archived legacy files (QUICKSTART.md, QUICK_REFERENCE.md, docs/guides/USAGE.md) to docs/archive/legacy/ Chinese (zh-CN) translations: - Full zh-CN mirror of all user-facing docs (getting-started, user-guide, reference, advanced) - GitHub Actions workflow for translation sync (.github/workflows/translate-docs.yml) - Translation sync checker script (scripts/check_translation_sync.sh) - Translation helper script (scripts/translate_doc.py) Content updates: - CHANGELOG.md: [Unreleased] → [3.1.0] - 2026-02-22 - README.md: updated with new doc structure links - AGENTS.md: updated agent documentation - docs/features/UNIFIED_SCRAPING.md: updated for unified scraper workflow JSON config Analysis/planning artifacts (kept for reference): - DOCUMENTATION_OVERHAUL_PLAN.md, DOCUMENTATION_OVERHAUL_SUMMARY.md - FEATURE_GAP_ANALYSIS.md, IMPLEMENTATION_GAPS_ANALYSIS.md, CREATE_COMMAND_COVERAGE_ANALYSIS.md - CHINESE_TRANSLATION_IMPLEMENTATION_SUMMARY.md, ISSUE_260_UPDATE.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 01:01:51 +03:00
parent 22bdd4f5f6
commit ba9a8ff8b5
69 changed files with 31304 additions and 246 deletions
--- a/docs/reference/CLI_REFERENCE.md
+++ b/docs/reference/CLI_REFERENCE.md
--- a/docs/reference/CONFIG_FORMAT.md
+++ b/docs/reference/CONFIG_FORMAT.md
@@ -0,0 +1,610 @@
+# Config Format Reference - Skill Seekers
+
+> **Version:** 3.1.0  
+> **Last Updated:** 2026-02-16  
+> **Complete JSON configuration specification**
+
+---
+
+## Table of Contents
+
+- [Overview](#overview)
+- [Single-Source Config](#single-source-config)
+  - [Documentation Source](#documentation-source)
+  - [GitHub Source](#github-source)
+  - [PDF Source](#pdf-source)
+  - [Local Source](#local-source)
+- [Unified (Multi-Source) Config](#unified-multi-source-config)
+- [Common Fields](#common-fields)
+- [Selectors](#selectors)
+- [Categories](#categories)
+- [URL Patterns](#url-patterns)
+- [Examples](#examples)
+
+---
+
+## Overview
+
+Skill Seekers uses JSON configuration files to define scraping targets. There are two types:
+
+| Type | Use Case | File |
+|------|----------|------|
+| **Single-Source** | One source (docs, GitHub, PDF, or local) | `*.json` |
+| **Unified** | Multiple sources combined | `*-unified.json` |
+
+---
+
+## Single-Source Config
+
+### Documentation Source
+
+For scraping documentation websites.
+
+```json
+{
+  "name": "react",
+  "base_url": "https://react.dev/",
+  "description": "React - JavaScript library for building UIs",
+  
+  "start_urls": [
+    "https://react.dev/learn",
+    "https://react.dev/reference/react"
+  ],
+  
+  "selectors": {
+    "main_content": "article",
+    "title": "h1",
+    "code_blocks": "pre code"
+  },
+  
+  "url_patterns": {
+    "include": ["/learn/", "/reference/"],
+    "exclude": ["/blog/", "/community/"]
+  },
+  
+  "categories": {
+    "getting_started": ["learn", "tutorial", "intro"],
+    "api": ["reference", "api", "hooks"]
+  },
+  
+  "rate_limit": 0.5,
+  "max_pages": 300,
+  "merge_mode": "claude-enhanced"
+}
+```
+
+#### Documentation Fields
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `name` | string | Yes | - | Skill name (alphanumeric, dashes, underscores) |
+| `base_url` | string | Yes | - | Base documentation URL |
+| `description` | string | No | "" | Skill description for SKILL.md |
+| `start_urls` | array | No | `[base_url]` | URLs to start crawling from |
+| `selectors` | object | No | see below | CSS selectors for content extraction |
+| `url_patterns` | object | No | `{}` | Include/exclude URL patterns |
+| `categories` | object | No | `{}` | Content categorization rules |
+| `rate_limit` | number | No | 0.5 | Seconds between requests |
+| `max_pages` | number | No | 500 | Maximum pages to scrape |
+| `merge_mode` | string | No | "claude-enhanced" | Merge strategy |
+| `extract_api` | boolean | No | false | Extract API references |
+| `llms_txt_url` | string | No | auto | Path to llms.txt file |
+
+---
+
+### GitHub Source
+
+For analyzing GitHub repositories.
+
+```json
+{
+  "name": "react-github",
+  "type": "github",
+  "repo": "facebook/react",
+  "description": "React GitHub repository analysis",
+  
+  "enable_codebase_analysis": true,
+  "code_analysis_depth": "deep",
+  
+  "fetch_issues": true,
+  "max_issues": 100,
+  "issue_labels": ["bug", "enhancement"],
+  
+  "fetch_releases": true,
+  "max_releases": 20,
+  
+  "fetch_changelog": true,
+  "analyze_commit_history": true,
+  
+  "file_patterns": ["*.js", "*.ts", "*.tsx"],
+  "exclude_patterns": ["*.test.js", "node_modules/**"],
+  
+  "rate_limit": 1.0
+}
+```
+
+#### GitHub Fields
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `name` | string | Yes | - | Skill name |
+| `type` | string | Yes | - | Must be `"github"` |
+| `repo` | string | Yes | - | Repository in `owner/repo` format |
+| `description` | string | No | "" | Skill description |
+| `enable_codebase_analysis` | boolean | No | true | Analyze source code |
+| `code_analysis_depth` | string | No | "standard" | `surface`, `standard`, `deep` |
+| `fetch_issues` | boolean | No | true | Fetch GitHub issues |
+| `max_issues` | number | No | 100 | Maximum issues to fetch |
+| `issue_labels` | array | No | [] | Filter by labels |
+| `fetch_releases` | boolean | No | true | Fetch releases |
+| `max_releases` | number | No | 20 | Maximum releases |
+| `fetch_changelog` | boolean | No | true | Extract CHANGELOG |
+| `analyze_commit_history` | boolean | No | false | Analyze commits |
+| `file_patterns` | array | No | [] | Include file patterns |
+| `exclude_patterns` | array | No | [] | Exclude file patterns |
+
+---
+
+### PDF Source
+
+For extracting content from PDF files.
+
+```json
+{
+  "name": "product-manual",
+  "type": "pdf",
+  "pdf_path": "docs/manual.pdf",
+  "description": "Product documentation manual",
+  
+  "enable_ocr": false,
+  "password": "",
+  
+  "extract_images": true,
+  "image_output_dir": "output/images/",
+  
+  "extract_tables": true,
+  "table_format": "markdown",
+  
+  "page_range": [1, 100],
+  "split_by_chapters": true,
+  
+  "chunk_size": 1000,
+  "chunk_overlap": 100
+}
+```
+
+#### PDF Fields
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `name` | string | Yes | - | Skill name |
+| `type` | string | Yes | - | Must be `"pdf"` |
+| `pdf_path` | string | Yes | - | Path to PDF file |
+| `description` | string | No | "" | Skill description |
+| `enable_ocr` | boolean | No | false | OCR for scanned PDFs |
+| `password` | string | No | "" | PDF password if encrypted |
+| `extract_images` | boolean | No | false | Extract embedded images |
+| `image_output_dir` | string | No | auto | Directory for images |
+| `extract_tables` | boolean | No | false | Extract tables |
+| `table_format` | string | No | "markdown" | `markdown`, `json`, `csv` |
+| `page_range` | array | No | all | `[start, end]` page range |
+| `split_by_chapters` | boolean | No | false | Split by detected chapters |
+| `chunk_size` | number | No | 1000 | Characters per chunk |
+| `chunk_overlap` | number | No | 100 | Overlap between chunks |
+
+---
+
+### Local Source
+
+For analyzing local codebases.
+
+```json
+{
+  "name": "my-project",
+  "type": "local",
+  "directory": "./my-project",
+  "description": "Local project analysis",
+  
+  "languages": ["Python", "JavaScript"],
+  "file_patterns": ["*.py", "*.js"],
+  "exclude_patterns": ["*.pyc", "node_modules/**", ".git/**"],
+  
+  "analysis_depth": "comprehensive",
+  
+  "extract_api": true,
+  "extract_patterns": true,
+  "extract_test_examples": true,
+  "extract_how_to_guides": true,
+  "extract_config_patterns": true,
+  
+  "include_comments": true,
+  "include_docstrings": true,
+  "include_readme": true
+}
+```
+
+#### Local Fields
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `name` | string | Yes | - | Skill name |
+| `type` | string | Yes | - | Must be `"local"` |
+| `directory` | string | Yes | - | Path to directory |
+| `description` | string | No | "" | Skill description |
+| `languages` | array | No | auto | Languages to analyze |
+| `file_patterns` | array | No | all | Include patterns |
+| `exclude_patterns` | array | No | common | Exclude patterns |
+| `analysis_depth` | string | No | "standard" | `quick`, `standard`, `comprehensive` |
+| `extract_api` | boolean | No | true | Extract API documentation |
+| `extract_patterns` | boolean | No | true | Detect patterns |
+| `extract_test_examples` | boolean | No | true | Extract test examples |
+| `extract_how_to_guides` | boolean | No | true | Generate guides |
+| `extract_config_patterns` | boolean | No | true | Extract config patterns |
+| `include_comments` | boolean | No | true | Include code comments |
+| `include_docstrings` | boolean | No | true | Include docstrings |
+| `include_readme` | boolean | No | true | Include README |
+
+---
+
+## Unified (Multi-Source) Config
+
+Combine multiple sources into one skill with conflict detection.
+
+```json
+{
+  "name": "react-complete",
+  "description": "React docs + GitHub + examples",
+  "merge_mode": "claude-enhanced",
+  
+  "sources": [
+    {
+      "type": "docs",
+      "name": "react-docs",
+      "base_url": "https://react.dev/",
+      "max_pages": 200,
+      "categories": {
+        "getting_started": ["learn"],
+        "api": ["reference"]
+      }
+    },
+    {
+      "type": "github",
+      "name": "react-github",
+      "repo": "facebook/react",
+      "fetch_issues": true,
+      "max_issues": 50
+    },
+    {
+      "type": "pdf",
+      "name": "react-cheatsheet",
+      "pdf_path": "docs/react-cheatsheet.pdf"
+    },
+    {
+      "type": "local",
+      "name": "react-examples",
+      "directory": "./react-examples"
+    }
+  ],
+  
+  "conflict_detection": {
+    "enabled": true,
+    "rules": [
+      {
+        "field": "api_signature",
+        "action": "flag_mismatch"
+      }
+    ]
+  },
+  
+  "output_structure": {
+    "group_by_source": false,
+    "cross_reference": true
+  }
+}
+```
+
+#### Unified Fields
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `name` | string | Yes | - | Combined skill name |
+| `description` | string | No | "" | Skill description |
+| `merge_mode` | string | No | "claude-enhanced" | `rule-based`, `claude-enhanced` |
+| `sources` | array | Yes | - | List of source configs |
+| `conflict_detection` | object | No | `{}` | Conflict detection settings |
+| `output_structure` | object | No | `{}` | Output organization |
+| `workflows` | array | No | `[]` | Workflow presets to apply |
+| `workflow_stages` | array | No | `[]` | Inline enhancement stages |
+| `workflow_vars` | object | No | `{}` | Workflow variable overrides |
+| `workflow_dry_run` | boolean | No | `false` | Preview workflows without executing |
+
+#### Workflow Configuration (Unified)
+
+Unified configs support defining enhancement workflows at the top level:
+
+```json
+{
+  "name": "react-complete",
+  "description": "React docs + GitHub with security enhancement",
+  "merge_mode": "claude-enhanced",
+  
+  "workflows": ["security-focus", "api-documentation"],
+  "workflow_stages": [
+    {
+      "name": "cleanup",
+      "prompt": "Remove boilerplate sections and standardize formatting"
+    }
+  ],
+  "workflow_vars": {
+    "focus_area": "performance",
+    "detail_level": "comprehensive"
+  },
+  
+  "sources": [
+    {"type": "docs", "base_url": "https://react.dev/"},
+    {"type": "github", "repo": "facebook/react"}
+  ]
+}
+```
+
+**Workflow Fields:**
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `workflows` | array | List of workflow preset names to apply |
+| `workflow_stages` | array | Inline stages with `name` and `prompt` |
+| `workflow_vars` | object | Key-value pairs for workflow variables |
+| `workflow_dry_run` | boolean | Preview workflows without executing |
+
+**Note:** CLI flags override config values (CLI takes precedence).
+
+#### Source Types in Unified Config
+
+Each source in the `sources` array can be:
+
+| Type | Required Fields |
+|------|-----------------|
+| `docs` | `base_url` |
+| `github` | `repo` |
+| `pdf` | `pdf_path` |
+| `local` | `directory` |
+
+---
+
+## Common Fields
+
+Fields available in all config types:
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `name` | string | Skill identifier (letters, numbers, dashes, underscores) |
+| `description` | string | Human-readable description |
+| `rate_limit` | number | Delay between requests in seconds |
+| `output_dir` | string | Custom output directory |
+| `skip_scrape` | boolean | Use existing data |
+| `enhance_level` | number | 0=off, 1=SKILL.md, 2=+config, 3=full |
+
+---
+
+## Selectors
+
+CSS selectors for content extraction from HTML:
+
+```json
+{
+  "selectors": {
+    "main_content": "article",
+    "title": "h1",
+    "code_blocks": "pre code",
+    "navigation": "nav.sidebar",
+    "breadcrumbs": "nav[aria-label='breadcrumb']",
+    "next_page": "a[rel='next']",
+    "prev_page": "a[rel='prev']"
+  }
+}
+```
+
+### Default Selectors
+
+If not specified, these defaults are used:
+
+| Element | Default Selector |
+|---------|-----------------|
+| `main_content` | `article, main, .content, #content, [role='main']` |
+| `title` | `h1, .page-title, title` |
+| `code_blocks` | `pre code, code[class*="language-"]` |
+| `navigation` | `nav, .sidebar, .toc` |
+
+---
+
+## Categories
+
+Map URL patterns to content categories:
+
+```json
+{
+  "categories": {
+    "getting_started": [
+      "intro", "tutorial", "quickstart", 
+      "installation", "getting-started"
+    ],
+    "core_concepts": [
+      "concept", "fundamental", "architecture",
+      "principle", "overview"
+    ],
+    "api_reference": [
+      "reference", "api", "method", "function",
+      "class", "interface", "type"
+    ],
+    "guides": [
+      "guide", "how-to", "example", "recipe",
+      "pattern", "best-practice"
+    ],
+    "advanced": [
+      "advanced", "expert", "performance",
+      "optimization", "internals"
+    ]
+  }
+}
+```
+
+Categories appear as sections in the generated SKILL.md.
+
+---
+
+## URL Patterns
+
+Control which URLs are included or excluded:
+
+```json
+{
+  "url_patterns": {
+    "include": [
+      "/docs/",
+      "/guide/",
+      "/api/",
+      "/reference/"
+    ],
+    "exclude": [
+      "/blog/",
+      "/news/",
+      "/community/",
+      "/search",
+      "?print=1",
+      "/_static/",
+      "/_images/"
+    ]
+  }
+}
+```
+
+### Pattern Rules
+
+- Patterns are matched against the URL path
+- Use `*` for wildcards: `/api/v*/`
+- Use `**` for recursive: `/docs/**/*.html`
+- Exclude takes precedence over include
+
+---
+
+## Examples
+
+### React Documentation
+
+```json
+{
+  "name": "react",
+  "base_url": "https://react.dev/",
+  "description": "React - JavaScript library for building UIs",
+  "start_urls": [
+    "https://react.dev/learn",
+    "https://react.dev/reference/react",
+    "https://react.dev/reference/react-dom"
+  ],
+  "selectors": {
+    "main_content": "article",
+    "title": "h1",
+    "code_blocks": "pre code"
+  },
+  "url_patterns": {
+    "include": ["/learn/", "/reference/", "/blog/"],
+    "exclude": ["/community/", "/search"]
+  },
+  "categories": {
+    "getting_started": ["learn", "tutorial"],
+    "api": ["reference", "api"],
+    "blog": ["blog"]
+  },
+  "rate_limit": 0.5,
+  "max_pages": 300
+}
+```
+
+### Django GitHub
+
+```json
+{
+  "name": "django-github",
+  "type": "github",
+  "repo": "django/django",
+  "description": "Django web framework source code",
+  "enable_codebase_analysis": true,
+  "code_analysis_depth": "deep",
+  "fetch_issues": true,
+  "max_issues": 100,
+  "fetch_releases": true,
+  "file_patterns": ["*.py"],
+  "exclude_patterns": ["tests/**", "docs/**"]
+}
+```
+
+### Unified Multi-Source
+
+```json
+{
+  "name": "godot-complete",
+  "description": "Godot Engine - docs, source, and manual",
+  "merge_mode": "claude-enhanced",
+  "sources": [
+    {
+      "type": "docs",
+      "name": "godot-docs",
+      "base_url": "https://docs.godotengine.org/en/stable/",
+      "max_pages": 500
+    },
+    {
+      "type": "github",
+      "name": "godot-source",
+      "repo": "godotengine/godot",
+      "fetch_issues": false
+    },
+    {
+      "type": "pdf",
+      "name": "godot-manual",
+      "pdf_path": "docs/godot-manual.pdf"
+    }
+  ]
+}
+```
+
+### Local Project
+
+```json
+{
+  "name": "my-api",
+  "type": "local",
+  "directory": "./my-api-project",
+  "description": "My REST API implementation",
+  "languages": ["Python"],
+  "file_patterns": ["*.py"],
+  "exclude_patterns": ["tests/**", "migrations/**"],
+  "analysis_depth": "comprehensive",
+  "extract_api": true,
+  "extract_test_examples": true
+}
+```
+
+---
+
+## Validation
+
+Validate your config before scraping:
+
+```bash
+# Using CLI
+skill-seekers scrape --config my-config.json --dry-run
+
+# Using MCP tool
+validate_config({"config": "my-config.json"})
+```
+
+---
+
+## See Also
+
+- [CLI Reference](CLI_REFERENCE.md) - Command reference
+- [Environment Variables](ENVIRONMENT_VARIABLES.md) - Configuration environment
+
+---
+
+*For more examples, see `configs/` directory in the repository*
--- a/docs/reference/ENVIRONMENT_VARIABLES.md
+++ b/docs/reference/ENVIRONMENT_VARIABLES.md
@@ -0,0 +1,738 @@
+# Environment Variables Reference - Skill Seekers
+
+> **Version:** 3.1.0  
+> **Last Updated:** 2026-02-16  
+> **Complete environment variable reference**
+
+---
+
+## Table of Contents
+
+- [Overview](#overview)
+- [API Keys](#api-keys)
+- [Platform Configuration](#platform-configuration)
+- [Paths and Directories](#paths-and-directories)
+- [Scraping Behavior](#scraping-behavior)
+- [Enhancement Settings](#enhancement-settings)
+- [GitHub Configuration](#github-configuration)
+- [Vector Database Settings](#vector-database-settings)
+- [Debug and Development](#debug-and-development)
+- [MCP Server Settings](#mcp-server-settings)
+- [Examples](#examples)
+
+---
+
+## Overview
+
+Skill Seekers uses environment variables for:
+- API authentication (Claude, Gemini, OpenAI, GitHub)
+- Configuration paths
+- Output directories
+- Behavior customization
+- Debug settings
+
+Variables are read at runtime and override default settings.
+
+---
+
+## API Keys
+
+### ANTHROPIC_API_KEY
+
+**Purpose:** Claude AI API access for enhancement and upload.
+
+**Format:** `sk-ant-api03-...`
+
+**Used by:**
+- `skill-seekers enhance` (API mode)
+- `skill-seekers upload` (Claude target)
+- AI enhancement features
+
+**Example:**
+```bash
+export ANTHROPIC_API_KEY=sk-ant-api03-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+```
+
+**Alternative:** Use `--api-key` flag per command.
+
+---
+
+### GOOGLE_API_KEY
+
+**Purpose:** Google Gemini API access for upload.
+
+**Format:** `AIza...`
+
+**Used by:**
+- `skill-seekers upload` (Gemini target)
+
+**Example:**
+```bash
+export GOOGLE_API_KEY=AIzaSyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+```
+
+---
+
+### OPENAI_API_KEY
+
+**Purpose:** OpenAI API access for upload and embeddings.
+
+**Format:** `sk-...`
+
+**Used by:**
+- `skill-seekers upload` (OpenAI target)
+- Embedding generation for vector DBs
+
+**Example:**
+```bash
+export OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+```
+
+---
+
+### GITHUB_TOKEN
+
+**Purpose:** GitHub API authentication for higher rate limits.
+
+**Format:** `ghp_...` (personal access token) or `github_pat_...` (fine-grained)
+
+**Used by:**
+- `skill-seekers github`
+- `skill-seekers unified` (GitHub sources)
+- `skill-seekers analyze` (GitHub repos)
+
+**Benefits:**
+- 5000 requests/hour vs 60 for unauthenticated
+- Access to private repositories
+- Higher GraphQL API limits
+
+**Example:**
+```bash
+export GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+```
+
+**Create token:** https://github.com/settings/tokens
+
+---
+
+## Platform Configuration
+
+### ANTHROPIC_BASE_URL
+
+**Purpose:** Custom Claude API endpoint.
+
+**Default:** `https://api.anthropic.com`
+
+**Use case:** Proxy servers, enterprise deployments, regional endpoints.
+
+**Example:**
+```bash
+export ANTHROPIC_BASE_URL=https://custom-api.example.com
+```
+
+---
+
+## Paths and Directories
+
+### SKILL_SEEKERS_HOME
+
+**Purpose:** Base directory for Skill Seekers data.
+
+**Default:**
+- Linux/macOS: `~/.config/skill-seekers/`
+- Windows: `%APPDATA%\skill-seekers\`
+
+**Used for:**
+- Configuration files
+- Workflow presets
+- Cache data
+- Checkpoints
+
+**Example:**
+```bash
+export SKILL_SEEKERS_HOME=/opt/skill-seekers
+```
+
+---
+
+### SKILL_SEEKERS_OUTPUT
+
+**Purpose:** Default output directory for skills.
+
+**Default:** `./output/`
+
+**Used by:**
+- All scraping commands
+- Package output
+- Skill generation
+
+**Example:**
+```bash
+export SKILL_SEEKERS_OUTPUT=/var/skills/output
+```
+
+---
+
+### SKILL_SEEKERS_CONFIG_DIR
+
+**Purpose:** Directory containing preset configs.
+
+**Default:** `configs/` (relative to working directory)
+
+**Example:**
+```bash
+export SKILL_SEEKERS_CONFIG_DIR=/etc/skill-seekers/configs
+```
+
+---
+
+## Scraping Behavior
+
+### SKILL_SEEKERS_RATE_LIMIT
+
+**Purpose:** Default rate limit for HTTP requests.
+
+**Default:** `0.5` (seconds)
+
+**Unit:** Seconds between requests
+
+**Example:**
+```bash
+# More aggressive (faster)
+export SKILL_SEEKERS_RATE_LIMIT=0.2
+
+# More conservative (slower)
+export SKILL_SEEKERS_RATE_LIMIT=1.0
+```
+
+**Override:** Use `--rate-limit` flag per command.
+
+---
+
+### SKILL_SEEKERS_MAX_PAGES
+
+**Purpose:** Default maximum pages to scrape.
+
+**Default:** `500`
+
+**Example:**
+```bash
+export SKILL_SEEKERS_MAX_PAGES=1000
+```
+
+**Override:** Use `--max-pages` flag or config file.
+
+---
+
+### SKILL_SEEKERS_WORKERS
+
+**Purpose:** Default number of parallel workers.
+
+**Default:** `1`
+
+**Maximum:** `10`
+
+**Example:**
+```bash
+export SKILL_SEEKERS_WORKERS=4
+```
+
+**Override:** Use `--workers` flag.
+
+---
+
+### SKILL_SEEKERS_TIMEOUT
+
+**Purpose:** HTTP request timeout.
+
+**Default:** `30` (seconds)
+
+**Example:**
+```bash
+# For slow servers
+export SKILL_SEEKERS_TIMEOUT=60
+```
+
+---
+
+### SKILL_SEEKERS_USER_AGENT
+
+**Purpose:** Custom User-Agent header.
+
+**Default:** `Skill-Seekers/3.1.0`
+
+**Example:**
+```bash
+export SKILL_SEEKERS_USER_AGENT="MyBot/1.0 (contact@example.com)"
+```
+
+---
+
+## Enhancement Settings
+
+### SKILL_SEEKER_AGENT
+
+**Purpose:** Default local coding agent for enhancement.
+
+**Default:** `claude`
+
+**Options:** `claude`, `cursor`, `windsurf`, `cline`, `continue`
+
+**Used by:**
+- `skill-seekers enhance`
+
+**Example:**
+```bash
+export SKILL_SEEKER_AGENT=cursor
+```
+
+---
+
+### SKILL_SEEKERS_ENHANCE_TIMEOUT
+
+**Purpose:** Timeout for AI enhancement operations.
+
+**Default:** `600` (seconds = 10 minutes)
+
+**Example:**
+```bash
+# For large skills
+export SKILL_SEEKERS_ENHANCE_TIMEOUT=1200
+```
+
+**Override:** Use `--timeout` flag.
+
+---
+
+### ANTHROPIC_MODEL
+
+**Purpose:** Claude model for API enhancement.
+
+**Default:** `claude-3-5-sonnet-20241022`
+
+**Options:**
+- `claude-3-5-sonnet-20241022` (recommended)
+- `claude-3-opus-20240229` (highest quality, more expensive)
+- `claude-3-haiku-20240307` (fastest, cheapest)
+
+**Example:**
+```bash
+export ANTHROPIC_MODEL=claude-3-opus-20240229
+```
+
+---
+
+## GitHub Configuration
+
+### GITHUB_API_URL
+
+**Purpose:** Custom GitHub API endpoint.
+
+**Default:** `https://api.github.com`
+
+**Use case:** GitHub Enterprise Server.
+
+**Example:**
+```bash
+export GITHUB_API_URL=https://github.company.com/api/v3
+```
+
+---
+
+### GITHUB_ENTERPRISE_TOKEN
+
+**Purpose:** Separate token for GitHub Enterprise.
+
+**Use case:** Different tokens for github.com vs enterprise.
+
+**Example:**
+```bash
+export GITHUB_TOKEN=ghp_...           # github.com
+export GITHUB_ENTERPRISE_TOKEN=...   # enterprise
+```
+
+---
+
+## Vector Database Settings
+
+### CHROMA_URL
+
+**Purpose:** ChromaDB server URL.
+
+**Default:** `http://localhost:8000`
+
+**Used by:**
+- `skill-seekers upload --target chroma`
+- `export_to_chroma` MCP tool
+
+**Example:**
+```bash
+export CHROMA_URL=http://chroma.example.com:8000
+```
+
+---
+
+### CHROMA_PERSIST_DIRECTORY
+
+**Purpose:** Local directory for ChromaDB persistence.
+
+**Default:** `./chroma_db/`
+
+**Example:**
+```bash
+export CHROMA_PERSIST_DIRECTORY=/var/lib/chroma
+```
+
+---
+
+### WEAVIATE_URL
+
+**Purpose:** Weaviate server URL.
+
+**Default:** `http://localhost:8080`
+
+**Used by:**
+- `skill-seekers upload --target weaviate`
+- `export_to_weaviate` MCP tool
+
+**Example:**
+```bash
+export WEAVIATE_URL=https://weaviate.example.com
+```
+
+---
+
+### WEAVIATE_API_KEY
+
+**Purpose:** Weaviate API key for authentication.
+
+**Used by:**
+- Weaviate Cloud
+- Authenticated Weaviate instances
+
+**Example:**
+```bash
+export WEAVIATE_API_KEY=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
+```
+
+---
+
+### QDRANT_URL
+
+**Purpose:** Qdrant server URL.
+
+**Default:** `http://localhost:6333`
+
+**Example:**
+```bash
+export QDRANT_URL=http://qdrant.example.com:6333
+```
+
+---
+
+### QDRANT_API_KEY
+
+**Purpose:** Qdrant API key for authentication.
+
+**Example:**
+```bash
+export QDRANT_API_KEY=xxxxxxxxxxxxxxxx
+```
+
+---
+
+## Debug and Development
+
+### SKILL_SEEKERS_DEBUG
+
+**Purpose:** Enable debug logging.
+
+**Values:** `1`, `true`, `yes`
+
+**Equivalent to:** `--verbose` flag
+
+**Example:**
+```bash
+export SKILL_SEEKERS_DEBUG=1
+```
+
+---
+
+### SKILL_SEEKERS_LOG_LEVEL
+
+**Purpose:** Set logging level.
+
+**Default:** `INFO`
+
+**Options:** `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`
+
+**Example:**
+```bash
+export SKILL_SEEKERS_LOG_LEVEL=DEBUG
+```
+
+---
+
+### SKILL_SEEKERS_LOG_FILE
+
+**Purpose:** Log to file instead of stdout.
+
+**Example:**
+```bash
+export SKILL_SEEKERS_LOG_FILE=/var/log/skill-seekers.log
+```
+
+---
+
+### SKILL_SEEKERS_CACHE_DIR
+
+**Purpose:** Custom cache directory.
+
+**Default:** `~/.cache/skill-seekers/`
+
+**Example:**
+```bash
+export SKILL_SEEKERS_CACHE_DIR=/tmp/skill-seekers-cache
+```
+
+---
+
+### SKILL_SEEKERS_NO_CACHE
+
+**Purpose:** Disable caching.
+
+**Values:** `1`, `true`, `yes`
+
+**Example:**
+```bash
+export SKILL_SEEKERS_NO_CACHE=1
+```
+
+---
+
+## MCP Server Settings
+
+### MCP_TRANSPORT
+
+**Purpose:** Default MCP transport mode.
+
+**Default:** `stdio`
+
+**Options:** `stdio`, `http`
+
+**Example:**
+```bash
+export MCP_TRANSPORT=http
+```
+
+**Override:** Use `--transport` flag.
+
+---
+
+### MCP_PORT
+
+**Purpose:** Default MCP HTTP port.
+
+**Default:** `8765`
+
+**Example:**
+```bash
+export MCP_PORT=8080
+```
+
+**Override:** Use `--port` flag.
+
+---
+
+### MCP_HOST
+
+**Purpose:** Default MCP HTTP host.
+
+**Default:** `127.0.0.1`
+
+**Example:**
+```bash
+export MCP_HOST=0.0.0.0
+```
+
+**Override:** Use `--host` flag.
+
+---
+
+## Examples
+
+### Development Environment
+
+```bash
+# Debug mode
+export SKILL_SEEKERS_DEBUG=1
+export SKILL_SEEKERS_LOG_LEVEL=DEBUG
+
+# Custom paths
+export SKILL_SEEKERS_HOME=./.skill-seekers
+export SKILL_SEEKERS_OUTPUT=./output
+
+# Faster scraping for testing
+export SKILL_SEEKERS_RATE_LIMIT=0.1
+export SKILL_SEEKERS_MAX_PAGES=50
+```
+
+### Production Environment
+
+```bash
+# API keys
+export ANTHROPIC_API_KEY=sk-ant-...
+export GITHUB_TOKEN=ghp_...
+
+# Custom output directory
+export SKILL_SEEKERS_OUTPUT=/var/www/skills
+
+# Conservative scraping
+export SKILL_SEEKERS_RATE_LIMIT=1.0
+export SKILL_SEEKERS_WORKERS=2
+
+# Logging
+export SKILL_SEEKERS_LOG_FILE=/var/log/skill-seekers.log
+export SKILL_SEEKERS_LOG_LEVEL=WARNING
+```
+
+### CI/CD Environment
+
+```bash
+# Non-interactive
+export SKILL_SEEKERS_LOG_LEVEL=ERROR
+
+# API keys from secrets
+export ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY_SECRET}
+export GITHUB_TOKEN=${GITHUB_TOKEN_SECRET}
+
+# Fresh runs (no cache)
+export SKILL_SEEKERS_NO_CACHE=1
+```
+
+### Multi-Platform Setup
+
+```bash
+# All API keys
+export ANTHROPIC_API_KEY=sk-ant-...
+export GOOGLE_API_KEY=AIza...
+export OPENAI_API_KEY=sk-...
+export GITHUB_TOKEN=ghp_...
+
+# Vector databases
+export CHROMA_URL=http://localhost:8000
+export WEAVIATE_URL=http://localhost:8080
+export WEAVIATE_API_KEY=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
+```
+
+---
+
+## Configuration File
+
+Environment variables can also be set in a `.env` file:
+
+```bash
+# .env file
+ANTHROPIC_API_KEY=sk-ant-...
+GITHUB_TOKEN=ghp_...
+SKILL_SEEKERS_OUTPUT=./output
+SKILL_SEEKERS_RATE_LIMIT=0.5
+```
+
+Load with:
+```bash
+# Automatically loaded if python-dotenv is installed
+# Or manually:
+export $(cat .env | xargs)
+```
+
+---
+
+## Priority Order
+
+Settings are applied in this order (later overrides earlier):
+
+1. Default values
+2. Environment variables
+3. Configuration file
+4. Command-line flags
+
+Example:
+```bash
+# Default: rate_limit = 0.5
+export SKILL_SEEKERS_RATE_LIMIT=1.0  # Env var overrides default
+# Config file: rate_limit = 0.2      # Config overrides env
+skill-seekers scrape --rate-limit 2.0  # Flag overrides all
+```
+
+---
+
+## Security Best Practices
+
+### Never commit API keys
+
+```bash
+# Add to .gitignore
+echo ".env" >> .gitignore
+echo "*.key" >> .gitignore
+```
+
+### Use secret management
+
+```bash
+# macOS Keychain
+export ANTHROPIC_API_KEY=$(security find-generic-password -s "anthropic-api" -w)
+
+# Linux Secret Service (with secret-tool)
+export ANTHROPIC_API_KEY=$(secret-tool lookup service anthropic)
+
+# 1Password CLI
+export ANTHROPIC_API_KEY=$(op read "op://vault/anthropic/credential")
+```
+
+### File permissions
+
+```bash
+# Restrict .env file
+chmod 600 .env
+```
+
+---
+
+## Troubleshooting
+
+### Variable not recognized
+
+```bash
+# Check if set
+echo $ANTHROPIC_API_KEY
+
+# Check in Python
+python -c "import os; print(os.getenv('ANTHROPIC_API_KEY'))"
+```
+
+### Priority issues
+
+```bash
+# See effective configuration
+skill-seekers config --show
+```
+
+### Path expansion
+
+```bash
+# Use full path or expand tilde
+export SKILL_SEEKERS_HOME=$HOME/.skill-seekers
+# NOT: ~/.skill-seekers (may not expand in all shells)
+```
+
+---
+
+## See Also
+
+- [CLI Reference](CLI_REFERENCE.md) - Command reference
+- [Config Format](CONFIG_FORMAT.md) - JSON configuration
+
+---
+
+*For platform-specific setup, see [Installation Guide](../getting-started/01-installation.md)*
--- a/docs/reference/MCP_REFERENCE.md
+++ b/docs/reference/MCP_REFERENCE.md