docs: complete documentation overhaul with v3.1.0 release notes and zh-CN translations

Documentation restructure: - New docs/getting-started/ guide (4 files: install, quick-start, first-skill, next-steps) - New docs/user-guide/ section (6 files: core concepts through troubleshooting) - New docs/reference/ section (CLI_REFERENCE, CONFIG_FORMAT, ENVIRONMENT_VARIABLES, MCP_REFERENCE) - New docs/advanced/ section (custom-workflows, mcp-server, multi-source) - New docs/ARCHITECTURE.md - system architecture overview - Archived legacy files (QUICKSTART.md, QUICK_REFERENCE.md, docs/guides/USAGE.md) to docs/archive/legacy/ Chinese (zh-CN) translations: - Full zh-CN mirror of all user-facing docs (getting-started, user-guide, reference, advanced) - GitHub Actions workflow for translation sync (.github/workflows/translate-docs.yml) - Translation sync checker script (scripts/check_translation_sync.sh) - Translation helper script (scripts/translate_doc.py) Content updates: - CHANGELOG.md: [Unreleased] → [3.1.0] - 2026-02-22 - README.md: updated with new doc structure links - AGENTS.md: updated agent documentation - docs/features/UNIFIED_SCRAPING.md: updated for unified scraper workflow JSON config Analysis/planning artifacts (kept for reference): - DOCUMENTATION_OVERHAUL_PLAN.md, DOCUMENTATION_OVERHAUL_SUMMARY.md - FEATURE_GAP_ANALYSIS.md, IMPLEMENTATION_GAPS_ANALYSIS.md, CREATE_COMMAND_COVERAGE_ANALYSIS.md - CHINESE_TRANSLATION_IMPLEMENTATION_SUMMARY.md, ISSUE_260_UPDATE.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 01:01:51 +03:00
parent 22bdd4f5f6
commit ba9a8ff8b5
69 changed files with 31304 additions and 246 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -12,7 +12,7 @@ This file provides essential guidance for AI coding agents working with the Skil

 | Attribute | Value |
 |-----------|-------|
-| **Current Version** | 3.1.0-dev |
+| **Current Version** | 3.0.0 |
 | **Python Version** | 3.10+ (tested on 3.10, 3.11, 3.12, 3.13) |
 | **License** | MIT |
 | **Package Name** | `skill-seekers` (PyPI) |
@@ -55,9 +55,9 @@ This file provides essential guidance for AI coding agents working with the Skil
 ```
 /mnt/1ece809a-2821-4f10-aecb-fcdf34760c0b/Git/Skill_Seekers/
 ├── src/skill_seekers/              # Main source code (src/ layout)
-│   ├── cli/                        # CLI tools and commands (~40k lines)
+│   ├── cli/                        # CLI tools and commands (~42k lines)
 │   │   ├── adaptors/               # Platform adaptors (Strategy pattern)
-│   │   │   ├── base.py             # Abstract base class
+│   │   │   ├── base.py             # Abstract base class (SkillAdaptor)
 │   │   │   ├── claude.py           # Claude AI adaptor
 │   │   │   ├── gemini.py           # Google Gemini adaptor
 │   │   │   ├── openai.py           # OpenAI ChatGPT adaptor
@@ -91,9 +91,9 @@ This file provides essential guidance for AI coding agents working with the Skil
 │   │   ├── cloud_storage_cli.py    # Cloud storage CLI
 │   │   ├── benchmark_cli.py        # Benchmarking CLI
 │   │   ├── sync_cli.py             # Sync monitoring CLI
-│   │   └── ...                     # Additional CLI modules
+│   │   └── workflows_command.py    # Workflow management CLI
 │   ├── mcp/                        # MCP server integration
-│   │   ├── server_fastmcp.py       # FastMCP server (main, ~708 lines)
+│   │   ├── server_fastmcp.py       # FastMCP server (~708 lines)
 │   │   ├── server_legacy.py        # Legacy server implementation
 │   │   ├── server.py               # Server entry point
 │   │   ├── agent_detector.py       # AI agent detection
@@ -105,7 +105,8 @@ This file provides essential guidance for AI coding agents working with the Skil
 │   │       ├── packaging_tools.py  # Packaging tools
 │   │       ├── source_tools.py     # Source management tools
 │   │       ├── splitting_tools.py  # Config splitting tools
-│   │       └── vector_db_tools.py  # Vector database tools
+│   │       ├── vector_db_tools.py  # Vector database tools
+│   │       └── workflow_tools.py   # Workflow management tools
 │   ├── sync/                       # Sync monitoring module
 │   │   ├── detector.py             # Change detection
 │   │   ├── models.py               # Data models (Pydantic)
@@ -120,9 +121,10 @@ This file provides essential guidance for AI coding agents working with the Skil
 │   │   ├── generator.py            # Embedding generation
 │   │   ├── cache.py                # Embedding cache
 │   │   └── models.py               # Embedding models
+│   ├── workflows/                  # YAML workflow presets
 │   ├── _version.py                 # Version information (reads from pyproject.toml)
 │   └── __init__.py                 # Package init
-├── tests/                          # Test suite (109 test files)
+├── tests/                          # Test suite (98 test files)
 ├── configs/                        # Preset configuration files
 ├── docs/                           # Documentation (80+ markdown files)
 │   ├── integrations/               # Platform integration guides
@@ -257,8 +259,8 @@ pytest tests/ -v -m "not slow and not integration"

 ### Test Architecture

- **109 test files** covering all features
- **~42,000 lines** of test code
+- **98 test files** covering all features
+- **1880+ tests** passing
 - CI Matrix: Ubuntu + macOS, Python 3.10-3.12
 - Test markers defined in `pyproject.toml`:

@@ -407,6 +409,7 @@ The CLI uses subcommands that delegate to existing modules:
 - `quality` - Quality metrics
 - `resume` - Resume interrupted jobs
 - `estimate` - Estimate page counts
+- `workflows` - Workflow management

 ### MCP Server Architecture

@@ -416,11 +419,12 @@ Two implementations:

 Tools are organized by category:
 - Config tools (3 tools): generate_config, list_configs, validate_config
- Scraping tools (8 tools): estimate_pages, scrape_docs, scrape_github, scrape_pdf, scrape_codebase, detect_patterns, extract_test_examples, build_how_to_guides
+- Scraping tools (9 tools): estimate_pages, scrape_docs, scrape_github, scrape_pdf, scrape_codebase, detect_patterns, extract_test_examples, build_how_to_guides, extract_config_patterns
 - Packaging tools (4 tools): package_skill, upload_skill, enhance_skill, install_skill
 - Source tools (5 tools): fetch_config, submit_config, add_config_source, list_config_sources, remove_config_source
 - Splitting tools (2 tools): split_config, generate_router
 - Vector Database tools (4 tools): export_to_weaviate, export_to_chroma, export_to_faiss, export_to_qdrant
+- Workflow tools (5 tools): list_workflows, get_workflow, create_workflow, update_workflow, delete_workflow

 **Running MCP Server:**
 ```bash
@@ -508,6 +512,7 @@ All workflows are in `.github/workflows/`:

 **`docker-publish.yml`:**
 - Builds and publishes Docker images
+- Multi-architecture support (linux/amd64, linux/arm64)

 **`vector-db-export.yml`:**
 - Tests vector database exports
@@ -608,22 +613,54 @@ export ANTHROPIC_BASE_URL=https://custom-endpoint.com/v1

 ## Documentation

-### Project Documentation
+### Project Documentation (New Structure - v3.1.0+)

- **README.md** - Main project documentation
- **README.zh-CN.md** - Chinese translation
- **CLAUDE.md** - Detailed implementation guidance
- **QUICKSTART.md** - Quick start guide
- **CONTRIBUTING.md** - Contribution guidelines
- **TROUBLESHOOTING.md** - Common issues and solutions
+**Entry Points:**
+- **README.md** - Main project documentation with navigation
+- **docs/README.md** - Documentation hub
 - **AGENTS.md** - This file, for AI coding agents
- **docs/** - Comprehensive documentation (80+ files)
-  - `docs/integrations/` - Integration guides for each platform
-  - `docs/guides/` - User guides
-  - `docs/reference/` - API reference
-  - `docs/features/` - Feature documentation
-  - `docs/blog/` - Blog posts and articles
-  - `docs/roadmap/` - Roadmap documents
+
+**Getting Started (for new users):**
+- `docs/getting-started/01-installation.md` - Installation guide
+- `docs/getting-started/02-quick-start.md` - 3 commands to first skill
+- `docs/getting-started/03-your-first-skill.md` - Complete walkthrough
+- `docs/getting-started/04-next-steps.md` - Where to go from here
+
+**User Guides (common tasks):**
+- `docs/user-guide/01-core-concepts.md` - How Skill Seekers works
+- `docs/user-guide/02-scraping.md` - All scraping options
+- `docs/user-guide/03-enhancement.md` - AI enhancement explained
+- `docs/user-guide/04-packaging.md` - Export to platforms
+- `docs/user-guide/05-workflows.md` - Enhancement workflows
+- `docs/user-guide/06-troubleshooting.md` - Common issues
+
+**Reference (technical details):**
+- `docs/reference/CLI_REFERENCE.md` - Complete command reference (20 commands)
+- `docs/reference/MCP_REFERENCE.md` - MCP tools reference (26 tools)
+- `docs/reference/CONFIG_FORMAT.md` - JSON configuration specification
+- `docs/reference/ENVIRONMENT_VARIABLES.md` - All environment variables
+
+**Advanced (power user topics):**
+- `docs/advanced/mcp-server.md` - MCP server setup
+- `docs/advanced/mcp-tools.md` - Advanced MCP usage
+- `docs/advanced/custom-workflows.md` - Creating custom workflows
+- `docs/advanced/multi-source.md` - Multi-source scraping
+
+**Legacy (being phased out):**
+- `QUICKSTART.md` - Old quick start (see docs/getting-started/)
+- `docs/guides/USAGE.md` - Old usage guide (see docs/user-guide/)
+- `docs/QUICK_REFERENCE.md` - Old reference (see docs/reference/)
+
+### Configuration Documentation
+
+Preset configs are in `configs/` directory:
+- `godot.json` - Godot Engine
+- `blender.json` / `blender-unified.json` - Blender Engine
+- `claude-code.json` - Claude Code
+- `httpx_comprehensive.json` - HTTPX library
+- `medusa-mercurjs.json` - Medusa/MercurJS
+- `astrovalley_unified.json` - Astrovalley
+- `configs/integrations/` - Integration-specific configs

 ### Configuration Documentation

@@ -662,6 +699,7 @@ Preset configs are in `configs/` directory:
 | `schedule` | >=1.2.0 | Scheduled tasks |
 | `python-dotenv` | >=1.1.1 | Environment variables |
 | `jsonschema` | >=4.25.1 | JSON validation |
+| `PyYAML` | >=6.0 | YAML parsing |

 ### Optional Dependencies

@@ -768,12 +806,47 @@ __version__ = get_version()  # Returns version from pyproject.toml

 ---

-## Code Statistics
+## Configuration File Format

- **Source Code:** ~40,000 lines (CLI modules)
- **Test Code:** ~42,000 lines (109 test files)
- **Documentation:** 80+ markdown files
- **Examples:** 11 complete integration examples
+Skill Seekers uses JSON configuration files to define scraping targets. Example structure:
+
+```json
+{
+  "name": "godot",
+  "description": "Godot Engine documentation",
+  "merge_mode": "claude-enhanced",
+  "sources": [
+    {
+      "type": "documentation",
+      "base_url": "https://docs.godotengine.org/en/stable/",
+      "extract_api": true,
+      "selectors": {
+        "main_content": "div[role='main']",
+        "title": "title",
+        "code_blocks": "pre"
+      },
+      "url_patterns": {
+        "include": [],
+        "exclude": ["/search.html", "/_static/"]
+      },
+      "categories": {
+        "getting_started": ["introduction", "getting_started"],
+        "scripting": ["scripting", "gdscript"]
+      },
+      "rate_limit": 0.5,
+      "max_pages": 500
+    },
+    {
+      "type": "github",
+      "repo": "godotengine/godot",
+      "enable_codebase_analysis": true,
+      "code_analysis_depth": "deep",
+      "fetch_issues": true,
+      "max_issues": 100
+    }
+  ]
+}
+```

 ---