Files
skill-seekers-reference/DOCUMENTATION_OVERHAUL_PLAN.md
yusyus ba9a8ff8b5 docs: complete documentation overhaul with v3.1.0 release notes and zh-CN translations
Documentation restructure:
- New docs/getting-started/ guide (4 files: install, quick-start, first-skill, next-steps)
- New docs/user-guide/ section (6 files: core concepts through troubleshooting)
- New docs/reference/ section (CLI_REFERENCE, CONFIG_FORMAT, ENVIRONMENT_VARIABLES, MCP_REFERENCE)
- New docs/advanced/ section (custom-workflows, mcp-server, multi-source)
- New docs/ARCHITECTURE.md - system architecture overview
- Archived legacy files (QUICKSTART.md, QUICK_REFERENCE.md, docs/guides/USAGE.md) to docs/archive/legacy/

Chinese (zh-CN) translations:
- Full zh-CN mirror of all user-facing docs (getting-started, user-guide, reference, advanced)
- GitHub Actions workflow for translation sync (.github/workflows/translate-docs.yml)
- Translation sync checker script (scripts/check_translation_sync.sh)
- Translation helper script (scripts/translate_doc.py)

Content updates:
- CHANGELOG.md: [Unreleased] → [3.1.0] - 2026-02-22
- README.md: updated with new doc structure links
- AGENTS.md: updated agent documentation
- docs/features/UNIFIED_SCRAPING.md: updated for unified scraper workflow JSON config

Analysis/planning artifacts (kept for reference):
- DOCUMENTATION_OVERHAUL_PLAN.md, DOCUMENTATION_OVERHAUL_SUMMARY.md
- FEATURE_GAP_ANALYSIS.md, IMPLEMENTATION_GAPS_ANALYSIS.md, CREATE_COMMAND_COVERAGE_ANALYSIS.md
- CHINESE_TRANSLATION_IMPLEMENTATION_SUMMARY.md, ISSUE_260_UPDATE.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 01:01:51 +03:00

16 KiB

Documentation Overhaul Plan - Skill Seekers v3.1.0

Status: Draft - Pending Review
Scope: Complete documentation rewrite
Target: Eliminate user confusion, remove phantom commands, establish single source of truth


Executive Summary

Problem Statement (from Issue #286)

  • Docs reference removed commands (python3 cli/doc_scraper.py pattern)
  • Phantom commands documented that don't exist (merge-sources, generate-router, etc.)
  • 83 markdown files with no clear hierarchy
  • Users cannot find accurate information

Solution

Complete documentation rewrite with:

  1. Single source of truth CLI reference (all 20 commands)
  2. Working quick start guide (3 commands to first skill)
  3. Clear documentation hierarchy (4 categories max)
  4. Deprecation strategy for outdated files

Phase Overview

Phase Name Duration Output
1 Foundation 3-4 hrs CLI_REFERENCE.md, MCP reference, new structure
2 User Guides 3-4 hrs Quick start, workflows, troubleshooting
3 Integration 2-3 hrs README rewrite, navigation, redirects
4 Cleanup 1-2 hrs Archive old files, add deprecation notices
Total 10-14 hrs Complete documentation overhaul

Detailed Phase Breakdown


Phase 1: Foundation (CLI Reference & Structure)

1.1 Create Master CLI Reference

File: docs/reference/CLI_REFERENCE.md (NEW)

Sections:

CLI_REFERENCE.md
├── Overview
│   ├── Installation
│   ├── Global Flags
│   └── Environment Variables
│
├── Command Reference (alphabetical)
│   ├── analyze
│   ├── config
│   ├── create
│   ├── enhance
│   ├── enhance-status
│   ├── estimate
│   ├── github
│   ├── install
│   ├── install-agent
│   ├── multilang
│   ├── package
│   ├── pdf
│   ├── quality
│   ├── resume
│   ├── scrape
│   ├── stream
│   ├── unified
│   ├── update
│   ├── upload
│   └── workflows
│
├── MCP Tools Reference
│   ├── Overview (MCP vs CLI)
│   ├── Transport modes (stdio, HTTP)
│   └── Tool listing (26 tools)
│       ├── Core Tools (9)
│       │   ├── list_configs
│       │   ├── generate_config
│       │   ├── validate_config
│       │   ├── estimate_pages
│       │   ├── scrape_docs
│       │   ├── package_skill
│       │   ├── upload_skill
│       │   ├── enhance_skill
│       │   └── install_skill
│       ├── Extended Tools (9)
│       │   ├── scrape_github
│       │   ├── scrape_pdf
│       │   ├── unified_scrape
│       │   ├── scrape_codebase
│       │   ├── detect_patterns
│       │   ├── extract_test_examples
│       │   ├── build_how_to_guides
│       │   ├── extract_config_patterns
│       │   └── detect_conflicts
│       ├── Config Source Tools (5)
│       │   ├── add_config_source
│       │   ├── list_config_sources
│       │   ├── remove_config_source
│       │   ├── fetch_config
│       │   └── submit_config
│       ├── Config Splitting Tools (2)
│       │   ├── split_config
│       │   └── generate_router
│       ├── Vector Database Tools (4)
│       │   ├── export_to_weaviate
│       │   ├── export_to_chroma
│       │   ├── export_to_faiss
│       │   └── export_to_qdrant
│       └── Workflow Tools (5)
│           ├── list_workflows
│           ├── get_workflow
│           ├── create_workflow
│           ├── update_workflow
│           └── delete_workflow
│
└── Common Workflows
    ├── Workflow 1: Documentation → Skill
    ├── Workflow 2: GitHub → Skill
    ├── Workflow 3: PDF → Skill
    ├── Workflow 4: Local Codebase → Skill
    └── Workflow 5: Multi-Source → Skill

Each command section includes:

  • Purpose (1 sentence)
  • Syntax
  • Arguments (table: name, required, description)
  • Flags (table: short, long, default, description)
  • Examples (3-5 real examples)
  • Exit codes
  • Common errors

1.2 Create Config Format Reference

File: docs/reference/CONFIG_FORMAT.md (NEW)

Complete JSON schema documentation:

  • Root properties
  • Source types (docs, github, pdf, local)
  • Selectors
  • Categories
  • URL patterns
  • Rate limiting
  • Examples for each source type

1.3 Create Environment Variables Reference

File: docs/reference/ENVIRONMENT_VARIABLES.md (NEW)

Complete env var documentation:

  • API keys (Anthropic, Google, OpenAI, GitHub)
  • Configuration paths
  • Output directories
  • Rate limiting
  • Debug options

1.4 Establish New Directory Structure

docs/
├── README.md                         # Documentation entry point
├── ARCHITECTURE.md                   # How docs are organized
│
├── getting-started/                  # New users start here
│   ├── 01-installation.md           # pip install, requirements
│   ├── 02-quick-start.md            # 3 commands to first skill
│   ├── 03-your-first-skill.md       # Complete walkthrough
│   └── 04-next-steps.md             # Where to go from here
│
├── user-guide/                       # Common tasks
│   ├── 01-core-concepts.md          # Skills, configs, sources
│   ├── 02-scraping.md               # Docs, GitHub, PDF, local
│   ├── 03-enhancement.md            # AI enhancement options
│   ├── 04-packaging.md              # Target platforms
│   ├── 05-workflows.md              # Using workflow presets
│   └── 06-troubleshooting.md        # Common issues
│
├── reference/                        # Technical reference
│   ├── CLI_REFERENCE.md             # Complete command reference
│   ├── MCP_REFERENCE.md             # MCP tools reference
│   ├── CONFIG_FORMAT.md             # JSON config specification
│   └── ENVIRONMENT_VARIABLES.md     # Environment variables
│
└── advanced/                         # Power user features
    ├── custom-workflows.md          # Creating YAML workflows
    ├── mcp-server.md                # MCP integration
    ├── multi-source.md              # Unified scraping deep dive
    └── api-reference.md             # Python API (for developers)

Phase 2: User Guides

2.1 Installation Guide

File: docs/getting-started/01-installation.md

  • System requirements (Python 3.10+)
  • Basic install: pip install skill-seekers
  • With all platforms: pip install skill-seekers[all-llms]
  • Development setup: pip install -e ".[all-llms,dev]"
  • Verify installation: skill-seekers --version

2.2 Quick Start Guide

File: docs/getting-started/02-quick-start.md

The "3 commands to first skill":

# 1. Install
pip install skill-seekers

# 2. Create skill (auto-detects source type)
skill-seekers create https://docs.django.com/

# 3. Package for Claude
skill-seekers package output/django --target claude

Plus variants:

  • GitHub repo: skill-seekers create django/django
  • Local project: skill-seekers create ./my-project
  • PDF file: skill-seekers create manual.pdf

2.3 Your First Skill (Complete Walkthrough)

File: docs/getting-started/03-your-first-skill.md

Step-by-step with screenshots/description:

  1. Choose source (we'll use React docs)
  2. Run create command
  3. Wait for scraping (explain what's happening)
  4. Review output structure
  5. Optional: enhance with AI
  6. Package skill
  7. Upload to Claude (or use locally)

Include actual output examples.

2.4 Core Concepts

File: docs/user-guide/01-core-concepts.md

  • What is a skill? (SKILL.md + references/)
  • What is a config? (JSON file defining source)
  • Source types (docs, github, pdf, local)
  • Enhancement (why and when)
  • Packaging (target platforms)

2.5 Scraping Guide

File: docs/user-guide/02-scraping.md

Four sections:

  1. Documentation Scraping

    • Using presets (--config)
    • Quick mode (--base-url, --name)
    • Dry run (--dry-run)
    • Rate limiting
  2. GitHub Repository Analysis

    • Basic analysis
    • Analysis depth options
    • With GitHub token
  3. PDF Extraction

    • Basic extraction
    • OCR for scanned PDFs
    • Large PDF handling
  4. Local Codebase Analysis

    • Analyzing local projects
    • Language detection
    • Pattern detection

2.6 Enhancement Guide

File: docs/user-guide/03-enhancement.md

  • What is enhancement? (improves SKILL.md quality)
  • API mode vs LOCAL mode
  • Using workflow presets
  • Multi-workflow chaining
  • When to skip enhancement

2.7 Packaging Guide

File: docs/user-guide/04-packaging.md

  • Supported platforms (table)
  • Platform-specific packaging
  • Multi-platform packaging loop
  • Output formats explained

2.8 Workflows Guide

File: docs/user-guide/05-workflows.md

  • What are workflow presets?
  • Built-in presets (default, minimal, security-focus, architecture-comprehensive, api-documentation)
  • Using presets (--enhance-workflow)
  • Chaining multiple presets
  • Listing available workflows
  • Creating custom workflows

2.9 Troubleshooting Guide

File: docs/user-guide/06-troubleshooting.md

Common issues with solutions:

Issue Cause Solution
ImportError Package not installed pip install -e .
Rate limit exceeded Too fast Increase rate_limit in config
No content extracted Wrong selectors Check selectors with browser dev tools
Enhancement fails No API key / Claude Code not running Set key or install Claude Code
Package fails Missing SKILL.md Run build first

Plus:

  • How to get help
  • Debug mode (--verbose)
  • Log files location
  • Creating a minimal reproduction

Phase 3: Integration

3.1 Main README Rewrite

File: README.md (UPDATE)

Structure:

# Skill Seekers

[Badges - keep current]

## 🚀 Quick Start (3 commands)
[The 3-command quick start]

## What is Skill Seekers?
[1-paragraph explanation]

## 📚 Documentation

| I want to... | Read this |
|--------------|-----------|
| Get started quickly | [Quick Start](docs/getting-started/02-quick-start.md) |
| Learn common workflows | [User Guide](docs/user-guide/) |
| Look up a command | [CLI Reference](docs/reference/CLI_REFERENCE.md) |
| Create custom configs | [Config Format](docs/reference/CONFIG_FORMAT.md) |
| Set up MCP | [MCP Guide](docs/advanced/mcp-server.md) |

## Installation
[Basic install instructions]

## Features
[Keep current features table]

## Contributing
[Link to CONTRIBUTING.md]

3.2 Docs Entry Point

File: docs/README.md (NEW)

Navigation hub:

  • Welcome message
  • "Where should I start?" flowchart
  • Quick links to all sections
  • Version info
  • How to contribute to docs

3.3 Architecture Document

File: docs/ARCHITECTURE.md (NEW)

Explains how documentation is organized:

  • 4 categories explained
  • When to use each
  • File naming conventions
  • How to contribute

Phase 4: Cleanup

4.1 Files to Archive

Move to docs/archive/legacy/:

  • docs/guides/USAGE.md - Uses old CLI pattern
  • docs/QUICK_REFERENCE.md - Has phantom commands
  • QUICKSTART.md (root) - Outdated, redirect to new quick start

4.2 Add Deprecation Notices

For files kept but outdated, add header:

> ⚠️ **DEPRECATED**: This document references older CLI patterns.
> 
> For up-to-date documentation, see:
> - [Quick Start](docs/getting-started/02-quick-start.md)
> - [CLI Reference](docs/reference/CLI_REFERENCE.md)

Files needing deprecation notice:

  • docs/guides/USAGE.md
  • Any other docs using python3 cli/X.py pattern

4.3 Update AGENTS.md

Update AGENTS.md documentation section to reflect new structure.

4.4 Chinese Documentation Strategy

Goal: Maintain parity between English and Chinese documentation.

Approach:

  • Primary: English docs are source of truth (in docs/)
  • Secondary: Chinese translations in docs.zh-CN/ or docs/locales/zh-CN/

Files to Translate (Priority Order):

  1. docs/getting-started/02-quick-start.md - Most accessed
  2. docs/README.md - Entry point
  3. docs/user-guide/06-troubleshooting.md - Reduces support burden
  4. docs/reference/CLI_REFERENCE.md - Command reference

Translation Workflow:

English doc updated → Mark for translation → Translate → Review → Publish

Options for Implementation:

  • Option A: Separate docs.zh-CN/ directory (mirrors docs/ structure)
  • Option B: Side-by-side files (README.md + README.zh-CN.md in same dir)
  • Option C: Keep existing README.zh-CN.md pattern, translate key docs only

Recommendation: Option C for now - translate only:

  • README.zh-CN.md (update existing)
  • docs/getting-started/02-quick-start.zh-CN.md (new)
  • docs/user-guide/06-troubleshooting.zh-CN.md (new)

Long-term: Consider i18n framework if user base grows.

Chinese README.md Updates Needed:

  • Update installation instructions
  • Update command examples (new CLI pattern)
  • Update navigation links to new docs structure
  • Remove phantom commands

Sync Strategy:

  • English docs: Always current (source of truth)
  • Chinese docs: Best effort, marked with "Last translated: DATE"
  • Community contributions welcome for translations

Files to Create/Modify

New Files (16)

File Phase Purpose
docs/reference/CLI_REFERENCE.md 1 Master command reference
docs/reference/MCP_REFERENCE.md 1 MCP tools reference
docs/reference/CONFIG_FORMAT.md 1 JSON config spec
docs/reference/ENVIRONMENT_VARIABLES.md 1 Env vars reference
docs/README.md 3 Docs entry point
docs/ARCHITECTURE.md 3 Documentation organization
docs/getting-started/01-installation.md 2 Install guide
docs/getting-started/02-quick-start.md 2 3-command quick start
docs/getting-started/03-your-first-skill.md 2 Complete walkthrough
docs/getting-started/04-next-steps.md 2 Where to go next
docs/user-guide/01-core-concepts.md 2 Core concepts
docs/user-guide/02-scraping.md 2 Scraping guide
docs/user-guide/03-enhancement.md 2 Enhancement guide
docs/user-guide/04-packaging.md 2 Packaging guide
docs/user-guide/05-workflows.md 2 Workflows guide
docs/user-guide/06-troubleshooting.md 2 Troubleshooting

Modified Files (2)

File Phase Changes
README.md 3 New structure, navigation table
AGENTS.md 4 Update documentation section

Archived Files (3+)

File Destination Action
docs/guides/USAGE.md docs/archive/legacy/ Move + deprecation notice
docs/QUICK_REFERENCE.md docs/archive/legacy/ Move + deprecation notice
QUICKSTART.md docs/archive/legacy/ Move + create redirect

Success Metrics

After implementation, documentation should:

  • Zero references to python3 cli/X.py pattern
  • Zero phantom commands documented
  • All 20 CLI commands documented with examples
  • Quick start works with copy-paste
  • Clear navigation from README
  • Troubleshooting covers top 10 issues
  • User can find any command in < 3 clicks

Review Checklist

Before implementation, review this plan for:

  • Completeness: All 20 commands covered?
  • Accuracy: No phantom commands?
  • Organization: Clear hierarchy?
  • Scope: Not too much / too little?
  • Priority: Right order of phases?

Next Steps

  1. Review this plan - Comment, modify, approve
  2. Say "good to go" - I'll switch to implementation mode
  3. Implementation - I'll create todos and start writing

Plan Version: 1.1
Created: 2026-02-16
Status: Awaiting Review