docs: update all documentation for 17 source types

Update 32 documentation files across English and Chinese (zh-CN) docs
to reflect the 10 new source types added in the previous commit.

Updated files:
- README.md, README.zh-CN.md — taglines, feature lists, examples, install extras
- docs/reference/ — CLI_REFERENCE, FEATURE_MATRIX, MCP_REFERENCE, CONFIG_FORMAT, API_REFERENCE
- docs/features/ — UNIFIED_SCRAPING with generic merge docs
- docs/advanced/ — multi-source guide, MCP server guide
- docs/getting-started/ — installation extras, quick-start examples
- docs/user-guide/ — core-concepts, scraping, packaging, workflows (complex-merge)
- docs/ — FAQ, TROUBLESHOOTING, BEST_PRACTICES, ARCHITECTURE, UNIFIED_PARSERS, README
- Root — BULLETPROOF_QUICKSTART, CONTRIBUTING, ROADMAP
- docs/zh-CN/ — Chinese translations for all of the above

32 files changed, +3,016 lines, -245 lines
This commit is contained in:
yusyus
2026-03-15 15:56:04 +03:00
parent 53b911b697
commit 37cb307455
32 changed files with 3011 additions and 240 deletions

View File

@@ -441,21 +441,46 @@ def test_config_validation_with_missing_fields():
```
Skill_Seekers/
├── cli/ # CLI tools
│ ├── doc_scraper.py # Main scraper
├── package_skill.py # Packager
├── upload_skill.py # Uploader
└── utils.py # Shared utilities
├── mcp/ # MCP server
├── server.py # MCP implementation
└── requirements.txt # MCP dependencies
├── configs/ # Framework configs
├── docs/ # Documentation
├── tests/ # Test suite
└── .github/ # GitHub config
── workflows/ # CI/CD workflows
├── src/skill_seekers/ # Main package (src/ layout)
│ ├── cli/ # CLI commands and entry points
│ ├── main.py # Unified CLI entry (COMMAND_MODULES dict)
│ ├── source_detector.py # Auto-detects source type
│ ├── create_command.py # Unified `create` command routing
├── config_validator.py # VALID_SOURCE_TYPES set
│ ├── unified_scraper.py # Multi-source orchestrator
│ ├── unified_skill_builder.py # Pairwise synthesis + generic merge
├── doc_scraper.py # Documentation (web)
├── github_scraper.py # GitHub repos
├── pdf_scraper.py # PDF files
├── word_scraper.py # Word (.docx)
│ │ ── epub_scraper.py # EPUB books
│ │ ├── video_scraper.py # Video (YouTube, Vimeo, local)
│ │ ├── codebase_scraper.py # Local codebases
│ │ ├── jupyter_scraper.py # Jupyter Notebooks
│ │ ├── html_scraper.py # Local HTML files
│ │ ├── openapi_scraper.py # OpenAPI/Swagger specs
│ │ ├── asciidoc_scraper.py # AsciiDoc files
│ │ ├── pptx_scraper.py # PowerPoint files
│ │ ├── rss_scraper.py # RSS/Atom feeds
│ │ ├── manpage_scraper.py # Man pages
│ │ ├── confluence_scraper.py # Confluence wikis
│ │ ├── notion_scraper.py # Notion pages
│ │ ├── chat_scraper.py # Slack/Discord exports
│ │ ├── adaptors/ # Platform adaptors (Strategy pattern)
│ │ ├── arguments/ # CLI argument definitions (one per source)
│ │ ├── parsers/ # Subcommand parsers (one per source)
│ │ └── storage/ # Cloud storage adaptors
│ ├── mcp/ # MCP server + tools
│ └── sync/ # Sync monitoring
├── configs/ # Preset JSON scraping configs
├── docs/ # Documentation
├── tests/ # 115+ test files (pytest)
└── .github/ # GitHub config
└── workflows/ # CI/CD workflows
```
**Scraper pattern (17 source types):** Each source type has `cli/<type>_scraper.py` (with `<Type>ToSkillConverter` class + `main()`), `arguments/<type>.py`, and `parsers/<type>_parser.py`. Register new types in: `parsers/__init__.py` PARSERS list, `main.py` COMMAND_MODULES dict, `config_validator.py` VALID_SOURCE_TYPES set.
---
## Release Process