docs: update all documentation for 17 source types

Update 32 documentation files across English and Chinese (zh-CN) docs
to reflect the 10 new source types added in the previous commit.

Updated files:
- README.md, README.zh-CN.md — taglines, feature lists, examples, install extras
- docs/reference/ — CLI_REFERENCE, FEATURE_MATRIX, MCP_REFERENCE, CONFIG_FORMAT, API_REFERENCE
- docs/features/ — UNIFIED_SCRAPING with generic merge docs
- docs/advanced/ — multi-source guide, MCP server guide
- docs/getting-started/ — installation extras, quick-start examples
- docs/user-guide/ — core-concepts, scraping, packaging, workflows (complex-merge)
- docs/ — FAQ, TROUBLESHOOTING, BEST_PRACTICES, ARCHITECTURE, UNIFIED_PARSERS, README
- Root — BULLETPROOF_QUICKSTART, CONTRIBUTING, ROADMAP
- docs/zh-CN/ — Chinese translations for all of the above

32 files changed, +3,016 lines, -245 lines
This commit is contained in:
yusyus
2026-03-15 15:56:04 +03:00
parent 53b911b697
commit 37cb307455
32 changed files with 3011 additions and 240 deletions

View File

@@ -1,8 +1,8 @@
# Config Format Reference - Skill Seekers
> **Version:** 3.1.4
> **Last Updated:** 2026-02-26
> **Complete JSON configuration specification**
> **Version:** 3.2.0
> **Last Updated:** 2026-03-15
> **Complete JSON configuration specification for 17 source types**
---
@@ -14,6 +14,7 @@
- [GitHub Source](#github-source)
- [PDF Source](#pdf-source)
- [Local Source](#local-source)
- [Additional Source Types](#additional-source-types)
- [Unified (Multi-Source) Config](#unified-multi-source-config)
- [Common Fields](#common-fields)
- [Selectors](#selectors)
@@ -266,6 +267,158 @@ For analyzing local codebases.
---
### Additional Source Types
The following 10 source types were added in v3.2.0. Each can be used as a standalone config or within a unified `sources` array.
#### Jupyter Notebook Source
```json
{
"name": "ml-tutorial",
"sources": [{
"type": "jupyter",
"notebook_path": "notebooks/tutorial.ipynb"
}]
}
```
#### Local HTML Source
```json
{
"name": "offline-docs",
"sources": [{
"type": "html",
"html_path": "./exported-docs/"
}]
}
```
#### OpenAPI/Swagger Source
```json
{
"name": "petstore-api",
"sources": [{
"type": "openapi",
"spec_path": "api/openapi.yaml",
"spec_url": "https://petstore.swagger.io/v2/swagger.json"
}]
}
```
#### AsciiDoc Source
```json
{
"name": "project-guide",
"sources": [{
"type": "asciidoc",
"asciidoc_path": "./docs/guide.adoc"
}]
}
```
#### PowerPoint Source
```json
{
"name": "training-slides",
"sources": [{
"type": "pptx",
"pptx_path": "presentations/training.pptx"
}]
}
```
#### RSS/Atom Feed Source
```json
{
"name": "engineering-blog",
"sources": [{
"type": "rss",
"feed_url": "https://engineering.example.com/feed.xml",
"follow_links": true,
"max_articles": 50
}]
}
```
#### Man Page Source
```json
{
"name": "unix-tools",
"sources": [{
"type": "manpage",
"man_names": "ls,grep,find,awk,sed",
"sections": "1,3"
}]
}
```
#### Confluence Source
```json
{
"name": "team-wiki",
"sources": [{
"type": "confluence",
"base_url": "https://wiki.example.com",
"space_key": "DEV",
"username": "user@example.com",
"max_pages": 500
}]
}
```
#### Notion Source
```json
{
"name": "product-docs",
"sources": [{
"type": "notion",
"database_id": "abc123def456",
"max_pages": 500
}]
}
```
#### Chat (Slack/Discord) Source
```json
{
"name": "team-knowledge",
"sources": [{
"type": "chat",
"export_path": "./slack-export/",
"platform": "slack",
"channel": "engineering",
"max_messages": 10000
}]
}
```
#### Additional Source Fields Reference
| Source Type | Required Fields | Optional Fields |
|-------------|-----------------|-----------------|
| `jupyter` | `notebook_path` | — |
| `html` | `html_path` | — |
| `openapi` | `spec_path` or `spec_url` | — |
| `asciidoc` | `asciidoc_path` | — |
| `pptx` | `pptx_path` | — |
| `rss` | `feed_url` or `feed_path` | `follow_links`, `max_articles` |
| `manpage` | `man_names` or `man_path` | `sections` |
| `confluence` | `base_url` + `space_key` or `export_path` | `username`, `token`, `max_pages` |
| `notion` | `database_id` or `page_id` or `export_path` | `token`, `max_pages` |
| `chat` | `export_path` | `platform`, `token`, `channel`, `max_messages` |
---
## Unified (Multi-Source) Config
Combine multiple sources into one skill with conflict detection.
@@ -380,14 +533,27 @@ Unified configs support defining enhancement workflows at the top level:
#### Source Types in Unified Config
Each source in the `sources` array can be:
Each source in the `sources` array can be any of the 17 supported types:
| Type | Required Fields |
|------|-----------------|
| `docs` | `base_url` |
| `documentation` / `docs` | `base_url` |
| `github` | `repo` |
| `pdf` | `pdf_path` |
| `word` | `docx_path` |
| `epub` | `epub_path` |
| `video` | `url` or `video_path` |
| `local` | `directory` |
| `jupyter` | `notebook_path` |
| `html` | `html_path` |
| `openapi` | `spec_path` or `spec_url` |
| `asciidoc` | `asciidoc_path` |
| `pptx` | `pptx_path` |
| `rss` | `feed_url` or `feed_path` |
| `manpage` | `man_names` or `man_path` |
| `confluence` | `base_url` + `space_key` or `export_path` |
| `notion` | `database_id` or `page_id` or `export_path` |
| `chat` | `export_path` |
---
@@ -606,6 +772,44 @@ Control which URLs are included or excluded:
}
```
### Unified with New Source Types
```json
{
"name": "project-complete",
"description": "Full project knowledge from multiple source types",
"merge_mode": "claude-enhanced",
"sources": [
{
"type": "docs",
"name": "project-docs",
"base_url": "https://docs.example.com/",
"max_pages": 200
},
{
"type": "github",
"name": "project-code",
"repo": "example/project"
},
{
"type": "openapi",
"name": "project-api",
"spec_path": "api/openapi.yaml"
},
{
"type": "confluence",
"name": "project-wiki",
"export_path": "./confluence-export/"
},
{
"type": "jupyter",
"name": "project-notebooks",
"notebook_path": "./notebooks/"
}
]
}
```
### Local Project
```json