docs: update all chunk flag names to match renamed CLI flags

Replace all occurrences of old ambiguous flag names with the new explicit ones:
  --chunk-size (tokens)  → --chunk-tokens
  --chunk-overlap        → --chunk-overlap-tokens
  --chunk                → --chunk-for-rag
  --streaming-chunk-size → --streaming-chunk-chars
  --streaming-overlap    → --streaming-overlap-chars
  --chunk-size (pages)   → --pdf-pages-per-chunk

Updated: CLI_REFERENCE (EN+ZH), user-guide (EN+ZH), integrations (Haystack,
Chroma, Weaviate, FAISS, Qdrant), features/PDF_CHUNKING, examples/haystack-pipeline,
strategy docs, archive docs, and CHANGELOG.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
yusyus
2026-02-24 22:15:14 +03:00
parent 7a2ffb286c
commit 73adda0b17
29 changed files with 488 additions and 214 deletions

View File

@@ -6,7 +6,6 @@ applies_to:
- doc_scraping
variables:
depth: comprehensive
alternatives: []
stages:
- name: feature_comparison
type: custom

View File

@@ -164,5 +164,5 @@ post_process:
add_metadata:
enhanced: true
workflow: data-validation
domain: ml
domain: backend
has_validation_docs: true

View File

@@ -17,6 +17,46 @@ stages:
target: examples
enabled: true
uses_history: false
- name: architecture_overview
type: custom
target: architecture
uses_history: false
enabled: true
prompt: >
Provide a concise architectural overview of this codebase.
Cover:
1. Overall architecture style (MVC, microservices, layered, etc.)
2. Key components and their responsibilities
3. Data flow between components
4. External dependencies and integrations
5. Entry points (CLI, API, web, etc.)
Output JSON with:
- "architecture_style": main architectural pattern
- "components": array of {name, responsibility}
- "data_flow": how data moves through the system
- "external_deps": third-party services and libraries
- "entry_points": how users interact with the system
- name: skill_polish
type: custom
target: skill_md
uses_history: true
enabled: true
prompt: >
Review the SKILL.md content generated so far and improve it.
Fix:
1. Unclear or overly technical descriptions
2. Missing quick-start examples
3. Gaps in the overview section
4. Redundant or duplicate information
5. Formatting inconsistencies
Output JSON with:
- "improved_overview": rewritten overview section
- "quick_start": concise getting-started snippet
- "key_concepts": 3-5 essential concepts a developer needs to know
post_process:
reorder_sections: []
add_metadata:

View File

@@ -14,12 +14,17 @@ stages:
uses_history: false
enabled: true
prompt: >
Review the following SKILL.md content and make minimal improvements:
- Fix obvious formatting issues
- Ensure the overview section is clear and concise
- Remove duplicate or redundant information
Review the SKILL.md content and make minimal targeted improvements.
Return the improved content as plain text without extra commentary.
Fix only:
1. Obvious formatting issues (broken lists, inconsistent headers)
2. Unclear overview section (make it one clear paragraph)
3. Duplicate or redundant information (remove repeats)
Output JSON with:
- "improved_overview": rewritten overview paragraph (plain markdown)
- "removed_sections": list of section names that were removed as duplicates
- "formatting_fixes": list of specific formatting issues corrected
post_process:
reorder_sections: []
add_metadata:

View File

@@ -3,9 +3,7 @@ description: "Security-focused review: vulnerabilities, auth, data handling"
version: "1.0"
applies_to:
- codebase_analysis
- python
- javascript
- typescript
- github_analysis
variables:
depth: comprehensive
stages: