docs: update all chunk flag names to match renamed CLI flags
Replace all occurrences of old ambiguous flag names with the new explicit ones: --chunk-size (tokens) → --chunk-tokens --chunk-overlap → --chunk-overlap-tokens --chunk → --chunk-for-rag --streaming-chunk-size → --streaming-chunk-chars --streaming-overlap → --streaming-overlap-chars --chunk-size (pages) → --pdf-pages-per-chunk Updated: CLI_REFERENCE (EN+ZH), user-guide (EN+ZH), integrations (Haystack, Chroma, Weaviate, FAISS, Qdrant), features/PDF_CHUNKING, examples/haystack-pipeline, strategy docs, archive docs, and CHANGELOG. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
143
.github/workflows/translate-docs.yml
vendored
143
.github/workflows/translate-docs.yml
vendored
@@ -1,143 +0,0 @@
|
||||
name: Translate Documentation to Chinese
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
- development
|
||||
paths:
|
||||
- 'docs/**/*.md'
|
||||
- '!docs/zh-CN/**'
|
||||
- '!docs/archive/**'
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
files:
|
||||
description: 'Specific files to translate (comma-separated, or "all")'
|
||||
required: false
|
||||
default: 'changed'
|
||||
|
||||
jobs:
|
||||
detect-changes:
|
||||
runs-on: ubuntu-latest
|
||||
outputs:
|
||||
changed-files: ${{ steps.detect.outputs.files }}
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 2
|
||||
|
||||
- name: Detect changed files
|
||||
id: detect
|
||||
run: |
|
||||
if [ "${{ github.event.inputs.files }}" = "all" ]; then
|
||||
# Translate all docs
|
||||
FILES=$(find docs -name "*.md" -not -path "docs/zh-CN/*" -not -path "docs/archive/*" | tr '\n' ',')
|
||||
elif [ "${{ github.event.inputs.files }}" != "" ] && [ "${{ github.event.inputs.files }}" != "changed" ]; then
|
||||
# Use provided files
|
||||
FILES="${{ github.event.inputs.files }}"
|
||||
else
|
||||
# Detect changed files
|
||||
FILES=$(git diff --name-only HEAD~1 HEAD | grep "^docs/" | grep -v "^docs/zh-CN/" | grep -v "^docs/archive/" | grep "\.md$" | tr '\n' ',')
|
||||
fi
|
||||
|
||||
# Remove trailing comma
|
||||
FILES=$(echo "$FILES" | sed 's/,$//')
|
||||
|
||||
echo "files=$FILES" >> $GITHUB_OUTPUT
|
||||
echo "Detected files: $FILES"
|
||||
|
||||
translate:
|
||||
runs-on: ubuntu-latest
|
||||
needs: detect-changes
|
||||
if: needs.detect-changes.outputs.changed-files != ''
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: '3.11'
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
pip install anthropic
|
||||
|
||||
- name: Translate documents
|
||||
env:
|
||||
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
|
||||
run: |
|
||||
IFS=',' read -ra FILES <<< "${{ needs.detect-changes.outputs.changed-files }}"
|
||||
for file in "${FILES[@]}"; do
|
||||
if [ -f "$file" ]; then
|
||||
echo "Translating: $file"
|
||||
python scripts/translate_doc.py "$file" --target-lang zh-CN || echo "Failed: $file"
|
||||
fi
|
||||
done
|
||||
|
||||
- name: Check for changes
|
||||
id: git-check
|
||||
run: |
|
||||
git add docs/zh-CN/
|
||||
if git diff --cached --quiet; then
|
||||
echo "changed=false" >> $GITHUB_OUTPUT
|
||||
else
|
||||
echo "changed=true" >> $GITHUB_OUTPUT
|
||||
fi
|
||||
|
||||
- name: Create Pull Request
|
||||
if: steps.git-check.outputs.changed == 'true'
|
||||
uses: peter-evans/create-pull-request@v6
|
||||
with:
|
||||
token: ${{ secrets.GITHUB_TOKEN }}
|
||||
commit-message: "[Auto] Chinese translation update"
|
||||
title: "🌐 [Auto] Chinese Documentation Translation Update"
|
||||
body: |
|
||||
## 🇨🇳 中文文档翻译更新 / Chinese Documentation Translation Update
|
||||
|
||||
This PR contains automated translations of updated documentation.
|
||||
|
||||
### 变更内容 / Changes
|
||||
${{ needs.detect-changes.outputs.changed-files }}
|
||||
|
||||
### 审阅指南 / Review Guide
|
||||
- [ ] 技术术语准确 / Technical terms accurate
|
||||
- [ ] 链接正确指向中文版本 / Links point to Chinese versions
|
||||
- [ ] 代码示例保持原样 / Code examples preserved
|
||||
- [ ] 格式正确 / Formatting correct
|
||||
|
||||
### 如何审阅 / How to Review
|
||||
1. 查看文件列表 / Check the file list
|
||||
2. 阅读中文翻译 / Read the Chinese translation
|
||||
3. 在 PR 中提出修改建议 / Suggest changes in PR
|
||||
4. 确认后批准 / Approve when satisfied
|
||||
|
||||
### 相关 Issue
|
||||
- #260 - Chinese Translation
|
||||
|
||||
---
|
||||
|
||||
*This PR was auto-generated by GitHub Actions*
|
||||
branch: auto-translate-zh-cn-${{ github.run_number }}
|
||||
delete-branch: true
|
||||
labels: translation, zh-CN, needs-review, automated
|
||||
|
||||
- name: Update Issue #260
|
||||
if: steps.git-check.outputs.changed == 'true'
|
||||
uses: actions/github-script@v7
|
||||
with:
|
||||
script: |
|
||||
github.rest.issues.createComment({
|
||||
issue_number: 260,
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
body: `🤖 **自动翻译更新 / Automated Translation Update**
|
||||
|
||||
新的中文翻译已准备就绪,需要社区审阅:
|
||||
- PR: #${{ steps.create-pr.outputs.pull-request-number }}
|
||||
- 文件: ${{ needs.detect-changes.outputs.changed-files }}
|
||||
|
||||
请志愿者帮忙审阅,谢谢!
|
||||
/ Community review needed, thanks!`
|
||||
})
|
||||
18
=0.24.0
Normal file
18
=0.24.0
Normal file
@@ -0,0 +1,18 @@
|
||||
error: externally-managed-environment
|
||||
|
||||
× This environment is externally managed
|
||||
╰─> To install Python packages system-wide, try 'pacman -S
|
||||
python-xyz', where xyz is the package you are trying to
|
||||
install.
|
||||
|
||||
If you wish to install a non-Arch-packaged Python package,
|
||||
create a virtual environment using 'python -m venv path/to/venv'.
|
||||
Then use path/to/venv/bin/python and path/to/venv/bin/pip.
|
||||
|
||||
If you wish to install a non-Arch packaged Python application,
|
||||
it may be easiest to use 'pipx install xyz', which will manage a
|
||||
virtual environment for you. Make sure you have python-pipx
|
||||
installed via pacman.
|
||||
|
||||
note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages.
|
||||
hint: See PEP 668 for the detailed specification.
|
||||
12
CHANGELOG.md
12
CHANGELOG.md
@@ -5,6 +5,18 @@ All notable changes to Skill Seeker will be documented in this file.
|
||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
### Changed
|
||||
- **Explicit chunk flag names** — All `--chunk-*` flags now include unit suffixes to eliminate ambiguity:
|
||||
- `--chunk-size` (RAG tokens) → `--chunk-tokens`
|
||||
- `--chunk-overlap` (RAG tokens) → `--chunk-overlap-tokens`
|
||||
- `--chunk` (enable RAG chunking) → `--chunk-for-rag`
|
||||
- `--streaming-chunk-size` (chars) → `--streaming-chunk-chars`
|
||||
- `--streaming-overlap` (chars) → `--streaming-overlap-chars`
|
||||
- `--chunk-size` in PDF extractor (pages) → `--pdf-pages-per-chunk`
|
||||
- **`setup_logging()` centralized** — Removed duplicate `logging.basicConfig()` calls in `github_scraper.py`, `codebase_scraper.py`, `unified_scraper.py`; all now use shared `setup_logging()` from `utils.py`
|
||||
|
||||
## [3.1.2] - 2026-02-24
|
||||
|
||||
### 🔧 Fix `create` Command Argument Forwarding, Gemini Model, and Enhance Dispatcher
|
||||
|
||||
345
TESTING_GAP_REPORT.md
Normal file
345
TESTING_GAP_REPORT.md
Normal file
@@ -0,0 +1,345 @@
|
||||
# Comprehensive Testing Gap Report
|
||||
|
||||
**Project:** Skill Seekers v3.1.0
|
||||
**Date:** 2026-02-22
|
||||
**Total Test Files:** 113
|
||||
**Total Test Functions:** ~208+ (collected: 2173 tests)
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
### Overall Test Health: 🟡 GOOD with Gaps
|
||||
|
||||
| Category | Status | Coverage | Key Gaps |
|
||||
|----------|--------|----------|----------|
|
||||
| CLI Arguments | ✅ Good | 85% | Some edge cases |
|
||||
| Workflow System | ✅ Excellent | 90% | Inline stage parsing edge cases |
|
||||
| Scrapers | 🟡 Moderate | 70% | Missing real HTTP/PDF tests |
|
||||
| Enhancement | 🟡 Partial | 60% | Core logic not tested |
|
||||
| MCP Tools | 🟡 Good | 75% | 8 tools not covered |
|
||||
| Integration/E2E | 🟡 Moderate | 65% | Heavy mocking |
|
||||
| Adaptors | ✅ Good | 80% | Good coverage per platform |
|
||||
|
||||
---
|
||||
|
||||
## Detailed Findings by Category
|
||||
|
||||
### 1. CLI Argument Tests ✅ GOOD
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_analyze_command.py` (269 lines, 26 tests)
|
||||
- `test_unified.py` - TestUnifiedCLIArguments class (6 tests)
|
||||
- `test_pdf_scraper.py` - TestPDFCLIArguments class (4 tests)
|
||||
- `test_create_arguments.py` (399 lines)
|
||||
- `test_create_integration_basic.py` (310 lines, 23 tests)
|
||||
|
||||
**Strengths:**
|
||||
- All new workflow flags are tested (`--enhance-workflow`, `--enhance-stage`, `--var`, `--workflow-dry-run`)
|
||||
- Argument parsing thoroughly tested
|
||||
- Default values verified
|
||||
- Complex command combinations tested
|
||||
|
||||
**Gaps:**
|
||||
- `test_create_integration_basic.py`: 2 tests skipped (source auto-detection not fully tested)
|
||||
- No tests for invalid argument combinations beyond basic parsing errors
|
||||
|
||||
---
|
||||
|
||||
### 2. Workflow Tests ✅ EXCELLENT
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_workflow_runner.py` (445 lines, 30+ tests)
|
||||
- `test_workflows_command.py` (571 lines, 40+ tests)
|
||||
- `test_workflow_tools_mcp.py` (295 lines, 20+ tests)
|
||||
|
||||
**Strengths:**
|
||||
- Comprehensive workflow execution tests
|
||||
- Variable substitution thoroughly tested
|
||||
- Dry-run mode tested
|
||||
- Workflow chaining tested
|
||||
- All 6 workflow subcommands tested (list, show, copy, add, remove, validate)
|
||||
- MCP workflow tools tested
|
||||
|
||||
**Minor Gaps:**
|
||||
- No tests for `_build_inline_engine` edge cases
|
||||
- No tests for malformed stage specs (empty, invalid format)
|
||||
|
||||
---
|
||||
|
||||
### 3. Scraper Tests 🟡 MODERATE with Significant Gaps
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_scraper_features.py` (524 lines) - Doc scraper features
|
||||
- `test_codebase_scraper.py` (478 lines) - Codebase analysis
|
||||
- `test_pdf_scraper.py` (558 lines) - PDF scraper
|
||||
- `test_github_scraper.py` (1015 lines) - GitHub scraper
|
||||
- `test_unified_analyzer.py` (428 lines) - Unified analyzer
|
||||
|
||||
**Critical Gaps:**
|
||||
|
||||
#### A. Missing Real External Resource Tests
|
||||
| Resource | Test Type | Status |
|
||||
|----------|-----------|--------|
|
||||
| HTTP Requests (docs) | Mocked only | ❌ Gap |
|
||||
| PDF Extraction | Mocked only | ❌ Gap |
|
||||
| GitHub API | Mocked only | ❌ Gap (acceptable) |
|
||||
| Local Files | Real tests | ✅ Good |
|
||||
|
||||
#### B. Missing Core Function Tests
|
||||
| Function | Location | Priority |
|
||||
|----------|----------|----------|
|
||||
| `UnifiedScraper.run()` | unified_scraper.py | 🔴 High |
|
||||
| `UnifiedScraper._scrape_documentation()` | unified_scraper.py | 🔴 High |
|
||||
| `UnifiedScraper._scrape_github()` | unified_scraper.py | 🔴 High |
|
||||
| `UnifiedScraper._scrape_pdf()` | unified_scraper.py | 🔴 High |
|
||||
| `UnifiedScraper._scrape_local()` | unified_scraper.py | 🟡 Medium |
|
||||
| `DocToSkillConverter.scrape()` | doc_scraper.py | 🔴 High |
|
||||
| `PDFToSkillConverter.extract_pdf()` | pdf_scraper.py | 🔴 High |
|
||||
|
||||
#### C. PDF Scraper Limited Coverage
|
||||
- No actual PDF parsing tests (only mocked)
|
||||
- OCR functionality not tested
|
||||
- Page range extraction not tested
|
||||
|
||||
---
|
||||
|
||||
### 4. Enhancement Tests 🟡 PARTIAL - MAJOR GAPS
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_enhance_command.py` (367 lines, 25+ tests)
|
||||
- `test_enhance_skill_local.py` (163 lines, 14 tests)
|
||||
|
||||
**Critical Gap in `test_enhance_skill_local.py`:**
|
||||
|
||||
| Function | Lines | Tested? | Priority |
|
||||
|----------|-------|---------|----------|
|
||||
| `summarize_reference()` | ~50 | ❌ No | 🔴 High |
|
||||
| `create_enhancement_prompt()` | ~200 | ❌ No | 🔴 High |
|
||||
| `run()` | ~100 | ❌ No | 🔴 High |
|
||||
| `_run_headless()` | ~130 | ❌ No | 🔴 High |
|
||||
| `_run_background()` | ~80 | ❌ No | 🟡 Medium |
|
||||
| `_run_daemon()` | ~60 | ❌ No | 🟡 Medium |
|
||||
| `write_status()` | ~30 | ❌ No | 🟡 Medium |
|
||||
| `read_status()` | ~40 | ❌ No | 🟡 Medium |
|
||||
| `detect_terminal_app()` | ~80 | ❌ No | 🟡 Medium |
|
||||
|
||||
**Current Tests Only Cover:**
|
||||
- Agent presets configuration
|
||||
- Command building
|
||||
- Agent name normalization
|
||||
- Environment variable handling
|
||||
|
||||
**Recommendation:** Add comprehensive tests for the core enhancement logic.
|
||||
|
||||
---
|
||||
|
||||
### 5. MCP Tool Tests 🟡 GOOD with Coverage Gaps
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_mcp_fastmcp.py` (868 lines)
|
||||
- `test_mcp_server.py` (715 lines)
|
||||
- `test_mcp_vector_dbs.py` (259 lines)
|
||||
- `test_real_world_fastmcp.py` (558 lines)
|
||||
|
||||
**Coverage Analysis:**
|
||||
|
||||
| Tool Category | Tools | Tested | Coverage |
|
||||
|---------------|-------|--------|----------|
|
||||
| Config Tools | 3 | 3 | ✅ 100% |
|
||||
| Scraping Tools | 8 | 4 | 🟡 50% |
|
||||
| Packaging Tools | 4 | 4 | ✅ 100% |
|
||||
| Splitting Tools | 2 | 2 | ✅ 100% |
|
||||
| Source Tools | 5 | 5 | ✅ 100% |
|
||||
| Vector DB Tools | 4 | 4 | ✅ 100% |
|
||||
| Workflow Tools | 5 | 0 | ❌ 0% |
|
||||
| **Total** | **31** | **22** | **🟡 71%** |
|
||||
|
||||
**Untested Tools:**
|
||||
1. `detect_patterns`
|
||||
2. `extract_test_examples`
|
||||
3. `build_how_to_guides`
|
||||
4. `extract_config_patterns`
|
||||
5. `list_workflows`
|
||||
6. `get_workflow`
|
||||
7. `create_workflow`
|
||||
8. `update_workflow`
|
||||
9. `delete_workflow`
|
||||
|
||||
**Note:** `test_mcp_server.py` tests legacy server, `test_mcp_fastmcp.py` tests modern server.
|
||||
|
||||
---
|
||||
|
||||
### 6. Integration/E2E Tests 🟡 MODERATE
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_create_integration_basic.py` (310 lines)
|
||||
- `test_e2e_three_stream_pipeline.py` (598 lines)
|
||||
- `test_analyze_e2e.py` (344 lines)
|
||||
- `test_install_skill_e2e.py` (533 lines)
|
||||
- `test_c3_integration.py` (362 lines)
|
||||
|
||||
**Issues Found:**
|
||||
|
||||
1. **Skipped Tests:**
|
||||
- `test_create_detects_web_url` - Source auto-detection incomplete
|
||||
- `test_create_invalid_source_shows_error` - Error handling incomplete
|
||||
- `test_cli_via_unified_command` - Asyncio issues
|
||||
|
||||
2. **Heavy Mocking:**
|
||||
- Most GitHub API tests use mocking
|
||||
- No real HTTP tests for doc scraping
|
||||
- Integration tests don't test actual integration
|
||||
|
||||
3. **Limited Scope:**
|
||||
- Only `--quick` preset tested (not `--comprehensive`)
|
||||
- C3.x tests use mock data only
|
||||
- Most E2E tests are unit tests with mocks
|
||||
|
||||
---
|
||||
|
||||
### 7. Adaptor Tests ✅ GOOD
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_adaptors/test_adaptors_e2e.py` (893 lines)
|
||||
- `test_adaptors/test_claude_adaptor.py` (314 lines)
|
||||
- `test_adaptors/test_gemini_adaptor.py` (146 lines)
|
||||
- `test_adaptors/test_openai_adaptor.py` (188 lines)
|
||||
- Plus 8 more platform adaptors
|
||||
|
||||
**Strengths:**
|
||||
- Each adaptor has dedicated tests
|
||||
- Package format testing
|
||||
- Upload success/failure scenarios
|
||||
- Platform-specific features tested
|
||||
|
||||
**Minor Gaps:**
|
||||
- Some adaptors only test 1-2 scenarios
|
||||
- Error handling coverage varies by platform
|
||||
|
||||
---
|
||||
|
||||
### 8. Config/Validation Tests ✅ GOOD
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_config_validation.py` (270 lines)
|
||||
- `test_config_extractor.py` (629 lines)
|
||||
- `test_config_fetcher.py` (340 lines)
|
||||
|
||||
**Strengths:**
|
||||
- Unified vs legacy format detection
|
||||
- Field validation comprehensive
|
||||
- Error message quality tested
|
||||
|
||||
---
|
||||
|
||||
## Summary of Critical Testing Gaps
|
||||
|
||||
### 🔴 HIGH PRIORITY (Must Fix)
|
||||
|
||||
1. **Enhancement Core Logic**
|
||||
- File: `test_enhance_skill_local.py`
|
||||
- Missing: 9 major functions
|
||||
- Impact: Core feature untested
|
||||
|
||||
2. **Unified Scraper Main Flow**
|
||||
- File: New tests needed
|
||||
- Missing: `_scrape_*()` methods, `run()` orchestration
|
||||
- Impact: Multi-source scraping untested
|
||||
|
||||
3. **Actual HTTP/PDF/GitHub Integration**
|
||||
- Missing: Real external resource tests
|
||||
- Impact: Only mock tests exist
|
||||
|
||||
### 🟡 MEDIUM PRIORITY (Should Fix)
|
||||
|
||||
4. **MCP Workflow Tools**
|
||||
- Missing: 5 workflow tools (0% coverage)
|
||||
- Impact: MCP workflow features untested
|
||||
|
||||
5. **Skipped Integration Tests**
|
||||
- 3 tests skipped
|
||||
- Impact: Source auto-detection incomplete
|
||||
|
||||
6. **PDF Real Extraction**
|
||||
- Missing: Actual PDF parsing
|
||||
- Impact: PDF feature quality unknown
|
||||
|
||||
### 🟢 LOW PRIORITY (Nice to Have)
|
||||
|
||||
7. **Additional Scraping Tools**
|
||||
- Missing: 4 scraping tool tests
|
||||
- Impact: Low (core tools covered)
|
||||
|
||||
8. **Edge Case Coverage**
|
||||
- Missing: Invalid argument combinations
|
||||
- Impact: Low (happy path covered)
|
||||
|
||||
---
|
||||
|
||||
## Recommendations
|
||||
|
||||
### Immediate Actions (Next Sprint)
|
||||
|
||||
1. **Add Enhancement Logic Tests** (~400 lines)
|
||||
- Test `summarize_reference()`
|
||||
- Test `create_enhancement_prompt()`
|
||||
- Test `run()` method
|
||||
- Test status read/write
|
||||
|
||||
2. **Fix Skipped Tests** (~100 lines)
|
||||
- Fix asyncio issues in `test_cli_via_unified_command`
|
||||
- Complete source auto-detection tests
|
||||
|
||||
3. **Add MCP Workflow Tool Tests** (~200 lines)
|
||||
- Test all 5 workflow tools
|
||||
|
||||
### Short Term (Next Month)
|
||||
|
||||
4. **Add Unified Scraper Integration Tests** (~300 lines)
|
||||
- Test main orchestration flow
|
||||
- Test individual source scraping
|
||||
|
||||
5. **Add Real PDF Tests** (~150 lines)
|
||||
- Test with actual PDF files
|
||||
- Test OCR if available
|
||||
|
||||
### Long Term (Next Quarter)
|
||||
|
||||
6. **HTTP Integration Tests** (~200 lines)
|
||||
- Test with real websites (use test sites)
|
||||
- Mock server approach
|
||||
|
||||
7. **Complete E2E Pipeline** (~300 lines)
|
||||
- Full workflow from scrape to upload
|
||||
- Real GitHub repo (fork test repo)
|
||||
|
||||
---
|
||||
|
||||
## Test Quality Metrics
|
||||
|
||||
| Metric | Score | Notes |
|
||||
|--------|-------|-------|
|
||||
| Test Count | 🟢 Good | 2173+ tests |
|
||||
| Coverage | 🟡 Moderate | ~75% estimated |
|
||||
| Real Tests | 🟡 Moderate | Many mocked |
|
||||
| Documentation | 🟢 Good | Most tests documented |
|
||||
| Maintenance | 🟢 Good | Tests recently updated |
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
The Skill Seekers test suite is **comprehensive in quantity** (2173+ tests) but has **quality gaps** in critical areas:
|
||||
|
||||
1. **Core enhancement logic** is largely untested
|
||||
2. **Multi-source scraping** orchestration lacks integration tests
|
||||
3. **MCP workflow tools** have zero coverage
|
||||
4. **Real external resource** testing is minimal
|
||||
|
||||
**Priority:** Fix the 🔴 HIGH priority gaps first, as they impact core functionality.
|
||||
|
||||
---
|
||||
|
||||
*Report generated: 2026-02-22*
|
||||
*Reviewer: Systematic test review with parallel subagent analysis*
|
||||
@@ -71,7 +71,7 @@ skill-seekers pdf manual.pdf --name product-manual
|
||||
skill-seekers pdf scanned.pdf --enable-ocr
|
||||
|
||||
# Large PDF (chunked processing)
|
||||
skill-seekers pdf large.pdf --chunk-size 50
|
||||
skill-seekers pdf large.pdf --pdf-pages-per-chunk 50
|
||||
```
|
||||
|
||||
### Multi-Source Scraping
|
||||
|
||||
@@ -122,7 +122,7 @@ python3 cli/pdf_extractor_poc.py documentation.pdf \
|
||||
--extract-images \
|
||||
--min-image-size 150 \
|
||||
--min-quality 6.0 \
|
||||
--chunk-size 20 \
|
||||
--pdf-pages-per-chunk 20 \
|
||||
--output documentation.json \
|
||||
--verbose \
|
||||
--pretty
|
||||
@@ -477,7 +477,7 @@ python3 cli/pdf_extractor_poc.py manual.pdf \
|
||||
--image-dir assets/images/ \
|
||||
--min-image-size 200 \
|
||||
--min-quality 7.0 \
|
||||
--chunk-size 15 \
|
||||
--pdf-pages-per-chunk 15 \
|
||||
--output manual.json \
|
||||
--verbose \
|
||||
--pretty
|
||||
|
||||
@@ -25,10 +25,10 @@ Break large PDFs into smaller, manageable chunks:
|
||||
python3 cli/pdf_extractor_poc.py input.pdf
|
||||
|
||||
# Custom chunk size (20 pages per chunk)
|
||||
python3 cli/pdf_extractor_poc.py input.pdf --chunk-size 20
|
||||
python3 cli/pdf_extractor_poc.py input.pdf --pdf-pages-per-chunk 20
|
||||
|
||||
# Disable chunking (single chunk with all pages)
|
||||
python3 cli/pdf_extractor_poc.py input.pdf --chunk-size 0
|
||||
python3 cli/pdf_extractor_poc.py input.pdf --pdf-pages-per-chunk 0
|
||||
```
|
||||
|
||||
### ✅ 2. Chapter/Section Detection
|
||||
@@ -272,7 +272,7 @@ cat manual.json | jq '.total_chunks'
|
||||
|
||||
```bash
|
||||
# Large PDF with bigger chunks (50 pages each)
|
||||
python3 cli/pdf_extractor_poc.py large_manual.pdf --chunk-size 50 -o output.json -v
|
||||
python3 cli/pdf_extractor_poc.py large_manual.pdf --pdf-pages-per-chunk 50 -o output.json -v
|
||||
|
||||
# Verbose output shows:
|
||||
# 📦 Creating chunks (chunk_size=50)...
|
||||
@@ -286,7 +286,7 @@ python3 cli/pdf_extractor_poc.py large_manual.pdf --chunk-size 50 -o output.json
|
||||
|
||||
```bash
|
||||
# Process all pages as single chunk
|
||||
python3 cli/pdf_extractor_poc.py small_doc.pdf --chunk-size 0 -o output.json
|
||||
python3 cli/pdf_extractor_poc.py small_doc.pdf --pdf-pages-per-chunk 0 -o output.json
|
||||
```
|
||||
|
||||
---
|
||||
@@ -369,7 +369,7 @@ Create a test PDF with chapters:
|
||||
3. Page 30: "Chapter 3: API Reference"
|
||||
|
||||
```bash
|
||||
python3 cli/pdf_extractor_poc.py test.pdf -o test.json --chunk-size 20 -v
|
||||
python3 cli/pdf_extractor_poc.py test.pdf -o test.json --pdf-pages-per-chunk 20 -v
|
||||
|
||||
# Verify chapters detected
|
||||
cat test.json | jq '.chapters'
|
||||
@@ -441,7 +441,7 @@ The chunking feature lays groundwork for:
|
||||
**Example workflow:**
|
||||
```bash
|
||||
# Extract large manual with chapters
|
||||
python3 cli/pdf_extractor_poc.py large_manual.pdf --chunk-size 25 -o manual.json
|
||||
python3 cli/pdf_extractor_poc.py large_manual.pdf --pdf-pages-per-chunk 25 -o manual.json
|
||||
|
||||
# Future: Build skill from chunks
|
||||
python3 cli/build_skill_from_pdf.py manual.json
|
||||
|
||||
@@ -223,7 +223,7 @@ skill-seekers package output/codebase --target langchain
|
||||
|
||||
**Option D: RAG-Optimized Chunking**
|
||||
```bash
|
||||
skill-seekers scrape --config configs/fastapi.json --chunk-for-rag --chunk-size 512
|
||||
skill-seekers scrape --config configs/fastapi.json --chunk-for-rag --chunk-tokens 512
|
||||
skill-seekers package output/fastapi --target langchain
|
||||
```
|
||||
|
||||
@@ -968,7 +968,7 @@ collection.add(
|
||||
|
||||
2. **Implement Semantic Chunking:**
|
||||
```bash
|
||||
skill-seekers scrape --config configs/fastapi.json --chunk-for-rag --chunk-size 512
|
||||
skill-seekers scrape --config configs/fastapi.json --chunk-for-rag --chunk-tokens 512
|
||||
```
|
||||
|
||||
3. **Set Up Multi-Collection Search:**
|
||||
|
||||
@@ -255,7 +255,7 @@ skill-seekers package output/codebase --target langchain
|
||||
|
||||
**Option D: RAG-Optimized Chunking**
|
||||
```bash
|
||||
skill-seekers scrape --config configs/fastapi.json --chunk-for-rag --chunk-size 512
|
||||
skill-seekers scrape --config configs/fastapi.json --chunk-for-rag --chunk-tokens 512
|
||||
skill-seekers package output/fastapi --target langchain
|
||||
```
|
||||
|
||||
|
||||
@@ -318,8 +318,8 @@ print(response["llm"]["replies"][0])
|
||||
# Enable semantic chunking (preserves code blocks, respects paragraphs)
|
||||
skill-seekers scrape --config configs/django.json \
|
||||
--chunk-for-rag \
|
||||
--chunk-size 512 \
|
||||
--chunk-overlap 50
|
||||
--chunk-tokens 512 \
|
||||
--chunk-overlap-tokens 50
|
||||
|
||||
# Package chunked output
|
||||
skill-seekers package output/django --target haystack
|
||||
@@ -439,8 +439,8 @@ python scripts/merge_documents.py \
|
||||
# Enable chunking for frameworks with long pages
|
||||
skill-seekers scrape --config configs/django.json \
|
||||
--chunk-for-rag \
|
||||
--chunk-size 512 \
|
||||
--chunk-overlap 50
|
||||
--chunk-tokens 512 \
|
||||
--chunk-overlap-tokens 50
|
||||
```
|
||||
|
||||
### 2. Choose Right Document Store
|
||||
@@ -506,8 +506,8 @@ Complete example of building a FastAPI documentation chatbot:
|
||||
# Scrape FastAPI docs with chunking
|
||||
skill-seekers scrape --config configs/fastapi.json \
|
||||
--chunk-for-rag \
|
||||
--chunk-size 512 \
|
||||
--chunk-overlap 50 \
|
||||
--chunk-tokens 512 \
|
||||
--chunk-overlap-tokens 50 \
|
||||
--max-pages 200
|
||||
|
||||
# Package for Haystack
|
||||
@@ -698,8 +698,8 @@ skill-seekers scrape --config configs/fastapi.json --chunk-for-rag
|
||||
# 2. Adjust chunk size
|
||||
skill-seekers scrape --config configs/fastapi.json \
|
||||
--chunk-for-rag \
|
||||
--chunk-size 768 \ # Larger chunks for more context
|
||||
--chunk-overlap 100 # More overlap for continuity
|
||||
--chunk-tokens 768 \ # Larger chunks for more context
|
||||
--chunk-overlap-tokens 100 # More overlap for continuity
|
||||
|
||||
# 3. Use hybrid search (BM25 + embeddings)
|
||||
# See Advanced Usage section
|
||||
|
||||
@@ -270,7 +270,7 @@ skill-seekers package output/codebase --target langchain
|
||||
|
||||
**Option D: RAG-Optimized Chunking**
|
||||
```bash
|
||||
skill-seekers scrape --config configs/fastapi.json --chunk-for-rag --chunk-size 512
|
||||
skill-seekers scrape --config configs/fastapi.json --chunk-for-rag --chunk-tokens 512
|
||||
skill-seekers package output/fastapi --target langchain
|
||||
```
|
||||
|
||||
|
||||
@@ -210,7 +210,7 @@ skill-seekers package output/codebase --target langchain
|
||||
|
||||
**Option D: RAG-Optimized Chunking**
|
||||
```bash
|
||||
skill-seekers scrape --config configs/fastapi.json --chunk-for-rag --chunk-size 512
|
||||
skill-seekers scrape --config configs/fastapi.json --chunk-for-rag --chunk-tokens 512
|
||||
skill-seekers package output/fastapi --target langchain
|
||||
```
|
||||
|
||||
@@ -960,7 +960,7 @@ print(schema.get("multiTenancyConfig", {}).get("enabled")) # Should be True
|
||||
|
||||
2. **Implement Semantic Chunking:**
|
||||
```bash
|
||||
skill-seekers scrape --config configs/fastapi.json --chunk-for-rag --chunk-size 512
|
||||
skill-seekers scrape --config configs/fastapi.json --chunk-for-rag --chunk-tokens 512
|
||||
```
|
||||
|
||||
3. **Set Up Multi-Tenancy:**
|
||||
|
||||
@@ -252,8 +252,8 @@ skill-seekers create [source] [options]
|
||||
| | `--workflow-dry-run` | | Preview workflow without executing |
|
||||
| | `--dry-run` | | Preview without creating |
|
||||
| | `--chunk-for-rag` | | Enable RAG chunking |
|
||||
| | `--chunk-size` | 512 | Chunk size in tokens |
|
||||
| | `--chunk-overlap` | 50 | Chunk overlap in tokens |
|
||||
| | `--chunk-tokens` | 512 | Chunk size in tokens |
|
||||
| | `--chunk-overlap-tokens` | 50 | Chunk overlap in tokens |
|
||||
| | `--help-web` | | Show web scraping options |
|
||||
| | `--help-github` | | Show GitHub options |
|
||||
| | `--help-local` | | Show local analysis options |
|
||||
@@ -615,10 +615,10 @@ skill-seekers package SKILL_DIRECTORY [options]
|
||||
| | `--skip-quality-check` | | Skip quality checks |
|
||||
| | `--upload` | | Auto-upload after packaging |
|
||||
| | `--streaming` | | Streaming mode for large docs |
|
||||
| | `--chunk-size` | 4000 | Max chars per chunk (streaming) |
|
||||
| | `--chunk-overlap` | 200 | Overlap between chunks |
|
||||
| | `--streaming-chunk-chars` | 4000 | Max chars per chunk (streaming) |
|
||||
| | `--streaming-overlap-chars` | 200 | Overlap between chunks (chars) |
|
||||
| | `--batch-size` | 100 | Chunks per batch |
|
||||
| | `--chunk` | | Enable RAG chunking |
|
||||
| | `--chunk-for-rag` | | Enable RAG chunking |
|
||||
| | `--chunk-tokens` | 512 | Max tokens per chunk |
|
||||
| | `--no-preserve-code` | | Allow code block splitting |
|
||||
|
||||
@@ -877,7 +877,7 @@ skill-seekers stream --config CONFIG [options]
|
||||
| Short | Long | Description |
|
||||
|-------|------|-------------|
|
||||
| `-c` | `--config` | Config JSON file |
|
||||
| | `--chunk-size` | Size of each chunk |
|
||||
| | `--streaming-chunk-chars` | Maximum characters per chunk (default: 4000) |
|
||||
| | `--output` | Output directory |
|
||||
|
||||
**Examples:**
|
||||
@@ -887,7 +887,7 @@ skill-seekers stream --config CONFIG [options]
|
||||
skill-seekers stream --config configs/large-docs.json
|
||||
|
||||
# Custom chunk size
|
||||
skill-seekers stream --config configs/large-docs.json --chunk-size 1000
|
||||
skill-seekers stream --config configs/large-docs.json --streaming-chunk-chars 1000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -365,8 +365,8 @@ Position Skill Seekers as **the universal documentation preprocessor** for the e
|
||||
2. **Implement Chunking for RAG** (8-12 hours)
|
||||
```bash
|
||||
skill-seekers scrape --chunk-for-rag \
|
||||
--chunk-size 512 \
|
||||
--chunk-overlap 50 \
|
||||
--chunk-tokens 512 \
|
||||
--chunk-overlap-tokens 50 \
|
||||
--preserve-code-blocks
|
||||
```
|
||||
|
||||
|
||||
@@ -139,8 +139,8 @@ skill-seekers scrape --format confluence # Confluence storage format
|
||||
```bash
|
||||
# New flag for embedding-optimized chunking
|
||||
skill-seekers scrape --chunk-for-rag \
|
||||
--chunk-size 512 \
|
||||
--chunk-overlap 50 \
|
||||
--chunk-tokens 512 \
|
||||
--chunk-overlap-tokens 50 \
|
||||
--add-metadata
|
||||
|
||||
# Output: chunks with metadata for embedding
|
||||
|
||||
@@ -385,7 +385,7 @@ skill-seekers create <url> --max-pages 100
|
||||
skill-seekers create <url> --streaming
|
||||
|
||||
# Or smaller chunks
|
||||
skill-seekers create <url> --chunk-size 500
|
||||
skill-seekers create <url> --chunk-tokens 500
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -158,8 +158,8 @@ skill-seekers package output/large-skill/ --streaming
|
||||
# Custom chunk size
|
||||
skill-seekers package output/large-skill/ \
|
||||
--streaming \
|
||||
--chunk-size 2000 \
|
||||
--chunk-overlap 100
|
||||
--streaming-chunk-chars 2000 \
|
||||
--streaming-overlap-chars 100
|
||||
```
|
||||
|
||||
**When to use:**
|
||||
@@ -177,23 +177,23 @@ Optimize for Retrieval-Augmented Generation:
|
||||
# Enable semantic chunking
|
||||
skill-seekers package output/my-skill/ \
|
||||
--target langchain \
|
||||
--chunk \
|
||||
--chunk-for-rag \
|
||||
--chunk-tokens 512
|
||||
|
||||
# Custom chunk size
|
||||
skill-seekers package output/my-skill/ \
|
||||
--target chroma \
|
||||
--chunk-tokens 256 \
|
||||
--chunk-overlap 50
|
||||
--chunk-overlap-tokens 50
|
||||
```
|
||||
|
||||
**Chunking Options:**
|
||||
|
||||
| Option | Default | Description |
|
||||
|--------|---------|-------------|
|
||||
| `--chunk` | auto | Enable chunking |
|
||||
| `--chunk-for-rag` | auto | Enable chunking |
|
||||
| `--chunk-tokens` | 512 | Tokens per chunk |
|
||||
| `--chunk-overlap` | 50 | Overlap between chunks |
|
||||
| `--chunk-overlap-tokens` | 50 | Overlap between chunks (tokens) |
|
||||
| `--no-preserve-code` | - | Allow splitting code blocks |
|
||||
|
||||
---
|
||||
@@ -449,7 +449,7 @@ skill-seekers upload output/my-skill-claude.zip --target claude
|
||||
skill-seekers package output/my-skill/ --streaming
|
||||
|
||||
# Smaller chunks
|
||||
skill-seekers package output/my-skill/ --streaming --chunk-size 1000
|
||||
skill-seekers package output/my-skill/ --streaming --streaming-chunk-chars 1000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -295,7 +295,7 @@ skill-seekers package output/my-skill/ --streaming
|
||||
# Reduce chunk size
|
||||
skill-seekers package output/my-skill/ \
|
||||
--streaming \
|
||||
--chunk-size 1000
|
||||
--streaming-chunk-chars 1000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -237,8 +237,8 @@ skill-seekers create [source] [options]
|
||||
| | `--workflow-dry-run` | | Preview workflow without executing |
|
||||
| | `--dry-run` | | Preview without creating |
|
||||
| | `--chunk-for-rag` | | Enable RAG chunking |
|
||||
| | `--chunk-size` | 512 | Chunk size in tokens |
|
||||
| | `--chunk-overlap` | 50 | Chunk overlap in tokens |
|
||||
| | `--chunk-tokens` | 512 | Chunk size in tokens |
|
||||
| | `--chunk-overlap-tokens` | 50 | Chunk overlap in tokens |
|
||||
| | `--help-web` | | Show web scraping options |
|
||||
| | `--help-github` | | Show GitHub options |
|
||||
| | `--help-local` | | Show local analysis options |
|
||||
@@ -593,10 +593,10 @@ skill-seekers package SKILL_DIRECTORY [options]
|
||||
| | `--skip-quality-check` | | Skip quality checks |
|
||||
| | `--upload` | | Auto-upload after packaging |
|
||||
| | `--streaming` | | Streaming mode for large docs |
|
||||
| | `--chunk-size` | 4000 | Max chars per chunk (streaming) |
|
||||
| | `--chunk-overlap` | 200 | Overlap between chunks |
|
||||
| | `--streaming-chunk-chars` | 4000 | Max chars per chunk (streaming) |
|
||||
| | `--streaming-overlap-chars` | 200 | Overlap between chunks (chars) |
|
||||
| | `--batch-size` | 100 | Chunks per batch |
|
||||
| | `--chunk` | | Enable RAG chunking |
|
||||
| | `--chunk-for-rag` | | Enable RAG chunking |
|
||||
| | `--chunk-tokens` | 512 | Max tokens per chunk |
|
||||
| | `--no-preserve-code` | | Allow code block splitting |
|
||||
|
||||
@@ -847,7 +847,7 @@ skill-seekers stream --config CONFIG [options]
|
||||
| Short | Long | Description |
|
||||
|-------|------|-------------|
|
||||
| `-c` | `--config` | Config JSON file |
|
||||
| | `--chunk-size` | Size of each chunk |
|
||||
| | `--streaming-chunk-chars` | Maximum characters per chunk (default: 4000) |
|
||||
| | `--output` | Output directory |
|
||||
|
||||
**Examples:**
|
||||
@@ -857,7 +857,7 @@ skill-seekers stream --config CONFIG [options]
|
||||
skill-seekers stream --config configs/large-docs.json
|
||||
|
||||
# Custom chunk size
|
||||
skill-seekers stream --config configs/large-docs.json --chunk-size 1000
|
||||
skill-seekers stream --config configs/large-docs.json --streaming-chunk-chars 1000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -385,7 +385,7 @@ skill-seekers create <url> --max-pages 100
|
||||
skill-seekers create <url> --streaming
|
||||
|
||||
# Or smaller chunks
|
||||
skill-seekers create <url> --chunk-size 500
|
||||
skill-seekers create <url> --chunk-tokens 500
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -158,8 +158,8 @@ skill-seekers package output/large-skill/ --streaming
|
||||
# Custom chunk size
|
||||
skill-seekers package output/large-skill/ \
|
||||
--streaming \
|
||||
--chunk-size 2000 \
|
||||
--chunk-overlap 100
|
||||
--streaming-chunk-chars 2000 \
|
||||
--streaming-overlap-chars 100
|
||||
```
|
||||
|
||||
**When to use:**
|
||||
@@ -177,23 +177,23 @@ Optimize for Retrieval-Augmented Generation:
|
||||
# Enable semantic chunking
|
||||
skill-seekers package output/my-skill/ \
|
||||
--target langchain \
|
||||
--chunk \
|
||||
--chunk-for-rag \
|
||||
--chunk-tokens 512
|
||||
|
||||
# Custom chunk size
|
||||
skill-seekers package output/my-skill/ \
|
||||
--target chroma \
|
||||
--chunk-tokens 256 \
|
||||
--chunk-overlap 50
|
||||
--chunk-overlap-tokens 50
|
||||
```
|
||||
|
||||
**Chunking Options:**
|
||||
|
||||
| Option | Default | Description |
|
||||
|--------|---------|-------------|
|
||||
| `--chunk` | auto | Enable chunking |
|
||||
| `--chunk-for-rag` | auto | Enable chunking |
|
||||
| `--chunk-tokens` | 512 | Tokens per chunk |
|
||||
| `--chunk-overlap` | 50 | Overlap between chunks |
|
||||
| `--chunk-overlap-tokens` | 50 | Overlap between chunks (tokens) |
|
||||
| `--no-preserve-code` | - | Allow splitting code blocks |
|
||||
|
||||
---
|
||||
@@ -449,7 +449,7 @@ skill-seekers upload output/my-skill-claude.zip --target claude
|
||||
skill-seekers package output/my-skill/ --streaming
|
||||
|
||||
# Smaller chunks
|
||||
skill-seekers package output/my-skill/ --streaming --chunk-size 1000
|
||||
skill-seekers package output/my-skill/ --streaming --streaming-chunk-chars 1000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -295,7 +295,7 @@ skill-seekers package output/my-skill/ --streaming
|
||||
# Reduce chunk size
|
||||
skill-seekers package output/my-skill/ \
|
||||
--streaming \
|
||||
--chunk-size 1000
|
||||
--streaming-chunk-chars 1000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -132,7 +132,7 @@ For better retrieval quality, use semantic chunking:
|
||||
|
||||
```bash
|
||||
# Generate with chunking
|
||||
skill-seekers scrape --config configs/react.json --max-pages 100 --chunk-for-rag --chunk-size 512 --chunk-overlap 50
|
||||
skill-seekers scrape --config configs/react.json --max-pages 100 --chunk-for-rag --chunk-tokens 512 --chunk-overlap-tokens 50
|
||||
|
||||
# Use chunked output
|
||||
python quickstart.py --chunked
|
||||
|
||||
@@ -6,7 +6,6 @@ applies_to:
|
||||
- doc_scraping
|
||||
variables:
|
||||
depth: comprehensive
|
||||
alternatives: []
|
||||
stages:
|
||||
- name: feature_comparison
|
||||
type: custom
|
||||
|
||||
@@ -164,5 +164,5 @@ post_process:
|
||||
add_metadata:
|
||||
enhanced: true
|
||||
workflow: data-validation
|
||||
domain: ml
|
||||
domain: backend
|
||||
has_validation_docs: true
|
||||
|
||||
@@ -17,6 +17,46 @@ stages:
|
||||
target: examples
|
||||
enabled: true
|
||||
uses_history: false
|
||||
- name: architecture_overview
|
||||
type: custom
|
||||
target: architecture
|
||||
uses_history: false
|
||||
enabled: true
|
||||
prompt: >
|
||||
Provide a concise architectural overview of this codebase.
|
||||
|
||||
Cover:
|
||||
1. Overall architecture style (MVC, microservices, layered, etc.)
|
||||
2. Key components and their responsibilities
|
||||
3. Data flow between components
|
||||
4. External dependencies and integrations
|
||||
5. Entry points (CLI, API, web, etc.)
|
||||
|
||||
Output JSON with:
|
||||
- "architecture_style": main architectural pattern
|
||||
- "components": array of {name, responsibility}
|
||||
- "data_flow": how data moves through the system
|
||||
- "external_deps": third-party services and libraries
|
||||
- "entry_points": how users interact with the system
|
||||
- name: skill_polish
|
||||
type: custom
|
||||
target: skill_md
|
||||
uses_history: true
|
||||
enabled: true
|
||||
prompt: >
|
||||
Review the SKILL.md content generated so far and improve it.
|
||||
|
||||
Fix:
|
||||
1. Unclear or overly technical descriptions
|
||||
2. Missing quick-start examples
|
||||
3. Gaps in the overview section
|
||||
4. Redundant or duplicate information
|
||||
5. Formatting inconsistencies
|
||||
|
||||
Output JSON with:
|
||||
- "improved_overview": rewritten overview section
|
||||
- "quick_start": concise getting-started snippet
|
||||
- "key_concepts": 3-5 essential concepts a developer needs to know
|
||||
post_process:
|
||||
reorder_sections: []
|
||||
add_metadata:
|
||||
|
||||
@@ -14,12 +14,17 @@ stages:
|
||||
uses_history: false
|
||||
enabled: true
|
||||
prompt: >
|
||||
Review the following SKILL.md content and make minimal improvements:
|
||||
- Fix obvious formatting issues
|
||||
- Ensure the overview section is clear and concise
|
||||
- Remove duplicate or redundant information
|
||||
Review the SKILL.md content and make minimal targeted improvements.
|
||||
|
||||
Return the improved content as plain text without extra commentary.
|
||||
Fix only:
|
||||
1. Obvious formatting issues (broken lists, inconsistent headers)
|
||||
2. Unclear overview section (make it one clear paragraph)
|
||||
3. Duplicate or redundant information (remove repeats)
|
||||
|
||||
Output JSON with:
|
||||
- "improved_overview": rewritten overview paragraph (plain markdown)
|
||||
- "removed_sections": list of section names that were removed as duplicates
|
||||
- "formatting_fixes": list of specific formatting issues corrected
|
||||
post_process:
|
||||
reorder_sections: []
|
||||
add_metadata:
|
||||
|
||||
@@ -3,9 +3,7 @@ description: "Security-focused review: vulnerabilities, auth, data handling"
|
||||
version: "1.0"
|
||||
applies_to:
|
||||
- codebase_analysis
|
||||
- python
|
||||
- javascript
|
||||
- typescript
|
||||
- github_analysis
|
||||
variables:
|
||||
depth: comprehensive
|
||||
stages:
|
||||
|
||||
Reference in New Issue
Block a user