Documentation restructure: - New docs/getting-started/ guide (4 files: install, quick-start, first-skill, next-steps) - New docs/user-guide/ section (6 files: core concepts through troubleshooting) - New docs/reference/ section (CLI_REFERENCE, CONFIG_FORMAT, ENVIRONMENT_VARIABLES, MCP_REFERENCE) - New docs/advanced/ section (custom-workflows, mcp-server, multi-source) - New docs/ARCHITECTURE.md - system architecture overview - Archived legacy files (QUICKSTART.md, QUICK_REFERENCE.md, docs/guides/USAGE.md) to docs/archive/legacy/ Chinese (zh-CN) translations: - Full zh-CN mirror of all user-facing docs (getting-started, user-guide, reference, advanced) - GitHub Actions workflow for translation sync (.github/workflows/translate-docs.yml) - Translation sync checker script (scripts/check_translation_sync.sh) - Translation helper script (scripts/translate_doc.py) Content updates: - CHANGELOG.md: [Unreleased] → [3.1.0] - 2026-02-22 - README.md: updated with new doc structure links - AGENTS.md: updated agent documentation - docs/features/UNIFIED_SCRAPING.md: updated for unified scraper workflow JSON config Analysis/planning artifacts (kept for reference): - DOCUMENTATION_OVERHAUL_PLAN.md, DOCUMENTATION_OVERHAUL_SUMMARY.md - FEATURE_GAP_ANALYSIS.md, IMPLEMENTATION_GAPS_ANALYSIS.md, CREATE_COMMAND_COVERAGE_ANALYSIS.md - CHINESE_TRANSLATION_IMPLEMENTATION_SUMMARY.md, ISSUE_260_UPDATE.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
10 KiB
Implementation Gaps Analysis - Current Codebase
Analysis Date: 2026-02-16
Scope: Integration gaps, duplicate code, missing connections in CURRENT implementation
🚨 Critical Integration Gaps
1. Unified Scraper Does NOT Use Workflow Runner
Gap: unified_scraper.py has its own scraping logic instead of using the shared workflow_runner.py
Evidence:
$ grep -n "workflow_runner" src/skill_seekers/cli/unified_scraper.py
# (no results)
Other scrapers DO use workflow_runner:
- ✅
doc_scraper.py- usesrun_workflows() - ✅
github_scraper.py- usesrun_workflows() - ✅
pdf_scraper.py- usesrun_workflows() - ✅
codebase_scraper.py- usesrun_workflows() - ❌
unified_scraper.py- DOES NOT userun_workflows()
Impact:
- Unified scraper cannot use enhancement workflows
- Inconsistent behavior between single-source and multi-source scraping
- Code duplication in enhancement logic
Fix:
# Add to unified_scraper.py
from skill_seekers.cli.workflow_runner import run_workflows
# After scraping all sources
context = run_workflows(
workflows=args.enhance_workflow,
inline_stages=args.enhance_stage,
scraper_context={"name": skill_name, "source_type": "unified"},
args=args
)
2. Duplicate Enhancer Classes (Old vs New)
Gap: Both old and new enhancer modules exist and are used simultaneously
Old modules (should be deprecated):
ai_enhancer.py- Old AIEnhancer classconfig_enhancer.py- Old ConfigEnhancer classguide_enhancer.py- Old GuideEnhancer class
New unified module:
unified_enhancer.py- New UnifiedEnhancer class (replaces all above)
Files still importing OLD modules:
architectural_pattern_detector.py → ai_enhancer.AIEnhancer
codebase_scraper.py → ai_enhancer.PatternEnhancer, config_enhancer.ConfigEnhancer
config_extractor.py → config_enhancer.ConfigEnhancer
enhancement_workflow.py → ai_enhancer.PatternEnhancer, TestExampleEnhancer, AIEnhancer
how_to_guide_builder.py → guide_enhancer.GuideEnhancer
pattern_recognizer.py → ai_enhancer.PatternEnhancer
test_example_extractor.py → ai_enhancer.TestExampleEnhancer
New unified_enhancer.py exports:
class UnifiedEnhancer: ...
class PatternEnhancer(UnifiedEnhancer): ...
class TestExampleEnhancer(UnifiedEnhancer): ...
class GuideEnhancer(UnifiedEnhancer): ...
class ConfigEnhancer(UnifiedEnhancer): ...
AIEnhancer = UnifiedEnhancer # Alias for compatibility
Impact:
- Maintenance burden (fix bugs in multiple places)
- Inconsistent behavior
- Confusion about which enhancer to use
- Larger codebase
Fix:
- Migrate all imports from old modules to
unified_enhancer.py - Deprecate old modules with warnings
- Eventually remove old modules
3. MCP Tools Missing Several CLI Commands
CLI Commands (20):
- ✅ create - Has MCP equivalent
- ✅ config - Has MCP equivalent
- ✅ scrape - Has MCP equivalent
- ✅ github - Has MCP equivalent
- ✅ package - Has MCP equivalent
- ✅ upload - Has MCP equivalent
- ✅ analyze - Has MCP equivalent (scrape_codebase)
- ✅ enhance - Has MCP equivalent
- ❌ enhance-status - NO MCP equivalent
- ✅ pdf - Has MCP equivalent
- ✅ unified - Has MCP equivalent (unified_scrape)
- ✅ estimate - Has MCP equivalent
- ✅ install - Has MCP equivalent
- ❌ install-agent - NO MCP equivalent
- ✅ extract-test-examples - Has MCP equivalent
- ❌ resume - NO MCP equivalent
- ❌ stream - NO MCP equivalent
- ❌ update - NO MCP equivalent
- ❌ multilang - NO MCP equivalent
- ❌ quality - NO MCP equivalent
- ✅ workflows - Has MCP equivalent
Missing in MCP (7 commands):
enhance-status- Monitor background enhancementinstall-agent- Install to IDE agents (Cursor, etc.)resume- Resume interrupted jobsstream- Stream large filesupdate- Incremental updatesmultilang- Multi-language docsquality- Quality scoring
Impact:
- Cannot use full functionality via MCP
- CLI and MCP have different capabilities
- Users restricted when using AI agents
4. Create Command Does Not Use Unified Infrastructure
Gap: create_command.py routes to individual scrapers instead of using unified system
Current flow:
create_command.py → detects source → calls individual scraper
→ doc_scraper.main()
→ github_scraper.main()
→ pdf_scraper.main()
→ codebase_scraper.main()
Gap: Each scraper has its own argument parsing and workflow logic
Impact:
- Inconsistent argument handling
- Duplicated workflow code
- Harder to maintain
Note: This is partially mitigated by workflow_runner usage in individual scrapers
5. Conflict Detector Not Integrated with Unified Scraper
Gap: conflict_detector.py exists but may not be fully utilized
Evidence:
# unified_scraper.py imports it:
from skill_seekers.cli.conflict_detector import ConflictDetector
# But check integration depth...
Need to verify:
- Does unified scraper actually run conflict detection?
- Are conflicts reported to users?
- Can users act on conflict reports?
🟠 Medium Priority Gaps
6. Enhancement Workflow Engine vs Old Enhancers
Gap: enhancement_workflow.py (new) may not fully replace old enhancer usage
enhancement_workflow.py:
- Uses
UnifiedEnhancer(new) - Supports YAML workflow presets
- Sequential stage execution
Old enhancers:
- Direct class instantiation
- No workflow support
- Used in codebase_scraper, pattern_recognizer, etc.
Impact: Two enhancement systems running in parallel
7. Resume Command Limited Scope
Gap: resume_command.py only works with specific scrapers
Questions:
- Does resume work with unified scraper?
- Does resume work with PDF scraping?
- Is resume state stored consistently?
8. Argument Parsing Duplication
Gap: Multiple argument parsers for similar functionality
Files:
parsers/doc_parser.pyparsers/github_parser.pyparsers/pdf_parser.pyparsers/create_parser.pyarguments/directory with multiple files
Gap: No unified argument validation across parsers
🟡 Minor Gaps
9. Storage Adapters Not Used in Core Flow
Gap: Cloud storage adapters exist but may not be integrated
storage/
├── base_storage.py
├── s3_storage.py
├── gcs_storage.py
└── azure_storage.py
Check: Are these actually used in CLI commands or just standalone?
10. Benchmark Framework Underutilized
Gap: benchmark/ module exists but may not be integrated into main flow
Check: Is benchmarking automatically run? Can users easily benchmark their skills?
📊 Gap Summary Matrix
| # | Gap | Severity | Files Affected | Effort to Fix |
|---|---|---|---|---|
| 1 | Unified scraper → workflow_runner | 🔴 Critical | unified_scraper.py | Medium |
| 2 | Duplicate enhancer classes | 🔴 Critical | 8 files import old | High |
| 3 | Missing MCP tools (7) | 🔴 Critical | MCP parity | Medium |
| 4 | Create command routing | 🟠 Medium | create_command.py | Medium |
| 5 | Conflict detector integration | 🟠 Medium | unified_scraper.py | Low |
| 6 | Old vs new enhancer systems | 🟠 Medium | Multiple | High |
| 7 | Resume scope | 🟠 Medium | resume_command.py | Low |
| 8 | Argument parsing duplication | 🟡 Minor | parsers/ | Medium |
| 9 | Storage adapters usage | 🟡 Minor | storage/ | Low |
| 10 | Benchmark integration | 🟡 Minor | benchmark/ | Low |
🎯 Recommended Fixes (Priority Order)
Phase 1: Critical (Immediate)
-
Add workflow_runner to unified_scraper.py
from skill_seekers.cli.workflow_runner import run_workflows # In main(): if args.enhance_workflow or args.enhance_stage: context = run_workflows(...) -
Migrate old enhancer imports to unified_enhancer
- Replace
from ai_enhancer import Xwithfrom unified_enhancer import X - Test all affected modules
- Add deprecation warnings to old modules
- Replace
-
Add missing MCP tools
resume_tool- Resume interrupted jobsupdate_tool- Incremental updatesquality_tool- Quality scoringstream_tool- Streaming modemultilang_tool- Multi-language supportenhance_status_tool- Monitor enhancementinstall_agent_tool- IDE agent installation
Phase 2: Medium Priority
-
Audit conflict_detector usage
- Verify it's called in unified_scraper
- Add conflict reporting to output
-
Consolidate argument parsing
- Create shared argument definitions
- Use composition instead of duplication
Phase 3: Cleanup
-
Deprecate old enhancer modules
# In ai_enhancer.py, config_enhancer.py, guide_enhancer.py import warnings warnings.warn("This module is deprecated. Use unified_enhancer instead.", DeprecationWarning) -
Remove old modules (after migration complete)
🔍 Verification Commands
# Check workflow_runner usage
grep -r "from.*workflow_runner" src/skill_seekers/cli/*.py
grep -r "run_workflows" src/skill_seekers/cli/*.py
# Check old enhancer imports
grep -r "from.*ai_enhancer\|from.*config_enhancer\|from.*guide_enhancer" src/skill_seekers/cli/*.py | grep -v "^src/skill_seekers/cli/\(ai_enhancer\|config_enhancer\|guide_enhancer\).py"
# Check MCP tools
grep -n "@mcp.tool\|def.*_tool" src/skill_seekers/mcp/server_fastmcp.py | wc -l
# Compare CLI vs MCP
skill-seekers --help | grep "^ [a-z]" | wc -l # 20 CLI commands
grep -c "@mcp.tool" src/skill_seekers/mcp/server_fastmcp.py # Should match
Conclusion
The biggest gaps are:
- Unified scraper missing workflow support - Critical for feature parity
- Old enhancer code still in use - Technical debt, maintenance burden
- MCP missing 7 CLI commands - Limits AI agent capabilities
These are integration gaps in existing features, not missing features. The functionality exists but isn't properly connected.
Analysis complete. Recommend Phase 1 fixes immediately.