3 sequence diagrams (create command dispatch, GitHub+C3.x pipeline with
all 5 stages, MCP dual-path invocation), 2 activity diagrams (source
detection in correct code order, enhancement level flag mapping), and
1 component diagram with corrected runtime dependency arrows.
All diagrams cross-referenced against source code for accuracy.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move UML/ directory and Architecture.md from Docs/ to docs/.
Rename Architecture.md to UML_ARCHITECTURE.md to avoid collision
with existing docs/ARCHITECTURE.md (docs organization file).
Update all references in README.md, CONTRIBUTING.md, CLAUDE.md,
and the architecture file itself.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The parsers architecture is now fully documented in the StarUML project
(Docs/UML/skill_seekers.mdj) with the Parsers class diagram showing all
28 SubcommandParser subclasses.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add MiniMax AI as LLM platform adaptor
Original implementation by octo-patch in PR #318.
This commit includes comprehensive improvements and documentation.
Code Improvements:
- Fix API key validation to properly check JWT format (eyJ prefix)
- Add specific exception handling for timeout and connection errors
- Remove unused variable in upload method
Dependencies:
- Add MiniMax to [all-llms] extra group in pyproject.toml
Tests:
- Remove duplicate setUp method in integration test class
- Add 4 new test methods:
* test_package_excludes_backup_files
* test_upload_success_mocked (with OpenAI mocking)
* test_upload_network_error
* test_upload_connection_error
* test_validate_api_key_jwt_format
- Update test_validate_api_key_valid to use JWT format keys
- Fix test assertions for error message matching
Documentation:
- Create comprehensive MINIMAX_INTEGRATION.md guide (380+ lines)
- Update MULTI_LLM_SUPPORT.md with MiniMax platform entry
- Update 01-installation.md extras table
- Update INTEGRATIONS.md AI platforms table
- Update AGENTS.md adaptor import pattern example
- Fix README.md platform count from 4 to 5
All tests pass (33 passed, 3 skipped)
Lint checks pass
Co-authored-by: octo-patch <octo-patch@users.noreply.github.com>
* fix: improve MiniMax adaptor — typed exceptions, key validation, tests, docs
- Remove invalid "minimax" self-reference from all-llms dependency group
- Use typed OpenAI exceptions (APITimeoutError, APIConnectionError)
instead of string-matching on generic Exception
- Replace incorrect JWT assumption in validate_api_key with length check
- Use DEFAULT_API_ENDPOINT constant instead of hardcoded URLs (3 sites)
- Add Path() cast for output_path before .is_dir() call
- Add sys.modules mock to test_enhance_missing_library
- Add mocked test_enhance_success with backup/content verification
- Update test assertions for new exception types and key validation
- Add MiniMax to __init__.py docstrings (module, get_adaptor, list_platforms)
- Add MiniMax sections to MULTI_LLM_SUPPORT.md (install, format, API key,
workflow example, export-to-all)
Follows up on PR #318 by @octo-patch (feat: add MiniMax AI as LLM platform adaptor).
Co-Authored-By: Octopus <octo-patch@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: octo-patch <octo-patch@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds docs/BEST_PRACTICES.md — a comprehensive guide for creating
high-quality Claude skills. Covers SKILL.md structure, code examples,
prerequisites, troubleshooting, quality targets, and a real-world
before/after example (Grade F to Grade A). Addresses roadmap item I2.2.
Based on PR #206 by @jmagly from the AI Writing Guide project.
Fixes applied: updated outdated CLI command, fixed broken doc links.
Co-authored-by: jmagly <jmagly@users.noreply.github.com>
- Add docs/VIDEO_GUIDE.md (483 lines) — comprehensive guide covering
Quick Start, CLI reference, visual pipeline, AI enhancement, output
structure, time clipping, and troubleshooting
- Update README.md video section with new CLI examples (enhance,
clipping, vision OCR, re-build from JSON) and link to full guide
- Sync README.zh-CN.md with all video feature additions:
- Quick Start section: video commands
- Core Features: new video extraction feature list
- Installation table: video/video-full packages + GPU note
- Usage Examples: full video extraction subsection
- Documentation links: VIDEO_GUIDE.md reference
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Auto-detects NVIDIA (CUDA), AMD (ROCm), or CPU-only GPU and installs the
correct PyTorch variant + easyocr + all visual extraction dependencies.
Removes easyocr from video-full pip extras to avoid pulling ~2GB of wrong
CUDA packages on non-NVIDIA systems.
New files:
- video_setup.py (835 lines): GPU detection, PyTorch install, ROCm config,
venv checks, system dep validation, module selection, verification
- test_video_setup.py (60 tests): Full coverage of detection, install, verify
Updated docs: CHANGELOG, AGENTS.md, CLAUDE.md, README.md, CLI_REFERENCE,
FAQ, TROUBLESHOOTING, installation guide, video dependency plan
All 2523 tests passing (15 skipped).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sync with latest development changes including ruff formatting,
bug fixes, and pinecone adaptor additions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add FALLBACK_MAIN_SELECTORS constant and _find_main_content() helper to
eliminate 3 duplicated fallback loops in doc_scraper.py
- Move link extraction before early return in extract_content() so links
are always discovered from the full page, not just main content
- Fix single-threaded dry-run to extract links from soup (full page)
instead of main element only — fixes reactflow.dev finding only 1 page
- Add link extraction to async dry-run path (was completely missing)
- Remove main_content from get_configuration() defaults so fallback logic
kicks in instead of a broad CSS comma selector matching body
- Smart create --config routing: peek at JSON to determine unified
(sources array → unified_scraper) vs simple (base_url → doc_scraper)
- Update docs/user-guide/02-scraping.md and docs/reference/CONFIG_FORMAT.md
to use unified config format (legacy format rejected since v2.11.0)
- Fix test_auto_fetch_enabled and test_mcp_validate_legacy_config
Closes#300
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Stage 1 quality improvements from the Arbitrary Limits & Dead Code audit:
Reference file truncation removed:
- codebase_scraper.py: remove code[:500] truncation at 5 locations — reference
files now contain complete code blocks for copy-paste usability
- unified_skill_builder.py: remove issues[:20], releases[:10], body[:500],
and code_snippet[:300] caps in reference files — full content preserved
Enhancement summarizer rewrite:
- enhance_skill_local.py: replace arbitrary [:5] code block cap with
character-budget approach using target_ratio * content_chars
- Fix intro boundary bug: track code block state so intro never ends
inside a code block, which was desynchronizing the parser
- Remove dead _target_lines variable (assigned but never used)
- Heading chunks now also respect the character budget
Hardcoded language fixes:
- unified_skill_builder.py: test examples use ex["language"] instead of
always "python" for syntax highlighting
- how_to_guide_builder.py: add language field to HowToGuide dataclass,
set from workflow at creation, used in AI enhancement prompt
Test fixes:
- test_enhance_skill_local.py: rename test to test_code_blocks_not_arbitrarily_capped,
fix assertion to count actual blocks (```count // 2), use target_ratio=0.9
Documentation:
- Add Stage 1 plan, implementation summary, review, and corrected docs
- Update CHANGELOG.md with all changes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All scrapers (scrape, github, analyze, pdf) now share a common argument
contract via add_all_standard_arguments() in arguments/common.py.
Universal flags (--dry-run, --verbose, --quiet, --name, --description,
workflow args) work consistently across all source types.
Previously, `create <url> --dry-run`, `create owner/repo --dry-run`,
and `create ./path --dry-run` would crash because sub-scrapers didn't
accept those flags. Also fixes main.py _handle_analyze_command() not
forwarding --dry-run, --preset, --quiet, --name, --description to
codebase_scraper.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These files were incorrectly deleted — they have distinct content from
the *_DEPLOYMENT.md files (different structure, different focus, different
examples) and are not duplicates.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add YAML-based enhancement workflow presets shipped inside the package
(default, minimal, security-focus, architecture-comprehensive, api-documentation)
- Add `skill-seekers workflows` subcommand: list, show, copy, add, remove, validate
- copy/add/remove all accept multiple names/files in one invocation with partial-failure behaviour
- `add --name` override restricted to single-file operations
- Add 5 MCP tools: list_workflows, get_workflow, create_workflow, update_workflow, delete_workflow
- Fix: create command _add_common_args() now correctly forwards each --enhance-workflow
as a separate flag instead of passing the whole list as a single argument
- Update README: reposition as "data layer for AI systems" with AI Skills front and centre
- Update CHANGELOG, QUICK_REFERENCE, CLAUDE.md with workflow preset details
- 1,880+ tests passing
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Refactored all RAG adaptors (LangChain, LlamaIndex, Haystack, Weaviate, Chroma,
FAISS, Qdrant) to use existing helper methods from base.py, removing ~215 lines
of duplicate code (26% reduction).
Key improvements:
- All adaptors now use _format_output_path() for consistent path handling
- All adaptors now use _iterate_references() for reference file iteration
- Added _generate_deterministic_id() helper with 3 formats (hex, uuid, uuid5)
- 5 adaptors refactored to use unified ID generation
- Removed 6 unused imports (hashlib, uuid)
Benefits:
- DRY principles enforced across all RAG adaptors
- Single source of truth for common logic
- Easier maintenance and testing
- Consistent behavior across platforms
All 159 adaptor tests passing. Zero regressions.
Phase 1 of optional enhancements (Phases 2-5 pending).
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Complete summary of all critical and high priority fixes:
- Phase 1 (P0): Test coverage + CLI integration
- Phase 2 (P1): Code quality improvements
- Full verification and validation results
- Release readiness checklist for v2.10.0
Ready for production release.
- Filter out chunks smaller than min_chunk_size (default 100 tokens)
- Exception: Keep all chunks if entire document is smaller than target size
- All 15 tests passing (100% pass rate)
Fixes edge case where very small chunks (e.g., 'Short.' = 6 chars) were
being created despite min_chunk_size=100 setting.
Test: pytest tests/test_rag_chunker.py -v
Add comprehensive integration guides for 4 AI coding assistants:
## New Integration Guides (98KB total)
- docs/integrations/WINDSURF.md (20KB) - Windsurf IDE with .windsurfrules
- docs/integrations/CLINE.md (25KB) - Cline VS Code extension with MCP
- docs/integrations/CONTINUE_DEV.md (28KB) - Continue.dev for any IDE
- docs/integrations/INTEGRATIONS.md (25KB) - Comprehensive hub with decision tree
## Working Examples (3 directories, 11 files)
- examples/windsurf-fastapi-context/ - FastAPI + Windsurf automation
- examples/cline-django-assistant/ - Django + Cline with MCP server
- examples/continue-dev-universal/ - HTTP context server for all IDEs
## README.md Updates
- Updated tagline: Universal preprocessor for 10+ AI systems
- Expanded Supported Integrations table (7 → 10 platforms)
- Added 'AI Coding Assistant Integrations' section (60+ lines)
- Cross-links to all new guides and examples
## Impact
- Week 2 of ACTION_PLAN.md: 4/4 tasks complete (100%) ✅
- Total new documentation: ~3,000 lines
- Total new code: ~1,000 lines (automation scripts, servers)
- Integration coverage: LangChain, LlamaIndex, Pinecone, Cursor, Windsurf,
Cline, Continue.dev, Claude, Gemini, ChatGPT
## Key Features
- All guides follow proven 11-section pattern from CURSOR.md
- Real-world examples with automation scripts
- Multi-IDE consistency (Continue.dev works in VS Code, JetBrains, Vim)
- MCP integration for dynamic documentation access
- Complete troubleshooting sections with solutions
Positions Skill Seekers as universal preprocessor for ANY AI system.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove SPYKE-related client documentation files
- Fix critical ruff linter errors:
- Remove unused 'os' import in test_analyze_e2e.py
- Remove unused 'setups' variable in test_test_example_extractor.py
- Prefix unused output_dir parameter in codebase_scraper.py
- Fix import sorting in test_integration.py
- Update CHANGELOG.md with comprehensive PR #272 feature documentation
These changes were part of PR #272 cleanup but didn't make it into the squash merge.
Complete implementation of C3.9, granular AI enhancement control, performance optimizations, and bug fixes.
Features:
- C3.9 Project Documentation Extraction (markdown files)
- Granular AI enhancement control (--enhance-level 0-3)
- C# test extraction support
- 6-12x faster LOCAL mode with parallel execution
- Auto-enhancement UX improvements
- LOCAL mode fallback for all AI enhancements
Bug Fixes:
- C# language support
- Config type field compatibility
- LocalSkillEnhancer import
Documentation:
- Updated CHANGELOG.md
- Updated CLAUDE.md
- Removed client-specific files
Tests: All 1,257 tests passing
Critical linter errors: Fixed
- Update Chinese README (README.zh-CN.md) with new preset flags
- Update docs/features/*.md (PATTERN_DETECTION, HOW_TO_GUIDES, BOOTSTRAP_SKILL_TECHNICAL)
- Update scripts/bootstrap_skill.sh to use 'skill-seekers analyze'
- Update scripts/skill_header.md command examples
- Update tests/test_bootstrap_skill.py assertions
- Fix CHANGELOG.md historical entry with correct command name
All references to 'skill-seekers-codebase' updated to 'skill-seekers analyze'
except where needed for backward compatibility (pyproject.toml, E2E tests).
Related to Phase 1 implementation from previous commits.
Merging with admin override due to known issues:
✅ **What Works**:
- GLM-4.7 Claude-compatible API support (correctly implemented)
- PDF scraper improvements (content truncation fixed, page traceability added)
- Documentation updates comprehensive
⚠️ **Known Issues (will be fixed in next commit)**:
1. Import bugs in 3 files causing UnboundLocalError (30 tests failing)
2. PDF scraper test expectations need updating for new behavior (5 tests failing)
3. test_godot_config failure (pre-existing, not caused by this PR - 1 test failing)
**Action Plan**:
Fixes for issues #1 and #2 are ready and will be committed immediately after merge.
Issue #3 requires separate investigation as it's a pre-existing problem.
Total: 36 failing tests, 35 will be fixed in next commit.
This patch release fixes the broken Chinese language selector link
on PyPI by using absolute GitHub URLs instead of relative paths.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This PR modernizes the MCP setup with comprehensive improvements:
**Key Improvements:**
✅ Virtual environment auto-detection (venv, .venv, $VIRTUAL_ENV)
✅ Module-based imports (python -m skill_seekers.mcp.server_fastmcp)
✅ Eliminates 'module not found' errors from missing dependencies
✅ No need for --break-system-packages or global installs
✅ Clean project isolation with venv
✅ Prepares for v3.0.0 when server.py will be removed
**Bug Fixes:**
🐛 Fixed 41 instances of server_fastmcp_fastmcp → server_fastmcp typo
🐛 Updated tests to accept -e ".[mcp]" format
🐛 Updated tests for module reference format
**Files Changed:** 13 files (+312/-154 lines)
**Testing:** All 1386 tests passing (verified)
Co-Authored-By: MiaoDX <miaodx@hotmail.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fixed 41 instances of 'server_fastmcp_fastmcp' to 'server_fastmcp'.
This was a typo in the documentation that would prevent the MCP server
from starting correctly.
All other files in the PR correctly use 'server_fastmcp'.
Co-Authored-By: MiaoDX <miaodx@hotmail.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This PR improves MCP server configuration by updating all documentation
to use the current server_fastmcp module and ensuring setup scripts
automatically use virtual environment Python instead of system Python.
## Changes
### 1. Documentation Updates (server → server_fastmcp)
Updated all references from deprecated `server` module to `server_fastmcp`:
**User-facing documentation:**
- examples/http_transport_examples.sh: All 13 command examples
- README.md: Configuration examples and troubleshooting commands
- docs/guides/MCP_SETUP.md: Enhanced migration guide with stdio/HTTP examples
- docs/guides/TESTING_GUIDE.md: Test import statements
- docs/guides/MULTI_AGENT_SETUP.md: Updated examples
- docs/guides/SETUP_QUICK_REFERENCE.md: Updated paths
- CLAUDE.md: CLI command examples
**MCP module:**
- src/skill_seekers/mcp/README.md: Updated config examples
- src/skill_seekers/mcp/agent_detector.py: Use server_fastmcp module
Note: Historical release notes (CHANGELOG.md) preserved unchanged.
### 2. Venv Python Configuration
**setup_mcp.sh improvements:**
- Added automatic venv detection (checks .venv, venv, and $VIRTUAL_ENV)
- Sets PYTHON_CMD to venv Python path when available
- **CRITICAL FIX**: Now updates PYTHON_CMD after creating/activating venv
- Generates MCP configs with full venv Python path
- Falls back to system python3 if no venv found
- Displays detected Python version and path
**Config examples updated:**
- .claude/mcp_config.example.json: Use venv Python path
- example-mcp-config.json: Use venv Python path
- Added "type": "stdio" for clarity
- Updated to use server_fastmcp module
### 3. Bug Fix: PYTHON_CMD Not Updated After Venv Creation
Previously, when setup_mcp.sh created or activated a venv, it failed to
update PYTHON_CMD, causing generated configs to still use system python3.
**Fixed cases:**
- When $VIRTUAL_ENV is already set → Update PYTHON_CMD to venv Python
- When existing venv is activated → Set PYTHON_CMD="$REPO_PATH/venv/bin/python3"
- When new venv is created → Set PYTHON_CMD="$REPO_PATH/venv/bin/python3"
## Benefits
### For Users:
✅ No deprecation warnings - All docs show current module
✅ Proper Python environment - MCP uses venv with all dependencies
✅ No system Python issues - Avoids "module not found" errors
✅ No global installation needed - No --break-system-packages required
✅ Automatic detection - setup_mcp.sh finds venv automatically
✅ Clean isolation - Projects don't interfere with system Python
### For Maintainers:
✅ Prepared for v3.0.0 - Documentation ready for server.py removal
✅ Reduced support burden - Fewer MCP configuration issues
✅ Consistent examples - All docs use same module/pattern
## Testing
**Verified:**
- ✅ All command examples use server_fastmcp
- ✅ No deprecated module references in user-facing docs (0 results)
- ✅ New module correctly referenced (129 instances)
- ✅ setup_mcp.sh detects venv and generates correct config
- ✅ PYTHON_CMD properly updated after venv creation
- ✅ MCP server starts correctly with venv Python
**Files changed:** 12 files (+262/-107 lines)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>