skill-seekers-reference

firefrost-gaming/skill-seekers-reference

Author	SHA1	Message	Date
yusyus	73adda0b17	docs: update all chunk flag names to match renamed CLI flags Replace all occurrences of old ambiguous flag names with the new explicit ones: --chunk-size (tokens) → --chunk-tokens --chunk-overlap → --chunk-overlap-tokens --chunk → --chunk-for-rag --streaming-chunk-size → --streaming-chunk-chars --streaming-overlap → --streaming-overlap-chars --chunk-size (pages) → --pdf-pages-per-chunk Updated: CLI_REFERENCE (EN+ZH), user-guide (EN+ZH), integrations (Haystack, Chroma, Weaviate, FAISS, Qdrant), features/PDF_CHUNKING, examples/haystack-pipeline, strategy docs, archive docs, and CHANGELOG. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 22:15:14 +03:00
yusyus	53d37e61dd	docs: Add 4 comprehensive vector database examples (Weaviate, Chroma, FAISS, Qdrant) Created complete working examples for all 4 vector databases with RAG adaptors: Weaviate Example: - Comprehensive README with hybrid search guide - 3 Python scripts (generate, upload, query) - Sample outputs and query results - Covers hybrid search, filtering, schema design Chroma Example: - Simple, local-first approach - In-memory and persistent storage options - Semantic search and metadata filtering - Comparison with Weaviate FAISS Example: - Facebook AI Similarity Search integration - OpenAI embeddings generation - Index building and persistence - Performance-focused for scale Qdrant Example: - Advanced filtering capabilities - Production-ready features - Complex query patterns - Rust-based performance Each example includes: - Detailed README with setup and troubleshooting - requirements.txt with dependencies - 3 working Python scripts - Sample outputs directory Total files: 20 (4 examples × 5 files each) Documentation: 4 comprehensive READMEs (~800 lines total) Phase 2 of optional enhancements complete. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-07 22:38:15 +03:00
yusyus	bad84ceac2	feat: Add Cursor React example repo (Task 3.2) Complete working example demonstrating Cursor + Skill Seekers workflow: Main Example (examples/cursor-react-skill/): - README.md (400+ lines) - Comprehensive guide with expected outputs - generate_cursorrules.py - Automation script for complete workflow - .cursorrules.example - Sample generated rules (React 18+ patterns) - requirements.txt - Python dependencies Example Project (example-project/): - package.json - React 18 + TypeScript + Vite - tsconfig.json - Strict TypeScript configuration - src/App.tsx - Sample counter component - src/index.tsx - React entry point - README.md - Testing instructions Workflow Demonstrated: 1. Scrape React docs → skill-seekers scrape 2. Package for Cursor → skill-seekers package --target claude 3. Extract and copy → unzip + cp to .cursorrules 4. Test in Cursor IDE with AI prompts Example Prompts Included: - useState hook patterns - Data fetching with useEffect - Custom hooks for validation - TypeScript typing examples Shows before/after comparison of AI suggestions with and without .cursorrules. Updates: README.md + INTEGRATIONS.md (added Haystack to supported list)	2026-02-07 21:07:11 +03:00
yusyus	1c888e7817	feat: Add Haystack RAG framework adaptor (Task 2.2) Implements complete Haystack 2.x integration for RAG pipelines: Haystack Adaptor (src/skill_seekers/cli/adaptors/haystack.py): - Document format: {content: str, meta: dict} - JSON packaging for Haystack pipelines - Compatible with InMemoryDocumentStore, BM25Retriever - Registered in adaptor factory as 'haystack' Example Pipeline (examples/haystack-pipeline/): - README.md with comprehensive guide and troubleshooting - quickstart.py demonstrating BM25 retrieval - requirements.txt (haystack-ai>=2.0.0) - Shows document loading, indexing, and querying Tests (tests/test_adaptors/test_haystack_adaptor.py): - 11 tests covering all adaptor functionality - Format validation, packaging, upload messages - Edge cases: empty dirs, references-only skills - All 93 adaptor tests passing (100% suite pass rate) Features: - No upload endpoint (local use only like LangChain/LlamaIndex) - No AI enhancement (enhance before packaging) - Same packaging pattern as other RAG frameworks - InMemoryDocumentStore + BM25Retriever example Test: pytest tests/test_adaptors/test_haystack_adaptor.py -v	2026-02-07 21:01:49 +03:00
yusyus	bdd61687c5	feat: Complete Phase 1 - AI Coding Assistant Integrations (v2.10.0) Add comprehensive integration guides for 4 AI coding assistants: ## New Integration Guides (98KB total) - docs/integrations/WINDSURF.md (20KB) - Windsurf IDE with .windsurfrules - docs/integrations/CLINE.md (25KB) - Cline VS Code extension with MCP - docs/integrations/CONTINUE_DEV.md (28KB) - Continue.dev for any IDE - docs/integrations/INTEGRATIONS.md (25KB) - Comprehensive hub with decision tree ## Working Examples (3 directories, 11 files) - examples/windsurf-fastapi-context/ - FastAPI + Windsurf automation - examples/cline-django-assistant/ - Django + Cline with MCP server - examples/continue-dev-universal/ - HTTP context server for all IDEs ## README.md Updates - Updated tagline: Universal preprocessor for 10+ AI systems - Expanded Supported Integrations table (7 → 10 platforms) - Added 'AI Coding Assistant Integrations' section (60+ lines) - Cross-links to all new guides and examples ## Impact - Week 2 of ACTION_PLAN.md: 4/4 tasks complete (100%) ✅ - Total new documentation: ~3,000 lines - Total new code: ~1,000 lines (automation scripts, servers) - Integration coverage: LangChain, LlamaIndex, Pinecone, Cursor, Windsurf, Cline, Continue.dev, Claude, Gemini, ChatGPT ## Key Features - All guides follow proven 11-section pattern from CURSOR.md - Real-world examples with automation scripts - Multi-IDE consistency (Continue.dev works in VS Code, JetBrains, Vim) - MCP integration for dynamic documentation access - Complete troubleshooting sections with solutions Positions Skill Seekers as universal preprocessor for ANY AI system. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-07 20:46:26 +03:00
yusyus	1552e1212d	feat: Week 1 Complete - Universal RAG Preprocessor Foundation Implements Week 1 of the 4-week strategic plan to position Skill Seekers as universal infrastructure for AI systems. Adds RAG ecosystem integrations (LangChain, LlamaIndex, Pinecone, Cursor) with comprehensive documentation. ## Technical Implementation (Tasks #1-2) ### New Platform Adaptors - Add LangChain adaptor (langchain.py) - exports Document format - Add LlamaIndex adaptor (llama_index.py) - exports TextNode format - Implement platform adaptor pattern with clean abstractions - Preserve all metadata (source, category, file, type) - Generate stable unique IDs for LlamaIndex nodes ### CLI Integration - Update main.py with --target argument - Modify package_skill.py for new targets - Register adaptors in factory pattern (__init__.py) ## Documentation (Tasks #3-7) ### Integration Guides Created (2,300+ lines) - docs/integrations/LANGCHAIN.md (400+ lines) * Quick start, setup guide, advanced usage * Real-world examples, troubleshooting - docs/integrations/LLAMA_INDEX.md (400+ lines) * VectorStoreIndex, query/chat engines * Advanced features, best practices - docs/integrations/PINECONE.md (500+ lines) * Production deployment, hybrid search * Namespace management, cost optimization - docs/integrations/CURSOR.md (400+ lines) * .cursorrules generation, multi-framework * Project-specific patterns - docs/integrations/RAG_PIPELINES.md (600+ lines) * Complete RAG architecture * 5 pipeline patterns, 2 deployment examples * Performance benchmarks, 3 real-world use cases ### Working Examples (Tasks #3-5) - examples/langchain-rag-pipeline/ * Complete QA chain with Chroma vector store * Interactive query mode - examples/llama-index-query-engine/ * Query engine with chat memory * Source attribution - examples/pinecone-upsert/ * Batch upsert with progress tracking * Semantic search with filters Each example includes: - quickstart.py (production-ready code) - README.md (usage instructions) - requirements.txt (dependencies) ## Marketing & Positioning (Tasks #8-9) ### Blog Post - docs/blog/UNIVERSAL_RAG_PREPROCESSOR.md (500+ lines) * Problem statement: 70% of RAG time = preprocessing * Solution: Skill Seekers as universal preprocessor * Architecture diagrams and data flow * Real-world impact: 3 case studies with ROI * Platform adaptor pattern explanation * Time/quality/cost comparisons * Getting started paths (quick/custom/full) * Integration code examples * Vision & roadmap (Weeks 2-4) ### README Updates - New tagline: "Universal preprocessing layer for AI systems" - Prominent "Universal RAG Preprocessor" hero section - Integrations table with links to all guides - RAG Quick Start (4-step getting started) - Updated "Why Use This?" - RAG use cases first - New "RAG Framework Integrations" section - Version badge updated to v2.9.0-dev ## Key Features ✅ Platform-agnostic preprocessing ✅ 99% faster than manual preprocessing (days → 15-45 min) ✅ Rich metadata for better retrieval accuracy ✅ Smart chunking preserves code blocks ✅ Multi-source combining (docs + GitHub + PDFs) ✅ Backward compatible (all existing features work) ## Impact Before: Claude-only skill generator After: Universal preprocessing layer for AI systems Integrations: - LangChain Documents ✅ - LlamaIndex TextNodes ✅ - Pinecone (ready for upsert) ✅ - Cursor IDE (.cursorrules) ✅ - Claude AI Skills (existing) ✅ - Gemini (existing) ✅ - OpenAI ChatGPT (existing) ✅ Documentation: 2,300+ lines Examples: 3 complete projects Time: 12 hours (50% faster than estimated 24-30h) ## Breaking Changes None - fully backward compatible ## Testing All existing tests pass Ready for Week 2 implementation Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-05 23:32:58 +03:00
MiaoDX	bd974148a2	feat: Update MCP to use server_fastmcp with venv Python support This PR improves MCP server configuration by updating all documentation to use the current server_fastmcp module and ensuring setup scripts automatically use virtual environment Python instead of system Python. ## Changes ### 1. Documentation Updates (server → server_fastmcp) Updated all references from deprecated `server` module to `server_fastmcp`: User-facing documentation: - examples/http_transport_examples.sh: All 13 command examples - README.md: Configuration examples and troubleshooting commands - docs/guides/MCP_SETUP.md: Enhanced migration guide with stdio/HTTP examples - docs/guides/TESTING_GUIDE.md: Test import statements - docs/guides/MULTI_AGENT_SETUP.md: Updated examples - docs/guides/SETUP_QUICK_REFERENCE.md: Updated paths - CLAUDE.md: CLI command examples MCP module: - src/skill_seekers/mcp/README.md: Updated config examples - src/skill_seekers/mcp/agent_detector.py: Use server_fastmcp module Note: Historical release notes (CHANGELOG.md) preserved unchanged. ### 2. Venv Python Configuration setup_mcp.sh improvements: - Added automatic venv detection (checks .venv, venv, and $VIRTUAL_ENV) - Sets PYTHON_CMD to venv Python path when available - CRITICAL FIX: Now updates PYTHON_CMD after creating/activating venv - Generates MCP configs with full venv Python path - Falls back to system python3 if no venv found - Displays detected Python version and path Config examples updated: - .claude/mcp_config.example.json: Use venv Python path - example-mcp-config.json: Use venv Python path - Added "type": "stdio" for clarity - Updated to use server_fastmcp module ### 3. Bug Fix: PYTHON_CMD Not Updated After Venv Creation Previously, when setup_mcp.sh created or activated a venv, it failed to update PYTHON_CMD, causing generated configs to still use system python3. Fixed cases: - When $VIRTUAL_ENV is already set → Update PYTHON_CMD to venv Python - When existing venv is activated → Set PYTHON_CMD="$REPO_PATH/venv/bin/python3" - When new venv is created → Set PYTHON_CMD="$REPO_PATH/venv/bin/python3" ## Benefits ### For Users: ✅ No deprecation warnings - All docs show current module ✅ Proper Python environment - MCP uses venv with all dependencies ✅ No system Python issues - Avoids "module not found" errors ✅ No global installation needed - No --break-system-packages required ✅ Automatic detection - setup_mcp.sh finds venv automatically ✅ Clean isolation - Projects don't interfere with system Python ### For Maintainers: ✅ Prepared for v3.0.0 - Documentation ready for server.py removal ✅ Reduced support burden - Fewer MCP configuration issues ✅ Consistent examples - All docs use same module/pattern ## Testing Verified: - ✅ All command examples use server_fastmcp - ✅ No deprecated module references in user-facing docs (0 results) - ✅ New module correctly referenced (129 instances) - ✅ setup_mcp.sh detects venv and generates correct config - ✅ PYTHON_CMD properly updated after venv creation - ✅ MCP server starts correctly with venv Python Files changed: 12 files (+262/-107 lines) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-18 15:55:46 +08:00
Pablo Estevez	5ed767ff9a	run ruff	2026-01-17 17:29:21 +00:00
yusyus	9e41094436	feat: v2.4.0 - MCP 2025 upgrade with multi-agent support (#217 ) * feat: v2.4.0 - MCP 2025 upgrade with multi-agent support Major MCP infrastructure upgrade to 2025 specification with HTTP + stdio transport and automatic configuration for 5+ AI coding agents. ### 🚀 What's New MCP 2025 Specification (SDK v1.25.0) - FastMCP framework integration (68% code reduction) - HTTP + stdio dual transport support - Multi-agent auto-configuration - 17 MCP tools (up from 9) - Improved performance and reliability Multi-Agent Support - Auto-detects 5 AI coding agents (Claude Code, Cursor, Windsurf, VS Code, IntelliJ) - Generates correct config for each agent (stdio vs HTTP) - One-command setup via ./setup_mcp.sh - HTTP server for concurrent multi-client support Architecture Improvements - Modular tool organization (tools/ package) - Graceful degradation for testing - Backward compatibility maintained - Comprehensive test coverage (606 tests passing) ### 📦 Changed Files Core MCP Server: - src/skill_seekers/mcp/server_fastmcp.py (NEW - 300 lines, FastMCP-based) - src/skill_seekers/mcp/server.py (UPDATED - compatibility shim) - src/skill_seekers/mcp/agent_detector.py (NEW - multi-agent detection) Tool Modules: - src/skill_seekers/mcp/tools/config_tools.py (NEW) - src/skill_seekers/mcp/tools/scraping_tools.py (NEW) - src/skill_seekers/mcp/tools/packaging_tools.py (NEW) - src/skill_seekers/mcp/tools/splitting_tools.py (NEW) - src/skill_seekers/mcp/tools/source_tools.py (NEW) Version Updates: - pyproject.toml: 2.3.0 → 2.4.0 - src/skill_seekers/cli/main.py: version string updated - src/skill_seekers/mcp/__init__.py: 2.0.0 → 2.4.0 Documentation: - README.md: Added multi-agent support section - docs/MCP_SETUP.md: Complete rewrite for MCP 2025 - docs/HTTP_TRANSPORT.md (NEW) - docs/MULTI_AGENT_SETUP.md (NEW) - CHANGELOG.md: v2.4.0 entry with migration guide Tests: - tests/test_mcp_fastmcp.py (NEW - 57 tests) - tests/test_server_fastmcp_http.py (NEW - HTTP transport tests) - All existing tests updated and passing (606/606) ### ✅ Test Results E2E Testing: - Fresh venv installation: ✅ - stdio transport: ✅ - HTTP transport: ✅ (health check, SSE endpoint) - Agent detection: ✅ (found Claude Code) - Full test suite: ✅ 606 passed, 152 skipped Test Coverage: - Core functionality: 100% passing - Backward compatibility: Verified - No breaking changes: Confirmed ### 🔄 Migration Path Existing Users: - Old `python -m skill_seekers.mcp.server` still works - Existing configs unchanged - All tools function identically - Deprecation warnings added (removal in v3.0.0) New Users: - Use `./setup_mcp.sh` for auto-configuration - Or manually use `python -m skill_seekers.mcp.server_fastmcp` - HTTP mode: `--http --port 8000` ### 📊 Metrics - Lines of code: 2200 → 300 (87% reduction in server.py) - Tools: 9 → 17 (88% increase) - Agents supported: 1 → 5 (400% increase) - Tests: 427 → 606 (42% increase) - All tests passing: ✅ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: Add backward compatibility exports to server.py for tests Re-export tool functions from server.py to maintain backward compatibility with test_mcp_server.py which imports from the legacy server module. This fixes CI test failures where tests expected functions like list_tools() and generate_config_tool() to be importable from skill_seekers.mcp.server. All tool functions are now re-exported for compatibility while maintaining the deprecation warning for direct server execution. * fix: Export run_subprocess_with_streaming and fix tool schemas for backward compatibility - Add run_subprocess_with_streaming export from scraping_tools - Fix tool schemas to include properties field (required by tests) - Resolves 9 failing tests in test_mcp_server.py * fix: Add call_tool router and fix test patches for modular architecture - Add call_tool function to server.py for backward compatibility - Fix test patches to use correct module paths (scraping_tools instead of server) - Update 7 test decorators to patch the correct function locations - Resolves remaining CI test failures --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-26 00:45:48 +03:00

9 Commits