skill-seekers-reference

firefrost-gaming/skill-seekers-reference

Author	SHA1	Message	Date
yusyus	7aa5f0d3cb	Merge MCP_refactor: Add auto-upload feature with 9 MCP tools Merges smart auto-upload feature with API key detection. Features: - New upload_skill.py for automatic API-based upload - Enhanced package_skill.py with --upload flag - Smart detection: upload if API key available, helpful message if not - 9 total MCP tools (added upload_skill) - Cross-platform folder opening - Graceful error handling Fixes: - Fix missing import os in mcp/server.py - Fix package_skill.py exit code - Update all documentation to reflect 9 tools Tests: 14/14 passed (100%) - CLI tests: 8/8 passed - MCP tests: 6/6 passed All documentation updated and verified.	2025-10-19 22:22:45 +03:00
yusyus	06dabf639c	Update documentation: correct MCP tool count to 9 tools - Update mcp/README.md: 8 tools → 9 tools, add upload_skill docs - Update docs/MCP_SETUP.md: verify section lists all 9 tools - Update docs/CLAUDE.md: MCP tool references updated - Add upload_skill to tool listings and examples - Update test coverage count: 31 → 34 tests All documentation now accurately reflects the current feature set.	2025-10-19 22:22:03 +03:00
yusyus	d8cc92cd46	Add smart auto-upload feature with API key detection Features: - New upload_skill.py for automatic API-based upload - Smart detection: upload if API key available, helpful message if not - Enhanced package_skill.py with --upload flag - New MCP tool: upload_skill (9 total MCP tools now) - Enhanced MCP tool: package_skill with smart auto-upload - Cross-platform folder opening in utils.py - Graceful error handling throughout Fixes: - Fix missing import os in mcp/server.py - Fix package_skill.py exit code (now 0 when API key missing) - Improve UX with helpful messages instead of errors Tests: 14/14 passed (100%) - CLI tests: 8/8 passed - MCP tests: 6/6 passed Files: +4 new, 5 modified, ~600 lines added	2025-10-19 22:17:23 +03:00
yusyus	6b97a9edc6	Update documentation for large documentation features Comprehensive documentation updates for large docs support: README.md: - Add "Large Documentation Support" to key features - Add "Router/Hub Skills" feature highlight - Add "Checkpoint/Resume" feature highlight - Update MCP tools count: 6 → 8 - Add complete section 7: Large Documentation Support (10K-40K+ Pages) - Split strategies: auto, category, router, size - Parallel scraping workflow - Configuration examples - Benefits and use cases - Add section 8: Checkpoint/Resume for Long Scrapes - Configuration examples - Resume/fresh workflow - Benefits and features - Update documentation links to include LARGE_DOCUMENTATION.md - Update MCP guide links to reflect 8 tools docs/CLAUDE.md: - Add resume/checkpoint commands - Add large documentation commands (split, router, package_multi) - Update MCP integration section (8 tools) - Expand directory structure to show new files - Add split_strategy, split_config, checkpoint config parameters - Add "Large Documentation Support" and "Checkpoint/Resume" features - Add complete large documentation workflow (40K pages example) - Update all command paths to use cli/ prefix mcp/README.md: - Update tool count: 6 → 8 - Add tool 7: split_config with full documentation - Add tool 8: generate_router with full documentation - Add "Large Documentation (40K Pages)" workflow example - Update test coverage: 25 → 31 tests - Update performance table with parallel scraping metrics - Document all split strategies docs/MCP_SETUP.md: - Update verified tools count: 6 → 8 - Update test count: 25 → 31 All documentation now comprehensively covers: - Large documentation handling (10K-40K+ pages) - Router/hub architecture - Config splitting strategies - Checkpoint/resume functionality - Parallel scraping workflows - Complete MCP integration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 20:58:47 +03:00
yusyus	105218f85e	Add checkpoint/resume feature for long scrapes Implement automatic progress saving and resumption for interrupted or very long documentation scrapes (40K+ pages). Features: - Automatic checkpoint saving every N pages (configurable, default: 1000) - Resume from last checkpoint with --resume flag - Fresh start with --fresh flag (clears checkpoint) - Progress state saved: visited URLs, pending URLs, pages scraped - Checkpoint saved on interruption (Ctrl+C) - Checkpoint cleared after successful completion Configuration: ```json { "checkpoint": { "enabled": true, "interval": 1000 } } ``` Usage: ```bash # Start scraping (with checkpoints enabled in config) python3 cli/doc_scraper.py --config configs/large-docs.json # If interrupted (Ctrl+C), resume later: python3 cli/doc_scraper.py --config configs/large-docs.json --resume # Start fresh (clear checkpoint): python3 cli/doc_scraper.py --config configs/large-docs.json --fresh ``` Checkpoint Data: - config: Full configuration - visited_urls: All URLs already scraped - pending_urls: Queue of URLs to scrape - pages_scraped: Count of pages completed - last_updated: Timestamp - checkpoint_interval: Interval setting Benefits: ✅ Never lose progress on long scrapes ✅ Handle interruptions gracefully ✅ Resume multi-hour scrapes easily ✅ Automatic save every 1000 pages ✅ Essential for 40K+ page documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 20:50:24 +03:00
yusyus	bddb57f5ef	Add large documentation handling (40K+ pages support) Implement comprehensive system for handling very large documentation sites with intelligent splitting strategies and router/hub architecture. New CLI Tools: - cli/split_config.py: Split large configs into focused sub-skills * Strategies: auto, category, router, size * Configurable target pages per skill (default: 5000) * Dry-run mode for preview - cli/generate_router.py: Create intelligent router/hub skills * Auto-generates routing logic based on keywords * Creates SKILL.md with topic-to-skill mapping * Infers router name from sub-skills - cli/package_multi.py: Batch package multiple skills * Package router + all sub-skills in one command * Progress tracking for each skill MCP Integration: - Added split_config tool (8 total MCP tools now) - Added generate_router tool - Supports 40K+ page documentation via MCP Configuration: - New split_strategy parameter in configs - split_config section for fine-tuned control - checkpoint section for resume capability (ready for Phase 4) - Example: configs/godot-large-example.json Documentation: - docs/LARGE_DOCUMENTATION.md (500+ lines) * Complete guide for 10K+ page documentation * All splitting strategies explained * Detailed workflows with examples * Best practices and troubleshooting * Real-world examples (AWS, Microsoft, Godot) Features: ✅ Handle 40K+ page documentation efficiently ✅ Parallel scraping support (5x-10x faster) ✅ Router + sub-skills architecture ✅ Intelligent keyword-based routing ✅ Multiple splitting strategies ✅ Full MCP integration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 20:48:03 +03:00
yusyus	f103aa62cb	Clean up tracked files and repository structure Remove unnecessary files: - configs/.DS_Store (macOS system file, should not be tracked) This ensures only relevant project files are version controlled and improves repository hygiene. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 19:45:13 +03:00
yusyus	1c5801d121	Update documentation for MCP integration Comprehensive documentation updates reflecting MCP integration: README.md: - Add MCP Integration and Tests Passing badges - Enhance MCP section with "Tested and Working" status - Add links to both setup and testing guides docs/MCP_SETUP.md: - Update status to reflect production testing - Add integration testing verification notes - Confirm all 6 tools working with natural language CLAUDE.md: - Add prominent MCP Integration section at top - List all 6 available MCP tools with descriptions - Add setup instructions and production status docs/TEST_MCP_IN_CLAUDE_CODE.md (moved from root): - Relocate testing guide to docs/ for better organization - Provides step-by-step MCP integration testing workflow - Documents complete test suite for all 6 tools All documentation now accurately reflects the fully tested and working MCP integration verified in production Claude Code environment. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 19:44:47 +03:00
yusyus	d7e6142ab0	Add test configurations for MCP validation Add 4 test configuration files used for validating MCP functionality: - astro.json: Astro framework documentation (15 pages, production test) - python-tutorial-test.json: Python tutorial (minimal test case) - tailwind.json: Tailwind CSS documentation (test case) - test-manual.json: Manual testing configuration These configs were used to verify: - Config generation via generate_config tool - Config validation via validate_config tool - Page estimation via estimate_pages tool - Full scraping workflow via scrape_docs tool - Skill packaging via package_skill tool All tests passed successfully in production Claude Code environment. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 19:44:27 +03:00
yusyus	35499da922	Add MCP configuration and setup scripts Add complete setup infrastructure for MCP integration: - example-mcp-config.json: Template Claude Code MCP configuration - setup_mcp.sh: Automated one-command setup script - test_mcp_server.py: Comprehensive test suite (25 tests, 100% pass) The setup script automates: - Dependency installation - Configuration file generation with absolute paths - Claude Code config directory creation - Validation and verification Tests cover: - All 6 MCP tool functions - Error handling and edge cases - Config validation - Page estimation - Skill packaging 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 19:43:56 +03:00
yusyus	278b591ed7	Add MCP server implementation with 6 tools Implement complete Model Context Protocol server providing 6 tools for documentation skill generation: - list_configs: List all available preset configurations - generate_config: Create new config files for any documentation site - validate_config: Validate config file structure and parameters - estimate_pages: Fast page count estimation before scraping - scrape_docs: Full documentation scraping and skill building - package_skill: Package skill directory into uploadable .zip Features: - Async/await architecture for efficient I/O operations - Full MCP protocol compliance - Comprehensive error handling and user-friendly messages - Integration with existing CLI tools (doc_scraper.py, etc.) - 25 unit tests with 100% pass rate 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 19:43:25 +03:00
yusyus	36ce32d02e	Add MCP test scripts for easy testing after restart - MCP_TEST_SCRIPT.md: Complete 10-test script with verification - QUICK_MCP_TEST.md: Quick 6-test version for fast testing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 17:29:21 +03:00
yusyus	b69f57b60a	Add comprehensive MCP setup guide and integration test template Documentation Added: - docs/MCP_SETUP.md: Complete 400+ line setup guide - Prerequisites and installation steps - Configuration examples for Claude Code - Verification and troubleshooting - 3 usage examples and advanced configuration - End-to-end workflow and quick reference - tests/mcp_integration_test.md: Comprehensive test template - 10 test cases covering all MCP tools - Performance metrics table - Issue tracking and environment setup - Setup and cleanup scripts - .claude/mcp_config.example.json: Example MCP configuration Documentation Updated: - STRUCTURE.md: Complete monorepo structure documentation - CLAUDE.md: All Python script paths updated to cli/ prefix - docs/USAGE.md: All command examples updated for monorepo - TODO.md: Current sprint status and completed tasks Summary: - Issues #2 and #3 handled (MCP setup guide + integration tests) - All documentation now reflects monorepo structure (cli/ + mcp/) - Tests: 71/71 passing (100%) - Ready for MCP server testing with Claude Code 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 17:01:37 +03:00
yusyus	ba7cacdb4c	Fix all test failures and add upper limit validation (100% pass rate!) Test Fixes: - Fixed 3 failing tests by checking warnings instead of errors - test_missing_recommended_selectors: now checks warnings - test_invalid_rate_limit_too_high: now checks warnings - test_invalid_max_pages_too_high: now checks warnings Validation Improvements: - Added rate_limit upper limit warning (> 10s) - Added max_pages upper limit warning (> 10000) - Helps users avoid extreme values Results: - Before: 68/71 tests passing (95.8%) - After: 71/71 tests passing (100%) ✅ Planning Files Added: - .github/create_issues.sh - Helper for creating issues - .github/SETUP_GUIDE.md - GitHub setup instructions Tests now comprehensively cover all validation scenarios including errors, warnings, and edge cases. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 15:50:25 +03:00
yusyus	23277ded26	Update TODO.md with current sprint tasks and create GitHub issues TODO.md Updates: - Mark current 4 tasks as STARTED - Add "In Progress" and "Completed Today" sections - Document current branch: MCP_refactor - Clear tracking of sprint progress GitHub Issues Created (templates): 1. Fix 3 test failures (warnings vs errors) 2. Create MCP setup guide for Claude Code 3. Test MCP server with actual Claude Code 4. Update documentation for monorepo structure Issue Templates Include: - Detailed problem descriptions - Step-by-step solutions - Acceptance criteria - Files to modify - Test plans Next Steps: User can create issues via: - GitHub web UI (copy from ISSUES_TO_CREATE.md) - GitHub CLI (gh issue create) - Or work directly from TODO.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 15:30:13 +03:00
yusyus	f66718122a	Add project planning and roadmap documentation Planning Structure: - TODO.md - Sprint planning and current tasks - ROADMAP.md - Long-term vision and milestones - .github/ISSUE_TEMPLATE/ - Issue templates for features and MCP tools TODO.md: - 5 phases: Core (done), Enhancement, Advanced, Docs, Integrations - Current sprint tasks clearly defined - Progress tracking with checkboxes ROADMAP.md: - Vision and milestones - v1.0 (done), v1.1 (in progress), v1.2-3.0 (planned) - Feature ideas categorized by priority - Metrics and goals - Release schedule Issue Templates: - feature_request.md - General features - mcp_tool.md - New MCP tools specifically Next: Fix tests, test MCP with Claude Code, document setup 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 15:23:58 +03:00
yusyus	ae924a9d05	Refactor: Convert to monorepo with CLI and MCP server Major restructure to support both CLI usage and MCP integration: Repository Structure: - cli/ - All CLI tools (doc_scraper, estimate_pages, enhance_skill, etc.) - mcp/ - New MCP server for Claude Code integration - configs/ - Shared configuration files - tests/ - Updated to import from cli/ - docs/ - Shared documentation MCP Server (NEW): - mcp/server.py - Full MCP server implementation - 6 tools available: * generate_config - Create config from URL * estimate_pages - Fast page count estimation * scrape_docs - Full documentation scraping * package_skill - Package to .zip * list_configs - Show available presets * validate_config - Validate config files - mcp/README.md - Complete MCP documentation - mcp/requirements.txt - MCP dependencies CLI Tools (Moved to cli/): - All existing functionality preserved - Same commands, same behavior - Tests updated to import from cli.doc_scraper Tests: - 68/71 passing (95.8%) - Updated imports from doc_scraper to cli.doc_scraper - Fixed validate_config() tuple unpacking (errors, warnings) - 3 minor test failures (checking warnings instead of errors) Benefits: - Use as CLI tool: python3 cli/doc_scraper.py - Use via MCP: Integrated with Claude Code - Shared code and configs - Single source of truth 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 15:19:53 +03:00
yusyus	af87572735	Remove unnecessary validation limits from config validator - Remove max_pages upper limit (was 10,000, now unlimited) - Remove rate_limit upper limit (was 10s, now unlimited) - Convert missing selector checks from errors to warnings - Add warnings system (non-blocking) vs errors (blocking) - Allow users to scrape large documentation sites (45k+ pages) - Allow flexible rate limiting for different site requirements All reasonable validations remain (required fields, valid URLs, correct data types, no negative values). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 14:55:56 +03:00
yusyus	be84e5a321	Merge branch 'pr-2'	2025-10-19 13:55:25 +03:00
yusyus	3144d3cf3a	Add comprehensive usage guide for all tools and workflows - Add docs/USAGE.md (~650 lines) - Complete command reference for all tools - Full help output for doc_scraper.py, estimate_pages.py, run_tests.py - Usage examples for enhancement and packaging tools - All 6 preset configs documented with details - 6 common workflows from quick start to advanced - Troubleshooting section with solutions - Advanced usage: custom selectors, URL patterns, categories - Performance tips and best practices - Exit codes, environment variables, file locations Tools covered: - doc_scraper.py (main tool with all options) - estimate_pages.py (page count estimator) - enhance_skill.py (API enhancement) - enhance_skill_local.py (local enhancement) - package_skill.py (skill packager) - run_tests.py (test runner) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 13:34:02 +03:00
jarek	7a4c1d7083	kubernetes config for official docs	2025-10-19 09:28:44 +02:00
yusyus	9c1a133c51	Add page count estimator for fast config validation - Add estimate_pages.py script (~270 lines) - Fast estimation without downloading content (HEAD requests only) - Shows estimated total pages and recommended max_pages - Validates URL patterns work correctly - Estimates scraping time based on rate_limit - Update CLAUDE.md with estimator workflow and commands - Update README.md features section with estimation benefits - Usage: python3 estimate_pages.py configs/react.json - Time: 1-2 minutes vs 20-40 minutes for full scrape 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 02:44:50 +03:00
yusyus	59c2f9126d	Optimize all framework configs with start_urls for better coverage All configs now follow the steam-economy-complete.json pattern with: - Multiple start_urls for comprehensive entry points - Improved include patterns for better targeting - Enhanced exclude patterns to skip irrelevant pages Godot Config: - Added 7 start_urls covering getting started, scripting, 2D, 3D, physics, animation, and classes - Added include patterns: /getting_started/, /tutorials/, /classes/ - More focused scraping of core documentation React Config: - Added 6 start_urls covering learn, quick-start, reference, and hooks - Existing patterns maintained (already well-optimized) Vue Config: - Added 6 start_urls covering introduction, essentials, components, composables, and API - Fixed base_url from https://vuejs.org/guide/ to https://vuejs.org/ - Added /partners/ to exclude list Django Config: - Added 7 start_urls covering intro, models, views, templates, forms, auth, and reference - Added /intro/ to include patterns - Added /releases/ to exclude list (changelog not needed) FastAPI Config: - Added 7 start_urls covering tutorial, first-steps, path-params, body, dependencies, advanced, and reference - Added /deployment/ to exclude list Benefits: - Better initial page discovery - More comprehensive documentation coverage - Faster scraping (direct entry to important sections) - Reduced unnecessary page crawling - Consistent pattern across all configs All configs tested and validated: ✅ 71/71 tests passing ✅ All 6 configs validated successfully 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 02:24:56 +03:00
yusyus	f9c8f1d610	Clean up macOS .DS_Store file from output directory	2025-10-19 02:21:25 +03:00
yusyus	f1fa8354d2	Add comprehensive test system with 71 tests (100% pass rate) Test Framework: - Created tests/ directory structure - Added __init__.py for test package - Implemented 71 comprehensive tests across 3 test suites Test Suites: 1. test_config_validation.py (25 tests) - Valid/invalid config structure - Required fields validation - Name format validation - URL format validation - Selectors validation - URL patterns validation - Categories validation - Rate limit validation (0-10 range) - Max pages validation (1-10000 range) - Start URLs validation 2. test_scraper_features.py (28 tests) - URL validation (include/exclude patterns) - Language detection (Python, JavaScript, GDScript, C++, etc.) - Pattern extraction from documentation - Smart categorization (by URL, title, content) - Text cleaning utilities 3. test_integration.py (18 tests) - Dry-run mode functionality - Config loading and validation - Real config files validation (godot, react, vue, django, fastapi, steam) - URL processing and normalization - Content extraction Test Runner (run_tests.py): - Custom colored test runner with ANSI colors - Detailed test summary with breakdown by category - Success rate calculation - Command-line options: --suite: Run specific test suite --verbose: Show each test name --quiet: Minimal output --failfast: Stop on first failure --list: List all available tests - Execution time: ~1 second for full suite Documentation: - Added comprehensive TESTING.md guide - Test writing templates - Best practices - Coverage information - Troubleshooting guide .gitignore: - Added Python cache files - Added output directory - Added IDE and OS files Test Results: ✅ 71/71 tests passing (100% pass rate) ✅ All existing configs validated ✅ Fast execution (<1 second) ✅ Ready for CI/CD integration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 02:08:58 +03:00
yusyus	eeef230c7b	Implement high and medium priority improvements High Priority: - Fix hardcoded package_skill.py path (line 778) Changed from: /mnt/skills/examples/skill-creator/scripts/package_skill.py Changed to: package_skill.py (local repository path) Medium Priority: - Add comprehensive config validation * Validates required fields (name, base_url) * Validates name format (alphanumeric, hyphens, underscores) * Validates base_url format (http/https) * Validates selectors structure and recommends standard selectors * Validates url_patterns (include/exclude lists) * Validates categories structure * Validates rate_limit range (0-10 seconds) * Validates max_pages range (1-10000) * Validates start_urls format if present * Provides clear error messages for invalid configs - Add --dry-run flag for preview mode * Previews first 20 URLs without saving data * Shows what would be scraped without creating files * Discovers links to estimate total pages * Displays configuration summary * No directories created in dry-run mode * Useful for testing configs before full scrape All changes tested and working correctly. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 01:57:59 +03:00
yusyus	f8c75a3b2d	Add comprehensive CLAUDE.md for Claude Code integration - Add root-level CLAUDE.md with complete guidance for Claude Code - Include Python 3.7+ requirement - Add first-time user workflow with all commands - Include CSS selector testing with BeautifulSoup examples - Add output quality verification commands - Document force re-scrape instructions - Fix package_skill.py path (remove hardcoded /mnt/skills reference) - Add complete config file structure with real examples - Include testing section for selector validation - Add performance metrics table - Document all key code locations with line numbers - Organize by: quick start → architecture → workflows → troubleshooting - Preserve existing docs/CLAUDE.md as detailed technical reference 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-19 01:43:02 +03:00
yusyus	a9b8591731	Update README.md with detailed project description and features; add initial VSCode settings.	2025-10-17 15:21:39 +00:00
yusyus	78b9cae398	Init	2025-10-17 15:14:44 +00:00
yusyus	397d47fe7c	Initial commit	2025-10-17 17:43:48 +03:00

... 10 11 12 13 14

680 Commits