yusyus
70ca1d9ba6
docs(A1.9): Add comprehensive git source documentation and example repository
...
Phase 4 Complete:
- Updated README.md with git source usage examples and use cases
- Created docs/GIT_CONFIG_SOURCES.md (800+ lines comprehensive guide)
- Updated CHANGELOG.md with v2.2.0 release notes
- Added configs/example-team/ example repository with E2E test
Documentation covers:
- Quick start and architecture
- MCP tools reference (4 tools with examples)
- Authentication for GitHub, GitLab, Bitbucket
- Use cases (small teams, enterprise, open source)
- Best practices, troubleshooting, advanced topics
- Complete API reference
Example repository includes:
- 3 example configs (react-custom, vue-internal, company-api)
- README with usage guide
- E2E test script (7 steps, 100% passing)
🤖 Generated with Claude Code
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com >
2025-12-21 19:38:26 +03:00
yusyus
119e642ced
fix: Add package installation check and fix test imports (Task 2.1)
...
Fixes test import errors in 7 test files that failed without package installed.
**Changes:**
1. **tests/conftest.py** - Added pytest_configure() hook
- Checks if skill_seekers package is installed before running tests
- Shows helpful error message guiding users to run `pip install -e .`
- Prevents confusing ModuleNotFoundError during test runs
2. **tests/test_constants.py** - Fixed dynamic imports
- Changed `from cli import` to `from skill_seekers.cli import` (6 locations)
- Fixes imports in test methods that dynamically import modules
- All 16 tests now pass ✅
3. **tests/test_llms_txt_detector.py** - Fixed patch decorators
- Changed `patch('cli.llms_txt_detector.` to `patch('skill_seekers.cli.llms_txt_detector.` (4 locations)
- All 4 tests now pass ✅
4. **docs/CLAUDE.md** - Added "Running Tests" section
- Clear instructions on installing package before testing
- Explanation of why installation is required
- Common pytest commands and options
- Test coverage statistics
**Testing:**
- ✅ All 101 tests pass across the 7 affected files:
- test_async_scraping.py (11 tests)
- test_config_validation.py (26 tests)
- test_constants.py (16 tests)
- test_estimate_pages.py (8 tests)
- test_integration.py (23 tests)
- test_llms_txt_detector.py (4 tests)
- test_llms_txt_downloader.py (13 tests)
- ✅ conftest.py check works correctly
- ✅ Helpful error shown when package not installed
**Impact:**
- Developers now get clear guidance when tests fail due to missing installation
- All test import issues resolved
- Better developer experience for contributors
2025-11-29 22:13:13 +03:00
sogoiii
04f97f8c49
✨ feat: add automatic terminal detection for local enhancement
...
Add smart terminal selection for --enhance-local with cascading priority:
1. SKILL_SEEKER_TERMINAL env var (explicit user preference)
2. TERM_PROGRAM env var (inherit current terminal)
3. Terminal.app (fallback default)
Supports Ghostty, iTerm2, WezTerm, and Terminal.app. Includes comprehensive
test suite (11 tests) and user documentation.
Changes:
- Add detect_terminal_app() function with priority-based selection
- Support for 4 major macOS terminals via TERMINAL_MAP
- Fallback handling for unknown terminals (IDE terminals)
- Add TERMINAL_SELECTION.md with setup examples and troubleshooting
- Update README.md to link to terminal selection guide
- Full test coverage for all detection paths and edge cases
2025-11-07 00:15:03 +03:00
yusyus
27407a59b9
Clean up unnecessary tracking and snapshot files
...
Removed 8 redundant files (~60K):
Development tracking (outdated/redundant with GitHub):
- GITHUB_BOARD_SETUP_COMPLETE.md - One-time setup doc
- PROJECT_STATUS.md - Oct 20 snapshot, outdated
- TODO.md - Replaced by FLEXIBLE_ROADMAP.md + GitHub board
- NEXT_TASKS.md - Replaced by FLEXIBLE_ROADMAP.md + GitHub board
Test snapshots (outdated, CI/CD has current status):
- TEST_SUMMARY.md - Oct 26 snapshot
- TEST_RESULTS.md - Oct 26 snapshot
Task summaries (redundant with git history):
- docs/B1_COMPLETE_SUMMARY.md - Completed task summary
Release notes (should be in GitHub Releases):
- RELEASE_NOTES_v1.0.0.md
Kept active documentation:
- FLEXIBLE_ROADMAP.md (master task catalog)
- README.md, CHANGELOG.md, CONTRIBUTING.md
- All quickstart/troubleshooting guides
- All docs/*.md (active documentation)
All tests still passing ✅
2025-10-26 17:40:50 +03:00
yusyus
962b5b9340
Add comprehensive bash script tests and fix old mcp/ path references
...
- Created tests/test_setup_scripts.py with 19 tests covering:
* setup_mcp.sh validation (11 tests)
* General bash script quality (4 tests)
* MCP path consistency across codebase (4 tests)
- Fixed old 'mcp/' references in documentation:
* docs/B1_COMPLETE_SUMMARY.md (3 refs)
* docs/PDF_MCP_TOOL.md (2 refs)
* docs/MCP_SETUP.md (18 refs)
* docs/TEST_MCP_IN_CLAUDE_CODE.md (4 refs)
These tests would have caught Issue #157 before it reached users.
Tests verify:
- Bash syntax validity
- No hardcoded paths
- Correct skill_seeker_mcp/ directory references
- Files referenced in scripts actually exist
- No deprecated backticks
- Proper error handling (set -e)
All 19 tests passing ✅
2025-10-26 17:33:39 +03:00
yusyus
5d8c7e39f6
Add unified multi-source scraping feature (Phases 7-11)
...
Completes the unified scraping system implementation:
**Phase 7: Unified Skill Builder**
- cli/unified_skill_builder.py: Generates final skill structure
- Inline conflict warnings (⚠️ ) in API reference
- Side-by-side docs vs code comparison
- Severity-based conflict grouping
- Separate conflicts.md report
**Phase 8: MCP Integration**
- skill_seeker_mcp/server.py: Auto-detects unified vs legacy configs
- Routes to unified_scraper.py or doc_scraper.py automatically
- Supports merge_mode parameter override
- Maintains full backward compatibility
**Phase 9: Example Unified Configs**
- configs/react_unified.json: React docs + GitHub
- configs/django_unified.json: Django docs + GitHub
- configs/fastapi_unified.json: FastAPI docs + GitHub
- configs/fastapi_unified_test.json: Test config with limited pages
**Phase 10: Comprehensive Tests**
- cli/test_unified_simple.py: Integration tests (all passing)
- Tests unified config validation
- Tests backward compatibility
- Tests mixed source types
- Tests error handling
**Phase 11: Documentation**
- docs/UNIFIED_SCRAPING.md: Complete guide (1000+ lines)
- Examples, best practices, troubleshooting
- Architecture diagrams and data flow
- Command reference
**Additional:**
- demo_conflicts.py: Interactive conflict detection demo
- TEST_RESULTS.md: Complete test results and findings
- cli/unified_scraper.py: Fixed doc_scraper integration (subprocess)
**Features:**
✅ Multi-source scraping (docs + GitHub + PDF)
✅ Conflict detection (4 types, 3 severity levels)
✅ Rule-based merging (fast, deterministic)
✅ Claude-enhanced merging (AI-powered)
✅ Transparent conflict reporting
✅ MCP auto-detection
✅ Backward compatibility
**Test Results:**
- 6/6 integration tests passed
- 4 unified configs validated
- 3 legacy configs backward compatible
- 5 conflicts detected in test data
- All documentation complete
🤖 Generated with Claude Code
2025-10-26 16:33:41 +03:00
Edgar I.
0e3f0c6375
docs: update status for Phase 1 completion
2025-10-24 18:28:30 +04:00
Edgar I.
38ebc66749
docs: add Phase 1 implementation plan for active skills
2025-10-24 18:27:17 +04:00
Edgar I.
38aa2cecec
docs: add active skills design for demand-driven documentation
2025-10-24 18:27:17 +04:00
Edgar I.
812c0992b3
docs: add comprehensive llms.txt feature documentation
2025-10-24 18:27:17 +04:00
Edgar I.
0b6c2ed593
docs: add llms.txt support documentation
2025-10-24 18:27:17 +04:00
yusyus
394eab218e
Add PDF Advanced Features (v1.2.0)
...
Priority 2 & 3 Features Implemented:
- OCR support for scanned PDFs (pytesseract + Pillow)
- Password-protected PDF support
- Complex table extraction
- Parallel page processing (3x faster)
- Intelligent caching (50% faster re-runs)
Testing:
- New test file: test_pdf_advanced_features.py (26 tests)
- Updated test_pdf_extractor.py (23 tests)
- Updated test_pdf_scraper.py (18 tests)
- Total: 49/49 PDF tests passing (100%)
- Overall: 142/142 tests passing (100%)
Documentation:
- Added docs/PDF_ADVANCED_FEATURES.md (580 lines)
- Updated CHANGELOG.md with v1.1.0 and v1.2.0
- Updated README.md version badges and features
- Updated docs/TESTING.md with new test counts
Dependencies:
- Added Pillow==11.0.0
- Added pytesseract==0.3.13
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-10-23 21:43:05 +03:00
yusyus
6936057820
Add PDF documentation support (Tasks B1.1-B1.8)
...
Complete PDF extraction and skill conversion functionality:
- pdf_extractor_poc.py (1,004 lines): Extract text, code, images from PDFs
- pdf_scraper.py (353 lines): Convert PDFs to Claude skills
- MCP tool scrape_pdf: PDF scraping via Claude Code
- 7 comprehensive documentation guides (4,705 lines)
- Example PDF config format (configs/example_pdf.json)
Features:
- 3 code detection methods (font, indent, pattern)
- 19+ programming languages detected with confidence scoring
- Syntax validation and quality scoring (0-10 scale)
- Image extraction with size filtering (--extract-images)
- Chapter/section detection and page chunking
- Quality-filtered code examples (--min-quality)
- Three usage modes: config file, direct PDF, from extracted JSON
Technical:
- PyMuPDF (fitz) as primary library (60x faster than alternatives)
- Language detection with confidence scoring
- Code block merging across pages
- Comprehensive metadata and statistics
- Compatible with existing Skill Seeker workflow
MCP Integration:
- New scrape_pdf tool (10th MCP tool total)
- Supports all three usage modes
- 10-minute timeout for large PDFs
- Real-time streaming output
Documentation (4,705 lines):
- B1_COMPLETE_SUMMARY.md: Overview of all 8 tasks
- PDF_PARSING_RESEARCH.md: Library comparison and benchmarks
- PDF_EXTRACTOR_POC.md: POC documentation
- PDF_CHUNKING.md: Page chunking guide
- PDF_SYNTAX_DETECTION.md: Syntax detection guide
- PDF_IMAGE_EXTRACTION.md: Image extraction guide
- PDF_SCRAPER.md: PDF scraper usage guide
- PDF_MCP_TOOL.md: MCP integration guide
Tasks completed: B1.1-B1.8
Addresses Issue #27
See docs/B1_COMPLETE_SUMMARY.md for complete details
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-10-23 00:23:16 +03:00
yusyus
66719cd53a
Fix CLI path references in documentation
...
Following PR #145 which fixed README.md, this commit corrects all
remaining documentation files to use the correct cli/ directory prefix
for Python scripts.
Changes:
- QUICKSTART.md: Fixed 21 occurrences (doc_scraper.py, enhance_skill_local.py, package_skill.py)
- docs/UPLOAD_GUIDE.md: Fixed 10 occurrences (doc_scraper.py, enhance_skill_local.py, package_skill.py)
- docs/ENHANCEMENT.md: Fixed 9 occurrences (doc_scraper.py, enhance_skill.py, enhance_skill_local.py)
All commands now correctly reference:
- python3 cli/doc_scraper.py (not python3 doc_scraper.py)
- python3 cli/enhance_skill.py (not python3 enhance_skill.py)
- python3 cli/enhance_skill_local.py (not python3 enhance_skill_local.py)
- python3 cli/package_skill.py (not python3 package_skill.py)
- python3 cli/estimate_pages.py (not python3 estimate_pages.py)
This ensures all documentation examples work correctly when run from
the repository root directory.
Related: PR #145
2025-10-22 21:33:47 +03:00
yusyus
b83f276621
Update Python requirement to 3.10+ for MCP compatibility
...
The MCP package requires Python 3.10 or higher. Updated:
- GitHub Actions workflow to test Python 3.10, 3.11, 3.12
- README.md badge to Python 3.10+
- CLAUDE.md prerequisites
- CONTRIBUTING.md prerequisites
- docs/MCP_SETUP.md prerequisites
This fixes the MCP installation error in CI:
'ERROR: No matching distribution found for mcp>=1.0.0'
MCP package versions 0.9.1+ all require Python 3.10+.
2025-10-19 22:53:28 +03:00
yusyus
9ce78e9a16
Fix GitHub Actions workflow: Update Python version requirements
...
- Update CI workflow to Python 3.9-3.12 (from 3.7-3.11)
- Python 3.7 and 3.8 no longer available on ubuntu-latest (Ubuntu 24.04)
- Add fail-fast: false to continue testing on failures
- Update all documentation to reflect Python 3.9+ requirement
Files updated:
- .github/workflows/tests.yml - New Python versions
- README.md - Badge updated to Python 3.9+
- CLAUDE.md - Prerequisites updated
- CONTRIBUTING.md - Prerequisites updated
- docs/MCP_SETUP.md - Prerequisites updated
This fixes the failing GitHub Actions tests.
2025-10-19 22:49:14 +03:00
yusyus
06dabf639c
Update documentation: correct MCP tool count to 9 tools
...
- Update mcp/README.md: 8 tools → 9 tools, add upload_skill docs
- Update docs/MCP_SETUP.md: verify section lists all 9 tools
- Update docs/CLAUDE.md: MCP tool references updated
- Add upload_skill to tool listings and examples
- Update test coverage count: 31 → 34 tests
All documentation now accurately reflects the current feature set.
2025-10-19 22:22:03 +03:00
yusyus
d8cc92cd46
Add smart auto-upload feature with API key detection
...
Features:
- New upload_skill.py for automatic API-based upload
- Smart detection: upload if API key available, helpful message if not
- Enhanced package_skill.py with --upload flag
- New MCP tool: upload_skill (9 total MCP tools now)
- Enhanced MCP tool: package_skill with smart auto-upload
- Cross-platform folder opening in utils.py
- Graceful error handling throughout
Fixes:
- Fix missing import os in mcp/server.py
- Fix package_skill.py exit code (now 0 when API key missing)
- Improve UX with helpful messages instead of errors
Tests: 14/14 passed (100%)
- CLI tests: 8/8 passed
- MCP tests: 6/6 passed
Files: +4 new, 5 modified, ~600 lines added
2025-10-19 22:17:23 +03:00
yusyus
6b97a9edc6
Update documentation for large documentation features
...
Comprehensive documentation updates for large docs support:
README.md:
- Add "Large Documentation Support" to key features
- Add "Router/Hub Skills" feature highlight
- Add "Checkpoint/Resume" feature highlight
- Update MCP tools count: 6 → 8
- Add complete section 7: Large Documentation Support (10K-40K+ Pages)
- Split strategies: auto, category, router, size
- Parallel scraping workflow
- Configuration examples
- Benefits and use cases
- Add section 8: Checkpoint/Resume for Long Scrapes
- Configuration examples
- Resume/fresh workflow
- Benefits and features
- Update documentation links to include LARGE_DOCUMENTATION.md
- Update MCP guide links to reflect 8 tools
docs/CLAUDE.md:
- Add resume/checkpoint commands
- Add large documentation commands (split, router, package_multi)
- Update MCP integration section (8 tools)
- Expand directory structure to show new files
- Add split_strategy, split_config, checkpoint config parameters
- Add "Large Documentation Support" and "Checkpoint/Resume" features
- Add complete large documentation workflow (40K pages example)
- Update all command paths to use cli/ prefix
mcp/README.md:
- Update tool count: 6 → 8
- Add tool 7: split_config with full documentation
- Add tool 8: generate_router with full documentation
- Add "Large Documentation (40K Pages)" workflow example
- Update test coverage: 25 → 31 tests
- Update performance table with parallel scraping metrics
- Document all split strategies
docs/MCP_SETUP.md:
- Update verified tools count: 6 → 8
- Update test count: 25 → 31
All documentation now comprehensively covers:
- Large documentation handling (10K-40K+ pages)
- Router/hub architecture
- Config splitting strategies
- Checkpoint/resume functionality
- Parallel scraping workflows
- Complete MCP integration
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-10-19 20:58:47 +03:00
yusyus
bddb57f5ef
Add large documentation handling (40K+ pages support)
...
Implement comprehensive system for handling very large documentation sites
with intelligent splitting strategies and router/hub architecture.
**New CLI Tools:**
- cli/split_config.py: Split large configs into focused sub-skills
* Strategies: auto, category, router, size
* Configurable target pages per skill (default: 5000)
* Dry-run mode for preview
- cli/generate_router.py: Create intelligent router/hub skills
* Auto-generates routing logic based on keywords
* Creates SKILL.md with topic-to-skill mapping
* Infers router name from sub-skills
- cli/package_multi.py: Batch package multiple skills
* Package router + all sub-skills in one command
* Progress tracking for each skill
**MCP Integration:**
- Added split_config tool (8 total MCP tools now)
- Added generate_router tool
- Supports 40K+ page documentation via MCP
**Configuration:**
- New split_strategy parameter in configs
- split_config section for fine-tuned control
- checkpoint section for resume capability (ready for Phase 4)
- Example: configs/godot-large-example.json
**Documentation:**
- docs/LARGE_DOCUMENTATION.md (500+ lines)
* Complete guide for 10K+ page documentation
* All splitting strategies explained
* Detailed workflows with examples
* Best practices and troubleshooting
* Real-world examples (AWS, Microsoft, Godot)
**Features:**
✅ Handle 40K+ page documentation efficiently
✅ Parallel scraping support (5x-10x faster)
✅ Router + sub-skills architecture
✅ Intelligent keyword-based routing
✅ Multiple splitting strategies
✅ Full MCP integration
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-10-19 20:48:03 +03:00
yusyus
1c5801d121
Update documentation for MCP integration
...
Comprehensive documentation updates reflecting MCP integration:
README.md:
- Add MCP Integration and Tests Passing badges
- Enhance MCP section with "Tested and Working" status
- Add links to both setup and testing guides
docs/MCP_SETUP.md:
- Update status to reflect production testing
- Add integration testing verification notes
- Confirm all 6 tools working with natural language
CLAUDE.md:
- Add prominent MCP Integration section at top
- List all 6 available MCP tools with descriptions
- Add setup instructions and production status
docs/TEST_MCP_IN_CLAUDE_CODE.md (moved from root):
- Relocate testing guide to docs/ for better organization
- Provides step-by-step MCP integration testing workflow
- Documents complete test suite for all 6 tools
All documentation now accurately reflects the fully tested and
working MCP integration verified in production Claude Code environment.
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-10-19 19:44:47 +03:00
yusyus
b69f57b60a
Add comprehensive MCP setup guide and integration test template
...
**Documentation Added:**
- docs/MCP_SETUP.md: Complete 400+ line setup guide
- Prerequisites and installation steps
- Configuration examples for Claude Code
- Verification and troubleshooting
- 3 usage examples and advanced configuration
- End-to-end workflow and quick reference
- tests/mcp_integration_test.md: Comprehensive test template
- 10 test cases covering all MCP tools
- Performance metrics table
- Issue tracking and environment setup
- Setup and cleanup scripts
- .claude/mcp_config.example.json: Example MCP configuration
**Documentation Updated:**
- STRUCTURE.md: Complete monorepo structure documentation
- CLAUDE.md: All Python script paths updated to cli/ prefix
- docs/USAGE.md: All command examples updated for monorepo
- TODO.md: Current sprint status and completed tasks
**Summary:**
- Issues #2 and #3 handled (MCP setup guide + integration tests)
- All documentation now reflects monorepo structure (cli/ + mcp/)
- Tests: 71/71 passing (100%)
- Ready for MCP server testing with Claude Code
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-10-19 17:01:37 +03:00
yusyus
3144d3cf3a
Add comprehensive usage guide for all tools and workflows
...
- Add docs/USAGE.md (~650 lines)
- Complete command reference for all tools
- Full help output for doc_scraper.py, estimate_pages.py, run_tests.py
- Usage examples for enhancement and packaging tools
- All 6 preset configs documented with details
- 6 common workflows from quick start to advanced
- Troubleshooting section with solutions
- Advanced usage: custom selectors, URL patterns, categories
- Performance tips and best practices
- Exit codes, environment variables, file locations
Tools covered:
- doc_scraper.py (main tool with all options)
- estimate_pages.py (page count estimator)
- enhance_skill.py (API enhancement)
- enhance_skill_local.py (local enhancement)
- package_skill.py (skill packager)
- run_tests.py (test runner)
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-10-19 13:34:02 +03:00
yusyus
f1fa8354d2
Add comprehensive test system with 71 tests (100% pass rate)
...
Test Framework:
- Created tests/ directory structure
- Added __init__.py for test package
- Implemented 71 comprehensive tests across 3 test suites
Test Suites:
1. test_config_validation.py (25 tests)
- Valid/invalid config structure
- Required fields validation
- Name format validation
- URL format validation
- Selectors validation
- URL patterns validation
- Categories validation
- Rate limit validation (0-10 range)
- Max pages validation (1-10000 range)
- Start URLs validation
2. test_scraper_features.py (28 tests)
- URL validation (include/exclude patterns)
- Language detection (Python, JavaScript, GDScript, C++, etc.)
- Pattern extraction from documentation
- Smart categorization (by URL, title, content)
- Text cleaning utilities
3. test_integration.py (18 tests)
- Dry-run mode functionality
- Config loading and validation
- Real config files validation (godot, react, vue, django, fastapi, steam)
- URL processing and normalization
- Content extraction
Test Runner (run_tests.py):
- Custom colored test runner with ANSI colors
- Detailed test summary with breakdown by category
- Success rate calculation
- Command-line options:
--suite: Run specific test suite
--verbose: Show each test name
--quiet: Minimal output
--failfast: Stop on first failure
--list: List all available tests
- Execution time: ~1 second for full suite
Documentation:
- Added comprehensive TESTING.md guide
- Test writing templates
- Best practices
- Coverage information
- Troubleshooting guide
.gitignore:
- Added Python cache files
- Added output directory
- Added IDE and OS files
Test Results:
✅ 71/71 tests passing (100% pass rate)
✅ All existing configs validated
✅ Fast execution (<1 second)
✅ Ready for CI/CD integration
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-10-19 02:08:58 +03:00
yusyus
78b9cae398
Init
2025-10-17 15:14:44 +00:00