Commit Graph

16 Commits

Author SHA1 Message Date
sogoiii
91692db87c 📝 docs: add skip_llms_txt to config parameters documentation 2025-11-20 14:00:55 -08:00
yusyus
5ee07a2181 docs: Update CLAUDE.md for v2.0.0 PyPI release
Major updates for v2.0.0:
- Added PyPI publication status and installation instructions
- Updated to reflect modern Python packaging (src/ layout, pyproject.toml)
- Updated all commands to use 'skill-seekers' CLI instead of python3 cli/*
- Updated file structure section for src/ layout
- Updated key code locations with new paths
- Added FUTURE_RELEASES.md to documentation list
- Updated test count (379 passing, all CI checks green)
- Updated date to November 11, 2025
- Added development workflow section
- Reorganized Additional Documentation into categories

All sections now reflect the post-PyPI publication state of the project.
2025-11-11 23:27:48 +03:00
yusyus
693294be8e docs: Update CLAUDE.md with new unified CLI commands
Updated all command examples to use new entry points:
- skill-seekers scrape (was: python3 cli/doc_scraper.py)
- skill-seekers unified (was: python3 cli/unified_scraper.py)
- skill-seekers estimate (was: python3 cli/estimate_pages.py)
- skill-seekers package (was: python3 cli/package_skill.py)
- skill-seekers enhance (was: python3 cli/enhance_skill_local.py)
- skill-seekers upload (was: python3 cli/upload_skill.py)

All 44+ command examples now use modern entry point syntax.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-07 01:25:40 +03:00
yusyus
13b19c2b06 Update CLAUDE.md with current project status
- Update date from October 26 to November 6, 2025
- Update test count: 390 tests total, 378 passing, 12 unified tests failing
- Update configs inventory: 24 total configs (14 single-source, 5 unified, 5 test)
- Add priority task: Fix 12 failing unified tests
- Update status: Core functionality stable, unified tests need attention
- Add detailed config breakdown by category
- Update available configs section with complete categorization

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-06 23:23:12 +03:00
yusyus
1e277f80d2 Update documentation for unified multi-source scraping (v2.0.0)
Major documentation update explaining the new unified scraping system that combines documentation + GitHub + PDF sources in a single skill with automatic conflict detection.

## Changes:

**README.md:**
- Update version badge to v2.0.0
- Add "Unified Multi-Source Scraping" to Key Features section
- Add comprehensive Option 5 section showing:
  - Problem statement (documentation drift)
  - Solution with code example
  - Conflict detection types and severity levels
  - Transparent reporting with side-by-side comparison
  - List of advantages (identifies gaps, catches changes, single source of truth)
  - Available unified configs
  - Link to full guide (docs/UNIFIED_SCRAPING.md)

**CLAUDE.md:**
- Update Current Status to v2.0.0
- Add "Major Release: Unified Multi-Source Scraping" in Recent Updates
- Update configs count from 11/11 to 15/15 (added 4 unified configs)
- Add new "Unified Multi-Source Scraping" section under Core Commands
- Include command examples and feature highlights
- Explain what makes unified scraping special

**QUICKSTART.md:**
- Add Option D: Unified Multi-Source to Step 2
- Add unified configs to Available Presets section
- Show react_unified, django_unified, fastapi_unified, godot_unified examples

## Value:
This documentation update explains how unified scraping helps developers:
- Mix documentation + code in one skill
- Automatically detect conflicts (missing_in_docs, missing_in_code, signature_mismatch)
- Get transparent side-by-side comparisons with ⚠️ warnings
- Identify documentation gaps and outdated docs
- Create a single source of truth combining both sources

Related to: Phase 7-11 unified scraper implementation (commit 5d8c7e3)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-26 16:41:58 +03:00
yusyus
319331f5a6 feat: Complete refactoring with async support, type safety, and package structure
This comprehensive refactoring improves code quality, performance, and maintainability
while maintaining 100% backwards compatibility.

## Major Features Added

### 🚀 Async/Await Support (2-3x Performance Boost)
- Added `--async` flag for parallel scraping using asyncio
- Implemented `scrape_page_async()` with httpx.AsyncClient
- Implemented `scrape_all_async()` with asyncio.gather()
- Connection pooling for better resource management
- Performance: 18 pg/s → 55 pg/s (3x faster)
- Memory: 120 MB → 40 MB (66% reduction)
- Full documentation in ASYNC_SUPPORT.md

### 📦 Python Package Structure (Phase 0 Complete)
- Created cli/__init__.py for clean imports
- Created skill_seeker_mcp/__init__.py (renamed from mcp/)
- Created skill_seeker_mcp/tools/__init__.py
- Proper package imports: `from cli import constants`
- Better IDE support and autocomplete

### ⚙️ Centralized Configuration
- Created cli/constants.py with 18 configuration constants
- DEFAULT_ASYNC_MODE, DEFAULT_RATE_LIMIT, DEFAULT_MAX_PAGES
- Enhancement limits, categorization scores, file limits
- All magic numbers now centralized and configurable

### 🔧 Code Quality Improvements
- Converted 71 print() statements to proper logging
- Added type hints to all DocToSkillConverter methods
- Fixed all mypy type checking issues
- Installed types-requests for better type safety
- Code quality: 5.5/10 → 6.5/10

## Testing
- Test count: 207 → 299 tests (92 new tests)
- 11 comprehensive async tests (all passing)
- 16 constants tests (all passing)
- Fixed test isolation issues
- 100% pass rate maintained (299/299 passing)

## Documentation
- Updated README.md with async examples and test count
- Updated CLAUDE.md with async usage guide
- Created ASYNC_SUPPORT.md (292 lines)
- Updated CHANGELOG.md with all changes
- Cleaned up temporary refactoring documents

## Cleanup
- Removed temporary planning/status documents
- Moved test_pr144_concerns.py to tests/ folder
- Updated .gitignore for test artifacts
- Better repository organization

## Breaking Changes
None - all changes are backwards compatible.
Async mode is opt-in via --async flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-26 13:05:39 +03:00
Preston Brown
de5344caf9 Add virtual environment setup and minimal dependencies (#149)
## Changes
- Add virtual environment setup instructions to all docs
- Create requirements.txt with minimal dependencies (13 packages)
- Make anthropic optional (only needed for API enhancement)
- Clarify path notation (~ = $HOME, /Users/yourname examples)
- Add venv activation reminders throughout documentation

## Files Changed
- README.md: Added venv setup section to CLI method
- BULLETPROOF_QUICKSTART.md: Replaced Step 4 with venv setup
- CLAUDE.md: Updated Prerequisites with venv instructions
- requirements.txt: Created with minimal deps (requests, beautifulsoup4, pytest)

## Why
- Prevents package conflicts and permission issues
- Standard Python development practice
- Enables proper pytest usage without pipx complications
- Makes setup clearer for beginners
2025-10-22 21:54:05 +03:00
yusyus
ff148cf98f Update documentation for new Ansible config
Added ansible-core.json config to available presets list in:
- README.md: Added to preset table and usage examples
- CLAUDE.md: Added to production configs list with details

Changes:
- Total configs: 11 → 12
- New category: DevOps & Automation
- Reorganized config list for better categorization

Related: PR #147
2025-10-22 21:51:45 +03:00
yusyus
831ea67d58 Update task tracking and CLAUDE.md with latest progress
Documentation Updates:
======================

TODO.md:
--------
 Added "Completed This Week" section:
   - H1.1: Issue #8 fixed (bulletproof docs + MCP setup)
   - H1.2: Issue #7 fixed (11/11 configs working)
   - H1.4: Issue #4 linked to roadmap
   - PR #5: Reviewed and approved

 Updated "Immediate Tasks" list:
   - Removed completed tasks
   - Added H1.3 (example project) as next priority

 Updated Progress Tracking:
   - 10 items completed this week
   - Clear visibility of accomplishments
   - Next steps clearly defined

NEXT_TASKS.md:
--------------
 Marked completed tasks in Starter Pack:
   - H1.1 (Issue #8) - DONE
   - H1.2 (Issue #7) - DONE
   - H1.4 (Issue #4) - DONE
   - PR #5 Review - DONE

 Updated Current Sprint (Oct 20-27):
   - Monday/Tuesday: 4/4 tasks completed 
   - Wednesday/Thursday: 3 tasks remaining
   - Progress: 4/10 tasks (40%)

 Added specific accomplishments:
   - Community engaged (3 issues)
   - All configs fixed (11/11)
   - PR security verified
   - Bulletproof documentation

CLAUDE.md:
----------
 Added "Current Status" section at top:
   - Version: v1.0.0
   - Recent updates this week
   - Community response wins
   - Next priorities

 Added configs status:
   - 11/11 verified working (100%)
   - New Laravel config
   - All selectors tested

 Added roadmap reference:
   - 134 tasks in 22 groups
   - Project board link
   - Clear next steps

 Added Laravel to Quick Start examples

 Added "Available Production Configs" section:
   - All 11 configs listed with selectors
   - Content extraction stats
   - Organized by category
   - Verification date

 Updated Additional Documentation:
   - Added BULLETPROOF_QUICKSTART.md
   - Added TROUBLESHOOTING.md
   - Added FLEXIBLE_ROADMAP.md
   - Added NEXT_TASKS.md
   - Added TODO.md

Impact:
-------
- Clear visibility of progress (4 major items this week)
- Updated guidance for Claude Code
- Accurate config information (11 working configs)
- Better onboarding with new docs
- Transparent roadmap tracking

Files modified: TODO.md, NEXT_TASKS.md, CLAUDE.md
2025-10-21 00:42:36 +03:00
yusyus
b83f276621 Update Python requirement to 3.10+ for MCP compatibility
The MCP package requires Python 3.10 or higher. Updated:
- GitHub Actions workflow to test Python 3.10, 3.11, 3.12
- README.md badge to Python 3.10+
- CLAUDE.md prerequisites
- CONTRIBUTING.md prerequisites
- docs/MCP_SETUP.md prerequisites

This fixes the MCP installation error in CI:
'ERROR: No matching distribution found for mcp>=1.0.0'

MCP package versions 0.9.1+ all require Python 3.10+.
2025-10-19 22:53:28 +03:00
yusyus
9ce78e9a16 Fix GitHub Actions workflow: Update Python version requirements
- Update CI workflow to Python 3.9-3.12 (from 3.7-3.11)
- Python 3.7 and 3.8 no longer available on ubuntu-latest (Ubuntu 24.04)
- Add fail-fast: false to continue testing on failures
- Update all documentation to reflect Python 3.9+ requirement

Files updated:
- .github/workflows/tests.yml - New Python versions
- README.md - Badge updated to Python 3.9+
- CLAUDE.md - Prerequisites updated
- CONTRIBUTING.md - Prerequisites updated
- docs/MCP_SETUP.md - Prerequisites updated

This fixes the failing GitHub Actions tests.
2025-10-19 22:49:14 +03:00
yusyus
d8cc92cd46 Add smart auto-upload feature with API key detection
Features:
- New upload_skill.py for automatic API-based upload
- Smart detection: upload if API key available, helpful message if not
- Enhanced package_skill.py with --upload flag
- New MCP tool: upload_skill (9 total MCP tools now)
- Enhanced MCP tool: package_skill with smart auto-upload
- Cross-platform folder opening in utils.py
- Graceful error handling throughout

Fixes:
- Fix missing import os in mcp/server.py
- Fix package_skill.py exit code (now 0 when API key missing)
- Improve UX with helpful messages instead of errors

Tests: 14/14 passed (100%)
- CLI tests: 8/8 passed
- MCP tests: 6/6 passed

Files: +4 new, 5 modified, ~600 lines added
2025-10-19 22:17:23 +03:00
yusyus
1c5801d121 Update documentation for MCP integration
Comprehensive documentation updates reflecting MCP integration:

README.md:
- Add MCP Integration and Tests Passing badges
- Enhance MCP section with "Tested and Working" status
- Add links to both setup and testing guides

docs/MCP_SETUP.md:
- Update status to reflect production testing
- Add integration testing verification notes
- Confirm all 6 tools working with natural language

CLAUDE.md:
- Add prominent MCP Integration section at top
- List all 6 available MCP tools with descriptions
- Add setup instructions and production status

docs/TEST_MCP_IN_CLAUDE_CODE.md (moved from root):
- Relocate testing guide to docs/ for better organization
- Provides step-by-step MCP integration testing workflow
- Documents complete test suite for all 6 tools

All documentation now accurately reflects the fully tested and
working MCP integration verified in production Claude Code environment.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-19 19:44:47 +03:00
yusyus
b69f57b60a Add comprehensive MCP setup guide and integration test template
**Documentation Added:**
- docs/MCP_SETUP.md: Complete 400+ line setup guide
  - Prerequisites and installation steps
  - Configuration examples for Claude Code
  - Verification and troubleshooting
  - 3 usage examples and advanced configuration
  - End-to-end workflow and quick reference

- tests/mcp_integration_test.md: Comprehensive test template
  - 10 test cases covering all MCP tools
  - Performance metrics table
  - Issue tracking and environment setup
  - Setup and cleanup scripts

- .claude/mcp_config.example.json: Example MCP configuration

**Documentation Updated:**
- STRUCTURE.md: Complete monorepo structure documentation
- CLAUDE.md: All Python script paths updated to cli/ prefix
- docs/USAGE.md: All command examples updated for monorepo
- TODO.md: Current sprint status and completed tasks

**Summary:**
- Issues #2 and #3 handled (MCP setup guide + integration tests)
- All documentation now reflects monorepo structure (cli/ + mcp/)
- Tests: 71/71 passing (100%)
- Ready for MCP server testing with Claude Code

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-19 17:01:37 +03:00
yusyus
9c1a133c51 Add page count estimator for fast config validation
- Add estimate_pages.py script (~270 lines)
- Fast estimation without downloading content (HEAD requests only)
- Shows estimated total pages and recommended max_pages
- Validates URL patterns work correctly
- Estimates scraping time based on rate_limit
- Update CLAUDE.md with estimator workflow and commands
- Update README.md features section with estimation benefits
- Usage: python3 estimate_pages.py configs/react.json
- Time: 1-2 minutes vs 20-40 minutes for full scrape

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-19 02:44:50 +03:00
yusyus
f8c75a3b2d Add comprehensive CLAUDE.md for Claude Code integration
- Add root-level CLAUDE.md with complete guidance for Claude Code
- Include Python 3.7+ requirement
- Add first-time user workflow with all commands
- Include CSS selector testing with BeautifulSoup examples
- Add output quality verification commands
- Document force re-scrape instructions
- Fix package_skill.py path (remove hardcoded /mnt/skills reference)
- Add complete config file structure with real examples
- Include testing section for selector validation
- Add performance metrics table
- Document all key code locations with line numbers
- Organize by: quick start → architecture → workflows → troubleshooting
- Preserve existing docs/CLAUDE.md as detailed technical reference

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-19 01:43:02 +03:00