skill-seekers-reference

firefrost-gaming/skill-seekers-reference

Author	SHA1	Message	Date
yusyus	cb0d3e885e	fix: Resolve MCP package shadowing issue and add package structure tests 🐛 Fixes: - Fix mcp package shadowing by importing external MCP before sys.path modification - Update mcp/server.py to avoid shadowing installed mcp package - Update tests/test_mcp_server.py import order ✅ Tests Added: - Add tests/test_package_structure.py with 23 comprehensive tests - Test cli package structure and imports - Test mcp package structure and imports - Test backwards compatibility - All package structure tests passing ✅ 📊 Test Results: - 205 tests passed ✅ - 67 tests skipped (PDF features, PyMuPDF not installed) - 23 new package structure tests added - Total: 272 tests (excluding test_mcp_server.py which needs more work) ⚠️ Known Issue: - test_mcp_server.py still has import issues (67 tests) - Will be fixed in next commit - Main functionality tests all passing Impact: Package structure working, 75% of tests passing	2025-10-26 00:26:57 +03:00
yusyus	fb0cb99e6b	feat(refactor): Phase 0 - Add Python package structure ✨ Improvements: - Add .gitignore entries for test artifacts (.pytest_cache, .coverage, htmlcov) - Create cli/__init__.py with exports for llms_txt modules - Create mcp/__init__.py with package documentation - Create mcp/tools/__init__.py as placeholder for future modularization ✅ Benefits: - Proper Python package structure enables clean imports - IDE autocomplete now works for cli modules - Can use: from cli import LlmsTxtDetector - Foundation for future refactoring 📊 Impact: - Code Quality: 6.0/10 (up from 5.5/10) - Import Issues: Fixed ✅ - Package Structure: Fixed ✅ Related: Phase 0 of REFACTORING_PLAN.md Time: 42 minutes Risk: Zero - additive changes only	2025-10-26 00:17:21 +03:00
yusyus	a0298b884a	fix: Add summary job to resolve CI merge blocking issue Adds 'tests-complete' summary job that: - Provides single status check for branch protection - Only passes when all matrix tests succeed - Fixes "Tests" check always showing as pending - Resolves PR merge blocking issue This ensures PRs can auto-merge once all 5 matrix jobs pass.	2025-10-25 14:54:33 +03:00
yusyus	42832d4064	Merge pull request #151 from eibrahimov/development Phase 1: Active Skills Foundation - Multi-variant llms.txt Support	2025-10-25 14:53:11 +03:00
Edgar I.	22404c36b3	fix: download all variants even with explicit llms_txt_url	2025-10-24 18:28:30 +04:00
Edgar I.	0e3f0c6375	docs: update status for Phase 1 completion	2025-10-24 18:28:30 +04:00
Edgar I.	b98457dfb1	feat: remove content truncation in reference files	2025-10-24 18:27:17 +04:00
Edgar I.	ac959d3ed5	feat: download all llms.txt variants with proper .md extension	2025-10-24 18:27:17 +04:00
Edgar I.	4e871588ae	feat: add get_proper_filename() for .txt to .md conversion	2025-10-24 18:27:17 +04:00
Edgar I.	e123de9055	feat: add detect_all() for multi-variant detection	2025-10-24 18:27:17 +04:00
Edgar I.	38ebc66749	docs: add Phase 1 implementation plan for active skills	2025-10-24 18:27:17 +04:00
Edgar I.	38aa2cecec	docs: add active skills design for demand-driven documentation	2025-10-24 18:27:17 +04:00
Edgar I.	812c0992b3	docs: add comprehensive llms.txt feature documentation	2025-10-24 18:27:17 +04:00
Edgar I.	697b42e9eb	docs: update MCP tool description for llms.txt	2025-10-24 18:27:17 +04:00
Edgar I.	41d1846278	test: add e2e test for llms.txt workflow	2025-10-24 18:27:17 +04:00
Edgar I.	104818f983	feat: enable llms.txt for hono config	2025-10-24 18:27:17 +04:00
Edgar I.	99a40d3a1b	feat: support explicit llms_txt_url in config	2025-10-24 18:27:17 +04:00
Edgar I.	0b6c2ed593	docs: add llms.txt support documentation	2025-10-24 18:27:17 +04:00
Edgar I.	12424e390c	feat: integrate llms.txt detection into scraping workflow	2025-10-24 18:26:10 +04:00
Edgar I.	e88a4b0fcc	fix: add retries, markdown validation, and test mocking to downloader - Implement retry logic with exponential backoff (default: 3 retries) - Add markdown validation to check for markdown patterns - Replace flaky HTTP tests with comprehensive mocking - Add 10 test cases covering all scenarios: - Successful download - Timeout with retry - Empty content rejection (<100 chars) - Non-markdown rejection - HTTP error handling - Exponential backoff validation - Markdown pattern detection - Custom timeout parameter - Custom max_retries parameter - User agent header verification All tests now pass reliably (10/10) without making real HTTP requests.	2025-10-24 18:26:10 +04:00
Edgar I.	3dd928b34b	feat: add llms.txt downloader with error handling	2025-10-24 18:26:10 +04:00
Edgar I.	a18ea8cf68	feat: add llms.txt markdown parser	2025-10-24 18:26:10 +04:00
Edgar I.	60fefb6c0b	fix: improve URL parsing and add test mocking for llms.txt detector	2025-10-24 18:26:10 +04:00
Edgar I.	8f44193b61	feat: add llms.txt detection module	2025-10-24 18:26:10 +04:00
yusyus	691318117c	Reorganize Key Features section with clear categories	2025-10-23 22:02:39 +03:00
yusyus	d309e1cfe7	Fix formatting in Key Features section Add blank line after PDF Documentation Support section for better readability 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 21:57:56 +03:00
yusyus	a612096fd3	Merge development into main (v1.2.0 release) Release v1.2.0 - PDF Advanced Features This release includes: - v1.1.0: Documentation Scraping Enhancements (unlimited scraping, parallel mode) - v1.2.0: PDF Advanced Features (OCR, passwords, tables, 3x faster) Priority 2 Features: - OCR support for scanned PDFs - Password-protected PDF support - Complex table extraction Priority 3 Features: - Parallel page processing (3x faster) - Intelligent caching (50% faster re-runs) Testing: 142/142 tests passing (100%) See CHANGELOG.md for full details. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 21:46:52 +03:00
yusyus	7c853e5e9c	Merge feature/pdf-support-clean into development Adds PDF Advanced Features (v1.2.0) This merge brings Priority 2 & 3 PDF features: - OCR support for scanned PDFs - Password-protected PDF support - Complex table extraction - Parallel page processing (3x faster) - Intelligent caching (50% faster re-runs) All 142 tests passing (100%) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 21:44:15 +03:00
yusyus	394eab218e	Add PDF Advanced Features (v1.2.0) Priority 2 & 3 Features Implemented: - OCR support for scanned PDFs (pytesseract + Pillow) - Password-protected PDF support - Complex table extraction - Parallel page processing (3x faster) - Intelligent caching (50% faster re-runs) Testing: - New test file: test_pdf_advanced_features.py (26 tests) - Updated test_pdf_extractor.py (23 tests) - Updated test_pdf_scraper.py (18 tests) - Total: 49/49 PDF tests passing (100%) - Overall: 142/142 tests passing (100%) Documentation: - Added docs/PDF_ADVANCED_FEATURES.md (580 lines) - Updated CHANGELOG.md with v1.1.0 and v1.2.0 - Updated README.md version badges and features - Updated docs/TESTING.md with new test counts Dependencies: - Added Pillow==11.0.0 - Added pytesseract==0.3.13 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 21:43:05 +03:00
yusyus	8ebd736055	Update documentation to include PDF support - Add PDF support to README.md Key Features - Add PDF CLI example (Option 3) - Update MCP README from 9 to 10 tools - Add scrape_pdf tool documentation - Add PDF workflow example - Update tool descriptions All main documentation now reflects PDF functionality	2025-10-23 00:33:44 +03:00
yusyus	6936057820	Add PDF documentation support (Tasks B1.1-B1.8) Complete PDF extraction and skill conversion functionality: - pdf_extractor_poc.py (1,004 lines): Extract text, code, images from PDFs - pdf_scraper.py (353 lines): Convert PDFs to Claude skills - MCP tool scrape_pdf: PDF scraping via Claude Code - 7 comprehensive documentation guides (4,705 lines) - Example PDF config format (configs/example_pdf.json) Features: - 3 code detection methods (font, indent, pattern) - 19+ programming languages detected with confidence scoring - Syntax validation and quality scoring (0-10 scale) - Image extraction with size filtering (--extract-images) - Chapter/section detection and page chunking - Quality-filtered code examples (--min-quality) - Three usage modes: config file, direct PDF, from extracted JSON Technical: - PyMuPDF (fitz) as primary library (60x faster than alternatives) - Language detection with confidence scoring - Code block merging across pages - Comprehensive metadata and statistics - Compatible with existing Skill Seeker workflow MCP Integration: - New scrape_pdf tool (10th MCP tool total) - Supports all three usage modes - 10-minute timeout for large PDFs - Real-time streaming output Documentation (4,705 lines): - B1_COMPLETE_SUMMARY.md: Overview of all 8 tasks - PDF_PARSING_RESEARCH.md: Library comparison and benchmarks - PDF_EXTRACTOR_POC.md: POC documentation - PDF_CHUNKING.md: Page chunking guide - PDF_SYNTAX_DETECTION.md: Syntax detection guide - PDF_IMAGE_EXTRACTION.md: Image extraction guide - PDF_SCRAPER.md: PDF scraper usage guide - PDF_MCP_TOOL.md: MCP integration guide Tasks completed: B1.1-B1.8 Addresses Issue #27 See docs/B1_COMPLETE_SUMMARY.md for complete details 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-23 00:23:16 +03:00
yusyus	05dc5c1cf6	Update GitHub Actions to use development branch Changed: - tests.yml: Run on 'development' instead of 'dev' - Triggers on push to: main, development - Triggers on PRs to: main, development This ensures: ✅ All PRs to development run tests ✅ Pushes to development run tests ✅ Branch protection can require 'Tests' check ✅ CI works with new two-branch workflow Related: Two-branch workflow setup 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 23:35:47 +03:00
yusyus	15fffd236b	Establish two-branch workflow: main + development Changes: 1. Created 'development' branch as integration branch 2. Set 'development' as default branch for all PRs 3. Protected both branches with appropriate rules Branch Protection: - main: Requires tests + 1 review, only maintainer merges - development: Requires tests, open for all contributor PRs Updated CONTRIBUTING.md: - Added comprehensive Branch Workflow section - Updated all examples to use 'development' branch - Clear visual diagram of branch structure - Step-by-step workflow example Workflow: - Contributors: Create feature branches from 'development' - PRs: Always target 'development' (not main) - Releases: Maintainer merges 'development' → 'main' This ensures: ✅ main always stable and production-ready ✅ development integrates all ongoing work ✅ Clear separation between integration and production ✅ Only maintainer controls production releases 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 23:30:45 +03:00
yusyus	8f062bb96c	Fix GitHub Actions release workflow permissions Problem: - Release workflow failing with "Resource not accessible by integration" - Missing permissions for GITHUB_TOKEN to create releases - Workflow tried to create releases that already exist manually Fix: 1. Added `permissions: contents: write` at workflow level - Grants GITHUB_TOKEN permission to create/edit releases - Required for softprops/action-gh-release@v1 2. Added release existence check before creation - Prevents errors when release already exists - Skips creation gracefully with informative message - Useful for manually created releases (like v1.1.0) Changes: - Line 8-9: Added permissions section - Line 48-57: Check if release exists with gh CLI - Line 59-60: Only create if release doesn't exist - Line 69-73: Skip message when release already exists This allows: - Automatic release creation on new tags - Manual release creation without workflow conflicts - Proper error handling and user feedback Related: GitHub Actions permissions model https://docs.github.com/en/actions/security-guides/automatic-token-authentication 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 23:13:55 +03:00
yusyus	0c5515129b	Fix flaky upload_skill tests by restoring cwd in parallel scraping tests Problem: - 2 tests in test_upload_skill.py failing intermittently in CI - Tests passed individually but failed when run after test_parallel_scraping.py - Tests failed with exit code 2 instead of 0 when running `--help` Root Cause: - test_parallel_scraping.py calls `os.chdir(tmpdir)` to create temporary test directories - These directory changes persisted across test classes - When upload_skill CLI tests ran subprocess with path 'cli/upload_skill.py', the relative path was broken because cwd was still in the temp directory - Result: subprocess couldn't find the script, returned exit code 2 Fix: - Added setUp/tearDown to all 6 test classes in test_parallel_scraping.py - setUp saves original cwd with `self.original_cwd = os.getcwd()` - tearDown restores it with `os.chdir(self.original_cwd)` - Ensures tests don't pollute working directory state for subsequent tests Impact: - All 158 tests now pass consistently - No more flaky failures in CI - Test isolation properly maintained 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 22:53:49 +03:00
IbrahimAlbyrk-luduArts	7e94c276be	Add unlimited scraping, parallel mode, and rate limit control (#144 ) Add three major features for improved performance and flexibility: 1. Unlimited Scraping Mode - Support max_pages: null or -1 for complete documentation coverage - Added unlimited parameter to MCP tools - Warning messages for unlimited mode 2. Parallel Scraping (1-10 workers) - ThreadPoolExecutor for concurrent requests - Thread-safe with proper locking - 20x performance improvement (10K pages: 83min → 4min) - Workers parameter in config 3. Configurable Rate Limiting - CLI overrides for rate_limit - --no-rate-limit flag for maximum speed - Per-worker rate limiting semantics 4. MCP Streaming & Timeouts - Non-blocking subprocess with real-time output - Intelligent timeouts per operation type - Prevents frozen/hanging behavior Thread-Safety Fixes: - Fixed race condition on visited_urls.add() - Protected pages_scraped counter with lock - Added explicit exception checking for workers - All shared state operations properly synchronized Test Coverage: - Added 17 comprehensive tests for new features - All 117 tests passing - Thread safety validated Performance: - 1000 pages: 8.3min → 0.4min (20x faster) - 10000 pages: 83min → 4min (20x faster) - Maintains backward compatibility (default: 0.5s, 1 worker) Commits: - 309bf71: feat: Add unlimited scraping mode support - 3ebc2d7: fix(mcp): Add timeout and streaming output - 5d16fdc: feat: Add configurable rate limiting and parallel scraping - ae7883d: Fix MCP server tests for streaming subprocess - e5713dd: Fix critical thread-safety issues in parallel scraping - 303efaf: Add comprehensive tests for parallel scraping features Co-authored-by: IbrahimAlbyrk-luduArts <ialbayrak@luduarts.com> Co-authored-by: Claude <noreply@anthropic.com>	2025-10-22 22:46:02 +03:00
yusyus	13fcce1f4e	Add comprehensive test coverage for CLI utilities Expand test suite from 118 to 166 tests (+48 new tests) with focus on untested CLI tools and utility functions. Overall coverage increased from 14% to 25%. New test files: - tests/test_utilities.py (42 tests) - API keys, file validation, formatting - tests/test_package_skill.py (11 tests) - Skill packaging workflow - tests/test_estimate_pages.py (8 tests) - Page estimation functionality - tests/test_upload_skill.py (7 tests) - Skill upload validation Coverage improvements by module: - cli/utils.py: 0% → 72% (+72%) - cli/upload_skill.py: 0% → 53% (+53%) - cli/estimate_pages.py: 0% → 47% (+47%) - cli/package_skill.py: 0% → 43% (+43%) All 166 tests passing. Added pytest-cov for coverage reporting. Updated requirements.txt with all dependencies including MCP packages. Test execution: 9.6s for complete suite 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-22 22:08:02 +03:00
Preston Brown	de5344caf9	Add virtual environment setup and minimal dependencies (#149 ) ## Changes - Add virtual environment setup instructions to all docs - Create requirements.txt with minimal dependencies (13 packages) - Make anthropic optional (only needed for API enhancement) - Clarify path notation (~ = $HOME, /Users/yourname examples) - Add venv activation reminders throughout documentation ## Files Changed - README.md: Added venv setup section to CLI method - BULLETPROOF_QUICKSTART.md: Replaced Step 4 with venv setup - CLAUDE.md: Updated Prerequisites with venv instructions - requirements.txt: Created with minimal deps (requests, beautifulsoup4, pytest) ## Why - Prevents package conflicts and permission issues - Standard Python development practice - Enables proper pytest usage without pipx complications - Makes setup clearer for beginners	2025-10-22 21:54:05 +03:00
yusyus	ff148cf98f	Update documentation for new Ansible config Added ansible-core.json config to available presets list in: - README.md: Added to preset table and usage examples - CLAUDE.md: Added to production configs list with details Changes: - Total configs: 11 → 12 - New category: DevOps & Automation - Reorganized config list for better categorization Related: PR #147	2025-10-22 21:51:45 +03:00
Schuyler Erle	183c7596a5	Add config for Ansible core documentation (#147 ) Co-authored-by: Schuyler Erle <schuyler@ardc.net>	2025-10-22 21:50:59 +03:00
yusyus	c03186574d	Add comprehensive CLI path tests and fix remaining issues Added 18 new tests covering all aspects of CLI path corrections: - Docstring/usage examples (5 tests) - Print statements (3 tests) - Subprocess calls (1 test) - Documentation files (3 tests) - Help output functionality (2 tests) - Script executability (4 tests) All tests verify that: 1. Scripts can be executed with cli/ prefix 2. Usage examples show correct paths 3. Print statements guide users correctly 4. No old hardcoded paths remain 5. Documentation is consistent Fixed additional issues found by tests: - cli/enhance_skill.py: Fixed 4 more occurrences in docstring and error message - cli/package_skill.py: Fixed 1 occurrence in help epilog Test Results: - Total tests: 118 (100 existing + 18 new) - All tests passing: 100% - Coverage: CLI paths, scraper features, config validation, integration, MCP server Related: PR #145	2025-10-22 21:45:51 +03:00
yusyus	581dbc792d	Fix CLI path references in Python code All Python scripts now use correct cli/ prefix in: - Usage docstrings (shown in --help) - Print statements (shown to users) - Subprocess calls (when calling other scripts) Changes: - cli/doc_scraper.py: Fixed 9 references (usage, print, subprocess) - cli/enhance_skill_local.py: Fixed 6 references (usage, print) - cli/enhance_skill.py: Fixed 5 references (usage, print) - cli/package_skill.py: Fixed 4 references (usage, epilog) - cli/estimate_pages.py: Fixed 3 references (epilog examples) All commands now correctly show: - python3 cli/doc_scraper.py (not python3 doc_scraper.py) - python3 cli/enhance_skill.py (not python3 enhance_skill.py) - python3 cli/enhance_skill_local.py (not python3 enhance_skill_local.py) - python3 cli/package_skill.py (not python3 package_skill.py) - python3 cli/estimate_pages.py (not python3 estimate_pages.py) Also fixed: - Old hardcoded path in enhance_skill_local.py:221 (was: /mnt/skills/examples/skill-creator/scripts/package_skill.py) (now: cli/package_skill.py) - Old hardcoded path in enhance_skill.py:210 (was: /mnt/skills/examples/skill-creator/scripts/package_skill.py) (now: cli/package_skill.py) This ensures all user-facing messages and subprocess calls use the correct paths when run from the repository root. Related: PR #145	2025-10-22 21:38:56 +03:00
yusyus	66719cd53a	Fix CLI path references in documentation Following PR #145 which fixed README.md, this commit corrects all remaining documentation files to use the correct cli/ directory prefix for Python scripts. Changes: - QUICKSTART.md: Fixed 21 occurrences (doc_scraper.py, enhance_skill_local.py, package_skill.py) - docs/UPLOAD_GUIDE.md: Fixed 10 occurrences (doc_scraper.py, enhance_skill_local.py, package_skill.py) - docs/ENHANCEMENT.md: Fixed 9 occurrences (doc_scraper.py, enhance_skill.py, enhance_skill_local.py) All commands now correctly reference: - python3 cli/doc_scraper.py (not python3 doc_scraper.py) - python3 cli/enhance_skill.py (not python3 enhance_skill.py) - python3 cli/enhance_skill_local.py (not python3 enhance_skill_local.py) - python3 cli/package_skill.py (not python3 package_skill.py) - python3 cli/estimate_pages.py (not python3 estimate_pages.py) This ensures all documentation examples work correctly when run from the repository root directory. Related: PR #145	2025-10-22 21:33:47 +03:00
Adam Creeger	9fcfc139bc	Update README to use cli directory for all CLI examples (#145 )	2025-10-22 21:30:45 +03:00
yusyus	e5f4d100b0	Merge pull request #143 from schuyler/main Add config for Claude Code documentation	2025-10-22 21:22:55 +03:00
Schuyler Erle	ab585584d0	Add config for Claude Code documentation	2025-10-20 21:27:19 -07:00
yusyus	013523c81d	Close Issues #117 and #125 - Tasks already complete Discovered 2 tasks were already done: Issue #117 (H1.4) - Answer Issue #3: Pro plan compatibility =========================================================== ✅ Status: ALREADY COMPLETE What it was: - Answer user question about Pro plan compatibility Why it's done: - Issue #3 already answered comprehensively - User question: "Will this work with pro plan?" - Answer given: Works with any plan, no API key needed - Issue #3 already closed by owner Time: 0 hours (already done) Issue #125 (I2.1) - Write troubleshooting guide =============================================== ✅ Status: ALREADY COMPLETE What it was: - Write comprehensive troubleshooting guide - Document common issues and solutions Why it's done: - TROUBLESHOOTING.md created during H1.1 (Issue #8) - 447 lines of comprehensive troubleshooting - Covers: installation, runtime, MCP, scraping, platform-specific - Already committed in `9028974` Time: 1.5 hours (done as part of H1.1) Updated Documentation: ===================== TODO.md: - Added H1.4 and I2.1 to completed tasks - Updated Category H summary (3/5 done) - Added to Progress Tracking section NEXT_TASKS.md: - Marked H1.4 as DONE (Issue #3 already answered) - Marked I2.1 as DONE (TROUBLESHOOTING.md created) - Updated sprint progress: 6/12 tasks (50%) - Added H1.5 to starter pack - Updated results summary Impact: ======= - H1 Group: 4/5 tasks complete (80%) - I2 Group: 1/5 tasks complete (20%) - Week Progress: 6/12 tasks (50%) - Only H1.3 and H1.5 remain in H1 Next Priority: H1.3 - Create example project folder (2-3 hours) Files modified: TODO.md, NEXT_TASKS.md Issues closed: #117, #125	2025-10-21 00:56:52 +03:00
yusyus	831ea67d58	Update task tracking and CLAUDE.md with latest progress Documentation Updates: ====================== TODO.md: -------- ✅ Added "Completed This Week" section: - H1.1: Issue #8 fixed (bulletproof docs + MCP setup) - H1.2: Issue #7 fixed (11/11 configs working) - H1.4: Issue #4 linked to roadmap - PR #5: Reviewed and approved ✅ Updated "Immediate Tasks" list: - Removed completed tasks - Added H1.3 (example project) as next priority ✅ Updated Progress Tracking: - 10 items completed this week - Clear visibility of accomplishments - Next steps clearly defined NEXT_TASKS.md: -------------- ✅ Marked completed tasks in Starter Pack: - H1.1 (Issue #8) - DONE - H1.2 (Issue #7) - DONE - H1.4 (Issue #4) - DONE - PR #5 Review - DONE ✅ Updated Current Sprint (Oct 20-27): - Monday/Tuesday: 4/4 tasks completed ✅ - Wednesday/Thursday: 3 tasks remaining - Progress: 4/10 tasks (40%) ✅ Added specific accomplishments: - Community engaged (3 issues) - All configs fixed (11/11) - PR security verified - Bulletproof documentation CLAUDE.md: ---------- ✅ Added "Current Status" section at top: - Version: v1.0.0 - Recent updates this week - Community response wins - Next priorities ✅ Added configs status: - 11/11 verified working (100%) - New Laravel config - All selectors tested ✅ Added roadmap reference: - 134 tasks in 22 groups - Project board link - Clear next steps ✅ Added Laravel to Quick Start examples ✅ Added "Available Production Configs" section: - All 11 configs listed with selectors - Content extraction stats - Organized by category - Verification date ✅ Updated Additional Documentation: - Added BULLETPROOF_QUICKSTART.md - Added TROUBLESHOOTING.md - Added FLEXIBLE_ROADMAP.md - Added NEXT_TASKS.md - Added TODO.md Impact: ------- - Clear visibility of progress (4 major items this week) - Updated guidance for Claude Code - Accurate config information (11 working configs) - Better onboarding with new docs - Transparent roadmap tracking Files modified: TODO.md, NEXT_TASKS.md, CLAUDE.md	2025-10-21 00:42:36 +03:00
yusyus	8bd3ccfcdf	Merge pull request #5 from jjshanks/anchor-fix Strip anchors from urls so that the pages aren't duplicated	2025-10-21 00:26:26 +03:00
yusyus	80382551b1	Fix Issue #7 : Fix all broken configs and add Laravel support Tested and fixed all 11 production configs - now 100% working! Fixed Configs: 1. Django (configs/django.json) - ❌ Was using: div.document (selector doesn't exist) - ✅ Now using: article (1,688 chars of content) - Verified on: https://docs.djangoproject.com/en/stable/ 2. Astro (configs/astro.json) - ❌ Was using: homepage URL (no article element) - ✅ Now using: /en/getting-started/ with article selector - Added: start_urls, categories, improved URL patterns - Increased max_pages from 15 to 100 3. Tailwind (configs/tailwind.json) - ❌ Was using: article (selector doesn't exist) - ✅ Now using: div.prose (195 chars of content) - Verified on: https://tailwindcss.com/docs New Config: 4. Laravel (configs/laravel.json) - NEW! - Created complete Laravel 9.x config - Selector: #main-content (16,131 chars of content) - Base URL: https://laravel.com/docs/9.x/ - Includes: 8 start_urls covering installation, routing, controllers, views, Blade, Eloquent, migrations, auth - Categories: getting_started, routing, views, models, authentication, api - max_pages: 500 Test Results: ✅ 11/11 configs tested and verified (100%) ✅ All selectors extract content properly ✅ All base URLs accessible Working Configs: - ✅ astro.json - ✅ django.json - ✅ fastapi.json - ✅ godot.json - ✅ godot-large-example.json - ✅ kubernetes.json - ✅ laravel.json (NEW) - ✅ react.json - ✅ steam-economy-complete.json - ✅ tailwind.json - ✅ vue.json How I Tested: 1. Created test_selectors.py to find correct CSS selectors 2. Tested each config's base_url + selector combination 3. Verified content extraction (not just "found" but actual text) 4. Ensured meaningful content length (50+ chars minimum) Fixes Issue #7 - Laravel scraping not working Fixes #7	2025-10-21 00:16:39 +03:00

... 10 11 12 13 14

693 Commits