This commit fixes three critical limitations discovered during local repository skill extraction testing:
**Fix 1: Code Analyzer Import Issue**
- Changed unified_scraper.py to use absolute imports instead of relative imports
- Fixed: `from github_scraper import` → `from skill_seekers.cli.github_scraper import`
- Fixed: `from pdf_scraper import` → `from skill_seekers.cli.pdf_scraper import`
- Result: CodeAnalyzer now available during extraction, deep analysis works
**Fix 2: Unity Library Exclusions**
- Updated should_exclude_dir() to accept and check full directory paths
- Updated _extract_file_tree_local() to pass both dir name and full path
- Added exclusion config passing from unified_scraper to github_scraper
- Result: exclude_dirs_additional now works (297 files excluded in test)
**Fix 3: AI Enhancement for Single Sources**
- Changed read_reference_files() to use rglob() for recursive search
- Now finds reference files in subdirectories (e.g., references/github/README.md)
- Result: AI enhancement works with unified skills that have nested references
**Test Results:**
- Code Analyzer: ✅ Working (deep analysis running)
- Unity Exclusions: ✅ Working (297 files excluded from 679)
- AI Enhancement: ✅ Working (finds and reads nested references)
**Files Changed:**
- src/skill_seekers/cli/unified_scraper.py (Fix 1 & 2)
- src/skill_seekers/cli/github_scraper.py (Fix 2)
- src/skill_seekers/cli/utils.py (Fix 3)
**Test Artifacts:**
- configs/deck_deck_go_local.json (test configuration)
- docs/LOCAL_REPO_TEST_RESULTS.md (comprehensive test report)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Major improvements:
- Configurable directory exclusions (Issue #203)
- Unlimited local repository analysis
- Skip llms.txt option (PR #198)
- 10+ bug fixes for GitHub scraper
- Test suite expanded to 427 tests
See CHANGELOG.md for full details.
Merges feat/add-skip-llm-to-config by @sogoiii.
This PR adds a valuable configuration option to explicitly skip llms.txt
detection, useful when a site's llms.txt is incomplete, incorrect, or when
specific HTML scraping is needed.
Key features:
- New 'skip_llms_txt' config option (default: false, backward compatible)
- Boolean type validation with warning for invalid values
- Support in both sync and async scraping modes
- 17 comprehensive tests (15 feature tests + 2 config validation tests)
All tests passing after fixing import paths to use proper package names.
Test results: ✅ 17/17 tests passing
Full test suite: ✅ 391 tests passing
Co-authored-by: sogoiii <sogoiii@users.noreply.github.com>
Fixes Issue #190 - "name 'logger' is not defined" error
**Problem:**
- Logger was used at line 40 (in code_analyzer import exception)
- Logger was defined at line 47
- Caused runtime error when code_analyzer import failed
**Solution:**
- Moved logging.basicConfig() and logger initialization to lines 34-39
- Now logger is defined BEFORE the code_analyzer import block
- Warning message now works correctly when code_analyzer is missing
**Testing:**
- ✅ All 22 GitHub scraper tests pass
- ✅ Logger warning appears correctly when code_analyzer missing
- ✅ No similar issues found in other CLI files
Closes#190
Fixes#193 - PDF scraping broken for PyPI users
Changed 3 files from absolute to relative imports to fix
ModuleNotFoundError when package is installed via pip:
1. pdf_scraper.py:22
- from pdf_extractor_poc import → from .pdf_extractor_poc import
- Fixes: skill-seekers pdf command failed with import error
2. github_scraper.py:36
- from code_analyzer import → from .code_analyzer import
- Proactive fix: prevents future import errors
3. test_unified_simple.py:17
- from config_validator import → from .config_validator import
- Proactive fix: test helper file
These absolute imports worked locally due to sys.path differences
but failed when installed via PyPI (pip install skill-seekers).
Tested with:
- skill-seekers pdf command now works ✅
- Extracted 32-page Godot Farming PDF successfully
All CLI commands should now work correctly when installed from PyPI.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add skip_llms_txt config option (default: False)
- Validate value is boolean, warn and default to False if not
- Support in both sync and async scraping modes
- Add 17 tests for config, behavior, and edge cases
Phase 2 & 3: Quality assurance before packaging
New module: quality_checker.py
- Enhancement verification (checks for template text, code examples, sections)
- Structure validation (SKILL.md, references/ directory)
- Content quality checks (YAML frontmatter, language tags, "When to Use" section)
- Link validation (internal markdown links)
- Quality scoring system (0-100 score + A-F grade)
- Detailed reporting with errors, warnings, and info messages
- CLI with --verbose and --strict modes
Integration in package_skill.py:
- Automatic quality checks before packaging
- Display quality report with score and grade
- Ask user to confirm if warnings/errors found
- Add --skip-quality-check flag to bypass checks
- Updated help examples
Benefits:
- Catch quality issues before packaging
- Ensure SKILL.md is properly enhanced
- Validate all links work
- Give users confidence in skill quality
- Comprehensive quality reports
Addresses user request: "check some sort of quality check at the end
like is links working, skill is good etc and give report the user"
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Phase 1: Fix race condition where main console exits before enhancement completes
Changes to enhance_skill_local.py:
- Add headless mode (default) using subprocess.run() which WAITS for completion
- Add timeout protection (default 10 minutes, configurable)
- Verify SKILL.md was actually updated (check mtime and size)
- Add --interactive-enhancement flag to use old terminal mode
- Detailed progress messages and error handling
- Clean up temp files after completion
Changes to doc_scraper.py:
- Use skill-seekers-enhance entry point instead of direct python path
- Pass --interactive-enhancement flag through if requested
- Update help text to reflect new headless default behavior
- Show proper status messages (HEADLESS vs INTERACTIVE)
Benefits:
- Main console now waits for enhancement to complete
- No more "Package your skill" message while enhancement is running
- Timeout prevents infinite hangs
- Terminal mode still available for users who want it
- Better error messages and progress tracking
Fixes user request: "make sure 1. console wait for it to finish"
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fixes#182
Changes:
- Add 'anthropic-beta: skills-2025-10-02' header (required by Anthropic Skills API)
- Change multipart field name from 'skill' to 'files[]' (correct API format)
Without these fixes, all upload attempts returned 404 errors.
Verified:
- All 379 tests passing (100%)
- No regressions in test suite
- Upload functionality corrected per API requirements
Co-authored-by: Straughter "BatmanOsama" Guthrie <straughterguthrie@gmail.com>
Original PR: #183