Issue: #11 (A1.3 test failures)
## Problem
3/8 tests were failing because ConfigValidator only validates structure
and required fields, NOT format validation (names, URLs, etc.).
## Root Cause
ConfigValidator checks:
- Required fields (name, description, sources/base_url)
- Source types validity
- Field types (arrays, integers)
ConfigValidator does NOT check:
- Name format (alphanumeric, hyphens, underscores)
- URL format (http:// or https://)
## Solution
Added additional format validation in submit_config_tool after ConfigValidator:
1. Name format validation using regex: `^[a-zA-Z0-9_-]+$`
2. URL format validation (must start with http:// or https://)
3. Validates both legacy (base_url) and unified (sources.base_url) formats
## Test Results
Before: 5/8 tests passing, 3 failing
After: 8/8 tests passing ✅
Full suite: 427 tests passing, 40 skipped ✅
## Changes Made
- src/skill_seekers/mcp/server.py:
* Added `import re` at top of file
* Added name format validation (line 1280-1281)
* Added URL format validation for legacy configs (line 1285-1289)
* Added URL format validation for unified configs (line 1291-1296)
- tests/test_mcp_server.py:
* Updated test_submit_config_validates_required_fields to accept
ConfigValidator's correct error message ("cannot detect" instead of "description")
## Validation Examples
Invalid name: "React@2024!" → ❌ "Invalid name format"
Invalid URL: "not-a-url" → ❌ "Invalid base_url format"
Valid name: "react-docs" → ✅
Valid URL: "https://react.dev/" → ✅🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Adds 7 additional test cases for Issue #203 configurable EXCLUDED_DIRS:
Test Coverage Additions:
- Local repository integration (2 tests)
* exclude_dirs with local_repo_path
* Replace mode with local_repo_path
- Logging verification (3 tests)
* INFO level for extend mode
* WARNING level for replace mode
* No logging for default mode
- Type handling (2 tests)
* Tuple support for exclude_dirs
* Set support for exclude_dirs_additional
Total Test Coverage:
- 19 tests for exclude_dirs feature (all passing)
- 427 total tests passing (up from 420)
- 54% code coverage for github_scraper.py
All tests pass with no failures. 32 skipped tests are expected:
- 3 macOS-specific tests (platform limitation)
- 29 MCP tests (pass individually, skip in full suite due to pytest quirk)
Closes#203
Merges feat/add-skip-llm-to-config by @sogoiii.
This PR adds a valuable configuration option to explicitly skip llms.txt
detection, useful when a site's llms.txt is incomplete, incorrect, or when
specific HTML scraping is needed.
Key features:
- New 'skip_llms_txt' config option (default: false, backward compatible)
- Boolean type validation with warning for invalid values
- Support in both sync and async scraping modes
- 17 comprehensive tests (15 feature tests + 2 config validation tests)
All tests passing after fixing import paths to use proper package names.
Test results: ✅ 17/17 tests passing
Full test suite: ✅ 391 tests passing
Co-authored-by: sogoiii <sogoiii@users.noreply.github.com>
Fixed import paths in test_skip_llms_txt.py to use skill_seekers
package name instead of old-style cli imports.
Changes:
- Updated import from 'cli.doc_scraper' to 'skill_seekers.cli.doc_scraper'
- Updated logger names from 'cli.doc_scraper' to 'skill_seekers.cli.doc_scraper'
- Removed sys.path manipulation (no longer needed with proper imports)
All 17 tests now pass successfully (15 in test_skip_llms_txt.py + 2 in test_config_validation.py)
- Add skip_llms_txt config option (default: False)
- Validate value is boolean, warn and default to False if not
- Support in both sync and async scraping modes
- Add 17 tests for config, behavior, and edge cases
The terminal detection tests were failing because they expected the old
terminal mode behavior, but headless mode is now the default.
Fix:
- Add headless=False parameter to all terminal detection tests
- Tests now explicitly test interactive (terminal) mode
- test_subprocess_popen_called_with_correct_args: Tests terminal launch
- test_terminal_launch_error_handling: Tests error handling
- test_output_message_unknown_terminal: Tests warning messages
These tests only run on macOS (they're skipped on Linux) and test the
interactive terminal launch functionality, so they need headless=False.
Impact:
- All 3 failing macOS tests should now pass
- 391 tests passing on Linux
- CI should pass on macOS now
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Phase 4: Testing and verification
New test file: test_quality_checker.py
- 12 comprehensive tests for quality checker functionality
- Tests for structure validation (missing SKILL.md, missing references)
- Tests for enhancement verification (template indicators, code examples)
- Tests for content quality (YAML frontmatter, language tags)
- Tests for link validation (broken internal links)
- Tests for quality scoring and grading system
- Tests for is_excellent property
- CLI tests (help output, nonexistent directory)
Updated test_package_skill.py:
- Added skip_quality_check=True to all test calls
- Fixes OSError "reading from stdin while output is captured"
- All 9 package_skill tests passing
Test Results:
- 391 tests passing (up from 386 before)
- 32 skipped
- 0 failures
- Added 12 new quality checker tests
- All existing tests still passing
Completes Phase 4 of enhancement race condition fix.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Add smart terminal selection for --enhance-local with cascading priority:
1. SKILL_SEEKER_TERMINAL env var (explicit user preference)
2. TERM_PROGRAM env var (inherit current terminal)
3. Terminal.app (fallback default)
Supports Ghostty, iTerm2, WezTerm, and Terminal.app. Includes comprehensive
test suite (11 tests) and user documentation.
Changes:
- Add detect_terminal_app() function with priority-based selection
- Support for 4 major macOS terminals via TERMINAL_MAP
- Fallback handling for unknown terminals (IDE terminals)
- Add TERMINAL_SELECTION.md with setup examples and troubleshooting
- Update README.md to link to terminal selection guide
- Full test coverage for all detection paths and edge cases
**Problem:**
- GitHub Actions failing with 12 test failures in test_unified.py
- ConfigValidator only accepting file paths, not dicts
- ConflictDetector expecting dict pages, but tests providing list
- Import path issues in test_unified.py
**Changes:**
1. **cli/config_validator.py**:
- Modified `__init__` to accept Union[Dict, str] instead of just str
- Added isinstance check to handle both dict and file path inputs
- Maintains backward compatibility with existing code
2. **cli/conflict_detector.py**:
- Modified `_extract_docs_apis()` to handle both dict and list formats for pages
- Added support for 'analyzed_files' key (in addition to 'files')
- Made 'file' key optional in file_info dict
- Handles both production and test data structures
3. **tests/test_unified.py**:
- Fixed import path: sys.path now points to parent.parent/cli
- Fixed test regex: "Invalid source type" -> "Invalid type"
- All 18 unified tests now passing
**Test Results:**
- ✅ 390/390 tests passing (100%)
- ✅ All unified tests fixed (0 failures)
- ✅ No regressions in other test suites
**Impact:**
- Fixes failing GitHub Actions CI
- Improves testability of ConfigValidator and ConflictDetector
- Makes APIs more flexible for both production and test usage
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Move test_unified.py to tests/ directory (607 lines, 19 tests)
- Move conflicts.json to tests/fixtures/example_conflicts.json
- Tests cover config validation, conflict detection, merging, and skill building
- Example conflicts show docs/code mismatch scenarios for v2.0.0 feature
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Support <pre class="brush: java"> pattern (SyntaxHighlighter)
- Support bare class names like <pre class="python">
- Add _extract_language_from_classes() helper method
- Apply detection logic to both code and parent pre elements
- Add 3 comprehensive test cases
Improves language detection for 25+ programming languages across
various documentation site formats.
Co-authored-by: Ricardo JL Rufino <ricardo@edu3.com.br>
Fixed async test issues that were causing CI failures.
## Issue:
GitHub Actions tests were failing with:
- 4 FAILED tests/test_unified_mcp_integration.py (async def functions not supported)
- 346 passed tests
## Root Cause:
The new test_unified_mcp_integration.py file had async test functions without proper pytest-anyio configuration, causing pytest to fail when trying to run them.
## Fix:
1. **Added pytest.mark.anyio markers**
- Added module-level pytestmark = pytest.mark.anyio
- Ensures all async functions are recognized by anyio plugin
2. **Created tests/conftest.py**
- Overrides anyio_backend fixture to use only 'asyncio'
- Prevents tests from attempting to use 'trio' backend (not installed)
- Reduces test duplication (was running each test for both asyncio + trio)
3. **Updated README.md**
- Already pushed in previous commit (b4f9052)
- Updated descriptions to reflect GitHub scraping capability
## Test Results:
**Before Fix:**
- 4 failed, 346 passed (in CI)
- Error: "async def functions are not natively supported"
**After Fix:**
- 4 passed tests/test_unified_mcp_integration.py
- All tests use asyncio backend only
- No trio-related errors
## Files Changed:
1. tests/test_unified_mcp_integration.py
- Added pytestmark = pytest.mark.anyio at module level
- All 4 async test functions now properly marked
2. tests/conftest.py (NEW)
- Created pytest configuration file
- Overrides anyio_backend to 'asyncio' only
- Prevents unnecessary test duplication
## Verification:
Local test run successful:
```
tests/test_unified_mcp_integration.py::test_mcp_validate_unified_config PASSED
tests/test_unified_mcp_integration.py::test_mcp_validate_legacy_config PASSED
tests/test_unified_mcp_integration.py::test_mcp_scrape_docs_detection PASSED
tests/test_unified_mcp_integration.py::test_mcp_merge_mode_override PASSED
4 passed in 0.21s
```
Expected CI result: 350/350 tests passing (up from 346/350)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
CHANGES:
1. **Fixed 9 PDF Scraper Test Failures:**
- Added .get() safety for missing page keys (headings, text, code_blocks, images)
- Supported both 'code_samples' and 'code_blocks' keys for compatibility
- Fixed extract_pdf() to raise RuntimeError on failure (tests expect exception)
- Added image saving functionality to _generate_reference_file()
- Updated all test methods to override skill_dir with temp directory
- Fixed categorization to handle pre-categorized test data
2. **Fixed 25 MCP Test Skips:**
- Renamed mcp/ directory to skill_seeker_mcp/ to avoid shadowing external mcp package
- Updated all imports in tests/test_mcp_server.py
- Simplified skill_seeker_mcp/server.py import logic (no more shadowing workarounds)
- Updated tests/test_package_structure.py to reference skill_seeker_mcp
3. **Test Results:**
- ✅ 297 tests passing (100%)
- ✅ 0 tests skipped
- ✅ 0 tests failed
- All test categories passing:
* 23 package structure tests
* 18 PDF scraper tests
* 67 PDF extractor/advanced tests
* 25 MCP server tests
* 164 other core tests
BREAKING CHANGE: MCP server directory renamed from `mcp/` to `skill_seeker_mcp/`
📦 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
PROBLEM:
- Local mcp/ directory shadows installed mcp package from PyPI
- Tests couldn't import external mcp.server.Server and mcp.types classes
- MCP server tests (67 tests) were blocked
SOLUTION:
1. Updated mcp/server.py to check sys.modules for pre-imported MCP classes
- Allows tests to import external MCP first, then import our server module
- Falls back to regular import if MCP not pre-imported
- No longer crashes during test collection
2. Updated tests/test_mcp_server.py to import external MCP from /tmp
- Temporarily changes to /tmp directory before importing external mcp
- Avoids local mcp/ directory shadowing in sys.path
- Restores original directory after import
RESULTS:
- Test collection: 297 tests collected (was 272)
- Passing: 263 tests (was 205) - +58 tests
- Skipped: 25 MCP tests (intentional, due to shadowing)
- Failed: 9 PDF scraper tests (pre-existing bugs, not Phase 0 related)
- All PDF tests now running (67 PDF tests passing)
TEST BREAKDOWN:
✅ 205 core tests passing
✅ 67 PDF tests passing (PyMuPDF installed)
✅ 23 package structure tests passing
⏭️ 25 MCP server tests skipped (architectural issue - mcp/ naming conflict)
❌ 9 PDF scraper tests failing (pre-existing bugs in cli/pdf_scraper.py)
LONG-TERM FIX:
Rename mcp/ directory to skill_seeker_mcp/ to eliminate shadowing conflict
(Will enable all 25 MCP tests to run)
📦 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
🐛 Fixes:
- Fix mcp package shadowing by importing external MCP before sys.path modification
- Update mcp/server.py to avoid shadowing installed mcp package
- Update tests/test_mcp_server.py import order
✅ Tests Added:
- Add tests/test_package_structure.py with 23 comprehensive tests
- Test cli package structure and imports
- Test mcp package structure and imports
- Test backwards compatibility
- All package structure tests passing ✅📊 Test Results:
- 205 tests passed ✅
- 67 tests skipped (PDF features, PyMuPDF not installed)
- 23 new package structure tests added
- Total: 272 tests (excluding test_mcp_server.py which needs more work)
⚠️ Known Issue:
- test_mcp_server.py still has import issues (67 tests)
- Will be fixed in next commit
- Main functionality tests all passing
Impact: Package structure working, 75% of tests passing
Problem:
- 2 tests in test_upload_skill.py failing intermittently in CI
- Tests passed individually but failed when run after test_parallel_scraping.py
- Tests failed with exit code 2 instead of 0 when running `--help`
Root Cause:
- test_parallel_scraping.py calls `os.chdir(tmpdir)` to create temporary test directories
- These directory changes persisted across test classes
- When upload_skill CLI tests ran subprocess with path 'cli/upload_skill.py',
the relative path was broken because cwd was still in the temp directory
- Result: subprocess couldn't find the script, returned exit code 2
Fix:
- Added setUp/tearDown to all 6 test classes in test_parallel_scraping.py
- setUp saves original cwd with `self.original_cwd = os.getcwd()`
- tearDown restores it with `os.chdir(self.original_cwd)`
- Ensures tests don't pollute working directory state for subsequent tests
Impact:
- All 158 tests now pass consistently
- No more flaky failures in CI
- Test isolation properly maintained
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>