- Changed from manual pip install to using requirements.txt
- Fixed mcp/requirements.txt -> skill_seeker_mcp/requirements.txt
- This ensures all dependencies (including httpx) are installed
Fixes the v2.0.0 tag Release workflow failure
Fixed async test issues that were causing CI failures.
## Issue:
GitHub Actions tests were failing with:
- 4 FAILED tests/test_unified_mcp_integration.py (async def functions not supported)
- 346 passed tests
## Root Cause:
The new test_unified_mcp_integration.py file had async test functions without proper pytest-anyio configuration, causing pytest to fail when trying to run them.
## Fix:
1. **Added pytest.mark.anyio markers**
- Added module-level pytestmark = pytest.mark.anyio
- Ensures all async functions are recognized by anyio plugin
2. **Created tests/conftest.py**
- Overrides anyio_backend fixture to use only 'asyncio'
- Prevents tests from attempting to use 'trio' backend (not installed)
- Reduces test duplication (was running each test for both asyncio + trio)
3. **Updated README.md**
- Already pushed in previous commit (b4f9052)
- Updated descriptions to reflect GitHub scraping capability
## Test Results:
**Before Fix:**
- 4 failed, 346 passed (in CI)
- Error: "async def functions are not natively supported"
**After Fix:**
- 4 passed tests/test_unified_mcp_integration.py
- All tests use asyncio backend only
- No trio-related errors
## Files Changed:
1. tests/test_unified_mcp_integration.py
- Added pytestmark = pytest.mark.anyio at module level
- All 4 async test functions now properly marked
2. tests/conftest.py (NEW)
- Created pytest configuration file
- Overrides anyio_backend to 'asyncio' only
- Prevents unnecessary test duplication
## Verification:
Local test run successful:
```
tests/test_unified_mcp_integration.py::test_mcp_validate_unified_config PASSED
tests/test_unified_mcp_integration.py::test_mcp_validate_legacy_config PASSED
tests/test_unified_mcp_integration.py::test_mcp_scrape_docs_detection PASSED
tests/test_unified_mcp_integration.py::test_mcp_merge_mode_override PASSED
4 passed in 0.21s
```
Expected CI result: 350/350 tests passing (up from 346/350)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Updated main description and feature sections to accurately reflect v2.0.0 capabilities:
## Changes:
**Main Description**:
- Changed from 'documentation website' to 'documentation websites, GitHub repositories, and PDFs'
- Added code analysis, conflict detection to workflow steps
- Emphasized multi-source capabilities
**What is Skill Seeker Section**:
- Updated to mention all three sources (docs, GitHub, PDFs)
- Added 'Analyzes code repositories with deep AST parsing'
- Added 'Detects conflicts between documentation and code'
- Now shows 6 steps instead of 4 (more comprehensive)
**Why Use This Section**:
- Updated use cases to include GitHub + docs combinations
- Added conflict detection benefits
- Added documentation gap analysis use case
- Added open source analysis use case
**GitHub Repository Scraping Section**:
- Updated version tag from v1.4.0 to v2.0.0
- Added 'Deep Code Analysis' with AST parsing
- Added 'API Extraction' with parameters and types
- Added 'Conflict Detection' feature
- Reorganized features to highlight new capabilities
## Rationale:
The previous README said 'any documentation website to skill' but we now support:
1. Documentation websites (original)
2. GitHub repositories (NEW - v2.0.0)
3. PDF files (v1.2.0)
4. Unified multi-source (docs + GitHub + PDF) (NEW - v2.0.0)
This update ensures users know they can scrape GitHub repos directly and combine multiple sources.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Major documentation update explaining the new unified scraping system that combines documentation + GitHub + PDF sources in a single skill with automatic conflict detection.
## Changes:
**README.md:**
- Update version badge to v2.0.0
- Add "Unified Multi-Source Scraping" to Key Features section
- Add comprehensive Option 5 section showing:
- Problem statement (documentation drift)
- Solution with code example
- Conflict detection types and severity levels
- Transparent reporting with side-by-side comparison
- List of advantages (identifies gaps, catches changes, single source of truth)
- Available unified configs
- Link to full guide (docs/UNIFIED_SCRAPING.md)
**CLAUDE.md:**
- Update Current Status to v2.0.0
- Add "Major Release: Unified Multi-Source Scraping" in Recent Updates
- Update configs count from 11/11 to 15/15 (added 4 unified configs)
- Add new "Unified Multi-Source Scraping" section under Core Commands
- Include command examples and feature highlights
- Explain what makes unified scraping special
**QUICKSTART.md:**
- Add Option D: Unified Multi-Source to Step 2
- Add unified configs to Available Presets section
- Show react_unified, django_unified, fastapi_unified, godot_unified examples
## Value:
This documentation update explains how unified scraping helps developers:
- Mix documentation + code in one skill
- Automatically detect conflicts (missing_in_docs, missing_in_code, signature_mismatch)
- Get transparent side-by-side comparisons with ⚠️ warnings
- Identify documentation gaps and outdated docs
- Create a single source of truth combining both sources
Related to: Phase 7-11 unified scraper implementation (commit 5d8c7e3)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Phase 3: Conflict Detection System ✅
- Created conflict_detector.py (500+ lines)
- Detects 4 conflict types:
* missing_in_docs - API in code but not documented
* missing_in_code - Documented API doesn't exist
* signature_mismatch - Different parameters/types
* description_mismatch - Docs vs code comments differ
- Fuzzy matching for similar names
- Severity classification (low/medium/high)
- Generates detailed conflict reports
Phase 4: Rule-Based Merger ✅
- Fast, deterministic merging rules
- 4 rules for handling conflicts:
1. Docs only → Include with [DOCS_ONLY] tag
2. Code only → Include with [UNDOCUMENTED] tag
3. Perfect match → Include normally
4. Conflict → Prefer code signature, keep docs description
- Generates unified API reference
- Summary statistics (matched, conflicts, etc.)
Phase 5: Claude-Enhanced Merger ✅
- AI-powered conflict reconciliation
- Opens Claude Code in new terminal
- Provides merge context and instructions
- Creates workspace with conflicts.json
- Waits for human-supervised merge
- Falls back to rule-based if needed
Testing:
✅ Conflict detector finds 5 conflicts in test data
✅ Rule-based merger successfully merges 5 APIs
✅ Proper handling of docs_only vs code_only
✅ JSON serialization works correctly
Next: Orchestrator to tie everything together
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fix test failures in CI by updating dependencies installation:
- Install from requirements.txt (includes httpx for async support)
- Update path: mcp/ → skill_seeker_mcp/
- Fix coverage command to use correct package name
Fixes ModuleNotFoundError: No module named 'httpx' in CI
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fix test failures in CI by updating dependencies installation:
- Install from requirements.txt (includes httpx for async support)
- Update path: mcp/ → skill_seeker_mcp/
- Fix coverage command to use correct package name
Fixes ModuleNotFoundError: No module named 'httpx' in CI
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
CHANGES:
1. **Fixed 9 PDF Scraper Test Failures:**
- Added .get() safety for missing page keys (headings, text, code_blocks, images)
- Supported both 'code_samples' and 'code_blocks' keys for compatibility
- Fixed extract_pdf() to raise RuntimeError on failure (tests expect exception)
- Added image saving functionality to _generate_reference_file()
- Updated all test methods to override skill_dir with temp directory
- Fixed categorization to handle pre-categorized test data
2. **Fixed 25 MCP Test Skips:**
- Renamed mcp/ directory to skill_seeker_mcp/ to avoid shadowing external mcp package
- Updated all imports in tests/test_mcp_server.py
- Simplified skill_seeker_mcp/server.py import logic (no more shadowing workarounds)
- Updated tests/test_package_structure.py to reference skill_seeker_mcp
3. **Test Results:**
- ✅ 297 tests passing (100%)
- ✅ 0 tests skipped
- ✅ 0 tests failed
- All test categories passing:
* 23 package structure tests
* 18 PDF scraper tests
* 67 PDF extractor/advanced tests
* 25 MCP server tests
* 164 other core tests
BREAKING CHANGE: MCP server directory renamed from `mcp/` to `skill_seeker_mcp/`
📦 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
PROBLEM:
- Local mcp/ directory shadows installed mcp package from PyPI
- Tests couldn't import external mcp.server.Server and mcp.types classes
- MCP server tests (67 tests) were blocked
SOLUTION:
1. Updated mcp/server.py to check sys.modules for pre-imported MCP classes
- Allows tests to import external MCP first, then import our server module
- Falls back to regular import if MCP not pre-imported
- No longer crashes during test collection
2. Updated tests/test_mcp_server.py to import external MCP from /tmp
- Temporarily changes to /tmp directory before importing external mcp
- Avoids local mcp/ directory shadowing in sys.path
- Restores original directory after import
RESULTS:
- Test collection: 297 tests collected (was 272)
- Passing: 263 tests (was 205) - +58 tests
- Skipped: 25 MCP tests (intentional, due to shadowing)
- Failed: 9 PDF scraper tests (pre-existing bugs, not Phase 0 related)
- All PDF tests now running (67 PDF tests passing)
TEST BREAKDOWN:
✅ 205 core tests passing
✅ 67 PDF tests passing (PyMuPDF installed)
✅ 23 package structure tests passing
⏭️ 25 MCP server tests skipped (architectural issue - mcp/ naming conflict)
❌ 9 PDF scraper tests failing (pre-existing bugs in cli/pdf_scraper.py)
LONG-TERM FIX:
Rename mcp/ directory to skill_seeker_mcp/ to eliminate shadowing conflict
(Will enable all 25 MCP tests to run)
📦 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
🐛 Fixes:
- Fix mcp package shadowing by importing external MCP before sys.path modification
- Update mcp/server.py to avoid shadowing installed mcp package
- Update tests/test_mcp_server.py import order
✅ Tests Added:
- Add tests/test_package_structure.py with 23 comprehensive tests
- Test cli package structure and imports
- Test mcp package structure and imports
- Test backwards compatibility
- All package structure tests passing ✅📊 Test Results:
- 205 tests passed ✅
- 67 tests skipped (PDF features, PyMuPDF not installed)
- 23 new package structure tests added
- Total: 272 tests (excluding test_mcp_server.py which needs more work)
⚠️ Known Issue:
- test_mcp_server.py still has import issues (67 tests)
- Will be fixed in next commit
- Main functionality tests all passing
Impact: Package structure working, 75% of tests passing
Adds 'tests-complete' summary job that:
- Provides single status check for branch protection
- Only passes when all matrix tests succeed
- Fixes "Tests" check always showing as pending
- Resolves PR merge blocking issue
This ensures PRs can auto-merge once all 5 matrix jobs pass.