CRITICAL BUG FIX:
- Fixed documentation scraper overwriting list with dict
- Changed self.scraped_data['documentation'] = {...} to .append({...})
- Bug was breaking unified skill builder reference generation
AI ENHANCEMENT UPDATES:
- Added repo_id extraction in utils.py for multi-repo support
- Enhanced grouping by (source, repo_id) tuple in both enhancement files
- Added MULTI-REPOSITORY HANDLING section to AI prompts
- AI now correctly identifies and synthesizes multiple repos
CHANGES:
1. src/skill_seekers/cli/utils.py:
- _determine_source_metadata() now returns (source, confidence, repo_id)
- Extracts repo_id from codebase_analysis/{repo_id}/ paths
- Added repo_id field to reference metadata dict
2. src/skill_seekers/cli/enhance_skill_local.py:
- Group references by (source_type, repo_id) instead of just source_type
- Display repo identity in prompt sections
- Detect multiple repos and add explicit guidance to AI
3. src/skill_seekers/cli/enhance_skill.py:
- Same grouping and display logic as local enhancement
- Multi-repository handling section added
4. src/skill_seekers/cli/unified_scraper.py:
- FIX: Documentation scraper now appends to list instead of overwriting
- Added source_id, base_url, refs_dir to documentation metadata
- Update refs_dir after moving to cache
TESTING:
- All 57 tests passing (unified, C3, utilities)
- Single-source verified: httpx comprehensive (219→749 lines after enhancement)
- Multi-source verified: encode/httpx + encode/httpcore (523 lines)
- AI enhancement working: Professional output with source attribution
QUALITY:
- Enhanced httpx SKILL.md: 749 lines, 19KB, A+ quality
- Source attribution working correctly
- Multi-repo synthesis transparent and accurate
- Reference structure clean and organized
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add retry_with_backoff() and retry_with_backoff_async() for network operations.
Features:
- Configurable max attempts (default: 3)
- Exponential backoff with configurable base delay
- Operation name for meaningful log messages
- Both sync and async versions
Addresses E2.6: Add retry logic for network failures
Co-authored-by: Joseph Magly <1159087+jmagly@users.noreply.github.com>
This commit fixes three critical limitations discovered during local repository skill extraction testing:
**Fix 1: Code Analyzer Import Issue**
- Changed unified_scraper.py to use absolute imports instead of relative imports
- Fixed: `from github_scraper import` → `from skill_seekers.cli.github_scraper import`
- Fixed: `from pdf_scraper import` → `from skill_seekers.cli.pdf_scraper import`
- Result: CodeAnalyzer now available during extraction, deep analysis works
**Fix 2: Unity Library Exclusions**
- Updated should_exclude_dir() to accept and check full directory paths
- Updated _extract_file_tree_local() to pass both dir name and full path
- Added exclusion config passing from unified_scraper to github_scraper
- Result: exclude_dirs_additional now works (297 files excluded in test)
**Fix 3: AI Enhancement for Single Sources**
- Changed read_reference_files() to use rglob() for recursive search
- Now finds reference files in subdirectories (e.g., references/github/README.md)
- Result: AI enhancement works with unified skills that have nested references
**Test Results:**
- Code Analyzer: ✅ Working (deep analysis running)
- Unity Exclusions: ✅ Working (297 files excluded from 679)
- AI Enhancement: ✅ Working (finds and reads nested references)
**Files Changed:**
- src/skill_seekers/cli/unified_scraper.py (Fix 1 & 2)
- src/skill_seekers/cli/github_scraper.py (Fix 2)
- src/skill_seekers/cli/utils.py (Fix 3)
**Test Artifacts:**
- configs/deck_deck_go_local.json (test configuration)
- docs/LOCAL_REPO_TEST_RESULTS.md (comprehensive test report)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>