Commit Graph

123 Commits

Author SHA1 Message Date
yusyus
03ac78173b chore: Remove client-specific docs, fix linter errors, update documentation
- Remove SPYKE-related client documentation files
- Fix critical ruff linter errors:
  - Remove unused 'os' import in test_analyze_e2e.py
  - Remove unused 'setups' variable in test_test_example_extractor.py
  - Prefix unused output_dir parameter with underscore in codebase_scraper.py
  - Fix import sorting in test_integration.py
- Update CHANGELOG.md with comprehensive C3.9 and enhancement features
- Update CLAUDE.md with --enhance-level documentation

All critical code quality issues resolved.
2026-01-31 14:38:15 +03:00
YusufKaraaslanSpyke
170dd0fd75 feat(C3.9): Add project documentation extraction from markdown files
- Scan ALL .md files in project (README, docs/, etc.)
- Smart categorization by folder/filename (overview, architecture, guides, etc.)
- Processing depth: surface=raw copy, deep=parse+summarize, full=AI-enhanced
- AI enhancement at level 2+ adds topic extraction and cross-references
- New "Project Documentation" section in SKILL.md with summaries
- Output to references/documentation/ organized by category
- Default ON, use --skip-docs to disable
- Add skip_docs parameter to MCP scrape_codebase_tool
- Add 15 new tests for markdown documentation features

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 13:54:56 +03:00
YusufKaraaslanSpyke
4cfb94e14f feat: Add default_enhance_level to config system
- Add default_enhance_level setting (default: 1 = SKILL.md only)
- --enhance flag now uses config default instead of hardcoded 1
- Show enhance level in config --show output

Users can change default via config file:
~/.config/skill-seekers/config.json
  "ai_enhancement": {
    "default_enhance_level": 2,
    ...
  }

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 14:45:46 +03:00
YusufKaraaslanSpyke
3abdf2d1f0 feat: Update MCP scrape_codebase_tool with enhance-level support
- Add enhance_level parameter (0-3) for granular AI control
- Add skip_* parameters for feature control (skip_patterns, skip_test_examples, etc.)
- Remove deprecated --build-* flags (features are now ON by default)
- Adjust timeout based on enhance_level (10min base, up to 60min for level 3)
- Update documentation with examples

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 14:43:46 +03:00
YusufKaraaslanSpyke
e953fc6276 fix: Correct LocalSkillEnhancer import and method call
- Fix import: SkillEnhancer → LocalSkillEnhancer
- Fix method: enhance() → run(headless=True, timeout=600)
- Fix constructor: pass force=True separately

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 14:39:11 +03:00
YusufKaraaslanSpyke
d7aa34a3af feat: Add --enhance-level for granular AI enhancement control
Levels:
- 0 (off): No AI enhancement (default)
- 1 (minimal): SKILL.md enhancement only (fast, high value)
- 2 (standard): SKILL.md + Architecture + Config enhancement
- 3 (full): Everything including patterns and test examples

--comprehensive and --enhance-level are INDEPENDENT:
- --comprehensive: Controls depth and features (full depth + all features)
- --enhance-level: Controls AI enhancement level

Usage examples:
  skill-seekers analyze --directory . --enhance-level 1  # SKILL.md AI only
  skill-seekers analyze --directory . --enhance          # Same as level 1
  skill-seekers analyze --directory . --comprehensive    # All features, no AI
  skill-seekers analyze --directory . --comprehensive --enhance-level 2  # All features + standard AI

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 14:32:07 +03:00
YusufKaraaslanSpyke
b8b5e9d6ef perf: Optimize LOCAL mode AI enhancement with parallel execution
- Increase default batch size from 5 to 20 patterns per CLI call
- Add parallel execution with 3 concurrent workers (configurable)
- Add ai_enhancement settings to config_manager:
  - local_batch_size: patterns per Claude CLI call (default: 20)
  - local_parallel_workers: concurrent CLI calls (default: 3)
- Expected speedup: 6-12x faster for large codebases

Config settings can be changed via:
  skill-seekers config (coming soon) or editing ~/.config/skill-seekers/config.json

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 14:07:20 +03:00
YusufKaraaslanSpyke
8a0c1f5fc6 feat: Add LOCAL mode fallback for all AI enhancements
When no ANTHROPIC_API_KEY is set, ai_enhancer.py now falls back to LOCAL
mode (Claude Code CLI) instead of disabling AI enhancement entirely.

Changes to AIEnhancer base class:
- AUTO mode now falls back to LOCAL instead of disabling
- Added _check_claude_cli() to verify Claude CLI is available
- Added _call_claude_local() method using Claude Code CLI
- Refactored _call_claude() to dispatch to API or LOCAL mode
- 2 minute timeout per LOCAL call (reasonable for batch processing)

This affects all AI enhancements:
- Design pattern enhancement (C3.1) → LOCAL fallback 
- Test example enhancement (C3.2) → LOCAL fallback 
- Architectural analysis (C3.7) → LOCAL fallback 

Before (bad UX):
  ℹ️  AI enhancement disabled (no API key found)

After (good UX):
  ℹ️  No API key found, using LOCAL mode (Claude Code CLI)
   AI enhancement enabled (using LOCAL mode - Claude Code CLI)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 11:15:54 +03:00
YusufKaraaslanSpyke
a8372b8f9d feat: Auto-enhance SKILL.md when --enhance or --comprehensive flag is used
UX improvement: When running `skill-seekers analyze --enhance` or
`skill-seekers analyze --comprehensive`, the SKILL.md is now automatically
enhanced as the final step. No need for a separate `skill-seekers enhance`
command.

Changes:
- After analyze_main() completes successfully, check if --enhance or
  --comprehensive was used
- If SKILL.md exists, automatically run SkillEnhancer in headless mode
- Use force=True to skip all prompts (consistent with --comprehensive UX)
- 10 minute timeout for large codebases
- Graceful fallback with retry instructions if enhancement fails

Before (bad UX):
  skill-seekers analyze --directory /path --enhance
  # Then separately:
  skill-seekers enhance output/codebase/

After (good UX):
  skill-seekers analyze --directory /path --enhance
  # Everything done in one command!

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 11:07:33 +03:00
YusufKaraaslanSpyke
22e2edbc7f fix: Rewrite config_enhancer LOCAL mode to use Claude CLI properly
The previous implementation had a design flaw - it ran `claude prompt_file`
and expected Claude to magically create a JSON file in a temp directory.
This never worked because Claude CLI is interactive and doesn't auto-save.

Changes:
- Use `--dangerously-skip-permissions` flag to bypass permission prompts
- Create a dedicated temp directory for each enhancement session
- Embed absolute output file path in the prompt so Claude knows where to write
- Run Claude from the temp directory as working_dir
- Improved prompt with explicit Write tool instruction
- Better error handling and logging (file not found, JSON parse errors)
- Show settings preview in prompt for better AI context

The LOCAL mode now follows the same pattern as enhance_skill_local.py
which works correctly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 10:48:49 +03:00
YusufKaraaslanSpyke
be2353cf2f fix: Add C# test example extraction and fix config_type field mismatch
Bug fixes:
- Fix KeyError in config_enhancer.py where "config_type" was expected but
  config_extractor saves as "type". Now supports both field names for
  backward compatibility.
- Fix settings "value_type" vs "type" mismatch in the same file.

New features:
- Add C# support for regex-based test example extraction
- Add language alias mapping (C# -> csharp, C++ -> cpp)
- Enhanced C# patterns for NUnit, xUnit, MSTest test frameworks
- Support for mock patterns (NSubstitute, Moq)
- Support for Zenject dependency injection patterns
- Support for setup/teardown method extraction

Tests:
- Add 2 new C# test extraction tests (NUnit tests, mock patterns)
- All 1257 tests pass (165 skipped)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 10:12:45 +03:00
yusyus
380a71c714 feat: Add discoverable 'analyze' subcommand with preset flags (Phase 1 UX improvement)
Implements Phase 1 of the codebase analysis UX improvement plan, making the
command discoverable and adding intuitive preset flags while maintaining 100%
backward compatibility.

New Features:
- Add 'analyze' subcommand to main CLI (skill-seekers analyze)
- Add --quick preset: Fast analysis (1-2 min, basic features only)
- Add --comprehensive preset: Full analysis (20-60 min, all features + AI)
- Add --enhance flag: Simple AI enhancement with auto-detection
- Improve help text with timing estimates and mode descriptions

Files Modified:
- src/skill_seekers/cli/main.py: Add analyze subcommand (lines 15, 273-311, 542-589)
- src/skill_seekers/cli/codebase_scraper.py: Add preset logic and improve help text
- tests/test_analyze_command.py: NEW - 20 comprehensive tests
- tests/test_cli_paths.py: Fix version check (2.7.0 -> 2.7.2)
- tests/test_package_structure.py: Fix 4 version checks (2.7.0 -> 2.7.2)
- README.md: Update examples to use 'analyze' command
- CLAUDE.md: Update examples to use 'analyze' command

Test Results:
- 81 tests related to Phase 1: ALL PASSING 
- 20 new tests for analyze command: ALL PASSING 
- Zero regressions introduced
- 100% backward compatibility maintained

Backward Compatibility:
- Old 'skill-seekers-codebase' command still works
- All existing flags (--depth, --ai-mode, --skip-*) still functional
- No breaking changes

Usage Examples:
  skill-seekers analyze --directory . --quick
  skill-seekers analyze --directory . --comprehensive
  skill-seekers analyze --directory . --enhance

Fixes #262 (codebase UX issues)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-29 21:52:46 +03:00
yusyus
cf3da6dff3 fix: Improve config path resolution and error messages (fixes #262)
Three critical UX improvements for custom config handling:

1. User config directory support:
   - Added ~/.config/skill-seekers/configs/ to search path
   - Users can now place custom configs in their home directory
   - Path resolution order: exact path → ./configs/ → user config dir → API

2. Better error messages:
   - Show all searched absolute paths when config not found
   - Added get_last_searched_paths() function to track locations
   - Clear guidance on where to place custom configs

3. Auto-create config.json:
   - ConfigManager now creates config.json on first initialization
   - Creates configs/ subdirectory for user custom configs
   - Display shows custom configs directory path

Fixes reported by @melamers in issue #262 where:
- Config path shown by `skill-seekers config` didn't exist
- Unclear where to save custom configs
- Error messages didn't show exact paths searched

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-27 22:01:04 +03:00
yusyus
746e335fae fix: Auto-fetch preset configs from API when not found locally
Fixes #264

Users reported that preset configs (react.json, godot.json, etc.) were not
found after installing via pip/uv, causing immediate failure on first use.

Solution: Instead of bundling configs in the package, the CLI now automatically
fetches missing configs from the SkillSeekersWeb.com API.

Changes:
- Created config_fetcher.py with smart config resolution:
  1. Check local path (backward compatible)
  2. Check with configs/ prefix
  3. Auto-fetch from SkillSeekersWeb.com API (new!)
- Updated doc_scraper.py to use ConfigValidator (supports unified configs)
- Added 15 comprehensive tests for auto-fetch functionality

User Experience:
- Zero configuration needed - presets work immediately after install
- Better error messages showing available configs from API
- Downloaded configs are cached locally for future use
- Fully backward compatible with existing local configs

Testing:
- 15 new unit tests (all passing)
- 2 integration tests with real API
- Full test suite: 1387 tests passing
- No breaking changes

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-27 21:41:20 +03:00
yusyus
8f720670f2 style: Format code with ruff
- Format 5 files affected by PDF scraper changes
- Ensures CI/CD code quality checks pass

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-27 21:11:21 +03:00
Zhichang Yu
9435d2911d feat: Add GLM-4.7 support and fix PDF scraper issues (#266)
Merging with admin override due to known issues:

 **What Works**:
- GLM-4.7 Claude-compatible API support (correctly implemented)
- PDF scraper improvements (content truncation fixed, page traceability added)  
- Documentation updates comprehensive

⚠️ **Known Issues (will be fixed in next commit)**:
1. Import bugs in 3 files causing UnboundLocalError (30 tests failing)
2. PDF scraper test expectations need updating for new behavior (5 tests failing)
3. test_godot_config failure (pre-existing, not caused by this PR - 1 test failing)

**Action Plan**:
Fixes for issues #1 and #2 are ready and will be committed immediately after merge.
Issue #3 requires separate investigation as it's a pre-existing problem.

Total: 36 failing tests, 35 will be fixed in next commit.
2026-01-27 21:10:40 +03:00
yusyus
ac53017ec8 chore: Bump version to 2.7.2 for hotfix release
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-21 23:23:33 +03:00
yusyus
cc76efa29a fix: Critical CLI bug fixes for issues #258 and #259
This hotfix resolves 4 critical bugs reported by users:

Issue #258: install command fails with unified_scraper
- Added --fresh and --dry-run flags to unified_scraper.py
- Updated main.py to pass both flags to unified scraper
- Fixed "unrecognized arguments" error

Issue #259 (Original): scrape command doesn't accept positional URL and --max-pages
- Added positional URL argument to scrape command
- Added --max-pages flag with safety warnings (>1000 pages, <10 pages)
- Updated doc_scraper.py and main.py argument parsers

Issue #259 (Comment A): Version shows 2.7.0 instead of actual version
- Fixed hardcoded version in main.py
- Now reads version dynamically from __init__.py

Issue #259 (Comment B): PDF command shows empty "Error: " message
- Improved exception handler in main.py to show exception type if message is empty
- Added proper error handling in pdf_scraper.py with context-specific messages
- Added traceback support in verbose mode

All fixes tested and verified with exact commands from issue reports.

Resolves: #258, #259

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-21 23:22:03 +03:00
yusyus
35cd0759e5 chore: Bump development version to 2.8.0-dev
After releasing v2.7.1 hotfix, development branch returns to 2.8.0-dev
for continued feature development.

Version Changes:
- pyproject.toml: 2.7.1 → 2.8.0-dev
- src/skill_seekers/__init__.py: 2.7.1 → 2.8.0-dev
- src/skill_seekers/cli/__init__.py: 2.7.1 → 2.8.0-dev
- src/skill_seekers/mcp/__init__.py: 2.7.1 → 2.8.0-dev
- src/skill_seekers/mcp/tools/__init__.py: 2.7.1 → 2.8.0-dev

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-18 22:41:45 +03:00
yusyus
dc6b82f06d chore: Bump version to 2.7.1 for hotfix release
Version Bump:
- pyproject.toml: 2.8.0-dev → 2.7.1
- src/skill_seekers/__init__.py: 2.8.0-dev → 2.7.1
- src/skill_seekers/cli/__init__.py: 2.8.0-dev → 2.7.1
- src/skill_seekers/mcp/__init__.py: 2.8.0-dev → 2.7.1
- src/skill_seekers/mcp/tools/__init__.py: 2.8.0-dev → 2.7.1

CHANGELOG:
- Added v2.7.1 entry documenting critical config download bug fix
- Root cause, solution, files fixed, impact, and testing documented

This hotfix resolves the critical 404 error bug when downloading configs
from the skillseekersweb.com API.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-18 22:39:34 +03:00
yusyus
f1d97facbc fix: Use download_url from API response instead of constructing URL
CRITICAL BUG FIX - Resolves 404 errors when fetching configs from API

Root Cause:
The code was constructing download URLs manually:
  download_url = f"{API_BASE_URL}/api/download/{config_name}.json"

This fails because the API provides download_url in the response, which
may differ from the constructed path (e.g., CDN URLs, version-specific paths).

Solution:
Changed both MCP server implementations to use download_url from API:
  download_url = config_info.get("download_url")

Added validation check for missing download_url field.

Files Modified:
- src/skill_seekers/mcp/tools/source_tools.py (FastMCP server, line 285-297)
- src/skill_seekers/mcp/server_legacy.py (Legacy server, line 1483-1494)

Bug Report:
User reported: skill-seekers install --config godot --unlimited
- API check: /api/configs/godot → 200 OK 
- Download: /api/download/godot.json → 404 Not Found 

After Fix:
- Uses download_url from API response → Works correctly 

Testing:
 All 15 source tools tests pass (test_mcp_fastmcp.py::TestSourceTools)
 All 8 fetch_config tests pass
 test_fetch_config_download_api: PASSED
 test_fetch_config_from_source: PASSED

Impact:
- Fixes config downloads from official API (skillseekersweb.com)
- Fixes config downloads from private Git repositories
- Prevents all future 404 errors from URL construction mismatch
- No breaking changes - fully backward compatible

Related Issue: Bug reported by user when testing Godot skill

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-18 22:25:35 +03:00
yusyus
d6f91029f6 chore: Bump version to 2.8.0-dev
Start development cycle for v2.8.0.

Version updated in 5 locations:
- pyproject.toml
- src/skill_seekers/__init__.py
- src/skill_seekers/cli/__init__.py
- src/skill_seekers/mcp/__init__.py
- src/skill_seekers/mcp/tools/__init__.py

All version numbers synchronized to prevent Issue #248.

[Unreleased] section in CHANGELOG.md ready for v2.8.0 changes.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-18 14:24:06 +03:00
MiaoDX
bd974148a2 feat: Update MCP to use server_fastmcp with venv Python support
This PR improves MCP server configuration by updating all documentation
to use the current server_fastmcp module and ensuring setup scripts
automatically use virtual environment Python instead of system Python.

## Changes

### 1. Documentation Updates (server → server_fastmcp)

Updated all references from deprecated `server` module to `server_fastmcp`:

**User-facing documentation:**
- examples/http_transport_examples.sh: All 13 command examples
- README.md: Configuration examples and troubleshooting commands
- docs/guides/MCP_SETUP.md: Enhanced migration guide with stdio/HTTP examples
- docs/guides/TESTING_GUIDE.md: Test import statements
- docs/guides/MULTI_AGENT_SETUP.md: Updated examples
- docs/guides/SETUP_QUICK_REFERENCE.md: Updated paths
- CLAUDE.md: CLI command examples

**MCP module:**
- src/skill_seekers/mcp/README.md: Updated config examples
- src/skill_seekers/mcp/agent_detector.py: Use server_fastmcp module

Note: Historical release notes (CHANGELOG.md) preserved unchanged.

### 2. Venv Python Configuration

**setup_mcp.sh improvements:**
- Added automatic venv detection (checks .venv, venv, and $VIRTUAL_ENV)
- Sets PYTHON_CMD to venv Python path when available
- **CRITICAL FIX**: Now updates PYTHON_CMD after creating/activating venv
- Generates MCP configs with full venv Python path
- Falls back to system python3 if no venv found
- Displays detected Python version and path

**Config examples updated:**
- .claude/mcp_config.example.json: Use venv Python path
- example-mcp-config.json: Use venv Python path
- Added "type": "stdio" for clarity
- Updated to use server_fastmcp module

### 3. Bug Fix: PYTHON_CMD Not Updated After Venv Creation

Previously, when setup_mcp.sh created or activated a venv, it failed to
update PYTHON_CMD, causing generated configs to still use system python3.

**Fixed cases:**
- When $VIRTUAL_ENV is already set → Update PYTHON_CMD to venv Python
- When existing venv is activated → Set PYTHON_CMD="$REPO_PATH/venv/bin/python3"
- When new venv is created → Set PYTHON_CMD="$REPO_PATH/venv/bin/python3"

## Benefits

### For Users:
 No deprecation warnings - All docs show current module
 Proper Python environment - MCP uses venv with all dependencies
 No system Python issues - Avoids "module not found" errors
 No global installation needed - No --break-system-packages required
 Automatic detection - setup_mcp.sh finds venv automatically
 Clean isolation - Projects don't interfere with system Python

### For Maintainers:
 Prepared for v3.0.0 - Documentation ready for server.py removal
 Reduced support burden - Fewer MCP configuration issues
 Consistent examples - All docs use same module/pattern

## Testing

**Verified:**
-  All command examples use server_fastmcp
-  No deprecated module references in user-facing docs (0 results)
-  New module correctly referenced (129 instances)
-  setup_mcp.sh detects venv and generates correct config
-  PYTHON_CMD properly updated after venv creation
-  MCP server starts correctly with venv Python

**Files changed:** 12 files (+262/-107 lines)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-18 15:55:46 +08:00
yusyus
136c5291d8 fix: Make 'Saved to:' regex patterns case-insensitive in install workflow
Fixed case-sensitivity bug where regex patterns failed to match output messages
due to case mismatch between 'saved to:' (lowercase in regex) and 'Saved to:'
(uppercase in actual output).

Changes:
- Line 529: Added (?i) flag to config path extraction regex
- Line 668: Added (?i) flag to package path extraction regex

This fixes the issue where 'skill-seekers install --config react' would:
1. Successfully download and save config to disk
2. Output: '📂 Saved to: output/react.json'
3. But fail with ' Failed to fetch config' due to regex mismatch

The workflow now correctly continues to Phase 2 (scraping) after fetching config.

Also updated comment on line 528 to reflect actual output format with emoji.

Fixes #236

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-18 00:33:19 +03:00
yusyus
538acb394c fix: Sync version numbers across all __init__.py files to 2.7.0
Fixed version mismatch bug where hardcoded versions were out of sync
with pyproject.toml.

Updated version from 2.5.2 to 2.7.0 in:
- src/skill_seekers/__init__.py
- src/skill_seekers/cli/__init__.py
- src/skill_seekers/mcp/__init__.py
- src/skill_seekers/mcp/tools/__init__.py

Now skill-seekers --version correctly reports: 2.7.0

Fixes #248

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-18 00:27:59 +03:00
yusyus
85c8d9d385 style: Run ruff format on 15 files (CI fix)
CI uses 'ruff format' not 'black' - applied proper formatting:

Files reformatted by ruff:
- config_extractor.py
- doc_scraper.py
- how_to_guide_builder.py
- llms_txt_parser.py
- pattern_recognizer.py
- test_example_extractor.py
- unified_codebase_analyzer.py
- test_architecture_scenarios.py
- test_async_scraping.py
- test_github_scraper.py
- test_guide_enhancer.py
- test_install_agent.py
- test_issue_219_e2e.py
- test_llms_txt_downloader.py
- test_skip_llms_txt.py

Fixes CI formatting check failure.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-18 00:01:30 +03:00
yusyus
9d43956b1d style: Run black formatter on 16 files
Applied black formatting to files modified in linting fixes:

Source files (8):
- config_extractor.py
- doc_scraper.py
- how_to_guide_builder.py
- llms_txt_downloader.py
- llms_txt_parser.py
- pattern_recognizer.py
- test_example_extractor.py
- unified_codebase_analyzer.py

Test files (8):
- test_architecture_scenarios.py
- test_async_scraping.py
- test_github_scraper.py
- test_guide_enhancer.py
- test_install_agent.py
- test_issue_219_e2e.py
- test_llms_txt_downloader.py
- test_skip_llms_txt.py

All formatting issues resolved.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-17 23:56:24 +03:00
yusyus
9666938eb0 fix: Resolve 21 ruff linting errors (SIM102, SIM117, B904, SIM113, B007)
Fixed all 21 linting errors identified in GitHub Actions:

SIM102 (7 errors - nested if statements):
- config_extractor.py:468 - Combined nested conditions
- config_validator.py (was B904, already fixed)
- pattern_recognizer.py:430,538,916 - Combined nested conditions
- test_example_extractor.py:365,412,460 - Combined nested conditions
- unified_skill_builder.py:1070 - Combined nested conditions

SIM117 (9 errors - multiple with statements):
- test_install_agent.py:418 - Combined with statements
- test_issue_219_e2e.py:278 - Combined with statements
- test_llms_txt_downloader.py:33,88 - Combined with statements
- test_skip_llms_txt.py:75,98,121,148,172,304 - Combined with statements

B904 (1 error - exception handling):
- config_validator.py:62 - Added 'from e' to exception chain

SIM113 (1 error - enumerate usage):
- doc_scraper.py:1068 - Removed unused 'completed' counter variable

B007 (1 error - unused loop variable):
- pdf_scraper.py:167 - Changed 'keywords' to '_' for unused variable

All changes improve code quality without altering functionality.
Tests: 1214 passed, 167 skipped (4 pre-existing failures unrelated)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-17 23:54:22 +03:00
yusyus
6439c85cde fix: Fix list comprehension variable names (NameError in CI)
Fixed incorrect variable names in list comprehensions that were causing
NameError in CI (Python 3.11/3.12):

Critical fixes:
- tests/test_markdown_parsing.py: 'l' → 'link' in list comprehension
- src/skill_seekers/cli/pdf_extractor_poc.py: 'l' → 'line' (2 occurrences)

Additional auto-lint fixes:
- Removed unused imports in llms_txt_downloader.py, llms_txt_parser.py
- Fixed comparison operators in config files
- Fixed list comprehension in other files

All tests now pass in CI.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-17 23:33:34 +03:00
yusyus
81dd5bbfbc fix: Fix remaining 61 ruff linting errors (SIM102, SIM117)
Fixed all remaining linting errors from the 310 total:
- SIM102: Combined nested if statements (31 errors)
  - adaptors/openai.py
  - config_extractor.py
  - codebase_scraper.py
  - doc_scraper.py
  - github_fetcher.py
  - pattern_recognizer.py
  - pdf_scraper.py
  - test_example_extractor.py

- SIM117: Combined multiple with statements (24 errors)
  - tests/test_async_scraping.py (2 errors)
  - tests/test_github_scraper.py (2 errors)
  - tests/test_guide_enhancer.py (20 errors)

- Fixed test fixture parameter (mock_config in test_c3_integration.py)

All 700+ tests passing.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-17 23:25:12 +03:00
yusyus
596b219599 fix: Resolve remaining 188 linting errors (249 total fixed)
Second batch of comprehensive linting fixes:

Unused Arguments/Variables (136 errors):
- ARG002/ARG001 (91 errors): Prefixed unused method/function arguments with '_'
  - Interface methods in adaptors (base.py, gemini.py, markdown.py)
  - AST analyzer methods maintaining signatures (code_analyzer.py)
  - Test fixtures and hooks (conftest.py)
  - Added noqa: ARG001/ARG002 for pytest hooks requiring exact names
- F841 (45 errors): Prefixed unused local variables with '_'
  - Tuple unpacking where some values aren't needed
  - Variables assigned but not referenced

Loop & Boolean Quality (28 errors):
- B007 (18 errors): Prefixed unused loop control variables with '_'
  - enumerate() loops where index not used
  - for-in loops where loop variable not referenced
- E712 (10 errors): Simplified boolean comparisons
  - Changed '== True' to direct boolean check
  - Changed '== False' to 'not' expression
  - Improved test readability

Code Quality (24 errors):
- SIM201 (4 errors): Already fixed in previous commit
- SIM118 (2 errors): Already fixed in previous commit
- E741 (4 errors): Already fixed in previous commit
- Config manager loop variable fix (1 error)

All Tests Passing:
- test_scraper_features.py: 42 passed
- test_integration.py: 51 passed
- test_architecture_scenarios.py: 11 passed
- test_real_world_fastmcp.py: 19 passed, 1 skipped

Note: Some SIM errors (nested if, multiple with) remain unfixed as they
would require non-trivial refactoring. Focus was on functional correctness.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-17 23:02:11 +03:00
yusyus
ec3e0bf491 fix: Resolve 61 critical linting errors
Fixed priority linting errors to improve code quality:

Critical Fixes:
- F821 (2 errors): Fixed undefined name 'original_result' in config_enhancer.py
- UP035 (2 errors): Removed deprecated typing.Dict and typing.Type imports
- F401 (27 errors): Removed unused imports and added noqa for availability checks
- E722 (19 errors): Replaced bare 'except:' with 'except Exception:'

Code Quality Improvements:
- SIM201 (4 errors): Simplified 'not x == y' to 'x != y'
- SIM118 (2 errors): Removed unnecessary .keys() in dict iterations
- E741 (4 errors): Renamed ambiguous variable 'l' to 'line'
- I001 (1 error): Sorted imports in test_bootstrap_skill.py

All modified areas tested and passing:
- test_scraper_features.py: 42 passed
- test_integration.py: 51 passed
- test_architecture_scenarios.py: 11 passed
- test_real_world_fastmcp.py: 19 passed (1 skipped)

Remaining linting errors: 249 (mostly code style suggestions like ARG002, F841, SIM102)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-17 22:54:40 +03:00
yusyus
02be4c53f6 fix: Add interactive parameter to prevent stdin read during tests
Fixes 2 failing tests in test_architecture_scenarios.py that were trying to
read from stdin during pytest execution, causing:
  OSError: pytest: reading from stdin while output is captured!

Changes:
- Added 'interactive' parameter to UnifiedCodebaseAnalyzer.analyze() (defaults to True)
- Pass interactive flag through to _analyze_github() and GitHubThreeStreamFetcher
- Updated failing tests to pass interactive=False

Tests fixed:
- test_scenario_1_github_three_stream_fetcher
- test_scenario_1_unified_analyzer_github

The interactive parameter controls whether the code prompts the user for
input (e.g., 'Continue without token?'). Setting it to False prevents
input() calls, making the code safe for CI/CD and test environments.

All 1386 tests now pass.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-17 22:02:35 +03:00
Pablo Estevez
c33c6f9073 change max lenght 2026-01-17 17:48:15 +00:00
Pablo Nicolás Estevez
97e597d9db Merge branch 'development' into ruff-and-mypy 2026-01-17 17:41:55 +00:00
yusyus
38e8969ae7 feat: Merge PR #249 - Bootstrap skill with fixes and MCP optionality
Merged PR #249 from @MiaoDX with enhancements:

Bootstrap Feature:
- Self-bootstrap: Generate skill-seekers as Claude Code skill
- Robust frontmatter detection (dynamic line finding)
- SKILL.md validation (YAML + Markdown structure)
- Comprehensive error handling (uv check, permission checks)
- 6 E2E tests with venv isolation

MCP Optionality (User Feature):
- MCP removed from core dependencies
- Optional install: pip install skill-seekers[mcp]
- Lazy loading with helpful error messages
- Interactive setup wizard on first run
- Backward compatible

Bug Fixes:
- Fixed codebase_scraper.py AttributeError (line 1193)
- Fixed test_bootstrap_skill_e2e.py Path vs str issue
- Updated test version expectations to 2.7.0
- Added httpx to core (required for async scraping)
- Added anthropic to core (required for AI enhancement)

Testing:
- 6 new bootstrap E2E tests (all passing)
- 1207/1217 tests passing (99.2% pass rate)
- All bootstrap and enhancement tests pass
- Remaining failures are pre-existing test infrastructure issues

Documentation:
- Updated CHANGELOG.md with v2.7.0 notes
- Updated README.md with bootstrap and installation options
- Added setup wizard guide

Files Modified (9):
- CHANGELOG.md, README.md - Documentation updates
- pyproject.toml - MCP optional, httpx/anthropic core, markers, entry points
- scripts/bootstrap_skill.sh - Dynamic frontmatter, validation, error handling
- src/skill_seekers/cli/install_skill.py - Lazy MCP loading
- tests/test_cli_paths.py - Version 2.7.0
- uv.lock - Dependency updates

New Files (2):
- src/skill_seekers/cli/setup_wizard.py - Interactive installation guide (95 lines)
- tests/test_bootstrap_skill_e2e.py - E2E bootstrap tests (169 lines)

Credits: @MiaoDX for PR #249

Co-Authored-By: MiaoDX <MiaoDX@hotmail.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-17 20:37:30 +03:00
yusyus
6d4ef0f13b Merge pull request #249 from MiaoDX-fork-and-pruning/dongxu/feat/bootstrap-it-01
Merge PR #249: Bootstrap skill with fixes and MCP optionality

Merged with comprehensive enhancements and testing.

Key Features:
- Bootstrap skill: Self-documentation capability
- MCP optionality: User choice for installation
- Interactive setup wizard
- 6 E2E tests (all passing)
- 1207/1217 tests passing (99.2%)

Co-Authored-By: MiaoDX <MiaoDX@hotmail.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-17 20:36:50 +03:00
Pablo Estevez
5ed767ff9a run ruff 2026-01-17 17:29:21 +00:00
yusyus
c89f059712 feat(v2.7.0): Smart Rate Limit Management & Multi-Token Configuration
Major Features:
- Multi-profile GitHub token system with secure storage
- Smart rate limit handler with 4 strategies (prompt/wait/switch/fail)
- Interactive configuration wizard with browser integration
- Configurable timeout (default 30 min) per profile
- Automatic profile switching on rate limits
- Live countdown timers with real-time progress
- Non-interactive mode for CI/CD (--non-interactive flag)
- Progress tracking and resume capability (skeleton)
- Comprehensive test suite (16 tests, all passing)

Solves:
- Indefinite waiting on GitHub rate limits
- Confusing GitHub token setup

Files Added:
- src/skill_seekers/cli/config_manager.py (~490 lines)
- src/skill_seekers/cli/config_command.py (~400 lines)
- src/skill_seekers/cli/rate_limit_handler.py (~450 lines)
- src/skill_seekers/cli/resume_command.py (~150 lines)
- tests/test_rate_limit_handler.py (16 tests)

Files Modified:
- src/skill_seekers/cli/github_fetcher.py (rate limit integration)
- src/skill_seekers/cli/github_scraper.py (--non-interactive, --profile flags)
- src/skill_seekers/cli/main.py (config, resume subcommands)
- pyproject.toml (version 2.7.0)
- CHANGELOG.md, README.md, CLAUDE.md (documentation)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-17 18:38:31 +03:00
MiaoDX
189abfec7d fix: Fix AttributeError in codebase_scraper for build_api_reference
The code was still referencing `args.build_api_reference` which was
changed to `args.skip_api_reference` in v2.5.2 (opt-in to opt-out flags).

This caused the codebase analysis to fail at the end with:
  AttributeError: 'Namespace' object has no attribute 'build_api_reference'

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 19:04:35 +08:00
yusyus
c9b9f44ce2 feat: Add --all flag to estimate command to list available configs
- Added find_configs_directory() to use same logic as API (api/configs_repo/official first, then configs/)
- Added list_all_configs() to display all 24 configs grouped by category with descriptions
- Updated CLI to support --all flag, making config argument optional when --all is used
- Added 2 new tests for --all flag functionality
- All 51 tests passing (51 passed, 1 skipped)

This enables users to discover all available preset configs without checking the API or filesystem directly.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-14 23:10:52 +03:00
yusyus
08a69f892f fix: Handle dict format in _get_language_stats
Fixed bug where _get_language_stats expected Path objects but received
dictionaries from results['files'].

Root cause: results['files'] contains dicts with 'language' key, not Path objects

Solution: Changed function to extract language from dict instead of calling detect_language()

Before:
  for file_path in files:
    lang = detect_language(file_path)  #  file_path is dict, not Path

After:
  for file_data in files:
    lang = file_data.get('language', 'Unknown')  #  Extract from dict

Tested: Successfully generated SKILL.md for AstroValley (90 lines, 19 C# files)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-13 22:13:22 +03:00
yusyus
7de17195dd feat: Add SKILL.md generation to codebase scraper
BREAKING CHANGE: Codebase scraper now generates complete skill structure

Implemented standalone SKILL.md generation for codebase analysis mode,
achieving source parity with other scrapers (docs, github, pdf).

**What Changed:**
- Added _generate_skill_md() - generates 300+ line SKILL.md
- Added _generate_references() - creates references/ directory structure
- Added format helper functions (patterns, examples, API, architecture, config)
- Called at end of analyze_codebase() - automatic SKILL.md generation

**SKILL.md Sections:**
- Front matter (name, description)
- Repository info (path, languages, file count)
- When to Use (comprehensive use cases)
- Quick Reference (languages, analysis features, stats)
- Design Patterns (C3.1 - if enabled)
- Code Examples (C3.2 - if enabled)
- API Reference (C2.5 - if enabled)
- Architecture Overview (C3.7 - always included)
- Configuration Patterns (C3.4 - if enabled)
- Available References (links to detailed docs)

**references/ Directory:**
Copies all analysis outputs into references/ for organized access:
- api_reference/
- dependencies/
- patterns/
- test_examples/
- tutorials/
- config_patterns/
- architecture/

**Benefits:**
 Source parity: All 4 sources now generate rich standalone SKILL.md
 Standalone mode complete: codebase-scraper → full skill output
 Synthesis ready: Can combine codebase with docs/github/pdf
 Consistent UX: All scrapers work the same way
 Follows plan: Implements synthesis architecture from bubbly-shimmying-anchor.md

**Output Example:**
```
output/codebase/
├── SKILL.md               #  NEW! 300+ lines
├── references/            #  NEW! Organized references
│   ├── api_reference/
│   ├── dependencies/
│   ├── patterns/
│   ├── test_examples/
│   └── architecture/
├── api_reference/         # Original analysis files
├── dependencies/
├── patterns/
├── test_examples/
└── architecture/
```

**Testing:**
```bash
# Standalone mode
codebase-scraper --directory /path/to/repo --output output/codebase/
ls output/codebase/SKILL.md  #  Now exists!

# Verify line count
wc -l output/codebase/SKILL.md  # Should be 200-400 lines

# Check structure
grep "## " output/codebase/SKILL.md
```

**Closes Gap:**
- Fixes: Codebase mode didn't generate SKILL.md (#issue from analysis)
- Implements: Option 1 from codebase_mode_analysis_report.md
- Effort: 4-6 hours (as estimated)

**Related:**
- Plan: /home/yusufk/.claude/plans/bubbly-shimmying-anchor.md (synthesis architecture)
- Analysis: /tmp/codebase_mode_analysis_report.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-13 22:08:50 +03:00
yusyus
72dde1ba08 feat: AI enhancement multi-repo support + critical bug fix
CRITICAL BUG FIX:
- Fixed documentation scraper overwriting list with dict
- Changed self.scraped_data['documentation'] = {...} to .append({...})
- Bug was breaking unified skill builder reference generation

AI ENHANCEMENT UPDATES:
- Added repo_id extraction in utils.py for multi-repo support
- Enhanced grouping by (source, repo_id) tuple in both enhancement files
- Added MULTI-REPOSITORY HANDLING section to AI prompts
- AI now correctly identifies and synthesizes multiple repos

CHANGES:
1. src/skill_seekers/cli/utils.py:
   - _determine_source_metadata() now returns (source, confidence, repo_id)
   - Extracts repo_id from codebase_analysis/{repo_id}/ paths
   - Added repo_id field to reference metadata dict

2. src/skill_seekers/cli/enhance_skill_local.py:
   - Group references by (source_type, repo_id) instead of just source_type
   - Display repo identity in prompt sections
   - Detect multiple repos and add explicit guidance to AI

3. src/skill_seekers/cli/enhance_skill.py:
   - Same grouping and display logic as local enhancement
   - Multi-repository handling section added

4. src/skill_seekers/cli/unified_scraper.py:
   - FIX: Documentation scraper now appends to list instead of overwriting
   - Added source_id, base_url, refs_dir to documentation metadata
   - Update refs_dir after moving to cache

TESTING:
- All 57 tests passing (unified, C3, utilities)
- Single-source verified: httpx comprehensive (219→749 lines after enhancement)
- Multi-source verified: encode/httpx + encode/httpcore (523 lines)
- AI enhancement working: Professional output with source attribution

QUALITY:
- Enhanced httpx SKILL.md: 749 lines, 19KB, A+ quality
- Source attribution working correctly
- Multi-repo synthesis transparent and accurate
- Reference structure clean and organized

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-12 22:05:34 +03:00
yusyus
52cf99136a fix: Resolve merge conflicts in router quality improvements
Resolved conflicts between router quality improvements and multi-source
synthesis architecture:

1. **unified_skill_builder.py**:
   - Updated _generate_architecture_overview() signature to accept github_data
   - Ensures GitHub metadata is available for enhanced router generation

2. **test_c3_integration.py**:
   - Updated test data structure to multi-source list format
   - Tests now properly mock github data for architecture generation
   - All 8 C3 integration tests passing

**Test Results**:
-  All 8 C3 integration tests pass
-  All 26 unified tests pass
-  All 116 GitHub-related tests pass
-  All 62 multi-source architecture tests pass

The changes maintain backward compatibility while enabling router skills
to leverage GitHub insights (issues, labels, metadata) for better quality.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-12 00:41:26 +03:00
yusyus
9d26ca5d0a Merge branch 'development' into feature/router-quality-improvements
Integrated multi-source support from development branch into feature branch's
C3.x auto-cloning and cache system. This merge combines TWO major features:

FEATURE BRANCH (C3.x + Cache):
- Automatic GitHub repository cloning for C3.x analysis
- Hidden .skillseeker-cache/ directory for intermediate files
- Cache reuse for faster rebuilds
- Enhanced AI skill quality improvements

DEVELOPMENT BRANCH (Multi-Source):
- Support multiple sources of same type (multiple GitHub repos, PDFs)
- List-based data storage with source indexing
- New configs: claude-code.json, medusa-mercurjs.json
- llms.txt downloader/parser enhancements
- New tests: test_markdown_parsing.py, test_multi_source.py

CONFLICT RESOLUTIONS:

1. configs/claude-code.json (COMPROMISE):
   - Kept file with _migration_note (preserves PR #244 work)
   - Feature branch had deleted it (config migration)
   - Development branch enhanced it (47 Claude Code doc URLs)

2. src/skill_seekers/cli/unified_scraper.py (INTEGRATED):
   Applied 8 changes for multi-source support:
   - List-based storage: {'github': [], 'documentation': [], 'pdf': []}
   - Source indexing with _source_counters
   - Unique naming: {name}_github_{idx}_{repo_id}
   - Unique data files: github_data_{idx}_{repo_id}.json
   - List append instead of dict assignment
   - Updated _clone_github_repo(repo_name, idx=0) signature
   - Applied same logic to _scrape_pdf()

3. src/skill_seekers/cli/unified_skill_builder.py (INTEGRATED):
   Applied 3 changes for multi-source synthesis:
   - _load_source_skill_mds(): Glob pattern for multiple sources
   - _generate_references(): Iterate through github_list
   - _generate_c3_analysis_references(repo_id): Per-repo C3.x references

TESTING STRATEGY:

Backward Compatibility:
- Single source configs work exactly as before (idx=0)

New Capabilities:
- Multiple GitHub repos: encode/httpx + facebook/react
- Multiple PDFs with unique indexing
- Mixed sources: docs + multiple GitHub repos

Pipeline Integrity:
- Scraper: Multi-source data collection with indexing
- Builder: Loads all source SKILL.md files
- Synthesis: Merges multiple sources with separators
- C3.x: Independent analysis per repo in unique subdirectories

Result: Support MULTIPLE sources per type + C3.x analysis + cache system

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-12 00:11:31 +03:00
yusyus
a99e22c639 feat: Multi-Source Synthesis Architecture - Rich Standalone Skills + Smart Combination
BREAKING CHANGE: Major architectural improvements to multi-source skill generation

This commit implements the complete "Multi-Source Synthesis Architecture" where
each source (documentation, GitHub, PDF) generates a rich standalone SKILL.md
file before being intelligently synthesized with source-specific formulas.

## 🎯 Core Architecture Changes

### 1. Rich Standalone SKILL.md Generation (Source Parity)

Each source now generates comprehensive, production-quality SKILL.md files that
can stand alone OR be synthesized with other sources.

**GitHub Scraper Enhancements** (+263 lines):
- Now generates 300+ line SKILL.md (was ~50 lines)
- Integrates C3.x codebase analysis data:
  - C2.5: API Reference extraction
  - C3.1: Design pattern detection (27 high-confidence patterns)
  - C3.2: Test example extraction (215 examples)
  - C3.7: Architectural pattern analysis
- Enhanced sections:
  -  Quick Reference with pattern summaries
  - 📝 Code Examples from real repository tests
  - 🔧 API Reference from codebase analysis
  - 🏗️ Architecture Overview with design patterns
  - ⚠️ Known Issues from GitHub issues
- Location: src/skill_seekers/cli/github_scraper.py

**PDF Scraper Enhancements** (+205 lines):
- Now generates 200+ line SKILL.md (was ~50 lines)
- Enhanced content extraction:
  - 📖 Chapter Overview (PDF structure breakdown)
  - 🔑 Key Concepts (extracted from headings)
  -  Quick Reference (pattern extraction)
  - 📝 Code Examples: Top 15 (was top 5), grouped by language
  - Quality scoring and intelligent truncation
- Better formatting and organization
- Location: src/skill_seekers/cli/pdf_scraper.py

**Result**: All 3 sources (docs, GitHub, PDF) now have equal capability to
generate rich, comprehensive standalone skills.

### 2. File Organization & Caching System

**Problem**: output/ directory cluttered with intermediate files, data, and logs.

**Solution**: New `.skillseeker-cache/` hidden directory for all intermediate files.

**New Structure**:
```
.skillseeker-cache/{skill_name}/
├── sources/          # Standalone SKILL.md from each source
│   ├── httpx_docs/
│   ├── httpx_github/
│   └── httpx_pdf/
├── data/             # Raw scraped data (JSON)
├── repos/            # Cloned GitHub repositories (cached for reuse)
└── logs/             # Session logs with timestamps

output/{skill_name}/  # CLEAN: Only final synthesized skill
├── SKILL.md
└── references/
```

**Benefits**:
-  Clean output/ directory (only final product)
-  Intermediate files preserved for debugging
-  Repository clones cached and reused (faster re-runs)
-  Timestamped logs for each scraping session
-  All cache dirs added to .gitignore

**Changes**:
- .gitignore: Added `.skillseeker-cache/` entry
- unified_scraper.py: Complete reorganization (+238 lines)
  - Added cache directory structure
  - File logging with timestamps
  - Repository cloning with caching/reuse
  - Cleaner intermediate file management
  - Better subprocess logging and error handling

### 3. Config Repository Migration

**Moved to separate config repository**: https://github.com/yusufkaraaslan/skill-seekers-configs

**Deleted from this repo** (35 config files):
- ansible-core.json, astro.json, claude-code.json
- django.json, django_unified.json, fastapi.json, fastapi_unified.json
- godot.json, godot_unified.json, godot_github.json, godot-large-example.json
- react.json, react_unified.json, react_github.json, react_github_example.json
- vue.json, kubernetes.json, laravel.json, tailwind.json, hono.json
- svelte_cli_unified.json, steam-economy-complete.json
- deck_deck_go_local.json, python-tutorial-test.json, example_pdf.json
- test-manual.json, fastapi_unified_test.json, fastmcp_github_example.json
- example-team/ directory (4 files)

**Kept as reference example**:
- configs/httpx_comprehensive.json (complete multi-source example)

**Rationale**:
- Cleaner repository (979+ lines added, 1680 deleted)
- Configs managed separately with versioning
- Official presets available via `fetch-config` command
- Users can maintain private config repos

### 4. AI Enhancement Improvements

**enhance_skill.py** (+125 lines):
- Better integration with multi-source synthesis
- Enhanced prompt generation for synthesized skills
- Improved error handling and logging
- Support for source metadata in enhancement

### 5. Documentation Updates

**CLAUDE.md** (+252 lines):
- Comprehensive project documentation
- Architecture explanations
- Development workflow guidelines
- Testing requirements
- Multi-source synthesis patterns

**SKILL_QUALITY_ANALYSIS.md** (new):
- Quality assessment framework
- Before/after analysis of httpx skill
- Grading rubric for skill quality
- Metrics and benchmarks

### 6. Testing & Validation Scripts

**test_httpx_skill.sh** (new):
- Complete httpx skill generation test
- Multi-source synthesis validation
- Quality metrics verification

**test_httpx_quick.sh** (new):
- Quick validation script
- Subset of features for rapid testing

## 📊 Quality Improvements

| Metric | Before | After | Improvement |
|--------|--------|-------|-------------|
| GitHub SKILL.md lines | ~50 | 300+ | +500% |
| PDF SKILL.md lines | ~50 | 200+ | +300% |
| GitHub C3.x integration |  No |  Yes | New feature |
| PDF pattern extraction |  No |  Yes | New feature |
| File organization | Messy | Clean cache | Major improvement |
| Repository cloning | Always fresh | Cached reuse | Faster re-runs |
| Logging | Console only | Timestamped files | Better debugging |
| Config management | In-repo | Separate repo | Cleaner separation |

## 🧪 Testing

All existing tests pass:
- test_c3_integration.py: Updated for new architecture
- 700+ tests passing
- Multi-source synthesis validated with httpx example

## 🔧 Technical Details

**Modified Core Files**:
1. src/skill_seekers/cli/github_scraper.py (+263 lines)
   - _generate_skill_md(): Rich content with C3.x integration
   - _format_pattern_summary(): Design pattern summaries
   - _format_code_examples(): Test example formatting
   - _format_api_reference(): API reference from codebase
   - _format_architecture(): Architectural pattern analysis

2. src/skill_seekers/cli/pdf_scraper.py (+205 lines)
   - _generate_skill_md(): Enhanced with rich content
   - _format_key_concepts(): Extract concepts from headings
   - _format_patterns_from_content(): Pattern extraction
   - Code examples: Top 15, grouped by language, better quality scoring

3. src/skill_seekers/cli/unified_scraper.py (+238 lines)
   - __init__(): Cache directory structure
   - _setup_logging(): File logging with timestamps
   - _clone_github_repo(): Repository caching system
   - _scrape_documentation(): Move to cache, better logging
   - Better subprocess handling and error reporting

4. src/skill_seekers/cli/enhance_skill.py (+125 lines)
   - Multi-source synthesis awareness
   - Enhanced prompt generation
   - Better error handling

**Minor Updates**:
- src/skill_seekers/cli/codebase_scraper.py (+3 lines): Minor improvements
- src/skill_seekers/cli/test_example_extractor.py: Quality scoring adjustments
- tests/test_c3_integration.py: Test updates for new architecture

## 🚀 Migration Guide

**For users with existing configs**:
No action required - all existing configs continue to work.

**For users wanting official presets**:
```bash
# Fetch from official config repo
skill-seekers fetch-config --name react --target unified

# Or use existing local configs
skill-seekers unified --config configs/httpx_comprehensive.json
```

**Cache directory**:
New `.skillseeker-cache/` directory will be created automatically.
Safe to delete - will be regenerated on next run.

## 📈 Next Steps

This architecture enables:
-  Source parity: All sources generate rich standalone skills
-  Smart synthesis: Each combination has optimal formula
-  Better debugging: Cached files and logs preserved
-  Faster iteration: Repository caching, clean output
- 🔄 Future: Multi-platform enhancement (Gemini, GPT-4) - planned
- 🔄 Future: Conflict detection between sources - planned
- 🔄 Future: Source prioritization rules - planned

## 🎓 Example: httpx Skill Quality

**Before**: 186 lines, basic synthesis, missing data
**After**: 640 lines with AI enhancement, A- (9/10) quality

**What changed**:
- All C3.x analysis data integrated (patterns, tests, API, architecture)
- GitHub metadata included (stars, topics, languages)
- PDF chapter structure visible
- Professional formatting with emojis and clear sections
- Real-world code examples from test suite
- Design patterns explained with confidence scores
- Known issues with impact assessment

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-11 23:01:07 +03:00
yusyus
cf9539878e fix: AI Enhancement File Update - Add --dangerously-skip-permissions Flag
PROBLEM:
AI enhancement was running Claude Code but SKILL.md was never updated.
Users saw "Claude finished but SKILL.md was not updated" error.

ROOT CAUSE:
Claude CLI was called with invalid --yes flag (doesn't exist).
Permission checks prevented file modifications from nested Claude sessions.

THE FIX:
1. Removed invalid --yes flag
2. Added --dangerously-skip-permissions flag to bypass ALL permission checks
3. Added explicit save instructions in prompt
4. Added debug output showing before/after file stats

CHANGES IN enhance_skill_local.py:

Line 614: Changed subprocess command
- Before: ['claude', '--yes', '--dangerously-skip-permissions', prompt_file]
- After:  ['claude', '--dangerously-skip-permissions', prompt_file]

Lines 363-377: Enhanced prompt with explicit save instructions
- Added "You MUST save" language
- Added "This is NOT a read-only task" clarification
- Added "Even if running from within another Claude Code session" permission
- Added verification requirements

Lines 644-654: Enhanced debug output
- Shows before/after mtime and size
- Displays last 20 lines of Claude output
- Helps identify what went wrong

VERIFICATION:
Tested on output/httpx/:
- Before: 219 lines, 5,582 bytes
- After:  702 lines, 21,377 bytes (+283% size, +221% lines)
- Enhancement time: 152.8 seconds
- Status:  SUCCESS - File updated correctly

IMPACT:
 AI enhancement now works automatically
 No more "file not updated" errors
 SKILL.md properly expands from 200 to 700+ lines
 Rich content with real examples from references
 Works even when called from within Claude Code session

The --dangerously-skip-permissions flag allows Claude Code to modify
files without permission prompts, essential for automated workflows.

🚨 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-11 22:29:14 +03:00
yusyus
424ddf01a1 fix: Skill Quality Improvements - C+ (6.5/10) → B+ (8/10) (+23%)
OVERALL IMPACT:
- Multi-source synthesis now properly merges all content from docs + GitHub
- AI enhancement reads 100% of references (was 44%)
- Pattern descriptions clean and readable (was unreadable walls of text)
- GitHub metadata fully displayed (stars, topics, languages, design patterns)

PHASE 1: AI Enhancement Reference Reading
- Fixed utils.py: Remove index.md skip logic (was losing 17KB of content)
- Fixed enhance_skill_local.py: Correct size calculation (ref['size'] not len(c))
- Fixed enhance_skill_local.py: Add working directory to subprocess (cwd)
- Fixed enhance_skill_local.py: Use relative paths instead of absolute
- Result: 4/9 files → 9/9 files, 54 chars → 29,971 chars (+55,400%)

PHASE 2: Content Synthesis
- Fixed unified_skill_builder.py: Add '' emoji to parser (was breaking GitHub parsing)
- Enhanced unified_skill_builder.py: Rewrote _synthesize_docs_github() method
- Added GitHub metadata sections (Repository Info, Languages, Design Patterns)
- Fixed placeholder text replacement (httpx_docs → httpx)
- Result: 186 → 223 lines (+20%), added 27 design patterns, 3 metadata sections

PHASE 3: Content Formatting
- Fixed doc_scraper.py: Truncate pattern descriptions to first sentence (max 150 chars)
- Fixed unified_skill_builder.py: Remove duplicate content labels
- Result: Pattern readability 2/10 → 9/10 (+350%), eliminated 10KB of bloat

METRICS:
┌─────────────────────────┬──────────┬──────────┬──────────┐
│ Metric                  │ Before   │ After    │ Change   │
├─────────────────────────┼──────────┼──────────┼──────────┤
│ SKILL.md Lines          │ 186      │ 219      │ +18%     │
│ Reference Files Read    │ 4/9      │ 9/9      │ +125%    │
│ Reference Content       │ 54 ch    │ 29,971ch │ +55,400% │
│ Placeholder Issues      │ 5        │ 0        │ -100%    │
│ Duplicate Labels        │ 4        │ 0        │ -100%    │
│ GitHub Metadata         │ 0        │ 3        │ +∞       │
│ Design Patterns         │ 0        │ 27       │ +∞       │
│ Pattern Readability     │ 2/10     │ 9/10     │ +350%    │
│ Overall Quality         │ 6.5/10   │ 8.0/10   │ +23%     │
└─────────────────────────┴──────────┴──────────┴──────────┘

FILES MODIFIED:
- src/skill_seekers/cli/utils.py (Phase 1)
- src/skill_seekers/cli/enhance_skill_local.py (Phase 1)
- src/skill_seekers/cli/unified_skill_builder.py (Phase 2, 3)
- src/skill_seekers/cli/doc_scraper.py (Phase 3)
- docs/SKILL_QUALITY_FIX_PLAN.md (implementation plan)

CRITICAL BUGS FIXED:
1. Index.md files skipped in AI enhancement (losing 57% of content)
2. Wrong size calculation in enhancement stats
3. Missing '' emoji in section parser (breaking GitHub Quick Reference)
4. Pattern descriptions output as 600+ char walls of text
5. Duplicate content labels in synthesis

🚨 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-11 22:16:37 +03:00
Nick Miethe
9042e1680c Enabling full support of the Claude Code documentation site, with support for all relevant pages and Anthropic's unconventional llms.txt 2026-01-11 14:15:32 +03:00