skill-seekers-reference

firefrost-gaming/skill-seekers-reference

Author	SHA1	Message	Date
yusyus	c896f8cb37	feat: add Codex CLI plugin manifest (#350 ) Add Codex plugin support for discovery via codex-plugin-scanner: - .codex-plugin/plugin.json — plugin manifest - .mcp.json — MCP server config (starts server_fastmcp) - skills/skill-seekers/SKILL.md — bundled skill for Codex - .gitignore — allow root .mcp.json to be tracked Co-authored-by: internet-dot <28622406+internet-dot@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 00:05:45 +03:00
yusyus	3cde94399e	fix: replace remaining glob('.md') with rglob('.md') in all adaptors Follow-up to #349 — the same bug existed in 4 more adaptor files: - base.py (2 locations) — affects all adaptors via inheritance - openai_compatible.py (2 locations) — affects minimax, deepseek, kimi, qwen, openrouter, together, fireworks - markdown.py (1 location) - streaming_adaptor.py (1 location) No glob("*.md") remains in any adaptor. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 23:22:45 +03:00
waewoo	3619e0c233	fix: replace glob('.md') with rglob('.md') in all adaptors (#349 )	2026-04-06 23:21:36 +03:00
yusyus	ca9f23277b	fix: exclude slow MCP/e2e tests from coverage step to prevent CI timeout The coverage step re-runs the entire test suite with --cov, including MCP tests (~20min each) that are already tested in a prior step. This caused the 6-hour GitHub Actions timeout to be hit. Exclude the slow tests from coverage and add a 30-minute timeout guard. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 21:08:17 +03:00
yusyus	6d37e43b83	feat: Grand Unification — one command, one interface, direct converters (#346 ) * fix: resolve 8 pipeline bugs found during skill quality review - Fix 0 APIs extracted from documentation by enriching summary.json with individual page file content before conflict detection - Fix all "Unknown" entries in merged_api.md by injecting dict keys as API names and falling back to AI merger field names - Fix frontmatter using raw slugs instead of config name by normalizing frontmatter after SKILL.md generation - Fix leaked absolute filesystem paths in patterns/index.md by stripping .skillseeker-cache repo clone prefixes - Fix ARCHITECTURE.md file count always showing "1 files" by counting files per language from code_analysis data - Fix YAML parse errors on GitHub Actions workflows by converting boolean keys (on: true) to strings - Fix false React/Vue.js framework detection in C# projects by filtering web frameworks based on primary language - Improve how-to guide generation by broadening workflow example filter to include setup/config examples with sufficient complexity - Fix test_git_sources_e2e failures caused by git init default branch being 'main' instead of 'master' Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address 6 review issues in ExecutionContext implementation Fixes from code review: 1. Mode resolution (#3 critical): _args_to_data no longer unconditionally overwrites mode. Only writes mode="api" when --api-key explicitly passed. Env-var-based mode detection moved to _default_data() as lowest priority. 2. Re-initialization warning (#4): initialize() now logs debug message when called a second time instead of silently returning stale instance. 3. _raw_args preserved in override (#5): temp context now copies _raw_args from parent so get_raw() works correctly inside override blocks. 4. test_local_mode_detection env cleanup (#7): test now saves/restores API key env vars to prevent failures when ANTHROPIC_API_KEY is set. 5. _load_config_file error handling (#8): wraps FileNotFoundError and JSONDecodeError with user-friendly ValueError messages. 6. Lint fixes: added logging import, fixed Generator import from collections.abc, fixed AgentClient return type annotation. Remaining P2/P3 items (documented, not blocking): - Lock TOCTOU in override() — safe on CPython, needs fix for no-GIL - get() reads _instance without lock — same CPython caveat - config_path not stored on instance - AnalysisSettings.depth not Literal constrained Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address all remaining P2/P3 review issues in ExecutionContext 1. Thread safety: get() now acquires _lock before reading _instance (#2) 2. Thread safety: override() saves/restores _initialized flag to prevent re-init during override blocks (#10) 3. Config path stored: _config_path PrivateAttr + config_path property (#6) 4. Literal validation: AnalysisSettings.depth now uses Literal["surface", "deep", "full"] — rejects invalid values (#9) 5. Test updated: test_analysis_depth_choices now expects ValidationError for invalid depth, added test_analysis_depth_valid_choices 6. Lint cleanup: removed unused imports, fixed whitespace in tests All 10 previously reported issues now resolved. 26 tests pass, lint clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: restore 5 truncated scrapers, migrate unified_scraper, fix context init 5 scrapers had main() truncated with "# Original main continues here..." after Kimi's migration — business logic was never connected: - html_scraper.py — restored HtmlToSkillConverter extraction + build - pptx_scraper.py — restored PptxToSkillConverter extraction + build - confluence_scraper.py — restored ConfluenceToSkillConverter with 3 modes - notion_scraper.py — restored NotionToSkillConverter with 4 sources - chat_scraper.py — restored ChatToSkillConverter extraction + build unified_scraper.py — migrated main() to context-first pattern with argv fallback Fixed context initialization chain: - main.py no longer initializes ExecutionContext (was stealing init from commands) - create_command.py now passes config_path from source_info.parsed - execution_context.py handles SourceInfo.raw_input (not raw_source) All 18 scrapers now genuinely migrated. 26 tests pass, lint clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve 7 data flow conflicts between ExecutionContext and legacy paths Critical fixes (CLI args silently lost): - unified_scraper Phase 6: reads ctx.enhancement.level instead of raw JSON when args=None (#3, #4) - unified_scraper Phase 6 agent: reads ctx.enhancement.agent instead of 3 independent env var lookups (#5) - doc_scraper._run_enhancement: uses agent_client.api_key instead of raw os.environ.get() — respects config file api_key (#1) Important fixes: - main._handle_analyze_command: populates _fake_args from ExecutionContext so --agent and --api-key aren't lost in analyze→enhance path (#6) - doc_scraper type annotations: replaced forward refs with Any to avoid F821 undefined name errors All changes include RuntimeError fallback for backward compatibility when ExecutionContext isn't initialized. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: 3 crashes + 1 stub in migrated scrapers found by deep scan 1. github_scraper.py: args.scrape_only and args.enhance_level crash when args=None (context path). Guarded with if args and getattr(). Also fixed agent fallback to read ctx.enhancement.agent. 2. codebase_scraper.py: args.output and args.skip_api_reference crash in summary block when args=None. Replaced with output_dir local var and ctx.analysis.skip_api_reference. 3. epub_scraper.py: main() was still a stub ending with "# Rest of main() continues..." — restored full extraction + build + enhancement logic using ctx values exclusively. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: complete ExecutionContext migration for remaining scrapers Kimi's Phase 4 scraper migrations + Claude's review fixes. All 18 scrapers now use context-first pattern with argv fallback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Phase 1 — ExecutionContext.get() always returns context (no RuntimeError) get() now returns a default context instead of raising RuntimeError when not explicitly initialized. This eliminates the need for try/except RuntimeError blocks in all 18 scrapers. Components can always call ExecutionContext.get() safely — it returns defaults if not initialized, or the explicitly initialized instance. Updated tests: test_get_returns_defaults_when_not_initialized, test_reset_clears_instance (no longer expects RuntimeError). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Phase 2a-c — remove 16 individual scraper CLI commands Removed individual scraper commands from: - COMMAND_MODULES in main.py (16 entries: scrape, github, pdf, word, epub, video, jupyter, html, openapi, asciidoc, pptx, rss, manpage, confluence, notion, chat) - pyproject.toml entry points (16 skill-seekers-<type> binaries) - parsers/__init__.py (16 parser registrations) All source types now accessed via: skill-seekers create <source> Kept: create, unified, analyze, enhance, package, upload, install, install-agent, config, doctor, and utility commands. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: create SkillConverter base class + converter registry New base interface that all 17 converters will inherit: - SkillConverter.run() — extract + build (same call for all types) - SkillConverter.extract() — override in subclass - SkillConverter.build_skill() — override in subclass - get_converter(source_type, config) — factory from registry - CONVERTER_REGISTRY — maps source type → (module, class) create_command will use get_converter() instead of _call_module(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Grand Unification — one command, one interface, direct converters Complete the Grand Unification refactor: `skill-seekers create` is now the single entry point for all 18 source types. Individual scraper CLI commands (scrape, github, pdf, analyze, unified, etc.) are removed. ## Architecture changes - 18 SkillConverter subclasses: Every scraper now inherits SkillConverter with extract() + build_skill() + SOURCE_TYPE. Factory via get_converter(). - create_command.py rewritten: _build_config() constructs config dicts from ExecutionContext for each source type. Direct converter.run() calls replace the old _build_argv() + sys.argv swap + _call_module() machinery. - main.py simplified: create command bypasses _reconstruct_argv entirely, calls CreateCommand(args).execute() directly. analyze/unified commands removed (create handles both via auto-detection). - CreateParser mode="all": Top-level parser now accepts all 120+ flags (--browser, --max-pages, --depth, etc.) since create is the only entry. - Centralized enhancement: Runs once in create_command after converter, not duplicated in each scraper. - MCP tools use converters: 5 scraping tools call get_converter() directly instead of subprocess. Config type auto-detected from keys. - ConfigValidator → UniSkillConfigValidator: Renamed with backward- compat alias. - Data flow: AgentClient + LocalSkillEnhancer read ExecutionContext first, env vars as fallback. ## What was removed - main() from all 18 scraper files (~3400 lines) - 18 CLI commands from COMMAND_MODULES + pyproject.toml entry points - analyze + unified parsers from parser registry - _build_argv, _call_module, _SKIP_ARGS, _DEST_TO_FLAG, all _route_() - setup_argument_parser, get_configuration, _check_deprecated_flags - Tests referencing removed commands/functions ## Net impact 51 files changed, ~6000 lines removed. 2996 tests pass, 0 failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix: review fixes for Grand Unification PR - Add autouse conftest fixture to reset ExecutionContext singleton between tests - Replace hardcoded defaults in _is_explicitly_set() with parser-derived defaults - Upgrade ExecutionContext double-init log from debug to info - Use logger.exception() in SkillConverter.run() to preserve tracebacks - Fix docstring "17 types" → "18 types" in skill_converter.py - DRY up 10 copy-paste help handlers into dict + loop (~100 lines removed) - Fix 2 CI workflows still referencing removed `skill-seekers scrape` command - Remove broken pyproject.toml entry point for codebase_scraper:main Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve 12 logic/flow issues found in deep review Critical fixes: - UnifiedScraper.run(): replace sys.exit(1) with return 1, add return 0 - doc_scraper: use ExecutionContext.get() when already initialized instead of re-calling initialize() which silently discards new config - unified_scraper: define enhancement_config before try/except to prevent UnboundLocalError in LOCAL enhancement timeout read Important fixes: - override(): cleaner tuple save/restore for singleton swap - --agent without --api-key now sets mode="local" so env API key doesn't override explicit agent choice - Remove DeprecationWarning from _reconstruct_argv (fires on every non-create command in production) - Rewrite scrape_generic_tool to use get_converter() instead of subprocess calls to removed main() functions - SkillConverter.run() checks build_skill() return value, returns 1 if False - estimate_pages_tool uses -m module invocation instead of .py file path Low-priority fixes: - get_converter() raises descriptive ValueError on class name typo - test_default_values: save/clear API key env vars before asserting mode - test_get_converter_pdf: fix config key "path" → "pdf_path" 3056 passed, 4 failed (pre-existing dep version issues), 32 skipped. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: update MCP server tests to mock converter instead of subprocess scrape_docs_tool now uses get_converter() + _run_converter() in-process instead of run_subprocess_with_streaming. Update 4 TestScrapeDocsTool tests to mock the converter layer instead of the removed subprocess path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: YusufKaraaslanSpyke <yusuf@spykegames.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 23:00:52 +03:00
yusyus	2a14309342	docs: update changelog, readme, and docs for v3.5.0 - Add CHANGELOG.md entry for v3.5.0 with all PR #336 changes - Update README.md: version 3.5.0, agent-agnostic examples, marketplace pipeline, SPA discovery - Update CLAUDE.md: AgentClient architecture, 40 MCP tools, new modules - Update docs/: UML architecture, MCP reference (40 tools, new tool categories), enhancement modes (multi-provider/multi-agent), FAQ - Update src/skill_seekers/mcp/README.md: accurate tool count and paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 04:57:32 +03:00
yusyus	c6a6db01bf	feat: agent-agnostic refactor, smart SPA discovery, marketplace pipeline (#336 ) * feat: fix unified scraper pipeline gaps, add multi-agent support, and Unity skill configs Fix multiple bugs in the unified scraper pipeline discovered while creating Unity skills (Spine, Addressables, DOTween): - Fix doc scraper KeyError by passing base_url in temp config - Fix scraped_data list-vs-dict bug in detect_conflicts() and merge_sources() - Add Phase 6 auto-enhancement from config "enhancement" block (LOCAL + API mode) - Add "browser": true config support for JavaScript SPA documentation sites - Add Phase 3 skip message for better UX - Add subprocess timeout (3600s) for doc scraper - Fix SkillEnhancer missing skill_dir argument in API mode - Fix browser renderer defaults (60s timeout, domcontentloaded wait condition) - Fix C3.x JSON filename mismatch (design_patterns.json → all_patterns.json) - Fix workflow builtin target handling when no pattern data available - Make AI enhancement timeout configurable via SKILL_SEEKER_ENHANCE_TIMEOUT env var (300s default) - Add C#, Go, Rust, Swift, Ruby, PHP, GDScript to GitHub scraper extension map - Add multi-agent LOCAL mode support across all 17 scrapers (--agent flag) - Add Kimi/Moonshot platform support (API keys, agent presets, config wizard) - Add unity-game-dev.yaml workflow (7 stages covering Unity-specific patterns) - Add 3 Unity skill configs (Spine, Addressables, DOTween) - Add comprehensive Claude bias audit report Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: create AgentClient abstraction, remove hardcoded Claude from 5 enhancers (#334) Phase 1 of the full agent-agnostic refactor. Creates a centralized AgentClient that all enhancers use instead of hardcoded subprocess calls and model names. New file: - agent_client.py: Unified AI client supporting API mode (Anthropic, Moonshot, Google, OpenAI) and LOCAL mode (Claude Code, Kimi, Codex, Copilot, OpenCode, custom agents). Provides detect_api_key(), get_model(), detect_default_target(). Refactored (removed all hardcoded ["claude", ...] subprocess calls): - ai_enhancer.py: -140 lines, delegates to AgentClient - config_enhancer.py: -150 lines, removed _run_claude_cli() - guide_enhancer.py: -120 lines, removed _check_claude_cli(), _call_claude_() - unified_enhancer.py: -100 lines, removed _check_claude_cli(), _call_claude_() - codebase_scraper.py: collapsed 3 functions into 1 using AgentClient Fixed: - utils.py: has_api_key()/get_api_key() now check all providers - enhance_skill.py, video_scraper.py, video_visual.py: model names configurable via ANTHROPIC_MODEL env var - enhancement_workflow.py: uses call() with _call_claude() fallback Net: -153 lines of code while adding full multi-agent support. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Phase 2 agent-agnostic refactor — defaults, help text, merge mode, MCP (#334) Phase 2 of the full agent-agnostic refactor: Default targets: - Changed default="claude" to auto-detect from API keys in 5 argument files and 3 CLI scripts (install_skill, upload_skill, enhance_skill) - Added AgentClient.detect_default_target() for runtime resolution - MCP server functions now use "auto" default with runtime detection Help text (16+ argument files): - Replaced "ANTHROPIC_API_KEY" / "Claude Code" with agent-neutral wording - Now mentions all API keys (ANTHROPIC, MOONSHOT, etc.) and "AI coding agent" Log messages: - main.py, enhance_command.py: "Claude Code CLI" → dynamic agent name - enhance_command.py docstring: "Claude Code" → "AI coding agent" Merge mode rename: - Added "ai-enhanced" as the preferred merge mode name - "claude-enhanced" kept as backward-compatible alias - Renamed ClaudeEnhancedMerger → AIEnhancedMerger (with alias) - Updated choices, validators, and descriptions MCP server descriptions: - server_fastmcp.py: "Claude AI skills" → "LLM skills" in tool descriptions - packaging_tools.py: Updated defaults and dry-run messages Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Phase 3 agent-agnostic refactor — docstrings, MCP descriptions, README (#334) Phase 3 of the full agent-agnostic refactor: Module docstrings (17+ scraper files): - "Claude Skill Converter" → "AI Skill Converter" - "Build Claude skill" → "Build AI/LLM skill" - "Asking Claude" → "Asking AI" - Updated doc_scraper, github_scraper, pdf_scraper, word_scraper, epub_scraper, video_scraper, enhance_skill, enhance_skill_local, unified_scraper, and others MCP server_legacy.py (30+ fixes): - All tool descriptions: "Claude skill" → "LLM skill" - "Upload to Claude" → "Upload skill" - "enhance with Claude Code" → "enhance with AI agent" - Kept claude.ai/skills URLs (platform-specific, correct) MCP README.md: - Added multi-agent support note at top - "Claude AI skills" → "LLM skills" throughout - Updated examples to show multi-platform usage - Kept Claude Code in supported agents list (accurate) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Phase 3 continued — remaining docstring and comment fixes (#334) Additional agent-neutral text fixes in 8 files missed from the initial Phase 3 commit: - config_extractor.py, config_manager.py, constants.py: comments - enhance_command.py: docstring and print messages - guide_enhancer.py: class/module docstrings - parsers/enhance_parser.py, install_parser.py: help text - signal_flow_analyzer.py: docstring Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * workflow added * fix: address code review issues in AgentClient and Phase 6 (#334) Fixes found during commit review: 1. AgentClient._call_local: Only append "Write your response to:" when caller explicitly passes output_file (was always appending) 2. Codex agent: Added uses_stdin flag to preset, pipe prompt via stdin instead of DEVNULL (codex reads from stdin with "-" arg) 3. Provider detection: Added _detect_provider_from_key() to detect provider from API key prefix (sk-ant- → anthropic, AIza → google) instead of always assuming anthropic 4. Phase 6 API mode: Replaced direct SkillEnhancer/ANTHROPIC_API_KEY with AgentClient for multi-provider support (Moonshot, Google, OpenAI) 5. config_enhancer: Removed output_file path from prompt — AgentClient manages temp files and output detection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: make claude adaptor model name configurable via ANTHROPIC_MODEL env var Missed in the Phase 1 refactor — adaptors/claude.py:381 had a hardcoded model name without the os.environ.get() wrapper that all other files use. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add copilot stdin support, custom agent, and kimi aliases (#334) Additional agent improvements from Kimi review: - Added uses_stdin: True to copilot agent preset (reads from stdin like codex) - Added custom agent support via SKILL_SEEKER_AGENT_CMD env var in _call_local() - Added kimi_code/kimi-code aliases in normalize_agent_name() - Added "kimi" to --target choices in enhance arguments - Updated help text with MOONSHOT_API_KEY across argument files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Kimi CLI integration — add uses_stdin and output parsing (#334) Kimi CLI's --print mode requires stdin piping and outputs structured protocol messages (TurnBegin, TextPart, etc.) instead of plain text. Fixes: - Added uses_stdin: True to kimi preset (was not piping prompt) - Added parse_output: "kimi" flag to preset - Added _parse_kimi_output() to extract text from TextPart lines - Kimi now returns clean text instead of raw protocol dump Tested: kimi returns '{"status": "ok"}' correctly via AgentClient. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Kimi CLI in enhance_skill_local — remove wrong skip-permissions, use absolute path Two bugs in enhance_skill_local.py AGENT_PRESETS for Kimi: 1. supports_skip_permissions was True — Kimi doesn't support --dangerously-skip-permissions, only Claude does. Fixed to False. 2. {skill_dir} was resolved as relative path — Kimi CLI requires absolute paths for --work-dir. Fixed with .resolve(). Tested: `skill-seekers enhance output/test-e2e/ --agent kimi` now works end-to-end (107s, 9233 bytes output). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove invalid --enhance-level flag from enhance subprocess calls doc_scraper.py and video_scraper.py were passing --enhance-level to skill-seekers-enhance, which doesn't accept that flag. This caused enhancement to fail silently after scraping completed. Fixes: - Removed --enhance-level from enhance subprocess calls - Added --agent passthrough in doc_scraper.py - Fixed log messages to show correct command Tested: `skill-seekers create <url> --enhance-level 1` now chains scrape → enhance successfully. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add --agent and --agent-cmd to create command UNIVERSAL_ARGUMENTS The --agent flag was defined in common.py but not imported into the create command's UNIVERSAL_ARGUMENTS, so it wasn't available when using `skill-seekers create <source> --agent kimi`. Now all 17 source types support the --agent flag via the create command. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: update docs data_file path after moving to cache directory The scraped_data["documentation"] stored the original output/ path for data_file, but the directory was moved to .skillseeker-cache/ afterward. Phase 2 conflict detection then failed with FileNotFoundError trying to read the old path. Now updates data_file to point to the cache location after the move. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: multi-language code signature extraction in GitHub scraper The GitHub scraper only analyzed files matching the primary language (by bytes). For multi-language repos like spine-runtimes (C++ primary but C# is the target), this meant 0 C# files were analyzed. Fix: Analyze top 3 languages with known extension mappings instead of just the primary. Also support "language" field in config source to explicitly target specific languages (e.g., "language": "C#"). Updated Unity configs to specify language: "C#" for focused analysis. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: per-file language detection + remove artificial analysis limits Rewrites GitHub scraper's _extract_signatures_and_tests() to detect language per-file from extension instead of only analyzing the primary language. This fixes multi-language repos like spine-runtimes (C++ primary) where C# files were never analyzed. Changes: - Build reverse ext→language map, detect language per-file - Analyze ALL files with known extensions (not just primary language) - Config "language" field works as optional filter, not a workaround - Store per-file language + languages_analyzed in output - Remove 50-file API mode limit (rate limiting already handles this) - Remove 100-file default config extraction limit (now unlimited by default) - Fix unified scraper default max_pages from 100 to 500 (matches constants.py) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove remaining 100-file limit in config_extractor.extract_from_directory The find_config_files default was changed to unlimited but extract_from_directory and CLI --max-files still defaulted to 100. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: replace interactive terminal merge with automated AgentClient call AIEnhancedMerger._launch_claude_merge() used to open a terminal window, run a bash script, and poll for a file — requiring manual interaction. Now uses AgentClient.call() to send the merge prompt directly and parse the JSON response. Fully automated, no terminal needed, works with any configured AI agent (Claude, Kimi, etc.). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add marketplace pipeline for publishing skills to Claude Code plugin repos Connect the three-repo pipeline: configs repo → Skill Seekers engine → plugin marketplace repos. Enables automated publishing of generated skills directly into Claude Code plugin repositories with proper plugin.json and marketplace.json structure. New components: - MarketplaceManager: Registry for plugin marketplace repos at ~/.skill-seekers/marketplaces.json with per-repo git tokens, branch config, and default author metadata - MarketplacePublisher: Clones marketplace repo, creates plugin directory structure (skills/, .claude-plugin/plugin.json), updates marketplace.json, commits and pushes. Includes skill_name validation to prevent path traversal, and cleanup of partial state on git failures - 4 MCP tools: add_marketplace, list_marketplaces, remove_marketplace, publish_to_marketplace — registered in FastMCP server - Phase 6 in install workflow: automatic marketplace publishing after packaging, triggered by --marketplace CLI arg or marketplace_targets config field CLI additions: - --marketplace NAME: publish to registered marketplace after packaging - --marketplace-category CAT: plugin category (default: development) - --create-branch: create feature branch instead of committing to main Security: - Skill name regex validation (^[a-zA-Z0-9][a-zA-Z0-9._-]$) prevents path traversal attacks via malicious SKILL.md frontmatter - has_api_key variable scoping fix in install workflow summary - try/finally cleanup of partial plugin directories on publish failure Config schema: - Optional marketplace_targets field in config JSON for multi-marketplace auto-publishing: [{"marketplace": "spyke", "category": "development"}] - Backward compatible — ignored by older versions Tests: 58 tests (36 manager + 22 publisher including 2 integration tests using file:// git protocol for full publish success path) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> feat: thread agent selection through entire enhancement pipeline Propagates the --agent and --agent-cmd CLI parameters through all enhancement components so users can use any supported coding agent (kimi, claude, copilot, codex, opencode) consistently across the full pipeline, not just in top-level enhancement. Agent parameter threading: - AIEnhancer: accepts agent param, passes to AgentClient - ConfigEnhancer: accepts agent param, passes to AgentClient - WorkflowEngine: accepts agent param, passes to sub-enhancers (PatternEnhancer, TestExampleEnhancer, AIEnhancer) - ArchitecturalPatternDetector: accepts agent param for AI enhancement - analyze_codebase(): accepts agent/agent_cmd, forwards to ConfigEnhancer, ArchitecturalPatternDetector, and doc processing - UnifiedScraper: reads agent from CLI args, forwards to doc scraper subprocess, C3.x analysis, and LOCAL enhancement - CreateCommand: forwards --agent and --agent-cmd to subprocess argv - workflow_runner: passes agent to WorkflowEngine for inline/named workflows Timeout improvements: - Default enhancement timeout increased from 300s (5min) to 2700s (45min) to accommodate large skill generation with local agents - New get_default_timeout() in agent_client.py with env var override (SKILL_SEEKER_ENHANCE_TIMEOUT) supporting 'unlimited' value - Config enhancement block supports "timeout": "unlimited" field - Removed hardcoded timeout=300 and timeout=600 calls in config_enhancer and merge_sources, now using centralized default CLI additions (unified_scraper): - --agent AGENT: select local coding agent for enhancement - --agent-cmd CMD: override agent command template (advanced) Config: unity-dotween.json updated with agent=kimi, timeout=unlimited, removed unused file_patterns Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add claude-code unified config for Claude Code CLI skill generation Unified config combining official Claude Code documentation and source code analysis. Covers internals, architecture, tools, commands, IDE integrations, MCP, plugins, skills, and development workflows. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add multi-agent support verification report and test artifacts - AGENT_SUPPORT_VERIFICATION.md: verification report confirming agent parameter threading works across all enhancement components - END_TO_END_EXAMPLES.md: complete workflows for all 17 source types with both Claude and Kimi agents - test_agents.sh: shell script for real-world testing of agent support across major CLI commands with both agents - test_realworld.md: real-world test scenarios for manual QA Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add .env to .gitignore to prevent secret exposure The .env file containing API keys (ANTHROPIC_API_KEY, GITHUB_TOKEN, etc.) was not in .gitignore, causing it to appear as untracked and risking accidental commit. Added .env, .env.local, and .env..local patterns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix: URL filtering uses base directory instead of full page URL (#331) is_valid_url() checked url.startswith(self.base_url) where base_url could be a full page path like ".../manual/index.html". Sibling pages like ".../manual/LoadingAssets.html" failed the check because they don't start with ".../index.html". Now strips the filename to get the directory prefix: "https://example.com/docs/index.html" → "https://example.com/docs/" This fixes SPA sites like Unity's DocFX docs where browser mode renders the page but sibling links were filtered out. Closes #331 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: pass language config through to GitHub scraper in unified flow The unified scraper built github_config from source fields but didn't include the "language" field. The GitHub scraper's per-file detection read self.config.get("language", "") which was always empty, so it fell back to analyzing all languages instead of the focused C# filter. For DOTween (C# only repo), this caused 0 files analyzed because without the language filter, it analyzed top 3 languages but the file tree matching failed silently. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: centralize all enhancement timeouts to 45min default with unlimited support All enhancement/AI timeouts now use get_default_timeout() from agent_client.py instead of scattered hardcoded values (120s, 300s, 600s). Default: 2700s (45 minutes) Override: SKILL_SEEKER_ENHANCE_TIMEOUT env var Unlimited: Set to "unlimited", "none", or "0" Updated: agent_client.py, enhance_skill_local.py, arguments/enhance.py, enhance_command.py, unified_enhancer.py, unified_scraper.py Not changed (different purposes): - Browser page load timeout (60s) - API HTTP request timeout (120s) - Doc scraper subprocess timeout (3600s) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add browser_wait_until and browser_extra_wait config for SPA docs DocFX sites (Unity docs) render navigation via JavaScript after initial page load. With domcontentloaded, only 1 link was found. With networkidle + 5s extra wait, 95 content pages are discovered. New config options for documentation sources: - browser_wait_until: "networkidle" \| "load" \| "domcontentloaded" - browser_extra_wait: milliseconds to wait after page load for lazy nav Updated Addressables config to use networkidle + 5000ms extra wait. Pass browser settings through unified scraper to doc scraper config. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: three-layer smart discovery engine for SPA documentation sites Replaces the browser_wait_until/browser_extra_wait config hacks with a proper discovery engine that runs before the BFS crawl loop: Layer 1: sitemap.xml — checks domain root for sitemap, parses <loc> tags Layer 2: llms.txt — existing mechanism (unchanged) Layer 3: SPA nav — renders index page with networkidle via Playwright, extracts all links from the fully-rendered DOM sidebar/TOC The BFS crawl then uses domcontentloaded (fast) since all pages are already discovered. No config hacks needed — browser mode automatically triggers SPA discovery when only 1 page is found. Tested: Unity Addressables DocFX site now discovers 95 pages (was 1). Removed browser_wait_until/browser_extra_wait from Addressables config. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: replace manual arg forwarding with dynamic routing in create command The create command manually hardcoded ~60% of scraper flags in _route_() methods, causing ~40 flags to be silently dropped. Every new flag required edits in 2 places (arguments/create.py + create_command.py), guaranteed to drift. Replaced with _build_argv() — a dynamic forwarder that iterates vars(self.args) and forwards all explicitly-set arguments automatically, using the same pattern as main.py::_reconstruct_argv(). This eliminates the root cause of all flag gaps. Changes in create_command.py (-380 lines, +175 lines = net -205): - Added _build_argv() dynamic arg forwarder with dest→flag translation map for mismatched names (async_mode→--async, video_playlist→--playlist, skip_config→--skip-config-patterns, workflow_var→--var) - Added _call_module() helper (dedup sys.argv swap pattern) - Simplified all _route_() methods from 50-70 lines to 5-10 lines each - Deleted _add_common_args() entirely (subsumed by _build_argv) - _route_generic() now forwards ALL args, not just universal ones New flags accessible via create command: - --from-json: build skill from pre-extracted JSON (all source types) - --skip-api-reference: skip API reference generation (local codebase) - --skip-dependency-graph: skip dependency analysis (local codebase) - --skip-config-patterns: skip config pattern extraction (local codebase) - --no-comments: skip comment extraction (local codebase) - --depth: analysis depth control (local codebase, deprecated) - --setup: auto-detect GPU/install video deps (video) Bug fix in unified_scraper.py: - Fixed C3.x pattern data loss: unified_scraper read patterns/detected_patterns.json but codebase_scraper writes patterns/all_patterns.json. Changed both read locations (line 828 for local sources, line 1597 for GitHub C3.x) to use the correct filename. This was causing 100% loss of design pattern data (e.g., 905 patterns detected but 0 included in final skill). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address 5 code review issues in marketplace and package pipeline Fixes found by automated code review of the marketplace feature and package command: 1. --marketplace flag silently ignored in package_skill.py CLI Added MarketplacePublisher invocation after successful packaging when --marketplace is provided. Previously the flag was parsed but never acted on. 2. Missing 7 platform choices in --target (package.py) Added minimax, opencode, deepseek, qwen, openrouter, together, fireworks to the argparse choices list. These platforms have registered adaptors but were rejected by the argument parser. 3. is_update always True for new marketplace registrations Two separate datetime.now() calls produced different microsecond timestamps, making added_at != updated_at always. Fixed by assigning a single timestamp to both fields. 4. Shallow clone (depth=1) caused push failures for marketplace repos MarketplacePublisher now does full clones instead of using GitConfigRepo's shallow clone (which is designed for read-only config fetching). Full clone is required for commit+push workflow. 5. Partial plugin dir not cleaned on force=True failure Removed the `and not force` guard from cleanup logic — if an operation fails midway, the partial directory should be cleaned regardless of whether force was set. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address dynamic routing edge cases in create_command Fixes from code review of the _build_argv() refactor: 1. Non-None defaults forwarded unconditionally — added enhance_level=2, doc_version="", video_languages="en", whisper_model="base", platform="slack", visual_interval=0.7, visual_min_gap=0.5, visual_similarity=3.0 to the defaults dict so they're only forwarded when the user explicitly overrides them. This fixes video sources incorrectly getting --enhance-level 2 (video default is 0). 2. video_url dest not translated — added "video_url": "--url" to _DEST_TO_FLAG so create correctly forwards --video-url as --url to video_scraper.py. 3. Video positional args double-forwarded — added video_url, video_playlist, video_file to _SKIP_ARGS since _route_video() already handles them via positional args from source detection. 4. Removed dead workflow_var entry from _DEST_TO_FLAG — the create parser uses key "var" not "workflow_var", so the translation was never triggered. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve 15 broken tests and --from-json crash bug in create command Fixes found by Kimi code review of the dynamic routing refactor: 1. 3 test_create_arguments.py failures — UNIVERSAL_ARGUMENTS count changed from 19 to 21 (added agent, agent_cmd). Updated expected count and name set. Moved from_json out of UNIVERSAL to ADVANCED_ARGUMENTS since not all scrapers support it. 2. 12 test_create_integration_basic.py failures — tests called _add_common_args() which was deleted in the refactor. Rewrote _collect_argv() to use _build_argv() via CreateCommand with SourceDetector. Updated _make_args defaults to match new parameter set. 3. --from-json crash bug — was in UNIVERSAL_ARGUMENTS so create accepted it for all source types, but web/github/local scrapers don't support it. Forwarding it caused argparse "unrecognized arguments" errors. Moved to ADVANCED_ARGUMENTS with documentation listing which source types support it. 4. Additional _is_explicitly_set defaults — added enhance_level=2, doc_version="", video_languages="en", whisper_model="base", platform="slack", visual_interval/min_gap/similarity defaults to prevent unconditional forwarding of parser defaults. 5. Video arg handling — added video_url to _DEST_TO_FLAG translation map, added video_url/video_playlist/video_file to _SKIP_ARGS (handled as positionals by _route_video). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: C3.x analysis data loss — read from references/ after _generate_references() cleanup Root cause: _generate_references() in codebase_scraper.py copies analysis directories (patterns/, test_examples/, config_patterns/, architecture/, dependencies/, api_reference/) into references/ then DELETES the originals to avoid duplication (Issue #279). But unified_scraper.py reads from the original paths after analyze_codebase() returns — by which time the originals are gone. This caused 100% data loss for all 6 C3.x data types (design patterns, test examples, config patterns, architecture, dependencies, API reference) in the unified scraper pipeline. The data was correctly detected (e.g., 905 patterns in 510 files) but never made it into the final skill. Fix: Added _load_json_fallback() method that checks references/{subdir}/ first (where _generate_references moves the data), falling back to the original path. Applied to both GitHub C3.x analysis (line ~1599) and local source analysis (line ~828). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add allowlist to _build_argv for config route to unified_scraper _build_argv() was forwarding all CLI args (--name, --doc-version, etc.) to unified_scraper which doesn't accept them. Added allowlist parameter to _build_argv() — when provided, ONLY args in the allowlist are forwarded. The config route now uses _UNIFIED_SCRAPER_ARGS allowlist with the exact set of flags unified_scraper accepts. This is a targeted patch — the proper fix is the ExecutionContext singleton refactor planned separately. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add force=True to marketplace publish from package CLI The package command's --marketplace flag didn't pass force=True to MarketplacePublisher, so re-publishing an existing skill would fail with "already exists" error instead of overwriting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add push_config tool for publishing configs to registered source repos New ConfigPublisher class that validates configs, places them in the correct category directory, commits, and pushes to registered source repositories. Follows the MarketplacePublisher pattern. Features: - Auto-detect category from config name/description - Validate via ConfigValidator + repo's validate-config.py - Support feature branch or direct push - Force overwrite existing configs - MCP tool: push_config(config_path, source_name, category) Usage: push_config(config_path="configs/unity-spine.json", source_name="spyke") Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: security hardening, error handling, tests, and cleanup Security: - Remove command injection via cloned repo script execution (config_publisher) - Replace git add -A with targeted staging (marketplace_publisher) - Clear auth tokens from cached .git/config after clone - Use defusedxml for sitemap XML parsing (XXE protection) - Add path traversal validation for config names Error handling: - AgentClient: specific exception handling for rate limit, auth, connection errors - AgentClient: log subprocess stderr on non-zero exit, raise on explicit API mode failure - config_publisher: only catch ValueError for validation warnings Logic bugs: - Fix _build_argv silently dropping --enhance-level 2 (matched default) - Fix URL filtering over-broadening (strip to parent instead of adding /) - Log warning when _call_module returns None exit code Tests (134 new): - test_agent_client.py: 71 tests for normalize, detect, init, timeout, model - test_config_publisher.py: 23 tests for detect_category, publish, errors - test_create_integration_basic.py: 20 tests for _build_argv routing - Fix 11 pre-existing failures (guide_enhancer, doctor, install_skill, marketplace) Cleanup: - Remove 5 dev artifact files (-1405 lines) - Rename _launch_claude_merge to _launch_ai_merge All 3194 tests pass, 39 expected skips. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: pin ruff==0.15.8 in CI and reformat packaging_tools.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add missing pytest install to vector DB adaptor test jobs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: reformat 7 files for ruff 0.15.8 and fix vector DB test path Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove test-week2-integration job referencing missing script Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: update e2e test to accept dynamic platform name in upload phase Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: YusufKaraaslanSpyke <yusuf@spykegames.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 04:50:15 +03:00
yusyus	c29fad606c	fix: upgrade deprecated GitHub Actions to v4/v5 and fix MCP test job actions/upload-artifact@v3 is fully deprecated and causes instant CI failure. Also upgrades checkout, setup-python, cache, github-script, and codecov actions to their latest major versions to resolve Node.js 20 deprecation warnings. Adds missing pytest install to MCP Vector DB test job and pins ruff>=0.15 in CI to match local tooling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 22:02:13 +03:00
yusyus	00c72ea4a3	fix: resolve CI failures across all GitHub Actions workflows - Fix ruff format issue in doc_scraper.py - Add pytest skip markers for browser renderer tests when Playwright is not installed in CI - Replace broken Python heredocs in 4 workflow YAML files (scheduled-updates, vector-db-export, quality-metrics, test-vector-dbs) with python3 -c calls to fix YAML parsing errors Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 20:40:45 +03:00
yusyus	6fded977dd	feat: add Kotlin language support for codebase analysis (#287 ) Adds full C3.x pipeline support for Kotlin (.kt, .kts): - Language detection patterns (40+ weighted patterns for data/sealed classes, coroutines, companion objects, KMP, etc.) - AST regex parser in code_analyzer.py (classes, objects, functions, extension functions, suspend functions) - Dependency extraction for Kotlin import statements (with alias support) - Design pattern adaptations (object→Singleton, companion→Factory, sealed→Strategy, data→Builder, Flow→Observer) - Test example extraction for JUnit 4/5, Kotest, MockK, Spek - Config detection for build.gradle.kts / settings.gradle.kts - Extension maps registered in codebase_scraper, unified_codebase_analyzer, github_scraper, generate_router Also fixes pre-existing parser count tests (35→36 for doctor command added in previous commit). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 23:25:12 +03:00
yusyus	ea4fed0be4	feat: add headless browser rendering for JavaScript SPA sites (#321 ) New BrowserRenderer class uses Playwright to render JavaScript-heavy documentation sites (React, Vue SPAs) that return empty HTML shells with requests.get(). Activated via --browser flag on web scraping. - browser_renderer.py: Playwright wrapper with lazy browser launch, auto-install Chromium on first use, context manager support - doc_scraper.py: browser_mode config, _render_with_browser() helper, integrated into scrape_page() and scrape_page_async() - SPA detection warnings now suggest --browser flag - Optional dep: pip install "skill-seekers[browser]" - 14 real e2e tests (actual Chromium, no mocks) - UML updated: Scrapers class diagram (BrowserRenderer + dependency), Parsers (DoctorParser), Utilities (Doctor), Components, and new Browser Rendering sequence diagram (#20) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 22:06:14 +03:00
yusyus	006cccabae	feat: add skill-seekers doctor health check command (#316 ) 8 diagnostic checks: Python version (3.10+), package install, git, 14 core deps, 10 optional deps, API keys, MCP server, output dir. Each check reports pass/warn/fail with --verbose for extra detail. Exit code 0 if no critical failures, 1 otherwise. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 21:27:17 +03:00
yusyus	43bdabb84f	feat: add prompt injection check workflow for content security (#324 ) New bundled workflow `prompt-injection-check` scans scraped content for prompt injection patterns (role assumption, instruction overrides, delimiter injection, hidden instructions, encoded payloads) using AI. Flags suspicious content without removing it — preserves documentation accuracy while warning about adversarial content. Added as first stage in both `default` and `security-focus` workflows so it runs automatically with --enhance-level >= 1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 21:17:57 +03:00
yusyus	6beff3d52f	fix: add path traversal protection to get_workflow_tool + tests (#325 ) PR #326 added _validate_name() to create/update/delete but missed get_workflow_tool, which would raise an unhandled ValueError instead of returning a user-friendly error. Added try/except handling and 6 tests covering all 4 tool functions with malicious names. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 20:56:52 +03:00
Spidershield-contrib	a12743769e	fix: prevent path traversal in workflow name parameter (CWE-22) (#326 ) Co-authored-by: spidershield-contrib <spidershield-contrib@users.noreply.github.com>	2026-03-28 20:55:13 +03:00
yusyus	c6c17ada95	docs: add 6 behavioral UML diagrams verified against codebase 3 sequence diagrams (create command dispatch, GitHub+C3.x pipeline with all 5 stages, MCP dual-path invocation), 2 activity diagrams (source detection in correct code order, enhancement level flag mapping), and 1 component diagram with corrected runtime dependency arrows. All diagrams cross-referenced against source code for accuracy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 20:45:30 +03:00
yusyus	d381315340	fix: pass enhance_level instead of removed enhance_with_ai/ai_mode to analyze_codebase (#323 ) Two call sites (_run_c3_analysis in unified_scraper.py and _analyze_c3x in unified_codebase_analyzer.py) still passed the old enhance_with_ai and ai_mode kwargs which were replaced by enhance_level. This caused a TypeError when running C3.x codebase analysis. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 22:14:51 +03:00
yusyus	31a57c448b	style: apply ruff formatting to github_scraper.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 23:46:42 +03:00
yusyus	d71c1d3aa3	fix: filter non-integer metadata from GitHub languages API response (#322 ) PyGithub's get_languages() returns raw API JSON which in some environments includes non-integer metadata keys (e.g., "url"), causing a TypeError in sum(). Now filters to integer values only before calculating percentages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 23:44:52 +03:00
yusyus	336ab6aaac	Merge development into main: release v3.4.0 8 new LLM platform adaptors, 7 new CLI agents, OpenCode skill tools, 8 bug fixes including SPA site detection, UML architecture docs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 22:20:38 +03:00
yusyus	5a93003da4	chore: bump version to 3.4.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 21:58:57 +03:00
yusyus	d76ab1d9a4	fix: report accurate saved/skipped page counts and detect SPA sites (#320 , #321 ) The scraper previously reported len(visited_urls) as "Scraped N pages" even when save_page() silently skipped pages with empty content (<50 chars). For JavaScript SPA sites this meant "Scraped 190 pages" followed by "No scraped data found!" with no explanation. Changes: - Added pages_saved/pages_skipped counters to DocToSkillConverter - save_page() now increments pages_skipped on skip, pages_saved on save - New _log_scrape_completion() reports "(N saved, M skipped)" breakdown - SPA detection warns when all/most pages have empty content - build_skill() error now explains empty content cause when pages skipped - Updated both sync and async scrape completion paths - 14 new tests across 4 test classes (counting, messages, SPA, build) Fixes #320 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 22:26:35 +03:00
yusyus	8152045e38	chore: consolidate Docs/ into docs/ (single documentation directory) Move UML/ directory and Architecture.md from Docs/ to docs/. Rename Architecture.md to UML_ARCHITECTURE.md to avoid collision with existing docs/ARCHITECTURE.md (docs organization file). Update all references in README.md, CONTRIBUTING.md, CLAUDE.md, and the architecture file itself. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 20:02:53 +03:00
yusyus	a1934905f6	docs: remove awesome-mcp-servers from ecosystem tables Not a Skill Seekers-specific repo — better suited for MCP docs section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 19:24:37 +03:00
yusyus	a1eab63daf	docs: add ecosystem section linking all Skill Seekers repos Add cross-repo discoverability for the 6 related repositories (website, configs, GitHub Action, plugin, Homebrew tap, MCP servers). - README.md: ecosystem table, Trendshift badge, pepy.tech downloads badge - All 11 translated READMEs: translated ecosystem sections - CONTRIBUTING.md: related repositories table for contributors - pyproject.toml: ecosystem URLs in [project.urls] for PyPI sidebar Addresses contributor feedback about difficulty finding the website repo. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 19:22:16 +03:00
yusyus	073e6b5a54	docs: add architecture references to README.md and CONTRIBUTING.md - README: Add Architecture section with package overview diagram, module table, and links to UML docs - README: Add Architecture subsection to Documentation with links to diagrams, HTML API reference, and StarUML project - CONTRIBUTING: Add UML Architecture subsection with design patterns documented and guidance to keep UML in sync with code changes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 12:33:56 +03:00
yusyus	40603a3cf6	docs: remove stale UNIFIED_PARSERS.md superseded by UML architecture The parsers architecture is now fully documented in the StarUML project (Docs/UML/skill_seekers.mdj) with the Parsers class diagram showing all 28 SubcommandParser subclasses. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 12:31:17 +03:00
yusyus	6b54988db5	docs: add StarUML HTML API reference documentation export 1,758 HTML files generated from StarUML project_export_doc containing full API reference for all ~200 classes, operations, attributes, and documentation across all 13 modules. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 12:29:14 +03:00
yusyus	30b877274b	docs: add full UML architecture with 14 class diagrams synced from source code - 14 StarUML diagrams covering all 13 modules (8 core + 5 utility) - ~200 classes with operations, attributes, and documentation from actual source - Package overview with 25 verified inter-module dependencies - Exported PNG diagrams in Docs/UML/exports/ - Architecture.md with embedded diagram descriptions - CLAUDE.md updated with architecture reference Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 12:24:43 +03:00
yusyus	d0d7d5a939	chore: remove stale root-level test scripts and junk files Remove files that should never have been committed: - test_api.py, test_httpx_quick.sh, test_httpx_skill.sh (ad-hoc test scripts) - test_week2_features.py (one-off validation script) - test_results.log (log file) - =0.24.0 (accidental pip error output) - demo_conflicts.py (demo script) - ruff_errors.txt (stale lint output) - TESTING_GAP_REPORT.md (stale one-time report) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 21:39:22 +03:00
yusyus	0fa99641aa	style: fix pre-existing ruff format issues in 5 files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 21:24:21 +03:00
yusyus	eb13f96ece	docs: update remaining docs for 12 LLM platforms Update platform counts (4→12) in: - docs/reference/CLAUDE_INTEGRATION.md (EN + zh-CN) - docs/guides/MCP_SETUP.md, UPLOAD_GUIDE.md, MIGRATION_GUIDE.md - docs/strategy/INTEGRATION_STRATEGY.md, DEEPWIKI_ANALYSIS.md, KIMI_ANALYSIS_COMPARISON.md - docs/archive/historical/HTTPX_SKILL_GRADING.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 20:50:50 +03:00
yusyus	6bb7078fbc	docs: update all documentation for 12 LLM platforms and 18 agents - README.md + 11 i18n READMEs: 5→12 LLM platforms, 11→18 agents, new platform/agent tables - CLAUDE.md: updated --target list, adaptor directory tree - CHANGELOG.md: added v3.4.0 entry with all Phase 1-4 changes - docs/reference/CLI_REFERENCE.md: new --target and --agent options - docs/reference/FEATURE_MATRIX.md: updated all platform counts and tables - docs/user-guide/04-packaging.md: new platform and agent rows - docs/FAQ.md: expanded platform/agent answers - docs/zh-CN/*: synchronized Chinese documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 20:42:31 +03:00
yusyus	cd7b322b5e	feat: expand platform coverage with 8 new adaptors, 7 new CLI agents, and OpenCode skill tools Phase 1 - OpenCode Integration: - Add OpenCodeAdaptor with directory-based packaging and dual-format YAML frontmatter - Kebab-case name validation matching OpenCode's regex spec Phase 2 - OpenAI-Compatible LLM Platforms: - Extract OpenAICompatibleAdaptor base class from MiniMax (shared format/package/upload/enhance) - Refactor MiniMax to ~20 lines of constants inheriting from base - Add 6 new LLM adaptors: Kimi, DeepSeek, Qwen, OpenRouter, Together AI, Fireworks AI - All use OpenAI-compatible API with platform-specific constants Phase 3 - CLI Agent Expansion: - Add 7 new install-agent paths: roo, cline, aider, bolt, kilo, continue, kimi-code - Total agents: 11 -> 18 Phase 4 - Advanced Features: - OpenCode skill splitter (auto-split large docs into focused sub-skills with router) - Bi-directional skill format converter (import/export between OpenCode and any platform) - GitHub Actions template for automated skill updates Totals: 12 --target platforms, 18 --agent paths, 2915 tests passing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 20:31:51 +03:00
yusyus	1d3d7389d7	fix: sanitize_url crashes on Python 3.14 strict urlparse (#284 ) Python 3.14's urlparse() raises ValueError on URLs with unencoded brackets that look like malformed IPv6 (e.g. http://[fdaa:x:x:x::x from docs.openclaw.ai llms-full.txt). sanitize_url() called urlparse() BEFORE encoding brackets, so it crashed before it could fix them. Fix: catch ValueError from urlparse, encode ALL brackets, then retry. This is safe because if urlparse rejected the brackets, they are NOT valid IPv6 host literals and should be encoded anyway. Also fixed Discord e2e tests to skip gracefully on network issues. Fixes #284 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 00:30:48 +03:00
yusyus	2ef6e59d06	fix: stop blindly appending /index.html.md to non-.md URLs (#277 ) The previous fix (`a82cf69`) only addressed anchor fragment stripping but left the fundamental problem: _convert_to_md_urls() blindly appended /index.html.md to ALL non-.md URLs from llms.txt. This only works for Docusaurus sites — for sites like Discord docs it generates mass 404s. Changes: - _convert_to_md_urls() now strips anchors and deduplicates only, preserving original URLs as-is instead of appending /index.html.md - New _has_md_extension() helper uses urlparse().path.endswith(".md") instead of error-prone ".md" in url substring matching - Fixed ".md" in url checks at 4 locations (lines 465, 554, 716, 775) - Removed 24 lines of dead commented-out code - Added real-world e2e test against docs.discord.com (no mocks) - Updated unit tests for new behavior (32 tests) Fixes #277 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 23:44:35 +03:00
yusyus	f6131c6798	fix: unified scraper temp config uses unified format for doc_scraper (#317 ) The unified scraper's _scrape_documentation() was creating temp configs in flat/legacy format (no "sources" key), causing doc_scraper's ConfigValidator to reject them. Wrap the temp config in unified format with a "sources" array. Also remove dead code branches and fix a pre-existing test that didn't clear GITHUB_TOKEN from env. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 22:35:12 +03:00
yusyus	4f87de6b56	fix: improve MiniMax adaptor from PR #318 review (#319 ) * feat: add MiniMax AI as LLM platform adaptor Original implementation by octo-patch in PR #318. This commit includes comprehensive improvements and documentation. Code Improvements: - Fix API key validation to properly check JWT format (eyJ prefix) - Add specific exception handling for timeout and connection errors - Remove unused variable in upload method Dependencies: - Add MiniMax to [all-llms] extra group in pyproject.toml Tests: - Remove duplicate setUp method in integration test class - Add 4 new test methods: * test_package_excludes_backup_files * test_upload_success_mocked (with OpenAI mocking) * test_upload_network_error * test_upload_connection_error * test_validate_api_key_jwt_format - Update test_validate_api_key_valid to use JWT format keys - Fix test assertions for error message matching Documentation: - Create comprehensive MINIMAX_INTEGRATION.md guide (380+ lines) - Update MULTI_LLM_SUPPORT.md with MiniMax platform entry - Update 01-installation.md extras table - Update INTEGRATIONS.md AI platforms table - Update AGENTS.md adaptor import pattern example - Fix README.md platform count from 4 to 5 All tests pass (33 passed, 3 skipped) Lint checks pass Co-authored-by: octo-patch <octo-patch@users.noreply.github.com> * fix: improve MiniMax adaptor — typed exceptions, key validation, tests, docs - Remove invalid "minimax" self-reference from all-llms dependency group - Use typed OpenAI exceptions (APITimeoutError, APIConnectionError) instead of string-matching on generic Exception - Replace incorrect JWT assumption in validate_api_key with length check - Use DEFAULT_API_ENDPOINT constant instead of hardcoded URLs (3 sites) - Add Path() cast for output_path before .is_dir() call - Add sys.modules mock to test_enhance_missing_library - Add mocked test_enhance_success with backup/content verification - Update test assertions for new exception types and key validation - Add MiniMax to __init__.py docstrings (module, get_adaptor, list_platforms) - Add MiniMax sections to MULTI_LLM_SUPPORT.md (install, format, API key, workflow example, export-to-all) Follows up on PR #318 by @octo-patch (feat: add MiniMax AI as LLM platform adaptor). Co-Authored-By: Octopus <octo-patch@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: octo-patch <octo-patch@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 22:12:23 +03:00
yusyus	37a23e6c6d	fix: replace unicode arrows in CLI help text for Windows cp1252 compatibility Replace → (U+2192) with -> in argparse help strings. Windows cmd uses cp1252 encoding which cannot render unicode arrows, causing --help to crash with UnicodeEncodeError.	2026-03-19 00:10:50 +03:00
yusyus	26c2d0bd5c	fix: correct CLI flags in plugin slash commands (create uses --preset, package uses --target)	2026-03-17 22:03:20 +03:00
yusyus	5e4932e8b1	feat: add distribution files for Smithery, GitHub Action, and Claude Code Plugin - Add Claude Code Plugin: plugin.json, .mcp.json, 3 slash commands, skill-builder agent skill - Add GitHub Action: composite action.yml with 6 inputs/2 outputs, comprehensive README - Add Smithery: publishing guide with namespace yusufkaraaslan/skill-seekers created - Add render-mcp.yaml for MCP server deployment on Render - Fix Dockerfile.mcp: --transport flag (nonexistent) → --http, add dynamic PORT support - Update AGENTS.md to v3.3.0 with corrected test count and expanded CI section - Allow distribution/claude-plugin/.mcp.json in .gitignore	2026-03-16 23:29:50 +03:00
yusyus	2b725aa8f7	fix: update version strings and test expectations from 3.2.0 to 3.3.0 Fix CI failures: version hardcoded in _version.py fallbacks and test assertions (test_package_structure, test_cli_paths) still referenced 3.2.0 after the version bump.	2026-03-16 00:53:35 +03:00
yusyus	ca0890ba6f	chore: bump version to 3.3.0 and finalize changelog - Bump version in pyproject.toml: 3.2.0 -> 3.3.0 - Rename [Unreleased] to [3.3.0] - 2026-03-16 with theme line - Add Supported Source Types (17) reference table - Add 12 missing changelog entries: - feat: sync-config command (#306) - feat: best practices guide (#206) - docs: 32 files updated for 17 source types - docs: README translations for 10 languages - perf: pre-compiled regex, bisect line indexing, O(1) dedup (#309) - fix: Invalid IPv6 URL on bracket URLs (#284) - fix: GitHub scraper PaginatedList crash (#269) - fix: release workflow version mismatch and 3.10 compat - fix: infer_categories key mismatch - fix: flaky benchmark test - fix: CI branch protection pending	2026-03-16 00:23:48 +03:00
yusyus	9e405df9d0	docs: add README translations for 10 languages (12 total) Add machine-translated README files for Japanese, Korean, Spanish, French, German, Portuguese (BR), Turkish, Arabic, Hindi, and Russian. Update language selector in English and Chinese READMEs to link all 12 versions. New files: README.{ja,ko,es,fr,de,pt-BR,tr,ar,hi,ru}.md Modified: README.md, README.zh-CN.md (language selector bar)	2026-03-15 16:27:05 +03:00
yusyus	37cb307455	docs: update all documentation for 17 source types Update 32 documentation files across English and Chinese (zh-CN) docs to reflect the 10 new source types added in the previous commit. Updated files: - README.md, README.zh-CN.md — taglines, feature lists, examples, install extras - docs/reference/ — CLI_REFERENCE, FEATURE_MATRIX, MCP_REFERENCE, CONFIG_FORMAT, API_REFERENCE - docs/features/ — UNIFIED_SCRAPING with generic merge docs - docs/advanced/ — multi-source guide, MCP server guide - docs/getting-started/ — installation extras, quick-start examples - docs/user-guide/ — core-concepts, scraping, packaging, workflows (complex-merge) - docs/ — FAQ, TROUBLESHOOTING, BEST_PRACTICES, ARCHITECTURE, UNIFIED_PARSERS, README - Root — BULLETPROOF_QUICKSTART, CONTRIBUTING, ROADMAP - docs/zh-CN/ — Chinese translations for all of the above 32 files changed, +3,016 lines, -245 lines	2026-03-15 15:56:04 +03:00
yusyus	53b911b697	feat: add 10 new skill source types (17 total) with full pipeline integration Add Jupyter Notebook, Local HTML, OpenAPI/Swagger, AsciiDoc, PowerPoint, RSS/Atom, Man Pages, Confluence, Notion, and Slack/Discord Chat as new skill source types. Each type is fully integrated across: - Standalone CLI commands (skill-seekers <type>) - Auto-detection via 'skill-seekers create' (file extension + content sniffing) - Unified multi-source configs (scraped_data, dispatch, config validation) - Unified skill builder (generic merge + source-attributed synthesis) - MCP server (scrape_generic tool with per-type flag mapping) - pyproject.toml (entry points, optional deps, [all] group) Also fixes: EPUB unified pipeline gap, missing word/video config validators, OpenAPI yaml import guard, MCP flag mismatch for all 10 types, stale docstrings, and adds 77 integration tests + complex-merge workflow. 50 files changed, +20,201 lines	2026-03-15 15:30:15 +03:00
yusyus	64403a3686	docs: add best practices guide for high-quality skills (#206 ) Adds docs/BEST_PRACTICES.md — a comprehensive guide for creating high-quality Claude skills. Covers SKILL.md structure, code examples, prerequisites, troubleshooting, quality targets, and a real-world before/after example (Grade F to Grade A). Addresses roadmap item I2.2. Based on PR #206 by @jmagly from the AI Writing Guide project. Fixes applied: updated outdated CLI command, fixed broken doc links. Co-authored-by: jmagly <jmagly@users.noreply.github.com>	2026-03-15 02:51:02 +03:00
yusyus	7185531f94	fix: replace PaginatedList slicing with itertools.islice in _extract_issues PyGithub's PaginatedList slicing (issues[:max_issues]) may fail with 'list index out of range' on some PyGithub versions or when repos have no issues. Replace with itertools.islice() which works reliably with any iterable, including PaginatedList. Bug reported by @dream0438-cmd in PR #269. Closes #269	2026-03-15 02:44:06 +03:00
yusyus	2e30970dfb	feat: add EPUB input support (#310 ) Adds EPUB as a first-class input source for skill generation. - EpubToSkillConverter (epub_scraper.py, ~1200 lines) following PDF scraper pattern - Dublin Core metadata, spine items, code blocks, tables, images extraction - DRM detection (Adobe ADEPT, Apple FairPlay, Readium LCP) with fail-fast - EPUB 3 NCX TOC bug workaround (ignore_ncx=True) - ebooklib as optional dep: pip install skill-seekers[epub] - Wired into create command with .epub auto-detection - 104 tests, all passing Review fixes: removed 3 empty test stubs, fixed SVG double-counting in _extract_images(), added logger.debug to bare except pass. Based on PR #310 by @christianbaumann. Co-authored-by: Christian Baumann <mail@chriss-baumann.de>	2026-03-15 02:34:41 +03:00
yusyus	83b9a695ba	feat: add sync-config command to detect and update config start_urls (#306 ) ## Summary Add `skill-seekers sync-config` subcommand that crawls a docs site's navigation, diffs discovered URLs against a config's start_urls, and optionally writes the updated list back with --apply. - BFS link discovery with configurable depth (default 2), max-pages, rate-limit - Respects url_patterns.include/exclude from config - Supports optional nav_seed_urls config field - Handles both unified (sources array) and legacy flat config formats - MCP tool sync_config included - 57 tests (39 unit + 18 E2E with local HTTP server) - Fixed CI: renamed summary job to "Tests" to match branch protection rule Closes #306	2026-03-15 02:16:32 +03:00

1 2 3 4 5 ...

690 Commits