- Fix#299: rename --chunk-size/--chunk-overlap to --streaming-chunk-size/
--streaming-overlap in arguments/package.py to avoid collision with the
RAG --chunk-size flag from arguments/common.py
- Phase 1a: make package_skill.py import args via add_package_arguments()
instead of a 105-line inline duplicate argparse block; fixes the root
cause of _reconstruct_argv() passing unrecognised flag names
- Phase 1b: centralise setup_logging() into utils.py and remove 4
duplicate module-level logging.basicConfig() calls from doc_scraper.py,
github_scraper.py, codebase_scraper.py, and unified_scraper.py
- Fix test_package_structure.py / test_cli_paths.py version strings
(3.1.1 → 3.1.2)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fix 1 — gemini.py: replace deprecated gemini-2.0-flash-exp (404 errors)
with gemini-2.5-flash (stable, GA, Google's recommended replacement).
Closes#290.
Fix 2 — enhance dispatcher: implement the documented auto-detection that
was missing from the code. skill-seekers enhance now correctly routes:
- ANTHROPIC_API_KEY set → Claude API mode (enhance_skill.py)
- GOOGLE_API_KEY set → Gemini API mode
- OPENAI_API_KEY set → OpenAI API mode
- No API keys → LOCAL mode (Claude Code Max, free)
Use --mode LOCAL to force local mode even when an API key is present.
9 new tests cover _detect_api_target() priority logic and main()
routing (API delegation, --mode LOCAL override, no-key fallback).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All scrapers (scrape, github, analyze, pdf) now share a common argument
contract via add_all_standard_arguments() in arguments/common.py.
Universal flags (--dry-run, --verbose, --quiet, --name, --description,
workflow args) work consistently across all source types.
Previously, `create <url> --dry-run`, `create owner/repo --dry-run`,
and `create ./path --dry-run` would crash because sub-scrapers didn't
accept those flags. Also fixes main.py _handle_analyze_command() not
forwarding --dry-run, --preset, --quiet, --name, --description to
codebase_scraper.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The create command crashed with 'Namespace' object has no attribute
'max_pages' because it accessed args.max_pages directly instead of
using getattr() like all other source-specific attributes in the
same method.
Closes#293
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- README, CONTRIBUTING, QUALITY_GUIDELINES, AGENTS.md all aligned with
production best practices (accurate counts, no max_pages, unified format)
- validate-config.py: fix two bugs (unified config categories lookup,
max_pages warning logic)
- Delete old submit-config.md (duplicate of submit-config.yml with
outdated content)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Review and update all 7 configs in build-tools:
esbuild, rollup, storybook, swc, turborepo, vite, webpack — all v1.1.0
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Review and update all 2 configs in api-tech:
- graphql.json: add mutations/subscriptions/variables categories,
more start_urls, v1.1.0
- trpc.json: update for tRPC v11, TanStack Query, more start_urls,
data_transformers category, v1.1.0
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Points submodule to merged main commit (bf9b0ff) after ai-ml
category review and enhancement was merged.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Review and update all 34 configs in the ai-ml category:
- Remove max_pages from all configs
- Rewrite anthropic, openai-api, langchain, ollama for current state
- Fix URL patterns in chroma, seaborn, nltk, keras, deepspeed
- All configs pass dry-run validation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The api/configs_repo git submodule was pinned to commit d4c0710 which only
had 14 configs. Updated to latest main (4275d6f) which has 178 configs across
21 categories (web-frameworks, ai-ml, game-engines, databases, devops, etc.)
Also fixed ConfigAnalyzer._categorize_config() to use directory structure
(official/{category}/{name}.json) as authoritative category instead of
keyword matching, which was classifying most new configs as "uncategorized".
Result: API /api/configs now returns 178 configs (was 14).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove unused imports (F401): os/Path/json/threading in tests; os in estimate_pages;
Path in install_skill; pytest in test_unified_scraper_orchestration
- Fix F821 undefined 'args' in unified_scraper._scrape_local() by storing
self._cli_args = args in run() and reading via getattr in _scrape_local()
- Fix ARG001/ARG005 unused lambda/function arguments with _ prefix or # noqa:ARG001
where parameter names must be preserved for keyword-argument compatibility
- Fix C408 unnecessary dict() calls → dict literals in test_enhance_command
- Fix F841 unused variable 'stub' in test_enhance_command
- Fix SIM117 nested with statements → single with in test_unified_scraper_orchestration
- Fix SIM105 try/except/pass → contextlib.suppress in test_unified_scraper_orchestration
- Rewrite TestScrapeLocal to test fixed behavior (not the NameError bug)
All 2267 tests pass, 11 skipped.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- pyproject.toml: version 3.0.0 → 3.1.0
- src/skill_seekers/_version.py: update hardcoded fallback to 3.1.0
- CHANGELOG.md: comprehensive [3.1.0] release notes covering all
features and fixes since v3.0.0 (unified create command, workflow
presets, RST parser, smart enhance dispatcher, CLI flag parity,
60 new workflow YAMLs, test suite improvements)
- Deprecation messages: update "removed in v3.0.0" → "v4.0.0" across
analyze_presets.py, codebase_scraper.py, mcp/server.py
- tests/test_cli_paths.py: update version assertion to 3.1.0
- tests/test_package_structure.py: update __version__ assertions to 3.1.0
- tests/test_preset_system.py: update deprecation message version to v4.0.0
All 2267 tests passing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fixes several categories of test failures to achieve a clean test suite:
**Python 3.14 / chromadb compatibility**
- chroma.py: broaden except clause to catch pydantic ConfigError on Python 3.14
- test_adaptors_e2e.py, test_integration_adaptors.py: skip on (ImportError, Exception)
**sys.modules corruption (test isolation)**
- test_swift_detection.py: save/restore all skill_seekers.cli modules AND parent
package attributes in test_empty_swift_patterns_handled_gracefully; prevents
@patch decorators in downstream test files from targeting stale module objects
**Removed unnecessary @unittest.skip decorators**
- test_claude_adaptor.py, test_gemini_adaptor.py, test_openai_adaptor.py: remove
skip from tests that already had pass-body or were compatible once deps installed
**Fixed openai import guard for installed package**
- test_openai_adaptor.py: use patch.dict(sys.modules, {"openai": None}) for
test_upload_missing_library since openai is now a transitive dep
**langchain import path update**
- test_rag_chunker.py: fix from langchain.schema → langchain_core.documents
**config_extractor tomllib fallback**
- config_extractor.py: use stdlib tomllib (Python 3.11+) as fallback when
tomli/toml packages are not installed
**Remove redundant sys.path.insert() calls**
- codebase_scraper.py, doc_scraper.py, enhance_skill.py, enhance_skill_local.py,
estimate_pages.py, install_skill.py: remove legacy path manipulation no longer
needed with pip install -e . (src/ layout)
**Test fixes: removed @requires_github from fully-mocked tests**
- test_unified_analyzer.py: 5 tests that mock GitHubThreeStreamFetcher don't
need a real token; remove decorator so they always run
**macOS-specific test improvements**
- test_terminal_detection.py: use @patch(sys.platform, "darwin") instead of
runtime skipTest() so tests run on all platforms
**Dependency updates**
- pyproject.toml, uv.lock: add langchain and llama-index as core dependencies
**New workflow presets and tests**
- src/skill_seekers/workflows/: add 60 new domain-specific workflow YAML presets
- tests/test_mcp_workflow_tools.py: tests for MCP workflow tool implementations
- tests/test_unified_scraper_orchestration.py: tests for UnifiedScraper methods
Result: 2115 passed, 158 skipped (external services/long-running), 0 failures
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fixes issues #289 and #286 (agent switching and Docker/root failures).
enhance_command.py (new smart dispatcher):
- Routes skill-seekers enhance to API mode (Gemini/OpenAI/Claude API)
when an API key is available, or LOCAL mode (Claude Code CLI) otherwise
- Decision priority: --target flag > config default_agent > auto-detect
from env vars (ANTHROPIC_API_KEY → claude, GOOGLE_API_KEY → gemini,
OPENAI_API_KEY → openai) > LOCAL fallback
- Blocks LOCAL mode when running as root (Docker/VPS) with clear error
message + API mode instructions
- Supports --dry-run, --target, --api-key as first-class flags
arguments/enhance.py:
- Added --target, --api-key, --dry-run, --interactive-enhancement to
ENHANCE_ARGUMENTS (shared by unified CLI parser and standalone entry point)
enhance_skill_local.py:
- Error output no longer truncated at 200 chars (shows up to 20 lines)
- Detects root/permission errors in stderr and prints actionable hint
config_manager.py:
- Added default_agent field to DEFAULT_CONFIG ai_enhancement section
- Added get_default_agent() and set_default_agent() methods
main.py:
- enhance command routed to enhance_command (was enhance_skill_local)
- _handle_analyze_command uses smart dispatcher for post-analysis enhancement
pyproject.toml:
- skill-seekers-enhance entry point updated to enhance_command:main
Tests: 1977 passed, 0 failed (28 new tests in test_enhance_command.py)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Flag/option synchronization fixes:
- analyze: add --dry-run, --api-key, and all workflow flags (--enhance-workflow,
--enhance-stage, --var, --workflow-dry-run) via WORKFLOW_ARGUMENTS merge
- pdf: add --api-key to PDF_ARGUMENTS; replace 5 hardcoded add_argument() calls
in pdf_scraper.py:main() with add_pdf_arguments() to activate all defined args
- unified: add --api-key and --enhance-level (global override) to UNIFIED_ARGUMENTS
and standalone parser; wire enhance_level CLI override into run() per-source loop
- codebase_scraper: fix --enhance-workflow to use action="append" (was type=str),
enabling multiple workflow chaining instead of silently dropping all but last
ConfigManager test isolation fix:
- __init__ now reads self.CONFIG_DIR/CONFIG_FILE/PROGRESS_DIR class variables
instead of calling _get_config_dir()/_get_progress_dir() directly, enabling
monkeypatching in tests (fixes pre-existing test_add_and_retrieve_github_profile)
Workflow JSON config support in unified_scraper:
- Phase 5 now reads workflows/workflow_stages/workflow_vars from top-level JSON
config and merges them with CLI args (CLI-first ordering); supports running
workflows even when unified scraper is called without CLI args (args=None)
Tests: 1,949 passed, 0 failed (added 18 new tests across 3 test files)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously _route_config only forwarded --dry-run, silently dropping
all enhancement workflows, --merge-mode, and --skip-codebase-analysis.
Changes:
- arguments/create.py: add CONFIG_ARGUMENTS dict with merge_mode and
skip_codebase_analysis; wire into get_source_specific_arguments(),
get_compatible_arguments(), and add_create_arguments(mode='config')
- create_command.py: fix _route_config to forward --fresh, --merge-mode,
--skip-codebase-analysis, and all 4 workflow flags; add --help-config
handler (skill-seekers create --help-config) matching other help modes
- parsers/create_parser.py: add --help-config flag for unified CLI parity
- tests/test_create_arguments.py: import CONFIG_ARGUMENTS; update config
source tests to assert correct content instead of empty dict
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
unified_scraper.py was the only scraper missing --enhance-workflow,
--enhance-stage, --var, and --workflow-dry-run support. All other
scrapers (doc_scraper, github_scraper, pdf_scraper, codebase_scraper)
already called run_workflows() after building the skill.
Changes:
- arguments/unified.py: add 4 workflow args to UNIFIED_ARGUMENTS so
the unified CLI subparser picks them up automatically
- unified_scraper.py main(): register the same 4 workflow args in the
standalone parser
- unified_scraper.py run(): accept optional `args` parameter and call
run_workflows() after build_skill(), passing unified context
(name + description) consistent with doc_scraper pattern
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Square brackets in URL paths (e.g. /api/[v1]/users from API reference docs)
are technically invalid unencoded per RFC 3986. httpx interprets them as IPv6
address literals and raises "Invalid IPv6 URL", crashing the llms-full.md
parse step.
Fix _clean_url() in LlmsTxtParser to percent-encode [ and ] in the path and
query components (-> %5B / %5D) using urlparse/urlunparse so only the path is
touched, not the host. Anchor-stripping logic is preserved and runs first.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add _get_config_dir() and _get_progress_dir() helpers that return
%APPDATA%/skill-seekers and %LOCALAPPDATA%/skill-seekers/progress on
Windows instead of Unix-only ~/.config and ~/.local/share paths
- Recompute paths at instance creation in __init__ so they are always
evaluated at runtime, not at class definition time
- Guard all chmod() calls with sys.platform != "win32" — chmod with
Unix stat flags is a no-op on Windows which caused config to appear
saved but be unreadable/unfindable on subsequent runs
- Fix should_show_welcome() and mark_welcome_shown() to use instance
config_dir instead of stale class-level WELCOME_FLAG constant
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>