fix: RAG chunking crash using non-existent converter.output_dir

DocToSkillConverter has self.skill_dir (string), not self.output_dir. The --chunk-for-rag flag on scrape command crashed with AttributeError. Changed to Path(converter.skill_dir). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 22:26:21 +03:00
parent 4b59bd43be
commit 3bad7cf365
2 changed files with 4 additions and 2 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -22,6 +22,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - **`docx` optional dependency group** — `pip install skill-seekers[docx]` (mammoth + python-docx)

 ### Fixed
+- **RAG chunking crash (`AttributeError: output_dir`)** — `execute_scraping_and_building()` used `converter.output_dir` which doesn't exist on `DocToSkillConverter`. Changed to `Path(converter.skill_dir)`. Affected `--chunk-for-rag` flag on `scrape` command.
 - **Issue #301: `setup.sh` fails on macOS with mismatched Python/pip** — `pip3` can point to a different Python than `python3` (e.g. pip3 → 3.9, python3 → 3.14), causing "no matching distribution" errors. Changed `setup.sh` to use `python3 -m pip` instead of bare `pip3` to guarantee the correct interpreter.
 - **Issue #300: Selector fallback & dry-run link discovery** — `create https://reactflow.dev/` now finds 20+ pages (was 1). Root causes:
  - `extract_content()` extracted links after the early-return when no content selector matched, so they were never discovered. Moved link extraction before the early return.
--- a/src/skill_seekers/cli/doc_scraper.py
+++ b/src/skill_seekers/cli/doc_scraper.py
@@ -2289,10 +2289,11 @@ def execute_scraping_and_building(
        )

        # Chunk the skill
-        chunks = chunker.chunk_skill(converter.output_dir)
+        skill_dir = Path(converter.skill_dir)
+        chunks = chunker.chunk_skill(skill_dir)

        # Save chunks
-        chunks_path = converter.output_dir / "rag_chunks.json"
+        chunks_path = skill_dir / "rag_chunks.json"
        chunker.save_chunks(chunks, chunks_path)

        logger.info(f"✅ Generated {len(chunks)} RAG chunks")