fix: RAG chunking crash using non-existent converter.output_dir

DocToSkillConverter has self.skill_dir (string), not self.output_dir.
The --chunk-for-rag flag on scrape command crashed with AttributeError.
Changed to Path(converter.skill_dir).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
yusyus
2026-02-27 22:26:21 +03:00
parent 4b59bd43be
commit 3bad7cf365
2 changed files with 4 additions and 2 deletions

View File

@@ -22,6 +22,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- **`docx` optional dependency group** — `pip install skill-seekers[docx]` (mammoth + python-docx)
### Fixed
- **RAG chunking crash (`AttributeError: output_dir`)** — `execute_scraping_and_building()` used `converter.output_dir` which doesn't exist on `DocToSkillConverter`. Changed to `Path(converter.skill_dir)`. Affected `--chunk-for-rag` flag on `scrape` command.
- **Issue #301: `setup.sh` fails on macOS with mismatched Python/pip** — `pip3` can point to a different Python than `python3` (e.g. pip3 → 3.9, python3 → 3.14), causing "no matching distribution" errors. Changed `setup.sh` to use `python3 -m pip` instead of bare `pip3` to guarantee the correct interpreter.
- **Issue #300: Selector fallback & dry-run link discovery** — `create https://reactflow.dev/` now finds 20+ pages (was 1). Root causes:
- `extract_content()` extracted links after the early-return when no content selector matched, so they were never discovered. Moved link extraction before the early return.

View File

@@ -2289,10 +2289,11 @@ def execute_scraping_and_building(
)
# Chunk the skill
chunks = chunker.chunk_skill(converter.output_dir)
skill_dir = Path(converter.skill_dir)
chunks = chunker.chunk_skill(skill_dir)
# Save chunks
chunks_path = converter.output_dir / "rag_chunks.json"
chunks_path = skill_dir / "rag_chunks.json"
chunker.save_chunks(chunks, chunks_path)
logger.info(f"✅ Generated {len(chunks)} RAG chunks")