fix: RAG chunking crash using non-existent converter.output_dir
DocToSkillConverter has self.skill_dir (string), not self.output_dir. The --chunk-for-rag flag on scrape command crashed with AttributeError. Changed to Path(converter.skill_dir). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -22,6 +22,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
- **`docx` optional dependency group** — `pip install skill-seekers[docx]` (mammoth + python-docx)
|
||||
|
||||
### Fixed
|
||||
- **RAG chunking crash (`AttributeError: output_dir`)** — `execute_scraping_and_building()` used `converter.output_dir` which doesn't exist on `DocToSkillConverter`. Changed to `Path(converter.skill_dir)`. Affected `--chunk-for-rag` flag on `scrape` command.
|
||||
- **Issue #301: `setup.sh` fails on macOS with mismatched Python/pip** — `pip3` can point to a different Python than `python3` (e.g. pip3 → 3.9, python3 → 3.14), causing "no matching distribution" errors. Changed `setup.sh` to use `python3 -m pip` instead of bare `pip3` to guarantee the correct interpreter.
|
||||
- **Issue #300: Selector fallback & dry-run link discovery** — `create https://reactflow.dev/` now finds 20+ pages (was 1). Root causes:
|
||||
- `extract_content()` extracted links after the early-return when no content selector matched, so they were never discovered. Moved link extraction before the early return.
|
||||
|
||||
@@ -2289,10 +2289,11 @@ def execute_scraping_and_building(
|
||||
)
|
||||
|
||||
# Chunk the skill
|
||||
chunks = chunker.chunk_skill(converter.output_dir)
|
||||
skill_dir = Path(converter.skill_dir)
|
||||
chunks = chunker.chunk_skill(skill_dir)
|
||||
|
||||
# Save chunks
|
||||
chunks_path = converter.output_dir / "rag_chunks.json"
|
||||
chunks_path = skill_dir / "rag_chunks.json"
|
||||
chunker.save_chunks(chunks, chunks_path)
|
||||
|
||||
logger.info(f"✅ Generated {len(chunks)} RAG chunks")
|
||||
|
||||
Reference in New Issue
Block a user