yusyus
|
064405c052
|
fix: resolve 18 bugs and code quality issues across adaptors, CLI, and chunking pipeline
Bug fixes:
- Fix --var flag silently dropped in create routing (args.workflow_var → args.var)
- Fix double _score_code_quality() call in word scraper
- Add .docx file extension validation in WordToSkillConverter
- Fix weaviate ImportError masked by generic Exception handler
- Fix RAG chunking crash using non-existent converter.output_dir
Chunking pipeline improvements:
- Wire --chunk-overlap-tokens through entire package pipeline
(package_skill → adaptor.package → format_skill_md → _maybe_chunk_content → RAGChunker)
- Add auto-scaling overlap: max(50, chunk_tokens//10) when chunk size is non-default
- Rename --no-preserve-code to --no-preserve-code-blocks (backward-compat alias kept)
- Replace hardcoded 512/50 chunk defaults with DEFAULT_CHUNK_TOKENS/DEFAULT_CHUNK_OVERLAP_TOKENS
constants across all 12 concrete adaptors, rag_chunker, base, and package_skill
Code quality:
- Extract shared _generate_openai_embeddings() and _generate_st_embeddings() to SkillAdaptor
base class, removing ~150 lines of duplication from chroma/weaviate/pinecone
- Add Pinecone adaptor with full upload support (pinecone_adaptor.py)
Tests (14 new):
- chunk_overlap_tokens parameter wiring, auto-scaling overlap, preserve_code_blocks flag
- .docx/.doc/no-extension file validation, --var flag routing E2E
- Embedding method inheritance verification, backward-compatible flag aliases
Docs:
- Update CHANGELOG, CLI_REFERENCE, API_REFERENCE, packaging guide (EN+ZH)
- Update README test count badge (1880+ → 2283+)
All 2283 tests passing, 8 skipped, 0 failures.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-02-28 21:57:59 +03:00 |
|
yusyus
|
c44b88e801
|
docs: update stale version numbers, MCP counts, and test counts across docs/
Version headers/footers updated to 3.1.0-dev:
- docs/features/BOOTSTRAP_SKILL_TECHNICAL.md (was 2.8.0-dev)
- docs/reference/API_REFERENCE.md (was 2.7.0)
- docs/reference/CODE_QUALITY.md (was 2.7.0)
- docs/guides/TESTING_GUIDE.md (was 2.7.0)
- docs/guides/MIGRATION_GUIDE.md (was 2.7.0, historical tables untouched)
MCP tool count 18 → 26:
- docs/guides/MCP_SETUP.md
- docs/guides/TESTING_GUIDE.md
- docs/reference/CODE_QUALITY.md
- docs/reference/CLAUDE_INTEGRATION.md
- docs/integrations/CLINE.md
- docs/strategy/INTEGRATION_STRATEGY.md
Test count 700+/1200+ → 1,880+:
- docs/guides/MCP_SETUP.md
- docs/guides/TESTING_GUIDE.md
- docs/reference/CODE_QUALITY.md
- docs/reference/CLAUDE_INTEGRATION.md
- docs/features/HOW_TO_GUIDES.md
- docs/blog/UNIVERSAL_RAG_PREPROCESSOR.md
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
2026-02-18 22:36:08 +03:00 |
|
yusyus
|
6f1d0a9a45
|
docs: Comprehensive markdown documentation update for v2.7.0
Documentation Overhaul (7 new files, ~4,750 lines)
Version Consistency Updates:
- Updated all version references to v2.7.0 (ROADMAP.md)
- Standardized test counts to 1200+ tests (README.md, Quality Assurance)
- Updated MCP tool references to 18 tools (CHANGELOG.md)
New Documentation Files:
1. docs/reference/API_REFERENCE.md (750 lines)
- Complete programmatic usage guide for Python integration
- All 8 core APIs documented with examples
- Configuration schema reference and error handling
- CI/CD integration examples (GitHub Actions, GitLab CI)
- Performance optimization and batch processing
2. docs/features/BOOTSTRAP_SKILL.md (450 lines)
- Self-hosting capability documentation (dogfooding)
- Architecture and workflow explanation (3 components)
- Troubleshooting and testing guide
- CI/CD integration examples
- Advanced usage and customization
3. docs/reference/CODE_QUALITY.md (550 lines)
- Comprehensive Ruff linting documentation
- All 21 v2.7.0 fixes explained with examples
- Testing requirements and coverage standards
- CI/CD integration (GitHub Actions, pre-commit hooks)
- Security scanning with Bandit
- Development workflow best practices
4. docs/guides/TESTING_GUIDE.md (750 lines)
- Complete testing reference (1200+ tests)
- Unit, integration, E2E, and MCP testing guides
- Coverage analysis and improvement strategies
- Debugging tests and troubleshooting
- CI/CD matrix testing (2 OS, 4 Python versions)
- Best practices and common patterns
5. docs/QUICK_REFERENCE.md (300 lines)
- One-page cheat sheet for quick lookup
- All CLI commands with examples
- Common workflows and shortcuts
- Environment variables and configurations
- Tips & tricks for power users
6. docs/guides/MIGRATION_GUIDE.md (400 lines)
- Version upgrade guides (v1.0.0 → v2.7.0)
- Breaking changes and migration steps
- Compatibility tables for all versions
- Rollback instructions
- Common migration issues and solutions
7. docs/FAQ.md (550 lines)
- Comprehensive Q&A covering all major topics
- Installation, usage, platforms, features
- Troubleshooting shortcuts
- Platform-specific questions
- Advanced usage and programmatic integration
Navigation Improvements:
- Added "New in v2.7.0" section to docs/README.md
- Integrated all new docs into navigation structure
- Enhanced "Finding What You Need" section with new entries
- Updated developer quick links (testing, code quality, API)
- Cross-referenced related documentation
Documentation Quality:
- All version references consistent (v2.7.0)
- Test counts standardized (1200+ tests)
- MCP tool counts accurate (18 tools)
- All internal links validated
- Format consistency maintained
- Proper heading hierarchy
Impact:
- 64 markdown files reviewed and validated
- 7 new documentation files created (~4,750 lines)
- 4 files updated (ROADMAP, README, CHANGELOG, docs/README)
- Comprehensive coverage of all v2.7.0 features
- Enhanced developer onboarding experience
- Improved user documentation accessibility
Related Issues:
- Addresses documentation gaps identified in v2.7.0 planning
- Supports code quality improvements (21 ruff fixes)
- Documents bootstrap skill feature
- Provides migration path for users upgrading from older versions
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
2026-01-18 01:16:22 +03:00 |
|