Files
skill-seekers-reference/docs
yusyus 7496c2b5e0 feat: unified document parser system with RST/Markdown/PDF support
Implements comprehensive unified parser architecture for extracting
structured content from multiple documentation formats with feature
parity and quality scoring.

Key Features:
- Unified Document structure for all formats (RST, Markdown, PDF)
- Enhanced RST parser: tables, cross-refs, directives, field lists
- Enhanced Markdown parser: tables, images, admonitions, quality scoring
- PDF parser wrapper: unified output while preserving all features
- Quality scoring system for code blocks and tables
- Format converters: to_markdown(), to_skill_format()
- Auto-detection of document formats

Architecture:
- BaseParser abstract class with format-specific implementations
- ContentBlock universal container with 12 block types
- 14 cross-reference types (including Godot-specific)
- Backward compatible with legacy parsers

Integration:
- doc_scraper.py: Enhanced MarkdownParser with graceful fallback
- codebase_scraper.py: RstParser for .rst file processing
- Maintains backward compatibility with existing workflows

Test Coverage:
- 75 tests passing (up from 42)
- 37 comprehensive parser tests (RST, Markdown, auto-detection, quality)
- Proper pytest fixtures and assertions
- Zero critical warnings

Documentation:
- Complete architecture guide (docs/architecture/UNIFIED_PARSERS.md)
- Class hierarchy diagrams and usage examples
- Integration guide and extension patterns

Impact:
- Godot documentation extraction: 20% → 90% content coverage (+70%)
- Tables: 0 → ~3,000+ extracted
- Cross-references: 0 → ~50,000+ extracted
- Directives: 0 → ~5,000+ extracted
- All with quality scoring and validation

Files Changed:
- New: src/skill_seekers/cli/parsers/extractors/ (7 files, ~100KB)
- New: tests/test_unified_parsers.py (37 tests)
- New: docs/architecture/UNIFIED_PARSERS.md (12KB)
- Modified: doc_scraper.py (enhanced Markdown extraction)
- Modified: codebase_scraper.py (RST file processing)

Breaking Changes: None (backward compatible)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-15 23:14:49 +03:00
..

Skill Seekers Documentation

Welcome to the Skill Seekers documentation hub. This directory contains comprehensive documentation organized by category.

📚 Quick Navigation

🆕 New in v2.7.0

Recently Added Documentation:

🚀 Getting Started

New to Skill Seekers? Start here:

📖 User Guides

Essential guides for setup and daily usage:

Feature Documentation

Learn about core features and capabilities:

Core Features

AI Enhancement

PDF Features

🔌 Platform Integrations

Multi-LLM platform support:

📘 Reference Documentation

Technical reference and architecture:

📋 Planning & Design

Development plans and designs:

📦 Archive

Historical documentation and completed features:

🤝 Contributing

Want to contribute? See:

📝 Changelog

  • CHANGELOG - Version history and release notes

For Users

For Developers

API & Tools

🔍 Finding What You Need

I want to...

Get started quicklyQuick Reference or Quickstart Guide

Find quick answersFAQ - Frequently asked questions

Use Skill Seekers programmaticallyAPI Reference - Python integration

Set up MCP serverMCP Setup Guide

Run testsTesting Guide - 1200+ tests

Understand code quality standardsCode Quality - Linting and CI/CD

Upgrade to new versionMigration Guide - Version upgrades

Scrape documentationUsage Guide → Documentation Scraping

Scrape GitHub reposUsage Guide → GitHub Scraping

Scrape PDFsPDF Scraper

Combine multiple sourcesUnified Scraping

Enhance my skill with AIAI Enhancement

Upload to Google GeminiGemini Integration

Upload to ChatGPTOpenAI Integration

Understand design patternsPattern Detection

Extract test examplesTest Example Extraction

Generate how-to guidesHow-To Guides

Create self-documenting skillBootstrap Skill - Dogfooding

Fix an issueTroubleshooting or FAQ

Contribute codeContributing Guide and Code Quality

📢 Support


Documentation Version: 2.7.0 Last Updated: 2026-01-18 Status: Complete & Organized