feat: add 10 new skill source types (17 total) with full pipeline integration

Add Jupyter Notebook, Local HTML, OpenAPI/Swagger, AsciiDoc, PowerPoint,
RSS/Atom, Man Pages, Confluence, Notion, and Slack/Discord Chat as new
skill source types. Each type is fully integrated across:

- Standalone CLI commands (skill-seekers <type>)
- Auto-detection via 'skill-seekers create' (file extension + content sniffing)
- Unified multi-source configs (scraped_data, dispatch, config validation)
- Unified skill builder (generic merge + source-attributed synthesis)
- MCP server (scrape_generic tool with per-type flag mapping)
- pyproject.toml (entry points, optional deps, [all] group)

Also fixes: EPUB unified pipeline gap, missing word/video config validators,
OpenAPI yaml import guard, MCP flag mismatch for all 10 types, stale
docstrings, and adds 77 integration tests + complex-merge workflow.

50 files changed, +20,201 lines
This commit is contained in:
yusyus
2026-03-15 15:30:15 +03:00
parent 64403a3686
commit 53b911b697
50 changed files with 20193 additions and 856 deletions

View File

@@ -24,12 +24,12 @@ class TestParserRegistry:
def test_all_parsers_registered(self):
"""Test that all parsers are registered."""
assert len(PARSERS) == 25, f"Expected 25 parsers, got {len(PARSERS)}"
assert len(PARSERS) == 35, f"Expected 35 parsers, got {len(PARSERS)}"
def test_get_parser_names(self):
"""Test getting list of parser names."""
names = get_parser_names()
assert len(names) == 25
assert len(names) == 35
assert "scrape" in names
assert "github" in names
assert "package" in names
@@ -243,9 +243,9 @@ class TestBackwardCompatibility:
assert cmd in names, f"Command '{cmd}' not found in parser registry!"
def test_command_count_matches(self):
"""Test that we have exactly 25 commands (includes create, workflows, word, epub, video, and sync-config)."""
assert len(PARSERS) == 25
assert len(get_parser_names()) == 25
"""Test that we have exactly 35 commands (25 original + 10 new source types)."""
assert len(PARSERS) == 35
assert len(get_parser_names()) == 35
if __name__ == "__main__":