feat: add EPUB input support (#310)

Adds EPUB as a first-class input source for skill generation.

- EpubToSkillConverter (epub_scraper.py, ~1200 lines) following PDF scraper pattern
- Dublin Core metadata, spine items, code blocks, tables, images extraction
- DRM detection (Adobe ADEPT, Apple FairPlay, Readium LCP) with fail-fast
- EPUB 3 NCX TOC bug workaround (ignore_ncx=True)
- ebooklib as optional dep: pip install skill-seekers[epub]
- Wired into create command with .epub auto-detection
- 104 tests, all passing

Review fixes: removed 3 empty test stubs, fixed SVG double-counting in
_extract_images(), added logger.debug to bare except pass.

Based on PR #310 by @christianbaumann.
Co-authored-by: Christian Baumann <mail@chriss-baumann.de>
This commit is contained in:
yusyus
2026-03-15 02:34:41 +03:00
committed by GitHub
parent 83b9a695ba
commit 2e30970dfb
16 changed files with 4502 additions and 9 deletions

View File

@@ -5,6 +5,18 @@ All notable changes to Skill Seeker will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [Unreleased]
### Added
- **EPUB (.epub) input support** via `skill-seekers create book.epub` or `skill-seekers epub --epub book.epub`
- Extracts chapters, metadata (Dublin Core), code blocks, images, and tables from EPUB 2 and EPUB 3 files
- DRM detection with clear error messages (Adobe ADEPT, Apple FairPlay, Readium LCP)
- Font obfuscation correctly identified as non-DRM
- EPUB 3 TOC bug workaround (`ignore_ncx` option)
- `--help-epub` flag for EPUB-specific help
- Optional dependency: `pip install "skill-seekers[epub]"` (ebooklib)
- 107 tests across 14 test classes
## [3.2.0] - 2026-03-01 ## [3.2.0] - 2026-03-01
**Theme:** Video source support, Word document support, Pinecone adaptor, and quality improvements. 94 files changed, +23,500 lines since v3.1.3. **2,540 tests passing.** **Theme:** Video source support, Word document support, Pinecone adaptor, and quality improvements. 94 files changed, +23,500 lines since v3.1.3. **2,540 tests passing.**

View File

@@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
## 🎯 Project Overview ## 🎯 Project Overview
**Skill Seekers** is the **universal documentation preprocessor** for AI systems. It transforms documentation websites, GitHub repositories, and PDFs into production-ready formats for **16+ platforms**: RAG pipelines (LangChain, LlamaIndex, Haystack), vector databases (Pinecone, Chroma, Weaviate, FAISS, Qdrant), AI coding assistants (Cursor, Windsurf, Cline, Continue.dev), and LLM platforms (Claude, Gemini, OpenAI). **Skill Seekers** is the **universal documentation preprocessor** for AI systems. It transforms documentation websites, GitHub repositories, PDFs, and EPUBs into production-ready formats for **16+ platforms**: RAG pipelines (LangChain, LlamaIndex, Haystack), vector databases (Pinecone, Chroma, Weaviate, FAISS, Qdrant), AI coding assistants (Cursor, Windsurf, Cline, Continue.dev), and LLM platforms (Claude, Gemini, OpenAI).
**Current Version:** v3.1.3 **Current Version:** v3.1.3
**Python Version:** 3.10+ required **Python Version:** 3.10+ required
@@ -222,6 +222,7 @@ src/skill_seekers/
│ ├── dependency_analyzer.py # Dependency graph analysis │ ├── dependency_analyzer.py # Dependency graph analysis
│ ├── signal_flow_analyzer.py # C3.10 Signal flow analysis (Godot) │ ├── signal_flow_analyzer.py # C3.10 Signal flow analysis (Godot)
│ ├── pdf_scraper.py # PDF extraction │ ├── pdf_scraper.py # PDF extraction
│ ├── epub_scraper.py # EPUB extraction
│ └── adaptors/ # ⭐ Platform adaptor pattern │ └── adaptors/ # ⭐ Platform adaptor pattern
│ ├── __init__.py # Factory: get_adaptor() │ ├── __init__.py # Factory: get_adaptor()
│ ├── base_adaptor.py # Abstract base │ ├── base_adaptor.py # Abstract base
@@ -397,7 +398,7 @@ The unified CLI modifies `sys.argv` and calls existing `main()` functions to mai
# Transforms to: doc_scraper.main() with modified sys.argv # Transforms to: doc_scraper.main() with modified sys.argv
``` ```
**Subcommands:** create, scrape, github, pdf, unified, codebase, enhance, enhance-status, package, upload, estimate, install, install-agent, patterns, how-to-guides **Subcommands:** create, scrape, github, pdf, epub, unified, codebase, enhance, enhance-status, package, upload, estimate, install, install-agent, patterns, how-to-guides
### NEW: Unified `create` Command ### NEW: Unified `create` Command
@@ -409,6 +410,7 @@ skill-seekers create https://docs.react.dev/ # → Web scraping
skill-seekers create facebook/react # → GitHub analysis skill-seekers create facebook/react # → GitHub analysis
skill-seekers create ./my-project # → Local codebase skill-seekers create ./my-project # → Local codebase
skill-seekers create tutorial.pdf # → PDF extraction skill-seekers create tutorial.pdf # → PDF extraction
skill-seekers create book.epub # → EPUB extraction
skill-seekers create configs/react.json # → Multi-source skill-seekers create configs/react.json # → Multi-source
# Progressive help system # Progressive help system
@@ -417,6 +419,7 @@ skill-seekers create --help-web # Shows web-specific options
skill-seekers create --help-github # Shows GitHub-specific options skill-seekers create --help-github # Shows GitHub-specific options
skill-seekers create --help-local # Shows local analysis options skill-seekers create --help-local # Shows local analysis options
skill-seekers create --help-pdf # Shows PDF extraction options skill-seekers create --help-pdf # Shows PDF extraction options
skill-seekers create --help-epub # Shows EPUB extraction options
skill-seekers create --help-advanced # Shows advanced/rare options skill-seekers create --help-advanced # Shows advanced/rare options
skill-seekers create --help-all # Shows all 120+ flags skill-seekers create --help-all # Shows all 120+ flags
@@ -685,6 +688,7 @@ pytest tests/ -v -m ""
- `test_unified.py` - Multi-source scraping - `test_unified.py` - Multi-source scraping
- `test_github_scraper.py` - GitHub analysis - `test_github_scraper.py` - GitHub analysis
- `test_pdf_scraper.py` - PDF extraction - `test_pdf_scraper.py` - PDF extraction
- `test_epub_scraper.py` - EPUB extraction
- `test_install_multiplatform.py` - Multi-platform packaging - `test_install_multiplatform.py` - Multi-platform packaging
- `test_integration.py` - End-to-end workflows - `test_integration.py` - End-to-end workflows
- `test_install_skill.py` - One-command install - `test_install_skill.py` - One-command install
@@ -741,6 +745,7 @@ skill-seekers-resume = "skill_seekers.cli.resume_command:main" #
skill-seekers-scrape = "skill_seekers.cli.doc_scraper:main" skill-seekers-scrape = "skill_seekers.cli.doc_scraper:main"
skill-seekers-github = "skill_seekers.cli.github_scraper:main" skill-seekers-github = "skill_seekers.cli.github_scraper:main"
skill-seekers-pdf = "skill_seekers.cli.pdf_scraper:main" skill-seekers-pdf = "skill_seekers.cli.pdf_scraper:main"
skill-seekers-epub = "skill_seekers.cli.epub_scraper:main"
skill-seekers-unified = "skill_seekers.cli.unified_scraper:main" skill-seekers-unified = "skill_seekers.cli.unified_scraper:main"
skill-seekers-codebase = "skill_seekers.cli.codebase_scraper:main" # C2.x Local codebase analysis skill-seekers-codebase = "skill_seekers.cli.codebase_scraper:main" # C2.x Local codebase analysis
skill-seekers-enhance = "skill_seekers.cli.enhance_skill_local:main" skill-seekers-enhance = "skill_seekers.cli.enhance_skill_local:main"
@@ -1754,6 +1759,7 @@ This section helps you quickly locate the right files when implementing common c
| GitHub scraping | `src/skill_seekers/cli/github_scraper.py` | ~56KB | Repo analysis + metadata | | GitHub scraping | `src/skill_seekers/cli/github_scraper.py` | ~56KB | Repo analysis + metadata |
| GitHub API | `src/skill_seekers/cli/github_fetcher.py` | ~17KB | Rate limit handling | | GitHub API | `src/skill_seekers/cli/github_fetcher.py` | ~17KB | Rate limit handling |
| PDF extraction | `src/skill_seekers/cli/pdf_scraper.py` | Medium | PyMuPDF + OCR | | PDF extraction | `src/skill_seekers/cli/pdf_scraper.py` | Medium | PyMuPDF + OCR |
| EPUB extraction | `src/skill_seekers/cli/epub_scraper.py` | Medium | ebooklib + BeautifulSoup |
| Code analysis | `src/skill_seekers/cli/code_analyzer.py` | ~65KB | Multi-language AST parsing | | Code analysis | `src/skill_seekers/cli/code_analyzer.py` | ~65KB | Multi-language AST parsing |
| Pattern detection | `src/skill_seekers/cli/pattern_recognizer.py` | Medium | C3.1 - 10 GoF patterns | | Pattern detection | `src/skill_seekers/cli/pattern_recognizer.py` | Medium | C3.1 - 10 GoF patterns |
| Test extraction | `src/skill_seekers/cli/test_example_extractor.py` | Medium | C3.2 - 5 categories | | Test extraction | `src/skill_seekers/cli/test_example_extractor.py` | Medium | C3.2 - 5 categories |
@@ -1777,7 +1783,7 @@ This section helps you quickly locate the right files when implementing common c
2. **Arguments:** `src/skill_seekers/cli/arguments/create.py` 2. **Arguments:** `src/skill_seekers/cli/arguments/create.py`
- Three tiers of arguments: - Three tiers of arguments:
- `UNIVERSAL_ARGUMENTS` (13 flags) - Work for all sources - `UNIVERSAL_ARGUMENTS` (13 flags) - Work for all sources
- Source-specific dicts (`WEB_ARGUMENTS`, `GITHUB_ARGUMENTS`, etc.) - Source-specific dicts (`WEB_ARGUMENTS`, `GITHUB_ARGUMENTS`, `EPUB_ARGUMENTS`, etc.)
- `ADVANCED_ARGUMENTS` - Rare/advanced options - `ADVANCED_ARGUMENTS` - Rare/advanced options
- `add_create_arguments(parser, mode)` - Multi-mode argument addition - `add_create_arguments(parser, mode)` - Multi-mode argument addition

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,271 @@
---
date: 2026-03-14T12:54:24.700367+00:00
git_commit: 7c90a4b9c9bccac8341b0769550d77aae3b4e524
branch: development
topic: "What files would be affected to add .epub support for input"
tags: [research, codebase, epub, input-format, scraper]
status: complete
---
# Research: What files would be affected to add .epub support for input
## Research Question
What files would be affected to add .epub support for input.
## Summary
Adding `.epub` input support follows an established pattern already used for PDF and Word (.docx) formats. The codebase has a consistent multi-layer architecture for document input formats: source detection, argument definitions, parser registration, create command routing, standalone scraper module, and tests. Based on analysis of the existing PDF and Word implementations, **16 existing files would need modification** and **4 new files would need to be created**.
## Detailed Findings
### New Files to Create (4 files)
| File | Purpose |
|------|---------|
| `src/skill_seekers/cli/epub_scraper.py` | Core EPUB extraction and skill building logic (analog: `word_scraper.py` at ~750 lines) |
| `src/skill_seekers/cli/arguments/epub.py` | EPUB-specific argument definitions (analog: `arguments/word.py`) |
| `src/skill_seekers/cli/parsers/epub_parser.py` | Subcommand parser class (analog: `parsers/word_parser.py`) |
| `tests/test_epub_scraper.py` | Test suite (analog: `test_word_scraper.py` at ~750 lines, 130+ tests) |
### Existing Files to Modify (16 files)
#### 1. Source Detection Layer
**`src/skill_seekers/cli/source_detector.py`** (3 locations)
- **`SourceDetector.detect()`** (line ~60): Add `.epub` extension check, following the `.docx` pattern at line 63-64:
```python
if source.endswith(".epub"):
return cls._detect_epub(source)
```
- **New method `_detect_epub()`**: Add detection method (following `_detect_word()` at lines 124-129):
```python
@classmethod
def _detect_epub(cls, source: str) -> SourceInfo:
name = os.path.splitext(os.path.basename(source))[0]
return SourceInfo(
type="epub", parsed={"file_path": source}, suggested_name=name, raw_input=source
)
```
- **`validate_source()`** (line ~250): Add epub validation block (following the word block at lines 273-278)
- **Error message** (line ~94): Add EPUB example to the `ValueError` help text
#### 2. CLI Dispatcher
**`src/skill_seekers/cli/main.py`** (2 locations)
- **`COMMAND_MODULES` dict** (line ~46): Add epub entry:
```python
"epub": "skill_seekers.cli.epub_scraper",
```
- **Module docstring** (line ~1): Add `epub` to the commands list
#### 3. Create Command Routing
**`src/skill_seekers/cli/create_command.py`** (3 locations)
- **`_route_to_scraper()`** (line ~121): Add `elif self.source_info.type == "epub":` routing case
- **New `_route_epub()` method**: Following the `_route_word()` pattern at lines 331-352:
```python
def _route_epub(self) -> int:
from skill_seekers.cli import epub_scraper
argv = ["epub_scraper"]
file_path = self.source_info.parsed["file_path"]
argv.extend(["--epub", file_path])
self._add_common_args(argv)
# epub-specific args here
...
```
- **`main()` epilog** (line ~537): Add EPUB example and source auto-detection entry
- **Progressive help** (line ~590): Add `--help-epub` flag and handler block
#### 4. Argument Definitions
**`src/skill_seekers/cli/arguments/create.py`** (4 locations)
- **New `EPUB_ARGUMENTS` dict** (~line 401): Define epub-specific arguments (e.g., `--epub` file path flag), following the `WORD_ARGUMENTS` pattern at lines 402-411
- **`get_source_specific_arguments()`** (line 595): Add `"epub": EPUB_ARGUMENTS` to the `source_args` dict
- **`add_create_arguments()`** (line 676): Add epub mode block:
```python
if mode in ["epub", "all"]:
for arg_name, arg_def in EPUB_ARGUMENTS.items():
parser.add_argument(*arg_def["flags"], **arg_def["kwargs"])
```
#### 5. Parser Registration
**`src/skill_seekers/cli/parsers/__init__.py`** (2 locations)
- **Import** (line ~15): Add `from .epub_parser import EpubParser`
- **`PARSERS` list** (line ~46): Add `EpubParser()` entry (near `WordParser()` and `PDFParser()`)
#### 6. Package Configuration
**`pyproject.toml`** (3 locations)
- **`[project.optional-dependencies]`** (line ~111): Add `epub` optional dependency group:
```toml
epub = [
"ebooklib>=0.18",
]
```
- **`all` optional dependency group** (line ~178): Add epub dependency to the combined `all` group
- **`[project.scripts]`** (line ~224): Add standalone entry point:
```toml
skill-seekers-epub = "skill_seekers.cli.epub_scraper:main"
```
#### 7. Argument Commons
**`src/skill_seekers/cli/arguments/common.py`**
- No changes strictly required, but `add_all_standard_arguments()` is called by the new `arguments/epub.py` (no modification needed — it's used as-is)
#### 8. Documentation / Configuration
**`CLAUDE.md`** (2 locations)
- **Commands section**: Add `epub` to the list of subcommands
- **Key source files table**: Add `epub_scraper.py` entry
**`CONTRIBUTING.md`** — Potentially update with epub format mention
**`CHANGELOG.md`** — New feature entry
### Files NOT Affected
These files do **not** need changes:
- **`unified_scraper.py`** — Multi-source configs could add epub support later but it's not required for basic input support
- **Platform adaptors** (`adaptors/*.py`) — Adaptors work on the output side (packaging), not input
- **Enhancement system** (`enhance_skill.py`, `enhance_skill_local.py`) — Works generically on SKILL.md
- **MCP server** (`mcp/server_fastmcp.py`) — Operates on completed skills
- **`pdf_extractor_poc.py`** — PDF-specific extraction; epub needs its own extractor
## Code References
### Pattern to Follow (Word .docx implementation)
- `src/skill_seekers/cli/word_scraper.py:1-750` — Full scraper with `WordToSkillConverter` class
- `src/skill_seekers/cli/arguments/word.py:1-75` — Argument definitions with `add_word_arguments()`
- `src/skill_seekers/cli/parsers/word_parser.py:1-33` — Parser class extending `SubcommandParser`
- `tests/test_word_scraper.py:1-750` — Comprehensive test suite with 130+ tests
### Key Integration Points
- `src/skill_seekers/cli/source_detector.py:57-65` — File extension detection order
- `src/skill_seekers/cli/source_detector.py:124-129` — `_detect_word()` method (template for `_detect_epub()`)
- `src/skill_seekers/cli/create_command.py:121-143` — `_route_to_scraper()` dispatch
- `src/skill_seekers/cli/create_command.py:331-352` — `_route_word()` (template for `_route_epub()`)
- `src/skill_seekers/cli/arguments/create.py:401-411` — `WORD_ARGUMENTS` dict (template)
- `src/skill_seekers/cli/arguments/create.py:595-604` — `get_source_specific_arguments()` mapping
- `src/skill_seekers/cli/arguments/create.py:676-678` — `add_create_arguments()` mode handling
- `src/skill_seekers/cli/parsers/__init__.py:35-59` — `PARSERS` registry list
- `src/skill_seekers/cli/main.py:46-70` — `COMMAND_MODULES` dict
- `pyproject.toml:111-115` — Optional dependency group pattern (docx)
- `pyproject.toml:213-246` — Script entry points
### Data Flow Architecture
The epub scraper would follow the same three-step pipeline as Word/PDF:
1. **Extract** — Parse `.epub` file → sections with text, headings, code, images → save to `output/{name}_extracted.json`
2. **Categorize** — Group sections by chapters/keywords
3. **Build** — Generate `SKILL.md`, `references/*.md`, `references/index.md`, `assets/`
The intermediate JSON format uses the same structure as Word/PDF:
```python
{
"source_file": str,
"metadata": {"title", "author", "created", ...},
"total_sections": int,
"total_code_blocks": int,
"total_images": int,
"languages_detected": {str: int},
"pages": [ # sections
{
"section_number": int,
"heading": str,
"text": str,
"code_samples": [...],
"images": [...],
"headings": [...]
}
]
}
```
## Architecture Documentation
### Document Input Format Pattern
Each input format follows a consistent architecture:
```
[source_detector.py] → detect type by extension
[create_command.py] → route to scraper
[{format}_scraper.py] → extract → categorize → build skill
[output/{name}/] → SKILL.md + references/ + assets/
```
Supporting files per format:
- `arguments/{format}.py` — CLI argument definitions
- `parsers/{format}_parser.py` — Subcommand parser class
- `tests/test_{format}_scraper.py` — Test suite
### Dependency Guard Pattern
The Word scraper uses an optional dependency guard that epub should replicate:
```python
try:
import ebooklib
from ebooklib import epub
EPUB_AVAILABLE = True
except ImportError:
EPUB_AVAILABLE = False
def _check_epub_deps():
if not EPUB_AVAILABLE:
raise RuntimeError(
"ebooklib is required for EPUB support.\n"
'Install with: pip install "skill-seekers[epub]"\n'
"Or: pip install ebooklib"
)
```
## Summary Table
| Category | Files | Action |
|----------|-------|--------|
| New files | 4 | Create from scratch |
| Source detection | 1 | Add epub detection + validation |
| CLI dispatcher | 1 | Add command module mapping |
| Create command | 1 | Add routing + help + examples |
| Arguments | 1 | Add EPUB_ARGUMENTS + register in helpers |
| Parser registry | 1 | Import + register EpubParser |
| Package config | 1 | Add deps + entry point |
| Documentation | 2+ | Update CLAUDE.md, CHANGELOG |
| **Total** | **12+ modified, 4 new** | |
## Open Questions
- Should epub support reuse any of the existing HTML parsing from `word_scraper.py` (which uses mammoth to convert to HTML then parses with BeautifulSoup)? EPUB internally contains XHTML files, so BeautifulSoup parsing would be directly applicable.
- Should the epub scraper support DRM-protected files, or only DRM-free epub files?
- Should epub-specific arguments include options like `--chapter-range` (similar to PDF's `--pages`)?

View File

@@ -114,6 +114,11 @@ docx = [
"python-docx>=1.1.0", "python-docx>=1.1.0",
] ]
# EPUB (.epub) support
epub = [
"ebooklib>=0.18",
]
# Video processing (lightweight: YouTube transcripts + metadata) # Video processing (lightweight: YouTube transcripts + metadata)
video = [ video = [
"yt-dlp>=2024.12.0", "yt-dlp>=2024.12.0",
@@ -178,6 +183,7 @@ embedding = [
all = [ all = [
"mammoth>=1.6.0", "mammoth>=1.6.0",
"python-docx>=1.1.0", "python-docx>=1.1.0",
"ebooklib>=0.18",
"yt-dlp>=2024.12.0", "yt-dlp>=2024.12.0",
"youtube-transcript-api>=1.2.0", "youtube-transcript-api>=1.2.0",
"mcp>=1.25,<2", "mcp>=1.25,<2",
@@ -222,6 +228,7 @@ skill-seekers-scrape = "skill_seekers.cli.doc_scraper:main"
skill-seekers-github = "skill_seekers.cli.github_scraper:main" skill-seekers-github = "skill_seekers.cli.github_scraper:main"
skill-seekers-pdf = "skill_seekers.cli.pdf_scraper:main" skill-seekers-pdf = "skill_seekers.cli.pdf_scraper:main"
skill-seekers-word = "skill_seekers.cli.word_scraper:main" skill-seekers-word = "skill_seekers.cli.word_scraper:main"
skill-seekers-epub = "skill_seekers.cli.epub_scraper:main"
skill-seekers-video = "skill_seekers.cli.video_scraper:main" skill-seekers-video = "skill_seekers.cli.video_scraper:main"
skill-seekers-unified = "skill_seekers.cli.unified_scraper:main" skill-seekers-unified = "skill_seekers.cli.unified_scraper:main"
skill-seekers-enhance = "skill_seekers.cli.enhance_command:main" skill-seekers-enhance = "skill_seekers.cli.enhance_command:main"

View File

@@ -410,6 +410,18 @@ WORD_ARGUMENTS: dict[str, dict[str, Any]] = {
}, },
} }
# EPUB specific (from epub.py)
EPUB_ARGUMENTS: dict[str, dict[str, Any]] = {
"epub": {
"flags": ("--epub",),
"kwargs": {
"type": str,
"help": "EPUB file path",
"metavar": "PATH",
},
},
}
# Video specific (from video.py) # Video specific (from video.py)
VIDEO_ARGUMENTS: dict[str, dict[str, Any]] = { VIDEO_ARGUMENTS: dict[str, dict[str, Any]] = {
"video_url": { "video_url": {
@@ -598,6 +610,7 @@ def get_source_specific_arguments(source_type: str) -> dict[str, dict[str, Any]]
"local": LOCAL_ARGUMENTS, "local": LOCAL_ARGUMENTS,
"pdf": PDF_ARGUMENTS, "pdf": PDF_ARGUMENTS,
"word": WORD_ARGUMENTS, "word": WORD_ARGUMENTS,
"epub": EPUB_ARGUMENTS,
"video": VIDEO_ARGUMENTS, "video": VIDEO_ARGUMENTS,
"config": CONFIG_ARGUMENTS, "config": CONFIG_ARGUMENTS,
} }
@@ -636,6 +649,7 @@ def add_create_arguments(parser: argparse.ArgumentParser, mode: str = "default")
- 'local': Universal + local-specific - 'local': Universal + local-specific
- 'pdf': Universal + pdf-specific - 'pdf': Universal + pdf-specific
- 'word': Universal + word-specific - 'word': Universal + word-specific
- 'epub': Universal + epub-specific
- 'video': Universal + video-specific - 'video': Universal + video-specific
- 'advanced': Advanced/rare arguments - 'advanced': Advanced/rare arguments
- 'all': All 120+ arguments - 'all': All 120+ arguments
@@ -677,6 +691,10 @@ def add_create_arguments(parser: argparse.ArgumentParser, mode: str = "default")
for arg_name, arg_def in WORD_ARGUMENTS.items(): for arg_name, arg_def in WORD_ARGUMENTS.items():
parser.add_argument(*arg_def["flags"], **arg_def["kwargs"]) parser.add_argument(*arg_def["flags"], **arg_def["kwargs"])
if mode in ["epub", "all"]:
for arg_name, arg_def in EPUB_ARGUMENTS.items():
parser.add_argument(*arg_def["flags"], **arg_def["kwargs"])
if mode in ["video", "all"]: if mode in ["video", "all"]:
for arg_name, arg_def in VIDEO_ARGUMENTS.items(): for arg_name, arg_def in VIDEO_ARGUMENTS.items():
parser.add_argument(*arg_def["flags"], **arg_def["kwargs"]) parser.add_argument(*arg_def["flags"], **arg_def["kwargs"])

View File

@@ -0,0 +1,66 @@
"""EPUB command argument definitions.
This module defines ALL arguments for the epub command in ONE place.
Both epub_scraper.py (standalone) and parsers/epub_parser.py (unified CLI)
import and use these definitions.
Shared arguments (name, description, output, enhance-level, api-key,
dry-run, verbose, quiet, workflow args) come from common.py / workflow.py
via ``add_all_standard_arguments()``.
"""
import argparse
from typing import Any
from .common import add_all_standard_arguments
# EPUB-specific argument definitions as data structure
# NOTE: Shared args (name, description, output, enhance_level, api_key, dry_run,
# verbose, quiet, workflow args) are registered by add_all_standard_arguments().
EPUB_ARGUMENTS: dict[str, dict[str, Any]] = {
"epub": {
"flags": ("--epub",),
"kwargs": {
"type": str,
"help": "Direct EPUB file path",
"metavar": "PATH",
},
},
"from_json": {
"flags": ("--from-json",),
"kwargs": {
"type": str,
"help": "Build skill from extracted JSON",
"metavar": "FILE",
},
},
}
def add_epub_arguments(parser: argparse.ArgumentParser) -> None:
"""Add all epub command arguments to a parser.
Registers shared args (name, description, output, enhance-level, api-key,
dry-run, verbose, quiet, workflow args) via add_all_standard_arguments(),
then adds EPUB-specific args on top.
The default for --enhance-level is overridden to 0 (disabled) for EPUB.
"""
# Shared universal args first
add_all_standard_arguments(parser)
# Override enhance-level default to 0 for EPUB
for action in parser._actions:
if hasattr(action, "dest") and action.dest == "enhance_level":
action.default = 0
action.help = (
"AI enhancement level (auto-detects API vs LOCAL mode): "
"0=disabled (default for EPUB), 1=SKILL.md only, 2=+architecture/config, 3=full enhancement. "
"Mode selection: uses API if ANTHROPIC_API_KEY is set, otherwise LOCAL (Claude Code)"
)
# EPUB-specific args
for arg_name, arg_def in EPUB_ARGUMENTS.items():
flags = arg_def["flags"]
kwargs = arg_def["kwargs"]
parser.add_argument(*flags, **kwargs)

View File

@@ -134,6 +134,8 @@ class CreateCommand:
return self._route_pdf() return self._route_pdf()
elif self.source_info.type == "word": elif self.source_info.type == "word":
return self._route_word() return self._route_word()
elif self.source_info.type == "epub":
return self._route_epub()
elif self.source_info.type == "video": elif self.source_info.type == "video":
return self._route_video() return self._route_video()
elif self.source_info.type == "config": elif self.source_info.type == "config":
@@ -351,6 +353,29 @@ class CreateCommand:
finally: finally:
sys.argv = original_argv sys.argv = original_argv
def _route_epub(self) -> int:
"""Route to EPUB scraper (epub_scraper.py)."""
from skill_seekers.cli import epub_scraper
# Reconstruct argv for epub_scraper
argv = ["epub_scraper"]
# Add EPUB file
file_path = self.source_info.parsed["file_path"]
argv.extend(["--epub", file_path])
# Add universal arguments
self._add_common_args(argv)
# Call epub_scraper with modified argv
logger.debug(f"Calling epub_scraper with argv: {argv}")
original_argv = sys.argv
try:
sys.argv = argv
return epub_scraper.main()
finally:
sys.argv = original_argv
def _route_video(self) -> int: def _route_video(self) -> int:
"""Route to video scraper (video_scraper.py).""" """Route to video scraper (video_scraper.py)."""
from skill_seekers.cli import video_scraper from skill_seekers.cli import video_scraper
@@ -541,6 +566,7 @@ Examples:
Local: skill-seekers create ./my-project -p comprehensive Local: skill-seekers create ./my-project -p comprehensive
PDF: skill-seekers create tutorial.pdf --ocr PDF: skill-seekers create tutorial.pdf --ocr
DOCX: skill-seekers create document.docx DOCX: skill-seekers create document.docx
EPUB: skill-seekers create ebook.epub
Video: skill-seekers create https://youtube.com/watch?v=... Video: skill-seekers create https://youtube.com/watch?v=...
Video: skill-seekers create recording.mp4 Video: skill-seekers create recording.mp4
Config: skill-seekers create configs/react.json Config: skill-seekers create configs/react.json
@@ -551,6 +577,7 @@ Source Auto-Detection:
• ./path → local codebase • ./path → local codebase
• file.pdf → PDF extraction • file.pdf → PDF extraction
• file.docx → Word document extraction • file.docx → Word document extraction
• file.epub → EPUB extraction
• youtube.com/... → Video transcript extraction • youtube.com/... → Video transcript extraction
• file.mp4 → Video file extraction • file.mp4 → Video file extraction
• file.json → multi-source config • file.json → multi-source config
@@ -560,6 +587,7 @@ Progressive Help (13 → 120+ flags):
--help-github GitHub repository options --help-github GitHub repository options
--help-local Local codebase analysis --help-local Local codebase analysis
--help-pdf PDF extraction options --help-pdf PDF extraction options
--help-epub EPUB extraction options
--help-video Video extraction options --help-video Video extraction options
--help-advanced Rare/advanced options --help-advanced Rare/advanced options
--help-all All options + compatibility --help-all All options + compatibility
@@ -591,6 +619,9 @@ Common Workflows:
parser.add_argument( parser.add_argument(
"--help-word", action="store_true", help=argparse.SUPPRESS, dest="_help_word" "--help-word", action="store_true", help=argparse.SUPPRESS, dest="_help_word"
) )
parser.add_argument(
"--help-epub", action="store_true", help=argparse.SUPPRESS, dest="_help_epub"
)
parser.add_argument( parser.add_argument(
"--help-video", action="store_true", help=argparse.SUPPRESS, dest="_help_video" "--help-video", action="store_true", help=argparse.SUPPRESS, dest="_help_video"
) )
@@ -652,6 +683,15 @@ Common Workflows:
add_create_arguments(parser_word, mode="word") add_create_arguments(parser_word, mode="word")
parser_word.print_help() parser_word.print_help()
return 0 return 0
elif args._help_epub:
parser_epub = argparse.ArgumentParser(
prog="skill-seekers create",
description="Create skill from EPUB e-book (.epub)",
formatter_class=argparse.RawDescriptionHelpFormatter,
)
add_create_arguments(parser_epub, mode="epub")
parser_epub.print_help()
return 0
elif args._help_video: elif args._help_video:
parser_video = argparse.ArgumentParser( parser_video = argparse.ArgumentParser(
prog="skill-seekers create", prog="skill-seekers create",

File diff suppressed because it is too large Load Diff

View File

@@ -13,6 +13,7 @@ Commands:
github Scrape GitHub repository github Scrape GitHub repository
pdf Extract from PDF file pdf Extract from PDF file
word Extract from Word (.docx) file word Extract from Word (.docx) file
epub Extract from EPUB e-book (.epub)
video Extract from video (YouTube or local) video Extract from video (YouTube or local)
unified Multi-source scraping (docs + GitHub + PDF) unified Multi-source scraping (docs + GitHub + PDF)
analyze Analyze local codebase and extract code knowledge analyze Analyze local codebase and extract code knowledge
@@ -50,6 +51,7 @@ COMMAND_MODULES = {
"github": "skill_seekers.cli.github_scraper", "github": "skill_seekers.cli.github_scraper",
"pdf": "skill_seekers.cli.pdf_scraper", "pdf": "skill_seekers.cli.pdf_scraper",
"word": "skill_seekers.cli.word_scraper", "word": "skill_seekers.cli.word_scraper",
"epub": "skill_seekers.cli.epub_scraper",
"video": "skill_seekers.cli.video_scraper", "video": "skill_seekers.cli.video_scraper",
"unified": "skill_seekers.cli.unified_scraper", "unified": "skill_seekers.cli.unified_scraper",
"enhance": "skill_seekers.cli.enhance_command", "enhance": "skill_seekers.cli.enhance_command",

View File

@@ -13,6 +13,7 @@ from .scrape_parser import ScrapeParser
from .github_parser import GitHubParser from .github_parser import GitHubParser
from .pdf_parser import PDFParser from .pdf_parser import PDFParser
from .word_parser import WordParser from .word_parser import WordParser
from .epub_parser import EpubParser
from .video_parser import VideoParser from .video_parser import VideoParser
from .unified_parser import UnifiedParser from .unified_parser import UnifiedParser
from .enhance_parser import EnhanceParser from .enhance_parser import EnhanceParser
@@ -45,6 +46,7 @@ PARSERS = [
EnhanceStatusParser(), EnhanceStatusParser(),
PDFParser(), PDFParser(),
WordParser(), WordParser(),
EpubParser(),
VideoParser(), VideoParser(),
UnifiedParser(), UnifiedParser(),
EstimateParser(), EstimateParser(),

View File

@@ -0,0 +1,32 @@
"""EPUB subcommand parser.
Uses shared argument definitions from arguments.epub to ensure
consistency with the standalone epub_scraper module.
"""
from .base import SubcommandParser
from skill_seekers.cli.arguments.epub import add_epub_arguments
class EpubParser(SubcommandParser):
"""Parser for epub subcommand."""
@property
def name(self) -> str:
return "epub"
@property
def help(self) -> str:
return "Extract from EPUB e-book (.epub)"
@property
def description(self) -> str:
return "Extract content from EPUB e-book (.epub) and generate skill"
def add_arguments(self, parser):
"""Add epub-specific arguments.
Uses shared argument definitions to ensure consistency
with epub_scraper.py (standalone scraper).
"""
add_epub_arguments(parser)

View File

@@ -63,6 +63,9 @@ class SourceDetector:
if source.endswith(".docx"): if source.endswith(".docx"):
return cls._detect_word(source) return cls._detect_word(source)
if source.endswith(".epub"):
return cls._detect_epub(source)
# Video file extensions # Video file extensions
VIDEO_EXTENSIONS = (".mp4", ".mkv", ".avi", ".mov", ".webm", ".flv", ".wmv") VIDEO_EXTENSIONS = (".mp4", ".mkv", ".avi", ".mov", ".webm", ".flv", ".wmv")
if source.lower().endswith(VIDEO_EXTENSIONS): if source.lower().endswith(VIDEO_EXTENSIONS):
@@ -99,6 +102,7 @@ class SourceDetector:
" Local: skill-seekers create ./my-project\n" " Local: skill-seekers create ./my-project\n"
" PDF: skill-seekers create tutorial.pdf\n" " PDF: skill-seekers create tutorial.pdf\n"
" DOCX: skill-seekers create document.docx\n" " DOCX: skill-seekers create document.docx\n"
" EPUB: skill-seekers create ebook.epub\n"
" Video: skill-seekers create https://youtube.com/watch?v=...\n" " Video: skill-seekers create https://youtube.com/watch?v=...\n"
" Video: skill-seekers create recording.mp4\n" " Video: skill-seekers create recording.mp4\n"
" Config: skill-seekers create configs/react.json" " Config: skill-seekers create configs/react.json"
@@ -128,6 +132,14 @@ class SourceDetector:
type="word", parsed={"file_path": source}, suggested_name=name, raw_input=source type="word", parsed={"file_path": source}, suggested_name=name, raw_input=source
) )
@classmethod
def _detect_epub(cls, source: str) -> SourceInfo:
"""Detect EPUB file source."""
name = os.path.splitext(os.path.basename(source))[0]
return SourceInfo(
type="epub", parsed={"file_path": source}, suggested_name=name, raw_input=source
)
@classmethod @classmethod
def _detect_video_file(cls, source: str) -> SourceInfo: def _detect_video_file(cls, source: str) -> SourceInfo:
"""Detect local video file source.""" """Detect local video file source."""
@@ -277,6 +289,13 @@ class SourceDetector:
if not os.path.isfile(file_path): if not os.path.isfile(file_path):
raise ValueError(f"Path is not a file: {file_path}") raise ValueError(f"Path is not a file: {file_path}")
elif source_info.type == "epub":
file_path = source_info.parsed["file_path"]
if not os.path.exists(file_path):
raise ValueError(f"EPUB file does not exist: {file_path}")
if not os.path.isfile(file_path):
raise ValueError(f"Path is not a file: {file_path}")
elif source_info.type == "video": elif source_info.type == "video":
if source_info.parsed.get("source_kind") == "file": if source_info.parsed.get("source_kind") == "file":
file_path = source_info.parsed["file_path"] file_path = source_info.parsed["file_path"]

View File

@@ -24,12 +24,12 @@ class TestParserRegistry:
def test_all_parsers_registered(self): def test_all_parsers_registered(self):
"""Test that all parsers are registered.""" """Test that all parsers are registered."""
assert len(PARSERS) == 24, f"Expected 24 parsers, got {len(PARSERS)}" assert len(PARSERS) == 25, f"Expected 25 parsers, got {len(PARSERS)}"
def test_get_parser_names(self): def test_get_parser_names(self):
"""Test getting list of parser names.""" """Test getting list of parser names."""
names = get_parser_names() names = get_parser_names()
assert len(names) == 24 assert len(names) == 25
assert "scrape" in names assert "scrape" in names
assert "github" in names assert "github" in names
assert "package" in names assert "package" in names
@@ -243,9 +243,9 @@ class TestBackwardCompatibility:
assert cmd in names, f"Command '{cmd}' not found in parser registry!" assert cmd in names, f"Command '{cmd}' not found in parser registry!"
def test_command_count_matches(self): def test_command_count_matches(self):
"""Test that we have exactly 24 commands (includes create, workflows, word, video, and sync-config commands).""" """Test that we have exactly 25 commands (includes create, workflows, word, epub, video, and sync-config)."""
assert len(PARSERS) == 24 assert len(PARSERS) == 25
assert len(get_parser_names()) == 24 assert len(get_parser_names()) == 25
if __name__ == "__main__": if __name__ == "__main__":

1626
tests/test_epub_scraper.py Normal file

File diff suppressed because it is too large Load Diff

28
uv.lock generated
View File

@@ -1078,6 +1078,19 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/b0/0d/9feae160378a3553fa9a339b0e9c1a048e147a4127210e286ef18b730f03/durationpy-0.10-py3-none-any.whl", hash = "sha256:3b41e1b601234296b4fb368338fdcd3e13e0b4fb5b67345948f4f2bf9868b286", size = 3922, upload-time = "2025-05-17T13:52:36.463Z" }, { url = "https://files.pythonhosted.org/packages/b0/0d/9feae160378a3553fa9a339b0e9c1a048e147a4127210e286ef18b730f03/durationpy-0.10-py3-none-any.whl", hash = "sha256:3b41e1b601234296b4fb368338fdcd3e13e0b4fb5b67345948f4f2bf9868b286", size = 3922, upload-time = "2025-05-17T13:52:36.463Z" },
] ]
[[package]]
name = "ebooklib"
version = "0.20"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "lxml" },
{ name = "six" },
]
sdist = { url = "https://files.pythonhosted.org/packages/77/85/322e8882a582d4b707220d1929cfb74c125f2ba513991edbce40dbc462de/ebooklib-0.20.tar.gz", hash = "sha256:35e2f9d7d39907be8d39ae2deb261b19848945903ae3dbb6577b187ead69e985", size = 127066, upload-time = "2025-10-26T20:56:20.968Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/bf/ee/aa015c5de8b0dc42a8e507eae8c2de5d1c0e068c896858fec6d502402ed6/ebooklib-0.20-py3-none-any.whl", hash = "sha256:fff5322517a37e31c972d27be7d982cc3928c16b3dcc5fd7e8f7c0f5d7bcf42b", size = 40995, upload-time = "2025-10-26T20:56:19.104Z" },
]
[[package]] [[package]]
name = "exceptiongroup" name = "exceptiongroup"
version = "1.3.1" version = "1.3.1"
@@ -5609,6 +5622,7 @@ all = [
{ name = "azure-storage-blob" }, { name = "azure-storage-blob" },
{ name = "boto3" }, { name = "boto3" },
{ name = "chromadb" }, { name = "chromadb" },
{ name = "ebooklib" },
{ name = "fastapi" }, { name = "fastapi" },
{ name = "google-cloud-storage" }, { name = "google-cloud-storage" },
{ name = "google-generativeai" }, { name = "google-generativeai" },
@@ -5657,6 +5671,9 @@ embedding = [
{ name = "uvicorn" }, { name = "uvicorn" },
{ name = "voyageai" }, { name = "voyageai" },
] ]
epub = [
{ name = "ebooklib" },
]
gcs = [ gcs = [
{ name = "google-cloud-storage" }, { name = "google-cloud-storage" },
] ]
@@ -5737,6 +5754,8 @@ requires-dist = [
{ name = "chromadb", marker = "extra == 'chroma'", specifier = ">=0.4.0" }, { name = "chromadb", marker = "extra == 'chroma'", specifier = ">=0.4.0" },
{ name = "chromadb", marker = "extra == 'rag-upload'", specifier = ">=0.4.0" }, { name = "chromadb", marker = "extra == 'rag-upload'", specifier = ">=0.4.0" },
{ name = "click", specifier = ">=8.3.0" }, { name = "click", specifier = ">=8.3.0" },
{ name = "ebooklib", marker = "extra == 'all'", specifier = ">=0.18" },
{ name = "ebooklib", marker = "extra == 'epub'", specifier = ">=0.18" },
{ name = "fastapi", marker = "extra == 'all'", specifier = ">=0.109.0" }, { name = "fastapi", marker = "extra == 'all'", specifier = ">=0.109.0" },
{ name = "fastapi", marker = "extra == 'embedding'", specifier = ">=0.109.0" }, { name = "fastapi", marker = "extra == 'embedding'", specifier = ">=0.109.0" },
{ name = "faster-whisper", marker = "extra == 'video-full'", specifier = ">=1.0.0" }, { name = "faster-whisper", marker = "extra == 'video-full'", specifier = ">=1.0.0" },
@@ -5808,7 +5827,7 @@ requires-dist = [
{ name = "yt-dlp", marker = "extra == 'video'", specifier = ">=2024.12.0" }, { name = "yt-dlp", marker = "extra == 'video'", specifier = ">=2024.12.0" },
{ name = "yt-dlp", marker = "extra == 'video-full'", specifier = ">=2024.12.0" }, { name = "yt-dlp", marker = "extra == 'video-full'", specifier = ">=2024.12.0" },
] ]
provides-extras = ["mcp", "gemini", "openai", "all-llms", "s3", "gcs", "azure", "docx", "video", "video-full", "chroma", "weaviate", "sentence-transformers", "pinecone", "rag-upload", "all-cloud", "embedding", "all"] provides-extras = ["mcp", "gemini", "openai", "all-llms", "s3", "gcs", "azure", "docx", "epub", "video", "video-full", "chroma", "weaviate", "sentence-transformers", "pinecone", "rag-upload", "all-cloud", "embedding", "all"]
[package.metadata.requires-dev] [package.metadata.requires-dev]
dev = [ dev = [
@@ -6165,6 +6184,13 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/0f/8b/4b61d6e13f7108f36910df9ab4b58fd389cc2520d54d81b88660804aad99/torch-2.10.0-2-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:418997cb02d0a0f1497cf6a09f63166f9f5df9f3e16c8a716ab76a72127c714f", size = 79423467, upload-time = "2026-02-10T21:44:48.711Z" }, { url = "https://files.pythonhosted.org/packages/0f/8b/4b61d6e13f7108f36910df9ab4b58fd389cc2520d54d81b88660804aad99/torch-2.10.0-2-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:418997cb02d0a0f1497cf6a09f63166f9f5df9f3e16c8a716ab76a72127c714f", size = 79423467, upload-time = "2026-02-10T21:44:48.711Z" },
{ url = "https://files.pythonhosted.org/packages/d3/54/a2ba279afcca44bbd320d4e73675b282fcee3d81400ea1b53934efca6462/torch-2.10.0-2-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:13ec4add8c3faaed8d13e0574f5cd4a323c11655546f91fbe6afa77b57423574", size = 79498202, upload-time = "2026-02-10T21:44:52.603Z" }, { url = "https://files.pythonhosted.org/packages/d3/54/a2ba279afcca44bbd320d4e73675b282fcee3d81400ea1b53934efca6462/torch-2.10.0-2-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:13ec4add8c3faaed8d13e0574f5cd4a323c11655546f91fbe6afa77b57423574", size = 79498202, upload-time = "2026-02-10T21:44:52.603Z" },
{ url = "https://files.pythonhosted.org/packages/ec/23/2c9fe0c9c27f7f6cb865abcea8a4568f29f00acaeadfc6a37f6801f84cb4/torch-2.10.0-2-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:e521c9f030a3774ed770a9c011751fb47c4d12029a3d6522116e48431f2ff89e", size = 79498254, upload-time = "2026-02-10T21:44:44.095Z" }, { url = "https://files.pythonhosted.org/packages/ec/23/2c9fe0c9c27f7f6cb865abcea8a4568f29f00acaeadfc6a37f6801f84cb4/torch-2.10.0-2-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:e521c9f030a3774ed770a9c011751fb47c4d12029a3d6522116e48431f2ff89e", size = 79498254, upload-time = "2026-02-10T21:44:44.095Z" },
{ url = "https://files.pythonhosted.org/packages/16/ee/efbd56687be60ef9af0c9c0ebe106964c07400eade5b0af8902a1d8cd58c/torch-2.10.0-3-cp310-cp310-manylinux_2_28_x86_64.whl", hash = "sha256:a1ff626b884f8c4e897c4c33782bdacdff842a165fee79817b1dd549fdda1321", size = 915510070, upload-time = "2026-03-11T14:16:39.386Z" },
{ url = "https://files.pythonhosted.org/packages/36/ab/7b562f1808d3f65414cd80a4f7d4bb00979d9355616c034c171249e1a303/torch-2.10.0-3-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:ac5bdcbb074384c66fa160c15b1ead77839e3fe7ed117d667249afce0acabfac", size = 915518691, upload-time = "2026-03-11T14:15:43.147Z" },
{ url = "https://files.pythonhosted.org/packages/b3/7a/abada41517ce0011775f0f4eacc79659bc9bc6c361e6bfe6f7052a6b9363/torch-2.10.0-3-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:98c01b8bb5e3240426dcde1446eed6f40c778091c8544767ef1168fc663a05a6", size = 915622781, upload-time = "2026-03-11T14:17:11.354Z" },
{ url = "https://files.pythonhosted.org/packages/ab/c6/4dfe238342ffdcec5aef1c96c457548762d33c40b45a1ab7033bb26d2ff2/torch-2.10.0-3-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:80b1b5bfe38eb0e9f5ff09f206dcac0a87aadd084230d4a36eea5ec5232c115b", size = 915627275, upload-time = "2026-03-11T14:16:11.325Z" },
{ url = "https://files.pythonhosted.org/packages/d8/f0/72bf18847f58f877a6a8acf60614b14935e2f156d942483af1ffc081aea0/torch-2.10.0-3-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:46b3574d93a2a8134b3f5475cfb98e2eb46771794c57015f6ad1fb795ec25e49", size = 915523474, upload-time = "2026-03-11T14:17:44.422Z" },
{ url = "https://files.pythonhosted.org/packages/f4/39/590742415c3030551944edc2ddc273ea1fdfe8ffb2780992e824f1ebee98/torch-2.10.0-3-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:b1d5e2aba4eb7f8e87fbe04f86442887f9167a35f092afe4c237dfcaaef6e328", size = 915632474, upload-time = "2026-03-11T14:15:13.666Z" },
{ url = "https://files.pythonhosted.org/packages/b6/8e/34949484f764dde5b222b7fe3fede43e4a6f0da9d7f8c370bb617d629ee2/torch-2.10.0-3-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:0228d20b06701c05a8f978357f657817a4a63984b0c90745def81c18aedfa591", size = 915523882, upload-time = "2026-03-11T14:14:46.311Z" },
{ url = "https://files.pythonhosted.org/packages/0c/1a/c61f36cfd446170ec27b3a4984f072fd06dab6b5d7ce27e11adb35d6c838/torch-2.10.0-cp310-cp310-manylinux_2_28_aarch64.whl", hash = "sha256:5276fa790a666ee8becaffff8acb711922252521b28fbce5db7db5cf9cb2026d", size = 145992962, upload-time = "2026-01-21T16:24:14.04Z" }, { url = "https://files.pythonhosted.org/packages/0c/1a/c61f36cfd446170ec27b3a4984f072fd06dab6b5d7ce27e11adb35d6c838/torch-2.10.0-cp310-cp310-manylinux_2_28_aarch64.whl", hash = "sha256:5276fa790a666ee8becaffff8acb711922252521b28fbce5db7db5cf9cb2026d", size = 145992962, upload-time = "2026-01-21T16:24:14.04Z" },
{ url = "https://files.pythonhosted.org/packages/b5/60/6662535354191e2d1555296045b63e4279e5a9dbad49acf55a5d38655a39/torch-2.10.0-cp310-cp310-manylinux_2_28_x86_64.whl", hash = "sha256:aaf663927bcd490ae971469a624c322202a2a1e68936eb952535ca4cd3b90444", size = 915599237, upload-time = "2026-01-21T16:23:25.497Z" }, { url = "https://files.pythonhosted.org/packages/b5/60/6662535354191e2d1555296045b63e4279e5a9dbad49acf55a5d38655a39/torch-2.10.0-cp310-cp310-manylinux_2_28_x86_64.whl", hash = "sha256:aaf663927bcd490ae971469a624c322202a2a1e68936eb952535ca4cd3b90444", size = 915599237, upload-time = "2026-01-21T16:23:25.497Z" },
{ url = "https://files.pythonhosted.org/packages/40/b8/66bbe96f0d79be2b5c697b2e0b187ed792a15c6c4b8904613454651db848/torch-2.10.0-cp310-cp310-win_amd64.whl", hash = "sha256:a4be6a2a190b32ff5c8002a0977a25ea60e64f7ba46b1be37093c141d9c49aeb", size = 113720931, upload-time = "2026-01-21T16:24:23.743Z" }, { url = "https://files.pythonhosted.org/packages/40/b8/66bbe96f0d79be2b5c697b2e0b187ed792a15c6c4b8904613454651db848/torch-2.10.0-cp310-cp310-win_amd64.whl", hash = "sha256:a4be6a2a190b32ff5c8002a0977a25ea60e64f7ba46b1be37093c141d9c49aeb", size = 113720931, upload-time = "2026-01-21T16:24:23.743Z" },