feat: add skill-seekers video --setup for GPU auto-detection and dependency installation

Auto-detects NVIDIA (CUDA), AMD (ROCm), or CPU-only GPU and installs the
correct PyTorch variant + easyocr + all visual extraction dependencies.
Removes easyocr from video-full pip extras to avoid pulling ~2GB of wrong
CUDA packages on non-NVIDIA systems.

New files:
- video_setup.py (835 lines): GPU detection, PyTorch install, ROCm config,
  venv checks, system dep validation, module selection, verification
- test_video_setup.py (60 tests): Full coverage of detection, install, verify

Updated docs: CHANGELOG, AGENTS.md, CLAUDE.md, README.md, CLI_REFERENCE,
FAQ, TROUBLESHOOTING, installation guide, video dependency plan

All 2523 tests passing (15 skipped).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
yusyus
2026-03-01 18:39:16 +03:00
parent 12bc29ab36
commit cc9cc32417
24 changed files with 2439 additions and 508 deletions

View File

@@ -59,6 +59,24 @@ Each platform has a dedicated adaptor for optimal formatting and upload.
**Recommendation:** Use LOCAL mode for free AI enhancement or skip enhancement entirely.
### How do I set up video extraction?
**Quick setup:**
```bash
# 1. Install video support
pip install skill-seekers[video-full]
# 2. Auto-detect GPU and install visual deps
skill-seekers video --setup
```
The `--setup` command auto-detects your GPU vendor (NVIDIA CUDA, AMD ROCm, or CPU-only) and installs the correct PyTorch variant along with easyocr and other visual extraction dependencies. This avoids the ~2GB NVIDIA CUDA download that would happen if easyocr were installed via pip on non-NVIDIA systems.
**What it detects:**
- **NVIDIA:** Uses `nvidia-smi` to find CUDA version → installs matching `cu124`/`cu121`/`cu118` PyTorch
- **AMD:** Uses `rocminfo` to find ROCm version → installs matching ROCm PyTorch
- **CPU-only:** Installs lightweight CPU-only PyTorch
### How long does it take to create a skill?
**Typical Times:**

View File

@@ -90,6 +90,35 @@ pyenv install 3.12
pyenv global 3.12
```
### Issue: Video Visual Dependencies Missing
**Symptoms:**
```
Missing video dependencies: easyocr
RuntimeError: Required video visual dependencies not installed
```
**Solutions:**
```bash
# Run the GPU-aware setup command
skill-seekers video --setup
# This auto-detects your GPU and installs:
# - PyTorch (correct CUDA/ROCm/CPU variant)
# - easyocr, opencv, pytesseract, scenedetect, faster-whisper
# - yt-dlp, youtube-transcript-api
# Verify installation
python -c "import torch; print(f'PyTorch: {torch.__version__}, CUDA: {torch.cuda.is_available()}')"
python -c "import easyocr; print('easyocr OK')"
```
**Common issues:**
- Running outside a virtual environment → `--setup` will warn you; create a venv first
- Missing system packages → Install `tesseract-ocr` and `ffmpeg` for your OS
- AMD GPU without ROCm → Install ROCm first, then re-run `--setup`
## Configuration Issues
### Issue: API Keys Not Recognized

View File

@@ -124,10 +124,14 @@ pip install skill-seekers[dev]
| `gcs` | Google Cloud Storage | `pip install skill-seekers[gcs]` |
| `azure` | Azure Blob Storage | `pip install skill-seekers[azure]` |
| `embedding` | Embedding server | `pip install skill-seekers[embedding]` |
| `video` | YouTube/video transcript extraction | `pip install skill-seekers[video]` |
| `video-full` | + Whisper transcription, scene detection | `pip install skill-seekers[video-full]` |
| `all-llms` | All LLM platforms | `pip install skill-seekers[all-llms]` |
| `all` | Everything | `pip install skill-seekers[all]` |
| `dev` | Development tools | `pip install skill-seekers[dev]` |
> **Video visual deps:** After installing `skill-seekers[video-full]`, run `skill-seekers video --setup` to auto-detect your GPU (NVIDIA/AMD/CPU) and install the correct PyTorch variant + easyocr.
---
## Post-Installation Setup

View File

@@ -113,7 +113,7 @@ All languages supported by:
| [`04_VIDEO_INTEGRATION.md`](./04_VIDEO_INTEGRATION.md) | CLI, config, source detection, unified scraper integration |
| [`05_VIDEO_OUTPUT.md`](./05_VIDEO_OUTPUT.md) | Output structure, SKILL.md integration, reference file format |
| [`06_VIDEO_TESTING.md`](./06_VIDEO_TESTING.md) | Test strategy, mocking, fixtures, CI considerations |
| [`07_VIDEO_DEPENDENCIES.md`](./07_VIDEO_DEPENDENCIES.md) | Dependency tiers, optional installs, system requirements |
| [`07_VIDEO_DEPENDENCIES.md`](./07_VIDEO_DEPENDENCIES.md) | Dependency tiers, optional installs, system requirements**IMPLEMENTED** (`video_setup.py`, GPU auto-detection, `--setup`) |
---

View File

@@ -4,6 +4,15 @@
**Document:** 07 of 07
**Status:** Planning
> **Status: IMPLEMENTED** — `skill-seekers video --setup` (see `video_setup.py`, 835 lines, 60 tests)
> - GPU auto-detection: NVIDIA (nvidia-smi/CUDA), AMD (rocminfo/ROCm), CPU fallback
> - Correct PyTorch index URL selection per GPU vendor
> - EasyOCR removed from pip extras, installed at runtime via --setup
> - ROCm configuration (MIOPEN_FIND_MODE, HSA_OVERRIDE_GFX_VERSION)
> - Virtual environment detection with --force override
> - System dependency checks (tesseract, ffmpeg)
> - Non-interactive mode for MCP/CI usage
---
## Table of Contents

View File

@@ -32,6 +32,7 @@
- [unified](#unified) - Multi-source scraping
- [update](#update) - Incremental updates
- [upload](#upload) - Upload to platform
- [video](#video) - Video extraction & setup
- [workflows](#workflows) - Manage workflow presets
- [Common Workflows](#common-workflows)
- [Exit Codes](#exit-codes)
@@ -1035,6 +1036,44 @@ skill-seekers upload output/react-weaviate.zip --target weaviate \
---
### video
Extract skills from video tutorials (YouTube, Vimeo, or local files).
### Usage
```bash
# Setup (first time — auto-detects GPU, installs PyTorch + visual deps)
skill-seekers video --setup
# Extract from YouTube
skill-seekers video --url https://www.youtube.com/watch?v=VIDEO_ID --name my-skill
# With visual frame extraction (requires --setup first)
skill-seekers video --url VIDEO_URL --name my-skill --visual
# Local video file
skill-seekers video --url /path/to/video.mp4 --name my-skill
```
### Key Flags
| Flag | Description |
|------|-------------|
| `--setup` | Auto-detect GPU and install visual extraction dependencies |
| `--url URL` | Video URL (YouTube, Vimeo) or local file path |
| `--name NAME` | Skill name for output |
| `--visual` | Enable visual frame extraction (OCR on keyframes) |
| `--vision-api` | Use Claude Vision API as OCR fallback for low-confidence frames |
### Notes
- `--setup` detects NVIDIA (CUDA), AMD (ROCm), or CPU-only and installs the correct PyTorch variant
- Requires `pip install skill-seekers[video]` (transcripts) or `skill-seekers[video-full]` (+ whisper + scene detection)
- EasyOCR is NOT included in pip extras — it is installed by `--setup` with the correct GPU backend
---
### workflows
Manage enhancement workflow presets.