feat: add skill-seekers video --setup for GPU auto-detection and dependency installation

Auto-detects NVIDIA (CUDA), AMD (ROCm), or CPU-only GPU and installs the correct PyTorch variant + easyocr + all visual extraction dependencies. Removes easyocr from video-full pip extras to avoid pulling ~2GB of wrong CUDA packages on non-NVIDIA systems. New files: - video_setup.py (835 lines): GPU detection, PyTorch install, ROCm config, venv checks, system dep validation, module selection, verification - test_video_setup.py (60 tests): Full coverage of detection, install, verify Updated docs: CHANGELOG, AGENTS.md, CLAUDE.md, README.md, CLI_REFERENCE, FAQ, TROUBLESHOOTING, installation guide, video dependency plan All 2523 tests passing (15 skipped). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 18:39:16 +03:00
parent 12bc29ab36
commit cc9cc32417
24 changed files with 2439 additions and 508 deletions
--- a/docs/FAQ.md
+++ b/docs/FAQ.md
@@ -59,6 +59,24 @@ Each platform has a dedicated adaptor for optimal formatting and upload.

 **Recommendation:** Use LOCAL mode for free AI enhancement or skip enhancement entirely.

+### How do I set up video extraction?
+
+**Quick setup:**
+```bash
+# 1. Install video support
+pip install skill-seekers[video-full]
+
+# 2. Auto-detect GPU and install visual deps
+skill-seekers video --setup
+```
+
+The `--setup` command auto-detects your GPU vendor (NVIDIA CUDA, AMD ROCm, or CPU-only) and installs the correct PyTorch variant along with easyocr and other visual extraction dependencies. This avoids the ~2GB NVIDIA CUDA download that would happen if easyocr were installed via pip on non-NVIDIA systems.
+
+**What it detects:**
+- **NVIDIA:** Uses `nvidia-smi` to find CUDA version → installs matching `cu124`/`cu121`/`cu118` PyTorch
+- **AMD:** Uses `rocminfo` to find ROCm version → installs matching ROCm PyTorch
+- **CPU-only:** Installs lightweight CPU-only PyTorch
+
 ### How long does it take to create a skill?

 **Typical Times:**
--- a/docs/TROUBLESHOOTING.md
+++ b/docs/TROUBLESHOOTING.md
@@ -90,6 +90,35 @@ pyenv install 3.12
 pyenv global 3.12
 ```

+### Issue: Video Visual Dependencies Missing
+
+**Symptoms:**
+```
+Missing video dependencies: easyocr
+RuntimeError: Required video visual dependencies not installed
+```
+
+**Solutions:**
+
+```bash
+# Run the GPU-aware setup command
+skill-seekers video --setup
+
+# This auto-detects your GPU and installs:
+# - PyTorch (correct CUDA/ROCm/CPU variant)
+# - easyocr, opencv, pytesseract, scenedetect, faster-whisper
+# - yt-dlp, youtube-transcript-api
+
+# Verify installation
+python -c "import torch; print(f'PyTorch: {torch.__version__}, CUDA: {torch.cuda.is_available()}')"
+python -c "import easyocr; print('easyocr OK')"
+```
+
+**Common issues:**
+- Running outside a virtual environment → `--setup` will warn you; create a venv first
+- Missing system packages → Install `tesseract-ocr` and `ffmpeg` for your OS
+- AMD GPU without ROCm → Install ROCm first, then re-run `--setup`
+
 ## Configuration Issues

 ### Issue: API Keys Not Recognized
--- a/docs/getting-started/01-installation.md
+++ b/docs/getting-started/01-installation.md
@@ -124,10 +124,14 @@ pip install skill-seekers[dev]
 | `gcs` | Google Cloud Storage | `pip install skill-seekers[gcs]` |
 | `azure` | Azure Blob Storage | `pip install skill-seekers[azure]` |
 | `embedding` | Embedding server | `pip install skill-seekers[embedding]` |
+| `video` | YouTube/video transcript extraction | `pip install skill-seekers[video]` |
+| `video-full` | + Whisper transcription, scene detection | `pip install skill-seekers[video-full]` |
 | `all-llms` | All LLM platforms | `pip install skill-seekers[all-llms]` |
 | `all` | Everything | `pip install skill-seekers[all]` |
 | `dev` | Development tools | `pip install skill-seekers[dev]` |

+> **Video visual deps:** After installing `skill-seekers[video-full]`, run `skill-seekers video --setup` to auto-detect your GPU (NVIDIA/AMD/CPU) and install the correct PyTorch variant + easyocr.
+
 ---

 ## Post-Installation Setup
--- a/docs/plans/video/00_VIDEO_SOURCE_OVERVIEW.md
+++ b/docs/plans/video/00_VIDEO_SOURCE_OVERVIEW.md
@@ -113,7 +113,7 @@ All languages supported by:
 | [`04_VIDEO_INTEGRATION.md`](./04_VIDEO_INTEGRATION.md) | CLI, config, source detection, unified scraper integration |
 | [`05_VIDEO_OUTPUT.md`](./05_VIDEO_OUTPUT.md) | Output structure, SKILL.md integration, reference file format |
 | [`06_VIDEO_TESTING.md`](./06_VIDEO_TESTING.md) | Test strategy, mocking, fixtures, CI considerations |
-| [`07_VIDEO_DEPENDENCIES.md`](./07_VIDEO_DEPENDENCIES.md) | Dependency tiers, optional installs, system requirements |
+| [`07_VIDEO_DEPENDENCIES.md`](./07_VIDEO_DEPENDENCIES.md) | Dependency tiers, optional installs, system requirements — **IMPLEMENTED** (`video_setup.py`, GPU auto-detection, `--setup`) |

 ---

--- a/docs/plans/video/07_VIDEO_DEPENDENCIES.md
+++ b/docs/plans/video/07_VIDEO_DEPENDENCIES.md
@@ -4,6 +4,15 @@
 **Document:** 07 of 07
 **Status:** Planning

+> **Status: IMPLEMENTED** — `skill-seekers video --setup` (see `video_setup.py`, 835 lines, 60 tests)
+> - GPU auto-detection: NVIDIA (nvidia-smi/CUDA), AMD (rocminfo/ROCm), CPU fallback
+> - Correct PyTorch index URL selection per GPU vendor
+> - EasyOCR removed from pip extras, installed at runtime via --setup
+> - ROCm configuration (MIOPEN_FIND_MODE, HSA_OVERRIDE_GFX_VERSION)
+> - Virtual environment detection with --force override
+> - System dependency checks (tesseract, ffmpeg)
+> - Non-interactive mode for MCP/CI usage
+
 ---

 ## Table of Contents
--- a/docs/reference/CLI_REFERENCE.md
+++ b/docs/reference/CLI_REFERENCE.md
@@ -32,6 +32,7 @@
  - [unified](#unified) - Multi-source scraping
  - [update](#update) - Incremental updates
  - [upload](#upload) - Upload to platform
+  - [video](#video) - Video extraction & setup
  - [workflows](#workflows) - Manage workflow presets
 - [Common Workflows](#common-workflows)
 - [Exit Codes](#exit-codes)
@@ -1035,6 +1036,44 @@ skill-seekers upload output/react-weaviate.zip --target weaviate \

 ---

+### video
+
+Extract skills from video tutorials (YouTube, Vimeo, or local files).
+
+### Usage
+
+```bash
+# Setup (first time — auto-detects GPU, installs PyTorch + visual deps)
+skill-seekers video --setup
+
+# Extract from YouTube
+skill-seekers video --url https://www.youtube.com/watch?v=VIDEO_ID --name my-skill
+
+# With visual frame extraction (requires --setup first)
+skill-seekers video --url VIDEO_URL --name my-skill --visual
+
+# Local video file
+skill-seekers video --url /path/to/video.mp4 --name my-skill
+```
+
+### Key Flags
+
+| Flag | Description |
+|------|-------------|
+| `--setup` | Auto-detect GPU and install visual extraction dependencies |
+| `--url URL` | Video URL (YouTube, Vimeo) or local file path |
+| `--name NAME` | Skill name for output |
+| `--visual` | Enable visual frame extraction (OCR on keyframes) |
+| `--vision-api` | Use Claude Vision API as OCR fallback for low-confidence frames |
+
+### Notes
+
+- `--setup` detects NVIDIA (CUDA), AMD (ROCm), or CPU-only and installs the correct PyTorch variant
+- Requires `pip install skill-seekers[video]` (transcripts) or `skill-seekers[video-full]` (+ whisper + scene detection)
+- EasyOCR is NOT included in pip extras — it is installed by `--setup` with the correct GPU backend
+
+---
+
 ### workflows

 Manage enhancement workflow presets.