feat: add video tutorial scraping pipeline with per-panel OCR and AI enhancement

Add complete video tutorial extraction system that converts YouTube videos
and local video files into AI-consumable skills. The pipeline extracts
transcripts, performs visual OCR on code editor panels independently,
tracks code evolution across frames, and generates structured SKILL.md output.

Key features:
- Video metadata extraction (YouTube, local files, playlists)
- Multi-source transcript extraction (YouTube API, yt-dlp, Whisper fallback)
- Chapter-based and time-window segmentation
- Visual extraction: keyframe detection, frame classification, panel detection
- Per-panel sub-section OCR (each IDE panel OCR'd independently)
- Parallel OCR with ThreadPoolExecutor for multi-panel frames
- Narrow panel filtering (300px min width) to skip UI chrome
- Text block tracking with spatial panel position matching
- Code timeline with edit tracking across frames
- Audio-visual alignment (code + narrator pairs)
- Video-specific AI enhancement prompt for OCR denoising and code reconstruction
- video-tutorial.yaml workflow with 4 stages (OCR cleanup, language detection,
  tutorial synthesis, skill polish)
- CLI integration: skill-seekers video --url/--video-file/--playlist
- MCP tool: scrape_video for automation
- 161 tests passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
YusufKaraaslanSpyke
2026-02-27 23:10:19 +03:00
parent 3bad7cf365
commit 62071c4aa9
32 changed files with 15090 additions and 9 deletions

View File

@@ -98,6 +98,7 @@ try:
scrape_docs_impl,
scrape_github_impl,
scrape_pdf_impl,
scrape_video_impl,
# Splitting tools
split_config_impl,
submit_config_impl,
@@ -420,6 +421,55 @@ async def scrape_pdf(
return str(result)
@safe_tool_decorator(
description="Extract transcripts and metadata from videos (YouTube, Vimeo, local files) and build Claude skill."
)
async def scrape_video(
url: str | None = None,
video_file: str | None = None,
playlist: str | None = None,
name: str | None = None,
description: str | None = None,
languages: str | None = None,
from_json: str | None = None,
) -> str:
"""
Scrape video content and build Claude skill.
Args:
url: Video URL (YouTube, Vimeo)
video_file: Local video file path
playlist: Playlist URL
name: Skill name
description: Skill description
languages: Transcript language preferences (comma-separated)
from_json: Build from extracted JSON file
Returns:
Video scraping results with file paths.
"""
args = {}
if url:
args["url"] = url
if video_file:
args["video_file"] = video_file
if playlist:
args["playlist"] = playlist
if name:
args["name"] = name
if description:
args["description"] = description
if languages:
args["languages"] = languages
if from_json:
args["from_json"] = from_json
result = await scrape_video_impl(args)
if isinstance(result, list) and result:
return result[0].text if hasattr(result[0], "text") else str(result[0])
return str(result)
@safe_tool_decorator(
description="Analyze local codebase and extract code knowledge. Walks directory tree, analyzes code files, extracts signatures, docstrings, and optionally generates API reference documentation and dependency graphs."
)