diff --git a/CATALOG.md b/CATALOG.md index f26d395a..7210a41e 100644 --- a/CATALOG.md +++ b/CATALOG.md @@ -1,8 +1,8 @@ # Skill Catalog -Generated at: 2026-02-04T08:08:20.870Z +Generated at: 2026-02-04T20:37:02.333Z -Total skills: 631 +Total skills: 633 ## architecture (62) @@ -125,6 +125,7 @@ Total skills: 631 | `analytics-tracking` | Design, audit, and improve analytics tracking systems that produce reliable, decision-ready data. Use when the user wants to set up, fix, or evaluate analyti... | analytics, tracking | analytics, tracking, audit, improve, produce, reliable, decision, data, user, wants, set, up | | `angular-ui-patterns` | Modern Angular UI patterns for loading states, error handling, and data display. Use when building UI components, handling async data, or managing component ... | angular, ui | angular, ui, loading, states, error, handling, data, display, building, components, async, managing | | `api-documenter` | Master API documentation with OpenAPI 3.1, AI-powered tools, and modern developer experience practices. Create interactive docs, generate SDKs, and build com... | api, documenter | api, documenter, documentation, openapi, ai, powered, developer, experience, interactive, docs, generate, sdks | +| `audio-transcriber` | Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration | audio, transcription, whisper, meeting-minutes, speech-to-text | audio, transcription, whisper, meeting-minutes, speech-to-text, transcriber, transform, recordings, professional, markdown, documentation, intelligent | | `autonomous-agent-patterns` | Design patterns for building autonomous coding agents. Covers tool integration, permission systems, browser automation, and human-in-the-loop workflows. Use ... | autonomous, agent | autonomous, agent, building, coding, agents, covers, integration, permission, browser, automation, human, loop | | `autonomous-agents` | Autonomous agents are AI systems that can independently decompose goals, plan actions, execute tools, and self-correct without constant human guidance. The c... | autonomous, agents | autonomous, agents, ai, independently, decompose, goals, plan, actions, execute, self, correct, without | | `beautiful-prose` | Hard-edged writing style contract for timeless, forceful English prose without AI tics | beautiful, prose | beautiful, prose, hard, edged, writing, style, contract, timeless, forceful, english, without, ai | @@ -181,7 +182,6 @@ Total skills: 631 | `prisma-expert` | Prisma ORM expert for schema design, migrations, query optimization, relations modeling, and database operations. Use PROACTIVELY for Prisma schema issues, m... | prisma | prisma, orm, schema, migrations, query, optimization, relations, modeling, database, operations, proactively, issues | | `programmatic-seo` | Design and evaluate programmatic SEO strategies for creating SEO-driven pages at scale using templates and structured data. Use when the user mentions progra... | programmatic, seo | programmatic, seo, evaluate, creating, driven, pages, scale, structured, data, user, mentions, directory | | `prompt-caching` | Caching strategies for LLM prompts including Anthropic prompt caching, response caching, and CAG (Cache Augmented Generation) Use when: prompt caching, cache... | prompt, caching | prompt, caching, llm, prompts, including, anthropic, response, cag, cache, augmented, generation | -| `prompt-engineer` | Expert prompt engineer specializing in advanced prompting techniques, LLM optimization, and AI system design. Masters chain-of-thought, constitutional AI, an... | prompt | prompt, engineer, specializing, prompting, techniques, llm, optimization, ai, masters, chain, thought, constitutional | | `prompt-engineering-patterns` | Master advanced prompt engineering techniques to maximize LLM performance, reliability, and controllability in production. Use when optimizing prompts, impro... | prompt, engineering | prompt, engineering, techniques, maximize, llm, performance, reliability, controllability, optimizing, prompts, improving, outputs | | `rag-engineer` | Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LL... | rag | rag, engineer, building, retrieval, augmented, generation, masters, embedding, models, vector, databases, chunking | | `rag-implementation` | Build Retrieval-Augmented Generation (RAG) systems for LLM applications with vector databases and semantic search. Use when implementing knowledge-grounded A... | rag | rag, retrieval, augmented, generation, llm, applications, vector, databases, semantic, search, implementing, knowledge | @@ -298,7 +298,7 @@ TRIGGER: "shopify", "shopify app", "checkout extension",... | shopify | shopify, | `viral-generator-builder` | Expert in building shareable generator tools that go viral - name generators, quiz makers, avatar creators, personality tests, and calculator tools. Covers t... | viral, generator, builder | viral, generator, builder, building, shareable, go, name, generators, quiz, makers, avatar, creators | | `webapp-testing` | Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing... | webapp | webapp, testing, toolkit, interacting, local, web, applications, playwright, supports, verifying, frontend, functionality | -## general (129) +## general (130) | Skill | Description | Tags | Triggers | | --- | --- | --- | --- | @@ -401,6 +401,7 @@ TRIGGER: "shopify", "shopify app", "checkout extension",... | shopify | shopify, | `posix-shell-pro` | Expert in strict POSIX sh scripting for maximum portability across Unix-like systems. Specializes in shell scripts that run on any POSIX-compliant shell (das... | posix, shell | posix, shell, pro, strict, sh, scripting, maximum, portability, unix, like, specializes, scripts | | `pptx-official` | Presentation creation, editing, and analysis. When Claude needs to work with presentations (.pptx files) for: (1) Creating new presentations, (2) Modifying o... | pptx, official | pptx, official, presentation, creation, editing, analysis, claude, work, presentations, files, creating, new | | `privilege-escalation-methods` | This skill should be used when the user asks to "escalate privileges", "get root access", "become administrator", "privesc techniques", "abuse sudo", "exploi... | privilege, escalation, methods | privilege, escalation, methods, skill, should, used, user, asks, escalate, privileges, get, root | +| `prompt-engineer` | Transforms user prompts into optimized prompts using frameworks (RTF, RISEN, Chain of Thought, RODES, Chain of Density, RACE, RISE, STAR, SOAP, CLEAR, GROW) | prompt-engineering, optimization, frameworks, ai-enhancement | prompt-engineering, optimization, frameworks, ai-enhancement, prompt, engineer, transforms, user, prompts, optimized, rtf, risen | | `prompt-library` | Curated collection of high-quality prompts for various use cases. Includes role-based prompts, task-specific templates, and prompt refinement techniques. Use... | prompt, library | prompt, library, curated, collection, high, quality, prompts, various, cases, includes, role, task | | `receiving-code-review` | Use when receiving code review feedback, before implementing suggestions, especially if feedback seems unclear or technically questionable - requires technic... | receiving, code | receiving, code, review, feedback, before, implementing, suggestions, especially, seems, unclear, technically, questionable | | `referral-program` | When the user wants to create, optimize, or analyze a referral program, affiliate program, or word-of-mouth strategy. Also use when the user mentions 'referr... | referral, program | referral, program, user, wants, optimize, analyze, affiliate, word, mouth, mentions, ambassador, viral | @@ -409,7 +410,6 @@ TRIGGER: "shopify", "shopify app", "checkout extension",... | shopify | shopify, | `sharp-edges` | Identify error-prone APIs and dangerous configurations | sharp, edges | sharp, edges, identify, error, prone, apis, dangerous, configurations | | `shellcheck-configuration` | Master ShellCheck static analysis configuration and usage for shell script quality. Use when setting up linting infrastructure, fixing code issues, or ensuri... | shellcheck, configuration | shellcheck, configuration, static, analysis, usage, shell, script, quality, setting, up, linting, infrastructure | | `signup-flow-cro` | When the user wants to optimize signup, registration, account creation, or trial activation flows. Also use when the user mentions "signup conversions," "reg... | signup, flow, cro | signup, flow, cro, user, wants, optimize, registration, account, creation, trial, activation, flows | -| `skill-creator` | Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capa... | skill, creator | skill, creator, creating, effective, skills, should, used, users, want, new, update, existing | | `skill-rails-upgrade` | Analyze Rails apps and provide upgrade assessments | skill, rails, upgrade | skill, rails, upgrade, analyze, apps, provide, assessments | | `slack-gif-creator` | Knowledge and utilities for creating animated GIFs optimized for Slack. Provides constraints, validation tools, and animation concepts. Use when users reques... | slack, gif, creator | slack, gif, creator, knowledge, utilities, creating, animated, gifs, optimized, provides, constraints, validation | | `social-content` | When the user wants help creating, scheduling, or optimizing social media content for LinkedIn, Twitter/X, Instagram, TikTok, Facebook, or other platforms. A... | social, content | social, content, user, wants, creating, scheduling, optimizing, media, linkedin, twitter, instagram, tiktok | @@ -431,6 +431,7 @@ TRIGGER: "shopify", "shopify app", "checkout extension",... | shopify | shopify, | `writing-plans` | Use when you have a spec or requirements for a multi-step task, before touching code | writing, plans | writing, plans, spec, requirements, multi, step, task, before, touching, code | | `writing-skills` | Use when creating, updating, or improving agent skills. | writing, skills | writing, skills, creating, updating, improving, agent | | `x-article-publisher-skill` | Publish articles to X/Twitter | x, article, publisher, skill | x, article, publisher, skill, publish, articles, twitter | +| `youtube-summarizer` | Extract transcripts from YouTube videos and generate comprehensive, detailed summaries using intelligent analysis frameworks | video, summarization, transcription, youtube, content-analysis | video, summarization, transcription, youtube, content-analysis, summarizer, extract, transcripts, videos, generate, detailed, summaries | ## infrastructure (79) @@ -660,7 +661,7 @@ TRIGGER: "shopify", "shopify app", "checkout extension",... | shopify | shopify, | `unit-testing-test-generate` | Generate comprehensive, maintainable unit tests across languages with strong coverage and edge case focus. | unit, generate | unit, generate, testing, test, maintainable, tests, languages, strong, coverage, edge, case | | `web3-testing` | Test smart contracts comprehensively using Hardhat and Foundry with unit tests, integration tests, and mainnet forking. Use when testing Solidity contracts, ... | web3 | web3, testing, test, smart, contracts, comprehensively, hardhat, foundry, unit, tests, integration, mainnet | -## workflow (16) +## workflow (17) | Skill | Description | Tags | Triggers | | --- | --- | --- | --- | @@ -678,5 +679,6 @@ TRIGGER: "shopify", "shopify app", "checkout extension",... | shopify | shopify, | `kaizen` | Guide for continuous improvement, error proofing, and standardization. Use this skill when the user wants to improve code quality, refactor, or discuss proce... | kaizen | kaizen, continuous, improvement, error, proofing, standardization, skill, user, wants, improve, code, quality | | `mermaid-expert` | Create Mermaid diagrams for flowcharts, sequences, ERDs, and architectures. Masters syntax for all diagram types and styling. Use PROACTIVELY for visual docu... | mermaid | mermaid, diagrams, flowcharts, sequences, erds, architectures, masters, syntax, all, diagram, types, styling | | `pdf-official` | Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs ... | pdf, official | pdf, official, manipulation, toolkit, extracting, text, tables, creating, new, pdfs, merging, splitting | +| `skill-creator` | This skill should be used when the user asks to create a new skill, build a skill, make a custom skill, develop a CLI skill, or wants to extend the CLI with ... | automation, scaffolding, skill-creation, meta-skill | automation, scaffolding, skill-creation, meta-skill, skill, creator, should, used, user, asks, new, custom | | `team-collaboration-issue` | You are a GitHub issue resolution expert specializing in systematic bug investigation, feature implementation, and collaborative development workflows. Your ... | team, collaboration, issue | team, collaboration, issue, github, resolution, specializing, systematic, bug, investigation, feature, collaborative, development | | `track-management` | Use this skill when creating, managing, or working with Conductor tracks - the logical work units for features, bugs, and refactors. Applies to spec.md, plan... | track | track, skill, creating, managing, working, conductor, tracks, logical, work, units, features, bugs | diff --git a/README.md b/README.md index 4d0d57e6..f20e769d 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ -# 🌌 Antigravity Awesome Skills: 631+ Agentic Skills for Claude Code, Gemini CLI, Cursor, Copilot & More +# 🌌 Antigravity Awesome Skills: 633+ Agentic Skills for Claude Code, Gemini CLI, Cursor, Copilot & More -> **The Ultimate Collection of 631+ Universal Agentic Skills for AI Coding Assistants β€” Claude Code, Gemini CLI, Codex CLI, Antigravity IDE, GitHub Copilot, Cursor, OpenCode, AdaL** +> **The Ultimate Collection of 633+ Universal Agentic Skills for AI Coding Assistants β€” Claude Code, Gemini CLI, Codex CLI, Antigravity IDE, GitHub Copilot, Cursor, OpenCode, AdaL** [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Claude Code](https://img.shields.io/badge/Claude%20Code-Anthropic-purple)](https://claude.ai) @@ -13,7 +13,7 @@ [![AdaL CLI](https://img.shields.io/badge/AdaL%20CLI-SylphAI-pink)](https://sylph.ai/) [![ASK Supported](https://img.shields.io/badge/ASK-Supported-blue)](https://github.com/yeasy/ask) -**Antigravity Awesome Skills** is a curated, battle-tested library of **631 high-performance agentic skills** designed to work seamlessly across all major AI coding assistants: +**Antigravity Awesome Skills** is a curated, battle-tested library of **633 high-performance agentic skills** designed to work seamlessly across all major AI coding assistants: - 🟣 **Claude Code** (Anthropic CLI) - πŸ”΅ **Gemini CLI** (Google DeepMind) @@ -32,7 +32,7 @@ This repository provides essential skills to transform your AI assistant into a - [πŸ”Œ Compatibility & Invocation](#compatibility--invocation) - [πŸ“¦ Features & Categories](#features--categories) - [🎁 Curated Collections (Bundles)](#curated-collections) -- [πŸ“š Browse 631+ Skills](#browse-631-skills) +- [πŸ“š Browse 633+ Skills](#browse-633-skills) - [πŸ› οΈ Installation](#installation) - [🀝 How to Contribute](#how-to-contribute) - [πŸ‘₯ Contributors & Credits](#credits--sources) @@ -132,7 +132,7 @@ The repository is organized into specialized domains to transform your AI into a [Check out our Starter Packs in docs/BUNDLES.md](docs/BUNDLES.md) to find the perfect toolkit for your role. -## Browse 631+ Skills +## Browse 633+ Skills We have moved the full skill registry to a dedicated catalog to keep this README clean. diff --git a/data/aliases.json b/data/aliases.json index c564c328..6800647b 100644 --- a/data/aliases.json +++ b/data/aliases.json @@ -1,5 +1,5 @@ { - "generatedAt": "2026-02-04T08:08:20.870Z", + "generatedAt": "2026-02-04T20:37:02.333Z", "aliases": { "accessibility-compliance-audit": "accessibility-compliance-accessibility-audit", "active directory attacks": "active-directory-attacks", diff --git a/data/bundles.json b/data/bundles.json index 06dd50cd..4a2df7e8 100644 --- a/data/bundles.json +++ b/data/bundles.json @@ -1,5 +1,5 @@ { - "generatedAt": "2026-02-04T08:08:20.870Z", + "generatedAt": "2026-02-04T20:37:02.333Z", "bundles": { "core-dev": { "description": "Core development skills across languages, frameworks, and backend/frontend fundamentals.", diff --git a/data/catalog.json b/data/catalog.json index 51763baf..ef84b0a0 100644 --- a/data/catalog.json +++ b/data/catalog.json @@ -1,6 +1,6 @@ { - "generatedAt": "2026-02-04T08:08:20.870Z", - "total": 631, + "generatedAt": "2026-02-04T20:37:02.333Z", + "total": 633, "skills": [ { "id": "3d-web-experience", @@ -1108,6 +1108,34 @@ ], "path": "skills/attack-tree-construction/SKILL.md" }, + { + "id": "audio-transcriber", + "name": "audio-transcriber", + "description": "Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration", + "category": "data-ai", + "tags": [ + "audio", + "transcription", + "whisper", + "meeting-minutes", + "speech-to-text" + ], + "triggers": [ + "audio", + "transcription", + "whisper", + "meeting-minutes", + "speech-to-text", + "transcriber", + "transform", + "recordings", + "professional", + "markdown", + "documentation", + "intelligent" + ], + "path": "skills/audio-transcriber/SKILL.md" + }, { "id": "auth-implementation-patterns", "name": "auth-implementation-patterns", @@ -10675,24 +10703,27 @@ { "id": "prompt-engineer", "name": "prompt-engineer", - "description": "Expert prompt engineer specializing in advanced prompting techniques, LLM optimization, and AI system design. Masters chain-of-thought, constitutional AI, and production prompt strategies. Use when building AI features, improving agent performance, or crafting system prompts.", - "category": "data-ai", + "description": "Transforms user prompts into optimized prompts using frameworks (RTF, RISEN, Chain of Thought, RODES, Chain of Density, RACE, RISE, STAR, SOAP, CLEAR, GROW)", + "category": "general", "tags": [ - "prompt" + "prompt-engineering", + "optimization", + "frameworks", + "ai-enhancement" ], "triggers": [ + "prompt-engineering", + "optimization", + "frameworks", + "ai-enhancement", "prompt", "engineer", - "specializing", - "prompting", - "techniques", - "llm", - "optimization", - "ai", - "masters", - "chain", - "thought", - "constitutional" + "transforms", + "user", + "prompts", + "optimized", + "rtf", + "risen" ], "path": "skills/prompt-engineer/SKILL.md" }, @@ -12671,25 +12702,27 @@ { "id": "skill-creator", "name": "skill-creator", - "description": "Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations.", - "category": "general", + "description": "This skill should be used when the user asks to create a new skill, build a skill, make a custom skill, develop a CLI skill, or wants to extend the CLI with new capabilities. Automates the entire skill creation workflow from brainstorming to installation.", + "category": "workflow", "tags": [ - "skill", - "creator" + "automation", + "scaffolding", + "skill-creation", + "meta-skill" ], "triggers": [ + "automation", + "scaffolding", + "skill-creation", + "meta-skill", "skill", "creator", - "creating", - "effective", - "skills", "should", "used", - "users", - "want", + "user", + "asks", "new", - "update", - "existing" + "custom" ], "path": "skills/skill-creator/SKILL.md" }, @@ -15345,6 +15378,34 @@ ], "path": "skills/xss-html-injection/SKILL.md" }, + { + "id": "youtube-summarizer", + "name": "youtube-summarizer", + "description": "Extract transcripts from YouTube videos and generate comprehensive, detailed summaries using intelligent analysis frameworks", + "category": "general", + "tags": [ + "video", + "summarization", + "transcription", + "youtube", + "content-analysis" + ], + "triggers": [ + "video", + "summarization", + "transcription", + "youtube", + "content-analysis", + "summarizer", + "extract", + "transcripts", + "videos", + "generate", + "detailed", + "summaries" + ], + "path": "skills/youtube-summarizer/SKILL.md" + }, { "id": "zapier-make-patterns", "name": "zapier-make-patterns", diff --git a/skills/audio-transcriber/CHANGELOG.md b/skills/audio-transcriber/CHANGELOG.md new file mode 100644 index 00000000..5c120e95 --- /dev/null +++ b/skills/audio-transcriber/CHANGELOG.md @@ -0,0 +1,137 @@ +# Changelog - audio-transcriber + +All notable changes to the audio-transcriber skill will be documented in this file. + +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), +and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). + +--- + +## [1.1.0] - 2026-02-03 + +### ✨ Added + +- **Intelligent Prompt Workflow** (Step 3b) - Complete integration with prompt-engineer skill + - **Scenario A**: User-provided prompts are automatically improved with prompt-engineer + - Displays both original and improved versions side-by-side + - Single confirmation: "Usar versΓ£o melhorada? [s/n]" + - **Scenario B**: Auto-generation when no prompt provided + - Analyzes transcript and suggests document type (ata, resumo, notas) + - Shows suggestion and asks confirmation + - Generates complete structured prompt (RISEN/RODES/STAR) + - Shows preview and asks final confirmation + - Falls back to DEFAULT_MEETING_PROMPT if declined + +- **LLM Integration** - Process transcripts with Claude CLI or GitHub Copilot CLI + - Priority: Claude > GitHub Copilot > None (transcript-only mode) + - Step 0b: CLI detection logic documented + - Timeout handling (5 minutes default) + - Graceful fallback if CLI unavailable + +- **Progress Indicators** - Visual feedback during long operations + - `tqdm` progress bar for Whisper transcription segments + - `rich` spinner for LLM processing + - Clear status messages at each step + +- **Timestamp-based File Naming** - Avoid overwriting previous transcriptions + - Format: `transcript-YYYYMMDD-HHMMSS.md` + - Format: `ata-YYYYMMDD-HHMMSS.md` + - Prevents data loss from repeated runs + +- **Automatic Cleanup** - Remove temporary files after processing + - Deletes `metadata.json` and `transcription.json` automatically + - `--keep-temp` flag to preserve if needed + - Clean output directory + +- **Rich Terminal UI** - Beautiful output with `rich` library + - Formatted panels for prompt previews + - Color-coded status messages (green=success, yellow=warning, red=error) + - Spinner animations for long-running tasks + +- **Dual Output Support** - Generate both transcript and processed ata + - `transcript-*.md` - Raw transcription with timestamps + - `ata-*.md` - Intelligent summary/meeting minutes (if LLM available) + - User can decline LLM processing to get transcript-only + +### πŸ”§ Changed + +- **SKILL.md** - Major documentation updates + - Added Step 0b (CLI Detection) + - Updated Step 2 (Progress Indicators) + - Added Step 3b (Intelligent Prompt Workflow with 150+ lines) + - Updated version to 1.1.0 + - Added detailed workflow diagrams for both scenarios + +- **install-requirements.sh** - Added UI libraries + - Now installs `tqdm` and `rich` packages + - Graceful fallback if installation fails + - Updated success messages + +- **Python Implementation** - Complete refactor + - Created `scripts/transcribe.py` (516 lines) + - Functions: `detect_cli_tool()`, `invoke_prompt_engineer()`, `handle_prompt_workflow()`, `process_with_llm()`, `transcribe_audio()`, `save_outputs()`, `cleanup_temp_files()` + - Command-line arguments: `--prompt`, `--model`, `--output-dir`, `--keep-temp` + - Auto-installs `rich` and `tqdm` if missing + +### πŸ› Fixed + +- **User prompts no longer ignored** - v1.0.0 completely ignored custom prompts + - Now processes all prompts (custom or auto-generated) with LLM + - Improves simple prompts into structured frameworks + +- **Temporary files cleanup** - v1.0.0 left `metadata.json` and `transcription.json` as trash + - Now automatically removed after processing + - Clean output directory + +- **File overwriting** - v1.0.0 used same filename (e.g., `meeting.md`) every time + - Now uses timestamp to prevent data loss + - Each run creates unique files + +- **Missing ata/summary** - v1.0.0 only generated raw transcript + - Now generates intelligent ata/resumo using LLM + - Respects user's prompt instructions + +- **No progress feedback** - v1.0.0 had silent processing (users didn't know if it froze) + - Now shows progress bar for transcription + - Shows spinner for LLM processing + - Clear status messages throughout + +### πŸ“ Notes + +- **Backward Compatibility:** Fully compatible with v1.0.0 workflows +- **Requires:** Python 3.8+, faster-whisper OR whisper, tqdm, rich +- **Optional:** Claude CLI or GitHub Copilot CLI for intelligent processing +- **Optional:** prompt-engineer skill for automatic prompt generation + +### πŸ”— Related Issues + +- Fixes #1: Prompt do usuΓ‘rio RISEN ignorado +- Fixes #2: Arquivos temporΓ‘rios (metadata.json, transcription.json) deixados como lixo +- Fixes #3: Output incompleto (apenas transcript RAW, sem ata) +- Fixes #4: Falta de indicador de progresso visual +- Fixes #5: Formato de saΓ­da sem timestamp + +--- + +## [1.0.0] - 2026-02-02 + +### ✨ Initial Release + +- Audio transcription using Faster-Whisper or OpenAI Whisper +- Automatic language detection +- Speaker diarization (basic) +- Voice Activity Detection (VAD) +- Markdown output with metadata table +- Installation script for dependencies +- Example scripts for basic transcription +- Support for multiple audio formats (MP3, WAV, M4A, OGG, FLAC, WEBM) +- FFmpeg integration for format conversion +- Zero-configuration philosophy + +### πŸ“ Known Limitations (Fixed in v1.1.0) + +- User prompts ignored (no LLM integration) +- Only raw transcript generated (no ata/summary) +- Temporary files not cleaned up +- No progress indicators +- Files overwritten on repeated runs diff --git a/skills/audio-transcriber/README.md b/skills/audio-transcriber/README.md new file mode 100644 index 00000000..aef3425a --- /dev/null +++ b/skills/audio-transcriber/README.md @@ -0,0 +1,340 @@ +# Audio Transcriber Skill v1.1.0 + +Transform audio recordings into professional Markdown documentation with **intelligent atas/summaries using LLM integration** (Claude/Copilot CLI) and automatic prompt engineering. + +## πŸ†• What's New in v1.1.0 + +- **🧠 LLM Integration** - Claude CLI (primary) or GitHub Copilot CLI (fallback) for intelligent processing +- **✨ Smart Prompts** - Automatic integration with prompt-engineer skill + - User-provided prompts β†’ automatically improved β†’ user chooses version + - No prompt β†’ analyzes transcript β†’ suggests format β†’ generates structured prompt +- **πŸ“Š Progress Indicators** - Visual progress bars (tqdm) and spinners (rich) +- **πŸ“ Timestamp Filenames** - `transcript-YYYYMMDD-HHMMSS.md` + `ata-YYYYMMDD-HHMMSS.md` +- **🧹 Auto-Cleanup** - Removes temporary `metadata.json` and `transcription.json` +- **🎨 Rich Terminal UI** - Beautiful formatted output with panels and colors + +See **[CHANGELOG.md](./CHANGELOG.md)** for complete v1.1.0 details. + +## 🎯 Core Features + +- **πŸ“ Rich Markdown Output** - Structured reports with metadata tables, timestamps, and formatting +- **πŸŽ™οΈ Speaker Diarization** - Automatically identifies and labels different speakers +- **πŸ“Š Technical Metadata** - Extracts file size, duration, language, processing time +- **πŸ“‹ Intelligent Atas/Summaries** - Generated via LLM (Claude/Copilot) with customizable prompts +- **πŸ’‘ Executive Summaries** - AI-generated structured summaries with topics, decisions, action items +- **🌍 Multi-language** - Supports 99 languages with auto-detection +- **⚑ Zero Configuration** - Auto-discovers Faster-Whisper/Whisper installation +- **πŸ”’ Privacy-First** - 100% local Whisper processing, no cloud uploads +- **πŸš€ Flexible Modes** - Transcript-only or intelligent processing with LLM + +## πŸ“¦ Installation + +### Quick Install (NPX) + +```bash +npx cli-ai-skills@latest install audio-transcriber +``` + +This automatically: +- Downloads the skill +- Installs Python dependencies (faster-whisper, tqdm, rich) +- Installs ffmpeg (macOS via Homebrew) +- Sets up the skill globally + +### Manual Installation + +#### 1. Install Transcription Engine + +**Recommended (fastest):** +```bash +pip install faster-whisper tqdm rich +``` + +**Alternative (original Whisper):** +```bash +pip install openai-whisper tqdm rich +``` + +#### 2. Install Audio Tools (Optional) + +For format conversion support: +```bash +# macOS +brew install ffmpeg + +# Linux +apt install ffmpeg +``` + +#### 3. Install LLM CLI (Optional - for intelligent summaries) + +**Claude CLI (recommended):** +```bash +# Follow: https://docs.anthropic.com/en/docs/claude-cli +``` + +**GitHub Copilot CLI (alternative):** +```bash +gh extension install github/gh-copilot +``` + +#### 4. Install Skill + +**Global installation (auto-updates with git pull):** +```bash +cd /path/to/cli-ai-skills +./scripts/install-skills.sh $(pwd) +``` + +**Repository only:** +```bash +# Skill is already available if you cloned the repo +``` + +## πŸš€ Usage + +### Basic Transcription + +```bash +copilot> transcribe audio to markdown: meeting.mp3 +``` + +**Output:** +- `meeting.md` - Full Markdown report with metadata, transcription, minutes, summary + +### With Subtitles + +```bash +copilot> convert audio file to text with subtitles: interview.wav +``` + +**Generates:** +- `interview.md` - Markdown report +- `interview.srt` - Subtitle file + +### Batch Processing + +```bash +copilot> transcreva estes Γ‘udios: recordings/*.mp3 +``` + +**Processes all MP3 files in the directory.** + +### Trigger Phrases + +Activate the skill with any of these phrases: + +- "transcribe audio to markdown" +- "transcreva este Γ‘udio" +- "convert audio file to text" +- "extract speech from audio" +- "Γ‘udio para texto com metadados" + +## πŸ“‹ Use Cases + +### 1. Team Meetings +Record standups, planning sessions, or retrospectives and automatically generate: +- Participant list +- Discussion topics with timestamps +- Decisions made +- Action items assigned + +### 2. Client Calls +Transcribe client conversations with: +- Speaker identification +- Key agreements documented +- Follow-up tasks extracted + +### 3. Interviews +Convert interviews to text with: +- Question/answer attribution +- Subtitle generation for video +- Searchable transcript + +### 4. Lectures & Training +Document educational content with: +- Timestamped notes +- Topic breakdown +- Key concepts summary + +### 5. Content Creation +Analyze podcasts, videos, YouTube content: +- Full transcription +- Chapter markers (timestamps) +- Summary for show notes + +## πŸ“Š Output Example + +```markdown +# Audio Transcription Report + +## πŸ“Š Metadata + +| Field | Value | +|-------|-------| +| **File Name** | team-standup.mp3 | +| **File Size** | 3.2 MB | +| **Duration** | 00:12:47 | +| **Language** | English (en) | +| **Processed Date** | 2026-02-02 14:35:21 | +| **Speakers Identified** | 5 | +| **Transcription Engine** | Faster-Whisper (model: base) | + +--- + +## πŸŽ™οΈ Full Transcription + +**[00:00:12 β†’ 00:00:45]** *Speaker 1* +Good morning everyone. Let's start with updates from the frontend team. + +**[00:00:46 β†’ 00:01:23]** *Speaker 2* +We completed the dashboard redesign and deployed to staging yesterday. + +--- + +## πŸ“‹ Meeting Minutes + +### Participants +- Speaker 1 (Meeting Lead) +- Speaker 2 (Frontend Developer) +- Speaker 3 (Backend Developer) +- Speaker 4 (Designer) +- Speaker 5 (Product Manager) + +### Topics Discussed +1. **Dashboard Redesign** (00:00:46) + - Completed and deployed to staging + - Positive feedback from QA team + +2. **API Performance Issues** (00:03:12) + - Database query optimization needed + - Target response time < 200ms + +### Decisions Made +- βœ… Approved dashboard for production deployment +- βœ… Allocated 2 sprint points for API optimization + +### Action Items +- [ ] **Deploy dashboard to production** - Assigned to: Speaker 2 - Due: 2026-02-05 +- [ ] **Optimize database queries** - Assigned to: Speaker 3 +- [ ] **Schedule user testing session** - Assigned to: Speaker 5 + +--- + +## πŸ“ Executive Summary + +The team standup covered progress on the dashboard redesign, which has been successfully completed and is ready for production deployment. The frontend team received positive feedback from QA and the design aligns with user requirements. + +Backend performance concerns were raised regarding API response times. The team decided to prioritize query optimization in the current sprint, with a target of sub-200ms response times. + +Next steps include production deployment of the dashboard by end of week and scheduling user testing sessions to validate the new design with real users. + +### Key Points +- πŸ”Ή Dashboard redesign complete and staging-approved +- πŸ”Ή API performance optimization prioritized +- πŸ”Ή User testing scheduled for next week + +### Next Steps +1. Production deployment (Speaker 2) +2. Database optimization (Speaker 3) +3. User testing coordination (Speaker 5) +``` + +## βš™οΈ Configuration + +No configuration needed! The skill automatically: +- Detects Faster-Whisper or Whisper installation +- Chooses the fastest available engine +- Selects appropriate model based on file size +- Auto-detects language + +## πŸ”§ Troubleshooting + +### "No transcription tool found" +**Solution:** Install Whisper: +```bash +pip install faster-whisper +``` + +### "Unsupported format" +**Solution:** Install ffmpeg: +```bash +brew install ffmpeg # macOS +apt install ffmpeg # Linux +``` + +### Slow processing +**Solution:** Use a smaller Whisper model: +```bash +# Edit the skill to use "tiny" or "base" model instead of "medium" +``` + +### Poor speaker identification +**Solution:** +- Ensure clear audio with minimal background noise +- Use a better microphone for recordings +- Try the "medium" or "large" Whisper model + +## πŸ› οΈ Advanced Usage + +### Custom Model Selection + +Edit `SKILL.md` Step 2 to change model: +```python +model = WhisperModel("small", device="cpu") # Change "base" to "small", "medium", etc. +``` + +### Output Language Control + +Force output in specific language: +```bash +# Edit Step 3 to set language explicitly +``` + +### Batch Settings + +Process specific file types only: +```bash +copilot> transcribe audio: recordings/*.wav # Only WAV files +``` + +## πŸ“š FAQ + +**Q: Does this work offline?** +A: Yes! 100% local processing, no internet required after initial model download. + +**Q: What's the difference between Whisper and Faster-Whisper?** +A: Faster-Whisper is 4-5x faster with same quality. Always prefer it if available. + +**Q: Can I transcribe YouTube videos?** +A: Not directly. Use a YouTube downloader first, then transcribe the audio file. Or use the `youtube-summarizer` skill instead. + +**Q: How accurate is speaker identification?** +A: Accuracy depends on audio quality. Clear recordings with distinct voices work best. Currently uses simple estimation; future versions will use advanced diarization. + +**Q: What languages are supported?** +A: 99 languages including English, Portuguese, Spanish, French, German, Chinese, Japanese, Arabic, and more. + +**Q: Can I edit the meeting minutes format?** +A: Yes! Edit the Markdown template in SKILL.md Step 3. + +## πŸ”— Related Skills + +- **youtube-summarizer** - Extract and summarize YouTube video transcripts +- **prompt-engineer** - Optimize prompts for better AI summaries + +## πŸ“„ License + +This skill is part of the cli-ai-skills repository. +MIT License - See repository LICENSE file. + +## 🀝 Contributing + +Found a bug or have a feature request? +Open an issue in the [cli-ai-skills repository](https://github.com/yourusername/cli-ai-skills). + +--- + +**Version:** 1.0.0 +**Author:** Eric Andrade +**Created:** 2026-02-02 diff --git a/skills/audio-transcriber/SKILL.md b/skills/audio-transcriber/SKILL.md new file mode 100644 index 00000000..d28cdcb5 --- /dev/null +++ b/skills/audio-transcriber/SKILL.md @@ -0,0 +1,558 @@ +--- +name: audio-transcriber +description: "Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration" +version: 1.2.0 +author: Eric Andrade +created: 2025-02-01 +updated: 2026-02-04 +platforms: [github-copilot-cli, claude-code, codex] +category: content +tags: [audio, transcription, whisper, meeting-minutes, speech-to-text] +risk: safe +--- + +## Purpose + +This skill automates audio-to-text transcription with professional Markdown output, extracting rich technical metadata (speakers, timestamps, language, file size, duration) and generating structured meeting minutes and executive summaries. It uses Faster-Whisper or Whisper with zero configuration, working universally across projects without hardcoded paths or API keys. + +Inspired by tools like Plaud, this skill transforms raw audio recordings into actionable documentation, making it ideal for meetings, interviews, lectures, and content analysis. + +## When to Use + +Invoke this skill when: + +- User needs to transcribe audio/video files to text +- User wants meeting minutes automatically generated from recordings +- User requires speaker identification (diarization) in conversations +- User needs subtitles/captions (SRT, VTT formats) +- User wants executive summaries of long audio content +- User asks variations of "transcribe this audio", "convert audio to text", "generate meeting notes from recording" +- User has audio files in common formats (MP3, WAV, M4A, OGG, FLAC, WEBM) + +## Workflow + +### Step 0: Discovery (Auto-detect Transcription Tools) + +**Objective:** Identify available transcription engines without user configuration. + +**Actions:** + +Run detection commands to find installed tools: + +```bash +# Check for Faster-Whisper (preferred - 4-5x faster) +if python3 -c "import faster_whisper" 2>/dev/null; then + TRANSCRIBER="faster-whisper" + echo "βœ… Faster-Whisper detected (optimized)" +# Fallback to original Whisper +elif python3 -c "import whisper" 2>/dev/null; then + TRANSCRIBER="whisper" + echo "βœ… OpenAI Whisper detected" +else + TRANSCRIBER="none" + echo "⚠️ No transcription tool found" +fi + +# Check for ffmpeg (audio format conversion) +if command -v ffmpeg &>/dev/null; then + echo "βœ… ffmpeg available (format conversion enabled)" +else + echo "ℹ️ ffmpeg not found (limited format support)" +fi +``` + +**If no transcriber found:** + +Offer automatic installation using the provided script: + +```bash +echo "⚠️ No transcription tool found" +echo "" +echo "πŸ”§ Auto-install dependencies? (Recommended)" +read -p "Run installation script? [Y/n]: " AUTO_INSTALL + +if [[ ! "$AUTO_INSTALL" =~ ^[Nn] ]]; then + # Get skill directory (works for both repo and symlinked installations) + SKILL_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + + # Run installation script + if [[ -f "$SKILL_DIR/scripts/install-requirements.sh" ]]; then + bash "$SKILL_DIR/scripts/install-requirements.sh" + else + echo "❌ Installation script not found" + echo "" + echo "πŸ“¦ Manual installation:" + echo " pip install faster-whisper # Recommended" + echo " pip install openai-whisper # Alternative" + echo " brew install ffmpeg # Optional (macOS)" + exit 1 + fi + + # Verify installation succeeded + if python3 -c "import faster_whisper" 2>/dev/null || python3 -c "import whisper" 2>/dev/null; then + echo "βœ… Installation successful! Proceeding with transcription..." + else + echo "❌ Installation failed. Please install manually." + exit 1 + fi +else + echo "" + echo "πŸ“¦ Manual installation required:" + echo "" + echo "Recommended (fastest):" + echo " pip install faster-whisper" + echo "" + echo "Alternative (original):" + echo " pip install openai-whisper" + echo "" + echo "Optional (format conversion):" + echo " brew install ffmpeg # macOS" + echo " apt install ffmpeg # Linux" + echo "" + exit 1 +fi +``` + +This ensures users can install dependencies with one confirmation, or opt for manual installation if preferred. + +**If transcriber found:** + +Proceed to Step 0b (CLI Detection). + + +### Step 1: Validate Audio File + +**Objective:** Verify file exists, check format, and extract metadata. + +**Actions:** + +1. **Accept file path or URL** from user: + - Local file: `meeting.mp3` + - URL: `https://example.com/audio.mp3` (download to temp directory) + +2. **Verify file exists:** + +```bash +if [[ ! -f "$AUDIO_FILE" ]]; then + echo "❌ File not found: $AUDIO_FILE" + exit 1 +fi +``` + +3. **Extract metadata** using ffprobe or file utilities: + +```bash +# Get file size +FILE_SIZE=$(du -h "$AUDIO_FILE" | cut -f1) + +# Get duration and format using ffprobe +DURATION=$(ffprobe -v error -show_entries format=duration \ + -of default=noprint_wrappers=1:nokey=1 "$AUDIO_FILE" 2>/dev/null) +FORMAT=$(ffprobe -v error -select_streams a:0 -show_entries \ + stream=codec_name -of default=noprint_wrappers=1:nokey=1 "$AUDIO_FILE" 2>/dev/null) + +# Convert duration to HH:MM:SS +DURATION_HMS=$(date -u -r "$DURATION" +%H:%M:%S 2>/dev/null || echo "Unknown") +``` + +4. **Check file size** (warn if large for cloud APIs): + +```bash +SIZE_MB=$(du -m "$AUDIO_FILE" | cut -f1) +if [[ $SIZE_MB -gt 25 ]]; then + echo "⚠️ Large file ($FILE_SIZE) - processing may take several minutes" +fi +``` + +5. **Validate format** (supported: MP3, WAV, M4A, OGG, FLAC, WEBM): + +```bash +EXTENSION="${AUDIO_FILE##*.}" +SUPPORTED_FORMATS=("mp3" "wav" "m4a" "ogg" "flac" "webm" "mp4") + +if [[ ! " ${SUPPORTED_FORMATS[@]} " =~ " ${EXTENSION,,} " ]]; then + echo "⚠️ Unsupported format: $EXTENSION" + if command -v ffmpeg &>/dev/null; then + echo "πŸ”„ Converting to WAV..." + ffmpeg -i "$AUDIO_FILE" -ar 16000 "${AUDIO_FILE%.*}.wav" -y + AUDIO_FILE="${AUDIO_FILE%.*}.wav" + else + echo "❌ Install ffmpeg to convert formats: brew install ffmpeg" + exit 1 + fi +fi +``` + + +### Step 3: Generate Markdown Output + +**Objective:** Create structured Markdown with metadata, transcription, meeting minutes, and summary. + +**Output Template:** + +```markdown +# Audio Transcription Report + +## πŸ“Š Metadata + +| Field | Value | +|-------|-------| +| **File Name** | {filename} | +| **File Size** | {file_size} | +| **Duration** | {duration_hms} | +| **Language** | {language} ({language_code}) | +| **Processed Date** | {process_date} | +| **Speakers Identified** | {num_speakers} | +| **Transcription Engine** | {engine} (model: {model}) | + + +## πŸ“‹ Meeting Minutes + +### Participants +- {speaker_1} +- {speaker_2} +- ... + +### Topics Discussed +1. **{topic_1}** ({timestamp}) + - {key_point_1} + - {key_point_2} + +2. **{topic_2}** ({timestamp}) + - {key_point_1} + +### Decisions Made +- βœ… {decision_1} +- βœ… {decision_2} + +### Action Items +- [ ] **{action_1}** - Assigned to: {speaker} - Due: {date_if_mentioned} +- [ ] **{action_2}** - Assigned to: {speaker} + + +*Generated by audio-transcriber skill v1.0.0* +*Transcription engine: {engine} | Processing time: {elapsed_time}s* +``` + +**Implementation:** + +Use Python or bash with AI model (Claude/GPT) for intelligent summarization: + +```python +def generate_meeting_minutes(segments): + """Extract topics, decisions, action items from transcription.""" + + # Group segments by topic (simple clustering by timestamps) + topics = cluster_by_topic(segments) + + # Identify action items (keywords: "should", "will", "need to", "action") + action_items = extract_action_items(segments) + + # Identify decisions (keywords: "decided", "agreed", "approved") + decisions = extract_decisions(segments) + + return { + "topics": topics, + "decisions": decisions, + "action_items": action_items + } + +def generate_summary(segments, max_paragraphs=5): + """Create executive summary using AI (Claude/GPT via API or local model).""" + + full_text = " ".join([s["text"] for s in segments]) + + # Use Chain of Density approach (from prompt-engineer frameworks) + summary_prompt = f""" + Summarize the following transcription in {max_paragraphs} concise paragraphs. + Focus on key topics, decisions, and action items. + + Transcription: + {full_text} + """ + + # Call AI model (placeholder - user can integrate Claude API or use local model) + summary = call_ai_model(summary_prompt) + + return summary +``` + +**Output file naming:** + +```bash +# v1.1.0: Use timestamp para evitar sobrescrever +TIMESTAMP=$(date +%Y%m%d-%H%M%S) +TRANSCRIPT_FILE="transcript-${TIMESTAMP}.md" +ATA_FILE="ata-${TIMESTAMP}.md" + +echo "$TRANSCRIPT_CONTENT" > "$TRANSCRIPT_FILE" +echo "βœ… Transcript salvo: $TRANSCRIPT_FILE" + +if [[ -n "$ATA_CONTENT" ]]; then + echo "$ATA_CONTENT" > "$ATA_FILE" + echo "βœ… Ata salva: $ATA_FILE" +fi +``` + + +#### **SCENARIO A: User Provided Custom Prompt** + +**Workflow:** + +1. **Display user's prompt:** + ``` + πŸ“ Prompt fornecido pelo usuΓ‘rio: + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ [User's prompt preview] β”‚ + β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + ``` + +2. **Automatically improve with prompt-engineer (if available):** + ```bash + πŸ”§ Melhorando prompt com prompt-engineer... + [Invokes: gh copilot -p "melhore este prompt: {user_prompt}"] + ``` + +3. **Show both versions:** + ``` + ✨ VersΓ£o melhorada: + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ Role: VocΓͺ Γ© um documentador... β”‚ + β”‚ Instructions: Transforme... β”‚ + β”‚ Steps: 1) ... 2) ... β”‚ + β”‚ End Goal: ... β”‚ + β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + + πŸ“ VersΓ£o original: + β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” + β”‚ [User's original prompt] β”‚ + β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + ``` + +4. **Ask which to use:** + ```bash + πŸ’‘ Usar versΓ£o melhorada? [s/n] (default: s): + ``` + +5. **Process with selected prompt:** + - If "s": use improved + - If "n": use original + + +#### **LLM Processing (Both Scenarios)** + +Once prompt is finalized: + +```python +from rich.progress import Progress, SpinnerColumn, TextColumn + +def process_with_llm(transcript, prompt, cli_tool='claude'): + full_prompt = f"{prompt}\n\n---\n\nTranscriΓ§Γ£o:\n\n{transcript}" + + with Progress( + SpinnerColumn(), + TextColumn("[progress.description]{task.description}"), + transient=True + ) as progress: + progress.add_task( + description=f"πŸ€– Processando com {cli_tool}...", + total=None + ) + + if cli_tool == 'claude': + result = subprocess.run( + ['claude', '-'], + input=full_prompt, + capture_output=True, + text=True, + timeout=300 # 5 minutes + ) + elif cli_tool == 'gh-copilot': + result = subprocess.run( + ['gh', 'copilot', 'suggest', '-t', 'shell', full_prompt], + capture_output=True, + text=True, + timeout=300 + ) + + if result.returncode == 0: + return result.stdout.strip() + else: + return None +``` + +**Progress output:** +``` +πŸ€– Processando com claude... β ‹ +[After completion:] +βœ… Ata gerada com sucesso! +``` + + +#### **Final Output** + +**Success (both files):** +```bash +πŸ’Ύ Salvando arquivos... + +βœ… Arquivos criados: + - transcript-20260203-023045.md (transcript puro) + - ata-20260203-023045.md (processado com LLM) + +🧹 Removidos arquivos temporΓ‘rios: metadata.json, transcription.json + +βœ… ConcluΓ­do! Tempo total: 3m 45s +``` + +**Transcript only (user declined LLM):** +```bash +πŸ’Ύ Salvando arquivos... + +βœ… Arquivo criado: + - transcript-20260203-023045.md + +ℹ️ Ata nΓ£o gerada (processamento LLM recusado pelo usuΓ‘rio) + +🧹 Removidos arquivos temporΓ‘rios: metadata.json, transcription.json + +βœ… ConcluΓ­do! +``` + + +### Step 5: Display Results Summary + +**Objective:** Show completion status and next steps. + +**Output:** + +```bash +echo "" +echo "βœ… Transcription Complete!" +echo "" +echo "πŸ“Š Results:" +echo " File: $OUTPUT_FILE" +echo " Language: $LANGUAGE" +echo " Duration: $DURATION_HMS" +echo " Speakers: $NUM_SPEAKERS" +echo " Words: $WORD_COUNT" +echo " Processing time: ${ELAPSED_TIME}s" +echo "" +echo "πŸ“ Generated:" +echo " - $OUTPUT_FILE (Markdown report)" +[if alternative formats:] +echo " - ${OUTPUT_FILE%.*}.srt (Subtitles)" +echo " - ${OUTPUT_FILE%.*}.json (Structured data)" +echo "" +echo "🎯 Next steps:" +echo " 1. Review meeting minutes and action items" +echo " 2. Share report with participants" +echo " 3. Track action items to completion" +``` + + +## Example Usage + +### **Example 1: Basic Transcription** + +**User Input:** +```bash +copilot> transcribe audio to markdown: meeting-2026-02-02.mp3 +``` + +**Skill Output:** + +```bash +βœ… Faster-Whisper detected (optimized) +βœ… ffmpeg available (format conversion enabled) + +πŸ“‚ File: meeting-2026-02-02.mp3 +πŸ“Š Size: 12.3 MB +⏱️ Duration: 00:45:32 + +πŸŽ™οΈ Processing... +[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ] 100% + +βœ… Language detected: Portuguese (pt-BR) +πŸ‘₯ Speakers identified: 4 +πŸ“ Generating Markdown output... + +βœ… Transcription Complete! + +πŸ“Š Results: + File: meeting-2026-02-02.md + Language: pt-BR + Duration: 00:45:32 + Speakers: 4 + Words: 6,842 + Processing time: 127s + +πŸ“ Generated: + - meeting-2026-02-02.md (Markdown report) + +🎯 Next steps: + 1. Review meeting minutes and action items + 2. Share report with participants + 3. Track action items to completion +``` + + +### **Example 3: Batch Processing** + +**User Input:** +```bash +copilot> transcreva estes Γ‘udios: recordings/*.mp3 +``` + +**Skill Output:** + +```bash +πŸ“¦ Batch mode: 5 files found + 1. team-standup.mp3 + 2. client-call.mp3 + 3. brainstorm-session.mp3 + 4. product-demo.mp3 + 5. retrospective.mp3 + +πŸŽ™οΈ Processing batch... + +[1/5] team-standup.mp3 βœ… (2m 34s) +[2/5] client-call.mp3 βœ… (15m 12s) +[3/5] brainstorm-session.mp3 βœ… (8m 47s) +[4/5] product-demo.mp3 βœ… (22m 03s) +[5/5] retrospective.mp3 βœ… (11m 28s) + +βœ… Batch Complete! +πŸ“ Generated 5 Markdown reports +⏱️ Total processing time: 6m 15s +``` + + +### **Example 5: Large File Warning** + +**User Input:** +```bash +copilot> transcribe audio to markdown: conference-keynote.mp3 +``` + +**Skill Output:** + +```bash +βœ… Faster-Whisper detected (optimized) + +πŸ“‚ File: conference-keynote.mp3 +πŸ“Š Size: 87.2 MB +⏱️ Duration: 02:15:47 +⚠️ Large file (87.2 MB) - processing may take several minutes + +Continue? [Y/n]: +``` + +**User:** `Y` + +```bash +πŸŽ™οΈ Processing... (this may take 10-15 minutes) +[β–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘] 20% - Estimated time remaining: 12m +``` + + +This skill is **platform-agnostic** and works in any terminal context where GitHub Copilot CLI is available. It does not depend on specific project configurations or external APIs, following the zero-configuration philosophy. diff --git a/skills/audio-transcriber/examples/basic-transcription.sh b/skills/audio-transcriber/examples/basic-transcription.sh new file mode 100755 index 00000000..9d74d0ac --- /dev/null +++ b/skills/audio-transcriber/examples/basic-transcription.sh @@ -0,0 +1,250 @@ +#!/usr/bin/env bash + +# Basic Audio Transcription Example +# Demonstrates how to use the audio-transcriber skill manually + +set -euo pipefail + +# Configuration +AUDIO_FILE="${1:-}" +MODEL="${MODEL:-base}" # Options: tiny, base, small, medium, large +OUTPUT_FORMAT="${OUTPUT_FORMAT:-markdown}" # Options: markdown, txt, srt, vtt, json + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +NC='\033[0m' # No Color + +# Helper functions +error() { + echo -e "${RED}❌ Error: $1${NC}" >&2 + exit 1 +} + +success() { + echo -e "${GREEN}βœ… $1${NC}" +} + +info() { + echo -e "${BLUE}ℹ️ $1${NC}" +} + +warn() { + echo -e "${YELLOW}⚠️ $1${NC}" +} + +# Check if audio file is provided +if [[ -z "$AUDIO_FILE" ]]; then + error "Usage: $0 " +fi + +# Verify file exists +if [[ ! -f "$AUDIO_FILE" ]]; then + error "File not found: $AUDIO_FILE" +fi + +# Step 0: Discovery - Check for transcription tools +info "Step 0: Discovering transcription tools..." + +TRANSCRIBER="" +if python3 -c "import faster_whisper" 2>/dev/null; then + TRANSCRIBER="faster-whisper" + success "Faster-Whisper detected (optimized)" +elif python3 -c "import whisper" 2>/dev/null; then + TRANSCRIBER="whisper" + success "OpenAI Whisper detected" +else + error "No transcription tool found. Install with: pip install faster-whisper" +fi + +# Check for ffmpeg +if command -v ffmpeg &>/dev/null; then + success "ffmpeg available (format conversion enabled)" +else + warn "ffmpeg not found (limited format support)" +fi + +# Step 1: Extract metadata +info "Step 1: Extracting audio metadata..." + +FILE_SIZE=$(du -h "$AUDIO_FILE" | cut -f1) +info "File size: $FILE_SIZE" + +# Get duration if ffprobe is available +if command -v ffprobe &>/dev/null; then + DURATION=$(ffprobe -v error -show_entries format=duration \ + -of default=noprint_wrappers=1:nokey=1 "$AUDIO_FILE" 2>/dev/null || echo "0") + + # Convert to HH:MM:SS + if command -v date &>/dev/null; then + if [[ "$OSTYPE" == "darwin"* ]]; then + # macOS + DURATION_HMS=$(date -u -r "${DURATION%.*}" +%H:%M:%S 2>/dev/null || echo "Unknown") + else + # Linux + DURATION_HMS=$(date -u -d @"${DURATION%.*}" +%H:%M:%S 2>/dev/null || echo "Unknown") + fi + else + DURATION_HMS="Unknown" + fi + + info "Duration: $DURATION_HMS" +else + warn "ffprobe not found - cannot extract duration" + DURATION="0" + DURATION_HMS="Unknown" +fi + +# Check file size warning +SIZE_MB=$(du -m "$AUDIO_FILE" | cut -f1) +if [[ $SIZE_MB -gt 25 ]]; then + warn "Large file ($FILE_SIZE) - processing may take several minutes" + read -p "Continue? [Y/n]: " CONTINUE + if [[ "$CONTINUE" =~ ^[Nn] ]]; then + info "Transcription cancelled" + exit 0 + fi +fi + +# Step 2: Transcribe using Python +info "Step 2: Transcribing audio..." + +OUTPUT_FILE="${AUDIO_FILE%.*}.md" +TEMP_JSON="/tmp/transcription_$$.json" + +python3 << EOF +import sys +import json +from datetime import datetime + +try: + if "$TRANSCRIBER" == "faster-whisper": + from faster_whisper import WhisperModel + model = WhisperModel("$MODEL", device="cpu", compute_type="int8") + segments, info = model.transcribe("$AUDIO_FILE", language=None, vad_filter=True) + + data = { + "language": info.language, + "language_probability": round(info.language_probability, 2), + "duration": info.duration, + "segments": [] + } + + for segment in segments: + data["segments"].append({ + "start": round(segment.start, 2), + "end": round(segment.end, 2), + "text": segment.text.strip() + }) + else: + import whisper + model = whisper.load_model("$MODEL") + result = model.transcribe("$AUDIO_FILE") + + data = { + "language": result["language"], + "duration": result["segments"][-1]["end"] if result["segments"] else 0, + "segments": result["segments"] + } + + with open("$TEMP_JSON", "w") as f: + json.dump(data, f) + + print(f"βœ… Language detected: {data['language']}") + print(f"πŸ“ Transcribed {len(data['segments'])} segments") + +except Exception as e: + print(f"❌ Error: {e}", file=sys.stderr) + sys.exit(1) +EOF + +# Check if transcription succeeded +if [[ ! -f "$TEMP_JSON" ]]; then + error "Transcription failed" +fi + +# Step 3: Generate Markdown output +info "Step 3: Generating Markdown report..." + +python3 << 'EOF' +import json +import sys +from datetime import datetime + +# Load transcription data +with open("${TEMP_JSON}") as f: + data = json.load(f) + +# Prepare metadata +filename = "${AUDIO_FILE}".split("/")[-1] +file_size = "${FILE_SIZE}" +duration_hms = "${DURATION_HMS}" +language = data["language"] +process_date = datetime.now().strftime("%Y-%m-%d %H:%M:%S") +num_segments = len(data["segments"]) + +# Generate Markdown +markdown = f"""# Audio Transcription Report + +## πŸ“Š Metadata + +| Field | Value | +|-------|-------| +| **File Name** | {filename} | +| **File Size** | {file_size} | +| **Duration** | {duration_hms} | +| **Language** | {language.upper()} | +| **Processed Date** | {process_date} | +| **Segments** | {num_segments} | +| **Transcription Engine** | ${TRANSCRIBER} (model: ${MODEL}) | + +--- + +## πŸŽ™οΈ Full Transcription + +""" + +# Add transcription with timestamps +for seg in data["segments"]: + start_time = f"{int(seg['start'] // 60):02d}:{int(seg['start'] % 60):02d}" + end_time = f"{int(seg['end'] // 60):02d}:{int(seg['end'] % 60):02d}" + markdown += f"**[{start_time} β†’ {end_time}]** \n{seg['text']}\n\n" + +markdown += """--- + +## πŸ“ Summary + +*Automatic summary generation requires AI integration (Claude/GPT).* +*For now, review the full transcription above.* + +--- + +*Generated by audio-transcriber skill example script* +*Transcription engine: ${TRANSCRIBER} | Model: ${MODEL}* +""" + +# Write to file +with open("${OUTPUT_FILE}", "w") as f: + f.write(markdown) + +print(f"βœ… Markdown report saved: ${OUTPUT_FILE}") +EOF + +# Clean up +rm -f "$TEMP_JSON" + +# Step 4: Display summary +success "Transcription complete!" +echo "" +echo "πŸ“Š Results:" +echo " Output file: $OUTPUT_FILE" +echo " Transcription engine: $TRANSCRIBER" +echo " Model: $MODEL" +echo "" +info "Next steps:" +echo " 1. Review the transcription: cat $OUTPUT_FILE" +echo " 2. Edit if needed: vim $OUTPUT_FILE" +echo " 3. Share with team or archive" +EOF diff --git a/skills/audio-transcriber/references/tools-comparison.md b/skills/audio-transcriber/references/tools-comparison.md new file mode 100644 index 00000000..8a1f9bcf --- /dev/null +++ b/skills/audio-transcriber/references/tools-comparison.md @@ -0,0 +1,352 @@ +# Transcription Tools Comparison + +Comprehensive comparison of audio transcription engines supported by the audio-transcriber skill. + +## Overview + +| Tool | Type | Speed | Quality | Cost | Privacy | Offline | Languages | +|------|------|-------|---------|------|---------|---------|-----------| +| **Faster-Whisper** | Open-source | ⚑⚑⚑⚑⚑ | ⭐⭐⭐⭐⭐ | Free | 100% | βœ… | 99 | +| **Whisper** | Open-source | ⚑⚑⚑ | ⭐⭐⭐⭐⭐ | Free | 100% | βœ… | 99 | +| Google Speech-to-Text | Commercial API | ⚑⚑⚑⚑ | ⭐⭐⭐⭐⭐ | $0.006/15s | Partial | ❌ | 125+ | +| Azure Speech | Commercial API | ⚑⚑⚑⚑ | ⭐⭐⭐⭐ | $1/hour | Partial | ❌ | 100+ | +| AssemblyAI | Commercial API | ⚑⚑⚑⚑ | ⭐⭐⭐⭐⭐ | $0.00025/s | Partial | ❌ | 99 | + +--- + +## Faster-Whisper (Recommended) + +### Pros +βœ… **4-5x faster** than original Whisper +βœ… **Same quality** as original Whisper +βœ… **Lower memory usage** (50-60% less RAM) +βœ… **Free and open-source** +βœ… **100% offline** (privacy guaranteed) +βœ… **Easy installation** (`pip install faster-whisper`) +βœ… **Drop-in replacement** for Whisper + +### Cons +❌ Requires Python 3.8+ +❌ Initial model download (~100MB-1.5GB) +❌ GPU optional but speeds up significantly + +### Installation + +```bash +pip install faster-whisper +``` + +### Usage Example + +```python +from faster_whisper import WhisperModel + +# Load model (auto-downloads on first run) +model = WhisperModel("base", device="cpu", compute_type="int8") + +# Transcribe +segments, info = model.transcribe("audio.mp3", language="pt") + +# Print results +for segment in segments: + print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}") +``` + +### Model Sizes + +| Model | Size | RAM | Speed (CPU) | Quality | +|-------|------|-----|-------------|---------| +| `tiny` | 39 MB | ~1 GB | Very fast (~10x realtime) | Basic | +| `base` | 74 MB | ~1 GB | Fast (~7x realtime) | Good | +| `small` | 244 MB | ~2 GB | Moderate (~4x realtime) | Very good | +| `medium` | 769 MB | ~5 GB | Slow (~2x realtime) | Excellent | +| `large` | 1550 MB | ~10 GB | Very slow (~1x realtime) | Best | + +**Recommendation:** `small` or `medium` for production use. + +--- + +## Whisper (Original) + +### Pros +βœ… **Official OpenAI model** +βœ… **Excellent quality** +βœ… **Free and open-source** +βœ… **100% offline** +βœ… **Well-documented** +βœ… **Large community** + +### Cons +❌ **Slower** than Faster-Whisper (4-5x) +❌ **Higher memory usage** +❌ Requires PyTorch (large dependency) +❌ GPU highly recommended for larger models + +### Installation + +```bash +pip install openai-whisper +``` + +### Usage Example + +```python +import whisper + +# Load model +model = whisper.load_model("base") + +# Transcribe +result = model.transcribe("audio.mp3", language="pt") + +# Print results +print(result["text"]) +``` + +### When to Use Whisper vs. Faster-Whisper + +**Use Faster-Whisper if:** +- Speed is important +- Limited RAM available +- Processing many files + +**Use Original Whisper if:** +- Faster-Whisper installation issues +- Need exact OpenAI implementation +- Already have Whisper in project dependencies + +--- + +## Google Cloud Speech-to-Text + +### Pros +βœ… **Very accurate** (industry-leading) +βœ… **Fast processing** (cloud infrastructure) +βœ… **125+ languages** +βœ… **Word-level timestamps** +βœ… **Punctuation & capitalization** +βœ… **Speaker diarization** (premium) + +### Cons +❌ **Requires internet** (cloud-only) +❌ **Costs money** (after free tier) +❌ **Privacy concerns** (audio uploaded to Google) +❌ Requires GCP account setup +❌ Complex authentication + +### Pricing + +- **Free tier:** 60 minutes/month +- **Standard:** $0.006 per 15 seconds ($1.44/hour) +- **Premium:** $0.009 per 15 seconds (with diarization) + +### Installation + +```bash +pip install google-cloud-speech +``` + +### Setup + +1. Create GCP project +2. Enable Speech-to-Text API +3. Create service account & download JSON key +4. Set environment variable: + ```bash + export GOOGLE_APPLICATION_CREDENTIALS="path/to/key.json" + ``` + +### Usage Example + +```python +from google.cloud import speech + +client = speech.SpeechClient() + +with open("audio.wav", "rb") as audio_file: + content = audio_file.read() + +audio = speech.RecognitionAudio(content=content) +config = speech.RecognitionConfig( + encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16, + sample_rate_hertz=16000, + language_code="pt-BR", +) + +response = client.recognize(config=config, audio=audio) + +for result in response.results: + print(result.alternatives[0].transcript) +``` + +--- + +## Azure Speech Services + +### Pros +βœ… **High accuracy** +βœ… **100+ languages** +βœ… **Real-time transcription** +βœ… **Custom models** (train on your data) +βœ… **Good Microsoft ecosystem integration** + +### Cons +❌ **Requires internet** +❌ **Costs money** (after free tier) +❌ **Privacy concerns** (cloud processing) +❌ Requires Azure account +❌ Complex setup + +### Pricing + +- **Free tier:** 5 hours/month +- **Standard:** $1.00 per audio hour + +### Installation + +```bash +pip install azure-cognitiveservices-speech +``` + +### Setup + +1. Create Azure account +2. Create Speech resource +3. Get API key and region +4. Set environment variables: + ```bash + export AZURE_SPEECH_KEY="your-key" + export AZURE_SPEECH_REGION="your-region" + ``` + +### Usage Example + +```python +import azure.cognitiveservices.speech as speechsdk + +speech_config = speechsdk.SpeechConfig( + subscription=os.environ.get('AZURE_SPEECH_KEY'), + region=os.environ.get('AZURE_SPEECH_REGION') +) + +audio_config = speechsdk.audio.AudioConfig(filename="audio.wav") +speech_recognizer = speechsdk.SpeechRecognizer( + speech_config=speech_config, + audio_config=audio_config +) + +result = speech_recognizer.recognize_once() +print(result.text) +``` + +--- + +## AssemblyAI + +### Pros +βœ… **Modern, developer-friendly API** +βœ… **Excellent accuracy** +βœ… **Advanced features** (sentiment, topic detection, PII redaction) +βœ… **Speaker diarization** (included) +βœ… **Fast processing** +βœ… **Good documentation** + +### Cons +❌ **Requires internet** +❌ **Costs money** (no free tier, only trial credits) +❌ **Privacy concerns** (cloud processing) +❌ Requires API key + +### Pricing + +- **Free trial:** $50 credits +- **Standard:** $0.00025 per second (~$0.90/hour) + +### Installation + +```bash +pip install assemblyai +``` + +### Setup + +1. Sign up at assemblyai.com +2. Get API key +3. Set environment variable: + ```bash + export ASSEMBLYAI_API_KEY="your-key" + ``` + +### Usage Example + +```python +import assemblyai as aai + +aai.settings.api_key = os.environ["ASSEMBLYAI_API_KEY"] + +transcriber = aai.Transcriber() +transcript = transcriber.transcribe("audio.mp3") + +print(transcript.text) + +# Speaker diarization +for utterance in transcript.utterances: + print(f"Speaker {utterance.speaker}: {utterance.text}") +``` + +--- + +## Recommendation Matrix + +### Use Faster-Whisper if: +- βœ… Privacy is critical (local processing) +- βœ… Want zero cost (free forever) +- βœ… Need offline capability +- βœ… Processing many files (speed matters) +- βœ… Limited budget + +### Use Google Speech-to-Text if: +- βœ… Need absolute best accuracy +- βœ… Have budget for cloud services +- βœ… Want advanced features (punctuation, diarization) +- βœ… Already using GCP ecosystem + +### Use Azure Speech if: +- βœ… In Microsoft ecosystem +- βœ… Need custom model training +- βœ… Want real-time transcription +- βœ… Have Azure credits + +### Use AssemblyAI if: +- βœ… Need advanced features (sentiment, topics) +- βœ… Want easiest API experience +- βœ… Need automatic PII redaction +- βœ… Value developer experience + +--- + +## Performance Benchmarks + +**Test:** 1-hour podcast (MP3, 44.1kHz, stereo) + +| Tool | Processing Time | Accuracy | Cost | +|------|----------------|----------|------| +| Faster-Whisper (small) | 8 min | 94% | $0 | +| Whisper (small) | 32 min | 94% | $0 | +| Google Speech | 2 min | 96% | $1.44 | +| Azure Speech | 3 min | 95% | $1.00 | +| AssemblyAI | 4 min | 96% | $0.90 | + +*Benchmarks run on MacBook Pro M1, 16GB RAM* + +--- + +## Conclusion + +**For the audio-transcriber skill:** + +1. **Primary:** Faster-Whisper (best balance of speed, quality, privacy, cost) +2. **Fallback:** Whisper (if Faster-Whisper unavailable) +3. **Optional:** Cloud APIs (user choice for premium features) + +This ensures the skill works out-of-the-box for most users while allowing advanced users to integrate commercial services if needed. diff --git a/skills/audio-transcriber/scripts/install-requirements.sh b/skills/audio-transcriber/scripts/install-requirements.sh new file mode 100755 index 00000000..48f51d9f --- /dev/null +++ b/skills/audio-transcriber/scripts/install-requirements.sh @@ -0,0 +1,190 @@ +#!/usr/bin/env bash + +# Audio Transcriber - Requirements Installation Script +# Automatically installs and validates dependencies + +set -euo pipefail + +# Colors +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +RED='\033[0;31m' +BLUE='\033[0;34m' +NC='\033[0m' + +echo -e "${BLUE}πŸ”§ Audio Transcriber - Dependency Installation${NC}" +echo "" + +# Check Python +if ! command -v python3 &>/dev/null; then + echo -e "${RED}❌ Python 3 not found. Please install Python 3.8+${NC}" + exit 1 +fi + +PYTHON_VERSION=$(python3 --version | cut -d' ' -f2 | cut -d'.' -f1,2) +echo -e "${GREEN}βœ… Python ${PYTHON_VERSION} detected${NC}" + +# Check pip +if ! python3 -m pip --version &>/dev/null; then + echo -e "${RED}❌ pip not found. Please install pip${NC}" + exit 1 +fi + +echo -e "${GREEN}βœ… pip available${NC}" +echo "" + +# Install system dependencies (macOS only) +if [[ "$OSTYPE" == "darwin"* ]]; then + echo -e "${BLUE}πŸ“¦ Checking system dependencies (macOS)...${NC}" + + # Check for Homebrew + if command -v brew &>/dev/null; then + # Install pkg-config and ffmpeg if not present + NEED_INSTALL="" + + if ! brew list pkg-config &>/dev/null 2>&1; then + NEED_INSTALL="$NEED_INSTALL pkg-config" + fi + + if ! brew list ffmpeg &>/dev/null 2>&1; then + NEED_INSTALL="$NEED_INSTALL ffmpeg" + fi + + if [[ -n "$NEED_INSTALL" ]]; then + echo -e "${BLUE}Installing:$NEED_INSTALL${NC}" + brew install $NEED_INSTALL --quiet + echo -e "${GREEN}βœ… System dependencies installed${NC}" + else + echo -e "${GREEN}βœ… System dependencies already installed${NC}" + fi + else + echo -e "${YELLOW}⚠️ Homebrew not found. Install manually if needed:${NC}" + echo " /bin/bash -c \"\$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"" + fi +fi + +echo "" + +# Install faster-whisper (recommended) +echo -e "${BLUE}πŸ“¦ Installing Faster-Whisper...${NC}" + +# Try different installation methods based on Python environment +if python3 -m pip install faster-whisper --quiet 2>/dev/null; then + echo -e "${GREEN}βœ… Faster-Whisper installed successfully${NC}" +elif python3 -m pip install --user --break-system-packages faster-whisper --quiet 2>/dev/null; then + echo -e "${GREEN}βœ… Faster-Whisper installed successfully (user mode)${NC}" +else + echo -e "${YELLOW}⚠️ Faster-Whisper installation failed, trying Whisper...${NC}" + + if python3 -m pip install openai-whisper --quiet 2>/dev/null; then + echo -e "${GREEN}βœ… Whisper installed successfully${NC}" + elif python3 -m pip install --user --break-system-packages openai-whisper --quiet 2>/dev/null; then + echo -e "${GREEN}βœ… Whisper installed successfully (user mode)${NC}" + else + echo -e "${RED}❌ Failed to install transcription engine${NC}" + echo "" + echo -e "${YELLOW}Manual installation options:${NC}" + echo " 1. Use --break-system-packages (macOS/Homebrew Python):" + echo " python3 -m pip install --user --break-system-packages openai-whisper" + echo "" + echo " 2. Use virtual environment (recommended):" + echo " python3 -m venv ~/whisper-env" + echo " source ~/whisper-env/bin/activate" + echo " pip install faster-whisper" + echo "" + echo " 3. Use pipx (isolated):" + echo " brew install pipx" + echo " pipx install openai-whisper" + exit 1 + fi +fi + +# Install UI/progress libraries (tqdm, rich) +echo "" +echo -e "${BLUE}πŸ“¦ Installing UI libraries (tqdm, rich)...${NC}" + +if python3 -m pip install tqdm rich --quiet 2>/dev/null; then + echo -e "${GREEN}βœ… tqdm and rich installed successfully${NC}" +elif python3 -m pip install --user --break-system-packages tqdm rich --quiet 2>/dev/null; then + echo -e "${GREEN}βœ… tqdm and rich installed successfully (user mode)${NC}" +else + echo -e "${YELLOW}⚠️ Optional UI libraries not installed (skill will still work)${NC}" +fi + +# Check ffmpeg (optional but recommended) +echo "" +if command -v ffmpeg &>/dev/null; then + echo -e "${GREEN}βœ… ffmpeg already installed${NC}" +else + echo -e "${YELLOW}⚠️ ffmpeg not found (should have been installed earlier)${NC}" + if [[ "$OSTYPE" == "darwin"* ]] && command -v brew &>/dev/null; then + echo -e "${BLUE}Installing ffmpeg via Homebrew...${NC}" + brew install ffmpeg --quiet && echo -e "${GREEN}βœ… ffmpeg installed${NC}" + else + echo -e "${BLUE}ℹ️ ffmpeg is optional but recommended for format conversion${NC}" + echo "" + echo "Install ffmpeg:" + if [[ "$OSTYPE" == "darwin"* ]]; then + echo " brew install ffmpeg" + elif [[ "$OSTYPE" == "linux-gnu"* ]]; then + echo " sudo apt install ffmpeg # Debian/Ubuntu" + echo " sudo yum install ffmpeg # CentOS/RHEL" + fi + fi +fi + +# Verify installation +echo "" +echo -e "${BLUE}πŸ” Verifying installation...${NC}" + +if python3 -c "import faster_whisper" 2>/dev/null; then + echo -e "${GREEN}βœ… Faster-Whisper verified${NC}" + TRANSCRIBER="Faster-Whisper" +elif python3 -c "import whisper" 2>/dev/null; then + echo -e "${GREEN}βœ… Whisper verified${NC}" + TRANSCRIBER="Whisper" +else + echo -e "${RED}❌ No transcription engine found after installation${NC}" + exit 1 +fi + +# Download initial model (optional) +read -p "Download Whisper 'base' model now? (recommended, ~74MB) [Y/n]: " DOWNLOAD_MODEL + +if [[ ! "$DOWNLOAD_MODEL" =~ ^[Nn] ]]; then + echo "" + echo -e "${BLUE}πŸ“₯ Downloading 'base' model...${NC}" + + python3 << 'EOF' +try: + import faster_whisper + model = faster_whisper.WhisperModel("base", device="cpu", compute_type="int8") + print("βœ… Model downloaded successfully") +except: + try: + import whisper + model = whisper.load_model("base") + print("βœ… Model downloaded successfully") + except Exception as e: + print(f"❌ Model download failed: {e}") +EOF +fi + +# Success summary +echo "" +echo -e "${GREEN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}" +echo -e "${GREEN}βœ… Installation Complete!${NC}" +echo -e "${GREEN}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}" +echo "" +echo "πŸ“Š Installed components:" +echo " β€’ Transcription engine: $TRANSCRIBER" +if command -v ffmpeg &>/dev/null; then + echo " β€’ Format conversion: ffmpeg (available)" +else + echo " β€’ Format conversion: ffmpeg (not installed)" +fi +echo "" +echo "πŸš€ Ready to use! Try:" +echo " copilot> transcribe audio to markdown: myfile.mp3" +echo " claude> transcreva este Γ‘udio: myfile.mp3" +echo "" diff --git a/skills/audio-transcriber/scripts/transcribe.py b/skills/audio-transcriber/scripts/transcribe.py new file mode 100755 index 00000000..1bf724df --- /dev/null +++ b/skills/audio-transcriber/scripts/transcribe.py @@ -0,0 +1,510 @@ +#!/usr/bin/env python3 +""" +Audio Transcriber v1.1.0 +Transcreve Γ‘udio para texto e gera atas/resumos usando LLM. +""" + +import os +import sys +import json +import subprocess +import shutil +from datetime import datetime +from pathlib import Path + +# Rich for beautiful terminal output +try: + from rich.console import Console + from rich.prompt import Prompt + from rich.panel import Panel + from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn + from rich import print as rprint + RICH_AVAILABLE = True +except ImportError: + RICH_AVAILABLE = False + print("⚠️ Installing rich for better UI...") + subprocess.run([sys.executable, "-m", "pip", "install", "--user", "rich"], check=False) + from rich.console import Console + from rich.prompt import Prompt + from rich.panel import Panel + from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn + from rich import print as rprint + +# tqdm for progress bars +try: + from tqdm import tqdm +except ImportError: + print("⚠️ Installing tqdm for progress bars...") + subprocess.run([sys.executable, "-m", "pip", "install", "--user", "tqdm"], check=False) + from tqdm import tqdm + +# Whisper engines +try: + from faster_whisper import WhisperModel + TRANSCRIBER = "faster-whisper" +except ImportError: + try: + import whisper + TRANSCRIBER = "whisper" + except ImportError: + print("❌ Nenhum engine de transcriΓ§Γ£o encontrado!") + print(" Instale: pip install faster-whisper") + sys.exit(1) + +console = Console() + +# Template padrΓ£o RISEN para fallback +DEFAULT_MEETING_PROMPT = """ +Role: VocΓͺ Γ© um transcritor profissional especializado em documentaΓ§Γ£o. + +Instructions: Transforme a transcriΓ§Γ£o fornecida em um documento estruturado e profissional. + +Steps: +1. Identifique o tipo de conteΓΊdo (reuniΓ£o, palestra, entrevista, etc.) +2. Extraia os principais tΓ³picos e pontos-chave +3. Identifique participantes/speakers (se aplicΓ‘vel) +4. Extraia decisΓ΅es tomadas e aΓ§Γ΅es definidas (se reuniΓ£o) +5. Organize em formato apropriado com seΓ§Γ΅es claras +6. Use Markdown para formataΓ§Γ£o profissional + +End Goal: Documento final bem estruturado, legΓ­vel e pronto para distribuiΓ§Γ£o. + +Narrowing: +- Mantenha objetividade e clareza +- Preserve contexto importante +- Use formataΓ§Γ£o Markdown adequada +- Inclua timestamps relevantes quando aplicΓ‘vel +""" + + +def detect_cli_tool(): + """Detecta qual CLI de LLM estΓ‘ disponΓ­vel (claude > gh copilot).""" + if shutil.which('claude'): + return 'claude' + elif shutil.which('gh'): + result = subprocess.run(['gh', 'copilot', '--version'], + capture_output=True, text=True) + if result.returncode == 0: + return 'gh-copilot' + return None + + +def invoke_prompt_engineer(raw_prompt, timeout=90): + """ + Invoca prompt-engineer skill via CLI para melhorar/gerar prompts. + + Args: + raw_prompt: Prompt a ser melhorado ou meta-prompt + timeout: Timeout em segundos + + Returns: + str: Prompt melhorado ou DEFAULT_MEETING_PROMPT se falhar + """ + try: + # Tentar via gh copilot + console.print("[dim] Invocando prompt-engineer...[/dim]") + + result = subprocess.run( + ['gh', 'copilot', 'suggest', '-t', 'shell', raw_prompt], + capture_output=True, + text=True, + timeout=timeout + ) + + if result.returncode == 0 and result.stdout.strip(): + return result.stdout.strip() + else: + console.print("[yellow]⚠️ prompt-engineer nΓ£o respondeu, usando template padrΓ£o[/yellow]") + return DEFAULT_MEETING_PROMPT + + except subprocess.TimeoutExpired: + console.print(f"[red]⚠️ Timeout apΓ³s {timeout}s, usando template padrΓ£o[/red]") + return DEFAULT_MEETING_PROMPT + except Exception as e: + console.print(f"[red]⚠️ Erro ao invocar prompt-engineer: {e}[/red]") + return DEFAULT_MEETING_PROMPT + + +def handle_prompt_workflow(user_prompt, transcript): + """ + Gerencia fluxo completo de prompts com prompt-engineer. + + CenΓ‘rio A: UsuΓ‘rio forneceu prompt β†’ Melhorar AUTOMATICAMENTE β†’ Confirmar + CenΓ‘rio B: Sem prompt β†’ Sugerir tipo β†’ Confirmar β†’ Gerar β†’ Confirmar + + Returns: + str: Prompt final a usar, ou None se usuΓ‘rio recusou processamento + """ + prompt_engineer_available = os.path.exists( + os.path.expanduser('~/.copilot/skills/prompt-engineer/SKILL.md') + ) + + # ========== CENÁRIO A: USUÁRIO FORNECEU PROMPT ========== + if user_prompt: + console.print("\n[cyan]πŸ“ Prompt fornecido pelo usuΓ‘rio[/cyan]") + console.print(Panel(user_prompt[:300] + ("..." if len(user_prompt) > 300 else ""), + title="Prompt original", border_style="dim")) + + if prompt_engineer_available: + # Melhora AUTOMATICAMENTE (sem perguntar) + console.print("\n[cyan]πŸ”§ Melhorando prompt com prompt-engineer...[/cyan]") + + improved_prompt = invoke_prompt_engineer( + f"melhore este prompt:\n\n{user_prompt}" + ) + + # Mostrar AMBAS versΓ΅es + console.print("\n[green]✨ VersΓ£o melhorada:[/green]") + console.print(Panel(improved_prompt[:500] + ("..." if len(improved_prompt) > 500 else ""), + title="Prompt otimizado", border_style="green")) + + console.print("\n[dim]πŸ“ VersΓ£o original:[/dim]") + console.print(Panel(user_prompt[:300] + ("..." if len(user_prompt) > 300 else ""), + title="Seu prompt", border_style="dim")) + + # Pergunta qual usar + confirm = Prompt.ask( + "\nπŸ’‘ Usar versΓ£o melhorada?", + choices=["s", "n"], + default="s" + ) + + return improved_prompt if confirm == "s" else user_prompt + else: + # prompt-engineer nΓ£o disponΓ­vel + console.print("[yellow]⚠️ prompt-engineer skill nΓ£o disponΓ­vel[/yellow]") + console.print("[dim]βœ… Usando seu prompt original[/dim]") + return user_prompt + + # ========== CENÁRIO B: SEM PROMPT - AUTO-GERAÇÃO ========== + else: + console.print("\n[yellow]⚠️ Nenhum prompt fornecido.[/yellow]") + + if not prompt_engineer_available: + console.print("[yellow]⚠️ prompt-engineer skill nΓ£o encontrado[/yellow]") + console.print("[dim]Usando template padrΓ£o...[/dim]") + return DEFAULT_MEETING_PROMPT + + # PASSO 1: Perguntar se quer auto-gerar + console.print("Posso analisar o transcript e sugerir um formato de resumo/ata?") + + generate = Prompt.ask( + "\nπŸ’‘ Gerar prompt automaticamente?", + choices=["s", "n"], + default="s" + ) + + if generate == "n": + console.print("[dim]βœ… Ok, gerando apenas transcript.md (sem ata)[/dim]") + return None # Sinaliza: nΓ£o processar com LLM + + # PASSO 2: Analisar transcript e SUGERIR tipo + console.print("\n[cyan]πŸ” Analisando transcript...[/cyan]") + + suggestion_meta_prompt = f""" +Analise este transcript ({len(transcript)} caracteres) e sugira: + +1. Tipo de conteΓΊdo (reuniΓ£o, palestra, entrevista, etc.) +2. Formato de saΓ­da recomendado (ata formal, resumo executivo, notas estruturadas) +3. Framework ideal (RISEN, RODES, STAR, etc.) + +Primeiras 1000 palavras do transcript: +{transcript[:4000]} + +Responda em 2-3 linhas concisas. +""" + + suggested_type = invoke_prompt_engineer(suggestion_meta_prompt) + + # PASSO 3: Mostrar sugestΓ£o e CONFIRMAR + console.print("\n[green]πŸ’‘ SugestΓ£o de formato:[/green]") + console.print(Panel(suggested_type, title="AnΓ‘lise do transcript", border_style="green")) + + confirm_type = Prompt.ask( + "\nπŸ’‘ Usar este formato?", + choices=["s", "n"], + default="s" + ) + + if confirm_type == "n": + console.print("[dim]Usando template padrΓ£o...[/dim]") + return DEFAULT_MEETING_PROMPT + + # PASSO 4: Gerar prompt completo baseado na sugestΓ£o + console.print("\n[cyan]✨ Gerando prompt estruturado...[/cyan]") + + final_meta_prompt = f""" +Crie um prompt completo e estruturado (usando framework apropriado) para: + +{suggested_type} + +O prompt deve instruir uma IA a transformar o transcript em um documento +profissional e bem formatado em Markdown. +""" + + generated_prompt = invoke_prompt_engineer(final_meta_prompt) + + # PASSO 5: Mostrar prompt gerado e CONFIRMAR + console.print("\n[green]βœ… Prompt gerado:[/green]") + console.print(Panel(generated_prompt[:600] + ("..." if len(generated_prompt) > 600 else ""), + title="Preview", border_style="green")) + + confirm_final = Prompt.ask( + "\nπŸ’‘ Usar este prompt?", + choices=["s", "n"], + default="s" + ) + + if confirm_final == "s": + return generated_prompt + else: + console.print("[dim]Usando template padrΓ£o...[/dim]") + return DEFAULT_MEETING_PROMPT + + +def process_with_llm(transcript, prompt, cli_tool='claude', timeout=300): + """ + Processa transcript com LLM usando prompt fornecido. + + Args: + transcript: Texto transcrito + prompt: Prompt instruindo como processar + cli_tool: 'claude' ou 'gh-copilot' + timeout: Timeout em segundos + + Returns: + str: Ata/resumo processado + """ + full_prompt = f"{prompt}\n\n---\n\nTranscriΓ§Γ£o:\n\n{transcript}" + + try: + with Progress( + SpinnerColumn(), + TextColumn("[progress.description]{task.description}"), + transient=True + ) as progress: + progress.add_task(description=f"πŸ€– Processando com {cli_tool}...", total=None) + + if cli_tool == 'claude': + result = subprocess.run( + ['claude', '-'], + input=full_prompt, + capture_output=True, + text=True, + timeout=timeout + ) + elif cli_tool == 'gh-copilot': + result = subprocess.run( + ['gh', 'copilot', 'suggest', '-t', 'shell', full_prompt], + capture_output=True, + text=True, + timeout=timeout + ) + else: + raise ValueError(f"CLI tool desconhecido: {cli_tool}") + + if result.returncode == 0: + return result.stdout.strip() + else: + console.print(f"[red]❌ Erro ao processar com {cli_tool}[/red]") + console.print(f"[dim]{result.stderr[:200]}[/dim]") + return None + + except subprocess.TimeoutExpired: + console.print(f"[red]❌ Timeout apΓ³s {timeout}s[/red]") + return None + except Exception as e: + console.print(f"[red]❌ Erro: {e}[/red]") + return None + + +def transcribe_audio(audio_file, model="base"): + """ + Transcreve Γ‘udio usando Whisper com barra de progresso. + + Returns: + dict: {language, duration, segments: [{start, end, text}]} + """ + console.print(f"\n[cyan]πŸŽ™οΈ Transcrevendo Γ‘udio com {TRANSCRIBER}...[/cyan]") + + try: + if TRANSCRIBER == "faster-whisper": + model_obj = WhisperModel(model, device="cpu", compute_type="int8") + segments, info = model_obj.transcribe( + audio_file, + language=None, + vad_filter=True, + word_timestamps=True + ) + + data = { + "language": info.language, + "language_probability": round(info.language_probability, 2), + "duration": info.duration, + "segments": [] + } + + # Converter generator em lista com progresso + console.print("[dim]Processando segmentos...[/dim]") + for segment in tqdm(segments, desc="Segmentos", unit="seg"): + data["segments"].append({ + "start": round(segment.start, 2), + "end": round(segment.end, 2), + "text": segment.text.strip() + }) + + else: # whisper original + import whisper + model_obj = whisper.load_model(model) + result = model_obj.transcribe(audio_file, word_timestamps=True) + + data = { + "language": result["language"], + "duration": result["segments"][-1]["end"] if result["segments"] else 0, + "segments": result["segments"] + } + + console.print(f"[green]βœ… TranscriΓ§Γ£o completa! Idioma: {data['language'].upper()}[/green]") + console.print(f"[dim] {len(data['segments'])} segmentos processados[/dim]") + + return data + + except Exception as e: + console.print(f"[red]❌ Erro na transcriΓ§Γ£o: {e}[/red]") + sys.exit(1) + + +def save_outputs(transcript_text, ata_text, audio_file, output_dir="."): + """ + Salva transcript e ata em arquivos .md com timestamp. + + Returns: + tuple: (transcript_path, ata_path or None) + """ + timestamp = datetime.now().strftime("%Y%m%d-%H%M%S") + base_name = Path(audio_file).stem + + # Sempre salva transcript + transcript_filename = f"transcript-{timestamp}.md" + transcript_path = Path(output_dir) / transcript_filename + + with open(transcript_path, 'w', encoding='utf-8') as f: + f.write(transcript_text) + + console.print(f"[green]βœ… Transcript salvo:[/green] {transcript_filename}") + + # Salva ata se existir + ata_path = None + if ata_text: + ata_filename = f"ata-{timestamp}.md" + ata_path = Path(output_dir) / ata_filename + + with open(ata_path, 'w', encoding='utf-8') as f: + f.write(ata_text) + + console.print(f"[green]βœ… Ata salva:[/green] {ata_filename}") + + return str(transcript_path), str(ata_path) if ata_path else None + + +def cleanup_temp_files(output_dir=".", keep_temp=False): + """Remove arquivos temporΓ‘rios JSON se nΓ£o for para manter.""" + if keep_temp: + return + + temp_files = ["metadata.json", "transcription.json"] + removed = [] + + for filename in temp_files: + filepath = Path(output_dir) / filename + if filepath.exists(): + filepath.unlink() + removed.append(filename) + + if removed: + console.print(f"[dim]🧹 Removidos arquivos temporΓ‘rios: {', '.join(removed)}[/dim]") + + +def main(): + """FunΓ§Γ£o principal.""" + import argparse + + parser = argparse.ArgumentParser(description="Audio Transcriber v1.1.0") + parser.add_argument("audio_file", help="Arquivo de Γ‘udio para transcrever") + parser.add_argument("--prompt", help="Prompt customizado para processar transcript") + parser.add_argument("--model", default="base", help="Modelo Whisper (tiny/base/small/medium/large)") + parser.add_argument("--output-dir", default=".", help="DiretΓ³rio de saΓ­da") + parser.add_argument("--keep-temp", action="store_true", help="Manter arquivos temporΓ‘rios JSON") + + args = parser.parse_args() + + # Verificar arquivo existe + if not os.path.exists(args.audio_file): + console.print(f"[red]❌ Arquivo nΓ£o encontrado: {args.audio_file}[/red]") + sys.exit(1) + + console.print("[bold cyan]🎡 Audio Transcriber v1.1.0[/bold cyan]\n") + + # Step 1: Transcrever + transcription_data = transcribe_audio(args.audio_file, model=args.model) + + # Gerar texto do transcript + transcript_text = f"# TranscriΓ§Γ£o de Áudio\n\n" + transcript_text += f"**Arquivo:** {Path(args.audio_file).name}\n" + transcript_text += f"**Idioma:** {transcription_data['language'].upper()}\n" + transcript_text += f"**Data:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n\n" + transcript_text += "---\n\n## TranscriΓ§Γ£o Completa\n\n" + + for seg in transcription_data["segments"]: + start_min = int(seg["start"] // 60) + start_sec = int(seg["start"] % 60) + end_min = int(seg["end"] // 60) + end_sec = int(seg["end"] % 60) + transcript_text += f"**[{start_min:02d}:{start_sec:02d} β†’ {end_min:02d}:{end_sec:02d}]** \n{seg['text']}\n\n" + + # Step 2: Detectar CLI + cli_tool = detect_cli_tool() + + if not cli_tool: + console.print("\n[yellow]⚠️ Nenhuma CLI de IA detectada (Claude ou GitHub Copilot)[/yellow]") + console.print("[dim]ℹ️ Salvando apenas transcript.md...[/dim]") + + save_outputs(transcript_text, None, args.audio_file, args.output_dir) + cleanup_temp_files(args.output_dir, args.keep_temp) + + console.print("\n[cyan]πŸ’‘ Para gerar ata/resumo:[/cyan]") + console.print(" - Instale Claude CLI: pip install claude-cli") + console.print(" - Ou GitHub Copilot CLI jΓ‘ estΓ‘ instalado (gh copilot)") + return + + console.print(f"\n[green]βœ… CLI detectada: {cli_tool}[/green]") + + # Step 3: Workflow de prompt + final_prompt = handle_prompt_workflow(args.prompt, transcript_text) + + if final_prompt is None: + # UsuΓ‘rio recusou processamento + save_outputs(transcript_text, None, args.audio_file, args.output_dir) + cleanup_temp_files(args.output_dir, args.keep_temp) + return + + # Step 4: Processar com LLM + ata_text = process_with_llm(transcript_text, final_prompt, cli_tool) + + if ata_text: + console.print("[green]βœ… Ata gerada com sucesso![/green]") + else: + console.print("[yellow]⚠️ Falha ao gerar ata, salvando apenas transcript[/yellow]") + + # Step 5: Salvar arquivos + console.print("\n[cyan]πŸ’Ύ Salvando arquivos...[/cyan]") + save_outputs(transcript_text, ata_text, args.audio_file, args.output_dir) + + # Step 6: Cleanup + cleanup_temp_files(args.output_dir, args.keep_temp) + + console.print("\n[bold green]βœ… ConcluΓ­do![/bold green]") + + +if __name__ == "__main__": + main() diff --git a/skills/prompt-engineer/README.md b/skills/prompt-engineer/README.md new file mode 100644 index 00000000..1a757d5a --- /dev/null +++ b/skills/prompt-engineer/README.md @@ -0,0 +1,659 @@ +# 🎯 Prompt Engineer + +**Version:** 1.0.1 +**Status:** ✨ Zero-Config | 🌍 Universal + +Transform raw prompts into optimized, production-ready prompts using 11 established prompting frameworks. + +--- + +## πŸ“‹ Overview + +**Prompt Engineer** is an intelligent AI skill that analyzes your intentions and automatically generates optimized prompts for Claude, ChatGPT, or any other AI model. Instead of struggling with how to phrase complex requests, simply describe what you want - the skill handles the rest. + +This skill works in **"magic mode"** - it operates silently, only asking questions when absolutely necessary. You provide a rough idea, and it returns a polished, structured prompt ready to use. + +### ✨ Key Features + +- **🎯 Intent Analysis:** Understands what you're trying to accomplish +- **🧠 11 Frameworks:** RTF, Chain of Thought, RISEN, RODES, Chain of Density, RACE, RISE, STAR, SOAP, CLEAR, GROW +- **πŸ”„ Framework Blending:** Combines multiple frameworks when needed +- **❓ Smart Clarification:** Only asks questions when critically needed +- **πŸ“ Adaptive Output:** Matches prompt length to task complexity +- **🌍 Universal:** Works in any project, any terminal context +- **πŸͺ„ Magic Mode:** No technical explanations - just results + +--- + +## πŸš€ Quick Start + +### Invoke the Skill + +Use any of these trigger phrases: + +```bash +copilot> improve this prompt: [your rough prompt] +copilot> optimize prompt: [your idea] +copilot> refine prompt: [what you want] +copilot> prompt engineering for: [task description] +copilot> transform into prompt: [concept] +copilot> create prompt for: [goal] +``` + +### Example 1: Simple Task + +**Input:** +```bash +copilot> improve this prompt: help me write Python code +``` + +**Skill asks:** +``` +What type of Python code do you need? +- [ ] Data analysis +- [ ] Web scraping +- [ ] Task automation +- [ ] API/backend development +``` + +**You select:** "Web scraping" + +**Output:** +````markdown +``` +You are a Python developer expert in web scraping. + +Task: Create a Python script to scrape data from [specify target website]. + +Requirements: +- Use BeautifulSoup or Scrapy +- Implement pagination and rate limiting +- Store data in structured format (CSV or JSON) +- Include error handling for network issues + +Output format: Complete, commented Python code with usage examples. +``` +```` + +--- + +### Example 2: Complex Task (No Clarification Needed) + +**Input:** +```bash +copilot> create prompt for: product recommendation system with purchase history and market trends +``` + +**Output:** +````markdown +``` +Role: You are a senior machine learning engineer specializing in recommendation systems. + +Objective: Design a product recommendation system that combines user purchase history, preferences, and market trends. + +Details: +- Input data sources: user transaction logs, preference surveys, market trend APIs +- Algorithm considerations: collaborative filtering, content-based filtering, hybrid approaches +- Performance metrics: precision, recall, diversity, novelty +- Scalability: system must handle 100k+ users + +Approach step-by-step: +1. Analyze the three data sources and their relationships +2. Propose 2-3 algorithm approaches with trade-offs +3. Design data pipeline architecture +4. Define evaluation metrics +5. Outline implementation roadmap + +Examples: Show pseudocode for core recommendation logic + +Sense Check: Validate that the solution balances personalization with discovery and scales to production. + +Output format: Technical design document with architecture diagrams and implementation roadmap. +``` +```` + +--- + +### Example 3: Ambiguous Task (Skill Clarifies) + +**Input:** +```bash +copilot> transform into prompt: I want to improve my process +``` + +**Skill asks:** +``` +What process do you want to improve? +- [ ] Work/productivity process +- [ ] Learning process +- [ ] Technical process (development, deployment, etc.) +``` + +**You select:** "Work process" + +**Skill asks:** +``` +What specific aspect? +- [ ] Time management +- [ ] Team collaboration +- [ ] Decision making +``` + +**You select:** "Decision making" + +**Output:** +````markdown +``` +Context: I struggle with making decisions quickly and effectively at work. + +Objective: Develop a structured framework to improve decision speed and quality. + +Current Reality: +- Decisions delayed by excessive analysis +- Lack of clear criteria +- Uncertainty about when to decide vs. gather information + +Desired Outcome: +- Consistent decision-making framework +- Clear criteria for different decision types +- Reduced decision fatigue + +Your task: +1. Assess current patterns and identify bottlenecks +2. Propose framework (Eisenhower Matrix, RAPID, Decision Tree) +3. Create decision criteria templates +4. Design tracking and iteration process + +Output format: Practical playbook with templates, examples, and metrics. +``` +```` + +--- + +## πŸ“š Supported Frameworks + +The **Prompt Engineer** skill uses **11 established frameworks** to optimize your prompts. The skill automatically selects and blends these frameworks based on your task - you never need to know or choose them manually. + +--- + +### 1. **RTF (Role-Task-Format)** + +**Structure:** Role β†’ Task β†’ Format + +**Best for:** Tasks requiring specific expertise or perspective + +**Components:** +- **Role:** "You are a [expert identity]" +- **Task:** "Your task is to [specific action]" +- **Format:** "Output format: [structure/style]" + +**Example:** +``` +You are a senior Python developer. +Task: Refactor this code for better performance. +Format: Provide refactored code with inline comments explaining changes. +``` + +--- + +### 2. **Chain of Thought** + +**Structure:** Problem β†’ Step 1 β†’ Step 2 β†’ ... β†’ Solution + +**Best for:** Complex reasoning, debugging, mathematical problems, logic puzzles + +**Components:** +- Break problem into sequential steps +- Show reasoning at each stage +- Build toward final solution + +**Example:** +``` +Solve this problem step-by-step: +1. Identify the core issue +2. Analyze contributing factors +3. Propose solution approach +4. Validate solution against requirements +``` + +--- + +### 3. **RISEN** + +**Structure:** Role, Instructions, Steps, End goal, Narrowing + +**Best for:** Multi-phase projects with clear deliverables and constraints + +**Components:** +- **Role:** Expert identity +- **Instructions:** What to do +- **Steps:** Sequential actions +- **End goal:** Desired outcome +- **Narrowing:** Constraints and focus areas + +**Example:** +``` +Role: You are a DevOps architect. +Instructions: Design a CI/CD pipeline for microservices. +Steps: 1) Analyze requirements 2) Select tools 3) Design workflow 4) Document +End goal: Automated deployment with zero-downtime releases. +Narrowing: Focus on AWS, limit to 3 environments (dev/staging/prod). +``` + +--- + +### 4. **RODES** + +**Structure:** Role, Objective, Details, Examples, Sense check + +**Best for:** Complex design, system architecture, research proposals + +**Components:** +- **Role:** Expert perspective +- **Objective:** What to achieve +- **Details:** Context and requirements +- **Examples:** Concrete illustrations +- **Sense check:** Validation criteria + +**Example:** +``` +Role: You are a system architect. +Objective: Design a scalable e-commerce platform. +Details: Handle 100k concurrent users, sub-200ms response time, multi-region. +Examples: Show database schema, caching strategy, load balancing. +Sense check: Validate solution meets latency and scalability requirements. +``` + +--- + +### 5. **Chain of Density** + +**Structure:** Iteration 1 (verbose) β†’ Iteration 2 β†’ ... β†’ Iteration 5 (maximum density) + +**Best for:** Summarization, compression, synthesis of long content + +**Process:** +- Start with verbose explanation +- Iteratively compress while preserving key information +- End with maximally dense version (high information per word) + +**Example:** +``` +Compress this article into progressively denser summaries: +1. Initial summary (300 words) +2. Compressed (200 words) +3. Further compressed (100 words) +4. Dense (50 words) +5. Maximum density (25 words, all critical points) +``` + +--- + +### 6. **RACE** + +**Structure:** Role, Audience, Context, Expectation + +**Best for:** Communication, presentations, stakeholder updates, storytelling + +**Components:** +- **Role:** Communicator identity +- **Audience:** Who you're addressing (expertise level, concerns) +- **Context:** Background/situation +- **Expectation:** What audience needs to know or do + +**Example:** +``` +Role: You are a product manager. +Audience: Non-technical executives. +Context: Quarterly business review, product performance down 5%. +Expectation: Explain root causes and recovery plan in non-technical terms. +``` + +--- + +### 7. **RISE** + +**Structure:** Research, Investigate, Synthesize, Evaluate + +**Best for:** Analysis, investigation, systematic exploration, diagnostic work + +**Process:** +1. **Research:** Gather information +2. **Investigate:** Deep dive into findings +3. **Synthesize:** Combine insights +4. **Evaluate:** Assess and recommend + +**Example:** +``` +Analyze customer churn data using RISE: +Research: Collect churn metrics, exit surveys, support tickets. +Investigate: Identify patterns in churned users. +Synthesize: Combine findings into themes. +Evaluate: Recommend retention strategies based on evidence. +``` + +--- + +### 8. **STAR** + +**Structure:** Situation, Task, Action, Result + +**Best for:** Problem-solving with rich context, case studies, retrospectives + +**Components:** +- **Situation:** Background context +- **Task:** Specific challenge +- **Action:** What needs doing +- **Result:** Expected outcome + +**Example:** +``` +Situation: Legacy monolith causing deployment delays (2 weeks per release). +Task: Modernize architecture to enable daily deployments. +Action: Migrate to microservices, implement CI/CD, containerize. +Result: Deploy 10+ times per day with <5% rollback rate. +``` + +--- + +### 9. **SOAP** + +**Structure:** Subjective, Objective, Assessment, Plan + +**Best for:** Structured documentation, medical records, technical logs, incident reports + +**Components:** +- **Subjective:** Reported information (symptoms, complaints) +- **Objective:** Observable facts (metrics, data) +- **Assessment:** Analysis and diagnosis +- **Plan:** Recommended actions + +**Example:** +``` +Incident Report (SOAP): +Subjective: Users report slow page loads starting 10 AM. +Objective: Average response time increased from 200ms to 3s. CPU at 95%. +Assessment: Database connection pool exhausted due to traffic spike. +Plan: 1) Scale pool size 2) Add monitoring alerts 3) Review query performance. +``` + +--- + +### 10. **CLEAR** + +**Structure:** Collaborative, Limited, Emotional, Appreciable, Refinable + +**Best for:** Goal-setting, OKRs, measurable objectives, team alignment + +**Components:** +- **Collaborative:** Who's involved +- **Limited:** Scope boundaries (time, resources) +- **Emotional:** Why it matters (motivation) +- **Appreciable:** Measurable progress indicators +- **Refinable:** How to iterate and improve + +**Example:** +``` +Q1 Objective (CLEAR): +Collaborative: Engineering + Product teams. +Limited: Complete by March 31, budget $50k, 2 engineers allocated. +Emotional: Reduces customer support load by 30%, improves satisfaction. +Appreciable: Track weekly via tickets resolved, NPS score, deployment count. +Refinable: Bi-weekly retrospectives, adjust priorities based on feedback. +``` + +--- + +### 11. **GROW** + +**Structure:** Goal, Reality, Options, Will + +**Best for:** Coaching, personal development, growth planning, mentorship + +**Components:** +- **Goal:** What to achieve +- **Reality:** Current situation (strengths, gaps) +- **Options:** Possible approaches +- **Will:** Commitment to action + +**Example:** +``` +Career Development (GROW): +Goal: Become senior engineer within 12 months. +Reality: Strong coding skills, weak in system design and leadership. +Options: 1) Take system design course 2) Lead a project 3) Find mentor. +Will: Commit to 5 hours/week study, lead Q2 project, find mentor by Feb. +``` + +--- + +### Framework Selection Logic + +The skill analyzes your input and: + +1. **Detects task type** + - Coding, writing, analysis, design, communication, etc. + +2. **Identifies complexity** + - Simple (1-2 sentences) β†’ Fast, minimal structure + - Moderate (paragraph) β†’ Standard framework + - Complex (detailed requirements) β†’ Advanced framework or blend + +3. **Selects primary framework** + - RTF β†’ Role-based tasks + - Chain of Thought β†’ Step-by-step reasoning + - RISEN/RODES β†’ Complex projects + - RACE β†’ Communication + - STAR β†’ Contextual problems + - And so on... + +4. **Blends secondary frameworks when needed** + - RODES + Chain of Thought β†’ Complex technical projects + - CLEAR + GROW β†’ Leadership goals + - RACE + STAR β†’ Strategic communication + +**You never choose the framework manually** - the skill does it automatically in "magic mode." + +--- + +### Common Framework Blends + +| Task Type | Primary Framework | Blended With | Result | +|-----------|------------------|--------------|--------| +| Complex technical design | RODES | Chain of Thought | Structured design with step-by-step reasoning | +| Leadership development | CLEAR | GROW | Measurable goals with action commitment | +| Strategic communication | RACE | STAR | Audience-aware storytelling with context | +| Incident investigation | RISE | SOAP | Systematic analysis with structured documentation | +| Project planning | RISEN | RTF | Multi-phase delivery with role clarity | + +--- + +## 🎯 How It Works + +``` +User Input (rough prompt) + ↓ +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ 1. Analyze Intent β”‚ What is the user trying to do? +β”‚ - Task type β”‚ Coding? Writing? Analysis? Design? +β”‚ - Complexity β”‚ Simple, moderate, complex? +β”‚ - Clarity β”‚ Clear or ambiguous? +β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + ↓ +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ 2. Clarify (Optional) β”‚ Only if critically needed +β”‚ - Ask 2-3 questions β”‚ Multiple choice when possible +β”‚ - Fill missing gaps β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + ↓ +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ 3. Select Framework(s) β”‚ Silent selection +β”‚ - Map task β†’ framework +β”‚ - Blend if needed β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + ↓ +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ 4. Generate Prompt β”‚ Apply framework rules +β”‚ - Add role/context β”‚ +β”‚ - Structure task β”‚ +β”‚ - Define format β”‚ +β”‚ - Add examples β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + ↓ +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ 5. Output β”‚ Clean, copy-ready +β”‚ Markdown code block β”‚ No explanations +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ +``` + +--- + +## 🎨 Use Cases + +### Coding + +```bash +copilot> optimize prompt: create REST API in Python +``` + +β†’ Generates structured prompt with role, requirements, output format, examples + +--- + +### Writing + +```bash +copilot> create prompt for: write technical article about microservices +``` + +β†’ Generates audience-aware prompt with structure, tone, and content guidelines + +--- + +### Analysis + +```bash +copilot> refine prompt: analyze sales data and identify trends +``` + +β†’ Generates step-by-step analytical framework with visualization requirements + +--- + +### Decision Making + +```bash +copilot> improve this prompt: I need to decide between technology A and B +``` + +β†’ Generates decision framework with criteria, trade-offs, and validation + +--- + +### Learning + +```bash +copilot> transform into prompt: learn machine learning from zero +``` + +β†’ Generates learning path prompt with phases, resources, and milestones + +--- + +## ❓ FAQ + +### Q: Does this skill work outside of Obsidian vaults? +**A:** Yes! It's a **universal skill** that works in any terminal context. It doesn't depend on vault structure, project configuration, or external files. + +--- + +### Q: Do I need to know prompting frameworks? +**A:** No. The skill knows all 11 frameworks and selects the best one(s) automatically based on your task. + +--- + +### Q: Will the skill explain which framework it used? +**A:** No. It operates in "magic mode" - you get the polished prompt without technical explanations. If you want to know, you can ask explicitly. + +--- + +### Q: How many questions will the skill ask me? +**A:** Maximum 2-3 questions, and only when information is critically missing. Most of the time, it generates the prompt directly. + +--- + +### Q: Can I customize the frameworks? +**A:** The skill uses standard framework definitions. You can't customize them, but you can provide additional constraints in your input (e.g., "create a short prompt for..."). + +--- + +### Q: Does it support languages other than English? +**A:** Yes. If you provide input in Portuguese, it generates the prompt in Portuguese. Same for English or mixed inputs. + +--- + +### Q: What if I don't like the generated prompt? +**A:** You can ask the skill to refine it: "make it shorter", "add more examples", "focus on X aspect", etc. + +--- + +### Q: Can I use this for any AI model (Claude, ChatGPT, Gemini)? +**A:** Yes. The prompts are model-agnostic and work with any conversational AI. + +--- + +## πŸ”§ Installation (Global Setup) + +This skill is designed to work **globally** across all your projects. + +### Option 1: Use from Repository + +1. Clone the repository: + ```bash + git clone https://github.com/eric.andrade/cli-ai-skills.git + ``` + +2. Configure Copilot to load skills globally: + ```bash + # Add to ~/.copilot/config.json + { + "skills": { + "directories": [ + "/path/to/cli-ai-skills/.github/skills" + ] + } + } + ``` + +### Option 2: Copy to Global Skills Directory + +```bash +cp -r /path/to/cli-ai-skills/.github/skills/prompt-engineer ~/.copilot/global-skills/ +``` + +Then configure: +```bash +# Add to ~/.copilot/config.json +{ + "skills": { + "directories": [ + "~/.copilot/global-skills" + ] + } +} +``` + +--- + +## πŸ“– Learn More + +- **[Skill Development Guide](../../resources/skills-development.md)** - Learn how to create your own skills +- **[SKILL.md](./SKILL.md)** - Full technical specification of this skill +- **[Repository README](../../README.md)** - Overview of all available skills + +--- + +## πŸ“„ Version + +**v1.0.1** | Zero-Config | Universal +*Works in any project, any context, any terminal.* diff --git a/skills/prompt-engineer/SKILL.md b/skills/prompt-engineer/SKILL.md index 09425dcd..63dbf265 100644 --- a/skills/prompt-engineer/SKILL.md +++ b/skills/prompt-engineer/SKILL.md @@ -1,272 +1,252 @@ --- name: prompt-engineer -description: Expert prompt engineer specializing in advanced prompting - techniques, LLM optimization, and AI system design. Masters chain-of-thought, - constitutional AI, and production prompt strategies. Use when building AI - features, improving agent performance, or crafting system prompts. -metadata: - model: inherit +description: "Transforms user prompts into optimized prompts using frameworks (RTF, RISEN, Chain of Thought, RODES, Chain of Density, RACE, RISE, STAR, SOAP, CLEAR, GROW)" +version: 1.1.0 +author: Eric Andrade +created: 2025-02-01 +updated: 2026-02-04 +platforms: [github-copilot-cli, claude-code, codex] +category: automation +tags: [prompt-engineering, optimization, frameworks, ai-enhancement] +risk: safe --- -## Use this skill when - -- Working on prompt engineer tasks or workflows -- Needing guidance, best practices, or checklists for prompt engineer - -## Do not use this skill when - -- The task is unrelated to prompt engineer -- You need a different domain or tool outside this scope - -## Instructions - -- Clarify goals, constraints, and required inputs. -- Apply relevant best practices and validate outcomes. -- Provide actionable steps and verification. -- If detailed examples are required, open `resources/implementation-playbook.md`. - -You are an expert prompt engineer specializing in crafting effective prompts for LLMs and optimizing AI system performance through advanced prompting techniques. - -IMPORTANT: When creating prompts, ALWAYS display the complete prompt text in a clearly marked section. Never describe a prompt without showing it. The prompt needs to be displayed in your response in a single block of text that can be copied and pasted. - ## Purpose -Expert prompt engineer specializing in advanced prompting methodologies and LLM optimization. Masters cutting-edge techniques including constitutional AI, chain-of-thought reasoning, and multi-agent prompt design. Focuses on production-ready prompt systems that are reliable, safe, and optimized for specific business outcomes. -## Capabilities +This skill transforms raw, unstructured user prompts into highly optimized prompts using established prompting frameworks. It analyzes user intent, identifies task complexity, and intelligently selects the most appropriate framework(s) to maximize Claude/ChatGPT output quality. -### Advanced Prompting Techniques +The skill operates in "magic mode" - it works silently behind the scenes, only interacting with users when clarification is critically needed. Users receive polished, ready-to-use prompts without technical explanations or framework jargon. -#### Chain-of-Thought & Reasoning -- Chain-of-thought (CoT) prompting for complex reasoning tasks -- Few-shot chain-of-thought with carefully crafted examples -- Zero-shot chain-of-thought with "Let's think step by step" -- Tree-of-thoughts for exploring multiple reasoning paths -- Self-consistency decoding with multiple reasoning chains -- Least-to-most prompting for complex problem decomposition -- Program-aided language models (PAL) for computational tasks +This is a **universal skill** that works in any terminal context, not limited to Obsidian vaults or specific project structures. -#### Constitutional AI & Safety -- Constitutional AI principles for self-correction and alignment -- Critique and revise patterns for output improvement -- Safety prompting techniques to prevent harmful outputs -- Jailbreak detection and prevention strategies -- Content filtering and moderation prompt patterns -- Ethical reasoning and bias mitigation in prompts -- Red teaming prompts for adversarial testing +## When to Use -#### Meta-Prompting & Self-Improvement -- Meta-prompting for prompt optimization and generation -- Self-reflection and self-evaluation prompt patterns -- Auto-prompting for dynamic prompt generation -- Prompt compression and efficiency optimization -- A/B testing frameworks for prompt performance -- Iterative prompt refinement methodologies -- Performance benchmarking and evaluation metrics +Invoke this skill when: -### Model-Specific Optimization +- User provides a vague or generic prompt (e.g., "help me code Python") +- User has a complex idea but struggles to articulate it clearly +- User's prompt lacks structure, context, or specific requirements +- Task requires step-by-step reasoning (debugging, analysis, design) +- User needs a prompt for a specific AI task but doesn't know prompting frameworks +- User wants to improve an existing prompt's effectiveness +- User asks variations of "how do I ask AI to..." or "create a prompt for..." -#### OpenAI Models (GPT-4o, o1-preview, o1-mini) -- Function calling optimization and structured outputs -- JSON mode utilization for reliable data extraction -- System message design for consistent behavior -- Temperature and parameter tuning for different use cases -- Token optimization strategies for cost efficiency -- Multi-turn conversation management -- Image and multimodal prompt engineering +## Workflow -#### Anthropic Claude (4.5 Sonnet, Haiku, Opus) -- Constitutional AI alignment with Claude's training -- Tool use optimization for complex workflows -- Computer use prompting for automation tasks -- XML tag structuring for clear prompt organization -- Context window optimization for long documents -- Safety considerations specific to Claude's capabilities -- Harmlessness and helpfulness balancing +### Step 1: Analyze Intent -#### Open Source Models (Llama, Mixtral, Qwen) -- Model-specific prompt formatting and special tokens -- Fine-tuning prompt strategies for domain adaptation -- Instruction-following optimization for different architectures -- Memory and context management for smaller models -- Quantization considerations for prompt effectiveness -- Local deployment optimization strategies -- Custom system prompt design for specialized models +**Objective:** Understand what the user truly wants to accomplish. -### Production Prompt Systems +**Actions:** +1. Read the raw prompt provided by the user +2. Detect task characteristics: + - **Type:** coding, writing, analysis, design, learning, planning, decision-making, creative, etc. + - **Complexity:** simple (one-step), moderate (multi-step), complex (requires reasoning/design) + - **Clarity:** clear intention vs. ambiguous/vague + - **Domain:** technical, business, creative, academic, personal, etc. +3. Identify implicit requirements: + - Does user need examples? + - Is output format specified? + - Are there constraints (time, resources, scope)? + - Is this exploratory or execution-focused? -#### Prompt Templates & Management -- Dynamic prompt templating with variable injection -- Conditional prompt logic based on context -- Multi-language prompt adaptation and localization -- Version control and A/B testing for prompts -- Prompt libraries and reusable component systems -- Environment-specific prompt configurations -- Rollback strategies for prompt deployments +**Detection Patterns:** +- **Simple tasks:** Short prompts (<50 chars), single verb, no context +- **Complex tasks:** Long prompts (>200 chars), multiple requirements, conditional logic +- **Ambiguous tasks:** Generic verbs ("help", "improve"), missing object/context +- **Structured tasks:** Mentions steps, phases, deliverables, stakeholders -#### RAG & Knowledge Integration -- Retrieval-augmented generation prompt optimization -- Context compression and relevance filtering -- Query understanding and expansion prompts -- Multi-document reasoning and synthesis -- Citation and source attribution prompting -- Hallucination reduction techniques -- Knowledge graph integration prompts -#### Agent & Multi-Agent Prompting -- Agent role definition and persona creation -- Multi-agent collaboration and communication protocols -- Task decomposition and workflow orchestration -- Inter-agent knowledge sharing and memory management -- Conflict resolution and consensus building prompts -- Tool selection and usage optimization -- Agent evaluation and performance monitoring +### Step 3: Select Framework(s) -### Specialized Applications +**Objective:** Map task characteristics to optimal prompting framework(s). -#### Business & Enterprise -- Customer service chatbot optimization -- Sales and marketing copy generation -- Legal document analysis and generation -- Financial analysis and reporting prompts -- HR and recruitment screening assistance -- Executive summary and reporting automation -- Compliance and regulatory content generation +**Framework Mapping Logic:** -#### Creative & Content -- Creative writing and storytelling prompts -- Content marketing and SEO optimization -- Brand voice and tone consistency -- Social media content generation -- Video script and podcast outline creation -- Educational content and curriculum development -- Translation and localization prompts +| Task Type | Recommended Framework(s) | Rationale | +|-----------|-------------------------|-----------| +| **Role-based tasks** (act as expert, consultant) | **RTF** (Role-Task-Format) | Clear role definition + task + output format | +| **Step-by-step reasoning** (debugging, proof, logic) | **Chain of Thought** | Encourages explicit reasoning steps | +| **Structured projects** (multi-phase, deliverables) | **RISEN** (Role, Instructions, Steps, End goal, Narrowing) | Comprehensive structure for complex work | +| **Complex design/analysis** (systems, architecture) | **RODES** (Role, Objective, Details, Examples, Sense check) | Balances detail with validation | +| **Summarization** (compress, synthesize) | **Chain of Density** | Iterative refinement to essential info | +| **Communication** (reports, presentations, storytelling) | **RACE** (Role, Audience, Context, Expectation) | Audience-aware messaging | +| **Investigation/analysis** (research, diagnosis) | **RISE** (Research, Investigate, Synthesize, Evaluate) | Systematic analytical approach | +| **Contextual situations** (problem-solving with background) | **STAR** (Situation, Task, Action, Result) | Context-rich problem framing | +| **Documentation** (medical, technical, records) | **SOAP** (Subjective, Objective, Assessment, Plan) | Structured information capture | +| **Goal-setting** (OKRs, objectives, targets) | **CLEAR** (Collaborative, Limited, Emotional, Appreciable, Refinable) | Goal clarity and actionability | +| **Coaching/development** (mentoring, growth) | **GROW** (Goal, Reality, Options, Will) | Developmental conversation structure | -#### Technical & Code -- Code generation and optimization prompts -- Technical documentation and API documentation -- Debugging and error analysis assistance -- Architecture design and system analysis -- Test case generation and quality assurance -- DevOps and infrastructure as code prompts -- Security analysis and vulnerability assessment +**Blending Strategy:** +- **Combine 2-3 frameworks** when task spans multiple types +- Example: Complex technical project β†’ **RODES + Chain of Thought** (structure + reasoning) +- Example: Leadership decision β†’ **CLEAR + GROW** (goal clarity + development) -### Evaluation & Testing +**Selection Criteria:** +- Primary framework = best match to core task type +- Secondary framework(s) = address additional complexity dimensions +- Avoid over-engineering: simple tasks get simple frameworks -#### Performance Metrics -- Task-specific accuracy and quality metrics -- Response time and efficiency measurements -- Cost optimization and token usage analysis -- User satisfaction and engagement metrics -- Safety and alignment evaluation -- Consistency and reliability testing -- Edge case and robustness assessment +**Critical Rule:** This selection happens **silently** - do not explain framework choice to user. -#### Testing Methodologies -- Red team testing for prompt vulnerabilities -- Adversarial prompt testing and jailbreak attempts -- Cross-model performance comparison -- A/B testing frameworks for prompt optimization -- Statistical significance testing for improvements -- Bias and fairness evaluation across demographics -- Scalability testing for production workloads +Role: You are a senior software architect. [RTF - Role] -### Advanced Patterns & Architectures +Objective: Design a microservices architecture for [system]. [RODES - Objective] -#### Prompt Chaining & Workflows -- Sequential prompt chaining for complex tasks -- Parallel prompt execution and result aggregation -- Conditional branching based on intermediate outputs -- Loop and iteration patterns for refinement -- Error handling and recovery mechanisms -- State management across prompt sequences -- Workflow optimization and performance tuning +Approach this step-by-step: [Chain of Thought] +1. Analyze current monolithic constraints +2. Identify service boundaries +3. Design inter-service communication +4. Plan data consistency strategy -#### Multimodal & Cross-Modal -- Vision-language model prompt optimization -- Image understanding and analysis prompts -- Document AI and OCR integration prompts -- Audio and speech processing integration -- Video analysis and content extraction -- Cross-modal reasoning and synthesis -- Multimodal creative and generative prompts +Details: [RODES - Details] +- Expected traffic: [X] +- Data volume: [Y] +- Team size: [Z] -## Behavioral Traits -- Always displays complete prompt text, never just descriptions -- Focuses on production reliability and safety over experimental techniques -- Considers token efficiency and cost optimization in all prompt designs -- Implements comprehensive testing and evaluation methodologies -- Stays current with latest prompting research and techniques -- Balances performance optimization with ethical considerations -- Documents prompt behavior and provides clear usage guidelines -- Iterates systematically based on empirical performance data -- Considers model limitations and failure modes in prompt design -- Emphasizes reproducibility and version control for prompt systems +Output Format: [RTF - Format] +Provide architecture diagram description, service definitions, and migration roadmap. -## Knowledge Base -- Latest research in prompt engineering and LLM optimization -- Model-specific capabilities and limitations across providers -- Production deployment patterns and best practices -- Safety and alignment considerations for AI systems -- Evaluation methodologies and performance benchmarking -- Cost optimization strategies for LLM applications -- Multi-agent and workflow orchestration patterns -- Multimodal AI and cross-modal reasoning techniques -- Industry-specific use cases and requirements -- Emerging trends in AI and prompt engineering - -## Response Approach -1. **Understand the specific use case** and requirements for the prompt -2. **Analyze target model capabilities** and optimization opportunities -3. **Design prompt architecture** with appropriate techniques and patterns -4. **Display the complete prompt text** in a clearly marked section -5. **Provide usage guidelines** and parameter recommendations -6. **Include evaluation criteria** and testing approaches -7. **Document safety considerations** and potential failure modes -8. **Suggest optimization strategies** for performance and cost - -## Required Output Format - -When creating any prompt, you MUST include: - -### The Prompt -``` -[Display the complete prompt text here - this is the most important part] +Sense Check: [RODES - Sense check] +Validate that services are loosely coupled, independently deployable, and aligned with business domains. ``` -### Implementation Notes -- Key techniques used and why they were chosen -- Model-specific optimizations and considerations -- Expected behavior and output format -- Parameter recommendations (temperature, max tokens, etc.) +**4.5. Language Adaptation** +- If original prompt is in Portuguese, generate prompt in Portuguese +- If original prompt is in English, generate prompt in English +- If mixed, default to English (more universal for AI models) -### Testing & Evaluation -- Suggested test cases and evaluation metrics -- Edge cases and potential failure modes -- A/B testing recommendations for optimization +**4.6. Quality Checks** +Before finalizing, verify: +- [ ] Prompt is self-contained (no external context needed) +- [ ] Task is specific and measurable +- [ ] Output format is clear +- [ ] No ambiguous language +- [ ] Appropriate level of detail for task complexity -### Usage Guidelines -- When and how to use this prompt effectively -- Customization options and variable parameters -- Integration considerations for production systems -## Example Interactions -- "Create a constitutional AI prompt for content moderation that self-corrects problematic outputs" -- "Design a chain-of-thought prompt for financial analysis that shows clear reasoning steps" -- "Build a multi-agent prompt system for customer service with escalation workflows" -- "Optimize a RAG prompt for technical documentation that reduces hallucinations" -- "Create a meta-prompt that generates optimized prompts for specific business use cases" -- "Design a safety-focused prompt for creative writing that maintains engagement while avoiding harm" -- "Build a structured prompt for code review that provides actionable feedback" -- "Create an evaluation framework for comparing prompt performance across different models" +## Critical Rules -## Before Completing Any Task +### **NEVER:** -Verify you have: -☐ Displayed the full prompt text (not just described it) -☐ Marked it clearly with headers or code blocks -☐ Provided usage instructions and implementation notes -☐ Explained your design choices and techniques used -☐ Included testing and evaluation recommendations -☐ Considered safety and ethical implications +- ❌ Assume information that wasn't provided - ALWAYS ask if critical details are missing +- ❌ Explain which framework was selected or why (magic mode - keep it invisible) +- ❌ Generate generic, one-size-fits-all prompts - always customize to context +- ❌ Use technical jargon in the final prompt (unless user's domain is technical) +- ❌ Ask more than 3 clarifying questions (avoid user fatigue) +- ❌ Include meta-commentary in the output ("This prompt uses...", "Note that...") +- ❌ Present output without code block formatting +- ❌ Mix languages inconsistently (if user writes in PT, respond in PT) -Remember: The best prompt is one that consistently produces the desired output with minimal post-processing. ALWAYS show the prompt, never just describe it. +### **ALWAYS:** + +- βœ… Analyze intent before generating (Step 1 is mandatory) +- βœ… Ask clarifying questions if critical information is ambiguous (Step 2 conditional) +- βœ… Select framework(s) based on task type and complexity (Step 3 mapping) +- βœ… Blend multiple frameworks when it improves prompt quality +- βœ… Adapt prompt length to original input complexity (simple β†’ short, complex β†’ detailed) +- βœ… Include output format specification in generated prompts +- βœ… Present final prompt in clean Markdown code block +- βœ… Make prompts self-contained (no dependency on external context) +- βœ… Use examples in complex prompts to illustrate expected output +- βœ… Validate prompt completeness before presenting (quality checks in Step 4.6) + + +### **Example 2: Complex Clear Prompt β†’ RODES + Chain of Thought (No Clarification)** + +**User Input:** +``` +copilot> create prompt for: I need to build a product recommendation system considering purchase history, user preferences, and market trends +``` + +**Skill Internal Process:** +1. **Analyze:** Clear intent, complex task (ML system design), technical domain +2. **Clarify:** Skip (sufficient information) +3. **Select:** RODES (complex design) + Chain of Thought (reasoning needed) + +**Skill Output:** + +````markdown +``` +Role: You are a senior machine learning engineer specializing in recommendation systems. + +Objective: Design a product recommendation system that combines user purchase history, preferences, and market trends. + +Details: +- Input data sources: user transaction logs, preference surveys, market trend APIs +- Algorithm considerations: collaborative filtering, content-based filtering, hybrid approaches +- Performance metrics: precision, recall, diversity, novelty +- Scalability: system must handle 100k+ users +- Real-time constraints: recommendations updated within 500ms + +Approach step-by-step: +1. Analyze the three data sources and their relationships +2. Propose 2-3 algorithm approaches with trade-offs analysis +3. Design data pipeline architecture (ingestion, processing, serving) +4. Define evaluation metrics and A/B testing strategy +5. Outline implementation roadmap with phases + +Examples: +- Show pseudocode for core recommendation logic +- Illustrate data flow diagram +- Provide sample API response structure + +Sense Check: Validate that the proposed solution balances personalization with discovery, scales to production load, and incorporates real-time market trends effectively. + +Output format: Technical design document with architecture diagrams, algorithm justification, and implementation roadmap. +``` +```` + + +### **Example 4: Code Debugging β†’ Chain of Thought** + +**User Input:** +``` +copilot> optimize prompt: my Python code isn't working, needs debugging +``` + +**Skill Internal Process:** +1. **Analyze:** Debugging task, needs step-by-step reasoning +2. **Clarify:** Skip (user will provide code context later) +3. **Select:** Chain of Thought + +**Skill Output:** + +````markdown +``` +You are an expert Python debugger. + +Task: Debug the Python code provided by the user by analyzing it step-by-step. + +Debugging approach: +1. Read and understand the code's intended purpose +2. Identify the error or unexpected behavior +3. Trace variable states at each execution step +4. Locate the root cause of the issue +5. Propose fix with explanation +6. Suggest preventive measures to avoid similar bugs + +For each step, show your reasoning: +- What you're checking +- What you found +- Why it matters + +Output format: +- **Issue identified:** [Describe the bug] +- **Root cause:** [Why it's happening] +- **Fix:** [Corrected code with comments] +- **Prevention:** [Best practices to avoid this in the future] + +Include a working example to verify the fix. +``` +```` + + +## Notes + +This skill is **platform-agnostic** and works in any terminal context where GitHub Copilot CLI is available. It does not depend on: +- Obsidian vault structure +- Specific project configurations +- External files or templates + +The skill is entirely self-contained, operating purely on user input and framework knowledge. diff --git a/skills/skill-creator/README.md b/skills/skill-creator/README.md new file mode 100644 index 00000000..982ec932 --- /dev/null +++ b/skills/skill-creator/README.md @@ -0,0 +1,270 @@ +# skill-creator + +**Automate CLI skill creation with best practices built-in.** + +## What It Does + +The skill-creator automates the entire workflow of creating new CLI skills for GitHub Copilot CLI and Claude Code. It guides you through brainstorming, applies standardized templates, validates content quality, and handles installationβ€”all while following Anthropic's official best practices. + +## Key Features + +- **🎯 Interactive Brainstorming** - Collaborative session to define skill purpose and scope +- **✨ Template Automation** - Automatic file generation with zero manual configuration +- **πŸ” Quality Validation** - Built-in checks for YAML, content quality, and writing style +- **πŸ“¦ Flexible Installation** - Choose repository-only, global, or hybrid installation +- **πŸ“Š Visual Progress Bar** - Real-time progress indicator showing completion status (e.g., `[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘] 60% - Step 3/5`) +- **πŸ”— Prompt Engineer Integration** - Optional enhancement using prompt-engineer skill + +## When to Use + +Use this skill when you want to: +- Create a new CLI skill following official standards +- Extend CLI functionality with custom capabilities +- Package domain knowledge into a reusable skill format +- Automate repetitive CLI tasks with a custom skill +- Install skills locally or globally across your system + +## Installation + +### Prerequisites + +This skill is part of the `cli-ai-skills` repository. To use it: + +```bash +# Clone the repository +git clone https://github.com/yourusername/cli-ai-skills.git +cd cli-ai-skills +``` + +### Install Globally (Recommended) + +Install via symlinks to make the skill available everywhere: + +```bash +# For GitHub Copilot CLI +ln -sf "$(pwd)/.github/skills/skill-creator" ~/.copilot/skills/skill-creator + +# For Claude Code +ln -sf "$(pwd)/.claude/skills/skill-creator" ~/.claude/skills/skill-creator +``` + +**Benefits of global installation:** +- Works in any directory +- Auto-updates when you `git pull` the repository +- No configuration files needed + +### Repository-Only Installation + +If you prefer to use the skill only within this repository, no installation is needed. The skill will be available when working in the `cli-ai-skills` directory. + +## Usage + +### Basic Skill Creation + +Simply ask the CLI to create a new skill: + +```bash +# GitHub Copilot CLI +gh copilot "create a new skill for debugging Python errors" + +# Claude Code +claude "create a skill that helps with git workflows" +``` + +The skill will guide you through with visual progress tracking: +1. **Brainstorming** (20%) - Define purpose, triggers, and type +2. **Prompt Enhancement** (40%, optional) - Enhance with prompt-engineer skill +3. **File Generation** (60%) - Create files from templates +4. **Validation** (80%) - Check quality and standards +5. **Installation** (100%) - Choose local, global, or both + +Each phase displays a progress bar: +``` +[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘] 60% - Step 3/5: File Generation +``` + +### Advanced Usage + +#### Create Code Generation Skill + +```bash +"Create a code skill that generates React components from descriptions" +``` + +The skill will: +- Use the specialized `code-skill-template.md` +- Ask about specific frameworks (React, Vue, etc.) +- Include code examples in the `examples/` folder + +#### Create Documentation Skill + +```bash +"Build a skill that writes API documentation from code" +``` + +The skill will: +- Use `documentation-skill-template.md` +- Ask about documentation formats +- Set up references for style guides + +#### Install for Specific Platform + +```bash +"Create a skill for Copilot only that analyzes TypeScript errors" +``` + +The skill will: +- Generate files only in `.github/skills/` +- Skip Claude-specific installation +- Validate against Copilot requirements + +## Example Walkthrough + +Here's what creating a skill looks like: + +``` +You: "create a skill for database schema migrations" + +[β–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘] 20% - Step 1/5: Brainstorming & Planning + +What should this skill do? +> Helps users create and manage database schema migrations safely + +When should it trigger? (3-5 phrases) +> "create migration", "generate schema change", "migrate database" + +What type of skill? +> [Γ—] General purpose + +Which platforms? +> [Γ—] Both (Copilot + Claude) + +[... continues through all phases ...] + +πŸŽ‰ Skill created successfully! + +πŸ“¦ Skill Name: database-migration +πŸ“ Location: .github/skills/database-migration/ +πŸ”— Installed: Global (Copilot + Claude) +``` + +## File Structure + +When you create a skill, this structure is generated: + +``` +.github/skills/your-skill-name/ +β”œβ”€β”€ SKILL.md # Main skill instructions (1.5-2k words) +β”œβ”€β”€ README.md # User-facing documentation (this file) +β”œβ”€β”€ references/ # Detailed guides (2k-5k words each) +β”‚ └── (empty, ready for extended docs) +β”œβ”€β”€ examples/ # Working code samples +β”‚ └── (empty, ready for examples) +└── scripts/ # Executable utilities + └── (empty, ready for automation) +``` + +## Configuration + +**No configuration needed!** This skill uses runtime discovery to: +- Detect installed platforms (Copilot CLI, Claude Code) +- Find repository root automatically +- Extract author info from git config +- Determine optimal file locations + +## Validation + +Every skill created is automatically validated for: +- βœ… **YAML Frontmatter** - Required fields and format +- βœ… **Description Format** - Third-person, trigger phrases +- βœ… **Word Count** - 1,500-2,000 ideal, under 5,000 max +- βœ… **Writing Style** - Imperative form, no second-person +- βœ… **Progressive Disclosure** - Proper content organization + +## Frameworks Used + +This skill leverages several established methodologies: + +- **Progressive Disclosure** - 3-level content hierarchy (metadata β†’ SKILL.md β†’ bundled resources) +- **Bundled Resources Pattern** - References, examples, and scripts as separate files +- **Anthropic Best Practices** - Official skill development standards +- **Zero-Config Design** - Runtime discovery, no hardcoded values +- **Template-Driven Generation** - Consistent structure across all skills + +## Troubleshooting + +### "Template not found" Error + +Ensure you're in the `cli-ai-skills` repository or have cloned it: + +```bash +git clone https://github.com/yourusername/cli-ai-skills.git +cd cli-ai-skills +``` + +### "Platform not detected" Warning + +If platforms aren't detected: +1. Choose "Repository only" installation +2. Manually specify platform during setup +3. Install globally later using provided commands + +### Validation Failures + +If validation finds issues: +- Review suggestions in the output +- Choose automatic fixes for common problems +- Manually edit files for complex issues +- Re-run validation: `scripts/validate-skill-yaml.sh .github/skills/your-skill` + +## Advanced Features + +### Prompt Engineer Integration + +Enhance your skill descriptions with AI: +1. Enable during Phase 2 (Prompt Refinement) +2. Skill will invoke `prompt-engineer` automatically +3. Review enhanced output before proceeding + +### Bundled Resources + +For complex skills, use bundled resources: +- **references/** - Detailed documentation (no word limit) +- **examples/** - Working code samples users can run +- **scripts/** - Automation utilities loaded on demand + +### Version Management + +Update existing skills: +```bash +scripts/update-skill-version.sh your-skill-name 1.1.0 +``` + +## Contributing + +Created a useful skill? Share it: +1. Ensure validation passes +2. Add usage examples +3. Update main README.md +4. Submit a pull request + +## Resources + +- **Writing Style Guide:** `resources/templates/writing-style-guide.md` +- **Anthropic Official Guide:** https://github.com/anthropics/claude-plugins-official +- **Templates Directory:** `resources/templates/` +- **Validation Scripts:** `scripts/validate-*.sh` + +## Support + +For issues or questions: +- Check existing skills in `.github/skills/` for examples +- Review `resources/skills-development.md` for methodology +- Open an issue in the repository + +--- + +**Version:** 1.1.0 +**Platform:** GitHub Copilot CLI, Claude Code +**Author:** Eric Andrade +**Last Updated:** 2026-02-01 diff --git a/skills/skill-creator/SKILL.md b/skills/skill-creator/SKILL.md index b7f86598..5ea8b178 100644 --- a/skills/skill-creator/SKILL.md +++ b/skills/skill-creator/SKILL.md @@ -1,356 +1,593 @@ --- name: skill-creator -description: Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations. -license: Complete terms in LICENSE.txt +description: "This skill should be used when the user asks to create a new skill, build a skill, make a custom skill, develop a CLI skill, or wants to extend the CLI with new capabilities. Automates the entire skill creation workflow from brainstorming to installation." +version: 1.3.0 +author: Eric Andrade +created: 2025-02-01 +updated: 2026-02-04 +platforms: [github-copilot-cli, claude-code, codex] +category: meta +tags: [automation, scaffolding, skill-creation, meta-skill] +risk: safe --- -# Skill Creator +# skill-creator -This skill provides guidance for creating effective skills. +## Purpose -## About Skills +To create new CLI skills following Anthropic's official best practices with zero manual configuration. This skill automates brainstorming, template application, validation, and installation processes while maintaining progressive disclosure patterns and writing style standards. -Skills are modular, self-contained packages that extend Claude's capabilities by providing -specialized knowledge, workflows, and tools. Think of them as "onboarding guides" for specific -domains or tasksβ€”they transform Claude from a general-purpose agent into a specialized agent -equipped with procedural knowledge that no model can fully possess. +## When to Use This Skill -### What Skills Provide +This skill should be used when: +- User wants to extend CLI functionality with custom capabilities +- User needs to create a skill following official standards +- User wants to automate repetitive CLI tasks with a reusable skill +- User needs to package domain knowledge into a skill format +- User wants both local and global skill installation options -1. Specialized workflows - Multi-step procedures for specific domains -2. Tool integrations - Instructions for working with specific file formats or APIs -3. Domain expertise - Company-specific knowledge, schemas, business logic -4. Bundled resources - Scripts, references, and assets for complex and repetitive tasks +## Core Capabilities -## Core Principles +1. **Interactive Brainstorming** - Collaborative session to define skill purpose and scope +2. **Prompt Enhancement** - Optional integration with prompt-engineer skill for refinement +3. **Template Application** - Automatic file generation from standardized templates +4. **Validation** - YAML, content, and style checks against Anthropic standards +5. **Installation** - Local repository or global installation with symlinks +6. **Progress Tracking** - Visual gauge showing completion status at each step -### Concise is Key +## Step 0: Discovery -The context window is a public good. Skills share the context window with everything else Claude needs: system prompt, conversation history, other Skills' metadata, and the actual user request. - -**Default assumption: Claude is already very smart.** Only add context Claude doesn't already have. Challenge each piece of information: "Does Claude really need this explanation?" and "Does this paragraph justify its token cost?" - -Prefer concise examples over verbose explanations. - -### Set Appropriate Degrees of Freedom - -Match the level of specificity to the task's fragility and variability: - -**High freedom (text-based instructions)**: Use when multiple approaches are valid, decisions depend on context, or heuristics guide the approach. - -**Medium freedom (pseudocode or scripts with parameters)**: Use when a preferred pattern exists, some variation is acceptable, or configuration affects behavior. - -**Low freedom (specific scripts, few parameters)**: Use when operations are fragile and error-prone, consistency is critical, or a specific sequence must be followed. - -Think of Claude as exploring a path: a narrow bridge with cliffs needs specific guardrails (low freedom), while an open field allows many routes (high freedom). - -### Anatomy of a Skill - -Every skill consists of a required SKILL.md file and optional bundled resources: - -``` -skill-name/ -β”œβ”€β”€ SKILL.md (required) -β”‚ β”œβ”€β”€ YAML frontmatter metadata (required) -β”‚ β”‚ β”œβ”€β”€ name: (required) -β”‚ β”‚ └── description: (required) -β”‚ └── Markdown instructions (required) -└── Bundled Resources (optional) - β”œβ”€β”€ scripts/ - Executable code (Python/Bash/etc.) - β”œβ”€β”€ references/ - Documentation intended to be loaded into context as needed - └── assets/ - Files used in output (templates, icons, fonts, etc.) -``` - -#### SKILL.md (required) - -Every SKILL.md consists of: - -- **Frontmatter** (YAML): Contains `name` and `description` fields. These are the only fields that Claude reads to determine when the skill gets used, thus it is very important to be clear and comprehensive in describing what the skill is, and when it should be used. -- **Body** (Markdown): Instructions and guidance for using the skill. Only loaded AFTER the skill triggers (if at all). - -#### Bundled Resources (optional) - -##### Scripts (`scripts/`) - -Executable code (Python/Bash/etc.) for tasks that require deterministic reliability or are repeatedly rewritten. - -- **When to include**: When the same code is being rewritten repeatedly or deterministic reliability is needed -- **Example**: `scripts/rotate_pdf.py` for PDF rotation tasks -- **Benefits**: Token efficient, deterministic, may be executed without loading into context -- **Note**: Scripts may still need to be read by Claude for patching or environment-specific adjustments - -##### References (`references/`) - -Documentation and reference material intended to be loaded as needed into context to inform Claude's process and thinking. - -- **When to include**: For documentation that Claude should reference while working -- **Examples**: `references/finance.md` for financial schemas, `references/mnda.md` for company NDA template, `references/policies.md` for company policies, `references/api_docs.md` for API specifications -- **Use cases**: Database schemas, API documentation, domain knowledge, company policies, detailed workflow guides -- **Benefits**: Keeps SKILL.md lean, loaded only when Claude determines it's needed -- **Best practice**: If files are large (>10k words), include grep search patterns in SKILL.md -- **Avoid duplication**: Information should live in either SKILL.md or references files, not both. Prefer references files for detailed information unless it's truly core to the skillβ€”this keeps SKILL.md lean while making information discoverable without hogging the context window. Keep only essential procedural instructions and workflow guidance in SKILL.md; move detailed reference material, schemas, and examples to references files. - -##### Assets (`assets/`) - -Files not intended to be loaded into context, but rather used within the output Claude produces. - -- **When to include**: When the skill needs files that will be used in the final output -- **Examples**: `assets/logo.png` for brand assets, `assets/slides.pptx` for PowerPoint templates, `assets/frontend-template/` for HTML/React boilerplate, `assets/font.ttf` for typography -- **Use cases**: Templates, images, icons, boilerplate code, fonts, sample documents that get copied or modified -- **Benefits**: Separates output resources from documentation, enables Claude to use files without loading them into context - -#### What to Not Include in a Skill - -A skill should only contain essential files that directly support its functionality. Do NOT create extraneous documentation or auxiliary files, including: - -- README.md -- INSTALLATION_GUIDE.md -- QUICK_REFERENCE.md -- CHANGELOG.md -- etc. - -The skill should only contain the information needed for an AI agent to do the job at hand. It should not contain auxilary context about the process that went into creating it, setup and testing procedures, user-facing documentation, etc. Creating additional documentation files just adds clutter and confusion. - -### Progressive Disclosure Design Principle - -Skills use a three-level loading system to manage context efficiently: - -1. **Metadata (name + description)** - Always in context (~100 words) -2. **SKILL.md body** - When skill triggers (<5k words) -3. **Bundled resources** - As needed by Claude (Unlimited because scripts can be executed without reading into context window) - -#### Progressive Disclosure Patterns - -Keep SKILL.md body to the essentials and under 500 lines to minimize context bloat. Split content into separate files when approaching this limit. When splitting out content into other files, it is very important to reference them from SKILL.md and describe clearly when to read them, to ensure the reader of the skill knows they exist and when to use them. - -**Key principle:** When a skill supports multiple variations, frameworks, or options, keep only the core workflow and selection guidance in SKILL.md. Move variant-specific details (patterns, examples, configuration) into separate reference files. - -**Pattern 1: High-level guide with references** - -```markdown -# PDF Processing - -## Quick start - -Extract text with pdfplumber: -[code example] - -## Advanced features - -- **Form filling**: See [FORMS.md](FORMS.md) for complete guide -- **API reference**: See [REFERENCE.md](REFERENCE.md) for all methods -- **Examples**: See [EXAMPLES.md](EXAMPLES.md) for common patterns -``` - -Claude loads FORMS.md, REFERENCE.md, or EXAMPLES.md only when needed. - -**Pattern 2: Domain-specific organization** - -For Skills with multiple domains, organize content by domain to avoid loading irrelevant context: - -``` -bigquery-skill/ -β”œβ”€β”€ SKILL.md (overview and navigation) -└── reference/ - β”œβ”€β”€ finance.md (revenue, billing metrics) - β”œβ”€β”€ sales.md (opportunities, pipeline) - β”œβ”€β”€ product.md (API usage, features) - └── marketing.md (campaigns, attribution) -``` - -When a user asks about sales metrics, Claude only reads sales.md. - -Similarly, for skills supporting multiple frameworks or variants, organize by variant: - -``` -cloud-deploy/ -β”œβ”€β”€ SKILL.md (workflow + provider selection) -└── references/ - β”œβ”€β”€ aws.md (AWS deployment patterns) - β”œβ”€β”€ gcp.md (GCP deployment patterns) - └── azure.md (Azure deployment patterns) -``` - -When the user chooses AWS, Claude only reads aws.md. - -**Pattern 3: Conditional details** - -Show basic content, link to advanced content: - -```markdown -# DOCX Processing - -## Creating documents - -Use docx-js for new documents. See [DOCX-JS.md](DOCX-JS.md). - -## Editing documents - -For simple edits, modify the XML directly. - -**For tracked changes**: See [REDLINING.md](REDLINING.md) -**For OOXML details**: See [OOXML.md](OOXML.md) -``` - -Claude reads REDLINING.md or OOXML.md only when the user needs those features. - -**Important guidelines:** - -- **Avoid deeply nested references** - Keep references one level deep from SKILL.md. All reference files should link directly from SKILL.md. -- **Structure longer reference files** - For files longer than 100 lines, include a table of contents at the top so Claude can see the full scope when previewing. - -## Skill Creation Process - -Skill creation involves these steps: - -1. Understand the skill with concrete examples -2. Plan reusable skill contents (scripts, references, assets) -3. Initialize the skill (run init_skill.py) -4. Edit the skill (implement resources and write SKILL.md) -5. Package the skill (run package_skill.py) -6. Iterate based on real usage - -Follow these steps in order, skipping only if there is a clear reason why they are not applicable. - -### Step 1: Understanding the Skill with Concrete Examples - -Skip this step only when the skill's usage patterns are already clearly understood. It remains valuable even when working with an existing skill. - -To create an effective skill, clearly understand concrete examples of how the skill will be used. This understanding can come from either direct user examples or generated examples that are validated with user feedback. - -For example, when building an image-editor skill, relevant questions include: - -- "What functionality should the image-editor skill support? Editing, rotating, anything else?" -- "Can you give some examples of how this skill would be used?" -- "I can imagine users asking for things like 'Remove the red-eye from this image' or 'Rotate this image'. Are there other ways you imagine this skill being used?" -- "What would a user say that should trigger this skill?" - -To avoid overwhelming users, avoid asking too many questions in a single message. Start with the most important questions and follow up as needed for better effectiveness. - -Conclude this step when there is a clear sense of the functionality the skill should support. - -### Step 2: Planning the Reusable Skill Contents - -To turn concrete examples into an effective skill, analyze each example by: - -1. Considering how to execute on the example from scratch -2. Identifying what scripts, references, and assets would be helpful when executing these workflows repeatedly - -Example: When building a `pdf-editor` skill to handle queries like "Help me rotate this PDF," the analysis shows: - -1. Rotating a PDF requires re-writing the same code each time -2. A `scripts/rotate_pdf.py` script would be helpful to store in the skill - -Example: When designing a `frontend-webapp-builder` skill for queries like "Build me a todo app" or "Build me a dashboard to track my steps," the analysis shows: - -1. Writing a frontend webapp requires the same boilerplate HTML/React each time -2. An `assets/hello-world/` template containing the boilerplate HTML/React project files would be helpful to store in the skill - -Example: When building a `big-query` skill to handle queries like "How many users have logged in today?" the analysis shows: - -1. Querying BigQuery requires re-discovering the table schemas and relationships each time -2. A `references/schema.md` file documenting the table schemas would be helpful to store in the skill - -To establish the skill's contents, analyze each concrete example to create a list of the reusable resources to include: scripts, references, and assets. - -### Step 3: Initializing the Skill - -At this point, it is time to actually create the skill. - -Skip this step only if the skill being developed already exists, and iteration or packaging is needed. In this case, continue to the next step. - -When creating a new skill from scratch, always run the `init_skill.py` script. The script conveniently generates a new template skill directory that automatically includes everything a skill requires, making the skill creation process much more efficient and reliable. - -Usage: +Before starting skill creation, gather runtime information: ```bash -scripts/init_skill.py --path +# Detect available platforms +COPILOT_INSTALLED=false +CLAUDE_INSTALLED=false +CODEX_INSTALLED=false + +if command -v gh &>/dev/null && gh copilot --version &>/dev/null 2>&1; then + COPILOT_INSTALLED=true +fi + +if [[ -d "$HOME/.claude" ]]; then + CLAUDE_INSTALLED=true +fi + +if [[ -d "$HOME/.codex" ]]; then + CODEX_INSTALLED=true +fi + +# Determine working directory +REPO_ROOT=$(git rev-parse --show-toplevel 2>/dev/null || pwd) +SKILLS_REPO="$REPO_ROOT" + +# Check if in cli-ai-skills repository +if [[ ! -d "$SKILLS_REPO/.github/skills" ]]; then + echo "⚠️ Not in cli-ai-skills repository. Creating standalone skill." + STANDALONE=true +fi + +# Get user info from git config +AUTHOR=$(git config user.name || echo "Unknown") +EMAIL=$(git config user.email || echo "") ``` -The script: +**Key Information Needed:** +- Which platforms to target (Copilot, Claude, Codex, or all three) +- Installation preference (local, global, or both) +- Skill name and purpose +- Skill type (general, code, documentation, analysis) -- Creates the skill directory at the specified path -- Generates a SKILL.md template with proper frontmatter and TODO placeholders -- Creates example resource directories: `scripts/`, `references/`, and `assets/` -- Adds example files in each directory that can be customized or deleted +## Main Workflow -After initialization, customize or remove the generated SKILL.md and example files as needed. +### Progress Tracking Guidelines -### Step 4: Edit the Skill +Throughout the workflow, display a visual progress bar before starting each phase to keep the user informed. The progress bar format is: -When editing the (newly-generated or existing) skill, remember that the skill is being created for another instance of Claude to use. Include information that would be beneficial and non-obvious to Claude. Consider what procedural knowledge, domain-specific details, or reusable assets would help another Claude instance execute these tasks more effectively. +``` +[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘] 60% - Step 3/5: Creating SKILL.md +``` -#### Learn Proven Design Patterns +**Format specifications:** +- 20 characters wide (use β–ˆ for filled, β–‘ for empty) +- Percentage based on current step (Step 1=20%, Step 2=40%, Step 3=60%, Step 4=80%, Step 5=100%) +- Step counter showing current/total (e.g., "Step 3/5") +- Brief description of current phase -Consult these helpful guides based on your skill's needs: +**Display the progress bar using:** +```bash +echo "[β–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘] 20% - Step 1/5: Brainstorming & Planning" +``` -- **Multi-step processes**: See references/workflows.md for sequential workflows and conditional logic -- **Specific output formats or quality standards**: See references/output-patterns.md for template and example patterns +### Phase 1: Brainstorming & Planning -These files contain established best practices for effective skill design. +**Progress:** Display before starting this phase: +```bash +echo "[β–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘] 20% - Step 1/5: Brainstorming & Planning" +``` -#### Start with Reusable Skill Contents +Display progress: +``` +╔══════════════════════════════════════════════════════════════╗ +β•‘ πŸ› οΈ SKILL CREATOR - Creating New Skill β•‘ +╠══════════════════════════════════════════════════════════════╣ +β•‘ β†’ Phase 1: Brainstorming [10%] β•‘ +β•‘ β—‹ Phase 2: Prompt Refinement β•‘ +β•‘ β—‹ Phase 3: File Generation β•‘ +β•‘ β—‹ Phase 4: Validation β•‘ +β•‘ β—‹ Phase 5: Installation β•‘ +╠══════════════════════════════════════════════════════════════╣ +β•‘ Progress: β–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 10% β•‘ +β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β• +``` -To begin implementation, start with the reusable resources identified above: `scripts/`, `references/`, and `assets/` files. Note that this step may require user input. For example, when implementing a `brand-guidelines` skill, the user may need to provide brand assets or templates to store in `assets/`, or documentation to store in `references/`. +**Ask the user:** -Added scripts must be tested by actually running them to ensure there are no bugs and that the output matches what is expected. If there are many similar scripts, only a representative sample needs to be tested to ensure confidence that they all work while balancing time to completion. +1. **What should this skill do?** (Free-form description) + - Example: "Help users debug Python code by analyzing stack traces" + +2. **When should it trigger?** (Provide 3-5 trigger phrases) + - Example: "debug Python error", "analyze stack trace", "fix Python exception" -Any example files and directories not needed for the skill should be deleted. The initialization script creates example files in `scripts/`, `references/`, and `assets/` to demonstrate structure, but most skills won't need all of them. +3. **What type of skill is this?** + - [ ] General purpose (default template) + - [ ] Code generation/modification + - [ ] Documentation creation/maintenance + - [ ] Analysis/investigation -#### Update SKILL.md +4. **Which platforms should support this skill?** + - [ ] GitHub Copilot CLI + - [ ] Claude Code + - [ ] Codex + - [ ] All three (recommended) -**Writing Guidelines:** Always use imperative/infinitive form. +5. **Provide a one-sentence description** (will appear in metadata) + - Example: "Analyzes Python stack traces and suggests fixes" -##### Frontmatter +**Capture responses and prepare for next phase.** -Write the YAML frontmatter with `name` and `description`: +### Phase 2: Prompt Enhancement (Optional) -- `name`: The skill name -- `description`: This is the primary triggering mechanism for your skill, and helps Claude understand when to use the skill. - - Include both what the Skill does and specific triggers/contexts for when to use it. - - Include all "when to use" information here - Not in the body. The body is only loaded after triggering, so "When to Use This Skill" sections in the body are not helpful to Claude. - - Example description for a `docx` skill: "Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. Use when Claude needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks" +**Progress:** Display before starting this phase: +```bash +echo "[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘] 40% - Step 2/5: Prompt Enhancement" +``` -Do not include any other fields in YAML frontmatter. +Update progress: +``` +╔══════════════════════════════════════════════════════════════╗ +β•‘ βœ“ Phase 1: Brainstorming β•‘ +β•‘ β†’ Phase 2: Prompt Refinement [30%] β•‘ +╠══════════════════════════════════════════════════════════════╣ +β•‘ Progress: β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 30% β•‘ +β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β• +``` -##### Body +**Ask the user:** +"Would you like to refine the skill description using the prompt-engineer skill?" +- [ ] Yes - Use prompt-engineer to enhance clarity and structure +- [ ] No - Proceed with current description -Write instructions for using the skill and its bundled resources. +If **Yes**: +1. Check if prompt-engineer skill is available +2. Invoke with current description as input +3. Review enhanced output with user +4. Ask: "Accept enhanced version or keep original?" -### Step 5: Packaging a Skill +If **No** or prompt-engineer unavailable: +- Proceed with original user input -Once development of the skill is complete, it must be packaged into a distributable .skill file that gets shared with the user. The packaging process automatically validates the skill first to ensure it meets all requirements: +### Phase 3: File Generation + +**Progress:** Display before starting this phase: +```bash +echo "[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘] 60% - Step 3/5: File Generation" +``` + +Update progress: +``` +╔══════════════════════════════════════════════════════════════╗ +β•‘ βœ“ Phase 1: Brainstorming β•‘ +β•‘ βœ“ Phase 2: Prompt Refinement β•‘ +β•‘ β†’ Phase 3: File Generation [50%] β•‘ +╠══════════════════════════════════════════════════════════════╣ +β•‘ Progress: β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 50% β•‘ +β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β• +``` + +**Generate skill structure:** ```bash -scripts/package_skill.py +# Convert skill name to kebab-case +SKILL_NAME=$(echo "$USER_INPUT" | tr '[:upper:]' '[:lower:]' | tr ' ' '-') + +# Create directories +if [[ "$PLATFORM" =~ "copilot" ]]; then + mkdir -p ".github/skills/$SKILL_NAME"/{references,examples,scripts} +fi + +if [[ "$PLATFORM" =~ "claude" ]]; then + mkdir -p ".claude/skills/$SKILL_NAME"/{references,examples,scripts} +fi + +if [[ "$PLATFORM" =~ "codex" ]]; then + mkdir -p ".codex/skills/$SKILL_NAME"/{references,examples,scripts} +fi ``` -Optional output directory specification: +**Apply templates:** + +1. **SKILL.md** - Use appropriate template: + - `skill-template-copilot.md`, `skill-template-claude.md`, or `skill-template-codex.md` + - Substitute placeholders: + - `{{SKILL_NAME}}` β†’ kebab-case name + - `{{DESCRIPTION}}` β†’ one-line description + - `{{TRIGGERS}}` β†’ comma-separated trigger phrases + - `{{PURPOSE}}` β†’ detailed purpose from brainstorming + - `{{AUTHOR}}` β†’ from git config + - `{{DATE}}` β†’ current date (YYYY-MM-DD) + - `{{VERSION}}` β†’ "1.0.0" + +2. **README.md** - Use `readme-template.md`: + - User-facing documentation (300-500 words) + - Include installation instructions + - Add usage examples + +3. **References/** (optional but recommended): + - Create `detailed-guide.md` for extended documentation (2k-5k words) + - Move lengthy content here to keep SKILL.md under 2k words + +**File creation commands:** ```bash -scripts/package_skill.py ./dist +# Apply template with substitution +sed "s/{{SKILL_NAME}}/$SKILL_NAME/g; \ + s/{{DESCRIPTION}}/$DESCRIPTION/g; \ + s/{{AUTHOR}}/$AUTHOR/g; \ + s/{{DATE}}/$(date +%Y-%m-%d)/g" \ + resources/templates/skill-template-copilot.md \ + > ".github/skills/$SKILL_NAME/SKILL.md" + +# Create README +sed "s/{{SKILL_NAME}}/$SKILL_NAME/g" \ + resources/templates/readme-template.md \ + > ".github/skills/$SKILL_NAME/README.md" + +# Apply template for Codex if selected +if [[ "$PLATFORM" =~ "codex" ]]; then + sed "s/{{SKILL_NAME}}/$SKILL_NAME/g; \ + s/{{DESCRIPTION}}/$DESCRIPTION/g; \ + s/{{AUTHOR}}/$AUTHOR/g; \ + s/{{DATE}}/$(date +%Y-%m-%d)/g" \ + resources/templates/skill-template-codex.md \ + > ".codex/skills/$SKILL_NAME/SKILL.md" + + sed "s/{{SKILL_NAME}}/$SKILL_NAME/g" \ + resources/templates/readme-template.md \ + > ".codex/skills/$SKILL_NAME/README.md" +fi ``` -The packaging script will: +**Display created structure:** +``` +βœ… Created: + .github/skills/your-skill-name/ (if Copilot selected) + .claude/skills/your-skill-name/ (if Claude selected) + .codex/skills/your-skill-name/ (if Codex selected) + β”œβ”€β”€ SKILL.md (832 lines) + β”œβ”€β”€ README.md (347 lines) + β”œβ”€β”€ references/ + β”œβ”€β”€ examples/ + └── scripts/ +``` -1. **Validate** the skill automatically, checking: +### Phase 4: Validation - - YAML frontmatter format and required fields - - Skill naming conventions and directory structure - - Description completeness and quality - - File organization and resource references +**Progress:** Display before starting this phase: +```bash +echo "[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘] 80% - Step 4/5: Validation" +``` -2. **Package** the skill if validation passes, creating a .skill file named after the skill (e.g., `my-skill.skill`) that includes all files and maintains the proper directory structure for distribution. The .skill file is a zip file with a .skill extension. +Update progress: +``` +╔══════════════════════════════════════════════════════════════╗ +β•‘ βœ“ Phase 3: File Generation β•‘ +β•‘ β†’ Phase 4: Validation [70%] β•‘ +╠══════════════════════════════════════════════════════════════╣ +β•‘ Progress: β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 70% β•‘ +β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β• +``` -If validation fails, the script will report the errors and exit without creating a package. Fix any validation errors and run the packaging command again. +**Run validation scripts:** -### Step 6: Iterate +```bash +# Validate YAML frontmatter +scripts/validate-skill-yaml.sh ".github/skills/$SKILL_NAME" -After testing the skill, users may request improvements. Often this happens right after using the skill, with fresh context of how the skill performed. +# Validate content quality +scripts/validate-skill-content.sh ".github/skills/$SKILL_NAME" +``` -**Iteration workflow:** +**Expected output:** +``` +πŸ” Validating YAML frontmatter... +βœ… YAML frontmatter valid! -1. Use the skill on real tasks -2. Notice struggles or inefficiencies -3. Identify how SKILL.md or bundled resources should be updated -4. Implement changes and test again +πŸ” Validating content... +βœ… Word count excellent: 1847 words +βœ… Content validation complete! +``` + +**If validation fails:** +- Display specific errors +- Offer to fix automatically (common issues) +- Ask user to manually correct complex issues + +**Common auto-fixes:** +- Convert second-person to imperative form +- Reformat description to third-person +- Add missing required fields + +### Phase 5: Installation + +**Progress:** Display before starting this phase: +```bash +echo "[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ] 100% - Step 5/5: Installation" +``` + +Update progress: +``` +╔══════════════════════════════════════════════════════════════╗ +β•‘ βœ“ Phase 4: Validation β•‘ +β•‘ β†’ Phase 5: Installation [90%] β•‘ +╠══════════════════════════════════════════════════════════════╣ +β•‘ Progress: β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘ 90% β•‘ +β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β• +``` + +**Ask the user:** +"How would you like to install this skill?" + +- [ ] **Repository only** - Files created in `.github/skills/` (works when in repo) +- [ ] **Global installation** - Create symlinks in `~/.copilot/skills/` (works everywhere) +- [ ] **Both** - Repository + global symlinks (recommended, auto-updates with git pull) +- [ ] **Skip installation** - Just create files + +**If global installation selected:** + +```bash +# Detect which platforms to install for +INSTALL_TARGETS=() + +if [[ "$COPILOT_INSTALLED" == "true" ]] && [[ "$PLATFORM" =~ "copilot" ]]; then + INSTALL_TARGETS+=("copilot") +fi + +if [[ "$CLAUDE_INSTALLED" == "true" ]] && [[ "$PLATFORM" =~ "claude" ]]; then + INSTALL_TARGETS+=("claude") +fi + +if [[ "$CODEX_INSTALLED" == "true" ]] && [[ "$PLATFORM" =~ "codex" ]]; then + INSTALL_TARGETS+=("codex") +fi + +# Ask user to confirm detected platforms +echo "Detected platforms: ${INSTALL_TARGETS[*]}" +echo "Install for these platforms? [Y/n]" +``` + +**Installation process:** + +```bash +# GitHub Copilot CLI +if [[ " ${INSTALL_TARGETS[*]} " =~ " copilot " ]]; then + ln -sf "$SKILLS_REPO/.github/skills/$SKILL_NAME" \ + "$HOME/.copilot/skills/$SKILL_NAME" + echo "βœ… Installed for GitHub Copilot CLI" +fi + +# Claude Code +if [[ " ${INSTALL_TARGETS[*]} " =~ " claude " ]]; then + ln -sf "$SKILLS_REPO/.claude/skills/$SKILL_NAME" \ + "$HOME/.claude/skills/$SKILL_NAME" + echo "βœ… Installed for Claude Code" +fi + +# Codex +if [[ " ${INSTALL_TARGETS[*]} " =~ " codex " ]]; then + ln -sf "$SKILLS_REPO/.codex/skills/$SKILL_NAME" \ + "$HOME/.codex/skills/$SKILL_NAME" + echo "βœ… Installed for Codex" +fi +``` + +**Verify installation:** + +```bash +# Check symlinks +ls -la ~/.copilot/skills/$SKILL_NAME 2>/dev/null +ls -la ~/.claude/skills/$SKILL_NAME 2>/dev/null +ls -la ~/.codex/skills/$SKILL_NAME 2>/dev/null +``` + +### Phase 6: Completion + +**Progress:** Display completion message: +```bash +echo "[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ] 100% - βœ“ Skill created successfully!" +``` + +Update progress: +``` +╔══════════════════════════════════════════════════════════════╗ +β•‘ βœ“ Phase 5: Installation β•‘ +β•‘ βœ… SKILL CREATION COMPLETE! β•‘ +╠══════════════════════════════════════════════════════════════╣ +β•‘ Progress: β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 100% β•‘ +β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β• +``` + +**Display summary:** + +``` +πŸŽ‰ Skill created successfully! + +πŸ“¦ Skill Name: your-skill-name +πŸ“ Location: .github/skills/your-skill-name/ +πŸ”— Installed: Global (Copilot + Claude) + +πŸ“‹ Files Created: + βœ… SKILL.md (1,847 words) + βœ… README.md (423 words) + βœ… references/ (empty, ready for extended docs) + βœ… examples/ (empty, ready for code samples) + βœ… scripts/ (empty, ready for utilities) + +πŸš€ Next Steps: + 1. Test the skill: Try trigger phrases in CLI + 2. Add examples: Create working code samples in examples/ + 3. Extend docs: Add detailed guides to references/ + 4. Commit changes: git add .github/skills/your-skill-name && git commit + 5. Share: Push to repository for team use + +πŸ’‘ Pro Tips: + - Keep SKILL.md under 2,000 words (currently: 1,847) + - Move detailed content to references/ folder + - Add executable scripts to scripts/ folder + - Update README.md with real usage examples + - Run validation before committing: scripts/validate-skill-yaml.sh +``` + +## Error Handling + +### Platform Detection Issues + +If platforms cannot be detected: +``` +⚠️ Unable to detect GitHub Copilot CLI or Claude Code + +Would you like to: +1. Install for repository only (works when in repo) +2. Specify platform manually +3. Skip installation +``` + +### Template Not Found + +If templates are missing: +``` +❌ Error: Template not found at resources/templates/ + +This skill requires the cli-ai-skills repository structure. + +Options: +1. Clone cli-ai-skills: git clone +2. Create minimal skill structure manually +3. Exit and set up templates first +``` + +### Validation Failures + +If content doesn't meet standards: +``` +⚠️ Validation Issues Found: + +1. YAML: Description not in third-person format + Expected: "This skill should be used when..." + Found: "Use this skill when..." + +2. Content: Word count too high (5,342 words, max 5,000) + Suggestion: Move detailed sections to references/ + +Fix automatically? [Y/n] +``` + +### Installation Conflicts + +If symlink already exists: +``` +⚠️ Skill already installed at ~/.copilot/skills/your-skill-name + +Options: +1. Overwrite existing installation +2. Rename new skill +3. Skip installation +4. Install to different location +``` + +## Bundled Resources + +This skill includes additional resources in subdirectories: + +### references/ + +Detailed documentation loaded when needed: +- `anthropic-best-practices.md` - Official Anthropic skill development guidelines +- `writing-style-guide.md` - Writing standards and examples +- `progressive-disclosure.md` - Content organization patterns +- `validation-checklist.md` - Pre-commit quality checks + +### examples/ + +Working examples demonstrating skill usage: +- `basic-skill-creation.md` - Simple skill creation walkthrough +- `advanced-skill-bundled-resources.md` - Complex skill with references/ +- `global-installation.md` - Installing skills system-wide + +### scripts/ + +Executable utilities for skill maintenance: +- `validate-all-skills.sh` - Batch validation of all skills in repository +- `update-skill-version.sh` - Bump version and update changelog +- `generate-skill-index.sh` - Auto-generate skills catalog + +## Technical Implementation Notes + +**Template Substitution:** +- Use `sed` for simple replacements +- Preserve YAML formatting exactly +- Handle multi-line descriptions with proper escaping + +**Symlink Strategy:** +- Always use absolute paths: `ln -sf /full/path/to/source ~/.copilot/skills/name` +- Verify symlink before considering installation complete +- Benefits: Auto-updates when repository is pulled + +**Validation Integration:** +- Run validation before installation +- Block installation if critical errors found +- Warnings are informational only + +**Git Integration:** +- Extract author from `git config user.name` +- Use repository root detection: `git rev-parse --show-toplevel` +- Respect `.gitignore` patterns + +## Quality Standards + +**SKILL.md Requirements:** +- 1,500-2,000 words (ideal) +- Under 5,000 words (maximum) +- Third-person description format +- Imperative/infinitive writing style +- Progressive disclosure pattern + +**README.md Requirements:** +- 300-500 words +- User-facing language +- Clear installation instructions +- Practical usage examples + +**Validation Checks:** +- YAML frontmatter completeness +- Description format (third-person) +- Word count limits +- Writing style (no second-person) +- Required fields present + +## References + +- **Anthropic Official Skill Development Guide:** https://github.com/anthropics/claude-plugins-official/blob/main/plugins/plugin-dev/skills/skill-development/SKILL.md +- **Repository:** https://github.com/yourusername/cli-ai-skills +- **Writing Style Guide:** `resources/templates/writing-style-guide.md` +- **Progress Tracker Template:** `resources/templates/progress-tracker.md` diff --git a/skills/youtube-summarizer/README.md b/skills/youtube-summarizer/README.md new file mode 100644 index 00000000..227f8aef --- /dev/null +++ b/skills/youtube-summarizer/README.md @@ -0,0 +1,365 @@ +# πŸŽ₯ youtube-summarizer + +> Extract transcripts from YouTube videos and generate comprehensive, detailed summaries + +**Version:** 1.2.0 +**Status:** ✨ Zero-Config | 🌍 Universal +**Platforms:** GitHub Copilot CLI, Claude Code + +--- + +## Overview + +The **youtube-summarizer** skill automates the extraction of YouTube video transcripts and generates verbose, structured summaries using the STAR + R-I-S-E framework. Perfect for documenting educational content, lectures, tutorials, or any informational videos without rewatching them. + +--- + +## Features + +- 🎯 **Automatic transcript extraction** using `youtube-transcript-api` +- βœ… **Video validation** - Checks if video is accessible and has transcripts +- 🌍 **Multi-language support** - Prefers Portuguese, falls back to English +- πŸ“Š **Comprehensive summaries** - Prioritizes detail and completeness +- πŸ“ **Structured output** - Markdown with headers, sections, insights +- πŸ” **Metadata included** - Video title, channel, duration, URL +- ⚑ **Error handling** - Clear messages for all failure scenarios +- πŸ› οΈ **Dependency management** - Offers to install requirements automatically +- πŸ“Š **Progress gauge** - Visual processing tracker across all steps +- πŸ’Ύ **Flexible save options** - Summary-only, summary+transcript, or transcript-only (NEW v1.2.0) + +--- + +## Quick Start + +### Triggers + +Activate this skill with any of these phrases: + +```bash +# English +copilot> summarize this video: https://www.youtube.com/watch?v=VIDEO_ID +copilot> summarize youtube video https://youtu.be/VIDEO_ID +copilot> extract youtube transcript https://youtube.com/watch?v=VIDEO_ID + +# Portuguese (also supported) +copilot> resume este video: https://www.youtube.com/watch?v=VIDEO_ID +``` + +### First-Time Setup + +The skill will automatically check for dependencies and offer to install them: + +```bash +⚠️ youtube-transcript-api not installed + +Would you like me to install it now? +- [x] Yes - Install with pip +- [ ] No - I'll install manually +``` + +Select "Yes" and the skill handles installation automatically. + +--- + +## Use Cases + +### 1. **Educational Video Documentation** + +```bash +copilot> summarize this video: https://www.youtube.com/watch?v=abc123 +``` + +**Output:** +- Comprehensive summary of lecture content +- Key concepts and terminology +- Examples and practical applications +- Resources mentioned in the video + +### 2. **Technical Tutorial Analysis** + +```bash +copilot> summarize youtube video https://youtu.be/xyz789 +``` + +**Output:** +- Step-by-step breakdown of tutorial +- Code snippets and commands mentioned +- Best practices highlighted +- Troubleshooting tips documented + +### 3. **Conference Talk Reference** + +```bash +copilot> extract youtube transcript https://youtube.com/watch?v=def456 +``` + +**Output:** +- Speaker insights and arguments +- Statistics and data points +- Case studies and examples +- Q&A session summary + +### 4. **Language Learning Content** + +```bash +copilot> summarize youtube video https://youtu.be/ghi789 +``` + +**Output:** +- Vocabulary and expressions used +- Grammar points explained +- Cultural references +- Practice exercises mentioned + +### 5. **Research and Investigation** + +```bash +copilot> summarize youtube video https://www.youtube.com/watch?v=jkl012 +``` + +**Output:** +- Research findings presented +- Methodology explained +- Results and conclusions +- Future work suggestions + +--- + +## Output Structure + +Every summary follows this comprehensive structure: + +```markdown +# [Video Title] + +**Canal:** [Channel Name] +**DuraΓ§Γ£o:** [Duration] +**URL:** [Video URL] +**Data de PublicaΓ§Γ£o:** [Date] + +--- + +## πŸ“Š SΓ­ntese Executiva +[High-level overview, 2-3 paragraphs] + +--- + +## πŸ“ Resumo Detalhado +### [Topic 1] +[Detailed analysis with examples, data, quotes] + +### [Topic 2] +[Continued breakdown...] + +--- + +## πŸ’‘ Principais Insights +- **Insight 1:** [Explanation] +- **Insight 2:** [Explanation] + +--- + +## πŸ“š Conceitos e Terminologia +- **Term 1:** [Definition] +- **Term 2:** [Definition] + +--- + +## πŸ”— Recursos Mencionados +- [Resource 1] +- [Resource 2] + +--- + +## πŸ“Œ ConclusΓ£o +[Final synthesis and key takeaways] +``` + +--- + +## Requirements + +- **Python 3.x** (usually pre-installed on macOS/Linux) +- **pip** (Python package manager) +- **[youtube-transcript-api](https://github.com/jdepoix/youtube-transcript-api)** by [Julien Depoix](https://github.com/jdepoix) (installed automatically by the skill) + +### Manual Installation (Optional) + +If you prefer to install dependencies manually: + +```bash +pip install youtube-transcript-api +``` + +--- + +## Supported URL Formats + +The skill recognizes these YouTube URL formats: + +- `https://www.youtube.com/watch?v=VIDEO_ID` +- `https://youtube.com/watch?v=VIDEO_ID` +- `https://youtu.be/VIDEO_ID` +- `https://m.youtube.com/watch?v=VIDEO_ID` + +--- + +## Limitations + +### Videos That Work + +βœ… Public videos with auto-generated captions +βœ… Videos with manual subtitles/captions +βœ… Videos with transcripts in any supported language + +### Videos That Don't Work + +❌ Private or unlisted videos +❌ Videos with transcripts disabled +❌ Age-restricted videos (may require authentication) +❌ Videos without any captions/subtitles + +--- + +## Error Messages + +### No Transcript Available + +``` +❌ No transcript available for this video + +This skill requires videos with auto-generated captions or manual subtitles. +Unfortunately, transcripts are not enabled for this video. +``` + +**Solution:** Try a different video that has captions enabled. + +### Invalid URL + +``` +❌ Invalid YouTube URL format + +Expected format examples: +- https://www.youtube.com/watch?v=VIDEO_ID +- https://youtu.be/VIDEO_ID +``` + +**Solution:** Ensure you're providing a complete, valid YouTube URL. + +### Video Not Accessible + +``` +❌ Unable to access video + +Possible reasons: +1. Video is private or unlisted +2. Video has been removed +3. Invalid video ID +``` + +**Solution:** Verify the URL and ensure the video is public. + +--- + +## FAQ + +### Q: How long does it take to generate a summary? + +**A:** Depends on video length: +- Short videos (5-10 min): 30-60 seconds +- Medium videos (20-40 min): 1-2 minutes +- Long videos (60+ min): 2-5 minutes + +### Q: Can I summarize videos in languages other than English/Portuguese? + +**A:** Yes! The skill attempts to extract transcripts in the video's original language. If unavailable, it falls back to English. + +### Q: Will this work with YouTube Music videos? + +**A:** Only if the music video has captions/transcripts enabled. Most music videos don't have transcripts. + +### Q: Can I customize the summary length? + +**A:** The skill prioritizes completeness by design (verbose summaries). If you need shorter summaries, you can ask the AI to condense the output afterward. + +### Q: Does this download the video? + +**A:** No. Only the text transcript is extracted via YouTube's API. No video files are downloaded. + +### Q: Can I save the summary to a file? + +**A:** Yes! After the summary is generated, the skill offers flexible save options: +- **Summary only** - Markdown file with structured summary +- **Summary + transcript** - Markdown file with summary and raw transcript appended +- **Transcript only** - Plain text file with raw transcript (NEW in v1.2.0) +- **Display only** - No files saved, summary shown in terminal + +Files are saved as `resumo-{VIDEO_ID}-{YYYY-MM-DD}.md` (summary) or `transcript-{VIDEO_ID}-{YYYY-MM-DD}.txt` (transcript-only). + +### Q: When should I save just the transcript? + +**A:** Use the transcript-only option when you: +- Need raw content for further analysis +- Want to process the text with other tools +- Prefer to create your own summary later +- Need the transcript for documentation or archival purposes + +--- + +## Installation + +### Global Installation (Recommended) + +Install the skill globally to use it across all projects: + +```bash +# Clone the repository +git clone https://github.com/ericgandrade/cli-ai-skills.git +cd cli-ai-skills + +# Run the install script +./scripts/install-skills.sh $(pwd) +``` + +This creates symlinks in: +- `~/.copilot/skills/youtube-summarizer/` (GitHub Copilot CLI) +- `~/.claude/skills/youtube-summarizer/` (Claude Code) + +### Repository Installation + +Add to a specific project: + +```bash +# Copy skill to your project +cp -r cli-ai-skills/.github/skills/youtube-summarizer .github/skills/ +``` + +--- + +## Contributing + +Found a bug or have a feature request? Contributions welcome! + +1. Fork the repository +2. Create a feature branch: `git checkout -b feature/youtube-enhancement` +3. Commit changes: `git commit -m "feat(youtube-summarizer): add feature X"` +4. Push and create a Pull Request + +--- + +## License + +MIT License - see [LICENSE](../../../LICENSE) for details. + +--- + +## Acknowledgments + +- **[youtube-transcript-api](https://github.com/jdepoix/youtube-transcript-api)** by [Julien Depoix](https://github.com/jdepoix) - Python library for extracting YouTube video transcripts +- **Anthropic STAR/R-I-S-E frameworks** - For structured summarization + +--- + +**Built with ❀️ by Eric Andrade** + +*Version 1.1.0 | Last updated: February 2026* diff --git a/skills/youtube-summarizer/SKILL.md b/skills/youtube-summarizer/SKILL.md new file mode 100644 index 00000000..e3fc3bb6 --- /dev/null +++ b/skills/youtube-summarizer/SKILL.md @@ -0,0 +1,411 @@ +--- +name: youtube-summarizer +description: "Extract transcripts from YouTube videos and generate comprehensive, detailed summaries using intelligent analysis frameworks" +version: 1.2.0 +author: Eric Andrade +created: 2025-02-01 +updated: 2026-02-04 +platforms: [github-copilot-cli, claude-code, codex] +category: content +tags: [video, summarization, transcription, youtube, content-analysis] +risk: safe +--- + +# youtube-summarizer + +## Purpose + +This skill extracts transcripts from YouTube videos and generates comprehensive, verbose summaries using the STAR + R-I-S-E framework. It validates video availability, extracts transcripts using the `youtube-transcript-api` Python library, and produces detailed documentation capturing all insights, arguments, and key points. + +The skill is designed for users who need thorough content analysis and reference documentation from educational videos, lectures, tutorials, or informational content. + +## When to Use This Skill + +This skill should be used when: + +- User provides a YouTube video URL and wants a detailed summary +- User needs to document video content for reference without rewatching +- User wants to extract insights, key points, and arguments from educational content +- User needs transcripts from YouTube videos for analysis +- User asks to "summarize", "resume", or "extract content" from YouTube videos +- User wants comprehensive documentation prioritizing completeness over brevity + +## Step 0: Discovery & Setup + +Before processing videos, validate the environment and dependencies: + +```bash +# Check if youtube-transcript-api is installed +python3 -c "import youtube_transcript_api" 2>/dev/null +if [ $? -ne 0 ]; then + echo "⚠️ youtube-transcript-api not found" + # Offer to install +fi + +# Check Python availability +if ! command -v python3 &>/dev/null; then + echo "❌ Python 3 is required but not installed" + exit 1 +fi +``` + +**Ask the user if dependency is missing:** + +``` +youtube-transcript-api is required but not installed. + +Would you like to install it now? +- [ ] Yes - Install with pip (pip install youtube-transcript-api) +- [ ] No - I'll install it manually +``` + +**If user selects "Yes":** + +```bash +pip install youtube-transcript-api +``` + +**Verify installation:** + +```bash +python3 -c "import youtube_transcript_api; print('βœ… youtube-transcript-api installed successfully')" +``` + +## Main Workflow + +### Progress Tracking Guidelines + +Throughout the workflow, display a visual progress gauge before each step to keep the user informed. The gauge format is: + +```bash +echo "[β–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘] 20% - Step 1/5: Validating URL" +``` + +**Format specifications:** +- 20 characters wide (use β–ˆ for filled, β–‘ for empty) +- Percentage increments: Step 1=20%, Step 2=40%, Step 3=60%, Step 4=80%, Step 5=100% +- Step counter showing current/total (e.g., "Step 3/5") +- Brief description of current phase + +**Display the initial status box before Step 1:** + +``` +╔══════════════════════════════════════════════════════════════╗ +β•‘ πŸ“Ή YOUTUBE SUMMARIZER - Processing Video β•‘ +╠══════════════════════════════════════════════════════════════╣ +β•‘ β†’ Step 1: Validating URL [IN PROGRESS] β•‘ +β•‘ β—‹ Step 2: Checking Availability β•‘ +β•‘ β—‹ Step 3: Extracting Transcript β•‘ +β•‘ β—‹ Step 4: Generating Summary β•‘ +β•‘ β—‹ Step 5: Formatting Output β•‘ +╠══════════════════════════════════════════════════════════════╣ +β•‘ Progress: β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘ 20% β•‘ +β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β• +``` + +### Step 1: Validate YouTube URL + +**Objective:** Extract video ID and validate URL format. + +**Supported URL Formats:** +- `https://www.youtube.com/watch?v=VIDEO_ID` +- `https://youtube.com/watch?v=VIDEO_ID` +- `https://youtu.be/VIDEO_ID` +- `https://m.youtube.com/watch?v=VIDEO_ID` + +**Actions:** + +```bash +# Extract video ID using regex or URL parsing +URL="$USER_PROVIDED_URL" + +# Pattern 1: youtube.com/watch?v=VIDEO_ID +if echo "$URL" | grep -qE 'youtube\.com/watch\?v='; then + VIDEO_ID=$(echo "$URL" | sed -E 's/.*[?&]v=([^&]+).*/\1/') +# Pattern 2: youtu.be/VIDEO_ID +elif echo "$URL" | grep -qE 'youtu\.be/'; then + VIDEO_ID=$(echo "$URL" | sed -E 's/.*youtu\.be\/([^?]+).*/\1/') +else + echo "❌ Invalid YouTube URL format" + exit 1 +fi + +echo "πŸ“Ή Video ID extracted: $VIDEO_ID" +``` + +**If URL is invalid:** + +``` +❌ Invalid YouTube URL + +Please provide a valid YouTube URL in one of these formats: +- https://www.youtube.com/watch?v=VIDEO_ID +- https://youtu.be/VIDEO_ID + +Example: https://www.youtube.com/watch?v=dQw4w9WgXcQ +``` + +### Step 2: Check Video & Transcript Availability + +**Progress:** +```bash +echo "[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘] 40% - Step 2/5: Checking Availability" +``` + +**Objective:** Verify video exists and transcript is accessible. + +**Actions:** + +```python +from youtube_transcript_api import YouTubeTranscriptApi, TranscriptsDisabled, NoTranscriptFound +import sys + +video_id = sys.argv[1] + +try: + # Get list of available transcripts + transcript_list = YouTubeTranscriptApi.list_transcripts(video_id) + + print(f"βœ… Video accessible: {video_id}") + print("πŸ“ Available transcripts:") + + for transcript in transcript_list: + print(f" - {transcript.language} ({transcript.language_code})") + if transcript.is_generated: + print(" [Auto-generated]") + +except TranscriptsDisabled: + print(f"❌ Transcripts are disabled for video {video_id}") + sys.exit(1) + +except NoTranscriptFound: + print(f"❌ No transcript found for video {video_id}") + sys.exit(1) + +except Exception as e: + print(f"❌ Error accessing video: {e}") + sys.exit(1) +``` + +**Error Handling:** + +| Error | Message | Action | +|-------|---------|--------| +| Video not found | "❌ Video does not exist or is private" | Ask user to verify URL | +| Transcripts disabled | "❌ Transcripts are disabled for this video" | Cannot proceed | +| No transcript available | "❌ No transcript found (not auto-generated or manually added)" | Cannot proceed | +| Private/restricted video | "❌ Video is private or restricted" | Ask for public video | + +### Step 3: Extract Transcript + +**Progress:** +```bash +echo "[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘] 60% - Step 3/5: Extracting Transcript" +``` + +**Objective:** Retrieve transcript in preferred language. + +**Actions:** + +```python +from youtube_transcript_api import YouTubeTranscriptApi + +video_id = "VIDEO_ID" + +try: + # Try to get transcript in user's preferred language first + # Fall back to English if not available + transcript = YouTubeTranscriptApi.get_transcript( + video_id, + languages=['pt', 'en'] # Prefer Portuguese, fallback to English + ) + + # Combine transcript segments into full text + full_text = " ".join([entry['text'] for entry in transcript]) + + # Get video metadata + from youtube_transcript_api import YouTubeTranscriptApi + transcript_list = YouTubeTranscriptApi.list_transcripts(video_id) + + print("βœ… Transcript extracted successfully") + print(f"πŸ“Š Transcript length: {len(full_text)} characters") + + # Save to temporary file for processing + with open(f"/tmp/transcript_{video_id}.txt", "w") as f: + f.write(full_text) + +except Exception as e: + print(f"❌ Error extracting transcript: {e}") + exit(1) +``` + +**Transcript Processing:** + +- Combine all transcript segments into coherent text +- Preserve punctuation and formatting where available +- Remove duplicate or overlapping segments (if auto-generated artifacts) +- Store in temporary file for analysis + +### Step 4: Generate Comprehensive Summary + +**Progress:** +```bash +echo "[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘] 80% - Step 4/5: Generating Summary" +``` + +**Objective:** Apply enhanced STAR + R-I-S-E prompt to create detailed summary. + +**Prompt Applied:** + +Use the enhanced prompt from Phase 2 (STAR + R-I-S-E framework) with the extracted transcript as input. + +**Actions:** + +1. Load the full transcript text +2. Apply the comprehensive summarization prompt +3. Use AI model (Claude/GPT) to generate structured summary +4. Ensure output follows the defined structure: + - Header with video metadata + - Executive synthesis + - Detailed section-by-section breakdown + - Key insights and conclusions + - Concepts and terminology + - Resources and references + +**Implementation:** + +```bash +# Use the transcript file as input to the AI prompt +TRANSCRIPT_FILE="/tmp/transcript_${VIDEO_ID}.txt" + +# The AI agent will: +# 1. Read the transcript +# 2. Apply the STAR + R-I-S-E summarization framework +# 3. Generate comprehensive Markdown output +# 4. Structure with headers, lists, and highlights + +Read "$TRANSCRIPT_FILE" # Read transcript into context +``` + +Then apply the full summarization prompt (from enhanced version in Phase 2). + +### Step 5: Format and Present Output + +**Progress:** +```bash +echo "[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ] 100% - Step 5/5: Formatting Output" +``` + +**Objective:** Deliver the summary in clean, well-structured Markdown. + +**Output Structure:** + +```markdown +# [Video Title] + +**Canal:** [Channel Name] +**DuraΓ§Γ£o:** [Duration] +**URL:** [https://youtube.com/watch?v=VIDEO_ID] +**Data de PublicaΓ§Γ£o:** [Date if available] + + +## πŸ“ Detailed Summary + +### [Topic 1] + +[Comprehensive explanation with examples, data, quotes...] + +#### [Subtopic 1.1] + +[Detailed breakdown...] + +### [Topic 2] + +[Continued detailed analysis...] + + +## πŸ“š Concepts and Terminology + +- **[Term 1]:** [Definition and context] +- **[Term 2]:** [Definition and context] + + +## πŸ“Œ Conclusion + +[Final synthesis and takeaways] + + +### **Example 2: Missing Dependency** + +**User Input:** +``` +claude> summarize this youtube video https://youtu.be/abc123 +``` + +**Skill Response:** + +``` +⚠️ youtube-transcript-api not installed + +This skill requires the Python library 'youtube-transcript-api'. + +Would you like me to install it now? +- [ ] Yes - Install with pip +- [ ] No - I'll install manually +``` + +**User selects "Yes":** + +```bash +$ pip install youtube-transcript-api +Successfully installed youtube-transcript-api-0.6.1 + +βœ… Installation complete! Proceeding with video summary... +``` + + +### **Example 4: Invalid URL** + +**User Input:** +``` +claude> summarize youtube video www.youtube.com/some-video +``` + +**Skill Response:** + +``` +❌ Invalid YouTube URL format + +Expected format examples: +- https://www.youtube.com/watch?v=VIDEO_ID +- https://youtu.be/VIDEO_ID + +Please provide a valid YouTube video URL. +``` + + +## πŸ“Š Executive Summary + +This video provides a comprehensive introduction to the fundamental concepts of Artificial Intelligence (AI), designed for beginners and professionals who want to understand the technical foundations and practical applications of modern AI. The instructor covers everything from basic definitions to machine learning algorithms, using practical examples and visualizations to facilitate understanding. + +[... continued detailed summary ...] +``` + +**Save Options:** + +``` +What would you like to save? +β†’ Summary + raw transcript + +βœ… File saved: resumo-exemplo123-2026-02-01.md (includes raw transcript) +[β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ] 100% - βœ“ Processing complete! +``` + + +Welcome to this comprehensive tutorial on machine learning fundamentals. In today's video, we'll explore the core concepts that power modern AI systems... +``` + + +**Version:** 1.2.0 +**Last Updated:** 2026-02-02 +**Maintained By:** Eric Andrade diff --git a/skills/youtube-summarizer/scripts/extract-transcript.py b/skills/youtube-summarizer/scripts/extract-transcript.py new file mode 100644 index 00000000..8c06569d --- /dev/null +++ b/skills/youtube-summarizer/scripts/extract-transcript.py @@ -0,0 +1,65 @@ +#!/usr/bin/env python3 +""" +Extract YouTube video transcript +Usage: ./extract-transcript.py VIDEO_ID [LANGUAGE_CODE] +""" + +import sys +from youtube_transcript_api import YouTubeTranscriptApi, TranscriptsDisabled, NoTranscriptFound + +def extract_transcript(video_id, language='en'): + """Extract transcript from YouTube video""" + try: + # Try to get transcript in specified language with fallback to English + transcript = YouTubeTranscriptApi.get_transcript( + video_id, + languages=[language, 'en'] + ) + + # Combine all transcript segments + full_text = " ".join([entry['text'] for entry in transcript]) + return full_text + + except TranscriptsDisabled: + print(f"❌ Transcripts are disabled for video {video_id}", file=sys.stderr) + sys.exit(1) + except NoTranscriptFound: + print(f"❌ No transcript found for video {video_id}", file=sys.stderr) + sys.exit(1) + except Exception as e: + print(f"❌ Error: {e}", file=sys.stderr) + sys.exit(1) + +def list_available_transcripts(video_id): + """List all available transcripts for a video""" + try: + transcript_list = YouTubeTranscriptApi.list_transcripts(video_id) + print(f"βœ… Available transcripts for {video_id}:") + + for transcript in transcript_list: + generated = "[Auto-generated]" if transcript.is_generated else "[Manual]" + translatable = "(translatable)" if transcript.is_translatable else "" + print(f" - {transcript.language} ({transcript.language_code}) {generated} {translatable}") + + return True + except Exception as e: + print(f"❌ Error listing transcripts: {e}", file=sys.stderr) + return False + +if __name__ == "__main__": + if len(sys.argv) < 2: + print("Usage: ./extract-transcript.py VIDEO_ID [LANGUAGE_CODE]") + print(" ./extract-transcript.py VIDEO_ID --list (list available transcripts)") + sys.exit(1) + + video_id = sys.argv[1] + + # Check if user wants to list available transcripts + if len(sys.argv) > 2 and sys.argv[2] == "--list": + list_available_transcripts(video_id) + sys.exit(0) + + # Extract transcript + language = sys.argv[2] if len(sys.argv) > 2 else 'en' + transcript = extract_transcript(video_id, language) + print(transcript) diff --git a/skills/youtube-summarizer/scripts/install-dependencies.sh b/skills/youtube-summarizer/scripts/install-dependencies.sh new file mode 100644 index 00000000..59daee7c --- /dev/null +++ b/skills/youtube-summarizer/scripts/install-dependencies.sh @@ -0,0 +1,28 @@ +#!/usr/bin/env bash +# Install youtube-transcript-api dependency + +set -e + +echo "πŸ“¦ Installing youtube-transcript-api..." + +if command -v pip3 &>/dev/null; then + pip3 install youtube-transcript-api + echo "βœ… Installation complete using pip3!" +elif command -v pip &>/dev/null; then + pip install youtube-transcript-api + echo "βœ… Installation complete using pip!" +else + echo "❌ Error: pip not found" + echo "Please install Python pip first:" + echo " macOS: brew install python3" + echo " Ubuntu/Debian: sudo apt install python3-pip" + echo " Fedora: sudo dnf install python3-pip" + exit 1 +fi + +# Verify installation +python3 -c "import youtube_transcript_api; print('βœ… youtube-transcript-api is ready to use!')" 2>/dev/null || { + echo "⚠️ Installation completed but verification failed" + echo "Try running: python3 -c 'import youtube_transcript_api'" + exit 1 +} diff --git a/skills_index.json b/skills_index.json index 3b822827..0383e224 100644 --- a/skills_index.json +++ b/skills_index.json @@ -404,6 +404,15 @@ "risk": "unknown", "source": "unknown" }, + { + "id": "audio-transcriber", + "path": "skills/audio-transcriber", + "category": "uncategorized", + "name": "audio-transcriber", + "description": "Transform audio recordings into professional Markdown documentation with intelligent summaries using LLM integration", + "risk": "safe", + "source": "unknown" + }, { "id": "auth-implementation-patterns", "path": "skills/auth-implementation-patterns", @@ -3910,8 +3919,8 @@ "path": "skills/prompt-engineer", "category": "uncategorized", "name": "prompt-engineer", - "description": "Expert prompt engineer specializing in advanced prompting techniques, LLM optimization, and AI system design. Masters chain-of-thought, constitutional AI, and production prompt strategies. Use when building AI features, improving agent performance, or crafting system prompts.", - "risk": "unknown", + "description": "Transforms user prompts into optimized prompts using frameworks (RTF, RISEN, Chain of Thought, RODES, Chain of Density, RACE, RISE, STAR, SOAP, CLEAR, GROW)", + "risk": "safe", "source": "unknown" }, { @@ -4639,8 +4648,8 @@ "path": "skills/skill-creator", "category": "uncategorized", "name": "skill-creator", - "description": "Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations.", - "risk": "unknown", + "description": "This skill should be used when the user asks to create a new skill, build a skill, make a custom skill, develop a CLI skill, or wants to extend the CLI with new capabilities. Automates the entire skill creation workflow from brainstorming to installation.", + "risk": "safe", "source": "unknown" }, { @@ -5669,6 +5678,15 @@ "risk": "unknown", "source": "unknown" }, + { + "id": "youtube-summarizer", + "path": "skills/youtube-summarizer", + "category": "uncategorized", + "name": "youtube-summarizer", + "description": "Extract transcripts from YouTube videos and generate comprehensive, detailed summaries using intelligent analysis frameworks", + "risk": "safe", + "source": "unknown" + }, { "id": "zapier-make-patterns", "path": "skills/zapier-make-patterns",