- Add audio-transcriber skill (v1.2.0): Transform audio to Markdown with Whisper - Add youtube-summarizer skill (v1.2.0): Generate summaries from YouTube videos - Update prompt-engineer skill: Enhanced with 11 optimization frameworks - Update skill-creator skill: Improved automation workflow All skills are zero-config, cross-platform (Claude Code, Copilot CLI, Codex) and follow Quality Bar V4 standards. Source: https://github.com/ericgandrade/cli-ai-skills
138 lines
5.3 KiB
Markdown
138 lines
5.3 KiB
Markdown
# Changelog - audio-transcriber
|
|
|
|
All notable changes to the audio-transcriber skill will be documented in this file.
|
|
|
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
|
|
---
|
|
|
|
## [1.1.0] - 2026-02-03
|
|
|
|
### ✨ Added
|
|
|
|
- **Intelligent Prompt Workflow** (Step 3b) - Complete integration with prompt-engineer skill
|
|
- **Scenario A**: User-provided prompts are automatically improved with prompt-engineer
|
|
- Displays both original and improved versions side-by-side
|
|
- Single confirmation: "Usar versão melhorada? [s/n]"
|
|
- **Scenario B**: Auto-generation when no prompt provided
|
|
- Analyzes transcript and suggests document type (ata, resumo, notas)
|
|
- Shows suggestion and asks confirmation
|
|
- Generates complete structured prompt (RISEN/RODES/STAR)
|
|
- Shows preview and asks final confirmation
|
|
- Falls back to DEFAULT_MEETING_PROMPT if declined
|
|
|
|
- **LLM Integration** - Process transcripts with Claude CLI or GitHub Copilot CLI
|
|
- Priority: Claude > GitHub Copilot > None (transcript-only mode)
|
|
- Step 0b: CLI detection logic documented
|
|
- Timeout handling (5 minutes default)
|
|
- Graceful fallback if CLI unavailable
|
|
|
|
- **Progress Indicators** - Visual feedback during long operations
|
|
- `tqdm` progress bar for Whisper transcription segments
|
|
- `rich` spinner for LLM processing
|
|
- Clear status messages at each step
|
|
|
|
- **Timestamp-based File Naming** - Avoid overwriting previous transcriptions
|
|
- Format: `transcript-YYYYMMDD-HHMMSS.md`
|
|
- Format: `ata-YYYYMMDD-HHMMSS.md`
|
|
- Prevents data loss from repeated runs
|
|
|
|
- **Automatic Cleanup** - Remove temporary files after processing
|
|
- Deletes `metadata.json` and `transcription.json` automatically
|
|
- `--keep-temp` flag to preserve if needed
|
|
- Clean output directory
|
|
|
|
- **Rich Terminal UI** - Beautiful output with `rich` library
|
|
- Formatted panels for prompt previews
|
|
- Color-coded status messages (green=success, yellow=warning, red=error)
|
|
- Spinner animations for long-running tasks
|
|
|
|
- **Dual Output Support** - Generate both transcript and processed ata
|
|
- `transcript-*.md` - Raw transcription with timestamps
|
|
- `ata-*.md` - Intelligent summary/meeting minutes (if LLM available)
|
|
- User can decline LLM processing to get transcript-only
|
|
|
|
### 🔧 Changed
|
|
|
|
- **SKILL.md** - Major documentation updates
|
|
- Added Step 0b (CLI Detection)
|
|
- Updated Step 2 (Progress Indicators)
|
|
- Added Step 3b (Intelligent Prompt Workflow with 150+ lines)
|
|
- Updated version to 1.1.0
|
|
- Added detailed workflow diagrams for both scenarios
|
|
|
|
- **install-requirements.sh** - Added UI libraries
|
|
- Now installs `tqdm` and `rich` packages
|
|
- Graceful fallback if installation fails
|
|
- Updated success messages
|
|
|
|
- **Python Implementation** - Complete refactor
|
|
- Created `scripts/transcribe.py` (516 lines)
|
|
- Functions: `detect_cli_tool()`, `invoke_prompt_engineer()`, `handle_prompt_workflow()`, `process_with_llm()`, `transcribe_audio()`, `save_outputs()`, `cleanup_temp_files()`
|
|
- Command-line arguments: `--prompt`, `--model`, `--output-dir`, `--keep-temp`
|
|
- Auto-installs `rich` and `tqdm` if missing
|
|
|
|
### 🐛 Fixed
|
|
|
|
- **User prompts no longer ignored** - v1.0.0 completely ignored custom prompts
|
|
- Now processes all prompts (custom or auto-generated) with LLM
|
|
- Improves simple prompts into structured frameworks
|
|
|
|
- **Temporary files cleanup** - v1.0.0 left `metadata.json` and `transcription.json` as trash
|
|
- Now automatically removed after processing
|
|
- Clean output directory
|
|
|
|
- **File overwriting** - v1.0.0 used same filename (e.g., `meeting.md`) every time
|
|
- Now uses timestamp to prevent data loss
|
|
- Each run creates unique files
|
|
|
|
- **Missing ata/summary** - v1.0.0 only generated raw transcript
|
|
- Now generates intelligent ata/resumo using LLM
|
|
- Respects user's prompt instructions
|
|
|
|
- **No progress feedback** - v1.0.0 had silent processing (users didn't know if it froze)
|
|
- Now shows progress bar for transcription
|
|
- Shows spinner for LLM processing
|
|
- Clear status messages throughout
|
|
|
|
### 📝 Notes
|
|
|
|
- **Backward Compatibility:** Fully compatible with v1.0.0 workflows
|
|
- **Requires:** Python 3.8+, faster-whisper OR whisper, tqdm, rich
|
|
- **Optional:** Claude CLI or GitHub Copilot CLI for intelligent processing
|
|
- **Optional:** prompt-engineer skill for automatic prompt generation
|
|
|
|
### 🔗 Related Issues
|
|
|
|
- Fixes #1: Prompt do usuário RISEN ignorado
|
|
- Fixes #2: Arquivos temporários (metadata.json, transcription.json) deixados como lixo
|
|
- Fixes #3: Output incompleto (apenas transcript RAW, sem ata)
|
|
- Fixes #4: Falta de indicador de progresso visual
|
|
- Fixes #5: Formato de saída sem timestamp
|
|
|
|
---
|
|
|
|
## [1.0.0] - 2026-02-02
|
|
|
|
### ✨ Initial Release
|
|
|
|
- Audio transcription using Faster-Whisper or OpenAI Whisper
|
|
- Automatic language detection
|
|
- Speaker diarization (basic)
|
|
- Voice Activity Detection (VAD)
|
|
- Markdown output with metadata table
|
|
- Installation script for dependencies
|
|
- Example scripts for basic transcription
|
|
- Support for multiple audio formats (MP3, WAV, M4A, OGG, FLAC, WEBM)
|
|
- FFmpeg integration for format conversion
|
|
- Zero-configuration philosophy
|
|
|
|
### 📝 Known Limitations (Fixed in v1.1.0)
|
|
|
|
- User prompts ignored (no LLM integration)
|
|
- Only raw transcript generated (no ata/summary)
|
|
- Temporary files not cleaned up
|
|
- No progress indicators
|
|
- Files overwritten on repeated runs
|