Files
antigravity-skills-reference/skills/audio-transcriber/CHANGELOG.md
Eric Andrade 801c8fa475 feat: add 4 universal skills from cli-ai-skills
- Add audio-transcriber skill (v1.2.0): Transform audio to Markdown with Whisper
- Add youtube-summarizer skill (v1.2.0): Generate summaries from YouTube videos
- Update prompt-engineer skill: Enhanced with 11 optimization frameworks
- Update skill-creator skill: Improved automation workflow

All skills are zero-config, cross-platform (Claude Code, Copilot CLI, Codex)
and follow Quality Bar V4 standards.

Source: https://github.com/ericgandrade/cli-ai-skills
2026-02-04 17:37:45 -03:00

5.3 KiB

Changelog - audio-transcriber

All notable changes to the audio-transcriber skill will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.


[1.1.0] - 2026-02-03

Added

  • Intelligent Prompt Workflow (Step 3b) - Complete integration with prompt-engineer skill

    • Scenario A: User-provided prompts are automatically improved with prompt-engineer
      • Displays both original and improved versions side-by-side
      • Single confirmation: "Usar versão melhorada? [s/n]"
    • Scenario B: Auto-generation when no prompt provided
      • Analyzes transcript and suggests document type (ata, resumo, notas)
      • Shows suggestion and asks confirmation
      • Generates complete structured prompt (RISEN/RODES/STAR)
      • Shows preview and asks final confirmation
      • Falls back to DEFAULT_MEETING_PROMPT if declined
  • LLM Integration - Process transcripts with Claude CLI or GitHub Copilot CLI

    • Priority: Claude > GitHub Copilot > None (transcript-only mode)
    • Step 0b: CLI detection logic documented
    • Timeout handling (5 minutes default)
    • Graceful fallback if CLI unavailable
  • Progress Indicators - Visual feedback during long operations

    • tqdm progress bar for Whisper transcription segments
    • rich spinner for LLM processing
    • Clear status messages at each step
  • Timestamp-based File Naming - Avoid overwriting previous transcriptions

    • Format: transcript-YYYYMMDD-HHMMSS.md
    • Format: ata-YYYYMMDD-HHMMSS.md
    • Prevents data loss from repeated runs
  • Automatic Cleanup - Remove temporary files after processing

    • Deletes metadata.json and transcription.json automatically
    • --keep-temp flag to preserve if needed
    • Clean output directory
  • Rich Terminal UI - Beautiful output with rich library

    • Formatted panels for prompt previews
    • Color-coded status messages (green=success, yellow=warning, red=error)
    • Spinner animations for long-running tasks
  • Dual Output Support - Generate both transcript and processed ata

    • transcript-*.md - Raw transcription with timestamps
    • ata-*.md - Intelligent summary/meeting minutes (if LLM available)
    • User can decline LLM processing to get transcript-only

🔧 Changed

  • SKILL.md - Major documentation updates

    • Added Step 0b (CLI Detection)
    • Updated Step 2 (Progress Indicators)
    • Added Step 3b (Intelligent Prompt Workflow with 150+ lines)
    • Updated version to 1.1.0
    • Added detailed workflow diagrams for both scenarios
  • install-requirements.sh - Added UI libraries

    • Now installs tqdm and rich packages
    • Graceful fallback if installation fails
    • Updated success messages
  • Python Implementation - Complete refactor

    • Created scripts/transcribe.py (516 lines)
    • Functions: detect_cli_tool(), invoke_prompt_engineer(), handle_prompt_workflow(), process_with_llm(), transcribe_audio(), save_outputs(), cleanup_temp_files()
    • Command-line arguments: --prompt, --model, --output-dir, --keep-temp
    • Auto-installs rich and tqdm if missing

🐛 Fixed

  • User prompts no longer ignored - v1.0.0 completely ignored custom prompts

    • Now processes all prompts (custom or auto-generated) with LLM
    • Improves simple prompts into structured frameworks
  • Temporary files cleanup - v1.0.0 left metadata.json and transcription.json as trash

    • Now automatically removed after processing
    • Clean output directory
  • File overwriting - v1.0.0 used same filename (e.g., meeting.md) every time

    • Now uses timestamp to prevent data loss
    • Each run creates unique files
  • Missing ata/summary - v1.0.0 only generated raw transcript

    • Now generates intelligent ata/resumo using LLM
    • Respects user's prompt instructions
  • No progress feedback - v1.0.0 had silent processing (users didn't know if it froze)

    • Now shows progress bar for transcription
    • Shows spinner for LLM processing
    • Clear status messages throughout

📝 Notes

  • Backward Compatibility: Fully compatible with v1.0.0 workflows
  • Requires: Python 3.8+, faster-whisper OR whisper, tqdm, rich
  • Optional: Claude CLI or GitHub Copilot CLI for intelligent processing
  • Optional: prompt-engineer skill for automatic prompt generation
  • Fixes #1: Prompt do usuário RISEN ignorado
  • Fixes #2: Arquivos temporários (metadata.json, transcription.json) deixados como lixo
  • Fixes #3: Output incompleto (apenas transcript RAW, sem ata)
  • Fixes #4: Falta de indicador de progresso visual
  • Fixes #5: Formato de saída sem timestamp

[1.0.0] - 2026-02-02

Initial Release

  • Audio transcription using Faster-Whisper or OpenAI Whisper
  • Automatic language detection
  • Speaker diarization (basic)
  • Voice Activity Detection (VAD)
  • Markdown output with metadata table
  • Installation script for dependencies
  • Example scripts for basic transcription
  • Support for multiple audio formats (MP3, WAV, M4A, OGG, FLAC, WEBM)
  • FFmpeg integration for format conversion
  • Zero-configuration philosophy

📝 Known Limitations (Fixed in v1.1.0)

  • User prompts ignored (no LLM integration)
  • Only raw transcript generated (no ata/summary)
  • Temporary files not cleaned up
  • No progress indicators
  • Files overwritten on repeated runs