Files
daymade bd0aa12004 Release v1.8.0: Add transcript-fixer skill
## New Skill: transcript-fixer v1.0.0

Correct speech-to-text (ASR/STT) transcription errors through dictionary-based rules and AI-powered corrections with automatic pattern learning.

**Features:**
- Two-stage correction pipeline (dictionary + AI)
- Automatic pattern detection and learning
- Domain-specific dictionaries (general, embodied_ai, finance, medical)
- SQLite-based correction repository
- Team collaboration with import/export
- GLM API integration for AI corrections
- Cost optimization through dictionary promotion

**Use cases:**
- Correcting meeting notes, lecture recordings, or interview transcripts
- Fixing Chinese/English homophone errors and technical terminology
- Building domain-specific correction dictionaries
- Improving transcript accuracy through iterative learning

**Documentation:**
- Complete workflow guides in references/
- SQL query templates
- Troubleshooting guide
- Team collaboration patterns
- API setup instructions

**Marketplace updates:**
- Updated marketplace to v1.8.0
- Added transcript-fixer plugin (category: productivity)
- Updated README.md with skill description and use cases
- Updated CLAUDE.md with skill listing and counts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-28 13:16:37 +08:00

3.0 KiB

Setup Guide

Complete installation and configuration guide for transcript-fixer.

Table of Contents

Installation

Dependencies

Install required dependencies using uv:

uv pip install -r requirements.txt

Or sync the project environment:

uv sync

Required packages:

  • anthropic - For Claude API integration (future)
  • requests - For GLM API calls
  • difflib - Standard library for diff generation

Database Initialization

Initialize the SQLite database (first time only):

uv run scripts/fix_transcription.py --init

This creates ~/.transcript-fixer/corrections.db with the complete schema:

  • 8 tables (corrections, context_rules, history, suggestions, etc.)
  • 3 views (active_corrections, pending_suggestions, statistics)
  • ACID transactions enabled
  • Automatic backups before migrations

See file_formats.md for complete database schema.

API Configuration

GLM API Key (Required for Stage 2)

Stage 2 AI corrections require a GLM API key.

  1. Obtain API key: Visit https://open.bigmodel.cn/
  2. Register for an account
  3. Generate an API key from the dashboard
  4. Set environment variable:
export GLM_API_KEY="your-api-key-here"

Persistence: Add to shell profile for permanent access:

# For bash
echo 'export GLM_API_KEY="your-key"' >> ~/.bashrc
source ~/.bashrc

# For zsh
echo 'export GLM_API_KEY="your-key"' >> ~/.zshrc
source ~/.zshrc

Verify Configuration

Run validation to check setup:

uv run scripts/fix_transcription.py --validate

Expected output:

🔍 Validating transcript-fixer configuration...

✅ Configuration directory exists: ~/.transcript-fixer
✅ Database valid: 0 corrections
✅ All 8 tables present
✅ GLM_API_KEY is set

============================================================
✅ All checks passed! Configuration is valid.
============================================================

Environment Setup

Python Environment

Required: Python 3.8+

Recommended: Use uv for all Python operations:

# Never use system python directly
uv run scripts/fix_transcription.py  # ✅ Correct

# Don't use system python
python scripts/fix_transcription.py  # ❌ Wrong

Directory Structure

After initialization, the directory structure is:

~/.transcript-fixer/
├── corrections.db              # SQLite database
├── corrections.YYYYMMDD.bak   # Automatic backups
└── (migration artifacts)

Important: The .db file should NOT be committed to Git. Export corrections to JSON for version control instead.

Next Steps

After setup:

  1. Add initial corrections (5-10 terms)
  2. Run first correction on a test file
  3. Review learned suggestions after 3-5 runs
  4. Build domain-specific dictionaries

See workflow_guide.md for detailed usage instructions.