## New Skill: transcript-fixer v1.0.0 Correct speech-to-text (ASR/STT) transcription errors through dictionary-based rules and AI-powered corrections with automatic pattern learning. **Features:** - Two-stage correction pipeline (dictionary + AI) - Automatic pattern detection and learning - Domain-specific dictionaries (general, embodied_ai, finance, medical) - SQLite-based correction repository - Team collaboration with import/export - GLM API integration for AI corrections - Cost optimization through dictionary promotion **Use cases:** - Correcting meeting notes, lecture recordings, or interview transcripts - Fixing Chinese/English homophone errors and technical terminology - Building domain-specific correction dictionaries - Improving transcript accuracy through iterative learning **Documentation:** - Complete workflow guides in references/ - SQL query templates - Troubleshooting guide - Team collaboration patterns - API setup instructions **Marketplace updates:** - Updated marketplace to v1.8.0 - Added transcript-fixer plugin (category: productivity) - Updated README.md with skill description and use cases - Updated CLAUDE.md with skill listing and counts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
314 lines
8.1 KiB
Markdown
314 lines
8.1 KiB
Markdown
# Troubleshooting Guide
|
|
|
|
Solutions to common issues and error conditions.
|
|
|
|
## Table of Contents
|
|
|
|
- [API Authentication Errors](#api-authentication-errors)
|
|
- [GLM_API_KEY Not Set](#glm_api_key-not-set)
|
|
- [Invalid API Key](#invalid-api-key)
|
|
- [Learning System Issues](#learning-system-issues)
|
|
- [No Suggestions Generated](#no-suggestions-generated)
|
|
- [Database Issues](#database-issues)
|
|
- [Database Not Found](#database-not-found)
|
|
- [Database Locked](#database-locked)
|
|
- [Corrupted Database](#corrupted-database)
|
|
- [Missing Tables](#missing-tables)
|
|
- [Common Pitfalls](#common-pitfalls)
|
|
- [1. Stage Order Confusion](#1-stage-order-confusion)
|
|
- [2. Overwriting Imports](#2-overwriting-imports)
|
|
- [3. Ignoring Learned Suggestions](#3-ignoring-learned-suggestions)
|
|
- [4. Testing on Large Files](#4-testing-on-large-files)
|
|
- [5. Manual Database Edits Without Validation](#5-manual-database-edits-without-validation)
|
|
- [6. Committing .db Files to Git](#6-committing-db-files-to-git)
|
|
- [Validation Commands](#validation-commands)
|
|
- [Quick Health Check](#quick-health-check)
|
|
- [Detailed Diagnostics](#detailed-diagnostics)
|
|
- [Getting Help](#getting-help)
|
|
|
|
## API Authentication Errors
|
|
|
|
### GLM_API_KEY Not Set
|
|
|
|
**Symptom**:
|
|
```
|
|
❌ Error: GLM_API_KEY environment variable not set
|
|
Set it with: export GLM_API_KEY='your-key'
|
|
```
|
|
|
|
**Solution**:
|
|
```bash
|
|
# Check if key is set
|
|
echo $GLM_API_KEY
|
|
|
|
# If empty, export key
|
|
export GLM_API_KEY="your-api-key-here"
|
|
|
|
# Verify
|
|
uv run scripts/fix_transcription.py --validate
|
|
```
|
|
|
|
**Persistence**: Add to shell profile (`.bashrc` or `.zshrc`) for permanent access.
|
|
|
|
See `glm_api_setup.md` for detailed API key management.
|
|
|
|
### Invalid API Key
|
|
|
|
**Symptom**: API calls fail with 401/403 errors
|
|
|
|
**Solutions**:
|
|
1. Verify key is correct (copy from https://open.bigmodel.cn/)
|
|
2. Check for extra spaces or quotes in the key
|
|
3. Regenerate key if compromised
|
|
4. Verify API quota hasn't been exceeded
|
|
|
|
## Learning System Issues
|
|
|
|
### No Suggestions Generated
|
|
|
|
**Symptom**: Running `--review-learned` shows no suggestions after multiple corrections.
|
|
|
|
**Requirements**:
|
|
- Minimum 3 correction runs with consistent patterns
|
|
- Learning frequency threshold ≥3 (default)
|
|
- Learning confidence threshold ≥0.8 (default)
|
|
|
|
**Diagnostic steps**:
|
|
|
|
```bash
|
|
# Check correction history count
|
|
sqlite3 ~/.transcript-fixer/corrections.db "SELECT COUNT(*) FROM correction_history;"
|
|
|
|
# If 0, no corrections have been run yet
|
|
# If >0 but <3, run more corrections
|
|
|
|
# Check suggestions table
|
|
sqlite3 ~/.transcript-fixer/corrections.db "SELECT * FROM learned_suggestions;"
|
|
|
|
# Check system configuration
|
|
sqlite3 ~/.transcript-fixer/corrections.db "SELECT key, value FROM system_config WHERE key LIKE 'learning%';"
|
|
```
|
|
|
|
**Solutions**:
|
|
1. Run at least 3 correction sessions
|
|
2. Ensure patterns repeat (same error → same correction)
|
|
3. Verify database permissions (should be readable/writable)
|
|
4. Check `correction_history` table has entries
|
|
|
|
## Database Issues
|
|
|
|
### Database Not Found
|
|
|
|
**Symptom**:
|
|
```
|
|
⚠️ Database not found: ~/.transcript-fixer/corrections.db
|
|
```
|
|
|
|
**Solution**:
|
|
```bash
|
|
uv run scripts/fix_transcription.py --init
|
|
```
|
|
|
|
This creates the database with the complete schema.
|
|
|
|
### Database Locked
|
|
|
|
**Symptom**:
|
|
```
|
|
Error: database is locked
|
|
```
|
|
|
|
**Causes**:
|
|
- Another process is accessing the database
|
|
- Unfinished transaction from crashed process
|
|
- File permissions issue
|
|
|
|
**Solutions**:
|
|
|
|
```bash
|
|
# Check for processes using the database
|
|
lsof ~/.transcript-fixer/corrections.db
|
|
|
|
# If processes found, kill them or wait for completion
|
|
|
|
# If database is corrupted, backup and recreate
|
|
cp ~/.transcript-fixer/corrections.db ~/.transcript-fixer/corrections_backup.db
|
|
sqlite3 ~/.transcript-fixer/corrections.db "VACUUM;"
|
|
```
|
|
|
|
### Corrupted Database
|
|
|
|
**Symptom**: SQLite errors, integrity check failures
|
|
|
|
**Solutions**:
|
|
|
|
```bash
|
|
# Check integrity
|
|
sqlite3 ~/.transcript-fixer/corrections.db "PRAGMA integrity_check;"
|
|
|
|
# If corrupted, attempt recovery
|
|
sqlite3 ~/.transcript-fixer/corrections.db ".recover" | sqlite3 ~/.transcript-fixer/corrections_new.db
|
|
|
|
# Replace database with recovered version
|
|
mv ~/.transcript-fixer/corrections.db ~/.transcript-fixer/corrections_corrupted.db
|
|
mv ~/.transcript-fixer/corrections_new.db ~/.transcript-fixer/corrections.db
|
|
```
|
|
|
|
### Missing Tables
|
|
|
|
**Symptom**:
|
|
```
|
|
❌ Database missing tables: ['corrections', ...]
|
|
```
|
|
|
|
**Solution**: Reinitialize schema (safe, uses IF NOT EXISTS):
|
|
|
|
```bash
|
|
python -c "from core import CorrectionRepository; from pathlib import Path; CorrectionRepository(Path.home() / '.transcript-fixer' / 'corrections.db')"
|
|
```
|
|
|
|
Or delete database and reinitialize:
|
|
|
|
```bash
|
|
# Backup first
|
|
cp ~/.transcript-fixer/corrections.db ~/corrections_backup_$(date +%Y%m%d).db
|
|
|
|
# Reinitialize
|
|
uv run scripts/fix_transcription.py --init
|
|
```
|
|
|
|
## Common Pitfalls
|
|
|
|
### 1. Stage Order Confusion
|
|
|
|
**Problem**: Running Stage 2 without Stage 1 output.
|
|
|
|
**Solution**: Use `--stage 3` for full pipeline, or run stages sequentially:
|
|
|
|
```bash
|
|
# Wrong: Stage 2 on raw file
|
|
uv run scripts/fix_transcription.py --input file.md --stage 2 # ❌
|
|
|
|
# Correct: Full pipeline
|
|
uv run scripts/fix_transcription.py --input file.md --stage 3 # ✅
|
|
|
|
# Or sequential stages
|
|
uv run scripts/fix_transcription.py --input file.md --stage 1
|
|
uv run scripts/fix_transcription.py --input file_stage1.md --stage 2
|
|
```
|
|
|
|
### 2. Overwriting Imports
|
|
|
|
**Problem**: Using `--import` without `--merge` overwrites existing corrections.
|
|
|
|
**Solution**: Always use `--merge` flag:
|
|
|
|
```bash
|
|
# Wrong: Overwrites existing
|
|
uv run scripts/fix_transcription.py --import team.json # ❌
|
|
|
|
# Correct: Merges with existing
|
|
uv run scripts/fix_transcription.py --import team.json --merge # ✅
|
|
```
|
|
|
|
### 3. Ignoring Learned Suggestions
|
|
|
|
**Problem**: Not reviewing learned patterns, missing free optimizations.
|
|
|
|
**Impact**: Patterns detected by AI remain expensive (Stage 2) instead of cheap (Stage 1).
|
|
|
|
**Solution**: Review suggestions every 3-5 runs:
|
|
|
|
```bash
|
|
uv run scripts/fix_transcription.py --review-learned
|
|
uv run scripts/fix_transcription.py --approve "错误" "正确"
|
|
```
|
|
|
|
### 4. Testing on Large Files
|
|
|
|
**Problem**: Testing dictionary changes on large files wastes API quota.
|
|
|
|
**Solution**: Start with `--stage 1` on small files (100-500 lines):
|
|
|
|
```bash
|
|
# Test dictionary changes first
|
|
uv run scripts/fix_transcription.py --input small_sample.md --stage 1
|
|
|
|
# Review output, adjust corrections
|
|
# Then run full pipeline
|
|
uv run scripts/fix_transcription.py --input large_file.md --stage 3
|
|
```
|
|
|
|
### 5. Manual Database Edits Without Validation
|
|
|
|
**Problem**: Direct SQL edits might violate schema constraints.
|
|
|
|
**Solution**: Always validate after manual changes:
|
|
|
|
```bash
|
|
sqlite3 ~/.transcript-fixer/corrections.db
|
|
# ... make changes ...
|
|
.quit
|
|
|
|
# Validate
|
|
uv run scripts/fix_transcription.py --validate
|
|
```
|
|
|
|
### 6. Committing .db Files to Git
|
|
|
|
**Problem**: Binary database files in Git cause merge conflicts and bloat repository.
|
|
|
|
**Solution**: Use JSON exports for version control:
|
|
|
|
```bash
|
|
# .gitignore
|
|
*.db
|
|
*.db-journal
|
|
*.bak
|
|
|
|
# Export for version control instead
|
|
uv run scripts/fix_transcription.py --export corrections_$(date +%Y%m%d).json
|
|
git add corrections_*.json
|
|
```
|
|
|
|
## Validation Commands
|
|
|
|
### Quick Health Check
|
|
|
|
```bash
|
|
uv run scripts/fix_transcription.py --validate
|
|
```
|
|
|
|
### Detailed Diagnostics
|
|
|
|
```bash
|
|
# Check database integrity
|
|
sqlite3 ~/.transcript-fixer/corrections.db "PRAGMA integrity_check;"
|
|
|
|
# Check table counts
|
|
sqlite3 ~/.transcript-fixer/corrections.db "
|
|
SELECT 'corrections' as table_name, COUNT(*) as count FROM corrections
|
|
UNION ALL
|
|
SELECT 'context_rules', COUNT(*) FROM context_rules
|
|
UNION ALL
|
|
SELECT 'learned_suggestions', COUNT(*) FROM learned_suggestions
|
|
UNION ALL
|
|
SELECT 'correction_history', COUNT(*) FROM correction_history;
|
|
"
|
|
|
|
# Check configuration
|
|
sqlite3 ~/.transcript-fixer/corrections.db "SELECT * FROM system_config;"
|
|
```
|
|
|
|
## Getting Help
|
|
|
|
If issues persist:
|
|
|
|
1. Run `--validate` to collect diagnostic information
|
|
2. Check `correction_history` and `audit_log` tables for errors
|
|
3. Review `references/file_formats.md` for schema details
|
|
4. Check `references/architecture.md` for component details
|
|
5. Verify Python and uv versions are up to date
|
|
|
|
For database corruption, automatic backups are created before migrations. Check for `.bak` files in `~/.transcript-fixer/`.
|