# Troubleshooting Guide Solutions to common issues and error conditions. ## Table of Contents - [API Authentication Errors](#api-authentication-errors) - [GLM_API_KEY Not Set](#glm_api_key-not-set) - [Invalid API Key](#invalid-api-key) - [Learning System Issues](#learning-system-issues) - [No Suggestions Generated](#no-suggestions-generated) - [Database Issues](#database-issues) - [Database Not Found](#database-not-found) - [Database Locked](#database-locked) - [Corrupted Database](#corrupted-database) - [Missing Tables](#missing-tables) - [Common Pitfalls](#common-pitfalls) - [1. Stage Order Confusion](#1-stage-order-confusion) - [2. Overwriting Imports](#2-overwriting-imports) - [3. Ignoring Learned Suggestions](#3-ignoring-learned-suggestions) - [4. Testing on Large Files](#4-testing-on-large-files) - [5. Manual Database Edits Without Validation](#5-manual-database-edits-without-validation) - [6. Committing .db Files to Git](#6-committing-db-files-to-git) - [Validation Commands](#validation-commands) - [Quick Health Check](#quick-health-check) - [Detailed Diagnostics](#detailed-diagnostics) - [Getting Help](#getting-help) ## API Authentication Errors ### GLM_API_KEY Not Set **Symptom**: ``` ❌ Error: GLM_API_KEY environment variable not set Set it with: export GLM_API_KEY='your-key' ``` **Solution**: ```bash # Check if key is set echo $GLM_API_KEY # If empty, export key export GLM_API_KEY="your-api-key-here" # Verify uv run scripts/fix_transcription.py --validate ``` **Persistence**: Add to shell profile (`.bashrc` or `.zshrc`) for permanent access. See `glm_api_setup.md` for detailed API key management. ### Invalid API Key **Symptom**: API calls fail with 401/403 errors **Solutions**: 1. Verify key is correct (copy from https://open.bigmodel.cn/) 2. Check for extra spaces or quotes in the key 3. Regenerate key if compromised 4. Verify API quota hasn't been exceeded ## Learning System Issues ### No Suggestions Generated **Symptom**: Running `--review-learned` shows no suggestions after multiple corrections. **Requirements**: - Minimum 3 correction runs with consistent patterns - Learning frequency threshold ≥3 (default) - Learning confidence threshold ≥0.8 (default) **Diagnostic steps**: ```bash # Check correction history count sqlite3 ~/.transcript-fixer/corrections.db "SELECT COUNT(*) FROM correction_history;" # If 0, no corrections have been run yet # If >0 but <3, run more corrections # Check suggestions table sqlite3 ~/.transcript-fixer/corrections.db "SELECT * FROM learned_suggestions;" # Check system configuration sqlite3 ~/.transcript-fixer/corrections.db "SELECT key, value FROM system_config WHERE key LIKE 'learning%';" ``` **Solutions**: 1. Run at least 3 correction sessions 2. Ensure patterns repeat (same error → same correction) 3. Verify database permissions (should be readable/writable) 4. Check `correction_history` table has entries ## Database Issues ### Database Not Found **Symptom**: ``` ⚠️ Database not found: ~/.transcript-fixer/corrections.db ``` **Solution**: ```bash uv run scripts/fix_transcription.py --init ``` This creates the database with the complete schema. ### Database Locked **Symptom**: ``` Error: database is locked ``` **Causes**: - Another process is accessing the database - Unfinished transaction from crashed process - File permissions issue **Solutions**: ```bash # Check for processes using the database lsof ~/.transcript-fixer/corrections.db # If processes found, kill them or wait for completion # If database is corrupted, backup and recreate cp ~/.transcript-fixer/corrections.db ~/.transcript-fixer/corrections_backup.db sqlite3 ~/.transcript-fixer/corrections.db "VACUUM;" ``` ### Corrupted Database **Symptom**: SQLite errors, integrity check failures **Solutions**: ```bash # Check integrity sqlite3 ~/.transcript-fixer/corrections.db "PRAGMA integrity_check;" # If corrupted, attempt recovery sqlite3 ~/.transcript-fixer/corrections.db ".recover" | sqlite3 ~/.transcript-fixer/corrections_new.db # Replace database with recovered version mv ~/.transcript-fixer/corrections.db ~/.transcript-fixer/corrections_corrupted.db mv ~/.transcript-fixer/corrections_new.db ~/.transcript-fixer/corrections.db ``` ### Missing Tables **Symptom**: ``` ❌ Database missing tables: ['corrections', ...] ``` **Solution**: Reinitialize schema (safe, uses IF NOT EXISTS): ```bash python -c "from core import CorrectionRepository; from pathlib import Path; CorrectionRepository(Path.home() / '.transcript-fixer' / 'corrections.db')" ``` Or delete database and reinitialize: ```bash # Backup first cp ~/.transcript-fixer/corrections.db ~/corrections_backup_$(date +%Y%m%d).db # Reinitialize uv run scripts/fix_transcription.py --init ``` ## Common Pitfalls ### 1. Stage Order Confusion **Problem**: Running Stage 2 without Stage 1 output. **Solution**: Use `--stage 3` for full pipeline, or run stages sequentially: ```bash # Wrong: Stage 2 on raw file uv run scripts/fix_transcription.py --input file.md --stage 2 # ❌ # Correct: Full pipeline uv run scripts/fix_transcription.py --input file.md --stage 3 # ✅ # Or sequential stages uv run scripts/fix_transcription.py --input file.md --stage 1 uv run scripts/fix_transcription.py --input file_stage1.md --stage 2 ``` ### 2. Overwriting Imports **Problem**: Using `--import` without `--merge` overwrites existing corrections. **Solution**: Always use `--merge` flag: ```bash # Wrong: Overwrites existing uv run scripts/fix_transcription.py --import team.json # ❌ # Correct: Merges with existing uv run scripts/fix_transcription.py --import team.json --merge # ✅ ``` ### 3. Ignoring Learned Suggestions **Problem**: Not reviewing learned patterns, missing free optimizations. **Impact**: Patterns detected by AI remain expensive (Stage 2) instead of cheap (Stage 1). **Solution**: Review suggestions every 3-5 runs: ```bash uv run scripts/fix_transcription.py --review-learned uv run scripts/fix_transcription.py --approve "错误" "正确" ``` ### 4. Testing on Large Files **Problem**: Testing dictionary changes on large files wastes API quota. **Solution**: Start with `--stage 1` on small files (100-500 lines): ```bash # Test dictionary changes first uv run scripts/fix_transcription.py --input small_sample.md --stage 1 # Review output, adjust corrections # Then run full pipeline uv run scripts/fix_transcription.py --input large_file.md --stage 3 ``` ### 5. Manual Database Edits Without Validation **Problem**: Direct SQL edits might violate schema constraints. **Solution**: Always validate after manual changes: ```bash sqlite3 ~/.transcript-fixer/corrections.db # ... make changes ... .quit # Validate uv run scripts/fix_transcription.py --validate ``` ### 6. Committing .db Files to Git **Problem**: Binary database files in Git cause merge conflicts and bloat repository. **Solution**: Use JSON exports for version control: ```bash # .gitignore *.db *.db-journal *.bak # Export for version control instead uv run scripts/fix_transcription.py --export corrections_$(date +%Y%m%d).json git add corrections_*.json ``` ## Validation Commands ### Quick Health Check ```bash uv run scripts/fix_transcription.py --validate ``` ### Detailed Diagnostics ```bash # Check database integrity sqlite3 ~/.transcript-fixer/corrections.db "PRAGMA integrity_check;" # Check table counts sqlite3 ~/.transcript-fixer/corrections.db " SELECT 'corrections' as table_name, COUNT(*) as count FROM corrections UNION ALL SELECT 'context_rules', COUNT(*) FROM context_rules UNION ALL SELECT 'learned_suggestions', COUNT(*) FROM learned_suggestions UNION ALL SELECT 'correction_history', COUNT(*) FROM correction_history; " # Check configuration sqlite3 ~/.transcript-fixer/corrections.db "SELECT * FROM system_config;" ``` ## Getting Help If issues persist: 1. Run `--validate` to collect diagnostic information 2. Check `correction_history` and `audit_log` tables for errors 3. Review `references/file_formats.md` for schema details 4. Check `references/architecture.md` for component details 5. Verify Python and uv versions are up to date For database corruption, automatic backups are created before migrations. Check for `.bak` files in `~/.transcript-fixer/`.