Add word-level diff generator script

- Add scripts/generate_word_diff.py for generating word-by-word comparison HTML - Shows complete word replacements (e.g., 'japanese 3 pro' → 'Gemini 3 Pro') - More readable than character-level or line-level diffs - Update SKILL.md with usage instructions and script documentation
2025-12-21 12:58:32 +08:00
parent dbcb53a376
commit 4a36e89195
2 changed files with 325 additions and 0 deletions
--- a/transcript-fixer/SKILL.md
+++ b/transcript-fixer/SKILL.md
@@ -65,6 +65,15 @@ uv run scripts/fix_transcription.py --review-learned
 - `*_stage2.md` - AI corrections applied (final version)
 - `*_对比.html` - Visual diff (open in browser for best experience)

+**Generate word-level diff** (recommended for reviewing corrections):
+```bash
+uv run scripts/generate_word_diff.py original.md corrected.md output.html
+```
+
+This creates an HTML file showing word-by-word differences with clear highlighting:
+- 🔴 `japanese 3 pro` → 🟢 `Gemini 3 Pro` (complete word replacements)
+- Easy to spot exactly what changed without character-level noise
+
 ## Example Session

 **Input transcript** (`meeting.md`):
@@ -153,6 +162,7 @@ sqlite3 ~/.transcript-fixer/corrections.db "SELECT value FROM system_config WHER
 - `ensure_deps.py` - Initialize shared virtual environment (run once, optional)
 - `fix_transcript_enhanced.py` - Enhanced wrapper (recommended for interactive use)
 - `fix_transcription.py` - Core CLI (for automation)
+- `generate_word_diff.py` - Generate word-level diff HTML for reviewing corrections
 - `examples/bulk_import.py` - Bulk import example

 **References** (load as needed):