## New Skill: transcript-fixer v1.0.0 Correct speech-to-text (ASR/STT) transcription errors through dictionary-based rules and AI-powered corrections with automatic pattern learning. **Features:** - Two-stage correction pipeline (dictionary + AI) - Automatic pattern detection and learning - Domain-specific dictionaries (general, embodied_ai, finance, medical) - SQLite-based correction repository - Team collaboration with import/export - GLM API integration for AI corrections - Cost optimization through dictionary promotion **Use cases:** - Correcting meeting notes, lecture recordings, or interview transcripts - Fixing Chinese/English homophone errors and technical terminology - Building domain-specific correction dictionaries - Improving transcript accuracy through iterative learning **Documentation:** - Complete workflow guides in references/ - SQL query templates - Troubleshooting guide - Team collaboration patterns - API setup instructions **Marketplace updates:** - Updated marketplace to v1.8.0 - Added transcript-fixer plugin (category: productivity) - Updated README.md with skill description and use cases - Updated CLAUDE.md with skill listing and counts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
9.9 KiB
Team Collaboration Guide
This guide explains how to share correction knowledge across teams using export/import and Git workflows.
Table of Contents
- Export/Import Workflow
- Git-Based Collaboration
- Selective Domain Sharing
- Git Branching Strategy
- Automated Sync (Advanced)
- Backup and Recovery
- Team Best Practices
- Integration with CI/CD
- Troubleshooting
- Security Considerations
- Further Reading
Export/Import Workflow
Export Corrections
Share your corrections with team members:
# Export specific domain
python scripts/fix_transcription.py --export team_corrections.json --domain embodied_ai
# Export general corrections
python scripts/fix_transcription.py --export team_corrections.json
Output: Creates a standalone JSON file with your corrections.
Import from Teammate
Two modes: merge (combine) or replace (overwrite):
# Merge (recommended) - combines with existing corrections
python scripts/fix_transcription.py --import team_corrections.json --merge
# Replace - overwrites existing corrections (dangerous!)
python scripts/fix_transcription.py --import team_corrections.json
Merge behavior:
- Adds new corrections
- Updates existing corrections with imported values
- Preserves corrections not in import file
Team Workflow Example
Person A (Domain Expert):
# Build correction dictionary
python fix_transcription.py --add "巨升" "具身" --domain embodied_ai
python fix_transcription.py --add "奇迹创坛" "奇绩创坛" --domain embodied_ai
# ... add 50 more corrections ...
# Export for team
python fix_transcription.py --export ai_corrections.json --domain embodied_ai
# Send ai_corrections.json to team via Slack/email
Person B (Team Member):
# Receive ai_corrections.json
# Import and merge with existing corrections
python fix_transcription.py --import ai_corrections.json --merge
# Now Person B has all 50+ corrections!
Git-Based Collaboration
For teams using Git, version control the entire correction database.
Initial Setup
Person A (First User):
cd ~/.transcript-fixer
git init
git add corrections.json context_rules.json config.json
git add domains/
git commit -m "Initial correction database"
# Push to shared repo
git remote add origin git@github.com:org/transcript-corrections.git
git push -u origin main
Team Members Clone
Person B, C, D (Team Members):
# Clone shared corrections
git clone git@github.com:org/transcript-corrections.git ~/.transcript-fixer
# Now everyone has the same corrections!
Ongoing Sync
Daily workflow:
# Morning: Pull team updates
cd ~/.transcript-fixer
git pull origin main
# During day: Add corrections
python fix_transcription.py --add "错误" "正确"
# Evening: Push your additions
cd ~/.transcript-fixer
git add corrections.json
git commit -m "Added 5 new embodied AI corrections"
git push origin main
Handling Conflicts
When two people add different corrections to same file:
cd ~/.transcript-fixer
git pull origin main
# If conflict occurs:
# CONFLICT in corrections.json
# Option 1: Manual merge (recommended)
nano corrections.json # Edit to combine both changes
git add corrections.json
git commit -m "Merged corrections from teammate"
git push
# Option 2: Keep yours
git checkout --ours corrections.json
git add corrections.json
git commit -m "Kept local corrections"
git push
# Option 3: Keep theirs
git checkout --theirs corrections.json
git add corrections.json
git commit -m "Used teammate's corrections"
git push
Best Practice: JSON merge conflicts are usually easy - just combine the correction entries from both versions.
Selective Domain Sharing
Share only specific domains with different teams:
Finance Team
# Finance team exports their domain
python fix_transcription.py --export finance_corrections.json --domain finance
# Share finance_corrections.json with finance team only
AI Team
# AI team exports their domain
python fix_transcription.py --export ai_corrections.json --domain embodied_ai
# Share ai_corrections.json with AI team only
Individual imports specific domains
# Alice works on both finance and AI
python fix_transcription.py --import finance_corrections.json --merge
python fix_transcription.py --import ai_corrections.json --merge
Git Branching Strategy
For larger teams, use branches for different domains or workflows:
Feature Branches
# Create branch for major dictionary additions
git checkout -b add-medical-terms
python fix_transcription.py --add "医疗术语" "正确术语" --domain medical
# ... add 100 medical corrections ...
git add domains/medical.json
git commit -m "Added 100 medical terminology corrections"
git push origin add-medical-terms
# Create PR for review
# After approval, merge to main
Domain Branches (Alternative)
# Separate branches per domain
git checkout -b domain/embodied-ai
# Work on AI corrections
git push origin domain/embodied-ai
git checkout -b domain/finance
# Work on finance corrections
git push origin domain/finance
Automated Sync (Advanced)
Set up automatic Git sync using cron/Task Scheduler:
macOS/Linux Cron
# Edit crontab
crontab -e
# Add daily sync at 9 AM and 6 PM
0 9,18 * * * cd ~/.transcript-fixer && git pull origin main && git push origin main
Windows Task Scheduler
# Create scheduled task
$action = New-ScheduledTaskAction -Execute "git" -Argument "pull origin main" -WorkingDirectory "$env:USERPROFILE\.transcript-fixer"
$trigger = New-ScheduledTaskTrigger -Daily -At 9am
Register-ScheduledTask -Action $action -Trigger $trigger -TaskName "SyncTranscriptCorrections"
Backup and Recovery
Backup Strategy
# Weekly backup to cloud
cd ~/.transcript-fixer
tar -czf transcript-corrections-$(date +%Y%m%d).tar.gz corrections.json context_rules.json domains/
# Upload to Dropbox/Google Drive/S3
Recovery from Backup
# Extract backup
tar -xzf transcript-corrections-20250127.tar.gz -C ~/.transcript-fixer/
Recovery from Git
# View history
cd ~/.transcript-fixer
git log corrections.json
# Restore from 3 commits ago
git checkout HEAD~3 corrections.json
# Or restore specific version
git checkout abc123def corrections.json
Team Best Practices
- Pull Before Push: Always
git pullbefore starting work - Commit Often: Small, frequent commits better than large infrequent ones
- Descriptive Messages: "Added 5 finance terms" better than "updates"
- Review Process: Use PRs for major dictionary changes (100+ corrections)
- Domain Ownership: Assign domain experts as reviewers
- Weekly Sync: Schedule team sync meetings to review learned suggestions
- Backup Policy: Weekly backups of entire
~/.transcript-fixer/
Integration with CI/CD
For enterprise teams, integrate validation into CI:
GitHub Actions Example
# .github/workflows/validate-corrections.yml
name: Validate Corrections
on:
pull_request:
paths:
- 'corrections.json'
- 'domains/*.json'
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Validate JSON
run: |
python -m json.tool corrections.json > /dev/null
for file in domains/*.json; do
python -m json.tool "$file" > /dev/null
done
- name: Check for duplicates
run: |
python scripts/check_duplicates.py corrections.json
Troubleshooting
Import Failed
# Check JSON validity
python -m json.tool team_corrections.json
# If invalid, fix JSON syntax errors
nano team_corrections.json
Git Sync Failed
# Check remote connection
git remote -v
# Re-add if needed
git remote set-url origin git@github.com:org/corrections.git
# Verify SSH keys
ssh -T git@github.com
Merge Conflicts Too Complex
# Nuclear option: Keep one version
git checkout --ours corrections.json # Keep yours
# OR
git checkout --theirs corrections.json # Keep theirs
# Then re-import the other version
python fix_transcription.py --import other_version.json --merge
Security Considerations
- Private Repos: Use private Git repositories for company-specific corrections
- Access Control: Limit who can push to main branch
- Secret Scanning: Never commit API keys (already handled by security_scan.py)
- Audit Trail: Git history provides full audit trail of who changed what
- Backup Encryption: Encrypt backups if containing sensitive terminology
Further Reading
- Git workflows: https://git-scm.com/book/en/v2/Git-Branching-Branching-Workflows
- JSON validation: https://jsonlint.com/
- Team Git practices: https://github.com/git-guides