Files
daymade bd0aa12004 Release v1.8.0: Add transcript-fixer skill
## New Skill: transcript-fixer v1.0.0

Correct speech-to-text (ASR/STT) transcription errors through dictionary-based rules and AI-powered corrections with automatic pattern learning.

**Features:**
- Two-stage correction pipeline (dictionary + AI)
- Automatic pattern detection and learning
- Domain-specific dictionaries (general, embodied_ai, finance, medical)
- SQLite-based correction repository
- Team collaboration with import/export
- GLM API integration for AI corrections
- Cost optimization through dictionary promotion

**Use cases:**
- Correcting meeting notes, lecture recordings, or interview transcripts
- Fixing Chinese/English homophone errors and technical terminology
- Building domain-specific correction dictionaries
- Improving transcript accuracy through iterative learning

**Documentation:**
- Complete workflow guides in references/
- SQL query templates
- Troubleshooting guide
- Team collaboration patterns
- API setup instructions

**Marketplace updates:**
- Updated marketplace to v1.8.0
- Added transcript-fixer plugin (category: productivity)
- Updated README.md with skill description and use cases
- Updated CLAUDE.md with skill listing and counts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-28 13:16:37 +08:00

9.9 KiB

Team Collaboration Guide

This guide explains how to share correction knowledge across teams using export/import and Git workflows.

Table of Contents

Export/Import Workflow

Export Corrections

Share your corrections with team members:

# Export specific domain
python scripts/fix_transcription.py --export team_corrections.json --domain embodied_ai

# Export general corrections
python scripts/fix_transcription.py --export team_corrections.json

Output: Creates a standalone JSON file with your corrections.

Import from Teammate

Two modes: merge (combine) or replace (overwrite):

# Merge (recommended) - combines with existing corrections
python scripts/fix_transcription.py --import team_corrections.json --merge

# Replace - overwrites existing corrections (dangerous!)
python scripts/fix_transcription.py --import team_corrections.json

Merge behavior:

  • Adds new corrections
  • Updates existing corrections with imported values
  • Preserves corrections not in import file

Team Workflow Example

Person A (Domain Expert):

# Build correction dictionary
python fix_transcription.py --add "巨升" "具身" --domain embodied_ai
python fix_transcription.py --add "奇迹创坛" "奇绩创坛" --domain embodied_ai
# ... add 50 more corrections ...

# Export for team
python fix_transcription.py --export ai_corrections.json --domain embodied_ai
# Send ai_corrections.json to team via Slack/email

Person B (Team Member):

# Receive ai_corrections.json
# Import and merge with existing corrections
python fix_transcription.py --import ai_corrections.json --merge

# Now Person B has all 50+ corrections!

Git-Based Collaboration

For teams using Git, version control the entire correction database.

Initial Setup

Person A (First User):

cd ~/.transcript-fixer
git init
git add corrections.json context_rules.json config.json
git add domains/
git commit -m "Initial correction database"

# Push to shared repo
git remote add origin git@github.com:org/transcript-corrections.git
git push -u origin main

Team Members Clone

Person B, C, D (Team Members):

# Clone shared corrections
git clone git@github.com:org/transcript-corrections.git ~/.transcript-fixer

# Now everyone has the same corrections!

Ongoing Sync

Daily workflow:

# Morning: Pull team updates
cd ~/.transcript-fixer
git pull origin main

# During day: Add corrections
python fix_transcription.py --add "错误" "正确"

# Evening: Push your additions
cd ~/.transcript-fixer
git add corrections.json
git commit -m "Added 5 new embodied AI corrections"
git push origin main

Handling Conflicts

When two people add different corrections to same file:

cd ~/.transcript-fixer
git pull origin main

# If conflict occurs:
# CONFLICT in corrections.json

# Option 1: Manual merge (recommended)
nano corrections.json  # Edit to combine both changes
git add corrections.json
git commit -m "Merged corrections from teammate"
git push

# Option 2: Keep yours
git checkout --ours corrections.json
git add corrections.json
git commit -m "Kept local corrections"
git push

# Option 3: Keep theirs
git checkout --theirs corrections.json
git add corrections.json
git commit -m "Used teammate's corrections"
git push

Best Practice: JSON merge conflicts are usually easy - just combine the correction entries from both versions.

Selective Domain Sharing

Share only specific domains with different teams:

Finance Team

# Finance team exports their domain
python fix_transcription.py --export finance_corrections.json --domain finance

# Share finance_corrections.json with finance team only

AI Team

# AI team exports their domain
python fix_transcription.py --export ai_corrections.json --domain embodied_ai

# Share ai_corrections.json with AI team only

Individual imports specific domains

# Alice works on both finance and AI
python fix_transcription.py --import finance_corrections.json --merge
python fix_transcription.py --import ai_corrections.json --merge

Git Branching Strategy

For larger teams, use branches for different domains or workflows:

Feature Branches

# Create branch for major dictionary additions
git checkout -b add-medical-terms
python fix_transcription.py --add "医疗术语" "正确术语" --domain medical
# ... add 100 medical corrections ...
git add domains/medical.json
git commit -m "Added 100 medical terminology corrections"
git push origin add-medical-terms

# Create PR for review
# After approval, merge to main

Domain Branches (Alternative)

# Separate branches per domain
git checkout -b domain/embodied-ai
# Work on AI corrections
git push origin domain/embodied-ai

git checkout -b domain/finance
# Work on finance corrections
git push origin domain/finance

Automated Sync (Advanced)

Set up automatic Git sync using cron/Task Scheduler:

macOS/Linux Cron

# Edit crontab
crontab -e

# Add daily sync at 9 AM and 6 PM
0 9,18 * * * cd ~/.transcript-fixer && git pull origin main && git push origin main

Windows Task Scheduler

# Create scheduled task
$action = New-ScheduledTaskAction -Execute "git" -Argument "pull origin main" -WorkingDirectory "$env:USERPROFILE\.transcript-fixer"
$trigger = New-ScheduledTaskTrigger -Daily -At 9am
Register-ScheduledTask -Action $action -Trigger $trigger -TaskName "SyncTranscriptCorrections"

Backup and Recovery

Backup Strategy

# Weekly backup to cloud
cd ~/.transcript-fixer
tar -czf transcript-corrections-$(date +%Y%m%d).tar.gz corrections.json context_rules.json domains/
# Upload to Dropbox/Google Drive/S3

Recovery from Backup

# Extract backup
tar -xzf transcript-corrections-20250127.tar.gz -C ~/.transcript-fixer/

Recovery from Git

# View history
cd ~/.transcript-fixer
git log corrections.json

# Restore from 3 commits ago
git checkout HEAD~3 corrections.json

# Or restore specific version
git checkout abc123def corrections.json

Team Best Practices

  1. Pull Before Push: Always git pull before starting work
  2. Commit Often: Small, frequent commits better than large infrequent ones
  3. Descriptive Messages: "Added 5 finance terms" better than "updates"
  4. Review Process: Use PRs for major dictionary changes (100+ corrections)
  5. Domain Ownership: Assign domain experts as reviewers
  6. Weekly Sync: Schedule team sync meetings to review learned suggestions
  7. Backup Policy: Weekly backups of entire ~/.transcript-fixer/

Integration with CI/CD

For enterprise teams, integrate validation into CI:

GitHub Actions Example

# .github/workflows/validate-corrections.yml
name: Validate Corrections

on:
  pull_request:
    paths:
      - 'corrections.json'
      - 'domains/*.json'

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2

      - name: Validate JSON
        run: |
          python -m json.tool corrections.json > /dev/null
          for file in domains/*.json; do
            python -m json.tool "$file" > /dev/null
          done

      - name: Check for duplicates
        run: |
          python scripts/check_duplicates.py corrections.json

Troubleshooting

Import Failed

# Check JSON validity
python -m json.tool team_corrections.json

# If invalid, fix JSON syntax errors
nano team_corrections.json

Git Sync Failed

# Check remote connection
git remote -v

# Re-add if needed
git remote set-url origin git@github.com:org/corrections.git

# Verify SSH keys
ssh -T git@github.com

Merge Conflicts Too Complex

# Nuclear option: Keep one version
git checkout --ours corrections.json  # Keep yours
# OR
git checkout --theirs corrections.json  # Keep theirs

# Then re-import the other version
python fix_transcription.py --import other_version.json --merge

Security Considerations

  1. Private Repos: Use private Git repositories for company-specific corrections
  2. Access Control: Limit who can push to main branch
  3. Secret Scanning: Never commit API keys (already handled by security_scan.py)
  4. Audit Trail: Git history provides full audit trail of who changed what
  5. Backup Encryption: Encrypt backups if containing sensitive terminology

Further Reading