firefrost-gaming/claude-code-skills-reference

Files

daymade bd0aa12004 Release v1.8.0: Add transcript-fixer skill

## New Skill: transcript-fixer v1.0.0

Correct speech-to-text (ASR/STT) transcription errors through dictionary-based rules and AI-powered corrections with automatic pattern learning.

**Features:**
- Two-stage correction pipeline (dictionary + AI)
- Automatic pattern detection and learning
- Domain-specific dictionaries (general, embodied_ai, finance, medical)
- SQLite-based correction repository
- Team collaboration with import/export
- GLM API integration for AI corrections
- Cost optimization through dictionary promotion

**Use cases:**
- Correcting meeting notes, lecture recordings, or interview transcripts
- Fixing Chinese/English homophone errors and technical terminology
- Building domain-specific correction dictionaries
- Improving transcript accuracy through iterative learning

**Documentation:**
- Complete workflow guides in references/
- SQL query templates
- Troubleshooting guide
- Team collaboration patterns
- API setup instructions

**Marketplace updates:**
- Updated marketplace to v1.8.0
- Added transcript-fixer plugin (category: productivity)
- Updated README.md with skill description and use cases
- Updated CLAUDE.md with skill listing and counts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-10-28 13:16:37 +08:00

26 KiB

Raw Blame History

Architecture Reference

Technical implementation details of the transcript-fixer system.

Module Structure
Design Principles
- SOLID Compliance
- File Length Limits
Module Architecture
Data Flow
SQLite Architecture (v2.0)
Module Details
State Management
- Database-Backed State
- Thread-Safe Access
Error Handling Strategy
Testing Strategy
Performance Considerations
Security Architecture
Extensibility Points
Dependencies
Deployment
Further Reading

Module Structure

The codebase follows a modular package structure for maintainability:

scripts/
├── fix_transcription.py        # Main entry point (~70 lines)
├── core/                       # Business logic & data access
│   ├── correction_repository.py # Data access layer (466 lines)
│   ├── correction_service.py    # Business logic layer (525 lines)
│   ├── schema.sql              # SQLite database schema (216 lines)
│   ├── dictionary_processor.py # Stage 1 processor (140 lines)
│   ├── ai_processor.py        # Stage 2 processor (199 lines)
│   └── learning_engine.py     # Pattern detection (252 lines)
├── cli/                        # Command-line interface
│   ├── commands.py            # Command handlers (180 lines)
│   └── argument_parser.py     # Argument config (95 lines)
└── utils/                      # Utility functions
    ├── diff_generator.py       # Multi-format diffs (132 lines)
    ├── logging_config.py       # Logging configuration (130 lines)
    └── validation.py          # SQLite validation (105 lines)

Benefits of modular structure:

Clear separation of concerns (business logic / CLI / utilities)
Easy to locate and modify specific functionality
Supports independent testing of modules
Scales well as codebase grows
Follows Python package best practices

Design Principles

SOLID Compliance

Every module follows SOLID principles for maintainability:

Single Responsibility Principle (SRP)
- Each module has exactly one reason to change
- CorrectionRepository: Database operations only
- CorrectionService: Business logic and validation only
- DictionaryProcessor: Text transformation only
- AIProcessor: API communication only
- LearningEngine: Pattern analysis only
Open/Closed Principle (OCP)
- Open for extension via SQL INSERT
- Closed for modification (no code changes needed)
- Add corrections via CLI or SQL without editing Python
Liskov Substitution Principle (LSP)
- All processors implement same interface
- Can swap implementations without breaking workflow
Interface Segregation Principle (ISP)
- Repository, Service, Processor, Engine are independent
- No unnecessary dependencies
Dependency Inversion Principle (DIP)
- Service depends on Repository interface
- CLI depends on Service interface
- Not tied to concrete implementations

File Length Limits

All files comply with code quality standards:

File	Lines	Limit	Status
`validation.py`	105	200	✅
`logging_config.py`	130	200	✅
`diff_generator.py`	132	200	✅
`dictionary_processor.py`	140	200	✅
`commands.py`	180	200	✅
`ai_processor.py`	199	250	✅
`schema.sql`	216	250	✅
`learning_engine.py`	252	250	✅
`correction_repository.py`	466	500	✅
`correction_service.py`	525	550	✅

Module Architecture

Layer Diagram

┌─────────────────────────────────────────┐
│   CLI Layer (fix_transcription.py)     │
│   - Argument parsing                    │
│   - Command routing                     │
│   - User interaction                    │
└───────────────┬─────────────────────────┘
                │
┌───────────────▼─────────────────────────┐
│   Business Logic Layer                  │
│                                         │
│  ┌──────────────────┐  ┌──────────────┐│
│  │ Dictionary       │  │ AI           ││
│  │ Processor        │  │ Processor    ││
│  │ (Stage 1)        │  │ (Stage 2)    ││
│  └──────────────────┘  └──────────────┘│
│                                         │
│  ┌──────────────────┐  ┌──────────────┐│
│  │ Learning         │  │ Diff         ││
│  │ Engine           │  │ Generator    ││
│  │ (Pattern detect) │  │ (Stage 3)    ││
│  └──────────────────┘  └──────────────┘│
└───────────────┬─────────────────────────┘
                │
┌───────────────▼─────────────────────────┐
│   Data Access Layer (SQLite-based)      │
│                                         │
│  ┌──────────────────────────────────┐  │
│  │ CorrectionManager (Facade)       │  │
│  │ - Backward-compatible API        │  │
│  └──────────────┬───────────────────┘  │
│                 │                       │
│  ┌──────────────▼───────────────────┐  │
│  │ CorrectionService                │  │
│  │ - Business logic                 │  │
│  │ - Validation                     │  │
│  │ - Import/Export                  │  │
│  └──────────────┬───────────────────┘  │
│                 │                       │
│  ┌──────────────▼───────────────────┐  │
│  │ CorrectionRepository             │  │
│  │ - ACID transactions              │  │
│  │ - Thread-safe connections        │  │
│  │ - Audit logging                  │  │
│  └──────────────────────────────────┘  │
└───────────────┬─────────────────────────┘
                │
┌───────────────▼─────────────────────────┐
│   Storage Layer                         │
│   ~/.transcript-fixer/corrections.db    │
│   - SQLite database (ACID compliant)    │
│   - 8 normalized tables + 3 views       │
│   - Comprehensive indexes               │
│   - Foreign key constraints             │
└─────────────────────────────────────────┘

Data Flow

Correction Workflow

1. User Input
   ↓
2. fix_transcription.py (Orchestrator)
   ↓
3. CorrectionService.get_corrections()
   ← Query from ~/.transcript-fixer/corrections.db
   ↓
4. DictionaryProcessor.process()
   - Apply context rules (regex)
   - Apply dictionary replacements
   - Track changes
   ↓
5. AIProcessor.process()
   - Split into chunks
   - Call GLM-4.6 API
   - Retry with fallback on error
   - Track AI changes
   ↓
6. CorrectionService.save_history()
   → Insert into correction_history table
   ↓
7. LearningEngine.analyze_and_suggest()
   - Query correction_history table
   - Detect patterns (frequency ≥3, confidence ≥80%)
   - Generate suggestions
   → Insert into learned_suggestions table
   ↓
8. Output Files
   - {filename}_stage1.md
   - {filename}_stage2.md

Learning Cycle

Run 1: meeting1.md
   AI corrects: "巨升" → "具身"
   ↓
   INSERT INTO correction_history

Run 2: meeting2.md
   AI corrects: "巨升" → "具身"
   ↓
   INSERT INTO correction_history

Run 3: meeting3.md
   AI corrects: "巨升" → "具身"
   ↓
   INSERT INTO correction_history
   ↓
   LearningEngine queries patterns:
   - SELECT ... GROUP BY from_text, to_text
   - Frequency: 3, Confidence: 100%
   ↓
   INSERT INTO learned_suggestions (status='pending')
   ↓
   User reviews: --review-learned
   ↓
   User approves: --approve "巨升" "具身"
   ↓
   INSERT INTO corrections (source='learned')
   UPDATE learned_suggestions (status='approved')
   ↓
   Future runs query corrections table (Stage 1 - faster!)

SQLite Architecture (v2.0)

Two-Layer Data Access (Simplified)

Design Principle: No users = no backward compatibility overhead.

The system uses a clean 2-layer architecture:

┌──────────────────────────────────────────┐
│ CLI Commands (commands.py)               │
│ - User interaction                       │
│ - Command routing                        │
└──────────────┬───────────────────────────┘
               │
┌──────────────▼───────────────────────────┐
│ CorrectionService (Business Logic)       │
│ - Input validation & sanitization        │
│ - Business rules enforcement             │
│ - Import/export orchestration            │
│ - Statistics calculation                 │
│ - History tracking                       │
└──────────────┬───────────────────────────┘
               │
┌──────────────▼───────────────────────────┐
│ CorrectionRepository (Data Access)       │
│ - ACID transactions                      │
│ - Thread-safe connections                │
│ - SQL query execution                    │
│ - Audit logging                          │
└──────────────┬───────────────────────────┘
               │
┌──────────────▼───────────────────────────┐
│ SQLite Database (corrections.db)         │
│ - 8 normalized tables                    │
│ - Foreign key constraints                │
│ - Comprehensive indexes                  │
│ - 3 views for common queries             │
└───────────────────────────────────────────┘

Database Schema (schema.sql)

Core Tables:

corrections (main correction storage)
- Primary key: id
- Unique constraint: (from_text, domain)
- Indexes: domain, source, added_at, is_active, from_text
- Fields: confidence (0.0-1.0), usage_count, notes
context_rules (regex-based rules)
- Pattern + replacement with priority ordering
- Indexes: priority (DESC), is_active
correction_history (audit trail for runs)
- Tracks: filename, domain, timestamps, change counts
- Links to correction_changes via foreign key
- Indexes: run_timestamp, domain, success
correction_changes (detailed change log)
- Links to history via foreign key (CASCADE delete)
- Stores: line_number, from/to text, rule_type, context
- Indexes: history_id, rule_type
learned_suggestions (AI-detected patterns)
- Status: pending → approved/rejected
- Unique constraint: (from_text, to_text, domain)
- Fields: frequency, confidence, timestamps
- Indexes: status, domain, confidence, frequency
suggestion_examples (occurrences of patterns)
- Links to learned_suggestions via foreign key
- Stores context where pattern occurred
system_config (configuration storage)
- Key-value store with type safety
- Stores: API settings, thresholds, defaults
audit_log (comprehensive audit trail)
- Tracks all database operations
- Fields: action, entity_type, entity_id, user, success
- Indexes: timestamp, action, entity_type, success

Views (for common queries):

active_corrections: Active corrections only
pending_suggestions: Suggestions pending review
correction_statistics: Statistics per domain

ACID Guarantees

Atomicity: All-or-nothing transactions

with self._transaction() as conn:
    conn.execute("INSERT ...")  # Either all succeed
    conn.execute("UPDATE ...")  # or all rollback

Consistency: Constraints enforced

Foreign key constraints
Check constraints (confidence 0.0-1.0, usage_count ≥ 0)
Unique constraints

Isolation: Serializable transactions

conn.execute("BEGIN IMMEDIATE")  # Acquire write lock

Durability: Changes persisted to disk

SQLite guarantees persistence after commit
Backup before migrations

Thread Safety

Thread-local connections:

def _get_connection(self):
    if not hasattr(self._local, 'connection'):
        self._local.connection = sqlite3.connect(...)
    return self._local.connection

Connection pooling:

One connection per thread
Automatic cleanup on close
Foreign keys enabled per connection

Clean Architecture (No Legacy)

Design Philosophy:

Clean 2-layer architecture (Service → Repository)
No backward compatibility overhead
Direct API design without legacy constraints
YAGNI principle: Build for current needs, not hypothetical migrations

Module Details

fix_transcription.py (Orchestrator)

Responsibilities:

Parse CLI arguments
Route commands to appropriate handlers
Coordinate workflow between modules
Display user feedback

Key Functions:

cmd_init()              # Initialize ~/.transcript-fixer/
cmd_add_correction()    # Add single correction
cmd_list_corrections()  # List corrections
cmd_run_correction()    # Execute correction workflow
cmd_review_learned()    # Review AI suggestions
cmd_approve()           # Approve learned correction

Design Pattern: Command pattern with function routing

correction_repository.py (Data Access Layer)

Responsibilities:

Execute SQL queries with ACID guarantees
Manage thread-safe database connections
Handle transactions (commit/rollback)
Perform audit logging
Convert between database rows and Python objects

Key Methods:

add_correction()          # INSERT with UNIQUE handling
get_correction()          # SELECT single correction
get_all_corrections()     # SELECT with filters
get_corrections_dict()    # For backward compatibility
update_correction()       # UPDATE with transaction
delete_correction()       # Soft delete (is_active=0)
increment_usage()         # Track usage statistics
bulk_import_corrections() # Batch INSERT with conflict resolution

Transaction Management:

@contextmanager
def _transaction(self):
    conn = self._get_connection()
    try:
        conn.execute("BEGIN IMMEDIATE")
        yield conn
        conn.commit()
    except Exception:
        conn.rollback()
        raise

correction_service.py (Business Logic Layer)

Responsibilities:

Input validation and sanitization
Business rule enforcement
Orchestrate repository operations
Import/export with conflict detection
Statistics calculation

Key Methods:

# Validation
validate_correction_text()  # Check length, control chars, NULL bytes
validate_domain_name()      # Prevent path traversal, injection
validate_confidence()       # Range check (0.0-1.0)
validate_source()          # Enum validation

# Operations
add_correction()           # Validate + repository.add
get_corrections()          # Get corrections for domain
remove_correction()        # Validate + repository.delete

# Import/Export
import_corrections()       # Pre-validate + bulk import + conflict detection
export_corrections()       # Query + format as JSON

# Analytics
get_statistics()          # Calculate metrics per domain

Validation Rules:

@dataclass
class ValidationRules:
    max_text_length: int = 1000
    min_text_length: int = 1
    max_domain_length: int = 50
    allowed_domain_pattern: str = r'^[a-zA-Z0-9_-]+$'

CLI Integration (commands.py)

Direct Service Usage:

def _get_service():
    """Get configured CorrectionService instance."""
    config_dir = Path.home() / ".transcript-fixer"
    db_path = config_dir / "corrections.db"
    repository = CorrectionRepository(db_path)
    return CorrectionService(repository)

def cmd_add_correction(args):
    service = _get_service()
    service.add_correction(args.from_text, args.to_text, args.domain)

Benefits of Direct Integration:

No unnecessary abstraction layers
Clear data flow: CLI → Service → Repository
Easy to understand and debug
Performance: One less function call per operation

dictionary_processor.py (Stage 1)

Responsibilities:

Apply context-aware regex rules
Apply simple dictionary replacements
Track all changes with line numbers

Processing Order:

Context rules first (higher priority)
Dictionary replacements second

Key Methods:

process(text) -> (corrected_text, changes)
_apply_context_rules()
_apply_dictionary()
get_summary(changes)

Change Tracking:

@dataclass
class Change:
    line_number: int
    from_text: str
    to_text: str
    rule_type: str      # "dictionary" or "context_rule"
    rule_name: str

ai_processor.py (Stage 2)

Responsibilities:

Split text into API-friendly chunks
Call GLM-4.6 API
Handle retries with fallback model
Track AI-suggested changes

Key Methods:

process(text, context) -> (corrected_text, changes)
_split_into_chunks()     # Respect paragraph boundaries
_process_chunk()         # Single API call
_build_prompt()          # Construct correction prompt

Chunking Strategy:

Max 6000 characters per chunk
Split on paragraph boundaries (\n\n)
If paragraph too long, split on sentences
Preserve context across chunks

Error Handling:

Retry with fallback model (GLM-4.5-Air)
If both fail, use original text
Never lose user's data

learning_engine.py (Pattern Detection)

Responsibilities:

Analyze correction history
Detect recurring patterns
Calculate confidence scores
Generate suggestions for review
Track rejected suggestions

Algorithm:

1. Query correction_history table
2. Extract stage2 (AI) changes
3. Group by pattern (from→to)
4. Count frequency
5. Calculate confidence
6. Filter by thresholds:
   - frequency ≥ 3
   - confidence ≥ 0.8
7. Save to learned/pending_review.json

Confidence Calculation:

confidence = (
    0.5 * frequency_score +   # More occurrences = higher
    0.3 * consistency_score + # Always same correction
    0.2 * recency_score       # Recent = higher
)

Key Methods:

analyze_and_suggest()     # Main analysis pipeline
approve_suggestion()      # Move to corrections.json
reject_suggestion()       # Move to rejected.json
list_pending()           # Get all suggestions

diff_generator.py (Stage 3)

Responsibilities:

Generate comparison reports
Multiple output formats
Word-level diff analysis

Output Formats:

Markdown summary (statistics + change list)
Unified diff (standard diff format)
HTML side-by-side (visual comparison)
Inline marked ([-old-] [+new+])

Not Modified: Kept original 338-line file as-is (working well)

State Management

Database-Backed State

All state stored in ~/.transcript-fixer/corrections.db
SQLite handles caching and transactions
ACID guarantees prevent corruption
Backup created before migrations

Thread-Safe Access

Thread-local connections (one per thread)
BEGIN IMMEDIATE for write transactions
No global state or shared mutable data
Each operation is independent (stateless modules)

Soft Deletes

Records marked inactive (is_active=0) instead of DELETE
Preserves audit trail
Can be reactivated if needed

Error Handling Strategy

Fail Fast for User Errors

if not skill_path.exists():
    print(f"❌ Error: Skill directory not found")
    sys.exit(1)

Retry for Transient Errors

try:
    api_call(model_primary)
except Exception:
    try:
        api_call(model_fallback)
    except Exception:
        use_original_text()

Backup Before Destructive Operations

if target_file.exists():
    shutil.copy2(target_file, backup_file)
# Then overwrite target_file

Testing Strategy

Unit Testing (Recommended)

# Test dictionary processor
def test_dictionary_processor():
    corrections = {"错误": "正确"}
    processor = DictionaryProcessor(corrections, [])
    text = "这是错误的文本"
    result, changes = processor.process(text)
    assert result == "这是正确的文本"
    assert len(changes) == 1

# Test learning engine thresholds
def test_learning_thresholds():
    engine = LearningEngine(history_dir, learned_dir)
    # Create mock history with pattern appearing 3+ times
    suggestions = engine.analyze_and_suggest()
    assert len(suggestions) > 0

Integration Testing

# End-to-end test
python fix_transcription.py --init
python fix_transcription.py --add "test" "TEST"
python fix_transcription.py --input test.md --stage 3
# Verify output files exist

Performance Considerations

Bottlenecks

AI API calls: Slowest part (60s timeout per chunk)
File I/O: Negligible (JSON files are small)
Pattern matching: Fast (regex + dict lookups)

Optimization Strategies

Stage 1 First: Test dictionary corrections before expensive AI calls
Chunking: Process large files in parallel chunks (future enhancement)
Caching: Could cache API results by content hash (future enhancement)

Scalability

Current capabilities (v2.0 with SQLite):

File size: Unlimited (chunks handle large files)
Corrections: Tested up to 100,000 entries (with indexes)
History: Unlimited (database handles efficiently)
Concurrent access: Thread-safe with ACID guarantees
Query performance: O(log n) with B-tree indexes

Performance improvements from SQLite:

Indexed queries (domain, source, added_at)
Views for common aggregations
Batch imports with transactions
Soft deletes (no data loss)

Future improvements:

Parallel chunk processing for AI calls
API response caching
Full-text search for corrections

Security Architecture

Secret Management

API keys via environment variables only
Never hardcode credentials
Security scanner enforces this

Backup Security

.bak files same permissions as originals
No encryption (user's responsibility)
Recommendation: Use encrypted filesystems

Git Security

.gitignore for .bak files
Private repos recommended
Security scan before commits

Extensibility Points

Adding New Processors

Create new processor class
Implement process(text) -> (result, changes) interface
Add to orchestrator workflow

Example:

class SpellCheckProcessor:
    def process(self, text):
        # Custom spell checking logic
        return corrected_text, changes

Adding New Learning Algorithms

Subclass LearningEngine
Override _calculate_confidence()
Adjust thresholds as needed

Adding New Export Formats

Add method to CorrectionManager
Support new file format
Add CLI command

Dependencies

Required

Python 3.8+ (from __future__ import annotations)
httpx (for API calls)

Optional

diff command (for unified diffs)
Git (for version control)

Development

pytest (for testing)
black (for formatting)
mypy (for type checking)

Deployment

User Installation

# 1. Clone or download skill to workspace
git clone <repo> transcript-fixer
cd transcript-fixer

# 2. Install dependencies
pip install -r requirements.txt

# 3. Initialize
python scripts/fix_transcription.py --init

# 4. Set API key
export GLM_API_KEY="KEY_VALUE"

# Ready to use!

CI/CD Pipeline (Future)

# Potential GitHub Actions workflow
test:
  - Install dependencies
  - Run unit tests
  - Run integration tests
  - Check code style (black, mypy)

security:
  - Run security_scan.py
  - Check for secrets

deploy:
  - Package skill
  - Upload to skill marketplace

26 KiB Raw Blame History

Architecture Reference

Table of Contents

Module Structure

Design Principles

SOLID Compliance

File Length Limits

Module Architecture

Layer Diagram

Data Flow

Correction Workflow

Learning Cycle

SQLite Architecture (v2.0)

Two-Layer Data Access (Simplified)

Database Schema (schema.sql)

ACID Guarantees

Thread Safety

Clean Architecture (No Legacy)

Module Details

fix_transcription.py (Orchestrator)

correction_repository.py (Data Access Layer)

correction_service.py (Business Logic Layer)

CLI Integration (commands.py)

dictionary_processor.py (Stage 1)

ai_processor.py (Stage 2)

learning_engine.py (Pattern Detection)

diff_generator.py (Stage 3)

State Management

Database-Backed State

Thread-Safe Access

Soft Deletes

Error Handling Strategy

Fail Fast for User Errors

Retry for Transient Errors

Backup Before Destructive Operations

Testing Strategy

Unit Testing (Recommended)

Integration Testing

Performance Considerations

Bottlenecks

Optimization Strategies

Scalability

Security Architecture

Secret Management

Backup Security

Git Security

Extensibility Points

Adding New Processors

Adding New Learning Algorithms

Adding New Export Formats

Dependencies

Required

Optional

Development

Deployment

User Installation

CI/CD Pipeline (Future)

Further Reading

26 KiB

Raw Blame History