Files
claude-skills-reference/engineering-team/senior-architect/references/tech_decision_guide.md
Alireza Rezvani 94224f2201 feat(senior-architect): Complete skill overhaul per Issue #48 (#88)
Addresses SkillzWave feedback and Anthropic best practices:

SKILL.md (343 lines):
- Third-person description with trigger phrases
- Added Table of Contents for navigation
- Concrete tool descriptions with usage examples
- Decision workflows: Database, Architecture Pattern, Monolith vs Microservices
- Removed marketing fluff, added actionable content

References (rewritten with real content):
- architecture_patterns.md: 9 patterns with trade-offs, code examples
  (Monolith, Modular Monolith, Microservices, Event-Driven, CQRS,
  Event Sourcing, Hexagonal, Clean Architecture, API Gateway)
- system_design_workflows.md: 6 step-by-step workflows
  (System Design Interview, Capacity Planning, API Design,
  Database Schema, Scalability Assessment, Migration Planning)
- tech_decision_guide.md: 7 decision frameworks with matrices
  (Database, Cache, Message Queue, Auth, Frontend, Cloud, API)

Scripts (fully functional, standard library only):
- architecture_diagram_generator.py: Mermaid + PlantUML + ASCII output
  Scans project structure, detects components, relationships
- dependency_analyzer.py: npm/pip/go/cargo support
  Circular dependency detection, coupling score calculation
- project_architect.py: Pattern detection (7 patterns)
  Layer violation detection, code quality metrics

All scripts tested and working.

Closes #48

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 10:29:14 +01:00

12 KiB

Technology Decision Guide

Decision frameworks and comparison matrices for common technology choices.

Decision Frameworks Index

  1. Database Selection
  2. Caching Strategy
  3. Message Queue Selection
  4. Authentication Strategy
  5. Frontend Framework Selection
  6. Cloud Provider Selection
  7. API Style Selection

1. Database Selection

SQL vs NoSQL Decision Matrix

Factor Choose SQL Choose NoSQL
Data relationships Complex, many-to-many Simple, denormalized OK
Schema Well-defined, stable Evolving, flexible
Transactions ACID required Eventual consistency OK
Query patterns Complex joins, aggregations Key-value, document lookups
Scale Vertical (some horizontal) Horizontal first
Team expertise Strong SQL skills Document/KV experience

Database Type Selection

Relational (SQL):

Database Best For Avoid When
PostgreSQL General purpose, JSON support, extensions Simple key-value only
MySQL Web applications, read-heavy Complex queries, JSON-heavy
SQLite Embedded, development, small apps Concurrent writes, scale

Document (NoSQL):

Database Best For Avoid When
MongoDB Flexible schema, rapid iteration Complex transactions
CouchDB Offline-first, sync required High throughput

Key-Value:

Database Best For Avoid When
Redis Caching, sessions, real-time Persistence critical
DynamoDB Serverless, auto-scaling Complex queries

Wide-Column:

Database Best For Avoid When
Cassandra Write-heavy, time-series Complex queries, small scale
ScyllaDB Cassandra alternative, performance Small datasets

Time-Series:

Database Best For Avoid When
TimescaleDB Time-series with SQL Non-time-series data
InfluxDB Metrics, monitoring Relational queries

Search:

Database Best For Avoid When
Elasticsearch Full-text search, logs Primary data store
Meilisearch Simple search, fast setup Complex analytics

Quick Decision Flow

Start
  │
  ├─ Need ACID transactions? ──Yes──► PostgreSQL/MySQL
  │
  ├─ Flexible schema needed? ──Yes──► MongoDB
  │
  ├─ Write-heavy (>50K/sec)? ──Yes──► Cassandra/ScyllaDB
  │
  ├─ Key-value access only? ──Yes──► Redis/DynamoDB
  │
  ├─ Time-series data? ──Yes──► TimescaleDB/InfluxDB
  │
  ├─ Full-text search? ──Yes──► Elasticsearch
  │
  └─ Default ──────────────────────► PostgreSQL

2. Caching Strategy

Cache Type Selection

Type Use Case Invalidation Complexity
Read-through Frequent reads, tolerance for stale On write/TTL Low
Write-through Data consistency critical Automatic Medium
Write-behind High write throughput Async High
Cache-aside Fine-grained control Application Medium

Cache Technology Selection

Technology Best For Limitations
Redis General purpose, data structures Memory cost
Memcached Simple key-value, high throughput No persistence
CDN (CloudFront, Fastly) Static assets, edge caching Dynamic content
Application cache Per-instance, small data Not distributed

Cache Patterns

Cache-Aside (Lazy Loading):

Read:
1. Check cache
2. If miss, read from DB
3. Store in cache
4. Return data

Write:
1. Write to DB
2. Invalidate cache

Write-Through:

Write:
1. Write to cache
2. Cache writes to DB
3. Return success

Read:
1. Read from cache (always hit)

TTL Guidelines:

Data Type Suggested TTL
User sessions 24-48 hours
API responses 1-5 minutes
Static content 24 hours - 1 week
Database queries 5-60 minutes
Feature flags 1-5 minutes

3. Message Queue Selection

Queue Technology Comparison

Feature RabbitMQ Kafka SQS Redis Streams
Throughput Medium (10K/s) Very High (100K+/s) Medium High
Ordering Per-queue Per-partition FIFO optional Per-stream
Durability Configurable Strong Strong Configurable
Replay No Yes No Yes
Complexity Medium High Low Low
Cost Self-hosted Self-hosted Pay-per-use Self-hosted

Decision Matrix

Requirement Recommendation
Simple task queue SQS or Redis
Event streaming Kafka
Complex routing RabbitMQ
Log aggregation Kafka
Serverless integration SQS
Real-time analytics Kafka
Request/reply pattern RabbitMQ

When to Use Each

RabbitMQ:

  • Complex routing logic (topic, fanout, headers)
  • Request/reply patterns
  • Priority queues
  • Message acknowledgment critical

Kafka:

  • Event sourcing
  • High throughput requirements (>50K messages/sec)
  • Message replay needed
  • Stream processing
  • Log aggregation

SQS:

  • AWS-native applications
  • Simple queue semantics
  • Serverless architectures
  • Don't want to manage infrastructure

Redis Streams:

  • Already using Redis
  • Moderate throughput
  • Simple streaming needs
  • Real-time features

4. Authentication Strategy

Method Selection

Method Best For Avoid When
Session-based Traditional web apps, server-rendered Mobile apps, microservices
JWT SPAs, mobile apps, microservices Need immediate revocation
OAuth 2.0 Third-party access, social login Internal-only apps
API Keys Server-to-server, simple auth User authentication
mTLS Service mesh, high security Public APIs

JWT vs Sessions

Factor JWT Sessions
Scalability Stateless, easy to scale Requires session store
Revocation Difficult (need blocklist) Immediate
Payload Can contain claims Server-side only
Security Token in client Server-controlled
Mobile friendly Yes Requires cookies

OAuth 2.0 Flow Selection

Flow Use Case
Authorization Code Web apps with backend
Authorization Code + PKCE SPAs, mobile apps
Client Credentials Machine-to-machine
Device Code Smart TVs, CLI tools

Avoid: Implicit flow (deprecated), Resource Owner Password (legacy only)

Token Lifetimes

Token Type Suggested Lifetime
Access token 15-60 minutes
Refresh token 7-30 days
API key No expiry (rotate quarterly)
Session 24 hours - 7 days

5. Frontend Framework Selection

Framework Comparison

Factor React Vue Angular Svelte
Learning curve Medium Low High Low
Ecosystem Largest Large Complete Growing
Performance Good Good Good Excellent
Bundle size Medium Small Large Smallest
TypeScript Good Good Native Good
Job market Largest Growing Enterprise Niche

Decision Matrix

Requirement Recommendation
Large team, enterprise Angular
Startup, rapid iteration React or Vue
Performance critical Svelte or Solid
Existing React team React
Progressive enhancement Vue or Svelte
Component library needed React (most options)

Meta-Framework Selection

Framework Best For
Next.js (React) Full-stack React, SSR/SSG
Nuxt (Vue) Full-stack Vue, SSR/SSG
SvelteKit Full-stack Svelte
Remix Data-heavy React apps
Astro Content sites, multi-framework

When to Use SSR vs SPA vs SSG

Rendering Use When
SSR SEO critical, dynamic content, auth-gated
SPA Internal tools, highly interactive, no SEO
SSG Content sites, blogs, documentation
ISR Mix of static and dynamic

6. Cloud Provider Selection

Provider Comparison

Factor AWS GCP Azure
Market share Largest Growing Enterprise strong
Service breadth Most comprehensive Strong ML/data Best Microsoft integration
Pricing Complex, volume discounts Simpler, sustained use EA discounts
Kubernetes EKS GKE (best managed) AKS
Serverless Lambda (mature) Cloud Functions Azure Functions
Database RDS, DynamoDB Cloud SQL, Spanner SQL, Cosmos

Decision Factors

If You Need Consider
Microsoft ecosystem Azure
Best Kubernetes experience GCP
Widest service selection AWS
Machine learning focus GCP or AWS
Government compliance AWS GovCloud or Azure Gov
Startup credits All offer programs

Multi-Cloud Considerations

Go multi-cloud when:

  • Regulatory requirements mandate it
  • Specific service (e.g., GCP BigQuery) is best-in-class
  • Negotiating leverage with vendors

Stay single-cloud when:

  • Team is small
  • Want to minimize complexity
  • Deep integration needed

Service Mapping

Need AWS GCP Azure
Compute EC2 Compute Engine Virtual Machines
Containers ECS, EKS GKE, Cloud Run AKS, Container Apps
Serverless Lambda Cloud Functions Azure Functions
Object Storage S3 Cloud Storage Blob Storage
SQL Database RDS Cloud SQL Azure SQL
NoSQL DynamoDB Firestore Cosmos DB
CDN CloudFront Cloud CDN Azure CDN
DNS Route 53 Cloud DNS Azure DNS

7. API Style Selection

REST vs GraphQL vs gRPC

Factor REST GraphQL gRPC
Use case General purpose Flexible queries Microservices
Learning curve Low Medium High
Over-fetching Common Solved N/A
Caching HTTP native Complex Custom
Browser support Native Native Limited
Tooling Mature Growing Strong
Performance Good Good Excellent

Decision Matrix

Requirement Recommendation
Public API REST
Mobile apps with varied needs GraphQL
Microservices communication gRPC
Real-time updates GraphQL subscriptions or WebSocket
File uploads REST
Internal services only gRPC
Third-party developers REST + OpenAPI

When to Choose Each

Choose REST when:

  • Building public APIs
  • Need HTTP caching
  • Simple CRUD operations
  • Team experienced with REST

Choose GraphQL when:

  • Multiple clients with different data needs
  • Rapid frontend iteration
  • Complex, nested data relationships
  • Want to reduce API calls

Choose gRPC when:

  • Service-to-service communication
  • Performance critical
  • Streaming required
  • Strong typing important

API Versioning Strategies

Strategy Pros Cons
URL path (/v1/) Clear, easy to implement URL pollution
Query param (?version=1) Flexible Easy to miss
Header (Accept-Version: 1) Clean URLs Less discoverable
No versioning (evolve) Simple Breaking changes risky

Recommendation: URL path versioning for public APIs, header versioning for internal.


Quick Reference

Decision Default Choice Alternative When
Database PostgreSQL Scale/flexibility → MongoDB, DynamoDB
Cache Redis Simple needs → Memcached
Queue SQS (AWS) / RabbitMQ Event streaming → Kafka
Auth JWT + Refresh Traditional web → Sessions
Frontend React + Next.js Simplicity → Vue, Performance → Svelte
Cloud AWS Microsoft shop → Azure, ML-first → GCP
API REST Mobile flexibility → GraphQL, Internal → gRPC