Files
claude-skills-reference/engineering-team/aws-solution-architect/references/service_selection.md
Alireza Rezvani c7dc957823 fix(skill): restructure aws-solution-architect for better organization (#61) (#114)
Complete restructure based on AI Agent Skills Benchmark feedback (original score: 66/100):

## Directory Reorganization
- Moved Python scripts to scripts/ directory
- Moved sample files to assets/ directory
- Created references/ directory with extracted content
- Removed HOW_TO_USE.md (integrated into SKILL.md)
- Removed __pycache__

## New Reference Files (3 files)
- architecture_patterns.md: 6 AWS patterns (serverless, microservices, three-tier,
  data processing, GraphQL, multi-region) with diagrams, cost breakdowns, pros/cons
- service_selection.md: Decision matrices for compute, database, storage, messaging,
  networking, security services with code examples
- best_practices.md: Serverless design, cost optimization, security hardening,
  scalability patterns, common pitfalls

## SKILL.md Rewrite
- Reduced from 345 lines to 307 lines (moved patterns to references/)
- Added trigger phrases to description ("design serverless architecture",
  "create CloudFormation templates", "optimize AWS costs")
- Structured around 6-step workflow instead of encyclopedia format
- Added Quick Start examples (MVP, Scaling, Cost Optimization, IaC)
- Removed marketing language ("Expert", "comprehensive")
- Consistent imperative voice throughout

## Structure Changes
- scripts/: architecture_designer.py, cost_optimizer.py, serverless_stack.py
- references/: architecture_patterns.md, service_selection.md, best_practices.md
- assets/: sample_input.json, expected_output.json

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 02:42:08 +01:00

11 KiB

AWS Service Selection Guide

Quick reference for choosing the right AWS service based on requirements.


Table of Contents


Compute Services

Decision Matrix

Requirement Recommended Service
Event-driven, short tasks (<15 min) Lambda
Containerized apps, predictable traffic ECS Fargate
Custom configs, GPU/FPGA EC2
Simple container from source App Runner
Kubernetes workloads EKS
Batch processing AWS Batch

Lambda

Best for: Event-driven functions, API backends, scheduled tasks

Limits:
- Execution: 15 minutes max
- Memory: 128 MB - 10 GB
- Package: 50 MB (zip), 10 GB (container)
- Concurrency: 1000 default (soft limit)

Pricing: $0.20 per 1M requests + compute time

Use when:

  • Variable/unpredictable traffic
  • Pay-per-use is important
  • No server management desired
  • Short-duration operations

Avoid when:

  • Long-running processes (>15 min)
  • Low-latency requirements (<50ms)
  • Heavy compute (consider Fargate)

ECS Fargate

Best for: Containerized applications, microservices

Limits:
- vCPU: 0.25 - 16
- Memory: 0.5 GB - 120 GB
- Storage: 20 GB - 200 GB ephemeral

Pricing: Per vCPU-hour + GB-hour

Use when:

  • Containerized applications
  • Predictable traffic patterns
  • Long-running processes
  • Need more control than Lambda

EC2

Best for: Custom configurations, specialized hardware

Instance Types:
- General: t3, m6i
- Compute: c6i
- Memory: r6i
- GPU: p4d, g5
- Storage: i3, d3

Use when:

  • Need GPU/FPGA
  • Windows applications
  • Specific instance configurations
  • Reserved capacity makes sense

Database Services

Decision Matrix

Data Type Query Pattern Scale Recommended
Key-value Simple lookups Any DynamoDB
Document Flexible queries <1TB DocumentDB
Relational Complex joins Variable Aurora Serverless
Relational High volume Fixed Aurora Standard
Time-series Time-based Any Timestream
Graph Relationships Any Neptune

DynamoDB

Best for: Key-value and document data, serverless applications

Limits:
- Item size: 400 KB max
- Partition key: 2048 bytes
- Sort key: 1024 bytes
- GSI: 20 per table

Pricing:
- On-demand: $1.25 per million writes, $0.25 per million reads
- Provisioned: Per RCU/WCU

Data Modeling Example:

# Single-table design for e-commerce
PK                  SK                  Attributes
USER#123            PROFILE             {name, email, ...}
USER#123            ORDER#456           {total, status, ...}
USER#123            ORDER#456#ITEM#1    {product, qty, ...}
PRODUCT#789         METADATA            {name, price, ...}

Aurora

Best for: Relational data with complex queries

Edition Use Case Scaling
Aurora Serverless v2 Variable workloads 0.5-128 ACUs, auto
Aurora Standard Predictable workloads Instance-based
Aurora Global Multi-region Cross-region replication
Limits:
- Storage: 128 TB max
- Replicas: 15 read replicas
- Connections: Instance-dependent

Pricing:
- Serverless: $0.12 per ACU-hour
- Standard: Instance + storage + I/O

Comparison: DynamoDB vs Aurora

Factor DynamoDB Aurora
Query flexibility Limited (key-based) Full SQL
Scaling Instant, unlimited Minutes, up to limits
Consistency Eventually/Strong ACID
Cost model Per-request Per-hour
Operational Zero management Some management

Storage Services

S3 Storage Classes

Class Access Pattern Retrieval Cost (GB/mo)
Standard Frequent Instant $0.023
Intelligent-Tiering Unknown Instant $0.023 + monitoring
Standard-IA Infrequent (30+ days) Instant $0.0125
One Zone-IA Infrequent, single AZ Instant $0.01
Glacier Instant Archive, instant access Instant $0.004
Glacier Flexible Archive Minutes-hours $0.0036
Glacier Deep Archive Long-term archive 12-48 hours $0.00099

Lifecycle Policy Example

{
  "Rules": [
    {
      "ID": "Archive old data",
      "Status": "Enabled",
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER"
        },
        {
          "Days": 365,
          "StorageClass": "DEEP_ARCHIVE"
        }
      ],
      "Expiration": {
        "Days": 2555
      }
    }
  ]
}

Block and File Storage

Service Use Case Access
EBS EC2 block storage Single instance
EFS Shared file system Multiple instances
FSx for Lustre HPC workloads High throughput
FSx for Windows Windows apps SMB protocol

Messaging and Events

Decision Matrix

Pattern Service Use Case
Event routing EventBridge Microservices, SaaS integration
Pub/sub SNS Fan-out notifications
Queue SQS Decoupling, buffering
Streaming Kinesis Real-time analytics
Message broker Amazon MQ Legacy migrations

EventBridge

Best for: Event-driven architectures, SaaS integration

# EventBridge rule pattern
{
    "source": ["orders.service"],
    "detail-type": ["OrderCreated"],
    "detail": {
        "total": [{"numeric": [">=", 100]}]
    }
}

SQS

Best for: Decoupling services, handling load spikes

Feature Standard FIFO
Throughput Unlimited 3000 msg/sec
Ordering Best effort Guaranteed
Delivery At least once Exactly once
Deduplication No Yes
# SQS with dead letter queue
import boto3

sqs = boto3.client('sqs')

def process_with_dlq(queue_url, dlq_url, max_retries=3):
    response = sqs.receive_message(
        QueueUrl=queue_url,
        MaxNumberOfMessages=10,
        WaitTimeSeconds=20,
        AttributeNames=['ApproximateReceiveCount']
    )

    for message in response.get('Messages', []):
        receive_count = int(message['Attributes']['ApproximateReceiveCount'])

        try:
            process(message)
            sqs.delete_message(QueueUrl=queue_url, ReceiptHandle=message['ReceiptHandle'])
        except Exception as e:
            if receive_count >= max_retries:
                sqs.send_message(QueueUrl=dlq_url, MessageBody=message['Body'])
                sqs.delete_message(QueueUrl=queue_url, ReceiptHandle=message['ReceiptHandle'])

Kinesis

Best for: Real-time streaming data, analytics

Service Use Case
Data Streams Custom processing
Data Firehose Direct to S3/Redshift
Data Analytics SQL on streams
Video Streams Video ingestion

API and Integration

API Gateway vs AppSync

Factor API Gateway AppSync
Protocol REST, WebSocket GraphQL
Real-time WebSocket setup Built-in subscriptions
Caching Response caching Field-level caching
Integration Lambda, HTTP, AWS Lambda, DynamoDB, HTTP
Pricing Per request Per request + data

API Gateway Configuration

# Throttling and caching
Resources:
  ApiGateway:
    Type: AWS::ApiGateway::RestApi
    Properties:
      Name: my-api

  ApiStage:
    Type: AWS::ApiGateway::Stage
    Properties:
      StageName: prod
      MethodSettings:
        - HttpMethod: "*"
          ResourcePath: "/*"
          ThrottlingBurstLimit: 500
          ThrottlingRateLimit: 1000
          CachingEnabled: true
          CacheTtlInSeconds: 300

Step Functions

Best for: Workflow orchestration, long-running processes

{
  "StartAt": "ProcessOrder",
  "States": {
    "ProcessOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:...:processOrder",
      "Next": "CheckInventory"
    },
    "CheckInventory": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.inStock",
          "BooleanEquals": true,
          "Next": "ShipOrder"
        }
      ],
      "Default": "BackOrder"
    },
    "ShipOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:...:shipOrder",
      "End": true
    },
    "BackOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:...:backOrder",
      "End": true
    }
  }
}

Networking

VPC Components

Component Purpose
VPC Isolated network
Subnet Network segment (public/private)
Internet Gateway Public internet access
NAT Gateway Private subnet outbound
VPC Endpoint Private AWS service access
Transit Gateway VPC interconnection

VPC Design Pattern

VPC: 10.0.0.0/16

Public Subnets (AZ a, b, c):
  10.0.1.0/24, 10.0.2.0/24, 10.0.3.0/24
  - ALB, NAT Gateway, Bastion

Private Subnets (AZ a, b, c):
  10.0.11.0/24, 10.0.12.0/24, 10.0.13.0/24
  - Application servers, Lambda

Database Subnets (AZ a, b, c):
  10.0.21.0/24, 10.0.22.0/24, 10.0.23.0/24
  - RDS, ElastiCache

VPC Endpoints (Cost Savings)

# Interface endpoint for Secrets Manager
SecretsManagerEndpoint:
  Type: AWS::EC2::VPCEndpoint
  Properties:
    VpcId: !Ref VPC
    ServiceName: !Sub com.amazonaws.${AWS::Region}.secretsmanager
    VpcEndpointType: Interface
    SubnetIds: !Ref PrivateSubnets
    SecurityGroupIds:
      - !Ref EndpointSecurityGroup

Security and Identity

IAM Best Practices

// Least privilege policy example
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "dynamodb:GetItem",
        "dynamodb:PutItem",
        "dynamodb:Query"
      ],
      "Resource": "arn:aws:dynamodb:us-east-1:123456789:table/users",
      "Condition": {
        "ForAllValues:StringEquals": {
          "dynamodb:LeadingKeys": ["${aws:userid}"]
        }
      }
    }
  ]
}

Secrets Manager vs Parameter Store

Factor Secrets Manager Parameter Store
Auto-rotation Built-in Manual
Cross-account Yes Limited
Pricing $0.40/secret/month Free (standard)
Use case Credentials, API keys Config, non-secrets

Cognito Configuration

UserPool:
  Type: AWS::Cognito::UserPool
  Properties:
    UserPoolName: my-app-users
    AutoVerifiedAttributes:
      - email
    MfaConfiguration: OPTIONAL
    EnabledMfas:
      - SOFTWARE_TOKEN_MFA
    Policies:
      PasswordPolicy:
        MinimumLength: 12
        RequireLowercase: true
        RequireUppercase: true
        RequireNumbers: true
        RequireSymbols: true
    AccountRecoverySetting:
      RecoveryMechanisms:
        - Name: verified_email
          Priority: 1