--- name: aws-serverless description: Specialized skill for building production-ready serverless applications on AWS. Covers Lambda functions, API Gateway, DynamoDB, SQS/SNS event-driven patterns, SAM/CDK deployment, and cold start optimization. risk: unknown source: vibeship-spawner-skills (Apache 2.0) date_added: 2026-02-27 --- # AWS Serverless Specialized skill for building production-ready serverless applications on AWS. Covers Lambda functions, API Gateway, DynamoDB, SQS/SNS event-driven patterns, SAM/CDK deployment, and cold start optimization. ## Principles - Right-size memory and timeout (measure before optimizing) - Minimize cold starts for latency-sensitive workloads - Use SnapStart for Java/.NET functions - Prefer HTTP API over REST API for simple use cases - Design for failure with DLQs and retries - Keep deployment packages small - Use environment variables for configuration - Implement structured logging with correlation IDs ## Patterns ### Lambda Handler Pattern Proper Lambda function structure with error handling **When to use**: Any Lambda function implementation,API handlers, event processors, scheduled tasks ```javascript // Node.js Lambda Handler // handler.js // Initialize outside handler (reused across invocations) const { DynamoDBClient } = require('@aws-sdk/client-dynamodb'); const { DynamoDBDocumentClient, GetCommand } = require('@aws-sdk/lib-dynamodb'); const client = new DynamoDBClient({}); const docClient = DynamoDBDocumentClient.from(client); // Handler function exports.handler = async (event, context) => { // Optional: Don't wait for event loop to clear (Node.js) context.callbackWaitsForEmptyEventLoop = false; try { // Parse input based on event source const body = typeof event.body === 'string' ? JSON.parse(event.body) : event.body; // Business logic const result = await processRequest(body); // Return API Gateway compatible response return { statusCode: 200, headers: { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' }, body: JSON.stringify(result) }; } catch (error) { console.error('Error:', JSON.stringify({ error: error.message, stack: error.stack, requestId: context.awsRequestId })); return { statusCode: error.statusCode || 500, headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ error: error.message || 'Internal server error' }) }; } }; async function processRequest(data) { // Your business logic here const result = await docClient.send(new GetCommand({ TableName: process.env.TABLE_NAME, Key: { id: data.id } })); return result.Item; } ``` ```python # Python Lambda Handler # handler.py import json import os import logging import boto3 from botocore.exceptions import ClientError # Initialize outside handler (reused across invocations) logger = logging.getLogger() logger.setLevel(logging.INFO) dynamodb = boto3.resource('dynamodb') table = dynamodb.Table(os.environ['TABLE_NAME']) def handler(event, context): try: # Parse input body = json.loads(event.get('body', '{}')) if isinstance(event.get('body'), str) else event.get('body', {}) # Business logic result = process_request(body) return { 'statusCode': 200, 'headers': { 'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*' }, 'body': json.dumps(result) } except ClientError as e: logger.error(f"DynamoDB error: {e.response['Error']['Message']}") return error_response(500, 'Database error') except json.JSONDecodeError: return error_response(400, 'Invalid JSON') except Exception as e: logger.error(f"Unexpected error: {str(e)}", exc_info=True) return error_response(500, 'Internal server error') def process_request(data): response = table.get_item(Key={'id': data['id']}) return response.get('Item') def error_response(status_code, message): return { 'statusCode': status_code, 'headers': {'Content-Type': 'application/json'}, 'body': json.dumps({'error': message}) } ``` ### Best_practices - Initialize clients outside handler (reused across warm invocations) - Always return proper API Gateway response format - Log with structured JSON for CloudWatch Insights - Include request ID in error logs for tracing ### API Gateway Integration Pattern REST API and HTTP API integration with Lambda **When to use**: Building REST APIs backed by Lambda,Need HTTP endpoints for functions ```yaml # template.yaml (SAM) AWSTemplateFormatVersion: '2010-09-09' Transform: AWS::Serverless-2016-10-31 Globals: Function: Runtime: nodejs20.x Timeout: 30 MemorySize: 256 Environment: Variables: TABLE_NAME: !Ref ItemsTable Resources: # HTTP API (recommended for simple use cases) HttpApi: Type: AWS::Serverless::HttpApi Properties: StageName: prod CorsConfiguration: AllowOrigins: - "*" AllowMethods: - GET - POST - DELETE AllowHeaders: - "*" # Lambda Functions GetItemFunction: Type: AWS::Serverless::Function Properties: Handler: src/handlers/get.handler Events: GetItem: Type: HttpApi Properties: ApiId: !Ref HttpApi Path: /items/{id} Method: GET Policies: - DynamoDBReadPolicy: TableName: !Ref ItemsTable CreateItemFunction: Type: AWS::Serverless::Function Properties: Handler: src/handlers/create.handler Events: CreateItem: Type: HttpApi Properties: ApiId: !Ref HttpApi Path: /items Method: POST Policies: - DynamoDBCrudPolicy: TableName: !Ref ItemsTable # DynamoDB Table ItemsTable: Type: AWS::DynamoDB::Table Properties: AttributeDefinitions: - AttributeName: id AttributeType: S KeySchema: - AttributeName: id KeyType: HASH BillingMode: PAY_PER_REQUEST Outputs: ApiUrl: Value: !Sub "https://${HttpApi}.execute-api.${AWS::Region}.amazonaws.com/prod" ``` ```javascript // src/handlers/get.js const { getItem } = require('../lib/dynamodb'); exports.handler = async (event) => { const id = event.pathParameters?.id; if (!id) { return { statusCode: 400, body: JSON.stringify({ error: 'Missing id parameter' }) }; } const item = await getItem(id); if (!item) { return { statusCode: 404, body: JSON.stringify({ error: 'Item not found' }) }; } return { statusCode: 200, body: JSON.stringify(item) }; }; ``` ### Structure project/ ├── template.yaml # SAM template ├── src/ │ ├── handlers/ │ │ ├── get.js │ │ ├── create.js │ │ └── delete.js │ └── lib/ │ └── dynamodb.js └── events/ └── event.json # Test events ### Api_comparison - Http_api: - Lower latency (~10ms) - Lower cost (50-70% cheaper) - Simpler, fewer features - Best for: Most REST APIs - Rest_api: - More features (caching, request validation, WAF) - Usage plans and API keys - Request/response transformation - Best for: Complex APIs, enterprise features ### Event-Driven SQS Pattern Lambda triggered by SQS for reliable async processing **When to use**: Decoupled, asynchronous processing,Need retry logic and DLQ,Processing messages in batches ```yaml # template.yaml Resources: ProcessorFunction: Type: AWS::Serverless::Function Properties: Handler: src/handlers/processor.handler Events: SQSEvent: Type: SQS Properties: Queue: !GetAtt ProcessingQueue.Arn BatchSize: 10 FunctionResponseTypes: - ReportBatchItemFailures # Partial batch failure handling ProcessingQueue: Type: AWS::SQS::Queue Properties: VisibilityTimeout: 180 # 6x Lambda timeout RedrivePolicy: deadLetterTargetArn: !GetAtt DeadLetterQueue.Arn maxReceiveCount: 3 DeadLetterQueue: Type: AWS::SQS::Queue Properties: MessageRetentionPeriod: 1209600 # 14 days ``` ```javascript // src/handlers/processor.js exports.handler = async (event) => { const batchItemFailures = []; for (const record of event.Records) { try { const body = JSON.parse(record.body); await processMessage(body); } catch (error) { console.error(`Failed to process message ${record.messageId}:`, error); // Report this item as failed (will be retried) batchItemFailures.push({ itemIdentifier: record.messageId }); } } // Return failed items for retry return { batchItemFailures }; }; async function processMessage(message) { // Your processing logic console.log('Processing:', message); // Simulate work await saveToDatabase(message); } ``` ```python # Python version import json import logging logger = logging.getLogger() def handler(event, context): batch_item_failures = [] for record in event['Records']: try: body = json.loads(record['body']) process_message(body) except Exception as e: logger.error(f"Failed to process {record['messageId']}: {e}") batch_item_failures.append({ 'itemIdentifier': record['messageId'] }) return {'batchItemFailures': batch_item_failures} ``` ### Best_practices - Set VisibilityTimeout to 6x Lambda timeout - Use ReportBatchItemFailures for partial batch failure - Always configure a DLQ for poison messages - Process messages idempotently ### DynamoDB Streams Pattern React to DynamoDB table changes with Lambda **When to use**: Real-time reactions to data changes,Cross-region replication,Audit logging, notifications ```yaml # template.yaml Resources: ItemsTable: Type: AWS::DynamoDB::Table Properties: TableName: items AttributeDefinitions: - AttributeName: id AttributeType: S KeySchema: - AttributeName: id KeyType: HASH BillingMode: PAY_PER_REQUEST StreamSpecification: StreamViewType: NEW_AND_OLD_IMAGES StreamProcessorFunction: Type: AWS::Serverless::Function Properties: Handler: src/handlers/stream.handler Events: Stream: Type: DynamoDB Properties: Stream: !GetAtt ItemsTable.StreamArn StartingPosition: TRIM_HORIZON BatchSize: 100 MaximumRetryAttempts: 3 DestinationConfig: OnFailure: Destination: !GetAtt StreamDLQ.Arn StreamDLQ: Type: AWS::SQS::Queue ``` ```javascript // src/handlers/stream.js exports.handler = async (event) => { for (const record of event.Records) { const eventName = record.eventName; // INSERT, MODIFY, REMOVE // Unmarshall DynamoDB format to plain JS objects const newImage = record.dynamodb.NewImage ? unmarshall(record.dynamodb.NewImage) : null; const oldImage = record.dynamodb.OldImage ? unmarshall(record.dynamodb.OldImage) : null; console.log(`${eventName}: `, { newImage, oldImage }); switch (eventName) { case 'INSERT': await handleInsert(newImage); break; case 'MODIFY': await handleModify(oldImage, newImage); break; case 'REMOVE': await handleRemove(oldImage); break; } } }; // Use AWS SDK v3 unmarshall const { unmarshall } = require('@aws-sdk/util-dynamodb'); ``` ### Stream_view_types - KEYS_ONLY: Only key attributes - NEW_IMAGE: After modification - OLD_IMAGE: Before modification - NEW_AND_OLD_IMAGES: Both before and after ### Cold Start Optimization Pattern Minimize Lambda cold start latency **When to use**: Latency-sensitive applications,User-facing APIs,High-traffic functions ## 1. Optimize Package Size ```javascript // Use modular AWS SDK v3 imports // GOOD - only imports what you need const { DynamoDBClient } = require('@aws-sdk/client-dynamodb'); const { DynamoDBDocumentClient, GetCommand } = require('@aws-sdk/lib-dynamodb'); // BAD - imports entire SDK const AWS = require('aws-sdk'); // Don't do this! ``` ## 2. Use SnapStart (Java/.NET) ```yaml # template.yaml Resources: JavaFunction: Type: AWS::Serverless::Function Properties: Handler: com.example.Handler::handleRequest Runtime: java21 SnapStart: ApplyOn: PublishedVersions # Enable SnapStart AutoPublishAlias: live ``` ## 3. Right-size Memory ```yaml # More memory = more CPU = faster init Resources: FastFunction: Type: AWS::Serverless::Function Properties: MemorySize: 1024 # 1GB gets full vCPU Timeout: 30 ``` ## 4. Provisioned Concurrency (when needed) ```yaml Resources: CriticalFunction: Type: AWS::Serverless::Function Properties: Handler: src/handlers/critical.handler AutoPublishAlias: live ProvisionedConcurrency: Type: AWS::Lambda::ProvisionedConcurrencyConfig Properties: FunctionName: !Ref CriticalFunction Qualifier: live ProvisionedConcurrentExecutions: 5 ``` ## 5. Keep Init Light ```python # GOOD - Lazy initialization _table = None def get_table(): global _table if _table is None: dynamodb = boto3.resource('dynamodb') _table = dynamodb.Table(os.environ['TABLE_NAME']) return _table def handler(event, context): table = get_table() # Only initializes on first use # ... ``` ### Optimization_priority - 1: Reduce package size (biggest impact) - 2: Use SnapStart for Java/.NET - 3: Increase memory for faster init - 4: Delay heavy imports - 5: Provisioned concurrency (last resort) ### SAM Local Development Pattern Local testing and debugging with SAM CLI **When to use**: Local development and testing,Debugging Lambda functions,Testing API Gateway locally ```bash # Install SAM CLI pip install aws-sam-cli # Initialize new project sam init --runtime nodejs20.x --name my-api # Build the project sam build # Run locally sam local start-api # Invoke single function sam local invoke GetItemFunction --event events/get.json # Local debugging (Node.js with VS Code) sam local invoke --debug-port 5858 GetItemFunction # Deploy sam deploy --guided ``` ```json // events/get.json (test event) { "pathParameters": { "id": "123" }, "httpMethod": "GET", "path": "/items/123" } ``` ```json // .vscode/launch.json (for debugging) { "version": "0.2.0", "configurations": [ { "name": "Attach to SAM CLI", "type": "node", "request": "attach", "address": "localhost", "port": 5858, "localRoot": "${workspaceRoot}/src", "remoteRoot": "/var/task/src", "protocol": "inspector" } ] } ``` ### Commands - Sam_build: Build Lambda deployment packages - Sam_local_start_api: Start local API Gateway - Sam_local_invoke: Invoke single function - Sam_deploy: Deploy to AWS - Sam_logs: Tail CloudWatch logs ### CDK Serverless Pattern Infrastructure as code with AWS CDK **When to use**: Complex infrastructure beyond Lambda,Prefer programming languages over YAML,Need reusable constructs ```typescript // lib/api-stack.ts import * as cdk from 'aws-cdk-lib'; import * as lambda from 'aws-cdk-lib/aws-lambda'; import * as apigateway from 'aws-cdk-lib/aws-apigateway'; import * as dynamodb from 'aws-cdk-lib/aws-dynamodb'; import { Construct } from 'constructs'; export class ApiStack extends cdk.Stack { constructor(scope: Construct, id: string, props?: cdk.StackProps) { super(scope, id, props); // DynamoDB Table const table = new dynamodb.Table(this, 'ItemsTable', { partitionKey: { name: 'id', type: dynamodb.AttributeType.STRING }, billingMode: dynamodb.BillingMode.PAY_PER_REQUEST, removalPolicy: cdk.RemovalPolicy.DESTROY, // For dev only }); // Lambda Function const getItemFn = new lambda.Function(this, 'GetItemFunction', { runtime: lambda.Runtime.NODEJS_20_X, handler: 'get.handler', code: lambda.Code.fromAsset('src/handlers'), environment: { TABLE_NAME: table.tableName, }, memorySize: 256, timeout: cdk.Duration.seconds(30), }); // Grant permissions table.grantReadData(getItemFn); // API Gateway const api = new apigateway.RestApi(this, 'ItemsApi', { restApiName: 'Items Service', defaultCorsPreflightOptions: { allowOrigins: apigateway.Cors.ALL_ORIGINS, allowMethods: apigateway.Cors.ALL_METHODS, }, }); const items = api.root.addResource('items'); const item = items.addResource('{id}'); item.addMethod('GET', new apigateway.LambdaIntegration(getItemFn)); // Output API URL new cdk.CfnOutput(this, 'ApiUrl', { value: api.url, }); } } ``` ```bash # CDK commands npm install -g aws-cdk cdk init app --language typescript cdk synth # Generate CloudFormation cdk diff # Show changes cdk deploy # Deploy to AWS ``` ## Sharp Edges ### Cold Start INIT Phase Now Billed (Aug 2025) Severity: HIGH Situation: Running Lambda functions in production Symptoms: Unexplained increase in Lambda costs (10-50% higher). Bill includes charges for function initialization. Functions with heavy startup logic cost more than expected. Why this breaks: As of August 1, 2025, AWS bills the INIT phase the same way it bills invocation duration. Previously, cold start initialization wasn't billed for the full duration. This affects functions with: - Heavy dependency loading (large packages) - Slow initialization code - Frequent cold starts (low traffic or poor concurrency) Cold starts now directly impact your bill, not just latency. Recommended fix: ## Measure your INIT phase ```bash # Check CloudWatch Logs for INIT_REPORT # Look for Init Duration in milliseconds # Example log line: # INIT_REPORT Init Duration: 423.45 ms ``` ## Reduce INIT duration ```javascript // 1. Minimize package size // Use tree shaking, exclude dev dependencies // npm prune --production // 2. Lazy load heavy dependencies let heavyLib = null; function getHeavyLib() { if (!heavyLib) { heavyLib = require('heavy-library'); } return heavyLib; } // 3. Use AWS SDK v3 modular imports const { S3Client } = require('@aws-sdk/client-s3'); // NOT: const AWS = require('aws-sdk'); ``` ## Use SnapStart for Java/.NET ```yaml Resources: JavaFunction: Type: AWS::Serverless::Function Properties: Runtime: java21 SnapStart: ApplyOn: PublishedVersions ``` ## Monitor cold start frequency ```javascript // Track cold starts with custom metric let isColdStart = true; exports.handler = async (event) => { if (isColdStart) { console.log('COLD_START'); // CloudWatch custom metric here isColdStart = false; } // ... }; ``` ### Lambda Timeout Misconfiguration Severity: HIGH Situation: Running Lambda functions, especially with external calls Symptoms: Function times out unexpectedly. "Task timed out after X seconds" in logs. Partial processing with no response. Silent failures with no error caught. Why this breaks: Default Lambda timeout is only 3 seconds. Maximum is 15 minutes. Common timeout causes: - Default timeout too short for workload - Downstream service taking longer than expected - Network issues in VPC - Infinite loops or blocking operations - S3 downloads larger than expected Lambda terminates at timeout without graceful shutdown. Recommended fix: ## Set appropriate timeout ```yaml # template.yaml Resources: MyFunction: Type: AWS::Serverless::Function Properties: Timeout: 30 # Seconds (max 900) # Set to expected duration + buffer ``` ## Implement timeout awareness ```javascript exports.handler = async (event, context) => { // Get remaining time const remainingTime = context.getRemainingTimeInMillis(); // If running low on time, fail gracefully if (remainingTime < 5000) { console.warn('Running low on time, aborting'); throw new Error('Insufficient time remaining'); } // For long operations, check periodically for (const item of items) { if (context.getRemainingTimeInMillis() < 10000) { // Save progress and exit gracefully await saveProgress(processedItems); throw new Error('Timeout approaching, saved progress'); } await processItem(item); } }; ``` ## Set downstream timeouts ```javascript const axios = require('axios'); // Always set timeouts on HTTP calls const response = await axios.get('https://api.example.com/data', { timeout: 5000 // 5 seconds }); ``` ### Out of Memory (OOM) Crash Severity: HIGH Situation: Lambda function processing data Symptoms: Function stops abruptly without error. CloudWatch logs appear truncated. "Max Memory Used" hits configured limit. Inconsistent behavior under load. Why this breaks: When Lambda exceeds memory allocation, AWS forcibly terminates the runtime. This happens without raising a catchable exception. Common causes: - Processing large files in memory - Memory leaks across invocations - Buffering entire response bodies - Heavy libraries consuming too much memory Recommended fix: ## Increase memory allocation ```yaml Resources: MyFunction: Type: AWS::Serverless::Function Properties: MemorySize: 1024 # MB (128-10240) # More memory = more CPU too ``` ## Stream large data ```javascript // BAD - loads entire file into memory const data = await s3.getObject(params).promise(); const content = data.Body.toString(); // GOOD - stream processing const { S3Client, GetObjectCommand } = require('@aws-sdk/client-s3'); const s3 = new S3Client({}); const response = await s3.send(new GetObjectCommand(params)); const stream = response.Body; // Process stream in chunks for await (const chunk of stream) { await processChunk(chunk); } ``` ## Monitor memory usage ```javascript exports.handler = async (event, context) => { const used = process.memoryUsage(); console.log('Memory:', { heapUsed: Math.round(used.heapUsed / 1024 / 1024) + 'MB', heapTotal: Math.round(used.heapTotal / 1024 / 1024) + 'MB' }); // ... }; ``` ## Use Lambda Power Tuning ```bash # Find optimal memory setting # https://github.com/alexcasalboni/aws-lambda-power-tuning ``` ### VPC-Attached Lambda Cold Start Delay Severity: MEDIUM Situation: Lambda functions in VPC accessing private resources Symptoms: Extremely slow cold starts (was 10+ seconds, now ~100ms). Timeouts on first invocation after idle period. Functions work in VPC but slow compared to non-VPC. Why this breaks: Lambda functions in VPC need Elastic Network Interfaces (ENIs). AWS improved this significantly with Hyperplane ENIs, but: - First cold start in VPC still has overhead - NAT Gateway issues can cause timeouts - Security group misconfig blocks traffic - DNS resolution can be slow Recommended fix: ## Verify VPC configuration ```yaml Resources: MyFunction: Type: AWS::Serverless::Function Properties: VpcConfig: SecurityGroupIds: - !Ref LambdaSecurityGroup SubnetIds: - !Ref PrivateSubnet1 - !Ref PrivateSubnet2 # Multiple AZs LambdaSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Lambda SG VpcId: !Ref VPC SecurityGroupEgress: - IpProtocol: tcp FromPort: 443 ToPort: 443 CidrIp: 0.0.0.0/0 # Allow HTTPS outbound ``` ## Use VPC endpoints for AWS services ```yaml # Avoid NAT Gateway for AWS service calls DynamoDBEndpoint: Type: AWS::EC2::VPCEndpoint Properties: ServiceName: !Sub com.amazonaws.${AWS::Region}.dynamodb VpcId: !Ref VPC RouteTableIds: - !Ref PrivateRouteTable VpcEndpointType: Gateway S3Endpoint: Type: AWS::EC2::VPCEndpoint Properties: ServiceName: !Sub com.amazonaws.${AWS::Region}.s3 VpcId: !Ref VPC VpcEndpointType: Gateway ``` ## Only use VPC when necessary Don't attach Lambda to VPC unless you need: - Access to RDS/ElastiCache in VPC - Access to private EC2 instances - Compliance requirements Most AWS services can be accessed without VPC. ### Node.js Event Loop Not Cleared Severity: MEDIUM Situation: Node.js Lambda function with callbacks or timers Symptoms: Function takes full timeout duration to return. "Task timed out" even though logic completed. Extra billing for idle time. Why this breaks: By default, Lambda waits for the Node.js event loop to be empty before returning. If you have: - Unresolved setTimeout/setInterval - Dangling database connections - Pending callbacks Lambda waits until timeout, even if your response was ready. Recommended fix: ## Tell Lambda not to wait for event loop ```javascript exports.handler = async (event, context) => { // Don't wait for event loop to clear context.callbackWaitsForEmptyEventLoop = false; // Your code here const result = await processRequest(event); return { statusCode: 200, body: JSON.stringify(result) }; }; ``` ## Close connections properly ```javascript // For database connections, use connection pooling // or close connections explicitly const mysql = require('mysql2/promise'); exports.handler = async (event, context) => { context.callbackWaitsForEmptyEventLoop = false; const connection = await mysql.createConnection({...}); try { const [rows] = await connection.query('SELECT * FROM users'); return { statusCode: 200, body: JSON.stringify(rows) }; } finally { await connection.end(); // Always close } }; ``` ### API Gateway Payload Size Limits Severity: MEDIUM Situation: Returning large responses or receiving large requests Symptoms: "413 Request Entity Too Large" error "Execution failed due to configuration error: Malformed Lambda proxy response" Response truncated or failed Why this breaks: API Gateway has hard payload limits: - REST API: 10 MB request/response - HTTP API: 10 MB request/response - Lambda itself: 6 MB sync response, 256 KB async Exceeding these causes failures that may not be obvious. Recommended fix: ## For large file uploads ```javascript // Use presigned S3 URLs instead of passing through API Gateway const { S3Client, PutObjectCommand } = require('@aws-sdk/client-s3'); const { getSignedUrl } = require('@aws-sdk/s3-request-presigner'); exports.handler = async (event) => { const s3 = new S3Client({}); const command = new PutObjectCommand({ Bucket: process.env.BUCKET_NAME, Key: `uploads/${Date.now()}.file` }); const uploadUrl = await getSignedUrl(s3, command, { expiresIn: 300 }); return { statusCode: 200, body: JSON.stringify({ uploadUrl }) }; }; ``` ## For large responses ```javascript // Store in S3, return presigned download URL exports.handler = async (event) => { const largeData = await generateLargeReport(); await s3.send(new PutObjectCommand({ Bucket: process.env.BUCKET_NAME, Key: `reports/${reportId}.json`, Body: JSON.stringify(largeData) })); const downloadUrl = await getSignedUrl(s3, new GetObjectCommand({ Bucket: process.env.BUCKET_NAME, Key: `reports/${reportId}.json` }), { expiresIn: 3600 } ); return { statusCode: 200, body: JSON.stringify({ downloadUrl }) }; }; ``` ### Infinite Loop or Recursive Invocation Severity: HIGH Situation: Lambda triggered by events Symptoms: Runaway costs. Thousands of invocations in minutes. CloudWatch logs show repeated invocations. Lambda writing to source bucket/table that triggers it. Why this breaks: Lambda can accidentally trigger itself: - S3 trigger writes back to same bucket - DynamoDB trigger updates same table - SNS publishes to topic that triggers it - Step Functions with wrong error handling Recommended fix: ## Use different buckets/prefixes ```yaml # S3 trigger with prefix filter Events: S3Event: Type: S3 Properties: Bucket: !Ref InputBucket Events: s3:ObjectCreated:* Filter: S3Key: Rules: - Name: prefix Value: uploads/ # Only trigger on uploads/ # Output to different bucket or prefix # OutputBucket or processed/ prefix ``` ## Add idempotency checks ```javascript exports.handler = async (event) => { for (const record of event.Records) { const key = record.s3.object.key; // Skip if this is a processed file if (key.startsWith('processed/')) { console.log('Skipping already processed file:', key); continue; } // Process and write to different location await processFile(key); await writeToS3(`processed/${key}`, result); } }; ``` ## Set reserved concurrency as circuit breaker ```yaml Resources: RiskyFunction: Type: AWS::Serverless::Function Properties: ReservedConcurrentExecutions: 10 # Max 10 parallel # Limits blast radius of runaway invocations ``` ## Monitor with CloudWatch alarms ```yaml InvocationAlarm: Type: AWS::CloudWatch::Alarm Properties: MetricName: Invocations Namespace: AWS/Lambda Statistic: Sum Period: 60 EvaluationPeriods: 1 Threshold: 1000 # Alert if >1000 invocations/min ComparisonOperator: GreaterThanThreshold ``` ## Validation Checks ### Hardcoded AWS Credentials Severity: ERROR AWS credentials must never be hardcoded Message: Hardcoded AWS access key detected. Use IAM roles or environment variables. ### AWS Secret Key in Source Code Severity: ERROR Secret keys should use Secrets Manager or environment variables Message: Hardcoded AWS secret key. Use IAM roles or Secrets Manager. ### Overly Permissive IAM Policy Severity: WARNING Avoid wildcard permissions in Lambda IAM roles Message: Overly permissive IAM policy. Use least privilege principle. ### Lambda Handler Without Error Handling Severity: WARNING Lambda handlers should have try/catch for graceful errors Message: Lambda handler without error handling. Add try/catch. ### Missing callbackWaitsForEmptyEventLoop Severity: INFO Node.js handlers should set callbackWaitsForEmptyEventLoop Message: Consider setting context.callbackWaitsForEmptyEventLoop = false ### Default Memory Configuration Severity: INFO Default 128MB may be too low for many workloads Message: Using default 128MB memory. Consider increasing for better performance. ### Low Timeout Configuration Severity: WARNING Very low timeout may cause unexpected failures Message: Timeout of 1-3 seconds may be too low. Increase if making external calls. ### No Dead Letter Queue Configuration Severity: WARNING Async functions should have DLQ for failed invocations Message: No DLQ configured. Add for async invocations. ### Importing Full AWS SDK v2 Severity: WARNING Import specific clients from AWS SDK v3 for smaller packages Message: Importing full AWS SDK. Use modular SDK v3 imports for smaller packages. ### Hardcoded DynamoDB Table Name Severity: WARNING Table names should come from environment variables Message: Hardcoded table name. Use environment variable for portability. ## Collaboration ### Delegation Triggers - user needs GCP serverless -> gcp-cloud-run (Cloud Run for containers, Cloud Functions for events) - user needs Azure serverless -> azure-functions (Azure Functions, Logic Apps) - user needs database design -> postgres-wizard (RDS design, or use DynamoDB patterns) - user needs authentication -> auth-specialist (Cognito, API Gateway authorizers) - user needs complex workflows -> workflow-automation (Step Functions, EventBridge) - user needs AI integration -> llm-architect (Lambda calling Bedrock or external LLMs) ## When to Use Use this skill when the request clearly matches the capabilities and patterns described above.