secrets-vault-manager (403-line SKILL.md, 3 scripts, 3 references): - HashiCorp Vault, AWS SM, Azure KV, GCP SM integration - Secret rotation, dynamic secrets, audit logging, emergency procedures sql-database-assistant (457-line SKILL.md, 3 scripts, 3 references): - Query optimization, migration generation, schema exploration - Multi-DB support (PostgreSQL, MySQL, SQLite, SQL Server) - ORM patterns (Prisma, Drizzle, TypeORM, SQLAlchemy) gcp-cloud-architect (418-line SKILL.md, 3 scripts, 3 references): - 6-step workflow mirroring aws-solution-architect for GCP - Cloud Run, GKE, BigQuery, Cloud Functions, cost optimization - Completes cloud trifecta (AWS + Azure + GCP) soc2-compliance (417-line SKILL.md, 3 scripts, 3 references): - SOC 2 Type I & II preparation, Trust Service Criteria mapping - Control matrix generation, evidence tracking, gap analysis - First SOC 2 skill in ra-qm-team (joins GDPR, ISO 27001, ISO 13485) All 12 scripts pass --help. Docs generated, mkdocs.yml nav updated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
548 lines
14 KiB
Markdown
548 lines
14 KiB
Markdown
# GCP Service Selection Guide
|
|
|
|
Quick reference for choosing the right GCP service based on requirements.
|
|
|
|
---
|
|
|
|
## Table of Contents
|
|
|
|
- [Compute Services](#compute-services)
|
|
- [Database Services](#database-services)
|
|
- [Storage Services](#storage-services)
|
|
- [Messaging and Events](#messaging-and-events)
|
|
- [API and Integration](#api-and-integration)
|
|
- [Networking](#networking)
|
|
- [Security and Identity](#security-and-identity)
|
|
|
|
---
|
|
|
|
## Compute Services
|
|
|
|
### Decision Matrix
|
|
|
|
| Requirement | Recommended Service |
|
|
|-------------|---------------------|
|
|
| HTTP-triggered containers, auto-scaling | Cloud Run |
|
|
| Event-driven, short tasks (<9 min) | Cloud Functions (2nd gen) |
|
|
| Kubernetes workloads, microservices | GKE Autopilot |
|
|
| Custom VMs, GPU/TPU | Compute Engine |
|
|
| Batch processing, HPC | Batch |
|
|
| Kubernetes with full control | GKE Standard |
|
|
|
|
### Cloud Run
|
|
|
|
**Best for:** Containerized HTTP services, APIs, web backends
|
|
|
|
```
|
|
Limits:
|
|
- vCPU: 1-8 per instance
|
|
- Memory: 128 MiB - 32 GiB
|
|
- Request timeout: 3600 seconds
|
|
- Concurrency: 1-1000 per instance
|
|
- Min instances: 0 (scale-to-zero)
|
|
- Max instances: 1000
|
|
|
|
Pricing: Per vCPU-second + GiB-second (free tier: 2M requests/month)
|
|
```
|
|
|
|
**Use when:**
|
|
- Containerized apps with HTTP endpoints
|
|
- Variable/unpredictable traffic
|
|
- Want scale-to-zero capability
|
|
- No Kubernetes expertise needed
|
|
|
|
**Avoid when:**
|
|
- Non-HTTP workloads (use Cloud Functions or GKE)
|
|
- Need GPU/TPU (use Compute Engine or GKE)
|
|
- Require persistent local storage
|
|
|
|
### Cloud Functions (2nd gen)
|
|
|
|
**Best for:** Event-driven functions, lightweight triggers, webhooks
|
|
|
|
```
|
|
Limits:
|
|
- Execution: 9 minutes max (2nd gen), 9 minutes (1st gen)
|
|
- Memory: 128 MB - 32 GB
|
|
- Concurrency: Up to 1000 per instance (2nd gen)
|
|
- Runtimes: Node.js, Python, Go, Java, .NET, Ruby, PHP
|
|
|
|
Pricing: $0.40 per million invocations + compute time
|
|
```
|
|
|
|
**Use when:**
|
|
- Event-driven processing (Pub/Sub, Cloud Storage, Firestore)
|
|
- Lightweight API endpoints
|
|
- Scheduled tasks (Cloud Scheduler triggers)
|
|
- Minimal infrastructure management
|
|
|
|
**Avoid when:**
|
|
- Long-running processes (>9 min)
|
|
- Complex multi-container apps
|
|
- Need fine-grained scaling control
|
|
|
|
### GKE Autopilot
|
|
|
|
**Best for:** Kubernetes workloads with managed node provisioning
|
|
|
|
```
|
|
Limits:
|
|
- Pod resources: 0.25-112 vCPU, 0.5-896 GiB memory
|
|
- GPU support: NVIDIA T4, L4, A100, H100
|
|
- Management fee: $0.10/hour per cluster ($74.40/month)
|
|
|
|
Pricing: Per pod vCPU-hour + GiB-hour (no node management)
|
|
```
|
|
|
|
**Use when:**
|
|
- Team has Kubernetes expertise
|
|
- Need pod-level resource control
|
|
- Multi-container services
|
|
- GPU workloads
|
|
|
|
### Compute Engine
|
|
|
|
**Best for:** Custom configurations, specialized hardware
|
|
|
|
```
|
|
Machine Types:
|
|
- General: e2, n2, n2d, c3
|
|
- Compute: c2, c2d
|
|
- Memory: m1, m2, m3
|
|
- Accelerator: a2 (GPU), a3 (GPU)
|
|
- Storage: z3
|
|
|
|
Pricing Options:
|
|
- On-demand, Spot (60-91% discount), Committed Use (37-55% discount)
|
|
```
|
|
|
|
**Use when:**
|
|
- Need GPU/TPU
|
|
- Windows workloads
|
|
- Specific hardware requirements
|
|
- Lift-and-shift migrations
|
|
|
|
---
|
|
|
|
## Database Services
|
|
|
|
### Decision Matrix
|
|
|
|
| Data Type | Query Pattern | Scale | Recommended |
|
|
|-----------|--------------|-------|-------------|
|
|
| Key-value, document | Simple lookups, real-time | Any | Firestore |
|
|
| Wide-column | High-throughput reads/writes | >1TB | Cloud Bigtable |
|
|
| Relational | Complex joins, ACID | Variable | Cloud SQL |
|
|
| Relational, global | Strong consistency, global | Large | Cloud Spanner |
|
|
| Time-series | Time-based queries | Any | Bigtable or BigQuery |
|
|
| Analytics, warehouse | SQL analytics | Petabytes | BigQuery |
|
|
|
|
### Firestore
|
|
|
|
**Best for:** Document data, mobile/web apps, real-time sync
|
|
|
|
```
|
|
Limits:
|
|
- Document size: 1 MiB max
|
|
- Field depth: 20 nested levels
|
|
- Write rate: 10,000 writes/sec per database
|
|
- Indexes: Automatic single-field, manual composite
|
|
|
|
Pricing:
|
|
- Reads: $0.036 per 100K reads
|
|
- Writes: $0.108 per 100K writes
|
|
- Storage: $0.108 per GiB/month
|
|
- Free tier: 50K reads, 20K writes, 1 GiB storage per day
|
|
```
|
|
|
|
**Use when:**
|
|
- Mobile/web apps needing offline sync
|
|
- Real-time data updates
|
|
- Flexible schema
|
|
- Serverless architecture
|
|
|
|
**Avoid when:**
|
|
- Complex SQL queries with joins
|
|
- Heavy analytics workloads
|
|
- Data >1 MiB per document
|
|
|
|
### Cloud SQL
|
|
|
|
**Best for:** Relational data with familiar SQL
|
|
|
|
| Engine | Version | Max Storage | Max Connections |
|
|
|--------|---------|-------------|-----------------|
|
|
| PostgreSQL | 15 | 64 TB | Instance-dependent |
|
|
| MySQL | 8.0 | 64 TB | Instance-dependent |
|
|
| SQL Server | 2022 | 64 TB | Instance-dependent |
|
|
|
|
```
|
|
Pricing:
|
|
- Machine type + storage + networking
|
|
- HA: 2x cost (regional instance)
|
|
- Read replicas: Per-replica pricing
|
|
```
|
|
|
|
**Use when:**
|
|
- Relational data with complex queries
|
|
- Existing SQL expertise
|
|
- Need ACID transactions
|
|
- Migration from on-premises databases
|
|
|
|
### Cloud Spanner
|
|
|
|
**Best for:** Globally distributed relational data
|
|
|
|
```
|
|
Limits:
|
|
- Storage: Unlimited
|
|
- Nodes: 1-100+ per instance
|
|
- Consistency: Strong global consistency
|
|
|
|
Pricing:
|
|
- Regional: $0.90/node-hour (~$657/month per node)
|
|
- Multi-region: $2.70/node-hour (~$1,971/month per node)
|
|
- Storage: $0.30/GiB/month
|
|
```
|
|
|
|
**Use when:**
|
|
- Global applications needing strong consistency
|
|
- Relational data at massive scale
|
|
- 99.999% availability requirement
|
|
- Horizontal scaling with SQL
|
|
|
|
### BigQuery
|
|
|
|
**Best for:** Analytics, data warehouse, SQL on massive datasets
|
|
|
|
```
|
|
Limits:
|
|
- Query: 6-hour timeout
|
|
- Concurrent queries: 100 default
|
|
- Streaming inserts: 100K rows/sec per table
|
|
|
|
Pricing:
|
|
- On-demand: $6.25 per TB queried (first 1 TB free/month)
|
|
- Editions: Autoscale slots starting at $0.04/slot-hour
|
|
- Storage: $0.02/GiB (active), $0.01/GiB (long-term)
|
|
```
|
|
|
|
### Firestore vs Cloud SQL vs Spanner
|
|
|
|
| Factor | Firestore | Cloud SQL | Cloud Spanner |
|
|
|--------|-----------|-----------|---------------|
|
|
| Query flexibility | Document-based | Full SQL | Full SQL |
|
|
| Scaling | Automatic | Vertical + read replicas | Horizontal |
|
|
| Consistency | Strong (single region) | ACID | Strong (global) |
|
|
| Cost model | Per-operation | Per-hour | Per-node-hour |
|
|
| Operational | Zero management | Managed (some ops) | Managed |
|
|
| Best for | Mobile/web apps | Traditional apps | Global scale |
|
|
|
|
---
|
|
|
|
## Storage Services
|
|
|
|
### Cloud Storage Classes
|
|
|
|
| Class | Access Pattern | Min Duration | Cost (GiB/mo) |
|
|
|-------|---------------|--------------|----------------|
|
|
| Standard | Frequent | None | $0.020 |
|
|
| Nearline | Monthly access | 30 days | $0.010 |
|
|
| Coldline | Quarterly access | 90 days | $0.004 |
|
|
| Archive | Annual access | 365 days | $0.0012 |
|
|
|
|
### Lifecycle Policy Example
|
|
|
|
```json
|
|
{
|
|
"lifecycle": {
|
|
"rule": [
|
|
{
|
|
"action": { "type": "SetStorageClass", "storageClass": "NEARLINE" },
|
|
"condition": { "age": 30, "matchesStorageClass": ["STANDARD"] }
|
|
},
|
|
{
|
|
"action": { "type": "SetStorageClass", "storageClass": "COLDLINE" },
|
|
"condition": { "age": 90, "matchesStorageClass": ["NEARLINE"] }
|
|
},
|
|
{
|
|
"action": { "type": "SetStorageClass", "storageClass": "ARCHIVE" },
|
|
"condition": { "age": 365, "matchesStorageClass": ["COLDLINE"] }
|
|
},
|
|
{
|
|
"action": { "type": "Delete" },
|
|
"condition": { "age": 2555 }
|
|
}
|
|
]
|
|
}
|
|
}
|
|
```
|
|
|
|
### Autoclass
|
|
|
|
Automatically transitions objects between storage classes based on access patterns. Recommended for mixed or unknown access patterns.
|
|
|
|
```bash
|
|
gsutil mb -l us-central1 --autoclass gs://my-bucket/
|
|
```
|
|
|
|
### Block and File Storage
|
|
|
|
| Service | Use Case | Access |
|
|
|---------|----------|--------|
|
|
| Persistent Disk | GCE/GKE block storage | Single instance (RW) or multi (RO) |
|
|
| Filestore | NFS shared file system | Multiple instances |
|
|
| Parallelstore | HPC parallel file system | High throughput |
|
|
| Cloud Storage FUSE | Mount GCS as filesystem | Any compute |
|
|
|
|
---
|
|
|
|
## Messaging and Events
|
|
|
|
### Decision Matrix
|
|
|
|
| Pattern | Service | Use Case |
|
|
|---------|---------|----------|
|
|
| Pub/sub messaging | Pub/Sub | Event streaming, microservice decoupling |
|
|
| Task queue | Cloud Tasks | Asynchronous task execution with retries |
|
|
| Workflow orchestration | Workflows | Multi-step service orchestration |
|
|
| Batch orchestration | Cloud Composer | Complex DAG-based pipelines (Airflow) |
|
|
| Event triggers | Eventarc | Route events to Cloud Run, GKE, Workflows |
|
|
|
|
### Pub/Sub
|
|
|
|
**Best for:** Event-driven architectures, stream processing
|
|
|
|
```
|
|
Limits:
|
|
- Message size: 10 MB max
|
|
- Throughput: Unlimited (auto-scaling)
|
|
- Retention: 7 days default (up to 31 days)
|
|
- Ordering: Per ordering key
|
|
|
|
Pricing: $40/TiB for message delivery
|
|
```
|
|
|
|
```python
|
|
# Pub/Sub publisher example
|
|
from google.cloud import pubsub_v1
|
|
import json
|
|
|
|
publisher = pubsub_v1.PublisherClient()
|
|
topic_path = publisher.topic_path('my-project', 'events')
|
|
|
|
def publish_event(event_type, payload):
|
|
data = json.dumps(payload).encode('utf-8')
|
|
future = publisher.publish(
|
|
topic_path,
|
|
data,
|
|
event_type=event_type
|
|
)
|
|
return future.result()
|
|
```
|
|
|
|
### Cloud Tasks
|
|
|
|
**Best for:** Asynchronous task execution with delivery guarantees
|
|
|
|
```
|
|
Features:
|
|
- Configurable retry policies
|
|
- Rate limiting
|
|
- Scheduled delivery
|
|
- HTTP and App Engine targets
|
|
|
|
Pricing: $0.40 per million operations
|
|
```
|
|
|
|
### Eventarc
|
|
|
|
**Best for:** Routing cloud events to services
|
|
|
|
```python
|
|
# Eventarc routes events from 130+ Google Cloud sources
|
|
# to Cloud Run, GKE, or Workflows
|
|
|
|
# Example: Trigger Cloud Run on Cloud Storage upload
|
|
# gcloud eventarc triggers create my-trigger \
|
|
# --destination-run-service=my-service \
|
|
# --event-filters="type=google.cloud.storage.object.v1.finalized" \
|
|
# --event-filters="bucket=my-bucket"
|
|
```
|
|
|
|
---
|
|
|
|
## API and Integration
|
|
|
|
### API Gateway vs Cloud Endpoints vs Cloud Run
|
|
|
|
| Factor | API Gateway | Cloud Endpoints | Cloud Run (direct) |
|
|
|--------|-------------|-----------------|---------------------|
|
|
| Protocol | REST, gRPC | REST, gRPC | Any HTTP |
|
|
| Auth | API keys, JWT, Firebase | API keys, JWT | IAM, custom |
|
|
| Rate limiting | Built-in | Built-in | Manual |
|
|
| Cost | Per-call pricing | Per-call pricing | Per-request |
|
|
| Best for | External APIs | Internal APIs | Simple services |
|
|
|
|
### Cloud Endpoints Configuration
|
|
|
|
```yaml
|
|
# openapi.yaml
|
|
swagger: "2.0"
|
|
info:
|
|
title: "My API"
|
|
version: "1.0.0"
|
|
host: "my-api-xyz.apigateway.my-project.cloud.goog"
|
|
schemes:
|
|
- "https"
|
|
paths:
|
|
/users:
|
|
get:
|
|
summary: "List users"
|
|
operationId: "listUsers"
|
|
x-google-backend:
|
|
address: "https://my-app-api-xyz.a.run.app"
|
|
security:
|
|
- api_key: []
|
|
securityDefinitions:
|
|
api_key:
|
|
type: "apiKey"
|
|
name: "key"
|
|
in: "query"
|
|
```
|
|
|
|
### Workflows
|
|
|
|
**Best for:** Orchestrating multi-service processes
|
|
|
|
```yaml
|
|
# workflow.yaml
|
|
main:
|
|
steps:
|
|
- processOrder:
|
|
call: http.post
|
|
args:
|
|
url: https://orders-service.run.app/process
|
|
body:
|
|
orderId: ${args.orderId}
|
|
result: orderResult
|
|
|
|
- checkInventory:
|
|
switch:
|
|
- condition: ${orderResult.body.inStock}
|
|
next: shipOrder
|
|
next: backOrder
|
|
|
|
- shipOrder:
|
|
call: http.post
|
|
args:
|
|
url: https://shipping-service.run.app/ship
|
|
body:
|
|
orderId: ${args.orderId}
|
|
result: shipResult
|
|
|
|
- backOrder:
|
|
call: http.post
|
|
args:
|
|
url: https://inventory-service.run.app/backorder
|
|
body:
|
|
orderId: ${args.orderId}
|
|
```
|
|
|
|
---
|
|
|
|
## Networking
|
|
|
|
### VPC Components
|
|
|
|
| Component | Purpose |
|
|
|-----------|---------|
|
|
| VPC | Isolated network (global resource) |
|
|
| Subnet | Regional network segment |
|
|
| Cloud NAT | Outbound internet for private instances |
|
|
| Cloud Router | Dynamic routing (BGP) |
|
|
| Private Google Access | Access GCP APIs without public IP |
|
|
| VPC Peering | Connect two VPC networks |
|
|
| Shared VPC | Share VPC across projects |
|
|
|
|
### VPC Design Pattern
|
|
|
|
```
|
|
VPC: 10.0.0.0/16 (global)
|
|
|
|
Subnet us-central1:
|
|
10.0.0.0/20 (primary)
|
|
10.4.0.0/14 (pods - secondary)
|
|
10.8.0.0/20 (services - secondary)
|
|
- GKE cluster, Cloud Run (VPC connector)
|
|
|
|
Subnet us-east1:
|
|
10.0.16.0/20 (primary)
|
|
- Cloud SQL (private IP), Memorystore
|
|
|
|
Subnet europe-west1:
|
|
10.0.32.0/20 (primary)
|
|
- DR / multi-region workloads
|
|
```
|
|
|
|
### Private Google Access
|
|
|
|
```bash
|
|
# Enable Private Google Access on a subnet
|
|
gcloud compute networks subnets update my-subnet \
|
|
--region=us-central1 \
|
|
--enable-private-google-access
|
|
```
|
|
|
|
---
|
|
|
|
## Security and Identity
|
|
|
|
### IAM Best Practices
|
|
|
|
```bash
|
|
# Prefer predefined roles over basic roles
|
|
# BAD: roles/editor (too broad)
|
|
# GOOD: roles/run.invoker (specific)
|
|
|
|
# Grant role to service account
|
|
gcloud projects add-iam-policy-binding my-project \
|
|
--member="serviceAccount:my-sa@my-project.iam.gserviceaccount.com" \
|
|
--role="roles/datastore.user" \
|
|
--condition='expression=resource.name.startsWith("projects/my-project/databases/(default)/documents/users"),title=firestore-users-only'
|
|
```
|
|
|
|
### Service Account Best Practices
|
|
|
|
| Practice | Description |
|
|
|----------|-------------|
|
|
| One SA per service | Separate service accounts per workload |
|
|
| Workload Identity | Bind K8s SAs to GCP SAs in GKE |
|
|
| Short-lived tokens | Use impersonation instead of key files |
|
|
| No SA keys | Avoid downloading JSON key files |
|
|
|
|
### Secret Manager vs Environment Variables
|
|
|
|
| Factor | Secret Manager | Env Variables |
|
|
|--------|---------------|---------------|
|
|
| Rotation | Automatic versioning | Manual redeploy |
|
|
| Audit | Cloud Audit Logs | No audit trail |
|
|
| Access control | IAM per-secret | Per-service |
|
|
| Pricing | $0.06/10K access ops | Free |
|
|
| Use case | Credentials, API keys | Non-sensitive config |
|
|
|
|
### Secret Manager Usage
|
|
|
|
```python
|
|
from google.cloud import secretmanager
|
|
|
|
def get_secret(project_id, secret_id, version="latest"):
|
|
client = secretmanager.SecretManagerServiceClient()
|
|
name = f"projects/{project_id}/secrets/{secret_id}/versions/{version}"
|
|
response = client.access_secret_version(request={"name": name})
|
|
return response.payload.data.decode("UTF-8")
|
|
|
|
# Usage
|
|
db_password = get_secret("my-project", "db-password")
|
|
```
|