Files
claude-skills-reference/engineering/secrets-vault-manager/references/vault_patterns.md
Reza Rezvani 87f3a007c9 feat(engineering,ra-qm): add secrets-vault-manager, sql-database-assistant, gcp-cloud-architect, soc2-compliance
secrets-vault-manager (403-line SKILL.md, 3 scripts, 3 references):
- HashiCorp Vault, AWS SM, Azure KV, GCP SM integration
- Secret rotation, dynamic secrets, audit logging, emergency procedures

sql-database-assistant (457-line SKILL.md, 3 scripts, 3 references):
- Query optimization, migration generation, schema exploration
- Multi-DB support (PostgreSQL, MySQL, SQLite, SQL Server)
- ORM patterns (Prisma, Drizzle, TypeORM, SQLAlchemy)

gcp-cloud-architect (418-line SKILL.md, 3 scripts, 3 references):
- 6-step workflow mirroring aws-solution-architect for GCP
- Cloud Run, GKE, BigQuery, Cloud Functions, cost optimization
- Completes cloud trifecta (AWS + Azure + GCP)

soc2-compliance (417-line SKILL.md, 3 scripts, 3 references):
- SOC 2 Type I & II preparation, Trust Service Criteria mapping
- Control matrix generation, evidence tracking, gap analysis
- First SOC 2 skill in ra-qm-team (joins GDPR, ISO 27001, ISO 13485)

All 12 scripts pass --help. Docs generated, mkdocs.yml nav updated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 14:05:11 +01:00

10 KiB

HashiCorp Vault Architecture & Patterns Reference

Architecture Overview

Vault operates as a centralized secret management service with a client-server model. All secrets are encrypted at rest and in transit. The seal/unseal mechanism protects the master encryption key.

Core Components

┌─────────────────────────────────────────────────┐
│                   Vault Cluster                  │
│  ┌───────────┐  ┌───────────┐  ┌───────────┐   │
│  │  Leader    │  │ Standby   │  │ Standby   │   │
│  │  (active)  │  │ (forward) │  │ (forward) │   │
│  └─────┬─────┘  └─────┬─────┘  └─────┬─────┘   │
│        │               │               │         │
│  ┌─────┴───────────────┴───────────────┴─────┐   │
│  │            Raft Storage Backend            │   │
│  └───────────────────────────────────────────┘   │
│                                                   │
│  ┌──────────┐  ┌──────────┐  ┌──────────────┐   │
│  │ Auth     │  │ Secret   │  │ Audit        │   │
│  │ Methods  │  │ Engines  │  │ Devices      │   │
│  └──────────┘  └──────────┘  └──────────────┘   │
└─────────────────────────────────────────────────┘

Storage Backend Selection

Backend HA Support Operational Complexity Recommendation
Integrated Raft Yes Low Default choice — no external dependencies
Consul Yes Medium Legacy — use Raft unless already running Consul
S3/GCS/Azure Blob No Low Dev/test only — no HA
PostgreSQL/MySQL No Medium Not recommended — no HA, added dependency

High Availability Setup

Raft Cluster Configuration

Minimum 3 nodes for production (tolerates 1 failure). 5 nodes for critical workloads (tolerates 2 failures).

# vault-config.hcl (per node)
storage "raft" {
  path    = "/opt/vault/data"
  node_id = "vault-1"

  retry_join {
    leader_api_addr = "https://vault-2.internal:8200"
  }
  retry_join {
    leader_api_addr = "https://vault-3.internal:8200"
  }
}

listener "tcp" {
  address       = "0.0.0.0:8200"
  tls_cert_file = "/opt/vault/tls/vault.crt"
  tls_key_file  = "/opt/vault/tls/vault.key"
}

api_addr     = "https://vault-1.internal:8200"
cluster_addr = "https://vault-1.internal:8201"

Auto-Unseal with AWS KMS

Eliminates manual unseal key management. Vault encrypts its master key with the KMS key.

seal "awskms" {
  region     = "us-east-1"
  kms_key_id = "alias/vault-unseal"
}

Requirements:

  • IAM role with kms:Encrypt, kms:Decrypt, kms:DescribeKey permissions
  • KMS key must be in the same region or accessible cross-region
  • KMS key should have restricted access — only Vault nodes

Auto-Unseal with Azure Key Vault

seal "azurekeyvault" {
  tenant_id  = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
  vault_name = "vault-unseal-kv"
  key_name   = "vault-unseal-key"
}

Auto-Unseal with GCP KMS

seal "gcpckms" {
  project    = "my-project"
  region     = "global"
  key_ring   = "vault-keyring"
  crypto_key = "vault-unseal-key"
}

Namespaces (Enterprise)

Namespaces provide tenant isolation within a single Vault cluster. Each namespace has independent policies, auth methods, and secret engines.

root/
├── dev/           # Development environment
│   ├── auth/
│   └── secret/
├── staging/       # Staging environment
│   ├── auth/
│   └── secret/
└── production/    # Production environment
    ├── auth/
    └── secret/

OSS alternative: Use path-based isolation with strict policies. Prefix all paths with environment name (e.g., secret/data/production/...).

Policy Patterns

Templated Policies

Use identity-based templates for scalable policy management:

# Allow entities to manage their own secrets
path "secret/data/{{identity.entity.name}}/*" {
  capabilities = ["create", "read", "update", "delete"]
}

# Read shared config for the entity's group
path "secret/data/shared/{{identity.groups.names}}/*" {
  capabilities = ["read"]
}

Sentinel Policies (Enterprise)

Enforce governance rules beyond path-based access:

# Require MFA for production secret writes
import "mfa"

main = rule {
  request.path matches "secret/data/production/.*" and
  request.operation in ["create", "update", "delete"] and
  mfa.methods.totp.valid
}

Policy Hierarchy

  1. Global deny — Explicit deny on sys/*, auth/token/create-orphan
  2. Environment base — Read access to environment-specific paths
  3. Service-specific — Scoped to exact paths the service needs
  4. Admin override — Requires MFA, time-limited, audit-heavy

Secret Engine Configuration

KV v2 (Versioned Key-Value)

# Enable with custom config
vault secrets enable -path=secret -version=2 kv

# Configure version retention
vault write secret/config max_versions=10 cas_required=true delete_version_after=90d

Check-and-Set (CAS): Prevents accidental overwrites. Client must supply the current version number to update.

Database Engine

# Enable and configure PostgreSQL
vault secrets enable database

vault write database/config/postgres \
  plugin_name=postgresql-database-plugin \
  connection_url="postgresql://{{username}}:{{password}}@db.internal:5432/app?sslmode=require" \
  allowed_roles="app-readonly,app-readwrite" \
  username="vault_admin" \
  password="INITIAL_PASSWORD"

# Rotate the root password (Vault manages it from now on)
vault write -f database/rotate-root/postgres

# Create a read-only role
vault write database/roles/app-readonly \
  db_name=postgres \
  creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
  revocation_statements="DROP ROLE IF EXISTS \"{{name}}\";" \
  default_ttl=1h \
  max_ttl=24h

PKI Engine (Certificate Authority)

# Enable PKI engine
vault secrets enable -path=pki pki
vault secrets tune -max-lease-ttl=87600h pki

# Generate root CA
vault write -field=certificate pki/root/generate/internal \
  common_name="Example Root CA" \
  ttl=87600h > root_ca.crt

# Enable intermediate CA
vault secrets enable -path=pki_int pki
vault secrets tune -max-lease-ttl=43800h pki_int

# Generate intermediate CSR
vault write -field=csr pki_int/intermediate/generate/internal \
  common_name="Example Intermediate CA" > intermediate.csr

# Sign with root CA
vault write -field=certificate pki/root/sign-intermediate \
  csr=@intermediate.csr format=pem_bundle ttl=43800h > intermediate.crt

# Set signed certificate
vault write pki_int/intermediate/set-signed certificate=@intermediate.crt

# Create role for leaf certificates
vault write pki_int/roles/web-server \
  allowed_domains="example.com" \
  allow_subdomains=true \
  max_ttl=2160h

Transit Engine (Encryption-as-a-Service)

vault secrets enable transit

# Create encryption key
vault write -f transit/keys/payment-data \
  type=aes256-gcm96

# Encrypt data
vault write transit/encrypt/payment-data \
  plaintext=$(echo "sensitive-data" | base64)

# Decrypt data
vault write transit/decrypt/payment-data \
  ciphertext="vault:v1:..."

# Rotate key (old versions still decrypt, new encrypts with latest)
vault write -f transit/keys/payment-data/rotate

# Rewrap ciphertext to latest key version
vault write transit/rewrap/payment-data \
  ciphertext="vault:v1:..."

Performance and Scaling

Performance Replication (Enterprise)

Primary cluster replicates to secondary clusters in other regions. Secondaries handle read traffic locally.

Performance Standbys (Enterprise)

Standby nodes serve read requests without forwarding to the leader, reducing leader load.

Response Wrapping

Wrap sensitive responses in a single-use token — the recipient unwraps exactly once:

# Wrap a secret (TTL = 5 minutes)
vault kv get -wrap-ttl=5m secret/data/production/db-creds

# Recipient unwraps
vault unwrap <wrapping_token>

Batch Tokens

For high-throughput workloads (Lambda, serverless), use batch tokens instead of service tokens. Batch tokens are not persisted to storage, reducing I/O.

Monitoring and Health

Key Metrics

Metric Alert Threshold Source
vault.core.unsealed 0 (sealed) Telemetry
vault.expire.num_leases >10,000 Telemetry
vault.audit.log_response Error rate >1% Telemetry
vault.runtime.alloc_bytes >80% memory Telemetry
vault.raft.leader.lastContact >500ms Telemetry
vault.token.count >50,000 Telemetry

Health Check Endpoint

# Returns 200 if initialized, unsealed, and active
curl -s https://vault.internal:8200/v1/sys/health

# Status codes:
# 200 — initialized, unsealed, active
# 429 — unsealed, standby
# 472 — disaster recovery secondary
# 473 — performance standby
# 501 — not initialized
# 503 — sealed

Disaster Recovery

Backup

# Raft snapshot (includes all data)
vault operator raft snapshot save backup-$(date +%Y%m%d).snap

# Schedule daily backups via cron
0 2 * * * /usr/local/bin/vault operator raft snapshot save /backups/vault-$(date +\%Y\%m\%d).snap

Restore

# Restore from snapshot (causes brief outage)
vault operator raft snapshot restore backup-20260320.snap

DR Replication (Enterprise)

Secondary cluster in standby. Promote on primary failure:

# On DR secondary
vault operator generate-root -dr-token
vault write sys/replication/dr/secondary/promote dr_operation_token=<token>