feat(engineering,ra-qm): add secrets-vault-manager, sql-database-assistant, gcp-cloud-architect, soc2-compliance

secrets-vault-manager (403-line SKILL.md, 3 scripts, 3 references):
- HashiCorp Vault, AWS SM, Azure KV, GCP SM integration
- Secret rotation, dynamic secrets, audit logging, emergency procedures

sql-database-assistant (457-line SKILL.md, 3 scripts, 3 references):
- Query optimization, migration generation, schema exploration
- Multi-DB support (PostgreSQL, MySQL, SQLite, SQL Server)
- ORM patterns (Prisma, Drizzle, TypeORM, SQLAlchemy)

gcp-cloud-architect (418-line SKILL.md, 3 scripts, 3 references):
- 6-step workflow mirroring aws-solution-architect for GCP
- Cloud Run, GKE, BigQuery, Cloud Functions, cost optimization
- Completes cloud trifecta (AWS + Azure + GCP)

soc2-compliance (417-line SKILL.md, 3 scripts, 3 references):
- SOC 2 Type I & II preparation, Trust Service Criteria mapping
- Control matrix generation, evidence tracking, gap analysis
- First SOC 2 skill in ra-qm-team (joins GDPR, ISO 27001, ISO 13485)

All 12 scripts pass --help. Docs generated, mkdocs.yml nav updated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Reza Rezvani
2026-03-25 14:05:11 +01:00
parent 7a2189fa21
commit 87f3a007c9
36 changed files with 13450 additions and 6 deletions

View File

@@ -0,0 +1,403 @@
---
name: "secrets-vault-manager"
description: "Use when the user asks to set up secret management infrastructure, integrate HashiCorp Vault, configure cloud secret stores (AWS Secrets Manager, Azure Key Vault, GCP Secret Manager), implement secret rotation, or audit secret access patterns."
---
# Secrets Vault Manager
**Tier:** POWERFUL
**Category:** Engineering
**Domain:** Security / Infrastructure / DevOps
---
## Overview
Production secret infrastructure management for teams running HashiCorp Vault, cloud-native secret stores, or hybrid architectures. This skill covers policy authoring, auth method configuration, automated rotation, dynamic secrets, audit logging, and incident response.
**Distinct from env-secrets-manager** which handles local `.env` file hygiene and leak detection. This skill operates at the infrastructure layer — Vault clusters, cloud KMS, certificate authorities, and CI/CD secret injection.
### When to Use
- Standing up a new Vault cluster or migrating to a managed secret store
- Designing auth methods for services, CI runners, and human operators
- Implementing automated credential rotation (database, API keys, certificates)
- Auditing secret access patterns for compliance (SOC 2, ISO 27001, HIPAA)
- Responding to a secret leak that requires mass revocation
- Integrating secrets into Kubernetes workloads or CI/CD pipelines
---
## HashiCorp Vault Patterns
### Architecture Decisions
| Decision | Recommendation | Rationale |
|----------|---------------|-----------|
| Deployment mode | HA with Raft storage | No external dependency, built-in leader election |
| Auto-unseal | Cloud KMS (AWS KMS / Azure Key Vault / GCP KMS) | Eliminates manual unseal, enables automated restarts |
| Namespaces | One per environment (dev/staging/prod) | Blast-radius isolation, independent policies |
| Audit devices | File + syslog (dual) | Vault refuses requests if all audit devices fail — dual prevents outages |
### Auth Methods
**AppRole** — Machine-to-machine authentication for services and batch jobs.
```hcl
# Enable AppRole
path "auth/approle/*" {
capabilities = ["create", "read", "update", "delete", "list"]
}
# Application-specific role
vault write auth/approle/role/payment-service \
token_ttl=1h \
token_max_ttl=4h \
secret_id_num_uses=1 \
secret_id_ttl=10m \
token_policies="payment-service-read"
```
**Kubernetes** — Pod-native authentication via service account tokens.
```hcl
vault write auth/kubernetes/role/api-server \
bound_service_account_names=api-server \
bound_service_account_namespaces=production \
policies=api-server-secrets \
ttl=1h
```
**OIDC** — Human operator access via SSO provider (Okta, Azure AD, Google Workspace).
```hcl
vault write auth/oidc/role/engineering \
bound_audiences="vault" \
allowed_redirect_uris="https://vault.example.com/ui/vault/auth/oidc/oidc/callback" \
user_claim="email" \
oidc_scopes="openid,profile,email" \
policies="engineering-read" \
ttl=8h
```
### Secret Engines
| Engine | Use Case | TTL Strategy |
|--------|----------|-------------|
| KV v2 | Static secrets (API keys, config) | Versioned, manual rotation |
| Database | Dynamic DB credentials | 1h default, 24h max |
| PKI | TLS certificates | 90d leaf certs, 5y intermediate CA |
| Transit | Encryption-as-a-service | Key rotation every 90d |
| SSH | Signed SSH certificates | 30m for interactive, 8h for automation |
### Policy Design
Follow least-privilege with path-based granularity:
```hcl
# payment-service-read policy
path "secret/data/production/payment/*" {
capabilities = ["read"]
}
path "database/creds/payment-readonly" {
capabilities = ["read"]
}
# Deny access to admin paths explicitly
path "sys/*" {
capabilities = ["deny"]
}
```
**Policy naming convention:** `{service}-{access-level}` (e.g., `payment-service-read`, `api-gateway-admin`).
---
## Cloud Secret Store Integration
### Comparison Matrix
| Feature | AWS Secrets Manager | Azure Key Vault | GCP Secret Manager |
|---------|--------------------|-----------------|--------------------|
| Rotation | Built-in Lambda | Custom logic via Functions | Cloud Functions |
| Versioning | Automatic | Manual or automatic | Automatic |
| Encryption | AWS KMS (default or CMK) | HSM-backed | Google-managed or CMEK |
| Access control | IAM policies + resource policy | RBAC + Access Policies | IAM bindings |
| Cross-region | Replication supported | Geo-redundant by default | Replication supported |
| Audit | CloudTrail | Azure Monitor + Diagnostic Logs | Cloud Audit Logs |
| Pricing model | Per-secret + per-API call | Per-operation + per-key | Per-secret version + per-access |
### When to Use Which
- **AWS Secrets Manager**: RDS/Aurora credential rotation out of the box. Best when fully on AWS.
- **Azure Key Vault**: Certificate management strength. Required for Azure AD integrated workloads.
- **GCP Secret Manager**: Simplest API surface. Best for GKE-native workloads with Workload Identity.
- **HashiCorp Vault**: Multi-cloud, dynamic secrets, PKI, transit encryption. Best for complex or hybrid environments.
### SDK Access Patterns
**Principle:** Always fetch secrets at startup or via sidecar — never bake into images or config files.
```python
# AWS Secrets Manager pattern
import boto3, json
def get_secret(secret_name, region="us-east-1"):
client = boto3.client("secretsmanager", region_name=region)
response = client.get_secret_value(SecretId=secret_name)
return json.loads(response["SecretString"])
```
```python
# GCP Secret Manager pattern
from google.cloud import secretmanager
def get_secret(project_id, secret_id, version="latest"):
client = secretmanager.SecretManagerServiceClient()
name = f"projects/{project_id}/secrets/{secret_id}/versions/{version}"
response = client.access_secret_version(request={"name": name})
return response.payload.data.decode("UTF-8")
```
```python
# Azure Key Vault pattern
from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient
def get_secret(vault_url, secret_name):
credential = DefaultAzureCredential()
client = SecretClient(vault_url=vault_url, credential=credential)
return client.get_secret(secret_name).value
```
---
## Secret Rotation Workflows
### Rotation Strategy by Secret Type
| Secret Type | Rotation Frequency | Method | Downtime Risk |
|-------------|-------------------|--------|---------------|
| Database passwords | 30 days | Dual-account swap | Zero (A/B rotation) |
| API keys | 90 days | Generate new, deprecate old | Zero (overlap window) |
| TLS certificates | 60 days before expiry | ACME or Vault PKI | Zero (graceful reload) |
| SSH keys | 90 days | Vault-signed certificates | Zero (CA-based) |
| Service tokens | 24 hours | Dynamic generation | Zero (short-lived) |
| Encryption keys | 90 days | Key versioning (rewrap) | Zero (version coexistence) |
### Database Credential Rotation (Dual-Account)
1. Two database accounts exist: `app_user_a` and `app_user_b`
2. Application currently uses `app_user_a`
3. Rotation rotates `app_user_b` password, updates secret store
4. Application switches to `app_user_b` on next credential fetch
5. After grace period, `app_user_a` password is rotated
6. Cycle repeats
### API Key Rotation (Overlap Window)
1. Generate new API key with provider
2. Store new key in secret store as `current`, move old to `previous`
3. Deploy applications — they read `current`
4. After all instances restarted (or TTL expired), revoke `previous`
5. Monitoring confirms zero usage of old key before revocation
---
## Dynamic Secrets
Dynamic secrets are generated on-demand with automatic expiration. Prefer dynamic secrets over static credentials wherever possible.
### Database Dynamic Credentials (Vault)
```hcl
# Configure database engine
vault write database/config/postgres \
plugin_name=postgresql-database-plugin \
connection_url="postgresql://{{username}}:{{password}}@db.example.com:5432/app" \
allowed_roles="app-readonly,app-readwrite" \
username="vault_admin" \
password="<admin-password>"
# Create role with TTL
vault write database/roles/app-readonly \
db_name=postgres \
creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
default_ttl=1h \
max_ttl=24h
```
### Cloud IAM Dynamic Credentials
Vault can generate short-lived AWS IAM credentials, Azure service principal passwords, or GCP service account keys — eliminating long-lived cloud credentials entirely.
### SSH Certificate Authority
Replace SSH key distribution with a Vault-signed certificate model:
1. Vault acts as SSH CA
2. Users/machines request signed certificates with short TTL (30 min)
3. SSH servers trust the CA public key — no `authorized_keys` management
4. Certificates expire automatically — no revocation needed for normal operations
---
## Audit Logging
### What to Log
| Event | Priority | Retention |
|-------|----------|-----------|
| Secret read access | HIGH | 1 year minimum |
| Secret creation/update | HIGH | 1 year minimum |
| Auth method login | MEDIUM | 90 days |
| Policy changes | CRITICAL | 2 years (compliance) |
| Failed access attempts | CRITICAL | 1 year |
| Token creation/revocation | MEDIUM | 90 days |
| Seal/unseal operations | CRITICAL | Indefinite |
### Anomaly Detection Signals
- Secret accessed from new IP/CIDR range
- Access volume spike (>3x baseline for a path)
- Off-hours access for human auth methods
- Service accessing secrets outside its policy scope (denied requests)
- Multiple failed auth attempts from single source
- Token created with unusually long TTL
### Compliance Reporting
Generate periodic reports covering:
1. **Access inventory** — Which identities accessed which secrets, when
2. **Rotation compliance** — Secrets overdue for rotation
3. **Policy drift** — Policies modified since last review
4. **Orphaned secrets** — Secrets with no recent access (>90 days)
Use `audit_log_analyzer.py` to parse Vault or cloud audit logs for these signals.
---
## Emergency Procedures
### Secret Leak Response (Immediate)
**Time target: Contain within 15 minutes of detection.**
1. **Identify scope** — Which secret(s) leaked, where (repo, log, error message, third party)
2. **Revoke immediately** — Rotate the compromised credential at the source (provider API, Vault, cloud SM)
3. **Invalidate tokens** — Revoke all Vault tokens that accessed the leaked secret
4. **Audit blast radius** — Query audit logs for usage of the compromised secret in the exposure window
5. **Notify stakeholders** — Security team, affected service owners, compliance (if PII/regulated data)
6. **Post-mortem** — Document root cause, update controls to prevent recurrence
### Vault Seal Operations
**When to seal:** Active security incident affecting Vault infrastructure, suspected key compromise.
**Sealing** stops all Vault operations. Use only as last resort.
**Unseal procedure:**
1. Gather quorum of unseal key holders (Shamir threshold)
2. Or confirm auto-unseal KMS key is accessible
3. Unseal via `vault operator unseal` or restart with auto-unseal
4. Verify audit devices reconnected
5. Check active leases and token validity
See `references/emergency_procedures.md` for complete playbooks.
---
## CI/CD Integration
### Vault Agent Sidecar (Kubernetes)
Vault Agent runs alongside application pods, handles authentication and secret rendering:
```yaml
# Pod annotation for Vault Agent Injector
annotations:
vault.hashicorp.com/agent-inject: "true"
vault.hashicorp.com/role: "api-server"
vault.hashicorp.com/agent-inject-secret-db: "database/creds/app-readonly"
vault.hashicorp.com/agent-inject-template-db: |
{{- with secret "database/creds/app-readonly" -}}
postgresql://{{ .Data.username }}:{{ .Data.password }}@db:5432/app
{{- end }}
```
### External Secrets Operator (Kubernetes)
For teams preferring declarative GitOps over agent sidecars:
```yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: api-credentials
spec:
refreshInterval: 1h
secretStoreRef:
name: vault-backend
kind: ClusterSecretStore
target:
name: api-credentials
data:
- secretKey: api-key
remoteRef:
key: secret/data/production/api
property: key
```
### GitHub Actions OIDC
Eliminate long-lived secrets in CI by using OIDC federation:
```yaml
- name: Authenticate to Vault
uses: hashicorp/vault-action@v2
with:
url: https://vault.example.com
method: jwt
role: github-ci
jwtGithubAudience: https://vault.example.com
secrets: |
secret/data/ci/deploy api_key | DEPLOY_API_KEY ;
secret/data/ci/deploy db_password | DB_PASSWORD
```
---
## Anti-Patterns
| Anti-Pattern | Risk | Correct Approach |
|-------------|------|-----------------|
| Hardcoded secrets in source code | Leak via repo, logs, error output | Fetch from secret store at runtime |
| Long-lived static tokens (>30 days) | Stale credentials, no accountability | Dynamic secrets or short TTL + rotation |
| Shared service accounts | No audit trail per consumer | Per-service identity with unique credentials |
| No rotation policy | Compromised creds persist indefinitely | Automated rotation on schedule |
| Secrets in environment variables on CI | Visible in build logs, process table | Vault Agent or OIDC-based injection |
| Single unseal key holder | Bus factor of 1, recovery blocked | Shamir split (3-of-5) or auto-unseal |
| No audit device configured | Zero visibility into access | Dual audit devices (file + syslog) |
| Wildcard policies (`path "*"`) | Over-permissioned, violates least privilege | Explicit path-based policies per service |
---
## Tools
| Script | Purpose |
|--------|---------|
| `vault_config_generator.py` | Generate Vault policy and auth config from application requirements |
| `rotation_planner.py` | Create rotation schedule from a secret inventory file |
| `audit_log_analyzer.py` | Analyze audit logs for anomalies and compliance gaps |
---
## Cross-References
- **env-secrets-manager** — Local `.env` file hygiene, leak detection, drift awareness
- **senior-secops** — Security operations, incident response, threat modeling
- **ci-cd-pipeline-builder** — Pipeline design where secrets are consumed
- **docker-development** — Container secret injection patterns
- **helm-chart-builder** — Kubernetes secret management in Helm charts

View File

@@ -0,0 +1,354 @@
# Cloud Secret Store Reference
## Provider Comparison
### Feature Matrix
| Feature | AWS Secrets Manager | Azure Key Vault | GCP Secret Manager |
|---------|--------------------|-----------------|--------------------|
| **Secret types** | String, binary | Secrets, keys, certificates | String, binary |
| **Max secret size** | 64 KB | 25 KB (secret), 200 KB (cert) | 64 KB |
| **Versioning** | Automatic (all versions) | Manual enable per secret | Automatic |
| **Rotation** | Built-in Lambda rotation | Custom via Functions/Logic Apps | Custom via Cloud Functions |
| **Encryption** | AWS KMS (default or CMK) | HSM-backed (FIPS 140-2 L2) | Google-managed or CMEK |
| **Cross-region** | Replication to multiple regions | Geo-redundant by SKU | Replication supported |
| **Access control** | IAM + resource-based policies | RBAC + access policies | IAM bindings |
| **Audit** | CloudTrail | Azure Monitor + Diagnostics | Cloud Audit Logs |
| **Secret references** | ARN | Vault URI + secret name | Resource name |
| **Cost model** | $0.40/secret/mo + $0.05/10K calls | $0.03/10K ops (Standard) | $0.06/10K access ops |
| **Free tier** | No | No | 6 active versions free |
### Decision Guide
**Choose AWS Secrets Manager when:**
- Fully on AWS
- Need native RDS/Aurora/Redshift rotation
- Using ECS/EKS with native AWS IAM integration
- Cross-account secret sharing via resource policies
**Choose Azure Key Vault when:**
- Azure-primary workloads
- Certificate lifecycle management is critical (built-in CA integration)
- Need HSM-backed key protection (Premium SKU)
- Azure AD conditional access integration required
**Choose GCP Secret Manager when:**
- GCP-primary workloads
- Using GKE with Workload Identity
- Want simplest API surface (few concepts, fast to integrate)
- Cost-sensitive (generous free tier)
**Choose HashiCorp Vault when:**
- Multi-cloud or hybrid environments
- Dynamic secrets (database, cloud IAM, SSH) are primary use case
- Need transit encryption, PKI, or SSH CA
- Regulatory requirement for self-hosted secret management
## AWS Secrets Manager
### Access Patterns
```python
import boto3
import json
from botocore.exceptions import ClientError
def get_secret(secret_name, region="us-east-1"):
"""Retrieve secret from AWS Secrets Manager."""
client = boto3.client("secretsmanager", region_name=region)
try:
response = client.get_secret_value(SecretId=secret_name)
except ClientError as e:
code = e.response["Error"]["Code"]
if code == "ResourceNotFoundException":
raise ValueError(f"Secret {secret_name} not found")
elif code == "DecryptionFailureException":
raise RuntimeError("KMS decryption failed — check key permissions")
raise
if "SecretString" in response:
return json.loads(response["SecretString"])
return response["SecretBinary"]
```
### Rotation with Lambda
```python
# rotation_lambda.py — skeleton for custom rotation
def lambda_handler(event, context):
secret_id = event["SecretId"]
step = event["Step"]
token = event["ClientRequestToken"]
client = boto3.client("secretsmanager")
if step == "createSecret":
# Generate new credentials
new_password = generate_password()
client.put_secret_value(
SecretId=secret_id,
ClientRequestToken=token,
SecretString=json.dumps({"password": new_password}),
VersionStages=["AWSPENDING"],
)
elif step == "setSecret":
# Apply new credentials to the target service
pending = get_secret_version(client, secret_id, "AWSPENDING", token)
apply_credentials(pending)
elif step == "testSecret":
# Verify new credentials work
pending = get_secret_version(client, secret_id, "AWSPENDING", token)
test_connection(pending)
elif step == "finishSecret":
# Mark AWSPENDING as AWSCURRENT
client.update_secret_version_stage(
SecretId=secret_id,
VersionStage="AWSCURRENT",
MoveToVersionId=token,
RemoveFromVersionId=get_current_version(client, secret_id),
)
```
### IAM Policy for Secret Access
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["secretsmanager:GetSecretValue"],
"Resource": "arn:aws:secretsmanager:us-east-1:123456789012:secret:production/api/*",
"Condition": {
"StringEquals": {
"aws:RequestedRegion": "us-east-1"
}
}
}
]
}
```
### Cross-Account Access
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {"AWS": "arn:aws:iam::987654321098:role/shared-secret-reader"},
"Action": "secretsmanager:GetSecretValue",
"Resource": "*",
"Condition": {
"ForAnyValue:StringEquals": {
"secretsmanager:VersionStage": "AWSCURRENT"
}
}
}
]
}
```
## Azure Key Vault
### Access Patterns
```python
from azure.identity import DefaultAzureCredential, ManagedIdentityCredential
from azure.keyvault.secrets import SecretClient
def get_secret(vault_url, secret_name, use_managed_identity=True):
"""Retrieve secret from Azure Key Vault."""
if use_managed_identity:
credential = ManagedIdentityCredential()
else:
credential = DefaultAzureCredential()
client = SecretClient(vault_url=vault_url, credential=credential)
return client.get_secret(secret_name).value
def list_secrets(vault_url):
"""List all secret names (not values)."""
credential = DefaultAzureCredential()
client = SecretClient(vault_url=vault_url, credential=credential)
return [s.name for s in client.list_properties_of_secrets()]
```
### RBAC vs Access Policies
**RBAC (recommended):**
- Uses Azure AD roles (`Key Vault Secrets User`, `Key Vault Secrets Officer`)
- Managed at subscription/resource group/vault level
- Audit via Azure AD activity logs
**Access Policies (legacy):**
- Per-vault configuration
- Object ID based
- No inheritance from resource group
```bash
# Assign RBAC role
az role assignment create \
--role "Key Vault Secrets User" \
--assignee <service-principal-id> \
--scope /subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.KeyVault/vaults/<vault>
```
### Certificate Management
Azure Key Vault has first-class certificate management with automatic renewal:
```bash
# Create certificate with auto-renewal
az keyvault certificate create \
--vault-name my-vault \
--name api-tls \
--policy @cert-policy.json
# cert-policy.json
{
"issuerParameters": {"name": "Self"},
"keyProperties": {"keyType": "RSA", "keySize": 2048},
"lifetimeActions": [
{"action": {"actionType": "AutoRenew"}, "trigger": {"daysBeforeExpiry": 30}}
],
"x509CertificateProperties": {
"subject": "CN=api.example.com",
"validityInMonths": 12
}
}
```
## GCP Secret Manager
### Access Patterns
```python
from google.cloud import secretmanager
def get_secret(project_id, secret_id, version="latest"):
"""Retrieve secret from GCP Secret Manager."""
client = secretmanager.SecretManagerServiceClient()
name = f"projects/{project_id}/secrets/{secret_id}/versions/{version}"
response = client.access_secret_version(request={"name": name})
return response.payload.data.decode("UTF-8")
def create_secret(project_id, secret_id, secret_value):
"""Create a new secret with initial version."""
client = secretmanager.SecretManagerServiceClient()
parent = f"projects/{project_id}"
# Create the secret resource
secret = client.create_secret(
request={
"parent": parent,
"secret_id": secret_id,
"secret": {"replication": {"automatic": {}}},
}
)
# Add a version with the secret value
client.add_secret_version(
request={
"parent": secret.name,
"payload": {"data": secret_value.encode("UTF-8")},
}
)
return secret.name
```
### Workload Identity for GKE
Eliminate service account key files by binding Kubernetes service accounts to GCP IAM:
```bash
# Create IAM binding
gcloud iam service-accounts add-iam-policy-binding \
secret-accessor@my-project.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:my-project.svc.id.goog[namespace/ksa-name]"
# Annotate Kubernetes service account
kubectl annotate serviceaccount ksa-name \
--namespace namespace \
iam.gke.io/gcp-service-account=secret-accessor@my-project.iam.gserviceaccount.com
```
### IAM Policy
```bash
# Grant secret accessor role to a service account
gcloud secrets add-iam-policy-binding my-secret \
--member="serviceAccount:my-app@my-project.iam.gserviceaccount.com" \
--role="roles/secretmanager.secretAccessor"
```
## Cross-Cloud Patterns
### Abstraction Layer
When operating multi-cloud, create a thin abstraction that normalizes secret access:
```python
# secret_client.py — cross-cloud abstraction
class SecretClient:
def __init__(self, provider, **kwargs):
if provider == "aws":
self._client = AWSSecretClient(**kwargs)
elif provider == "azure":
self._client = AzureSecretClient(**kwargs)
elif provider == "gcp":
self._client = GCPSecretClient(**kwargs)
elif provider == "vault":
self._client = VaultSecretClient(**kwargs)
else:
raise ValueError(f"Unknown provider: {provider}")
def get(self, key):
return self._client.get(key)
def set(self, key, value):
return self._client.set(key, value)
```
### Migration Strategy
When migrating between providers:
1. **Dual-write phase** — Write to both old and new store simultaneously
2. **Dual-read phase** — Read from new store, fallback to old
3. **Cut-over** — Read exclusively from new store
4. **Cleanup** — Remove secrets from old store after grace period
### Secret Synchronization
For hybrid setups (e.g., Vault as primary, cloud SM for specific workloads):
- Use Vault's cloud secret engines to generate cloud-native credentials dynamically
- Or use External Secrets Operator to sync from Vault into cloud-native stores
- Never manually copy secrets between stores — always automate
## Caching and Performance
### Client-Side Caching
All three cloud providers support caching SDKs:
- **AWS:** `aws-secretsmanager-caching-python` — caches with configurable TTL
- **Azure:** Built-in HTTP caching in SDK, or use Azure App Configuration
- **GCP:** No official caching library — implement in-process cache with TTL
### Caching Rules
1. Cache TTL should be shorter than rotation period (e.g., cache 5 min if rotating every 30 days)
2. Implement cache invalidation on secret version change events
3. Never cache secrets to disk — in-memory only
4. Log cache hits/misses for debugging rotation issues
## Compliance Mapping
| Requirement | AWS SM | Azure KV | GCP SM | Vault |
|------------|--------|----------|--------|-------|
| SOC 2 audit trail | CloudTrail | Monitor logs | Audit Logs | Audit device |
| HIPAA encryption | KMS (BAA) | HSM (BAA) | CMEK (BAA) | Auto-encrypt |
| PCI DSS key mgmt | KMS compliance | Premium HSM | CMEK | Transit engine |
| GDPR data residency | Region selection | Region selection | Region selection | Self-hosted |
| ISO 27001 | Certified | Certified | Certified | Self-certify |

View File

@@ -0,0 +1,280 @@
# Emergency Procedures Reference
## Secret Leak Response Playbook
### Severity Classification
| Severity | Definition | Response Time | Example |
|----------|-----------|---------------|---------|
| **P0 — Critical** | Production credentials exposed publicly | Immediate (15 min) | Database password in public GitHub repo |
| **P1 — High** | Internal credentials exposed beyond intended scope | 1 hour | API key in build logs accessible to wider org |
| **P2 — Medium** | Non-production credentials exposed | 4 hours | Staging DB password in internal wiki |
| **P3 — Low** | Expired or limited-scope credential exposed | 24 hours | Rotated API key found in old commit history |
### P0/P1 Response Procedure
**Phase 1: Contain (0-15 minutes)**
1. **Identify the leaked secret**
- What credential was exposed? (type, scope, permissions)
- Where was it exposed? (repo, log, error page, third-party service)
- When was it first exposed? (commit timestamp, log timestamp)
- Is the exposure still active? (repo public? log accessible?)
2. **Revoke immediately**
- Database password: `ALTER ROLE app_user WITH PASSWORD 'new_password';`
- API key: Regenerate via provider console/API
- Vault token: `vault token revoke <token>`
- AWS access key: `aws iam delete-access-key --access-key-id <key>`
- Cloud service account: Delete and recreate key
- TLS certificate: Revoke via CA, generate new certificate
3. **Remove exposure**
- Public repo: Remove file, force-push to remove from history, request GitHub cache purge
- Build logs: Delete log artifacts, rotate CI/CD secrets
- Error page: Deploy fix to suppress secret in error output
- Third-party: Contact vendor for log purge if applicable
4. **Deploy new credentials**
- Update secret store with rotated credential
- Restart affected services to pick up new credential
- Verify services are healthy with new credential
**Phase 2: Assess (15-60 minutes)**
5. **Audit blast radius**
- Query Vault/cloud SM audit logs for the compromised credential
- Check for unauthorized usage during the exposure window
- Review network logs for suspicious connections from unknown IPs
- Check if the compromised credential grants access to other secrets (privilege escalation)
6. **Notify stakeholders**
- Security team (always)
- Service owners for affected systems
- Compliance team if regulated data was potentially accessed
- Legal if customer data may have been compromised
- Executive leadership for P0 incidents
**Phase 3: Recover (1-24 hours)**
7. **Rotate adjacent credentials**
- If the leaked credential could access other secrets, rotate those too
- If a Vault token leaked, check what policies it had — rotate everything accessible
8. **Harden against recurrence**
- Add pre-commit hook to detect secrets (e.g., `gitleaks`, `detect-secrets`)
- Review CI/CD pipeline for secret masking
- Audit who has access to the source of the leak
**Phase 4: Post-Mortem (24-72 hours)**
9. **Document incident**
- Timeline of events
- Root cause analysis
- Impact assessment
- Remediation actions taken
- Preventive measures added
### Response Communication Template
```
SECURITY INCIDENT — SECRET EXPOSURE
Severity: P0/P1
Time detected: YYYY-MM-DD HH:MM UTC
Secret type: [database password / API key / token / certificate]
Exposure vector: [public repo / build log / error output / other]
Status: [CONTAINED / INVESTIGATING / RESOLVED]
Immediate actions taken:
- [ ] Credential revoked at source
- [ ] Exposure removed
- [ ] New credential deployed
- [ ] Services verified healthy
- [ ] Audit log review in progress
Blast radius assessment: [PENDING / COMPLETE — no unauthorized access / COMPLETE — unauthorized access detected]
Next update: [time]
Incident commander: [name]
```
## Vault Seal/Unseal Procedures
### Understanding Seal Status
Vault uses a **seal** mechanism to protect the encryption key hierarchy. When sealed, Vault cannot decrypt any data or serve any requests.
```
Sealed State:
Vault process running → YES
API responding → YES (503 Sealed)
Serving secrets → NO
All active leases → FROZEN (not revoked)
Audit logging → NO
Unsealed State:
Vault process running → YES
API responding → YES (200 OK)
Serving secrets → YES
Active leases → RESUMING
Audit logging → YES
```
### When to Seal Vault (Emergency Only)
Seal Vault when:
- Active intrusion on Vault infrastructure is confirmed
- Vault server compromise is suspected (unauthorized root access)
- Encryption key material may have been extracted
- Regulatory/legal hold requires immediate data access prevention
**Do NOT seal for:**
- Routine maintenance (use graceful shutdown instead)
- Single-node issues in HA cluster (let standby take over)
- Suspected secret leak (revoke the secret, don't seal Vault)
### Seal Procedure
```bash
# Seal a single node
vault operator seal
# Seal all nodes (HA cluster)
# Seal each node individually — leader last
vault operator seal -address=https://vault-standby-1:8200
vault operator seal -address=https://vault-standby-2:8200
vault operator seal -address=https://vault-leader:8200
```
**Impact of sealing:**
- All active client connections dropped immediately
- All token and lease timers paused
- Applications lose secret access — prepare for cascading failures
- Monitoring will fire alerts for sealed state
### Unseal Procedure (Shamir Keys)
Requires a quorum of key holders (e.g., 3 of 5).
```bash
# Each key holder provides their unseal key
vault operator unseal <key-1>
vault operator unseal <key-2>
vault operator unseal <key-3>
# Vault unseals after reaching threshold
```
**Operational checklist after unseal:**
1. Verify health: `vault status` shows `Sealed: false`
2. Check audit devices: `vault audit list` — confirm all enabled
3. Check auth methods: `vault auth list`
4. Verify HA status: `vault operator raft list-peers`
5. Check lease count: monitor `vault.expire.num_leases`
6. Verify applications reconnecting (check application logs)
### Unseal Procedure (Auto-Unseal)
If using cloud KMS auto-unseal, Vault unseals automatically on restart:
```bash
# Restart Vault service
systemctl restart vault
# Verify unseal (should happen within seconds)
vault status
```
**If auto-unseal fails:**
- Check cloud KMS key permissions (IAM role may have been modified)
- Check network connectivity to cloud KMS endpoint
- Check KMS key status (not disabled, not scheduled for deletion)
- Check Vault logs: `journalctl -u vault -f`
## Mass Credential Rotation Procedure
When a broad compromise requires rotating many credentials simultaneously.
### Pre-Rotation Checklist
- [ ] Identify all credentials in scope
- [ ] Map credential dependencies (which services use which credentials)
- [ ] Determine rotation order (databases before applications)
- [ ] Prepare rollback plan for each credential
- [ ] Notify all service owners
- [ ] Schedule maintenance window if zero-downtime not possible
- [ ] Stage new credentials in secret store (but don't activate yet)
### Rotation Order
1. **Infrastructure credentials** — Database root passwords, cloud IAM admin keys
2. **Service credentials** — Application database users, API keys
3. **Integration credentials** — Third-party API keys, webhook secrets
4. **Human credentials** — Force password reset, revoke SSO sessions
### Rollback Plan
For each credential, document:
- Previous value (store in sealed emergency envelope or HSM)
- How to revert (specific command or API call)
- Verification step (how to confirm old credential works)
- Maximum time to rollback (SLA)
## Vault Recovery Procedures
### Lost Unseal Keys
If unseal keys are lost and auto-unseal is not configured:
1. **If Vault is currently unsealed:** Enable auto-unseal immediately, then reseal/unseal with KMS
2. **If Vault is sealed:** Data is irrecoverable without keys. Restore from Raft snapshot backup
3. **Prevention:** Store unseal keys in separate, secure locations (HSMs, safety deposit boxes). Use auto-unseal for production.
### Raft Cluster Recovery
**Single node failure (cluster still has quorum):**
```bash
# Remove failed peer
vault operator raft remove-peer <failed-node-id>
# Add replacement node
# (new node joins via retry_join in config)
```
**Loss of quorum (majority of nodes failed):**
```bash
# On a surviving node with recent data
vault operator raft join -leader-ca-cert=@ca.crt https://surviving-node:8200
# If no node survives, restore from snapshot
vault operator raft snapshot restore /backups/latest.snap
```
### Root Token Recovery
If root token is lost (it should be revoked after initial setup):
```bash
# Generate new root token (requires unseal key quorum)
vault operator generate-root -init
# Each key holder provides their key
vault operator generate-root -nonce=<nonce> <unseal-key>
# After quorum, decode the encoded token
vault operator generate-root -decode=<encoded-token> -otp=<otp>
```
**Best practice:** Generate a root token only when needed, complete the task, then revoke it:
```bash
vault token revoke <root-token>
```
## Incident Severity Escalation Matrix
| Signal | Escalation |
|--------|-----------|
| Single secret exposed in internal log | P2 — Rotate secret, add log masking |
| Secret in public repository (no evidence of use) | P1 — Immediate rotation, history scrub |
| Secret in public repository (evidence of unauthorized use) | P0 — Full incident response, legal notification |
| Vault node compromised | P0 — Seal cluster, rotate all accessible secrets |
| Cloud KMS key compromised | P0 — Create new key, re-encrypt all secrets, rotate all credentials |
| Audit log gap detected | P1 — Investigate cause, assume worst case for gap period |
| Multiple failed auth attempts from unknown source | P2 — Block source, investigate, rotate targeted credentials |

View File

@@ -0,0 +1,342 @@
# HashiCorp Vault Architecture & Patterns Reference
## Architecture Overview
Vault operates as a centralized secret management service with a client-server model. All secrets are encrypted at rest and in transit. The seal/unseal mechanism protects the master encryption key.
### Core Components
```
┌─────────────────────────────────────────────────┐
│ Vault Cluster │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Leader │ │ Standby │ │ Standby │ │
│ │ (active) │ │ (forward) │ │ (forward) │ │
│ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ │
│ │ │ │ │
│ ┌─────┴───────────────┴───────────────┴─────┐ │
│ │ Raft Storage Backend │ │
│ └───────────────────────────────────────────┘ │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ Auth │ │ Secret │ │ Audit │ │
│ │ Methods │ │ Engines │ │ Devices │ │
│ └──────────┘ └──────────┘ └──────────────┘ │
└─────────────────────────────────────────────────┘
```
### Storage Backend Selection
| Backend | HA Support | Operational Complexity | Recommendation |
|---------|-----------|----------------------|----------------|
| Integrated Raft | Yes | Low | **Default choice** — no external dependencies |
| Consul | Yes | Medium | Legacy — use Raft unless already running Consul |
| S3/GCS/Azure Blob | No | Low | Dev/test only — no HA |
| PostgreSQL/MySQL | No | Medium | Not recommended — no HA, added dependency |
## High Availability Setup
### Raft Cluster Configuration
Minimum 3 nodes for production (tolerates 1 failure). 5 nodes for critical workloads (tolerates 2 failures).
```hcl
# vault-config.hcl (per node)
storage "raft" {
path = "/opt/vault/data"
node_id = "vault-1"
retry_join {
leader_api_addr = "https://vault-2.internal:8200"
}
retry_join {
leader_api_addr = "https://vault-3.internal:8200"
}
}
listener "tcp" {
address = "0.0.0.0:8200"
tls_cert_file = "/opt/vault/tls/vault.crt"
tls_key_file = "/opt/vault/tls/vault.key"
}
api_addr = "https://vault-1.internal:8200"
cluster_addr = "https://vault-1.internal:8201"
```
### Auto-Unseal with AWS KMS
Eliminates manual unseal key management. Vault encrypts its master key with the KMS key.
```hcl
seal "awskms" {
region = "us-east-1"
kms_key_id = "alias/vault-unseal"
}
```
**Requirements:**
- IAM role with `kms:Encrypt`, `kms:Decrypt`, `kms:DescribeKey` permissions
- KMS key must be in the same region or accessible cross-region
- KMS key should have restricted access — only Vault nodes
### Auto-Unseal with Azure Key Vault
```hcl
seal "azurekeyvault" {
tenant_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
vault_name = "vault-unseal-kv"
key_name = "vault-unseal-key"
}
```
### Auto-Unseal with GCP KMS
```hcl
seal "gcpckms" {
project = "my-project"
region = "global"
key_ring = "vault-keyring"
crypto_key = "vault-unseal-key"
}
```
## Namespaces (Enterprise)
Namespaces provide tenant isolation within a single Vault cluster. Each namespace has independent policies, auth methods, and secret engines.
```
root/
├── dev/ # Development environment
│ ├── auth/
│ └── secret/
├── staging/ # Staging environment
│ ├── auth/
│ └── secret/
└── production/ # Production environment
├── auth/
└── secret/
```
**OSS alternative:** Use path-based isolation with strict policies. Prefix all paths with environment name (e.g., `secret/data/production/...`).
## Policy Patterns
### Templated Policies
Use identity-based templates for scalable policy management:
```hcl
# Allow entities to manage their own secrets
path "secret/data/{{identity.entity.name}}/*" {
capabilities = ["create", "read", "update", "delete"]
}
# Read shared config for the entity's group
path "secret/data/shared/{{identity.groups.names}}/*" {
capabilities = ["read"]
}
```
### Sentinel Policies (Enterprise)
Enforce governance rules beyond path-based access:
```python
# Require MFA for production secret writes
import "mfa"
main = rule {
request.path matches "secret/data/production/.*" and
request.operation in ["create", "update", "delete"] and
mfa.methods.totp.valid
}
```
### Policy Hierarchy
1. **Global deny** — Explicit deny on `sys/*`, `auth/token/create-orphan`
2. **Environment base** — Read access to environment-specific paths
3. **Service-specific** — Scoped to exact paths the service needs
4. **Admin override** — Requires MFA, time-limited, audit-heavy
## Secret Engine Configuration
### KV v2 (Versioned Key-Value)
```bash
# Enable with custom config
vault secrets enable -path=secret -version=2 kv
# Configure version retention
vault write secret/config max_versions=10 cas_required=true delete_version_after=90d
```
**Check-and-Set (CAS):** Prevents accidental overwrites. Client must supply the current version number to update.
### Database Engine
```bash
# Enable and configure PostgreSQL
vault secrets enable database
vault write database/config/postgres \
plugin_name=postgresql-database-plugin \
connection_url="postgresql://{{username}}:{{password}}@db.internal:5432/app?sslmode=require" \
allowed_roles="app-readonly,app-readwrite" \
username="vault_admin" \
password="INITIAL_PASSWORD"
# Rotate the root password (Vault manages it from now on)
vault write -f database/rotate-root/postgres
# Create a read-only role
vault write database/roles/app-readonly \
db_name=postgres \
creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
revocation_statements="DROP ROLE IF EXISTS \"{{name}}\";" \
default_ttl=1h \
max_ttl=24h
```
### PKI Engine (Certificate Authority)
```bash
# Enable PKI engine
vault secrets enable -path=pki pki
vault secrets tune -max-lease-ttl=87600h pki
# Generate root CA
vault write -field=certificate pki/root/generate/internal \
common_name="Example Root CA" \
ttl=87600h > root_ca.crt
# Enable intermediate CA
vault secrets enable -path=pki_int pki
vault secrets tune -max-lease-ttl=43800h pki_int
# Generate intermediate CSR
vault write -field=csr pki_int/intermediate/generate/internal \
common_name="Example Intermediate CA" > intermediate.csr
# Sign with root CA
vault write -field=certificate pki/root/sign-intermediate \
csr=@intermediate.csr format=pem_bundle ttl=43800h > intermediate.crt
# Set signed certificate
vault write pki_int/intermediate/set-signed certificate=@intermediate.crt
# Create role for leaf certificates
vault write pki_int/roles/web-server \
allowed_domains="example.com" \
allow_subdomains=true \
max_ttl=2160h
```
### Transit Engine (Encryption-as-a-Service)
```bash
vault secrets enable transit
# Create encryption key
vault write -f transit/keys/payment-data \
type=aes256-gcm96
# Encrypt data
vault write transit/encrypt/payment-data \
plaintext=$(echo "sensitive-data" | base64)
# Decrypt data
vault write transit/decrypt/payment-data \
ciphertext="vault:v1:..."
# Rotate key (old versions still decrypt, new encrypts with latest)
vault write -f transit/keys/payment-data/rotate
# Rewrap ciphertext to latest key version
vault write transit/rewrap/payment-data \
ciphertext="vault:v1:..."
```
## Performance and Scaling
### Performance Replication (Enterprise)
Primary cluster replicates to secondary clusters in other regions. Secondaries handle read traffic locally.
### Performance Standbys (Enterprise)
Standby nodes serve read requests without forwarding to the leader, reducing leader load.
### Response Wrapping
Wrap sensitive responses in a single-use token — the recipient unwraps exactly once:
```bash
# Wrap a secret (TTL = 5 minutes)
vault kv get -wrap-ttl=5m secret/data/production/db-creds
# Recipient unwraps
vault unwrap <wrapping_token>
```
### Batch Tokens
For high-throughput workloads (Lambda, serverless), use batch tokens instead of service tokens. Batch tokens are not persisted to storage, reducing I/O.
## Monitoring and Health
### Key Metrics
| Metric | Alert Threshold | Source |
|--------|----------------|--------|
| `vault.core.unsealed` | 0 (sealed) | Telemetry |
| `vault.expire.num_leases` | >10,000 | Telemetry |
| `vault.audit.log_response` | Error rate >1% | Telemetry |
| `vault.runtime.alloc_bytes` | >80% memory | Telemetry |
| `vault.raft.leader.lastContact` | >500ms | Telemetry |
| `vault.token.count` | >50,000 | Telemetry |
### Health Check Endpoint
```bash
# Returns 200 if initialized, unsealed, and active
curl -s https://vault.internal:8200/v1/sys/health
# Status codes:
# 200 — initialized, unsealed, active
# 429 — unsealed, standby
# 472 — disaster recovery secondary
# 473 — performance standby
# 501 — not initialized
# 503 — sealed
```
## Disaster Recovery
### Backup
```bash
# Raft snapshot (includes all data)
vault operator raft snapshot save backup-$(date +%Y%m%d).snap
# Schedule daily backups via cron
0 2 * * * /usr/local/bin/vault operator raft snapshot save /backups/vault-$(date +\%Y\%m\%d).snap
```
### Restore
```bash
# Restore from snapshot (causes brief outage)
vault operator raft snapshot restore backup-20260320.snap
```
### DR Replication (Enterprise)
Secondary cluster in standby. Promote on primary failure:
```bash
# On DR secondary
vault operator generate-root -dr-token
vault write sys/replication/dr/secondary/promote dr_operation_token=<token>
```

View File

@@ -0,0 +1,330 @@
#!/usr/bin/env python3
"""Analyze Vault or cloud secret manager audit logs for anomalies.
Reads JSON-lines or JSON-array audit log files and flags unusual access
patterns including volume spikes, off-hours access, new source IPs,
and failed authentication attempts.
Usage:
python audit_log_analyzer.py --log-file vault-audit.log --threshold 5
python audit_log_analyzer.py --log-file audit.json --threshold 3 --json
Expected log entry format (JSON lines or JSON array):
{
"timestamp": "2026-03-20T14:32:00Z",
"type": "request",
"auth": {"accessor": "token-abc123", "entity_id": "eid-001", "display_name": "approle-payment-svc"},
"request": {"path": "secret/data/production/payment/api-keys", "operation": "read"},
"response": {"status_code": 200},
"remote_address": "10.0.1.15"
}
Fields are optional — the analyzer works with whatever is available.
"""
import argparse
import json
import sys
import textwrap
from collections import defaultdict
from datetime import datetime
def load_logs(path):
"""Load audit log entries from file. Supports JSON lines and JSON array."""
entries = []
try:
with open(path, "r") as f:
content = f.read().strip()
except FileNotFoundError:
print(f"ERROR: Log file not found: {path}", file=sys.stderr)
sys.exit(1)
if not content:
return entries
# Try JSON array first
if content.startswith("["):
try:
entries = json.loads(content)
return entries
except json.JSONDecodeError:
pass
# Try JSON lines
for i, line in enumerate(content.split("\n"), 1):
line = line.strip()
if not line:
continue
try:
entries.append(json.loads(line))
except json.JSONDecodeError:
print(f"WARNING: Skipping malformed line {i}", file=sys.stderr)
return entries
def extract_fields(entry):
"""Extract normalized fields from a log entry."""
timestamp_raw = entry.get("timestamp", entry.get("time", ""))
ts = None
if timestamp_raw:
for fmt in ("%Y-%m-%dT%H:%M:%SZ", "%Y-%m-%dT%H:%M:%S.%fZ", "%Y-%m-%dT%H:%M:%S%z", "%Y-%m-%d %H:%M:%S"):
try:
ts = datetime.strptime(timestamp_raw.replace("+00:00", "Z").rstrip("Z") + "Z", fmt.rstrip("Z") + "Z") if "Z" not in fmt else datetime.strptime(timestamp_raw, fmt)
break
except (ValueError, TypeError):
continue
if ts is None:
# Fallback: try basic parse
try:
ts = datetime.fromisoformat(timestamp_raw.replace("Z", "+00:00").replace("+00:00", ""))
except (ValueError, TypeError):
pass
auth = entry.get("auth", {})
request = entry.get("request", {})
response = entry.get("response", {})
return {
"timestamp": ts,
"hour": ts.hour if ts else None,
"identity": auth.get("display_name", auth.get("entity_id", "unknown")),
"path": request.get("path", entry.get("path", "unknown")),
"operation": request.get("operation", entry.get("operation", "unknown")),
"status_code": response.get("status_code", entry.get("status_code")),
"remote_address": entry.get("remote_address", entry.get("source_address", "unknown")),
"entry_type": entry.get("type", "unknown"),
}
def analyze(entries, threshold):
"""Run anomaly detection across all log entries."""
parsed = [extract_fields(e) for e in entries]
# Counters
access_by_identity = defaultdict(int)
access_by_path = defaultdict(int)
access_by_ip = defaultdict(set) # identity -> set of IPs
ip_to_identities = defaultdict(set) # IP -> set of identities
failed_by_source = defaultdict(int)
off_hours_access = []
path_by_identity = defaultdict(set) # identity -> set of paths
hourly_distribution = defaultdict(int)
for p in parsed:
identity = p["identity"]
path = p["path"]
ip = p["remote_address"]
status = p["status_code"]
hour = p["hour"]
access_by_identity[identity] += 1
access_by_path[path] += 1
access_by_ip[identity].add(ip)
ip_to_identities[ip].add(identity)
path_by_identity[identity].add(path)
if hour is not None:
hourly_distribution[hour] += 1
# Failed access (non-200 or 4xx/5xx)
if status and (status >= 400 or status == 0):
failed_by_source[f"{identity}@{ip}"] += 1
# Off-hours: before 6 AM or after 10 PM
if hour is not None and (hour < 6 or hour >= 22):
off_hours_access.append(p)
# Build anomalies
anomalies = []
# 1. Volume spikes — identities accessing secrets more than threshold * average
if access_by_identity:
avg_access = sum(access_by_identity.values()) / len(access_by_identity)
spike_threshold = max(threshold * avg_access, threshold)
for identity, count in access_by_identity.items():
if count >= spike_threshold:
anomalies.append({
"type": "volume_spike",
"severity": "HIGH",
"identity": identity,
"access_count": count,
"threshold": round(spike_threshold, 1),
"description": f"Identity '{identity}' made {count} accesses (threshold: {round(spike_threshold, 1)})",
})
# 2. Multi-IP access — single identity from many IPs
for identity, ips in access_by_ip.items():
if len(ips) >= threshold:
anomalies.append({
"type": "multi_ip_access",
"severity": "MEDIUM",
"identity": identity,
"ip_count": len(ips),
"ips": sorted(ips),
"description": f"Identity '{identity}' accessed from {len(ips)} different IPs",
})
# 3. Failed access attempts
for source, count in failed_by_source.items():
if count >= threshold:
anomalies.append({
"type": "failed_access",
"severity": "HIGH",
"source": source,
"failure_count": count,
"description": f"Source '{source}' had {count} failed access attempts",
})
# 4. Off-hours access
if off_hours_access:
off_hours_identities = defaultdict(int)
for p in off_hours_access:
off_hours_identities[p["identity"]] += 1
for identity, count in off_hours_identities.items():
if count >= max(threshold, 2):
anomalies.append({
"type": "off_hours_access",
"severity": "MEDIUM",
"identity": identity,
"access_count": count,
"description": f"Identity '{identity}' made {count} accesses outside business hours (before 6 AM / after 10 PM)",
})
# 5. Broad path access — single identity touching many paths
for identity, paths in path_by_identity.items():
if len(paths) >= threshold * 2:
anomalies.append({
"type": "broad_access",
"severity": "MEDIUM",
"identity": identity,
"path_count": len(paths),
"paths": sorted(paths)[:10],
"description": f"Identity '{identity}' accessed {len(paths)} distinct secret paths",
})
# Sort anomalies by severity
severity_order = {"CRITICAL": 0, "HIGH": 1, "MEDIUM": 2, "LOW": 3}
anomalies.sort(key=lambda x: severity_order.get(x["severity"], 4))
# Summary stats
summary = {
"total_entries": len(entries),
"parsed_entries": len(parsed),
"unique_identities": len(access_by_identity),
"unique_paths": len(access_by_path),
"unique_source_ips": len(ip_to_identities),
"total_failures": sum(failed_by_source.values()),
"off_hours_events": len(off_hours_access),
"anomalies_found": len(anomalies),
}
# Top accessed paths
top_paths = sorted(access_by_path.items(), key=lambda x: -x[1])[:10]
return {
"summary": summary,
"anomalies": anomalies,
"top_accessed_paths": [{"path": p, "count": c} for p, c in top_paths],
"hourly_distribution": dict(sorted(hourly_distribution.items())),
}
def print_human(result, threshold):
"""Print human-readable analysis report."""
summary = result["summary"]
anomalies = result["anomalies"]
print("=== Audit Log Analysis Report ===")
print(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M')}")
print(f"Anomaly threshold: {threshold}")
print()
print("--- Summary ---")
print(f" Total log entries: {summary['total_entries']}")
print(f" Unique identities: {summary['unique_identities']}")
print(f" Unique secret paths: {summary['unique_paths']}")
print(f" Unique source IPs: {summary['unique_source_ips']}")
print(f" Total failures: {summary['total_failures']}")
print(f" Off-hours events: {summary['off_hours_events']}")
print(f" Anomalies detected: {summary['anomalies_found']}")
print()
if anomalies:
print("--- Anomalies ---")
for i, a in enumerate(anomalies, 1):
print(f" [{a['severity']}] {a['type']}: {a['description']}")
print()
else:
print("--- No anomalies detected ---")
print()
if result["top_accessed_paths"]:
print("--- Top Accessed Paths ---")
for item in result["top_accessed_paths"]:
print(f" {item['count']:5d} {item['path']}")
print()
if result["hourly_distribution"]:
print("--- Hourly Distribution ---")
max_count = max(result["hourly_distribution"].values()) if result["hourly_distribution"] else 1
for hour in range(24):
count = result["hourly_distribution"].get(hour, 0)
bar_len = int((count / max_count) * 40) if max_count > 0 else 0
marker = " *" if (hour < 6 or hour >= 22) else ""
print(f" {hour:02d}:00 {'#' * bar_len:40s} {count}{marker}")
print(" (* = off-hours)")
def main():
parser = argparse.ArgumentParser(
description="Analyze Vault/cloud secret manager audit logs for anomalies.",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=textwrap.dedent("""\
The analyzer detects:
- Volume spikes (identity accessing secrets above threshold * average)
- Multi-IP access (single identity from many source IPs)
- Failed access attempts (repeated auth/access failures)
- Off-hours access (before 6 AM or after 10 PM)
- Broad path access (single identity accessing many distinct paths)
Log format: JSON lines or JSON array. Each entry should include
timestamp, auth info, request path/operation, response status,
and remote address. Missing fields are handled gracefully.
Examples:
%(prog)s --log-file vault-audit.log --threshold 5
%(prog)s --log-file audit.json --threshold 3 --json
"""),
)
parser.add_argument("--log-file", required=True, help="Path to audit log file (JSON lines or JSON array)")
parser.add_argument(
"--threshold",
type=int,
default=5,
help="Anomaly sensitivity threshold — lower = more sensitive (default: 5)",
)
parser.add_argument("--json", action="store_true", dest="json_output", help="Output as JSON")
args = parser.parse_args()
entries = load_logs(args.log_file)
if not entries:
print("No log entries found in file.", file=sys.stderr)
sys.exit(1)
result = analyze(entries, args.threshold)
result["log_file"] = args.log_file
result["threshold"] = args.threshold
result["analyzed_at"] = datetime.now().isoformat()
if args.json_output:
print(json.dumps(result, indent=2))
else:
print_human(result, args.threshold)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,280 @@
#!/usr/bin/env python3
"""Create a rotation schedule from a secret inventory file.
Reads a JSON inventory of secrets and produces a rotation plan based on
the selected policy (30d, 60d, 90d) with urgency classification.
Usage:
python rotation_planner.py --inventory secrets.json --policy 30d
python rotation_planner.py --inventory secrets.json --policy 90d --json
Inventory file format (JSON):
[
{
"name": "prod-db-password",
"type": "database",
"store": "vault",
"last_rotated": "2026-01-15",
"owner": "platform-team",
"environment": "production"
},
...
]
"""
import argparse
import json
import sys
import textwrap
from datetime import datetime, timedelta
POLICY_DAYS = {
"30d": 30,
"60d": 60,
"90d": 90,
}
# Default rotation period by secret type if not overridden by policy
TYPE_DEFAULTS = {
"database": 30,
"api-key": 90,
"tls-certificate": 60,
"ssh-key": 90,
"service-token": 1,
"encryption-key": 90,
"oauth-secret": 90,
"password": 30,
}
URGENCY_THRESHOLDS = {
"critical": 0, # Already overdue
"high": 7, # Due within 7 days
"medium": 14, # Due within 14 days
"low": 30, # Due within 30 days
}
def load_inventory(path):
"""Load and validate secret inventory from JSON file."""
try:
with open(path, "r") as f:
data = json.load(f)
except FileNotFoundError:
print(f"ERROR: Inventory file not found: {path}", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"ERROR: Invalid JSON in {path}: {e}", file=sys.stderr)
sys.exit(1)
if not isinstance(data, list):
print("ERROR: Inventory must be a JSON array of secret objects", file=sys.stderr)
sys.exit(1)
validated = []
for i, entry in enumerate(data):
if not isinstance(entry, dict):
print(f"WARNING: Skipping entry {i} — not an object", file=sys.stderr)
continue
name = entry.get("name", f"unnamed-{i}")
secret_type = entry.get("type", "unknown")
last_rotated = entry.get("last_rotated")
if not last_rotated:
print(f"WARNING: '{name}' has no last_rotated date — marking as overdue", file=sys.stderr)
last_rotated_dt = None
else:
try:
last_rotated_dt = datetime.strptime(last_rotated, "%Y-%m-%d")
except ValueError:
print(f"WARNING: '{name}' has invalid date '{last_rotated}' — marking as overdue", file=sys.stderr)
last_rotated_dt = None
validated.append({
"name": name,
"type": secret_type,
"store": entry.get("store", "unknown"),
"last_rotated": last_rotated_dt,
"owner": entry.get("owner", "unassigned"),
"environment": entry.get("environment", "unknown"),
})
return validated
def compute_schedule(inventory, policy_days):
"""Compute rotation schedule for each secret."""
now = datetime.now()
schedule = []
for secret in inventory:
# Determine rotation interval
type_default = TYPE_DEFAULTS.get(secret["type"], 90)
rotation_interval = min(policy_days, type_default)
if secret["last_rotated"] is None:
days_since = 999
next_rotation = now # Immediate
days_until = -999
else:
days_since = (now - secret["last_rotated"]).days
next_rotation = secret["last_rotated"] + timedelta(days=rotation_interval)
days_until = (next_rotation - now).days
# Classify urgency
if days_until <= URGENCY_THRESHOLDS["critical"]:
urgency = "CRITICAL"
elif days_until <= URGENCY_THRESHOLDS["high"]:
urgency = "HIGH"
elif days_until <= URGENCY_THRESHOLDS["medium"]:
urgency = "MEDIUM"
else:
urgency = "LOW"
schedule.append({
"name": secret["name"],
"type": secret["type"],
"store": secret["store"],
"owner": secret["owner"],
"environment": secret["environment"],
"last_rotated": secret["last_rotated"].strftime("%Y-%m-%d") if secret["last_rotated"] else "NEVER",
"rotation_interval_days": rotation_interval,
"next_rotation": next_rotation.strftime("%Y-%m-%d"),
"days_until_due": days_until,
"days_since_rotation": days_since,
"urgency": urgency,
})
# Sort by urgency (critical first), then by days until due
urgency_order = {"CRITICAL": 0, "HIGH": 1, "MEDIUM": 2, "LOW": 3}
schedule.sort(key=lambda x: (urgency_order.get(x["urgency"], 4), x["days_until_due"]))
return schedule
def build_summary(schedule):
"""Build summary statistics."""
total = len(schedule)
by_urgency = {}
by_type = {}
by_owner = {}
for entry in schedule:
urg = entry["urgency"]
by_urgency[urg] = by_urgency.get(urg, 0) + 1
t = entry["type"]
by_type[t] = by_type.get(t, 0) + 1
o = entry["owner"]
by_owner[o] = by_owner.get(o, 0) + 1
return {
"total_secrets": total,
"by_urgency": by_urgency,
"by_type": by_type,
"by_owner": by_owner,
"overdue_count": by_urgency.get("CRITICAL", 0),
"due_within_7d": by_urgency.get("HIGH", 0),
}
def print_human(schedule, summary, policy):
"""Print human-readable rotation plan."""
print(f"=== Secret Rotation Plan (Policy: {policy}) ===")
print(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M')}")
print(f"Total secrets: {summary['total_secrets']}")
print()
print("--- Urgency Summary ---")
for urg in ["CRITICAL", "HIGH", "MEDIUM", "LOW"]:
count = summary["by_urgency"].get(urg, 0)
if count > 0:
print(f" {urg:10s} {count}")
print()
if not schedule:
print("No secrets in inventory.")
return
print("--- Rotation Schedule ---")
print(f" {'Name':30s} {'Type':15s} {'Urgency':10s} {'Last Rotated':12s} {'Next Due':12s} {'Owner'}")
print(f" {'-'*30} {'-'*15} {'-'*10} {'-'*12} {'-'*12} {'-'*15}")
for entry in schedule:
overdue_marker = " **OVERDUE**" if entry["urgency"] == "CRITICAL" else ""
print(
f" {entry['name']:30s} {entry['type']:15s} {entry['urgency']:10s} "
f"{entry['last_rotated']:12s} {entry['next_rotation']:12s} "
f"{entry['owner']}{overdue_marker}"
)
print()
print("--- Action Items ---")
critical = [e for e in schedule if e["urgency"] == "CRITICAL"]
high = [e for e in schedule if e["urgency"] == "HIGH"]
if critical:
print(f" IMMEDIATE: Rotate {len(critical)} overdue secret(s):")
for e in critical:
print(f" - {e['name']} ({e['type']}, owner: {e['owner']})")
if high:
print(f" THIS WEEK: Rotate {len(high)} secret(s) due within 7 days:")
for e in high:
print(f" - {e['name']} (due: {e['next_rotation']}, owner: {e['owner']})")
if not critical and not high:
print(" No urgent rotations needed.")
def main():
parser = argparse.ArgumentParser(
description="Create rotation schedule from a secret inventory file.",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=textwrap.dedent("""\
Policies:
30d Aggressive — all secrets rotate within 30 days max
60d Standard — 60-day maximum rotation window
90d Relaxed — 90-day maximum rotation window
Note: Some secret types (e.g., database passwords) have shorter
built-in defaults that override the policy maximum.
Example inventory file (secrets.json):
[
{"name": "prod-db", "type": "database", "store": "vault",
"last_rotated": "2026-01-15", "owner": "platform-team",
"environment": "production"}
]
"""),
)
parser.add_argument("--inventory", required=True, help="Path to JSON inventory file")
parser.add_argument(
"--policy",
required=True,
choices=["30d", "60d", "90d"],
help="Rotation policy (maximum rotation interval)",
)
parser.add_argument("--json", action="store_true", dest="json_output", help="Output as JSON")
args = parser.parse_args()
policy_days = POLICY_DAYS[args.policy]
inventory = load_inventory(args.inventory)
schedule = compute_schedule(inventory, policy_days)
summary = build_summary(schedule)
result = {
"policy": args.policy,
"policy_days": policy_days,
"generated_at": datetime.now().isoformat(),
"summary": summary,
"schedule": schedule,
}
if args.json_output:
print(json.dumps(result, indent=2))
else:
print_human(schedule, summary, args.policy)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,302 @@
#!/usr/bin/env python3
"""Generate Vault policy and auth configuration from application requirements.
Produces HCL policy files and auth method setup commands for HashiCorp Vault
based on application name, auth method, and required secret paths.
Usage:
python vault_config_generator.py --app-name payment-service --auth-method approle --secrets "db-creds,api-key,tls-cert"
python vault_config_generator.py --app-name api-gateway --auth-method kubernetes --secrets "db-creds" --namespace production --json
"""
import argparse
import json
import sys
import textwrap
from datetime import datetime
# Default TTLs by auth method
AUTH_METHOD_DEFAULTS = {
"approle": {
"token_ttl": "1h",
"token_max_ttl": "4h",
"secret_id_num_uses": 1,
"secret_id_ttl": "10m",
},
"kubernetes": {
"token_ttl": "1h",
"token_max_ttl": "4h",
},
"oidc": {
"token_ttl": "8h",
"token_max_ttl": "12h",
},
}
# Secret type templates
SECRET_TYPE_MAP = {
"db-creds": {
"engine": "database",
"path": "database/creds/{app}-readonly",
"capabilities": ["read"],
"description": "Dynamic database credentials",
},
"db-admin": {
"engine": "database",
"path": "database/creds/{app}-readwrite",
"capabilities": ["read"],
"description": "Dynamic database admin credentials",
},
"api-key": {
"engine": "kv-v2",
"path": "secret/data/{env}/{app}/api-keys",
"capabilities": ["read"],
"description": "Static API keys (KV v2)",
},
"tls-cert": {
"engine": "pki",
"path": "pki/issue/{app}-cert",
"capabilities": ["create", "update"],
"description": "TLS certificate issuance",
},
"encryption": {
"engine": "transit",
"path": "transit/encrypt/{app}-key",
"capabilities": ["update"],
"description": "Transit encryption operations",
},
"ssh-cert": {
"engine": "ssh",
"path": "ssh/sign/{app}-role",
"capabilities": ["create", "update"],
"description": "SSH certificate signing",
},
"config": {
"engine": "kv-v2",
"path": "secret/data/{env}/{app}/config",
"capabilities": ["read"],
"description": "Application configuration secrets",
},
}
def parse_secrets(secrets_str):
"""Parse comma-separated secret types into list."""
secrets = [s.strip() for s in secrets_str.split(",") if s.strip()]
valid = []
unknown = []
for s in secrets:
if s in SECRET_TYPE_MAP:
valid.append(s)
else:
unknown.append(s)
return valid, unknown
def generate_policy_hcl(app_name, secrets, environment="production"):
"""Generate HCL policy document."""
lines = [
f'# Vault policy for {app_name}',
f'# Generated: {datetime.now().strftime("%Y-%m-%d %H:%M:%S")}',
f'# Environment: {environment}',
'',
]
for secret_type in secrets:
tmpl = SECRET_TYPE_MAP[secret_type]
path = tmpl["path"].format(app=app_name, env=environment)
caps = ", ".join(f'"{c}"' for c in tmpl["capabilities"])
lines.append(f'# {tmpl["description"]}')
lines.append(f'path "{path}" {{')
lines.append(f' capabilities = [{caps}]')
lines.append('}')
lines.append('')
# Always deny sys paths
lines.append('# Deny admin paths')
lines.append('path "sys/*" {')
lines.append(' capabilities = ["deny"]')
lines.append('}')
return "\n".join(lines)
def generate_auth_config(app_name, auth_method, policy_name, namespace=None):
"""Generate auth method setup commands."""
commands = []
defaults = AUTH_METHOD_DEFAULTS.get(auth_method, {})
if auth_method == "approle":
cmd = (
f"vault write auth/approle/role/{app_name} \\\n"
f" token_ttl={defaults['token_ttl']} \\\n"
f" token_max_ttl={defaults['token_max_ttl']} \\\n"
f" secret_id_num_uses={defaults['secret_id_num_uses']} \\\n"
f" secret_id_ttl={defaults['secret_id_ttl']} \\\n"
f" token_policies=\"{policy_name}\""
)
commands.append({"description": f"Create AppRole for {app_name}", "command": cmd})
commands.append({
"description": "Fetch RoleID",
"command": f"vault read auth/approle/role/{app_name}/role-id",
})
commands.append({
"description": "Generate SecretID (single-use)",
"command": f"vault write -f auth/approle/role/{app_name}/secret-id",
})
elif auth_method == "kubernetes":
ns = namespace or "default"
cmd = (
f"vault write auth/kubernetes/role/{app_name} \\\n"
f" bound_service_account_names={app_name} \\\n"
f" bound_service_account_namespaces={ns} \\\n"
f" policies={policy_name} \\\n"
f" ttl={defaults['token_ttl']}"
)
commands.append({"description": f"Create Kubernetes auth role for {app_name}", "command": cmd})
elif auth_method == "oidc":
cmd = (
f"vault write auth/oidc/role/{app_name} \\\n"
f" bound_audiences=\"vault\" \\\n"
f" allowed_redirect_uris=\"https://vault.example.com/ui/vault/auth/oidc/oidc/callback\" \\\n"
f" user_claim=\"email\" \\\n"
f" oidc_scopes=\"openid,profile,email\" \\\n"
f" policies=\"{policy_name}\" \\\n"
f" ttl={defaults['token_ttl']}"
)
commands.append({"description": f"Create OIDC role for {app_name}", "command": cmd})
return commands
def build_output(app_name, auth_method, secrets, environment, namespace):
"""Build complete configuration output."""
valid_secrets, unknown_secrets = parse_secrets(secrets)
if not valid_secrets:
return {
"error": "No valid secret types provided",
"unknown": unknown_secrets,
"available_types": list(SECRET_TYPE_MAP.keys()),
}
policy_name = f"{app_name}-policy"
policy_hcl = generate_policy_hcl(app_name, valid_secrets, environment)
auth_commands = generate_auth_config(app_name, auth_method, policy_name, namespace)
secret_details = []
for s in valid_secrets:
tmpl = SECRET_TYPE_MAP[s]
secret_details.append({
"type": s,
"engine": tmpl["engine"],
"path": tmpl["path"].format(app=app_name, env=environment),
"capabilities": tmpl["capabilities"],
"description": tmpl["description"],
})
result = {
"app_name": app_name,
"auth_method": auth_method,
"environment": environment,
"policy_name": policy_name,
"policy_hcl": policy_hcl,
"auth_commands": auth_commands,
"secrets": secret_details,
"generated_at": datetime.now().isoformat(),
}
if unknown_secrets:
result["warnings"] = [f"Unknown secret type '{u}' — skipped. Available: {list(SECRET_TYPE_MAP.keys())}" for u in unknown_secrets]
if namespace:
result["namespace"] = namespace
return result
def print_human(result):
"""Print human-readable output."""
if "error" in result:
print(f"ERROR: {result['error']}")
if result.get("unknown"):
print(f" Unknown types: {', '.join(result['unknown'])}")
print(f" Available types: {', '.join(result['available_types'])}")
sys.exit(1)
print(f"=== Vault Configuration for {result['app_name']} ===")
print(f"Auth Method: {result['auth_method']}")
print(f"Environment: {result['environment']}")
print(f"Policy Name: {result['policy_name']}")
print()
if result.get("warnings"):
for w in result["warnings"]:
print(f"WARNING: {w}")
print()
print("--- Policy HCL ---")
print(result["policy_hcl"])
print()
print(f"Write policy: vault policy write {result['policy_name']} {result['policy_name']}.hcl")
print()
print("--- Auth Method Setup ---")
for cmd_info in result["auth_commands"]:
print(f"# {cmd_info['description']}")
print(cmd_info["command"])
print()
print("--- Secret Paths ---")
for s in result["secrets"]:
caps = ", ".join(s["capabilities"])
print(f" {s['type']:15s} {s['path']:50s} [{caps}]")
def main():
parser = argparse.ArgumentParser(
description="Generate Vault policy and auth configuration from application requirements.",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=textwrap.dedent("""\
Secret types:
db-creds Dynamic database credentials (read-only)
db-admin Dynamic database credentials (read-write)
api-key Static API keys in KV v2
tls-cert TLS certificate issuance via PKI
encryption Transit encryption-as-a-service
ssh-cert SSH certificate signing
config Application configuration secrets
Examples:
%(prog)s --app-name payment-svc --auth-method approle --secrets "db-creds,api-key"
%(prog)s --app-name api-gw --auth-method kubernetes --secrets "db-creds,config" --namespace prod --json
"""),
)
parser.add_argument("--app-name", required=True, help="Application or service name")
parser.add_argument(
"--auth-method",
required=True,
choices=["approle", "kubernetes", "oidc"],
help="Vault auth method to configure",
)
parser.add_argument("--secrets", required=True, help="Comma-separated secret types (e.g., db-creds,api-key,tls-cert)")
parser.add_argument("--environment", default="production", help="Target environment (default: production)")
parser.add_argument("--namespace", help="Kubernetes namespace (for kubernetes auth method)")
parser.add_argument("--json", action="store_true", dest="json_output", help="Output as JSON")
args = parser.parse_args()
result = build_output(args.app_name, args.auth_method, args.secrets, args.environment, args.namespace)
if args.json_output:
print(json.dumps(result, indent=2))
else:
print_human(result)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,457 @@
---
name: "sql-database-assistant"
description: "Use when the user asks to write SQL queries, optimize database performance, generate migrations, explore database schemas, or work with ORMs like Prisma, Drizzle, TypeORM, or SQLAlchemy."
---
# SQL Database Assistant - POWERFUL Tier Skill
## Overview
The operational companion to database design. While **database-designer** focuses on schema architecture and **database-schema-designer** handles ERD modeling, this skill covers the day-to-day: writing queries, optimizing performance, generating migrations, and bridging the gap between application code and database engines.
### Core Capabilities
- **Natural Language to SQL** — translate requirements into correct, performant queries
- **Schema Exploration** — introspect live databases across PostgreSQL, MySQL, SQLite, SQL Server
- **Query Optimization** — EXPLAIN analysis, index recommendations, N+1 detection, rewrite patterns
- **Migration Generation** — up/down scripts, zero-downtime strategies, rollback plans
- **ORM Integration** — Prisma, Drizzle, TypeORM, SQLAlchemy patterns and escape hatches
- **Multi-Database Support** — dialect-aware SQL with compatibility guidance
### Tools
| Script | Purpose |
|--------|---------|
| `scripts/query_optimizer.py` | Static analysis of SQL queries for performance issues |
| `scripts/migration_generator.py` | Generate migration file templates from change descriptions |
| `scripts/schema_explorer.py` | Generate schema documentation from introspection queries |
---
## Natural Language to SQL
### Translation Patterns
When converting requirements to SQL, follow this sequence:
1. **Identify entities** — map nouns to tables
2. **Identify relationships** — map verbs to JOINs or subqueries
3. **Identify filters** — map adjectives/conditions to WHERE clauses
4. **Identify aggregations** — map "total", "average", "count" to GROUP BY
5. **Identify ordering** — map "top", "latest", "highest" to ORDER BY + LIMIT
### Common Query Templates
**Top-N per group (window function)**
```sql
SELECT * FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rn
FROM employees
) ranked WHERE rn <= 3;
```
**Running totals**
```sql
SELECT date, amount,
SUM(amount) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM transactions;
```
**Gap detection**
```sql
SELECT curr.id, curr.seq_num, prev.seq_num AS prev_seq
FROM records curr
LEFT JOIN records prev ON prev.seq_num = curr.seq_num - 1
WHERE prev.id IS NULL AND curr.seq_num > 1;
```
**UPSERT (PostgreSQL)**
```sql
INSERT INTO settings (key, value, updated_at)
VALUES ('theme', 'dark', NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = EXCLUDED.updated_at;
```
**UPSERT (MySQL)**
```sql
INSERT INTO settings (key_name, value, updated_at)
VALUES ('theme', 'dark', NOW())
ON DUPLICATE KEY UPDATE value = VALUES(value), updated_at = VALUES(updated_at);
```
> See references/query_patterns.md for JOINs, CTEs, window functions, JSON operations, and more.
---
## Schema Exploration
### Introspection Queries
**PostgreSQL — list tables and columns**
```sql
SELECT table_name, column_name, data_type, is_nullable, column_default
FROM information_schema.columns
WHERE table_schema = 'public'
ORDER BY table_name, ordinal_position;
```
**PostgreSQL — foreign keys**
```sql
SELECT tc.table_name, kcu.column_name,
ccu.table_name AS foreign_table, ccu.column_name AS foreign_column
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage ccu ON tc.constraint_name = ccu.constraint_name
WHERE tc.constraint_type = 'FOREIGN KEY';
```
**MySQL — table sizes**
```sql
SELECT table_name, table_rows,
ROUND(data_length / 1024 / 1024, 2) AS data_mb,
ROUND(index_length / 1024 / 1024, 2) AS index_mb
FROM information_schema.tables
WHERE table_schema = DATABASE()
ORDER BY data_length DESC;
```
**SQLite — schema dump**
```sql
SELECT name, sql FROM sqlite_master WHERE type = 'table' ORDER BY name;
```
**SQL Server — columns with types**
```sql
SELECT t.name AS table_name, c.name AS column_name,
ty.name AS data_type, c.max_length, c.is_nullable
FROM sys.columns c
JOIN sys.tables t ON c.object_id = t.object_id
JOIN sys.types ty ON c.user_type_id = ty.user_type_id
ORDER BY t.name, c.column_id;
```
### Generating Documentation from Schema
Use `scripts/schema_explorer.py` to produce markdown or JSON documentation:
```bash
python scripts/schema_explorer.py --dialect postgres --tables all --format md
python scripts/schema_explorer.py --dialect mysql --tables users,orders --format json --json
```
---
## Query Optimization
### EXPLAIN Analysis Workflow
1. **Run EXPLAIN ANALYZE** (PostgreSQL) or **EXPLAIN FORMAT=JSON** (MySQL)
2. **Identify the costliest node** — Seq Scan on large tables, Nested Loop with high row estimates
3. **Check for missing indexes** — sequential scans on filtered columns
4. **Look for estimation errors** — planned vs actual rows divergence signals stale statistics
5. **Evaluate JOIN order** — ensure the smallest result set drives the join
### Index Recommendation Checklist
- Columns in WHERE clauses with high selectivity
- Columns in JOIN conditions (foreign keys)
- Columns in ORDER BY when combined with LIMIT
- Composite indexes matching multi-column WHERE predicates (most selective column first)
- Partial indexes for queries with constant filters (e.g., `WHERE status = 'active'`)
- Covering indexes to avoid table lookups for read-heavy queries
### Query Rewriting Patterns
| Anti-Pattern | Rewrite |
|-------------|---------|
| `SELECT * FROM orders` | `SELECT id, status, total FROM orders` (explicit columns) |
| `WHERE YEAR(created_at) = 2025` | `WHERE created_at >= '2025-01-01' AND created_at < '2026-01-01'` (sargable) |
| Correlated subquery in SELECT | LEFT JOIN with aggregation |
| `NOT IN (SELECT ...)` with NULLs | `NOT EXISTS (SELECT 1 ...)` |
| `UNION` (dedup) when not needed | `UNION ALL` |
| `LIKE '%search%'` | Full-text search index (GIN/FULLTEXT) |
| `ORDER BY RAND()` | Application-side random sampling or `TABLESAMPLE` |
### N+1 Detection
**Symptoms:**
- Application loop that executes one query per parent row
- ORM lazy-loading related entities inside a loop
- Query log shows hundreds of identical SELECT patterns with different IDs
**Fixes:**
- Use eager loading (`include` in Prisma, `joinedload` in SQLAlchemy)
- Batch queries with `WHERE id IN (...)`
- Use DataLoader pattern for GraphQL resolvers
### Static Analysis Tool
```bash
python scripts/query_optimizer.py --query "SELECT * FROM orders WHERE status = 'pending'" --dialect postgres
python scripts/query_optimizer.py --query queries.sql --dialect mysql --json
```
> See references/optimization_guide.md for EXPLAIN plan reading, index types, and connection pooling.
---
## Migration Generation
### Zero-Downtime Migration Patterns
**Adding a column (safe)**
```sql
-- Up
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- Down
ALTER TABLE users DROP COLUMN phone;
```
**Renaming a column (expand-contract)**
```sql
-- Step 1: Add new column
ALTER TABLE users ADD COLUMN full_name VARCHAR(255);
-- Step 2: Backfill
UPDATE users SET full_name = name;
-- Step 3: Deploy app reading both columns
-- Step 4: Deploy app writing only new column
-- Step 5: Drop old column
ALTER TABLE users DROP COLUMN name;
```
**Adding a NOT NULL column (safe sequence)**
```sql
-- Step 1: Add nullable
ALTER TABLE orders ADD COLUMN region VARCHAR(50);
-- Step 2: Backfill with default
UPDATE orders SET region = 'unknown' WHERE region IS NULL;
-- Step 3: Add constraint
ALTER TABLE orders ALTER COLUMN region SET NOT NULL;
ALTER TABLE orders ALTER COLUMN region SET DEFAULT 'unknown';
```
**Index creation (non-blocking, PostgreSQL)**
```sql
CREATE INDEX CONCURRENTLY idx_orders_status ON orders (status);
```
### Data Backfill Strategies
- **Batch updates** — process in chunks of 1000-10000 rows to avoid lock contention
- **Background jobs** — run backfills asynchronously with progress tracking
- **Dual-write** — write to old and new columns during transition period
- **Validation queries** — verify row counts and data integrity after each batch
### Rollback Strategies
Every migration must have a reversible down script. For irreversible changes:
1. **Backup before execution**`pg_dump` the affected tables
2. **Feature flags** — application can switch between old/new schema reads
3. **Shadow tables** — keep a copy of the original table during migration window
### Migration Generator Tool
```bash
python scripts/migration_generator.py --change "add email_verified boolean to users" --dialect postgres --format sql
python scripts/migration_generator.py --change "rename column name to full_name in customers" --dialect mysql --format alembic --json
```
---
## Multi-Database Support
### Dialect Differences
| Feature | PostgreSQL | MySQL | SQLite | SQL Server |
|---------|-----------|-------|--------|------------|
| UPSERT | `ON CONFLICT DO UPDATE` | `ON DUPLICATE KEY UPDATE` | `ON CONFLICT DO UPDATE` | `MERGE` |
| Boolean | Native `BOOLEAN` | `TINYINT(1)` | `INTEGER` | `BIT` |
| Auto-increment | `SERIAL` / `GENERATED` | `AUTO_INCREMENT` | `INTEGER PRIMARY KEY` | `IDENTITY` |
| JSON | `JSONB` (indexed) | `JSON` | Text (ext) | `NVARCHAR(MAX)` |
| Array | Native `ARRAY` | Not supported | Not supported | Not supported |
| CTE (recursive) | Full support | 8.0+ | 3.8.3+ | Full support |
| Window functions | Full support | 8.0+ | 3.25.0+ | Full support |
| Full-text search | `tsvector` + GIN | `FULLTEXT` index | FTS5 extension | Full-text catalog |
| LIMIT/OFFSET | `LIMIT n OFFSET m` | `LIMIT n OFFSET m` | `LIMIT n OFFSET m` | `OFFSET m ROWS FETCH NEXT n ROWS ONLY` |
### Compatibility Tips
- **Always use parameterized queries** — prevents SQL injection across all dialects
- **Avoid dialect-specific functions in shared code** — wrap in adapter layer
- **Test migrations on target engine** — `information_schema` varies between engines
- **Use ISO date format** — `'YYYY-MM-DD'` works everywhere
- **Quote identifiers** — use double quotes (SQL standard) or backticks (MySQL)
---
## ORM Patterns
### Prisma
**Schema definition**
```prisma
model User {
id Int @id @default(autoincrement())
email String @unique
name String?
posts Post[]
createdAt DateTime @default(now())
}
model Post {
id Int @id @default(autoincrement())
title String
author User @relation(fields: [authorId], references: [id])
authorId Int
}
```
**Migrations**: `npx prisma migrate dev --name add_user_email`
**Query API**: `prisma.user.findMany({ where: { email: { contains: '@' } }, include: { posts: true } })`
**Raw SQL escape hatch**: `prisma.$queryRaw\`SELECT * FROM users WHERE id = ${userId}\``
### Drizzle
**Schema-first definition**
```typescript
export const users = pgTable('users', {
id: serial('id').primaryKey(),
email: varchar('email', { length: 255 }).notNull().unique(),
name: text('name'),
createdAt: timestamp('created_at').defaultNow(),
});
```
**Query builder**: `db.select().from(users).where(eq(users.email, email))`
**Migrations**: `npx drizzle-kit generate:pg` then `npx drizzle-kit push:pg`
### TypeORM
**Entity decorators**
```typescript
@Entity()
export class User {
@PrimaryGeneratedColumn()
id: number;
@Column({ unique: true })
email: string;
@OneToMany(() => Post, post => post.author)
posts: Post[];
}
```
**Repository pattern**: `userRepo.find({ where: { email }, relations: ['posts'] })`
**Migrations**: `npx typeorm migration:generate -n AddUserEmail`
### SQLAlchemy
**Declarative models**
```python
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
email = Column(String(255), unique=True, nullable=False)
name = Column(String(255))
posts = relationship('Post', back_populates='author')
```
**Session management**: Always use `with Session() as session:` context manager
**Alembic migrations**: `alembic revision --autogenerate -m "add user email"`
> See references/orm_patterns.md for side-by-side comparisons and migration workflows per ORM.
---
## Data Integrity
### Constraint Strategy
- **Primary keys** — every table must have one; prefer surrogate keys (serial/UUID)
- **Foreign keys** — enforce referential integrity; define ON DELETE behavior explicitly
- **UNIQUE constraints** — for business-level uniqueness (email, slug, API key)
- **CHECK constraints** — validate ranges, enums, and business rules at the DB level
- **NOT NULL** — default to NOT NULL; make nullable only when genuinely optional
### Transaction Isolation Levels
| Level | Dirty Read | Non-Repeatable Read | Phantom Read | Use Case |
|-------|-----------|-------------------|-------------|----------|
| READ UNCOMMITTED | Yes | Yes | Yes | Never recommended |
| READ COMMITTED | No | Yes | Yes | Default for PostgreSQL, general OLTP |
| REPEATABLE READ | No | No | Yes (InnoDB: No) | Financial calculations |
| SERIALIZABLE | No | No | No | Critical consistency (billing, inventory) |
### Deadlock Prevention
1. **Consistent lock ordering** — always acquire locks in the same table/row order
2. **Short transactions** — minimize time between first lock and commit
3. **Advisory locks** — use `pg_advisory_lock()` for application-level coordination
4. **Retry logic** — catch deadlock errors and retry with exponential backoff
---
## Backup & Restore
### PostgreSQL
```bash
# Full backup
pg_dump -Fc --no-owner dbname > backup.dump
# Restore
pg_restore -d dbname --clean --no-owner backup.dump
# Point-in-time recovery: configure WAL archiving + restore_command
```
### MySQL
```bash
# Full backup
mysqldump --single-transaction --routines --triggers dbname > backup.sql
# Restore
mysql dbname < backup.sql
# Binary log for PITR: mysqlbinlog --start-datetime="2025-01-01 00:00:00" binlog.000001
```
### SQLite
```bash
# Backup (safe with concurrent reads)
sqlite3 dbname ".backup backup.db"
```
### Backup Best Practices
- **Automate** — cron or systemd timer, never manual-only
- **Test restores** — untested backups are not backups
- **Offsite copies** — S3, GCS, or separate region
- **Retention policy** — daily for 7 days, weekly for 4 weeks, monthly for 12 months
- **Monitor backup size and duration** — sudden changes signal issues
---
## Anti-Patterns
| Anti-Pattern | Problem | Fix |
|-------------|---------|-----|
| `SELECT *` | Transfers unnecessary data, breaks on schema changes | Explicit column list |
| Missing indexes on FK columns | Slow JOINs and cascading deletes | Add indexes on all foreign keys |
| N+1 queries | 1 + N round trips to database | Eager loading or batch queries |
| Implicit type coercion | `WHERE id = '123'` prevents index use | Match types in predicates |
| No connection pooling | Exhausts connections under load | PgBouncer, ProxySQL, or ORM pool |
| Unbounded queries | No LIMIT risks returning millions of rows | Always paginate |
| Storing money as FLOAT | Rounding errors | Use `DECIMAL(19,4)` or integer cents |
| God tables | One table with 50+ columns | Normalize or use vertical partitioning |
| Soft deletes everywhere | Complicates every query with `WHERE deleted_at IS NULL` | Archive tables or event sourcing |
| Raw string concatenation | SQL injection | Parameterized queries always |
---
## Cross-References
| Skill | Relationship |
|-------|-------------|
| **database-designer** | Schema architecture, normalization analysis, ERD generation |
| **database-schema-designer** | Visual ERD modeling, relationship mapping |
| **migration-architect** | Complex multi-step migration orchestration |
| **api-design-reviewer** | Ensuring API endpoints align with query patterns |
| **observability-platform** | Query performance monitoring, slow query alerts |

View File

@@ -0,0 +1,330 @@
# Query Optimization Guide
How to read EXPLAIN plans, choose the right index types, understand query plan operators, and configure connection pooling.
---
## Reading EXPLAIN Plans
### PostgreSQL — EXPLAIN ANALYZE
```sql
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT * FROM orders WHERE status = 'paid' ORDER BY created_at DESC LIMIT 20;
```
**Sample output:**
```
Limit (cost=0.43..12.87 rows=20 width=128) (actual time=0.052..0.089 rows=20 loops=1)
-> Index Scan Backward using idx_orders_status_created on orders (cost=0.43..4521.33 rows=7284 width=128) (actual time=0.051..0.085 rows=20 loops=1)
Index Cond: (status = 'paid')
Buffers: shared hit=4
Planning Time: 0.156 ms
Execution Time: 0.112 ms
```
**Key fields to check:**
| Field | What it tells you |
|-------|-------------------|
| `cost` | Estimated startup..total cost (arbitrary units) |
| `rows` | Estimated row count at that node |
| `actual time` | Real wall-clock time in milliseconds |
| `actual rows` | Real row count — compare against estimate |
| `Buffers: shared hit` | Pages read from cache (good) |
| `Buffers: shared read` | Pages read from disk (slow) |
| `loops` | How many times the node executed |
**Red flags:**
- `Seq Scan` on a large table with a WHERE clause — missing index
- `actual rows` >> `rows` (estimated) — stale statistics, run `ANALYZE`
- `Nested Loop` with high loop count — consider hash join or add index
- `Sort` with `external merge` — not enough `work_mem`, spilling to disk
- `Buffers: shared read` much higher than `shared hit` — cold cache or table too large for memory
### MySQL — EXPLAIN FORMAT=JSON
```sql
EXPLAIN FORMAT=JSON SELECT * FROM orders WHERE status = 'paid' ORDER BY created_at DESC LIMIT 20;
```
**Key fields:**
- `query_block.select_id` — identifies subqueries
- `table.access_type``ALL` (full scan), `ref` (index lookup), `range`, `index`, `const`
- `table.rows_examined_per_scan` — how many rows the engine reads
- `table.using_index` — covering index (no table lookup needed)
- `table.attached_condition` — the WHERE filter applied
**Access types ranked (best to worst):**
`system` > `const` > `eq_ref` > `ref` > `range` > `index` > `ALL`
---
## Index Types
### B-tree (default)
The workhorse index. Supports equality, range, prefix, and ORDER BY operations.
**Best for:** `=`, `<`, `>`, `<=`, `>=`, `BETWEEN`, `LIKE 'prefix%'`, `ORDER BY`, `MIN()`, `MAX()`
```sql
CREATE INDEX idx_orders_created ON orders (created_at);
```
**Composite B-tree:** Column order matters. The index is useful for queries that filter on a leftmost prefix of the indexed columns.
```sql
-- This index serves: WHERE status = ... AND created_at > ...
-- Also serves: WHERE status = ...
-- Does NOT serve: WHERE created_at > ... (without status)
CREATE INDEX idx_orders_status_created ON orders (status, created_at);
```
### Hash
Equality-only lookups. Faster than B-tree for exact matches but no range support.
**Best for:** `=` lookups on high-cardinality columns
```sql
-- PostgreSQL
CREATE INDEX idx_sessions_token ON sessions USING hash (token);
```
**Limitations:** No range queries, no ORDER BY, not WAL-logged before PostgreSQL 10.
### GIN (Generalized Inverted Index)
For multi-valued data: arrays, JSONB, full-text search vectors.
```sql
-- JSONB containment
CREATE INDEX idx_products_tags ON products USING gin (tags);
-- Query: SELECT * FROM products WHERE tags @> '["sale"]';
-- Full-text search
CREATE INDEX idx_articles_search ON articles USING gin (to_tsvector('english', title || ' ' || body));
```
### GiST (Generalized Search Tree)
For geometric, range, and proximity data.
```sql
-- Range type (e.g., date ranges)
CREATE INDEX idx_bookings_period ON bookings USING gist (during);
-- Query: SELECT * FROM bookings WHERE during && '[2025-01-01, 2025-01-31]';
-- PostGIS geometry
CREATE INDEX idx_locations_geom ON locations USING gist (geom);
```
### BRIN (Block Range INdex)
Tiny index for naturally ordered data (e.g., time-series append-only tables).
```sql
CREATE INDEX idx_events_created ON events USING brin (created_at);
```
**Best for:** Large tables where the indexed column correlates with physical row order. Much smaller than B-tree but less precise.
### Partial Index
Index only rows matching a condition. Smaller and faster for targeted queries.
```sql
-- Only index active users (skip millions of inactive)
CREATE INDEX idx_users_active_email ON users (email) WHERE status = 'active';
```
### Covering Index (INCLUDE)
Store extra columns in the index to avoid table lookups (index-only scans).
```sql
-- PostgreSQL 11+
CREATE INDEX idx_orders_status ON orders (status) INCLUDE (total, created_at);
-- Query can be answered entirely from the index:
-- SELECT total, created_at FROM orders WHERE status = 'paid';
```
### Expression Index
Index the result of a function or expression.
```sql
CREATE INDEX idx_users_lower_email ON users (LOWER(email));
-- Query: SELECT * FROM users WHERE LOWER(email) = 'user@example.com';
```
---
## Query Plan Operators
### Scan operators
| Operator | Description | Performance |
|----------|-------------|-------------|
| **Seq Scan** | Full table scan, reads every row | Slow on large tables |
| **Index Scan** | B-tree lookup + table fetch | Fast for selective queries |
| **Index Only Scan** | Reads only the index (covering) | Fastest for covered queries |
| **Bitmap Index Scan** | Builds a bitmap of matching pages | Good for medium selectivity |
| **Bitmap Heap Scan** | Fetches pages identified by bitmap | Pairs with bitmap index scan |
### Join operators
| Operator | Description | Best when |
|----------|-------------|-----------|
| **Nested Loop** | For each outer row, scan inner | Small outer set, indexed inner |
| **Hash Join** | Build hash table on inner, probe with outer | Medium-large sets, no index |
| **Merge Join** | Merge two sorted inputs | Both inputs already sorted |
### Other operators
| Operator | Description |
|----------|-------------|
| **Sort** | Sorts rows (may spill to disk if work_mem exceeded) |
| **Hash Aggregate** | GROUP BY using hash table |
| **Group Aggregate** | GROUP BY on pre-sorted input |
| **Limit** | Stops after N rows |
| **Materialize** | Caches subquery results in memory |
| **Gather / Gather Merge** | Collects results from parallel workers |
---
## Connection Pooling
### Why pool connections?
Each database connection consumes memory (5-10 MB in PostgreSQL). Without pooling:
- Application creates a new connection per request (slow: TCP + TLS + auth)
- Under load, connection count spikes past `max_connections`
- Database OOM or connection refused errors
### PgBouncer (PostgreSQL)
The standard external connection pooler for PostgreSQL.
**Modes:**
- **Session** — connection assigned for entire client session (safest, least efficient)
- **Transaction** — connection returned to pool after each transaction (recommended)
- **Statement** — connection returned after each statement (cannot use transactions)
```ini
# pgbouncer.ini
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb
[pgbouncer]
pool_mode = transaction
max_client_conn = 200
default_pool_size = 20
min_pool_size = 5
reserve_pool_size = 5
reserve_pool_timeout = 3
server_idle_timeout = 300
```
**Sizing formula:**
```
default_pool_size = num_cpu_cores * 2 + effective_spindle_count
```
For SSDs, start with `num_cpu_cores * 2` (typically 4-16 connections is optimal).
### ProxySQL (MySQL)
```ini
mysql_servers = ({ address="127.0.0.1", port=3306, hostgroup=0, max_connections=100 })
mysql_query_rules = ({ rule_id=1, match_pattern="^SELECT.*FOR UPDATE", destination_hostgroup=0 })
```
### Application-Level Pooling
Most ORMs and drivers include built-in pooling:
| Platform | Pool Configuration |
|----------|--------------------|
| **node-postgres** | `new Pool({ max: 20, idleTimeoutMillis: 30000 })` |
| **SQLAlchemy** | `create_engine(url, pool_size=20, max_overflow=5)` |
| **HikariCP (Java)** | `maximumPoolSize=20, minimumIdle=5, idleTimeout=300000` |
| **Prisma** | `connection_limit=20` in connection string |
### Pool Sizing Guidelines
| Metric | Guideline |
|--------|-----------|
| **Minimum** | Number of always-active background workers |
| **Maximum** | 2-4x CPU cores for OLTP; lower for OLAP |
| **Idle timeout** | 30-300 seconds (reclaim unused connections) |
| **Connection timeout** | 3-10 seconds (fail fast under pressure) |
| **Queue size** | 2-5x pool max (buffer bursts before rejecting) |
**Warning:** More connections does not mean better performance. Beyond the optimal point (usually 20-50), contention on locks, CPU, and I/O causes throughput to decrease.
---
## Statistics and Maintenance
### PostgreSQL
```sql
-- Update statistics for the query planner
ANALYZE orders;
ANALYZE; -- All tables
-- Check table bloat and dead tuples
SELECT relname, n_dead_tup, last_autovacuum, last_autoanalyze
FROM pg_stat_user_tables ORDER BY n_dead_tup DESC;
-- Identify unused indexes
SELECT indexrelname, idx_scan, pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexrelname NOT LIKE '%pkey%'
ORDER BY pg_relation_size(indexrelid) DESC;
```
### MySQL
```sql
-- Update statistics
ANALYZE TABLE orders;
-- Check index usage
SELECT * FROM sys.schema_unused_indexes;
SELECT * FROM sys.schema_redundant_indexes;
-- Identify long-running queries
SELECT * FROM information_schema.processlist WHERE time > 10;
```
---
## Performance Checklist
Before deploying any query to production:
1. Run `EXPLAIN ANALYZE` and verify no unexpected sequential scans
2. Check that estimated rows are within 10x of actual rows
3. Verify index usage on all WHERE, JOIN, and ORDER BY columns
4. Ensure LIMIT is present for user-facing list queries
5. Confirm parameterized queries (no string concatenation)
6. Test with production-like data volume (not just 10 rows)
7. Monitor query time in application metrics after deployment
8. Set up slow query log alerting (> 100ms for OLTP, > 5s for reports)
---
## Quick Reference: When to Use Which Index
| Query Pattern | Index Type |
|--------------|-----------|
| `WHERE col = value` | B-tree or Hash |
| `WHERE col > value` | B-tree |
| `WHERE col LIKE 'prefix%'` | B-tree |
| `WHERE col LIKE '%substring%'` | GIN (full-text) or trigram |
| `WHERE jsonb_col @> '{...}'` | GIN |
| `WHERE array_col && ARRAY[...]` | GIN |
| `WHERE range_col && '[a,b]'` | GiST |
| `WHERE ST_DWithin(geom, ...)` | GiST |
| `WHERE col = value` (append-only) | BRIN |
| `WHERE col = value AND status = 'active'` | Partial B-tree |
| `SELECT a, b WHERE c = value` | Covering (INCLUDE) |

View File

@@ -0,0 +1,451 @@
# ORM Patterns Reference
Side-by-side comparison of Prisma, Drizzle, TypeORM, and SQLAlchemy patterns for common database operations.
---
## Schema Definition
### Prisma (schema.prisma)
```prisma
model User {
id Int @id @default(autoincrement())
email String @unique
name String?
role Role @default(USER)
posts Post[]
profile Profile?
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
@@index([email])
@@map("users")
}
model Post {
id Int @id @default(autoincrement())
title String
body String?
published Boolean @default(false)
author User @relation(fields: [authorId], references: [id], onDelete: Cascade)
authorId Int
tags Tag[]
createdAt DateTime @default(now())
@@index([authorId])
@@index([published, createdAt])
@@map("posts")
}
enum Role {
USER
ADMIN
MODERATOR
}
```
### Drizzle (schema.ts)
```typescript
import { pgTable, serial, varchar, text, boolean, timestamp, integer, pgEnum } from 'drizzle-orm/pg-core';
export const roleEnum = pgEnum('role', ['USER', 'ADMIN', 'MODERATOR']);
export const users = pgTable('users', {
id: serial('id').primaryKey(),
email: varchar('email', { length: 255 }).notNull().unique(),
name: varchar('name', { length: 255 }),
role: roleEnum('role').default('USER').notNull(),
createdAt: timestamp('created_at').defaultNow().notNull(),
updatedAt: timestamp('updated_at').defaultNow().notNull(),
});
export const posts = pgTable('posts', {
id: serial('id').primaryKey(),
title: varchar('title', { length: 255 }).notNull(),
body: text('body'),
published: boolean('published').default(false).notNull(),
authorId: integer('author_id').notNull().references(() => users.id, { onDelete: 'cascade' }),
createdAt: timestamp('created_at').defaultNow().notNull(),
}, (table) => ({
authorIdx: index('idx_posts_author').on(table.authorId),
publishedIdx: index('idx_posts_published').on(table.published, table.createdAt),
}));
```
### TypeORM (entities)
```typescript
import { Entity, PrimaryGeneratedColumn, Column, ManyToOne, OneToMany, CreateDateColumn, UpdateDateColumn, Index } from 'typeorm';
export enum Role { USER = 'USER', ADMIN = 'ADMIN', MODERATOR = 'MODERATOR' }
@Entity('users')
export class User {
@PrimaryGeneratedColumn()
id: number;
@Column({ unique: true })
@Index()
email: string;
@Column({ nullable: true })
name: string;
@Column({ type: 'enum', enum: Role, default: Role.USER })
role: Role;
@OneToMany(() => Post, post => post.author)
posts: Post[];
@CreateDateColumn()
createdAt: Date;
@UpdateDateColumn()
updatedAt: Date;
}
@Entity('posts')
@Index(['published', 'createdAt'])
export class Post {
@PrimaryGeneratedColumn()
id: number;
@Column()
title: string;
@Column({ nullable: true, type: 'text' })
body: string;
@Column({ default: false })
published: boolean;
@ManyToOne(() => User, user => user.posts, { onDelete: 'CASCADE' })
author: User;
@Column()
authorId: number;
@CreateDateColumn()
createdAt: Date;
}
```
### SQLAlchemy (models.py)
```python
import enum
from datetime import datetime
from sqlalchemy import Column, Integer, String, Text, Boolean, DateTime, Enum, ForeignKey, Index
from sqlalchemy.orm import relationship, DeclarativeBase
class Base(DeclarativeBase):
pass
class Role(enum.Enum):
USER = "USER"
ADMIN = "ADMIN"
MODERATOR = "MODERATOR"
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True, autoincrement=True)
email = Column(String(255), unique=True, nullable=False, index=True)
name = Column(String(255), nullable=True)
role = Column(Enum(Role), default=Role.USER, nullable=False)
posts = relationship('Post', back_populates='author', cascade='all, delete-orphan')
created_at = Column(DateTime, default=datetime.utcnow, nullable=False)
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow, nullable=False)
class Post(Base):
__tablename__ = 'posts'
__table_args__ = (
Index('idx_posts_published', 'published', 'created_at'),
)
id = Column(Integer, primary_key=True, autoincrement=True)
title = Column(String(255), nullable=False)
body = Column(Text, nullable=True)
published = Column(Boolean, default=False, nullable=False)
author_id = Column(Integer, ForeignKey('users.id', ondelete='CASCADE'), nullable=False, index=True)
author = relationship('User', back_populates='posts')
created_at = Column(DateTime, default=datetime.utcnow, nullable=False)
```
---
## CRUD Operations
### Create
| ORM | Pattern |
|-----|---------|
| **Prisma** | `await prisma.user.create({ data: { email, name } })` |
| **Drizzle** | `await db.insert(users).values({ email, name }).returning()` |
| **TypeORM** | `await userRepo.save(userRepo.create({ email, name }))` |
| **SQLAlchemy** | `session.add(User(email=email, name=name)); session.commit()` |
### Read (with filter)
| ORM | Pattern |
|-----|---------|
| **Prisma** | `await prisma.user.findMany({ where: { role: 'ADMIN' }, orderBy: { createdAt: 'desc' } })` |
| **Drizzle** | `await db.select().from(users).where(eq(users.role, 'ADMIN')).orderBy(desc(users.createdAt))` |
| **TypeORM** | `await userRepo.find({ where: { role: Role.ADMIN }, order: { createdAt: 'DESC' } })` |
| **SQLAlchemy** | `session.query(User).filter(User.role == Role.ADMIN).order_by(User.created_at.desc()).all()` |
### Update
| ORM | Pattern |
|-----|---------|
| **Prisma** | `await prisma.user.update({ where: { id }, data: { name } })` |
| **Drizzle** | `await db.update(users).set({ name }).where(eq(users.id, id))` |
| **TypeORM** | `await userRepo.update(id, { name })` |
| **SQLAlchemy** | `session.query(User).filter(User.id == id).update({User.name: name}); session.commit()` |
### Delete
| ORM | Pattern |
|-----|---------|
| **Prisma** | `await prisma.user.delete({ where: { id } })` |
| **Drizzle** | `await db.delete(users).where(eq(users.id, id))` |
| **TypeORM** | `await userRepo.delete(id)` |
| **SQLAlchemy** | `session.query(User).filter(User.id == id).delete(); session.commit()` |
---
## Relations and Eager Loading
### Prisma — include / select
```typescript
// Eager load posts with user
const user = await prisma.user.findUnique({
where: { id: 1 },
include: { posts: { where: { published: true }, orderBy: { createdAt: 'desc' } } },
});
// Nested create
await prisma.user.create({
data: {
email: 'new@example.com',
posts: { create: [{ title: 'First post' }] },
},
});
```
### Drizzle — relational queries
```typescript
const result = await db.query.users.findFirst({
where: eq(users.id, 1),
with: { posts: { where: eq(posts.published, true), orderBy: [desc(posts.createdAt)] } },
});
```
### TypeORM — relations / query builder
```typescript
// FindOptions
const user = await userRepo.findOne({ where: { id: 1 }, relations: ['posts'] });
// QueryBuilder for complex joins
const result = await userRepo.createQueryBuilder('u')
.leftJoinAndSelect('u.posts', 'p', 'p.published = :pub', { pub: true })
.where('u.id = :id', { id: 1 })
.getOne();
```
### SQLAlchemy — joinedload / selectinload
```python
from sqlalchemy.orm import joinedload, selectinload
# Eager load in one JOIN query
user = session.query(User).options(joinedload(User.posts)).filter(User.id == 1).first()
# Eager load in a separate IN query (better for collections)
users = session.query(User).options(selectinload(User.posts)).all()
```
---
## Raw SQL Escape Hatches
Every ORM should provide a way to execute raw SQL for complex queries:
| ORM | Pattern |
|-----|---------|
| **Prisma** | `` prisma.$queryRaw`SELECT * FROM users WHERE id = ${id}` `` |
| **Drizzle** | `db.execute(sql`SELECT * FROM users WHERE id = ${id}`)` |
| **TypeORM** | `dataSource.query('SELECT * FROM users WHERE id = $1', [id])` |
| **SQLAlchemy** | `session.execute(text('SELECT * FROM users WHERE id = :id'), {'id': id})` |
Always use parameterized queries in raw SQL to prevent injection.
---
## Transaction Patterns
### Prisma
```typescript
await prisma.$transaction(async (tx) => {
const user = await tx.user.create({ data: { email } });
await tx.post.create({ data: { title: 'Welcome', authorId: user.id } });
});
```
### Drizzle
```typescript
await db.transaction(async (tx) => {
const [user] = await tx.insert(users).values({ email }).returning();
await tx.insert(posts).values({ title: 'Welcome', authorId: user.id });
});
```
### TypeORM
```typescript
await dataSource.transaction(async (manager) => {
const user = await manager.save(User, { email });
await manager.save(Post, { title: 'Welcome', authorId: user.id });
});
```
### SQLAlchemy
```python
with Session() as session:
try:
user = User(email=email)
session.add(user)
session.flush() # Get user.id without committing
session.add(Post(title='Welcome', author_id=user.id))
session.commit()
except Exception:
session.rollback()
raise
```
---
## Migration Workflows
### Prisma
```bash
# Generate migration from schema changes
npx prisma migrate dev --name add_posts_table
# Apply in production
npx prisma migrate deploy
# Reset database (dev only)
npx prisma migrate reset
# Generate client after schema change
npx prisma generate
```
**Files:** `prisma/migrations/<timestamp>_<name>/migration.sql`
### Drizzle
```bash
# Generate migration SQL from schema diff
npx drizzle-kit generate:pg
# Push schema directly (dev only, no migration files)
npx drizzle-kit push:pg
# Apply migrations
npx drizzle-kit migrate
```
**Files:** `drizzle/<timestamp>_<name>.sql`
### TypeORM
```bash
# Auto-generate migration from entity changes
npx typeorm migration:generate -d data-source.ts -n AddPostsTable
# Create empty migration
npx typeorm migration:create -n CustomMigration
# Run pending migrations
npx typeorm migration:run -d data-source.ts
# Revert last migration
npx typeorm migration:revert -d data-source.ts
```
**Files:** `src/migrations/<timestamp>-<Name>.ts`
### SQLAlchemy (Alembic)
```bash
# Initialize Alembic
alembic init alembic
# Auto-generate migration from model changes
alembic revision --autogenerate -m "add posts table"
# Apply all pending
alembic upgrade head
# Revert one step
alembic downgrade -1
# Show current state
alembic current
```
**Files:** `alembic/versions/<hash>_<slug>.py`
---
## N+1 Prevention Cheat Sheet
| ORM | Lazy (N+1 risk) | Eager (fixed) |
|-----|-----------------|---------------|
| **Prisma** | Not accessing `include` | `include: { posts: true }` |
| **Drizzle** | Separate queries | `with: { posts: true }` |
| **TypeORM** | `@ManyToOne(() => ..., { lazy: true })` | `relations: ['posts']` or `leftJoinAndSelect` |
| **SQLAlchemy** | Default `lazy='select'` | `joinedload()` or `selectinload()` |
**Rule of thumb:** If you access a relation inside a loop, you have an N+1 problem. Always load relations before the loop.
---
## Connection Pooling
### Prisma
```
# In .env or connection string
DATABASE_URL="postgresql://user:pass@host/db?connection_limit=20&pool_timeout=10"
```
### Drizzle (with node-postgres)
```typescript
import { Pool } from 'pg';
const pool = new Pool({ max: 20, idleTimeoutMillis: 30000, connectionTimeoutMillis: 5000 });
const db = drizzle(pool);
```
### TypeORM
```typescript
const dataSource = new DataSource({
type: 'postgres',
extra: { max: 20, idleTimeoutMillis: 30000 },
});
```
### SQLAlchemy
```python
from sqlalchemy import create_engine
engine = create_engine('postgresql://user:pass@host/db', pool_size=20, max_overflow=5, pool_timeout=30)
```
---
## Best Practices Summary
1. **Always use migrations** — never modify production schemas by hand
2. **Eager load relations** — prevent N+1 in every list/collection query
3. **Use transactions** — group related writes to maintain consistency
4. **Parameterize raw SQL** — never concatenate user input into queries
5. **Connection pooling** — configure pool size matching your workload
6. **Index foreign keys** — ORMs often skip this; add manually if needed
7. **Review generated SQL** — enable query logging in development to catch inefficiencies
8. **Type-safe queries** — leverage TypeScript/Python typing for compile-time checks
9. **Separate read/write models** — use views or read replicas for heavy reporting queries
10. **Test migrations both ways** — always verify that down migrations actually reverse up migrations

View File

@@ -0,0 +1,406 @@
# SQL Query Patterns Reference
Common query patterns for everyday database operations. All examples use PostgreSQL syntax with dialect notes where they differ.
---
## JOIN Patterns
### INNER JOIN — matching rows in both tables
```sql
SELECT u.name, o.id AS order_id, o.total
FROM users u
INNER JOIN orders o ON o.user_id = u.id
WHERE o.status = 'paid';
```
### LEFT JOIN — all rows from left, matching from right
```sql
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.id, u.name;
```
Returns users even if they have zero orders.
### Self JOIN — comparing rows within the same table
```sql
-- Find employees who earn more than their manager
SELECT e.name AS employee, m.name AS manager, e.salary, m.salary AS manager_salary
FROM employees e
JOIN employees m ON e.manager_id = m.id
WHERE e.salary > m.salary;
```
### CROSS JOIN — every combination (cartesian product)
```sql
-- Generate a calendar grid
SELECT d.date, s.shift_name
FROM dates d
CROSS JOIN shifts s;
```
Use intentionally. Accidental cartesian joins are a performance killer.
### LATERAL JOIN (PostgreSQL) — correlated subquery as a table
```sql
-- Top 3 orders per user
SELECT u.name, top_orders.*
FROM users u
CROSS JOIN LATERAL (
SELECT id, total FROM orders
WHERE user_id = u.id
ORDER BY total DESC LIMIT 3
) top_orders;
```
MySQL equivalent: use a subquery with `ROW_NUMBER()`.
---
## Common Table Expressions (CTEs)
### Basic CTE — readable subquery
```sql
WITH active_users AS (
SELECT id, name, email
FROM users
WHERE last_login > CURRENT_DATE - INTERVAL '30 days'
)
SELECT au.name, COUNT(o.id) AS recent_orders
FROM active_users au
JOIN orders o ON o.user_id = au.id
GROUP BY au.name;
```
### Multiple CTEs — chaining transformations
```sql
WITH monthly_revenue AS (
SELECT DATE_TRUNC('month', created_at) AS month, SUM(total) AS revenue
FROM orders WHERE status = 'paid'
GROUP BY 1
),
growth AS (
SELECT month, revenue,
LAG(revenue) OVER (ORDER BY month) AS prev_revenue,
ROUND((revenue - LAG(revenue) OVER (ORDER BY month)) / LAG(revenue) OVER (ORDER BY month) * 100, 1) AS growth_pct
FROM monthly_revenue
)
SELECT * FROM growth ORDER BY month;
```
### Recursive CTE — hierarchical data
```sql
-- Organization tree
WITH RECURSIVE org_tree AS (
-- Base case: top-level managers
SELECT id, name, manager_id, 0 AS depth
FROM employees WHERE manager_id IS NULL
UNION ALL
-- Recursive case: subordinates
SELECT e.id, e.name, e.manager_id, ot.depth + 1
FROM employees e
JOIN org_tree ot ON e.manager_id = ot.id
)
SELECT * FROM org_tree ORDER BY depth, name;
```
### Recursive CTE — path traversal
```sql
-- Category breadcrumb
WITH RECURSIVE breadcrumb AS (
SELECT id, name, parent_id, name::TEXT AS path
FROM categories WHERE id = 42
UNION ALL
SELECT c.id, c.name, c.parent_id, c.name || ' > ' || b.path
FROM categories c
JOIN breadcrumb b ON c.id = b.parent_id
)
SELECT path FROM breadcrumb WHERE parent_id IS NULL;
```
---
## Window Functions
### ROW_NUMBER — assign unique rank per partition
```sql
SELECT *, ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank
FROM employees;
```
### RANK and DENSE_RANK — handle ties
```sql
-- RANK: 1, 2, 2, 4 (skips after tie)
-- DENSE_RANK: 1, 2, 2, 3 (no skip)
SELECT name, salary,
RANK() OVER (ORDER BY salary DESC) AS rank,
DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;
```
### Running total and moving average
```sql
SELECT date, amount,
SUM(amount) OVER (ORDER BY date) AS running_total,
AVG(amount) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS moving_avg_7d
FROM daily_revenue;
```
### LAG / LEAD — access adjacent rows
```sql
SELECT date, revenue,
LAG(revenue, 1) OVER (ORDER BY date) AS prev_day,
revenue - LAG(revenue, 1) OVER (ORDER BY date) AS day_over_day_change
FROM daily_revenue;
```
### NTILE — divide into buckets
```sql
-- Split customers into quartiles by total spend
SELECT customer_id, total_spend,
NTILE(4) OVER (ORDER BY total_spend DESC) AS spend_quartile
FROM customer_summary;
```
### FIRST_VALUE / LAST_VALUE
```sql
SELECT department_id, name, salary,
FIRST_VALUE(name) OVER (PARTITION BY department_id ORDER BY salary DESC) AS highest_paid
FROM employees;
```
---
## Subquery Patterns
### EXISTS — correlated existence check
```sql
-- Users who have placed at least one order
SELECT u.* FROM users u
WHERE EXISTS (SELECT 1 FROM orders o WHERE o.user_id = u.id);
```
### NOT EXISTS — safer than NOT IN for NULLs
```sql
-- Users who have never ordered
SELECT u.* FROM users u
WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.user_id = u.id);
```
### Scalar subquery — single value
```sql
SELECT name, salary,
salary - (SELECT AVG(salary) FROM employees) AS diff_from_avg
FROM employees;
```
### Derived table — subquery in FROM
```sql
SELECT dept, avg_salary
FROM (
SELECT department_id AS dept, AVG(salary) AS avg_salary
FROM employees GROUP BY department_id
) dept_avg
WHERE avg_salary > 100000;
```
---
## Aggregation Patterns
### GROUP BY with HAVING
```sql
-- Departments with more than 10 employees
SELECT department_id, COUNT(*) AS headcount, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
HAVING COUNT(*) > 10;
```
### GROUPING SETS — multiple grouping levels
```sql
SELECT region, product_category, SUM(revenue)
FROM sales
GROUP BY GROUPING SETS (
(region, product_category),
(region),
(product_category),
()
);
```
### ROLLUP — hierarchical subtotals
```sql
SELECT region, city, SUM(revenue)
FROM sales
GROUP BY ROLLUP (region, city);
-- Produces: (region, city), (region), ()
```
### CUBE — all combinations
```sql
SELECT region, product, SUM(revenue)
FROM sales
GROUP BY CUBE (region, product);
```
### FILTER clause (PostgreSQL) — conditional aggregation
```sql
SELECT
COUNT(*) AS total,
COUNT(*) FILTER (WHERE status = 'paid') AS paid,
COUNT(*) FILTER (WHERE status = 'cancelled') AS cancelled,
SUM(total) FILTER (WHERE status = 'paid') AS paid_revenue
FROM orders;
```
MySQL/SQL Server equivalent: `SUM(CASE WHEN status = 'paid' THEN 1 ELSE 0 END)`.
---
## UPSERT Patterns
### PostgreSQL — ON CONFLICT
```sql
INSERT INTO user_settings (user_id, key, value, updated_at)
VALUES (1, 'theme', 'dark', NOW())
ON CONFLICT (user_id, key)
DO UPDATE SET value = EXCLUDED.value, updated_at = EXCLUDED.updated_at;
```
### MySQL — ON DUPLICATE KEY
```sql
INSERT INTO user_settings (user_id, key_name, value, updated_at)
VALUES (1, 'theme', 'dark', NOW())
ON DUPLICATE KEY UPDATE value = VALUES(value), updated_at = VALUES(updated_at);
```
### SQL Server — MERGE
```sql
MERGE INTO user_settings AS target
USING (VALUES (1, 'theme', 'dark')) AS source (user_id, key_name, value)
ON target.user_id = source.user_id AND target.key_name = source.key_name
WHEN MATCHED THEN UPDATE SET value = source.value, updated_at = GETDATE()
WHEN NOT MATCHED THEN INSERT (user_id, key_name, value, updated_at)
VALUES (source.user_id, source.key_name, source.value, GETDATE());
```
---
## JSON Operations
### PostgreSQL JSONB
```sql
-- Extract field
SELECT data->>'name' AS name FROM products WHERE data->>'category' = 'electronics';
-- Array contains
SELECT * FROM products WHERE data->'tags' ? 'sale';
-- Update nested field
UPDATE products SET data = jsonb_set(data, '{price}', '29.99') WHERE id = 1;
-- Aggregate into JSON array
SELECT jsonb_agg(jsonb_build_object('id', id, 'name', name)) FROM users;
```
### MySQL JSON
```sql
-- Extract field
SELECT JSON_EXTRACT(data, '$.name') AS name FROM products;
-- Shorthand: SELECT data->>"$.name"
-- Search in array
SELECT * FROM products WHERE JSON_CONTAINS(data->"$.tags", '"sale"');
-- Update
UPDATE products SET data = JSON_SET(data, '$.price', 29.99) WHERE id = 1;
```
---
## Pagination Patterns
### Offset pagination (simple but slow for deep pages)
```sql
SELECT * FROM products ORDER BY id LIMIT 20 OFFSET 40;
```
### Keyset pagination (fast, requires ordered unique column)
```sql
-- Page after the last seen id
SELECT * FROM products WHERE id > :last_seen_id ORDER BY id LIMIT 20;
```
### Keyset with composite sort
```sql
SELECT * FROM products
WHERE (created_at, id) < (:last_created_at, :last_id)
ORDER BY created_at DESC, id DESC
LIMIT 20;
```
---
## Bulk Operations
### Batch INSERT
```sql
INSERT INTO events (type, payload, created_at) VALUES
('click', '{"page": "/home"}', NOW()),
('view', '{"page": "/pricing"}', NOW()),
('click', '{"page": "/signup"}', NOW());
```
### Batch UPDATE with VALUES
```sql
UPDATE products AS p SET price = v.price
FROM (VALUES (1, 29.99), (2, 49.99), (3, 9.99)) AS v(id, price)
WHERE p.id = v.id;
```
### DELETE with subquery
```sql
DELETE FROM sessions
WHERE user_id IN (SELECT id FROM users WHERE deleted_at IS NOT NULL);
```
### COPY (PostgreSQL bulk load)
```sql
COPY products (name, price, category) FROM '/path/to/data.csv' WITH (FORMAT csv, HEADER true);
```
---
## Utility Patterns
### Generate series (PostgreSQL)
```sql
-- Fill date gaps
SELECT d::date FROM generate_series('2025-01-01'::date, '2025-12-31', '1 day') d;
```
### Deduplicate rows
```sql
DELETE FROM events a USING events b
WHERE a.id > b.id AND a.user_id = b.user_id AND a.event_type = b.event_type
AND a.created_at = b.created_at;
```
### Pivot (manual)
```sql
SELECT user_id,
SUM(CASE WHEN month = 1 THEN revenue END) AS jan,
SUM(CASE WHEN month = 2 THEN revenue END) AS feb,
SUM(CASE WHEN month = 3 THEN revenue END) AS mar
FROM monthly_revenue
GROUP BY user_id;
```
### Conditional INSERT (skip if exists)
```sql
INSERT INTO tags (name) SELECT 'new-tag'
WHERE NOT EXISTS (SELECT 1 FROM tags WHERE name = 'new-tag');
```

View File

@@ -0,0 +1,442 @@
#!/usr/bin/env python3
"""
Migration Generator
Generates database migration file templates (up/down) from natural-language
schema change descriptions.
Supported operations:
- Add column, drop column, rename column
- Add table, drop table, rename table
- Add index, drop index
- Add constraint, drop constraint
- Change column type
Usage:
python migration_generator.py --change "add email_verified boolean to users" --dialect postgres
python migration_generator.py --change "rename column name to full_name in customers" --format alembic
python migration_generator.py --change "add index on orders(status, created_at)" --output 001_add_index.sql
python migration_generator.py --change "create table reviews with id, user_id, rating, body" --json
"""
import argparse
import json
import os
import re
import sys
import textwrap
from dataclasses import dataclass, asdict
from datetime import datetime
from typing import List, Optional, Tuple
@dataclass
class Migration:
"""A generated migration with up and down scripts."""
description: str
dialect: str
format: str
up: str
down: str
warnings: List[str]
def to_dict(self):
return asdict(self)
# ---------------------------------------------------------------------------
# Change parsers — extract structured intent from natural language
# ---------------------------------------------------------------------------
def parse_add_column(desc: str) -> Optional[dict]:
"""Parse: add <column> <type> to <table>"""
m = re.match(
r'add\s+(?:column\s+)?(\w+)\s+(\w[\w(),.]*)\s+(?:to|on)\s+(\w+)',
desc, re.IGNORECASE,
)
if m:
return {"op": "add_column", "column": m.group(1), "type": m.group(2), "table": m.group(3)}
return None
def parse_drop_column(desc: str) -> Optional[dict]:
"""Parse: drop/remove <column> from <table>"""
m = re.match(
r'(?:drop|remove)\s+(?:column\s+)?(\w+)\s+from\s+(\w+)',
desc, re.IGNORECASE,
)
if m:
return {"op": "drop_column", "column": m.group(1), "table": m.group(2)}
return None
def parse_rename_column(desc: str) -> Optional[dict]:
"""Parse: rename column <old> to <new> in <table>"""
m = re.match(
r'rename\s+column\s+(\w+)\s+to\s+(\w+)\s+in\s+(\w+)',
desc, re.IGNORECASE,
)
if m:
return {"op": "rename_column", "old": m.group(1), "new": m.group(2), "table": m.group(3)}
return None
def parse_add_table(desc: str) -> Optional[dict]:
"""Parse: create table <name> with <col1>, <col2>, ..."""
m = re.match(
r'create\s+table\s+(\w+)\s+with\s+(.+)',
desc, re.IGNORECASE,
)
if m:
cols = [c.strip() for c in m.group(2).split(",")]
return {"op": "add_table", "table": m.group(1), "columns": cols}
return None
def parse_drop_table(desc: str) -> Optional[dict]:
"""Parse: drop table <name>"""
m = re.match(r'drop\s+table\s+(\w+)', desc, re.IGNORECASE)
if m:
return {"op": "drop_table", "table": m.group(1)}
return None
def parse_add_index(desc: str) -> Optional[dict]:
"""Parse: add index on <table>(<col1>, <col2>)"""
m = re.match(
r'add\s+(?:unique\s+)?index\s+(?:on\s+)?(\w+)\s*\(([^)]+)\)',
desc, re.IGNORECASE,
)
if m:
unique = "unique" in desc.lower()
cols = [c.strip() for c in m.group(2).split(",")]
return {"op": "add_index", "table": m.group(1), "columns": cols, "unique": unique}
return None
def parse_change_type(desc: str) -> Optional[dict]:
"""Parse: change <column> type to <type> in <table>"""
m = re.match(
r'change\s+(?:column\s+)?(\w+)\s+type\s+to\s+(\w[\w(),.]*)\s+in\s+(\w+)',
desc, re.IGNORECASE,
)
if m:
return {"op": "change_type", "column": m.group(1), "new_type": m.group(2), "table": m.group(3)}
return None
PARSERS = [
parse_add_column,
parse_drop_column,
parse_rename_column,
parse_add_table,
parse_drop_table,
parse_add_index,
parse_change_type,
]
def parse_change(desc: str) -> Optional[dict]:
for parser in PARSERS:
result = parser(desc)
if result:
return result
return None
# ---------------------------------------------------------------------------
# SQL generators per dialect
# ---------------------------------------------------------------------------
TYPE_MAP = {
"boolean": {"postgres": "BOOLEAN", "mysql": "TINYINT(1)", "sqlite": "INTEGER", "sqlserver": "BIT"},
"text": {"postgres": "TEXT", "mysql": "TEXT", "sqlite": "TEXT", "sqlserver": "NVARCHAR(MAX)"},
"integer": {"postgres": "INTEGER", "mysql": "INT", "sqlite": "INTEGER", "sqlserver": "INT"},
"int": {"postgres": "INTEGER", "mysql": "INT", "sqlite": "INTEGER", "sqlserver": "INT"},
"serial": {"postgres": "SERIAL", "mysql": "INT AUTO_INCREMENT", "sqlite": "INTEGER", "sqlserver": "INT IDENTITY(1,1)"},
"varchar": {"postgres": "VARCHAR(255)", "mysql": "VARCHAR(255)", "sqlite": "TEXT", "sqlserver": "NVARCHAR(255)"},
"timestamp": {"postgres": "TIMESTAMP", "mysql": "DATETIME", "sqlite": "TEXT", "sqlserver": "DATETIME2"},
"uuid": {"postgres": "UUID", "mysql": "CHAR(36)", "sqlite": "TEXT", "sqlserver": "UNIQUEIDENTIFIER"},
"json": {"postgres": "JSONB", "mysql": "JSON", "sqlite": "TEXT", "sqlserver": "NVARCHAR(MAX)"},
"decimal": {"postgres": "DECIMAL(19,4)", "mysql": "DECIMAL(19,4)", "sqlite": "REAL", "sqlserver": "DECIMAL(19,4)"},
"float": {"postgres": "DOUBLE PRECISION", "mysql": "DOUBLE", "sqlite": "REAL", "sqlserver": "FLOAT"},
}
def map_type(type_name: str, dialect: str) -> str:
"""Map a generic type name to a dialect-specific type."""
key = type_name.lower().rstrip("()")
if key in TYPE_MAP and dialect in TYPE_MAP[key]:
return TYPE_MAP[key][dialect]
return type_name.upper()
def gen_add_column(change: dict, dialect: str) -> Tuple[str, str, List[str]]:
col_type = map_type(change["type"], dialect)
table = change["table"]
col = change["column"]
up = f"ALTER TABLE {table} ADD COLUMN {col} {col_type};"
down = f"ALTER TABLE {table} DROP COLUMN {col};"
return up, down, []
def gen_drop_column(change: dict, dialect: str) -> Tuple[str, str, List[str]]:
table = change["table"]
col = change["column"]
up = f"ALTER TABLE {table} DROP COLUMN {col};"
down = f"-- WARNING: Cannot fully reverse DROP COLUMN. Provide the original type.\nALTER TABLE {table} ADD COLUMN {col} TEXT;"
return up, down, ["Down migration uses TEXT as placeholder. Replace with the original column type."]
def gen_rename_column(change: dict, dialect: str) -> Tuple[str, str, List[str]]:
table = change["table"]
old, new = change["old"], change["new"]
warnings = []
if dialect == "postgres":
up = f"ALTER TABLE {table} RENAME COLUMN {old} TO {new};"
down = f"ALTER TABLE {table} RENAME COLUMN {new} TO {old};"
elif dialect == "mysql":
up = f"ALTER TABLE {table} RENAME COLUMN {old} TO {new};"
down = f"ALTER TABLE {table} RENAME COLUMN {new} TO {old};"
elif dialect == "sqlite":
up = f"ALTER TABLE {table} RENAME COLUMN {old} TO {new};"
down = f"ALTER TABLE {table} RENAME COLUMN {new} TO {old};"
warnings.append("SQLite RENAME COLUMN requires version 3.25.0+.")
elif dialect == "sqlserver":
up = f"EXEC sp_rename '{table}.{old}', '{new}', 'COLUMN';"
down = f"EXEC sp_rename '{table}.{new}', '{old}', 'COLUMN';"
else:
up = f"ALTER TABLE {table} RENAME COLUMN {old} TO {new};"
down = f"ALTER TABLE {table} RENAME COLUMN {new} TO {old};"
return up, down, warnings
def gen_add_table(change: dict, dialect: str) -> Tuple[str, str, List[str]]:
table = change["table"]
cols = change["columns"]
col_defs = []
has_id = False
for col in cols:
col = col.strip()
if col.lower() == "id":
has_id = True
if dialect == "postgres":
col_defs.append(" id SERIAL PRIMARY KEY")
elif dialect == "mysql":
col_defs.append(" id INT AUTO_INCREMENT PRIMARY KEY")
elif dialect == "sqlite":
col_defs.append(" id INTEGER PRIMARY KEY AUTOINCREMENT")
elif dialect == "sqlserver":
col_defs.append(" id INT IDENTITY(1,1) PRIMARY KEY")
else:
# Check if type is specified (e.g., "rating int")
parts = col.split()
if len(parts) >= 2:
col_defs.append(f" {parts[0]} {map_type(parts[1], dialect)}")
else:
col_defs.append(f" {col} TEXT")
cols_sql = ",\n".join(col_defs)
up = f"CREATE TABLE {table} (\n{cols_sql}\n);"
down = f"DROP TABLE {table};"
warnings = []
if not has_id:
warnings.append("Table has no explicit primary key. Consider adding an 'id' column.")
return up, down, warnings
def gen_drop_table(change: dict, dialect: str) -> Tuple[str, str, List[str]]:
table = change["table"]
up = f"DROP TABLE {table};"
down = f"-- WARNING: Cannot reverse DROP TABLE without original DDL.\nCREATE TABLE {table} (id INTEGER PRIMARY KEY);"
return up, down, ["Down migration is a placeholder. Replace with the original CREATE TABLE statement."]
def gen_add_index(change: dict, dialect: str) -> Tuple[str, str, List[str]]:
table = change["table"]
cols = change["columns"]
unique = "UNIQUE " if change.get("unique") else ""
idx_name = f"idx_{table}_{'_'.join(cols)}"
if dialect == "postgres":
up = f"CREATE {unique}INDEX CONCURRENTLY {idx_name} ON {table} ({', '.join(cols)});"
else:
up = f"CREATE {unique}INDEX {idx_name} ON {table} ({', '.join(cols)});"
down = f"DROP INDEX {idx_name};" if dialect != "mysql" else f"DROP INDEX {idx_name} ON {table};"
warnings = []
if dialect == "postgres":
warnings.append("CONCURRENTLY cannot run inside a transaction. Run outside migration transaction.")
return up, down, warnings
def gen_change_type(change: dict, dialect: str) -> Tuple[str, str, List[str]]:
table = change["table"]
col = change["column"]
new_type = map_type(change["new_type"], dialect)
warnings = ["Down migration uses TEXT as placeholder. Replace with the original column type."]
if dialect == "postgres":
up = f"ALTER TABLE {table} ALTER COLUMN {col} TYPE {new_type};"
down = f"ALTER TABLE {table} ALTER COLUMN {col} TYPE TEXT;"
elif dialect == "mysql":
up = f"ALTER TABLE {table} MODIFY COLUMN {col} {new_type};"
down = f"ALTER TABLE {table} MODIFY COLUMN {col} TEXT;"
elif dialect == "sqlserver":
up = f"ALTER TABLE {table} ALTER COLUMN {col} {new_type};"
down = f"ALTER TABLE {table} ALTER COLUMN {col} NVARCHAR(MAX);"
else:
up = f"-- SQLite does not support ALTER COLUMN. Recreate the table."
down = f"-- SQLite does not support ALTER COLUMN. Recreate the table."
warnings.append("SQLite requires table recreation for type changes.")
return up, down, warnings
GENERATORS = {
"add_column": gen_add_column,
"drop_column": gen_drop_column,
"rename_column": gen_rename_column,
"add_table": gen_add_table,
"drop_table": gen_drop_table,
"add_index": gen_add_index,
"change_type": gen_change_type,
}
# ---------------------------------------------------------------------------
# Format wrappers
# ---------------------------------------------------------------------------
def wrap_sql(up: str, down: str, description: str) -> Tuple[str, str]:
"""Wrap as plain SQL migration files."""
timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
header = f"-- Migration: {description}\n-- Generated: {datetime.now().isoformat()}\n\n"
return header + "-- Up\n" + up, header + "-- Down\n" + down
def wrap_prisma(up: str, down: str, description: str) -> Tuple[str, str]:
"""Format as Prisma migration SQL (Prisma uses raw SQL in migration.sql)."""
header = f"-- Migration: {description}\n-- Format: Prisma (migration.sql)\n\n"
return header + up, header + "-- Rollback\n" + down
def wrap_alembic(up: str, down: str, description: str) -> Tuple[str, str]:
"""Format as Alembic Python migration."""
slug = re.sub(r'\W+', '_', description.lower())[:40]
revision = datetime.now().strftime("%Y%m%d%H%M")
template = textwrap.dedent(f'''\
"""
{description}
Revision ID: {revision}
"""
from alembic import op
import sqlalchemy as sa
revision = '{revision}'
down_revision = None # Set to previous revision
def upgrade():
op.execute("""
{textwrap.indent(up, " ")}
""")
def downgrade():
op.execute("""
{textwrap.indent(down, " ")}
""")
''')
return template, ""
FORMATTERS = {
"sql": wrap_sql,
"prisma": wrap_prisma,
"alembic": wrap_alembic,
}
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def main():
parser = argparse.ArgumentParser(
description="Generate database migration templates from change descriptions.",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Supported change descriptions:
"add email_verified boolean to users"
"drop column legacy_flag from accounts"
"rename column name to full_name in customers"
"create table reviews with id, user_id, rating int, body text"
"drop table temp_imports"
"add index on orders(status, created_at)"
"add unique index on users(email)"
"change email type to varchar in users"
Examples:
%(prog)s --change "add phone varchar to users" --dialect postgres
%(prog)s --change "create table reviews with id, user_id, rating int, body" --format prisma
%(prog)s --change "add index on orders(status)" --output migrations/001.sql --json
""",
)
parser.add_argument("--change", required=True, help="Natural-language description of the schema change")
parser.add_argument("--dialect", choices=["postgres", "mysql", "sqlite", "sqlserver"],
default="postgres", help="Target database dialect (default: postgres)")
parser.add_argument("--format", choices=["sql", "prisma", "alembic"], default="sql",
dest="fmt", help="Output format (default: sql)")
parser.add_argument("--output", help="Write migration to file instead of stdout")
parser.add_argument("--json", action="store_true", dest="json_output", help="Output as JSON")
args = parser.parse_args()
change = parse_change(args.change)
if not change:
print(f"Error: Could not parse change description: '{args.change}'", file=sys.stderr)
print("Run with --help to see supported patterns.", file=sys.stderr)
sys.exit(1)
gen_fn = GENERATORS.get(change["op"])
if not gen_fn:
print(f"Error: No generator for operation '{change['op']}'", file=sys.stderr)
sys.exit(1)
up, down, warnings = gen_fn(change, args.dialect)
fmt_fn = FORMATTERS[args.fmt]
up_formatted, down_formatted = fmt_fn(up, down, args.change)
migration = Migration(
description=args.change,
dialect=args.dialect,
format=args.fmt,
up=up_formatted,
down=down_formatted,
warnings=warnings,
)
if args.json_output:
print(json.dumps(migration.to_dict(), indent=2))
else:
if args.output:
with open(args.output, "w") as f:
f.write(migration.up)
print(f"Migration written to {args.output}")
if migration.down:
down_path = args.output.replace(".sql", "_down.sql")
with open(down_path, "w") as f:
f.write(migration.down)
print(f"Rollback written to {down_path}")
else:
print(migration.up)
if migration.down:
print("\n" + "=" * 40 + " ROLLBACK " + "=" * 40 + "\n")
print(migration.down)
if warnings:
print("\nWarnings:")
for w in warnings:
print(f" - {w}")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,348 @@
#!/usr/bin/env python3
"""
SQL Query Optimizer — Static Analysis
Analyzes SQL queries for common performance issues:
- SELECT * usage
- Missing WHERE clauses on UPDATE/DELETE
- Cartesian joins (missing JOIN conditions)
- Subqueries in SELECT list
- Missing LIMIT on unbounded SELECTs
- Function calls on indexed columns (non-sargable)
- LIKE with leading wildcard
- ORDER BY RAND()
- UNION instead of UNION ALL
- NOT IN with subquery (NULL-unsafe)
Usage:
python query_optimizer.py --query "SELECT * FROM users"
python query_optimizer.py --query queries.sql --dialect postgres
python query_optimizer.py --query "SELECT * FROM orders" --json
"""
import argparse
import json
import os
import re
import sys
from dataclasses import dataclass, asdict
from typing import List, Optional
@dataclass
class Issue:
"""A single optimization issue found in a query."""
severity: str # critical, warning, info
rule: str
message: str
suggestion: str
line: Optional[int] = None
@dataclass
class QueryAnalysis:
"""Analysis result for one SQL query."""
query: str
issues: List[Issue]
score: int # 0-100, higher is better
def to_dict(self):
return {
"query": self.query[:200] + ("..." if len(self.query) > 200 else ""),
"issues": [asdict(i) for i in self.issues],
"issue_count": len(self.issues),
"score": self.score,
}
# ---------------------------------------------------------------------------
# Rule checkers
# ---------------------------------------------------------------------------
def check_select_star(sql: str) -> Optional[Issue]:
"""Detect SELECT * usage."""
if re.search(r'\bSELECT\s+\*\s', sql, re.IGNORECASE):
return Issue(
severity="warning",
rule="select-star",
message="SELECT * transfers unnecessary data and breaks on schema changes.",
suggestion="List only the columns you need: SELECT col1, col2, ...",
)
return None
def check_missing_where(sql: str) -> Optional[Issue]:
"""Detect UPDATE/DELETE without WHERE."""
upper = sql.upper().strip()
for keyword in ("UPDATE", "DELETE"):
if upper.startswith(keyword) and "WHERE" not in upper:
return Issue(
severity="critical",
rule="missing-where",
message=f"{keyword} without WHERE affects every row in the table.",
suggestion=f"Add a WHERE clause to restrict the {keyword} scope.",
)
return None
def check_cartesian_join(sql: str) -> Optional[Issue]:
"""Detect comma-separated tables without explicit JOIN or WHERE join condition."""
upper = sql.upper()
if "SELECT" not in upper:
return None
from_match = re.search(r'\bFROM\s+(.+?)(?:\bWHERE\b|\bGROUP\b|\bORDER\b|\bLIMIT\b|\bHAVING\b|;|$)',
sql, re.IGNORECASE | re.DOTALL)
if not from_match:
return None
from_clause = from_match.group(1)
# Skip if explicit JOINs are used
if re.search(r'\bJOIN\b', from_clause, re.IGNORECASE):
return None
# Count comma-separated tables
tables = [t.strip() for t in from_clause.split(",") if t.strip()]
if len(tables) > 1 and "WHERE" not in upper:
return Issue(
severity="critical",
rule="cartesian-join",
message="Multiple tables in FROM without JOIN or WHERE creates a cartesian product.",
suggestion="Use explicit JOIN syntax with ON conditions.",
)
return None
def check_subquery_in_select(sql: str) -> Optional[Issue]:
"""Detect correlated subqueries in SELECT list."""
select_match = re.search(r'\bSELECT\b(.+?)\bFROM\b', sql, re.IGNORECASE | re.DOTALL)
if select_match:
select_clause = select_match.group(1)
if re.search(r'\(\s*SELECT\b', select_clause, re.IGNORECASE):
return Issue(
severity="warning",
rule="subquery-in-select",
message="Subquery in SELECT list executes once per row (correlated subquery).",
suggestion="Rewrite as a LEFT JOIN with aggregation.",
)
return None
def check_missing_limit(sql: str) -> Optional[Issue]:
"""Detect unbounded SELECT without LIMIT."""
upper = sql.upper().strip()
if not upper.startswith("SELECT"):
return None
# Skip if it's a subquery or aggregate-only
if re.search(r'\bCOUNT\s*\(', upper) and "GROUP BY" not in upper:
return None
if "LIMIT" not in upper and "FETCH" not in upper and "TOP " not in upper:
return Issue(
severity="info",
rule="missing-limit",
message="SELECT without LIMIT may return unbounded rows.",
suggestion="Add LIMIT to prevent returning excessive data.",
)
return None
def check_function_on_column(sql: str) -> Optional[Issue]:
"""Detect function calls on columns in WHERE (non-sargable)."""
where_match = re.search(r'\bWHERE\b(.+?)(?:\bGROUP\b|\bORDER\b|\bLIMIT\b|\bHAVING\b|;|$)',
sql, re.IGNORECASE | re.DOTALL)
if not where_match:
return None
where_clause = where_match.group(1)
non_sargable = re.search(
r'\b(YEAR|MONTH|DAY|DATE|UPPER|LOWER|TRIM|CAST|COALESCE|IFNULL|NVL)\s*\(',
where_clause, re.IGNORECASE
)
if non_sargable:
func = non_sargable.group(1).upper()
return Issue(
severity="warning",
rule="non-sargable",
message=f"Function {func}() on column in WHERE prevents index usage.",
suggestion="Rewrite to compare the raw column against transformed constants.",
)
return None
def check_leading_wildcard(sql: str) -> Optional[Issue]:
"""Detect LIKE '%...' patterns."""
if re.search(r"LIKE\s+'%", sql, re.IGNORECASE):
return Issue(
severity="warning",
rule="leading-wildcard",
message="LIKE with leading wildcard prevents index usage.",
suggestion="Use full-text search (GIN index, FULLTEXT, FTS5) for substring matching.",
)
return None
def check_order_by_rand(sql: str) -> Optional[Issue]:
"""Detect ORDER BY RAND() / RANDOM()."""
if re.search(r'ORDER\s+BY\s+(RAND|RANDOM)\s*\(\)', sql, re.IGNORECASE):
return Issue(
severity="warning",
rule="order-by-rand",
message="ORDER BY RAND() scans and sorts the entire table.",
suggestion="Use application-side random sampling or TABLESAMPLE.",
)
return None
def check_union_vs_union_all(sql: str) -> Optional[Issue]:
"""Detect UNION without ALL (unnecessary dedup)."""
if re.search(r'\bUNION\b(?!\s+ALL\b)', sql, re.IGNORECASE):
return Issue(
severity="info",
rule="union-without-all",
message="UNION performs deduplication sort; use UNION ALL if duplicates are acceptable.",
suggestion="Replace UNION with UNION ALL unless you specifically need deduplication.",
)
return None
def check_not_in_subquery(sql: str) -> Optional[Issue]:
"""Detect NOT IN (SELECT ...) which is NULL-unsafe."""
if re.search(r'\bNOT\s+IN\s*\(\s*SELECT\b', sql, re.IGNORECASE):
return Issue(
severity="warning",
rule="not-in-subquery",
message="NOT IN with subquery returns no rows if any subquery result is NULL.",
suggestion="Use NOT EXISTS (SELECT 1 ...) instead.",
)
return None
ALL_CHECKS = [
check_select_star,
check_missing_where,
check_cartesian_join,
check_subquery_in_select,
check_missing_limit,
check_function_on_column,
check_leading_wildcard,
check_order_by_rand,
check_union_vs_union_all,
check_not_in_subquery,
]
# ---------------------------------------------------------------------------
# Analysis engine
# ---------------------------------------------------------------------------
def analyze_query(sql: str, dialect: str = "postgres") -> QueryAnalysis:
"""Run all checks against a single SQL query."""
issues: List[Issue] = []
for check_fn in ALL_CHECKS:
issue = check_fn(sql)
if issue:
issues.append(issue)
# Score: start at 100, deduct per severity
score = 100
for issue in issues:
if issue.severity == "critical":
score -= 25
elif issue.severity == "warning":
score -= 10
else:
score -= 5
score = max(0, score)
return QueryAnalysis(query=sql.strip(), issues=issues, score=score)
def split_queries(text: str) -> List[str]:
"""Split SQL text into individual statements."""
queries = []
for stmt in text.split(";"):
stmt = stmt.strip()
if stmt and len(stmt) > 5:
queries.append(stmt + ";")
return queries
# ---------------------------------------------------------------------------
# Output formatting
# ---------------------------------------------------------------------------
SEVERITY_ICONS = {"critical": "[CRITICAL]", "warning": "[WARNING]", "info": "[INFO]"}
def format_text(analyses: List[QueryAnalysis]) -> str:
"""Format analysis results as human-readable text."""
lines = []
for i, analysis in enumerate(analyses, 1):
lines.append(f"{'='*60}")
lines.append(f"Query {i} (Score: {analysis.score}/100)")
lines.append(f" {analysis.query[:120]}{'...' if len(analysis.query) > 120 else ''}")
lines.append("")
if not analysis.issues:
lines.append(" No issues detected.")
for issue in analysis.issues:
icon = SEVERITY_ICONS.get(issue.severity, "")
lines.append(f" {icon} {issue.rule}: {issue.message}")
lines.append(f" -> {issue.suggestion}")
lines.append("")
return "\n".join(lines)
def format_json(analyses: List[QueryAnalysis]) -> str:
"""Format analysis results as JSON."""
return json.dumps(
{"analyses": [a.to_dict() for a in analyses], "total_queries": len(analyses)},
indent=2,
)
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def main():
parser = argparse.ArgumentParser(
description="Analyze SQL queries for common performance issues.",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
%(prog)s --query "SELECT * FROM users"
%(prog)s --query queries.sql --dialect mysql
%(prog)s --query "DELETE FROM orders" --json
""",
)
parser.add_argument(
"--query", required=True,
help="SQL query string or path to a .sql file",
)
parser.add_argument(
"--dialect", choices=["postgres", "mysql", "sqlite", "sqlserver"],
default="postgres", help="SQL dialect (default: postgres)",
)
parser.add_argument(
"--json", action="store_true", dest="json_output",
help="Output results as JSON",
)
args = parser.parse_args()
# Determine if query is a file path or inline SQL
sql_text = args.query
if os.path.isfile(args.query):
with open(args.query, "r") as f:
sql_text = f.read()
queries = split_queries(sql_text)
if not queries:
# Treat the whole input as a single query
queries = [sql_text.strip()]
analyses = [analyze_query(q, args.dialect) for q in queries]
if args.json_output:
print(format_json(analyses))
else:
print(format_text(analyses))
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,315 @@
#!/usr/bin/env python3
"""
Schema Explorer
Generates schema documentation from database introspection queries.
Outputs the introspection SQL and sample documentation templates
for PostgreSQL, MySQL, SQLite, and SQL Server.
Since this tool runs without a live database connection, it generates:
1. The introspection queries you need to run
2. Documentation templates from the results
3. Sample schema docs for common table patterns
Usage:
python schema_explorer.py --dialect postgres --tables all --format md
python schema_explorer.py --dialect mysql --tables users,orders --format json
python schema_explorer.py --dialect sqlite --tables all --json
"""
import argparse
import json
import sys
import textwrap
from dataclasses import dataclass, asdict
from typing import List, Optional, Dict
# ---------------------------------------------------------------------------
# Introspection query templates per dialect
# ---------------------------------------------------------------------------
INTROSPECTION_QUERIES: Dict[str, Dict[str, str]] = {
"postgres": {
"tables": textwrap.dedent("""\
SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public' AND table_type = 'BASE TABLE'
ORDER BY table_name;"""),
"columns": textwrap.dedent("""\
SELECT table_name, column_name, data_type, character_maximum_length,
is_nullable, column_default
FROM information_schema.columns
WHERE table_schema = 'public' {table_filter}
ORDER BY table_name, ordinal_position;"""),
"primary_keys": textwrap.dedent("""\
SELECT tc.table_name, kcu.column_name
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
ON tc.constraint_name = kcu.constraint_name
WHERE tc.constraint_type = 'PRIMARY KEY' AND tc.table_schema = 'public'
ORDER BY tc.table_name;"""),
"foreign_keys": textwrap.dedent("""\
SELECT tc.table_name, kcu.column_name,
ccu.table_name AS foreign_table, ccu.column_name AS foreign_column
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage ccu
ON tc.constraint_name = ccu.constraint_name
WHERE tc.constraint_type = 'FOREIGN KEY'
ORDER BY tc.table_name;"""),
"indexes": textwrap.dedent("""\
SELECT schemaname, tablename, indexname, indexdef
FROM pg_indexes
WHERE schemaname = 'public'
ORDER BY tablename, indexname;"""),
"table_sizes": textwrap.dedent("""\
SELECT relname AS table_name,
pg_size_pretty(pg_total_relation_size(relid)) AS total_size,
pg_size_pretty(pg_relation_size(relid)) AS data_size,
pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) AS index_size
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC;"""),
},
"mysql": {
"tables": textwrap.dedent("""\
SELECT table_name
FROM information_schema.tables
WHERE table_schema = DATABASE() AND table_type = 'BASE TABLE'
ORDER BY table_name;"""),
"columns": textwrap.dedent("""\
SELECT table_name, column_name, column_type, is_nullable,
column_default, column_key, extra
FROM information_schema.columns
WHERE table_schema = DATABASE() {table_filter}
ORDER BY table_name, ordinal_position;"""),
"foreign_keys": textwrap.dedent("""\
SELECT table_name, column_name, referenced_table_name, referenced_column_name
FROM information_schema.key_column_usage
WHERE table_schema = DATABASE() AND referenced_table_name IS NOT NULL
ORDER BY table_name;"""),
"indexes": textwrap.dedent("""\
SELECT table_name, index_name, non_unique, column_name, seq_in_index
FROM information_schema.statistics
WHERE table_schema = DATABASE()
ORDER BY table_name, index_name, seq_in_index;"""),
"table_sizes": textwrap.dedent("""\
SELECT table_name, table_rows,
ROUND(data_length / 1024 / 1024, 2) AS data_mb,
ROUND(index_length / 1024 / 1024, 2) AS index_mb
FROM information_schema.tables
WHERE table_schema = DATABASE()
ORDER BY data_length DESC;"""),
},
"sqlite": {
"tables": textwrap.dedent("""\
SELECT name FROM sqlite_master
WHERE type = 'table' AND name NOT LIKE 'sqlite_%'
ORDER BY name;"""),
"columns": textwrap.dedent("""\
-- Run for each table:
PRAGMA table_info({table_name});"""),
"foreign_keys": textwrap.dedent("""\
-- Run for each table:
PRAGMA foreign_key_list({table_name});"""),
"indexes": textwrap.dedent("""\
SELECT name, tbl_name, sql FROM sqlite_master
WHERE type = 'index'
ORDER BY tbl_name, name;"""),
"schema_dump": textwrap.dedent("""\
SELECT name, sql FROM sqlite_master
WHERE type = 'table'
ORDER BY name;"""),
},
"sqlserver": {
"tables": textwrap.dedent("""\
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE'
ORDER BY TABLE_NAME;"""),
"columns": textwrap.dedent("""\
SELECT t.name AS table_name, c.name AS column_name,
ty.name AS data_type, c.max_length, c.precision, c.scale,
c.is_nullable, dc.definition AS default_value
FROM sys.columns c
JOIN sys.tables t ON c.object_id = t.object_id
JOIN sys.types ty ON c.user_type_id = ty.user_type_id
LEFT JOIN sys.default_constraints dc ON c.default_object_id = dc.object_id
{table_filter}
ORDER BY t.name, c.column_id;"""),
"foreign_keys": textwrap.dedent("""\
SELECT fk.name AS fk_name,
tp.name AS parent_table, cp.name AS parent_column,
tr.name AS referenced_table, cr.name AS referenced_column
FROM sys.foreign_keys fk
JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id
JOIN sys.tables tp ON fkc.parent_object_id = tp.object_id
JOIN sys.columns cp ON fkc.parent_object_id = cp.object_id AND fkc.parent_column_id = cp.column_id
JOIN sys.tables tr ON fkc.referenced_object_id = tr.object_id
JOIN sys.columns cr ON fkc.referenced_object_id = cr.object_id AND fkc.referenced_column_id = cr.column_id
ORDER BY tp.name;"""),
"indexes": textwrap.dedent("""\
SELECT t.name AS table_name, i.name AS index_name,
i.type_desc, i.is_unique, c.name AS column_name,
ic.key_ordinal
FROM sys.indexes i
JOIN sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id
JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
JOIN sys.tables t ON i.object_id = t.object_id
WHERE i.name IS NOT NULL
ORDER BY t.name, i.name, ic.key_ordinal;"""),
},
}
# ---------------------------------------------------------------------------
# Documentation generators
# ---------------------------------------------------------------------------
SAMPLE_TABLES = {
"users": {
"columns": [
{"name": "id", "type": "SERIAL / INT", "nullable": "NO", "default": "auto", "notes": "Primary key"},
{"name": "email", "type": "VARCHAR(255)", "nullable": "NO", "default": "-", "notes": "Unique, indexed"},
{"name": "name", "type": "VARCHAR(255)", "nullable": "YES", "default": "NULL", "notes": "Display name"},
{"name": "password_hash", "type": "VARCHAR(255)", "nullable": "NO", "default": "-", "notes": "bcrypt hash"},
{"name": "created_at", "type": "TIMESTAMP", "nullable": "NO", "default": "NOW()", "notes": ""},
{"name": "updated_at", "type": "TIMESTAMP", "nullable": "NO", "default": "NOW()", "notes": ""},
],
"indexes": ["PRIMARY KEY (id)", "UNIQUE INDEX (email)"],
"foreign_keys": [],
},
"orders": {
"columns": [
{"name": "id", "type": "SERIAL / INT", "nullable": "NO", "default": "auto", "notes": "Primary key"},
{"name": "user_id", "type": "INTEGER", "nullable": "NO", "default": "-", "notes": "FK -> users.id"},
{"name": "status", "type": "VARCHAR(50)", "nullable": "NO", "default": "'pending'", "notes": "pending/paid/shipped/cancelled"},
{"name": "total", "type": "DECIMAL(19,4)", "nullable": "NO", "default": "0", "notes": "Order total in cents"},
{"name": "created_at", "type": "TIMESTAMP", "nullable": "NO", "default": "NOW()", "notes": ""},
],
"indexes": ["PRIMARY KEY (id)", "INDEX (user_id)", "INDEX (status, created_at)"],
"foreign_keys": ["user_id -> users.id ON DELETE CASCADE"],
},
}
def generate_md(dialect: str, tables: List[str]) -> str:
"""Generate markdown schema documentation."""
lines = [f"# Database Schema Documentation ({dialect.upper()})\n"]
lines.append(f"Generated by sql-database-assistant schema_explorer.\n")
# Introspection queries section
lines.append("## Introspection Queries\n")
lines.append("Run these queries against your database to extract schema information:\n")
queries = INTROSPECTION_QUERIES.get(dialect, {})
for qname, qsql in queries.items():
table_filter = ""
if "all" not in tables:
tlist = ", ".join(f"'{t}'" for t in tables)
table_filter = f"AND table_name IN ({tlist})"
qsql = qsql.replace("{table_filter}", table_filter)
qsql = qsql.replace("{table_name}", tables[0] if tables and tables[0] != "all" else "TABLE_NAME")
lines.append(f"### {qname.replace('_', ' ').title()}\n")
lines.append(f"```sql\n{qsql}\n```\n")
# Sample documentation
lines.append("## Sample Table Documentation\n")
lines.append("Below is an example of the documentation format produced from query results:\n")
show_tables = tables if "all" not in tables else list(SAMPLE_TABLES.keys())
for tname in show_tables:
sample = SAMPLE_TABLES.get(tname)
if not sample:
lines.append(f"### {tname}\n")
lines.append("_No sample data available. Run introspection queries above._\n")
continue
lines.append(f"### {tname}\n")
lines.append("| Column | Type | Nullable | Default | Notes |")
lines.append("|--------|------|----------|---------|-------|")
for col in sample["columns"]:
lines.append(f"| {col['name']} | {col['type']} | {col['nullable']} | {col['default']} | {col['notes']} |")
lines.append("")
if sample["indexes"]:
lines.append("**Indexes:** " + ", ".join(sample["indexes"]))
if sample["foreign_keys"]:
lines.append("**Foreign Keys:** " + ", ".join(sample["foreign_keys"]))
lines.append("")
return "\n".join(lines)
def generate_json_output(dialect: str, tables: List[str]) -> dict:
"""Generate JSON schema documentation."""
queries = INTROSPECTION_QUERIES.get(dialect, {})
processed = {}
for qname, qsql in queries.items():
table_filter = ""
if "all" not in tables:
tlist = ", ".join(f"'{t}'" for t in tables)
table_filter = f"AND table_name IN ({tlist})"
processed[qname] = qsql.replace("{table_filter}", table_filter).replace(
"{table_name}", tables[0] if tables and tables[0] != "all" else "TABLE_NAME"
)
show_tables = tables if "all" not in tables else list(SAMPLE_TABLES.keys())
sample_docs = {}
for tname in show_tables:
sample = SAMPLE_TABLES.get(tname)
if sample:
sample_docs[tname] = sample
return {
"dialect": dialect,
"requested_tables": tables,
"introspection_queries": processed,
"sample_documentation": sample_docs,
"instructions": "Run the introspection queries against your database, then use the results to populate documentation in the sample format shown.",
}
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def main():
parser = argparse.ArgumentParser(
description="Generate schema documentation from database introspection.",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
%(prog)s --dialect postgres --tables all --format md
%(prog)s --dialect mysql --tables users,orders --format json
%(prog)s --dialect sqlite --tables all --json
""",
)
parser.add_argument(
"--dialect", required=True, choices=["postgres", "mysql", "sqlite", "sqlserver"],
help="Target database dialect",
)
parser.add_argument(
"--tables", default="all",
help="Comma-separated table names or 'all' (default: all)",
)
parser.add_argument(
"--format", choices=["md", "json"], default="md", dest="fmt",
help="Output format (default: md)",
)
parser.add_argument(
"--json", action="store_true", dest="json_output",
help="Output as JSON (overrides --format)",
)
args = parser.parse_args()
tables = [t.strip() for t in args.tables.split(",")]
if args.json_output or args.fmt == "json":
result = generate_json_output(args.dialect, tables)
print(json.dumps(result, indent=2))
else:
print(generate_md(args.dialect, tables))
if __name__ == "__main__":
main()