- pdf-creator v1.2.0: theme system (default/warm-terra), dual backend (weasyprint/chrome auto-detect), argparse CLI, extracted CSS to themes/ - terraform-skill: operational traps from real deployments (provisioner timing, DNS duplication, multi-env isolation, pre-deploy validation) - asr-transcribe-to-text: add security scan marker Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5.6 KiB
Zero-to-Deployment Checklist
A fresh instance with an empty data disk exposes every implicit dependency that production silently relies on. This checklist covers everything that must be explicitly created before services will start.
Pre-flight: cloud-init must handle
These run at OS boot, before Terraform provisioners:
- Mount data disk: Format if new (
blkidcheck), mount to/data, add to fstab - Create service directories:
mkdir -p /data/{service1,service2,...}— file provisioners fail if target dir doesn't exist - Install Docker + Compose: Curl installer, enable systemd service
- Configure swap:
fallocateon data disk (NOT system disk) - SSH hardening: key-only auth, no password root login
- Firewall: UFW + DOCKER-USER iptables chain
- Debconf preseed: For any package with interactive prompts (iptables-persistent, etc.)
- Signal readiness: Write timestamp to
/data/cloud-init.log
Provisioner ordering
Terraform provisioners execute in declaration order within a resource, but resources execute in parallel unless depends_on is set.
lobehub_deploy ──────────────────→ channel_sync (depends_on lobehub)
→ casdoor_sync (depends_on lobehub)
→ minio_sync (depends_on lobehub)
claude4dev_deploy (depends_on lobehub_deploy)
├─ wait for cloud-init
├─ upload source (tarball via file provisioner)
├─ upload .env (staging variant)
├─ start stateful (postgres, redis) --no-recreate
├─ run DB migrations
├─ build stateless images
├─ fix volume permissions
├─ start stateless (relay, api, frontend, gateway)
└─ verify health
Database bootstrap
PostgreSQL databases
PostgreSQL docker-entrypoint-initdb.d scripts only run when the data directory is empty (first-ever start). On subsequent starts — even if a database doesn't exist — init scripts are skipped.
Fix: Explicitly create databases in provisioner:
# Wait for postgres healthy
sleep 10
# Create database if missing (idempotent)
docker exec my-postgres psql -U postgres -tc \
"SELECT 1 FROM pg_database WHERE datname='mydb'" | grep -q 1 \
|| docker exec my-postgres psql -U postgres -c "CREATE DATABASE mydb;"
Schema migrations
Migrations must be idempotent. Track applied versions:
PSQL='docker compose exec -T postgres psql -v ON_ERROR_STOP=1 -U myuser -d mydb'
# Create tracking table
$PSQL -tAc "CREATE TABLE IF NOT EXISTS schema_migrations (
version TEXT PRIMARY KEY,
applied_at TIMESTAMPTZ DEFAULT now()
)"
# Apply each migration file in order
for f in migrations/*.sql; do
VER=$(basename $f)
APPLIED=$($PSQL -tAc "SELECT 1 FROM schema_migrations WHERE version='$VER'" | tr -d ' ')
if [ "$APPLIED" = "1" ]; then
echo "Skip: $VER"
else
echo "Apply: $VER"
{ echo 'BEGIN;'; cat $f; echo 'COMMIT;'; } | $PSQL
$PSQL -tAc "INSERT INTO schema_migrations(version) VALUES ('$VER') ON CONFLICT DO NOTHING"
fi
done
Docker build on remote
Proxy mode
Docker Compose reads build args from .env via ${VAR:-default}. Command-line env vars do NOT override .env values for compose interpolation.
# WRONG: compose still reads DOCKER_WITH_PROXY_MODE from .env
DOCKER_WITH_PROXY_MODE=disabled docker compose build myapp
# RIGHT: modify .env so compose reads the correct value
grep -q DOCKER_WITH_PROXY_MODE .env || echo 'DOCKER_WITH_PROXY_MODE=disabled' >> .env
docker compose build myapp
Memory management
Building Docker images while 10+ containers run can OOM on small instances (8GB). Strategy:
# Stop non-critical containers to free RAM
cd /data/other-project && docker compose stop search-engine analytics-db || true
# Build (memory-intensive)
cd /data/myproject && docker compose build myapp
# Restart stopped containers
cd /data/other-project && docker compose up -d search-engine analytics-db || true
Volume permissions
Containers running as non-root need writable volume directories:
# Before docker compose up:
mkdir -p data-dir logs-dir
chown -R 1001:1001 data-dir logs-dir # match container UID
Find the UID from the Dockerfile:
RUN adduser -S myuser -u 1001 -G mygroup
USER myuser # runs as uid 1001
Environment-specific .env files
Production .env contains production URLs. Staging needs its own .env with:
| Variable | Production | Staging |
|---|---|---|
FRONTEND_URL |
https://myapp.com |
https://staging.myapp.com |
CORS_ORIGIN |
https://myapp.com |
https://staging.myapp.com |
NEW_API_URL |
http://api-container:3000 |
Same (internal Docker network) |
DOCKER_WITH_PROXY_MODE |
required (if behind proxy) |
disabled (direct internet) |
Pattern: Create .env.staging alongside .env. In Terraform:
locals {
env_src = "${local.repo}/.env.staging" # staging-specific
}
provisioner "file" {
source = local.env_src
destination = "${local.deploy_dir}/.env"
}
Rsync must exclude .env files (otherwise production .env overwrites staging .env):
--exclude=.env --exclude='.env.*'
Verification template
After all services start, verify in the provisioner (not ad-hoc SSH):
sleep 20
echo '=== Service logs ==='
docker logs my-critical-service --tail 20 2>&1 || true
echo '=== All containers ==='
docker ps --format 'table {{.Names}}\t{{.Status}}' 2>&1 || true
# Final gate (only line that can fail)
docker ps --filter name=my-critical-service --format '{{.Status}}' | grep -q healthy \
|| { echo 'FATAL: service unhealthy'; exit 1; }