- pdf-creator v1.2.0: theme system (default/warm-terra), dual backend (weasyprint/chrome auto-detect), argparse CLI, extracted CSS to themes/ - terraform-skill: operational traps from real deployments (provisioner timing, DNS duplication, multi-env isolation, pre-deploy validation) - asr-transcribe-to-text: add security scan marker Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
169 lines
5.6 KiB
Markdown
169 lines
5.6 KiB
Markdown
# Zero-to-Deployment Checklist
|
|
|
|
A fresh instance with an empty data disk exposes every implicit dependency that production silently relies on. This checklist covers everything that must be explicitly created before services will start.
|
|
|
|
## Pre-flight: cloud-init must handle
|
|
|
|
These run at OS boot, before Terraform provisioners:
|
|
|
|
- [ ] **Mount data disk**: Format if new (`blkid` check), mount to `/data`, add to fstab
|
|
- [ ] **Create service directories**: `mkdir -p /data/{service1,service2,...}` — file provisioners fail if target dir doesn't exist
|
|
- [ ] **Install Docker + Compose**: Curl installer, enable systemd service
|
|
- [ ] **Configure swap**: `fallocate` on data disk (NOT system disk)
|
|
- [ ] **SSH hardening**: key-only auth, no password root login
|
|
- [ ] **Firewall**: UFW + DOCKER-USER iptables chain
|
|
- [ ] **Debconf preseed**: For any package with interactive prompts (iptables-persistent, etc.)
|
|
- [ ] **Signal readiness**: Write timestamp to `/data/cloud-init.log`
|
|
|
|
## Provisioner ordering
|
|
|
|
Terraform provisioners execute in declaration order within a resource, but resources execute in parallel unless `depends_on` is set.
|
|
|
|
```
|
|
lobehub_deploy ──────────────────→ channel_sync (depends_on lobehub)
|
|
→ casdoor_sync (depends_on lobehub)
|
|
→ minio_sync (depends_on lobehub)
|
|
|
|
claude4dev_deploy (depends_on lobehub_deploy)
|
|
├─ wait for cloud-init
|
|
├─ upload source (tarball via file provisioner)
|
|
├─ upload .env (staging variant)
|
|
├─ start stateful (postgres, redis) --no-recreate
|
|
├─ run DB migrations
|
|
├─ build stateless images
|
|
├─ fix volume permissions
|
|
├─ start stateless (relay, api, frontend, gateway)
|
|
└─ verify health
|
|
```
|
|
|
|
## Database bootstrap
|
|
|
|
### PostgreSQL databases
|
|
|
|
PostgreSQL `docker-entrypoint-initdb.d` scripts only run when the data directory is empty (first-ever start). On subsequent starts — even if a database doesn't exist — init scripts are skipped.
|
|
|
|
**Fix**: Explicitly create databases in provisioner:
|
|
```bash
|
|
# Wait for postgres healthy
|
|
sleep 10
|
|
# Create database if missing (idempotent)
|
|
docker exec my-postgres psql -U postgres -tc \
|
|
"SELECT 1 FROM pg_database WHERE datname='mydb'" | grep -q 1 \
|
|
|| docker exec my-postgres psql -U postgres -c "CREATE DATABASE mydb;"
|
|
```
|
|
|
|
### Schema migrations
|
|
|
|
Migrations must be idempotent. Track applied versions:
|
|
```bash
|
|
PSQL='docker compose exec -T postgres psql -v ON_ERROR_STOP=1 -U myuser -d mydb'
|
|
|
|
# Create tracking table
|
|
$PSQL -tAc "CREATE TABLE IF NOT EXISTS schema_migrations (
|
|
version TEXT PRIMARY KEY,
|
|
applied_at TIMESTAMPTZ DEFAULT now()
|
|
)"
|
|
|
|
# Apply each migration file in order
|
|
for f in migrations/*.sql; do
|
|
VER=$(basename $f)
|
|
APPLIED=$($PSQL -tAc "SELECT 1 FROM schema_migrations WHERE version='$VER'" | tr -d ' ')
|
|
if [ "$APPLIED" = "1" ]; then
|
|
echo "Skip: $VER"
|
|
else
|
|
echo "Apply: $VER"
|
|
{ echo 'BEGIN;'; cat $f; echo 'COMMIT;'; } | $PSQL
|
|
$PSQL -tAc "INSERT INTO schema_migrations(version) VALUES ('$VER') ON CONFLICT DO NOTHING"
|
|
fi
|
|
done
|
|
```
|
|
|
|
## Docker build on remote
|
|
|
|
### Proxy mode
|
|
|
|
Docker Compose reads build args from `.env` via `${VAR:-default}`. Command-line env vars do NOT override `.env` values for compose interpolation.
|
|
|
|
```bash
|
|
# WRONG: compose still reads DOCKER_WITH_PROXY_MODE from .env
|
|
DOCKER_WITH_PROXY_MODE=disabled docker compose build myapp
|
|
|
|
# RIGHT: modify .env so compose reads the correct value
|
|
grep -q DOCKER_WITH_PROXY_MODE .env || echo 'DOCKER_WITH_PROXY_MODE=disabled' >> .env
|
|
docker compose build myapp
|
|
```
|
|
|
|
### Memory management
|
|
|
|
Building Docker images while 10+ containers run can OOM on small instances (8GB). Strategy:
|
|
|
|
```bash
|
|
# Stop non-critical containers to free RAM
|
|
cd /data/other-project && docker compose stop search-engine analytics-db || true
|
|
|
|
# Build (memory-intensive)
|
|
cd /data/myproject && docker compose build myapp
|
|
|
|
# Restart stopped containers
|
|
cd /data/other-project && docker compose up -d search-engine analytics-db || true
|
|
```
|
|
|
|
## Volume permissions
|
|
|
|
Containers running as non-root need writable volume directories:
|
|
|
|
```bash
|
|
# Before docker compose up:
|
|
mkdir -p data-dir logs-dir
|
|
chown -R 1001:1001 data-dir logs-dir # match container UID
|
|
```
|
|
|
|
Find the UID from the Dockerfile:
|
|
```dockerfile
|
|
RUN adduser -S myuser -u 1001 -G mygroup
|
|
USER myuser # runs as uid 1001
|
|
```
|
|
|
|
## Environment-specific .env files
|
|
|
|
Production `.env` contains production URLs. Staging needs its own `.env` with:
|
|
|
|
| Variable | Production | Staging |
|
|
|---|---|---|
|
|
| `FRONTEND_URL` | `https://myapp.com` | `https://staging.myapp.com` |
|
|
| `CORS_ORIGIN` | `https://myapp.com` | `https://staging.myapp.com` |
|
|
| `NEW_API_URL` | `http://api-container:3000` | Same (internal Docker network) |
|
|
| `DOCKER_WITH_PROXY_MODE` | `required` (if behind proxy) | `disabled` (direct internet) |
|
|
|
|
**Pattern**: Create `.env.staging` alongside `.env`. In Terraform:
|
|
```hcl
|
|
locals {
|
|
env_src = "${local.repo}/.env.staging" # staging-specific
|
|
}
|
|
|
|
provisioner "file" {
|
|
source = local.env_src
|
|
destination = "${local.deploy_dir}/.env"
|
|
}
|
|
```
|
|
|
|
Rsync must exclude `.env` files (otherwise production .env overwrites staging .env):
|
|
```
|
|
--exclude=.env --exclude='.env.*'
|
|
```
|
|
|
|
## Verification template
|
|
|
|
After all services start, verify in the provisioner (not ad-hoc SSH):
|
|
|
|
```bash
|
|
sleep 20
|
|
echo '=== Service logs ==='
|
|
docker logs my-critical-service --tail 20 2>&1 || true
|
|
echo '=== All containers ==='
|
|
docker ps --format 'table {{.Names}}\t{{.Status}}' 2>&1 || true
|
|
# Final gate (only line that can fail)
|
|
docker ps --filter name=my-critical-service --format '{{.Status}}' | grep -q healthy \
|
|
|| { echo 'FATAL: service unhealthy'; exit 1; }
|
|
```
|