Files
firefrost-operations-manual/docs/tasks/firefrost-codex-migration-to-open-webui/RECOVERY.md
The Chronicler #21 2e953ce312 feat: Complete Firefrost Knowledge Engine deployment plan
- Comprehensive task documentation for migrating from AnythingLLM to Dify+n8n+Qdrant
- 8 detailed documents covering every aspect of deployment
- Complete step-by-step commands (zero assumptions)
- Prerequisites checklist (20 items)
- Deployment plan in 2 parts (11 phases, every command)
- Configuration files (all configs with exact content)
- Recovery procedures (4 disaster scenarios)
- Verification guide (30 tests, complete checklist)
- Troubleshooting guide (common issues + solutions)

Built by: The Chronicler #21
For: Meg, Holly, and children not yet born
Time investment: 10-15 hours execution time
Purpose: Enable Meg/Holly autonomous work with Git write-back

This deployment enables:
- RBAC (Meg sees all, Holly sees Pokerole only)
- Git write-back via ai-proposals branch
- Discord approval workflow (one-click merge)
- Self-healing (80% of failures)
- Automated daily backups
- Complete monitoring

Documentation is so detailed that any future Chronicler can execute
this deployment with zero prior knowledge and complete confidence.

Fire + Frost + Foundation = Where Love Builds Legacy
2026-02-22 09:55:13 +00:00

7.3 KiB

RECOVERY AND BACKUP PROCEDURES

Complete disaster recovery guide for Firefrost Knowledge Engine


🎯 BACKUP STRATEGY

Philosophy: Git is the source of truth. Vector DB can always be rebuilt.

What to backup: PostgreSQL database (user accounts, settings, chat histories) n8n volumes (workflows, credentials) Configuration files (docker-compose.yml, .env, Nginx configs)

What NOT to backup: Qdrant vectors (re-index from Git in minutes) Redis cache (temporary data only) Git repositories (Gitea is the master)


📅 AUTOMATED DAILY BACKUPS

Schedule: Daily at 4:00 AM (cron)

Script location: /opt/firefrost_backup.sh

What it does:

  1. Dumps PostgreSQL database
  2. Copies n8n workflows and credentials
  3. Copies configuration files
  4. Compresses into tarball
  5. Transfers to Command Center (offsite)
  6. Removes local backups older than 7 days

Backup location:

  • Local: /opt/firefrost_codex_YYYYMMDD_HHMM.tar.gz
  • Offsite: root@63.143.34.217:/root/backups/firefrost-codex/

Retention: 7 days local, unlimited offsite

Monitor backups:

# Check recent backups
ls -lh /opt/firefrost_codex_*.tar.gz

# Check backup logs
tail -f /var/log/firefrost-backup.log

# Verify offsite transfer
ssh root@63.143.34.217 "ls -lh /root/backups/firefrost-codex/"

🔄 RECOVERY SCENARIOS

Scenario A: Qdrant Corrupted (Wrong Answers)

Symptoms:

  • Codex returns incorrect information
  • Searches find wrong documents
  • Outdated content returned

Diagnosis:

# Check Qdrant health
curl http://127.0.0.1:6333/

Recovery (5-10 minutes):

# Stop Qdrant
docker-compose stop qdrant

# Delete corrupted data
rm -rf /opt/firefrost-codex/volumes/qdrant/storage/*

# Restart Qdrant
docker-compose start qdrant

# Trigger Git sync in n8n to rebuild
# (Open n8n, run "Firefrost Git Sync" workflow manually)

Verification:

  • Test queries return correct current information
  • No archived docs in results

Scenario B: n8n Workflows Lost

Symptoms:

  • Workflows missing or corrupted
  • Automation not working
  • Can't update docs via Codex

Recovery (10-15 minutes):

# Stop n8n
docker-compose stop n8n

# Extract latest backup
cd /tmp
tar -xzf /opt/firefrost_codex_LATEST_BACKUP.tar.gz
cd codex_backup_*/

# Restore n8n data
rm -rf /opt/firefrost-codex/volumes/n8n/*
cp -r n8n_data/* /opt/firefrost-codex/volumes/n8n/

# Restart n8n
cd /opt/firefrost-codex
docker-compose start n8n

Verification:

# Check workflows restored
curl http://127.0.0.1:5678
# Login and verify workflows present

Scenario C: Database Corruption

Symptoms:

  • Can't login to Dify
  • User accounts missing
  • Settings reset
  • Chat history gone

Recovery (15-20 minutes):

# Stop all services
cd /opt/firefrost-codex
docker-compose down

# Extract latest backup
cd /tmp
tar -xzf /opt/firefrost_codex_LATEST_BACKUP.tar.gz
cd codex_backup_*/

# Restore database
docker-compose up -d db
sleep 30  # Wait for database to start
cat dify_postgres.sql | docker exec -i $(docker ps -qf "name=db") psql -U postgres

# Restart all services
cd /opt/firefrost-codex
docker-compose up -d

Verification:

  • Login to Dify works
  • User accounts present
  • Settings preserved
  • Chat history intact

Scenario D: Complete TX1 Server Crash

Symptoms:

  • TX1 completely down
  • Hardware failure
  • OS corrupted
  • Full rebuild needed

Recovery (30-60 minutes):

Step 1: Provision new server

  • Install Ubuntu 22.04 LTS
  • Configure network (same IP if possible)
  • Install Docker and Nginx

Step 2: Retrieve backup from Command Center

# On new TX1
mkdir -p /opt
scp root@63.143.34.217:/root/backups/firefrost-codex/firefrost_codex_LATEST.tar.gz /opt/

# Extract
cd /opt
tar -xzf firefrost_codex_LATEST.tar.gz
cd codex_backup_*/

Step 3: Restore configurations

# Create deployment directory
mkdir -p /opt/firefrost-codex
cd /opt/firefrost-codex

# Restore files
cp /tmp/codex_backup_*/docker-compose.yml .
cp /tmp/codex_backup_*/.env .
mkdir -p volumes/n8n
cp -r /tmp/codex_backup_*/n8n_data/* volumes/n8n/

# Restore Nginx config
cp /tmp/codex_backup_*/firefrost-codex.conf /etc/nginx/sites-available/
ln -s /etc/nginx/sites-available/firefrost-codex.conf /etc/nginx/sites-enabled/

Step 4: Regenerate SSL certificates

systemctl stop nginx
certbot certonly --standalone \
  -d codex.firefrostgaming.com \
  -d n8n.firefrostgaming.com \
  --email codex@firefrostgaming.com \
  --agree-tos
systemctl start nginx

Step 5: Start Docker stack

cd /opt/firefrost-codex
docker-compose up -d
sleep 60  # Wait for services

Step 6: Restore database

cat /tmp/codex_backup_*/dify_postgres.sql | docker exec -i $(docker ps -qf "name=db") psql -U postgres
docker-compose restart dify-api dify-worker

Step 7: Rebuild Qdrant from Git

Total downtime: ~45 minutes (assuming new server ready)


🧪 TESTING BACKUPS

CRITICAL: Never trust an untested backup

Test quarterly (every 3 months):

Dry-Run Database Restore:

# Create temporary test database
docker run --name test-postgres \
  -e POSTGRES_PASSWORD=test \
  -d postgres:15-alpine

# Restore backup into test database
cat dify_postgres.sql | docker exec -i test-postgres psql -U postgres

# Check for errors
docker logs test-postgres | grep ERROR

# Cleanup
docker rm -f test-postgres

If no errors: Backup is valid

Document test results:

echo "$(date): Backup test PASSED" >> /var/log/firefrost-backup-tests.log

🚨 EMERGENCY CONTACTS

If disaster recovery fails:

  1. Check TROUBLESHOOTING.md for common issues
  2. Review Docker logs: docker-compose logs
  3. Check Nginx logs: /var/log/nginx/error.log
  4. Verify backups exist: Both local and Command Center
  5. If stuck: Document what happened, wait for fresh Chronicler session

No external support needed - we built this ourselves


📊 BACKUP MONITORING

Verify backups running:

# Check cron job
crontab -l | grep firefrost

# Check recent backup log
tail -20 /var/log/firefrost-backup.log

# Check backup exists today
ls -lh /opt/firefrost_codex_$(date +%Y%m%d)*.tar.gz

# Verify transferred to Command Center
ssh root@63.143.34.217 "ls -lh /root/backups/firefrost-codex/ | tail -5"

Add to Uptime Kuma (optional):

  • Monitor backup log file modified date
  • Alert if >25 hours since last backup
  • Indicates backup failure

💾 MANUAL BACKUP (Before Major Changes)

Before any major changes, create manual backup:

# Run backup script manually
/opt/firefrost_backup.sh

# Tag the backup
mv /opt/firefrost_codex_$(date +%Y%m%d)*.tar.gz \
   /opt/firefrost_codex_before_major_change_$(date +%Y%m%d).tar.gz

Keep pre-change backups for 30 days


🎯 BACKUP SUCCESS CRITERIA

Daily backups must: Complete without errors Transfer to Command Center successfully Be restorable (tested quarterly) Include all critical data (DB, n8n, configs) Be less than 24 hours old

If ANY criteria fails: Investigate immediately


Fire + Frost + Foundation = Where Data Never Dies 💙🔥❄️