- Comprehensive task documentation for migrating from AnythingLLM to Dify+n8n+Qdrant - 8 detailed documents covering every aspect of deployment - Complete step-by-step commands (zero assumptions) - Prerequisites checklist (20 items) - Deployment plan in 2 parts (11 phases, every command) - Configuration files (all configs with exact content) - Recovery procedures (4 disaster scenarios) - Verification guide (30 tests, complete checklist) - Troubleshooting guide (common issues + solutions) Built by: The Chronicler #21 For: Meg, Holly, and children not yet born Time investment: 10-15 hours execution time Purpose: Enable Meg/Holly autonomous work with Git write-back This deployment enables: - RBAC (Meg sees all, Holly sees Pokerole only) - Git write-back via ai-proposals branch - Discord approval workflow (one-click merge) - Self-healing (80% of failures) - Automated daily backups - Complete monitoring Documentation is so detailed that any future Chronicler can execute this deployment with zero prior knowledge and complete confidence. Fire + Frost + Foundation = Where Love Builds Legacy
7.3 KiB
RECOVERY AND BACKUP PROCEDURES
Complete disaster recovery guide for Firefrost Knowledge Engine
🎯 BACKUP STRATEGY
Philosophy: Git is the source of truth. Vector DB can always be rebuilt.
What to backup: ✅ PostgreSQL database (user accounts, settings, chat histories) ✅ n8n volumes (workflows, credentials) ✅ Configuration files (docker-compose.yml, .env, Nginx configs)
What NOT to backup: ❌ Qdrant vectors (re-index from Git in minutes) ❌ Redis cache (temporary data only) ❌ Git repositories (Gitea is the master)
📅 AUTOMATED DAILY BACKUPS
Schedule: Daily at 4:00 AM (cron)
Script location: /opt/firefrost_backup.sh
What it does:
- Dumps PostgreSQL database
- Copies n8n workflows and credentials
- Copies configuration files
- Compresses into tarball
- Transfers to Command Center (offsite)
- Removes local backups older than 7 days
Backup location:
- Local:
/opt/firefrost_codex_YYYYMMDD_HHMM.tar.gz - Offsite:
root@63.143.34.217:/root/backups/firefrost-codex/
Retention: 7 days local, unlimited offsite
Monitor backups:
# Check recent backups
ls -lh /opt/firefrost_codex_*.tar.gz
# Check backup logs
tail -f /var/log/firefrost-backup.log
# Verify offsite transfer
ssh root@63.143.34.217 "ls -lh /root/backups/firefrost-codex/"
🔄 RECOVERY SCENARIOS
Scenario A: Qdrant Corrupted (Wrong Answers)
Symptoms:
- Codex returns incorrect information
- Searches find wrong documents
- Outdated content returned
Diagnosis:
# Check Qdrant health
curl http://127.0.0.1:6333/
Recovery (5-10 minutes):
# Stop Qdrant
docker-compose stop qdrant
# Delete corrupted data
rm -rf /opt/firefrost-codex/volumes/qdrant/storage/*
# Restart Qdrant
docker-compose start qdrant
# Trigger Git sync in n8n to rebuild
# (Open n8n, run "Firefrost Git Sync" workflow manually)
Verification:
- Test queries return correct current information
- No archived docs in results
Scenario B: n8n Workflows Lost
Symptoms:
- Workflows missing or corrupted
- Automation not working
- Can't update docs via Codex
Recovery (10-15 minutes):
# Stop n8n
docker-compose stop n8n
# Extract latest backup
cd /tmp
tar -xzf /opt/firefrost_codex_LATEST_BACKUP.tar.gz
cd codex_backup_*/
# Restore n8n data
rm -rf /opt/firefrost-codex/volumes/n8n/*
cp -r n8n_data/* /opt/firefrost-codex/volumes/n8n/
# Restart n8n
cd /opt/firefrost-codex
docker-compose start n8n
Verification:
# Check workflows restored
curl http://127.0.0.1:5678
# Login and verify workflows present
Scenario C: Database Corruption
Symptoms:
- Can't login to Dify
- User accounts missing
- Settings reset
- Chat history gone
Recovery (15-20 minutes):
# Stop all services
cd /opt/firefrost-codex
docker-compose down
# Extract latest backup
cd /tmp
tar -xzf /opt/firefrost_codex_LATEST_BACKUP.tar.gz
cd codex_backup_*/
# Restore database
docker-compose up -d db
sleep 30 # Wait for database to start
cat dify_postgres.sql | docker exec -i $(docker ps -qf "name=db") psql -U postgres
# Restart all services
cd /opt/firefrost-codex
docker-compose up -d
Verification:
- Login to Dify works
- User accounts present
- Settings preserved
- Chat history intact
Scenario D: Complete TX1 Server Crash
Symptoms:
- TX1 completely down
- Hardware failure
- OS corrupted
- Full rebuild needed
Recovery (30-60 minutes):
Step 1: Provision new server
- Install Ubuntu 22.04 LTS
- Configure network (same IP if possible)
- Install Docker and Nginx
Step 2: Retrieve backup from Command Center
# On new TX1
mkdir -p /opt
scp root@63.143.34.217:/root/backups/firefrost-codex/firefrost_codex_LATEST.tar.gz /opt/
# Extract
cd /opt
tar -xzf firefrost_codex_LATEST.tar.gz
cd codex_backup_*/
Step 3: Restore configurations
# Create deployment directory
mkdir -p /opt/firefrost-codex
cd /opt/firefrost-codex
# Restore files
cp /tmp/codex_backup_*/docker-compose.yml .
cp /tmp/codex_backup_*/.env .
mkdir -p volumes/n8n
cp -r /tmp/codex_backup_*/n8n_data/* volumes/n8n/
# Restore Nginx config
cp /tmp/codex_backup_*/firefrost-codex.conf /etc/nginx/sites-available/
ln -s /etc/nginx/sites-available/firefrost-codex.conf /etc/nginx/sites-enabled/
Step 4: Regenerate SSL certificates
systemctl stop nginx
certbot certonly --standalone \
-d codex.firefrostgaming.com \
-d n8n.firefrostgaming.com \
--email codex@firefrostgaming.com \
--agree-tos
systemctl start nginx
Step 5: Start Docker stack
cd /opt/firefrost-codex
docker-compose up -d
sleep 60 # Wait for services
Step 6: Restore database
cat /tmp/codex_backup_*/dify_postgres.sql | docker exec -i $(docker ps -qf "name=db") psql -U postgres
docker-compose restart dify-api dify-worker
Step 7: Rebuild Qdrant from Git
- Access n8n at https://n8n.firefrostgaming.com
- Run "Firefrost Git Sync" workflow manually
- Wait 5-10 minutes for indexing
Total downtime: ~45 minutes (assuming new server ready)
🧪 TESTING BACKUPS
CRITICAL: Never trust an untested backup
Test quarterly (every 3 months):
Dry-Run Database Restore:
# Create temporary test database
docker run --name test-postgres \
-e POSTGRES_PASSWORD=test \
-d postgres:15-alpine
# Restore backup into test database
cat dify_postgres.sql | docker exec -i test-postgres psql -U postgres
# Check for errors
docker logs test-postgres | grep ERROR
# Cleanup
docker rm -f test-postgres
If no errors: Backup is valid
Document test results:
echo "$(date): Backup test PASSED" >> /var/log/firefrost-backup-tests.log
🚨 EMERGENCY CONTACTS
If disaster recovery fails:
- Check TROUBLESHOOTING.md for common issues
- Review Docker logs:
docker-compose logs - Check Nginx logs:
/var/log/nginx/error.log - Verify backups exist: Both local and Command Center
- If stuck: Document what happened, wait for fresh Chronicler session
No external support needed - we built this ourselves
📊 BACKUP MONITORING
Verify backups running:
# Check cron job
crontab -l | grep firefrost
# Check recent backup log
tail -20 /var/log/firefrost-backup.log
# Check backup exists today
ls -lh /opt/firefrost_codex_$(date +%Y%m%d)*.tar.gz
# Verify transferred to Command Center
ssh root@63.143.34.217 "ls -lh /root/backups/firefrost-codex/ | tail -5"
Add to Uptime Kuma (optional):
- Monitor backup log file modified date
- Alert if >25 hours since last backup
- Indicates backup failure
💾 MANUAL BACKUP (Before Major Changes)
Before any major changes, create manual backup:
# Run backup script manually
/opt/firefrost_backup.sh
# Tag the backup
mv /opt/firefrost_codex_$(date +%Y%m%d)*.tar.gz \
/opt/firefrost_codex_before_major_change_$(date +%Y%m%d).tar.gz
Keep pre-change backups for 30 days
🎯 BACKUP SUCCESS CRITERIA
Daily backups must: ✅ Complete without errors ✅ Transfer to Command Center successfully ✅ Be restorable (tested quarterly) ✅ Include all critical data (DB, n8n, configs) ✅ Be less than 24 hours old
If ANY criteria fails: Investigate immediately
Fire + Frost + Foundation = Where Data Never Dies 💙🔥❄️