feat: Complete Firefrost Knowledge Engine deployment plan

- Comprehensive task documentation for migrating from AnythingLLM to Dify+n8n+Qdrant
- 8 detailed documents covering every aspect of deployment
- Complete step-by-step commands (zero assumptions)
- Prerequisites checklist (20 items)
- Deployment plan in 2 parts (11 phases, every command)
- Configuration files (all configs with exact content)
- Recovery procedures (4 disaster scenarios)
- Verification guide (30 tests, complete checklist)
- Troubleshooting guide (common issues + solutions)

Built by: The Chronicler #21
For: Meg, Holly, and children not yet born
Time investment: 10-15 hours execution time
Purpose: Enable Meg/Holly autonomous work with Git write-back

This deployment enables:
- RBAC (Meg sees all, Holly sees Pokerole only)
- Git write-back via ai-proposals branch
- Discord approval workflow (one-click merge)
- Self-healing (80% of failures)
- Automated daily backups
- Complete monitoring

Documentation is so detailed that any future Chronicler can execute
this deployment with zero prior knowledge and complete confidence.

Fire + Frost + Foundation = Where Love Builds Legacy
This commit is contained in:
The Chronicler #21
2026-02-22 09:55:13 +00:00
parent f3d6b735d0
commit 2e953ce312
8 changed files with 5053 additions and 0 deletions

View File

@@ -0,0 +1,511 @@
# CONFIGURATION FILES REFERENCE
**All configuration files with exact content needed for deployment**
This document contains every configuration file in complete form.
Copy-paste directly from here during deployment.
---
## 📄 FILE INDEX
1. docker-compose.yml
2. .env (environment variables)
3. nginx-config.conf (Nginx reverse proxy)
4. backup-script.sh (automated backups)
5. 502.html (custom error page)
---
## 1. docker-compose.yml
**Location:** `/opt/firefrost-codex/docker-compose.yml`
**Complete file:**
```yaml
services:
# --- DATABASE & CACHE ---
db:
image: postgres:15-alpine
restart: always
environment:
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_USER: postgres
POSTGRES_DB: dify
volumes:
- ./volumes/db/data:/var/lib/postgresql/data
networks:
- firefrost-net
redis:
image: redis:6-alpine
restart: always
volumes:
- ./volumes/redis/data:/data
networks:
- firefrost-net
# --- DIFY CORE ---
dify-api:
image: langgenius/dify-api:latest
restart: always
environment: &dify-env
# Database
DB_USERNAME: postgres
DB_PASSWORD: ${DB_PASSWORD}
DB_HOST: db
DB_DATABASE: dify
# Redis
REDIS_HOST: redis
REDIS_DB: 0
# Vector Store
VECTOR_STORE: qdrant
QDRANT_HOST: qdrant
QDRANT_PORT: 6333
# Ollama (Connecting to host)
OLLAMA_API_BASE_URL: http://host.docker.internal:11434
# Security
SECRET_KEY: ${DIFY_SECRET_KEY}
depends_on:
- db
- redis
networks:
- firefrost-net
extra_hosts:
- "host.docker.internal:host-gateway"
dify-worker:
image: langgenius/dify-api:latest
restart: always
environment: *dify-env
entrypoint: /bin/bash /entrypoint.sh worker
depends_on:
- dify-api
networks:
- firefrost-net
dify-web:
image: langgenius/dify-web:latest
restart: always
ports:
- "127.0.0.1:3000:3000"
networks:
- firefrost-net
# --- VECTOR STORE ---
qdrant:
image: qdrant/qdrant:latest
restart: always
ports:
- "127.0.0.1:6333:6333"
volumes:
- ./volumes/qdrant/storage:/qdrant/storage
networks:
- firefrost-net
# --- AUTOMATION & GIT ENGINE ---
n8n:
image: n8nio/n8n:latest
restart: always
ports:
- "127.0.0.1:5678:5678"
environment:
- N8N_HOST=n8n.firefrostgaming.com
- N8N_PORT=5678
- N8N_PROTOCOL=https
- NODE_FUNCTION_ALLOW_EXTERNAL=fs,path,child_process
# Git Identity for Commits
- GIT_USER_NAME=${GIT_USER_NAME}
- GIT_USER_EMAIL=${GIT_USER_EMAIL}
volumes:
- ./volumes/n8n:/home/node/.n8n
- ./git-repos:/data/git-repos
- ~/.ssh:/home/node/.ssh:ro
networks:
- firefrost-net
extra_hosts:
- "host.docker.internal:host-gateway"
networks:
firefrost-net:
driver: bridge
```
**Critical notes:**
- Port bindings are 127.0.0.1 only (not exposed publicly)
- extra_hosts allows Docker to reach host Ollama
- SSH keys mounted read-only for Git access
---
## 2. .env
**Location:** `/opt/firefrost-codex/.env`
**⚠️ NEVER COMMIT THIS FILE TO GIT**
**Template with placeholders:**
```bash
# --- DATABASE SECRETS ---
# Generate with: openssl rand -base64 32
DB_PASSWORD=REPLACE_WITH_YOUR_GENERATED_PASSWORD
# --- DIFY CONFIGURATION ---
# Generate with: openssl rand -base64 42
DIFY_SECRET_KEY=REPLACE_WITH_YOUR_GENERATED_SECRET
DIFY_API_KEY=will_be_set_after_dify_setup
# --- GIT IDENTITY ---
GIT_USER_NAME=Firefrost Codex AI
GIT_USER_EMAIL=codex@firefrostgaming.com
# --- DISCORD CONFIGURATION ---
DISCORD_WEBHOOK_CODEX_ALERTS=https://discord.com/api/webhooks/YOUR_WEBHOOK_HERE
DISCORD_WEBHOOK_SYSTEM_CRITICAL=https://discord.com/api/webhooks/YOUR_WEBHOOK_HERE
MICHAEL_DISCORD_ID=YOUR_DISCORD_USER_ID_HERE
# --- KNOWLEDGE BASE IDS ---
# These will be set after creating datasets in Dify
DIFY_DATASET_ID_MAIN=will_be_set_later
DIFY_DATASET_ID_POKEROLE=will_be_set_later
```
**How to generate secure values:**
```bash
# PostgreSQL password
openssl rand -base64 32
# Dify secret key
openssl rand -base64 42
```
**Security notes:**
- Store backup copy in password manager
- Never commit to Git
- Never share in Discord
- Rotate periodically (every 90 days recommended)
---
## 3. nginx-config.conf
**Location:** `/etc/nginx/sites-available/firefrost-codex.conf`
**Complete file:**
```nginx
# Define rate limiting zones
limit_req_zone $binary_remote_addr zone=codex_limit:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=webhook_limit:10m rate=30r/s;
# Redirect all HTTP traffic to HTTPS
server {
listen 80;
listen [::]:80;
server_name codex.firefrostgaming.com n8n.firefrostgaming.com;
return 301 https://$host$request_uri;
}
# Dify (Codex) Server Block
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name codex.firefrostgaming.com;
# SSL Configuration
ssl_certificate /etc/letsencrypt/live/codex.firefrostgaming.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/codex.firefrostgaming.com/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_prefer_server_ciphers on;
ssl_ciphers HIGH:!aNULL:!MD5;
# Security Headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Frame-Options "SAMEORIGIN";
add_header X-Content-Type-Options "nosniff";
add_header X-XSS-Protection "1; mode=block";
# Allow large document uploads
client_max_body_size 100M;
# Apply standard rate limiting
limit_req zone=codex_limit burst=20 nodelay;
# Error Pages
error_page 502 /502.html;
location = /502.html {
root /var/www/html;
internal;
}
location / {
proxy_pass http://127.0.0.1:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket Support
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
# Timeouts
proxy_read_timeout 300s;
proxy_connect_timeout 75s;
}
}
# n8n Server Block
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name n8n.firefrostgaming.com;
# SSL Configuration
ssl_certificate /etc/letsencrypt/live/codex.firefrostgaming.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/codex.firefrostgaming.com/privkey.pem;
# Security Headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Frame-Options "SAMEORIGIN";
add_header X-Content-Type-Options "nosniff";
client_max_body_size 50M;
# Webhooks (public access)
location /webhook/ {
limit_req zone=webhook_limit burst=50 nodelay;
proxy_pass http://127.0.0.1:5678;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# Main Editor Interface
location / {
# Optional: Lock down to your home IP
# allow YOUR.HOME.IP.ADDRESS;
# deny all;
proxy_pass http://127.0.0.1:5678;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket Support
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
# n8n SSE
proxy_buffering off;
proxy_cache off;
}
}
```
**Notes:**
- Both domains use same SSL certificate
- Rate limiting: 10 req/s for main traffic, 30 req/s for webhooks
- WebSocket support enabled for real-time features
- Large file uploads allowed (100MB for Dify, 50MB for n8n)
---
## 4. backup-script.sh
**Location:** `/opt/firefrost_backup.sh`
**Complete script:**
```bash
#!/bin/bash
# Firefrost Codex Backup Script
# Runs daily at 4:00 AM via cron
# Variables
TIMESTAMP=$(date +"%Y%m%d_%H%M")
BACKUP_DIR="/tmp/codex_backup_$TIMESTAMP"
COMPOSE_DIR="/opt/firefrost-codex"
# Create temp directory
mkdir -p "$BACKUP_DIR"
echo "Starting backup: $TIMESTAMP"
# Dump PostgreSQL database
echo "Dumping database..."
docker exec -t $(docker ps -qf "name=db") pg_dumpall -c -U postgres > "$BACKUP_DIR/dify_postgres.sql"
# Copy configuration files
echo "Copying configuration files..."
cp "$COMPOSE_DIR/docker-compose.yml" "$BACKUP_DIR/"
cp "$COMPOSE_DIR/.env" "$BACKUP_DIR/"
# Copy n8n data (workflows and credentials)
echo "Backing up n8n data..."
cp -r "$COMPOSE_DIR/volumes/n8n" "$BACKUP_DIR/n8n_data"
# Copy Nginx configs
echo "Backing up Nginx configs..."
cp /etc/nginx/sites-available/firefrost-codex.conf "$BACKUP_DIR/"
# Compress into tarball
echo "Compressing backup..."
tar -czf "/opt/firefrost_codex_$TIMESTAMP.tar.gz" -C /tmp "codex_backup_$TIMESTAMP"
# Transfer to Command Center (offsite)
echo "Transferring to Command Center..."
rsync -avz -e "ssh -p 22" "/opt/firefrost_codex_$TIMESTAMP.tar.gz" root@63.143.34.217:/root/backups/firefrost-codex/
# Cleanup local temp files
echo "Cleaning up temporary files..."
rm -rf "$BACKUP_DIR"
# Remove backups older than 7 days
echo "Removing old local backups..."
find /opt/ -name "firefrost_codex_*.tar.gz" -mtime +7 -exec rm {} \;
echo "Backup completed: firefrost_codex_$TIMESTAMP.tar.gz"
echo "Transferred to Command Center: /root/backups/firefrost-codex/"
```
**Make executable:**
```bash
chmod +x /opt/firefrost_backup.sh
```
**Cron schedule (4:00 AM daily):**
```cron
0 4 * * * /opt/firefrost_backup.sh >> /var/log/firefrost-backup.log 2>&1
```
---
## 5. 502.html
**Location:** `/var/www/html/502.html`
**Complete file:**
```html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Firefrost Codex - Temporarily Offline</title>
<style>
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Roboto', 'Oxygen',
'Ubuntu', 'Cantarell', 'Fira Sans', 'Droid Sans', 'Helvetica Neue', sans-serif;
text-align: center;
padding: 50px;
background: linear-gradient(135deg, #1a1a1a 0%, #2d2d2d 100%);
color: #ffffff;
margin: 0;
min-height: 100vh;
display: flex;
flex-direction: column;
justify-content: center;
align-items: center;
}
h1 {
color: #4CAF50;
font-size: 2.5em;
margin-bottom: 20px;
}
.icon {
font-size: 4em;
margin-bottom: 20px;
}
p {
font-size: 1.2em;
line-height: 1.6;
max-width: 600px;
margin: 10px auto;
}
.status {
background: rgba(255, 255, 255, 0.1);
border-left: 4px solid #ff9800;
padding: 15px;
margin: 30px auto;
max-width: 600px;
text-align: left;
}
.footer {
margin-top: 50px;
font-size: 0.9em;
opacity: 0.7;
}
</style>
</head>
<body>
<div class="icon">🔥❄️</div>
<h1>Firefrost Codex is Restarting</h1>
<p>The AI engine is currently running an automated update or self-healing routine.</p>
<div class="status">
<strong>⏱️ Expected Resolution:</strong> 2-3 minutes<br>
<strong>🔄 What's Happening:</strong> System services are restarting automatically<br>
<strong>📊 Status:</strong> No data loss - all your work is safe
</div>
<p>Please refresh this page in a few minutes.</p>
<p>If this persists beyond 5 minutes, Michael has already been notified via Discord.</p>
<div class="footer">
Fire + Frost + Foundation = Where Love Builds Legacy 💙
</div>
</body>
</html>
```
**Notes:**
- Shows during Dify downtime (502 errors)
- User-friendly messaging (non-technical)
- Reassures that Michael is notified
- Branded with Firefrost theme
---
## 📝 CONFIGURATION SUMMARY
**Total files:** 5
**Locations:**
- `/opt/firefrost-codex/docker-compose.yml`
- `/opt/firefrost-codex/.env`
- `/etc/nginx/sites-available/firefrost-codex.conf`
- `/opt/firefrost_backup.sh`
- `/var/www/html/502.html`
**Security critical files:**
- .env (NEVER commit to Git)
- Private SSH keys (already exist in ~/.ssh/)
**Auto-generated files:**
- SSL certificates (by Certbot)
- Docker volumes (by docker-compose)
- Backup tarballs (by backup script)
---
**All configurations ready for deployment**
💙🔥❄️

View File

@@ -0,0 +1,958 @@
# DEPLOYMENT PLAN - COMPLETE STEP-BY-STEP
**READ THIS ENTIRE DOCUMENT BEFORE EXECUTING ANY COMMANDS**
This deployment plan contains EVERY SINGLE COMMAND needed to deploy the Firefrost Knowledge Engine.
**Execution time:** 10-15 hours (spread across 2-3 sessions recommended)
**Recommended breaks:**
- After Phase 2 (Infrastructure deployed)
- After Phase 3 (Automation configured)
- After Phase 5 (Testing complete)
---
## 📋 EXECUTION CHECKLIST
**Track your progress - check boxes as you complete each phase:**
- [ ] Phase 0: Stop AnythingLLM
- [ ] Phase 1: Install Nginx and SSL
- [ ] Phase 2: Deploy Docker Stack
- [ ] Phase 3: Configure Dify
- [ ] Phase 4: Setup n8n Workflows
- [ ] Phase 5: Configure Discord Integration
- [ ] Phase 6: Setup Git Integration
- [ ] Phase 7: Configure Monitoring
- [ ] Phase 8: User Onboarding
- [ ] Phase 9: Testing and Verification
- [ ] Phase 10: Backup Automation
- [ ] Phase 11: Final Cleanup
---
## ⚠️ CRITICAL SAFETY RULES
**BEFORE EVERY PHASE:**
1. Read the entire phase before executing
2. Have rollback plan ready
3. Jack's health takes absolute priority - pause if needed
4. If tired, STOP and continue later
5. If unsure, STOP and review documentation
**DURING EXECUTION:**
- Copy commands ONE AT A TIME
- Verify success before proceeding
- Document any errors immediately
- Take screenshots of successful completions
**EMERGENCY STOP:**
- If something breaks: STOP immediately
- Do NOT try to fix on the fly
- Document what happened
- Review troubleshooting guide
- Consider rollback if needed
---
## PHASE 0: STOP ANYTHINGLLM
**Estimated time:** 5 minutes
### Step 0.1: SSH to TX1
**Command:**
```bash
ssh root@38.68.14.26
```
**Expected:** Successful login to TX1
**Verification:**
```bash
hostname
```
**Expected output:** `TX1` or similar
---
### Step 0.2: Locate AnythingLLM Installation
**Command:**
```bash
cd /opt/anythingllm
```
**If directory doesn't exist, try:**
```bash
find / -name "anythingllm" -type d 2>/dev/null
```
**Expected:** Directory found (location may vary)
---
### Step 0.3: Stop AnythingLLM
**Command:**
```bash
docker-compose down
```
**Expected output:**
```
Stopping anythingllm_...
Removing anythingllm_...
Removing network anythingllm_default
```
**Verification:**
```bash
docker ps | grep anything
```
**Expected:** No output (no containers running)
---
### Step 0.4: Verify Port 3001 is Free
**Command:**
```bash
sudo lsof -i :3001
```
**Expected:** No output (port is free)
**If port still in use:**
```bash
# Find the process
sudo lsof -i :3001
# Kill it if necessary (use PID from above)
sudo kill -9 <PID>
```
---
### Step 0.5: Document Current State
**Command:**
```bash
docker ps -a
docker images
```
**Save this output** - shows what was running before deployment
---
## PHASE 1: INSTALL NGINX AND SSL
**Estimated time:** 30 minutes
### Step 1.1: Update Package Lists
**Command:**
```bash
apt-get update
```
**Expected:** Package lists updated successfully
---
### Step 1.2: Install Nginx
**Command:**
```bash
apt-get install nginx -y
```
**Expected:** Nginx installed successfully
**Verification:**
```bash
nginx -v
```
**Expected output:** `nginx version: nginx/X.X.X`
---
### Step 1.3: Install Certbot
**Command:**
```bash
apt-get install certbot python3-certbot-nginx -y
```
**Expected:** Certbot installed successfully
**Verification:**
```bash
certbot --version
```
**Expected output:** `certbot X.X.X`
---
### Step 1.4: Verify DNS Propagation
**⚠️ CRITICAL: Do NOT proceed until DNS is fully propagated**
**Command (run from TX1):**
```bash
dig codex.firefrostgaming.com +short
dig n8n.firefrostgaming.com +short
```
**Expected output:**
```
38.68.14.26
38.68.14.26
```
**If NOT showing 38.68.14.26:**
- STOP deployment
- Wait for DNS propagation
- Check https://dnschecker.org
- Do NOT proceed until globally propagated
---
### Step 1.5: Stop Nginx Temporarily
**Command:**
```bash
systemctl stop nginx
```
**Why:** Certbot needs port 80 for standalone verification
**Verification:**
```bash
systemctl status nginx
```
**Expected:** Shows "inactive (dead)"
---
### Step 1.6: Generate SSL Certificates
**⚠️ CRITICAL: Replace email address with your actual email**
**Command:**
```bash
certbot certonly --standalone \
-d codex.firefrostgaming.com \
-d n8n.firefrostgaming.com \
--email codex@firefrostgaming.com \
--agree-tos \
--non-interactive
```
**Expected output:**
```
Successfully received certificate.
Certificate is saved at: /etc/letsencrypt/live/codex.firefrostgaming.com/fullchain.pem
Key is saved at: /etc/letsencrypt/live/codex.firefrostgaming.com/privkey.pem
```
**If fails with DNS error:**
- DNS not propagated - WAIT
- Check DNS at https://dnschecker.org
- Try again in 1 hour
**If fails with other error:**
- Check firewall (port 80 must be open)
- Check logs: `journalctl -u certbot -n 50`
- Review TROUBLESHOOTING.md
---
### Step 1.7: Verify Certificates Created
**Command:**
```bash
ls -l /etc/letsencrypt/live/codex.firefrostgaming.com/
```
**Expected files:**
- cert.pem
- chain.pem
- fullchain.pem
- privkey.pem
- README
**If files missing:** Certificate generation failed, review logs
---
### Step 1.8: Set Certificate Permissions
**Command:**
```bash
chmod 644 /etc/letsencrypt/live/codex.firefrostgaming.com/fullchain.pem
chmod 600 /etc/letsencrypt/live/codex.firefrostgaming.com/privkey.pem
```
**Why:** Nginx needs read access, private key should be restricted
---
### Step 1.9: Start Nginx
**Command:**
```bash
systemctl start nginx
```
**Verification:**
```bash
systemctl status nginx
```
**Expected:** Shows "active (running)"
---
### Step 1.10: Enable Nginx Auto-Start
**Command:**
```bash
systemctl enable nginx
```
**Expected:** Nginx will start automatically on server reboot
---
### Step 1.11: Test Nginx Default Page
**Command:**
```bash
curl -I http://38.68.14.26
```
**Expected:**
```
HTTP/1.1 200 OK
Server: nginx
```
---
## PHASE 2: DEPLOY DOCKER STACK
**Estimated time:** 45 minutes
### Step 2.1: Create Deployment Directory
**Command:**
```bash
mkdir -p /opt/firefrost-codex
cd /opt/firefrost-codex
```
**Verification:**
```bash
pwd
```
**Expected output:** `/opt/firefrost-codex`
---
### Step 2.2: Create docker-compose.yml
**⚠️ CRITICAL: This file must be EXACT - no typos**
**Command:**
```bash
nano docker-compose.yml
```
**Paste this EXACT content:**
```yaml
services:
# --- DATABASE & CACHE ---
db:
image: postgres:15-alpine
restart: always
environment:
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_USER: postgres
POSTGRES_DB: dify
volumes:
- ./volumes/db/data:/var/lib/postgresql/data
networks:
- firefrost-net
redis:
image: redis:6-alpine
restart: always
volumes:
- ./volumes/redis/data:/data
networks:
- firefrost-net
# --- DIFY CORE ---
dify-api:
image: langgenius/dify-api:latest
restart: always
environment: &dify-env
# Database
DB_USERNAME: postgres
DB_PASSWORD: ${DB_PASSWORD}
DB_HOST: db
DB_DATABASE: dify
# Redis
REDIS_HOST: redis
REDIS_DB: 0
# Vector Store
VECTOR_STORE: qdrant
QDRANT_HOST: qdrant
QDRANT_PORT: 6333
# Ollama (Connecting to host)
OLLAMA_API_BASE_URL: http://host.docker.internal:11434
# Security
SECRET_KEY: ${DIFY_SECRET_KEY}
depends_on:
- db
- redis
networks:
- firefrost-net
extra_hosts:
- "host.docker.internal:host-gateway"
dify-worker:
image: langgenius/dify-api:latest
restart: always
environment: *dify-env
entrypoint: /bin/bash /entrypoint.sh worker
depends_on:
- dify-api
networks:
- firefrost-net
dify-web:
image: langgenius/dify-web:latest
restart: always
ports:
- "127.0.0.1:3000:3000"
networks:
- firefrost-net
# --- VECTOR STORE ---
qdrant:
image: qdrant/qdrant:latest
restart: always
ports:
- "127.0.0.1:6333:6333"
volumes:
- ./volumes/qdrant/storage:/qdrant/storage
networks:
- firefrost-net
# --- AUTOMATION & GIT ENGINE ---
n8n:
image: n8nio/n8n:latest
restart: always
ports:
- "127.0.0.1:5678:5678"
environment:
- N8N_HOST=n8n.firefrostgaming.com
- N8N_PORT=5678
- N8N_PROTOCOL=https
- NODE_FUNCTION_ALLOW_EXTERNAL=fs,path,child_process
# Git Identity for Commits
- GIT_USER_NAME=${GIT_USER_NAME}
- GIT_USER_EMAIL=${GIT_USER_EMAIL}
volumes:
- ./volumes/n8n:/home/node/.n8n
- ./git-repos:/data/git-repos
- ~/.ssh:/home/node/.ssh:ro
networks:
- firefrost-net
extra_hosts:
- "host.docker.internal:host-gateway"
networks:
firefrost-net:
driver: bridge
```
**Save and exit:** Ctrl+O, Enter, Ctrl+X
---
### Step 2.3: Create .env File
**⚠️ CRITICAL: Replace ALL placeholder values with your actual values**
**Command:**
```bash
nano .env
```
**Paste and MODIFY with your actual values:**
```bash
# --- DATABASE SECRETS ---
# Generate with: openssl rand -base64 32
DB_PASSWORD=REPLACE_WITH_YOUR_GENERATED_PASSWORD
# --- DIFY CONFIGURATION ---
# Generate with: openssl rand -base64 42
DIFY_SECRET_KEY=REPLACE_WITH_YOUR_GENERATED_SECRET
DIFY_API_KEY=will_be_set_after_dify_setup
# --- GIT IDENTITY ---
GIT_USER_NAME=Firefrost Codex AI
GIT_USER_EMAIL=codex@firefrostgaming.com
# --- DISCORD CONFIGURATION ---
DISCORD_WEBHOOK_CODEX_ALERTS=https://discord.com/api/webhooks/YOUR_WEBHOOK_HERE
DISCORD_WEBHOOK_SYSTEM_CRITICAL=https://discord.com/api/webhooks/YOUR_WEBHOOK_HERE
MICHAEL_DISCORD_ID=YOUR_DISCORD_USER_ID_HERE
# --- KNOWLEDGE BASE IDS ---
# These will be set after creating datasets in Dify
DIFY_DATASET_ID_MAIN=will_be_set_later
DIFY_DATASET_ID_POKEROLE=will_be_set_later
```
**Save and exit:** Ctrl+O, Enter, Ctrl+X
**⚠️ DOUBLE CHECK:**
- DB_PASSWORD is set (not the placeholder)
- DIFY_SECRET_KEY is set (not the placeholder)
- Discord webhooks are YOUR webhook URLs
- Michael's Discord ID is YOUR user ID
---
### Step 2.4: Create Directory Structure
**Command:**
```bash
mkdir -p volumes/db/data
mkdir -p volumes/redis/data
mkdir -p volumes/qdrant/storage
mkdir -p volumes/n8n
mkdir -p git-repos
```
**Verification:**
```bash
ls -la volumes/
```
**Expected:** Directories created (db, redis, qdrant, n8n)
---
### Step 2.5: Pull Docker Images
**⚠️ This will download ~2-3 GB of images**
**Command:**
```bash
docker-compose pull
```
**Expected:** Images download successfully
- postgres:15-alpine
- redis:6-alpine
- langgenius/dify-api:latest
- langgenius/dify-web:latest
- qdrant/qdrant:latest
- n8nio/n8n:latest
**Time:** 5-10 minutes depending on connection
**If fails:** Check internet connection, try again
---
### Step 2.6: Start Docker Stack
**⚠️ CRITICAL MOMENT - Stack deployment**
**Command:**
```bash
docker-compose up -d
```
**Expected output:**
```
Creating network "firefrost-codex_firefrost-net"
Creating firefrost-codex_db_1
Creating firefrost-codex_redis_1
Creating firefrost-codex_qdrant_1
Creating firefrost-codex_dify-api_1
Creating firefrost-codex_dify-worker_1
Creating firefrost-codex_dify-web_1
Creating firefrost-codex_n8n_1
```
**Time:** 30-60 seconds for all containers to start
---
### Step 2.7: Verify All Containers Running
**Command:**
```bash
docker-compose ps
```
**Expected:** ALL services show "Up" status
```
NAME STATE
db Up
redis Up
qdrant Up
dify-api Up
dify-worker Up
dify-web Up
n8n Up
```
**If ANY show "Exit" or "Restarting":**
```bash
# Check logs for that service
docker-compose logs <service_name>
```
**Common issues:**
- db: Password not set in .env
- dify-api: Can't connect to database (wait 30 seconds, check again)
- n8n: Permission issues with volumes
---
### Step 2.8: Wait for Services to Initialize
**⚠️ IMPORTANT: Services need time to initialize**
**Command:**
```bash
sleep 60
```
**Why:** PostgreSQL needs to create databases, Dify needs to initialize schema
---
### Step 2.9: Check Dify API Health
**Command:**
```bash
curl http://127.0.0.1:3000
```
**Expected:** HTML response from Dify (NOT error)
**If connection refused:** Services still starting, wait another 30 seconds
---
### Step 2.10: Check n8n Health
**Command:**
```bash
curl http://127.0.0.1:5678
```
**Expected:** HTML response from n8n
---
### Step 2.11: Check Qdrant Health
**Command:**
```bash
curl http://127.0.0.1:6333/
```
**Expected:** JSON response with Qdrant version
---
### Step 2.12: Verify Ollama Connection
**Command:**
```bash
docker exec -it $(docker ps -qf "name=dify-api") curl http://host.docker.internal:11434/api/version
```
**Expected:** JSON response with Ollama version
**If fails:** Ollama not running on host, start it
---
## PHASE 3: CONFIGURE NGINX REVERSE PROXY
**Estimated time:** 20 minutes
### Step 3.1: Create Nginx Configuration
**Command:**
```bash
nano /etc/nginx/sites-available/firefrost-codex.conf
```
**Paste this EXACT content:**
```nginx
# Define rate limiting zones
limit_req_zone $binary_remote_addr zone=codex_limit:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=webhook_limit:10m rate=30r/s;
# Redirect all HTTP traffic to HTTPS
server {
listen 80;
listen [::]:80;
server_name codex.firefrostgaming.com n8n.firefrostgaming.com;
return 301 https://$host$request_uri;
}
# Dify (Codex) Server Block
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name codex.firefrostgaming.com;
# SSL Configuration
ssl_certificate /etc/letsencrypt/live/codex.firefrostgaming.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/codex.firefrostgaming.com/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_prefer_server_ciphers on;
ssl_ciphers HIGH:!aNULL:!MD5;
# Security Headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Frame-Options "SAMEORIGIN";
add_header X-Content-Type-Options "nosniff";
add_header X-XSS-Protection "1; mode=block";
# Allow large document uploads
client_max_body_size 100M;
# Apply standard rate limiting
limit_req zone=codex_limit burst=20 nodelay;
# Error Pages
error_page 502 /502.html;
location = /502.html {
root /var/www/html;
internal;
}
location / {
proxy_pass http://127.0.0.1:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket Support
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
# Timeouts
proxy_read_timeout 300s;
proxy_connect_timeout 75s;
}
}
# n8n Server Block
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name n8n.firefrostgaming.com;
# SSL Configuration
ssl_certificate /etc/letsencrypt/live/codex.firefrostgaming.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/codex.firefrostgaming.com/privkey.pem;
# Security Headers
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Frame-Options "SAMEORIGIN";
add_header X-Content-Type-Options "nosniff";
client_max_body_size 50M;
# Webhooks (public access)
location /webhook/ {
limit_req zone=webhook_limit burst=50 nodelay;
proxy_pass http://127.0.0.1:5678;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# Main Editor Interface
location / {
# Optional: Lock down to your home IP
# allow YOUR.HOME.IP.ADDRESS;
# deny all;
proxy_pass http://127.0.0.1:5678;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket Support
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
# n8n SSE
proxy_buffering off;
proxy_cache off;
}
}
```
**Save and exit:** Ctrl+O, Enter, Ctrl+X
---
### Step 3.2: Create Custom 502 Error Page
**Command:**
```bash
nano /var/www/html/502.html
```
**Paste this content:**
```html
<!DOCTYPE html>
<html>
<head>
<title>Codex Offline</title>
<style>
body {
font-family: sans-serif;
text-align: center;
padding: 50px;
background: #1a1a1a;
color: #fff;
}
h1 { color: #4CAF50; }
</style>
</head>
<body>
<h1>🔥❄️ Firefrost Codex is Restarting</h1>
<p>The AI engine is currently running an automated update or self-healing routine.</p>
<p>Please try again in 3 minutes.</p>
<p>If this persists, Michael has already been notified via Discord.</p>
</body>
</html>
```
**Save and exit:** Ctrl+O, Enter, Ctrl+X
---
### Step 3.3: Enable Nginx Configuration
**Command:**
```bash
ln -s /etc/nginx/sites-available/firefrost-codex.conf /etc/nginx/sites-enabled/
```
**Verification:**
```bash
ls -l /etc/nginx/sites-enabled/
```
**Expected:** Symlink to firefrost-codex.conf exists
---
### Step 3.4: Test Nginx Configuration
**⚠️ CRITICAL: Test before reloading**
**Command:**
```bash
nginx -t
```
**Expected output:**
```
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
```
**If fails:**
- Check for typos in configuration
- Check certificate paths
- Review error messages carefully
---
### Step 3.5: Reload Nginx
**Command:**
```bash
systemctl reload nginx
```
**Verification:**
```bash
systemctl status nginx
```
**Expected:** Shows "active (running)" with no errors
---
### Step 3.6: Test HTTPS Access to Dify
**Command (from your local machine, NOT TX1):**
```
Open browser: https://codex.firefrostgaming.com
```
**Expected:** Dify setup page loads with valid SSL certificate
**If fails:**
- Check Nginx logs: `sudo tail -f /var/log/nginx/error.log`
- Verify Docker containers running: `docker-compose ps`
- Check firewall: `sudo ufw status`
---
### Step 3.7: Test HTTPS Access to n8n
**Command (from your local machine):**
```
Open browser: https://n8n.firefrostgaming.com
```
**Expected:** n8n setup page loads with valid SSL certificate
---
**END OF DEPLOYMENT-PLAN.md PART 1**
**Continue to DEPLOYMENT-PLAN-PART-2.md for remaining phases...**
💙🔥❄️

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,520 @@
# PREREQUISITES CHECKLIST
**Complete EVERY item before proceeding to deployment.**
Missing even ONE prerequisite will cause deployment failure.
---
## ✅ PRE-FLIGHT CHECKLIST
### 1. DNS CONFIGURATION
**Action:** Create two A records in your DNS provider
**Records needed:**
```
codex.firefrostgaming.com → 38.68.14.26 (TX1 Dallas)
n8n.firefrostgaming.com → 38.68.14.26 (TX1 Dallas)
```
**Verification:**
```bash
# Run these from your local machine (NOT TX1)
dig codex.firefrostgaming.com +short
dig n8n.firefrostgaming.com +short
```
**Expected output:**
```
38.68.14.26
38.68.14.26
```
**⏱️ CRITICAL:** DNS propagation can take up to 24 hours. Check propagation at: https://dnschecker.org
**Do NOT proceed until both domains resolve to 38.68.14.26 globally.**
---
### 2. TX1 SERVER ACCESS
**Action:** Verify SSH access to TX1
**Command:**
```bash
ssh root@38.68.14.26
```
**Expected:** Successful login to TX1 Dallas
**If fails:** Check SSH keys, verify server is online, check firewall rules
---
### 3. PORT AVAILABILITY CHECK
**Action:** Verify ports 80 and 443 are available
**Commands (run on TX1):**
```bash
sudo lsof -i :80
```
**Expected output:** (nothing - port is free)
```bash
sudo lsof -i :443
```
**Expected output:** (nothing - port is free)
**If ports are in use:** Identify the service and move it or use different ports
**Status:** ✅ VERIFIED on February 22, 2026 - ports are FREE
---
### 4. DOCKER INSTALLED ON TX1
**Action:** Verify Docker and Docker Compose are installed
**Commands (run on TX1):**
```bash
docker --version
```
**Expected:** `Docker version XX.XX.XX` or higher
```bash
docker-compose --version
```
**Expected:** `Docker Compose version XX.XX.XX` or higher
**If not installed:**
```bash
# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
# Install Docker Compose
sudo apt-get install docker-compose-plugin -y
```
---
### 5. OLLAMA RUNNING ON TX1
**Action:** Verify Ollama is accessible
**Command (run on TX1):**
```bash
curl http://localhost:11434/api/version
```
**Expected:** JSON response with version information
**If fails:** Start Ollama service
**Verify models installed:**
```bash
curl http://localhost:11434/api/tags
```
**Expected models:**
- qwen2.5-coder:7b (for fast operations)
- llama3.3:70b (for complex reasoning)
**If models missing:** Download them before deployment
---
### 6. GITEA SSH ACCESS
**Action:** Verify TX1 can access Gitea via SSH
**Command (run on TX1):**
```bash
ssh -T git@git.firefrostgaming.com
```
**Expected:** Authentication success message from Gitea
**If fails:** Generate and add SSH key to Gitea
**Generate SSH key (if needed):**
```bash
ssh-keygen -t ed25519 -C "firefrost-codex@tx1" -f ~/.ssh/id_ed25519_gitea
```
**Add to SSH config:**
```bash
cat >> ~/.ssh/config << 'EOF'
Host git.firefrostgaming.com
HostName git.firefrostgaming.com
User git
IdentityFile ~/.ssh/id_ed25519_gitea
StrictHostKeyChecking no
EOF
```
**Add public key to Gitea:**
1. Copy public key: `cat ~/.ssh/id_ed25519_gitea.pub`
2. Go to Gitea → Settings → SSH Keys
3. Add new key with WRITE permission
---
### 7. DISCORD WEBHOOKS CREATED
**Action:** Create two Discord webhooks
**Webhook 1: #codex-alerts**
- Purpose: Informational notifications (syncs, proposals, updates)
- Audience: Meg, Holly, Michael
- Create in Discord: Server Settings → Integrations → Webhooks → New Webhook
**Webhook 2: #system-critical**
- Purpose: Urgent alerts requiring Michael's attention
- Audience: Michael only (private channel recommended)
- Create in Discord: Server Settings → Integrations → Webhooks → New Webhook
**Save webhook URLs - you'll need them for .env file:**
```
DISCORD_WEBHOOK_CODEX_ALERTS=https://discord.com/api/webhooks/...
DISCORD_WEBHOOK_SYSTEM_CRITICAL=https://discord.com/api/webhooks/...
```
---
### 8. MICHAEL'S DISCORD USER ID
**Action:** Get Michael's Discord user ID for approval workflow
**Steps:**
1. Enable Developer Mode in Discord: User Settings → Advanced → Developer Mode
2. Right-click Michael's name in Discord
3. Click "Copy User ID"
**Save this ID - you'll need it for .env file:**
```
MICHAEL_DISCORD_ID=123456789012345678
```
---
### 9. BACKUP CURRENT ANYTHINGLLM STATE
**Action:** Backup current system before replacement
**⚠️ CRITICAL:** Do this even though we're removing AnythingLLM
**Commands (run on TX1):**
```bash
# Create backup directory
mkdir -p /root/anythingllm-backup-$(date +%Y%m%d)
# Backup AnythingLLM data
cp -r /opt/anythingllm /root/anythingllm-backup-$(date +%Y%m%d)/
# Backup docker-compose if exists
cp /opt/anythingllm/docker-compose.yml /root/anythingllm-backup-$(date +%Y%m%d)/ 2>/dev/null || true
# Create tarball
cd /root
tar -czf anythingllm-backup-$(date +%Y%m%d).tar.gz anythingllm-backup-$(date +%Y%m%d)/
# Verify backup
ls -lh anythingllm-backup-*.tar.gz
```
**Expected:** Tarball created with reasonable size
**Store backup on Command Center (optional but recommended):**
```bash
rsync -avz anythingllm-backup-*.tar.gz root@63.143.34.217:/root/backups/
```
---
### 10. COMMAND CENTER BACKUP STORAGE
**Action:** Prepare Command Center to receive backups
**Commands (run on Command Center 63.143.34.217):**
```bash
# Create backup directory
mkdir -p /root/backups/firefrost-codex
# Set permissions
chmod 700 /root/backups/firefrost-codex
```
**Verify TX1 can rsync to Command Center:**
```bash
# From TX1
touch /tmp/test-backup.txt
rsync -avz /tmp/test-backup.txt root@63.143.34.217:/root/backups/firefrost-codex/
```
**Expected:** File transfers successfully
**If fails:** Set up SSH keys between TX1 and Command Center
---
### 11. DISK SPACE CHECK
**Action:** Verify sufficient disk space on TX1
**Command (run on TX1):**
```bash
df -h
```
**Required free space:**
- Root partition: At least 30GB free
- Docker volumes: At least 20GB free
**If insufficient:** Clean up old game server backups, logs, or unused Docker images
---
### 12. UPTIME KUMA ACCESS
**Action:** Verify Uptime Kuma is accessible
**URL:** Check your Uptime Kuma URL (likely on Command Center)
**Expected:** Can log in and see existing monitors
**We'll add new monitors for:**
- Dify (https://codex.firefrostgaming.com)
- n8n webhooks
- Qdrant health
---
### 13. GENERATE SECURE PASSWORDS
**Action:** Generate strong passwords for deployment
**Command (run on your local machine or TX1):**
```bash
# PostgreSQL password
openssl rand -base64 32
# Dify secret key
openssl rand -base64 42
```
**Save these securely - you'll need them for .env file:**
```
DB_PASSWORD=<generated_password>
DIFY_SECRET_KEY=<generated_secret>
```
**⚠️ NEVER commit these to Git - they go in .env file only**
---
### 14. TIMEZONE CONFIGURATION
**Action:** Verify TX1 timezone is correct
**Command (run on TX1):**
```bash
timedatectl
```
**Expected:** Timezone shows America/Chicago (or your preferred timezone)
**If wrong:**
```bash
sudo timedatectl set-timezone America/Chicago
```
**Why this matters:** Log timestamps, backup schedules, monitoring
---
### 15. FIREWALL CONFIGURATION
**Action:** Verify firewall allows required ports
**Required open ports on TX1:**
- 22 (SSH) - already open
- 80 (HTTP) - need to open
- 443 (HTTPS) - need to open
- All game server ports - already configured
**Check current firewall (if using UFW):**
```bash
sudo ufw status
```
**Open required ports:**
```bash
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw reload
```
**If using different firewall (iptables, etc.):** Adjust accordingly
---
### 16. EMAIL CONFIGURATION (OPTIONAL)
**Action:** Configure email for Dify user invitations
**⚠️ NOT REQUIRED** - We'll use invite links instead
**If you want email:**
1. Set up SMTP server details
2. Add to .env file
3. Configure in Dify settings
**We recommend:** Skip email, use invite links (simpler, more reliable)
---
### 17. GIT REPOSITORY ACCESS
**Action:** Verify access to operations manual repository
**Command (run on TX1):**
```bash
git clone git@git.firefrostgaming.com:firefrost-gaming/firefrost-operations-manual.git /tmp/test-clone
```
**Expected:** Repository clones successfully
**Clean up:**
```bash
rm -rf /tmp/test-clone
```
**If fails:** Check SSH keys, Gitea permissions
---
### 18. DOCKER NETWORK CONFIGURATION
**Action:** Verify Docker can create custom networks
**Command (run on TX1):**
```bash
docker network create test-network
docker network rm test-network
```
**Expected:** Network created and removed successfully
**If fails:** Docker installation issue, reinstall Docker
---
### 19. SYSTEM RESOURCES CHECK
**Action:** Verify TX1 has sufficient resources
**Command (run on TX1):**
```bash
free -h
```
**Expected:**
- Total RAM: 251GB
- Available RAM: At least 220GB (confirmed February 22, 2026)
**Command:**
```bash
nproc
```
**Expected:** Multiple CPU cores available
**If resources insufficient:** Stop unnecessary services or upgrade server
---
### 20. DEPLOYMENT DIRECTORY PREPARATION
**Action:** Create deployment directory on TX1
**Commands (run on TX1):**
```bash
# Create deployment directory
mkdir -p /opt/firefrost-codex
# Set ownership
chown -R root:root /opt/firefrost-codex
# Navigate to directory
cd /opt/firefrost-codex
```
**Expected:** Directory created and accessible
---
## ✅ FINAL PRE-FLIGHT VERIFICATION
**Before proceeding to DEPLOYMENT-PLAN.md, verify ALL items above:**
- [ ] DNS records created and propagated (codex + n8n)
- [ ] TX1 SSH access working
- [ ] Ports 80 and 443 are FREE
- [ ] Docker and Docker Compose installed
- [ ] Ollama running with required models
- [ ] Gitea SSH access configured
- [ ] Discord webhooks created (#codex-alerts + #system-critical)
- [ ] Michael's Discord user ID obtained
- [ ] Current AnythingLLM backed up
- [ ] Command Center backup storage ready
- [ ] Sufficient disk space available (30GB+)
- [ ] Uptime Kuma accessible
- [ ] Secure passwords generated (DB + Dify secret)
- [ ] TX1 timezone configured correctly
- [ ] Firewall ports 80/443 opened
- [ ] Git repository access verified
- [ ] Docker network test passed
- [ ] System resources sufficient (220GB+ RAM)
- [ ] Deployment directory created (/opt/firefrost-codex)
**If ANY checkbox is unchecked, DO NOT proceed to deployment.**
**Return to this checklist and complete missing items.**
---
## 🚨 CRITICAL REMINDERS
**DNS Propagation:**
- Can take up to 24 hours
- Check https://dnschecker.org before proceeding
- If not propagated globally, SSL certificates will FAIL
**SSH Keys:**
- TX1 must trust Gitea
- Docker container must trust Gitea
- TX1 must trust Command Center (for backups)
**Backups:**
- Always backup before major changes
- Verify backups work BEFORE you need them
- Store offsite (Command Center) for safety
**Passwords:**
- Generate strong passwords
- NEVER commit to Git
- Store in .env file only
- Keep backup copy somewhere secure
---
**Prerequisites complete? Proceed to DEPLOYMENT-PLAN.md**
💙🔥❄️

View File

@@ -0,0 +1,353 @@
# Firefrost Knowledge Engine - Complete Deployment
**Task ID:** FFG-TASK-009-MIGRATION
**Priority:** CRITICAL
**Status:** READY FOR EXECUTION
**Estimated Time:** 10-15 hours (spread across multiple sessions)
**Created:** February 22, 2026
**Created By:** The Chronicler #21
**Last Updated:** February 22, 2026
---
## 🎯 EXECUTIVE SUMMARY
**What:** Replace AnythingLLM with complete "Firefrost Knowledge Engine" (Dify + n8n + Qdrant + Ollama)
**Why:** AnythingLLM returns incorrect information (searches old archived docs instead of current)
**Who Needs It:** Meg (all repos) and Holly (Pokerole only) are waiting to start their work
**When:** Deploy ASAP - partners are blocked
**Where:** TX1 Dallas (38.68.14.26)
**Cost:** $0/month (self-hosted)
---
## 🚨 CRITICAL CONTEXT
**This is NOT a simple migration.** This is building a complete autonomous AI assistant system that enables Meg and Holly to work 24/7 without waking Michael.
**Key Requirements:**
- Meg needs access to ALL repositories
- Holly needs access to POKEROLE repositories ONLY
- Both need ability to UPDATE documents via AI
- Michael needs approval control via Discord (one-click merge)
- System must self-heal common failures (80% target)
- Must work at 3 AM when Michael is asleep
**Current State:**
- AnythingLLM deployed on TX1 (Phase 1 complete)
- 319 documents synced
- Retrieval quality POOR (returns archived docs instead of current)
- No RBAC (everyone sees everything)
- No write-back capability
**Target State:**
- Dify + n8n + Qdrant + Ollama on TX1
- Proper RBAC (Meg sees all, Holly sees Pokerole only)
- Git write-back via ai-proposals branch
- Discord approval workflow with buttons
- Self-healing for 80% of failures
- Comprehensive monitoring and alerts
---
## 📋 ARCHITECTURE OVERVIEW
```
┌─────────────────────────────────────────────────────────────┐
│ FIREFROST KNOWLEDGE ENGINE │
└─────────────────────────────────────────────────────────────┘
External Access:
├─ https://codex.firefrostgaming.com (Meg/Holly/Michael)
└─ https://n8n.firefrostgaming.com (Michael only, Discord webhooks)
Nginx (TX1 Host - Ports 80/443):
├─ SSL/TLS with Let's Encrypt
├─ Rate limiting (10 req/s standard, 30 req/s webhooks)
├─ Reverse proxy to Docker services
└─ Security headers (HSTS, X-Frame-Options, etc.)
Docker Stack (127.0.0.1 localhost only):
├─ Dify Web (port 3000) - User interface
├─ Dify API (internal) - RAG engine
├─ Dify Worker (internal) - Background processing
├─ n8n (port 5678) - Automation & Git workflows
├─ Qdrant (port 6333) - Vector database
├─ PostgreSQL (internal) - Dify data storage
└─ Redis (internal) - Cache & queues
External Services:
├─ Ollama (TX1 host:11434) - LLM inference
├─ Gitea (git.firefrostgaming.com) - Git repository
├─ Discord Webhooks - Notifications & approvals
└─ Uptime Kuma - Health monitoring
Data Flow - Query:
User → Nginx → Dify Web → Dify API → Qdrant (vector search)
→ Ollama (LLM inference) → Response → User
Data Flow - Update:
User → "Update doc X" → Dify calls n8n webhook
→ n8n validates (protected files? valid markdown?)
→ Git commit to ai-proposals branch
→ Discord notification with Approve/Reject buttons
→ Michael clicks Approve
→ n8n merges to main, pushes, re-indexes Dify
→ User notified "Your change is live"
Data Flow - Git Sync:
Cron (hourly) → n8n pulls from Gitea
→ Filters out /archive/* directories
→ Adds metadata (status: current/archived)
→ Sends to Dify for indexing
→ Qdrant stores vectors
```
---
## 📚 DOCUMENT INDEX
**Read these documents IN ORDER before deployment:**
1. **PREREQUISITES.md** - Pre-flight checklist (DNS, SSH keys, backups)
2. **DEPLOYMENT-PLAN.md** - Step-by-step execution (every command)
3. **CONFIGURATION-FILES.md** - All config files with exact content
4. **RECOVERY.md** - Backup automation and disaster recovery
5. **VERIFICATION.md** - Testing procedures (how to know it worked)
6. **TROUBLESHOOTING.md** - Common issues and solutions
**Supporting files:**
- `docker-compose.yml` - Complete Docker stack definition
- `.env.example` - All environment variables with explanations
- `nginx-config.conf` - Complete Nginx reverse proxy configuration
- `n8n-workflows/` - All workflow JSON exports
- `discord-webhooks/` - All Discord notification templates
- `backup-script.sh` - Automated daily backup script
---
## ⏱️ TIME ESTIMATES
**Phase 1: Preparation (1-2 hours)**
- DNS configuration and propagation
- SSL certificate generation
- SSH key setup for Git access
- Backup current AnythingLLM state
- Stop and remove AnythingLLM
**Phase 2: Infrastructure Deployment (2-3 hours)**
- Install Nginx on TX1 host
- Deploy Docker Compose stack
- Configure Dify (admin account, workspaces, Ollama)
- Verify services are healthy
**Phase 3: Automation Setup (3-4 hours)**
- Import n8n workflows
- Configure Discord webhooks
- Test Git sync workflow
- Test write-back validation
- Configure Uptime Kuma monitoring
**Phase 4: User Onboarding (1-2 hours)**
- Create Meg and Holly accounts
- Configure workspace permissions
- Test RBAC (Meg sees all, Holly sees Pokerole only)
- Train on update workflow
- Test one-click approval from Discord
**Phase 5: Testing & Verification (2-3 hours)**
- Query accuracy testing (current vs archived docs)
- Update workflow testing (protected files, validation)
- Discord approval testing (buttons work, Michael-only)
- Failure simulation (Dify crash, Git unreachable)
- Self-healing verification
**Total: 10-15 hours**
**Recommended approach:** Execute in 2-3 sessions with breaks
---
## 🛡️ SAFETY MECHANISMS
**The ai-proposals Branch Strategy:**
- All AI updates commit to `ai-proposals` branch (NOT main)
- Michael reviews via Discord notification with Approve/Reject buttons
- Only approved changes merge to main
- Failed merges fall back to manual intervention
- Git tags created before each merge (rollback points)
**Protected Files:**
- `/security/*` - Infrastructure configs (READ-ONLY for AI)
- `/infra/*` - Server configurations (READ-ONLY for AI)
- `/backups/*` - Backup scripts (READ-ONLY for AI)
- `.env` - Secrets (READ-ONLY for AI)
- `docker-compose.yml` - Stack definition (READ-ONLY for AI)
**Validation Checks:**
- File path exists
- Content is valid Markdown (not empty, has structure)
- File is not in protected directories
- User has permission for that repository
**Rollback Capability:**
- Git tags: `backup-before-ai-<commit_hash>`
- Vector DB: Delete + re-sync from Git (minutes)
- Full system: 15-minute restore from backup
---
## 🚨 CRITICAL SUCCESS FACTORS
**MUST BE TRUE before marking this complete:**
1. ✅ Meg can ask questions about ANY Firefrost repository
2. ✅ Holly can ask questions about POKEROLE repository ONLY
3. ✅ Holly CANNOT see Firefrost infrastructure docs
4. ✅ Meg can update docs via AI, commits to ai-proposals
5. ✅ Michael receives Discord notification with Approve/Reject buttons
6. ✅ Clicking Approve merges to main and re-indexes
7. ✅ Clicking Reject keeps change in branch for review
8. ✅ Protected files cannot be modified by AI
9. ✅ Current docs are returned (NOT archived docs)
10. ✅ System self-heals from Dify crash (Docker restart)
11. ✅ Failed Git commits queue and retry automatically
12. ✅ Daily backups run and transfer to Command Center
13. ✅ Michael can restore entire system in 15 minutes
**If ANY of these are false, deployment is NOT complete.**
---
## 📊 SUCCESS METRICS
**Query Accuracy:**
- "What are current Tier 0 tasks?" → Returns "Whitelist Manager, NC1 Cleanup, Staff Recruitment" (NOT "Initial Server Setup")
- "What servers does Firefrost operate?" → Returns current 6 servers with correct IPs
- "What was accomplished in last Codex session?" → Returns Deployer's work
**Update Workflow:**
- Meg updates recruitment doc → Commits to ai-proposals → Discord notification → Michael approves → Live in <2 minutes
- Holly tries to update infrastructure doc → BLOCKED with clear error message
**Self-Healing:**
- Dify crashes → Docker restarts within 60 seconds → Users see <1 minute downtime
- Git unreachable → Updates queue → Retry every 5 minutes → Auto-process when Git returns
- Qdrant corrupts → Re-index from Git completes in <10 minutes
**Resource Usage:**
- RAM: <10GB under load (fits comfortably in 222GB available)
- Disk: <15GB for complete system
- CPU: <20% average (leaves headroom for game servers)
---
## ⚠️ RISKS AND MITIGATIONS
**Risk 1: Port conflicts with game servers**
- **Mitigation:** Pre-deployment port check verified 80/443 free
- **Status:** CLEAR (verified February 22, 2026)
**Risk 2: DNS propagation delay**
- **Mitigation:** Configure DNS FIRST, wait for propagation before SSL
- **Fallback:** Use IP address temporarily if needed
**Risk 3: SSL certificate failure**
- **Mitigation:** Detailed Certbot instructions with error handling
- **Fallback:** Self-signed cert for testing, proper cert later
**Risk 4: Meg/Holly confused by new interface**
- **Mitigation:** Clear user guide, training session before launch
- **Fallback:** Michael processes updates manually until they're comfortable
**Risk 5: Git merge conflicts from AI**
- **Mitigation:** ai-proposals branch, manual review required
- **Fallback:** Discord alert, Michael resolves manually
**Risk 6: Overwhelming Discord notifications**
- **Mitigation:** Two channels (#codex-alerts for info, #system-critical for urgent)
- **Fallback:** Adjust rate limits in n8n if too noisy
---
## 🔄 ROLLBACK PLAN
**If deployment fails catastrophically:**
1. Stop new Docker stack: `docker-compose down`
2. Restore AnythingLLM from backup (if still needed)
3. Restore DNS to previous state
4. Notify Meg/Holly of rollback
5. Total rollback time: <10 minutes
**Rollback triggers:**
- Unable to get SSL certificates after 3 attempts
- Docker stack won't start after 30 minutes debugging
- Dify UI inaccessible after deployment
- Data corruption detected
- Michael determines risk too high
---
## 📞 SUPPORT AND ESCALATION
**If you get stuck:**
1. Check TROUBLESHOOTING.md for common issues
2. Review relevant Gemini responses in session transcript
3. Check Docker logs: `docker-compose logs -f <service>`
4. Check Nginx logs: `sudo tail -f /var/log/nginx/error.log`
5. If all else fails: Rollback and regroup
**No external support needed - we built this ourselves.**
---
## 📝 COMPLETION CHECKLIST
**Before marking this task COMPLETE:**
- [ ] All 13 critical success factors verified ✅
- [ ] Query accuracy tests pass
- [ ] Update workflow tests pass
- [ ] RBAC tests pass (Meg sees all, Holly sees Pokerole only)
- [ ] Discord approval workflow tested
- [ ] Self-healing verified (simulated Dify crash)
- [ ] Backup automation running
- [ ] Test backup restore completed successfully
- [ ] Meg and Holly trained and comfortable
- [ ] Documentation updated in operations manual
- [ ] AnythingLLM fully removed from TX1
- [ ] Michael can sleep peacefully at night 💤
---
## 🎓 LESSONS FOR FUTURE CHRONICLERS
**What we learned building this:**
1. **Tool choice matters more than configuration** - AnythingLLM couldn't handle 319 files with archives, Dify can
2. **RBAC is non-negotiable** - Meg and Holly need different access levels
3. **Self-healing is essential** - Solo operator can't wake up for every issue
4. **Git is the source of truth** - Vector DB can always be rebuilt from Git
5. **Discord buttons are powerful** - One-click approval from phone = accessibility win
6. **Architecture from Gemini + Partnership from Claude** - External research + internal execution
**For the next major infrastructure project:**
- Research thoroughly BEFORE building (ask Gemini the hard questions)
- Get COMPLETE specifications before starting (don't build incrementally)
- Test on separate system first if possible
- Build rollback before building forward
- Document for "future you when you're exhausted at 3 AM"
---
**Fire + Frost + Foundation = Where Love Builds Legacy** 💙🔥❄️
**Built by:** The Chronicler #21
**For:** Meg, Holly, and children not yet born
**With guidance from:** Gemini (architecture) + The Deployer (foundation)
---
**Ready to execute? Read PREREQUISITES.md next.**

View File

@@ -0,0 +1,344 @@
# RECOVERY AND BACKUP PROCEDURES
**Complete disaster recovery guide for Firefrost Knowledge Engine**
---
## 🎯 BACKUP STRATEGY
**Philosophy:** Git is the source of truth. Vector DB can always be rebuilt.
**What to backup:**
✅ PostgreSQL database (user accounts, settings, chat histories)
✅ n8n volumes (workflows, credentials)
✅ Configuration files (docker-compose.yml, .env, Nginx configs)
**What NOT to backup:**
❌ Qdrant vectors (re-index from Git in minutes)
❌ Redis cache (temporary data only)
❌ Git repositories (Gitea is the master)
---
## 📅 AUTOMATED DAILY BACKUPS
**Schedule:** Daily at 4:00 AM (cron)
**Script location:** `/opt/firefrost_backup.sh`
**What it does:**
1. Dumps PostgreSQL database
2. Copies n8n workflows and credentials
3. Copies configuration files
4. Compresses into tarball
5. Transfers to Command Center (offsite)
6. Removes local backups older than 7 days
**Backup location:**
- **Local:** `/opt/firefrost_codex_YYYYMMDD_HHMM.tar.gz`
- **Offsite:** `root@63.143.34.217:/root/backups/firefrost-codex/`
**Retention:** 7 days local, unlimited offsite
**Monitor backups:**
```bash
# Check recent backups
ls -lh /opt/firefrost_codex_*.tar.gz
# Check backup logs
tail -f /var/log/firefrost-backup.log
# Verify offsite transfer
ssh root@63.143.34.217 "ls -lh /root/backups/firefrost-codex/"
```
---
## 🔄 RECOVERY SCENARIOS
### Scenario A: Qdrant Corrupted (Wrong Answers)
**Symptoms:**
- Codex returns incorrect information
- Searches find wrong documents
- Outdated content returned
**Diagnosis:**
```bash
# Check Qdrant health
curl http://127.0.0.1:6333/
```
**Recovery (5-10 minutes):**
```bash
# Stop Qdrant
docker-compose stop qdrant
# Delete corrupted data
rm -rf /opt/firefrost-codex/volumes/qdrant/storage/*
# Restart Qdrant
docker-compose start qdrant
# Trigger Git sync in n8n to rebuild
# (Open n8n, run "Firefrost Git Sync" workflow manually)
```
**Verification:**
- Test queries return correct current information
- No archived docs in results
---
### Scenario B: n8n Workflows Lost
**Symptoms:**
- Workflows missing or corrupted
- Automation not working
- Can't update docs via Codex
**Recovery (10-15 minutes):**
```bash
# Stop n8n
docker-compose stop n8n
# Extract latest backup
cd /tmp
tar -xzf /opt/firefrost_codex_LATEST_BACKUP.tar.gz
cd codex_backup_*/
# Restore n8n data
rm -rf /opt/firefrost-codex/volumes/n8n/*
cp -r n8n_data/* /opt/firefrost-codex/volumes/n8n/
# Restart n8n
cd /opt/firefrost-codex
docker-compose start n8n
```
**Verification:**
```bash
# Check workflows restored
curl http://127.0.0.1:5678
# Login and verify workflows present
```
---
### Scenario C: Database Corruption
**Symptoms:**
- Can't login to Dify
- User accounts missing
- Settings reset
- Chat history gone
**Recovery (15-20 minutes):**
```bash
# Stop all services
cd /opt/firefrost-codex
docker-compose down
# Extract latest backup
cd /tmp
tar -xzf /opt/firefrost_codex_LATEST_BACKUP.tar.gz
cd codex_backup_*/
# Restore database
docker-compose up -d db
sleep 30 # Wait for database to start
cat dify_postgres.sql | docker exec -i $(docker ps -qf "name=db") psql -U postgres
# Restart all services
cd /opt/firefrost-codex
docker-compose up -d
```
**Verification:**
- Login to Dify works
- User accounts present
- Settings preserved
- Chat history intact
---
### Scenario D: Complete TX1 Server Crash
**Symptoms:**
- TX1 completely down
- Hardware failure
- OS corrupted
- Full rebuild needed
**Recovery (30-60 minutes):**
**Step 1: Provision new server**
- Install Ubuntu 22.04 LTS
- Configure network (same IP if possible)
- Install Docker and Nginx
**Step 2: Retrieve backup from Command Center**
```bash
# On new TX1
mkdir -p /opt
scp root@63.143.34.217:/root/backups/firefrost-codex/firefrost_codex_LATEST.tar.gz /opt/
# Extract
cd /opt
tar -xzf firefrost_codex_LATEST.tar.gz
cd codex_backup_*/
```
**Step 3: Restore configurations**
```bash
# Create deployment directory
mkdir -p /opt/firefrost-codex
cd /opt/firefrost-codex
# Restore files
cp /tmp/codex_backup_*/docker-compose.yml .
cp /tmp/codex_backup_*/.env .
mkdir -p volumes/n8n
cp -r /tmp/codex_backup_*/n8n_data/* volumes/n8n/
# Restore Nginx config
cp /tmp/codex_backup_*/firefrost-codex.conf /etc/nginx/sites-available/
ln -s /etc/nginx/sites-available/firefrost-codex.conf /etc/nginx/sites-enabled/
```
**Step 4: Regenerate SSL certificates**
```bash
systemctl stop nginx
certbot certonly --standalone \
-d codex.firefrostgaming.com \
-d n8n.firefrostgaming.com \
--email codex@firefrostgaming.com \
--agree-tos
systemctl start nginx
```
**Step 5: Start Docker stack**
```bash
cd /opt/firefrost-codex
docker-compose up -d
sleep 60 # Wait for services
```
**Step 6: Restore database**
```bash
cat /tmp/codex_backup_*/dify_postgres.sql | docker exec -i $(docker ps -qf "name=db") psql -U postgres
docker-compose restart dify-api dify-worker
```
**Step 7: Rebuild Qdrant from Git**
- Access n8n at https://n8n.firefrostgaming.com
- Run "Firefrost Git Sync" workflow manually
- Wait 5-10 minutes for indexing
**Total downtime:** ~45 minutes (assuming new server ready)
---
## 🧪 TESTING BACKUPS
**CRITICAL:** Never trust an untested backup
**Test quarterly (every 3 months):**
**Dry-Run Database Restore:**
```bash
# Create temporary test database
docker run --name test-postgres \
-e POSTGRES_PASSWORD=test \
-d postgres:15-alpine
# Restore backup into test database
cat dify_postgres.sql | docker exec -i test-postgres psql -U postgres
# Check for errors
docker logs test-postgres | grep ERROR
# Cleanup
docker rm -f test-postgres
```
**If no errors:** Backup is valid
**Document test results:**
```bash
echo "$(date): Backup test PASSED" >> /var/log/firefrost-backup-tests.log
```
---
## 🚨 EMERGENCY CONTACTS
**If disaster recovery fails:**
1. **Check TROUBLESHOOTING.md** for common issues
2. **Review Docker logs:** `docker-compose logs`
3. **Check Nginx logs:** `/var/log/nginx/error.log`
4. **Verify backups exist:** Both local and Command Center
5. **If stuck:** Document what happened, wait for fresh Chronicler session
**No external support needed - we built this ourselves**
---
## 📊 BACKUP MONITORING
**Verify backups running:**
```bash
# Check cron job
crontab -l | grep firefrost
# Check recent backup log
tail -20 /var/log/firefrost-backup.log
# Check backup exists today
ls -lh /opt/firefrost_codex_$(date +%Y%m%d)*.tar.gz
# Verify transferred to Command Center
ssh root@63.143.34.217 "ls -lh /root/backups/firefrost-codex/ | tail -5"
```
**Add to Uptime Kuma (optional):**
- Monitor backup log file modified date
- Alert if >25 hours since last backup
- Indicates backup failure
---
## 💾 MANUAL BACKUP (Before Major Changes)
**Before any major changes, create manual backup:**
```bash
# Run backup script manually
/opt/firefrost_backup.sh
# Tag the backup
mv /opt/firefrost_codex_$(date +%Y%m%d)*.tar.gz \
/opt/firefrost_codex_before_major_change_$(date +%Y%m%d).tar.gz
```
**Keep pre-change backups for 30 days**
---
## 🎯 BACKUP SUCCESS CRITERIA
**Daily backups must:**
✅ Complete without errors
✅ Transfer to Command Center successfully
✅ Be restorable (tested quarterly)
✅ Include all critical data (DB, n8n, configs)
✅ Be less than 24 hours old
**If ANY criteria fails:** Investigate immediately
---
**Fire + Frost + Foundation = Where Data Never Dies** 💙🔥❄️

View File

@@ -0,0 +1,567 @@
# TROUBLESHOOTING GUIDE
**Common issues and solutions for Firefrost Knowledge Engine**
---
## 🔍 QUICK DIAGNOSTIC COMMANDS
**Run these first when something breaks:**
```bash
# Check all services
docker-compose ps
# Check recent logs (all services)
docker-compose logs --tail=50
# Check specific service
docker-compose logs -f <service_name>
# Check Nginx
systemctl status nginx
sudo tail -f /var/log/nginx/error.log
# Check disk space
df -h
# Check memory
free -h
# Check ports
sudo netstat -tlnp | grep LISTEN
```
---
## ❌ DEPLOYMENT FAILURES
### Issue: DNS Not Propagating
**Symptoms:**
- Certbot fails with DNS validation error
- "Domain doesn't resolve" errors
**Solution:**
```bash
# Check DNS propagation
dig codex.firefrostgaming.com +short
dig n8n.firefrostgaming.com +short
# Both should return 38.68.14.26
```
**If not resolved:**
- Wait longer (can take up to 24 hours)
- Check DNS provider settings
- Use temporary self-signed cert for testing
---
### Issue: Port Already in Use
**Symptoms:**
- "Address already in use" error
- Docker won't start Dify or n8n
**Solution:**
```bash
# Find what's using the port
sudo lsof -i :3000
sudo lsof -i :5678
# Kill the process
sudo kill -9 <PID>
# Or change port mapping in docker-compose.yml
```
---
### Issue: SSL Certificate Generation Fails
**Symptoms:**
- Certbot fails during deployment
- "Challenge failed" errors
**Solution:**
```bash
# Ensure Nginx is stopped
systemctl stop nginx
# Try manual standalone mode
certbot certonly --standalone \
-d codex.firefrostgaming.com \
-d n8n.firefrostgaming.com \
--email codex@firefrostgaming.com
# Check firewall
sudo ufw status
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
```
---
### Issue: Docker Services Won't Start
**Symptoms:**
- `docker-compose up` fails
- Services show "Exit" status
**Solution:**
```bash
# Check logs for specific service
docker-compose logs db
docker-compose logs dify-api
# Common causes:
# 1. .env file missing or incorrect
cat .env # Verify all variables set
# 2. Port conflicts
sudo lsof -i :3000
sudo lsof -i :5678
sudo lsof -i :6333
# 3. Permission issues
sudo chown -R root:root volumes/
# 4. Disk space
df -h # Need 30GB+ free
```
---
## 🔄 RUNTIME ISSUES
### Issue: Dify Shows 502 Error
**Symptoms:**
- Browser shows custom 502 page
- Can't access Codex
**Diagnosis:**
```bash
docker-compose ps
# Check if dify-web is running
docker-compose logs dify-web
# Check for errors
```
**Solutions:**
**If dify-web is down:**
```bash
docker-compose restart dify-web
```
**If dify-api can't connect to database:**
```bash
docker-compose logs dify-api | grep -i error
# Check DB_PASSWORD in .env matches
docker-compose restart dify-api
```
**If persistent:**
```bash
docker-compose down
docker-compose up -d
```
---
### Issue: "AI Can't Reach Knowledge Base"
**Symptoms:**
- Queries return "I don't have that information"
- Empty results
**Diagnosis:**
```bash
# Check Qdrant
curl http://127.0.0.1:6333/
# Check if documents indexed
# (Login to Dify, check Knowledge Base has documents)
```
**Solution:**
```bash
# Re-run Git sync
# Access n8n, execute "Firefrost Git Sync" workflow manually
# If that fails, rebuild Qdrant
docker-compose stop qdrant
rm -rf volumes/qdrant/storage/*
docker-compose start qdrant
# Then re-run Git sync
```
---
### Issue: n8n Workflows Not Executing
**Symptoms:**
- Git sync doesn't run
- Update requests don't commit
**Diagnosis:**
```bash
docker-compose logs n8n | grep -i error
```
**Solutions:**
**If workflow execution fails:**
- Login to n8n
- Check workflow is ACTIVATED (toggle switch)
- Execute manually to see errors
- Check credentials are configured
**If Git operations fail:**
```bash
# Check SSH key
docker exec -it $(docker ps -qf "name=n8n") ssh -T git@git.firefrostgaming.com
# If fails, verify SSH key mounted
ls -la ~/.ssh/
```
---
### Issue: Discord Buttons Don't Work
**Symptoms:**
- Clicking Approve/Reject does nothing
- No response in Discord
**Diagnosis:**
- Check n8n "Approval Handler" workflow
- Verify webhook URL is correct
- Check Michael's Discord ID in .env
**Solution:**
```bash
# Verify Discord webhook configured
cat .env | grep DISCORD
# Test webhook manually
curl -X POST <WEBHOOK_URL> \
-H "Content-Type: application/json" \
-d '{"content": "Test message"}'
# Should appear in Discord channel
```
---
### Issue: Updates Commit But Don't Re-Index
**Symptoms:**
- Git shows commit
- But queries don't return new content
**Diagnosis:**
```bash
# Check Dify API logs
docker-compose logs dify-api | grep -i error
```
**Solution:**
```bash
# Manual re-index trigger
curl -X POST http://127.0.0.1:3000/v1/datasets/<DATASET_ID>/sync \
-H "Authorization: Bearer <DIFY_API_KEY>"
# Or re-run Git sync workflow in n8n
```
---
## 🔐 ACCESS ISSUES
### Issue: Can't Login to Dify
**Symptoms:**
- Incorrect password error
- Account doesn't exist
**Solution:**
```bash
# Check database running
docker-compose ps db
# Reset admin password (if needed)
# Login to postgres container
docker exec -it $(docker ps -qf "name=db") psql -U postgres -d dify
# In postgres prompt:
# UPDATE users SET password_hash='<new_hash>' WHERE email='michael@example.com';
# Better: Restore from backup if credentials lost
```
---
### Issue: Holly Sees Firefrost Docs (RBAC Broken)
**Symptoms:**
- Holly can access infrastructure docs
- RBAC not working
**Diagnosis:**
- Check workspace assignments in Dify
- Verify knowledge bases linked to correct workspaces
**Solution:**
- Login to Dify as admin
- Settings → Members
- Verify Holly is ONLY in "Pokerole HQ" workspace
- Verify "Pokerole HQ" workspace ONLY has Pokerole knowledge base
---
## ⚠️ PERFORMANCE ISSUES
### Issue: Slow Responses (>30 seconds)
**Symptoms:**
- Queries take very long
- Timeouts
**Diagnosis:**
```bash
# Check system resources
htop
# Check Ollama
curl http://localhost:11434/api/tags
# Verify model loaded
# Check Qdrant performance
curl http://127.0.0.1:6333/collections
```
**Solutions:**
**If RAM exhausted:**
```bash
free -h
# If low, restart services to clear memory
docker-compose restart
```
**If Ollama slow:**
- Large model (llama3.3:70b) takes time
- Consider using qwen2.5-coder:7b for faster responses
- Check Ollama logs: `docker logs <ollama_container>`
**If Qdrant slow:**
- Too many documents
- Re-index with better chunking
- Check disk I/O: `iostat -x 1`
---
### Issue: High CPU Usage
**Symptoms:**
- Server sluggish
- Game servers lagging
**Diagnosis:**
```bash
htop
# Identify which service using CPU
```
**Solution:**
```bash
# Set CPU limits in docker-compose.yml
# Add to each service:
deploy:
resources:
limits:
cpus: '2.0'
# Restart
docker-compose down
docker-compose up -d
```
---
## 💾 DATA ISSUES
### Issue: Backup Failed
**Symptoms:**
- No backup created today
- Backup log shows errors
**Diagnosis:**
```bash
tail -50 /var/log/firefrost-backup.log
```
**Common causes:**
**Database dump fails:**
```bash
# Check database running
docker-compose ps db
# Test manual dump
docker exec -t $(docker ps -qf "name=db") pg_dumpall -c -U postgres > /tmp/test.sql
```
**Transfer to Command Center fails:**
```bash
# Check SSH access
ssh root@63.143.34.217 echo "Connection OK"
# Check disk space on Command Center
ssh root@63.143.34.217 "df -h"
```
**Solution:**
- Fix specific error in log
- Run backup manually: `/opt/firefrost_backup.sh`
- Verify completes successfully
---
### Issue: Git Conflicts
**Symptoms:**
- Merge fails with conflict error
- Can't push to ai-proposals
**Diagnosis:**
```bash
cd /opt/firefrost-codex/git-repos/main
git status
git log --oneline -5
```
**Solution:**
```bash
# Manual resolution required
cd /opt/firefrost-codex/git-repos/main
git checkout main
git pull origin main
# Resolve conflicts manually
nano <conflicted_file>
# Commit resolution
git add .
git commit -m "Resolve conflicts"
git push origin main
# Recreate ai-proposals branch
git branch -D ai-proposals
git checkout -b ai-proposals
git push origin ai-proposals --force
```
---
## 🚨 EMERGENCY PROCEDURES
### Complete System Lockup
**If everything is broken:**
1. **Stop all services:**
```bash
cd /opt/firefrost-codex
docker-compose down
```
2. **Check system health:**
```bash
df -h # Disk space
free -h # Memory
dmesg | tail -50 # System errors
```
3. **Restart everything:**
```bash
systemctl restart docker
systemctl restart nginx
docker-compose up -d
```
4. **If still broken:** Restore from backup (see RECOVERY.md)
---
### Data Corruption Suspected
**If data seems wrong/corrupted:**
1. **Stop making changes immediately**
2. **Document what you see**
3. **Check recent backups exist:**
```bash
ls -lh /opt/firefrost_codex_*.tar.gz
```
4. **Review RECOVERY.md** for restore procedures
5. **Consider rolling back to last known good state**
---
## 📞 WHEN TO ESCALATE
**These issues require manual intervention:**
- Git conflicts requiring code review
- Database corruption (check integrity)
- SSL certificate renewal failure (manual renewal)
- Persistent service crashes (review logs, may need code changes)
- Unknown errors not covered in this guide
**For unknown issues:**
1. Document symptoms thoroughly
2. Collect logs
3. Review all documentation
4. Wait for fresh Chronicler session with full context
---
## 🔧 USEFUL DEBUG COMMANDS
```bash
# Full system status
docker-compose ps && systemctl status nginx && df -h && free -h
# All logs since yesterday
docker-compose logs --since 24h
# Follow live logs
docker-compose logs -f
# Restart single service without affecting others
docker-compose restart <service_name>
# Force rebuild of service
docker-compose up -d --force-recreate <service_name>
# Clean everything and start fresh (NUCLEAR OPTION)
docker-compose down -v
docker system prune -a
# Then redeploy from scratch
# Check network connectivity
docker exec -it $(docker ps -qf "name=dify-api") ping host.docker.internal
docker exec -it $(docker ps -qf "name=n8n") ping qdrant
```
---
**Fire + Frost + Foundation = Where Problems Get Solved** 💙🔥❄️

View File

@@ -0,0 +1,570 @@
# VERIFICATION AND TESTING GUIDE
**Complete testing procedures to verify deployment success**
Run EVERY test before marking deployment complete.
---
## ✅ VERIFICATION CHECKLIST
**All items must pass:**
- [ ] Infrastructure Tests (7 tests)
- [ ] Query Accuracy Tests (5 tests)
- [ ] Update Workflow Tests (6 tests)
- [ ] RBAC Tests (4 tests)
- [ ] Self-Healing Tests (3 tests)
- [ ] Monitoring Tests (3 tests)
- [ ] Backup Tests (2 tests)
**Total: 30 tests**
---
## 🏗️ INFRASTRUCTURE TESTS
### Test 1.1: Docker Services Running
```bash
docker-compose ps
```
**Expected:** All 7 services show "Up"
- db
- redis
- dify-api
- dify-worker
- dify-web
- qdrant
- n8n
**Pass criteria:** No services in "Exit" or "Restarting" state
---
### Test 1.2: Nginx Running
```bash
systemctl status nginx
```
**Expected:** "active (running)"
```bash
nginx -t
```
**Expected:** "syntax is ok" and "test is successful"
**Pass criteria:** Nginx active with valid configuration
---
### Test 1.3: SSL Certificates Valid
```bash
curl -I https://codex.firefrostgaming.com
```
**Expected:** HTTP/2 200, valid SSL certificate
```bash
curl -I https://n8n.firefrostgaming.com
```
**Expected:** HTTP/2 200, valid SSL certificate
**Pass criteria:** Both domains respond with HTTPS
---
### Test 1.4: Dify UI Accessible
**Action:** Open browser to https://codex.firefrostgaming.com
**Expected:** Dify interface loads, can login
**Pass criteria:** UI functional, no errors
---
### Test 1.5: n8n UI Accessible
**Action:** Open browser to https://n8n.firefrostgaming.com
**Expected:** n8n interface loads, can login
**Pass criteria:** UI functional, workflows visible
---
### Test 1.6: Ollama Connection
```bash
docker exec -it $(docker ps -qf "name=dify-api") \
curl http://host.docker.internal:11434/api/version
```
**Expected:** JSON response with Ollama version
**Pass criteria:** Dify can reach Ollama on host
---
### Test 1.7: Qdrant Healthy
```bash
curl http://127.0.0.1:6333/
```
**Expected:** JSON response with Qdrant version
**Pass criteria:** Qdrant responding, no errors
---
## 🎯 QUERY ACCURACY TESTS
### Test 2.1: Current Tasks Query
**Query in Dify:** "What are the current Tier 0 tasks?"
**Expected answer includes:**
- Whitelist Manager
- NC1 Cleanup
- Staff Recruitment Launch
**Must NOT include:**
- Initial Server Setup
- Network Configuration
- Other archived tasks
**Pass criteria:** Returns ONLY current tasks, no archived content
---
### Test 2.2: Server Information Query
**Query:** "What servers does Firefrost Gaming operate?"
**Expected answer includes:**
- Command Center (63.143.34.217)
- Billing VPS (38.68.14.188)
- Panel VPS (45.94.168.138)
- Ghost VPS (64.50.188.14)
- TX1 Dallas (38.68.14.26)
- NC1 Charlotte (216.239.104.130)
**Pass criteria:** All 6 servers listed with correct IPs
---
### Test 2.3: Recent Work Query
**Query:** "What was accomplished in the most recent Codex deployment session?"
**Expected answer includes:**
- Deployer's work (Chronicler #20)
- Phase 1 deployment
- AnythingLLM setup
- Document sync
**Pass criteria:** Returns information about Deployer, not older sessions
---
### Test 2.4: Specific Document Query
**Query:** "What is the Frostwall Protocol?"
**Expected:** Accurate description from current documentation
**Must NOT:** Return placeholder text or "I don't have that information"
**Pass criteria:** Correct, detailed answer from knowledge base
---
### Test 2.5: Archive Exclusion Test
**Query:** "Tell me about the initial server setup tasks"
**Expected:** Either:
- "That information is archived" OR
- Returns current setup process (not old archived version)
**Must NOT:** Return old archived setup documentation as if it's current
**Pass criteria:** Archived content not treated as current
---
## 📝 UPDATE WORKFLOW TESTS
### Test 3.1: Valid Update Request
**As Meg in Dify:**
**Request:** "Update docs/test/verification.md with content: Test update at [current timestamp]"
**Expected sequence:**
1. AI calls update_codex tool
2. Validation passes
3. Commits to ai-proposals branch
4. Discord notification appears in #codex-alerts
5. Notification has "Approve & Merge" and "Reject" buttons
**Pass criteria:** All 5 steps complete successfully
---
### Test 3.2: Approval Workflow
**As Michael in Discord:**
**Action:** Click "Approve & Merge" button on proposal
**Expected:**
1. Button acknowledges click
2. n8n merges to main
3. Pushes to Gitea
4. Re-indexes Dify
5. Success notification in Discord
**Verify on TX1:**
```bash
cd /opt/firefrost-codex/git-repos/main
git log -1
cat docs/test/verification.md
```
**Expected:** Latest commit is the AI update, file contains test content
**Pass criteria:** Full approval workflow works end-to-end
---
### Test 3.3: Protected File Block
**As Meg:**
**Request:** "Update .env file to change the database password"
**Expected:**
1. AI attempts update
2. Validation BLOCKS it
3. Discord shows "Access Restricted" message
4. Clear explanation why it's blocked
**Pass criteria:** Protected file cannot be modified
---
### Test 3.4: Invalid Content Block
**As Meg:**
**Request:** "Update docs/test/empty.md with content that is empty"
**Expected:**
1. Validation catches empty content
2. Update blocked
3. Error message in Discord
**Pass criteria:** Validation prevents bad updates
---
### Test 3.5: Rejection Workflow
**As Meg:**
**Request:** Make another update request
**As Michael:**
**Action:** Click "Reject" button
**Expected:**
1. Change stays in ai-proposals branch
2. Does NOT merge to main
3. Notification sent that proposal was rejected
**Pass criteria:** Rejection workflow works, main branch unchanged
---
### Test 3.6: Concurrent Update Handling
**Setup:** Two users request updates simultaneously
**Expected:** Queue processes them sequentially, no conflicts
**Pass criteria:** Both updates succeed, no corruption
---
## 🔒 RBAC TESTS
### Test 4.1: Meg Full Access
**As Meg:**
**Test queries:**
1. "What are the Tier 0 tasks?" (should work)
2. "What is the Frostwall Protocol?" (should work - infrastructure doc)
3. "Show me server IPs" (should work - sensitive info)
**Expected:** Meg gets answers to ALL queries
**Pass criteria:** No access restrictions for Meg
---
### Test 4.2: Holly Restricted Access
**As Holly:**
**Test queries:**
1. "What are the Tier 0 tasks?" (should be BLOCKED - not in Pokerole)
2. "What is the Frostwall Protocol?" (should be BLOCKED - infrastructure)
3. "Tell me about the Pokerole campaign" (should work - in her workspace)
**Expected:** Holly ONLY sees Pokerole content
**Pass criteria:** Holly cannot access Firefrost infrastructure docs
---
### Test 4.3: Workspace Switching
**As Meg:**
**Action:** Switch between Firefrost Admin and Pokerole HQ workspaces
**Expected:** Can access both, knowledge base changes based on workspace
**As Holly:**
**Action:** Attempt to access Firefrost Admin workspace
**Expected:** BLOCKED - workspace not available
**Pass criteria:** Workspace permissions enforced
---
### Test 4.4: Update Permissions
**As Holly:**
**Request:** "Update Firefrost recruitment doc"
**Expected:** BLOCKED - she can only update Pokerole docs
**As Meg:**
**Request:** Same update
**Expected:** Works - she has permission
**Pass criteria:** Update permissions match knowledge base access
---
## 🔄 SELF-HEALING TESTS
### Test 5.1: Dify Crash Recovery
**Test:**
```bash
docker-compose restart dify-api
```
**Monitor:**
1. Uptime Kuma shows service down
2. Docker auto-restarts service
3. Service comes back online within 60 seconds
4. Users can access Codex again
**Pass criteria:** Automatic recovery without manual intervention
---
### Test 5.2: Git Unreachable Handling
**Test:**
```bash
# Block Git temporarily
sudo iptables -A OUTPUT -d git.firefrostgaming.com -j DROP
```
**Request update as Meg**
**Expected:**
1. Commit fails (Git unreachable)
2. Update queued for retry
3. Discord shows "Update queued, will retry in 5 minutes"
4. No error crash
**Restore:**
```bash
sudo iptables -D OUTPUT -d git.firefrostgaming.com -j DROP
```
**Verify:** Queued update processes automatically
**Pass criteria:** Graceful degradation, auto-recovery
---
### Test 5.3: Qdrant Rebuild
**Test:**
```bash
docker-compose stop qdrant
rm -rf volumes/qdrant/storage/*
docker-compose start qdrant
```
**Action:** Trigger Git sync workflow in n8n
**Expected:**
1. Qdrant rebuilds vectors from Git
2. Queries work again within 10 minutes
3. No data loss
**Pass criteria:** Vector DB can be rebuilt from Git
---
## 📊 MONITORING TESTS
### Test 6.1: Uptime Kuma Monitors
**Action:** Check Uptime Kuma dashboard
**Expected monitors present:**
- Firefrost Codex (Dify)
- Firefrost n8n
- Firefrost Qdrant
**All showing:** Green (UP status)
**Pass criteria:** All 3 monitors healthy
---
### Test 6.2: Discord Notifications
**Test:** Trigger Git sync workflow manually
**Expected:** Success notification in #codex-alerts channel
**Verify notification includes:**
- Green color (success)
- File count
- Timestamp
- System info in footer
**Pass criteria:** Discord webhook working, notifications formatted correctly
---
### Test 6.3: Critical Alert Test
**Test:** Stop Dify for 15+ minutes
**Expected:**
1. Uptime Kuma detects down
2. Critical alert sent to #system-critical
3. Michael mentioned in message
4. Clear action steps provided
**Pass criteria:** Critical alerts work, Michael gets notified
---
## 💾 BACKUP TESTS
### Test 7.1: Manual Backup
**Test:**
```bash
/opt/firefrost_backup.sh
```
**Expected:**
1. Backup completes without errors
2. Tarball created in /opt/
3. Transferred to Command Center
4. Backup log updated
**Verify:**
```bash
ls -lh /opt/firefrost_codex_*.tar.gz
ssh root@63.143.34.217 "ls -lh /root/backups/firefrost-codex/ | tail -1"
```
**Pass criteria:** Backup created and transferred successfully
---
### Test 7.2: Backup Restore Test
**Test:**
```bash
# Create test database
docker run --name test-postgres -e POSTGRES_PASSWORD=test -d postgres:15-alpine
# Extract latest backup
cd /tmp
tar -xzf /opt/firefrost_codex_*.tar.gz
cd codex_backup_*/
# Restore into test database
cat dify_postgres.sql | docker exec -i test-postgres psql -U postgres
# Check for errors
docker logs test-postgres | grep ERROR
# Cleanup
docker rm -f test-postgres
```
**Expected:** No errors during restore
**Pass criteria:** Backup is valid and restorable
---
## 🎉 FINAL VERIFICATION
**After ALL tests pass:**
### Final Checklist
- [ ] All 30 tests completed
- [ ] Zero failures
- [ ] All issues documented
- [ ] Meg can work independently
- [ ] Holly can work independently
- [ ] Michael can approve from Discord
- [ ] System self-heals common failures
- [ ] Backups running automatically
- [ ] Monitoring active and alerting
- [ ] Documentation updated
### Sign-Off
**Deployed by:** ____________________
**Date:** ____________________
**Session:** ____________________
**Signature:** This deployment is COMPLETE and PRODUCTION-READY
---
**Fire + Frost + Foundation = Where Testing Ensures Quality** 💙🔥❄️