Files
firefrost-operations-manual/docs/tasks/firefrost-codex-migration-to-open-webui/TROUBLESHOOTING.md
The Chronicler #21 2e953ce312 feat: Complete Firefrost Knowledge Engine deployment plan
- Comprehensive task documentation for migrating from AnythingLLM to Dify+n8n+Qdrant
- 8 detailed documents covering every aspect of deployment
- Complete step-by-step commands (zero assumptions)
- Prerequisites checklist (20 items)
- Deployment plan in 2 parts (11 phases, every command)
- Configuration files (all configs with exact content)
- Recovery procedures (4 disaster scenarios)
- Verification guide (30 tests, complete checklist)
- Troubleshooting guide (common issues + solutions)

Built by: The Chronicler #21
For: Meg, Holly, and children not yet born
Time investment: 10-15 hours execution time
Purpose: Enable Meg/Holly autonomous work with Git write-back

This deployment enables:
- RBAC (Meg sees all, Holly sees Pokerole only)
- Git write-back via ai-proposals branch
- Discord approval workflow (one-click merge)
- Self-healing (80% of failures)
- Automated daily backups
- Complete monitoring

Documentation is so detailed that any future Chronicler can execute
this deployment with zero prior knowledge and complete confidence.

Fire + Frost + Foundation = Where Love Builds Legacy
2026-02-22 09:55:13 +00:00

9.9 KiB

TROUBLESHOOTING GUIDE

Common issues and solutions for Firefrost Knowledge Engine


🔍 QUICK DIAGNOSTIC COMMANDS

Run these first when something breaks:

# Check all services
docker-compose ps

# Check recent logs (all services)
docker-compose logs --tail=50

# Check specific service
docker-compose logs -f <service_name>

# Check Nginx
systemctl status nginx
sudo tail -f /var/log/nginx/error.log

# Check disk space
df -h

# Check memory
free -h

# Check ports
sudo netstat -tlnp | grep LISTEN

DEPLOYMENT FAILURES

Issue: DNS Not Propagating

Symptoms:

  • Certbot fails with DNS validation error
  • "Domain doesn't resolve" errors

Solution:

# Check DNS propagation
dig codex.firefrostgaming.com +short
dig n8n.firefrostgaming.com +short

# Both should return 38.68.14.26

If not resolved:

  • Wait longer (can take up to 24 hours)
  • Check DNS provider settings
  • Use temporary self-signed cert for testing

Issue: Port Already in Use

Symptoms:

  • "Address already in use" error
  • Docker won't start Dify or n8n

Solution:

# Find what's using the port
sudo lsof -i :3000
sudo lsof -i :5678

# Kill the process
sudo kill -9 <PID>

# Or change port mapping in docker-compose.yml

Issue: SSL Certificate Generation Fails

Symptoms:

  • Certbot fails during deployment
  • "Challenge failed" errors

Solution:

# Ensure Nginx is stopped
systemctl stop nginx

# Try manual standalone mode
certbot certonly --standalone \
  -d codex.firefrostgaming.com \
  -d n8n.firefrostgaming.com \
  --email codex@firefrostgaming.com

# Check firewall
sudo ufw status
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp

Issue: Docker Services Won't Start

Symptoms:

  • docker-compose up fails
  • Services show "Exit" status

Solution:

# Check logs for specific service
docker-compose logs db
docker-compose logs dify-api

# Common causes:
# 1. .env file missing or incorrect
cat .env  # Verify all variables set

# 2. Port conflicts
sudo lsof -i :3000
sudo lsof -i :5678
sudo lsof -i :6333

# 3. Permission issues
sudo chown -R root:root volumes/

# 4. Disk space
df -h  # Need 30GB+ free

🔄 RUNTIME ISSUES

Issue: Dify Shows 502 Error

Symptoms:

  • Browser shows custom 502 page
  • Can't access Codex

Diagnosis:

docker-compose ps
# Check if dify-web is running

docker-compose logs dify-web
# Check for errors

Solutions:

If dify-web is down:

docker-compose restart dify-web

If dify-api can't connect to database:

docker-compose logs dify-api | grep -i error
# Check DB_PASSWORD in .env matches
docker-compose restart dify-api

If persistent:

docker-compose down
docker-compose up -d

Issue: "AI Can't Reach Knowledge Base"

Symptoms:

  • Queries return "I don't have that information"
  • Empty results

Diagnosis:

# Check Qdrant
curl http://127.0.0.1:6333/

# Check if documents indexed
# (Login to Dify, check Knowledge Base has documents)

Solution:

# Re-run Git sync
# Access n8n, execute "Firefrost Git Sync" workflow manually

# If that fails, rebuild Qdrant
docker-compose stop qdrant
rm -rf volumes/qdrant/storage/*
docker-compose start qdrant
# Then re-run Git sync

Issue: n8n Workflows Not Executing

Symptoms:

  • Git sync doesn't run
  • Update requests don't commit

Diagnosis:

docker-compose logs n8n | grep -i error

Solutions:

If workflow execution fails:

  • Login to n8n
  • Check workflow is ACTIVATED (toggle switch)
  • Execute manually to see errors
  • Check credentials are configured

If Git operations fail:

# Check SSH key
docker exec -it $(docker ps -qf "name=n8n") ssh -T git@git.firefrostgaming.com

# If fails, verify SSH key mounted
ls -la ~/.ssh/

Issue: Discord Buttons Don't Work

Symptoms:

  • Clicking Approve/Reject does nothing
  • No response in Discord

Diagnosis:

  • Check n8n "Approval Handler" workflow
  • Verify webhook URL is correct
  • Check Michael's Discord ID in .env

Solution:

# Verify Discord webhook configured
cat .env | grep DISCORD

# Test webhook manually
curl -X POST <WEBHOOK_URL> \
  -H "Content-Type: application/json" \
  -d '{"content": "Test message"}'

# Should appear in Discord channel

Issue: Updates Commit But Don't Re-Index

Symptoms:

  • Git shows commit
  • But queries don't return new content

Diagnosis:

# Check Dify API logs
docker-compose logs dify-api | grep -i error

Solution:

# Manual re-index trigger
curl -X POST http://127.0.0.1:3000/v1/datasets/<DATASET_ID>/sync \
  -H "Authorization: Bearer <DIFY_API_KEY>"

# Or re-run Git sync workflow in n8n

🔐 ACCESS ISSUES

Issue: Can't Login to Dify

Symptoms:

  • Incorrect password error
  • Account doesn't exist

Solution:

# Check database running
docker-compose ps db

# Reset admin password (if needed)
# Login to postgres container
docker exec -it $(docker ps -qf "name=db") psql -U postgres -d dify

# In postgres prompt:
# UPDATE users SET password_hash='<new_hash>' WHERE email='michael@example.com';

# Better: Restore from backup if credentials lost

Issue: Holly Sees Firefrost Docs (RBAC Broken)

Symptoms:

  • Holly can access infrastructure docs
  • RBAC not working

Diagnosis:

  • Check workspace assignments in Dify
  • Verify knowledge bases linked to correct workspaces

Solution:

  • Login to Dify as admin
  • Settings → Members
  • Verify Holly is ONLY in "Pokerole HQ" workspace
  • Verify "Pokerole HQ" workspace ONLY has Pokerole knowledge base

⚠️ PERFORMANCE ISSUES

Issue: Slow Responses (>30 seconds)

Symptoms:

  • Queries take very long
  • Timeouts

Diagnosis:

# Check system resources
htop

# Check Ollama
curl http://localhost:11434/api/tags
# Verify model loaded

# Check Qdrant performance
curl http://127.0.0.1:6333/collections

Solutions:

If RAM exhausted:

free -h
# If low, restart services to clear memory
docker-compose restart

If Ollama slow:

  • Large model (llama3.3:70b) takes time
  • Consider using qwen2.5-coder:7b for faster responses
  • Check Ollama logs: docker logs <ollama_container>

If Qdrant slow:

  • Too many documents
  • Re-index with better chunking
  • Check disk I/O: iostat -x 1

Issue: High CPU Usage

Symptoms:

  • Server sluggish
  • Game servers lagging

Diagnosis:

htop
# Identify which service using CPU

Solution:

# Set CPU limits in docker-compose.yml
# Add to each service:
deploy:
  resources:
    limits:
      cpus: '2.0'

# Restart
docker-compose down
docker-compose up -d

💾 DATA ISSUES

Issue: Backup Failed

Symptoms:

  • No backup created today
  • Backup log shows errors

Diagnosis:

tail -50 /var/log/firefrost-backup.log

Common causes:

Database dump fails:

# Check database running
docker-compose ps db

# Test manual dump
docker exec -t $(docker ps -qf "name=db") pg_dumpall -c -U postgres > /tmp/test.sql

Transfer to Command Center fails:

# Check SSH access
ssh root@63.143.34.217 echo "Connection OK"

# Check disk space on Command Center
ssh root@63.143.34.217 "df -h"

Solution:

  • Fix specific error in log
  • Run backup manually: /opt/firefrost_backup.sh
  • Verify completes successfully

Issue: Git Conflicts

Symptoms:

  • Merge fails with conflict error
  • Can't push to ai-proposals

Diagnosis:

cd /opt/firefrost-codex/git-repos/main
git status
git log --oneline -5

Solution:

# Manual resolution required
cd /opt/firefrost-codex/git-repos/main
git checkout main
git pull origin main

# Resolve conflicts manually
nano <conflicted_file>

# Commit resolution
git add .
git commit -m "Resolve conflicts"
git push origin main

# Recreate ai-proposals branch
git branch -D ai-proposals
git checkout -b ai-proposals
git push origin ai-proposals --force

🚨 EMERGENCY PROCEDURES

Complete System Lockup

If everything is broken:

  1. Stop all services:
cd /opt/firefrost-codex
docker-compose down
  1. Check system health:
df -h  # Disk space
free -h  # Memory
dmesg | tail -50  # System errors
  1. Restart everything:
systemctl restart docker
systemctl restart nginx
docker-compose up -d
  1. If still broken: Restore from backup (see RECOVERY.md)

Data Corruption Suspected

If data seems wrong/corrupted:

  1. Stop making changes immediately
  2. Document what you see
  3. Check recent backups exist:
ls -lh /opt/firefrost_codex_*.tar.gz
  1. Review RECOVERY.md for restore procedures
  2. Consider rolling back to last known good state

📞 WHEN TO ESCALATE

These issues require manual intervention:

  • Git conflicts requiring code review
  • Database corruption (check integrity)
  • SSL certificate renewal failure (manual renewal)
  • Persistent service crashes (review logs, may need code changes)
  • Unknown errors not covered in this guide

For unknown issues:

  1. Document symptoms thoroughly
  2. Collect logs
  3. Review all documentation
  4. Wait for fresh Chronicler session with full context

🔧 USEFUL DEBUG COMMANDS

# Full system status
docker-compose ps && systemctl status nginx && df -h && free -h

# All logs since yesterday
docker-compose logs --since 24h

# Follow live logs
docker-compose logs -f

# Restart single service without affecting others
docker-compose restart <service_name>

# Force rebuild of service
docker-compose up -d --force-recreate <service_name>

# Clean everything and start fresh (NUCLEAR OPTION)
docker-compose down -v
docker system prune -a
# Then redeploy from scratch

# Check network connectivity
docker exec -it $(docker ps -qf "name=dify-api") ping host.docker.internal
docker exec -it $(docker ps -qf "name=n8n") ping qdrant

Fire + Frost + Foundation = Where Problems Get Solved 💙🔥❄️