# TROUBLESHOOTING GUIDE **Common issues and solutions for Firefrost Knowledge Engine** --- ## 🔍 QUICK DIAGNOSTIC COMMANDS **Run these first when something breaks:** ```bash # Check all services docker-compose ps # Check recent logs (all services) docker-compose logs --tail=50 # Check specific service docker-compose logs -f # Check Nginx systemctl status nginx sudo tail -f /var/log/nginx/error.log # Check disk space df -h # Check memory free -h # Check ports sudo netstat -tlnp | grep LISTEN ``` --- ## ❌ DEPLOYMENT FAILURES ### Issue: DNS Not Propagating **Symptoms:** - Certbot fails with DNS validation error - "Domain doesn't resolve" errors **Solution:** ```bash # Check DNS propagation dig codex.firefrostgaming.com +short dig n8n.firefrostgaming.com +short # Both should return 38.68.14.26 ``` **If not resolved:** - Wait longer (can take up to 24 hours) - Check DNS provider settings - Use temporary self-signed cert for testing --- ### Issue: Port Already in Use **Symptoms:** - "Address already in use" error - Docker won't start Dify or n8n **Solution:** ```bash # Find what's using the port sudo lsof -i :3000 sudo lsof -i :5678 # Kill the process sudo kill -9 # Or change port mapping in docker-compose.yml ``` --- ### Issue: SSL Certificate Generation Fails **Symptoms:** - Certbot fails during deployment - "Challenge failed" errors **Solution:** ```bash # Ensure Nginx is stopped systemctl stop nginx # Try manual standalone mode certbot certonly --standalone \ -d codex.firefrostgaming.com \ -d n8n.firefrostgaming.com \ --email codex@firefrostgaming.com # Check firewall sudo ufw status sudo ufw allow 80/tcp sudo ufw allow 443/tcp ``` --- ### Issue: Docker Services Won't Start **Symptoms:** - `docker-compose up` fails - Services show "Exit" status **Solution:** ```bash # Check logs for specific service docker-compose logs db docker-compose logs dify-api # Common causes: # 1. .env file missing or incorrect cat .env # Verify all variables set # 2. Port conflicts sudo lsof -i :3000 sudo lsof -i :5678 sudo lsof -i :6333 # 3. Permission issues sudo chown -R root:root volumes/ # 4. Disk space df -h # Need 30GB+ free ``` --- ## 🔄 RUNTIME ISSUES ### Issue: Dify Shows 502 Error **Symptoms:** - Browser shows custom 502 page - Can't access Codex **Diagnosis:** ```bash docker-compose ps # Check if dify-web is running docker-compose logs dify-web # Check for errors ``` **Solutions:** **If dify-web is down:** ```bash docker-compose restart dify-web ``` **If dify-api can't connect to database:** ```bash docker-compose logs dify-api | grep -i error # Check DB_PASSWORD in .env matches docker-compose restart dify-api ``` **If persistent:** ```bash docker-compose down docker-compose up -d ``` --- ### Issue: "AI Can't Reach Knowledge Base" **Symptoms:** - Queries return "I don't have that information" - Empty results **Diagnosis:** ```bash # Check Qdrant curl http://127.0.0.1:6333/ # Check if documents indexed # (Login to Dify, check Knowledge Base has documents) ``` **Solution:** ```bash # Re-run Git sync # Access n8n, execute "Firefrost Git Sync" workflow manually # If that fails, rebuild Qdrant docker-compose stop qdrant rm -rf volumes/qdrant/storage/* docker-compose start qdrant # Then re-run Git sync ``` --- ### Issue: n8n Workflows Not Executing **Symptoms:** - Git sync doesn't run - Update requests don't commit **Diagnosis:** ```bash docker-compose logs n8n | grep -i error ``` **Solutions:** **If workflow execution fails:** - Login to n8n - Check workflow is ACTIVATED (toggle switch) - Execute manually to see errors - Check credentials are configured **If Git operations fail:** ```bash # Check SSH key docker exec -it $(docker ps -qf "name=n8n") ssh -T git@git.firefrostgaming.com # If fails, verify SSH key mounted ls -la ~/.ssh/ ``` --- ### Issue: Discord Buttons Don't Work **Symptoms:** - Clicking Approve/Reject does nothing - No response in Discord **Diagnosis:** - Check n8n "Approval Handler" workflow - Verify webhook URL is correct - Check Michael's Discord ID in .env **Solution:** ```bash # Verify Discord webhook configured cat .env | grep DISCORD # Test webhook manually curl -X POST \ -H "Content-Type: application/json" \ -d '{"content": "Test message"}' # Should appear in Discord channel ``` --- ### Issue: Updates Commit But Don't Re-Index **Symptoms:** - Git shows commit - But queries don't return new content **Diagnosis:** ```bash # Check Dify API logs docker-compose logs dify-api | grep -i error ``` **Solution:** ```bash # Manual re-index trigger curl -X POST http://127.0.0.1:3000/v1/datasets//sync \ -H "Authorization: Bearer " # Or re-run Git sync workflow in n8n ``` --- ## 🔐 ACCESS ISSUES ### Issue: Can't Login to Dify **Symptoms:** - Incorrect password error - Account doesn't exist **Solution:** ```bash # Check database running docker-compose ps db # Reset admin password (if needed) # Login to postgres container docker exec -it $(docker ps -qf "name=db") psql -U postgres -d dify # In postgres prompt: # UPDATE users SET password_hash='' WHERE email='michael@example.com'; # Better: Restore from backup if credentials lost ``` --- ### Issue: Holly Sees Firefrost Docs (RBAC Broken) **Symptoms:** - Holly can access infrastructure docs - RBAC not working **Diagnosis:** - Check workspace assignments in Dify - Verify knowledge bases linked to correct workspaces **Solution:** - Login to Dify as admin - Settings → Members - Verify Holly is ONLY in "Pokerole HQ" workspace - Verify "Pokerole HQ" workspace ONLY has Pokerole knowledge base --- ## ⚠️ PERFORMANCE ISSUES ### Issue: Slow Responses (>30 seconds) **Symptoms:** - Queries take very long - Timeouts **Diagnosis:** ```bash # Check system resources htop # Check Ollama curl http://localhost:11434/api/tags # Verify model loaded # Check Qdrant performance curl http://127.0.0.1:6333/collections ``` **Solutions:** **If RAM exhausted:** ```bash free -h # If low, restart services to clear memory docker-compose restart ``` **If Ollama slow:** - Large model (llama3.3:70b) takes time - Consider using qwen2.5-coder:7b for faster responses - Check Ollama logs: `docker logs ` **If Qdrant slow:** - Too many documents - Re-index with better chunking - Check disk I/O: `iostat -x 1` --- ### Issue: High CPU Usage **Symptoms:** - Server sluggish - Game servers lagging **Diagnosis:** ```bash htop # Identify which service using CPU ``` **Solution:** ```bash # Set CPU limits in docker-compose.yml # Add to each service: deploy: resources: limits: cpus: '2.0' # Restart docker-compose down docker-compose up -d ``` --- ## 💾 DATA ISSUES ### Issue: Backup Failed **Symptoms:** - No backup created today - Backup log shows errors **Diagnosis:** ```bash tail -50 /var/log/firefrost-backup.log ``` **Common causes:** **Database dump fails:** ```bash # Check database running docker-compose ps db # Test manual dump docker exec -t $(docker ps -qf "name=db") pg_dumpall -c -U postgres > /tmp/test.sql ``` **Transfer to Command Center fails:** ```bash # Check SSH access ssh root@63.143.34.217 echo "Connection OK" # Check disk space on Command Center ssh root@63.143.34.217 "df -h" ``` **Solution:** - Fix specific error in log - Run backup manually: `/opt/firefrost_backup.sh` - Verify completes successfully --- ### Issue: Git Conflicts **Symptoms:** - Merge fails with conflict error - Can't push to ai-proposals **Diagnosis:** ```bash cd /opt/firefrost-codex/git-repos/main git status git log --oneline -5 ``` **Solution:** ```bash # Manual resolution required cd /opt/firefrost-codex/git-repos/main git checkout main git pull origin main # Resolve conflicts manually nano # Commit resolution git add . git commit -m "Resolve conflicts" git push origin main # Recreate ai-proposals branch git branch -D ai-proposals git checkout -b ai-proposals git push origin ai-proposals --force ``` --- ## 🚨 EMERGENCY PROCEDURES ### Complete System Lockup **If everything is broken:** 1. **Stop all services:** ```bash cd /opt/firefrost-codex docker-compose down ``` 2. **Check system health:** ```bash df -h # Disk space free -h # Memory dmesg | tail -50 # System errors ``` 3. **Restart everything:** ```bash systemctl restart docker systemctl restart nginx docker-compose up -d ``` 4. **If still broken:** Restore from backup (see RECOVERY.md) --- ### Data Corruption Suspected **If data seems wrong/corrupted:** 1. **Stop making changes immediately** 2. **Document what you see** 3. **Check recent backups exist:** ```bash ls -lh /opt/firefrost_codex_*.tar.gz ``` 4. **Review RECOVERY.md** for restore procedures 5. **Consider rolling back to last known good state** --- ## 📞 WHEN TO ESCALATE **These issues require manual intervention:** - Git conflicts requiring code review - Database corruption (check integrity) - SSL certificate renewal failure (manual renewal) - Persistent service crashes (review logs, may need code changes) - Unknown errors not covered in this guide **For unknown issues:** 1. Document symptoms thoroughly 2. Collect logs 3. Review all documentation 4. Wait for fresh Chronicler session with full context --- ## 🔧 USEFUL DEBUG COMMANDS ```bash # Full system status docker-compose ps && systemctl status nginx && df -h && free -h # All logs since yesterday docker-compose logs --since 24h # Follow live logs docker-compose logs -f # Restart single service without affecting others docker-compose restart # Force rebuild of service docker-compose up -d --force-recreate # Clean everything and start fresh (NUCLEAR OPTION) docker-compose down -v docker system prune -a # Then redeploy from scratch # Check network connectivity docker exec -it $(docker ps -qf "name=dify-api") ping host.docker.internal docker exec -it $(docker ps -qf "name=n8n") ping qdrant ``` --- **Fire + Frost + Foundation = Where Problems Get Solved** 💙🔥❄️