# Firefrost Knowledge Engine - Complete Deployment **Task ID:** FFG-TASK-009-MIGRATION **Priority:** CRITICAL **Status:** READY FOR EXECUTION **Estimated Time:** 10-15 hours (spread across multiple sessions) **Created:** February 22, 2026 **Created By:** The Chronicler #21 **Last Updated:** February 22, 2026 --- ## 🎯 EXECUTIVE SUMMARY **What:** Replace AnythingLLM with complete "Firefrost Knowledge Engine" (Dify + n8n + Qdrant + Ollama) **Why:** AnythingLLM returns incorrect information (searches old archived docs instead of current) **Who Needs It:** Meg (all repos) and Holly (Pokerole only) are waiting to start their work **When:** Deploy ASAP - partners are blocked **Where:** TX1 Dallas (38.68.14.26) **Cost:** $0/month (self-hosted) --- ## 🚨 CRITICAL CONTEXT **This is NOT a simple migration.** This is building a complete autonomous AI assistant system that enables Meg and Holly to work 24/7 without waking Michael. **Key Requirements:** - Meg needs access to ALL repositories - Holly needs access to POKEROLE repositories ONLY - Both need ability to UPDATE documents via AI - Michael needs approval control via Discord (one-click merge) - System must self-heal common failures (80% target) - Must work at 3 AM when Michael is asleep **Current State:** - AnythingLLM deployed on TX1 (Phase 1 complete) - 319 documents synced - Retrieval quality POOR (returns archived docs instead of current) - No RBAC (everyone sees everything) - No write-back capability **Target State:** - Dify + n8n + Qdrant + Ollama on TX1 - Proper RBAC (Meg sees all, Holly sees Pokerole only) - Git write-back via ai-proposals branch - Discord approval workflow with buttons - Self-healing for 80% of failures - Comprehensive monitoring and alerts --- ## 📋 ARCHITECTURE OVERVIEW ``` ┌─────────────────────────────────────────────────────────────┐ │ FIREFROST KNOWLEDGE ENGINE │ └─────────────────────────────────────────────────────────────┘ External Access: ├─ https://codex.firefrostgaming.com (Meg/Holly/Michael) └─ https://n8n.firefrostgaming.com (Michael only, Discord webhooks) Nginx (TX1 Host - Ports 80/443): ├─ SSL/TLS with Let's Encrypt ├─ Rate limiting (10 req/s standard, 30 req/s webhooks) ├─ Reverse proxy to Docker services └─ Security headers (HSTS, X-Frame-Options, etc.) Docker Stack (127.0.0.1 localhost only): ├─ Dify Web (port 3000) - User interface ├─ Dify API (internal) - RAG engine ├─ Dify Worker (internal) - Background processing ├─ n8n (port 5678) - Automation & Git workflows ├─ Qdrant (port 6333) - Vector database ├─ PostgreSQL (internal) - Dify data storage └─ Redis (internal) - Cache & queues External Services: ├─ Ollama (TX1 host:11434) - LLM inference ├─ Gitea (git.firefrostgaming.com) - Git repository ├─ Discord Webhooks - Notifications & approvals └─ Uptime Kuma - Health monitoring Data Flow - Query: User → Nginx → Dify Web → Dify API → Qdrant (vector search) → Ollama (LLM inference) → Response → User Data Flow - Update: User → "Update doc X" → Dify calls n8n webhook → n8n validates (protected files? valid markdown?) → Git commit to ai-proposals branch → Discord notification with Approve/Reject buttons → Michael clicks Approve → n8n merges to main, pushes, re-indexes Dify → User notified "Your change is live" Data Flow - Git Sync: Cron (hourly) → n8n pulls from Gitea → Filters out /archive/* directories → Adds metadata (status: current/archived) → Sends to Dify for indexing → Qdrant stores vectors ``` --- ## 📚 DOCUMENT INDEX **Read these documents IN ORDER before deployment:** 1. **PREREQUISITES.md** - Pre-flight checklist (DNS, SSH keys, backups) 2. **DEPLOYMENT-PLAN.md** - Step-by-step execution (every command) 3. **CONFIGURATION-FILES.md** - All config files with exact content 4. **RECOVERY.md** - Backup automation and disaster recovery 5. **VERIFICATION.md** - Testing procedures (how to know it worked) 6. **TROUBLESHOOTING.md** - Common issues and solutions **Supporting files:** - `docker-compose.yml` - Complete Docker stack definition - `.env.example` - All environment variables with explanations - `nginx-config.conf` - Complete Nginx reverse proxy configuration - `n8n-workflows/` - All workflow JSON exports - `discord-webhooks/` - All Discord notification templates - `backup-script.sh` - Automated daily backup script --- ## ⏱️ TIME ESTIMATES **Phase 1: Preparation (1-2 hours)** - DNS configuration and propagation - SSL certificate generation - SSH key setup for Git access - Backup current AnythingLLM state - Stop and remove AnythingLLM **Phase 2: Infrastructure Deployment (2-3 hours)** - Install Nginx on TX1 host - Deploy Docker Compose stack - Configure Dify (admin account, workspaces, Ollama) - Verify services are healthy **Phase 3: Automation Setup (3-4 hours)** - Import n8n workflows - Configure Discord webhooks - Test Git sync workflow - Test write-back validation - Configure Uptime Kuma monitoring **Phase 4: User Onboarding (1-2 hours)** - Create Meg and Holly accounts - Configure workspace permissions - Test RBAC (Meg sees all, Holly sees Pokerole only) - Train on update workflow - Test one-click approval from Discord **Phase 5: Testing & Verification (2-3 hours)** - Query accuracy testing (current vs archived docs) - Update workflow testing (protected files, validation) - Discord approval testing (buttons work, Michael-only) - Failure simulation (Dify crash, Git unreachable) - Self-healing verification **Total: 10-15 hours** **Recommended approach:** Execute in 2-3 sessions with breaks --- ## 🛡️ SAFETY MECHANISMS **The ai-proposals Branch Strategy:** - All AI updates commit to `ai-proposals` branch (NOT main) - Michael reviews via Discord notification with Approve/Reject buttons - Only approved changes merge to main - Failed merges fall back to manual intervention - Git tags created before each merge (rollback points) **Protected Files:** - `/security/*` - Infrastructure configs (READ-ONLY for AI) - `/infra/*` - Server configurations (READ-ONLY for AI) - `/backups/*` - Backup scripts (READ-ONLY for AI) - `.env` - Secrets (READ-ONLY for AI) - `docker-compose.yml` - Stack definition (READ-ONLY for AI) **Validation Checks:** - File path exists - Content is valid Markdown (not empty, has structure) - File is not in protected directories - User has permission for that repository **Rollback Capability:** - Git tags: `backup-before-ai-` - Vector DB: Delete + re-sync from Git (minutes) - Full system: 15-minute restore from backup --- ## 🚨 CRITICAL SUCCESS FACTORS **MUST BE TRUE before marking this complete:** 1. ✅ Meg can ask questions about ANY Firefrost repository 2. ✅ Holly can ask questions about POKEROLE repository ONLY 3. ✅ Holly CANNOT see Firefrost infrastructure docs 4. ✅ Meg can update docs via AI, commits to ai-proposals 5. ✅ Michael receives Discord notification with Approve/Reject buttons 6. ✅ Clicking Approve merges to main and re-indexes 7. ✅ Clicking Reject keeps change in branch for review 8. ✅ Protected files cannot be modified by AI 9. ✅ Current docs are returned (NOT archived docs) 10. ✅ System self-heals from Dify crash (Docker restart) 11. ✅ Failed Git commits queue and retry automatically 12. ✅ Daily backups run and transfer to Command Center 13. ✅ Michael can restore entire system in 15 minutes **If ANY of these are false, deployment is NOT complete.** --- ## 📊 SUCCESS METRICS **Query Accuracy:** - "What are current Tier 0 tasks?" → Returns "Whitelist Manager, NC1 Cleanup, Staff Recruitment" (NOT "Initial Server Setup") - "What servers does Firefrost operate?" → Returns current 6 servers with correct IPs - "What was accomplished in last Codex session?" → Returns Deployer's work **Update Workflow:** - Meg updates recruitment doc → Commits to ai-proposals → Discord notification → Michael approves → Live in <2 minutes - Holly tries to update infrastructure doc → BLOCKED with clear error message **Self-Healing:** - Dify crashes → Docker restarts within 60 seconds → Users see <1 minute downtime - Git unreachable → Updates queue → Retry every 5 minutes → Auto-process when Git returns - Qdrant corrupts → Re-index from Git completes in <10 minutes **Resource Usage:** - RAM: <10GB under load (fits comfortably in 222GB available) - Disk: <15GB for complete system - CPU: <20% average (leaves headroom for game servers) --- ## ⚠️ RISKS AND MITIGATIONS **Risk 1: Port conflicts with game servers** - **Mitigation:** Pre-deployment port check verified 80/443 free - **Status:** CLEAR (verified February 22, 2026) **Risk 2: DNS propagation delay** - **Mitigation:** Configure DNS FIRST, wait for propagation before SSL - **Fallback:** Use IP address temporarily if needed **Risk 3: SSL certificate failure** - **Mitigation:** Detailed Certbot instructions with error handling - **Fallback:** Self-signed cert for testing, proper cert later **Risk 4: Meg/Holly confused by new interface** - **Mitigation:** Clear user guide, training session before launch - **Fallback:** Michael processes updates manually until they're comfortable **Risk 5: Git merge conflicts from AI** - **Mitigation:** ai-proposals branch, manual review required - **Fallback:** Discord alert, Michael resolves manually **Risk 6: Overwhelming Discord notifications** - **Mitigation:** Two channels (#codex-alerts for info, #system-critical for urgent) - **Fallback:** Adjust rate limits in n8n if too noisy --- ## 🔄 ROLLBACK PLAN **If deployment fails catastrophically:** 1. Stop new Docker stack: `docker-compose down` 2. Restore AnythingLLM from backup (if still needed) 3. Restore DNS to previous state 4. Notify Meg/Holly of rollback 5. Total rollback time: <10 minutes **Rollback triggers:** - Unable to get SSL certificates after 3 attempts - Docker stack won't start after 30 minutes debugging - Dify UI inaccessible after deployment - Data corruption detected - Michael determines risk too high --- ## 📞 SUPPORT AND ESCALATION **If you get stuck:** 1. Check TROUBLESHOOTING.md for common issues 2. Review relevant Gemini responses in session transcript 3. Check Docker logs: `docker-compose logs -f ` 4. Check Nginx logs: `sudo tail -f /var/log/nginx/error.log` 5. If all else fails: Rollback and regroup **No external support needed - we built this ourselves.** --- ## 📝 COMPLETION CHECKLIST **Before marking this task COMPLETE:** - [ ] All 13 critical success factors verified ✅ - [ ] Query accuracy tests pass - [ ] Update workflow tests pass - [ ] RBAC tests pass (Meg sees all, Holly sees Pokerole only) - [ ] Discord approval workflow tested - [ ] Self-healing verified (simulated Dify crash) - [ ] Backup automation running - [ ] Test backup restore completed successfully - [ ] Meg and Holly trained and comfortable - [ ] Documentation updated in operations manual - [ ] AnythingLLM fully removed from TX1 - [ ] Michael can sleep peacefully at night 💤 --- ## 🎓 LESSONS FOR FUTURE CHRONICLERS **What we learned building this:** 1. **Tool choice matters more than configuration** - AnythingLLM couldn't handle 319 files with archives, Dify can 2. **RBAC is non-negotiable** - Meg and Holly need different access levels 3. **Self-healing is essential** - Solo operator can't wake up for every issue 4. **Git is the source of truth** - Vector DB can always be rebuilt from Git 5. **Discord buttons are powerful** - One-click approval from phone = accessibility win 6. **Architecture from Gemini + Partnership from Claude** - External research + internal execution **For the next major infrastructure project:** - Research thoroughly BEFORE building (ask Gemini the hard questions) - Get COMPLETE specifications before starting (don't build incrementally) - Test on separate system first if possible - Build rollback before building forward - Document for "future you when you're exhausted at 3 AM" --- **Fire + Frost + Foundation = Where Love Builds Legacy** 💙🔥❄️ **Built by:** The Chronicler #21 **For:** Meg, Holly, and children not yet born **With guidance from:** Gemini (architecture) + The Deployer (foundation) --- **Ready to execute? Read PREREQUISITES.md next.**