- Comprehensive status document covering Phases 0-4 completion - All 10+ sequential configuration issues documented with solutions - Critical configuration reference for future troubleshooting - Lessons learned from 6-hour deployment session - Ready for Phase 5-11 execution Phase 4 achievements: - Plugin system deployed (daemon, sandbox, ssrf_proxy) - Ollama integration complete (5 models configured) - Gemini provider added for heavy lifting - Dify Issue #603 timeout bug solved - All CORS/CSRF authentication working - System defaults configured Deployed by: The Diagnostician (Chronicler #23)
Firefrost Knowledge Engine - Complete Deployment
Task ID: FFG-TASK-009-MIGRATION
Priority: CRITICAL
Status: READY FOR EXECUTION
Estimated Time: 10-15 hours (spread across multiple sessions)
Created: February 22, 2026
Created By: The Chronicler #21
Last Updated: February 22, 2026
🎯 EXECUTIVE SUMMARY
What: Replace AnythingLLM with complete "Firefrost Knowledge Engine" (Dify + n8n + Qdrant + Ollama)
Why: AnythingLLM returns incorrect information (searches old archived docs instead of current)
Who Needs It: Meg (all repos) and Holly (Pokerole only) are waiting to start their work
When: Deploy ASAP - partners are blocked
Where: TX1 Dallas (38.68.14.26)
Cost: $0/month (self-hosted)
🚨 CRITICAL CONTEXT
This is NOT a simple migration. This is building a complete autonomous AI assistant system that enables Meg and Holly to work 24/7 without waking Michael.
Key Requirements:
- Meg needs access to ALL repositories
- Holly needs access to POKEROLE repositories ONLY
- Both need ability to UPDATE documents via AI
- Michael needs approval control via Discord (one-click merge)
- System must self-heal common failures (80% target)
- Must work at 3 AM when Michael is asleep
Current State:
- AnythingLLM deployed on TX1 (Phase 1 complete)
- 319 documents synced
- Retrieval quality POOR (returns archived docs instead of current)
- No RBAC (everyone sees everything)
- No write-back capability
Target State:
- Dify + n8n + Qdrant + Ollama on TX1
- Proper RBAC (Meg sees all, Holly sees Pokerole only)
- Git write-back via ai-proposals branch
- Discord approval workflow with buttons
- Self-healing for 80% of failures
- Comprehensive monitoring and alerts
📋 ARCHITECTURE OVERVIEW
┌─────────────────────────────────────────────────────────────┐
│ FIREFROST KNOWLEDGE ENGINE │
└─────────────────────────────────────────────────────────────┘
External Access:
├─ https://codex.firefrostgaming.com (Meg/Holly/Michael)
└─ https://n8n.firefrostgaming.com (Michael only, Discord webhooks)
Nginx (TX1 Host - Ports 80/443):
├─ SSL/TLS with Let's Encrypt
├─ Rate limiting (10 req/s standard, 30 req/s webhooks)
├─ Reverse proxy to Docker services
└─ Security headers (HSTS, X-Frame-Options, etc.)
Docker Stack (127.0.0.1 localhost only):
├─ Dify Web (port 3000) - User interface
├─ Dify API (internal) - RAG engine
├─ Dify Worker (internal) - Background processing
├─ n8n (port 5678) - Automation & Git workflows
├─ Qdrant (port 6333) - Vector database
├─ PostgreSQL (internal) - Dify data storage
└─ Redis (internal) - Cache & queues
External Services:
├─ Ollama (TX1 host:11434) - LLM inference
├─ Gitea (git.firefrostgaming.com) - Git repository
├─ Discord Webhooks - Notifications & approvals
└─ Uptime Kuma - Health monitoring
Data Flow - Query:
User → Nginx → Dify Web → Dify API → Qdrant (vector search)
→ Ollama (LLM inference) → Response → User
Data Flow - Update:
User → "Update doc X" → Dify calls n8n webhook
→ n8n validates (protected files? valid markdown?)
→ Git commit to ai-proposals branch
→ Discord notification with Approve/Reject buttons
→ Michael clicks Approve
→ n8n merges to main, pushes, re-indexes Dify
→ User notified "Your change is live"
Data Flow - Git Sync:
Cron (hourly) → n8n pulls from Gitea
→ Filters out /archive/* directories
→ Adds metadata (status: current/archived)
→ Sends to Dify for indexing
→ Qdrant stores vectors
📚 DOCUMENT INDEX
Read these documents IN ORDER before deployment:
- PREREQUISITES.md - Pre-flight checklist (DNS, SSH keys, backups)
- DEPLOYMENT-PLAN.md - Step-by-step execution (every command)
- CONFIGURATION-FILES.md - All config files with exact content
- RECOVERY.md - Backup automation and disaster recovery
- VERIFICATION.md - Testing procedures (how to know it worked)
- TROUBLESHOOTING.md - Common issues and solutions
Supporting files:
docker-compose.yml- Complete Docker stack definition.env.example- All environment variables with explanationsnginx-config.conf- Complete Nginx reverse proxy configurationn8n-workflows/- All workflow JSON exportsdiscord-webhooks/- All Discord notification templatesbackup-script.sh- Automated daily backup script
⏱️ TIME ESTIMATES
Phase 1: Preparation (1-2 hours)
- DNS configuration and propagation
- SSL certificate generation
- SSH key setup for Git access
- Backup current AnythingLLM state
- Stop and remove AnythingLLM
Phase 2: Infrastructure Deployment (2-3 hours)
- Install Nginx on TX1 host
- Deploy Docker Compose stack
- Configure Dify (admin account, workspaces, Ollama)
- Verify services are healthy
Phase 3: Automation Setup (3-4 hours)
- Import n8n workflows
- Configure Discord webhooks
- Test Git sync workflow
- Test write-back validation
- Configure Uptime Kuma monitoring
Phase 4: User Onboarding (1-2 hours)
- Create Meg and Holly accounts
- Configure workspace permissions
- Test RBAC (Meg sees all, Holly sees Pokerole only)
- Train on update workflow
- Test one-click approval from Discord
Phase 5: Testing & Verification (2-3 hours)
- Query accuracy testing (current vs archived docs)
- Update workflow testing (protected files, validation)
- Discord approval testing (buttons work, Michael-only)
- Failure simulation (Dify crash, Git unreachable)
- Self-healing verification
Total: 10-15 hours
Recommended approach: Execute in 2-3 sessions with breaks
🛡️ SAFETY MECHANISMS
The ai-proposals Branch Strategy:
- All AI updates commit to
ai-proposalsbranch (NOT main) - Michael reviews via Discord notification with Approve/Reject buttons
- Only approved changes merge to main
- Failed merges fall back to manual intervention
- Git tags created before each merge (rollback points)
Protected Files:
/security/*- Infrastructure configs (READ-ONLY for AI)/infra/*- Server configurations (READ-ONLY for AI)/backups/*- Backup scripts (READ-ONLY for AI).env- Secrets (READ-ONLY for AI)docker-compose.yml- Stack definition (READ-ONLY for AI)
Validation Checks:
- File path exists
- Content is valid Markdown (not empty, has structure)
- File is not in protected directories
- User has permission for that repository
Rollback Capability:
- Git tags:
backup-before-ai-<commit_hash> - Vector DB: Delete + re-sync from Git (minutes)
- Full system: 15-minute restore from backup
🚨 CRITICAL SUCCESS FACTORS
MUST BE TRUE before marking this complete:
- ✅ Meg can ask questions about ANY Firefrost repository
- ✅ Holly can ask questions about POKEROLE repository ONLY
- ✅ Holly CANNOT see Firefrost infrastructure docs
- ✅ Meg can update docs via AI, commits to ai-proposals
- ✅ Michael receives Discord notification with Approve/Reject buttons
- ✅ Clicking Approve merges to main and re-indexes
- ✅ Clicking Reject keeps change in branch for review
- ✅ Protected files cannot be modified by AI
- ✅ Current docs are returned (NOT archived docs)
- ✅ System self-heals from Dify crash (Docker restart)
- ✅ Failed Git commits queue and retry automatically
- ✅ Daily backups run and transfer to Command Center
- ✅ Michael can restore entire system in 15 minutes
If ANY of these are false, deployment is NOT complete.
📊 SUCCESS METRICS
Query Accuracy:
- "What are current Tier 0 tasks?" → Returns "Whitelist Manager, NC1 Cleanup, Staff Recruitment" (NOT "Initial Server Setup")
- "What servers does Firefrost operate?" → Returns current 6 servers with correct IPs
- "What was accomplished in last Codex session?" → Returns Deployer's work
Update Workflow:
- Meg updates recruitment doc → Commits to ai-proposals → Discord notification → Michael approves → Live in <2 minutes
- Holly tries to update infrastructure doc → BLOCKED with clear error message
Self-Healing:
- Dify crashes → Docker restarts within 60 seconds → Users see <1 minute downtime
- Git unreachable → Updates queue → Retry every 5 minutes → Auto-process when Git returns
- Qdrant corrupts → Re-index from Git completes in <10 minutes
Resource Usage:
- RAM: <10GB under load (fits comfortably in 222GB available)
- Disk: <15GB for complete system
- CPU: <20% average (leaves headroom for game servers)
⚠️ RISKS AND MITIGATIONS
Risk 1: Port conflicts with game servers
- Mitigation: Pre-deployment port check verified 80/443 free
- Status: CLEAR (verified February 22, 2026)
Risk 2: DNS propagation delay
- Mitigation: Configure DNS FIRST, wait for propagation before SSL
- Fallback: Use IP address temporarily if needed
Risk 3: SSL certificate failure
- Mitigation: Detailed Certbot instructions with error handling
- Fallback: Self-signed cert for testing, proper cert later
Risk 4: Meg/Holly confused by new interface
- Mitigation: Clear user guide, training session before launch
- Fallback: Michael processes updates manually until they're comfortable
Risk 5: Git merge conflicts from AI
- Mitigation: ai-proposals branch, manual review required
- Fallback: Discord alert, Michael resolves manually
Risk 6: Overwhelming Discord notifications
- Mitigation: Two channels (#codex-alerts for info, #system-critical for urgent)
- Fallback: Adjust rate limits in n8n if too noisy
🔄 ROLLBACK PLAN
If deployment fails catastrophically:
- Stop new Docker stack:
docker-compose down - Restore AnythingLLM from backup (if still needed)
- Restore DNS to previous state
- Notify Meg/Holly of rollback
- Total rollback time: <10 minutes
Rollback triggers:
- Unable to get SSL certificates after 3 attempts
- Docker stack won't start after 30 minutes debugging
- Dify UI inaccessible after deployment
- Data corruption detected
- Michael determines risk too high
📞 SUPPORT AND ESCALATION
If you get stuck:
- Check TROUBLESHOOTING.md for common issues
- Review relevant Gemini responses in session transcript
- Check Docker logs:
docker-compose logs -f <service> - Check Nginx logs:
sudo tail -f /var/log/nginx/error.log - If all else fails: Rollback and regroup
No external support needed - we built this ourselves.
📝 COMPLETION CHECKLIST
Before marking this task COMPLETE:
- All 13 critical success factors verified ✅
- Query accuracy tests pass
- Update workflow tests pass
- RBAC tests pass (Meg sees all, Holly sees Pokerole only)
- Discord approval workflow tested
- Self-healing verified (simulated Dify crash)
- Backup automation running
- Test backup restore completed successfully
- Meg and Holly trained and comfortable
- Documentation updated in operations manual
- AnythingLLM fully removed from TX1
- Michael can sleep peacefully at night 💤
🎓 LESSONS FOR FUTURE CHRONICLERS
What we learned building this:
- Tool choice matters more than configuration - AnythingLLM couldn't handle 319 files with archives, Dify can
- RBAC is non-negotiable - Meg and Holly need different access levels
- Self-healing is essential - Solo operator can't wake up for every issue
- Git is the source of truth - Vector DB can always be rebuilt from Git
- Discord buttons are powerful - One-click approval from phone = accessibility win
- Architecture from Gemini + Partnership from Claude - External research + internal execution
For the next major infrastructure project:
- Research thoroughly BEFORE building (ask Gemini the hard questions)
- Get COMPLETE specifications before starting (don't build incrementally)
- Test on separate system first if possible
- Build rollback before building forward
- Document for "future you when you're exhausted at 3 AM"
Fire + Frost + Foundation = Where Love Builds Legacy 💙🔥❄️
Built by: The Chronicler #21
For: Meg, Holly, and children not yet born
With guidance from: Gemini (architecture) + The Deployer (foundation)
Ready to execute? Read PREREQUISITES.md next.