Files
firefrost-operations-manual/docs/tasks/firefrost-codex-migration-to-open-webui/README.md
The Chronicler #21 2e953ce312 feat: Complete Firefrost Knowledge Engine deployment plan
- Comprehensive task documentation for migrating from AnythingLLM to Dify+n8n+Qdrant
- 8 detailed documents covering every aspect of deployment
- Complete step-by-step commands (zero assumptions)
- Prerequisites checklist (20 items)
- Deployment plan in 2 parts (11 phases, every command)
- Configuration files (all configs with exact content)
- Recovery procedures (4 disaster scenarios)
- Verification guide (30 tests, complete checklist)
- Troubleshooting guide (common issues + solutions)

Built by: The Chronicler #21
For: Meg, Holly, and children not yet born
Time investment: 10-15 hours execution time
Purpose: Enable Meg/Holly autonomous work with Git write-back

This deployment enables:
- RBAC (Meg sees all, Holly sees Pokerole only)
- Git write-back via ai-proposals branch
- Discord approval workflow (one-click merge)
- Self-healing (80% of failures)
- Automated daily backups
- Complete monitoring

Documentation is so detailed that any future Chronicler can execute
this deployment with zero prior knowledge and complete confidence.

Fire + Frost + Foundation = Where Love Builds Legacy
2026-02-22 09:55:13 +00:00

354 lines
12 KiB
Markdown

# Firefrost Knowledge Engine - Complete Deployment
**Task ID:** FFG-TASK-009-MIGRATION
**Priority:** CRITICAL
**Status:** READY FOR EXECUTION
**Estimated Time:** 10-15 hours (spread across multiple sessions)
**Created:** February 22, 2026
**Created By:** The Chronicler #21
**Last Updated:** February 22, 2026
---
## 🎯 EXECUTIVE SUMMARY
**What:** Replace AnythingLLM with complete "Firefrost Knowledge Engine" (Dify + n8n + Qdrant + Ollama)
**Why:** AnythingLLM returns incorrect information (searches old archived docs instead of current)
**Who Needs It:** Meg (all repos) and Holly (Pokerole only) are waiting to start their work
**When:** Deploy ASAP - partners are blocked
**Where:** TX1 Dallas (38.68.14.26)
**Cost:** $0/month (self-hosted)
---
## 🚨 CRITICAL CONTEXT
**This is NOT a simple migration.** This is building a complete autonomous AI assistant system that enables Meg and Holly to work 24/7 without waking Michael.
**Key Requirements:**
- Meg needs access to ALL repositories
- Holly needs access to POKEROLE repositories ONLY
- Both need ability to UPDATE documents via AI
- Michael needs approval control via Discord (one-click merge)
- System must self-heal common failures (80% target)
- Must work at 3 AM when Michael is asleep
**Current State:**
- AnythingLLM deployed on TX1 (Phase 1 complete)
- 319 documents synced
- Retrieval quality POOR (returns archived docs instead of current)
- No RBAC (everyone sees everything)
- No write-back capability
**Target State:**
- Dify + n8n + Qdrant + Ollama on TX1
- Proper RBAC (Meg sees all, Holly sees Pokerole only)
- Git write-back via ai-proposals branch
- Discord approval workflow with buttons
- Self-healing for 80% of failures
- Comprehensive monitoring and alerts
---
## 📋 ARCHITECTURE OVERVIEW
```
┌─────────────────────────────────────────────────────────────┐
│ FIREFROST KNOWLEDGE ENGINE │
└─────────────────────────────────────────────────────────────┘
External Access:
├─ https://codex.firefrostgaming.com (Meg/Holly/Michael)
└─ https://n8n.firefrostgaming.com (Michael only, Discord webhooks)
Nginx (TX1 Host - Ports 80/443):
├─ SSL/TLS with Let's Encrypt
├─ Rate limiting (10 req/s standard, 30 req/s webhooks)
├─ Reverse proxy to Docker services
└─ Security headers (HSTS, X-Frame-Options, etc.)
Docker Stack (127.0.0.1 localhost only):
├─ Dify Web (port 3000) - User interface
├─ Dify API (internal) - RAG engine
├─ Dify Worker (internal) - Background processing
├─ n8n (port 5678) - Automation & Git workflows
├─ Qdrant (port 6333) - Vector database
├─ PostgreSQL (internal) - Dify data storage
└─ Redis (internal) - Cache & queues
External Services:
├─ Ollama (TX1 host:11434) - LLM inference
├─ Gitea (git.firefrostgaming.com) - Git repository
├─ Discord Webhooks - Notifications & approvals
└─ Uptime Kuma - Health monitoring
Data Flow - Query:
User → Nginx → Dify Web → Dify API → Qdrant (vector search)
→ Ollama (LLM inference) → Response → User
Data Flow - Update:
User → "Update doc X" → Dify calls n8n webhook
→ n8n validates (protected files? valid markdown?)
→ Git commit to ai-proposals branch
→ Discord notification with Approve/Reject buttons
→ Michael clicks Approve
→ n8n merges to main, pushes, re-indexes Dify
→ User notified "Your change is live"
Data Flow - Git Sync:
Cron (hourly) → n8n pulls from Gitea
→ Filters out /archive/* directories
→ Adds metadata (status: current/archived)
→ Sends to Dify for indexing
→ Qdrant stores vectors
```
---
## 📚 DOCUMENT INDEX
**Read these documents IN ORDER before deployment:**
1. **PREREQUISITES.md** - Pre-flight checklist (DNS, SSH keys, backups)
2. **DEPLOYMENT-PLAN.md** - Step-by-step execution (every command)
3. **CONFIGURATION-FILES.md** - All config files with exact content
4. **RECOVERY.md** - Backup automation and disaster recovery
5. **VERIFICATION.md** - Testing procedures (how to know it worked)
6. **TROUBLESHOOTING.md** - Common issues and solutions
**Supporting files:**
- `docker-compose.yml` - Complete Docker stack definition
- `.env.example` - All environment variables with explanations
- `nginx-config.conf` - Complete Nginx reverse proxy configuration
- `n8n-workflows/` - All workflow JSON exports
- `discord-webhooks/` - All Discord notification templates
- `backup-script.sh` - Automated daily backup script
---
## ⏱️ TIME ESTIMATES
**Phase 1: Preparation (1-2 hours)**
- DNS configuration and propagation
- SSL certificate generation
- SSH key setup for Git access
- Backup current AnythingLLM state
- Stop and remove AnythingLLM
**Phase 2: Infrastructure Deployment (2-3 hours)**
- Install Nginx on TX1 host
- Deploy Docker Compose stack
- Configure Dify (admin account, workspaces, Ollama)
- Verify services are healthy
**Phase 3: Automation Setup (3-4 hours)**
- Import n8n workflows
- Configure Discord webhooks
- Test Git sync workflow
- Test write-back validation
- Configure Uptime Kuma monitoring
**Phase 4: User Onboarding (1-2 hours)**
- Create Meg and Holly accounts
- Configure workspace permissions
- Test RBAC (Meg sees all, Holly sees Pokerole only)
- Train on update workflow
- Test one-click approval from Discord
**Phase 5: Testing & Verification (2-3 hours)**
- Query accuracy testing (current vs archived docs)
- Update workflow testing (protected files, validation)
- Discord approval testing (buttons work, Michael-only)
- Failure simulation (Dify crash, Git unreachable)
- Self-healing verification
**Total: 10-15 hours**
**Recommended approach:** Execute in 2-3 sessions with breaks
---
## 🛡️ SAFETY MECHANISMS
**The ai-proposals Branch Strategy:**
- All AI updates commit to `ai-proposals` branch (NOT main)
- Michael reviews via Discord notification with Approve/Reject buttons
- Only approved changes merge to main
- Failed merges fall back to manual intervention
- Git tags created before each merge (rollback points)
**Protected Files:**
- `/security/*` - Infrastructure configs (READ-ONLY for AI)
- `/infra/*` - Server configurations (READ-ONLY for AI)
- `/backups/*` - Backup scripts (READ-ONLY for AI)
- `.env` - Secrets (READ-ONLY for AI)
- `docker-compose.yml` - Stack definition (READ-ONLY for AI)
**Validation Checks:**
- File path exists
- Content is valid Markdown (not empty, has structure)
- File is not in protected directories
- User has permission for that repository
**Rollback Capability:**
- Git tags: `backup-before-ai-<commit_hash>`
- Vector DB: Delete + re-sync from Git (minutes)
- Full system: 15-minute restore from backup
---
## 🚨 CRITICAL SUCCESS FACTORS
**MUST BE TRUE before marking this complete:**
1. ✅ Meg can ask questions about ANY Firefrost repository
2. ✅ Holly can ask questions about POKEROLE repository ONLY
3. ✅ Holly CANNOT see Firefrost infrastructure docs
4. ✅ Meg can update docs via AI, commits to ai-proposals
5. ✅ Michael receives Discord notification with Approve/Reject buttons
6. ✅ Clicking Approve merges to main and re-indexes
7. ✅ Clicking Reject keeps change in branch for review
8. ✅ Protected files cannot be modified by AI
9. ✅ Current docs are returned (NOT archived docs)
10. ✅ System self-heals from Dify crash (Docker restart)
11. ✅ Failed Git commits queue and retry automatically
12. ✅ Daily backups run and transfer to Command Center
13. ✅ Michael can restore entire system in 15 minutes
**If ANY of these are false, deployment is NOT complete.**
---
## 📊 SUCCESS METRICS
**Query Accuracy:**
- "What are current Tier 0 tasks?" → Returns "Whitelist Manager, NC1 Cleanup, Staff Recruitment" (NOT "Initial Server Setup")
- "What servers does Firefrost operate?" → Returns current 6 servers with correct IPs
- "What was accomplished in last Codex session?" → Returns Deployer's work
**Update Workflow:**
- Meg updates recruitment doc → Commits to ai-proposals → Discord notification → Michael approves → Live in <2 minutes
- Holly tries to update infrastructure doc → BLOCKED with clear error message
**Self-Healing:**
- Dify crashes → Docker restarts within 60 seconds → Users see <1 minute downtime
- Git unreachable → Updates queue → Retry every 5 minutes → Auto-process when Git returns
- Qdrant corrupts → Re-index from Git completes in <10 minutes
**Resource Usage:**
- RAM: <10GB under load (fits comfortably in 222GB available)
- Disk: <15GB for complete system
- CPU: <20% average (leaves headroom for game servers)
---
## ⚠️ RISKS AND MITIGATIONS
**Risk 1: Port conflicts with game servers**
- **Mitigation:** Pre-deployment port check verified 80/443 free
- **Status:** CLEAR (verified February 22, 2026)
**Risk 2: DNS propagation delay**
- **Mitigation:** Configure DNS FIRST, wait for propagation before SSL
- **Fallback:** Use IP address temporarily if needed
**Risk 3: SSL certificate failure**
- **Mitigation:** Detailed Certbot instructions with error handling
- **Fallback:** Self-signed cert for testing, proper cert later
**Risk 4: Meg/Holly confused by new interface**
- **Mitigation:** Clear user guide, training session before launch
- **Fallback:** Michael processes updates manually until they're comfortable
**Risk 5: Git merge conflicts from AI**
- **Mitigation:** ai-proposals branch, manual review required
- **Fallback:** Discord alert, Michael resolves manually
**Risk 6: Overwhelming Discord notifications**
- **Mitigation:** Two channels (#codex-alerts for info, #system-critical for urgent)
- **Fallback:** Adjust rate limits in n8n if too noisy
---
## 🔄 ROLLBACK PLAN
**If deployment fails catastrophically:**
1. Stop new Docker stack: `docker-compose down`
2. Restore AnythingLLM from backup (if still needed)
3. Restore DNS to previous state
4. Notify Meg/Holly of rollback
5. Total rollback time: <10 minutes
**Rollback triggers:**
- Unable to get SSL certificates after 3 attempts
- Docker stack won't start after 30 minutes debugging
- Dify UI inaccessible after deployment
- Data corruption detected
- Michael determines risk too high
---
## 📞 SUPPORT AND ESCALATION
**If you get stuck:**
1. Check TROUBLESHOOTING.md for common issues
2. Review relevant Gemini responses in session transcript
3. Check Docker logs: `docker-compose logs -f <service>`
4. Check Nginx logs: `sudo tail -f /var/log/nginx/error.log`
5. If all else fails: Rollback and regroup
**No external support needed - we built this ourselves.**
---
## 📝 COMPLETION CHECKLIST
**Before marking this task COMPLETE:**
- [ ] All 13 critical success factors verified ✅
- [ ] Query accuracy tests pass
- [ ] Update workflow tests pass
- [ ] RBAC tests pass (Meg sees all, Holly sees Pokerole only)
- [ ] Discord approval workflow tested
- [ ] Self-healing verified (simulated Dify crash)
- [ ] Backup automation running
- [ ] Test backup restore completed successfully
- [ ] Meg and Holly trained and comfortable
- [ ] Documentation updated in operations manual
- [ ] AnythingLLM fully removed from TX1
- [ ] Michael can sleep peacefully at night 💤
---
## 🎓 LESSONS FOR FUTURE CHRONICLERS
**What we learned building this:**
1. **Tool choice matters more than configuration** - AnythingLLM couldn't handle 319 files with archives, Dify can
2. **RBAC is non-negotiable** - Meg and Holly need different access levels
3. **Self-healing is essential** - Solo operator can't wake up for every issue
4. **Git is the source of truth** - Vector DB can always be rebuilt from Git
5. **Discord buttons are powerful** - One-click approval from phone = accessibility win
6. **Architecture from Gemini + Partnership from Claude** - External research + internal execution
**For the next major infrastructure project:**
- Research thoroughly BEFORE building (ask Gemini the hard questions)
- Get COMPLETE specifications before starting (don't build incrementally)
- Test on separate system first if possible
- Build rollback before building forward
- Document for "future you when you're exhausted at 3 AM"
---
**Fire + Frost + Foundation = Where Love Builds Legacy** 💙🔥❄️
**Built by:** The Chronicler #21
**For:** Meg, Holly, and children not yet born
**With guidance from:** Gemini (architecture) + The Deployer (foundation)
---
**Ready to execute? Read PREREQUISITES.md next.**