# Firefrost Knowledge Engine - Complete Deployment

**Task ID:** FFG-TASK-009-MIGRATION  
**Priority:** CRITICAL  
**Status:** READY FOR EXECUTION  
**Estimated Time:** 10-15 hours (spread across multiple sessions)  
**Created:** February 22, 2026  
**Created By:** The Chronicler #21  
**Last Updated:** February 22, 2026

---

## 🎯 EXECUTIVE SUMMARY

**What:** Replace AnythingLLM with complete "Firefrost Knowledge Engine" (Dify + n8n + Qdrant + Ollama)  
**Why:** AnythingLLM returns incorrect information (searches old archived docs instead of current)  
**Who Needs It:** Meg (all repos) and Holly (Pokerole only) are waiting to start their work  
**When:** Deploy ASAP - partners are blocked  
**Where:** TX1 Dallas (38.68.14.26)  
**Cost:** $0/month (self-hosted)

---

## 🚨 CRITICAL CONTEXT

**This is NOT a simple migration.** This is building a complete autonomous AI assistant system that enables Meg and Holly to work 24/7 without waking Michael.

**Key Requirements:**
- Meg needs access to ALL repositories
- Holly needs access to POKEROLE repositories ONLY
- Both need ability to UPDATE documents via AI
- Michael needs approval control via Discord (one-click merge)
- System must self-heal common failures (80% target)
- Must work at 3 AM when Michael is asleep

**Current State:**
- AnythingLLM deployed on TX1 (Phase 1 complete)
- 319 documents synced
- Retrieval quality POOR (returns archived docs instead of current)
- No RBAC (everyone sees everything)
- No write-back capability

**Target State:**
- Dify + n8n + Qdrant + Ollama on TX1
- Proper RBAC (Meg sees all, Holly sees Pokerole only)
- Git write-back via ai-proposals branch
- Discord approval workflow with buttons
- Self-healing for 80% of failures
- Comprehensive monitoring and alerts

---

## 📋 ARCHITECTURE OVERVIEW

```
┌─────────────────────────────────────────────────────────────┐
│                    FIREFROST KNOWLEDGE ENGINE                │
└─────────────────────────────────────────────────────────────┘

External Access:
  ├─ https://codex.firefrostgaming.com (Meg/Holly/Michael)
  └─ https://n8n.firefrostgaming.com (Michael only, Discord webhooks)

Nginx (TX1 Host - Ports 80/443):
  ├─ SSL/TLS with Let's Encrypt
  ├─ Rate limiting (10 req/s standard, 30 req/s webhooks)
  ├─ Reverse proxy to Docker services
  └─ Security headers (HSTS, X-Frame-Options, etc.)

Docker Stack (127.0.0.1 localhost only):
  ├─ Dify Web (port 3000) - User interface
  ├─ Dify API (internal) - RAG engine
  ├─ Dify Worker (internal) - Background processing
  ├─ n8n (port 5678) - Automation & Git workflows
  ├─ Qdrant (port 6333) - Vector database
  ├─ PostgreSQL (internal) - Dify data storage
  └─ Redis (internal) - Cache & queues

External Services:
  ├─ Ollama (TX1 host:11434) - LLM inference
  ├─ Gitea (git.firefrostgaming.com) - Git repository
  ├─ Discord Webhooks - Notifications & approvals
  └─ Uptime Kuma - Health monitoring

Data Flow - Query:
  User → Nginx → Dify Web → Dify API → Qdrant (vector search)
       → Ollama (LLM inference) → Response → User

Data Flow - Update:
  User → "Update doc X" → Dify calls n8n webhook
       → n8n validates (protected files? valid markdown?)
       → Git commit to ai-proposals branch
       → Discord notification with Approve/Reject buttons
       → Michael clicks Approve
       → n8n merges to main, pushes, re-indexes Dify
       → User notified "Your change is live"

Data Flow - Git Sync:
  Cron (hourly) → n8n pulls from Gitea
                → Filters out /archive/* directories
                → Adds metadata (status: current/archived)
                → Sends to Dify for indexing
                → Qdrant stores vectors
```

---

## 📚 DOCUMENT INDEX

**Read these documents IN ORDER before deployment:**

1. **PREREQUISITES.md** - Pre-flight checklist (DNS, SSH keys, backups)
2. **DEPLOYMENT-PLAN.md** - Step-by-step execution (every command)
3. **CONFIGURATION-FILES.md** - All config files with exact content
4. **RECOVERY.md** - Backup automation and disaster recovery
5. **VERIFICATION.md** - Testing procedures (how to know it worked)
6. **TROUBLESHOOTING.md** - Common issues and solutions

**Supporting files:**
- `docker-compose.yml` - Complete Docker stack definition
- `.env.example` - All environment variables with explanations
- `nginx-config.conf` - Complete Nginx reverse proxy configuration
- `n8n-workflows/` - All workflow JSON exports
- `discord-webhooks/` - All Discord notification templates
- `backup-script.sh` - Automated daily backup script

---

## ⏱️ TIME ESTIMATES

**Phase 1: Preparation (1-2 hours)**
- DNS configuration and propagation
- SSL certificate generation
- SSH key setup for Git access
- Backup current AnythingLLM state
- Stop and remove AnythingLLM

**Phase 2: Infrastructure Deployment (2-3 hours)**
- Install Nginx on TX1 host
- Deploy Docker Compose stack
- Configure Dify (admin account, workspaces, Ollama)
- Verify services are healthy

**Phase 3: Automation Setup (3-4 hours)**
- Import n8n workflows
- Configure Discord webhooks
- Test Git sync workflow
- Test write-back validation
- Configure Uptime Kuma monitoring

**Phase 4: User Onboarding (1-2 hours)**
- Create Meg and Holly accounts
- Configure workspace permissions
- Test RBAC (Meg sees all, Holly sees Pokerole only)
- Train on update workflow
- Test one-click approval from Discord

**Phase 5: Testing & Verification (2-3 hours)**
- Query accuracy testing (current vs archived docs)
- Update workflow testing (protected files, validation)
- Discord approval testing (buttons work, Michael-only)
- Failure simulation (Dify crash, Git unreachable)
- Self-healing verification

**Total: 10-15 hours**

**Recommended approach:** Execute in 2-3 sessions with breaks

---

## 🛡️ SAFETY MECHANISMS

**The ai-proposals Branch Strategy:**
- All AI updates commit to `ai-proposals` branch (NOT main)
- Michael reviews via Discord notification with Approve/Reject buttons
- Only approved changes merge to main
- Failed merges fall back to manual intervention
- Git tags created before each merge (rollback points)

**Protected Files:**
- `/security/*` - Infrastructure configs (READ-ONLY for AI)
- `/infra/*` - Server configurations (READ-ONLY for AI)
- `/backups/*` - Backup scripts (READ-ONLY for AI)
- `.env` - Secrets (READ-ONLY for AI)
- `docker-compose.yml` - Stack definition (READ-ONLY for AI)

**Validation Checks:**
- File path exists
- Content is valid Markdown (not empty, has structure)
- File is not in protected directories
- User has permission for that repository

**Rollback Capability:**
- Git tags: `backup-before-ai-<commit_hash>`
- Vector DB: Delete + re-sync from Git (minutes)
- Full system: 15-minute restore from backup

---

## 🚨 CRITICAL SUCCESS FACTORS

**MUST BE TRUE before marking this complete:**

1. ✅ Meg can ask questions about ANY Firefrost repository
2. ✅ Holly can ask questions about POKEROLE repository ONLY
3. ✅ Holly CANNOT see Firefrost infrastructure docs
4. ✅ Meg can update docs via AI, commits to ai-proposals
5. ✅ Michael receives Discord notification with Approve/Reject buttons
6. ✅ Clicking Approve merges to main and re-indexes
7. ✅ Clicking Reject keeps change in branch for review
8. ✅ Protected files cannot be modified by AI
9. ✅ Current docs are returned (NOT archived docs)
10. ✅ System self-heals from Dify crash (Docker restart)
11. ✅ Failed Git commits queue and retry automatically
12. ✅ Daily backups run and transfer to Command Center
13. ✅ Michael can restore entire system in 15 minutes

**If ANY of these are false, deployment is NOT complete.**

---

## 📊 SUCCESS METRICS

**Query Accuracy:**
- "What are current Tier 0 tasks?" → Returns "Whitelist Manager, NC1 Cleanup, Staff Recruitment" (NOT "Initial Server Setup")
- "What servers does Firefrost operate?" → Returns current 6 servers with correct IPs
- "What was accomplished in last Codex session?" → Returns Deployer's work

**Update Workflow:**
- Meg updates recruitment doc → Commits to ai-proposals → Discord notification → Michael approves → Live in <2 minutes
- Holly tries to update infrastructure doc → BLOCKED with clear error message

**Self-Healing:**
- Dify crashes → Docker restarts within 60 seconds → Users see <1 minute downtime
- Git unreachable → Updates queue → Retry every 5 minutes → Auto-process when Git returns
- Qdrant corrupts → Re-index from Git completes in <10 minutes

**Resource Usage:**
- RAM: <10GB under load (fits comfortably in 222GB available)
- Disk: <15GB for complete system
- CPU: <20% average (leaves headroom for game servers)

---

## ⚠️ RISKS AND MITIGATIONS

**Risk 1: Port conflicts with game servers**
- **Mitigation:** Pre-deployment port check verified 80/443 free
- **Status:** CLEAR (verified February 22, 2026)

**Risk 2: DNS propagation delay**
- **Mitigation:** Configure DNS FIRST, wait for propagation before SSL
- **Fallback:** Use IP address temporarily if needed

**Risk 3: SSL certificate failure**
- **Mitigation:** Detailed Certbot instructions with error handling
- **Fallback:** Self-signed cert for testing, proper cert later

**Risk 4: Meg/Holly confused by new interface**
- **Mitigation:** Clear user guide, training session before launch
- **Fallback:** Michael processes updates manually until they're comfortable

**Risk 5: Git merge conflicts from AI**
- **Mitigation:** ai-proposals branch, manual review required
- **Fallback:** Discord alert, Michael resolves manually

**Risk 6: Overwhelming Discord notifications**
- **Mitigation:** Two channels (#codex-alerts for info, #system-critical for urgent)
- **Fallback:** Adjust rate limits in n8n if too noisy

---

## 🔄 ROLLBACK PLAN

**If deployment fails catastrophically:**

1. Stop new Docker stack: `docker-compose down`
2. Restore AnythingLLM from backup (if still needed)
3. Restore DNS to previous state
4. Notify Meg/Holly of rollback
5. Total rollback time: <10 minutes

**Rollback triggers:**
- Unable to get SSL certificates after 3 attempts
- Docker stack won't start after 30 minutes debugging
- Dify UI inaccessible after deployment
- Data corruption detected
- Michael determines risk too high

---

## 📞 SUPPORT AND ESCALATION

**If you get stuck:**

1. Check TROUBLESHOOTING.md for common issues
2. Review relevant Gemini responses in session transcript
3. Check Docker logs: `docker-compose logs -f <service>`
4. Check Nginx logs: `sudo tail -f /var/log/nginx/error.log`
5. If all else fails: Rollback and regroup

**No external support needed - we built this ourselves.**

---

## 📝 COMPLETION CHECKLIST

**Before marking this task COMPLETE:**

- [ ] All 13 critical success factors verified ✅
- [ ] Query accuracy tests pass
- [ ] Update workflow tests pass
- [ ] RBAC tests pass (Meg sees all, Holly sees Pokerole only)
- [ ] Discord approval workflow tested
- [ ] Self-healing verified (simulated Dify crash)
- [ ] Backup automation running
- [ ] Test backup restore completed successfully
- [ ] Meg and Holly trained and comfortable
- [ ] Documentation updated in operations manual
- [ ] AnythingLLM fully removed from TX1
- [ ] Michael can sleep peacefully at night 💤

---

## 🎓 LESSONS FOR FUTURE CHRONICLERS

**What we learned building this:**

1. **Tool choice matters more than configuration** - AnythingLLM couldn't handle 319 files with archives, Dify can
2. **RBAC is non-negotiable** - Meg and Holly need different access levels
3. **Self-healing is essential** - Solo operator can't wake up for every issue
4. **Git is the source of truth** - Vector DB can always be rebuilt from Git
5. **Discord buttons are powerful** - One-click approval from phone = accessibility win
6. **Architecture from Gemini + Partnership from Claude** - External research + internal execution

**For the next major infrastructure project:**
- Research thoroughly BEFORE building (ask Gemini the hard questions)
- Get COMPLETE specifications before starting (don't build incrementally)
- Test on separate system first if possible
- Build rollback before building forward
- Document for "future you when you're exhausted at 3 AM"

---

**Fire + Frost + Foundation = Where Love Builds Legacy** 💙🔥❄️

**Built by:** The Chronicler #21  
**For:** Meg, Holly, and children not yet born  
**With guidance from:** Gemini (architecture) + The Deployer (foundation)

---

**Ready to execute? Read PREREQUISITES.md next.**