Task #9: Rewrite AI Stack architecture for DERP compliance

Complete rewrite of self-hosted AI stack (Task #9) with new DERP-compliant architecture: CHANGES: - Architecture: AnythingLLM+OpenWebUI → Dify+Ollama (DERP-compliant) - Cost model: $0/month additional (self-hosted on TX1, no external APIs) - Usage tiers: Claude Projects (primary) → DERP backup (emergency) → Discord bots (staff/subscribers) - Time estimate: 8-12hrs → 6-8hrs (more focused deployment) - Resource allocation: 97GB storage, 92GB RAM when active (vs 150GB/110GB) NEW DOCUMENTATION: - README.md: Complete architecture rewrite with three-tier usage model - deployment-plan.md: Step-by-step deployment (6 phases, all commands included) - usage-guide.md: Decision tree for when to use Claude vs DERP vs bots - resource-requirements.md: TX1 capacity planning, monitoring, disaster recovery KEY FEATURES: - Zero additional monthly cost (beyond existing $20 Claude Pro) - True DERP compliance (fully self-hosted when Claude unavailable) - Knowledge graph RAG (indexes entire 416-file repo) - Discord bot integration (role-based staff/subscriber access) - Emergency procedures documented - Capacity planning for growth (up to 18 game servers) MODELS: - Qwen 2.5 Coder 72B (infrastructure/coding, 128K context) - Llama 3.3 70B (general reasoning, 128K context) - Llama 3.2 Vision 11B (screenshot analysis) Updated tasks.md summary to reflect new architecture. Status: Ready for deployment (pending medical clearance) Fire + Frost + Foundation + DERP = True Independence 💙🔥❄️
2026-02-18 17:27:25 +00:00
parent b9517306c7
commit b32afdd1db
5 changed files with 1365 additions and 30 deletions
--- a/docs/core/tasks.md
+++ b/docs/core/tasks.md
@@ -178,15 +178,17 @@ Professional @firefrostgaming.com email on NC1. Self-hosted, $120/year saved, el
 ---

 ### 9. Self-Hosted AI Stack on TX1
-**Time:** 8-12 hours (3-4 active, rest downloads)  
+**Time:** 6-8 hours (3-4 active, rest downloads)  
 **Status:** BLOCKED - Medical clearance  
 **Documentation:** `docs/tasks/self-hosted-ai-stack-on-tx1/`

-Dual AI deployment: AnythingLLM (ops) + Open WebUI (staff). DERP backup, unlimited AI access.
+DERP-compliant AI infrastructure: Dify + Ollama + self-hosted models. Three-tier usage: Claude Projects (primary) → DERP backup (emergency) → Discord/Wiki bots (staff/subscribers).

+**Architecture:** Dify with knowledge graph RAG, Ollama model server  
 **Models:** Qwen 2.5 Coder 72B, Llama 3.3 70B, Llama 3.2 Vision 11B  
-**Storage:** ~150GB  
-**RAM:** ~110GB when loaded
+**Storage:** ~97GB  
+**RAM:** ~92GB when DERP activated, ~8GB idle  
+**Monthly Cost:** $0 (self-hosted, no additional cost beyond Claude Pro)

 ---