Task #9: Rewrite AI Stack architecture for DERP compliance

Complete rewrite of self-hosted AI stack (Task #9) with new DERP-compliant architecture:

CHANGES:
- Architecture: AnythingLLM+OpenWebUI → Dify+Ollama (DERP-compliant)
- Cost model: $0/month additional (self-hosted on TX1, no external APIs)
- Usage tiers: Claude Projects (primary) → DERP backup (emergency) → Discord bots (staff/subscribers)
- Time estimate: 8-12hrs → 6-8hrs (more focused deployment)
- Resource allocation: 97GB storage, 92GB RAM when active (vs 150GB/110GB)

NEW DOCUMENTATION:
- README.md: Complete architecture rewrite with three-tier usage model
- deployment-plan.md: Step-by-step deployment (6 phases, all commands included)
- usage-guide.md: Decision tree for when to use Claude vs DERP vs bots
- resource-requirements.md: TX1 capacity planning, monitoring, disaster recovery

KEY FEATURES:
- Zero additional monthly cost (beyond existing $20 Claude Pro)
- True DERP compliance (fully self-hosted when Claude unavailable)
- Knowledge graph RAG (indexes entire 416-file repo)
- Discord bot integration (role-based staff/subscriber access)
- Emergency procedures documented
- Capacity planning for growth (up to 18 game servers)

MODELS:
- Qwen 2.5 Coder 72B (infrastructure/coding, 128K context)
- Llama 3.3 70B (general reasoning, 128K context)
- Llama 3.2 Vision 11B (screenshot analysis)

Updated tasks.md summary to reflect new architecture.

Status: Ready for deployment (pending medical clearance)

Fire + Frost + Foundation + DERP = True Independence 💙🔥❄️
This commit is contained in:
The Chronicler
2026-02-18 17:27:25 +00:00
parent b9517306c7
commit b32afdd1db
5 changed files with 1365 additions and 30 deletions

View File

@@ -178,15 +178,17 @@ Professional @firefrostgaming.com email on NC1. Self-hosted, $120/year saved, el
---
### 9. Self-Hosted AI Stack on TX1
**Time:** 8-12 hours (3-4 active, rest downloads)
**Time:** 6-8 hours (3-4 active, rest downloads)
**Status:** BLOCKED - Medical clearance
**Documentation:** `docs/tasks/self-hosted-ai-stack-on-tx1/`
Dual AI deployment: AnythingLLM (ops) + Open WebUI (staff). DERP backup, unlimited AI access.
DERP-compliant AI infrastructure: Dify + Ollama + self-hosted models. Three-tier usage: Claude Projects (primary) → DERP backup (emergency) → Discord/Wiki bots (staff/subscribers).
**Architecture:** Dify with knowledge graph RAG, Ollama model server
**Models:** Qwen 2.5 Coder 72B, Llama 3.3 70B, Llama 3.2 Vision 11B
**Storage:** ~150GB
**RAM:** ~110GB when loaded
**Storage:** ~97GB
**RAM:** ~92GB when DERP activated, ~8GB idle
**Monthly Cost:** $0 (self-hosted, no additional cost beyond Claude Pro)
---