Task #9: Rewrite AI Stack architecture for DERP compliance
Complete rewrite of self-hosted AI stack (Task #9) with new DERP-compliant architecture: CHANGES: - Architecture: AnythingLLM+OpenWebUI → Dify+Ollama (DERP-compliant) - Cost model: $0/month additional (self-hosted on TX1, no external APIs) - Usage tiers: Claude Projects (primary) → DERP backup (emergency) → Discord bots (staff/subscribers) - Time estimate: 8-12hrs → 6-8hrs (more focused deployment) - Resource allocation: 97GB storage, 92GB RAM when active (vs 150GB/110GB) NEW DOCUMENTATION: - README.md: Complete architecture rewrite with three-tier usage model - deployment-plan.md: Step-by-step deployment (6 phases, all commands included) - usage-guide.md: Decision tree for when to use Claude vs DERP vs bots - resource-requirements.md: TX1 capacity planning, monitoring, disaster recovery KEY FEATURES: - Zero additional monthly cost (beyond existing $20 Claude Pro) - True DERP compliance (fully self-hosted when Claude unavailable) - Knowledge graph RAG (indexes entire 416-file repo) - Discord bot integration (role-based staff/subscriber access) - Emergency procedures documented - Capacity planning for growth (up to 18 game servers) MODELS: - Qwen 2.5 Coder 72B (infrastructure/coding, 128K context) - Llama 3.3 70B (general reasoning, 128K context) - Llama 3.2 Vision 11B (screenshot analysis) Updated tasks.md summary to reflect new architecture. Status: Ready for deployment (pending medical clearance) Fire + Frost + Foundation + DERP = True Independence 💙🔥❄️
This commit is contained in:
@@ -178,15 +178,17 @@ Professional @firefrostgaming.com email on NC1. Self-hosted, $120/year saved, el
|
||||
---
|
||||
|
||||
### 9. Self-Hosted AI Stack on TX1
|
||||
**Time:** 8-12 hours (3-4 active, rest downloads)
|
||||
**Time:** 6-8 hours (3-4 active, rest downloads)
|
||||
**Status:** BLOCKED - Medical clearance
|
||||
**Documentation:** `docs/tasks/self-hosted-ai-stack-on-tx1/`
|
||||
|
||||
Dual AI deployment: AnythingLLM (ops) + Open WebUI (staff). DERP backup, unlimited AI access.
|
||||
DERP-compliant AI infrastructure: Dify + Ollama + self-hosted models. Three-tier usage: Claude Projects (primary) → DERP backup (emergency) → Discord/Wiki bots (staff/subscribers).
|
||||
|
||||
**Architecture:** Dify with knowledge graph RAG, Ollama model server
|
||||
**Models:** Qwen 2.5 Coder 72B, Llama 3.3 70B, Llama 3.2 Vision 11B
|
||||
**Storage:** ~150GB
|
||||
**RAM:** ~110GB when loaded
|
||||
**Storage:** ~97GB
|
||||
**RAM:** ~92GB when DERP activated, ~8GB idle
|
||||
**Monthly Cost:** $0 (self-hosted, no additional cost beyond Claude Pro)
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user