Self-Hosted AI Stack on TX1
Status: Blocked - Medical clearance
Priority: Tier 2 - Major Infrastructure
Time: 8-12 hours (3-4 active, rest downloads)
Location: TX1 Dallas
Last Updated: 2026-02-16
Overview
Dual AI deployment: AnythingLLM (Michael/Meg, document-heavy) + Open WebUI (staff assistant). DERP backup, unlimited AI access, staff foundation.
Architecture
Primary: AnythingLLM (ai.firefrostgaming.com)
- 1,000+ document libraries
- LanceDB vector database
- Workspace isolation (Operations, Pokerole, Brainstorming)
Secondary: Open WebUI (staff-ai.firefrostgaming.com)
- Lighter for staff wiki
- Chroma vector DB
- ChatGPT-like interface
Phases
Phase 1: Deploy stack (1-2 hours) Phase 2: Load models (6-8 hours overnight) Phase 3: Document ingestion (2-3 hours active, 6-8 total)
Models
- Qwen 2.5 Coder 72B (~40GB)
- Llama 3.3 70B (~40GB)
- Llama 3.2 Vision 11B (~7GB)
- Embeddings: all-MiniLM-L6-v2 (~400MB)
Total: ~150GB storage, ~110GB RAM when loaded
Success Criteria
- ✅ Both stacks deployed
- ✅ Models loaded and operational
- ✅ Documents ingested (Ops, Pokerole, Brainstorming)
- ✅ DERP backup functional
See: deployment-plan.md for detailed phases
Fire + Frost + Foundation 💙🔥❄️