firefrost-operations-manual/docs/tasks/self-hosted-ai-stack-on-tx1/README.md

# Self-Hosted AI Stack on TX1

**Status:** Blocked - Medical clearance
**Priority:** Tier 2 - Major Infrastructure
**Time:** 8-12 hours (3-4 active, rest downloads)
**Location:** TX1 Dallas
**Last Updated:** 2026-02-16

## Overview
Dual AI deployment: AnythingLLM (Michael/Meg, document-heavy) + Open WebUI (staff assistant). DERP backup, unlimited AI access, staff foundation.

## Architecture
**Primary: AnythingLLM** (ai.firefrostgaming.com)
- 1,000+ document libraries
- LanceDB vector database
- Workspace isolation (Operations, Pokerole, Brainstorming)

**Secondary: Open WebUI** (staff-ai.firefrostgaming.com)
- Lighter for staff wiki
- Chroma vector DB
- ChatGPT-like interface

## Phases
**Phase 1:** Deploy stack (1-2 hours)
**Phase 2:** Load models (6-8 hours overnight)
**Phase 3:** Document ingestion (2-3 hours active, 6-8 total)

## Models
- Qwen 2.5 Coder 72B (~40GB)
- Llama 3.3 70B (~40GB)
- Llama 3.2 Vision 11B (~7GB)
- Embeddings: all-MiniLM-L6-v2 (~400MB)

**Total:** ~150GB storage, ~110GB RAM when loaded

## Success Criteria
- ✅ Both stacks deployed
- ✅ Models loaded and operational
- ✅ Documents ingested (Ops, Pokerole, Brainstorming)
- ✅ DERP backup functional

**See:** deployment-plan.md for detailed phases

**Fire + Frost + Foundation** 💙🔥❄️