45 lines
1.3 KiB
Markdown
45 lines
1.3 KiB
Markdown
# Self-Hosted AI Stack on TX1
|
|
|
|
**Status:** Blocked - Medical clearance
|
|
**Priority:** Tier 2 - Major Infrastructure
|
|
**Time:** 8-12 hours (3-4 active, rest downloads)
|
|
**Location:** TX1 Dallas
|
|
**Last Updated:** 2026-02-16
|
|
|
|
## Overview
|
|
Dual AI deployment: AnythingLLM (Michael/Meg, document-heavy) + Open WebUI (staff assistant). DERP backup, unlimited AI access, staff foundation.
|
|
|
|
## Architecture
|
|
**Primary: AnythingLLM** (ai.firefrostgaming.com)
|
|
- 1,000+ document libraries
|
|
- LanceDB vector database
|
|
- Workspace isolation (Operations, Pokerole, Brainstorming)
|
|
|
|
**Secondary: Open WebUI** (staff-ai.firefrostgaming.com)
|
|
- Lighter for staff wiki
|
|
- Chroma vector DB
|
|
- ChatGPT-like interface
|
|
|
|
## Phases
|
|
**Phase 1:** Deploy stack (1-2 hours)
|
|
**Phase 2:** Load models (6-8 hours overnight)
|
|
**Phase 3:** Document ingestion (2-3 hours active, 6-8 total)
|
|
|
|
## Models
|
|
- Qwen 2.5 Coder 72B (~40GB)
|
|
- Llama 3.3 70B (~40GB)
|
|
- Llama 3.2 Vision 11B (~7GB)
|
|
- Embeddings: all-MiniLM-L6-v2 (~400MB)
|
|
|
|
**Total:** ~150GB storage, ~110GB RAM when loaded
|
|
|
|
## Success Criteria
|
|
- ✅ Both stacks deployed
|
|
- ✅ Models loaded and operational
|
|
- ✅ Documents ingested (Ops, Pokerole, Brainstorming)
|
|
- ✅ DERP backup functional
|
|
|
|
**See:** deployment-plan.md for detailed phases
|
|
|
|
**Fire + Frost + Foundation** 💙🔥❄️
|