diff --git a/docs/tasks/self-hosted-ai-stack-on-tx1/README.md b/docs/tasks/self-hosted-ai-stack-on-tx1/README.md new file mode 100644 index 0000000..516b8fe --- /dev/null +++ b/docs/tasks/self-hosted-ai-stack-on-tx1/README.md @@ -0,0 +1,44 @@ +# Self-Hosted AI Stack on TX1 + +**Status:** Blocked - Medical clearance +**Priority:** Tier 2 - Major Infrastructure +**Time:** 8-12 hours (3-4 active, rest downloads) +**Location:** TX1 Dallas +**Last Updated:** 2026-02-16 + +## Overview +Dual AI deployment: AnythingLLM (Michael/Meg, document-heavy) + Open WebUI (staff assistant). DERP backup, unlimited AI access, staff foundation. + +## Architecture +**Primary: AnythingLLM** (ai.firefrostgaming.com) +- 1,000+ document libraries +- LanceDB vector database +- Workspace isolation (Operations, Pokerole, Brainstorming) + +**Secondary: Open WebUI** (staff-ai.firefrostgaming.com) +- Lighter for staff wiki +- Chroma vector DB +- ChatGPT-like interface + +## Phases +**Phase 1:** Deploy stack (1-2 hours) +**Phase 2:** Load models (6-8 hours overnight) +**Phase 3:** Document ingestion (2-3 hours active, 6-8 total) + +## Models +- Qwen 2.5 Coder 72B (~40GB) +- Llama 3.3 70B (~40GB) +- Llama 3.2 Vision 11B (~7GB) +- Embeddings: all-MiniLM-L6-v2 (~400MB) + +**Total:** ~150GB storage, ~110GB RAM when loaded + +## Success Criteria +- ✅ Both stacks deployed +- ✅ Models loaded and operational +- ✅ Documents ingested (Ops, Pokerole, Brainstorming) +- ✅ DERP backup functional + +**See:** deployment-plan.md for detailed phases + +**Fire + Frost + Foundation** 💙🔥❄️