Files
firefrost-operations-manual/docs/tasks/self-hosted-ai-stack-on-tx1/deployment-plan.md
The Chronicler b32afdd1db Task #9: Rewrite AI Stack architecture for DERP compliance
Complete rewrite of self-hosted AI stack (Task #9) with new DERP-compliant architecture:

CHANGES:
- Architecture: AnythingLLM+OpenWebUI → Dify+Ollama (DERP-compliant)
- Cost model: $0/month additional (self-hosted on TX1, no external APIs)
- Usage tiers: Claude Projects (primary) → DERP backup (emergency) → Discord bots (staff/subscribers)
- Time estimate: 8-12hrs → 6-8hrs (more focused deployment)
- Resource allocation: 97GB storage, 92GB RAM when active (vs 150GB/110GB)

NEW DOCUMENTATION:
- README.md: Complete architecture rewrite with three-tier usage model
- deployment-plan.md: Step-by-step deployment (6 phases, all commands included)
- usage-guide.md: Decision tree for when to use Claude vs DERP vs bots
- resource-requirements.md: TX1 capacity planning, monitoring, disaster recovery

KEY FEATURES:
- Zero additional monthly cost (beyond existing $20 Claude Pro)
- True DERP compliance (fully self-hosted when Claude unavailable)
- Knowledge graph RAG (indexes entire 416-file repo)
- Discord bot integration (role-based staff/subscriber access)
- Emergency procedures documented
- Capacity planning for growth (up to 18 game servers)

MODELS:
- Qwen 2.5 Coder 72B (infrastructure/coding, 128K context)
- Llama 3.3 70B (general reasoning, 128K context)
- Llama 3.2 Vision 11B (screenshot analysis)

Updated tasks.md summary to reflect new architecture.

Status: Ready for deployment (pending medical clearance)

Fire + Frost + Foundation + DERP = True Independence 💙🔥❄️
2026-02-18 17:27:25 +00:00

501 lines
11 KiB
Markdown

# Self-Hosted AI Stack - Deployment Plan
**Task:** Self-Hosted AI Stack on TX1
**Location:** TX1 Dallas (38.68.14.26)
**Total Time:** 6-8 hours (3-4 active, rest overnight downloads)
**Last Updated:** 2026-02-18
---
## Prerequisites
### Before Starting
- [ ] SSH access to TX1
- [ ] Docker installed on TX1
- [ ] Docker Compose installed
- [ ] Sufficient storage (~100GB free)
- [ ] No game servers under heavy load (model downloads are bandwidth-intensive)
### Domain Configuration
- [ ] DNS A record: ai.firefrostgaming.com → 38.68.14.26
- [ ] SSL certificate ready (Let's Encrypt)
---
## Phase 1: Deploy Dify (2-3 hours)
### Step 1.1: Create Directory Structure
```bash
ssh root@38.68.14.26
cd /opt
mkdir -p dify
cd dify
```
### Step 1.2: Download Dify Docker Compose
```bash
wget https://raw.githubusercontent.com/langgenius/dify/main/docker/docker-compose.yaml
```
### Step 1.3: Configure Environment
```bash
# Create .env file
cat > .env << 'EOF'
# Dify Configuration
DIFY_VERSION=0.6.0
API_URL=https://ai.firefrostgaming.com
WEB_API_URL=https://ai.firefrostgaming.com
# Database
POSTGRES_PASSWORD=<generate_secure_password>
POSTGRES_DB=dify
# Redis
REDIS_PASSWORD=<generate_secure_password>
# Secret Key (generate with: openssl rand -base64 32)
SECRET_KEY=<generate_secret_key>
# Storage
STORAGE_TYPE=local
STORAGE_LOCAL_PATH=/app/storage
EOF
```
### Step 1.4: Deploy Dify
```bash
docker-compose up -d
```
**Wait:** 5-10 minutes for all services to start
### Step 1.5: Verify Deployment
```bash
docker-compose ps
# All services should show "Up"
curl http://localhost/health
# Should return: {"status":"ok"}
```
### Step 1.6: Configure Nginx Reverse Proxy
```bash
# Create Nginx config
cat > /etc/nginx/sites-available/ai.firefrostgaming.com << 'EOF'
server {
listen 80;
server_name ai.firefrostgaming.com;
location / {
proxy_pass http://localhost:80;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
EOF
# Enable site
ln -s /etc/nginx/sites-available/ai.firefrostgaming.com /etc/nginx/sites-enabled/
nginx -t
systemctl reload nginx
# Get SSL certificate
certbot --nginx -d ai.firefrostgaming.com
```
### Step 1.7: Initial Configuration
1. Visit https://ai.firefrostgaming.com
2. Create admin account (Michael)
3. Configure workspaces:
- **Operations** (infrastructure docs)
- **Brainstorming** (creative docs)
---
## Phase 2: Install Ollama and Models (Overnight)
### Step 2.1: Install Ollama
```bash
curl -fsSL https://ollama.com/install.sh | sh
```
### Step 2.2: Download Models (Overnight - Large Files)
**Download Qwen 2.5 Coder 72B:**
```bash
ollama pull qwen2.5-coder:72b
```
**Size:** ~40GB
**Time:** 2-4 hours (depending on connection)
**Download Llama 3.3 70B:**
```bash
ollama pull llama3.3:70b
```
**Size:** ~40GB
**Time:** 2-4 hours
**Download Llama 3.2 Vision 11B:**
```bash
ollama pull llama3.2-vision:11b
```
**Size:** ~7GB
**Time:** 30-60 minutes
**Total download time:** 6-8 hours (run overnight)
### Step 2.3: Verify Models
```bash
ollama list
# Should show all three models
# Test Qwen
ollama run qwen2.5-coder:72b "Write a bash script to check disk space"
# Should generate script
# Test Llama 3.3
ollama run llama3.3:70b "Explain Firefrost Gaming's Fire + Frost philosophy"
# Should respond
# Test Vision
ollama run llama3.2-vision:11b "Describe this image: /path/to/test/image.jpg"
# Should analyze image
```
### Step 2.4: Configure Ollama as Dify Backend
In Dify web interface:
1. Go to Settings → Model Providers
2. Add Ollama provider
3. URL: http://localhost:11434
4. Add models:
- qwen2.5-coder:72b
- llama3.3:70b
- llama3.2-vision:11b
5. Set Qwen as default for coding queries
6. Set Llama 3.3 as default for general queries
---
## Phase 3: Index Git Repository (1-2 hours)
### Step 3.1: Clone Operations Manual to TX1
```bash
cd /opt/dify
git clone https://git.firefrostgaming.com/firefrost-gaming/firefrost-operations-manual.git
```
### Step 3.2: Configure Dify Knowledge Base
**Operations Workspace:**
1. In Dify, go to Operations workspace
2. Create Knowledge Base: "Infrastructure Docs"
3. Upload folder: `/opt/dify/firefrost-operations-manual/docs/`
4. Processing: Automatic chunking with Q&A segmentation
5. Embedding model: Default (all-MiniLM-L6-v2)
**Brainstorming Workspace:**
1. Go to Brainstorming workspace
2. Create Knowledge Base: "Creative Docs"
3. Upload folder: `/opt/dify/firefrost-operations-manual/docs/planning/`
4. Same processing settings
**Wait:** 30-60 minutes for indexing (416 files)
### Step 3.3: Test Knowledge Retrieval
In Operations workspace:
- Query: "What is the Frostwall Protocol?"
- Should return relevant docs with citations
In Brainstorming workspace:
- Query: "What is the Terraria branding training arc?"
- Should return planning docs
---
## Phase 4: Discord Bot (2-3 hours)
### Step 4.1: Create Bot on Discord Developer Portal
1. Go to https://discord.com/developers/applications
2. Create new application: "Firefrost AI Assistant"
3. Go to Bot section
4. Create bot
5. Copy bot token
6. Enable Privileged Gateway Intents:
- Message Content Intent
- Server Members Intent
### Step 4.2: Install Bot Code on TX1
```bash
cd /opt
mkdir -p firefrost-discord-bot
cd firefrost-discord-bot
# Create requirements.txt
cat > requirements.txt << 'EOF'
discord.py==2.3.2
aiohttp==3.9.1
python-dotenv==1.0.0
EOF
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```
### Step 4.3: Create Bot Script
```bash
cat > bot.py << 'EOF'
import discord
from discord.ext import commands
import aiohttp
import os
from dotenv import load_dotenv
load_dotenv()
TOKEN = os.getenv('DISCORD_TOKEN')
DIFY_API_URL = os.getenv('DIFY_API_URL')
DIFY_API_KEY = os.getenv('DIFY_API_KEY')
intents = discord.Intents.default()
intents.message_content = True
bot = commands.Bot(command_prefix='/', intents=intents)
@bot.event
async def on_ready():
print(f'{bot.user} is now running!')
@bot.command(name='ask')
async def ask(ctx, *, question):
"""Ask the AI a question"""
# Check user roles
is_staff = any(role.name in ['Staff', 'Admin'] for role in ctx.author.roles)
is_subscriber = any(role.name == 'Subscriber' for role in ctx.author.roles)
if not (is_staff or is_subscriber):
await ctx.send("You need Staff or Subscriber role to use this command.")
return
# Determine workspace based on role
workspace = 'operations' if is_staff else 'general'
await ctx.send(f"🤔 Thinking...")
async with aiohttp.ClientSession() as session:
async with session.post(
f'{DIFY_API_URL}/v1/chat-messages',
headers={
'Authorization': f'Bearer {DIFY_API_KEY}',
'Content-Type': 'application/json'
},
json={
'query': question,
'user': str(ctx.author.id),
'conversation_id': None,
'workspace': workspace
}
) as resp:
if resp.status == 200:
data = await resp.json()
answer = data.get('answer', 'No response')
# Split long responses
if len(answer) > 2000:
chunks = [answer[i:i+2000] for i in range(0, len(answer), 2000)]
for chunk in chunks:
await ctx.send(chunk)
else:
await ctx.send(answer)
else:
await ctx.send("❌ Error connecting to AI. Please try again.")
bot.run(TOKEN)
EOF
```
### Step 4.4: Configure Bot
```bash
# Create .env file
cat > .env << 'EOF'
DISCORD_TOKEN=<your_bot_token>
DIFY_API_URL=https://ai.firefrostgaming.com
DIFY_API_KEY=<get_from_dify_settings>
EOF
```
### Step 4.5: Create Systemd Service
```bash
cat > /etc/systemd/system/firefrost-discord-bot.service << 'EOF'
[Unit]
Description=Firefrost Discord Bot
After=network.target
[Service]
Type=simple
User=root
WorkingDirectory=/opt/firefrost-discord-bot
Environment="PATH=/opt/firefrost-discord-bot/venv/bin"
ExecStart=/opt/firefrost-discord-bot/venv/bin/python bot.py
Restart=always
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable firefrost-discord-bot
systemctl start firefrost-discord-bot
```
### Step 4.6: Invite Bot to Discord
1. Go to OAuth2 → URL Generator
2. Select scopes: bot, applications.commands
3. Select permissions: Send Messages, Read Message History
4. Copy generated URL
5. Open in browser and invite to Firefrost Discord
### Step 4.7: Test Bot
In Discord:
```
/ask What is the Frostwall Protocol?
```
Should return answer from Operations workspace (staff only)
---
## Phase 5: Testing and Validation (30 minutes)
### Test 1: DERP Backup (Strategic Query)
**Simulate Claude outage:**
1. Load Qwen model: `ollama run qwen2.5-coder:72b`
2. In Dify Operations workspace, ask:
- "Should I deploy Mailcow before or after Frostwall Protocol?"
3. Verify:
- Response references both task docs
- Shows dependency understanding
- Recommends Frostwall first
### Test 2: Discord Bot (Staff Query)
As staff member in Discord:
```
/ask How many game servers are running?
```
Should return infrastructure details
### Test 3: Discord Bot (Subscriber Query)
As subscriber in Discord:
```
/ask What modpacks are available?
```
Should return modpack list (limited to public info)
### Test 4: Resource Monitoring
```bash
# Check RAM usage with model loaded
free -h
# Should show ~92GB used when Qwen loaded
# Check disk usage
df -h /opt/dify
# Should show ~97GB used
# Check Docker containers
docker ps
# All Dify services should be running
```
---
## Phase 6: Documentation (1 hour)
### Create Usage Guide
Document at `/opt/dify/USAGE-GUIDE.md`:
- When to use Claude (primary)
- When to use DERP (Claude down)
- When to use Discord bot (routine queries)
- Emergency procedures
### Update Operations Manual
Commit changes to Git:
- Task documentation updated
- Deployment plan complete
- Usage guide created
---
## Success Criteria Checklist
- [ ] Dify deployed and accessible at https://ai.firefrostgaming.com
- [ ] Ollama running with all 3 models loaded
- [ ] Operations workspace indexing complete (416 files)
- [ ] Brainstorming workspace indexing complete
- [ ] DERP backup tested (strategic query works)
- [ ] Discord bot deployed and running
- [ ] Staff can query via Discord (/ask command)
- [ ] Subscribers have limited access
- [ ] Resource usage within TX1 limits (~92GB RAM, ~97GB storage)
- [ ] Documentation complete and committed to Git
- [ ] Zero additional monthly cost confirmed
---
## Rollback Plan
If deployment fails:
```bash
# Stop all services
cd /opt/dify
docker-compose down
# Stop Discord bot
systemctl stop firefrost-discord-bot
systemctl disable firefrost-discord-bot
# Remove installation
rm -rf /opt/dify
rm -rf /opt/firefrost-discord-bot
rm /etc/systemd/system/firefrost-discord-bot.service
systemctl daemon-reload
# Remove Nginx config
rm /etc/nginx/sites-enabled/ai.firefrostgaming.com
rm /etc/nginx/sites-available/ai.firefrostgaming.com
nginx -t && systemctl reload nginx
# Uninstall Ollama
sudo /usr/local/bin/ollama-uninstall.sh
```
---
**Fire + Frost + Foundation + DERP = True Independence** 💙🔥❄️