firefrost-operations-manual/docs/tasks/self-hosted-ai-stack-on-tx1/deployment-plan.md

# Self-Hosted AI Stack - Deployment Plan

**Task:** Self-Hosted AI Stack on TX1
**Location:** TX1 Dallas (38.68.14.26)
**Total Time:** 6-8 hours (3-4 active, rest overnight downloads)
**Last Updated:** 2026-02-18

---

## Prerequisites

### Before Starting
- [ ] SSH access to TX1
- [ ] Docker installed on TX1
- [ ] Docker Compose installed
- [ ] Sufficient storage (~100GB free)
- [ ] No game servers under heavy load (model downloads are bandwidth-intensive)

### Domain Configuration
- [ ] DNS A record: ai.firefrostgaming.com → 38.68.14.26
- [ ] SSL certificate ready (Let's Encrypt)

---

## Phase 1: Deploy Dify (2-3 hours)

### Step 1.1: Create Directory Structure

```bash
ssh root@38.68.14.26
cd /opt
mkdir -p dify
cd dify
```

### Step 1.2: Download Dify Docker Compose

```bash
wget https://raw.githubusercontent.com/langgenius/dify/main/docker/docker-compose.yaml
```

### Step 1.3: Configure Environment

```bash
# Create .env file
cat > .env << 'EOF'
# Dify Configuration
DIFY_VERSION=0.6.0
API_URL=https://ai.firefrostgaming.com
WEB_API_URL=https://ai.firefrostgaming.com

# Database
POSTGRES_PASSWORD=<generate_secure_password>
POSTGRES_DB=dify

# Redis
REDIS_PASSWORD=<generate_secure_password>

# Secret Key (generate with: openssl rand -base64 32)
SECRET_KEY=<generate_secret_key>

# Storage
STORAGE_TYPE=local
STORAGE_LOCAL_PATH=/app/storage
EOF
```

### Step 1.4: Deploy Dify

```bash
docker-compose up -d
```

**Wait:** 5-10 minutes for all services to start

### Step 1.5: Verify Deployment

```bash
docker-compose ps
# All services should show "Up"

curl http://localhost/health
# Should return: {"status":"ok"}
```

### Step 1.6: Configure Nginx Reverse Proxy

```bash
# Create Nginx config
cat > /etc/nginx/sites-available/ai.firefrostgaming.com << 'EOF'
server {
    listen 80;
    server_name ai.firefrostgaming.com;

    location / {
        proxy_pass http://localhost:80;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}
EOF

# Enable site
ln -s /etc/nginx/sites-available/ai.firefrostgaming.com /etc/nginx/sites-enabled/
nginx -t
systemctl reload nginx

# Get SSL certificate
certbot --nginx -d ai.firefrostgaming.com
```

### Step 1.7: Initial Configuration

1. Visit https://ai.firefrostgaming.com
2. Create admin account (Michael)
3. Configure workspaces:
   - **Operations** (infrastructure docs)
   - **Brainstorming** (creative docs)

---

## Phase 2: Install Ollama and Models (Overnight)

### Step 2.1: Install Ollama

```bash
curl -fsSL https://ollama.com/install.sh | sh
```

### Step 2.2: Download Models (Overnight - Large Files)

**Download Qwen 2.5 Coder 72B:**
```bash
ollama pull qwen2.5-coder:72b
```
**Size:** ~40GB
**Time:** 2-4 hours (depending on connection)

**Download Llama 3.3 70B:**
```bash
ollama pull llama3.3:70b
```
**Size:** ~40GB
**Time:** 2-4 hours

**Download Llama 3.2 Vision 11B:**
```bash
ollama pull llama3.2-vision:11b
```
**Size:** ~7GB
**Time:** 30-60 minutes

**Total download time:** 6-8 hours (run overnight)

### Step 2.3: Verify Models

```bash
ollama list
# Should show all three models

# Test Qwen
ollama run qwen2.5-coder:72b "Write a bash script to check disk space"
# Should generate script

# Test Llama 3.3
ollama run llama3.3:70b "Explain Firefrost Gaming's Fire + Frost philosophy"
# Should respond

# Test Vision
ollama run llama3.2-vision:11b "Describe this image: /path/to/test/image.jpg"
# Should analyze image
```

### Step 2.4: Configure Ollama as Dify Backend

In Dify web interface:
1. Go to Settings → Model Providers
2. Add Ollama provider
3. URL: http://localhost:11434
4. Add models:
   - qwen2.5-coder:72b
   - llama3.3:70b
   - llama3.2-vision:11b
5. Set Qwen as default for coding queries
6. Set Llama 3.3 as default for general queries

---

## Phase 3: Index Git Repository (1-2 hours)

### Step 3.1: Clone Operations Manual to TX1

```bash
cd /opt/dify
git clone https://git.firefrostgaming.com/firefrost-gaming/firefrost-operations-manual.git
```

### Step 3.2: Configure Dify Knowledge Base

**Operations Workspace:**
1. In Dify, go to Operations workspace
2. Create Knowledge Base: "Infrastructure Docs"
3. Upload folder: `/opt/dify/firefrost-operations-manual/docs/`
4. Processing: Automatic chunking with Q&A segmentation
5. Embedding model: Default (all-MiniLM-L6-v2)

**Brainstorming Workspace:**
1. Go to Brainstorming workspace
2. Create Knowledge Base: "Creative Docs"
3. Upload folder: `/opt/dify/firefrost-operations-manual/docs/planning/`
4. Same processing settings

**Wait:** 30-60 minutes for indexing (416 files)

### Step 3.3: Test Knowledge Retrieval

In Operations workspace:
- Query: "What is the Frostwall Protocol?"
- Should return relevant docs with citations

In Brainstorming workspace:
- Query: "What is the Terraria branding training arc?"
- Should return planning docs

---

## Phase 4: Discord Bot (2-3 hours)

### Step 4.1: Create Bot on Discord Developer Portal

1. Go to https://discord.com/developers/applications
2. Create new application: "Firefrost AI Assistant"
3. Go to Bot section
4. Create bot
5. Copy bot token
6. Enable Privileged Gateway Intents:
   - Message Content Intent
   - Server Members Intent

### Step 4.2: Install Bot Code on TX1

```bash
cd /opt
mkdir -p firefrost-discord-bot
cd firefrost-discord-bot

# Create requirements.txt
cat > requirements.txt << 'EOF'
discord.py==2.3.2
aiohttp==3.9.1
python-dotenv==1.0.0
EOF

# Create virtual environment
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

### Step 4.3: Create Bot Script

```bash
cat > bot.py << 'EOF'
import discord
from discord.ext import commands
import aiohttp
import os
from dotenv import load_dotenv

load_dotenv()

TOKEN = os.getenv('DISCORD_TOKEN')
DIFY_API_URL = os.getenv('DIFY_API_URL')
DIFY_API_KEY = os.getenv('DIFY_API_KEY')

intents = discord.Intents.default()
intents.message_content = True
bot = commands.Bot(command_prefix='/', intents=intents)

@bot.event
async def on_ready():
    print(f'{bot.user} is now running!')

@bot.command(name='ask')
async def ask(ctx, *, question):
    """Ask the AI a question"""
    # Check user roles
    is_staff = any(role.name in ['Staff', 'Admin'] for role in ctx.author.roles)
    is_subscriber = any(role.name == 'Subscriber' for role in ctx.author.roles)

    if not (is_staff or is_subscriber):
        await ctx.send("You need Staff or Subscriber role to use this command.")
        return

    # Determine workspace based on role
    workspace = 'operations' if is_staff else 'general'

    await ctx.send(f"🤔 Thinking...")

    async with aiohttp.ClientSession() as session:
        async with session.post(
            f'{DIFY_API_URL}/v1/chat-messages',
            headers={
                'Authorization': f'Bearer {DIFY_API_KEY}',
                'Content-Type': 'application/json'
            },
            json={
                'query': question,
                'user': str(ctx.author.id),
                'conversation_id': None,
                'workspace': workspace
            }
        ) as resp:
            if resp.status == 200:
                data = await resp.json()
                answer = data.get('answer', 'No response')

                # Split long responses
                if len(answer) > 2000:
                    chunks = [answer[i:i+2000] for i in range(0, len(answer), 2000)]
                    for chunk in chunks:
                        await ctx.send(chunk)
                else:
                    await ctx.send(answer)
            else:
                await ctx.send("❌ Error connecting to AI. Please try again.")

bot.run(TOKEN)
EOF
```

### Step 4.4: Configure Bot

```bash
# Create .env file
cat > .env << 'EOF'
DISCORD_TOKEN=<your_bot_token>
DIFY_API_URL=https://ai.firefrostgaming.com
DIFY_API_KEY=<get_from_dify_settings>
EOF
```

### Step 4.5: Create Systemd Service

```bash
cat > /etc/systemd/system/firefrost-discord-bot.service << 'EOF'
[Unit]
Description=Firefrost Discord Bot
After=network.target

[Service]
Type=simple
User=root
WorkingDirectory=/opt/firefrost-discord-bot
Environment="PATH=/opt/firefrost-discord-bot/venv/bin"
ExecStart=/opt/firefrost-discord-bot/venv/bin/python bot.py
Restart=always

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable firefrost-discord-bot
systemctl start firefrost-discord-bot
```

### Step 4.6: Invite Bot to Discord

1. Go to OAuth2 → URL Generator
2. Select scopes: bot, applications.commands
3. Select permissions: Send Messages, Read Message History
4. Copy generated URL
5. Open in browser and invite to Firefrost Discord

### Step 4.7: Test Bot

In Discord:
```
/ask What is the Frostwall Protocol?
```
Should return answer from Operations workspace (staff only)

---

## Phase 5: Testing and Validation (30 minutes)

### Test 1: DERP Backup (Strategic Query)

**Simulate Claude outage:**
1. Load Qwen model: `ollama run qwen2.5-coder:72b`
2. In Dify Operations workspace, ask:
   - "Should I deploy Mailcow before or after Frostwall Protocol?"
3. Verify:
   - Response references both task docs
   - Shows dependency understanding
   - Recommends Frostwall first

### Test 2: Discord Bot (Staff Query)

As staff member in Discord:
```
/ask How many game servers are running?
```
Should return infrastructure details

### Test 3: Discord Bot (Subscriber Query)

As subscriber in Discord:
```
/ask What modpacks are available?
```
Should return modpack list (limited to public info)

### Test 4: Resource Monitoring

```bash
# Check RAM usage with model loaded
free -h
# Should show ~92GB used when Qwen loaded

# Check disk usage
df -h /opt/dify
# Should show ~97GB used

# Check Docker containers
docker ps
# All Dify services should be running
```

---

## Phase 6: Documentation (1 hour)

### Create Usage Guide

Document at `/opt/dify/USAGE-GUIDE.md`:
- When to use Claude (primary)
- When to use DERP (Claude down)
- When to use Discord bot (routine queries)
- Emergency procedures

### Update Operations Manual

Commit changes to Git:
- Task documentation updated
- Deployment plan complete
- Usage guide created

---

## Success Criteria Checklist

- [ ] Dify deployed and accessible at https://ai.firefrostgaming.com
- [ ] Ollama running with all 3 models loaded
- [ ] Operations workspace indexing complete (416 files)
- [ ] Brainstorming workspace indexing complete
- [ ] DERP backup tested (strategic query works)
- [ ] Discord bot deployed and running
- [ ] Staff can query via Discord (/ask command)
- [ ] Subscribers have limited access
- [ ] Resource usage within TX1 limits (~92GB RAM, ~97GB storage)
- [ ] Documentation complete and committed to Git
- [ ] Zero additional monthly cost confirmed

---

## Rollback Plan

If deployment fails:

```bash
# Stop all services
cd /opt/dify
docker-compose down

# Stop Discord bot
systemctl stop firefrost-discord-bot
systemctl disable firefrost-discord-bot

# Remove installation
rm -rf /opt/dify
rm -rf /opt/firefrost-discord-bot
rm /etc/systemd/system/firefrost-discord-bot.service
systemctl daemon-reload

# Remove Nginx config
rm /etc/nginx/sites-enabled/ai.firefrostgaming.com
rm /etc/nginx/sites-available/ai.firefrostgaming.com
nginx -t && systemctl reload nginx

# Uninstall Ollama
sudo /usr/local/bin/ollama-uninstall.sh
```

---

**Fire + Frost + Foundation + DERP = True Independence** 💙🔥❄️