Files
firefrost-operations-manual/docs/tasks/self-hosted-ai-stack-on-tx1/deployment-plan.md
The Chronicler 96f20e8715 Task #9: Rewrite AI Stack architecture for DERP compliance
Complete rewrite of self-hosted AI stack (Task #9) with new DERP-compliant architecture:

CHANGES:
- Architecture: AnythingLLM+OpenWebUI → Dify+Ollama (DERP-compliant)
- Cost model: $0/month additional (self-hosted on TX1, no external APIs)
- Usage tiers: Claude Projects (primary) → DERP backup (emergency) → Discord bots (staff/subscribers)
- Time estimate: 8-12hrs → 6-8hrs (more focused deployment)
- Resource allocation: 97GB storage, 92GB RAM when active (vs 150GB/110GB)

NEW DOCUMENTATION:
- README.md: Complete architecture rewrite with three-tier usage model
- deployment-plan.md: Step-by-step deployment (6 phases, all commands included)
- usage-guide.md: Decision tree for when to use Claude vs DERP vs bots
- resource-requirements.md: TX1 capacity planning, monitoring, disaster recovery

KEY FEATURES:
- Zero additional monthly cost (beyond existing $20 Claude Pro)
- True DERP compliance (fully self-hosted when Claude unavailable)
- Knowledge graph RAG (indexes entire 416-file repo)
- Discord bot integration (role-based staff/subscriber access)
- Emergency procedures documented
- Capacity planning for growth (up to 18 game servers)

MODELS:
- Qwen 2.5 Coder 72B (infrastructure/coding, 128K context)
- Llama 3.3 70B (general reasoning, 128K context)
- Llama 3.2 Vision 11B (screenshot analysis)

Updated tasks.md summary to reflect new architecture.

Status: Ready for deployment (pending medical clearance)

Fire + Frost + Foundation + DERP = True Independence 💙🔥❄️
2026-02-18 17:27:25 +00:00

11 KiB

Self-Hosted AI Stack - Deployment Plan

Task: Self-Hosted AI Stack on TX1
Location: TX1 Dallas (38.68.14.26)
Total Time: 6-8 hours (3-4 active, rest overnight downloads)
Last Updated: 2026-02-18


Prerequisites

Before Starting

  • SSH access to TX1
  • Docker installed on TX1
  • Docker Compose installed
  • Sufficient storage (~100GB free)
  • No game servers under heavy load (model downloads are bandwidth-intensive)

Domain Configuration

  • DNS A record: ai.firefrostgaming.com → 38.68.14.26
  • SSL certificate ready (Let's Encrypt)

Phase 1: Deploy Dify (2-3 hours)

Step 1.1: Create Directory Structure

ssh root@38.68.14.26
cd /opt
mkdir -p dify
cd dify

Step 1.2: Download Dify Docker Compose

wget https://raw.githubusercontent.com/langgenius/dify/main/docker/docker-compose.yaml

Step 1.3: Configure Environment

# Create .env file
cat > .env << 'EOF'
# Dify Configuration
DIFY_VERSION=0.6.0
API_URL=https://ai.firefrostgaming.com
WEB_API_URL=https://ai.firefrostgaming.com

# Database
POSTGRES_PASSWORD=<generate_secure_password>
POSTGRES_DB=dify

# Redis
REDIS_PASSWORD=<generate_secure_password>

# Secret Key (generate with: openssl rand -base64 32)
SECRET_KEY=<generate_secret_key>

# Storage
STORAGE_TYPE=local
STORAGE_LOCAL_PATH=/app/storage
EOF

Step 1.4: Deploy Dify

docker-compose up -d

Wait: 5-10 minutes for all services to start

Step 1.5: Verify Deployment

docker-compose ps
# All services should show "Up"

curl http://localhost/health
# Should return: {"status":"ok"}

Step 1.6: Configure Nginx Reverse Proxy

# Create Nginx config
cat > /etc/nginx/sites-available/ai.firefrostgaming.com << 'EOF'
server {
    listen 80;
    server_name ai.firefrostgaming.com;
    
    location / {
        proxy_pass http://localhost:80;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}
EOF

# Enable site
ln -s /etc/nginx/sites-available/ai.firefrostgaming.com /etc/nginx/sites-enabled/
nginx -t
systemctl reload nginx

# Get SSL certificate
certbot --nginx -d ai.firefrostgaming.com

Step 1.7: Initial Configuration

  1. Visit https://ai.firefrostgaming.com
  2. Create admin account (Michael)
  3. Configure workspaces:
    • Operations (infrastructure docs)
    • Brainstorming (creative docs)

Phase 2: Install Ollama and Models (Overnight)

Step 2.1: Install Ollama

curl -fsSL https://ollama.com/install.sh | sh

Step 2.2: Download Models (Overnight - Large Files)

Download Qwen 2.5 Coder 72B:

ollama pull qwen2.5-coder:72b

Size: ~40GB
Time: 2-4 hours (depending on connection)

Download Llama 3.3 70B:

ollama pull llama3.3:70b

Size: ~40GB
Time: 2-4 hours

Download Llama 3.2 Vision 11B:

ollama pull llama3.2-vision:11b

Size: ~7GB
Time: 30-60 minutes

Total download time: 6-8 hours (run overnight)

Step 2.3: Verify Models

ollama list
# Should show all three models

# Test Qwen
ollama run qwen2.5-coder:72b "Write a bash script to check disk space"
# Should generate script

# Test Llama 3.3
ollama run llama3.3:70b "Explain Firefrost Gaming's Fire + Frost philosophy"
# Should respond

# Test Vision
ollama run llama3.2-vision:11b "Describe this image: /path/to/test/image.jpg"
# Should analyze image

Step 2.4: Configure Ollama as Dify Backend

In Dify web interface:

  1. Go to Settings → Model Providers
  2. Add Ollama provider
  3. URL: http://localhost:11434
  4. Add models:
    • qwen2.5-coder:72b
    • llama3.3:70b
    • llama3.2-vision:11b
  5. Set Qwen as default for coding queries
  6. Set Llama 3.3 as default for general queries

Phase 3: Index Git Repository (1-2 hours)

Step 3.1: Clone Operations Manual to TX1

cd /opt/dify
git clone https://git.firefrostgaming.com/firefrost-gaming/firefrost-operations-manual.git

Step 3.2: Configure Dify Knowledge Base

Operations Workspace:

  1. In Dify, go to Operations workspace
  2. Create Knowledge Base: "Infrastructure Docs"
  3. Upload folder: /opt/dify/firefrost-operations-manual/docs/
  4. Processing: Automatic chunking with Q&A segmentation
  5. Embedding model: Default (all-MiniLM-L6-v2)

Brainstorming Workspace:

  1. Go to Brainstorming workspace
  2. Create Knowledge Base: "Creative Docs"
  3. Upload folder: /opt/dify/firefrost-operations-manual/docs/planning/
  4. Same processing settings

Wait: 30-60 minutes for indexing (416 files)

Step 3.3: Test Knowledge Retrieval

In Operations workspace:

  • Query: "What is the Frostwall Protocol?"
  • Should return relevant docs with citations

In Brainstorming workspace:

  • Query: "What is the Terraria branding training arc?"
  • Should return planning docs

Phase 4: Discord Bot (2-3 hours)

Step 4.1: Create Bot on Discord Developer Portal

  1. Go to https://discord.com/developers/applications
  2. Create new application: "Firefrost AI Assistant"
  3. Go to Bot section
  4. Create bot
  5. Copy bot token
  6. Enable Privileged Gateway Intents:
    • Message Content Intent
    • Server Members Intent

Step 4.2: Install Bot Code on TX1

cd /opt
mkdir -p firefrost-discord-bot
cd firefrost-discord-bot

# Create requirements.txt
cat > requirements.txt << 'EOF'
discord.py==2.3.2
aiohttp==3.9.1
python-dotenv==1.0.0
EOF

# Create virtual environment
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Step 4.3: Create Bot Script

cat > bot.py << 'EOF'
import discord
from discord.ext import commands
import aiohttp
import os
from dotenv import load_dotenv

load_dotenv()

TOKEN = os.getenv('DISCORD_TOKEN')
DIFY_API_URL = os.getenv('DIFY_API_URL')
DIFY_API_KEY = os.getenv('DIFY_API_KEY')

intents = discord.Intents.default()
intents.message_content = True
bot = commands.Bot(command_prefix='/', intents=intents)

@bot.event
async def on_ready():
    print(f'{bot.user} is now running!')

@bot.command(name='ask')
async def ask(ctx, *, question):
    """Ask the AI a question"""
    # Check user roles
    is_staff = any(role.name in ['Staff', 'Admin'] for role in ctx.author.roles)
    is_subscriber = any(role.name == 'Subscriber' for role in ctx.author.roles)
    
    if not (is_staff or is_subscriber):
        await ctx.send("You need Staff or Subscriber role to use this command.")
        return
    
    # Determine workspace based on role
    workspace = 'operations' if is_staff else 'general'
    
    await ctx.send(f"🤔 Thinking...")
    
    async with aiohttp.ClientSession() as session:
        async with session.post(
            f'{DIFY_API_URL}/v1/chat-messages',
            headers={
                'Authorization': f'Bearer {DIFY_API_KEY}',
                'Content-Type': 'application/json'
            },
            json={
                'query': question,
                'user': str(ctx.author.id),
                'conversation_id': None,
                'workspace': workspace
            }
        ) as resp:
            if resp.status == 200:
                data = await resp.json()
                answer = data.get('answer', 'No response')
                
                # Split long responses
                if len(answer) > 2000:
                    chunks = [answer[i:i+2000] for i in range(0, len(answer), 2000)]
                    for chunk in chunks:
                        await ctx.send(chunk)
                else:
                    await ctx.send(answer)
            else:
                await ctx.send("❌ Error connecting to AI. Please try again.")

bot.run(TOKEN)
EOF

Step 4.4: Configure Bot

# Create .env file
cat > .env << 'EOF'
DISCORD_TOKEN=<your_bot_token>
DIFY_API_URL=https://ai.firefrostgaming.com
DIFY_API_KEY=<get_from_dify_settings>
EOF

Step 4.5: Create Systemd Service

cat > /etc/systemd/system/firefrost-discord-bot.service << 'EOF'
[Unit]
Description=Firefrost Discord Bot
After=network.target

[Service]
Type=simple
User=root
WorkingDirectory=/opt/firefrost-discord-bot
Environment="PATH=/opt/firefrost-discord-bot/venv/bin"
ExecStart=/opt/firefrost-discord-bot/venv/bin/python bot.py
Restart=always

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable firefrost-discord-bot
systemctl start firefrost-discord-bot

Step 4.6: Invite Bot to Discord

  1. Go to OAuth2 → URL Generator
  2. Select scopes: bot, applications.commands
  3. Select permissions: Send Messages, Read Message History
  4. Copy generated URL
  5. Open in browser and invite to Firefrost Discord

Step 4.7: Test Bot

In Discord:

/ask What is the Frostwall Protocol?

Should return answer from Operations workspace (staff only)


Phase 5: Testing and Validation (30 minutes)

Test 1: DERP Backup (Strategic Query)

Simulate Claude outage:

  1. Load Qwen model: ollama run qwen2.5-coder:72b
  2. In Dify Operations workspace, ask:
    • "Should I deploy Mailcow before or after Frostwall Protocol?"
  3. Verify:
    • Response references both task docs
    • Shows dependency understanding
    • Recommends Frostwall first

Test 2: Discord Bot (Staff Query)

As staff member in Discord:

/ask How many game servers are running?

Should return infrastructure details

Test 3: Discord Bot (Subscriber Query)

As subscriber in Discord:

/ask What modpacks are available?

Should return modpack list (limited to public info)

Test 4: Resource Monitoring

# Check RAM usage with model loaded
free -h
# Should show ~92GB used when Qwen loaded

# Check disk usage
df -h /opt/dify
# Should show ~97GB used

# Check Docker containers
docker ps
# All Dify services should be running

Phase 6: Documentation (1 hour)

Create Usage Guide

Document at /opt/dify/USAGE-GUIDE.md:

  • When to use Claude (primary)
  • When to use DERP (Claude down)
  • When to use Discord bot (routine queries)
  • Emergency procedures

Update Operations Manual

Commit changes to Git:

  • Task documentation updated
  • Deployment plan complete
  • Usage guide created

Success Criteria Checklist

  • Dify deployed and accessible at https://ai.firefrostgaming.com
  • Ollama running with all 3 models loaded
  • Operations workspace indexing complete (416 files)
  • Brainstorming workspace indexing complete
  • DERP backup tested (strategic query works)
  • Discord bot deployed and running
  • Staff can query via Discord (/ask command)
  • Subscribers have limited access
  • Resource usage within TX1 limits (~92GB RAM, ~97GB storage)
  • Documentation complete and committed to Git
  • Zero additional monthly cost confirmed

Rollback Plan

If deployment fails:

# Stop all services
cd /opt/dify
docker-compose down

# Stop Discord bot
systemctl stop firefrost-discord-bot
systemctl disable firefrost-discord-bot

# Remove installation
rm -rf /opt/dify
rm -rf /opt/firefrost-discord-bot
rm /etc/systemd/system/firefrost-discord-bot.service
systemctl daemon-reload

# Remove Nginx config
rm /etc/nginx/sites-enabled/ai.firefrostgaming.com
rm /etc/nginx/sites-available/ai.firefrostgaming.com
nginx -t && systemctl reload nginx

# Uninstall Ollama
sudo /usr/local/bin/ollama-uninstall.sh

Fire + Frost + Foundation + DERP = True Independence 💙🔥❄️