Archive threshold: ≥50KB OR ≥4 files Archived to _archive/: - firefrost-codex-migration-to-open-webui (127K, 9 files) - whitelist-manager (65K, 5 files) - self-hosted-ai-stack-on-tx1 (35K, 4 files) Deleted (obsolete/superseded): - builder-rank-holly-setup - consultant-photo-processing - ghost-theme-migration (empty) - gitea-plane-integration (Plane abandoned) - gitea-upgrade (Kanban approach abandoned) - plane-deployment (superseded by decommission) - pterodactyl-blueprint-asset-build (fold into #26) - pterodactyl-modpack-version-display (fold into #26) - scope-document-corrections (too vague) - scoped-gitea-token (honor system working) - whitelist-manager-v1-12-compatibility (rolled into Trinity Console) Also added: Gemini task management consolidation consultation Chronicler #69
11 KiB
Self-Hosted AI Stack - Deployment Plan
Task: Self-Hosted AI Stack on TX1
Location: TX1 Dallas (38.68.14.26)
Total Time: 6-8 hours (3-4 active, rest overnight downloads)
Last Updated: 2026-02-18
Prerequisites
Before Starting
- SSH access to TX1
- Docker installed on TX1
- Docker Compose installed
- Sufficient storage (~100GB free)
- No game servers under heavy load (model downloads are bandwidth-intensive)
Domain Configuration
- DNS A record: ai.firefrostgaming.com → 38.68.14.26
- SSL certificate ready (Let's Encrypt)
Phase 1: Deploy Dify (2-3 hours)
Step 1.1: Create Directory Structure
ssh root@38.68.14.26
cd /opt
mkdir -p dify
cd dify
Step 1.2: Download Dify Docker Compose
wget https://raw.githubusercontent.com/langgenius/dify/main/docker/docker-compose.yaml
Step 1.3: Configure Environment
# Create .env file
cat > .env << 'EOF'
# Dify Configuration
DIFY_VERSION=0.6.0
API_URL=https://ai.firefrostgaming.com
WEB_API_URL=https://ai.firefrostgaming.com
# Database
POSTGRES_PASSWORD=<generate_secure_password>
POSTGRES_DB=dify
# Redis
REDIS_PASSWORD=<generate_secure_password>
# Secret Key (generate with: openssl rand -base64 32)
SECRET_KEY=<generate_secret_key>
# Storage
STORAGE_TYPE=local
STORAGE_LOCAL_PATH=/app/storage
EOF
Step 1.4: Deploy Dify
docker-compose up -d
Wait: 5-10 minutes for all services to start
Step 1.5: Verify Deployment
docker-compose ps
# All services should show "Up"
curl http://localhost/health
# Should return: {"status":"ok"}
Step 1.6: Configure Nginx Reverse Proxy
# Create Nginx config
cat > /etc/nginx/sites-available/ai.firefrostgaming.com << 'EOF'
server {
listen 80;
server_name ai.firefrostgaming.com;
location / {
proxy_pass http://localhost:80;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
EOF
# Enable site
ln -s /etc/nginx/sites-available/ai.firefrostgaming.com /etc/nginx/sites-enabled/
nginx -t
systemctl reload nginx
# Get SSL certificate
certbot --nginx -d ai.firefrostgaming.com
Step 1.7: Initial Configuration
- Visit https://ai.firefrostgaming.com
- Create admin account (Michael)
- Configure workspaces:
- Operations (infrastructure docs)
- Brainstorming (creative docs)
Phase 2: Install Ollama and Models (Overnight)
Step 2.1: Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
Step 2.2: Download Models (Overnight - Large Files)
Download Qwen 2.5 Coder 72B:
ollama pull qwen2.5-coder:72b
Size: ~40GB
Time: 2-4 hours (depending on connection)
Download Llama 3.3 70B:
ollama pull llama3.3:70b
Size: ~40GB
Time: 2-4 hours
Download Llama 3.2 Vision 11B:
ollama pull llama3.2-vision:11b
Size: ~7GB
Time: 30-60 minutes
Total download time: 6-8 hours (run overnight)
Step 2.3: Verify Models
ollama list
# Should show all three models
# Test Qwen
ollama run qwen2.5-coder:72b "Write a bash script to check disk space"
# Should generate script
# Test Llama 3.3
ollama run llama3.3:70b "Explain Firefrost Gaming's Fire + Frost philosophy"
# Should respond
# Test Vision
ollama run llama3.2-vision:11b "Describe this image: /path/to/test/image.jpg"
# Should analyze image
Step 2.4: Configure Ollama as Dify Backend
In Dify web interface:
- Go to Settings → Model Providers
- Add Ollama provider
- URL: http://localhost:11434
- Add models:
- qwen2.5-coder:72b
- llama3.3:70b
- llama3.2-vision:11b
- Set Qwen as default for coding queries
- Set Llama 3.3 as default for general queries
Phase 3: Index Git Repository (1-2 hours)
Step 3.1: Clone Operations Manual to TX1
cd /opt/dify
git clone https://git.firefrostgaming.com/firefrost-gaming/firefrost-operations-manual.git
Step 3.2: Configure Dify Knowledge Base
Operations Workspace:
- In Dify, go to Operations workspace
- Create Knowledge Base: "Infrastructure Docs"
- Upload folder:
/opt/dify/firefrost-operations-manual/docs/ - Processing: Automatic chunking with Q&A segmentation
- Embedding model: Default (all-MiniLM-L6-v2)
Brainstorming Workspace:
- Go to Brainstorming workspace
- Create Knowledge Base: "Creative Docs"
- Upload folder:
/opt/dify/firefrost-operations-manual/docs/planning/ - Same processing settings
Wait: 30-60 minutes for indexing (416 files)
Step 3.3: Test Knowledge Retrieval
In Operations workspace:
- Query: "What is the Frostwall Protocol?"
- Should return relevant docs with citations
In Brainstorming workspace:
- Query: "What is the Terraria branding training arc?"
- Should return planning docs
Phase 4: Discord Bot (2-3 hours)
Step 4.1: Create Bot on Discord Developer Portal
- Go to https://discord.com/developers/applications
- Create new application: "Firefrost AI Assistant"
- Go to Bot section
- Create bot
- Copy bot token
- Enable Privileged Gateway Intents:
- Message Content Intent
- Server Members Intent
Step 4.2: Install Bot Code on TX1
cd /opt
mkdir -p firefrost-discord-bot
cd firefrost-discord-bot
# Create requirements.txt
cat > requirements.txt << 'EOF'
discord.py==2.3.2
aiohttp==3.9.1
python-dotenv==1.0.0
EOF
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Step 4.3: Create Bot Script
cat > bot.py << 'EOF'
import discord
from discord.ext import commands
import aiohttp
import os
from dotenv import load_dotenv
load_dotenv()
TOKEN = os.getenv('DISCORD_TOKEN')
DIFY_API_URL = os.getenv('DIFY_API_URL')
DIFY_API_KEY = os.getenv('DIFY_API_KEY')
intents = discord.Intents.default()
intents.message_content = True
bot = commands.Bot(command_prefix='/', intents=intents)
@bot.event
async def on_ready():
print(f'{bot.user} is now running!')
@bot.command(name='ask')
async def ask(ctx, *, question):
"""Ask the AI a question"""
# Check user roles
is_staff = any(role.name in ['Staff', 'Admin'] for role in ctx.author.roles)
is_subscriber = any(role.name == 'Subscriber' for role in ctx.author.roles)
if not (is_staff or is_subscriber):
await ctx.send("You need Staff or Subscriber role to use this command.")
return
# Determine workspace based on role
workspace = 'operations' if is_staff else 'general'
await ctx.send(f"🤔 Thinking...")
async with aiohttp.ClientSession() as session:
async with session.post(
f'{DIFY_API_URL}/v1/chat-messages',
headers={
'Authorization': f'Bearer {DIFY_API_KEY}',
'Content-Type': 'application/json'
},
json={
'query': question,
'user': str(ctx.author.id),
'conversation_id': None,
'workspace': workspace
}
) as resp:
if resp.status == 200:
data = await resp.json()
answer = data.get('answer', 'No response')
# Split long responses
if len(answer) > 2000:
chunks = [answer[i:i+2000] for i in range(0, len(answer), 2000)]
for chunk in chunks:
await ctx.send(chunk)
else:
await ctx.send(answer)
else:
await ctx.send("❌ Error connecting to AI. Please try again.")
bot.run(TOKEN)
EOF
Step 4.4: Configure Bot
# Create .env file
cat > .env << 'EOF'
DISCORD_TOKEN=<your_bot_token>
DIFY_API_URL=https://ai.firefrostgaming.com
DIFY_API_KEY=<get_from_dify_settings>
EOF
Step 4.5: Create Systemd Service
cat > /etc/systemd/system/firefrost-discord-bot.service << 'EOF'
[Unit]
Description=Firefrost Discord Bot
After=network.target
[Service]
Type=simple
User=root
WorkingDirectory=/opt/firefrost-discord-bot
Environment="PATH=/opt/firefrost-discord-bot/venv/bin"
ExecStart=/opt/firefrost-discord-bot/venv/bin/python bot.py
Restart=always
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable firefrost-discord-bot
systemctl start firefrost-discord-bot
Step 4.6: Invite Bot to Discord
- Go to OAuth2 → URL Generator
- Select scopes: bot, applications.commands
- Select permissions: Send Messages, Read Message History
- Copy generated URL
- Open in browser and invite to Firefrost Discord
Step 4.7: Test Bot
In Discord:
/ask What is the Frostwall Protocol?
Should return answer from Operations workspace (staff only)
Phase 5: Testing and Validation (30 minutes)
Test 1: DERP Backup (Strategic Query)
Simulate Claude outage:
- Load Qwen model:
ollama run qwen2.5-coder:72b - In Dify Operations workspace, ask:
- "Should I deploy Mailcow before or after Frostwall Protocol?"
- Verify:
- Response references both task docs
- Shows dependency understanding
- Recommends Frostwall first
Test 2: Discord Bot (Staff Query)
As staff member in Discord:
/ask How many game servers are running?
Should return infrastructure details
Test 3: Discord Bot (Subscriber Query)
As subscriber in Discord:
/ask What modpacks are available?
Should return modpack list (limited to public info)
Test 4: Resource Monitoring
# Check RAM usage with model loaded
free -h
# Should show ~92GB used when Qwen loaded
# Check disk usage
df -h /opt/dify
# Should show ~97GB used
# Check Docker containers
docker ps
# All Dify services should be running
Phase 6: Documentation (1 hour)
Create Usage Guide
Document at /opt/dify/USAGE-GUIDE.md:
- When to use Claude (primary)
- When to use DERP (Claude down)
- When to use Discord bot (routine queries)
- Emergency procedures
Update Operations Manual
Commit changes to Git:
- Task documentation updated
- Deployment plan complete
- Usage guide created
Success Criteria Checklist
- Dify deployed and accessible at https://ai.firefrostgaming.com
- Ollama running with all 3 models loaded
- Operations workspace indexing complete (416 files)
- Brainstorming workspace indexing complete
- DERP backup tested (strategic query works)
- Discord bot deployed and running
- Staff can query via Discord (/ask command)
- Subscribers have limited access
- Resource usage within TX1 limits (~92GB RAM, ~97GB storage)
- Documentation complete and committed to Git
- Zero additional monthly cost confirmed
Rollback Plan
If deployment fails:
# Stop all services
cd /opt/dify
docker-compose down
# Stop Discord bot
systemctl stop firefrost-discord-bot
systemctl disable firefrost-discord-bot
# Remove installation
rm -rf /opt/dify
rm -rf /opt/firefrost-discord-bot
rm /etc/systemd/system/firefrost-discord-bot.service
systemctl daemon-reload
# Remove Nginx config
rm /etc/nginx/sites-enabled/ai.firefrostgaming.com
rm /etc/nginx/sites-available/ai.firefrostgaming.com
nginx -t && systemctl reload nginx
# Uninstall Ollama
sudo /usr/local/bin/ollama-uninstall.sh
Fire + Frost + Foundation + DERP = True Independence 💙🔥❄️