firefrost-operations-manual/docs/tasks/self-hosted-ai-stack-on-tx1/usage-guide.md

# AI Stack Usage Guide

**Purpose:** Know which AI system to use when
**Last Updated:** 2026-02-18

---

## The Three-Tier System

### Tier 1: Claude Projects (Primary) - **USE THIS FIRST**

**Who:** Michael + Meg
**Where:** claude.ai or Claude app
**Cost:** $20/month (already paying)

**When to use:**
- ✅ **Normal daily operations** (99% of the time)
- ✅ **Strategic decision-making** (deployment order, architecture)
- ✅ **Complex reasoning** (tradeoffs, dependencies)
- ✅ **Session continuity** (remembers context across days)
- ✅ **Best experience** (fastest, most capable)

**What Claude can do:**
- Search entire 416-file operations manual
- Write deployment scripts
- Review infrastructure decisions
- Generate documentation
- Debug issues
- Plan roadmaps

**Example queries:**
- "Should I deploy Mailcow or AI stack first?"
- "Write a script to deploy Frostwall Protocol"
- "What tasks depend on NC1 cleanup?"
- "Help me troubleshoot this Pterodactyl error"

**Limitations:**
- Requires internet connection
- Subject to Anthropic availability

---

### Tier 2: DERP Backup (Emergency Only) - **WHEN CLAUDE IS DOWN**

**Who:** Michael + Meg
**Where:** https://ai.firefrostgaming.com
**Cost:** $0/month (self-hosted on TX1)

**When to use:**
- ❌ **Not for normal operations** (Claude is faster/better)
- ✅ **Anthropic outage** (Claude unavailable for hours)
- ✅ **Emergency infrastructure decisions** (can't wait for Claude)
- ✅ **Critical troubleshooting** (server down, need immediate help)

**What DERP can do:**
- Query indexed operations manual (416 files)
- Strategic reasoning with 128K context
- Infrastructure troubleshooting
- Code generation
- Emergency deployment guidance

**Available models:**
- **Qwen 2.5 Coder 72B** - Infrastructure/coding questions
- **Llama 3.3 70B** - General reasoning
- **Llama 3.2 Vision 11B** - Screenshot analysis

**Example queries:**
- "Claude is down. What's the deployment order for Frostwall?"
- "Emergency: Mailcow not starting. Check logs and diagnose."
- "Need to deploy something NOW. What dependencies are missing?"

**Limitations:**
- Slower inference than Claude
- No session continuity
- Manual model selection
- Uses TX1 resources (~80GB RAM when active)

**How to activate:**
1. Verify Claude is unavailable (try multiple times)
2. Go to https://ai.firefrostgaming.com
3. Select workspace:
   - **Operations** - Infrastructure decisions
   - **Brainstorming** - Creative work
4. Select model:
   - **Qwen 2.5 Coder** - For deployment/troubleshooting
   - **Llama 3.3** - For general questions
5. Ask question
6. Copy/paste response as needed

**When to deactivate:**
- Claude comes back online
- Emergency resolved
- Free up TX1 RAM for game servers

---

### Tier 3: Discord Bot (Staff/Subscribers) - **ROUTINE QUERIES**

**Who:** Staff + Subscribers
**Where:** Firefrost Discord server
**Cost:** $0/month (same infrastructure)

**When to use:**
- ✅ **Routine questions** (daily operations)
- ✅ **Quick lookups** (server status, modpack info)
- ✅ **Staff training** (how-to queries)
- ✅ **Subscriber support** (basic info)

**Commands:**

**`/ask [question]`**
- Available to: Staff + Subscribers
- Searches: Operations workspace (staff) or public docs (subscribers)
- Rate limit: 10 queries/hour per user

**Example queries (Staff):**
```
/ask How many game servers are running?
/ask What's the Whitelist Manager deployment status?
/ask How do I restart a Minecraft server?
```

**Example queries (Subscribers):**
```
/ask What modpacks are available?
/ask How do I join a server?
/ask What's the difference between Fire and Frost paths?
```

**Role-based access:**
- **Staff:** Full Operations workspace access
- **Subscribers:** Public documentation only
- **No role:** Cannot use bot

**Limitations:**
- Simple queries only (no complex reasoning)
- No file uploads
- No strategic decisions
- Rate limited

---

## Decision Tree

```
┌─────────────────────────────────────┐
│    Do you need AI assistance?      │
└─────────────┬───────────────────────┘
              │
              ▼
      ┌───────────────┐
      │ Is it urgent? │
      └───┬───────┬───┘
          │       │
        NO│       │YES
          │       │
          ▼       ▼
    ┌─────────┐ ┌──────────────┐
    │ Claude  │ │ Is Claude    │
    │ working?│ │ available?   │
    └───┬─────┘ └──┬───────┬───┘
        │          │       │
       YES│       YES│     │NO
        │          │       │
        ▼          ▼       ▼
  ┌──────────┐ ┌──────────┐ ┌─────────┐
  │Use Claude│ │Use Claude│ │Use DERP │
  │Projects  │ │Projects  │ │Backup   │
  └──────────┘ └──────────┘ └─────────┘
```

**For staff/subscribers:**
```
┌────────────────────────────┐
│   Simple routine query?    │
└──────────┬─────────────────┘
           │
          YES
           │
           ▼
   ┌──────────────┐
   │ Use Discord  │
   │ Bot: /ask    │
   └──────────────┘
```

---

## Emergency Procedures

### Scenario 1: Claude Down, Need Strategic Decision

**Problem:** Anthropic outage, need to deploy something NOW

**Solution:**
1. Verify Claude truly unavailable (try web + app)
2. Go to https://ai.firefrostgaming.com
3. Login with Michael's account
4. Select Operations workspace
5. Select Qwen 2.5 Coder model
6. Ask strategic question
7. Copy deployment commands
8. Execute carefully (no session memory!)

**Note:** DERP doesn't remember context. Be explicit in each query.

### Scenario 2: Discord Bot Down

**Problem:** Staff reporting bot not responding

**Check status:**
```bash
ssh root@38.68.14.26
systemctl status firefrost-discord-bot
```

**If stopped:**
```bash
systemctl start firefrost-discord-bot
```

**If errors:**
```bash
journalctl -u firefrost-discord-bot -f
# Check for API errors, token issues
```

**If Dify down:**
```bash
cd /opt/dify
docker-compose ps
# If services down:
docker-compose up -d
```

### Scenario 3: Model Won't Load

**Problem:** DERP system reports "model unavailable"

**Check Ollama:**
```bash
ollama list
# Should show: qwen2.5-coder:72b, llama3.3:70b, llama3.2-vision:11b
```

**If models missing:**
```bash
# Re-download
ollama pull qwen2.5-coder:72b
ollama pull llama3.3:70b
ollama pull llama3.2-vision:11b
```

**Check RAM:**
```bash
free -h
# If <90GB free, unload game servers temporarily
```

---

## Cost Tracking

### Monthly Costs
- **Claude Projects:** $20/month (primary system)
- **Dify:** $0/month (self-hosted)
- **Ollama:** $0/month (self-hosted)
- **Discord Bot:** $0/month (self-hosted)
- **Total:** $20/month ✅

### Resource Usage (TX1)
- **Storage:** ~97GB (one-time)
- **RAM (active DERP):** ~92GB (temporary)
- **RAM (idle):** <5GB (normal)
- **Bandwidth:** Models downloaded once, minimal ongoing

---

## Performance Expectations

### Claude Projects (Primary)
- **Response time:** 5-30 seconds
- **Quality:** Excellent (GPT-4 class)
- **Context:** Full repo (416 files)
- **Session memory:** Yes

### DERP Backup (Emergency)
- **Response time:** 30-120 seconds (slower than Claude)
- **Quality:** Good (GPT-3.5 to GPT-4 class depending on model)
- **Context:** 128K tokens per query
- **Session memory:** No (each query independent)

### Discord Bot (Routine)
- **Response time:** 10-45 seconds
- **Quality:** Good for simple queries
- **Context:** Knowledge base search
- **Rate limit:** 10 queries/hour per user

---

## Best Practices

### For Michael + Meg:
1. ✅ **Always use Claude Projects first** (best experience)
2. ✅ **Only use DERP for true emergencies** (Claude unavailable)
3. ✅ **Document DERP usage** (so Claude can learn from it later)
4. ✅ **Free TX1 RAM after DERP use** (restart Ollama if needed)

### For Staff:
1. ✅ **Use Discord bot for quick lookups** (fast, simple)
2. ✅ **Ask Michael/Meg for complex questions** (they have Claude)
3. ✅ **Don't abuse rate limits** (10 queries/hour is generous)
4. ✅ **Report bot issues immediately** (don't let it stay broken)

### For Subscribers:
1. ✅ **Use Discord bot for server info** (join instructions, modpacks)
2. ✅ **Don't ask for staff-only info** (bot will decline)
3. ✅ **Be patient** (bot shares resources with staff)

---

## Training & Onboarding

### New Staff Training:
1. Introduce Discord bot commands (`/ask`)
2. Show example queries (moderation, server management)
3. Explain rate limits
4. When to escalate to Michael/Meg

### Subscriber Communication:
1. Announce bot in Discord
2. Pin message with `/ask` command
3. Example queries in welcome channel
4. FAQ: "What can the bot answer?"

---

**Fire + Frost + Foundation + DERP = True Independence** 💙🔥❄️

**Remember: Claude first, DERP only when necessary, Discord bot for routine queries.**

**Monthly cost: $20 (no increase)**