firefrost-gaming/firefrost-operations-manual

Files

The Chronicler 96f20e8715 Task #9 : Rewrite AI Stack architecture for DERP compliance

Complete rewrite of self-hosted AI stack (Task #9) with new DERP-compliant architecture:

CHANGES:
- Architecture: AnythingLLM+OpenWebUI → Dify+Ollama (DERP-compliant)
- Cost model: $0/month additional (self-hosted on TX1, no external APIs)
- Usage tiers: Claude Projects (primary) → DERP backup (emergency) → Discord bots (staff/subscribers)
- Time estimate: 8-12hrs → 6-8hrs (more focused deployment)
- Resource allocation: 97GB storage, 92GB RAM when active (vs 150GB/110GB)

NEW DOCUMENTATION:
- README.md: Complete architecture rewrite with three-tier usage model
- deployment-plan.md: Step-by-step deployment (6 phases, all commands included)
- usage-guide.md: Decision tree for when to use Claude vs DERP vs bots
- resource-requirements.md: TX1 capacity planning, monitoring, disaster recovery

KEY FEATURES:
- Zero additional monthly cost (beyond existing $20 Claude Pro)
- True DERP compliance (fully self-hosted when Claude unavailable)
- Knowledge graph RAG (indexes entire 416-file repo)
- Discord bot integration (role-based staff/subscriber access)
- Emergency procedures documented
- Capacity planning for growth (up to 18 game servers)

MODELS:
- Qwen 2.5 Coder 72B (infrastructure/coding, 128K context)
- Llama 3.3 70B (general reasoning, 128K context)
- Llama 3.2 Vision 11B (screenshot analysis)

Updated tasks.md summary to reflect new architecture.

Status: Ready for deployment (pending medical clearance)

Fire + Frost + Foundation + DERP = True Independence 💙🔥❄️

2026-02-18 17:27:25 +00:00

9.3 KiB

Raw Blame History

AI Stack Usage Guide

Purpose: Know which AI system to use when
Last Updated: 2026-02-18

The Three-Tier System

Tier 1: Claude Projects (Primary) - USE THIS FIRST

Who: Michael + Meg
Where: claude.ai or Claude app
Cost: $20/month (already paying)

When to use:

✅ Normal daily operations (99% of the time)
✅ Strategic decision-making (deployment order, architecture)
✅ Complex reasoning (tradeoffs, dependencies)
✅ Session continuity (remembers context across days)
✅ Best experience (fastest, most capable)

What Claude can do:

Search entire 416-file operations manual
Write deployment scripts
Review infrastructure decisions
Generate documentation
Debug issues
Plan roadmaps

Example queries:

"Should I deploy Mailcow or AI stack first?"
"Write a script to deploy Frostwall Protocol"
"What tasks depend on NC1 cleanup?"
"Help me troubleshoot this Pterodactyl error"

Limitations:

Requires internet connection
Subject to Anthropic availability

Tier 2: DERP Backup (Emergency Only) - WHEN CLAUDE IS DOWN

Who: Michael + Meg
Where: https://ai.firefrostgaming.com
Cost: $0/month (self-hosted on TX1)

When to use:

❌ Not for normal operations (Claude is faster/better)
✅ Anthropic outage (Claude unavailable for hours)
✅ Emergency infrastructure decisions (can't wait for Claude)
✅ Critical troubleshooting (server down, need immediate help)

What DERP can do:

Query indexed operations manual (416 files)
Strategic reasoning with 128K context
Infrastructure troubleshooting
Code generation
Emergency deployment guidance

Available models:

Qwen 2.5 Coder 72B - Infrastructure/coding questions
Llama 3.3 70B - General reasoning
Llama 3.2 Vision 11B - Screenshot analysis

Example queries:

"Claude is down. What's the deployment order for Frostwall?"
"Emergency: Mailcow not starting. Check logs and diagnose."
"Need to deploy something NOW. What dependencies are missing?"

Limitations:

Slower inference than Claude
No session continuity
Manual model selection
Uses TX1 resources (~80GB RAM when active)

How to activate:

Verify Claude is unavailable (try multiple times)
Go to https://ai.firefrostgaming.com
Select workspace:
- Operations - Infrastructure decisions
- Brainstorming - Creative work
Select model:
- Qwen 2.5 Coder - For deployment/troubleshooting
- Llama 3.3 - For general questions
Ask question
Copy/paste response as needed

When to deactivate:

Claude comes back online
Emergency resolved
Free up TX1 RAM for game servers

Tier 3: Discord Bot (Staff/Subscribers) - ROUTINE QUERIES

Who: Staff + Subscribers
Where: Firefrost Discord server
Cost: $0/month (same infrastructure)

When to use:

✅ Routine questions (daily operations)
✅ Quick lookups (server status, modpack info)
✅ Staff training (how-to queries)
✅ Subscriber support (basic info)

Commands:

/ask [question]

Available to: Staff + Subscribers
Searches: Operations workspace (staff) or public docs (subscribers)
Rate limit: 10 queries/hour per user

Example queries (Staff):

/ask How many game servers are running?
/ask What's the Whitelist Manager deployment status?
/ask How do I restart a Minecraft server?

Example queries (Subscribers):

/ask What modpacks are available?
/ask How do I join a server?
/ask What's the difference between Fire and Frost paths?

Role-based access:

Staff: Full Operations workspace access
Subscribers: Public documentation only
No role: Cannot use bot

Limitations:

Simple queries only (no complex reasoning)
No file uploads
No strategic decisions
Rate limited

Decision Tree

┌─────────────────────────────────────┐
│    Do you need AI assistance?      │
└─────────────┬───────────────────────┘
              │
              ▼
      ┌───────────────┐
      │ Is it urgent? │
      └───┬───────┬───┘
          │       │
        NO│       │YES
          │       │
          ▼       ▼
    ┌─────────┐ ┌──────────────┐
    │ Claude  │ │ Is Claude    │
    │ working?│ │ available?   │
    └───┬─────┘ └──┬───────┬───┘
        │          │       │
       YES│       YES│     │NO
        │          │       │
        ▼          ▼       ▼
  ┌──────────┐ ┌──────────┐ ┌─────────┐
  │Use Claude│ │Use Claude│ │Use DERP │
  │Projects  │ │Projects  │ │Backup   │
  └──────────┘ └──────────┘ └─────────┘

For staff/subscribers:

┌────────────────────────────┐
│   Simple routine query?    │
└──────────┬─────────────────┘
           │
          YES
           │
           ▼
   ┌──────────────┐
   │ Use Discord  │
   │ Bot: /ask    │
   └──────────────┘

Emergency Procedures

Scenario 1: Claude Down, Need Strategic Decision

Problem: Anthropic outage, need to deploy something NOW

Solution:

Verify Claude truly unavailable (try web + app)
Go to https://ai.firefrostgaming.com
Login with Michael's account
Select Operations workspace
Select Qwen 2.5 Coder model
Ask strategic question
Copy deployment commands
Execute carefully (no session memory!)

Note: DERP doesn't remember context. Be explicit in each query.

Scenario 2: Discord Bot Down

Problem: Staff reporting bot not responding

Check status:

ssh root@38.68.14.26
systemctl status firefrost-discord-bot

If stopped:

systemctl start firefrost-discord-bot

If errors:

journalctl -u firefrost-discord-bot -f
# Check for API errors, token issues

If Dify down:

cd /opt/dify
docker-compose ps
# If services down:
docker-compose up -d

Scenario 3: Model Won't Load

Problem: DERP system reports "model unavailable"

Check Ollama:

ollama list
# Should show: qwen2.5-coder:72b, llama3.3:70b, llama3.2-vision:11b

If models missing:

# Re-download
ollama pull qwen2.5-coder:72b
ollama pull llama3.3:70b
ollama pull llama3.2-vision:11b

Check RAM:

free -h
# If <90GB free, unload game servers temporarily

Cost Tracking

Monthly Costs

Claude Projects: $20/month (primary system)
Dify: $0/month (self-hosted)
Ollama: $0/month (self-hosted)
Discord Bot: $0/month (self-hosted)
Total: $20/month ✅

Resource Usage (TX1)

Storage: ~97GB (one-time)
RAM (active DERP): ~92GB (temporary)
RAM (idle): <5GB (normal)
Bandwidth: Models downloaded once, minimal ongoing

Performance Expectations

Claude Projects (Primary)

Response time: 5-30 seconds
Quality: Excellent (GPT-4 class)
Context: Full repo (416 files)
Session memory: Yes

DERP Backup (Emergency)

Response time: 30-120 seconds (slower than Claude)
Quality: Good (GPT-3.5 to GPT-4 class depending on model)
Context: 128K tokens per query
Session memory: No (each query independent)

Discord Bot (Routine)

Response time: 10-45 seconds
Quality: Good for simple queries
Context: Knowledge base search
Rate limit: 10 queries/hour per user

Best Practices

For Michael + Meg:

✅ Always use Claude Projects first (best experience)
✅ Only use DERP for true emergencies (Claude unavailable)
✅ Document DERP usage (so Claude can learn from it later)
✅ Free TX1 RAM after DERP use (restart Ollama if needed)

For Staff:

✅ Use Discord bot for quick lookups (fast, simple)
✅ Ask Michael/Meg for complex questions (they have Claude)
✅ Don't abuse rate limits (10 queries/hour is generous)
✅ Report bot issues immediately (don't let it stay broken)

For Subscribers:

✅ Use Discord bot for server info (join instructions, modpacks)
✅ Don't ask for staff-only info (bot will decline)
✅ Be patient (bot shares resources with staff)

Training & Onboarding

New Staff Training:

Introduce Discord bot commands (/ask)
Show example queries (moderation, server management)
Explain rate limits
When to escalate to Michael/Meg

Subscriber Communication:

Announce bot in Discord
Pin message with /ask command
Example queries in welcome channel
FAQ: "What can the bot answer?"

Fire + Frost + Foundation + DERP = True Independence 💙🔥❄️

Remember: Claude first, DERP only when necessary, Discord bot for routine queries.

Monthly cost: $20 (no increase)

9.3 KiB Raw Blame History

AI Stack Usage Guide

The Three-Tier System

Tier 1: Claude Projects (Primary) - USE THIS FIRST

Tier 2: DERP Backup (Emergency Only) - WHEN CLAUDE IS DOWN

Tier 3: Discord Bot (Staff/Subscribers) - ROUTINE QUERIES

Decision Tree

Emergency Procedures

Scenario 1: Claude Down, Need Strategic Decision

Scenario 2: Discord Bot Down

Scenario 3: Model Won't Load

Cost Tracking

Monthly Costs

Resource Usage (TX1)

Performance Expectations

Claude Projects (Primary)

DERP Backup (Emergency)

Discord Bot (Routine)

Best Practices

For Michael + Meg:

For Staff:

For Subscribers:

Training & Onboarding

New Staff Training:

Subscriber Communication:

9.3 KiB

Raw Blame History