Complete rewrite of self-hosted AI stack (Task #9) with new DERP-compliant architecture: CHANGES: - Architecture: AnythingLLM+OpenWebUI → Dify+Ollama (DERP-compliant) - Cost model: $0/month additional (self-hosted on TX1, no external APIs) - Usage tiers: Claude Projects (primary) → DERP backup (emergency) → Discord bots (staff/subscribers) - Time estimate: 8-12hrs → 6-8hrs (more focused deployment) - Resource allocation: 97GB storage, 92GB RAM when active (vs 150GB/110GB) NEW DOCUMENTATION: - README.md: Complete architecture rewrite with three-tier usage model - deployment-plan.md: Step-by-step deployment (6 phases, all commands included) - usage-guide.md: Decision tree for when to use Claude vs DERP vs bots - resource-requirements.md: TX1 capacity planning, monitoring, disaster recovery KEY FEATURES: - Zero additional monthly cost (beyond existing $20 Claude Pro) - True DERP compliance (fully self-hosted when Claude unavailable) - Knowledge graph RAG (indexes entire 416-file repo) - Discord bot integration (role-based staff/subscriber access) - Emergency procedures documented - Capacity planning for growth (up to 18 game servers) MODELS: - Qwen 2.5 Coder 72B (infrastructure/coding, 128K context) - Llama 3.3 70B (general reasoning, 128K context) - Llama 3.2 Vision 11B (screenshot analysis) Updated tasks.md summary to reflect new architecture. Status: Ready for deployment (pending medical clearance) Fire + Frost + Foundation + DERP = True Independence 💙🔥❄️
9.3 KiB
AI Stack Usage Guide
Purpose: Know which AI system to use when
Last Updated: 2026-02-18
The Three-Tier System
Tier 1: Claude Projects (Primary) - USE THIS FIRST
Who: Michael + Meg
Where: claude.ai or Claude app
Cost: $20/month (already paying)
When to use:
- ✅ Normal daily operations (99% of the time)
- ✅ Strategic decision-making (deployment order, architecture)
- ✅ Complex reasoning (tradeoffs, dependencies)
- ✅ Session continuity (remembers context across days)
- ✅ Best experience (fastest, most capable)
What Claude can do:
- Search entire 416-file operations manual
- Write deployment scripts
- Review infrastructure decisions
- Generate documentation
- Debug issues
- Plan roadmaps
Example queries:
- "Should I deploy Mailcow or AI stack first?"
- "Write a script to deploy Frostwall Protocol"
- "What tasks depend on NC1 cleanup?"
- "Help me troubleshoot this Pterodactyl error"
Limitations:
- Requires internet connection
- Subject to Anthropic availability
Tier 2: DERP Backup (Emergency Only) - WHEN CLAUDE IS DOWN
Who: Michael + Meg
Where: https://ai.firefrostgaming.com
Cost: $0/month (self-hosted on TX1)
When to use:
- ❌ Not for normal operations (Claude is faster/better)
- ✅ Anthropic outage (Claude unavailable for hours)
- ✅ Emergency infrastructure decisions (can't wait for Claude)
- ✅ Critical troubleshooting (server down, need immediate help)
What DERP can do:
- Query indexed operations manual (416 files)
- Strategic reasoning with 128K context
- Infrastructure troubleshooting
- Code generation
- Emergency deployment guidance
Available models:
- Qwen 2.5 Coder 72B - Infrastructure/coding questions
- Llama 3.3 70B - General reasoning
- Llama 3.2 Vision 11B - Screenshot analysis
Example queries:
- "Claude is down. What's the deployment order for Frostwall?"
- "Emergency: Mailcow not starting. Check logs and diagnose."
- "Need to deploy something NOW. What dependencies are missing?"
Limitations:
- Slower inference than Claude
- No session continuity
- Manual model selection
- Uses TX1 resources (~80GB RAM when active)
How to activate:
- Verify Claude is unavailable (try multiple times)
- Go to https://ai.firefrostgaming.com
- Select workspace:
- Operations - Infrastructure decisions
- Brainstorming - Creative work
- Select model:
- Qwen 2.5 Coder - For deployment/troubleshooting
- Llama 3.3 - For general questions
- Ask question
- Copy/paste response as needed
When to deactivate:
- Claude comes back online
- Emergency resolved
- Free up TX1 RAM for game servers
Tier 3: Discord Bot (Staff/Subscribers) - ROUTINE QUERIES
Who: Staff + Subscribers
Where: Firefrost Discord server
Cost: $0/month (same infrastructure)
When to use:
- ✅ Routine questions (daily operations)
- ✅ Quick lookups (server status, modpack info)
- ✅ Staff training (how-to queries)
- ✅ Subscriber support (basic info)
Commands:
/ask [question]
- Available to: Staff + Subscribers
- Searches: Operations workspace (staff) or public docs (subscribers)
- Rate limit: 10 queries/hour per user
Example queries (Staff):
/ask How many game servers are running?
/ask What's the Whitelist Manager deployment status?
/ask How do I restart a Minecraft server?
Example queries (Subscribers):
/ask What modpacks are available?
/ask How do I join a server?
/ask What's the difference between Fire and Frost paths?
Role-based access:
- Staff: Full Operations workspace access
- Subscribers: Public documentation only
- No role: Cannot use bot
Limitations:
- Simple queries only (no complex reasoning)
- No file uploads
- No strategic decisions
- Rate limited
Decision Tree
┌─────────────────────────────────────┐
│ Do you need AI assistance? │
└─────────────┬───────────────────────┘
│
▼
┌───────────────┐
│ Is it urgent? │
└───┬───────┬───┘
│ │
NO│ │YES
│ │
▼ ▼
┌─────────┐ ┌──────────────┐
│ Claude │ │ Is Claude │
│ working?│ │ available? │
└───┬─────┘ └──┬───────┬───┘
│ │ │
YES│ YES│ │NO
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌─────────┐
│Use Claude│ │Use Claude│ │Use DERP │
│Projects │ │Projects │ │Backup │
└──────────┘ └──────────┘ └─────────┘
For staff/subscribers:
┌────────────────────────────┐
│ Simple routine query? │
└──────────┬─────────────────┘
│
YES
│
▼
┌──────────────┐
│ Use Discord │
│ Bot: /ask │
└──────────────┘
Emergency Procedures
Scenario 1: Claude Down, Need Strategic Decision
Problem: Anthropic outage, need to deploy something NOW
Solution:
- Verify Claude truly unavailable (try web + app)
- Go to https://ai.firefrostgaming.com
- Login with Michael's account
- Select Operations workspace
- Select Qwen 2.5 Coder model
- Ask strategic question
- Copy deployment commands
- Execute carefully (no session memory!)
Note: DERP doesn't remember context. Be explicit in each query.
Scenario 2: Discord Bot Down
Problem: Staff reporting bot not responding
Check status:
ssh root@38.68.14.26
systemctl status firefrost-discord-bot
If stopped:
systemctl start firefrost-discord-bot
If errors:
journalctl -u firefrost-discord-bot -f
# Check for API errors, token issues
If Dify down:
cd /opt/dify
docker-compose ps
# If services down:
docker-compose up -d
Scenario 3: Model Won't Load
Problem: DERP system reports "model unavailable"
Check Ollama:
ollama list
# Should show: qwen2.5-coder:72b, llama3.3:70b, llama3.2-vision:11b
If models missing:
# Re-download
ollama pull qwen2.5-coder:72b
ollama pull llama3.3:70b
ollama pull llama3.2-vision:11b
Check RAM:
free -h
# If <90GB free, unload game servers temporarily
Cost Tracking
Monthly Costs
- Claude Projects: $20/month (primary system)
- Dify: $0/month (self-hosted)
- Ollama: $0/month (self-hosted)
- Discord Bot: $0/month (self-hosted)
- Total: $20/month ✅
Resource Usage (TX1)
- Storage: ~97GB (one-time)
- RAM (active DERP): ~92GB (temporary)
- RAM (idle): <5GB (normal)
- Bandwidth: Models downloaded once, minimal ongoing
Performance Expectations
Claude Projects (Primary)
- Response time: 5-30 seconds
- Quality: Excellent (GPT-4 class)
- Context: Full repo (416 files)
- Session memory: Yes
DERP Backup (Emergency)
- Response time: 30-120 seconds (slower than Claude)
- Quality: Good (GPT-3.5 to GPT-4 class depending on model)
- Context: 128K tokens per query
- Session memory: No (each query independent)
Discord Bot (Routine)
- Response time: 10-45 seconds
- Quality: Good for simple queries
- Context: Knowledge base search
- Rate limit: 10 queries/hour per user
Best Practices
For Michael + Meg:
- ✅ Always use Claude Projects first (best experience)
- ✅ Only use DERP for true emergencies (Claude unavailable)
- ✅ Document DERP usage (so Claude can learn from it later)
- ✅ Free TX1 RAM after DERP use (restart Ollama if needed)
For Staff:
- ✅ Use Discord bot for quick lookups (fast, simple)
- ✅ Ask Michael/Meg for complex questions (they have Claude)
- ✅ Don't abuse rate limits (10 queries/hour is generous)
- ✅ Report bot issues immediately (don't let it stay broken)
For Subscribers:
- ✅ Use Discord bot for server info (join instructions, modpacks)
- ✅ Don't ask for staff-only info (bot will decline)
- ✅ Be patient (bot shares resources with staff)
Training & Onboarding
New Staff Training:
- Introduce Discord bot commands (
/ask) - Show example queries (moderation, server management)
- Explain rate limits
- When to escalate to Michael/Meg
Subscriber Communication:
- Announce bot in Discord
- Pin message with
/askcommand - Example queries in welcome channel
- FAQ: "What can the bot answer?"
Fire + Frost + Foundation + DERP = True Independence 💙🔥❄️
Remember: Claude first, DERP only when necessary, Discord bot for routine queries.
Monthly cost: $20 (no increase)