firefrost-gaming/firefrost-operations-manual

Files

Claude 0fe2753fd8 docs: Task #96 deployment log — Gemma 4 live on TX1

Ollama 0.20.5 (updated from 0.16.2, fixed Docker networking)
Model: gemma4:26b-a4b-it-q8_0 (28GB, q8_0 quantization)
Speed: 14.4 tokens/sec on CPU-only
RAM: 93GB/251GB used, 157GB available for game servers
Remaining: Connect to Dify as model provider (web UI step)

Chronicler #78 | firefrost-operations-manual

2026-04-11 15:00:11 +00:00

2.4 KiB

Raw Blame History

Task #96: Gemma 4 Deployment Log

Date: April 11, 2026 Chronicler: #78 Status: Model deployed, Dify connection pending

Deployment Steps Completed

1. Ollama Update

Before: Docker container, Ollama 0.16.2
Problem: Container had broken bridge networking (no internet access)
Fix: Recreated container with --network host
After: Ollama 0.20.5 with full network access

2. Model Pull

Tag: gemma4:26b-a4b-it-q8_0 (NOT 26b-a4b-q8_0 as in Gemini consultation)
Size: 28GB
Download speed: ~250 MB/s
Total download time: ~3 minutes

3. Inference Test

First query response: "I am a large language model, trained by Google."
Speed: 14.4 tokens/sec
Total time: 13.1s for 175 eval tokens
Loading time: First run had ~30 second model load (loading into RAM)

4. RAM Impact

Metric	Before	After	Available
Total RAM	251GB	251GB	—
Used	65GB	93GB	—
Available	186GB	157GB	157GB

Verdict: 28GB used by model, 157GB still available. Game servers unaffected.

5. Player Impact Check

Queried all 20 Minecraft servers via MC ping protocol
0 players online at time of deployment (9:45 AM Saturday)
No game server performance impact expected even under load

Docker Container Configuration

docker run -d \
  --name ollama \
  --network host \
  -v /usr/share/ollama/.ollama:/root/.ollama \
  --restart unless-stopped \
  ollama/ollama

Key change from original: --network host instead of default bridge networking. Bridge mode had broken DNS/routing in the container.

Remaining Steps

Connect to Dify (requires web UI — Michael)
- codex.firefrostgaming.com → Settings → Model Providers → Ollama
- Model: gemma4:26b-a4b-it-q8_0
- Base URL: http://host.docker.internal:11434 or http://172.17.0.1:11434
- Context Length: 65536
Test RAG queries against operations manual
Benchmark quality — compare against previous Dify responses

Errata

Gemini consultation typo: The consultation at docs/consultations/gemini-gemma4-selfhosting-2026-04-06.md references gemma4:26b-a4b-q8_0. The correct Ollama tag is gemma4:26b-a4b-it-q8_0 (includes it for instruction-tuned).

Fire + Frost + Foundation = Where Love Builds Legacy 💙🔥❄️

2.4 KiB Raw Blame History