Commit Graph

2 Commits

Author SHA1 Message Date
Claude
0fe2753fd8 docs: Task #96 deployment log — Gemma 4 live on TX1
Ollama 0.20.5 (updated from 0.16.2, fixed Docker networking)
Model: gemma4:26b-a4b-it-q8_0 (28GB, q8_0 quantization)
Speed: 14.4 tokens/sec on CPU-only
RAM: 93GB/251GB used, 157GB available for game servers
Remaining: Connect to Dify as model provider (web UI step)

Chronicler #78 | firefrost-operations-manual
2026-04-11 15:00:11 +00:00
Claude
e3be9a1dd1 docs: Task #96 spec — Gemma 4 Self-Hosted LLM
Full context from Gemini consultation (April 6, 2026).
Gemma 4 26B A4B MoE recommended for TX1 (251GB RAM, CPU-only).
~26GB at q8_0 quantization, zero monthly API cost.
Tightly coupled with Task #93 (Trinity Codex).

Chronicler #78 | firefrost-operations-manual
2026-04-11 14:44:13 +00:00