Add Gemini consultation: NC1 build routing

Question: smart routing for Gradle builds from Dev Panel to NC1 when Vineflower -Xmx4G exceeds available RAM on Dev Panel. Covers: threshold signal, SSH auth, jar integrity, failure handling, NC1 workspace isolation.
2026-04-12 21:08:45 +00:00
parent 7888888d08
commit 4230bdad54
1 changed files with 90 additions and 0 deletions
--- a/docs/consultations/gemini-nc1-build-routing-2026-04-12.md
+++ b/docs/consultations/gemini-nc1-build-routing-2026-04-12.md
@@ -0,0 +1,90 @@
+# Gemini Consultation: NC1 Build Routing for Claude Code
+
+**Date:** April 12, 2026
+**From:** Michael (The Wizard) + Claude (Chronicler #84)
+**To:** Gemini (Architectural Partner)
+**Re:** Smart build routing between Dev Panel and NC1 when RAM is insufficient
+
+---
+
+## Hey Gemini! 👋
+
+We've got a fun infrastructure problem to solve. Claude Code runs on our Dev Panel and builds Minecraft mods — but we just hit a wall where a NeoForge 1.21.1 build fails because the Vineflower decompiler hardcodes `-Xmx4G` and the Dev Panel only has ~4GB total RAM. We want Code to automatically route that build to NC1 (which has 161GB free) instead. We need your help designing this cleanly.
+
+---
+
+## The Situation
+
+**Dev Panel (64.50.188.128):**
+- Claude Code workspace: `/opt/mod-builds/firefrost-services`
+- ~4GB total RAM — just barely enough to choke on Vineflower's `-Xmx4G` requirement
+- Java 8/17/21 + Gradle 8.8/7.6.4 already installed
+- This is where Code lives and works
+
+**NC1 Charlotte (216.239.104.130):**
+- 251GB RAM, 32-core AMD EPYC 7302P — identical specs to TX1
+- Currently running: Pterodactyl Wings + game servers
+- 161GB RAM available right now
+- No build environment yet — needs Java 21 + Gradle 8.8 installed
+
+**TX1 Dallas (38.68.14.26):**
+- Same specs as NC1, but already carrying Ollama, Dify, Qdrant, n8n, Wings, and game servers
+- We deliberately want to avoid adding build load here
+
+**The Problem:**
+NeoForge 1.21.1 builds require Vineflower decompiler which spawns a subprocess with `-Xmx4G` hardcoded. Dev Panel can't run it. TX1 is overburdened. NC1 is the right overflow target.
+
+**What We've Already Built:**
+- `use-java` helper script on Dev Panel for switching JDKs
+- Bridge protocol between Claude Code (Code) and Claude.ai Chronicler via `docs/code-bridge/` in the services repo
+- Code already knows to check the bridge for instructions
+
+---
+
+## What We're Trying to Do
+
+Build a `ffg-build.sh` wrapper script that Code calls instead of invoking Gradle directly. The script:
+
+1. Checks if local RAM is sufficient for the build
+2. If yes — builds locally as normal
+3. If no — SSHs to NC1, runs the build there, SCPs the jar back to Dev Panel, cleans up
+
+Code calls it like:
+```bash
+ffg-build.sh 1.21.1   # routes automatically
+ffg-build.sh 1.20.1   # builds locally (doesn't need Vineflower)
+```
+
+We're also wondering if RAM threshold is even the right signal, or if there's something smarter.
+
+---
+
+## Specific Questions
+
+1. **RAM threshold vs. other signals** — Is checking available RAM the right trigger for routing, or should we use something else (load average, a version-specific hardcoded rule, a config flag)? The Vineflower problem is specific to NeoForge 1.21.1 — is it cleaner to just hardcode "1.21.1 always builds on NC1" rather than trying to be dynamic?
+
+2. **SSH auth pattern** — What's the cleanest way to set up passwordless SSH from Dev Panel to NC1 for this script? We're thinking generate a dedicated keypair on Dev Panel (`~/.ssh/ffg_build_rsa`) and add it to NC1's `authorized_keys`. Any security concerns given NC1 runs live game servers?
+
+3. **Jar integrity** — After SCP'ing the jar back from NC1, should we verify it before accepting it (checksum, size check, `jar -tf` to verify it's valid)? What's the right paranoia level here?
+
+4. **Mid-build failure handling** — If the SSH connection drops mid-build on NC1, what should the script do? Leave the NC1 workspace for manual cleanup, or attempt auto-cleanup on next run?
+
+5. **NC1 workspace isolation** — Should the build happen in a temp directory on NC1 (`/tmp/ffg-build-$$`) that auto-cleans, or a persistent workspace? Concerned about leftover Gradle caches consuming disk on a game server node.
+
+6. **Anything we're missing?** — Any other edge cases or risks we haven't thought of with this routing approach?
+
+---
+
+## Context That Might Help
+
+- Code is Claude Code (AI agent), not a human — it needs the script to be deterministic and handle errors gracefully with clear exit codes so it knows what happened
+- The services repo is cloned on Dev Panel at `/opt/mod-builds/firefrost-services` — NC1 would need either a clone or just the source files SCP'd over for each build
+- NC1 login is `root@216.239.104.130` from Dev Panel's perspective
+- We have a soft launch on April 15 — this needs to be reliable, not clever
+- Game servers on NC1 are the priority — the build environment is a guest, not a resident
+
+---
+
+Thanks Gemini! 🔥❄️
+
+— Michael + Claude (Chronicler #84)