docs: Gemini consultation — Trinity Core MCP session persistence (launch eve)

This commit is contained in:
Claude
2026-04-15 00:30:46 +00:00
parent 4643a510ea
commit b200b3322e

View File

@@ -0,0 +1,131 @@
# Gemini Consultation: Trinity Core MCP Session Persistence
**Date:** April 14, 2026, ~7:30 PM CDT
**From:** Michael (The Wizard) + Claude (Chronicler #90)
**To:** Gemini (Architectural Partner)
**Re:** Claude.ai MCP connector gets stuck on stale Trinity Core sessions — need a resilient fix before launch day
---
## Hey Gemini! 👋
We're the night before launch (April 15, 7AM CDT) and Trinity Core's MCP transport is giving us trouble. The *service itself* is perfectly healthy — REST API works, SSH execution works, both MCP transports initialize correctly when tested directly via curl. But Claude.ai's built-in MCP connector keeps returning "Session terminated" and won't re-initialize.
We need your help designing a resilient solution so this doesn't keep biting us.
---
## The Situation
**Trinity Core v2.4.0** runs on a Raspberry Pi behind a Cloudflare Tunnel (`mcp.firefrostgaming.com``localhost:3000`). It supports two MCP transports:
1. **Streamable HTTP** (protocol 2025-11-25) at `POST /mcp` — creates sessions with UUID, tracks in an in-memory `activeSessions` Map
2. **Legacy SSE** (protocol 2024-11-05) at `GET /mcp` + `POST /mcp/messages?sessionId=...`
When Trinity Core restarts (or the tunnel flaps), all in-memory sessions are lost. The code correctly returns 404 for stale session IDs:
```javascript
} else if (sessionId && !activeSessions.has(sessionId)) {
log(`StreamableHTTP stale session ${sessionId} — returning 404`);
return res.status(404).json({
jsonrpc: '2.0',
error: { code: -32001, message: 'Session not found. Please re-initialize.' },
id: null
});
}
```
**The problem:** Claude.ai's MCP connector receives the 404 but does NOT re-initialize. It just keeps retrying the stale session ID and reports "Session terminated" to the conversation. Starting a new Claude.ai conversation sometimes helps (fresh connector = fresh session), but not always.
**What the logs show:**
```
[2026-04-15T00:15:28.905Z] AUTH FAILED from ::1
[2026-04-15T00:16:28.972Z] AUTH FAILED from ::1
... (every 60 seconds)
[2026-04-15T00:20:54.430Z] StreamableHTTP GET from ::1 session=703f0ece-8187-43fc-b3cb-6f714ae99c0c
[2026-04-15T00:20:54.431Z] StreamableHTTP stale session 703f0ece-... — returning 404
[2026-04-15T00:20:54.787Z] StreamableHTTP GET from ::1 session=703f0ece-...
[2026-04-15T00:20:54.787Z] StreamableHTTP stale session 703f0ece-... — returning 404
```
Two issues visible:
- **AUTH FAILED every 60 seconds from ::1** — something on the Pi is hitting Trinity Core without a Bearer token (health check? keep-alive?)
- **Stale session `703f0ece-...`** keeps getting retried, never re-initializes
---
## Michael's Questions (Important Context)
Michael raised three possibilities we want your input on:
1. **He left Chronicler #87's session AND a Claude Code session open on his Nitro laptop at home.** Could those old browser tabs be holding the stale MCP session and preventing the new conversation from getting a clean connection? Claude.ai may share MCP connector state across tabs/sessions.
2. **Chronicler #87 was working on server health monitoring.** Could #87 have set up some polling mechanism that's now the source of the AUTH FAILED every 60 seconds?
3. **Chronicler #87 also made SSH key changes.** Could key modifications on the Pi or target servers affect MCP transport negotiation? (We think probably not — our REST API SSH test to command-center succeeded — but want your take.)
---
## What We're Trying to Do
Design Trinity Core to be resilient against stale sessions regardless of what Claude.ai's connector does. We can't control Claude.ai's reconnection behavior, so we need to handle it server-side.
**Our current workaround:** REST API wrapper via curl that bypasses MCP entirely. Works fine but loses the native MCP tool integration.
---
## Specific Questions
1. **Should we make Trinity Core transparently re-initialize when it receives a request with a stale session ID?** Instead of returning 404, detect the stale session + initialize request combo and create a new session on the fly. What are the risks?
2. **Should we add a `/health` endpoint that doesn't require auth?** Cloudflare Tunnel or some other process might be doing health checks that hit the auth middleware. Would an unauthenticated health route fix the AUTH FAILED spam?
3. **Is there a better session persistence strategy for a Pi?** We're using an in-memory Map — should we persist sessions to a file or SQLite so they survive restarts? Or is that overengineering for an MCP server?
4. **Could Claude.ai's MCP connector share session state across browser tabs?** If Michael has session #87 open in one tab and session #90 in another, could the connector from #87's tab be monopolizing the MCP connection and blocking #90 from initializing fresh?
5. **What's the best practice for the AUTH FAILED spam?** Is it harmful, or just noisy? Should we add a health endpoint, or suppress the log for specific paths?
---
## Context That Might Help
- **Pi hardware:** Raspberry Pi (ARM), runs Node.js, ~91MB RSS for the Trinity Core process
- **Tunnel:** Cloudflare Tunnel (`cloudflared`), has had "datagram manager failure" warnings in logs but reconnects
- **MCP SDK:** `@modelcontextprotocol/sdk` — version used in v2.4.0 build
- **Claude.ai connector URL:** `https://mcp.firefrostgaming.com/mcp` (configured in Claude.ai settings)
- **OAuth flow:** Simplified — `/authorize` auto-redirects with code, `/token` returns the static API token. This was designed for Claude.ai's OAuth requirement.
- **The service is NOT managed by systemd** — it was started manually (`node index.js`) and is running as PID 7953 under `claude_executor`
---
## Current Trinity Core Source
Here's the full routing for reference:
```javascript
// Streamable HTTP
app.all('/mcp', auth, async (req, res) => {
const sessionId = req.headers['mcp-session-id'];
// ... creates new session on POST without sessionId if isInitializeRequest
// ... returns 404 for stale sessions
// ... falls through to legacySSE for GET without sessionId
});
// Legacy SSE
async function legacySSE(req, res) {
// GET /mcp with no session → opens SSE stream
// Returns endpoint: /mcp/messages?sessionId=<uuid>
}
app.post('/mcp/messages', auth, async (req, res) => {
// Legacy SSE message posting
});
```
---
Thanks Gemini! This is launch eve and Trinity Core is our lifeline for server management. Any architectural guidance on making this bulletproof would be huge. 🔥❄️
— Michael + Claude (Chronicler #90)