docs(consult): Complete Claude Code/MCP architecture consultation

4 rounds with Gemini, all questions answered:
- Node.js MCP server on headless HP laptop
- Discord buttons for zero-typing approval (Frostwall-Overwatch bot)
- 10min timeout, auto-deny + notify
- Sudoers as allowlist, sequential queue
- Ephemeral state on failure
- PM2 for process management
- Cloudflare Tunnel with Service Tokens

Architecture complete, ready to build.
This commit is contained in:
Claude
2026-04-08 07:21:19 +00:00
parent 33978f9be7
commit 046483d89c

View File

@@ -224,3 +224,162 @@ claude_executor ALL=(ALL) NOPASSWD: /usr/bin/git -C /opt/arbiter-3.0 pull
3. **Audit Trail:** Logs local, Gitea, or both?
---
## Gemini's Response — Round 2 (April 8, 2026)
**Summary:** Option B for approval (MCP server prompts), Cloudflare Service Tokens, dual-stage logging.
### Key Decisions:
1. **MCP Tool Schema:** Option B — approval at MCP server level, not in payload. Prevents AI from hallucinating `"approved": true`.
2. **Cloudflare Auth:** Service Tokens (CF-Access-Client-Id + CF-Access-Client-Secret) in MCP config.
3. **Audit Trail:** Both — local `.log` file for immediate access, n8n webhook to commit to Gitea for institutional memory.
---
## Round 3: Clarification on HP Laptop Usage
**Correction:** HP laptop is headless, on a shelf, 24/7 — NOT Michael's active workstation.
**New Questions:**
1. Approval mechanism for headless setup?
2. Always-on reliability recommendations?
3. Python vs Node.js for MCP server?
---
## Gemini's Response — Round 3 (April 8, 2026)
**Summary:** Node.js, Discord buttons for approval, PM2 for process management.
### Key Decisions:
1. **Tech Stack:** Node.js — familiar (Arbiter), official MCP SDK, stable `ssh2` library.
2. **Approval Mechanism:** Discord bot with interactive buttons in private `#mcp-approvals` channel. Zero typing, works from phone or PC.
3. **Always-On Reliability:**
- PM2 for process management (`pm2 startup`, `pm2 save`)
- Power settings: Never sleep, lid close = do nothing
- BIOS: Restore on AC power loss
---
## Round 4: Final Architecture Questions
Architecture diagram confirmed. Five remaining questions:
1. Discord Bot: New or existing (Arbiter)?
2. Timeout handling for unanswered approvals?
3. Command allowlist vs blocklist?
4. Multiple pending approvals handling?
5. Failure recovery on reboot?
---
## Gemini's Response — Round 4 (April 8, 2026)
**Summary:** All five questions answered definitively. Architecture complete.
### Final Decisions:
1. **Discord Bot:** NEW dedicated bot ("Frostwall-Overwatch")
- Self-contained on HP laptop
- Out-of-Band management — can restart Command Center even if Arbiter is down
- Don't rely on infrastructure you're trying to manage
2. **Timeout:** 10 minutes, auto-deny + notify
- Returns: `"Execution Denied: Timeout reached. User did not approve."`
- Discord: Edit message to `[EXPIRED]`, disable buttons
- Visual audit trail without hanging processes
3. **Command Restrictions:** Sudoers IS the allowlist
- Node app passes command blindly
- Linux OS rejects unauthorized commands via `claude_executor` sudo restrictions
- Keep Node app dumb, let target OS handle security
4. **Multiple Approvals:** Queue sequentially
- One at a time: Post → Wait → Execute → Return → Next
- Safer than batch (no wall of code to review)
- Safer than parallel (no accidental taps, correct order)
5. **Failure Recovery:** Ephemeral state (let it die)
- Pending approvals orphaned on reboot
- In-flight SSH commands severed naturally
- Claude.ai times out, Chronicler re-requests
- Clean slate, no ghost processes
---
## Complete Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ MICHAEL │
│ (Main PC or Phone - anywhere) │
└─────────────────────┬───────────────────────────────────────────┘
│ Discord Button Click (Approve/Deny)
┌─────────────────────────────────────────────────────────────────┐
│ DISCORD │
│ #mcp-approvals channel │
│ Frostwall-Overwatch bot │
└─────────────────────┬───────────────────────────────────────────┘
│ Button interaction event
┌─────────────────────────────────────────────────────────────────┐
│ HP LAPTOP (Headless, 24/7) │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Node.js MCP Server (PM2 managed) │ │
│ │ - Receives requests from Claude.ai via Cloudflare │ │
│ │ - Posts approval request to Discord │ │
│ │ - Waits for button click (10min timeout) │ │
│ │ - Executes SSH command on approval │ │
│ │ - Returns result to Claude.ai │ │
│ │ - Logs locally + fires to n8n for archival │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────┬───────────────────────────────────────────┘
│ Cloudflare Tunnel (Service Token auth)
┌─────────────────────────────────────────────────────────────────┐
│ CLAUDE.AI │
│ Chronicler using MCP tools │
└─────────────────────────────────────────────────────────────────┘
│ SSH via ssh2 library
┌─────────────────────────────────────────────────────────────────┐
│ TARGET SERVERS │
│ Command Center | Dev Panel | (others as allowed) │
│ (claude_executor user with restricted sudoers) │
└─────────────────────────────────────────────────────────────────┘
```
---
## Pre-requisites Checklist
- [ ] Create Discord bot "Frostwall-Overwatch" and get token
- [ ] Create `#mcp-approvals` channel, get channel ID
- [ ] Configure Cloudflare Tunnel on HP laptop
- [ ] Generate Cloudflare Service Token (Client ID + Secret)
- [ ] Create `claude_executor` user on Command Center with restricted sudoers
- [ ] Generate SSH keys for `claude_executor`
- [ ] Install Node.js 18+ on HP laptop
- [ ] Install PM2 globally on HP laptop
---
## Conclusion
Architecture fully specified. Ready to build.
**Next Steps:**
1. Update Task #92 spec with this complete architecture
2. Work through pre-requisites checklist
3. Build `index.js` for MCP server
🔥❄️
— Michael + Claude (Chronicler #69)