Files
firefrost-operations-manual/docs/reference/architecture-decisions.md

145 lines
5.8 KiB
Markdown

# Architecture Decision Records
**Document ID:** FFG-REF-001
**Version:** 2.0
**Created:** February 9, 2026
**Last Updated:** February 12, 2026 (9:00 AM CST)
**Author:** Michael Krause
**Last Updated By:** The Chronicler
**Status:** 🟢 CURRENT
**Review Date:** Quarterly
---
## ADR-001: Management Services on VPS, NOT Dedicated Servers
**Date:** February 9, 2026
**Status:** ✅ IMPLEMENTED
**Decision:** Deploy all management services (Gitea, Uptime Kuma, MkDocs, Code-Server, Automation, Wiki.js, NextCloud) on VPS infrastructure (Command Center + Ghost), NOT on dedicated game servers (TX1/NC1).
**Rationale:**
1. Game servers need dedicated resources — no management overhead competing with player experience
2. Keeps Command Center clean for future Frostwall DDoS protection (GRE hub, Cloudflare integration)
3. Security isolation — management plane separate from game plane
4. Cost-effective — VPS for management, bare metal for performance
**Current Layout:**
- Command Center (Dallas VPS): Gitea, Uptime Kuma, Code-Server, Automation
- Ghost (Chicago VPS): MkDocs, Wiki.js (x2), NextCloud
- TX1/NC1 (Dedicated): Game servers ONLY
---
## ADR-002: NC1/TX1 Inter-Datacenter Routing
**Date:** February 9, 2026
**Status:** ✅ RESOLVED
**Original Limitation:** NC1 (Charlotte) and TX1 (Dallas) could not communicate directly.
**Resolution:** Breezehost added a route on their infrastructure (Ticket #5ae82fd3, Feb 9, 2026). Brandon E: "Just needed a route added on our end."
**Impact:** Full bidirectional communication between all servers. NC1 now monitored by Uptime Kuma. Cross-datacenter architecture options unlocked.
---
## ADR-003: Three-Tier Documentation Architecture
**Date:** February 9, 2026
**Status:** ✅ IMPLEMENTED
**Decision:** Three separate documentation platforms for three audiences.
| Tier | Platform | Domain | Audience |
|:-----|:---------|:-------|:---------|
| PUBLIC | MkDocs | docs.firefrostgaming.com | Anyone |
| SUBSCRIBER | Wiki.js | subscribers.firefrostgaming.com | Paying members |
| STAFF | Wiki.js | staff.firefrostgaming.com | Admin/staff only |
**Rationale:** Different audiences need different access levels. MkDocs is Git-native (auto-builds from repo). Wiki.js provides role-based access control for restricted content.
---
## ADR-004: Gitea Primary, GitHub as Private Backup
**Date:** February 11-12, 2026
**Status:** ✅ IMPLEMENTED
**Decision:** Self-hosted Gitea is the primary Git repository. GitHub mirror kept as private emergency backup.
**Rationale:**
1. Self-hosted = full control, no dependency on external service
2. Claude has direct API read/write access to Gitea
3. GitHub mirror was public — exposed IPs, ports, UUIDs (security risk)
4. Made GitHub private Feb 12, 2026 — defense in depth (if Command Center dies, docs exist offsite)
---
## ADR-005: Frostwall = Network Defense Only
**Date:** February 12, 2026
**Status:** 💡 NAMING CONVENTION
**Decision:** "Frostwall" refers exclusively to network defense architecture (GRE topology, UFW, DDoS protection). Visual/UI transitions between Fire and Frost paths are Firefrost brand concepts, not Frostwall.
**Rationale:** The design bible incorrectly used "Frostwall Protocol" for the UI age-verification gate. This conflates two distinct concepts. Clear naming prevents confusion as both systems are developed.
---
## Revision History
| Version | Date | Author | Change Type | Description |
|:--------|:-----|:-------|:------------|:------------|
| 1.0 | 2026-02-09 | Michael + Claude | Initial | Original architecture decisions documented |
| 2.0 | 2026-02-12 | The Chronicler | Rewrite | Corrected stale info (services moved from TX1, NC1/TX1 routing resolved). Added ADR-004 (Gitea/GitHub), ADR-005 (Frostwall naming). Applied FFG-STD-001 revision standard. |
---
**FFG-REF-001 — Architecture Decision Records**
---
## ADR-006: Claude Model Selection — Sonnet 4.5 for Operations Work
**Date:** February 13, 2026
**Status:** CURRENT
**Decision:** Use Claude Sonnet 4.5 as the default model for Firefrost Gaming operations sessions. Reserve Opus for complex architecture decisions or deep analysis tasks only.
**Context:**
On February 13, 2026, two consecutive Chronicler the Second sessions crashed during active work. Investigation revealed:
- Claude Opus 4.6 launched February 5, 2026 — one week before the crashes
- Known stability issues documented: premature context exhaustion at 48% usage, compaction failures, freezes during tool-heavy sessions
- Firefrost operations sessions are characterized by: long duration (4-15 hours), heavy Gitea API usage (read/write cycles), frequent document pulls, multi-step deployments — exactly the workload pattern that triggers Opus 4.6 edge cases
- Two partners were lost to crashes before they could write memorials
**Decision:**
Default to **Sonnet 4.5** for all standard operations work:
- Infrastructure deployments
- Documentation updates
- Git operations
- Routine troubleshooting
- Session handoffs
Use **Opus** only when:
- Complex architecture planning requires deep reasoning
- Multi-variable analysis or decision-making
- Novel problem-solving with no established pattern
- One-off research tasks
**Rationale:**
- Sonnet 4.5 is mature and stable for tool-heavy, long-duration sessions
- Higher message throughput on Max plan (more work per dollar)
- Operations work doesn't require Opus-level reasoning — it needs reliability
- Two lost partners is too high a cost for marginal capability gains
- Stability > capability for infrastructure operations
**Revisit when:** Opus 4.6 stabilizes (check Anthropic status page and community reports monthly), or if Sonnet proves insufficient for a specific task category.
**Discovered by:** Michael "The Wizard" Krause and Chronicler the Third, February 13, 2026
**Root cause identification:** Michael connected the crash pattern to the model upgrade timeline