Add TX1 crisis status - all 6 game servers down due to IP allocation mismatch

This commit is contained in:
Firefrost Automation
2026-02-11 17:20:02 -06:00
parent dc40a573f4
commit d0fc4e7a0d

View File

@@ -1761,3 +1761,94 @@ This prevents documentation drift and ensures every future Claude session has cu
---
**Last Updated:** February 11, 2026 4:30 PM CST
---
## 🚨 CRITICAL: TX1 GAME SERVERS CRISIS (FEB 11, 2026 - 5:00 PM)
### **THE DISCOVERY:**
While diagnosing Holly's FoundryVTT access issues (50% success rate), discovered:
**❌ ALL 6 TX1 GAME SERVERS ARE DOWN**
### **ROOT CAUSE:**
**Pterodactyl Panel misconfiguration:**
- All TX1 servers configured with IP: `38.68.14.188` (Billing VPS)
- TX1's actual IPs: `38.68.14.26-30` (game servers)
- Docker cannot bind to IP that doesn't exist on machine
- Result: All servers fail to start (Exit Code 128)
### **AFFECTED SERVERS:**
1. Stoneblock 4 (wrong: 38.68.14.188:25565)
2. Reclamation (wrong: 38.68.14.188:25567)
3. Society: Sunlit Valley (wrong: 38.68.14.188:25568)
4. Vanilla 1.21.11 (wrong: 38.68.14.188:25571)
5. All The Mons (wrong: 38.68.14.188:25572)
6. FoundryVTT (wrong: 38.68.14.188:30000)
### **TX1 VERIFIED IPs:**
```
38.68.14.26 (game servers)
38.68.14.27 (game servers)
38.68.14.28 (game servers)
38.68.14.29 (game servers)
38.68.14.30 (game servers)
74.63.218.202 (Code-Server - already working)
74.63.218.203 (unused)
74.63.218.204 (unused)
74.63.218.205 (unused)
```
### **FIX IN PROGRESS:**
**Step 1: Allocate Port Ranges (IN PROGRESS)**
- For each IP (38.68.14.26-30):
- Minecraft: 25565-25580 (16 ports)
- Hytale: 5520-5521 (2 ports)
- FoundryVTT: 30000 (only on .26)
**Step 2: Assign Correct Allocations**
- Stoneblock 4 → 38.68.14.26:25565
- Reclamation → 38.68.14.27:25565
- Society → 38.68.14.28:25565
- Vanilla → 38.68.14.29:25565
- All The Mons → 38.68.14.30:25565
- FoundryVTT → 38.68.14.26:30000
**Step 3: Restart All Servers**
**Step 4: Add Game Servers to Uptime Kuma**
- CRITICAL LESSON: "I would have known earlier"
- We monitor VPS infrastructure but NOT game servers
- This created blind spot
- All 12 game servers (TX1 + NC1) need monitoring
### **CURRENT STATUS (5:30 PM CST):**
- 🔧 Allocating port ranges in Pterodactyl Panel
- ⏳ Awaiting allocation completion
- 📋 Next: Assign to servers, restart, monitor
### **LESSONS LEARNED:**
1. **Monitoring Gap:** Game servers not in Uptime Kuma = delayed crisis detection
2. **Allocation Verification:** Should have verified IPs during initial setup
3. **Complete Testing:** Node running ≠ servers running
4. **User-Facing Monitoring:** Monitor what USERS interact with, not just infrastructure
### **PRIORITY AFTER FIX:**
✅ Add all 12 game servers to Uptime Kuma
✅ Verify allocations match actual server IPs
✅ Test each server individually
✅ Document proper allocation process
---
**Crisis Discovered:** February 11, 2026 5:00 PM CST
**Status:** Fix in progress (allocating port ranges)
**ETA:** 30-60 minutes to full recovery