NEW: phase0-dismantling.md, mkdocs-deployment.md UPDATED: architecture-decisions.md, pterodactyl-extensions-plan.md, INDEX.md Audit: February 9, 2026 - All 5 gaps fixed
2.9 KiB
Phase 0: Infrastructure Dismantling & Vanilla Reset
Date: February 7, 2026
Status: COMPLETE
Purpose: Document what was removed and why during the Phase 0 vanilla reset
Executive Summary
On February 7, 2026, we dismantled the "Frostwall Protocol v1.0" - a complex GRE tunnel architecture that was causing more problems than it solved. This document preserves the technical details for future reference and explains the strategic decision to rebuild from a "vanilla baseline."
What Was Dismantled
Command Center (63.143.34.217)
GRE Tunnels Removed:
- gre-nc1 - Tunnel to NC1 Charlotte (192.168.20.1/30)
- gre-tx1 - Tunnel to TX1 Dallas (192.168.10.1/30)
Processes Killed:
- 68 leaked tunnel-related processes
- master_restore.sh background processes
- reboot_audit.sh background processes
Cron Jobs Disabled:
- master_restore.sh - Auto-restore tunnel configuration
- reboot_audit.sh - Tunnel health monitoring
iptables Rules Cleaned:
- All GRE-related NAT rules
- All tunnel routing rules
- Reset to default firewall policy
NC1 Charlotte (216.239.104.130)
GRE Tunnel Removed:
- gre-cc - Tunnel to Command Center
- Tunnel IP: 192.168.20.2/30
- Peer: 63.143.34.217
TX1 Dallas (38.68.14.26)
GRE Tunnel Removed:
- gre-tx1 - Tunnel to Command Center
- Tunnel IP: 192.168.10.2/30
- Secondary IP on tunnel: 38.68.14.188/32 (Billing Portal routing)
- Peer: 63.143.34.217
Why It Was Removed
Problem 1: CosmicGuard Double-Encapsulation
The original Charlotte node was behind CosmicGuard DDoS protection, which automatically creates GRE tunnels. Running our tunnel over their tunnel created double encapsulation and MTU issues.
Problem 2: Protocol 47 Blocking
Upstream carrier was black-holing Protocol 47 (GRE) on 38.x IP ranges. Required migration to 216.239.104.x range.
Problem 3: Complexity vs. Benefit
Constant connectivity issues, difficult troubleshooting, 68+ leaked processes, midnight emergencies.
Problem 4: Maintenance Burden
With Michael's health and family planning goals, midnight pages were unsustainable.
The Decision: Vanilla Reset
Philosophy: "Start from a clean baseline and rebuild properly."
Future Plan (Phase 1):
- Design simplified DDoS protection
- Cloudflare Spectrum or simplified GRE (decision pending)
- Focus on reliability over complexity
Lessons Learned
- Complexity has a cost - Every added layer is a potential failure point
- Health matters - Infrastructure should support life, not consume it
- Document before dismantling - This document preserves institutional knowledge
- Vanilla baseline enables iteration - Easier to build correctly from scratch
- Provider relationships matter - Breezehost's Jon Beard was crucial
Revision History
| Version | Date | Changes |
|---|---|---|
| 1.0 | 2026-02-09 | Initial documentation (retroactive) |