diff --git a/docs/planning/migration-plan.md b/docs/planning/migration-plan.md deleted file mode 100644 index 5828d7a..0000000 --- a/docs/planning/migration-plan.md +++ /dev/null @@ -1,412 +0,0 @@ -# 🔥❄️ FIREFROST GAMING: DEDIS → VPS MIGRATION PLAN - -**Created:** February 9, 2026 -**Status:** Planning Phase -**Priority:** 🔴 Critical - Blocks Phase 0.5 Completion - ---- - -## Executive Summary - -**DECISION:** Move ALL management services from dedicated servers (TX1/NC1) back to VPS tier. - -**REASON:** Dedicated servers have excessive networking complexity: -- Manual IP assignment required -- Routing issues from internet -- SSL certificate challenges -- VPS = plug-and-play, Dedis = DIY everything - -**GOAL:** TX1 and NC1 become **game servers ONLY**. All management infrastructure on VPS tier where networking "just works." - ---- - -## Current State (What We Built) - -### On TX1 Dallas (Dedicated): -1. ✅ Gitea (git.firefrostgaming.com) - Port 3000, IP 74.63.218.202 -2. ✅ Uptime Kuma (status.firefrostgaming.com) - Port 3001, IP 74.63.218.203 -3. ✅ MkDocs (docs.firefrostgaming.com) - Port 8000, IP 74.63.218.204 -4. ❌ Wiki.js (FAILED) - Port 3000 conflict, IP 74.63.218.205 not routable - -### Issues Discovered: -- Command Center /29 block (74.63.218.201-206) not routed from internet -- Manual IP assignment required (doesn't survive reboot without netplan config) -- Port conflicts (Gitea on 3000, Wiki.js also wants 3000) -- Wiki.js crashing on startup (exit code 1) -- SSL certificates work but routing doesn't - ---- - -## Target State (Where We're Going) - -### VPS Allocation Strategy - -**OPTION A: Spread Services Across Existing VPS (Recommended)** - -| VPS | Current Use | Add Services | Reasoning | -|-----|-------------|--------------|-----------| -| **Panel** (45.94.168.138) | Pterodactyl Panel | - | Leave dedicated to Panel (performance) | -| **Command Center** (63.143.34.217) | Reserved for DDoS | Gitea + Uptime Kuma | Management hub (original purpose) | -| **Billing** (38.68.14.188) | Paymenter | - | Keep billing isolated | -| **Ghost** (64.50.188.14) | Ghost CMS | MkDocs + Wiki.js + NextCloud | Documentation cluster | - -**OPTION B: Consolidate on Command Center (Alternative)** - -| VPS | Services | Reasoning | -|-----|----------|-----------| -| **Command Center** (63.143.34.217) | Gitea + Uptime Kuma + MkDocs + Wiki.js + NextCloud | Single management hub | -| Others | Original purposes | Simpler, all management in one place | - -**OPTION C: New VPS for Management (Cleanest)** - -| Server | Purpose | Cost | -|--------|---------|------| -| **NEW: Management VPS** | All 5 Phase 0.5 services | ~$10-15/month | -| Existing VPS | Original purposes | No changes | -| TX1/NC1 | Game servers only | Simplified | - ---- - -## Migration Phases - -### Phase M1: Planning & Preparation (TODAY) -- ✅ Document current state -- ✅ Choose VPS allocation strategy -- ⏳ Create migration checklist -- ⏳ Plan DNS cutover strategy -- ⏳ Document rollback procedures - -### Phase M2: Gitea Migration (Day 1) -**Target:** Command Center VPS (63.143.34.217) - -**Steps:** -1. Backup Gitea data from TX1 (`/var/lib/gitea`) -2. Export Gitea database (SQLite dump) -3. Install Gitea on Command Center VPS -4. Restore data and database -5. Update DNS: git.firefrostgaming.com → 63.143.34.217 -6. SSL certificate (easy on VPS - no routing issues) -7. Test and verify -8. Decommission TX1 Gitea - -**Downtime:** ~10-15 minutes (DNS propagation) - -### Phase M3: Uptime Kuma Migration (Day 1) -**Target:** Command Center VPS (63.143.34.217) - -**Steps:** -1. Export Uptime Kuma config/monitors -2. Install on Command Center VPS -3. Import monitors -4. Update DNS: status.firefrostgaming.com → 63.143.34.217 -5. SSL certificate -6. Reconfigure Discord webhook (if needed) -7. Test monitoring -8. Decommission TX1 Uptime Kuma - -**Downtime:** ~5 minutes - -### Phase M4: MkDocs Migration (Day 1-2) -**Target:** Ghost VPS (64.50.188.14) OR Command Center - -**Steps:** -1. Copy MkDocs source files -2. Install MkDocs + Material theme -3. Configure build webhook -4. Update DNS: docs.firefrostgaming.com → [target VPS] -5. SSL certificate -6. Test site build -7. Decommission TX1 MkDocs - -**Downtime:** ~5 minutes - -### Phase M5: Wiki.js Fresh Deploy (Day 2) -**Target:** Ghost VPS (64.50.188.14) OR Command Center - -**Steps:** -1. Fresh Wiki.js install (no migration needed - never worked) -2. Dual domain config: subscribers + staff -3. DNS: Both → [target VPS] -4. SSL certificates (single command, no routing issues) -5. Complete setup wizard -6. Configure Git sync with Gitea - -**Downtime:** None (new service) - -### Phase M6: NextCloud Deploy (Day 2-3) -**Target:** Ghost VPS (64.50.188.14) OR Command Center - -**Steps:** -1. Install NextCloud -2. Configure storage for world downloads -3. DNS: downloads.firefrostgaming.com → [target VPS] -4. SSL certificate -5. Configure external storage (if needed) -6. Test upload/download - -**Downtime:** None (new service) - ---- - -## DNS Changes Required - -**Before Migration:** -| Domain | Current IP | Type | -|--------|-----------|------| -| git.firefrostgaming.com | 38.68.14.26 (TX1) | A | -| status.firefrostgaming.com | 38.68.14.26 (TX1) | A | -| docs.firefrostgaming.com | 38.68.14.26 (TX1) | A | - -**After Migration (Option A - Spread):** -| Domain | New IP | Type | -|--------|--------|------| -| git.firefrostgaming.com | 63.143.34.217 (Command Center) | A | -| status.firefrostgaming.com | 63.143.34.217 (Command Center) | A | -| docs.firefrostgaming.com | 64.50.188.14 (Ghost) | A | -| subscribers.firefrostgaming.com | 64.50.188.14 (Ghost) | A | -| staff.firefrostgaming.com | 64.50.188.14 (Ghost) | A | -| downloads.firefrostgaming.com | 64.50.188.14 (Ghost) | A | - -**After Migration (Option B - Consolidate):** -| Domain | New IP | Type | -|--------|--------|------| -| All management services | 63.143.34.217 (Command Center) | A | - ---- - -## Rollback Plan - -**If migration fails:** -1. Services still running on TX1 (don't decommission until verified) -2. Revert DNS changes in Cloudflare (instant) -3. Old services continue working -4. Max downtime: DNS TTL (5 minutes) - -**Safety net:** -- All TX1 services stay running during migration -- Only decommission after 24 hours of stable operation -- Keep backups of TX1 data for 7 days - ---- - -## Benefits of Migration - -### Technical Benefits: -- ✅ No manual IP assignment -- ✅ No routing issues -- ✅ SSL certificates "just work" -- ✅ Simpler networking (everything on standard ports) -- ✅ Faster deployments (no dedi complexity) - -### Operational Benefits: -- ✅ TX1/NC1 dedicated to game servers (cleaner architecture) -- ✅ Management services isolated from game server load -- ✅ Easier troubleshooting (VPS networking predictable) -- ✅ Less hand strain for Michael (fewer manual operations) - -### Psychological Benefits: -- ✅ No more fighting with dedi networking -- ✅ Services deploy in minutes, not hours -- ✅ Predictable, reliable infrastructure -- ✅ Focus on game servers, not infrastructure headaches - ---- - -## Cost Analysis - -**Current Setup:** $0 additional (using TX1 resources) - -**Option A (Spread):** $0 additional (using existing VPS) - -**Option B (Consolidate):** $0 additional (using Command Center) - -**Option C (New VPS):** ~$10-15/month -- Pros: Cleanest separation, dedicated management server -- Cons: Additional cost - -**Recommendation:** Option A or B (no additional cost) - ---- - -## Timeline Estimate - -**Conservative (Safe):** -- Day 1: Gitea + Uptime Kuma migration (3-4 hours) -- Day 2: MkDocs + Wiki.js deployment (2-3 hours) -- Day 3: NextCloud deployment (1-2 hours) -- **Total:** 3 days, 6-9 hours work - -**Aggressive (If marathon session):** -- Day 1: All 5 services (6-8 hours) -- Day 2: Testing and verification -- **Total:** 1-2 days - ---- - -## Risk Assessment - -| Risk | Probability | Impact | Mitigation | -|------|------------|--------|------------| -| Data loss during migration | Low | High | Full backups before migration | -| DNS propagation delay | Medium | Low | Low TTL, staged cutover | -| SSL certificate issues | Low | Medium | VPS networking reliable | -| Port conflicts | Low | Low | Plan port allocation upfront | -| Service crashes | Low | Medium | Keep TX1 services running as backup | - ---- - -## Decision Point - -**MICHAEL: Which option do you prefer?** - -**Option A:** Spread services (Command Center + Ghost) -- Gitea + Uptime Kuma on Command Center -- MkDocs + Wiki.js + NextCloud on Ghost - -**Option B:** Consolidate all on Command Center -- All 5 services on one VPS -- Simpler management, single point - -**Option C:** New dedicated management VPS -- Cleanest architecture -- ~$10-15/month additional cost - -**My Recommendation:** Option A (spread) -- Balances load across existing VPS -- No additional cost -- Ghost VPS currently underutilized - ---- - -**Fire + Frost = Where Passion Meets Precision** 🔥❄️ - -**Next Step:** Michael reviews and chooses option, then we execute migration plan. - ---- - -## CRITICAL: Automation System Migration - -**OVERSIGHT IDENTIFIED:** The Firefrost Automation System currently runs on TX1! - -### Current Automation Setup -- **Location:** `/root/firefrost-work/firefrost-operations-manual/automation/` -- **Daemon:** Running on TX1 (PID management) -- **Function:** Polls Git repo, executes tasks, commits results -- **Usage:** 95% reduction in manual operations - -### Migration Decision Required - -**Option 1: Keep Automation on TX1** -- Pros: Already working, no changes needed -- Cons: Requires TX1 SSH access for management work -- Use case: If TX1 remains partially management server - -**Option 2: Move Automation to Command Center VPS** -- Pros: All management tools in one place -- Cons: Need to set up SSH keys, test thoroughly -- Use case: If going full VPS for management - -**Option 3: Run Automation on BOTH** -- Pros: Redundancy, can manage either server -- Cons: Complexity, two daemons to monitor -- Use case: Hybrid approach - -**Option 4: Eliminate Need for Automation on Dedis** -- Pros: VPS deployments are simpler (no dedi complexity) -- Cons: Lose automation benefits -- Reality check: VPS might not need automation as much - -### Recommendation: Option 2 (Move to Command Center) - -**Reasoning:** -- All management work happens on VPS tier -- Automation system designed for management services -- Keep TX1/NC1 as "appliances" (game servers only) -- Single management hub = cleaner architecture - -### Automation Migration Steps - -1. **Clone repo to Command Center VPS:** -```bash -cd /root -git clone https://git.firefrostgaming.com/firefrost-gaming/firefrost-operations-manual.git firefrost-work/firefrost-operations-manual -``` - -2. **Set up Git authentication:** -```bash -# SSH key or HTTPS token -git config --global user.name "Firefrost Automation" -git config --global user.email "automation@firefrostgaming.com" -``` - -3. **Start daemon on Command Center:** -```bash -cd ~/firefrost-work/firefrost-operations-manual -nohup bash automation/automation-daemon.sh > /dev/null 2>&1 & -echo "Daemon PID: $!" -``` - -4. **Test task execution:** -- Queue test task -- Verify execution -- Check Git commit - -5. **Stop TX1 daemon:** -```bash -# On TX1 -ps aux | grep automation-daemon -kill [PID] -``` - -6. **Update documentation:** -- USAGE.md updated with new location -- Session handoff updated - -### Timeline - -**When to migrate automation:** -- **After** Gitea migration (needs working Git repo) -- **Before** other services (to use automation for migrations) -- **Estimated time:** 30 minutes - -### Automation in Migration Workflow - -**Use automation for:** -- ✅ Backing up services -- ✅ Deploying to VPS -- ✅ Testing configurations -- ✅ Committing migration logs - -**Don't use automation for:** -- ❌ Initial Git clone (chicken-egg problem) -- ❌ DNS changes (manual in Cloudflare) -- ❌ Critical rollbacks (need manual control) - ---- - -## Updated Migration Phase Order - -### Phase M0: Automation System Migration (NEW - FIRST!) -**Target:** Command Center VPS -**Duration:** 30 minutes -**Prerequisite:** Gitea migrated first - -**Steps:** -1. Migrate Gitea to Command Center (Service 1) -2. Clone repo on Command Center -3. Configure Git authentication -4. Start automation daemon -5. Test with simple task -6. Stop TX1 daemon -7. Update documentation - -**Why first:** Enables automation for remaining migrations! - -### Phase M1-M6: Continue as planned -(All other services use automation system on Command Center) - ---- - -**CRITICAL NOTE:** This is why good planning matters! Almost missed a key component. - -**Fire + Frost = Where Passion Meets Precision** 🔥❄️