# Netdata Deployment - Complete Guide **Status:** Ready to Deploy **Priority:** Tier 2 - Infrastructure Monitoring **Time Estimate:** 30 minutes (all servers) **Last Updated:** 2026-02-17 --- ## Overview Deploy Netdata real-time monitoring across all Firefrost infrastructure. Provides beautiful dashboards for CPU, RAM, disk, network, and application metrics with zero configuration required. **What is Netdata?** - Real-time performance monitoring - Beautiful web dashboards - Zero configuration needed - Extremely lightweight (< 3% CPU, ~100 MB RAM) - Open source and free --- ## Deployment Targets **All 4 infrastructure servers:** 1. **Command Center** (63.143.34.217) - Dallas hub - Services: Gitea, Uptime Kuma, Code-Server, Automation - Dashboard: `http://63.143.34.217:19999` 2. **TX1** (38.68.14.26) - Dallas game servers - Services: 5 Minecraft servers + FoundryVTT - Dashboard: `http://38.68.14.26:19999` 3. **NC1** (216.239.104.130) - Charlotte game servers - Services: 6 Minecraft servers + Hytale - Dashboard: `http://216.239.104.130:19999` 4. **Ghost VPS** (64.50.188.14) - Chicago staff services - Services: MkDocs, Wiki.js (x2), NextCloud - Dashboard: `http://64.50.188.14:19999` --- ## Installation (Per Server) ### One-Line Install **On each server:** ```bash # Install Netdata bash <(curl -Ss https://my-netdata.io/kickstart.sh) # The installer will: # - Auto-detect your OS # - Install dependencies # - Compile and install Netdata # - Start the service # - Open port 19999 ``` **Installation takes:** 2-5 minutes per server --- ## Step-by-Step Deployment ### Phase 1: Install on Command Center (10 min) ```bash # SSH to Command Center ssh root@63.143.34.217 # Run installer bash <(curl -Ss https://my-netdata.io/kickstart.sh) # Wait for installation to complete # Answer prompts (usually just press Enter for defaults) # Verify installation systemctl status netdata # Should show: active (running) # Test dashboard curl http://localhost:19999 # Should return HTML ``` **Open in browser:** `http://63.143.34.217:19999` You should see the Netdata dashboard! --- ### Phase 2: Install on TX1 (5 min) ```bash # SSH to TX1 ssh root@38.68.14.26 # Run installer bash <(curl -Ss https://my-netdata.io/kickstart.sh) # Verify systemctl status netdata # Test curl http://localhost:19999 ``` **Open in browser:** `http://38.68.14.26:19999` --- ### Phase 3: Install on NC1 (5 min) ```bash # SSH to NC1 ssh root@216.239.104.130 # Run installer bash <(curl -Ss https://my-netdata.io/kickstart.sh) # Verify systemctl status netdata # Test curl http://localhost:19999 ``` **Open in browser:** `http://216.239.104.130:19999` --- ### Phase 4: Install on Ghost VPS (5 min) ```bash # SSH to Ghost ssh root@64.50.188.14 # Run installer bash <(curl -Ss https://my-netdata.io/kickstart.sh) # Verify systemctl status netdata # Test curl http://localhost:19999 ``` **Open in browser:** `http://64.50.188.14:19999` --- ## Post-Installation Configuration ### 1. Configure UFW Firewall **On each server:** ```bash # Allow Netdata port from Michael's management IP only ufw allow from MICHAEL_MANAGEMENT_IP to any port 19999 proto tcp # Verify ufw status | grep 19999 ``` **Security note:** Netdata dashboards contain sensitive server information. Only allow access from trusted IPs. --- ### 2. Set Up Parent-Child Streaming (Optional) **Benefit:** View all servers from one dashboard (Command Center) **On Command Center (parent):** ```bash # Edit config nano /etc/netdata/stream.conf # Add: [11111111-2222-3333-4444-555555555555] enabled = yes default history = 3600 default memory mode = save health enabled = yes ``` **On TX1, NC1, Ghost (children):** ```bash # Edit config nano /etc/netdata/stream.conf # Add: [stream] enabled = yes destination = 63.143.34.217:19999 api key = 11111111-2222-3333-4444-555555555555 # Restart netdata systemctl restart netdata ``` **Result:** All server metrics visible on Command Center dashboard --- ### 3. Configure Alerts **Edit alert config:** ```bash nano /etc/netdata/health.d/custom.conf ``` **Example alerts:** ```yaml # Alert when CPU usage > 80% for 5 minutes alarm: cpu_usage on: system.cpu calc: $user + $system every: 1m warn: $this > 80 crit: $this > 95 delay: up 5m down 15m info: CPU usage is too high # Alert when RAM usage > 90% alarm: ram_usage on: system.ram calc: $used * 100 / ($used + $free) every: 1m warn: $this > 90 crit: $this > 95 delay: up 5m down 15m info: RAM usage is too high # Alert when disk space < 20% alarm: disk_space on: disk.space calc: $avail * 100 / ($avail + $used) every: 1m warn: $this < 20 crit: $this < 10 delay: up 5m down 15m info: Disk space is running low ``` **Reload config:** ```bash killall -USR2 netdata ``` --- ### 4. Discord Integration (Optional) **Set up Discord webhook for alerts:** ```bash # Edit alarm notification config nano /etc/netdata/health_alarm_notify.conf # Find Discord section and configure: SEND_DISCORD="YES" DISCORD_WEBHOOK_URL="YOUR_DISCORD_WEBHOOK_URL_HERE" DEFAULT_RECIPIENT_DISCORD="network-alerts" ``` **Test alert:** ```bash # Trigger test alert /usr/libexec/netdata/plugins.d/alarm-notify.sh test ``` Check Discord for test notification. --- ## Dashboard Access ### Quick Access Links **Save these bookmarks:** - Command Center: http://63.143.34.217:19999 - TX1: http://38.68.14.26:19999 - NC1: http://216.239.104.130:19999 - Ghost: http://64.50.188.14:19999 **Unified View (if streaming configured):** - All servers: http://63.143.34.217:19999 → View nodes --- ### Key Metrics to Monitor **CPU:** - User % (application load) - System % (kernel load) - IOWait % (disk bottleneck indicator) **RAM:** - Used vs Available - Cache (should be high, that's good!) - Swap usage (should be low) **Disk:** - Disk space remaining - Read/write speeds - IOPs **Network:** - Bandwidth usage - Packet drops - Connection count **Minecraft Servers (TX1/NC1):** - Java heap usage - GC activity - Thread count --- ## Maintenance ### Daily - Quick glance at dashboards (bookmark all 4) - Check for any red alerts ### Weekly - Review CPU/RAM trends - Check disk space projections - Verify alerts working ### Monthly - Review historical data - Adjust alert thresholds if needed - Update Netdata if new version available --- ## Updates **Check for updates:** ```bash # On each server netdata-updater.sh ``` **Or auto-update (recommended):** Updates automatically check daily and install automatically. --- ## Troubleshooting ### Dashboard won't load **Check service:** ```bash systemctl status netdata ``` **Restart if needed:** ```bash systemctl restart netdata ``` **Check firewall:** ```bash ufw status | grep 19999 telnet localhost 19999 ``` --- ### High CPU usage from Netdata Netdata should use < 3% CPU normally. **Check what's using resources:** ```bash # Disable some plugins if needed nano /etc/netdata/netdata.conf # Under [plugins], disable unused: python.d = no node.d = no ``` --- ### Streaming not working **Verify:** - Parent (Command Center) has stream.conf with API key - Children have correct parent IP - Port 19999 accessible from children to parent - API keys match exactly **Debug:** ```bash # On child tail -f /var/log/netdata/error.log | grep stream ``` --- ### Alerts not sending to Discord **Check:** - Discord webhook URL correct - `SEND_DISCORD="YES"` set - Test alert sent successfully **Debug:** ```bash /usr/libexec/netdata/plugins.d/alarm-notify.sh test debug ``` --- ## Advanced Features (Optional) ### Netdata Cloud (Free) **Benefits:** - Centralized dashboard for all servers - Mobile app - Longer data retention - Collaboration features **Setup:** 1. Go to https://app.netdata.cloud 2. Create free account 3. Claim nodes: ```bash # On each server netdata-claim.sh -token=YOUR_TOKEN -rooms=YOUR_ROOM -url=https://app.netdata.cloud ``` --- ### Custom Dashboards Create custom dashboards with specific metrics: 1. Open Netdata dashboard 2. Click "Create Dashboard" 3. Add charts 4. Save and share URL --- ## Success Criteria Checklist - [ ] Netdata installed on Command Center - [ ] Netdata installed on TX1 - [ ] Netdata installed on NC1 - [ ] Netdata installed on Ghost VPS - [ ] All dashboards accessible via browser - [ ] UFW rules configured (management IP only) - [ ] Alerts configured for CPU/RAM/Disk - [ ] (Optional) Discord integration working - [ ] (Optional) Parent-child streaming configured - [ ] Dashboards bookmarked for quick access --- ## Related Tasks - **Staggered Server Restart System** - Monitor impact on resources - **World Backup Automation** - Monitor backup job duration - **Command Center Security** - Part of monitoring infrastructure - **Frostwall Protocol** - Monitor tunnel performance --- **Fire + Frost + Foundation = Where Love Builds Legacy** 💙🔥❄️ --- **Document Status:** COMPLETE **Ready for Deployment:** When SSH access available (30 minutes total) **Dependencies:** SSH access to all 4 servers, management IP whitelisted **Port Required:** 19999 (internal only, secured by UFW)