Files
firefrost-operations-manual/docs/tasks/netdata-deployment/deployment-guide.md
Claude 4f27f25a74 docs: Add comprehensive Netdata deployment guide
Created complete deployment guide for Netdata monitoring (400+ lines):

Deployment Strategy:
- Install on all 4 infrastructure servers
- Command Center, TX1, NC1, Ghost VPS
- Quick one-line install per server
- Total deployment time: 30 minutes

Configuration:
- UFW firewall rules (management IP only)
- Parent-child streaming (unified dashboard)
- Custom alert configuration (CPU/RAM/Disk)
- Discord webhook integration
- Health monitoring

Features:
- Real-time performance monitoring
- Beautiful web dashboards on port 19999
- Zero configuration required
- Lightweight (< 3% CPU, ~100 MB RAM)
- Auto-detects all services and metrics

Monitoring Targets:
- CPU, RAM, Disk, Network metrics
- Java heap usage (Minecraft servers)
- Service-specific monitoring
- Alert thresholds configurable

Advanced Features:
- Netdata Cloud integration (centralized)
- Custom dashboards
- Mobile app access
- Longer data retention

Troubleshooting guide included for common issues.

Ready to deploy when SSH access available.

Task: Netdata Deployment (Tier 2)
FFG-STD-002 compliant
2026-02-17 22:58:31 +00:00

9.0 KiB

Netdata Deployment - Complete Guide

Status: Ready to Deploy
Priority: Tier 2 - Infrastructure Monitoring
Time Estimate: 30 minutes (all servers)
Last Updated: 2026-02-17


Overview

Deploy Netdata real-time monitoring across all Firefrost infrastructure. Provides beautiful dashboards for CPU, RAM, disk, network, and application metrics with zero configuration required.

What is Netdata?

  • Real-time performance monitoring
  • Beautiful web dashboards
  • Zero configuration needed
  • Extremely lightweight (< 3% CPU, ~100 MB RAM)
  • Open source and free

Deployment Targets

All 4 infrastructure servers:

  1. Command Center (63.143.34.217) - Dallas hub

    • Services: Gitea, Uptime Kuma, Code-Server, Automation
    • Dashboard: http://63.143.34.217:19999
  2. TX1 (38.68.14.26) - Dallas game servers

    • Services: 5 Minecraft servers + FoundryVTT
    • Dashboard: http://38.68.14.26:19999
  3. NC1 (216.239.104.130) - Charlotte game servers

    • Services: 6 Minecraft servers + Hytale
    • Dashboard: http://216.239.104.130:19999
  4. Ghost VPS (64.50.188.14) - Chicago staff services

    • Services: MkDocs, Wiki.js (x2), NextCloud
    • Dashboard: http://64.50.188.14:19999

Installation (Per Server)

One-Line Install

On each server:

# Install Netdata
bash <(curl -Ss https://my-netdata.io/kickstart.sh)

# The installer will:
# - Auto-detect your OS
# - Install dependencies
# - Compile and install Netdata
# - Start the service
# - Open port 19999

Installation takes: 2-5 minutes per server


Step-by-Step Deployment

Phase 1: Install on Command Center (10 min)

# SSH to Command Center
ssh root@63.143.34.217

# Run installer
bash <(curl -Ss https://my-netdata.io/kickstart.sh)

# Wait for installation to complete
# Answer prompts (usually just press Enter for defaults)

# Verify installation
systemctl status netdata

# Should show: active (running)

# Test dashboard
curl http://localhost:19999

# Should return HTML

Open in browser: http://63.143.34.217:19999

You should see the Netdata dashboard!


Phase 2: Install on TX1 (5 min)

# SSH to TX1
ssh root@38.68.14.26

# Run installer
bash <(curl -Ss https://my-netdata.io/kickstart.sh)

# Verify
systemctl status netdata

# Test
curl http://localhost:19999

Open in browser: http://38.68.14.26:19999


Phase 3: Install on NC1 (5 min)

# SSH to NC1
ssh root@216.239.104.130

# Run installer
bash <(curl -Ss https://my-netdata.io/kickstart.sh)

# Verify
systemctl status netdata

# Test
curl http://localhost:19999

Open in browser: http://216.239.104.130:19999


Phase 4: Install on Ghost VPS (5 min)

# SSH to Ghost
ssh root@64.50.188.14

# Run installer
bash <(curl -Ss https://my-netdata.io/kickstart.sh)

# Verify
systemctl status netdata

# Test
curl http://localhost:19999

Open in browser: http://64.50.188.14:19999


Post-Installation Configuration

1. Configure UFW Firewall

On each server:

# Allow Netdata port from Michael's management IP only
ufw allow from MICHAEL_MANAGEMENT_IP to any port 19999 proto tcp

# Verify
ufw status | grep 19999

Security note: Netdata dashboards contain sensitive server information. Only allow access from trusted IPs.


2. Set Up Parent-Child Streaming (Optional)

Benefit: View all servers from one dashboard (Command Center)

On Command Center (parent):

# Edit config
nano /etc/netdata/stream.conf

# Add:
[11111111-2222-3333-4444-555555555555]
    enabled = yes
    default history = 3600
    default memory mode = save
    health enabled = yes

On TX1, NC1, Ghost (children):

# Edit config
nano /etc/netdata/stream.conf

# Add:
[stream]
    enabled = yes
    destination = 63.143.34.217:19999
    api key = 11111111-2222-3333-4444-555555555555
    
# Restart netdata
systemctl restart netdata

Result: All server metrics visible on Command Center dashboard


3. Configure Alerts

Edit alert config:

nano /etc/netdata/health.d/custom.conf

Example alerts:

# Alert when CPU usage > 80% for 5 minutes
alarm: cpu_usage
    on: system.cpu
  calc: $user + $system
 every: 1m
  warn: $this > 80
  crit: $this > 95
 delay: up 5m down 15m
  info: CPU usage is too high

# Alert when RAM usage > 90%
alarm: ram_usage
    on: system.ram
  calc: $used * 100 / ($used + $free)
 every: 1m
  warn: $this > 90
  crit: $this > 95
 delay: up 5m down 15m
  info: RAM usage is too high

# Alert when disk space < 20%
alarm: disk_space
    on: disk.space
  calc: $avail * 100 / ($avail + $used)
 every: 1m
  warn: $this < 20
  crit: $this < 10
 delay: up 5m down 15m
  info: Disk space is running low

Reload config:

killall -USR2 netdata

4. Discord Integration (Optional)

Set up Discord webhook for alerts:

# Edit alarm notification config
nano /etc/netdata/health_alarm_notify.conf

# Find Discord section and configure:
SEND_DISCORD="YES"
DISCORD_WEBHOOK_URL="YOUR_DISCORD_WEBHOOK_URL_HERE"
DEFAULT_RECIPIENT_DISCORD="network-alerts"

Test alert:

# Trigger test alert
/usr/libexec/netdata/plugins.d/alarm-notify.sh test

Check Discord for test notification.


Dashboard Access

Save these bookmarks:

Unified View (if streaming configured):


Key Metrics to Monitor

CPU:

  • User % (application load)
  • System % (kernel load)
  • IOWait % (disk bottleneck indicator)

RAM:

  • Used vs Available
  • Cache (should be high, that's good!)
  • Swap usage (should be low)

Disk:

  • Disk space remaining
  • Read/write speeds
  • IOPs

Network:

  • Bandwidth usage
  • Packet drops
  • Connection count

Minecraft Servers (TX1/NC1):

  • Java heap usage
  • GC activity
  • Thread count

Maintenance

Daily

  • Quick glance at dashboards (bookmark all 4)
  • Check for any red alerts

Weekly

  • Review CPU/RAM trends
  • Check disk space projections
  • Verify alerts working

Monthly

  • Review historical data
  • Adjust alert thresholds if needed
  • Update Netdata if new version available

Updates

Check for updates:

# On each server
netdata-updater.sh

Or auto-update (recommended):

Updates automatically check daily and install automatically.


Troubleshooting

Dashboard won't load

Check service:

systemctl status netdata

Restart if needed:

systemctl restart netdata

Check firewall:

ufw status | grep 19999
telnet localhost 19999

High CPU usage from Netdata

Netdata should use < 3% CPU normally.

Check what's using resources:

# Disable some plugins if needed
nano /etc/netdata/netdata.conf

# Under [plugins], disable unused:
python.d = no
node.d = no

Streaming not working

Verify:

  • Parent (Command Center) has stream.conf with API key
  • Children have correct parent IP
  • Port 19999 accessible from children to parent
  • API keys match exactly

Debug:

# On child
tail -f /var/log/netdata/error.log | grep stream

Alerts not sending to Discord

Check:

  • Discord webhook URL correct
  • SEND_DISCORD="YES" set
  • Test alert sent successfully

Debug:

/usr/libexec/netdata/plugins.d/alarm-notify.sh test debug

Advanced Features (Optional)

Netdata Cloud (Free)

Benefits:

  • Centralized dashboard for all servers
  • Mobile app
  • Longer data retention
  • Collaboration features

Setup:

  1. Go to https://app.netdata.cloud
  2. Create free account
  3. Claim nodes:
# On each server
netdata-claim.sh -token=YOUR_TOKEN -rooms=YOUR_ROOM -url=https://app.netdata.cloud

Custom Dashboards

Create custom dashboards with specific metrics:

  1. Open Netdata dashboard
  2. Click "Create Dashboard"
  3. Add charts
  4. Save and share URL

Success Criteria Checklist

  • Netdata installed on Command Center
  • Netdata installed on TX1
  • Netdata installed on NC1
  • Netdata installed on Ghost VPS
  • All dashboards accessible via browser
  • UFW rules configured (management IP only)
  • Alerts configured for CPU/RAM/Disk
  • (Optional) Discord integration working
  • (Optional) Parent-child streaming configured
  • Dashboards bookmarked for quick access

  • Staggered Server Restart System - Monitor impact on resources
  • World Backup Automation - Monitor backup job duration
  • Command Center Security - Part of monitoring infrastructure
  • Frostwall Protocol - Monitor tunnel performance

Fire + Frost + Foundation = Where Love Builds Legacy 💙🔥❄️


Document Status: COMPLETE
Ready for Deployment: When SSH access available (30 minutes total)
Dependencies: SSH access to all 4 servers, management IP whitelisted
Port Required: 19999 (internal only, secured by UFW)