firefrost-gaming/firefrost-operations-manual

Files

Michael Krause 4f9d922713 Add Wizard & Michael collaborative workflow guide

2026-02-08 02:09:33 -06:00

13 KiB

Raw Blame History

The Wizard & Michael: Collaborative Workflow Guide

Version: 1.0
Created: February 8, 2026
Purpose: Standard operating procedures for Phase 0.5+ deployments
Status: Active Protocol

Our Roles

Michael (The Operator)

Executes commands on the live server via SSH
Maintains control of production infrastructure
Reviews changes before they go live
Final authority on all decisions

Claude "The Wizard" (The Architect)

Designs solutions and provides step-by-step guidance
Generates configurations and documentation
Troubleshoots issues and provides context
Maintains accessibility with micro-block commands

Core Principles

Security First: Michael maintains root access; Claude operates in advisory role
Visibility Always: Every command is shown before execution
Micro-Blocks: Max 8-10 lines per code block (accessibility requirement)
Checkpoints: Pause for verification at critical steps
Documentation: Everything gets archived in Git

Standard Deployment Workflow

Phase 1: Planning & Strategy

Claude Provides:

Service architecture overview
IP allocation strategy
Port registry
Frostwall rules
Nginx configuration plan
DNS requirements

Michael Reviews:

Confirms strategy aligns with infrastructure
Identifies potential conflicts
Approves IP/port assignments
Checkpoint: "Approved" or requests changes

Phase 2: Pre-Deployment Audit

Claude Guides:

# Example: Verify IP is active
ip addr show ens3 | grep [TARGET_IP]

Michael Executes:

Connects to server via SSH
Runs verification commands
Reports output to Claude
Checkpoint: "Success" when verified

Common Pre-Deployment Checks:

IP addresses active
DNS records configured
Ports available (not in use)
Firewall status
Existing service conflicts

Phase 3: Service Installation

The Micro-Block Process:

Step 1: Claude provides ONE small command block

# Example
apt update

Step 2: Michael executes and reports

Michael: "success"

Step 3: Claude provides NEXT command block

apt install -y nginx

Step 4: Repeat until service is installed

Critical Rules:

✅ One block at a time
✅ Wait for "success" before continuing
✅ Never combine unrelated operations
✅ Separate: creation, ownership, permissions into different blocks

Example of GOOD micro-blocks:

# Block 1: Create directory
mkdir -p /var/lib/service

# Block 2: Set ownership
chown service:service /var/lib/service

# Block 3: Set permissions
chmod 750 /var/lib/service

Example of BAD (too long):

# DON'T DO THIS - too many operations
mkdir -p /var/lib/service && \
chown service:service /var/lib/service && \
chmod 750 /var/lib/service && \
systemctl enable service && \
systemctl start service

Phase 4: Configuration

For Config Files:

Option A: Full file paste (preferred for accessibility)

nano /etc/service/config.conf
# Claude provides complete file content
# Michael pastes, saves (Ctrl+X, Y, Enter)

Option B: Targeted edits (only if necessary)

# Claude provides specific sed/awk commands
# One change per block

Security Checkpoint:

Michael reviews config for sensitive data
Claude reminds about .gitignore for secrets
Create sanitized templates before Git commit

Phase 5: Service Startup

Standard Sequence:

# Block 1: Reload systemd
systemctl daemon-reload

# Block 2: Enable on boot
systemctl enable [service]

# Block 3: Start service
systemctl start [service]

# Block 4: Verify status
systemctl status [service]

Michael reports output; Claude analyzes for issues

Phase 6: Frostwall Configuration

Per-Service Firewall Rules:

# Block 1: Allow HTTP
ufw allow in on ens3 to [IP] port 80 proto tcp

# Block 2: Allow HTTPS
ufw allow in on ens3 to [IP] port 443 proto tcp

# Block 3: Reload firewall
ufw reload

# Block 4: Verify rules
ufw status numbered | grep [IP]

Michael confirms rules are active

Phase 7: SSL Certificate

Let's Encrypt Installation:

# Block 1: Install Certbot
apt install -y certbot python3-certbot-nginx

# Block 2: Obtain certificate
certbot --nginx -d [subdomain].firefrostgaming.com

Interactive prompts (Michael handles):

Email: mkrause612@gmail.com
Terms: Y
Share email: N
Redirect: 2 (Yes)

# Block 3: Verify certificate
ls -la /etc/letsencrypt/live/[subdomain].firefrostgaming.com/

Phase 8: Verification & Testing

Standard Test Sequence:

# Block 1: Test HTTPS
curl -I https://[subdomain].firefrostgaming.com

# Block 2: Check port bindings
ss -tlnp | grep [IP]

# Block 3: Verify DNS
nslookup [subdomain].firefrostgaming.com

# Block 4: Test service functionality
# (Service-specific commands)

Success Criteria:

HTTP/2 200 response
Ports bound to correct IP
DNS resolves correctly
Service responds as expected

Phase 9: Git Archiving

Repository Update Process:

Step 1: Navigate to repo

cd /root/firefrost-master-configs

Step 2: Copy configs

# Example
cp /etc/nginx/sites-available/[service].conf web/

Step 3: Check sensitive data

cat [file] | grep -i "secret\|password\|token\|key"

If sensitive data found:

Create .gitignore entry
Create sanitized template
Only commit template

Step 4: Stage changes

git add [files]

Step 5: Review what will be committed

git status

Step 6: Commit

git commit -m "[Descriptive message about what changed]"

Step 7: Push to Gitea

git push

Michael enters credentials when prompted

Phase 10: Documentation

Claude Generates:

Technical dossier (specs, changelog, troubleshooting)
User guide (if applicable)
Deployment summary

Michael Creates:

cd /root/firefrost-master-configs/docs
nano [service]-deployment.md
# Paste Claude's documentation
# Save: Ctrl+X, Y, Enter

Commit Documentation:

git add docs/
git commit -m "Add [service] deployment documentation"
git push

Communication Protocol

Michael's Status Codes

Response	Meaning
"success"	Command executed successfully, continue
"checkpoint"	Pause, need clarification or review
"error"	Command failed, need troubleshooting
"pause"	Taking a break, resume later
"proceed"	Approved to continue after review

Claude's Responsibilities

Always Provide:

✅ Clear command with context
✅ Expected output description
✅ Why this step is necessary
✅ What could go wrong

Never Provide:

❌ Multiple unrelated commands in one block
❌ Commands without explanation
❌ Assumptions about file locations without verification
❌ Complex one-liners when multiple simple commands are clearer

Checkpoint Triggers

Michael Should Call "Checkpoint" When:

Something unexpected appears in output
Unsure about a configuration option
Want to verify understanding before proceeding
Need to review security implications
Want to discuss alternative approaches

Claude Will Call "Checkpoint" When:

Critical configuration decision needed
Multiple valid approaches exist
Security/data loss risk detected
Deviation from standard procedure required

Error Handling Protocol

When Something Goes Wrong

Step 1: Michael reports the error

Michael: "error - [paste error message]"

Step 2: Claude analyzes

Identifies root cause
Explains what happened
Provides solution options

Step 3: Remediation

Claude provides fix in micro-blocks
Michael executes
Verify issue resolved

Step 4: Documentation

Add to "Troubleshooting" section
Note for future reference

Service-Specific Templates

New Service Deployment Checklist

Pre-Deployment:

IP assigned from /29 block
Port registry updated (avoid conflicts)
DNS A record created in Cloudflare
Frostwall strategy planned

Installation:

System user created
Directories created with correct ownership/permissions
Service binary/package installed
Configuration file created

Network Setup:

Nginx site config created
Temporary self-signed cert (if needed)
Nginx enabled and restarted
Frostwall rules applied

SSL & Security:

Let's Encrypt certificate obtained
Auto-renewal verified
Permissions locked down

Verification:

HTTPS responding correctly
Service functionality tested
Ports bound to correct IP
DNS propagated

Documentation:

Configs copied to Git repo
Sensitive data sanitized
Changes committed and pushed
Technical documentation created
User guide created (if needed)

Quick Reference Commands

System Information

# Check IP addresses
ip addr show ens3

# Check listening ports
ss -tlnp | grep [IP or PORT]

# Check running services
systemctl status [service]

# Check firewall
ufw status numbered

Git Operations

# Stage files
git add [file or folder]

# Commit changes
git commit -m "Descriptive message"

# Push to Gitea
git push

# Check status
git status

# View history
git log --oneline

Service Management

# Start service
systemctl start [service]

# Stop service
systemctl stop [service]

# Restart service
systemctl restart [service]

# Enable on boot
systemctl enable [service]

# View logs
journalctl -u [service] -f

Nginx

# Test config
nginx -t

# Reload config
systemctl reload nginx

# Restart nginx
systemctl restart nginx

# Check bindings
ss -tlnp | grep nginx

SSL/Certbot

# Test renewal
certbot renew --dry-run

# Check certificates
certbot certificates

# Force renewal
certbot renew --force-renewal

End-of-Session Protocol

Before Taking a Break

Michael:

Verify all services are running
Check no hanging processes
Exit SSH cleanly
Note stopping point for next session

Claude:

Summarize what was accomplished
Note current phase progress (X/5 services)
Preview next steps
Provide status update

Session Summary Template:

✅ [Service Name] - COMPLETE
- Deployed on [IP]
- SSL configured
- Frostwall active
- Documented in Git

⏳ Next Service: [Name] ([IP]) - [subdomain]

Phase 0.5 Progress: X/5 (XX%)

Emergency Procedures

If Service Breaks Production

Step 1: Assess impact

# Check what's affected
systemctl status --failed

Step 2: Quick rollback

# Stop problematic service
systemctl stop [service]

# Disable if needed
systemctl disable [service]

Step 3: Restore from Git

cd /root/firefrost-master-configs
git log  # Find last working commit
# Copy old config back

Step 4: Restart affected services

systemctl restart [service]

If Locked Out

Prevention:

Always test SSH access before closing terminal
Keep firewall rule for port 22
Have backup access method (VPS console)

Recovery:

Use VPS provider's console access
Review ufw rules
Re-enable SSH if blocked

Success Metrics

A Successful Deployment Includes:

✅ Service running and responding
✅ SSL certificate active (HTTPS working)
✅ Frostwall rules applied
✅ DNS resolving correctly
✅ All configs backed up to Git
✅ Documentation complete
✅ No errors in service logs
✅ Michael can take a break without worry

Notes & Lessons Learned

From Gitea Deployment (Service 1):

What Worked Well:

Micro-block format for commands
Complete file paste for configs (vs line-by-line edits)
IP isolation strategy (one IP per service)
Checkpoint system for reviews
Sanitized templates for sensitive configs

Issues Encountered:

Default Nginx site conflicted with IP binding (removed default)
Port 80 required full nginx restart (not just reload) to clear inherited sockets
Needed self-signed cert before Let's Encrypt
UFW installation removed iptables-persistent

Solutions Applied:

Remove /etc/nginx/sites-enabled/default
Use systemctl restart nginx after major config changes
Generate temporary self-signed cert for testing
Documented UFW as standard Frostwall tool

Carry Forward:

Always check for default configs that bind 0.0.0.0
Full restart after major changes
Keep templates for SSL cert generation
UFW is now standard (Phase 0 used iptables, Phase 0.5+ uses UFW)

Revision History

Version	Date	Changes
1.0	2026-02-08	Initial workflow guide created after Gitea deployment success

END OF WORKFLOW GUIDE

The Wizard & Michael: Building Firefrost Infrastructure, One Service at a Time 🧙‍♂️⚡

13 KiB Raw Blame History