Files
firefrost-operations-manual/docs/workflow-guide.md

654 lines
13 KiB
Markdown

# The Wizard & Michael: Collaborative Workflow Guide
**Version:** 1.0
**Created:** February 8, 2026
**Purpose:** Standard operating procedures for Phase 0.5+ deployments
**Status:** Active Protocol
---
## Our Roles
### Michael (The Operator)
- **Executes commands** on the live server via SSH
- **Maintains control** of production infrastructure
- **Reviews changes** before they go live
- **Final authority** on all decisions
### Claude "The Wizard" (The Architect)
- **Designs solutions** and provides step-by-step guidance
- **Generates configurations** and documentation
- **Troubleshoots issues** and provides context
- **Maintains accessibility** with micro-block commands
---
## Core Principles
1. **Security First:** Michael maintains root access; Claude operates in advisory role
2. **Visibility Always:** Every command is shown before execution
3. **Micro-Blocks:** Max 8-10 lines per code block (accessibility requirement)
4. **Checkpoints:** Pause for verification at critical steps
5. **Documentation:** Everything gets archived in Git
---
## Standard Deployment Workflow
### Phase 1: Planning & Strategy
**Claude Provides:**
- Service architecture overview
- IP allocation strategy
- Port registry
- Frostwall rules
- Nginx configuration plan
- DNS requirements
**Michael Reviews:**
- Confirms strategy aligns with infrastructure
- Identifies potential conflicts
- Approves IP/port assignments
- **Checkpoint: "Approved" or requests changes**
---
### Phase 2: Pre-Deployment Audit
**Claude Guides:**
```bash
# Example: Verify IP is active
ip addr show ens3 | grep [TARGET_IP]
```
**Michael Executes:**
- Connects to server via SSH
- Runs verification commands
- Reports output to Claude
- **Checkpoint: "Success" when verified**
**Common Pre-Deployment Checks:**
- IP addresses active
- DNS records configured
- Ports available (not in use)
- Firewall status
- Existing service conflicts
---
### Phase 3: Service Installation
**The Micro-Block Process:**
**Step 1: Claude provides ONE small command block**
```bash
# Example
apt update
```
**Step 2: Michael executes and reports**
```
Michael: "success"
```
**Step 3: Claude provides NEXT command block**
```bash
apt install -y nginx
```
**Step 4: Repeat until service is installed**
**Critical Rules:**
- ✅ One block at a time
- ✅ Wait for "success" before continuing
- ✅ Never combine unrelated operations
- ✅ Separate: creation, ownership, permissions into different blocks
**Example of GOOD micro-blocks:**
```bash
# Block 1: Create directory
mkdir -p /var/lib/service
# Block 2: Set ownership
chown service:service /var/lib/service
# Block 3: Set permissions
chmod 750 /var/lib/service
```
**Example of BAD (too long):**
```bash
# DON'T DO THIS - too many operations
mkdir -p /var/lib/service && \
chown service:service /var/lib/service && \
chmod 750 /var/lib/service && \
systemctl enable service && \
systemctl start service
```
---
### Phase 4: Configuration
**For Config Files:**
**Option A: Full file paste (preferred for accessibility)**
```bash
nano /etc/service/config.conf
# Claude provides complete file content
# Michael pastes, saves (Ctrl+X, Y, Enter)
```
**Option B: Targeted edits (only if necessary)**
```bash
# Claude provides specific sed/awk commands
# One change per block
```
**Security Checkpoint:**
- Michael reviews config for sensitive data
- Claude reminds about .gitignore for secrets
- Create sanitized templates before Git commit
---
### Phase 5: Service Startup
**Standard Sequence:**
```bash
# Block 1: Reload systemd
systemctl daemon-reload
```
```bash
# Block 2: Enable on boot
systemctl enable [service]
```
```bash
# Block 3: Start service
systemctl start [service]
```
```bash
# Block 4: Verify status
systemctl status [service]
```
**Michael reports output; Claude analyzes for issues**
---
### Phase 6: Frostwall Configuration
**Per-Service Firewall Rules:**
```bash
# Block 1: Allow HTTP
ufw allow in on ens3 to [IP] port 80 proto tcp
```
```bash
# Block 2: Allow HTTPS
ufw allow in on ens3 to [IP] port 443 proto tcp
```
```bash
# Block 3: Reload firewall
ufw reload
```
```bash
# Block 4: Verify rules
ufw status numbered | grep [IP]
```
**Michael confirms rules are active**
---
### Phase 7: SSL Certificate
**Let's Encrypt Installation:**
```bash
# Block 1: Install Certbot
apt install -y certbot python3-certbot-nginx
```
```bash
# Block 2: Obtain certificate
certbot --nginx -d [subdomain].firefrostgaming.com
```
**Interactive prompts (Michael handles):**
- Email: mkrause612@gmail.com
- Terms: Y
- Share email: N
- Redirect: 2 (Yes)
```bash
# Block 3: Verify certificate
ls -la /etc/letsencrypt/live/[subdomain].firefrostgaming.com/
```
---
### Phase 8: Verification & Testing
**Standard Test Sequence:**
```bash
# Block 1: Test HTTPS
curl -I https://[subdomain].firefrostgaming.com
```
```bash
# Block 2: Check port bindings
ss -tlnp | grep [IP]
```
```bash
# Block 3: Verify DNS
nslookup [subdomain].firefrostgaming.com
```
```bash
# Block 4: Test service functionality
# (Service-specific commands)
```
**Success Criteria:**
- HTTP/2 200 response
- Ports bound to correct IP
- DNS resolves correctly
- Service responds as expected
---
### Phase 9: Git Archiving
**Repository Update Process:**
**Step 1: Navigate to repo**
```bash
cd /root/firefrost-master-configs
```
**Step 2: Copy configs**
```bash
# Example
cp /etc/nginx/sites-available/[service].conf web/
```
**Step 3: Check sensitive data**
```bash
cat [file] | grep -i "secret\|password\|token\|key"
```
**If sensitive data found:**
- Create .gitignore entry
- Create sanitized template
- Only commit template
**Step 4: Stage changes**
```bash
git add [files]
```
**Step 5: Review what will be committed**
```bash
git status
```
**Step 6: Commit**
```bash
git commit -m "[Descriptive message about what changed]"
```
**Step 7: Push to Gitea**
```bash
git push
```
**Michael enters credentials when prompted**
---
### Phase 10: Documentation
**Claude Generates:**
- Technical dossier (specs, changelog, troubleshooting)
- User guide (if applicable)
- Deployment summary
**Michael Creates:**
```bash
cd /root/firefrost-master-configs/docs
nano [service]-deployment.md
# Paste Claude's documentation
# Save: Ctrl+X, Y, Enter
```
**Commit Documentation:**
```bash
git add docs/
git commit -m "Add [service] deployment documentation"
git push
```
---
## Communication Protocol
### Michael's Status Codes
| Response | Meaning |
|----------|---------|
| **"success"** | Command executed successfully, continue |
| **"checkpoint"** | Pause, need clarification or review |
| **"error"** | Command failed, need troubleshooting |
| **"pause"** | Taking a break, resume later |
| **"proceed"** | Approved to continue after review |
### Claude's Responsibilities
**Always Provide:**
- ✅ Clear command with context
- ✅ Expected output description
- ✅ Why this step is necessary
- ✅ What could go wrong
**Never Provide:**
- ❌ Multiple unrelated commands in one block
- ❌ Commands without explanation
- ❌ Assumptions about file locations without verification
- ❌ Complex one-liners when multiple simple commands are clearer
---
## Checkpoint Triggers
**Michael Should Call "Checkpoint" When:**
- Something unexpected appears in output
- Unsure about a configuration option
- Want to verify understanding before proceeding
- Need to review security implications
- Want to discuss alternative approaches
**Claude Will Call "Checkpoint" When:**
- Critical configuration decision needed
- Multiple valid approaches exist
- Security/data loss risk detected
- Deviation from standard procedure required
---
## Error Handling Protocol
### When Something Goes Wrong
**Step 1: Michael reports the error**
```
Michael: "error - [paste error message]"
```
**Step 2: Claude analyzes**
- Identifies root cause
- Explains what happened
- Provides solution options
**Step 3: Remediation**
- Claude provides fix in micro-blocks
- Michael executes
- Verify issue resolved
**Step 4: Documentation**
- Add to "Troubleshooting" section
- Note for future reference
---
## Service-Specific Templates
### New Service Deployment Checklist
**Pre-Deployment:**
- [ ] IP assigned from /29 block
- [ ] Port registry updated (avoid conflicts)
- [ ] DNS A record created in Cloudflare
- [ ] Frostwall strategy planned
**Installation:**
- [ ] System user created
- [ ] Directories created with correct ownership/permissions
- [ ] Service binary/package installed
- [ ] Configuration file created
**Network Setup:**
- [ ] Nginx site config created
- [ ] Temporary self-signed cert (if needed)
- [ ] Nginx enabled and restarted
- [ ] Frostwall rules applied
**SSL & Security:**
- [ ] Let's Encrypt certificate obtained
- [ ] Auto-renewal verified
- [ ] Permissions locked down
**Verification:**
- [ ] HTTPS responding correctly
- [ ] Service functionality tested
- [ ] Ports bound to correct IP
- [ ] DNS propagated
**Documentation:**
- [ ] Configs copied to Git repo
- [ ] Sensitive data sanitized
- [ ] Changes committed and pushed
- [ ] Technical documentation created
- [ ] User guide created (if needed)
---
## Quick Reference Commands
### System Information
```bash
# Check IP addresses
ip addr show ens3
# Check listening ports
ss -tlnp | grep [IP or PORT]
# Check running services
systemctl status [service]
# Check firewall
ufw status numbered
```
### Git Operations
```bash
# Stage files
git add [file or folder]
# Commit changes
git commit -m "Descriptive message"
# Push to Gitea
git push
# Check status
git status
# View history
git log --oneline
```
### Service Management
```bash
# Start service
systemctl start [service]
# Stop service
systemctl stop [service]
# Restart service
systemctl restart [service]
# Enable on boot
systemctl enable [service]
# View logs
journalctl -u [service] -f
```
### Nginx
```bash
# Test config
nginx -t
# Reload config
systemctl reload nginx
# Restart nginx
systemctl restart nginx
# Check bindings
ss -tlnp | grep nginx
```
### SSL/Certbot
```bash
# Test renewal
certbot renew --dry-run
# Check certificates
certbot certificates
# Force renewal
certbot renew --force-renewal
```
---
## End-of-Session Protocol
### Before Taking a Break
**Michael:**
1. Verify all services are running
2. Check no hanging processes
3. Exit SSH cleanly
4. Note stopping point for next session
**Claude:**
1. Summarize what was accomplished
2. Note current phase progress (X/5 services)
3. Preview next steps
4. Provide status update
**Session Summary Template:**
```
✅ [Service Name] - COMPLETE
- Deployed on [IP]
- SSL configured
- Frostwall active
- Documented in Git
⏳ Next Service: [Name] ([IP]) - [subdomain]
Phase 0.5 Progress: X/5 (XX%)
```
---
## Emergency Procedures
### If Service Breaks Production
**Step 1: Assess impact**
```bash
# Check what's affected
systemctl status --failed
```
**Step 2: Quick rollback**
```bash
# Stop problematic service
systemctl stop [service]
# Disable if needed
systemctl disable [service]
```
**Step 3: Restore from Git**
```bash
cd /root/firefrost-master-configs
git log # Find last working commit
# Copy old config back
```
**Step 4: Restart affected services**
```bash
systemctl restart [service]
```
### If Locked Out
**Prevention:**
- Always test SSH access before closing terminal
- Keep firewall rule for port 22
- Have backup access method (VPS console)
**Recovery:**
- Use VPS provider's console access
- Review ufw rules
- Re-enable SSH if blocked
---
## Success Metrics
**A Successful Deployment Includes:**
- ✅ Service running and responding
- ✅ SSL certificate active (HTTPS working)
- ✅ Frostwall rules applied
- ✅ DNS resolving correctly
- ✅ All configs backed up to Git
- ✅ Documentation complete
- ✅ No errors in service logs
- ✅ Michael can take a break without worry
---
## Notes & Lessons Learned
**From Gitea Deployment (Service 1):**
**What Worked Well:**
- Micro-block format for commands
- Complete file paste for configs (vs line-by-line edits)
- IP isolation strategy (one IP per service)
- Checkpoint system for reviews
- Sanitized templates for sensitive configs
**Issues Encountered:**
- Default Nginx site conflicted with IP binding (removed default)
- Port 80 required full nginx restart (not just reload) to clear inherited sockets
- Needed self-signed cert before Let's Encrypt
- UFW installation removed iptables-persistent
**Solutions Applied:**
- Remove /etc/nginx/sites-enabled/default
- Use `systemctl restart nginx` after major config changes
- Generate temporary self-signed cert for testing
- Documented UFW as standard Frostwall tool
**Carry Forward:**
- Always check for default configs that bind 0.0.0.0
- Full restart after major changes
- Keep templates for SSL cert generation
- UFW is now standard (Phase 0 used iptables, Phase 0.5+ uses UFW)
---
## Revision History
| Version | Date | Changes |
|---------|------|---------|
| **1.0** | 2026-02-08 | Initial workflow guide created after Gitea deployment success |
---
**END OF WORKFLOW GUIDE**
**The Wizard & Michael: Building Firefrost Infrastructure, One Service at a Time** 🧙‍♂️⚡