Files
firefrost-operations-manual/docs/workflow-guide.md
Firefrost Automation b74d1051bf Create auto-update system for sandbox briefing
- automation/update-sandbox-briefing.sh (auto-sync script)
- Added to project-files/ for Gemini access
- Updated workflow-guide.md with sandbox integration
- Keeps exploratory AI in sync with production progress
2026-02-09 20:27:47 -06:00

938 lines
21 KiB
Markdown

# The Wizard & Michael: Collaborative Workflow Guide
**Version:** 1.0
**Created:** February 8, 2026
**Purpose:** Standard operating procedures for Phase 0.5+ deployments
**Status:** Active Protocol
---
## Our Roles
### Michael (The Operator)
- **Executes commands** on the live server via SSH
- **Maintains control** of production infrastructure
- **Reviews changes** before they go live
- **Final authority** on all decisions
### Claude "The Wizard" (The Architect)
- **Designs solutions** and provides step-by-step guidance
- **Generates configurations** and documentation
- **Troubleshoots issues** and provides context
- **Maintains accessibility** with micro-block commands
---
## Core Principles
1. **Security First:** Michael maintains root access; Claude operates in advisory role
2. **Visibility Always:** Every command is shown before execution
3. **Micro-Blocks:** Max 8-10 lines per code block (accessibility requirement)
4. **Checkpoints:** Pause for verification at critical steps
5. **Documentation:** Everything gets archived in Git
---
## Standard Deployment Workflow
### Phase 1: Planning & Strategy
**Claude Provides:**
- Service architecture overview
- IP allocation strategy
- Port registry
- Frostwall rules
- Nginx configuration plan
- DNS requirements
**Michael Reviews:**
- Confirms strategy aligns with infrastructure
- Identifies potential conflicts
- Approves IP/port assignments
- **Checkpoint: "Approved" or requests changes**
---
### Phase 2: Pre-Deployment Audit
**Claude Guides:**
```bash
# Example: Verify IP is active
ip addr show ens3 | grep [TARGET_IP]
```
**Michael Executes:**
- Connects to server via SSH
- Runs verification commands
- Reports output to Claude
- **Checkpoint: "Success" when verified**
**Common Pre-Deployment Checks:**
- IP addresses active
- DNS records configured
- Ports available (not in use)
- Firewall status
- Existing service conflicts
---
### Phase 3: Service Installation
**The Micro-Block Process:**
**Step 1: Claude provides ONE small command block**
```bash
# Example
apt update
```
**Step 2: Michael executes and reports**
```
Michael: "success"
```
**Step 3: Claude provides NEXT command block**
```bash
apt install -y nginx
```
**Step 4: Repeat until service is installed**
**Critical Rules:**
- ✅ One block at a time
- ✅ Wait for "success" before continuing
- ✅ Never combine unrelated operations
- ✅ Separate: creation, ownership, permissions into different blocks
**Example of GOOD micro-blocks:**
```bash
# Block 1: Create directory
mkdir -p /var/lib/service
# Block 2: Set ownership
chown service:service /var/lib/service
# Block 3: Set permissions
chmod 750 /var/lib/service
```
**Example of BAD (too long):**
```bash
# DON'T DO THIS - too many operations
mkdir -p /var/lib/service && \
chown service:service /var/lib/service && \
chmod 750 /var/lib/service && \
systemctl enable service && \
systemctl start service
```
---
### Phase 4: Configuration
**For Config Files:**
**Option A: Full file paste (preferred for accessibility)**
```bash
nano /etc/service/config.conf
# Claude provides complete file content
# Michael pastes, saves (Ctrl+X, Y, Enter)
```
**Option B: Targeted edits (only if necessary)**
```bash
# Claude provides specific sed/awk commands
# One change per block
```
**Security Checkpoint:**
- Michael reviews config for sensitive data
- Claude reminds about .gitignore for secrets
- Create sanitized templates before Git commit
---
### Phase 5: Service Startup
**Standard Sequence:**
```bash
# Block 1: Reload systemd
systemctl daemon-reload
```
```bash
# Block 2: Enable on boot
systemctl enable [service]
```
```bash
# Block 3: Start service
systemctl start [service]
```
```bash
# Block 4: Verify status
systemctl status [service]
```
**Michael reports output; Claude analyzes for issues**
---
### Phase 6: Frostwall Configuration
**Per-Service Firewall Rules:**
```bash
# Block 1: Allow HTTP
ufw allow in on ens3 to [IP] port 80 proto tcp
```
```bash
# Block 2: Allow HTTPS
ufw allow in on ens3 to [IP] port 443 proto tcp
```
```bash
# Block 3: Reload firewall
ufw reload
```
```bash
# Block 4: Verify rules
ufw status numbered | grep [IP]
```
**Michael confirms rules are active**
---
### Phase 7: SSL Certificate
**Let's Encrypt Installation:**
```bash
# Block 1: Install Certbot
apt install -y certbot python3-certbot-nginx
```
```bash
# Block 2: Obtain certificate
certbot --nginx -d [subdomain].firefrostgaming.com
```
**Interactive prompts (Michael handles):**
- Email: mkrause612@gmail.com
- Terms: Y
- Share email: N
- Redirect: 2 (Yes)
```bash
# Block 3: Verify certificate
ls -la /etc/letsencrypt/live/[subdomain].firefrostgaming.com/
```
---
### Phase 8: Verification & Testing
**Standard Test Sequence:**
```bash
# Block 1: Test HTTPS
curl -I https://[subdomain].firefrostgaming.com
```
```bash
# Block 2: Check port bindings
ss -tlnp | grep [IP]
```
```bash
# Block 3: Verify DNS
nslookup [subdomain].firefrostgaming.com
```
```bash
# Block 4: Test service functionality
# (Service-specific commands)
```
**Success Criteria:**
- HTTP/2 200 response
- Ports bound to correct IP
- DNS resolves correctly
- Service responds as expected
---
### Phase 9: Git Archiving
**Repository Update Process:**
**Step 1: Navigate to repo**
```bash
cd /root/firefrost-master-configs
```
**Step 2: Copy configs**
```bash
# Example
cp /etc/nginx/sites-available/[service].conf web/
```
**Step 3: Check sensitive data**
```bash
cat [file] | grep -i "secret\|password\|token\|key"
```
**If sensitive data found:**
- Create .gitignore entry
- Create sanitized template
- Only commit template
**Step 4: Stage changes**
```bash
git add [files]
```
**Step 5: Review what will be committed**
```bash
git status
```
**Step 6: Commit**
```bash
git commit -m "[Descriptive message about what changed]"
```
**Step 7: Push to Gitea**
```bash
git push
```
**Michael enters credentials when prompted**
---
### Phase 10: Documentation
**Claude Generates:**
- Technical dossier (specs, changelog, troubleshooting)
- User guide (if applicable)
- Deployment summary
**Michael Creates:**
```bash
cd /root/firefrost-master-configs/docs
nano [service]-deployment.md
# Paste Claude's documentation
# Save: Ctrl+X, Y, Enter
```
**Commit Documentation:**
```bash
git add docs/
git commit -m "Add [service] deployment documentation"
git push
```
---
## Communication Protocol
### Michael's Status Codes
| Response | Meaning |
|----------|---------|
| **"success"** | Command executed successfully, continue |
| **"checkpoint"** | Pause, need clarification or review |
| **"error"** | Command failed, need troubleshooting |
| **"pause"** | Taking a break, resume later |
| **"proceed"** | Approved to continue after review |
### Claude's Responsibilities
**Always Provide:**
- ✅ Clear command with context
- ✅ Expected output description
- ✅ Why this step is necessary
- ✅ What could go wrong
**Never Provide:**
- ❌ Multiple unrelated commands in one block
- ❌ Commands without explanation
- ❌ Assumptions about file locations without verification
- ❌ Complex one-liners when multiple simple commands are clearer
---
## Checkpoint Triggers
**Michael Should Call "Checkpoint" When:**
- Something unexpected appears in output
- Unsure about a configuration option
- Want to verify understanding before proceeding
- Need to review security implications
- Want to discuss alternative approaches
**Claude Will Call "Checkpoint" When:**
- Critical configuration decision needed
- Multiple valid approaches exist
- Security/data loss risk detected
- Deviation from standard procedure required
---
## Error Handling Protocol
### When Something Goes Wrong
**Step 1: Michael reports the error**
```
Michael: "error - [paste error message]"
```
**Step 2: Claude analyzes**
- Identifies root cause
- Explains what happened
- Provides solution options
**Step 3: Remediation**
- Claude provides fix in micro-blocks
- Michael executes
- Verify issue resolved
**Step 4: Documentation**
- Add to "Troubleshooting" section
- Note for future reference
---
## Service-Specific Templates
### New Service Deployment Checklist
**Pre-Deployment:**
- [ ] IP assigned from /29 block
- [ ] Port registry updated (avoid conflicts)
- [ ] DNS A record created in Cloudflare
- [ ] Frostwall strategy planned
**Installation:**
- [ ] System user created
- [ ] Directories created with correct ownership/permissions
- [ ] Service binary/package installed
- [ ] Configuration file created
**Network Setup:**
- [ ] Nginx site config created
- [ ] Temporary self-signed cert (if needed)
- [ ] Nginx enabled and restarted
- [ ] Frostwall rules applied
**SSL & Security:**
- [ ] Let's Encrypt certificate obtained
- [ ] Auto-renewal verified
- [ ] Permissions locked down
**Verification:**
- [ ] HTTPS responding correctly
- [ ] Service functionality tested
- [ ] Ports bound to correct IP
- [ ] DNS propagated
**Documentation:**
- [ ] Configs copied to Git repo
- [ ] Sensitive data sanitized
- [ ] Changes committed and pushed
- [ ] Technical documentation created
- [ ] User guide created (if needed)
---
## Quick Reference Commands
### System Information
```bash
# Check IP addresses
ip addr show ens3
# Check listening ports
ss -tlnp | grep [IP or PORT]
# Check running services
systemctl status [service]
# Check firewall
ufw status numbered
```
### Git Operations
```bash
# Stage files
git add [file or folder]
# Commit changes
git commit -m "Descriptive message"
# Push to Gitea
git push
# Check status
git status
# View history
git log --oneline
```
### Service Management
```bash
# Start service
systemctl start [service]
# Stop service
systemctl stop [service]
# Restart service
systemctl restart [service]
# Enable on boot
systemctl enable [service]
# View logs
journalctl -u [service] -f
```
### Nginx
```bash
# Test config
nginx -t
# Reload config
systemctl reload nginx
# Restart nginx
systemctl restart nginx
# Check bindings
ss -tlnp | grep nginx
```
### SSL/Certbot
```bash
# Test renewal
certbot renew --dry-run
# Check certificates
certbot certificates
# Force renewal
certbot renew --force-renewal
```
---
## End-of-Session Protocol
### Before Taking a Break
**Michael:**
1. Verify all services are running
2. Check no hanging processes
3. Exit SSH cleanly
4. Note stopping point for next session
**Claude:**
1. Summarize what was accomplished
2. Note current phase progress (X/5 services)
3. Preview next steps
4. Provide status update
**Session Summary Template:**
```
✅ [Service Name] - COMPLETE
- Deployed on [IP]
- SSL configured
- Frostwall active
- Documented in Git
⏳ Next Service: [Name] ([IP]) - [subdomain]
Phase 0.5 Progress: X/5 (XX%)
```
---
## Emergency Procedures
### If Service Breaks Production
**Step 1: Assess impact**
```bash
# Check what's affected
systemctl status --failed
```
**Step 2: Quick rollback**
```bash
# Stop problematic service
systemctl stop [service]
# Disable if needed
systemctl disable [service]
```
**Step 3: Restore from Git**
```bash
cd /root/firefrost-master-configs
git log # Find last working commit
# Copy old config back
```
**Step 4: Restart affected services**
```bash
systemctl restart [service]
```
### If Locked Out
**Prevention:**
- Always test SSH access before closing terminal
- Keep firewall rule for port 22
- Have backup access method (VPS console)
**Recovery:**
- Use VPS provider's console access
- Review ufw rules
- Re-enable SSH if blocked
---
## Success Metrics
**A Successful Deployment Includes:**
- ✅ Service running and responding
- ✅ SSL certificate active (HTTPS working)
- ✅ Frostwall rules applied
- ✅ DNS resolving correctly
- ✅ All configs backed up to Git
- ✅ Documentation complete
- ✅ No errors in service logs
- ✅ Michael can take a break without worry
---
## Notes & Lessons Learned
**From Gitea Deployment (Service 1):**
**What Worked Well:**
- Micro-block format for commands
- Complete file paste for configs (vs line-by-line edits)
- IP isolation strategy (one IP per service)
- Checkpoint system for reviews
- Sanitized templates for sensitive configs
**Issues Encountered:**
- Default Nginx site conflicted with IP binding (removed default)
- Port 80 required full nginx restart (not just reload) to clear inherited sockets
- Needed self-signed cert before Let's Encrypt
- UFW installation removed iptables-persistent
**Solutions Applied:**
- Remove /etc/nginx/sites-enabled/default
- Use `systemctl restart nginx` after major config changes
- Generate temporary self-signed cert for testing
- Documented UFW as standard Frostwall tool
**Carry Forward:**
- Always check for default configs that bind 0.0.0.0
- Full restart after major changes
- Keep templates for SSL cert generation
- UFW is now standard (Phase 0 used iptables, Phase 0.5+ uses UFW)
---
## Revision History
| Version | Date | Changes |
|---------|------|---------|
| **1.0** | 2026-02-08 | Initial workflow guide created after Gitea deployment success |
---
**END OF WORKFLOW GUIDE**
**The Wizard & Michael: Building Firefrost Infrastructure, One Service at a Time** 🧙‍♂️⚡
---
## Documentation Maintenance Protocol
**Core Principle:** *"Always revise ALL documents when changes occur"*
### Why This Matters
**The Documentation Drift Problem:**
- Project files get stale → Future Claude sessions get wrong context
- Memory becomes outdated → Contradictions emerge
- Session handoff falls behind → Time wasted catching up
- Technical decisions get lost → "I thought we documented that?"
**The Solution:**
When ANY significant change occurs, update ALL affected documents in the same session.
### What Triggers Documentation Updates
**ALWAYS update when:**
1. Architecture pivot (e.g., BookStack → MkDocs)
2. New system deployed (e.g., automation framework)
3. Phase completion or status change
4. Major technical decision made
5. Process change identified
**Documents to Check:**
- ✅ FIREFROST-PROJECT-SCOPE-V2.md (master document)
- ✅ session-handoff.md (current status)
- ✅ Project memory (via Claude interface)
- ✅ Project instructions (if workflow changes)
- ✅ Deployment guides (if applicable)
### The Update Workflow
**Step 1: Identify Impact**
- What changed?
- Which documents reference this?
- What will future Claude need to know?
**Step 2: Update Documents**
- Master scope (if architecture/phase/goals change)
- Session handoff (if current state changes)
- Deployment guides (if technical details change)
- Memory/instructions (if process changes)
**Step 3: Commit Everything**
- Single commit with all related updates
- Clear commit message explaining the change
- Push to Git immediately
**Step 4: Verify**
- Quickly review updated docs for consistency
- Check that future Claude will have correct context
### Examples of Good Documentation Discipline
**Example 1: Automation System Deployed**
- ✅ Added to Project Scope (Infrastructure section)
- ✅ Updated session-handoff (Current State)
- ✅ Updated memory (Tools & Resources)
- ✅ Updated instructions (Automation First)
- ✅ Created USAGE.md guide
- Result: Future Claude knows to use automation
**Example 2: BookStack → MkDocs Pivot**
- ✅ Updated Project Scope (Three-tier architecture)
- ✅ Archived old plans
- ✅ Created new deployment guide
- ✅ Updated timeline
- Result: No confusion about which system we're using
**Example 3: Phase 1 DDoS Gap Identified**
- ✅ Added Phase 1 section to Project Scope
- ✅ Documented decision to revise when gaps found
- Result: Complete picture of all phases
### Anti-Patterns to Avoid
**❌ "I'll document it later"**
- Later never comes
- Context is lost
- Future sessions waste time reconstructing
**❌ "It's in my head"**
- Future Claude can't read your mind
- Team members can't help
- You'll forget details
**❌ "Just one quick change"**
- Leads to drift between docs
- Causes contradictions
- Breaks continuity
### The 5-Minute Rule
**If a change takes 5 minutes to implement, spend 5 minutes documenting it.**
This includes:
- Updating master scope
- Quick note in session handoff
- Git commit message with context
**Investment:** 10 minutes total
**Payoff:** Hours saved in future sessions
### Success Metrics
**Good Documentation Discipline:**
- Future Claude sessions start fast (no 10-minute catchup)
- No contradictions between documents
- Clear audit trail of decisions
- Team can contribute without confusion
**Poor Documentation Discipline:**
- "Wait, I thought we removed that?"
- "What was the reason we chose X over Y?"
- "Is this document current?"
- Wasted time reconstructing context
---
**Remember:** Documentation is not separate from the work—it IS the work.
**Fire + Frost = Where Passion Meets Precision** 🔥❄️
---
## Documentation Maintenance Protocol
**Core Principle:** *"Always revise ALL documents when changes occur"*
### Why This Matters
**The Documentation Drift Problem:**
- Project files get stale → Future Claude sessions get wrong context
- Memory becomes outdated → Contradictions emerge
- Session handoff falls behind → Time wasted catching up
- Technical decisions get lost → "I thought we documented that?"
**The Solution:**
When ANY significant change occurs, update ALL affected documents in the same session.
### What Triggers Documentation Updates
**ALWAYS update when:**
1. Architecture pivot (e.g., BookStack → MkDocs)
2. New system deployed (e.g., automation framework)
3. Phase completion or status change
4. Major technical decision made
5. Process change identified
**Documents to Check:**
- ✅ FIREFROST-PROJECT-SCOPE-V2.md (master document)
- ✅ session-handoff.md (current status)
- ✅ Project memory (via Claude interface)
- ✅ Project instructions (if workflow changes)
- ✅ Deployment guides (if applicable)
### The Update Workflow
**Step 1: Identify Impact**
- What changed?
- Which documents reference this?
- What will future Claude need to know?
**Step 2: Update Documents**
- Master scope (if architecture/phase/goals change)
- Session handoff (if current state changes)
- Deployment guides (if technical details change)
- Memory/instructions (if process changes)
**Step 3: Commit Everything**
- Single commit with all related updates
- Clear commit message explaining the change
- Push to Git immediately
**Step 4: Verify**
- Quickly review updated docs for consistency
- Check that future Claude will have correct context
### Examples of Good Documentation Discipline
**Example 1: Automation System Deployed**
- ✅ Added to Project Scope (Infrastructure section)
- ✅ Updated session-handoff (Current State)
- ✅ Updated memory (Tools & Resources)
- ✅ Updated instructions (Automation First)
- ✅ Created USAGE.md guide
- Result: Future Claude knows to use automation
**Example 2: BookStack → MkDocs Pivot**
- ✅ Updated Project Scope (Three-tier architecture)
- ✅ Archived old plans
- ✅ Created new deployment guide
- ✅ Updated timeline
- Result: No confusion about which system we're using
**Example 3: Phase 1 DDoS Gap Identified**
- ✅ Added Phase 1 section to Project Scope
- ✅ Documented decision to revise when gaps found
- Result: Complete picture of all phases
### Anti-Patterns to Avoid
**❌ "I'll document it later"**
- Later never comes
- Context is lost
- Future sessions waste time reconstructing
**❌ "It's in my head"**
- Future Claude can't read your mind
- Team members can't help
- You'll forget details
**❌ "Just one quick change"**
- Leads to drift between docs
- Causes contradictions
- Breaks continuity
### The 5-Minute Rule
**If a change takes 5 minutes to implement, spend 5 minutes documenting it.**
This includes:
- Updating master scope
- Quick note in session handoff
- Git commit message with context
**Investment:** 10 minutes total
**Payoff:** Hours saved in future sessions
### Success Metrics
**Good Documentation Discipline:**
- Future Claude sessions start fast (no 10-minute catchup)
- No contradictions between documents
- Clear audit trail of decisions
- Team can contribute without confusion
**Poor Documentation Discipline:**
- "Wait, I thought we removed that?"
- "What was the reason we chose X over Y?"
- "Is this document current?"
- Wasted time reconstructing context
---
**Remember:** Documentation is not separate from the work—it IS the work.
**Fire + Frost = Where Passion Meets Precision** 🔥❄️
---
## Sandbox AI Integration (Added Feb 9, 2026)
**Purpose:** Keep exploratory AI (Gemini) in sync with production progress
**Auto-Update Trigger:**
After completing ANY of these milestones:
- Service deployment
- Phase completion
- Major infrastructure change
- Architecture decision
**Update Command:**
```bash
bash automation/update-sandbox-briefing.sh
git add docs/SANDBOX-BRIEFING.md project-files/SANDBOX-BRIEFING.md
git commit -m "Update sandbox briefing: [what changed]"
git push
```
**Gemini Access:**
https://raw.githubusercontent.com/frostystyle/firefrost-operations-manual/master/project-files/SANDBOX-BRIEFING.md
**Workflow:**
1. Production work happens with Claude
2. Auto-update sandbox briefing after milestones
3. Gemini always has current context for brainstorming
4. Ideas validated in sandbox → implemented in production