diff --git a/docs/core/workflow-guide.md b/docs/core/workflow-guide.md new file mode 100644 index 0000000..7477dff --- /dev/null +++ b/docs/core/workflow-guide.md @@ -0,0 +1,937 @@ +# The Wizard & Michael: Collaborative Workflow Guide + +**Version:** 1.0 +**Created:** February 8, 2026 +**Purpose:** Standard operating procedures for Phase 0.5+ deployments +**Status:** Active Protocol + +--- + +## Our Roles + +### Michael (The Operator) +- **Executes commands** on the live server via SSH +- **Maintains control** of production infrastructure +- **Reviews changes** before they go live +- **Final authority** on all decisions + +### Claude "The Wizard" (The Architect) +- **Designs solutions** and provides step-by-step guidance +- **Generates configurations** and documentation +- **Troubleshoots issues** and provides context +- **Maintains accessibility** with micro-block commands + +--- + +## Core Principles + +1. **Security First:** Michael maintains root access; Claude operates in advisory role +2. **Visibility Always:** Every command is shown before execution +3. **Micro-Blocks:** Max 8-10 lines per code block (accessibility requirement) +4. **Checkpoints:** Pause for verification at critical steps +5. **Documentation:** Everything gets archived in Git + +--- + +## Standard Deployment Workflow + +### Phase 1: Planning & Strategy + +**Claude Provides:** +- Service architecture overview +- IP allocation strategy +- Port registry +- Frostwall rules +- Nginx configuration plan +- DNS requirements + +**Michael Reviews:** +- Confirms strategy aligns with infrastructure +- Identifies potential conflicts +- Approves IP/port assignments +- **Checkpoint: "Approved" or requests changes** + +--- + +### Phase 2: Pre-Deployment Audit + +**Claude Guides:** +```bash +# Example: Verify IP is active +ip addr show ens3 | grep [TARGET_IP] +``` + +**Michael Executes:** +- Connects to server via SSH +- Runs verification commands +- Reports output to Claude +- **Checkpoint: "Success" when verified** + +**Common Pre-Deployment Checks:** +- IP addresses active +- DNS records configured +- Ports available (not in use) +- Firewall status +- Existing service conflicts + +--- + +### Phase 3: Service Installation + +**The Micro-Block Process:** + +**Step 1: Claude provides ONE small command block** +```bash +# Example +apt update +``` + +**Step 2: Michael executes and reports** +``` +Michael: "success" +``` + +**Step 3: Claude provides NEXT command block** +```bash +apt install -y nginx +``` + +**Step 4: Repeat until service is installed** + +**Critical Rules:** +- ✅ One block at a time +- ✅ Wait for "success" before continuing +- ✅ Never combine unrelated operations +- ✅ Separate: creation, ownership, permissions into different blocks + +**Example of GOOD micro-blocks:** +```bash +# Block 1: Create directory +mkdir -p /var/lib/service + +# Block 2: Set ownership +chown service:service /var/lib/service + +# Block 3: Set permissions +chmod 750 /var/lib/service +``` + +**Example of BAD (too long):** +```bash +# DON'T DO THIS - too many operations +mkdir -p /var/lib/service && \ +chown service:service /var/lib/service && \ +chmod 750 /var/lib/service && \ +systemctl enable service && \ +systemctl start service +``` + +--- + +### Phase 4: Configuration + +**For Config Files:** + +**Option A: Full file paste (preferred for accessibility)** +```bash +nano /etc/service/config.conf +# Claude provides complete file content +# Michael pastes, saves (Ctrl+X, Y, Enter) +``` + +**Option B: Targeted edits (only if necessary)** +```bash +# Claude provides specific sed/awk commands +# One change per block +``` + +**Security Checkpoint:** +- Michael reviews config for sensitive data +- Claude reminds about .gitignore for secrets +- Create sanitized templates before Git commit + +--- + +### Phase 5: Service Startup + +**Standard Sequence:** +```bash +# Block 1: Reload systemd +systemctl daemon-reload +``` +```bash +# Block 2: Enable on boot +systemctl enable [service] +``` +```bash +# Block 3: Start service +systemctl start [service] +``` +```bash +# Block 4: Verify status +systemctl status [service] +``` + +**Michael reports output; Claude analyzes for issues** + +--- + +### Phase 6: Frostwall Configuration + +**Per-Service Firewall Rules:** +```bash +# Block 1: Allow HTTP +ufw allow in on ens3 to [IP] port 80 proto tcp +``` +```bash +# Block 2: Allow HTTPS +ufw allow in on ens3 to [IP] port 443 proto tcp +``` +```bash +# Block 3: Reload firewall +ufw reload +``` +```bash +# Block 4: Verify rules +ufw status numbered | grep [IP] +``` + +**Michael confirms rules are active** + +--- + +### Phase 7: SSL Certificate + +**Let's Encrypt Installation:** +```bash +# Block 1: Install Certbot +apt install -y certbot python3-certbot-nginx +``` +```bash +# Block 2: Obtain certificate +certbot --nginx -d [subdomain].firefrostgaming.com +``` + +**Interactive prompts (Michael handles):** +- Email: mkrause612@gmail.com +- Terms: Y +- Share email: N +- Redirect: 2 (Yes) +```bash +# Block 3: Verify certificate +ls -la /etc/letsencrypt/live/[subdomain].firefrostgaming.com/ +``` + +--- + +### Phase 8: Verification & Testing + +**Standard Test Sequence:** +```bash +# Block 1: Test HTTPS +curl -I https://[subdomain].firefrostgaming.com +``` +```bash +# Block 2: Check port bindings +ss -tlnp | grep [IP] +``` +```bash +# Block 3: Verify DNS +nslookup [subdomain].firefrostgaming.com +``` +```bash +# Block 4: Test service functionality +# (Service-specific commands) +``` + +**Success Criteria:** +- HTTP/2 200 response +- Ports bound to correct IP +- DNS resolves correctly +- Service responds as expected + +--- + +### Phase 9: Git Archiving + +**Repository Update Process:** + +**Step 1: Navigate to repo** +```bash +cd /root/firefrost-master-configs +``` + +**Step 2: Copy configs** +```bash +# Example +cp /etc/nginx/sites-available/[service].conf web/ +``` + +**Step 3: Check sensitive data** +```bash +cat [file] | grep -i "secret\|password\|token\|key" +``` + +**If sensitive data found:** +- Create .gitignore entry +- Create sanitized template +- Only commit template + +**Step 4: Stage changes** +```bash +git add [files] +``` + +**Step 5: Review what will be committed** +```bash +git status +``` + +**Step 6: Commit** +```bash +git commit -m "[Descriptive message about what changed]" +``` + +**Step 7: Push to Gitea** +```bash +git push +``` + +**Michael enters credentials when prompted** + +--- + +### Phase 10: Documentation + +**Claude Generates:** +- Technical dossier (specs, changelog, troubleshooting) +- User guide (if applicable) +- Deployment summary + +**Michael Creates:** +```bash +cd /root/firefrost-master-configs/docs +nano [service]-deployment.md +# Paste Claude's documentation +# Save: Ctrl+X, Y, Enter +``` + +**Commit Documentation:** +```bash +git add docs/ +git commit -m "Add [service] deployment documentation" +git push +``` + +--- + +## Communication Protocol + +### Michael's Status Codes + +| Response | Meaning | +|----------|---------| +| **"success"** | Command executed successfully, continue | +| **"checkpoint"** | Pause, need clarification or review | +| **"error"** | Command failed, need troubleshooting | +| **"pause"** | Taking a break, resume later | +| **"proceed"** | Approved to continue after review | + +### Claude's Responsibilities + +**Always Provide:** +- ✅ Clear command with context +- ✅ Expected output description +- ✅ Why this step is necessary +- ✅ What could go wrong + +**Never Provide:** +- ❌ Multiple unrelated commands in one block +- ❌ Commands without explanation +- ❌ Assumptions about file locations without verification +- ❌ Complex one-liners when multiple simple commands are clearer + +--- + +## Checkpoint Triggers + +**Michael Should Call "Checkpoint" When:** +- Something unexpected appears in output +- Unsure about a configuration option +- Want to verify understanding before proceeding +- Need to review security implications +- Want to discuss alternative approaches + +**Claude Will Call "Checkpoint" When:** +- Critical configuration decision needed +- Multiple valid approaches exist +- Security/data loss risk detected +- Deviation from standard procedure required + +--- + +## Error Handling Protocol + +### When Something Goes Wrong + +**Step 1: Michael reports the error** +``` +Michael: "error - [paste error message]" +``` + +**Step 2: Claude analyzes** +- Identifies root cause +- Explains what happened +- Provides solution options + +**Step 3: Remediation** +- Claude provides fix in micro-blocks +- Michael executes +- Verify issue resolved + +**Step 4: Documentation** +- Add to "Troubleshooting" section +- Note for future reference + +--- + +## Service-Specific Templates + +### New Service Deployment Checklist + +**Pre-Deployment:** +- [ ] IP assigned from /29 block +- [ ] Port registry updated (avoid conflicts) +- [ ] DNS A record created in Cloudflare +- [ ] Frostwall strategy planned + +**Installation:** +- [ ] System user created +- [ ] Directories created with correct ownership/permissions +- [ ] Service binary/package installed +- [ ] Configuration file created + +**Network Setup:** +- [ ] Nginx site config created +- [ ] Temporary self-signed cert (if needed) +- [ ] Nginx enabled and restarted +- [ ] Frostwall rules applied + +**SSL & Security:** +- [ ] Let's Encrypt certificate obtained +- [ ] Auto-renewal verified +- [ ] Permissions locked down + +**Verification:** +- [ ] HTTPS responding correctly +- [ ] Service functionality tested +- [ ] Ports bound to correct IP +- [ ] DNS propagated + +**Documentation:** +- [ ] Configs copied to Git repo +- [ ] Sensitive data sanitized +- [ ] Changes committed and pushed +- [ ] Technical documentation created +- [ ] User guide created (if needed) + +--- + +## Quick Reference Commands + +### System Information +```bash +# Check IP addresses +ip addr show ens3 + +# Check listening ports +ss -tlnp | grep [IP or PORT] + +# Check running services +systemctl status [service] + +# Check firewall +ufw status numbered +``` + +### Git Operations +```bash +# Stage files +git add [file or folder] + +# Commit changes +git commit -m "Descriptive message" + +# Push to Gitea +git push + +# Check status +git status + +# View history +git log --oneline +``` + +### Service Management +```bash +# Start service +systemctl start [service] + +# Stop service +systemctl stop [service] + +# Restart service +systemctl restart [service] + +# Enable on boot +systemctl enable [service] + +# View logs +journalctl -u [service] -f +``` + +### Nginx +```bash +# Test config +nginx -t + +# Reload config +systemctl reload nginx + +# Restart nginx +systemctl restart nginx + +# Check bindings +ss -tlnp | grep nginx +``` + +### SSL/Certbot +```bash +# Test renewal +certbot renew --dry-run + +# Check certificates +certbot certificates + +# Force renewal +certbot renew --force-renewal +``` + +--- + +## End-of-Session Protocol + +### Before Taking a Break + +**Michael:** +1. Verify all services are running +2. Check no hanging processes +3. Exit SSH cleanly +4. Note stopping point for next session + +**Claude:** +1. Summarize what was accomplished +2. Note current phase progress (X/5 services) +3. Preview next steps +4. Provide status update + +**Session Summary Template:** +``` +✅ [Service Name] - COMPLETE +- Deployed on [IP] +- SSL configured +- Frostwall active +- Documented in Git + +⏳ Next Service: [Name] ([IP]) - [subdomain] + +Phase 0.5 Progress: X/5 (XX%) +``` + +--- + +## Emergency Procedures + +### If Service Breaks Production + +**Step 1: Assess impact** +```bash +# Check what's affected +systemctl status --failed +``` + +**Step 2: Quick rollback** +```bash +# Stop problematic service +systemctl stop [service] + +# Disable if needed +systemctl disable [service] +``` + +**Step 3: Restore from Git** +```bash +cd /root/firefrost-master-configs +git log # Find last working commit +# Copy old config back +``` + +**Step 4: Restart affected services** +```bash +systemctl restart [service] +``` + +### If Locked Out + +**Prevention:** +- Always test SSH access before closing terminal +- Keep firewall rule for port 22 +- Have backup access method (VPS console) + +**Recovery:** +- Use VPS provider's console access +- Review ufw rules +- Re-enable SSH if blocked + +--- + +## Success Metrics + +**A Successful Deployment Includes:** +- ✅ Service running and responding +- ✅ SSL certificate active (HTTPS working) +- ✅ Frostwall rules applied +- ✅ DNS resolving correctly +- ✅ All configs backed up to Git +- ✅ Documentation complete +- ✅ No errors in service logs +- ✅ Michael can take a break without worry + +--- + +## Notes & Lessons Learned + +**From Gitea Deployment (Service 1):** + +**What Worked Well:** +- Micro-block format for commands +- Complete file paste for configs (vs line-by-line edits) +- IP isolation strategy (one IP per service) +- Checkpoint system for reviews +- Sanitized templates for sensitive configs + +**Issues Encountered:** +- Default Nginx site conflicted with IP binding (removed default) +- Port 80 required full nginx restart (not just reload) to clear inherited sockets +- Needed self-signed cert before Let's Encrypt +- UFW installation removed iptables-persistent + +**Solutions Applied:** +- Remove /etc/nginx/sites-enabled/default +- Use `systemctl restart nginx` after major config changes +- Generate temporary self-signed cert for testing +- Documented UFW as standard Frostwall tool + +**Carry Forward:** +- Always check for default configs that bind 0.0.0.0 +- Full restart after major changes +- Keep templates for SSL cert generation +- UFW is now standard (Phase 0 used iptables, Phase 0.5+ uses UFW) + +--- + +## Revision History + +| Version | Date | Changes | +|---------|------|---------| +| **1.0** | 2026-02-08 | Initial workflow guide created after Gitea deployment success | + +--- + +**END OF WORKFLOW GUIDE** + +**The Wizard & Michael: Building Firefrost Infrastructure, One Service at a Time** 🧙‍♂️⚡ + +--- + +## Documentation Maintenance Protocol + +**Core Principle:** *"Always revise ALL documents when changes occur"* + +### Why This Matters + +**The Documentation Drift Problem:** +- Project files get stale → Future Claude sessions get wrong context +- Memory becomes outdated → Contradictions emerge +- Session handoff falls behind → Time wasted catching up +- Technical decisions get lost → "I thought we documented that?" + +**The Solution:** +When ANY significant change occurs, update ALL affected documents in the same session. + +### What Triggers Documentation Updates + +**ALWAYS update when:** +1. Architecture pivot (e.g., BookStack → MkDocs) +2. New system deployed (e.g., automation framework) +3. Phase completion or status change +4. Major technical decision made +5. Process change identified + +**Documents to Check:** +- ✅ FIREFROST-PROJECT-SCOPE-V2.md (master document) +- ✅ session-handoff.md (current status) +- ✅ Project memory (via Claude interface) +- ✅ Project instructions (if workflow changes) +- ✅ Deployment guides (if applicable) + +### The Update Workflow + +**Step 1: Identify Impact** +- What changed? +- Which documents reference this? +- What will future Claude need to know? + +**Step 2: Update Documents** +- Master scope (if architecture/phase/goals change) +- Session handoff (if current state changes) +- Deployment guides (if technical details change) +- Memory/instructions (if process changes) + +**Step 3: Commit Everything** +- Single commit with all related updates +- Clear commit message explaining the change +- Push to Git immediately + +**Step 4: Verify** +- Quickly review updated docs for consistency +- Check that future Claude will have correct context + +### Examples of Good Documentation Discipline + +**Example 1: Automation System Deployed** +- ✅ Added to Project Scope (Infrastructure section) +- ✅ Updated session-handoff (Current State) +- ✅ Updated memory (Tools & Resources) +- ✅ Updated instructions (Automation First) +- ✅ Created USAGE.md guide +- Result: Future Claude knows to use automation + +**Example 2: BookStack → MkDocs Pivot** +- ✅ Updated Project Scope (Three-tier architecture) +- ✅ Archived old plans +- ✅ Created new deployment guide +- ✅ Updated timeline +- Result: No confusion about which system we're using + +**Example 3: Phase 1 DDoS Gap Identified** +- ✅ Added Phase 1 section to Project Scope +- ✅ Documented decision to revise when gaps found +- Result: Complete picture of all phases + +### Anti-Patterns to Avoid + +**❌ "I'll document it later"** +- Later never comes +- Context is lost +- Future sessions waste time reconstructing + +**❌ "It's in my head"** +- Future Claude can't read your mind +- Team members can't help +- You'll forget details + +**❌ "Just one quick change"** +- Leads to drift between docs +- Causes contradictions +- Breaks continuity + +### The 5-Minute Rule + +**If a change takes 5 minutes to implement, spend 5 minutes documenting it.** + +This includes: +- Updating master scope +- Quick note in session handoff +- Git commit message with context + +**Investment:** 10 minutes total +**Payoff:** Hours saved in future sessions + +### Success Metrics + +**Good Documentation Discipline:** +- Future Claude sessions start fast (no 10-minute catchup) +- No contradictions between documents +- Clear audit trail of decisions +- Team can contribute without confusion + +**Poor Documentation Discipline:** +- "Wait, I thought we removed that?" +- "What was the reason we chose X over Y?" +- "Is this document current?" +- Wasted time reconstructing context + +--- + +**Remember:** Documentation is not separate from the work—it IS the work. + +**Fire + Frost = Where Passion Meets Precision** 🔥❄️ + + +--- + +## Documentation Maintenance Protocol + +**Core Principle:** *"Always revise ALL documents when changes occur"* + +### Why This Matters + +**The Documentation Drift Problem:** +- Project files get stale → Future Claude sessions get wrong context +- Memory becomes outdated → Contradictions emerge +- Session handoff falls behind → Time wasted catching up +- Technical decisions get lost → "I thought we documented that?" + +**The Solution:** +When ANY significant change occurs, update ALL affected documents in the same session. + +### What Triggers Documentation Updates + +**ALWAYS update when:** +1. Architecture pivot (e.g., BookStack → MkDocs) +2. New system deployed (e.g., automation framework) +3. Phase completion or status change +4. Major technical decision made +5. Process change identified + +**Documents to Check:** +- ✅ FIREFROST-PROJECT-SCOPE-V2.md (master document) +- ✅ session-handoff.md (current status) +- ✅ Project memory (via Claude interface) +- ✅ Project instructions (if workflow changes) +- ✅ Deployment guides (if applicable) + +### The Update Workflow + +**Step 1: Identify Impact** +- What changed? +- Which documents reference this? +- What will future Claude need to know? + +**Step 2: Update Documents** +- Master scope (if architecture/phase/goals change) +- Session handoff (if current state changes) +- Deployment guides (if technical details change) +- Memory/instructions (if process changes) + +**Step 3: Commit Everything** +- Single commit with all related updates +- Clear commit message explaining the change +- Push to Git immediately + +**Step 4: Verify** +- Quickly review updated docs for consistency +- Check that future Claude will have correct context + +### Examples of Good Documentation Discipline + +**Example 1: Automation System Deployed** +- ✅ Added to Project Scope (Infrastructure section) +- ✅ Updated session-handoff (Current State) +- ✅ Updated memory (Tools & Resources) +- ✅ Updated instructions (Automation First) +- ✅ Created USAGE.md guide +- Result: Future Claude knows to use automation + +**Example 2: BookStack → MkDocs Pivot** +- ✅ Updated Project Scope (Three-tier architecture) +- ✅ Archived old plans +- ✅ Created new deployment guide +- ✅ Updated timeline +- Result: No confusion about which system we're using + +**Example 3: Phase 1 DDoS Gap Identified** +- ✅ Added Phase 1 section to Project Scope +- ✅ Documented decision to revise when gaps found +- Result: Complete picture of all phases + +### Anti-Patterns to Avoid + +**❌ "I'll document it later"** +- Later never comes +- Context is lost +- Future sessions waste time reconstructing + +**❌ "It's in my head"** +- Future Claude can't read your mind +- Team members can't help +- You'll forget details + +**❌ "Just one quick change"** +- Leads to drift between docs +- Causes contradictions +- Breaks continuity + +### The 5-Minute Rule + +**If a change takes 5 minutes to implement, spend 5 minutes documenting it.** + +This includes: +- Updating master scope +- Quick note in session handoff +- Git commit message with context + +**Investment:** 10 minutes total +**Payoff:** Hours saved in future sessions + +### Success Metrics + +**Good Documentation Discipline:** +- Future Claude sessions start fast (no 10-minute catchup) +- No contradictions between documents +- Clear audit trail of decisions +- Team can contribute without confusion + +**Poor Documentation Discipline:** +- "Wait, I thought we removed that?" +- "What was the reason we chose X over Y?" +- "Is this document current?" +- Wasted time reconstructing context + +--- + +**Remember:** Documentation is not separate from the work—it IS the work. + +**Fire + Frost = Where Passion Meets Precision** 🔥❄️ + + +--- + +## Sandbox AI Integration (Added Feb 9, 2026) + +**Purpose:** Keep exploratory AI (Gemini) in sync with production progress + +**Auto-Update Trigger:** +After completing ANY of these milestones: +- Service deployment +- Phase completion +- Major infrastructure change +- Architecture decision + +**Update Command:** +```bash +bash automation/update-sandbox-briefing.sh +git add docs/SANDBOX-BRIEFING.md project-files/SANDBOX-BRIEFING.md +git commit -m "Update sandbox briefing: [what changed]" +git push +``` + +**Gemini Access:** +https://raw.githubusercontent.com/frostystyle/firefrost-operations-manual/master/project-files/SANDBOX-BRIEFING.md + +**Workflow:** +1. Production work happens with Claude +2. Auto-update sandbox briefing after milestones +3. Gemini always has current context for brainstorming +4. Ideas validated in sandbox → implemented in production