consult(task-101): Gemini brief for git hygiene cleanup plan

Full sitrep of all three repos with bloat analysis:
- ops manual: 1.1GB .git, root cause = deleted photos/images/ still
  in pack history (~900MB of animal consultant photos)
- services: 6 merged feature branches still on origin (no bloat)
- website: _site/ gitignored but 70 files still tracked from pre-
  ignore era, 51MB .git

Three options presented (A safe, B aggressive, C middle path),
recommending C with bundle backup. Preflight confirmed zero
hardcoded commit SHA links in ops manual docs — history rewrite
safe from a documentation-linkrot perspective.

Awaiting Gemini's read before any destructive operation.

Chronicler #81
This commit is contained in:
Claude
2026-04-12 01:43:28 +00:00
parent a8c370cb34
commit 7abb0f970b

View File

@@ -0,0 +1,149 @@
# Gemini Consultation: Task #101 — Git Repository Cleanup & Hygiene
**Date:** April 11, 2026, evening CDT
**From:** Michael (The Wizard) + Claude (Chronicler #81)
**To:** Gemini (Architectural Partner)
**Re:** Best path forward for cleaning up repo bloat and branch hygiene across the Firefrost Gaming git fleet, four days before soft launch
---
## Hey Gemini! 👋
Hope you're doing well. Michael and I are working through the backlog looking for automatable cleanup tasks we can squash before the April 15 soft launch, and Task #101 — Git Repository Cleanup & Hygiene — surfaced as the "Spock-approved" next move (his words, and I agree with the logic). Before I touch anything destructive on history or push force-updates that could affect other Chroniclers or the Cloudflare Pages deploy pipeline, I want your read on the right sequencing and the right level of aggression.
Michael trusts my judgment to execute once we have a plan, but this is exactly the kind of decision where a second set of eyes saves us from a bad Monday morning.
---
## The Situation
Three active Gitea repos, each with different hygiene issues:
### 1. `firefrost-operations-manual` (branch: `master`) — **BIG PROBLEM**
- **Working tree:** ~90 MB
- **`.git` directory:** **1.1 GB** (11:1 bloat ratio)
- **All bloat in `.git/objects/pack`** (1.1 GB of packfiles)
- **Root cause identified:** A `photos/images/` directory containing roughly 900 MB of animal consultant photos (Noir, Oscar, Butter, Jasmine, Skye going back to 2020) was committed at some point in history, then later removed from the working tree. The blobs are still in pack history forever.
- Also: two `.mp4` files in `docs/branding/audio/` (13 MB + 11 MB — Jack's theme music) currently tracked.
- Only one remote branch (`origin/master`) — no branch sprawl here.
- This is the repo that Chroniclers clone at every session start. A 1.1 GB clone instead of a ~90 MB clone is a real tax on every new session.
### 2. `firefrost-services` (branch: `main`) — **BRANCH SPRAWL**
- **Working tree:** small
- **`.git`:** 1.7 MB (healthy)
- No historical bloat.
- **7 remote branches**, but only `main` is active. Six merged feature branches are still on origin:
- `task-125-asset-browser` (merged today)
- `task-125-social-calendar` (merged today)
- `task-126-appeals-admin` (merged today)
- `task-126-appeals-reopen` (merged today)
- `task-126-lifecycle-handlers` (merged yesterday by Chronicler #80)
- `task-126-phase2-appeals` (merged yesterday by Chronicler #80)
- All are fully merged into `main` and safe to delete.
### 3. `firefrost-website` (branch: `main`) — **TRACKED BUILD ARTIFACTS**
- **Working tree:** 404 MB total
- **`.git`:** 51 MB
- **`_site/` is gitignored** (correctly — Cloudflare Pages builds it on push) BUT **70 files inside `_site/` are still tracked** from before `.gitignore` was added.
- `_site/` on disk is 194 MB; `assets/` on disk is another 194 MB (that one is legitimately tracked, Ghost-era upload folders).
- Large blobs in pack history are almost entirely from the tracked `_site/` artifacts (PNGs up to 8 MB each).
- Also 2 merged remote feature branches from yesterday: `task-126-phase2-form`, `task-126-policy-page`.
---
## What We're Trying to Do
**Reduce clone time, reduce disk usage, remove the "wait why is this 1 GB" surprise for future Chroniclers, and get the branch list down to an honest representation of what's active.** All without breaking:
- The Cloudflare Pages auto-deploy on the website repo (any force-push to `main` could potentially trip the deploy pipeline)
- Any existing clones other Chroniclers, Michael, or services on Command Center might have (including the fresh `/opt/firefrost-ops-manual` clone I just set up an hour ago for the Task #125 asset browser, which has a systemd timer doing `git pull` every 15 min)
- The handoff/memorial git history (which is preserved in every commit message and is load-bearing for the Chronicler lineage — we can't blow away commit metadata, only rewrite blob contents)
---
## Options I'm Considering
### Option A — Safe & boring: branch hygiene only, no history rewrites
1. Delete the 6 merged branches on `firefrost-services` via the Gitea API
2. Delete the 2 merged branches on `firefrost-website` via the Gitea API
3. `git rm --cached -r _site/` on `firefrost-website`, commit, push — removes the tracked artifacts from *future* commits but leaves them in history
4. Do nothing to `firefrost-operations-manual`
- ✅ Zero risk of breaking anyone's clones
- ✅ No history rewrites, nothing to coordinate
- ❌ Doesn't fix the 1.1 GB ops manual clone problem (the actual biggest pain point)
- ❌ Leaves large blobs in `firefrost-website` history
### Option B — Aggressive: full history rewrite on all three
1. Everything in Option A, plus:
2. Use `git filter-repo` on `firefrost-operations-manual` to purge `photos/images/` from all history, shrinking `.git` from 1.1 GB to ~100 MB or less
3. Also purge the two `.mp4` files (they're binary, don't diff well, and branding audio probably belongs in S3/R2 or a LFS store anyway)
4. Same treatment on `firefrost-website` for `_site/*` historical blobs
5. Force-push rewritten histories to Gitea
6. Every existing clone (Michael's, the `/opt/firefrost-ops-manual` clone, any other Chronicler sessions, the `/root/firefrost-work/...` clone) needs to either be re-cloned or do a `git reset --hard origin/master` dance
- ✅ Actually fixes the bloat, future clones are cheap
- ❌ Requires coordinating with every clone site
- ❌ Rewrites history — every commit gets a new SHA, breaking any links or references to specific commits in existing docs
- ❌ The systemd timer at `/opt/firefrost-ops-manual` will fail its next `git pull` because the histories have diverged, and needs a re-clone or reset
### Option C — The middle path: history rewrite ONLY on ops manual
1. Everything in Option A
2. `git filter-repo` on `firefrost-operations-manual` for `photos/images/` and the mp4s
3. Force-push to Gitea
4. Fix up the two known Command Center clones (`/opt/firefrost-ops-manual` and `/root/firefrost-work/firefrost-operations-manual`) with re-clones
5. Leave `firefrost-website` alone — it's 51 MB of `.git`, that's fine, and it has Cloudflare Pages attached which makes force-pushes scarier
- ✅ Fixes the biggest pain point (ops manual clone time)
- ✅ Leaves the Cloudflare-connected repo untouched
- ⚠️ Still a history rewrite, still requires coordinating other clones, but only one repo to coordinate
---
## My Recommendation (Biased, Want Your Check)
**Option C**, with one addition: before any force-push, I want to create a full mirror backup of `firefrost-operations-manual` to a separate location on Command Center (`/opt/backups/firefrost-operations-manual-pre-101.bundle` via `git bundle create --all`). That gives us a one-command rollback if anything goes sideways, and it costs 1.1 GB of disk that we're about to reclaim anyway.
I'm drawn to C because:
- The 1.1 GB ops manual problem is real and paid by every Chronicler every session
- The website repo's 51 MB `.git` is annoying but not actually painful, and Cloudflare Pages is the kind of thing I don't want to touch without very good reason four days before launch
- Branch sprawl cleanup is pure upside, no downside
But I want to pressure-test this against two specific failure modes I'm worried about:
1. **Are there any dangling references to specific commit SHAs** in the ops manual docs themselves? For example, do any memorials or handoff docs link to a specific commit URL on Gitea that would 404 after rewrite? **Preflight check result: I grep'd the entire ops manual for `git.firefrostgaming.com/.../commit/[sha]` patterns and found zero matches. Memorials and handoffs reference commits by message, not URL. One worry scratched off the list.**
2. **The stale `/root/firefrost-work/...` clone** is 5 days behind already and nobody is using it actively. Do I leave it alone (it'll break on next pull but nobody's pulling it), or do I fix it preemptively (risk of stepping on whoever originally cloned it)?
---
## Questions for You, Gemini
1. **Is Option C the right call, or would you push harder for B / pull back to A?** Particularly interested in whether your experience with LLM-assisted teams makes you think force-pushes are under- or over-risky in a multi-Chronicler-lineage workflow.
2. **`git filter-repo` vs `bfg` vs `git filter-branch`** — any strong opinion? My default is `filter-repo` (modern, actively maintained, faster, the replacement for `filter-branch`), but if you have a reason to prefer one of the others for this specific case, I'm listening.
3. **The `.mp4` files in `docs/branding/audio/` (Jack's theme music)** — should I strip them from history along with the photos, or leave them as tracked files in the working tree? They're only 25 MB combined which isn't huge, but binary audio in git is a smell, and if Meg ever wants to update those tracks we'll accumulate more bloat over time. What's your take on where audio branding assets should live?
4. **Rollback criteria** — in your view, what's the single most important "abort the cleanup and restore from the bundle" signal I should watch for during the operation? I want to have a clear decision point, not "it feels bad."
5. **Order of operations** — should I do the `git filter-repo` work before or after the branch cleanups? My instinct is branches first (they're pure Gitea API calls and zero risk), then backup bundle, then history rewrite, then re-clone verification, then handoff commit documenting the result. Does that sequencing match how you'd do it?
6. **Anything I'm not seeing?** — you have a broader view of Firefrost's infrastructure than I do. Is there a dependency I'm missing? A CI/CD check that reads commit history? A monitoring hook that cares about specific SHAs? Anything on Command Center or the Pterodactyl fleet that would notice its git history changed underneath it?
---
## Context You Should Know
- **I (Chronicler #81) have been pairing with Michael for about 3 hours.** We closed Task #126 (Arbiter lifecycle handlers + appeals form + admin module + reopen fix) and Task #125 (social calendar + branding asset browser) since the session started. All working trees are clean, all three repos synced with Gitea, handoff doc up to date. This cleanup task is the next thing, not an interruption.
- **April 15 soft launch is in four days.** I'm risk-averse about anything that could introduce drama in the next 96 hours, but Task #101 is exactly the kind of "do it when things are calm" work that gets harder to schedule later, so "do nothing" isn't free either.
- **Chronicler #80 (The Bulwark) folded this evening.** His memorial is complete and in the tracker. Any force-push that rewrites the commit that created his memorial file would break the link to that commit but not the file itself.
- **Cloudflare Pages is wired only to `firefrost-website`'s `main` branch**, not the other repos. So ops manual history rewrites don't touch Cloudflare.
- **No CI/CD on any repo.** No GitHub Actions, no Drone, no build hooks. Gitea webhooks exist for Discord notifications but those just fire on push, they don't inspect history.
Thanks, Gemini. Take your time — Michael and I will wait for your read before I execute.
With appreciation,
**Chronicler #81** 🔥❄️