Field Manual № 01 — The Discipline of Collaborating with an Agent that Forgets

The Premise

Every conversation begins with no memory of the prior one.

Each new session, the agent loads what it can — the project's claude.md, the memory index, your first message — and that is the entirety of its context. Yesterday's debugging path, the false trail you ruled out, the reason you chose approach A over approach B: none of it survives unless it was written somewhere the next session will read.

Most engineering tools assume continuity. Claude Code does not. The first time this catches you out, you spend an hour re-explaining the project. The tenth time, you start to suspect there is a discipline you are missing. There is. This manual describes it.

The question, once you accept the premise, is simple: what do you encode, and where? There are two slots, and they answer different questions.

Memory files hold the durable knowledge of a project — its architecture, its conventions, the gotchas you've burned hours discovering, the workflow rules you want to apply session after session. They are written once and consulted forever. They should be true on the timescale of the project itself, not on the timescale of any one task.

The handoff prompt holds the working set — where you left off this afternoon, what is in flight, what the next session should propose first. It is written at the end of every working session and consumed at the start of the next one. It has a known shelf life of exactly one session.

Together they form a protocol between past-you and future-you, with the agent as the courier. Get this protocol right and the agent is a long-running collaborator. Get it wrong and you have an expensive autocomplete.

The Six Universal Gates

Six checkpoints that should exist on every serious project.

The rules below are gates — moments where the agent stops and verifies before proceeding. They are not optional politeness; they are the difference between a session that ships and a session that introduces the bug it was supposed to fix. Six are universal. The rest grow on top, project by project, and earn their keep through incident.

The Gates

01 / VI GATE(session-start)

Verify against reality, before acting.

Read the memory index. Read the feature inventory. Run git status, git log, hardware and service checks. Memory is a point-in-time snapshot; trust the live state. Every claim from the previous session is a hypothesis until verified.

02 / VI GATE(pre-commit)

Scan, compile, review — every time.

Secret scan the diff. Confirm every language touched still compiles. Review staged files for unexpected additions: .env, credentials, build artifacts. Never bypass hooks unless explicitly authorized. The cost of one bad commit lasts forever in git history.

03 / VI GATE(self-audit)

Audit, fix, re-audit until clean.

One pass is never enough. The first audit catches eighty percent of issues; the re-audit catches the bugs introduced by the fix. Do not declare a feature done after the first sweep. Done is what survives the second one.

04 / VI GATE(post-work)

Update the record before saying done.

Update the feature inventory. Update memory files for any new architectural decision or workflow change. Verify the push landed. Report what was done. Memory drift is worse than no memory — it actively misleads future sessions.

05 / VI GATE(push-verify)

Local and remote must agree.

After every push: confirm the push succeeded, the working tree is clean, tags are pushed if releasing, version strings agree across every manifest, CI is not broken. Asymmetry between local and remote is where releases break.

06 / VI GATE(session-handoff)

End every session with a kickoff prompt.

A fenced code block the next session can paste verbatim, containing the gate checks, last-session summary, workflow rules, candidates for what to do next, and known follow-ups. This is the bridge across the no-shared-memory gap. Without it, every session starts from scratch.

III

Three Standing Rules

Three rules that aren't gates, but should exist on every project.

Where the gates above are checkpoints, these are standing dispositions — reflexes you want the agent to have, encoded as feedback memories so they survive across sessions and do not require re-prompting.

One task at a time.

Sequential, not batched. Finish before proposing the next. For any task that needs manual user action — account creation, token paste, hardware connection, third-party UI clicks — give a numbered step list and wait for confirmation. Multi-task batches collapse under their own weight; sequential work clarifies thinking on both sides of the conversation.

Do not add unrequested complexity.

No speculative abstractions. No backward-compatibility shims for code that has no users yet. No helper utilities for one-time operations. Match the scope of changes to what was asked. Three similar lines of code are better than a premature abstraction designed for a future that will not arrive.

Confirm risky actions first.

Destructive operations, force-pushes, anything visible to others, anything affecting shared state — pause and ask before doing. Authorization for one action does not transfer to other actions. The cost of a confirmation is small; the cost of an unwanted force-push is catastrophic and asymmetric.

Anything that should survive amnesia goes in memory. Anything that bridges between sessions goes in the handoff.

The Starter Prompt

What to paste on day one of a new project.

Treat this as session zero. The agent reads it, sets up the memory system and the standing rules, then waits for the first real task. Adapt the bracketed placeholders to your project; everything else is the universal core. Copy the block below.

Starter Prompt — Session Zero

// Paste this into a fresh Claude Code session at the start of a new project This is the kickoff session for <PROJECT NAME>, a <one-line description>. We'll be working on this across many sessions over weeks or months. I want you to build up a memory system and workflow discipline from the start so that future sessions can pick up where we left off without floundering. ── Memory system structure ────────────────────────────── Set up the following memory files in your auto-memory directory. Use four memory file types, each with frontmatter: --- name: <short title> description: <one-line description> type: user | feedback | project | reference --- MEMORY.md is an INDEX only — one line per file, never put memory content directly in MEMORY.md. ── GATE rules (save as feedback memories) ─────────────── 1. feedback_session_start.md — At session start: read MEMORY.md, read FEATURES.md (when it exists), run git status + git log --oneline -10, verify any dependent services are in expected state. Don't assume memory is current; verify against reality before acting. 2. feedback_precommit.md — Before every commit: secret scan the diff (grep for password/secret/token/api_key), verify builds compile, review staged files for unexpected additions, never use --no-verify. 3. feedback_selfaudit.md — After implementing a feature: audit, fix, RE-AUDIT until zero issues. Compile, lint, edge cases, end-to-end. Don't declare done after the first pass. 4. feedback_postwork.md — Before saying "done" or ending a session: update FEATURES.md, update memory files for any new architectural decisions or workflow rules, verify the push landed, report what was done. 5. feedback_push_verify.md — After git push: verify push succeeded, working tree clean, tags pushed if releasing, version strings consistent across all version files. 6. feedback_session_handoff.md — At session end: produce a fenced code block kickoff prompt for the next session containing GATE checks, last-session summary, workflow rules, "next up" candidates, and known follow-ups. Single-pass output: write status summary FIRST, then exactly ONE fenced block, then stop. 7. feedback_workflow_one_by_one.md — Tackle tasks one at a time. Finish before proposing next. Get my approval before starting. Numbered step-by-step lists for any manual user action. ── Project state files ─────────────────────────────────── 8. project_overview.md — Current status, version, what's implemented, tech stack, project structure. Update on every meaningful milestone. 9. FEATURES.md (in the repo root, not in memory) — Single source of truth for what's done and what's not. Checkbox list. Updated as features land. ── Workflow rules to honor from this point forward ────── · Confirm risky actions first (destructive ops, force-pushes, anything affecting shared state) · Don't add unrequested complexity (no speculative abstractions, no backward-compat shims) · Use Plan mode for non-trivial implementations to align on approach before writing code · When you discover a project-specific gotcha (a non-obvious thing that bit you), save it as a feedback memory immediately so future sessions don't hit it · When you make an architectural decision, save the WHY in a project memory, not just the what ── Domain-specific rules (fill in your own) ───────────── <e.g., "Real device hardware testing required before any tagged release"> <e.g., "Releases ship to N distribution channels: A, B, C"> <e.g., "Test database is NEVER mocked in integration tests"> ── Bootstrapping ───────────────────────────────────────── After you've set the memory system up: 1. Tell me exactly what files you created (list them). 2. Ask me about anything project-specific you need to write the first project_overview.md and user_<myname>.md. 3. Confirm by reading back to me: the GATE list, the standing rules, and one sentence about what you understand the project to be. 4. STOP and wait for me to give the first real task. Don't write any product code yet. Today's first task: <one-line description of what you want to build first>. Don't auto-pick the next task after finishing — propose one with clear scope, get my approval, then start.

Session zero. Adapt the bracketed placeholders. Everything else is universal.

How Memory Grows

You cannot write the full rule set on day one.

Because you do not yet know what is going to bite you. The memory system grows organically, in two ways, and the trick is to recognize the moments when accretion is happening.

From corrections. Every time you tell the agent "no, don't do that" or "stop doing X" — that is a feedback memory waiting to be saved. One project I work on has a rule against launching the development binary with the user's active display, because doing so once put a window onscreen and clobbered the running production app. One incident, one rule, never again.

From validations. Every time you tell the agent "yes, that was the right call" on a non-obvious choice — that is also a feedback memory. These are quieter, easier to miss, and equally important. Without them, the agent drifts away from approaches you have already approved, and you end up re-litigating decisions weekly.

Project memories accumulate similarly. You do not write them up front. You write them when you make an architectural decision worth not re-litigating in the next session. The trigger is simple: this should not need to be discovered twice.

Universal vs Specific

What belongs in the starter prompt, and what does not.

The rules in this manual are universal. They should exist on every project where Claude Code does multi-session engineering work, regardless of language, framework, or domain.

Some rules are not universal and should not be in the starter prompt. A hardware-test gate only matters if you have hardware in the loop. A multi-endpoint release flow only matters if you ship to multiple distribution channels. A polish-pass discipline only matters if you want a deliberate "feature complete" pass over each shipped feature. A library-version-lockstep rule only matters if you publish a library.

These belong in the memory system once the project grows into needing them — not before. The starter prompt is scaffolding; the project-specific rules are the building. Trying to write them up front produces a generic checklist that nobody follows; letting them accrete from real incident produces a memory system that the agent and the engineer both trust.

VII

The Deeper Principle

Treat the agent as a long-running collaborator that has amnesia.

The discipline that makes any of this work is not any single rule. It is the meta-discipline of internalizing one fact and letting it shape everything else: the agent has amnesia, and it is your job to bridge the gap.

Once that fact is fully internalized — really internalized, not just nodded at — the rules write themselves. Anything that should survive amnesia goes in memory. Anything that bridges between sessions goes in the handoff. Anything that is a reflex you want the agent to have goes in a feedback memory. Anything that is true today and false next week goes nowhere; let it stay in the conversation and die there.

The project this manual is drawn from has roughly thirty memory files now. On day one it had zero. Each one was earned by an incident — a bug shipped, a correction given, a validation made explicit, an architectural decision worth not re-explaining. That accretion is the actual product. The starter prompt above is just enough scaffolding to make the accretion happen.

If the discipline feels heavy at first, that is because it is replacing something that used to be free: the assumption that your tools remember you. They do not. But once you start writing the memory yourself, the agent starts to feel less like a chatbot and more like the long-running engineering collaborator you wanted in the first place.

The discipline of
collaborating with
an agent that forgets.

Every conversation begins with no memory of the prior one.

Six checkpoints that should exist on every serious project.

Verify against reality, before acting.

Scan, compile, review — every time.

Audit, fix, re-audit until clean.

Update the record before saying done.

Local and remote must agree.

End every session with a kickoff prompt.

Three rules that aren't gates, but should exist on every project.

One task at a time.

Do not add unrequested complexity.

Confirm risky actions first.

What to paste on day one of a new project.

You cannot write the full rule set on day one.

What belongs in the starter prompt, and what does not.

Treat the agent as a long-running collaborator that has amnesia.

Colophon

The discipline ofcollaborating withan agent that forgets.

Every conversation begins with no memory of the prior one.

Six checkpoints that should exist on every serious project.

Verify against reality, before acting.

Scan, compile, review — every time.

Audit, fix, re-audit until clean.

Update the record before saying done.

Local and remote must agree.

End every session with a kickoff prompt.

Three rules that aren't gates, but should exist on every project.

One task at a time.

Do not add unrequested complexity.

Confirm risky actions first.

What to paste on day one of a new project.

You cannot write the full rule set on day one.

What belongs in the starter prompt, and what does not.

Treat the agent as a long-running collaborator that has amnesia.

Colophon

The discipline of
collaborating with
an agent that forgets.