feat(writing-skills): refactor skill to gold standard architecture

2026-01-29 15:12:41 -03:00
parent f2c3e783b5
commit 10ca106b79
16 changed files with 1966 additions and 698 deletions
--- a/skills/writing-skills/SKILL.md
+++ b/skills/writing-skills/SKILL.md
@@ -1,737 +1,125 @@
 ---
 name: writing-skills
-description: Use when creating new skills, editing existing skills, or verifying skills work before deployment
+description: Use when creating, updating, or improving agent skills.
+metadata:
+  category: meta
+  author: ozy
+  triggers: new skill, create skill, update skill, skill documentation, skill template,
+    agent skill, writing skill
+  references: anti-rationalization, cso, standards, templates, testing, tier-1-simple,
+    tier-2-expanded, tier-3-platform
 ---

-# Writing Skills
+# Writing Skills (Excellence)

-## Overview
+Dispatcher for skill creation excellence. Use the decision tree below to find the right template and standards.

-**Writing skills IS Test-Driven Development applied to process documentation.**
+## ⚡ Quick Decision Tree

-**Personal skills live in agent-specific directories (`~/.claude/skills` for Claude Code, `~/.codex/skills` for Codex)**
+### What do you need to do?

-You write test cases (pressure scenarios with subagents), watch them fail (baseline behavior), write the skill (documentation), watch tests pass (agents comply), and refactor (close loopholes).
+1. **Create a NEW skill:**
+   - Is it simple (single file, <200 lines)? → [Tier 1 Architecture](references/tier-1-simple/README.md)
+   - Is it complex (multi-concept, 200-1000 lines)? → [Tier 2 Architecture](references/tier-2-expanded/README.md)
+   - Is it a massive platform (10+ products, AWS, Convex)? → [Tier 3 Architecture](references/tier-3-platform/README.md)

-**Core principle:** If you didn't watch an agent fail without the skill, you don't know if the skill teaches the right thing.
+2. **Improve an EXISTING skill:**
+   - Fix "it's too long" -> [Modularize (Tier 3)](references/templates/tier-3-platform.md)
+   - Fix "AI ignores rules" -> [Anti-Rationalization](references/anti-rationalization/README.md)
+   - Fix "users can't find it" -> [CSO (Search Optimization)](references/cso/README.md)

-**REQUIRED BACKGROUND:** You MUST understand superpowers:test-driven-development before using this skill. That skill defines the fundamental RED-GREEN-REFACTOR cycle. This skill adapts TDD to documentation.
+3. **Verify Compliance:**
+   - Check metadata/naming -> [Standards](references/standards/README.md)
+   - Add tests -> [Testing Guide](references/testing/README.md)

-**Official guidance:** For Anthropic's official skill authoring best practices, see anthropic-best-practices.md. This document provides additional patterns and guidelines that complement the TDD-focused approach in this skill.
+## 📚 Component Index

-## What is a Skill?
+| Component | Purpose |
+|-----------|---------|
+| **[CSO](references/cso/README.md)** | "SEO for LLMs". How to write descriptions that trigger. |
+| **[Standards](references/standards/README.md)** | File naming, YAML frontmatter, directory structure. |
+| **[Anti-Rationalization](references/anti-rationalization/README.md)**| How to write rules that agents won't ignore. |
+| **[Testing](references/testing/README.md)** | How to ensure your skill actually works. |

-A **skill** is a reference guide for proven techniques, patterns, or tools. Skills help future Claude instances find and apply effective approaches.
+## 🛠️ Templates

-**Skills are:** Reusable techniques, patterns, tools, reference guides
-
-**Skills are NOT:** Narratives about how you solved a problem once
-
-## TDD Mapping for Skills
-
-| TDD Concept             | Skill Creation                                   |
-| ----------------------- | ------------------------------------------------ |
-| **Test case**           | Pressure scenario with subagent                  |
-| **Production code**     | Skill document (SKILL.md)                        |
-| **Test fails (RED)**    | Agent violates rule without skill (baseline)     |
-| **Test passes (GREEN)** | Agent complies with skill present                |
-| **Refactor**            | Close loopholes while maintaining compliance     |
-| **Write test first**    | Run baseline scenario BEFORE writing skill       |
-| **Watch it fail**       | Document exact rationalizations agent uses       |
-| **Minimal code**        | Write skill addressing those specific violations |
-| **Watch it pass**       | Verify agent now complies                        |
-| **Refactor cycle**      | Find new rationalizations → plug → re-verify     |
-
-The entire skill creation process follows RED-GREEN-REFACTOR.
-
-## When to Create a Skill
-
-**Create when:**
-
- Technique wasn't intuitively obvious to you
- You'd reference this again across projects
- Pattern applies broadly (not project-specific)
- Others would benefit
-
-**Don't create for:**
-
- One-off solutions
- Standard practices well-documented elsewhere
- Project-specific conventions (put in CLAUDE.md)
- Mechanical constraints (if it's enforceable with regex/validation, automate it—save documentation for judgment calls)
-
-## Skill Types
-
-### Technique
-
-Concrete method with steps to follow (condition-based-waiting, root-cause-tracing)
-
-### Pattern
-
-Way of thinking about problems (flatten-with-flags, test-invariants)
-
-### Reference
-
-API docs, syntax guides, tool documentation (office docs)
-
-## Directory Structure
-
-```
-skills/
-  skill-name/
-    SKILL.md              # Main reference (required)
-    supporting-file.*     # Only if needed
-```
-
-**Flat namespace** - all skills in one searchable namespace
-
-**Separate files for:**
-
-1. **Heavy reference** (100+ lines) - API docs, comprehensive syntax
-2. **Reusable tools** - Scripts, utilities, templates
-
-**Keep inline:**
-
- Principles and concepts
- Code patterns (< 50 lines)
- Everything else
-
-## Set Appropriate Degrees of Freedom
-
-Match the level of specificity to the task's fragility and variability:
-
- **High freedom (text-based instructions)**: Use when multiple approaches are valid or decisions depend on context.
- **Medium freedom (pseudocode or scripts with parameters)**: Use when a preferred pattern exists but some variation is acceptable.
- **Low freedom (specific scripts, no-context instructions)**: Use when operations are fragile, error-prone, or consistency is critical.
-
-## Progressive Disclosure
-
-Manage context efficiently by splitting detailed information into separate files:
-
-1. **Metadata (name + description)**: Always loaded for discovery.
-2. **SKILL.md body**: Core workflow and high-level guidance. Keep under 500 lines.
-3. **Bundled resources**:
-   - `scripts/`: Deterministic code/logic.
-   - `references/`: Detailed schemas, API docs, or domain knowledge.
-   - `assets/`: Templates, images, or static files.
-
-**Pattern**: Link to advanced content or variant-specific details (e.g., `aws.md` vs `gcp.md`) from the main `SKILL.md`.
-
-## SKILL.md Structure
-
-**Frontmatter (YAML):**
-
- Only two fields supported: `name` and `description`
- Max 1024 characters total
- `name`: Use letters, numbers, and hyphens only (no parentheses, special chars)
- `description`: Third-person, describes ONLY when to use (NOT what it does)
-  - Start with "Use when..." to focus on triggering conditions
-  - Include specific symptoms, situations, and contexts
-  - **NEVER summarize the skill's process or workflow** (see CSO section for why)
-  - Keep under 500 characters if possible
-
-```markdown
---
-name: Skill-Name-With-Hyphens
-description: Use when [specific triggering conditions and symptoms]
---
-
-# Skill Name
-
-## Overview
-
-What is this? Core principle in 1-2 sentences.
+- [Technique Skill](references/templates/technique.md) (How-to)
+- [Reference Skill](references/templates/reference.md) (Docs)
+- [Discipline Skill](references/templates/discipline.md) (Rules)
+- [Pattern Skill](references/templates/pattern.md) (Design Patterns)

 ## When to Use

-[Small inline flowchart IF decision non-obvious]
+- Creating a NEW skill from scratch
+- Improving an EXISTING skill that agents ignore
+- Debugging why a skill isn't being triggered
+- Standardizing skills across a team

-Bullet list with SYMPTOMS and use cases
-When NOT to use
+## How It Works

-## Core Pattern (for techniques/patterns)
+1. **Identify goal** → Use decision tree above
+2. **Select template** → From `references/templates/`
+3. **Apply CSO** → Optimize description for discovery
+4. **Add anti-rationalization** → For discipline skills
+5. **Test** → RED-GREEN-REFACTOR cycle

-Before/after code comparison
+## Quick Example

-## Quick Reference
+```yaml
+---
+name: my-technique
+description: Use when [specific symptom occurs].
+metadata:
+  category: technique
+  triggers: error-text, symptom, tool-name
+---

-Table or bullets for scanning common operations
+# My Technique

-## Implementation
-
-Inline code for simple patterns
-Link to file for heavy reference or reusable tools
+## When to Use
+- [Symptom A]
+- [Error message]
+```

 ## Common Mistakes

-What goes wrong + fixes
+| Mistake | Fix |
+|---------|-----|
+| Description summarizes workflow | Use "Use when..." triggers only |
+| No `metadata.triggers` | Add 3+ keywords |
+| Generic name ("helper") | Use gerund (`creating-skills`) |
+| Long monolithic SKILL.md | Split into `references/` |

-## Real-World Impact (optional)
+See [gotchas.md](gotchas.md) for more.

-Concrete results
-```
+## ✅ Pre-Deploy Checklist

-## Claude Search Optimization (CSO)
+Before deploying any skill:

-**Critical for discovery:** Future Claude needs to FIND your skill
+- [ ] `name` field matches directory name exactly
+- [ ] `SKILL.md` filename is ALL CAPS
+- [ ] Description starts with "Use when..."
+- [ ] `metadata.triggers` has 3+ keywords
+- [ ] Total lines < 500 (use `references/` for more)
+- [ ] No `@` force-loading in cross-references
+- [ ] Tested with real scenarios

-### 1. Rich Description Field
+## 🔗 Related Skills

-**Purpose:** Claude reads description to decide which skills to load for a given task. Make it answer: "Should I read this skill right now?"
+- **[opencode-expert](skill://opencode-expert)**: For OpenCode environment configuration
+- Use `/write-skill` command for guided skill creation

-**Format:** Start with "Use when..." to focus on triggering conditions
-
-**CRITICAL: Description = When to Use, NOT What the Skill Does**
-
-The description should ONLY describe triggering conditions. Do NOT summarize the skill's process or workflow in the description.
-
-**Why this matters:** Testing revealed that when a description summarizes the skill's workflow, Claude may follow the description instead of reading the full skill content. A description saying "code review between tasks" caused Claude to do ONE review, even though the skill's flowchart clearly showed TWO reviews (spec compliance then code quality).
-
-When the description was changed to just "Use when executing implementation plans with independent tasks" (no workflow summary), Claude correctly read the flowchart and followed the two-stage review process.
-
-**The trap:** Descriptions that summarize workflow create a shortcut Claude will take. The skill body becomes documentation Claude skips.
-
-```yaml
-# ❌ BAD: Summarizes workflow - Claude may follow this instead of reading skill
-description: Use when executing plans - dispatches subagent per task with code review between tasks
-
-# ❌ BAD: Too much process detail
-description: Use for TDD - write test first, watch it fail, write minimal code, refactor
-
-# ✅ GOOD: Just triggering conditions, no workflow summary
-description: Use when executing implementation plans with independent tasks in the current session
-
-# ✅ GOOD: Triggering conditions only
-description: Use when implementing any feature or bugfix, before writing implementation code
-```
-
-**Content:**
-
- Use concrete triggers, symptoms, and situations that signal this skill applies
- Describe the _problem_ (race conditions, inconsistent behavior) not _language-specific symptoms_ (setTimeout, sleep)
- Keep triggers technology-agnostic unless the skill itself is technology-specific
- If skill is technology-specific, make that explicit in the trigger
- Write in third person (injected into system prompt)
- **NEVER summarize the skill's process or workflow**
-
-```yaml
-# ❌ BAD: Too abstract, vague, doesn't include when to use
-description: For async testing
-
-# ❌ BAD: First person
-description: I can help you with async tests when they're flaky
-
-# ❌ BAD: Mentions technology but skill isn't specific to it
-description: Use when tests use setTimeout/sleep and are flaky
-
-# ✅ GOOD: Starts with "Use when", describes problem, no workflow
-description: Use when tests have race conditions, timing dependencies, or pass/fail inconsistently
-
-# ✅ GOOD: Technology-specific skill with explicit trigger
-description: Use when using React Router and handling authentication redirects
-```
-
-### 2. Keyword Coverage
-
-Use words Claude would search for:
-
- Error messages: "Hook timed out", "ENOTEMPTY", "race condition"
- Symptoms: "flaky", "hanging", "zombie", "pollution"
- Synonyms: "timeout/hang/freeze", "cleanup/teardown/afterEach"
- Tools: Actual commands, library names, file types
-
-### 3. Descriptive Naming
-
-**Use active voice, verb-first:**
-
- ✅ `creating-skills` not `skill-creation`
- ✅ `condition-based-waiting` not `async-test-helpers`
-
-### 4. Token Efficiency (Critical)
-
-**Problem:** getting-started and frequently-referenced skills load into EVERY conversation. Every token counts.
-
-**Target word counts:**
-
- getting-started workflows: <150 words each
- Frequently-loaded skills: <200 words total
- Other skills: <500 words (still be concise)
-
-**Techniques:**
-
-**Move details to tool help:**
+## Examples

+**Create a Tier 1 skill:**
 ```bash
-# ❌ BAD: Document all flags in SKILL.md
-search-conversations supports --text, --both, --after DATE, --before DATE, --limit N
-
-# ✅ GOOD: Reference --help
-search-conversations supports multiple modes and filters. Run --help for details.
+mkdir -p ~/.config/opencode/skills/my-technique
+touch ~/.config/opencode/skills/my-technique/SKILL.md
 ```

-**Use cross-references:**
-
-```markdown
-# ❌ BAD: Repeat workflow details
-
-When searching, dispatch subagent with template...
-[20 lines of repeated instructions]
-
-# ✅ GOOD: Reference other skill
-
-Always use subagents (50-100x context savings). REQUIRED: Use [other-skill-name] for workflow.
-```
-
-**Compress examples:**
-
-```markdown
-# ❌ BAD: Verbose example (42 words)
-
-your human partner: "How did we handle authentication errors in React Router before?"
-You: I'll search past conversations for React Router authentication patterns.
-[Dispatch subagent with search query: "React Router authentication error handling 401"]
-
-# ✅ GOOD: Minimal example (20 words)
-
-Partner: "How did we handle auth errors in React Router?"
-You: Searching...
-[Dispatch subagent → synthesis]
-```
-
-**Eliminate redundancy:**
-
- Don't repeat what's in cross-referenced skills
- Don't explain what's obvious from command
- Don't include multiple examples of same pattern
-
-**Verification:**
-
+**Create a Tier 2 skill:**
 ```bash
-wc -w skills/path/SKILL.md
-# getting-started workflows: aim for <150 each
-# Other frequently-loaded: aim for <200 total
+mkdir -p ~/.config/opencode/skills/my-skill/references/core
+touch ~/.config/opencode/skills/my-skill/{SKILL.md,gotchas.md}
+touch ~/.config/opencode/skills/my-skill/references/core/README.md
 ```
-
-**Name by what you DO or core insight:**
-
- ✅ `condition-based-waiting` > `async-test-helpers`
- ✅ `using-skills` not `skill-usage`
- ✅ `flatten-with-flags` > `data-structure-refactoring`
- ✅ `root-cause-tracing` > `debugging-techniques`
-
-**Gerunds (-ing) work well for processes:**
-
- `creating-skills`, `testing-skills`, `debugging-with-logs`
- Active, describes the action you're taking
-
-### 4. Cross-Referencing Other Skills
-
-**When writing documentation that references other skills:**
-
-Use skill name only, with explicit requirement markers:
-
- ✅ Good: `**REQUIRED SUB-SKILL:** Use superpowers:test-driven-development`
- ✅ Good: `**REQUIRED BACKGROUND:** You MUST understand superpowers:systematic-debugging`
- ❌ Bad: `See skills/testing/test-driven-development` (unclear if required)
- ❌ Bad: `@skills/testing/test-driven-development/SKILL.md` (force-loads, burns context)
-
-**Why no @ links:** `@` syntax force-loads files immediately, consuming 200k+ context before you need them.
-
-## Flowchart Usage
-
-```dot
-digraph when_flowchart {
-    "Need to show information?" [shape=diamond];
-    "Decision where I might go wrong?" [shape=diamond];
-    "Use markdown" [shape=box];
-    "Small inline flowchart" [shape=box];
-
-    "Need to show information?" -> "Decision where I might go wrong?" [label="yes"];
-    "Decision where I might go wrong?" -> "Small inline flowchart" [label="yes"];
-    "Decision where I might go wrong?" -> "Use markdown" [label="no"];
-}
-```
-
-**Use flowcharts ONLY for:**
-
- Non-obvious decision points
- Process loops where you might stop too early
- "When to use A vs B" decisions
-
-**Never use flowcharts for:**
-
- Reference material → Tables, lists
- Code examples → Markdown blocks
- Linear instructions → Numbered lists
- Labels without semantic meaning (step1, helper2)
-
-See @graphviz-conventions.dot for graphviz style rules.
-
-**Visualizing for your human partner:** Use `render-graphs.js` in this directory to render a skill's flowcharts to SVG:
-
-```bash
-./render-graphs.js ../some-skill           # Each diagram separately
-./render-graphs.js ../some-skill --combine # All diagrams in one SVG
-```
-
-## Code Examples
-
-**One excellent example beats many mediocre ones**
-
-Choose most relevant language:
-
- Testing techniques → TypeScript/JavaScript
- System debugging → Shell/Python
- Data processing → Python
-
-**Good example:**
-
- Complete and runnable
- Well-commented explaining WHY
- From real scenario
- Shows pattern clearly
- Ready to adapt (not generic template)
-
-**Don't:**
-
- Implement in 5+ languages
- Create fill-in-the-blank templates
- Write contrived examples
-
-You're good at porting - one great example is enough.
-
-## File Organization
-
-### Self-Contained Skill
-
-```
-defense-in-depth/
-  SKILL.md    # Everything inline
-```
-
-When: All content fits, no heavy reference needed
-
-### Skill with Reusable Tool
-
-```
-condition-based-waiting/
-  SKILL.md    # Overview + patterns
-  example.ts  # Working helpers to adapt
-```
-
-When: Tool is reusable code, not just narrative
-
-### Skill with Heavy Reference
-
-```
-pptx/
-  SKILL.md       # Overview + workflows
-  pptxgenjs.md   # 600 lines API reference
-  ooxml.md       # 500 lines XML structure
-  scripts/       # Executable tools
-```
-
-When: Reference material too large for inline
-
-## The Iron Law (Same as TDD)
-
-```
-NO SKILL WITHOUT A FAILING TEST FIRST
-```
-
-This applies to NEW skills AND EDITS to existing skills.
-
-Write skill before testing? Delete it. Start over.
-Edit skill without testing? Same violation.
-
-**No exceptions:**
-
- Not for "simple additions"
- Not for "just adding a section"
- Not for "documentation updates"
- Don't keep untested changes as "reference"
- Don't "adapt" while running tests
- Delete means delete
-
-**REQUIRED BACKGROUND:** The superpowers:test-driven-development skill explains why this matters. Same principles apply to documentation.
-
-## Testing All Skill Types
-
-Different skill types need different test approaches:
-
-### Discipline-Enforcing Skills (rules/requirements)
-
-**Examples:** TDD, verification-before-completion, designing-before-coding
-
-**Test with:**
-
- Academic questions: Do they understand the rules?
- Pressure scenarios: Do they comply under stress?
- Multiple pressures combined: time + sunk cost + exhaustion
- Identify rationalizations and add explicit counters
-
-**Success criteria:** Agent follows rule under maximum pressure
-
-### Technique Skills (how-to guides)
-
-**Examples:** condition-based-waiting, root-cause-tracing, defensive-programming
-
-**Test with:**
-
- Application scenarios: Can they apply the technique correctly?
- Variation scenarios: Do they handle edge cases?
- Missing information tests: Do instructions have gaps?
-
-**Success criteria:** Agent successfully applies technique to new scenario
-
-### Pattern Skills (mental models)
-
-**Examples:** reducing-complexity, information-hiding concepts
-
-**Test with:**
-
- Recognition scenarios: Do they recognize when pattern applies?
- Application scenarios: Can they use the mental model?
- Counter-examples: Do they know when NOT to apply?
-
-**Success criteria:** Agent correctly identifies when/how to apply pattern
-
-### Reference Skills (documentation/APIs)
-
-**Examples:** API documentation, command references, library guides
-
-**Test with:**
-
- Retrieval scenarios: Can they find the right information?
- Application scenarios: Can they use what they found correctly?
- Gap testing: Are common use cases covered?
-
-**Success criteria:** Agent finds and correctly applies reference information
-
-## Common Rationalizations for Skipping Testing
-
-| Excuse                         | Reality                                                          |
-| ------------------------------ | ---------------------------------------------------------------- |
-| "Skill is obviously clear"     | Clear to you ≠ clear to other agents. Test it.                   |
-| "It's just a reference"        | References can have gaps, unclear sections. Test retrieval.      |
-| "Testing is overkill"          | Untested skills have issues. Always. 15 min testing saves hours. |
-| "I'll test if problems emerge" | Problems = agents can't use skill. Test BEFORE deploying.        |
-| "Too tedious to test"          | Testing is less tedious than debugging bad skill in production.  |
-| "I'm confident it's good"      | Overconfidence guarantees issues. Test anyway.                   |
-| "Academic review is enough"    | Reading ≠ using. Test application scenarios.                     |
-| "No time to test"              | Deploying untested skill wastes more time fixing it later.       |
-
-**All of these mean: Test before deploying. No exceptions.**
-
-## Bulletproofing Skills Against Rationalization
-
-Skills that enforce discipline (like TDD) need to resist rationalization. Agents are smart and will find loopholes when under pressure.
-
-**Psychology note:** Understanding WHY persuasion techniques work helps you apply them systematically. See persuasion-principles.md for research foundation (Cialdini, 2021; Meincke et al., 2025) on authority, commitment, scarcity, social proof, and unity principles.
-
-### Close Every Loophole Explicitly
-
-Don't just state the rule - forbid specific workarounds:
-
-<Bad>
-```markdown
-Write code before test? Delete it.
-```
-</Bad>
-
-<Good>
-```markdown
-Write code before test? Delete it. Start over.
-
-**No exceptions:**
-
- Don't keep it as "reference"
- Don't "adapt" it while writing tests
- Don't look at it
- Delete means delete
-
-````
-</Good>
-
-### Address "Spirit vs Letter" Arguments
-
-Add foundational principle early:
-
-```markdown
-**Violating the letter of the rules is violating the spirit of the rules.**
-````
-
-This cuts off entire class of "I'm following the spirit" rationalizations.
-
-### Build Rationalization Table
-
-Capture rationalizations from baseline testing (see Testing section below). Every excuse agents make goes in the table:
-
-```markdown
-| Excuse                           | Reality                                                                 |
-| -------------------------------- | ----------------------------------------------------------------------- |
-| "Too simple to test"             | Simple code breaks. Test takes 30 seconds.                              |
-| "I'll test after"                | Tests passing immediately prove nothing.                                |
-| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
-```
-
-### Create Red Flags List
-
-Make it easy for agents to self-check when rationalizing:
-
-```markdown
-## Red Flags - STOP and Start Over
-
- Code before test
- "I already manually tested it"
- "Tests after achieve the same purpose"
- "It's about spirit not ritual"
- "This is different because..."
-
-**All of these mean: Delete code. Start over with TDD.**
-```
-
-### Update CSO for Violation Symptoms
-
-Add to description: symptoms of when you're ABOUT to violate the rule:
-
-```yaml
-description: use when implementing any feature or bugfix, before writing implementation code
-```
-
-## RED-GREEN-REFACTOR for Skills
-
-Follow the TDD cycle:
-
-### RED: Write Failing Test (Baseline)
-
-Run pressure scenario with subagent WITHOUT the skill. Document exact behavior:
-
- What choices did they make?
- What rationalizations did they use (verbatim)?
- Which pressures triggered violations?
-
-This is "watch the test fail" - you must see what agents naturally do before writing the skill.
-
-### GREEN: Write Minimal Skill
-
-Write skill that addresses those specific rationalizations. Don't add extra content for hypothetical cases.
-
-Run same scenarios WITH skill. Agent should now comply.
-
-### REFACTOR: Close Loopholes
-
-Agent found new rationalization? Add explicit counter. Re-test until bulletproof.
-
-**Testing methodology:** See @testing-skills-with-subagents.md for the complete testing methodology:
-
- How to write pressure scenarios
- Pressure types (time, sunk cost, authority, exhaustion)
- Plugging holes systematically
- Meta-testing techniques
-
-## Anti-Patterns
-
-### ❌ Narrative Example
-
-"In session 2025-10-03, we found empty projectDir caused..."
-**Why bad:** Too specific, not reusable
-
-### ❌ Multi-Language Dilution
-
-example-js.js, example-py.py, example-go.go
-**Why bad:** Mediocre quality, maintenance burden
-
-### ❌ Code in Flowcharts
-
-```dot
-step1 [label="import fs"];
-step2 [label="read file"];
-```
-
-**Why bad:** Can't copy-paste, hard to read
-
-### ❌ Generic Labels
-
-helper1, helper2, step3, pattern4
-**Why bad:** Labels should have semantic meaning
-
-## STOP: Before Moving to Next Skill
-
-**After writing ANY skill, you MUST STOP and complete the deployment process.**
-
-**Do NOT:**
-
- Create multiple skills in batch without testing each
- Move to next skill before current one is verified
- Skip testing because "batching is more efficient"
-
-**The deployment checklist below is MANDATORY for EACH skill.**
-
-Deploying untested skills = deploying untested code. It's a violation of quality standards.
-
-## Skill Creation Checklist (TDD Adapted)
-
-**IMPORTANT: Use TodoWrite to create todos for EACH checklist item below.**
-
-**RED Phase - Write Failing Test:**
-
- [ ] Create pressure scenarios (3+ combined pressures for discipline skills)
- [ ] Run scenarios WITHOUT skill - document baseline behavior verbatim
- [ ] Identify patterns in rationalizations/failures
-
-**GREEN Phase - Write Minimal Skill:**
-
- [ ] Name uses only letters, numbers, hyphens (no parentheses/special chars)
- [ ] YAML frontmatter with only name and description (max 1024 chars)
- [ ] Description starts with "Use when..." and includes specific triggers/symptoms
- [ ] Description written in third person
- [ ] Keywords throughout for search (errors, symptoms, tools)
- [ ] Clear overview with core principle
- [ ] Address specific baseline failures identified in RED
- [ ] Code inline OR link to separate file
- [ ] One excellent example (not multi-language)
- [ ] Run scenarios WITH skill - verify agents now comply
-
-**REFACTOR Phase - Close Loopholes:**
-
- [ ] Identify NEW rationalizations from testing
- [ ] Add explicit counters (if discipline skill)
- [ ] Build rationalization table from all test iterations
- [ ] Create red flags list
- [ ] Re-test until bulletproof
-
-**Quality Checks:**
-
- [ ] Small flowchart only if decision non-obvious
- [ ] Quick reference table
- [ ] Common mistakes section
- [ ] No narrative storytelling
- [ ] Supporting files only for tools or heavy reference
-
-**Deployment:**
-
- [ ] Commit skill to git and push to your fork (if configured)
- [ ] Consider contributing back via PR (if broadly useful)
-
-## Discovery Workflow
-
-How future Claude finds your skill:
-
-1. **Encounters problem** ("tests are flaky")
-2. **Finds SKILL** (description matches)
-3. **Scans overview** (is this relevant?)
-4. **Reads patterns** (quick reference table)
-5. **Loads example** (only when implementing)
-
-**Optimize for this flow** - put searchable terms early and often.
-
-## The Bottom Line
-
-**Creating skills IS TDD for process documentation.**
-
-Same Iron Law: No skill without failing test first.
-Same cycle: RED (baseline) → GREEN (write skill) → REFACTOR (close loopholes).
-Same benefits: Better quality, fewer surprises, bulletproof results.
-
-If you follow TDD for code, follow it for skills. It's the same discipline applied to documentation.
--- a/skills/writing-skills/examples.md
+++ b/skills/writing-skills/examples.md
@@ -0,0 +1,282 @@
+# Skill Templates & Examples
+
+Complete, copy-paste templates for each skill type.
+
+---
+
+## Template: Technique Skill
+
+For how-to guides that teach a specific method.
+
+```markdown
+---
+name: technique-name
+description: >-
+  Use when [specific symptom].
+metadata:
+  category: technique
+  triggers: error-text, symptom, tool-name
+---
+
+# Technique Name
+
+## Overview
+
+[1-2 sentence core principle]
+
+## When to Use
+
+- [Symptom A]
+- [Symptom B]
+- [Error message text]
+
+**NOT for:**
+- [When to avoid]
+
+## The Problem
+
+```javascript
+// Bad example
+function badCode() {
+  // problematic pattern
+}
+```
+
+## The Solution
+
+```javascript
+// Good example
+function goodCode() {
+  // improved pattern
+}
+```
+
+## Step-by-Step
+
+1. [First step]
+2. [Second step]
+3. [Final step]
+
+## Quick Reference
+
+| Scenario | Approach |
+|----------|----------|
+| Case A | Solution A |
+| Case B | Solution B |
+
+## Common Mistakes
+
+**Mistake 1:** [Description]
+- Wrong: `bad code`
+- Right: `good code`
+```
+
+---
+
+## Template: Reference Skill
+
+For documentation, APIs, and lookup tables.
+
+```markdown
+---
+name: reference-name
+description: >-
+  Use when working with [domain].
+metadata:
+  category: reference
+  triggers: tool, api, specific-terms
+---
+
+# Reference Name
+
+## Quick Reference
+
+| Command | Purpose |
+|---------|---------|
+| `cmd1` | Does X |
+| `cmd2` | Does Y |
+
+## Common Patterns
+
+**Pattern A:**
+```bash
+example command
+```
+
+**Pattern B:**
+```bash
+another example
+```
+
+## Detailed Docs
+
+For more options, run `--help` or see:
+- [patterns.md](patterns.md)
+- [examples.md](examples.md)
+```
+
+---
+
+## Template: Discipline Skill
+
+For rules that agents must follow. Requires anti-rationalization techniques.
+
+```markdown
+---
+name: discipline-name
+description: >-
+  Use when [BEFORE violation].
+metadata:
+  category: discipline
+  triggers: new feature, code change, implementation
+---
+
+# Rule Name
+
+## Iron Law
+
+**[SINGLE SENTENCE ABSOLUTE RULE]**
+
+Violating the letter IS violating the spirit.
+
+## The Rule
+
+1. ALWAYS [step 1]
+2. NEVER [step 2]
+3. [Step 3]
+
+## Violations
+
+[Action before rule]? **Delete it. Start over.**
+
+**No exceptions:**
+- Don't keep it as "reference"
+- Don't "adapt" it
+- Delete means delete
+
+## Common Rationalizations
+
+| Excuse | Reality |
+|--------|---------|
+| "Too simple" | Simple code breaks. Rule takes 30 seconds. |
+| "I'll do it after" | After = never. Do it now. |
+| "Spirit not ritual" | The ritual IS the spirit. |
+
+## Red Flags - STOP
+
+- [Flag 1]
+- [Flag 2]
+- "This is different because..."
+
+**All mean:** Delete. Start over.
+
+## Valid Exceptions
+
+- [Exception 1]
+- [Exception 2]
+
+**Everything else:** Follow the rule.
+```
+
+---
+
+## Template: Pattern Skill
+
+For mental models and design patterns.
+
+```markdown
+---
+name: pattern-name
+description: >-
+  Use when [recognizable symptom].
+metadata:
+  category: pattern
+  triggers: complexity, hard-to-follow, nested
+---
+
+# Pattern Name
+
+## The Pattern
+
+[1-2 sentence core idea]
+
+## Recognition Signs
+
+- [Sign that pattern applies]
+- [Another sign]
+- [Code smell]
+
+## Before
+
+```typescript
+// Complex/problematic
+function before() {
+  // nested, confusing
+}
+```
+
+## After
+
+```typescript
+// Clean/improved
+function after() {
+  // flat, clear
+}
+```
+
+## When NOT to Use
+
+- [Over-engineering case]
+- [Simple case that doesn't need it]
+
+## Impact
+
+**Before:** [Problem metric]
+**After:** [Improved metric]
+```
+
+---
+
+## Real Example: Condition-Based Waiting
+
+```yaml
+---
+name: condition-based-waiting
+description: >-
+  Use when tests have race conditions or timing dependencies.
+metadata:
+  category: technique
+  triggers: flaky tests, timeout, race condition, sleep, setTimeout
+---
+```
+
+```markdown
+# Condition-Based Waiting
+
+## Overview
+
+Replace `sleep(ms)` with `waitFor(() => condition)`.
+
+## When to Use
+
+- Tests pass sometimes, fail other times
+- Tests use `sleep()` or `setTimeout()`
+- "Works on my machine"
+
+## The Fix
+
+```typescript
+// ❌ Bad
+await sleep(2000);
+expect(element).toBeVisible();
+
+// ✅ Good
+await waitFor(() => element.isVisible(), { timeout: 5000 });
+expect(element).toBeVisible();
+```
+
+## Impact
+
+- Flaky tests: 15/100 → 0/100
+- Speed: 40% faster (no over-waiting)
+```
--- a/skills/writing-skills/gotchas.md
+++ b/skills/writing-skills/gotchas.md
@@ -0,0 +1,197 @@
+---
+description: Common pitfalls and tribal knowledge for skill creation.
+metadata:
+  tags: [gotchas, troubleshooting, mistakes]
+---
+
+# Skill Writing Gotchas
+
+Tribal knowledge to avoid common mistakes.
+
+## YAML Frontmatter
+
+### Invalid Syntax
+
+```yaml
+# ❌ BAD: Mixed list and map
+metadata:
+  references: 
+  triggers: a, b, c
+  - item1
+  - item2
+
+# ✅ GOOD: Consistent structure
+metadata:
+  triggers: a, b, c
+  references:
+    - item1
+    - item2
+```
+
+### Multiline Description
+
+```yaml
+# ❌ BAD: Line breaks create parsing errors
+description: Use when creating skills.
+  Also for updating.
+
+# ✅ GOOD: Use YAML multiline syntax
+description: >-
+  Use when creating or updating skills.
+  Triggers: new skill, update skill
+```
+
+## Naming
+
+### Directory Must Match `name` Field
+
+```
+# ❌ BAD
+directory: my-skill/
+name: mySkill  # Mismatch!
+
+# ✅ GOOD
+directory: my-skill/
+name: my-skill  # Exact match
+```
+
+### SKILL.md Must Be ALL CAPS
+
+```
+# ❌ BAD
+skill.md
+Skill.md
+
+# ✅ GOOD
+SKILL.md
+```
+
+## Discovery
+
+### Description = Triggers, NOT Workflow
+
+```yaml
+# ❌ BAD: Agent reads this and skips the full skill
+description: Analyzes code, finds bugs, suggests fixes
+
+# ✅ GOOD: Agent reads full skill to understand workflow
+description: Use when debugging errors or reviewing code quality
+```
+
+### Pre-Violation Triggers for Discipline Skills
+
+```yaml
+# ❌ BAD: Triggers AFTER violation
+description: Use when you forgot to write tests
+
+# ✅ GOOD: Triggers BEFORE violation
+description: Use when implementing any feature, before writing code
+```
+
+## Token Efficiency
+
+### Skill Loaded Every Conversation = Token Drain
+
+- Frequently-loaded skills: <200 words
+- All others: <500 words
+- Move details to `references/` files
+
+### Don't Duplicate CLI Help
+
+```markdown
+# ❌ BAD: 50 lines documenting all flags
+
+# ✅ GOOD: One line
+Run `mytool --help` for all options.
+```
+
+## Anti-Rationalization (Discipline Skills Only)
+
+### Agents Are Smart at Finding Loopholes
+
+```markdown
+# ❌ BAD: Trust agents will "get the spirit"
+Write test before code.
+
+# ✅ GOOD: Close every loophole explicitly
+Write test before code.
+
+**No exceptions:**
+- Don't keep code as "reference"
+- Don't "adapt" existing code
+- Delete means delete
+```
+
+### Build Rationalization Table
+
+Every excuse from baseline testing goes in the table:
+
+| Excuse | Reality |
+|--------|---------|
+| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
+| "I'll test after" | Tests-after prove nothing immediately. |
+
+## Cross-References
+
+### Keep References One Level Deep
+
+```markdown
+# ❌ BAD: Nested chain (A → B → C)
+See [patterns.md] → which links to [advanced.md] → which links to [deep.md]
+
+# ✅ GOOD: Flat (A → B, A → C)
+See [patterns.md] and [advanced.md]
+```
+
+### Never Force-Load with @
+
+```markdown
+# ❌ BAD: Burns context immediately
+@skills/my-skill/SKILL.md
+
+# ✅ GOOD: Agent loads when needed
+See [my-skill] for details.
+```
+
+## OpenCode Integration
+
+### Correct Skill Directory
+
+```bash
+# ❌ BAD: Old singular path
+~/.config/opencode/skill/my-skill/
+
+# ✅ GOOD: Plural path
+~/.config/opencode/skills/my-skill/
+```
+
+### Skill Cross-Reference Syntax
+
+```markdown
+# ❌ BAD: File path (fragile)
+See /home/user/.config/opencode/skills/my-skill/SKILL.md
+
+# ✅ GOOD: Skill protocol
+See [my-skill](skill://my-skill)
+```
+
+## Tier Selection
+
+### Don't Overthink Tier Choice
+
+```markdown
+# ❌ BAD: Starting with Tier 3 "just in case"
+# Result: Wasted effort, empty reference files
+
+# ✅ GOOD: Start with Tier 1, upgrade when needed
+# Can always add references/ later
+```
+
+### Signals You Need to Upgrade
+
+| Signal | Action |
+|--------|--------|
+| SKILL.md > 200 lines | → Tier 2 |
+| 3+ related sub-topics | → Tier 2 |
+| 10+ products/services | → Tier 3 |
+| "I need X" vs "I want Y" | → Tier 3 decision trees |
--- a/skills/writing-skills/references/anti-rationalization/README.md
+++ b/skills/writing-skills/references/anti-rationalization/README.md
@@ -0,0 +1,255 @@
+# Anti-Rationalization Guide
+
+Techniques for bulletproofing skills against agent rationalization.
+
+## The Problem
+
+Discipline-enforcing skills (like TDD) face a unique challenge: smart agents under pressure will find loopholes.
+
+**Example**: Skill says "Write test first". Agent under deadline thinks:
+
+- "This is too simple to test"
+- "I'll test after, same result"
+- "It's the spirit that matters, not ritual"
+
+## Psychology Foundation
+
+Understanding WHY persuasion works helps apply it systematically.
+
+**Research basis**: Cialdini (2021), Meincke et al. (2025)
+
+**Core principles**:
+
+- **Authority**: "The TDD community agrees..."
+- **Commitment**: "You already said you follow TDD..."
+- **Scarcity**: "Missing tests now = bugs later"
+- **Social Proof**: "All tested code passing CI proves value"
+- **Unity**: "We're engineers who value quality"
+
+## Technique 1: Close Every Loophole Explicitly
+
+Don't just state the rule - forbid specific workarounds.
+
+### Bad Example
+
+```markdown
+Write code before test? Delete it.
+```
+
+### Good Example
+
+```markdown
+Write code before test? Delete it. Start over.
+
+**No exceptions**:
+
+- Don't keep it as "reference"
+- Don't "adapt" it while writing tests
+- Don't look at it
+- Delete means delete
+```
+
+**Why it works**: Agents try specific workarounds. Counter each explicitly.
+
+## Technique 2: Address "Spirit vs Letter" Arguments
+
+Add foundational principle early:
+
+```markdown
+**Violating the letter of the rules is violating the spirit of the rules.**
+```
+
+**Why it works**: Cuts off entire class of "I'm following the spirit" rationalizations.
+
+## Technique 3: Build Rationalization Table
+
+Capture excuses from baseline testing. Every rationalization goes in table:
+
+```markdown
+| Excuse                           | Reality                                                                 |
+| -------------------------------- | ----------------------------------------------------------------------- |
+| "Too simple to test"             | Simple code breaks. Test takes 30 seconds.                              |
+| "I'll test after"                | Tests passing immediately prove nothing.                                |
+| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
+| "It's about spirit not ritual"   | The letter IS the spirit. TDD's value comes from the specific sequence. |
+```
+
+**Why it works**: Agents read table, recognize their own thinking, see the counter-argument.
+
+## Technique 4: Create Red Flags List
+
+Make it easy for agents to self-check when rationalizing:
+
+```markdown
+## Red Flags - STOP and Start Over
+
+- Code before test
+- "I already manually tested it"
+- "Tests after achieve the same purpose"
+- "It's about spirit not ritual"
+- "This is different because..."
+
+**All of these mean**: Delete code. Start over with TDD.
+```
+
+**Why it works**: Simple checklist, clear action (delete & restart).
+
+## Technique 5: Update Description for Violation Symptoms
+
+Add to description: symptoms of when you're ABOUT to violate:
+
+```yaml
+# ❌ BAD: Only describes what skill does
+description: TDD methodology for writing code
+
+# ✅ GOOD: Includes pre-violation symptoms
+description: Use when implementing any feature or bugfix, before writing implementation code
+metadata:
+  triggers: new feature, bug fix, code change
+```
+
+**Why it works**: Triggers skill BEFORE violation, not after.
+
+## Technique 6: Use Strong Language
+
+Weak language invites rationalization:
+
+```markdown
+# Weak
+
+You should write tests first.
+Generally, test before code.
+It's better to test first.
+
+# Strong
+
+ALWAYS write test first.
+NEVER write code before test.
+Test-first is MANDATORY.
+```
+
+**Why it works**: No ambiguity, no wiggle room.
+
+## Technique 7: Invoke Commitment & Consistency
+
+Reference agent's own standards:
+
+```markdown
+You claimed to follow TDD.
+TDD means test-first.
+Code-first is NOT TDD.
+
+**Either**:
+
+- Follow TDD (test-first), or
+- Admit you're not doing TDD
+
+Don't redefine TDD to fit what you already did.
+```
+
+**Why it works**: Agents resist cognitive dissonance (Festinger, 1957).
+
+## Technique 8: Provide Escape Hatch for Legitimate Cases
+
+If there ARE valid exceptions, state them explicitly:
+
+```markdown
+## When NOT to Use TDD
+
+- Spike solutions (throwaway exploratory code)
+- One-time scripts deleting in 1 hour
+- Generated boilerplate (verified via other means)
+
+**Everything else**: Use TDD. No exceptions.
+```
+
+**Why it works**: Removes "but this is different" argument for non-exception cases.
+
+## Complete Bulletproofing Checklist
+
+For discipline-enforcing skills:
+
+**Loophole Closing**:
+
+- [ ] Forbidden each specific workaround explicitly?
+- [ ] Added "spirit vs letter" principle?
+- [ ] Built rationalization table from baseline tests?
+- [ ] Created red flags list?
+
+**Strength**:
+
+- [ ] Used strong language (ALWAYS/NEVER)?
+- [ ] Invoked commitment & consistency?
+- [ ] Provided explicit escape hatch?
+
+**Discovery**:
+
+- [ ] Description includes pre-violation symptoms?
+- [ ] Keywords target moment BEFORE violation?
+
+**Testing**:
+
+- [ ] Tested with combined pressures?
+- [ ] Agent complied under maximum pressure?
+- [ ] No new rationalizations found?
+
+## Real-World Example: TDD Skill
+
+### Baseline Rationalizations Found
+
+1. "Too simple to test"
+2. "I'll test after"
+3. "Spirit not ritual"
+4. "Already manually tested"
+5. "This is different because..."
+
+### Counters Applied
+
+**Rationalization table**:
+
+```markdown
+| Excuse               | Reality                                                        |
+| -------------------- | -------------------------------------------------------------- |
+| "Too simple to test" | Simple code breaks. Test takes 30 seconds.                     |
+| "I'll test after"    | Tests passing immediately prove nothing.                       |
+| "Spirit not ritual"  | The letter IS the spirit. TDD's value comes from the sequence. |
+| "Manually tested"    | Manual tests don't run automatically. They rot.                |
+```
+
+**Red flags**:
+
+```markdown
+## Red Flags - STOP
+
+- Code before test
+- "I already tested manually"
+- "Spirit not ritual"
+- "This is different..."
+
+All mean: Delete code. Start over.
+```
+
+**Result**: Agent compliance under combined time + sunk cost pressure.
+
+## Common Mistakes
+
+| Mistake                                | Fix                                                            |
+| -------------------------------------- | -------------------------------------------------------------- |
+| Trust agents will "get the spirit"     | Close explicit loopholes. Agents are smart at rationalization. |
+| Use weak language ("should", "better") | Use ALWAYS/NEVER for discipline rules.                         |
+| Skip rationalization table             | Every excuse needs explicit counter.                           |
+| No red flags list                      | Make self-checking easy.                                       |
+| Generic description                    | Add pre-violation symptoms to trigger skill earlier.           |
+
+## Meta-Strategy
+
+**For each new rationalization**:
+
+1. Document it verbatim (from failed test)
+2. Add to rationalization table
+3. Update red flags list
+4. Re-test
+
+**Iterate until**: Agent can't find ANY rationalization that works.
+
+That's bulletproof.
--- a/skills/writing-skills/references/cso/README.md
+++ b/skills/writing-skills/references/cso/README.md
@@ -0,0 +1,268 @@
+# CSO Guide - Claude Search Optimization
+
+Advanced techniques for making skills discoverable by agents.
+
+## The Discovery Problem
+
+You have 100+ skills. Agent receives a task. How does it find the RIGHT skill?
+
+**Answer**: The `description` field.
+
+## Critical Rule: Description = Triggers, NOT Workflow
+
+### The Trap
+
+When description summarizes workflow, agents take a shortcut.
+
+**Real example that failed**:
+
+```yaml
+# Agent did ONE review instead of TWO
+description: Code review between tasks
+
+# Skill body had flowchart showing TWO reviews:
+# 1. Spec compliance
+# 2. Code quality
+```
+
+**Why it failed**: Agent read description, thought "code review between tasks means one review", never read the flowchart.
+
+**Fix**:
+
+```yaml
+# Agent now reads full skill and follows flowchart
+description: Use when executing implementation plans with independent tasks
+```
+
+### The Pattern
+
+```yaml
+# ❌ BAD: Workflow summary
+description: Analyzes git diff, generates commit message in conventional format
+
+# ✅ GOOD: Trigger conditions only
+description: Use when generating commit messages or reviewing staged changes
+```
+
+## Token Efficiency Critical for Skills
+
+**Problem**: Frequently-loaded skills consume tokens in EVERY conversation.
+
+**Target word counts**:
+
+- Frequently-loaded skills: <200 words total
+- Other skills: <500 words
+
+### Techniques
+
+**1. Move details to tool help**:
+
+```bash
+# ❌ BAD: Document all flags in SKILL.md
+search-conversations supports --text, --both, --after DATE, --before DATE, --limit N
+
+# ✅ GOOD: Reference --help
+search-conversations supports multiple modes. Run --help for details.
+```
+
+**2. Use cross-references**:
+
+```markdown
+# ❌ BAD: Repeat workflow
+
+When searching, dispatch agent with template...
+[20 lines of repeated instructions]
+
+# ✅ GOOD: Reference other skill
+
+Use subagents for searches. See [delegating-to-subagents] for workflow.
+```
+
+**3. Compress examples**:
+
+```markdown
+# ❌ BAD: Verbose (42 words)
+
+Partner: "How did we handle auth errors in React Router?"
+You: I'll search past conversations for patterns.
+[Dispatch subagent with query: "React Router authentication error handling 401"]
+
+# ✅ GOOD: Minimal (20 words)
+
+Partner: "Auth errors in React Router?"
+You: Searching...
+[Dispatch subagent → synthesis]
+```
+
+## Keyword Strategy
+
+### Error Messages
+
+Include EXACT error text users will see:
+
+- "Hook timed out after 5000ms"
+- "ENOTEMPTY: directory not empty"
+- "jest --watch is not responding"
+
+### Symptoms
+
+Use words users naturally say:
+
+- "flaky", "hangs", "zombie process"
+- "slow", "timeout", "race condition"
+- "cleanup failed", "pollution"
+
+### Tools & Commands
+
+Actual names, not descriptions:
+
+- "pytest", not "Python testing"
+- "git rebase", not "rebasing"
+- ".docx files", not "Word documents"
+
+### Synonyms
+
+Cover multiple ways to describe same thing:
+
+- timeout/hang/freeze
+- cleanup/teardown/after Each
+- mock/stub/fake
+
+## Naming Conventions
+
+### Gerunds (-ing) for Processes
+
+✅ `creating-skills`, `debugging-with-logs`, `testing-async-code`
+
+### Verb-first for Actions
+
+✅ `flatten-with-flags`, `reduce-complexity`, `trace-root-cause`
+
+### ❌ Avoid
+
+- `skill-creation` (passive, less searchable)
+- `async-test-helpers` (too generic)
+- `debugging-techniques` (vague)
+
+## Description Template
+
+```yaml
+description: "Use when [SPECIFIC TRIGGER]."
+metadata:
+  triggers: [error1], [symptom2], [tool3]
+```
+
+**Examples**:
+
+```yaml
+# Technique skill
+description: "Use when tests have race conditions, timing dependencies, or pass/fail inconsistently."
+metadata:
+  triggers: flaky tests, timeout, race condition
+
+# Pattern skill
+description: "Use when complex data structures make code hard to follow."
+metadata:
+  triggers: nested loops, multiple flags, confusing state
+
+# Reference skill
+description: "Use when working with React Router and authentication."
+metadata:
+  triggers: 401 redirect, login flow, protected routes
+
+# Discipline skill
+description: "Use when implementing any feature or bugfix, before writing implementation code."
+metadata:
+  triggers: new feature, bug fix, code change
+```
+
+## Third Person Rule
+
+Description is injected into system prompt. Inconsistent POV breaks discovery.
+
+```yaml
+# ❌ BAD: First person
+description: "I can help you with async tests"
+
+# ❌ BAD: Second person
+description: "You can use this for race conditions"
+
+# ✅ GOOD: Third person
+description: "Handles async tests with race conditions"
+```
+
+## Cross-Referencing Other Skills
+
+**When documenting a skill that references other skills**:
+
+Use skill name only, with explicit requirement markers:
+
+```markdown
+# ✅ GOOD: Clear requirement
+
+**REQUIRED BACKGROUND**: You MUST understand test-driven-development before using this skill.
+
+**REQUIRED SUB-SKILL**: Use defensive-programming for error handling.
+
+# ❌ BAD: Unclear if required
+
+See test-driven-development skill for context.
+
+# ❌ NEVER: Force-loads (burns context)
+
+@skills/testing/test-driven-development/SKILL.md
+```
+
+**Why no @ links**: `@` syntax force-loads files immediately, consuming tokens before needed.
+
+## Verification Checklist
+
+Before deploying:
+
+- [ ] Description starts with "Use when..."?
+- [ ] Description is <500 characters?
+- [ ] Description lists ONLY triggers, not workflow?
+- [ ] Includes 3+ keywords (errors/symptoms/tools)?
+- [ ] Third person throughout?
+- [ ] Name uses gerund or verb-first format?
+- [ ] Name has only letters, numbers, hyphens?
+- [ ] No @ syntax for cross-references?
+- [ ] Word count <200 (frequent) or <500 (other)?
+
+## Real-World Examples
+
+### Before/After: TDD Skill
+
+❌ **Before** (workflow in description):
+
+```yaml
+description: Write test first, watch it fail, write minimal code, refactor
+```
+
+Result: Agents followed description, skipped reading full skill.
+
+✅ **After** (triggers only):
+
+```yaml
+description: Use when implementing any feature or bugfix, before writing implementation code
+```
+
+Result: Agents read full skill, followed complete TDD cycle.
+
+### Before/After: BigQuery Skill
+
+❌ **Before** (too vague):
+
+```yaml
+description: Helps with database queries
+```
+
+Result: Never loaded (too generic, agents couldn't identify relevance).
+
+✅ **After** (specific triggers):
+
+```yaml
+description: Use when analyzing BigQuery data. Triggers: revenue metrics, pipeline data, API usage, campaign attribution.
+```
+
+Result: Loads for relevant queries, includes domain keywords.
--- a/skills/writing-skills/references/standards/README.md
+++ b/skills/writing-skills/references/standards/README.md
@@ -0,0 +1,152 @@
+---
+description: Standards and naming rules for creating agent skills.
+metadata:
+  tags: [standards, naming, yaml, structure]
+---
+
+# Skill Development Guide
+
+Comprehensive reference for creating effective agent skills.
+
+## Directory Structure
+
+```
+~/.config/opencode/skills/
+  {skill-name}/           # kebab-case, matches `name` field
+    SKILL.md              # Required: main skill definition
+    references/           # Optional: supporting documentation
+      README.md           # Sub-topic entry point
+      *.md                # Additional files
+```
+
+**Project-local alternative:**
+```
+.agent/skills/{skill-name}/SKILL.md
+```
+
+## Naming Rules
+
+| Element | Rule | Example |
+|---------|------|---------|
+| Directory | kebab-case, 1-64 chars | `react-best-practices` |
+| `SKILL.md` | ALL CAPS, exact filename | `SKILL.md` (not `skill.md`) |
+| `name` field | Must match directory name | `name: react-best-practices` |
+
+## SKILL.md Structure
+
+```markdown
+---
+name: {skill-name}
+description: >-
+  Use when [trigger condition].
+metadata:
+  category: technique
+  triggers: keyword1, keyword2, error-text
+---
+
+# Skill Title
+
+Brief description of what this skill does.
+
+## When to Use
+
+- Symptom or situation A
+- Symptom or situation B
+
+## How It Works
+
+Step-by-step instructions or reference content.
+
+## Examples
+
+Concrete usage examples.
+
+## Common Mistakes
+
+What to avoid and why.
+```
+
+## Description Best Practices
+
+The `description` field is critical for skill discovery:
+
+```yaml
+# ❌ BAD: Workflow summary (agent skips reading full skill)
+description: Analyzes code, finds bugs, suggests fixes
+
+# ✅ GOOD: Trigger conditions only
+description: Use when debugging errors or reviewing code quality.
+metadata:
+  triggers: bug, error, code review
+```
+
+**Rules:**
+- Start with "Use when..."
+- Put triggers under `metadata.triggers`
+- Keep under 500 characters
+- Use third person (not "I" or "You")
+
+## Context Efficiency
+
+Skills load into context on-demand. Optimize for token usage:
+
+| Guideline | Reason |
+|-----------|--------|
+| Keep SKILL.md < 500 lines | Reduces context consumption |
+| Put details in supporting files | Agent reads only what's needed |
+| Use tables for reference data | More compact than prose |
+| Link to `--help` for CLI tools | Avoids duplicating docs |
+
+## Supporting Files
+
+For complex skills, use additional files:
+
+```
+my-skill/
+  SKILL.md              # Overview + navigation
+  patterns.md           # Detailed patterns
+  examples.md           # Code examples
+  troubleshooting.md    # Common issues
+```
+
+**Supporting file frontmatter is required** (for any `.md` besides `SKILL.md`):
+
+```markdown
+---
+description: >-
+  Short summary used for search and retrieval.
+metadata:
+  tags: [pattern, troubleshooting, api]
+  source: internal
+---
+```
+
+This frontmatter helps the LLM locate the right file when referenced from `SKILL.md`.
+
+Reference from SKILL.md:
+```markdown
+## Detailed Reference
+
+- [Patterns](patterns.md) - Common usage patterns
+- [Examples](examples.md) - Code samples
+```
+
+## Skill Types
+
+| Type | Purpose | Example |
+|------|---------|---------|
+| **Reference** | Documentation, APIs | `bigquery-analysis` |
+| **Technique** | How-to guides | `condition-based-waiting` |
+| **Pattern** | Mental models | `flatten-with-flags` |
+| **Discipline** | Rules to enforce | `test-driven-development` |
+
+## Verification Checklist
+
+Before deploying:
+
+- [ ] `name` matches directory name?
+- [ ] `SKILL.md` is ALL CAPS?
+- [ ] Description starts with "Use when..."?
+- [ ] Triggers listed under metadata?
+- [ ] Under 500 lines?
+- [ ] Tested with real scenarios?
--- a/skills/writing-skills/references/standards/metadata-standard.md
+++ b/skills/writing-skills/references/standards/metadata-standard.md
@@ -0,0 +1,65 @@
+# SKILL.md Metadata Standard
+
+Official frontmatter fields recognized by OpenCode.
+
+## Required Fields
+
+```yaml
+---
+name: skill-name
+description: >-
+  Use when [trigger condition].
+metadata:
+  triggers: keyword1, keyword2, error-message
+---
+```
+
+| Field | Rules |
+|-------|-------|
+| `name` | 1-64 chars, lowercase, hyphens only, must match directory name |
+| `description` | 1-1024 chars, should describe when to use |
+
+## Optional Fields
+
+```yaml
+---
+name: skill-name
+description: Purpose and triggers.
+metadata:
+  license: MIT
+  compatibility: opencode
+  author: "your-name"
+  version: "1.0.0"
+  category: "reference"
+  tags: "tag1, tag2"
+---
+```
+
+| Field | Purpose |
+|-------|---------|
+| `license` | License identifier (e.g., MIT, Apache-2.0) |
+| `compatibility` | Tool compatibility marker |
+| `metadata` | String-to-string map for custom key-values |
+
+## Name Validation
+
+```regex
+^[a-z0-9]+(-[a-z0-9]+)*$
+```
+
+**Valid**: `my-skill`, `git-release`, `tdd`  
+**Invalid**: `My-Skill`, `my_skill`, `-my-skill`, `my--skill`
+
+## Common Metadata Keys
+
+Use these conventions for consistency across skills:
+
+| Key | Example | Purpose |
+|-----|---------|---------|
+| `author` | `"your-name"` | Skill creator |
+| `version` | `"1.0.0"` | Semantic version |
+| `category` | `"reference"` | Type: reference, technique, discipline, pattern |
+| `tags` | `"react, hooks"` | Searchable keywords |
+
+> [!IMPORTANT]
+> Any field not listed here is **ignored** by OpenCode's skill loader.
--- a/skills/writing-skills/references/templates/discipline.md
+++ b/skills/writing-skills/references/templates/discipline.md
@@ -0,0 +1,54 @@
+---
+name: discipline-name
+description: >-
+  Use when [BEFORE violation].
+metadata:
+  category: discipline
+  triggers: new feature, code change, implementation
+---
+
+# Rule Name
+
+## Iron Law
+
+**[SINGLE SENTENCE ABSOLUTE RULE]**
+
+Violating the letter IS violating the spirit.
+
+## The Rule
+
+1. ALWAYS [step 1]
+2. NEVER [step 2]
+3. [Step 3]
+
+## Violations
+
+[Action before rule]? **Delete it. Start over.**
+
+**No exceptions:**
+- Don't keep it as "reference"
+- Don't "adapt" it
+- Delete means delete
+
+## Common Rationalizations
+
+| Excuse | Reality |
+|--------|---------|
+| "Too simple" | Simple code breaks. Rule takes 30 seconds. |
+| "I'll do it after" | After = never. Do it now. |
+| "Spirit not ritual" | The ritual IS the spirit. |
+
+## Red Flags - STOP
+
+- [Flag 1]
+- [Flag 2]
+- "This is different because..."
+
+**All mean:** Delete. Start over.
+
+## Valid Exceptions
+
+- [Exception 1]
+- [Exception 2]
+
+**Everything else:** Follow the rule.
--- a/skills/writing-skills/references/templates/pattern.md
+++ b/skills/writing-skills/references/templates/pattern.md
@@ -0,0 +1,48 @@
+---
+name: pattern-name
+description: >-
+  Use when [recognizable symptom].
+metadata:
+  category: pattern
+  triggers: complexity, hard-to-follow, nested
+---
+
+# Pattern Name
+
+## The Pattern
+
+[1-2 sentence core idea]
+
+## Recognition Signs
+
+- [Sign that pattern applies]
+- [Another sign]
+- [Code smell]
+
+## Before
+
+```typescript
+// Complex/problematic
+function before() {
+  // nested, confusing
+}
+```
+
+## After
+
+```typescript
+// Clean/improved
+function after() {
+  // flat, clear
+}
+```
+
+## When NOT to Use
+
+- [Over-engineering case]
+- [Simple case that doesn't need it]
+
+## Impact
+
+**Before:** [Problem metric]
+**After:** [Improved metric]
--- a/skills/writing-skills/references/templates/reference.md
+++ b/skills/writing-skills/references/templates/reference.md
@@ -0,0 +1,35 @@
+---
+name: reference-name
+description: >-
+  Use when working with [domain].
+metadata:
+  category: reference
+  triggers: tool, api, specific-terms
+---
+
+# Reference Name
+
+## Quick Reference
+
+| Command | Purpose |
+|---------|---------|
+| `cmd1` | Does X |
+| `cmd2` | Does Y |
+
+## Common Patterns
+
+**Pattern A:**
+```bash
+example command
+```
+
+**Pattern B:**
+```bash
+another example
+```
+
+## Detailed Docs
+
+For more options, run `--help` or see:
+- [patterns.md](patterns.md)
+- [examples.md](examples.md)
--- a/skills/writing-skills/references/templates/technique.md
+++ b/skills/writing-skills/references/templates/technique.md
@@ -0,0 +1,59 @@
+---
+name: technique-name
+description: Use when [specific symptom].
+metadata:
+  category: technique
+  triggers: error-text, symptom, tool-name
+---
+
+# Technique Name
+
+## Overview
+
+[1-2 sentence core principle]
+
+## When to Use
+
+- [Symptom A]
+- [Symptom B]
+- [Error message text]
+
+**NOT for:**
+- [When to avoid]
+
+## The Problem
+
+```javascript
+// Bad example
+function badCode() {
+  // problematic pattern
+}
+```
+
+## The Solution
+
+```javascript
+// Good example
+function goodCode() {
+  // improved pattern
+}
+```
+
+## Step-by-Step
+
+1. [First step]
+2. [Second step]
+3. [Final step]
+
+## Quick Reference
+
+| Scenario | Approach |
+|----------|----------|
+| Case A | Solution A |
+| Case B | Solution B |
+
+## Common Mistakes
+
+**Mistake 1:** [Description]
+- Wrong: `bad code`
+- Right: `good code`
--- a/skills/writing-skills/references/templates/tier-3-platform.md
+++ b/skills/writing-skills/references/templates/tier-3-platform.md
@@ -0,0 +1,19 @@
+# Platform Name Skill
+
+Template for complex Tier 3 skills.
+
+## Structure
+
+```
+skill/
+├── SKILL.md            # Dispatcher
+├── commands/
+│   └── skill.md        # Orchestrator
+└── references/
+    └── topic/
+        ├── README.md   # Overview
+        ├── api.md      # API Reference
+        ├── config.md   # Configuration
+        ├── patterns.md # Recipes
+        └── gotchas.md  # Critical Errors
+```
--- a/skills/writing-skills/references/testing/README.md
+++ b/skills/writing-skills/references/testing/README.md
@@ -0,0 +1,204 @@
+# Testing Guide - TDD for Skills
+
+Complete methodology for testing skills using RED-GREEN-REFACTOR cycle.
+
+## Testing All Skill Types
+
+Different skill types need different test approaches.
+
+### Discipline-Enforcing Skills (rules/requirements)
+
+**Examples**: TDD, verification-before-completion, designing-before-coding
+
+**Test with**:
+
+- Academic questions: Do they understand the rules?
+- Pressure scenarios: Do they comply under stress?
+- Multiple pressures combined: time + sunk cost + exhaustion
+- Identify rationalizations and add explicit counters
+
+**Success criteria**: Agent follows rule under maximum pressure
+
+### Technique Skills (how-to guides)
+
+**Examples**: condition-based-waiting, root-cause-tracing, defensive-programming
+
+**Test with**:
+
+- Application scenarios: Can they apply the technique correctly?
+- Variation scenarios: Do they handle edge cases?
+- Missing information tests: Do instructions have gaps?
+
+**Success criteria**: Agent successfully applies technique to new scenario
+
+### Pattern Skills (mental models)
+
+**Examples**: reducing-complexity, information-hiding concepts
+
+**Test with**:
+
+- Recognition scenarios: Do they recognize when pattern applies?
+- Application scenarios: Can they use the mental model?
+- Counter-examples: Do they know when NOT to apply?
+
+**Success criteria**: Agent correctly identifies when/how to apply pattern
+
+### Reference Skills (documentation/APIs)
+
+**Examples**: API documentation, command references, library guides
+
+**Test with**:
+
+- Retrieval scenarios: Can they find the right information?
+- Application scenarios: Can they use what they found correctly?
+- Gap testing: Are common use cases covered?
+
+**Success criteria**: Agent finds and correctly applies reference information
+
+## Pressure Types for Testing
+
+### Time Pressure
+
+"You have 5 minutes to complete this task"
+
+### Sunk Cost Pressure
+
+"You already spent 2 hours on this, just finish it quickly"
+
+### Authority Pressure
+
+"The senior developer said to skip tests for this quick bug fix"
+
+### Exhaustion Pressure
+
+"This is the 10th task today, let's wrap it up"
+
+## RED Phase: Baseline Testing
+
+**Goal**: Watch the agent fail WITHOUT the skill.
+
+**Steps**:
+
+1. Design pressure scenario (combine 2-3 pressures)
+2. Give agent the task WITHOUT the skill loaded
+3. Document EXACT behavior:
+   - What rationalization did they use?
+   - Which pressure triggered the violation?
+   - How did they justify the shortcut?
+
+**Critical**: Copy exact quotes. You'll need them for GREEN phase.
+
+**Example Baseline**:
+
+```
+Scenario: Implement feature under time pressure
+Pressure: "You have 10 minutes"
+Agent response: "Since we're short on time, I'll implement the feature first
+and add tests after. Testing later achieves the same goal."
+```
+
+## GREEN Phase: Minimal Implementation
+
+**Goal**: Write skill that addresses SPECIFIC baseline failures.
+
+**Steps**:
+
+1. Review baseline rationalizations
+2. Write skill sections that counter THOSE EXACT arguments
+3. Re-run scenario WITH skill
+4. Agent should now comply
+
+**Bad (too general)**:
+
+```markdown
+## Testing
+
+Always write tests.
+```
+
+**Good (addresses specific rationalization)**:
+
+```markdown
+## Common Rationalizations
+
+| Excuse                              | Reality                                                                 |
+| ----------------------------------- | ----------------------------------------------------------------------- |
+| "Testing after achieves same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
+| "Too simple to test"                | Simple code breaks. Test takes 30 seconds.                              |
+```
+
+## REFACTOR Phase: Loophole Closing
+
+**Goal**: Find and plug new rationalizations.
+
+**Steps**:
+
+1. Agent found new workaround? Document it.
+2. Add explicit counter to skill
+3. Re-test same scenario
+4. Repeat until bulletproof
+
+**Pattern**:
+
+```markdown
+## Red Flags - STOP and Start Over
+
+- Code before test
+- "I already manually tested it"
+- "Tests after achieve the same purpose"
+- "It's about spirit not ritual"
+- "This is different because..."
+
+**All of these mean**: Delete code. Start over with TDD.
+```
+
+## Complete Test Checklist
+
+Before deploying a skill:
+
+**Baseline (RED)**:
+
+- [ ] Designed 3+ pressure scenarios
+- [ ] Ran scenarios WITHOUT skill
+- [ ] Documented verbatim agent responses
+- [ ] Identified pattern in rationalizations
+
+**Implementation (GREEN)**:
+
+- [ ] Skill addresses SPECIFIC baseline failures
+- [ ] Re-ran scenarios WITH skill
+- [ ] Agent complied in all scenarios
+- [ ] No hand-waving or generic advice
+
+**Bulletproofing (REFACTOR)**:
+
+- [ ] Tested with combined pressures
+- [ ] Found and documented new rationalizations
+- [ ] Added explicit counters
+- [ ] Re-tested until no more loopholes
+- [ ] Created "Red Flags" section
+
+## Common Testing Mistakes
+
+| Mistake                        | Fix                                                       |
+| ------------------------------ | --------------------------------------------------------- |
+| "I'll test if problems emerge" | Problems = agents can't use skill. Test BEFORE deploying. |
+| "Skill is obviously clear"     | Clear to you ≠ clear to agents. Test it.                  |
+| "Testing is overkill"          | Untested skills have issues. Always.                      |
+| "Academic review is enough"    | Reading ≠ using. Test application scenarios.              |
+
+## Meta-Testing
+
+**Test the test**: If agent passes too easily, your test is weak.
+
+**Good test indicators**:
+
+- Agent fails WITHOUT skill (proves skill is needed)
+- Agent p asses WITH skill (proves skill works)
+- Multiple pressures needed to trigger failure (proves realistic)
+
+**Bad test indicators**:
+
+- Agent passes even without skill (test is irrelevant)
+- Agent fails even with skill (skill is unclear)
+- Single obvious scenario (test is too simple)
--- a/skills/writing-skills/references/tier-1-simple/README.md
+++ b/skills/writing-skills/references/tier-1-simple/README.md
@@ -0,0 +1,75 @@
+---
+description: When to use Tier 1 (Simple) skill architecture.
+metadata:
+  tags: [tier-1, simple, single-file]
+---
+
+# Tier 1: Simple Skills
+
+Single-file skills for focused, specific purposes.
+
+## When to Use
+
+- **Single concept**: One technique, one pattern, one reference
+- **Under 200 lines**: Can fit comfortably in one file
+- **No complex decision logic**: User knows exactly what they need
+- **Frequently loaded**: Needs minimal token footprint
+
+## Structure
+
+```
+my-skill/
+└── SKILL.md          # Everything in one file
+```
+
+## Example
+
+```yaml
+---
+name: flatten-with-flags
+description: Use when simplifying deeply nested conditionals.
+metadata:
+  category: pattern
+  triggers: nested if, complex conditionals, early return
+---
+
+# Flatten with Flags
+
+## When to Use
+- Code has 3+ levels of nesting
+- Conditions are hard to follow
+
+## The Pattern
+Replace nested conditions with early returns and flag variables.
+
+## Before
+```javascript
+function process(data) {
+  if (data) {
+    if (data.valid) {
+      if (data.ready) {
+        return doWork(data);
+      }
+    }
+  }
+  return null;
+}
+```
+
+## After
+```javascript
+function process(data) {
+  if (!data) return null;
+  if (!data.valid) return null;
+  if (!data.ready) return null;
+  return doWork(data);
+}
+```
+```
+
+## Checklist
+
+- [ ] Fits in <200 lines
+- [ ] Single focused purpose
+- [ ] No need for `references/` directory
+- [ ] Description uses "Use when..." pattern
--- a/skills/writing-skills/references/tier-2-expanded/README.md
+++ b/skills/writing-skills/references/tier-2-expanded/README.md
@@ -0,0 +1,69 @@
+---
+description: When to use Tier 2 (Expanded) skill architecture.
+metadata:
+  tags: [tier-2, expanded, multi-file]
+---
+
+# Tier 2: Expanded Skills
+
+Multi-file skills for complex topics with multiple sub-concepts.
+
+## When to Use
+
+- **Multiple related concepts**: Needs separation of concerns
+- **200-1000 lines total**: Too big for one file
+- **Needs reference files**: Patterns, examples, troubleshooting
+- **Cross-linking**: Users need to navigate between sub-topics
+
+## Structure
+
+```
+my-skill/
+├── SKILL.md              # Overview + navigation
+└── references/
+    ├── core/
+    │   ├── README.md     # Main concept
+    │   └── api.md        # API reference
+    ├── patterns/
+    │   └── README.md     # Usage patterns
+    └── troubleshooting/
+        └── README.md     # Common issues
+```
+
+## Example
+
+The `writing-skills` skill itself is Tier 2:
+
+```
+writing-skills/
+├── SKILL.md              # Decision tree + navigation
+├── gotchas.md            # Tribal knowledge
+└── references/
+    ├── anti-rationalization/
+    ├── cso/
+    ├── standards/
+    ├── templates/
+    └── testing/
+```
+
+## Progressive Disclosure
+
+1. **Metadata** (~100 tokens): Name + description loaded at startup
+2. **SKILL.md** (<500 lines): Decision tree + index
+3. **References** (as needed): Loaded only when user navigates
+
+## Key Differences from Tier 1
+
+| Aspect | Tier 1 | Tier 2 |
+|--------|--------|--------|
+| Files | 1 | 5-20 |
+| Total lines | <200 | 200-1000 |
+| Decision logic | None | Simple tree |
+| Token cost | Minimal | Medium (progressive) |
+
+## Checklist
+
+- [ ] SKILL.md has clear navigation links
+- [ ] Each `references/` subdir has README.md
+- [ ] No circular references between files
+- [ ] Decision tree points to specific files
--- a/skills/writing-skills/references/tier-3-platform/README.md
+++ b/skills/writing-skills/references/tier-3-platform/README.md
@@ -0,0 +1,98 @@
+---
+description: When to use Tier 3 (Platform) skill architecture for large platforms.
+metadata:
+  tags: [tier-3, platform, enterprise, cloudflare-pattern]
+---
+
+# Tier 3: Platform Skills
+
+Enterprise-grade skills for entire platforms (AWS, Cloudflare, Convex, etc).
+
+## When to Use
+
+- **Entire platform**: 10+ products/services
+- **1000+ lines total**: Would overwhelm context if monolithic
+- **Complex decision logic**: Users start with "I need X" not "I want product Y"
+- **Undocumented gotchas**: Tribal knowledge is critical
+
+## The Cloudflare Pattern
+
+Based on `cloudflare-skill` by Dillon Mulroy.
+
+### Structure
+
+```
+my-platform/
+├── SKILL.md                  # Decision trees only
+└── references/
+    └── <product>/
+        ├── README.md         # Overview, when to use
+        ├── api.md            # Runtime API reference
+        ├── configuration.md  # Config options
+        ├── patterns.md       # Usage patterns
+        └── gotchas.md        # Pitfalls, limits
+```
+
+### The 5-File Pattern
+
+Each product directory has exactly 5 files:
+
+| File | Purpose | When to Load |
+|------|---------|--------------|
+| `README.md` | Overview, when to use | Always first |
+| `api.md` | Runtime APIs, methods | Implementing features |
+| `configuration.md` | Config, environment | Setting up |
+| `patterns.md` | Common workflows | Best practices |
+| `gotchas.md` | Pitfalls, limits | Debugging |
+
+## Decision Trees
+
+The power of Tier 3 is decision trees that help the AI **choose**:
+
+```markdown
+Need to store data?
+├─ Simple key-value → kv/
+├─ Relational queries → d1/
+├─ Large files/blobs → r2/
+├─ Per-user state → durable-objects/
+└─ Vector embeddings → vectorize/
+```
+
+## Slash Command Integration
+
+Create a slash command to orchestrate:
+
+```markdown
+---
+description: Load platform skill and get contextual guidance
+---
+
+## Workflow
+
+1. Load skill: `skill({ name: 'my-platform' })`
+2. Identify product from decision tree
+3. Load relevant reference files based on task
+
+| Task | Files |
+|------|-------|
+| New setup | README.md + configuration.md |
+| Implement feature | api.md + patterns.md |
+| Debug issue | gotchas.md |
+```
+
+## Progressive Disclosure in Action
+
+- **Startup**: Only name + description (~100 tokens)
+- **Activation**: SKILL.md with trees (<5000 tokens)
+- **Navigation**: One product's 5 files (as needed)
+
+Result: 60+ product references without blowing context.
+
+## Checklist
+
+- [ ] SKILL.md contains ONLY decision trees + index
+- [ ] Each product has exactly 5 files
+- [ ] Decision trees cover all "I need X" scenarios
+- [ ] Cross-references stay one level deep
+- [ ] Slash command created for orchestration
+- [ ] Every product has `gotchas.md`