**Bug fixes (run_experiment.py):** - Fix broken revert logic: was saving HEAD as pre_commit (no-op revert), now uses git reset --hard HEAD~1 for correct rollback - Remove broken --loop mode (agent IS the loop, script handles one iteration) - Fix shell injection: all git commands use subprocess list form - Replace shell tail with Python file read **Bug fixes (other scripts):** - setup_experiment.py: fix shell injection in git branch creation, remove dead --skip-baseline flag, fix evaluator docstring parsing - log_results.py: fix 6 falsy-zero bugs (baseline=0 treated as None), add domain_filter to CSV/markdown export, move import time to top - evaluators: add FileNotFoundError handling, fix output format mismatch in llm_judge_copy, add peak_kb on macOS, add ValueError handling **Plugin packaging (NEW):** - plugin.json, settings.json, CLAUDE.md for plugin registry - 5 slash commands: /ar:setup, /ar:run, /ar:loop, /ar:status, /ar:resume - /ar:loop supports user-selected intervals (10m, 1h, daily, weekly, monthly) - experiment-runner agent for autonomous loop iterations - Registered in marketplace.json as plugin #20 **SKILL.md rewrite:** - Replace ambiguous "Loop Protocol" with clear "Agent Protocol" - Add results.tsv format spec, strategy escalation, self-improvement - Replace "NEVER STOP" with resumable stopping logic **Docs & sync:** - Codex (157 skills), Gemini (229 items), convert.sh all pick up the skill - 6 new MkDocs pages, mkdocs.yml nav updated - Counts updated: 17 agents, 22 slash commands Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2.5 KiB
2.5 KiB
title, description
| title | description |
|---|---|
| /ar:resume — Resume Experiment | /ar:resume — Resume Experiment - Claude Code skill from the Engineering - POWERFUL domain. |
/ar:resume — Resume Experiment
:material-rocket-launch: Engineering - POWERFUL
:material-identifier: `resume`
:material-github: Source
Install:
claude /plugin install engineering-advanced-skills
Resume a paused or context-limited experiment. Reads all history and continues where you left off.
Usage
/ar:resume # List experiments, let user pick
/ar:resume engineering/api-speed # Resume specific experiment
What It Does
Step 1: List experiments if needed
If no experiment specified:
python {skill_path}/scripts/setup_experiment.py --list
Show status for each (active/paused/done based on results.tsv age). Let user pick.
Step 2: Load full context
# Checkout the experiment branch
git checkout autoresearch/{domain}/{name}
# Read config
cat .autoresearch/{domain}/{name}/config.cfg
# Read strategy
cat .autoresearch/{domain}/{name}/program.md
# Read full results history
cat .autoresearch/{domain}/{name}/results.tsv
# Read recent git log for the branch
git log --oneline -20
Step 3: Report current state
Summarize for the user:
Resuming: engineering/api-speed
Target: src/api/search.py
Metric: p50_ms (lower is better)
Experiments: 23 total — 8 kept, 12 discarded, 3 crashed
Best: 185ms (-42% from baseline of 320ms)
Last experiment: "added response caching" → KEEP (185ms)
Recent patterns:
- Caching changes: 3 kept, 1 discarded (consistently helpful)
- Algorithm changes: 2 discarded, 1 crashed (high risk, low reward so far)
- I/O optimization: 2 kept (promising direction)
Step 4: Ask next action
How would you like to continue?
1. Single iteration (/ar:run) — I'll make one change and evaluate
2. Start a loop (/ar:loop) — Autonomous with scheduled interval
3. Just show me the results — I'll review and decide
If the user picks loop, hand off to /ar:loop with the experiment pre-selected.
If single, hand off to /ar:run.