firefrost-gaming/claude-skills-reference

Files

Reza Rezvani 7911cf957a feat(autoresearch-agent): fix critical bugs, package as plugin with 5 slash commands

**Bug fixes (run_experiment.py):**
- Fix broken revert logic: was saving HEAD as pre_commit (no-op revert),
  now uses git reset --hard HEAD~1 for correct rollback
- Remove broken --loop mode (agent IS the loop, script handles one iteration)
- Fix shell injection: all git commands use subprocess list form
- Replace shell tail with Python file read

**Bug fixes (other scripts):**
- setup_experiment.py: fix shell injection in git branch creation,
  remove dead --skip-baseline flag, fix evaluator docstring parsing
- log_results.py: fix 6 falsy-zero bugs (baseline=0 treated as None),
  add domain_filter to CSV/markdown export, move import time to top
- evaluators: add FileNotFoundError handling, fix output format mismatch
  in llm_judge_copy, add peak_kb on macOS, add ValueError handling

**Plugin packaging (NEW):**
- plugin.json, settings.json, CLAUDE.md for plugin registry
- 5 slash commands: /ar:setup, /ar:run, /ar:loop, /ar:status, /ar:resume
- /ar:loop supports user-selected intervals (10m, 1h, daily, weekly, monthly)
- experiment-runner agent for autonomous loop iterations
- Registered in marketplace.json as plugin #20

**SKILL.md rewrite:**
- Replace ambiguous "Loop Protocol" with clear "Agent Protocol"
- Add results.tsv format spec, strategy escalation, self-improvement
- Replace "NEVER STOP" with resumable stopping logic

**Docs & sync:**
- Codex (157 skills), Gemini (229 items), convert.sh all pick up the skill
- 6 new MkDocs pages, mkdocs.yml nav updated
- Counts updated: 17 agents, 22 slash commands

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-13 14:38:59 +01:00

2.5 KiB

Raw Blame History

title, description

title	description
/ar:resume — Resume Experiment	/ar:resume — Resume Experiment - Claude Code skill from the Engineering - POWERFUL domain.

/ar:resume — Resume Experiment

:material-rocket-launch: Engineering - POWERFUL :material-identifier: `resume` :material-github: Source

Install: claude /plugin install engineering-advanced-skills

Resume a paused or context-limited experiment. Reads all history and continues where you left off.

Usage

/ar:resume                                  # List experiments, let user pick
/ar:resume engineering/api-speed            # Resume specific experiment

What It Does

Step 1: List experiments if needed

If no experiment specified:

python {skill_path}/scripts/setup_experiment.py --list

Show status for each (active/paused/done based on results.tsv age). Let user pick.

Step 2: Load full context

# Checkout the experiment branch
git checkout autoresearch/{domain}/{name}

# Read config
cat .autoresearch/{domain}/{name}/config.cfg

# Read strategy
cat .autoresearch/{domain}/{name}/program.md

# Read full results history
cat .autoresearch/{domain}/{name}/results.tsv

# Read recent git log for the branch
git log --oneline -20

Step 3: Report current state

Summarize for the user:

Resuming: engineering/api-speed
  Target: src/api/search.py
  Metric: p50_ms (lower is better)
  Experiments: 23 total — 8 kept, 12 discarded, 3 crashed
  Best: 185ms (-42% from baseline of 320ms)
  Last experiment: "added response caching" → KEEP (185ms)

  Recent patterns:
  - Caching changes: 3 kept, 1 discarded (consistently helpful)
  - Algorithm changes: 2 discarded, 1 crashed (high risk, low reward so far)
  - I/O optimization: 2 kept (promising direction)

Step 4: Ask next action

How would you like to continue?
  1. Single iteration (/ar:run)  — I'll make one change and evaluate
  2. Start a loop (/ar:loop)     — Autonomous with scheduled interval
  3. Just show me the results    — I'll review and decide

If the user picks loop, hand off to /ar:loop with the experiment pre-selected. If single, hand off to /ar:run.

2.5 KiB Raw Blame History