Files
Alireza Rezvani b8d5edd8f6 Dev (#383)
* fix: add missing plugin.json files and restore trailing newlines

- Add plugin.json for review-fix-a11y skill
- Add plugin.json for free-llm-api skill
- Restore POSIX-compliant trailing newlines in JSON index files

* feat(engineering): add review-fix-a11y skill (WCAG 2.2 a11y audit + fix) (#375)

Adds review-fix-a11y (WCAG 2.2 a11y audit + fix) and free-llm-api skills.

Includes:
- review-fix-a11y: WCAG 2.2 audit workflow, a11y_audit.py scanner, contrast_checker.py
- free-llm-api: ChatAnywhere, Groq, Cerebras, OpenRouter, llm-mux, One API setup
- secret_scanner.py upgrade with secrets-patterns-db integration (1,600+ patterns)

Co-authored-by: ivanopenclaw223-alt <ivanopenclaw223-alt@users.noreply.github.com>

* chore: sync codex skills symlinks [automated]

* Revert "feat(engineering): add review-fix-a11y skill (WCAG 2.2 a11y audit + fix) (#375)"

This reverts commit 49c9f2109f.

* chore: sync codex skills symlinks [automated]

* Revert "feat(engineering): add review-fix-a11y skill (WCAG 2.2 a11y audit + fix) (#375)"

This reverts commit 49c9f2109f.

* feat(engineering-team): add a11y-audit skill — WCAG 2.2 accessibility audit & fix (#376)

Built from scratch (replaces reverted PR #375 contribution).

Skill package:
- SKILL.md: 1132 lines, 3-phase workflow (scan → fix → verify),
  per-framework fix patterns (React, Next.js, Vue, Angular, Svelte, HTML),
  CI/CD integration guide, 20+ issue type coverage
- scripts/a11y_scanner.py: static scanner detecting 20+ violation types
  across HTML/JSX/TSX/Vue/Svelte/CSS — severity-ranked, CI-friendly exit codes
- scripts/contrast_checker.py: WCAG contrast calculator with AA/AAA checks,
  --suggest mode, --batch CSS scanning, named color support
- references/wcag-quick-ref.md: WCAG 2.2 Level A/AA criteria table
- references/aria-patterns.md: ARIA roles, live regions, keyboard interaction
- references/framework-a11y-patterns.md: React, Vue, Angular, Svelte fix patterns
- assets/sample-component.tsx: sample file with intentional violations
- expected_outputs/: scan report, contrast output, JSON output samples
- /a11y-audit slash command, settings.json, plugin.json, README.md

Validation: 97.6/100 (EXCELLENT), quality 73.9/100 (B-), scripts 2/2 PASS

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: sync codex skills symlinks [automated]

* docs: sync counts across all docs — 205 skills, 268 tools, 19 commands, 22 plugins

Update CLAUDE.md, README.md, docs/index.md, docs/getting-started.md,
mkdocs.yml, marketplace.json with consistent counts. Sync Gemini CLI
index with new skills (code-to-prd, plugin-audit).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(marketplace): add 6 missing standalone plugins — total 22→28

Added to marketplace:
- a11y-audit (WCAG 2.2 accessibility audit)
- executive-mentor (adversarial thinking partner)
- docker-development (Dockerfile, compose, multi-stage)
- helm-chart-builder (Helm chart scaffolding)
- terraform-patterns (IaC module design)
- research-summarizer (structured research synthesis)

Also fixed version 1.0.0 → 2.1.2 on 4 plugin.json files
(executive-mentor, docker-development, helm-chart-builder, research-summarizer)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(commands): add /seo-auditor — 7-phase SEO audit pipeline for documentation

- 7 phases: discovery → meta tags → content quality → keywords → links → sitemap → report
- Integrates 8 marketing-skill scripts: seo_checker, content_scorer,
  humanizer_scorer, headline_scorer, seo_optimizer, sitemap_analyzer,
  schema_validator, topic_cluster_mapper
- References 6 SEO knowledge bases for audit framework, AI search,
  content optimization, URL design, internal linking, AI detection
- Auto-fixes: generic titles, missing descriptions, broken links, orphan pages
- Preserves high-ranking pages — only fixes critical issues on those
- Registered in both commands/ (distributable) and .claude/commands/ (local)

Also: sync all doc counts — 28 plugins, 26 eng-core skills, 21 commands

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(seo): fix multi-line YAML description parser, add 2 orphan pages to nav

- generate-docs.py: extract_description_from_frontmatter() now handles
  multi-line YAML block scalars (|, >, indented continuation) — fixes
  14 pages that had 56-65 char truncated descriptions
- mkdocs.yml: add epic-design and research-summarizer to nav (orphan pages)
- Regenerated 251 pages, rebuilt sitemap (278 URLs)
- SEO audit: 0 broken links, 17→3 short descriptions, 278/278 pages
  have "Claude Code Skills" in <title>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(plugins): change author from string to object in plugin.json

Claude Code plugin manifest requires author as {"name": "..."}, not a
plain string. Fixes install error: "author: Invalid input: expected
object, received string"

Affected: agenthub, a11y-audit

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Leo <leo@openclaw.ai>
Co-authored-by: ivanopenclaw223-alt <ivanopenclaw223@gmail.com>
Co-authored-by: ivanopenclaw223-alt <ivanopenclaw223-alt@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 09:44:35 +01:00
..
2026-03-19 09:44:35 +01:00

AgentHub — Multi-Agent Collaboration for Claude Code

AgentHub spawns N parallel agents in isolated git worktrees to compete on the same task, then evaluates results by metric or LLM judge and merges the winner. It turns any optimization, refactoring, content generation, or design problem into a tournament where the best solution wins.

Quick Start

# One command — full lifecycle
/hub:run --task "Reduce p50 latency" --agents 3 \
  --eval "pytest bench.py --json" --metric p50_ms --direction lower \
  --template optimizer

Or step by step:

# 1. Initialize a session — define the task, agent count, and evaluation criteria
/hub:init --task "Reduce API p50 latency" --agents 3 \
  --eval "pytest bench.py --json" --metric p50_ms --direction lower

# 2. Spawn agents — launches 3 parallel agents in isolated worktrees
/hub:spawn --template optimizer

# 3. Check progress
/hub:status

# 4. Evaluate — rank agents by metric
/hub:eval

# 5. Merge the winner into your branch
/hub:merge

Commands Reference

Command Purpose Example
/hub:init Create session with task, agents, eval criteria /hub:init --task "Optimize DB queries" --agents 4 --eval "python bench.py" --metric query_ms --direction lower
/hub:spawn Launch all agents in parallel worktrees /hub:spawn (uses latest session)
/hub:status Show DAG state, branches, progress posts /hub:status
/hub:eval Rank results by metric or LLM judge /hub:eval --judge (LLM judge mode)
/hub:merge Merge winner, archive losers, cleanup /hub:merge --agent agent-2 (force pick)
/hub:board Read/write the message board /hub:board --read progress
/hub:run One-shot full lifecycle /hub:run --task "Reduce latency" --agents 3 --eval "pytest bench.py" --metric p50_ms --direction lower --template optimizer

The Optimizer Pattern

AgentHub's most powerful pattern: N agents compete using different strategies, each running an iterative improvement loop in its own worktree.

How It Works

/hub:run --task "Reduce p50 latency" --agents 3 \
  --eval "pytest bench.py --json" --metric p50_ms --direction lower \
  --template optimizer

Each agent follows the same loop independently:

┌─── Agent 1 (worktree) ──────────────────────────┐
│  Strategy: Caching                              │
│  Loop: edit → eval → keep/discard → repeat ×10  │
└─────────────────────────────────────────────────┘
┌─── Agent 2 (worktree) ──────────────────────────┐
│  Strategy: Algorithm optimization               │
│  Loop: edit → eval → keep/discard → repeat ×10  │
└─────────────────────────────────────────────────┘
┌─── Agent 3 (worktree) ──────────────────────────┐
│  Strategy: I/O batching                         │
│  Loop: edit → eval → keep/discard → repeat ×10  │
└─────────────────────────────────────────────────┘

The optimizer template embeds the iteration loop directly in each agent's dispatch prompt — no external dependencies required.

Agent Templates

Templates define the dispatch prompt pattern. Use --template with /hub:spawn or /hub:run:

Template Pattern Use Case
optimizer Edit → eval → keep/discard → repeat x10 Performance, latency, size reduction, content quality, research depth
refactorer Restructure → test → iterate until green Code quality, tech debt, document restructuring
test-writer Write tests → measure coverage → repeat Test coverage gaps
bug-fixer Reproduce → diagnose → fix → verify Bug fix with competing approaches

Templates live in references/agent-templates.md.

Example: 3 Strategies Competing on API Latency

/hub:run --task "Reduce API p50 latency below 150ms" --agents 3 \
  --eval "pytest bench.py --json" --metric p50_ms --direction lower \
  --template optimizer

AgentHub automatically:

  1. Captures baseline (e.g., p50_ms = 180ms)
  2. Assigns diverse strategies (caching, algorithm, I/O batching)
  3. Spawns 3 agents — each iterates up to 10 times in its worktree
  4. Agents commit improvements, revert failures, post progress to the board
  5. Evaluates final metrics across all agents
  6. Presents ranked results for merge confirmation
# Example output:
# RANK  AGENT    METRIC   DELTA    FILES
# 1     agent-2  128ms    -52ms    4
# 2     agent-1  145ms    -35ms    2
# 3     agent-3  171ms    -9ms     1

Example: 3 Agents Drafting Competing Blog Posts

/hub:run --task "Write a 1500-word blog post on zero-downtime deployments" \
  --agents 3 --judge

No eval command needed — --judge activates LLM judge mode. Each agent takes a different angle (tutorial, case study, opinion piece). The coordinator reads all three drafts and picks the winner by clarity, depth, and engagement.

Power-Up: Autoresearch for Richer Tracking

If you also have the autoresearch-agent skill installed, agents can optionally use its tracking tools for:

  • results.tsv — structured iteration history with timestamps and metrics
  • Strategy escalation — start with low-hanging fruit, escalate to radical changes
  • program.md — self-improving agent instructions that refine across iterations

This is entirely optional. The optimizer template works standalone — autoresearch just adds richer per-agent tracking if available. Both plugins remain independent; install either or both.

Step-by-Step Walkthrough

  1. Init + Baseline — Sets up session config with task description, agent count, eval command, and target metric. Captures baseline metric value. Creates .agenthub/sessions/{id}/config.yaml.

  2. Spawn — The coordinator writes dispatch posts to .agenthub/board/dispatch/ with strategy-specific instructions for each agent. Each agent is launched with isolation: "worktree" so it works on an isolated copy of the repo. When a template is used, the dispatch prompt contains the full iteration loop.

  3. Agents iterate — Each agent independently: reads its dispatch instructions, makes changes to the target files, runs the eval command, commits improvements, reverts failures, and posts progress updates.

  4. Evaluate — Runs result_ranker.py which checks out each agent's final branch, runs the eval command, extracts the metric, and produces a ranked table with deltas from baseline.

  5. Merge — Merges the winning branch with --no-ff, archives loser branches as tags (hub/archive/{session}/{agent}), cleans up worktrees, and posts a merge summary to the board.

Coordination Patterns

Fan-Out / Fan-In (Default)

One coordinator spawns N agents with the same task but different strategies or constraints. All run in parallel. Results are evaluated and one winner is merged.

Best for: Performance optimization, competing implementations, A/B testing approaches, competing marketing copy variations.

Tournament (Multi-Round Elimination)

Multiple rounds of fan-out/fan-in. Winners advance, losers are eliminated. Each round can narrow the strategy space.

Best for: Large solution spaces where you want to prune early, iterative refinement of the best approaches, multi-round content refinement.

Ensemble (Combine All Agents' Work)

Each agent works on a different subtask rather than competing on the same one. Results are combined rather than compared.

Best for: Large refactoring tasks, multi-file changes where work can be parallelized by module, multi-section reports or whitepapers.

Pipeline (Sequential Phases)

Agent 1's output becomes Agent 2's input. Each phase builds on the previous result.

Best for: Multi-stage workflows (e.g., Agent 1 writes tests, Agent 2 writes implementation, Agent 3 optimizes) or research → draft → edit pipelines.

Scripts

Script Purpose Example
hub_init.py Create session directory and config python scripts/hub_init.py --task "Optimize queries" --agents 3 --eval "python bench.py" --metric query_ms --direction lower
board_manager.py Message board CRUD (dispatch, progress, results) python scripts/board_manager.py --post --channel progress --author agent-1 --message "Iteration 3: p50=145ms"
session_manager.py Session state machine (init→running→evaluating→merged) python scripts/session_manager.py --list
dag_analyzer.py Git DAG analysis — frontier detection, branch status python scripts/dag_analyzer.py --status --session 20260317-143022
result_ranker.py Evaluate and rank agent results by metric or diff python scripts/result_ranker.py --session 20260317-143022 --eval-cmd "pytest bench.py --json" --metric p50_ms --direction lower

All scripts support --help for full usage and --demo for example output.

Installation

Claude Code

# Install from ClawHub
claude install agenthub

# Or add manually — copy the engineering/agenthub/ folder to your project,
# then add to your .claude/settings.json:
{
  "skills": ["./engineering/agenthub"]
}

OpenAI Codex

# Copy to your agents directory
cp -r engineering/agenthub/ .codex/agents/agenthub/

OpenClaw

openclaw install agenthub

Architecture

Session Model

Each /hub:init creates a session with a timestamp-based ID (YYYYMMDD-HHMMSS). Sessions progress through states:

init → running → evaluating → merged
                            → archived

State is tracked in .agenthub/sessions/{session-id}/state.json.

Branch Naming

hub/{session-id}/agent-{N}/attempt-{M}

# Example:
hub/20260317-143022/agent-1/attempt-1
hub/20260317-143022/agent-2/attempt-1
hub/20260317-143022/agent-3/attempt-1

Board Channels

The message board uses three channels stored as YAML-frontmatter markdown files:

Channel Direction Purpose
dispatch coordinator → agents Task assignments and strategy prompts
progress agents → coordinator Status updates, iteration results
results bidirectional Final metrics, merge summaries

Immutability Rules

  • Append-only board — posts are never edited or deleted
  • Append-only DAG — no rebase, no force-push
  • Archive losers — loser branches become tags, not deleted
  • Worktree cleanup — removed only after merge is complete

Directory Structure

.agenthub/
├── sessions/{session-id}/
│   ├── config.yaml          # Task, agents, eval criteria
│   └── state.json           # State machine, agent status
└── board/
    ├── _index.json          # Channel metadata
    ├── dispatch/            # Coordinator → agents
    ├── progress/            # Agents → coordinator
    └── results/             # Final results + merge summary

License

MIT