- AgentHub: 13 files updated with non-engineering examples (content drafts, research, strategy) — engineering stays primary, cross-domain secondary - AgentHub: 7 slash commands, 5 Python scripts, 3 references, 1 agent, dry_run.py validation (57 checks) - Marketplace: agenthub entry added with cross-domain keywords, engineering POWERFUL updated (25→30), product (12→13), counts synced across all configs - SEO: generate-docs.py now produces keyword-rich <title> tags and meta descriptions using SKILL.md frontmatter — "Claude Code Skills" in site_name propagates to all 276 HTML pages - SEO: per-domain title suffixes (Agent Skill for Codex & OpenClaw, etc.), slug-as-title cleanup, domain label stripping from titles - Broken links: 141→0 warnings — new rewrite_skill_internal_links() converts references/, scripts/, assets/ links to GitHub source URLs; skills/index.md phantom slugs fixed (6 marketing, 7 RA/QM) - Counts synced: 204 skills, 266 tools, 382 refs, 16 agents, 17 commands, 21 plugins — consistent across CLAUDE.md, README.md, docs/index.md, marketplace.json, getting-started.md, mkdocs.yml - Platform sync: Codex 163 skills, Gemini 246 items, OpenClaw compatible Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3.4 KiB
3.4 KiB
title, description
| title | description |
|---|---|
| Fix Failing or Flaky Tests — Agent Skill & Codex Plugin | >-. Agent skill for Claude Code, Codex CLI, Gemini CLI, OpenClaw. |
Fix Failing or Flaky Tests
Install:
claude /plugin install engineering-skills
Diagnose and fix a Playwright test that fails or passes intermittently using a systematic taxonomy.
Input
$ARGUMENTS contains:
- A test file path:
e2e/login.spec.ts - A test name: ""should redirect after login"`
- A description:
"the checkout test fails in CI but passes locally"
Steps
1. Reproduce the Failure
Run the test to capture the error:
npx playwright test <file> --reporter=list
If the test passes, it's likely flaky. Run burn-in:
npx playwright test <file> --repeat-each=10 --reporter=list
If it still passes, try with parallel workers:
npx playwright test --fully-parallel --workers=4 --repeat-each=5
2. Capture Trace
Run with full tracing:
npx playwright test <file> --trace=on --retries=0
Read the trace output. Use /debug to analyze trace files if available.
3. Categorize the Failure
Load flaky-taxonomy.md from this skill directory.
Every failing test falls into one of four categories:
| Category | Symptom | Diagnosis |
|---|---|---|
| Timing/Async | Fails intermittently everywhere | --repeat-each=20 reproduces locally |
| Test Isolation | Fails in suite, passes alone | --workers=1 --grep "test name" passes |
| Environment | Fails in CI, passes locally | Compare CI vs local screenshots/traces |
| Infrastructure | Random, no pattern | Error references browser internals |
4. Apply Targeted Fix
Timing/Async:
- Replace
waitForTimeout()with web-first assertions - Add
awaitto missing Playwright calls - Wait for specific network responses before asserting
- Use
toBeVisible()before interacting with elements
Test Isolation:
- Remove shared mutable state between tests
- Create test data per-test via API or fixtures
- Use unique identifiers (timestamps, random strings) for test data
- Check for database state leaks
Environment:
- Match viewport sizes between local and CI
- Account for font rendering differences in screenshots
- Use
dockerlocally to match CI environment - Check for timezone-dependent assertions
Infrastructure:
- Increase timeout for slow CI runners
- Add retries in CI config (
retries: 2) - Check for browser OOM (reduce parallel workers)
- Ensure browser dependencies are installed
5. Verify the Fix
Run the test 10 times to confirm stability:
npx playwright test <file> --repeat-each=10 --reporter=list
All 10 must pass. If any fail, go back to step 3.
6. Prevent Recurrence
Suggest:
- Add to CI with
retries: 2if not already - Enable
trace: 'on-first-retry'in config - Add the fix pattern to project's test conventions doc
Output
- Root cause category and specific issue
- The fix applied (with diff)
- Verification result (10/10 passes)
- Prevention recommendation