claude-skills-reference

firefrost-gaming/claude-skills-reference

Author	SHA1	Message	Date
Reza Rezvani	7533d34978	chore: post-merge sync — statistical-analyst plugin, spec-to-repo skill, docs update New: - feat(product-team): add spec-to-repo skill — natural-language spec to runnable repo 1 Python tool (validate_project.py), 2 references, 3 concrete examples - feat(engineering): add statistical-analyst plugin.json + marketplace entry (32 total) Sync: - Update all counts to 233 skills, 305 tools, 424 refs, 25 agents, 22 commands - Fix engineering-advanced plugin description: 42 → 43 skills - Sync Codex (194 skills), Gemini (282 items), MkDocs (281 pages → 313 HTML) - Update CLAUDE.md, README.md, docs/index.md, docs/getting-started.md, mkdocs.yml - Expand product-analytics SKILL.md + add JSON output to metrics_calculator.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 12:09:55 +02:00
Reza Rezvani	7c2564845a	refactor(engineering): move statistical-analyst to engineering/, fix cross-refs - Move from data-analysis/ to engineering/ - Fix 5 cross-references to use correct domain paths - Fix Python 3.9 compat in sample_size_calculator.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 11:18:12 +02:00
Reza Rezvani	5710a7b763	chore: post-merge sync — plugins, audits, docs, cross-platform indexes New skills integrated: - engineering/behuman, code-tour, demo-video, data-quality-auditor Plugins & marketplace: - Add plugin.json for code-tour, demo-video, data-quality-auditor - Add all 3 to marketplace.json (31 total plugins) - Update marketplace counts to 248 skills, 332 tools, 460 refs Skill fixes: - Move data-quality-auditor from data-analysis/ to engineering/ - Fix cross-refs: code-tour, demo-video, data-quality-auditor - Add evals.json for code-tour (5 scenarios) and demo-video (4 scenarios) - demo-video: add output artifacts, prereqs check, references extraction - code-tour: add default persona, parallel discovery, trivial repo guidance - Fix Python 3.9 compat (from __future__ import annotations) product-analytics audit fixes: - Expand SKILL.md from 82 to 147 lines (anti-patterns, cross-refs, examples) - Add --format json to all metrics_calculator.py subcommands - Add error handling (FileNotFoundError, KeyError) Docs & indexes: - Update CLAUDE.md, README.md, docs/index.md, docs/getting-started.md counts - Sync Codex (192 skills) and Gemini (280 items) indexes - Regenerate MkDocs pages (279 pages, 311 HTML) - Add 3 new nav entries to mkdocs.yml - Update mkdocs.yml site_description Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 02:05:19 +02:00
Alireza Rezvani	6c89d8f591	Merge pull request #475 from vaddisrinivas/add-framecraft feat(engineering): add demo-video skill	2026-04-04 01:18:10 +02:00
Srinivas Vaddi	8be3cd56e8	feat(engineering): add code-tour skill Add a skill for creating CodeTour .tour files — persona-targeted, step-by-step walkthroughs that link to real files and line numbers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 02:43:59 -04:00
Srinivas Vaddi	01ab7433ac	feat(engineering): add demo-video skill Add a skill for creating polished demo videos from screenshots and scene descriptions. Orchestrates playwright, ffmpeg, and edge-tts MCPs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 02:43:44 -04:00
Reza Rezvani	baca2e61ac	feat(engineering): add behuman skill — Self-Mirror consciousness loop Based on issue #464 submission by voidborne-d. Enhanced with English-only content (removed all Chinese), anti-patterns section, cross-references, plugin.json, convention-compliant frontmatter, and English eval scenarios. behuman (193 lines + reference + 8 eval scenarios): - Self-Mirror loop: instinctive response → reflection → conscious revision - Show mode (2.5-3x tokens) and quiet mode (1.5-2x tokens) - 3 English examples: emotional support, life advice, personal writing - Based on Lacan's Mirror Stage + Kahneman's Dual Process Theory - Zero dependencies — pure prompt technique Co-Authored-By: voidborne-d <voidborne-d@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 01:54:01 +02:00
Alireza Rezvani	1a06eacbb8	Merge pull request #430 from xingzihai/feat/security-dimension-v2 feat(skill-tester): add Security dimension to quality scoring system	2026-03-31 15:25:43 +02:00
Reza Rezvani	d02cc1c9b2	feat(plugins): add standalone plugin.json for 4 new community skills Each skill is now individually installable: - llm-cost-optimizer - prompt-governance - business-investment-advisor - video-content-strategist Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 12:36:16 +02:00
Reza Rezvani	3cd885aa33	chore: sync indexes, update marketplace and docs for 4 new community skills - Codex CLI: 182 skills, 4 new symlinks - Gemini CLI: 274 items, 4 new - engineering plugin.json: 36→38 - finance plugin.json: 2→3 - marketing plugin.json: 43→44 - marketplace.json: updated 3 bundle descriptions - mkdocs.yml: 4 new nav entries - docs/index.md + getting-started.md: domain counts updated - 273 docs pages generated Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 12:32:57 +02:00
Reza Rezvani	1f374e7492	feat: add 4 community skills — llm-cost-optimizer, prompt-governance, business-investment-advisor, video-content-strategist Based on PR #448 by chad848. Enhanced with frontmatter normalization, anti-patterns sections, ghost script reference removal, and broken cross-reference fixes. Automotive-electrical-engineer excluded (out of scope for software/AI skills library). llm-cost-optimizer (engineering/, 192 lines): - Reduce LLM API spend 40-80% via model routing, caching, compression - 3 modes: Cost Audit, Optimize, Design Cost-Efficient Architecture prompt-governance (engineering/, 224 lines): - Production prompt lifecycle: versioning, eval pipelines, A/B testing - Distinct from senior-prompt-engineer (writing) — this is ops/governance business-investment-advisor (finance/, 220 lines): - Capital allocation: ROI, NPV, IRR, payback, build-vs-buy, lease-vs-buy - NOT securities advice — business capex decisions only video-content-strategist (marketing-skill/, 218 lines): - YouTube strategy, video scripting, short-form pipelines, content atomization - Fills video gap in 44-skill marketing pod Co-Authored-By: chad848 <chad848@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 11:43:03 +02:00
Claude	6fa92054bd	release(v2.2.0): 223 skills, security suite, self-eval, full docs update - Add 8 new skills: 6 security (adversarial-reviewer, ai-security, cloud-security, incident-response, red-team, threat-detection), self-eval, snowflake-development - Update all counts: 223 skills, 298 tools, 416 references, 23 agents, 22 commands - Update CHANGELOG.md with v2.2.0 entry - Update all plugin.json versions to 2.2.0 - Update CLAUDE.md, README.md, docs/index.md, docs/getting-started.md, mkdocs.yml - Verify MkDocs build (301 pages), Codex/Gemini sync, all new scripts pass --help https://claude.ai/code/session_011CHSDjqWBPRcEJ3oJrAUHS	2026-03-31 05:55:51 +00:00
Claude	c8520885f9	feat: full ecosystem integration for PR #435 (5 security skills) and PR #436 (self-eval) - Updated domain plugin.json counts (engineering-team: 36, engineering: 36) - Added 6 new skills to mkdocs.yml navigation - Updated engineering-team/CLAUDE.md with security skills section - Generated docs pages for all 6 new skills - Synced Codex + Gemini indexes and symlinks - Ran cross-platform conversion (Cursor, Aider, Windsurf, KiloCode, OpenCode, Augment, Antigravity) https://claude.ai/code/session_01XY4i7SR4BHLWJpdjwGnNLG	2026-03-30 19:11:46 +00:00
Ethan Kreloff	c48c92aa96	feat(engineering): add self-eval skill Adds self-eval skill for honest AI work quality evaluation. Uses two-axis scoring (ambition x execution), mandatory devil's advocate reasoning, and cross-session anti-inflation detection via .self-eval-scores.jsonl persistence.	2026-03-30 21:07:45 +02:00
xingzihai	e0e683ee5e	fix(skill-tester): make Security dimension opt-in with --include-security flag - Add --include-security flag to quality_scorer.py - Default: 4 dimensions × 25% (backward compatible) - With --include-security: 5 dimensions × 20% - Update tier recommendation logic for optional Security - Update documentation to reflect opt-in behavior This addresses the breaking change concern from PR review: the weight change from 25% to 20% would affect all existing audit baselines. The new opt-in approach preserves backward compatibility.	2026-03-27 10:05:12 +00:00
xingzihai	2f92a1dfcb	feat(skill-tester): add Security dimension to quality scoring system - Add SecurityScorer module (605 lines) with comprehensive security assessment - Add 4 security scoring components: - Sensitive data exposure prevention (hardcoded credentials detection) - Safe file operations (path traversal prevention) - Command injection prevention (shell=True, eval, exec detection) - Input validation quality (argparse, error handling, type checking) - Add 53 unit tests with 850 lines of test code - Update quality_scorer.py to integrate Security dimension (20% weight) - Rebalance all dimensions from 25% to 20% (5 dimensions total) - Update tier requirements: - POWERFUL: Security ≥70 - STANDARD: Security ≥50 - BASIC: Security ≥40 - Update documentation (quality-scoring-rubric.md, tier-requirements-matrix.md) - Version bump to 2.0.0 This addresses the feedback from PR #420 by providing a focused, well-tested implementation of the Security dimension without bundling other changes.	2026-03-26 13:25:27 +00:00
Reza Rezvani	86fc905e97	chore: sync cross-platform indexes, regenerate docs, fix plugin.json counts - Codex CLI: 174 skills synced, 11 new symlinks - Gemini CLI: 262 items synced, 11 new - engineering plugin.json: 33 → 35 skills - engineering-team plugin.json: 28 → 29 skills - Docs regenerated: 261 pages (214 skills + 25 agents + 22 commands) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 15:42:39 +01:00
Reza Rezvani	f352e8cdd0	fix: trim 3 SKILL.md files to comply with Anthropic 500-line limit Per Anthropic docs: "Keep SKILL.md under 500 lines. Move detailed reference material to separate files." - browser-automation: 564 → 266 lines (moved examples to references/) - spec-driven-workflow: 586 → 333 lines (moved full spec example to references/) - security-pen-testing: 850 → 306 lines (condensed OWASP/attack details, moved to references/) No content deleted — all moved to existing reference files with pointers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 15:20:47 +01:00
Reza Rezvani	268061b0fd	fix: move browser-automation and spec-driven-workflow scripts to scripts/ directory Validator expects scripts in scripts/ subdirectory, not at skill root. Moved 6 scripts to match repo convention. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 14:53:14 +01:00
Reza Rezvani	43bb5c4d59	Merge branch 'feature/sprint-phase-3-gaps' into dev # Conflicts: # docs/skills/engineering-team/index.md # docs/skills/engineering/index.md # mkdocs.yml	2026-03-25 14:23:21 +01:00
Alireza Rezvani	c1b2aacb74	Merge pull request #408 from alirezarezvani/feature/sprint-improvements improve(engineering): enhance 5 existing skills — tdd-guide, env-secrets-manager, senior-secops, database-designer, senior-devops	2026-03-25 14:22:04 +01:00
Alireza Rezvani	ea2b33ab52	Merge pull request #407 from alirezarezvani/feature/sprint-phase-2-cloud feat(engineering-team): add azure-cloud-architect, security-pen-testing; extend terraform-patterns	2026-03-25 14:22:01 +01:00
Reza Rezvani	87f3a007c9	feat(engineering,ra-qm): add secrets-vault-manager, sql-database-assistant, gcp-cloud-architect, soc2-compliance secrets-vault-manager (403-line SKILL.md, 3 scripts, 3 references): - HashiCorp Vault, AWS SM, Azure KV, GCP SM integration - Secret rotation, dynamic secrets, audit logging, emergency procedures sql-database-assistant (457-line SKILL.md, 3 scripts, 3 references): - Query optimization, migration generation, schema exploration - Multi-DB support (PostgreSQL, MySQL, SQLite, SQL Server) - ORM patterns (Prisma, Drizzle, TypeORM, SQLAlchemy) gcp-cloud-architect (418-line SKILL.md, 3 scripts, 3 references): - 6-step workflow mirroring aws-solution-architect for GCP - Cloud Run, GKE, BigQuery, Cloud Functions, cost optimization - Completes cloud trifecta (AWS + Azure + GCP) soc2-compliance (417-line SKILL.md, 3 scripts, 3 references): - SOC 2 Type I & II preparation, Trust Service Criteria mapping - Control matrix generation, evidence tracking, gap analysis - First SOC 2 skill in ra-qm-team (joins GDPR, ISO 27001, ISO 13485) All 12 scripts pass --help. Docs generated, mkdocs.yml nav updated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 14:05:11 +01:00
Reza Rezvani	67e2bfabfa	improve(engineering): enhance tdd-guide, env-secrets-manager, senior-secops, database-designer, senior-devops tdd-guide (164 → 412 lines): - Spec-first workflow, per-language examples (TS/Python/Go) - Bounded autonomy rules, property-based testing, mutation testing env-secrets-manager (78 → 260 lines): - Cloud secret store integration (Vault, AWS SM, Azure KV, GCP SM) - Secret rotation workflow, CI/CD injection, pre-commit detection, audit logging senior-secops (422 → 505 lines): - OWASP Top 10 quick-check, secret scanning tools comparison - Supply chain security (SBOM, Sigstore, SLSA levels) database-designer (66 → 289 lines): - Query patterns (JOINs, CTEs, window functions), migration patterns - Performance optimization (indexing, EXPLAIN, N+1, connection pooling) - Multi-DB decision matrix, sharding & replication senior-devops (275 → 323 lines): - Multi-cloud cross-references (AWS, Azure, GCP architects) - Cloud-agnostic IaC section (Terraform/OpenTofu, Pulumi) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 13:49:25 +01:00
Reza Rezvani	2056ba251f	feat(engineering-team): add azure-cloud-architect, security-pen-testing; extend terraform-patterns azure-cloud-architect (451-line SKILL.md, 3 scripts, 3 references): - 6-step workflow mirroring aws-solution-architect for Azure - Bicep/ARM templates, AKS, Functions, Cosmos DB, cost optimization - architecture_designer.py, cost_optimizer.py, bicep_generator.py security-pen-testing (850-line SKILL.md, 3 scripts, 3 references): - OWASP Top 10 systematic audit, offensive security testing - XSS/SQLi/SSRF/IDOR detection, secret scanning, API security - vulnerability_scanner.py, dependency_auditor.py, pentest_report_generator.py - Responsible disclosure workflow included terraform-patterns extended (487 → 740 lines): - Multi-cloud provider configuration - OpenTofu compatibility notes - Infracost integration for PR cost estimation - Import existing infrastructure patterns - Terragrunt DRY multi-environment patterns Updated engineering-team plugin.json (26 → 28 skills). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 13:32:22 +01:00
Reza Rezvani	97952ccbee	feat(engineering): add browser-automation and spec-driven-workflow skills browser-automation (564-line SKILL.md, 3 scripts, 3 references): - Web scraping, form filling, screenshot capture, data extraction - Anti-detection patterns, cookie/session management, dynamic content - scraping_toolkit.py, form_automation_builder.py, anti_detection_checker.py - NOT testing (that's playwright-pro) — this is automation & scraping spec-driven-workflow (586-line SKILL.md, 3 scripts, 3 references): - Spec-first development: write spec BEFORE code - Bounded autonomy rules, 6-phase workflow, self-review checklist - spec_generator.py, spec_validator.py, test_extractor.py - Pairs with tdd-guide for red-green-refactor after spec Updated engineering plugin.json (31 → 33 skills). Added both to mkdocs.yml nav and generated docs pages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 12:57:18 +01:00
Reza Rezvani	ce1d7925cc	feat(engineering): integrate focused-fix skill — docs, command, agent, marketplace - Normalize SKILL.md frontmatter to repo standard (remove non-standard license, metadata.* fields; inline description) - Generate docs page (docs/skills/engineering/focused-fix.md) - Add to mkdocs.yml nav (skills + commands) - Create /focused-fix slash command (commands/ + .claude/commands/) - Add to cs-senior-engineer agent (skill integration + new workflow #4) - Update marketplace.json and plugin.json descriptions (30 → 31 skills) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 13:59:30 +01:00
Reza Rezvani	e065a8c4d0	feat(engineering): add focused-fix — deep-dive feature repair skill 5-phase protocol (SCOPE → TRACE → DIAGNOSE → FIX → VERIFY) for systematically repairing entire features/modules. Includes bidirectional dependency tracing, root-cause confirmation, risk labeling, 3-strike architecture escalation, and phase-skip guards. Cherry-picked from PR #388 (avinashchby). Co-Authored-By: avinashchby <24788443+avinashchby@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 13:14:51 +01:00
Reza Rezvani	193f71e56f	fix: correct broken install paths, improve skill descriptions, standardize counts Cherry-picked from PR #387 (ssmanji89) and rebased on dev. - Fix 6 wrong PM skill install paths in INSTALLATION.md - Fix content-creator → content-production script paths - Fix senior-devops CLI flags to match actual deployment_manager.py - Replace vague descriptions with trigger-oriented "Use when..." on 7 engineering skills - Standardize skill count 170 → 205+, finance 1 → 2, version 2.1.1 → 2.1.2 - Use python3 instead of python for macOS compatibility - Remove broken integrations/ link in README.md Excluded: *.zip gitignore wildcard (overrides intentional design decision) Co-Authored-By: sully <ssmanji89@gmail.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 11:57:40 +01:00
Reza Rezvani	ea04644987	fix(plugins): change author from string to object in plugin.json Claude Code plugin manifest requires author as {"name": "..."}, not a plain string. Fixes install error: "author: Invalid input: expected object, received string" Affected: agenthub, a11y-audit Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 09:02:28 +01:00
Reza Rezvani	4709662631	feat(marketplace): add 6 missing standalone plugins — total 22→28 Added to marketplace: - a11y-audit (WCAG 2.2 accessibility audit) - executive-mentor (adversarial thinking partner) - docker-development (Dockerfile, compose, multi-stage) - helm-chart-builder (Helm chart scaffolding) - terraform-patterns (IaC module design) - research-summarizer (structured research synthesis) Also fixed version 1.0.0 → 2.1.2 on 4 plugin.json files (executive-mentor, docker-development, helm-chart-builder, research-summarizer) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 09:01:31 +01:00
Reza Rezvani	6453a29ecf	fix(security-auditor): reduce false positives — whitelist plugin dirs, remove 'token' from exfil pattern - Add .claude-plugin, .codex, .gemini to hidden file allowlist (FS-HIDDEN) These are required plugin infrastructure directories, not secrets. - Remove 'tokens?' from PROMPT-EXFIL regex — 'access token' is a standard technical term in auth reference docs, causing false positives on every skill that documents JWT/OAuth flows (e.g. saas-scaffolder auth-billing-guide) - Remaining PROMPT-EXFIL patterns (credentials, secrets, api_keys, .env, .ssh, .aws, ~/home, /etc) are specific enough to catch real threats Fixes: CI security audit failure on PR #370 (7 CRITICAL false positives) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-17 15:43:37 +01:00
Reza Rezvani	2f57ef8948	feat(agenthub): add AgentHub plugin with cross-domain examples, SEO optimization, and docs site fixes - AgentHub: 13 files updated with non-engineering examples (content drafts, research, strategy) — engineering stays primary, cross-domain secondary - AgentHub: 7 slash commands, 5 Python scripts, 3 references, 1 agent, dry_run.py validation (57 checks) - Marketplace: agenthub entry added with cross-domain keywords, engineering POWERFUL updated (25→30), product (12→13), counts synced across all configs - SEO: generate-docs.py now produces keyword-rich <title> tags and meta descriptions using SKILL.md frontmatter — "Claude Code Skills" in site_name propagates to all 276 HTML pages - SEO: per-domain title suffixes (Agent Skill for Codex & OpenClaw, etc.), slug-as-title cleanup, domain label stripping from titles - Broken links: 141→0 warnings — new rewrite_skill_internal_links() converts references/, scripts/, assets/ links to GitHub source URLs; skills/index.md phantom slugs fixed (6 marketing, 7 RA/QM) - Counts synced: 204 skills, 266 tools, 382 refs, 16 agents, 17 commands, 21 plugins — consistent across CLAUDE.md, README.md, docs/index.md, marketplace.json, getting-started.md, mkdocs.yml - Platform sync: Codex 163 skills, Gemini 246 items, OpenClaw compatible Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-17 12:10:46 +01:00
Reza Rezvani	de724ae5c4	fix(terraform-patterns): align plugin.json version to repo versioning (2.1.2) Review gate flagged version 1.0.0 as non-compliant with CLAUDE.md rule: "Version follows repo versioning." Updated to 2.1.2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-16 16:10:07 +01:00
Leo	dac49ee9f9	feat(skills): add terraform-patterns agent skill Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 23:29:01 +01:00
Leo	0c31067556	feat(skills): add helm-chart-builder agent skill Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 23:28:54 +01:00
Leo	5aaf3e5e0b	seo: optimize all 9 pack descriptions — add Gemini CLI, Cursor, OpenClaw keywords + 'agent skill/plugin' framing Consistent format: '<N> <domain> agent skills and plugins for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw' Updated both SKILL.md frontmatter and plugin.json for each pack.	2026-03-15 23:22:14 +01:00
Leo	bf1473b1be	feat(skills): add research-summarizer and docker-development agent skills research-summarizer (product-team/): - Structured research summarization for papers, articles, reports - Slash commands: /research:summarize, /research:compare, /research:cite - Python tools: extract_citations.py (5 citation formats), format_summary.py (6 templates) - References: summary-templates.md, citation-formats.md docker-development (engineering/): - Dockerfile optimization, compose orchestration, container security - Slash commands: /docker:optimize, /docker:compose, /docker:security - Python tools: dockerfile_analyzer.py (15 rules), compose_validator.py (best practices) - References: dockerfile-best-practices.md, compose-patterns.md Both skills include .claude-plugin/plugin.json and follow POWERFUL tier conventions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-15 22:47:16 +01:00
Reza Rezvani	7911cf957a	feat(autoresearch-agent): fix critical bugs, package as plugin with 5 slash commands Bug fixes (run_experiment.py): - Fix broken revert logic: was saving HEAD as pre_commit (no-op revert), now uses git reset --hard HEAD~1 for correct rollback - Remove broken --loop mode (agent IS the loop, script handles one iteration) - Fix shell injection: all git commands use subprocess list form - Replace shell tail with Python file read Bug fixes (other scripts): - setup_experiment.py: fix shell injection in git branch creation, remove dead --skip-baseline flag, fix evaluator docstring parsing - log_results.py: fix 6 falsy-zero bugs (baseline=0 treated as None), add domain_filter to CSV/markdown export, move import time to top - evaluators: add FileNotFoundError handling, fix output format mismatch in llm_judge_copy, add peak_kb on macOS, add ValueError handling Plugin packaging (NEW): - plugin.json, settings.json, CLAUDE.md for plugin registry - 5 slash commands: /ar:setup, /ar:run, /ar:loop, /ar:status, /ar:resume - /ar:loop supports user-selected intervals (10m, 1h, daily, weekly, monthly) - experiment-runner agent for autonomous loop iterations - Registered in marketplace.json as plugin #20 SKILL.md rewrite: - Replace ambiguous "Loop Protocol" with clear "Agent Protocol" - Add results.tsv format spec, strategy escalation, self-improvement - Replace "NEVER STOP" with resumable stopping logic Docs & sync: - Codex (157 skills), Gemini (229 items), convert.sh all pick up the skill - 6 new MkDocs pages, mkdocs.yml nav updated - Counts updated: 17 agents, 22 slash commands Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-13 14:38:59 +01:00
Leo	12591282da	refactor: autoresearch-agent v2.0 — multi-experiment, multi-domain, real-world evaluators Major rewrite based on deep study of Karpathy's autoresearch repo. Architecture changes: - Multi-experiment support: .autoresearch/{domain}/{name}/ structure - Domain categories: engineering, marketing, content, prompts, custom - Project-level (git-tracked, shareable) or user-level (~/.autoresearch/) scope - User chooses scope during setup, not installation New evaluators (8 ready-to-use): - Free: benchmark_speed, benchmark_size, test_pass_rate, build_speed, memory_usage - LLM judge (uses existing subscription): llm_judge_content, llm_judge_prompt, llm_judge_copy - LLM judges call user's CLI tool (claude/codex/gemini) — no extra API keys needed Script improvements: - setup_experiment.py: --domain, --scope, --evaluator, --list, --list-evaluators - run_experiment.py: --experiment domain/name, --resume, --loop, --single - log_results.py: --dashboard, --domain, --format csv\|markdown\|terminal, --output Results export: - Terminal (default), CSV, and Markdown formats - Per-experiment, per-domain, or cross-experiment dashboard view SKILL.md rewritten: - Clear activation triggers (when the skill should activate) - Practical examples for each domain - Evaluator documentation with cost transparency - Simplified loop protocol matching Karpathy's original philosophy	2026-03-13 08:22:29 +01:00
Leo	a799d8bdb8	feat: add autoresearch-agent — autonomous experiment loop for ML, prompt, code & skill optimization Inspired by Karpathy's autoresearch. The agent modifies a target file, runs a fixed evaluation, keeps improvements (git commit), discards failures (git reset), and loops indefinitely — no human in the loop. Includes: - SKILL.md with setup wizard, 4 domain configs, experiment loop protocol - 3 stdlib-only Python scripts (setup, run, log — 687 lines) - Reference docs: experiment domains guide, program.md templates Domains: ML training (val_bpb), prompt engineering (eval_score), code performance (p50_ms), agent skill optimization (pass_rate). Cherry-picked from feat/autoresearch-agent and rebased onto dev. Fixes: timeout inconsistency (2x→2.5x), results.tsv tracking clarity, zero-metric edge case, installation section aligned with multi-tool support.	2026-03-13 07:21:44 +01:00
Leo	5a34d661aa	fix(engineering): address Claude Code review findings - performance-profiler: add Quick Start section with script usage examples - interview-system-designer: fix references to match actual filenames	2026-03-11 20:46:48 +01:00
Leo	93eee35b83	fix(engineering): improve interview-system-designer - add scripts + extract references	2026-03-11 20:25:11 +01:00
Leo	dc61de798d	fix(engineering): improve runbook-generator - add scripts + extract references	2026-03-11 20:24:23 +01:00
Leo	6f55bc4fd6	fix(engineering): improve env-secrets-manager - add scripts + extract references	2026-03-11 20:23:50 +01:00
Leo	bafb155334	fix(engineering): improve agent-workflow-designer - add scripts + extract references	2026-03-11 20:23:01 +01:00
Leo	9e590c81fb	fix(engineering): improve monorepo-navigator - add scripts + extract references	2026-03-11 20:22:16 +01:00
Leo	abab3b528e	fix(engineering): improve codebase-onboarding - add scripts + extract references	2026-03-11 20:21:34 +01:00
Leo	60ad9d3873	fix(engineering): improve performance-profiler - add scripts + extract references	2026-03-11 20:20:26 +01:00
Leo	a851de0f94	fix(security): add disclaimers to sample code and scaffolding templates - payment_processor.py: add disclaimer header + replace realistic-looking keys with EXAMPLE_NOT_REAL - project_scaffolder.py: add SCAFFOLDING PLACEHOLDER comments to generated secrets - pipeline_orchestrator.py: no change needed (compile() used for syntax validation only)	2026-03-11 20:18:27 +01:00

1 2

80 Commits