chore: post-merge sync — statistical-analyst plugin, spec-to-repo skill, docs update

New:
- feat(product-team): add spec-to-repo skill — natural-language spec to runnable repo
  1 Python tool (validate_project.py), 2 references, 3 concrete examples
- feat(engineering): add statistical-analyst plugin.json + marketplace entry (32 total)

Sync:
- Update all counts to 233 skills, 305 tools, 424 refs, 25 agents, 22 commands
- Fix engineering-advanced plugin description: 42 → 43 skills
- Sync Codex (194 skills), Gemini (282 items), MkDocs (281 pages → 313 HTML)
- Update CLAUDE.md, README.md, docs/index.md, docs/getting-started.md, mkdocs.yml
- Expand product-analytics SKILL.md + add JSON output to metrics_calculator.py

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Reza Rezvani
2026-04-07 12:09:38 +02:00
parent 986fa1f581
commit 7533d34978
21 changed files with 1602 additions and 53 deletions

View File

@@ -1,6 +1,6 @@
---
title: Install Agent Skills — Codex, Gemini CLI, OpenClaw Setup
description: "How to install 248 Claude Code skills and agent plugins for 11 AI coding tools. Step-by-step setup for Claude Code, OpenAI Codex, Gemini CLI, OpenClaw, Cursor, Aider, Windsurf, and more."
description: "How to install 233 Claude Code skills and agent plugins for 11 AI coding tools. Step-by-step setup for Claude Code, OpenAI Codex, Gemini CLI, OpenClaw, Cursor, Aider, Windsurf, and more."
---
# Getting Started
@@ -141,9 +141,9 @@ Choose your platform and follow the steps:
| Bundle | Install Command | Skills |
|--------|----------------|--------|
| **Engineering Core** | `/plugin install engineering-skills@claude-code-skills` | 37 |
| **Engineering POWERFUL** | `/plugin install engineering-advanced-skills@claude-code-skills` | 42 |
| **Engineering POWERFUL** | `/plugin install engineering-advanced-skills@claude-code-skills` | 43 |
| **Product** | `/plugin install product-skills@claude-code-skills` | 15 |
| **Marketing** | `/plugin install marketing-skills@claude-code-skills` | 45 |
| **Marketing** | `/plugin install marketing-skills@claude-code-skills` | 44 |
| **Regulatory & Quality** | `/plugin install ra-qm-skills@claude-code-skills` | 14 |
| **Project Management** | `/plugin install pm-skills@claude-code-skills` | 9 |
| **C-Level Advisory** | `/plugin install c-level-skills@claude-code-skills` | 34 |
@@ -182,7 +182,7 @@ AI-augmented development. Optimize for SEO.
## Python Tools
All 332 tools use the standard library only — zero pip installs, all verified.
All 305 tools use the standard library only — zero pip installs, all verified.
```bash
# Security audit a skill before installing
@@ -254,7 +254,7 @@ See the [Skills & Agents Factory](https://github.com/alirezarezvani/claude-code-
Yes. Run `./scripts/gemini-install.sh` to set up skills for Gemini CLI. A sync script (`scripts/sync-gemini-skills.py`) generates the skills index automatically.
??? question "Does this work with Cursor, Windsurf, Aider, or other tools?"
Yes. All 248 skills can be converted to native formats for Cursor, Aider, Kilo Code, Windsurf, OpenCode, Augment, and Antigravity. Run `./scripts/convert.sh --tool all` and then install with `./scripts/install.sh --tool <name>`. See [Multi-Tool Integrations](integrations.md) for details.
Yes. All 233 skills can be converted to native formats for Cursor, Aider, Kilo Code, Windsurf, OpenCode, Augment, and Antigravity. Run `./scripts/convert.sh --tool all` and then install with `./scripts/install.sh --tool <name>`. See [Multi-Tool Integrations](integrations.md) for details.
??? question "Can I use Agent Skills in ChatGPT?"
Yes. We have [6 Custom GPTs](custom-gpts.md) that bring Agent Skills directly into ChatGPT — no installation needed. Just click and start chatting.

View File

@@ -1,6 +1,6 @@
---
title: 248 Agent Skills for Codex, Gemini CLI & OpenClaw
description: "248 production-ready Claude Code skills and agent plugins for 11 AI coding tools. Engineering, product, marketing, compliance, and finance agent skills for Claude Code, OpenAI Codex, Gemini CLI, Cursor, and OpenClaw."
title: 233 Agent Skills for Codex, Gemini CLI & OpenClaw
description: "233 production-ready Claude Code skills and agent plugins for 11 AI coding tools. Engineering, product, marketing, compliance, and finance agent skills for Claude Code, OpenAI Codex, Gemini CLI, Cursor, and OpenClaw."
hide:
- toc
- edit
@@ -14,7 +14,7 @@ hide:
# Agent Skills
248 production-ready skills, 23 agents, 3 personas, and an orchestration protocol for AI coding tools.
233 production-ready skills, 25 agents, 3 personas, and an orchestration protocol for AI coding tools.
{ .hero-subtitle }
[Get Started](getting-started.md){ .md-button .md-button--primary }
@@ -49,7 +49,7 @@ hide:
<div class="grid cards" markdown>
- :material-toolbox:{ .lg .middle } **248 Skills**
- :material-toolbox:{ .lg .middle } **233 Skills**
---
@@ -57,7 +57,7 @@ hide:
[:octicons-arrow-right-24: Browse skills](skills/)
- :material-robot:{ .lg .middle } **23 Agents**
- :material-robot:{ .lg .middle } **25 Agents**
---
@@ -81,7 +81,7 @@ hide:
[:octicons-arrow-right-24: Learn patterns](orchestration.md)
- :material-language-python:{ .lg .middle } **332 Python Tools**
- :material-language-python:{ .lg .middle } **305 Python Tools**
---
@@ -143,7 +143,7 @@ hide:
Agent designer, RAG architect, database designer, CI/CD builder, MCP server builder, security auditor, tech debt tracker
[:octicons-arrow-right-24: 42 skills](skills/engineering/)
[:octicons-arrow-right-24: 43 skills](skills/engineering/)
- :material-bullseye-arrow:{ .lg .middle } **Product**
@@ -151,7 +151,7 @@ hide:
Product manager, agile PO, strategist, UX researcher, UI design system, landing pages, SaaS scaffolder, analytics, experiment designer
[:octicons-arrow-right-24: 14 skills](skills/product-team/)
[:octicons-arrow-right-24: 15 skills](skills/product-team/)
- :material-bullhorn:{ .lg .middle } **Marketing**
@@ -159,7 +159,7 @@ hide:
Content, SEO, CRO, channels, growth, intelligence, sales — 7 specialist pods with 32 Python tools
[:octicons-arrow-right-24: 45 skills](skills/marketing-skill/)
[:octicons-arrow-right-24: 44 skills](skills/marketing-skill/)
- :material-clipboard-check:{ .lg .middle } **Project Management**
@@ -175,7 +175,7 @@ hide:
Full C-suite (10 roles), orchestration, board meetings, culture frameworks, strategic alignment
[:octicons-arrow-right-24: 28 skills](skills/c-level-advisor/)
[:octicons-arrow-right-24: 34 skills](skills/c-level-advisor/)
- :material-shield-check:{ .lg .middle } **Regulatory & Quality**
@@ -191,7 +191,7 @@ hide:
Customer success, sales engineer, revenue operations, contracts & proposals
[:octicons-arrow-right-24: 4 skills](skills/business-growth/)
[:octicons-arrow-right-24: 5 skills](skills/business-growth/)
- :material-currency-usd:{ .lg .middle } **Finance**

View File

@@ -1,13 +1,13 @@
---
title: "Engineering - POWERFUL Skills — Agent Skills & Codex Plugins"
description: "55 engineering - powerful skills — advanced agent-native skill and Claude Code plugin for AI agent design, infrastructure, and automation. Works with Claude Code, Codex CLI, Gemini CLI, and OpenClaw."
description: "56 engineering - powerful skills — advanced agent-native skill and Claude Code plugin for AI agent design, infrastructure, and automation. Works with Claude Code, Codex CLI, Gemini CLI, and OpenClaw."
---
<div class="domain-header" markdown>
# :material-rocket-launch: Engineering - POWERFUL
<p class="domain-count">55 skills in this domain</p>
<p class="domain-count">56 skills in this domain</p>
</div>
@@ -263,6 +263,12 @@ description: "55 engineering - powerful skills — advanced agent-native skill a
The operational companion to database design. While database-designer focuses on schema architecture and database-sch...
- **[Z-test for two proportions (A/B conversion rates)](statistical-analyst.md)**
---
python3 scripts/hypothesistester.py --test ztest \
- **[Tech Debt Tracker](tech-debt-tracker.md)**
---

View File

@@ -0,0 +1,258 @@
---
title: "Z-test for two proportions (A/B conversion rates) — Agent Skill for Codex & OpenClaw"
description: "Run hypothesis tests, analyze A/B experiment results, calculate sample sizes, and interpret statistical significance with effect sizes. Use when you. Agent skill for Claude Code, Codex CLI, Gemini CLI, OpenClaw."
---
# Z-test for two proportions (A/B conversion rates)
<div class="page-meta" markdown>
<span class="meta-badge">:material-rocket-launch: Engineering - POWERFUL</span>
<span class="meta-badge">:material-identifier: `statistical-analyst`</span>
<span class="meta-badge">:material-github: <a href="https://github.com/alirezarezvani/claude-skills/tree/main/engineering/statistical-analyst/SKILL.md">Source</a></span>
</div>
<div class="install-banner" markdown>
<span class="install-label">Install:</span> <code>claude /plugin install engineering-advanced-skills</code>
</div>
You are an expert statistician and data scientist. Your goal is to help teams make decisions grounded in statistical evidence — not gut feel. You distinguish signal from noise, size experiments correctly before they start, and interpret results with full context: significance, effect size, power, and practical impact.
You treat "statistically significant" and "practically significant" as separate questions and always answer both.
---
## Entry Points
### Mode 1 — Analyze Experiment Results (A/B Test)
Use when an experiment has already run and you have result data.
1. **Clarify** — Confirm metric type (conversion rate, mean, count), sample sizes, and observed values
2. **Choose test** — Proportions → Z-test; Continuous means → t-test; Categorical → Chi-square
3. **Run** — Execute `hypothesis_tester.py` with appropriate method
4. **Interpret** — Report p-value, confidence interval, effect size (Cohen's d / Cohen's h / Cramér's V)
5. **Decide** — Ship / hold / extend using the decision framework below
### Mode 2 — Size an Experiment (Pre-Launch)
Use before launching a test to ensure it will be conclusive.
1. **Define** — Baseline rate, minimum detectable effect (MDE), significance level (α), power (1β)
2. **Calculate** — Run `sample_size_calculator.py` to get required N per variant
3. **Sanity-check** — Confirm traffic volume can deliver N within acceptable time window
4. **Document** — Lock the stopping rule before launch to prevent p-hacking
### Mode 3 — Interpret Existing Numbers
Use when someone shares a result and asks "is this significant?" or "what does this mean?"
1. Ask for: sample sizes, observed values, baseline, and what decision depends on the result
2. Run the appropriate test
3. Report using the Bottom Line → What → Why → How to Act structure
4. Flag any validity threats (peeking, multiple comparisons, SUTVA violations)
---
## Tools
### `scripts/hypothesis_tester.py`
Run Z-test (proportions), two-sample t-test (means), or Chi-square test (categorical). Returns p-value, confidence interval, effect size, and a plain-English verdict.
```bash
# Z-test for two proportions (A/B conversion rates)
python3 scripts/hypothesis_tester.py --test ztest \
--control-n 5000 --control-x 250 \
--treatment-n 5000 --treatment-x 310
# Two-sample t-test (comparing means, e.g. revenue per user)
python3 scripts/hypothesis_tester.py --test ttest \
--control-mean 42.3 --control-std 18.1 --control-n 800 \
--treatment-mean 46.1 --treatment-std 19.4 --treatment-n 820
# Chi-square test (multi-category outcomes)
python3 scripts/hypothesis_tester.py --test chi2 \
--observed "120,80,50" --expected "100,100,50"
# Output JSON for downstream use
python3 scripts/hypothesis_tester.py --test ztest \
--control-n 5000 --control-x 250 \
--treatment-n 5000 --treatment-x 310 \
--format json
```
### `scripts/sample_size_calculator.py`
Calculate required sample size per variant before launching an experiment.
```bash
# Proportion test (conversion rate experiment)
python3 scripts/sample_size_calculator.py --test proportion \
--baseline 0.05 --mde 0.20 --alpha 0.05 --power 0.80
# Mean test (continuous metric experiment)
python3 scripts/sample_size_calculator.py --test mean \
--baseline-mean 42.3 --baseline-std 18.1 --mde 0.10 \
--alpha 0.05 --power 0.80
# Show tradeoff table across power levels
python3 scripts/sample_size_calculator.py --test proportion \
--baseline 0.05 --mde 0.20 --table
# Output JSON
python3 scripts/sample_size_calculator.py --test proportion \
--baseline 0.05 --mde 0.20 --format json
```
### `scripts/confidence_interval.py`
Compute confidence intervals for a proportion or mean. Use for reporting observed metrics with uncertainty bounds.
```bash
# CI for a proportion
python3 scripts/confidence_interval.py --type proportion \
--n 1200 --x 96
# CI for a mean
python3 scripts/confidence_interval.py --type mean \
--n 800 --mean 42.3 --std 18.1
# Custom confidence level
python3 scripts/confidence_interval.py --type proportion \
--n 1200 --x 96 --confidence 0.99
# Output JSON
python3 scripts/confidence_interval.py --type proportion \
--n 1200 --x 96 --format json
```
---
## Test Selection Guide
| Scenario | Metric | Test |
|---|---|---|
| A/B conversion rate (clicked/not) | Proportion | Z-test for two proportions |
| A/B revenue, load time, session length | Continuous mean | Two-sample t-test (Welch's) |
| A/B/C/n multi-variant with categories | Categorical counts | Chi-square |
| Single sample vs. known value | Mean vs. constant | One-sample t-test |
| Non-normal data, small n | Rank-based | Use Mann-Whitney U (flag for human) |
**When NOT to use these tools:**
- n < 30 per group without checking normality
- Metrics with heavy tails (e.g. revenue with whales) — consider log transform or trimmed mean first
- Sequential / peeking scenarios — use sequential testing or SPRT instead
- Clustered data (e.g. users within countries) — standard tests assume independence
---
## Decision Framework (Post-Experiment)
Use this after running the test:
| p-value | Effect Size | Practical Impact | Decision |
|---|---|---|---|
| < α | Large / Medium | Meaningful | ✅ Ship |
| < α | Small | Negligible | ⚠️ Hold — statistically significant but not worth the complexity |
| ≥ α | — | — | 🔁 Extend (if underpowered) or ❌ Kill |
| < α | Any | Negative UX | ❌ Kill regardless |
**Always ask:** "If this effect were exactly as measured, would the business care?" If no — don't ship on significance alone.
---
## Effect Size Reference
Effect sizes translate statistical results into practical language:
**Cohen's d (means):**
| d | Interpretation |
|---|---|
| < 0.2 | Negligible |
| 0.20.5 | Small |
| 0.50.8 | Medium |
| > 0.8 | Large |
**Cohen's h (proportions):**
| h | Interpretation |
|---|---|
| < 0.2 | Negligible |
| 0.20.5 | Small |
| 0.50.8 | Medium |
| > 0.8 | Large |
**Cramér's V (chi-square):**
| V | Interpretation |
|---|---|
| < 0.1 | Negligible |
| 0.10.3 | Small |
| 0.30.5 | Medium |
| > 0.5 | Large |
---
## Proactive Risk Triggers
Surface these unprompted when you spot the signals:
- **Peeking / early stopping** — Running a test and checking results daily inflates false positive rate. Ask: "Did you look at results before the planned end date?"
- **Multiple comparisons** — Testing 10 metrics at α=0.05 gives ~40% chance of at least one false positive. Flag when > 3 metrics are being evaluated.
- **Underpowered test** — If n is below the required sample size, a non-significant result tells you nothing. Always check power retroactively.
- **SUTVA violations** — If users in control and treatment can interact (e.g. social features, shared inventory), the independence assumption breaks.
- **Simpson's Paradox** — An aggregate result can reverse when segmented. Flag when segment-level results are available.
- **Novelty effect** — Significant early results in UX tests often decay. Flag for post-novelty re-measurement.
---
## Output Artifacts
| Request | Deliverable |
|---|---|
| "Did our test win?" | Significance report: p-value, CI, effect size, verdict, caveats |
| "How big should our test be?" | Sample size report with power/MDE tradeoff table |
| "What's the confidence interval for X?" | CI report with margin of error and interpretation |
| "Is this difference real?" | Hypothesis test with plain-English conclusion |
| "How long should we run this?" | Duration estimate = (required N per variant) / (daily traffic per variant) |
| "We tested 5 things — what's significant?" | Multiple comparison analysis with Bonferroni-adjusted thresholds |
---
## Quality Loop
Tag every finding with confidence:
- 🟢 **Verified** — Test assumptions met, sufficient n, no validity threats
- 🟡 **Likely** — Minor assumption violations; interpret directionally
- 🔴 **Inconclusive** — Underpowered, peeking, or data integrity issue; do not act
---
## Communication Standard
Structure all results as:
**Bottom Line** — One sentence: "Treatment increased conversion by 1.2pp (95% CI: 0.42.0pp). Result is statistically significant (p=0.003) with a small effect (h=0.18). Recommend shipping."
**What** — The numbers: observed rates/means, difference, p-value, CI, effect size
**Why It Matters** — Business translation: what does the effect size mean in revenue, users, or decisions?
**How to Act** — Ship / hold / extend / kill with specific rationale
---
## Related Skills
| Skill | Use When |
|---|---|
| `marketing-skill/ab-test-setup` | Designing the experiment before it runs — randomization, instrumentation, holdout |
| `engineering/data-quality-auditor` | Verifying input data integrity before running any statistical test |
| `product-team/experiment-designer` | Structuring the hypothesis, success metrics, and guardrail metrics |
| `product-team/product-analytics` | Analyzing product funnel and retention metrics |
| `finance/saas-metrics-coach` | Interpreting SaaS KPIs that may feed into experiments (ARR, churn, LTV) |
| `marketing-skill/campaign-analytics` | Statistical analysis of marketing campaign performance |
**When NOT to use this skill:**
- You need to design or instrument the experiment — use `marketing-skill/ab-test-setup` or `product-team/experiment-designer`
- You need to clean or validate the input data — use `engineering/data-quality-auditor` first
- You need Bayesian inference or multi-armed bandit analysis — flag that frequentist tests may not be appropriate
---
## References
- `references/statistical-testing-concepts.md` — t-test, Z-test, chi-square theory; p-value interpretation; Type I/II errors; power analysis math

View File

@@ -1,13 +1,13 @@
---
title: "Product Skills — Agent Skills & Codex Plugins"
description: "15 product skills — product management agent skill and Claude Code plugin for PRDs, discovery, analytics, and roadmaps. Works with Claude Code, Codex CLI, Gemini CLI, and OpenClaw."
description: "16 product skills — product management agent skill and Claude Code plugin for PRDs, discovery, analytics, and roadmaps. Works with Claude Code, Codex CLI, Gemini CLI, and OpenClaw."
---
<div class="domain-header" markdown>
# :material-lightbulb-outline: Product
<p class="domain-count">15 skills in this domain</p>
<p class="domain-count">16 skills in this domain</p>
</div>
@@ -95,6 +95,12 @@ description: "15 product skills — product management agent skill and Claude Co
Tier: POWERFUL
- **[Spec to Repo](spec-to-repo.md)**
---
Turn a natural-language project specification into a complete, runnable starter repository. Not a template filler — a...
- **[UI Design System](ui-design-system.md)**
---

View File

@@ -101,18 +101,58 @@ See:
- Flattening at low level: product used occasionally, revisit value metric.
- Improving newer cohorts: onboarding or positioning improvements are working.
## Anti-Patterns
| Anti-pattern | Fix |
|---|---|
| **Vanity metrics** — tracking pageviews or total signups without activation context | Always pair acquisition metrics with activation rate and retention |
| **Single-point retention** — reporting "30-day retention is 20%" | Compare retention curves across cohorts, not isolated snapshots |
| **Dashboard overload** — 30+ metrics on one screen | Executive layer: 5-7 metrics. Feature layer: per-feature only |
| **No decision rule** — tracking a KPI with no threshold or action plan | Every KPI needs: target, threshold, owner, and "if below X, then Y" |
| **Averaging across segments** — reporting blended metrics that hide segment differences | Always segment by cohort, plan tier, channel, or geography |
| **Ignoring seasonality** — comparing this week to last week without adjusting | Use period-over-period with same-period-last-year context |
## Tooling
### `scripts/metrics_calculator.py`
CLI utility for:
- Retention rate calculations by cohort age
- Cohort table generation
- Basic funnel conversion analysis
CLI utility for retention, cohort, and funnel analysis from CSV data. Supports text and JSON output.
Examples:
```bash
# Retention analysis
python3 scripts/metrics_calculator.py retention events.csv
python3 scripts/metrics_calculator.py retention events.csv --format json
# Cohort matrix
python3 scripts/metrics_calculator.py cohort events.csv --cohort-grain month
python3 scripts/metrics_calculator.py cohort events.csv --cohort-grain week --format json
# Funnel conversion
python3 scripts/metrics_calculator.py funnel funnel.csv --stages visit,signup,activate,pay
python3 scripts/metrics_calculator.py funnel funnel.csv --stages visit,signup,activate,pay --format json
```
**CSV format for retention/cohort:**
```csv
user_id,cohort_date,activity_date
u001,2026-01-01,2026-01-01
u001,2026-01-01,2026-01-03
u002,2026-01-02,2026-01-02
```
**CSV format for funnel:**
```csv
user_id,stage
u001,visit
u001,signup
u001,activate
u002,visit
u002,signup
```
## Cross-References
- Related: `product-team/experiment-designer` — for A/B test planning after identifying metric opportunities
- Related: `product-team/product-manager-toolkit` — for RICE prioritization of metric-driven features
- Related: `product-team/product-discovery` — for assumption mapping when metrics reveal unknowns
- Related: `finance/saas-metrics-coach` — for SaaS-specific metrics (ARR, MRR, churn, LTV)

View File

@@ -0,0 +1,285 @@
---
title: "Spec to Repo — Agent Skill for Product Teams"
description: "Use when the user says 'build me an app', 'create a project from this spec', 'scaffold a new repo', 'generate a starter', 'turn this idea into code'. Agent skill for Claude Code, Codex CLI, Gemini CLI, OpenClaw."
---
# Spec to Repo
<div class="page-meta" markdown>
<span class="meta-badge">:material-lightbulb-outline: Product</span>
<span class="meta-badge">:material-identifier: `spec-to-repo`</span>
<span class="meta-badge">:material-github: <a href="https://github.com/alirezarezvani/claude-skills/tree/main/product-team/spec-to-repo/SKILL.md">Source</a></span>
</div>
<div class="install-banner" markdown>
<span class="install-label">Install:</span> <code>claude /plugin install product-skills</code>
</div>
Turn a natural-language project specification into a complete, runnable starter repository. Not a template filler — a spec interpreter that generates real, working code for any stack.
## When to Use
- User provides a text description of an app and wants code
- User has a PRD, requirements doc, or feature list and needs a codebase
- User says "build me an app that...", "scaffold this", "bootstrap a project"
- User wants a working starter repo, not just a file tree
**Not this skill** when the user wants a SaaS app with Stripe + Auth specifically — use `product-team/saas-scaffolder` instead.
## Core Workflow
### Phase 1 — Parse & Interpret
Read the spec. Extract these fields silently:
| Field | Source | Required |
|-------|--------|----------|
| App name | Explicit or infer from description | yes |
| Description | First sentence of spec | yes |
| Features | Bullet points or sentences describing behavior | yes |
| Tech stack | Explicit ("use FastAPI") or infer from context | yes |
| Auth | "login", "users", "accounts", "roles" | if mentioned |
| Database | "store", "save", "persist", "records", "schema" | if mentioned |
| API surface | "endpoint", "API", "REST", "GraphQL" | if mentioned |
| Deploy target | "Vercel", "Docker", "AWS", "Railway" | if mentioned |
**Stack inference rules** (when user doesn't specify):
| Signal | Inferred stack |
|--------|---------------|
| "web app", "dashboard", "SaaS" | Next.js + TypeScript |
| "API", "backend", "microservice" | FastAPI (Python) or Express (Node) |
| "mobile app" | Flutter or React Native |
| "CLI tool" | Go or Python |
| "data pipeline" | Python |
| "high performance", "systems" | Rust or Go |
After parsing, present a structured interpretation back to the user:
```
## Spec Interpretation
**App:** [name]
**Stack:** [framework + language]
**Features:**
1. [feature]
2. [feature]
**Database:** [yes/no — engine]
**Auth:** [yes/no — method]
**Deploy:** [target]
Does this match your intent? Any corrections before I generate?
```
Flag ambiguities. Ask **at most 3** clarifying questions. If the user says "just build it", proceed with best-guess defaults.
### Phase 2 — Architecture
Design the project before writing any files:
1. **Select template** — Match to a stack template from `references/stack-templates.md`
2. **Define file tree** — List every file that will be created
3. **Map features to files** — Each feature gets at minimum one file/component
4. **Design database schema** — If applicable, define tables/collections with fields and types
5. **Identify dependencies** — List every package with version constraints
6. **Plan API routes** — If applicable, list every endpoint with method, path, request/response shape
Present the file tree to the user before generating:
```
project-name/
├── README.md
├── .env.example
├── .gitignore
├── .github/workflows/ci.yml
├── package.json / requirements.txt / go.mod
├── src/
│ ├── ...
├── tests/
│ ├── ...
└── ...
```
### Phase 3 — Generate
Write every file. Rules:
- **Real code, not stubs.** Every function has a real implementation. No `// TODO: implement` or `pass` placeholders.
- **Syntactically valid.** Every file must parse without errors in its language.
- **Imports match dependencies.** Every import must correspond to a package in the manifest (package.json, requirements.txt, go.mod, etc.).
- **Types included.** TypeScript projects use types. Python projects use type hints. Go projects use typed structs.
- **Environment variables.** Generate `.env.example` with every required variable, commented with purpose.
- **README.md.** Include: project description, prerequisites, setup steps (clone, install, configure env, run), and available scripts/commands.
- **CI config.** Generate `.github/workflows/ci.yml` with: install, lint (if linter in deps), test, build.
- **.gitignore.** Stack-appropriate ignores (node_modules, __pycache__, .env, build artifacts).
**File generation order:**
1. Manifest (package.json / requirements.txt / go.mod)
2. Config files (.env.example, .gitignore, CI)
3. Database schema / migrations
4. Core business logic
5. API routes / endpoints
6. UI components (if applicable)
7. Tests
8. README.md
### Phase 4 — Validate
After generation, run through this checklist:
- [ ] Every imported package exists in the manifest
- [ ] Every file referenced by an import exists in the tree
- [ ] `.env.example` lists every env var used in code
- [ ] `.gitignore` covers build artifacts and secrets
- [ ] README has setup instructions that actually work
- [ ] No hardcoded secrets, API keys, or passwords
- [ ] At least one test file exists
- [ ] Build/start command is documented and would work
Run `scripts/validate_project.py` against the generated directory to catch common issues.
## Examples
### Example 1: Task Management API
**Input spec:**
> "Build me a task management API. Users can create, list, update, and delete tasks. Tasks have a title, description, status (todo/in-progress/done), and due date. Use FastAPI with SQLite. Add basic auth with API keys."
**Output file tree:**
```
task-api/
├── README.md
├── .env.example # API_KEY, DATABASE_URL
├── .gitignore
├── .github/workflows/ci.yml
├── requirements.txt # fastapi, uvicorn, sqlalchemy, pytest
├── main.py # FastAPI app, CORS, lifespan
├── models.py # SQLAlchemy Task model
├── schemas.py # Pydantic request/response schemas
├── database.py # SQLite engine + session
├── auth.py # API key middleware
├── routers/
│ └── tasks.py # CRUD endpoints
└── tests/
└── test_tasks.py # Smoke tests for each endpoint
```
### Example 2: Recipe Sharing Web App
**Input spec:**
> "I want a recipe sharing website. Users sign up, post recipes with ingredients and steps, browse other recipes, and save favorites. Use Next.js with Tailwind. Store data in PostgreSQL."
**Output file tree:**
```
recipe-share/
├── README.md
├── .env.example # DATABASE_URL, NEXTAUTH_SECRET, NEXTAUTH_URL
├── .gitignore
├── .github/workflows/ci.yml
├── package.json # next, react, tailwindcss, prisma, next-auth
├── tailwind.config.ts
├── tsconfig.json
├── next.config.ts
├── prisma/
│ └── schema.prisma # User, Recipe, Ingredient, Favorite models
├── src/
│ ├── app/
│ │ ├── layout.tsx
│ │ ├── page.tsx # Homepage — recipe feed
│ │ ├── recipes/
│ │ │ ├── page.tsx # Browse recipes
│ │ │ ├── [id]/page.tsx # Recipe detail
│ │ │ └── new/page.tsx # Create recipe form
│ │ └── api/
│ │ ├── auth/[...nextauth]/route.ts
│ │ └── recipes/route.ts
│ ├── components/
│ │ ├── RecipeCard.tsx
│ │ ├── RecipeForm.tsx
│ │ └── Navbar.tsx
│ └── lib/
│ ├── prisma.ts
│ └── auth.ts
└── tests/
└── recipes.test.ts
```
### Example 3: CLI Expense Tracker
**Input spec:**
> "Python CLI tool for tracking expenses. Commands: add, list, summary, export-csv. Store in a local SQLite file. No external API."
**Output file tree:**
```
expense-tracker/
├── README.md
├── .gitignore
├── .github/workflows/ci.yml
├── pyproject.toml
├── src/
│ └── expense_tracker/
│ ├── __init__.py
│ ├── cli.py # argparse commands
│ ├── database.py # SQLite operations
│ ├── models.py # Expense dataclass
│ └── formatters.py # Table + CSV output
└── tests/
└── test_cli.py
```
## Anti-Patterns
| Anti-pattern | Fix |
|---|---|
| **Placeholder code**`// TODO: implement`, `pass`, empty function bodies | Every function has a real implementation. If complex, implement a working simplified version. |
| **Stack override** — picking Next.js when the user said Flask | Always honor explicit tech preferences. Only infer when the user doesn't specify. |
| **Missing .gitignore** — committing node_modules or .env | Generate stack-appropriate .gitignore as one of the first files. |
| **Phantom imports** — importing packages not in the manifest | Cross-check every import against package.json / requirements.txt before finishing. |
| **Over-engineering MVP** — adding Redis caching, rate limiting, WebSockets to a v1 | Build the minimum that works. The user can iterate. |
| **Ignoring stated preferences** — user says "PostgreSQL" and you generate MongoDB | Parse the spec carefully. Explicit preferences are non-negotiable. |
| **Missing env vars** — code reads `process.env.X` but `.env.example` doesn't list it | Every env var used in code must appear in `.env.example` with a comment. |
| **No tests** — shipping a repo with zero test files | At minimum: one smoke test per API endpoint or one test per core function. |
| **Hallucinated APIs** — generating code that calls library methods that don't exist | Stick to well-documented, stable APIs. When unsure, use the simplest approach. |
## Validation Script
### `scripts/validate_project.py`
Checks a generated project directory for common issues:
```bash
# Validate a generated project
python3 scripts/validate_project.py /path/to/generated-project
# JSON output
python3 scripts/validate_project.py /path/to/generated-project --format json
```
Checks performed:
- README.md exists and is non-empty
- .gitignore exists
- .env.example exists (if code references env vars)
- Package manifest exists (package.json, requirements.txt, go.mod, Cargo.toml, pubspec.yaml)
- No .env file committed (secrets leak)
- At least one test file exists
- No TODO/FIXME placeholders in generated code
## Progressive Enhancement
For complex specs, generate in stages:
1. **MVP** — Core feature only, working end-to-end
2. **Auth** — Add authentication if requested
3. **Polish** — Error handling, validation, loading states
4. **Deploy** — Docker, CI, deploy config
Ask the user after MVP: "Core is working. Want me to add auth/polish/deploy next, or iterate on what's here?"
## Cross-References
- Related: `product-team/saas-scaffolder` — SaaS-specific scaffolding (Next.js + Stripe + Auth)
- Related: `engineering/spec-driven-workflow` — spec-first development methodology
- Related: `engineering/database-designer` — database schema design patterns
- Related: `engineering-team/senior-fullstack` — full-stack implementation patterns