feat(engineering): add browser-automation and spec-driven-workflow skills

browser-automation (564-line SKILL.md, 3 scripts, 3 references): - Web scraping, form filling, screenshot capture, data extraction - Anti-detection patterns, cookie/session management, dynamic content - scraping_toolkit.py, form_automation_builder.py, anti_detection_checker.py - NOT testing (that's playwright-pro) — this is automation & scraping spec-driven-workflow (586-line SKILL.md, 3 scripts, 3 references): - Spec-first development: write spec BEFORE code - Bounded autonomy rules, 6-phase workflow, self-review checklist - spec_generator.py, spec_validator.py, test_extractor.py - Pairs with tdd-guide for red-green-refactor after spec Updated engineering plugin.json (31 → 33 skills). Added both to mkdocs.yml nav and generated docs pages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 12:57:18 +01:00
parent 7a2189fa21
commit 97952ccbee
19 changed files with 7379 additions and 3 deletions
--- a/engineering/.claude-plugin/plugin.json
+++ b/engineering/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
  "name": "engineering-advanced-skills",
-  "description": "31 advanced engineering skills: agent designer, agent workflow designer, AgentHub, RAG architect, database designer, migration architect, observability designer, dependency auditor, release manager, API reviewer, CI/CD pipeline builder, MCP server builder, skill security auditor, performance profiler, Helm chart builder, Terraform patterns, focused-fix, and more. Agent skill and plugin for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw.",
+  "description": "33 advanced engineering skills: agent designer, agent workflow designer, AgentHub, RAG architect, database designer, migration architect, observability designer, dependency auditor, release manager, API reviewer, CI/CD pipeline builder, MCP server builder, skill security auditor, performance profiler, Helm chart builder, Terraform patterns, focused-fix, browser-automation, spec-driven-workflow, and more. Agent skill and plugin for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw.",
  "version": "2.1.2",
  "author": {
    "name": "Alireza Rezvani",
--- a/engineering/browser-automation/SKILL.md
+++ b/engineering/browser-automation/SKILL.md
@@ -0,0 +1,564 @@
+---
+name: "browser-automation"
+description: "Use when the user asks to automate browser tasks, scrape websites, fill forms, capture screenshots, extract structured data from web pages, or build web automation workflows. NOT for testing — use playwright-pro for that."
+---
+
+# Browser Automation - POWERFUL
+
+## Overview
+
+The Browser Automation skill provides comprehensive tools and knowledge for building production-grade web automation workflows using Playwright. This skill covers data extraction, form filling, screenshot capture, session management, and anti-detection patterns for reliable browser automation at scale.
+
+**When to use this skill:**
+- Scraping structured data from websites (tables, listings, search results)
+- Automating multi-step browser workflows (login, fill forms, download files)
+- Capturing screenshots or PDFs of web pages
+- Extracting data from SPAs and JavaScript-heavy sites
+- Building repeatable browser-based data pipelines
+
+**When NOT to use this skill:**
+- Writing browser tests or E2E test suites — use **playwright-pro** instead
+- Testing API endpoints — use **api-test-suite-builder** instead
+- Load testing or performance benchmarking — use **performance-profiler** instead
+
+**Why Playwright over Selenium or Puppeteer:**
+- **Auto-wait built in** — no explicit `sleep()` or `waitForElement()` needed for most actions
+- **Multi-browser from one API** — Chromium, Firefox, WebKit with zero config changes
+- **Network interception** — block ads, mock responses, capture API calls natively
+- **Browser contexts** — isolated sessions without spinning up new browser instances
+- **Codegen** — `playwright codegen` records your actions and generates scripts
+- **Async-first** — Python async/await for high-throughput scraping
+
+## Core Competencies
+
+### 1. Web Scraping Patterns
+
+#### DOM Extraction with CSS Selectors
+CSS selectors are the primary tool for element targeting. Prefer them over XPath for readability and performance.
+
+**Selector priority (most to least reliable):**
+1. `data-testid`, `data-id`, or custom data attributes — stable across redesigns
+2. `#id` selectors — unique but may change between deploys
+3. Semantic selectors: `article`, `nav`, `main`, `section` — resilient to CSS changes
+4. Class-based: `.product-card`, `.price` — brittle if classes are generated (e.g., CSS modules)
+5. Positional: `nth-child()`, `nth-of-type()` — last resort, breaks on layout changes
+
+**Compound selectors for precision:**
+```python
+# Product cards within a specific container
+page.query_selector_all("div.search-results > article.product-card")
+
+# Price inside a product card (scoped)
+card.query_selector("span[data-field='price']")
+
+# Links with specific text content
+page.locator("a", has_text="Next Page")
+```
+
+#### XPath for Complex Traversal
+Use XPath only when CSS cannot express the relationship:
+```python
+# Find element by text content (XPath strength)
+page.locator("//td[contains(text(), 'Total')]/following-sibling::td[1]")
+
+# Navigate up the DOM tree
+page.locator("//span[@class='price']/ancestor::div[@class='product']")
+```
+
+#### Pagination Patterns
+- **Next-button pagination**: Click "Next" until disabled or absent
+- **URL-based pagination**: Increment `?page=N` or `&offset=N` in URL
+- **Infinite scroll**: Scroll to bottom, wait for new content, repeat until no change
+- **Load-more button**: Click button, wait for DOM mutation, repeat
+
+#### Infinite Scroll Handling
+```python
+async def scroll_to_bottom(page, max_scrolls=50, pause_ms=1500):
+    previous_height = 0
+    for i in range(max_scrolls):
+        current_height = await page.evaluate("document.body.scrollHeight")
+        if current_height == previous_height:
+            break
+        await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
+        await page.wait_for_timeout(pause_ms)
+        previous_height = current_height
+    return i + 1  # number of scrolls performed
+```
+
+### 2. Form Filling & Multi-Step Workflows
+
+#### Login Flows
+```python
+async def login(page, url, username, password):
+    await page.goto(url)
+    await page.fill("input[name='username']", username)
+    await page.fill("input[name='password']", password)
+    await page.click("button[type='submit']")
+    # Wait for navigation to complete (post-login redirect)
+    await page.wait_for_url("**/dashboard**")
+```
+
+#### Multi-Page Forms
+Break multi-step forms into discrete functions per step. Each function:
+1. Fills the fields for that step
+2. Clicks the "Next" or "Continue" button
+3. Waits for the next step to load (URL change or DOM element)
+
+```python
+async def fill_step_1(page, data):
+    await page.fill("#first-name", data["first_name"])
+    await page.fill("#last-name", data["last_name"])
+    await page.select_option("#country", data["country"])
+    await page.click("button:has-text('Continue')")
+    await page.wait_for_selector("#step-2-form")
+
+async def fill_step_2(page, data):
+    await page.fill("#address", data["address"])
+    await page.fill("#city", data["city"])
+    await page.click("button:has-text('Continue')")
+    await page.wait_for_selector("#step-3-form")
+```
+
+#### File Uploads
+```python
+# Single file
+await page.set_input_files("input[type='file']", "/path/to/file.pdf")
+
+# Multiple files
+await page.set_input_files("input[type='file']", [
+    "/path/to/file1.pdf",
+    "/path/to/file2.pdf"
+])
+
+# Drag-and-drop upload zones (no visible input element)
+async with page.expect_file_chooser() as fc_info:
+    await page.click("div.upload-zone")
+file_chooser = await fc_info.value
+await file_chooser.set_files("/path/to/file.pdf")
+```
+
+#### Dropdown and Select Handling
+```python
+# Native <select> element
+await page.select_option("#country", value="US")
+await page.select_option("#country", label="United States")
+
+# Custom dropdown (div-based)
+await page.click("div.dropdown-trigger")
+await page.click("div.dropdown-option:has-text('United States')")
+```
+
+### 3. Screenshot & PDF Capture
+
+#### Screenshot Strategies
+```python
+# Full page (scrolls automatically)
+await page.screenshot(path="full-page.png", full_page=True)
+
+# Viewport only (what's visible)
+await page.screenshot(path="viewport.png")
+
+# Specific element
+element = page.locator("div.chart-container")
+await element.screenshot(path="chart.png")
+
+# With custom viewport for consistency
+context = await browser.new_context(viewport={"width": 1920, "height": 1080})
+```
+
+#### PDF Generation
+```python
+# Only works in Chromium
+await page.pdf(
+    path="output.pdf",
+    format="A4",
+    margin={"top": "1cm", "right": "1cm", "bottom": "1cm", "left": "1cm"},
+    print_background=True
+)
+```
+
+#### Visual Regression Baselines
+Take screenshots at known states and compare pixel-by-pixel. Store baselines in version control. Use naming conventions: `{page}_{viewport}_{state}.png`.
+
+### 4. Structured Data Extraction
+
+#### Tables to JSON
+```python
+async def extract_table(page, selector):
+    headers = await page.eval_on_selector_all(
+        f"{selector} thead th",
+        "elements => elements.map(e => e.textContent.trim())"
+    )
+    rows = await page.eval_on_selector_all(
+        f"{selector} tbody tr",
+        """rows => rows.map(row => {
+            return Array.from(row.querySelectorAll('td'))
+                .map(cell => cell.textContent.trim())
+        })"""
+    )
+    return [dict(zip(headers, row)) for row in rows]
+```
+
+#### Listings to Arrays
+```python
+async def extract_listings(page, container_sel, field_map):
+    """
+    field_map example: {"title": "h3.title", "price": "span.price", "url": "a::attr(href)"}
+    """
+    items = []
+    cards = await page.query_selector_all(container_sel)
+    for card in cards:
+        item = {}
+        for field, sel in field_map.items():
+            if "::attr(" in sel:
+                attr_sel, attr_name = sel.split("::attr(")
+                attr_name = attr_name.rstrip(")")
+                el = await card.query_selector(attr_sel)
+                item[field] = await el.get_attribute(attr_name) if el else None
+            else:
+                el = await card.query_selector(sel)
+                item[field] = (await el.text_content()).strip() if el else None
+        items.append(item)
+    return items
+```
+
+#### Nested Data Extraction
+For threaded content (comments with replies), use recursive extraction:
+```python
+async def extract_comments(page, parent_selector):
+    comments = []
+    elements = await page.query_selector_all(f"{parent_selector} > .comment")
+    for el in elements:
+        text = await (await el.query_selector(".comment-body")).text_content()
+        author = await (await el.query_selector(".author")).text_content()
+        replies = await extract_comments(el, ".replies")
+        comments.append({
+            "author": author.strip(),
+            "text": text.strip(),
+            "replies": replies
+        })
+    return comments
+```
+
+### 5. Cookie & Session Management
+
+#### Save and Restore Sessions
+```python
+import json
+
+# Save cookies after login
+cookies = await context.cookies()
+with open("session.json", "w") as f:
+    json.dump(cookies, f)
+
+# Restore session in new context
+with open("session.json", "r") as f:
+    cookies = json.load(f)
+context = await browser.new_context()
+await context.add_cookies(cookies)
+```
+
+#### Storage State (Cookies + Local Storage)
+```python
+# Save full state (cookies + localStorage + sessionStorage)
+await context.storage_state(path="state.json")
+
+# Restore full state
+context = await browser.new_context(storage_state="state.json")
+```
+
+**Best practice:** Save state after login, reuse across scraping sessions. Check session validity before starting a long job — make a lightweight request to a protected page and verify you are not redirected to login.
+
+### 6. Anti-Detection Patterns
+
+Modern websites detect automation through multiple vectors. Address all of them:
+
+#### User Agent Rotation
+Never use the default Playwright user agent. Rotate through real browser user agents:
+```python
+USER_AGENTS = [
+    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+]
+```
+
+#### Viewport and Screen Size
+Set realistic viewport dimensions. The default 800x600 is a red flag:
+```python
+context = await browser.new_context(
+    viewport={"width": 1920, "height": 1080},
+    screen={"width": 1920, "height": 1080},
+    user_agent=random.choice(USER_AGENTS),
+)
+```
+
+#### WebDriver Flag Removal
+Playwright sets `navigator.webdriver = true`. Remove it:
+```python
+await page.add_init_script("""
+    Object.defineProperty(navigator, 'webdriver', {get: () => undefined});
+""")
+```
+
+#### Request Throttling
+Add human-like delays between actions:
+```python
+import random
+
+async def human_delay(min_ms=500, max_ms=2000):
+    delay = random.randint(min_ms, max_ms)
+    await page.wait_for_timeout(delay)
+```
+
+#### Proxy Support
+```python
+browser = await playwright.chromium.launch(
+    proxy={"server": "http://proxy.example.com:8080"}
+)
+# Or per-context:
+context = await browser.new_context(
+    proxy={"server": "http://proxy.example.com:8080",
+           "username": "user", "password": "pass"}
+)
+```
+
+### 7. Dynamic Content Handling
+
+#### SPA Rendering
+SPAs render content client-side. Wait for the actual content, not the page load:
+```python
+await page.goto(url)
+# Wait for the data to render, not just the shell
+await page.wait_for_selector("div.product-list article", state="attached")
+```
+
+#### AJAX / Fetch Waiting
+Intercept and wait for specific API calls:
+```python
+async with page.expect_response("**/api/products*") as response_info:
+    await page.click("button.load-more")
+response = await response_info.value
+data = await response.json()  # You can use the API data directly
+```
+
+#### Shadow DOM Traversal
+```python
+# Playwright pierces open Shadow DOM automatically with >>
+await page.locator("custom-element >> .inner-class").click()
+```
+
+#### Lazy-Loaded Images
+Scroll elements into view to trigger lazy loading:
+```python
+images = await page.query_selector_all("img[data-src]")
+for img in images:
+    await img.scroll_into_view_if_needed()
+    await page.wait_for_timeout(200)
+```
+
+### 8. Error Handling & Retry Logic
+
+#### Retry Decorator Pattern
+```python
+import asyncio
+
+async def with_retry(coro_factory, max_retries=3, backoff_base=2):
+    for attempt in range(max_retries):
+        try:
+            return await coro_factory()
+        except Exception as e:
+            if attempt == max_retries - 1:
+                raise
+            wait = backoff_base ** attempt
+            print(f"Attempt {attempt + 1} failed: {e}. Retrying in {wait}s...")
+            await asyncio.sleep(wait)
+```
+
+#### Handling Common Failures
+```python
+from playwright.async_api import TimeoutError as PlaywrightTimeout
+
+try:
+    await page.click("button.submit", timeout=5000)
+except PlaywrightTimeout:
+    # Element did not appear — page structure may have changed
+    # Try fallback selector
+    await page.click("[type='submit']", timeout=5000)
+except Exception as e:
+    # Network error, browser crash, etc.
+    await page.screenshot(path="error-state.png")
+    raise
+```
+
+#### Rate Limit Detection
+```python
+async def check_rate_limit(response):
+    if response.status == 429:
+        retry_after = response.headers.get("retry-after", "60")
+        wait_seconds = int(retry_after)
+        print(f"Rate limited. Waiting {wait_seconds}s...")
+        await asyncio.sleep(wait_seconds)
+        return True
+    return False
+```
+
+## Workflows
+
+### Workflow 1: Single-Page Data Extraction
+
+**Scenario:** Extract product data from a single page with JavaScript-rendered content.
+
+**Steps:**
+1. Launch browser in headed mode during development (`headless=False`), switch to headless for production
+2. Navigate to URL and wait for content selector
+3. Extract data using `query_selector_all` with field mapping
+4. Validate extracted data (check for nulls, expected types)
+5. Output as JSON
+
+```python
+async def extract_single_page(url, selectors):
+    async with async_playwright() as p:
+        browser = await p.chromium.launch(headless=True)
+        context = await browser.new_context(
+            viewport={"width": 1920, "height": 1080},
+            user_agent="Mozilla/5.0 ..."
+        )
+        page = await context.new_page()
+        await page.goto(url, wait_until="networkidle")
+        data = await extract_listings(page, selectors["container"], selectors["fields"])
+        await browser.close()
+    return data
+```
+
+### Workflow 2: Multi-Page Scraping with Pagination
+
+**Scenario:** Scrape search results across 50+ pages.
+
+**Steps:**
+1. Launch browser with anti-detection settings
+2. Navigate to first page
+3. Extract data from current page
+4. Check if "Next" button exists and is enabled
+5. Click next, wait for new content to load (not just navigation)
+6. Repeat until no next page or max pages reached
+7. Deduplicate results by unique key
+8. Write output incrementally (don't hold everything in memory)
+
+```python
+async def scrape_paginated(base_url, selectors, max_pages=100):
+    all_data = []
+    async with async_playwright() as p:
+        browser = await p.chromium.launch(headless=True)
+        page = await (await browser.new_context()).new_page()
+        await page.goto(base_url)
+
+        for page_num in range(max_pages):
+            items = await extract_listings(page, selectors["container"], selectors["fields"])
+            all_data.extend(items)
+
+            next_btn = page.locator(selectors["next_button"])
+            if await next_btn.count() == 0 or await next_btn.is_disabled():
+                break
+
+            await next_btn.click()
+            await page.wait_for_selector(selectors["container"])
+            await human_delay(800, 2000)
+
+        await browser.close()
+    return all_data
+```
+
+### Workflow 3: Authenticated Workflow Automation
+
+**Scenario:** Log into a portal, navigate a multi-step form, download a report.
+
+**Steps:**
+1. Check for existing session state file
+2. If no session, perform login and save state
+3. Navigate to target page using saved session
+4. Fill multi-step form with provided data
+5. Wait for download to trigger
+6. Save downloaded file to target directory
+
+```python
+async def authenticated_workflow(credentials, form_data, download_dir):
+    async with async_playwright() as p:
+        browser = await p.chromium.launch(headless=True)
+        state_file = "session_state.json"
+
+        # Restore or create session
+        if os.path.exists(state_file):
+            context = await browser.new_context(storage_state=state_file)
+        else:
+            context = await browser.new_context()
+            page = await context.new_page()
+            await login(page, credentials["url"], credentials["user"], credentials["pass"])
+            await context.storage_state(path=state_file)
+
+        page = await context.new_page()
+        await page.goto(form_data["target_url"])
+
+        # Fill form steps
+        for step_fn in [fill_step_1, fill_step_2]:
+            await step_fn(page, form_data)
+
+        # Handle download
+        async with page.expect_download() as dl_info:
+            await page.click("button:has-text('Download Report')")
+        download = await dl_info.value
+        await download.save_as(os.path.join(download_dir, download.suggested_filename))
+
+        await browser.close()
+```
+
+## Tools Reference
+
+| Script | Purpose | Key Flags | Output |
+|--------|---------|-----------|--------|
+| `scraping_toolkit.py` | Generate Playwright scraping script skeleton | `--url`, `--selectors`, `--paginate`, `--output` | Python script or JSON config |
+| `form_automation_builder.py` | Generate form-fill automation script from field spec | `--fields`, `--url`, `--output` | Python automation script |
+| `anti_detection_checker.py` | Audit a Playwright script for detection vectors | `--file`, `--verbose` | Risk report with score |
+
+All scripts are stdlib-only. Run `python3 <script> --help` for full usage.
+
+## Anti-Patterns
+
+### Hardcoded Waits
+**Bad:** `await page.wait_for_timeout(5000)` before every action.
+**Good:** Use `wait_for_selector`, `wait_for_url`, `expect_response`, or `wait_for_load_state`. Hardcoded waits are flaky and slow.
+
+### No Error Recovery
+**Bad:** Linear script that crashes on first failure.
+**Good:** Wrap each page interaction in try/except. Take error-state screenshots. Implement retry with exponential backoff.
+
+### Ignoring robots.txt
+**Bad:** Scraping without checking robots.txt directives.
+**Good:** Fetch and parse robots.txt before scraping. Respect `Crawl-delay`. Skip disallowed paths. Add your bot name to User-Agent if running at scale.
+
+### Storing Credentials in Scripts
+**Bad:** Hardcoding usernames and passwords in Python files.
+**Good:** Use environment variables, `.env` files (gitignored), or a secrets manager. Pass credentials via CLI arguments.
+
+### No Rate Limiting
+**Bad:** Hammering a site with 100 requests/second.
+**Good:** Add random delays between requests (1-3s for polite scraping). Monitor for 429 responses. Implement exponential backoff.
+
+### Selector Fragility
+**Bad:** Relying on auto-generated class names (`.css-1a2b3c`) or deep nesting (`div > div > div > span:nth-child(3)`).
+**Good:** Use data attributes, semantic HTML, or text-based locators. Test selectors in browser DevTools first.
+
+### Not Cleaning Up Browser Instances
+**Bad:** Launching browsers without closing them, leading to resource leaks.
+**Good:** Always use `try/finally` or async context managers to ensure `browser.close()` is called.
+
+### Running Headed in Production
+**Bad:** Using `headless=False` in production/CI.
+**Good:** Develop with headed mode for debugging, deploy with `headless=True`. Use environment variable to toggle: `headless = os.environ.get("HEADLESS", "true") == "true"`.
+
+## Cross-References
+
+- **playwright-pro** — Browser testing skill. Use for E2E tests, test assertions, test fixtures. Browser Automation is for data extraction and workflow automation, not testing.
+- **api-test-suite-builder** — When the website has a public API, hit the API directly instead of scraping the rendered page. Faster, more reliable, less detectable.
+- **performance-profiler** — If your automation scripts are slow, profile the bottlenecks before adding concurrency.
+- **env-secrets-manager** — For securely managing credentials used in authenticated automation workflows.
--- a/engineering/browser-automation/anti_detection_checker.py
+++ b/engineering/browser-automation/anti_detection_checker.py
@@ -0,0 +1,520 @@
+#!/usr/bin/env python3
+"""
+Anti-Detection Checker - Audits Playwright scripts for common bot detection vectors.
+
+Analyzes a Playwright automation script and identifies patterns that make the
+browser detectable as a bot. Produces a risk score (0-100) with specific
+recommendations for each issue found.
+
+Detection vectors checked:
+- Headless mode usage
+- Default/missing user agent configuration
+- Viewport size (default 800x600 is a red flag)
+- WebDriver flag (navigator.webdriver)
+- Navigator property overrides
+- Request throttling / human-like delays
+- Cookie/session management
+- Proxy configuration
+- Error handling patterns
+
+No external dependencies - uses only Python standard library.
+"""
+
+import argparse
+import json
+import os
+import re
+import sys
+from dataclasses import dataclass, asdict
+from typing import List, Optional
+
+
+@dataclass
+class Finding:
+    """A single detection risk finding."""
+    category: str
+    severity: str  # "critical", "high", "medium", "low", "info"
+    description: str
+    line: Optional[int]
+    recommendation: str
+    weight: int  # Points added to risk score (0-15)
+
+
+SEVERITY_WEIGHTS = {
+    "critical": 15,
+    "high": 10,
+    "medium": 5,
+    "low": 2,
+    "info": 0,
+}
+
+
+class AntiDetectionChecker:
+    """Analyzes Playwright scripts for bot detection vulnerabilities."""
+
+    def __init__(self, script_content: str, file_path: str = "<stdin>"):
+        self.content = script_content
+        self.lines = script_content.split("\n")
+        self.file_path = file_path
+        self.findings: List[Finding] = []
+
+    def check_all(self) -> List[Finding]:
+        """Run all detection checks."""
+        self._check_headless_mode()
+        self._check_user_agent()
+        self._check_viewport()
+        self._check_webdriver_flag()
+        self._check_navigator_properties()
+        self._check_request_delays()
+        self._check_error_handling()
+        self._check_proxy()
+        self._check_session_management()
+        self._check_browser_close()
+        self._check_stealth_imports()
+        return self.findings
+
+    def _find_line(self, pattern: str) -> Optional[int]:
+        """Find the first line number matching a regex pattern."""
+        for i, line in enumerate(self.lines, 1):
+            if re.search(pattern, line):
+                return i
+        return None
+
+    def _has_pattern(self, pattern: str) -> bool:
+        """Check if pattern exists anywhere in the script."""
+        return bool(re.search(pattern, self.content))
+
+    def _check_headless_mode(self):
+        """Check if headless mode is properly configured."""
+        if self._has_pattern(r"headless\s*=\s*False"):
+            self.findings.append(Finding(
+                category="Headless Mode",
+                severity="high",
+                description="Browser launched in headed mode (headless=False). This is fine for development but should be headless=True in production.",
+                line=self._find_line(r"headless\s*=\s*False"),
+                recommendation="Use headless=True for production. Toggle via environment variable: headless=os.environ.get('HEADLESS', 'true') == 'true'",
+                weight=SEVERITY_WEIGHTS["high"],
+            ))
+        elif not self._has_pattern(r"headless"):
+            # Default is headless=True in Playwright, which is correct
+            self.findings.append(Finding(
+                category="Headless Mode",
+                severity="info",
+                description="Using default headless mode (True). Good for production.",
+                line=None,
+                recommendation="No action needed. Default headless=True is correct.",
+                weight=SEVERITY_WEIGHTS["info"],
+            ))
+
+    def _check_user_agent(self):
+        """Check if a custom user agent is set."""
+        has_ua = self._has_pattern(r"user_agent\s*=") or self._has_pattern(r"userAgent")
+        has_ua_list = self._has_pattern(r"USER_AGENTS?\s*=\s*\[")
+        has_random_ua = self._has_pattern(r"random\.choice.*(?:USER_AGENT|user_agent|ua)")
+
+        if not has_ua:
+            self.findings.append(Finding(
+                category="User Agent",
+                severity="critical",
+                description="No custom user agent configured. Playwright's default user agent contains 'HeadlessChrome' which is trivially detected.",
+                line=None,
+                recommendation="Set a realistic user agent: context = await browser.new_context(user_agent='Mozilla/5.0 ...')",
+                weight=SEVERITY_WEIGHTS["critical"],
+            ))
+        elif has_ua_list and has_random_ua:
+            self.findings.append(Finding(
+                category="User Agent",
+                severity="info",
+                description="User agent rotation detected. Good anti-detection practice.",
+                line=self._find_line(r"USER_AGENTS?\s*=\s*\["),
+                recommendation="Ensure user agents are recent and match the browser being launched (e.g., Chrome UA for Chromium).",
+                weight=SEVERITY_WEIGHTS["info"],
+            ))
+        elif has_ua:
+            self.findings.append(Finding(
+                category="User Agent",
+                severity="low",
+                description="Custom user agent set but no rotation detected. Single user agent is fingerprint-able at scale.",
+                line=self._find_line(r"user_agent\s*="),
+                recommendation="Rotate through 5-10 recent user agents using random.choice().",
+                weight=SEVERITY_WEIGHTS["low"],
+            ))
+
+    def _check_viewport(self):
+        """Check viewport configuration."""
+        has_viewport = self._has_pattern(r"viewport\s*=\s*\{") or self._has_pattern(r"viewport.*width")
+
+        if not has_viewport:
+            self.findings.append(Finding(
+                category="Viewport Size",
+                severity="high",
+                description="No viewport configured. Default Playwright viewport (1280x720) is common among bots. Sites may flag unusual viewport distributions.",
+                line=None,
+                recommendation="Set a common desktop viewport: viewport={'width': 1920, 'height': 1080}. Vary across runs.",
+                weight=SEVERITY_WEIGHTS["high"],
+            ))
+        else:
+            # Check for suspiciously small viewports
+            match = re.search(r"width['\"]?\s*[:=]\s*(\d+)", self.content)
+            if match:
+                width = int(match.group(1))
+                if width < 1024:
+                    self.findings.append(Finding(
+                        category="Viewport Size",
+                        severity="medium",
+                        description=f"Viewport width {width}px is unusually small. Most desktop browsers are 1366px+ wide.",
+                        line=self._find_line(r"width.*" + str(width)),
+                        recommendation="Use 1366x768 (most common) or 1920x1080. Avoid unusual sizes like 800x600.",
+                        weight=SEVERITY_WEIGHTS["medium"],
+                    ))
+                else:
+                    self.findings.append(Finding(
+                        category="Viewport Size",
+                        severity="info",
+                        description=f"Viewport width {width}px is reasonable.",
+                        line=self._find_line(r"width.*" + str(width)),
+                        recommendation="No action needed.",
+                        weight=SEVERITY_WEIGHTS["info"],
+                    ))
+
+    def _check_webdriver_flag(self):
+        """Check if navigator.webdriver is being removed."""
+        has_webdriver_override = (
+            self._has_pattern(r"navigator.*webdriver") or
+            self._has_pattern(r"webdriver.*undefined") or
+            self._has_pattern(r"add_init_script.*webdriver")
+        )
+
+        if not has_webdriver_override:
+            self.findings.append(Finding(
+                category="WebDriver Flag",
+                severity="critical",
+                description="navigator.webdriver is not overridden. This is the most common bot detection check. Every major anti-bot service tests this property.",
+                line=None,
+                recommendation=(
+                    "Add init script to remove the flag:\n"
+                    "  await page.add_init_script(\"Object.defineProperty(navigator, 'webdriver', {get: () => undefined});\")"
+                ),
+                weight=SEVERITY_WEIGHTS["critical"],
+            ))
+        else:
+            self.findings.append(Finding(
+                category="WebDriver Flag",
+                severity="info",
+                description="navigator.webdriver override detected.",
+                line=self._find_line(r"webdriver"),
+                recommendation="No action needed.",
+                weight=SEVERITY_WEIGHTS["info"],
+            ))
+
+    def _check_navigator_properties(self):
+        """Check for additional navigator property hardening."""
+        checks = {
+            "plugins": (r"navigator.*plugins", "navigator.plugins is empty in headless mode. Real browsers report installed plugins."),
+            "languages": (r"navigator.*languages", "navigator.languages should be set to match the user agent locale."),
+            "platform": (r"navigator.*platform", "navigator.platform should match the user agent OS."),
+        }
+
+        overridden_count = 0
+        for prop, (pattern, desc) in checks.items():
+            if self._has_pattern(pattern):
+                overridden_count += 1
+
+        if overridden_count == 0:
+            self.findings.append(Finding(
+                category="Navigator Properties",
+                severity="medium",
+                description="No navigator property hardening detected. Advanced anti-bot services check plugins, languages, and platform properties.",
+                line=None,
+                recommendation="Override navigator.plugins, navigator.languages, and navigator.platform via add_init_script() to match realistic browser fingerprints.",
+                weight=SEVERITY_WEIGHTS["medium"],
+            ))
+        elif overridden_count < 3:
+            self.findings.append(Finding(
+                category="Navigator Properties",
+                severity="low",
+                description=f"Partial navigator hardening ({overridden_count}/3 properties). Consider covering all three: plugins, languages, platform.",
+                line=None,
+                recommendation="Add overrides for any missing properties among: plugins, languages, platform.",
+                weight=SEVERITY_WEIGHTS["low"],
+            ))
+
+    def _check_request_delays(self):
+        """Check for human-like request delays."""
+        has_sleep = self._has_pattern(r"asyncio\.sleep") or self._has_pattern(r"wait_for_timeout")
+        has_random_delay = (
+            self._has_pattern(r"random\.(uniform|randint|random)") and has_sleep
+        )
+
+        if not has_sleep:
+            self.findings.append(Finding(
+                category="Request Timing",
+                severity="high",
+                description="No delays between actions detected. Machine-speed interactions are the easiest behavior-based detection signal.",
+                line=None,
+                recommendation="Add random delays between page interactions: await asyncio.sleep(random.uniform(0.5, 2.0))",
+                weight=SEVERITY_WEIGHTS["high"],
+            ))
+        elif not has_random_delay:
+            self.findings.append(Finding(
+                category="Request Timing",
+                severity="medium",
+                description="Fixed delays detected but no randomization. Constant timing intervals are detectable patterns.",
+                line=self._find_line(r"(asyncio\.sleep|wait_for_timeout)"),
+                recommendation="Use random delays: random.uniform(min_seconds, max_seconds) instead of fixed values.",
+                weight=SEVERITY_WEIGHTS["medium"],
+            ))
+        else:
+            self.findings.append(Finding(
+                category="Request Timing",
+                severity="info",
+                description="Randomized delays detected between actions.",
+                line=self._find_line(r"random\.(uniform|randint)"),
+                recommendation="No action needed. Ensure delays are realistic (0.5-3s for browsing, 1-5s for reading).",
+                weight=SEVERITY_WEIGHTS["info"],
+            ))
+
+    def _check_error_handling(self):
+        """Check for error handling patterns."""
+        has_try_except = self._has_pattern(r"try\s*:") and self._has_pattern(r"except")
+        has_retry = self._has_pattern(r"retr(y|ies)") or self._has_pattern(r"max_retries|max_attempts")
+
+        if not has_try_except:
+            self.findings.append(Finding(
+                category="Error Handling",
+                severity="medium",
+                description="No try/except blocks found. Unhandled errors will crash the automation and leave browser instances running.",
+                line=None,
+                recommendation="Wrap page interactions in try/except. Handle TimeoutError, network errors, and element-not-found gracefully.",
+                weight=SEVERITY_WEIGHTS["medium"],
+            ))
+        elif not has_retry:
+            self.findings.append(Finding(
+                category="Error Handling",
+                severity="low",
+                description="Error handling present but no retry logic detected. Transient failures (network blips, slow loads) will cause data loss.",
+                line=None,
+                recommendation="Add retry with exponential backoff for network operations and element interactions.",
+                weight=SEVERITY_WEIGHTS["low"],
+            ))
+
+    def _check_proxy(self):
+        """Check for proxy configuration."""
+        has_proxy = self._has_pattern(r"proxy\s*=\s*\{") or self._has_pattern(r"proxy.*server")
+
+        if not has_proxy:
+            self.findings.append(Finding(
+                category="Proxy",
+                severity="low",
+                description="No proxy configuration detected. Running from a single IP address is fine for small jobs but will trigger rate limits at scale.",
+                line=None,
+                recommendation="For high-volume scraping, use rotating proxies: proxy={'server': 'http://proxy:port'}",
+                weight=SEVERITY_WEIGHTS["low"],
+            ))
+
+    def _check_session_management(self):
+        """Check for session/cookie management."""
+        has_storage_state = self._has_pattern(r"storage_state")
+        has_cookies = self._has_pattern(r"cookies\(\)") or self._has_pattern(r"add_cookies")
+
+        if not has_storage_state and not has_cookies:
+            self.findings.append(Finding(
+                category="Session Management",
+                severity="low",
+                description="No session persistence detected. Each run will start fresh, requiring re-authentication.",
+                line=None,
+                recommendation="Use storage_state() to save/restore sessions across runs. This avoids repeated logins that may trigger security alerts.",
+                weight=SEVERITY_WEIGHTS["low"],
+            ))
+
+    def _check_browser_close(self):
+        """Check if browser is properly closed."""
+        has_close = self._has_pattern(r"browser\.close\(\)") or self._has_pattern(r"await.*close")
+        has_context_manager = self._has_pattern(r"async\s+with\s+async_playwright")
+
+        if not has_close and not has_context_manager:
+            self.findings.append(Finding(
+                category="Resource Cleanup",
+                severity="medium",
+                description="No browser.close() or context manager detected. Browser processes will leak on failure.",
+                line=None,
+                recommendation="Use 'async with async_playwright() as p:' or ensure browser.close() is in a finally block.",
+                weight=SEVERITY_WEIGHTS["medium"],
+            ))
+
+    def _check_stealth_imports(self):
+        """Check for stealth/anti-detection library usage."""
+        has_stealth = self._has_pattern(r"playwright_stealth|stealth_async|undetected")
+        if has_stealth:
+            self.findings.append(Finding(
+                category="Stealth Library",
+                severity="info",
+                description="Third-party stealth library detected. These provide additional fingerprint evasion but add dependencies.",
+                line=self._find_line(r"playwright_stealth|stealth_async|undetected"),
+                recommendation="Stealth libraries are helpful but not a silver bullet. Still implement manual checks for user agent, viewport, and timing.",
+                weight=SEVERITY_WEIGHTS["info"],
+            ))
+
+    def get_risk_score(self) -> int:
+        """Calculate overall risk score (0-100). Higher = more detectable."""
+        raw_score = sum(f.weight for f in self.findings)
+        # Cap at 100
+        return min(raw_score, 100)
+
+    def get_risk_level(self) -> str:
+        """Get human-readable risk level."""
+        score = self.get_risk_score()
+        if score <= 10:
+            return "LOW"
+        elif score <= 30:
+            return "MODERATE"
+        elif score <= 50:
+            return "HIGH"
+        else:
+            return "CRITICAL"
+
+    def get_summary(self) -> dict:
+        """Get a summary of the analysis."""
+        severity_counts = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
+        for f in self.findings:
+            severity_counts[f.severity] += 1
+
+        return {
+            "file": self.file_path,
+            "risk_score": self.get_risk_score(),
+            "risk_level": self.get_risk_level(),
+            "total_findings": len(self.findings),
+            "severity_counts": severity_counts,
+            "actionable_findings": len([f for f in self.findings if f.severity != "info"]),
+        }
+
+
+def format_text_report(checker: AntiDetectionChecker, verbose: bool = False) -> str:
+    """Format findings as human-readable text."""
+    lines = []
+    summary = checker.get_summary()
+
+    lines.append("=" * 60)
+    lines.append("  ANTI-DETECTION AUDIT REPORT")
+    lines.append("=" * 60)
+    lines.append(f"File:          {summary['file']}")
+    lines.append(f"Risk Score:    {summary['risk_score']}/100 ({summary['risk_level']})")
+    lines.append(f"Total Issues:  {summary['actionable_findings']} actionable, {summary['severity_counts']['info']} info")
+    lines.append("")
+
+    # Severity breakdown
+    for sev in ["critical", "high", "medium", "low"]:
+        count = summary["severity_counts"][sev]
+        if count > 0:
+            lines.append(f"  {sev.upper():10s} {count}")
+    lines.append("")
+
+    # Findings grouped by severity
+    severity_order = ["critical", "high", "medium", "low"]
+    if verbose:
+        severity_order.append("info")
+
+    for sev in severity_order:
+        sev_findings = [f for f in checker.findings if f.severity == sev]
+        if not sev_findings:
+            continue
+
+        lines.append(f"--- {sev.upper()} ---")
+        for f in sev_findings:
+            line_info = f" (line {f.line})" if f.line else ""
+            lines.append(f"  [{f.category}]{line_info}")
+            lines.append(f"    {f.description}")
+            lines.append(f"    Fix: {f.recommendation}")
+            lines.append("")
+
+    # Exit code guidance
+    lines.append("-" * 60)
+    score = summary["risk_score"]
+    if score <= 10:
+        lines.append("Result: PASS - Low detection risk.")
+    elif score <= 30:
+        lines.append("Result: PASS with warnings - Address medium/high issues for production use.")
+    else:
+        lines.append("Result: FAIL - High detection risk. Fix critical and high issues before deploying.")
+    lines.append("")
+
+    return "\n".join(lines)
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Audit a Playwright script for common bot detection vectors.",
+        epilog=(
+            "Examples:\n"
+            "  %(prog)s --file scraper.py\n"
+            "  %(prog)s --file scraper.py --verbose\n"
+            "  %(prog)s --file scraper.py --json\n"
+            "\n"
+            "Exit codes:\n"
+            "  0 - Low risk (score 0-10)\n"
+            "  1 - Moderate to high risk (score 11-50)\n"
+            "  2 - Critical risk (score 51+)\n"
+        ),
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument(
+        "--file",
+        required=True,
+        help="Path to the Playwright script to audit",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_output",
+        default=False,
+        help="Output results as JSON",
+    )
+    parser.add_argument(
+        "--verbose",
+        action="store_true",
+        default=False,
+        help="Include informational (non-actionable) findings in output",
+    )
+
+    args = parser.parse_args()
+
+    file_path = os.path.abspath(args.file)
+    if not os.path.isfile(file_path):
+        print(f"Error: File not found: {file_path}", file=sys.stderr)
+        sys.exit(2)
+
+    try:
+        with open(file_path, "r", encoding="utf-8") as f:
+            content = f.read()
+    except Exception as e:
+        print(f"Error reading file: {e}", file=sys.stderr)
+        sys.exit(2)
+
+    if not content.strip():
+        print("Error: File is empty.", file=sys.stderr)
+        sys.exit(2)
+
+    checker = AntiDetectionChecker(content, file_path)
+    checker.check_all()
+
+    if args.json_output:
+        output = checker.get_summary()
+        output["findings"] = [asdict(f) for f in checker.findings]
+        if not args.verbose:
+            output["findings"] = [f for f in output["findings"] if f["severity"] != "info"]
+        print(json.dumps(output, indent=2))
+    else:
+        print(format_text_report(checker, verbose=args.verbose))
+
+    # Exit code based on risk
+    score = checker.get_risk_score()
+    if score <= 10:
+        sys.exit(0)
+    elif score <= 50:
+        sys.exit(1)
+    else:
+        sys.exit(2)
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering/browser-automation/form_automation_builder.py
+++ b/engineering/browser-automation/form_automation_builder.py
@@ -0,0 +1,324 @@
+#!/usr/bin/env python3
+"""
+Form Automation Builder - Generates Playwright form-fill automation scripts.
+
+Takes a JSON field specification and target URL, then produces a ready-to-run
+Playwright script that fills forms, handles multi-step flows, and manages
+file uploads.
+
+No external dependencies - uses only Python standard library.
+"""
+
+import argparse
+import json
+import os
+import sys
+import textwrap
+from datetime import datetime
+
+
+SUPPORTED_FIELD_TYPES = {
+    "text": "page.fill('{selector}', '{value}')",
+    "password": "page.fill('{selector}', '{value}')",
+    "email": "page.fill('{selector}', '{value}')",
+    "textarea": "page.fill('{selector}', '{value}')",
+    "select": "page.select_option('{selector}', value='{value}')",
+    "checkbox": "page.check('{selector}')" if True else "page.uncheck('{selector}')",
+    "radio": "page.check('{selector}')",
+    "file": "page.set_input_files('{selector}', '{value}')",
+    "click": "page.click('{selector}')",
+}
+
+
+def validate_fields(fields):
+    """Validate the field specification format. Returns list of issues."""
+    issues = []
+    if not isinstance(fields, list):
+        issues.append("Top-level structure must be a JSON array of field objects.")
+        return issues
+
+    for i, field in enumerate(fields):
+        if not isinstance(field, dict):
+            issues.append(f"Field {i}: must be a JSON object.")
+            continue
+        if "selector" not in field:
+            issues.append(f"Field {i}: missing required 'selector' key.")
+        if "type" not in field:
+            issues.append(f"Field {i}: missing required 'type' key.")
+        elif field["type"] not in SUPPORTED_FIELD_TYPES:
+            issues.append(
+                f"Field {i}: unsupported type '{field['type']}'. "
+                f"Supported: {', '.join(sorted(SUPPORTED_FIELD_TYPES.keys()))}"
+            )
+        if field.get("type") not in ("checkbox", "radio", "click") and "value" not in field:
+            issues.append(f"Field {i}: missing 'value' for type '{field.get('type', '?')}'.")
+
+    return issues
+
+
+def generate_field_action(field, indent=8):
+    """Generate the Playwright action line for a single field."""
+    ftype = field["type"]
+    selector = field["selector"]
+    value = field.get("value", "")
+    label = field.get("label", selector)
+    prefix = " " * indent
+
+    lines = []
+    lines.append(f'{prefix}# {label}')
+
+    if ftype == "checkbox":
+        if field.get("value", "true").lower() in ("true", "yes", "1", "on"):
+            lines.append(f'{prefix}await page.check("{selector}")')
+        else:
+            lines.append(f'{prefix}await page.uncheck("{selector}")')
+    elif ftype == "radio":
+        lines.append(f'{prefix}await page.check("{selector}")')
+    elif ftype == "click":
+        lines.append(f'{prefix}await page.click("{selector}")')
+    elif ftype == "select":
+        lines.append(f'{prefix}await page.select_option("{selector}", value="{value}")')
+    elif ftype == "file":
+        lines.append(f'{prefix}await page.set_input_files("{selector}", "{value}")')
+    else:
+        # text, password, email, textarea
+        lines.append(f'{prefix}await page.fill("{selector}", "{value}")')
+
+    # Add optional wait_after
+    wait_after = field.get("wait_after")
+    if wait_after:
+        lines.append(f'{prefix}await page.wait_for_selector("{wait_after}")')
+
+    return "\n".join(lines)
+
+
+def build_form_script(url, fields, output_format="script"):
+    """Build a Playwright form automation script from the field specification."""
+
+    issues = validate_fields(fields)
+    if issues:
+        return None, issues
+
+    if output_format == "json":
+        config = {
+            "url": url,
+            "fields": fields,
+            "field_count": len(fields),
+            "field_types": list(set(f["type"] for f in fields)),
+            "has_file_upload": any(f["type"] == "file" for f in fields),
+            "generated_at": datetime.now().isoformat(),
+        }
+        return config, None
+
+    # Group fields into steps if step markers are present
+    steps = {}
+    for field in fields:
+        step = field.get("step", 1)
+        if step not in steps:
+            steps[step] = []
+        steps[step].append(field)
+
+    multi_step = len(steps) > 1
+
+    # Generate step functions
+    step_functions = []
+    for step_num in sorted(steps.keys()):
+        step_fields = steps[step_num]
+        actions = "\n".join(generate_field_action(f) for f in step_fields)
+
+        if multi_step:
+            fn = textwrap.dedent(f"""\
+async def fill_step_{step_num}(page):
+    \"\"\"Fill form step {step_num} ({len(step_fields)} fields).\"\"\"
+    print(f"Filling step {step_num}...")
+{actions}
+    print(f"Step {step_num} complete.")
+""")
+        else:
+            fn = textwrap.dedent(f"""\
+async def fill_form(page):
+    \"\"\"Fill form ({len(step_fields)} fields).\"\"\"
+    print("Filling form...")
+{actions}
+    print("Form filled.")
+""")
+        step_functions.append(fn)
+
+    step_functions_str = "\n\n".join(step_functions)
+
+    # Generate main() call sequence
+    if multi_step:
+        step_calls = "\n".join(
+            f"        await fill_step_{n}(page)" for n in sorted(steps.keys())
+        )
+    else:
+        step_calls = "        await fill_form(page)"
+
+    submit_selector = None
+    for field in fields:
+        if field.get("type") == "click" and field.get("is_submit"):
+            submit_selector = field["selector"]
+            break
+
+    submit_block = ""
+    if submit_selector:
+        submit_block = textwrap.dedent(f"""\
+
+        # Submit
+        await page.click("{submit_selector}")
+        await page.wait_for_load_state("networkidle")
+        print("Form submitted.")
+""")
+
+    script = textwrap.dedent(f'''\
+#!/usr/bin/env python3
+"""
+Auto-generated Playwright form automation script.
+Target: {url}
+Fields: {len(fields)}
+Steps: {len(steps)}
+Generated: {datetime.now().isoformat()}
+
+Requirements:
+    pip install playwright
+    playwright install chromium
+"""
+
+import asyncio
+import random
+from playwright.async_api import async_playwright
+
+URL = "{url}"
+
+USER_AGENTS = [
+    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+]
+
+
+{step_functions_str}
+
+async def main():
+    async with async_playwright() as p:
+        browser = await p.chromium.launch(headless=True)
+        context = await browser.new_context(
+            viewport={{"width": 1920, "height": 1080}},
+            user_agent=random.choice(USER_AGENTS),
+        )
+        page = await context.new_page()
+
+        await page.add_init_script(
+            "Object.defineProperty(navigator, \'webdriver\', {{get: () => undefined}});"
+        )
+
+        print(f"Navigating to {{URL}}...")
+        await page.goto(URL, wait_until="networkidle")
+
+{step_calls}
+{submit_block}
+        print("Automation complete.")
+        await browser.close()
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
+''')
+
+    return script, None
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Generate Playwright form-fill automation scripts from a JSON field specification.",
+        epilog=textwrap.dedent("""\
+Examples:
+  %(prog)s --url https://example.com/signup --fields fields.json
+  %(prog)s --url https://example.com/signup --fields fields.json --output fill_form.py
+  %(prog)s --url https://example.com/signup --fields fields.json --json
+
+Field specification format (fields.json):
+  [
+    {"selector": "#email", "type": "email", "value": "user@example.com", "label": "Email"},
+    {"selector": "#password", "type": "password", "value": "s3cret"},
+    {"selector": "#country", "type": "select", "value": "US"},
+    {"selector": "#terms", "type": "checkbox", "value": "true"},
+    {"selector": "#avatar", "type": "file", "value": "/path/to/photo.jpg"},
+    {"selector": "button[type='submit']", "type": "click", "is_submit": true}
+  ]
+
+Supported field types: text, password, email, textarea, select, checkbox, radio, file, click
+
+Multi-step forms: Add "step": N to each field to group into steps.
+        """),
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument(
+        "--url",
+        required=True,
+        help="Target form URL",
+    )
+    parser.add_argument(
+        "--fields",
+        required=True,
+        help="Path to JSON file containing field specifications",
+    )
+    parser.add_argument(
+        "--output",
+        help="Output file path (default: stdout)",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_output",
+        default=False,
+        help="Output JSON configuration instead of Python script",
+    )
+
+    args = parser.parse_args()
+
+    # Load fields
+    fields_path = os.path.abspath(args.fields)
+    if not os.path.isfile(fields_path):
+        print(f"Error: Fields file not found: {fields_path}", file=sys.stderr)
+        sys.exit(2)
+
+    try:
+        with open(fields_path, "r") as f:
+            fields = json.load(f)
+    except json.JSONDecodeError as e:
+        print(f"Error: Invalid JSON in {fields_path}: {e}", file=sys.stderr)
+        sys.exit(2)
+
+    output_format = "json" if args.json_output else "script"
+    result, errors = build_form_script(
+        url=args.url,
+        fields=fields,
+        output_format=output_format,
+    )
+
+    if errors:
+        print("Validation errors:", file=sys.stderr)
+        for err in errors:
+            print(f"  - {err}", file=sys.stderr)
+        sys.exit(2)
+
+    if args.json_output:
+        output_text = json.dumps(result, indent=2)
+    else:
+        output_text = result
+
+    if args.output:
+        output_path = os.path.abspath(args.output)
+        with open(output_path, "w") as f:
+            f.write(output_text)
+        if not args.json_output:
+            os.chmod(output_path, 0o755)
+        print(f"Written to {output_path}", file=sys.stderr)
+        sys.exit(0)
+    else:
+        print(output_text)
+        sys.exit(0)
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering/browser-automation/references/anti_detection_patterns.md
+++ b/engineering/browser-automation/references/anti_detection_patterns.md
@@ -0,0 +1,453 @@
+# Anti-Detection Patterns for Browser Automation
+
+This reference covers techniques to make Playwright automation less detectable by anti-bot services. These are defense-in-depth measures — no single technique is sufficient, but combining them significantly reduces detection risk.
+
+## Detection Vectors
+
+Anti-bot systems detect automation through multiple signals. Understanding what they check helps you counter effectively.
+
+### Tier 1: Trivial Detection (Every Site Checks These)
+1. **navigator.webdriver** — Set to `true` by all automation frameworks
+2. **User-Agent string** — Default headless UA contains "HeadlessChrome"
+3. **WebGL renderer** — Headless Chrome reports "SwiftShader" or "Google SwiftShader"
+
+### Tier 2: Common Detection (Most Anti-Bot Services)
+4. **Viewport/screen dimensions** — Unusual sizes flag automation
+5. **Plugins array** — Empty in headless mode, populated in real browsers
+6. **Languages** — Missing or mismatched locale
+7. **Request timing** — Machine-speed interactions
+8. **Mouse movement** — No mouse events between clicks
+
+### Tier 3: Advanced Detection (Cloudflare, DataDome, PerimeterX)
+9. **Canvas fingerprint** — Headless renders differently
+10. **WebGL fingerprint** — GPU-specific rendering variations
+11. **Audio fingerprint** — AudioContext processing differences
+12. **Font enumeration** — Different available fonts in headless
+13. **Behavioral analysis** — Scroll patterns, click patterns, reading time
+
+## Stealth Techniques
+
+### 1. WebDriver Flag Removal
+
+The most critical fix. Every anti-bot check starts here.
+
+```python
+await page.add_init_script("""
+    // Remove webdriver flag
+    Object.defineProperty(navigator, 'webdriver', {
+        get: () => undefined,
+    });
+
+    // Remove Playwright-specific properties
+    delete window.__playwright;
+    delete window.__pw_manual;
+""")
+```
+
+### 2. User Agent Configuration
+
+Match the user agent to the browser you are launching. A Chrome UA with Firefox-specific headers is a red flag.
+
+```python
+# Chrome 120 on Windows 10 (most common configuration globally)
+CHROME_WIN = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
+
+# Chrome 120 on macOS
+CHROME_MAC = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
+
+# Chrome 120 on Linux
+CHROME_LINUX = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
+
+# Firefox 121 on Windows
+FIREFOX_WIN = "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0"
+```
+
+**Rules:**
+- Update UAs every 2-3 months as browser versions increment
+- Match UA platform to `navigator.platform` override
+- If using Chromium, use Chrome UAs. If Firefox, use Firefox UAs.
+- Never use obviously fake or ancient UAs
+
+### 3. Viewport and Screen Properties
+
+Common real-world screen resolutions (from analytics data):
+
+| Resolution | Market Share | Use For |
+|-----------|-------------|---------|
+| 1920x1080 | ~23% | Default choice |
+| 1366x768 | ~14% | Laptop simulation |
+| 1536x864 | ~9% | Scaled laptop |
+| 1440x900 | ~7% | MacBook |
+| 2560x1440 | ~5% | High-end desktop |
+
+```python
+import random
+
+VIEWPORTS = [
+    {"width": 1920, "height": 1080},
+    {"width": 1366, "height": 768},
+    {"width": 1536, "height": 864},
+    {"width": 1440, "height": 900},
+]
+
+viewport = random.choice(VIEWPORTS)
+context = await browser.new_context(
+    viewport=viewport,
+    screen=viewport,  # screen should match viewport
+)
+```
+
+### 4. Navigator Properties Hardening
+
+```python
+STEALTH_INIT = """
+    // Plugins (headless Chrome has 0 plugins, real Chrome has 3-5)
+    Object.defineProperty(navigator, 'plugins', {
+        get: () => {
+            const plugins = [
+                { name: 'Chrome PDF Plugin', filename: 'internal-pdf-viewer' },
+                { name: 'Chrome PDF Viewer', filename: 'mhjfbmdgcfjbbpaeojofohoefgiehjai' },
+                { name: 'Native Client', filename: 'internal-nacl-plugin' },
+            ];
+            plugins.length = 3;
+            return plugins;
+        },
+    });
+
+    // Languages
+    Object.defineProperty(navigator, 'languages', {
+        get: () => ['en-US', 'en'],
+    });
+
+    // Platform (match to user agent)
+    Object.defineProperty(navigator, 'platform', {
+        get: () => 'Win32',  // or 'MacIntel' for macOS UA
+    });
+
+    // Hardware concurrency (real browsers report CPU cores)
+    Object.defineProperty(navigator, 'hardwareConcurrency', {
+        get: () => 8,
+    });
+
+    // Device memory (Chrome-specific)
+    Object.defineProperty(navigator, 'deviceMemory', {
+        get: () => 8,
+    });
+
+    // Connection info
+    Object.defineProperty(navigator, 'connection', {
+        get: () => ({
+            effectiveType: '4g',
+            rtt: 50,
+            downlink: 10,
+            saveData: false,
+        }),
+    });
+"""
+
+await context.add_init_script(STEALTH_INIT)
+```
+
+### 5. WebGL Fingerprint Evasion
+
+Headless Chrome uses SwiftShader for WebGL, which anti-bot services detect.
+
+```python
+# Option A: Launch with a real GPU (headed mode on a machine with GPU)
+browser = await p.chromium.launch(headless=False)
+
+# Option B: Override WebGL renderer info
+await page.add_init_script("""
+    const getParameter = WebGLRenderingContext.prototype.getParameter;
+    WebGLRenderingContext.prototype.getParameter = function(parameter) {
+        if (parameter === 37445) {
+            return 'Intel Inc.';  // UNMASKED_VENDOR_WEBGL
+        }
+        if (parameter === 37446) {
+            return 'Intel(R) Iris(TM) Plus Graphics 640';  // UNMASKED_RENDERER_WEBGL
+        }
+        return getParameter.call(this, parameter);
+    };
+""")
+```
+
+### 6. Canvas Fingerprint Noise
+
+Anti-bot services render text/shapes to a canvas and hash the output. Headless Chrome produces a different hash.
+
+```python
+await page.add_init_script("""
+    const originalToDataURL = HTMLCanvasElement.prototype.toDataURL;
+    HTMLCanvasElement.prototype.toDataURL = function(type) {
+        if (type === 'image/png' || type === undefined) {
+            // Add minimal noise to the canvas to change fingerprint
+            const ctx = this.getContext('2d');
+            if (ctx) {
+                const imageData = ctx.getImageData(0, 0, this.width, this.height);
+                for (let i = 0; i < imageData.data.length; i += 4) {
+                    // Shift one channel by +/- 1 (imperceptible)
+                    imageData.data[i] = imageData.data[i] ^ 1;
+                }
+                ctx.putImageData(imageData, 0, 0);
+            }
+        }
+        return originalToDataURL.apply(this, arguments);
+    };
+""")
+```
+
+## Request Throttling Patterns
+
+### Human-Like Delays
+
+Real users do not click at machine speed. Add realistic delays between actions.
+
+```python
+import random
+import asyncio
+
+async def human_delay(action_type="browse"):
+    """Add realistic delay based on action type."""
+    delays = {
+        "browse": (1.0, 3.0),      # Browsing between pages
+        "read": (2.0, 8.0),        # Reading content
+        "fill": (0.3, 0.8),        # Between form fields
+        "click": (0.1, 0.5),       # Before clicking
+        "scroll": (0.5, 1.5),      # Between scroll actions
+    }
+    min_s, max_s = delays.get(action_type, (0.5, 2.0))
+    await asyncio.sleep(random.uniform(min_s, max_s))
+```
+
+### Request Rate Limiting
+
+```python
+import time
+
+class RateLimiter:
+    """Enforce minimum delay between requests."""
+
+    def __init__(self, min_interval_seconds=1.0):
+        self.min_interval = min_interval_seconds
+        self.last_request_time = 0
+
+    async def wait(self):
+        elapsed = time.time() - self.last_request_time
+        if elapsed < self.min_interval:
+            await asyncio.sleep(self.min_interval - elapsed)
+        self.last_request_time = time.time()
+
+# Usage
+limiter = RateLimiter(min_interval_seconds=2.0)
+for url in urls:
+    await limiter.wait()
+    await page.goto(url)
+```
+
+### Exponential Backoff on Errors
+
+```python
+async def with_backoff(coro_factory, max_retries=5, base_delay=1.0):
+    for attempt in range(max_retries):
+        try:
+            return await coro_factory()
+        except Exception as e:
+            if attempt == max_retries - 1:
+                raise
+            delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
+            print(f"Attempt {attempt + 1} failed: {e}. Retrying in {delay:.1f}s...")
+            await asyncio.sleep(delay)
+```
+
+## Proxy Rotation Strategies
+
+### Single Proxy
+
+```python
+browser = await p.chromium.launch(
+    proxy={"server": "http://proxy.example.com:8080"}
+)
+```
+
+### Authenticated Proxy
+
+```python
+context = await browser.new_context(
+    proxy={
+        "server": "http://proxy.example.com:8080",
+        "username": "user",
+        "password": "pass",
+    }
+)
+```
+
+### Rotating Proxy Pool
+
+```python
+PROXIES = [
+    "http://proxy1.example.com:8080",
+    "http://proxy2.example.com:8080",
+    "http://proxy3.example.com:8080",
+]
+
+async def create_context_with_proxy(browser):
+    proxy = random.choice(PROXIES)
+    return await browser.new_context(
+        proxy={"server": proxy}
+    )
+```
+
+### Per-Request Proxy (via Context Rotation)
+
+Playwright does not support per-request proxy switching. Achieve it by creating a new context for each request or batch:
+
+```python
+async def scrape_url(browser, url, proxy):
+    context = await browser.new_context(proxy={"server": proxy})
+    page = await context.new_page()
+    try:
+        await page.goto(url)
+        data = await extract_data(page)
+        return data
+    finally:
+        await context.close()
+```
+
+### SOCKS5 Proxy
+
+```python
+browser = await p.chromium.launch(
+    proxy={"server": "socks5://proxy.example.com:1080"}
+)
+```
+
+## Headless Detection Avoidance
+
+### Running Chrome Channel Instead of Chromium
+
+The bundled Chromium binary has different properties than a real Chrome install. Using the Chrome channel makes the browser indistinguishable from a normal install.
+
+```python
+# Use installed Chrome instead of bundled Chromium
+browser = await p.chromium.launch(channel="chrome", headless=True)
+```
+
+**Requirements:** Chrome must be installed on the system.
+
+### New Headless Mode (Chrome 112+)
+
+Chrome's "new headless" mode is harder to detect than the old one:
+
+```python
+browser = await p.chromium.launch(
+    args=["--headless=new"],
+)
+```
+
+### Avoiding Common Flags
+
+Do NOT pass these flags — they are headless-detection signals:
+- `--disable-gpu` (old headless workaround, not needed)
+- `--no-sandbox` (security risk, detectable)
+- `--disable-setuid-sandbox` (same as above)
+
+## Behavioral Evasion
+
+### Mouse Movement Simulation
+
+Anti-bot services track mouse events. A click without preceding mouse movement is suspicious.
+
+```python
+async def human_click(page, selector):
+    """Click with preceding mouse movement."""
+    element = await page.query_selector(selector)
+    box = await element.bounding_box()
+    if box:
+        # Move to element with slight offset
+        x = box["x"] + box["width"] / 2 + random.uniform(-5, 5)
+        y = box["y"] + box["height"] / 2 + random.uniform(-5, 5)
+        await page.mouse.move(x, y, steps=random.randint(5, 15))
+        await asyncio.sleep(random.uniform(0.05, 0.2))
+        await page.mouse.click(x, y)
+```
+
+### Typing Speed Variation
+
+```python
+async def human_type(page, selector, text):
+    """Type with variable speed like a human."""
+    await page.click(selector)
+    for char in text:
+        await page.keyboard.type(char)
+        # Faster for common keys, slower for special characters
+        if char in "aeiou tnrs":
+            await asyncio.sleep(random.uniform(0.03, 0.08))
+        else:
+            await asyncio.sleep(random.uniform(0.08, 0.20))
+```
+
+### Scroll Behavior
+
+Real users scroll gradually, not in instant jumps.
+
+```python
+async def human_scroll(page, distance=None):
+    """Scroll down gradually like a human."""
+    if distance is None:
+        distance = random.randint(300, 800)
+
+    current = 0
+    while current < distance:
+        step = random.randint(50, 150)
+        await page.mouse.wheel(0, step)
+        current += step
+        await asyncio.sleep(random.uniform(0.05, 0.15))
+```
+
+## Detection Testing
+
+### Self-Check Script
+
+Navigate to these URLs to test your stealth configuration:
+
+- `https://bot.sannysoft.com/` — Comprehensive bot detection test
+- `https://abrahamjuliot.github.io/creepjs/` — Advanced fingerprint analysis
+- `https://browserleaks.com/webgl` — WebGL fingerprint details
+- `https://browserleaks.com/canvas` — Canvas fingerprint details
+
+### Quick Test Pattern
+
+```python
+async def test_stealth(page):
+    """Navigate to detection test page and report results."""
+    await page.goto("https://bot.sannysoft.com/")
+    await page.wait_for_timeout(3000)
+
+    # Check for failed tests
+    failed = await page.eval_on_selector_all(
+        "td.failed",
+        "els => els.map(e => e.parentElement.querySelector('td').textContent)"
+    )
+
+    if failed:
+        print(f"FAILED checks: {failed}")
+    else:
+        print("All checks passed.")
+
+    await page.screenshot(path="stealth_test.png", full_page=True)
+```
+
+## Recommended Stealth Stack
+
+For most automation tasks, apply these in order of priority:
+
+1. **WebDriver flag removal** — Critical, takes 2 lines
+2. **Custom user agent** — Critical, takes 1 line
+3. **Viewport configuration** — High priority, takes 1 line
+4. **Request delays** — High priority, add random.uniform() calls
+5. **Navigator properties** — Medium priority, init script block
+6. **Chrome channel** — Medium priority, one launch option
+7. **WebGL override** — Low priority unless hitting advanced anti-bot
+8. **Canvas noise** — Low priority unless hitting advanced anti-bot
+9. **Proxy rotation** — Only for high-volume or repeated scraping
+10. **Behavioral simulation** — Only for sites with behavioral analysis
--- a/engineering/browser-automation/references/data_extraction_recipes.md
+++ b/engineering/browser-automation/references/data_extraction_recipes.md
@@ -0,0 +1,580 @@
+# Data Extraction Recipes
+
+Practical patterns for extracting structured data from web pages using Playwright. Each recipe is a self-contained pattern you can adapt to your target site.
+
+## CSS Selector Patterns for Common Structures
+
+### E-Commerce Product Listings
+
+```python
+PRODUCT_SELECTORS = {
+    "container": "div.product-card, article.product, li.product-item",
+    "fields": {
+        "title": "h2.product-title, h3.product-name, [data-testid='product-title']",
+        "price": "span.price, .product-price, [data-testid='price']",
+        "original_price": "span.original-price, .was-price, del",
+        "rating": "span.rating, .star-rating, [data-rating]",
+        "review_count": "span.review-count, .num-reviews",
+        "image_url": "img.product-image::attr(src), img::attr(data-src)",
+        "product_url": "a.product-link::attr(href), h2 a::attr(href)",
+        "availability": "span.stock-status, .availability",
+    }
+}
+```
+
+### News/Blog Article Listings
+
+```python
+ARTICLE_SELECTORS = {
+    "container": "article, div.post, div.article-card",
+    "fields": {
+        "headline": "h2 a, h3 a, .article-title",
+        "summary": "p.excerpt, .article-summary, .post-excerpt",
+        "author": "span.author, .byline, [rel='author']",
+        "date": "time, span.date, .published-date",
+        "category": "span.category, a.tag, .article-category",
+        "url": "h2 a::attr(href), .article-title a::attr(href)",
+        "image_url": "img.thumbnail::attr(src), .article-image img::attr(src)",
+    }
+}
+```
+
+### Job Listings
+
+```python
+JOB_SELECTORS = {
+    "container": "div.job-card, li.job-listing, article.job",
+    "fields": {
+        "title": "h2.job-title, a.job-link, [data-testid='job-title']",
+        "company": "span.company-name, .employer, [data-testid='company']",
+        "location": "span.location, .job-location, [data-testid='location']",
+        "salary": "span.salary, .compensation, [data-testid='salary']",
+        "job_type": "span.job-type, .employment-type",
+        "posted_date": "time, span.posted, .date-posted",
+        "url": "a.job-link::attr(href), h2 a::attr(href)",
+    }
+}
+```
+
+### Search Engine Results
+
+```python
+SERP_SELECTORS = {
+    "container": "div.g, .search-result, li.result",
+    "fields": {
+        "title": "h3, .result-title",
+        "url": "a::attr(href), cite",
+        "snippet": "div.VwiC3b, .result-snippet, .search-description",
+        "displayed_url": "cite, .result-url",
+    }
+}
+```
+
+## Table Extraction Recipes
+
+### Simple HTML Table to JSON
+
+The most common extraction pattern. Works for any standard `<table>` with `<thead>` and `<tbody>`.
+
+```python
+async def extract_table(page, table_selector="table"):
+    """Extract an HTML table into a list of dictionaries."""
+    data = await page.evaluate(f"""
+        (selector) => {{
+            const table = document.querySelector(selector);
+            if (!table) return null;
+
+            // Get headers
+            const headers = Array.from(table.querySelectorAll('thead th, thead td'))
+                .map(th => th.textContent.trim());
+
+            // If no thead, use first row as headers
+            if (headers.length === 0) {{
+                const firstRow = table.querySelector('tr');
+                if (firstRow) {{
+                    headers.push(...Array.from(firstRow.querySelectorAll('th, td'))
+                        .map(cell => cell.textContent.trim()));
+                }}
+            }}
+
+            // Get data rows
+            const rows = Array.from(table.querySelectorAll('tbody tr'));
+            return rows.map(row => {{
+                const cells = Array.from(row.querySelectorAll('td'));
+                const obj = {{}};
+                cells.forEach((cell, i) => {{
+                    if (i < headers.length) {{
+                        obj[headers[i]] = cell.textContent.trim();
+                    }}
+                }});
+                return obj;
+            }});
+        }}
+    """, table_selector)
+    return data or []
+```
+
+### Table with Links and Attributes
+
+When table cells contain links or data attributes, not just text:
+
+```python
+async def extract_rich_table(page, table_selector="table"):
+    """Extract table including links and data attributes."""
+    return await page.evaluate(f"""
+        (selector) => {{
+            const table = document.querySelector(selector);
+            if (!table) return [];
+
+            const headers = Array.from(table.querySelectorAll('thead th'))
+                .map(th => th.textContent.trim());
+
+            return Array.from(table.querySelectorAll('tbody tr')).map(row => {{
+                const obj = {{}};
+                Array.from(row.querySelectorAll('td')).forEach((cell, i) => {{
+                    const key = headers[i] || `col_${{i}}`;
+                    obj[key] = cell.textContent.trim();
+
+                    // Extract link if present
+                    const link = cell.querySelector('a');
+                    if (link) {{
+                        obj[key + '_url'] = link.href;
+                    }}
+
+                    // Extract data attributes
+                    for (const attr of cell.attributes) {{
+                        if (attr.name.startsWith('data-')) {{
+                            obj[key + '_' + attr.name] = attr.value;
+                        }}
+                    }}
+                }});
+                return obj;
+            }});
+        }}
+    """, table_selector)
+```
+
+### Multi-Page Table (Paginated)
+
+```python
+async def extract_paginated_table(page, table_selector, next_selector, max_pages=50):
+    """Extract data from a table that spans multiple pages."""
+    all_rows = []
+    headers = None
+
+    for page_num in range(max_pages):
+        # Extract current page
+        page_data = await page.evaluate(f"""
+            (selector) => {{
+                const table = document.querySelector(selector);
+                if (!table) return {{ headers: [], rows: [] }};
+
+                const hs = Array.from(table.querySelectorAll('thead th'))
+                    .map(th => th.textContent.trim());
+
+                const rs = Array.from(table.querySelectorAll('tbody tr')).map(row =>
+                    Array.from(row.querySelectorAll('td')).map(td => td.textContent.trim())
+                );
+
+                return {{ headers: hs, rows: rs }};
+            }}
+        """, table_selector)
+
+        if headers is None and page_data["headers"]:
+            headers = page_data["headers"]
+
+        for row in page_data["rows"]:
+            all_rows.append(dict(zip(headers or [], row)))
+
+        # Check for next page
+        next_btn = page.locator(next_selector)
+        if await next_btn.count() == 0 or await next_btn.is_disabled():
+            break
+
+        await next_btn.click()
+        await page.wait_for_load_state("networkidle")
+        await page.wait_for_timeout(random.randint(800, 2000))
+
+    return all_rows
+```
+
+## Product Listing Extraction
+
+### Generic Listing Extractor
+
+Works for any repeating card/list pattern:
+
+```python
+async def extract_listings(page, container_sel, field_map):
+    """
+    Extract data from repeating elements.
+
+    field_map: dict mapping field names to CSS selectors.
+    Special suffixes:
+        ::attr(name)  — extract attribute instead of text
+        ::html        — extract innerHTML
+    """
+    items = []
+    cards = await page.query_selector_all(container_sel)
+
+    for card in cards:
+        item = {}
+        for field_name, selector in field_map.items():
+            try:
+                if "::attr(" in selector:
+                    sel, attr = selector.split("::attr(")
+                    attr = attr.rstrip(")")
+                    el = await card.query_selector(sel)
+                    item[field_name] = await el.get_attribute(attr) if el else None
+                elif selector.endswith("::html"):
+                    sel = selector.replace("::html", "")
+                    el = await card.query_selector(sel)
+                    item[field_name] = await el.inner_html() if el else None
+                else:
+                    el = await card.query_selector(selector)
+                    item[field_name] = (await el.text_content()).strip() if el else None
+            except Exception:
+                item[field_name] = None
+        items.append(item)
+
+    return items
+```
+
+### With Price Parsing
+
+```python
+import re
+
+def parse_price(text):
+    """Extract numeric price from text like '$1,234.56' or '1.234,56 EUR'."""
+    if not text:
+        return None
+    # Remove currency symbols and whitespace
+    cleaned = re.sub(r'[^\d.,]', '', text.strip())
+    if not cleaned:
+        return None
+    # Handle European format (1.234,56)
+    if ',' in cleaned and '.' in cleaned:
+        if cleaned.rindex(',') > cleaned.rindex('.'):
+            cleaned = cleaned.replace('.', '').replace(',', '.')
+        else:
+            cleaned = cleaned.replace(',', '')
+    elif ',' in cleaned:
+        # Could be 1,234 or 1,23 — check decimal places
+        parts = cleaned.split(',')
+        if len(parts[-1]) <= 2:
+            cleaned = cleaned.replace(',', '.')
+        else:
+            cleaned = cleaned.replace(',', '')
+    try:
+        return float(cleaned)
+    except ValueError:
+        return None
+
+async def extract_products_with_prices(page, container_sel, field_map, price_field="price"):
+    """Extract listings and parse prices into floats."""
+    items = await extract_listings(page, container_sel, field_map)
+    for item in items:
+        if price_field in item and item[price_field]:
+            item[f"{price_field}_raw"] = item[price_field]
+            item[price_field] = parse_price(item[price_field])
+    return items
+```
+
+## Pagination Handling
+
+### Next-Button Pagination
+
+The most common pattern. Click "Next" until the button disappears or is disabled.
+
+```python
+async def paginate_via_next_button(page, next_selector, content_selector, max_pages=100):
+    """
+    Yield page objects as you paginate through results.
+
+    next_selector: CSS selector for the "Next" button/link
+    content_selector: CSS selector to wait for after navigation (confirms new page loaded)
+    """
+    pages_scraped = 0
+
+    while pages_scraped < max_pages:
+        yield page  # Caller extracts data from current page
+        pages_scraped += 1
+
+        next_btn = page.locator(next_selector)
+        if await next_btn.count() == 0:
+            break
+
+        try:
+            is_disabled = await next_btn.is_disabled()
+        except Exception:
+            is_disabled = True
+
+        if is_disabled:
+            break
+
+        await next_btn.click()
+        await page.wait_for_selector(content_selector, state="attached")
+        await page.wait_for_timeout(random.randint(500, 1500))
+```
+
+### URL-Based Pagination
+
+When pages follow a predictable URL pattern:
+
+```python
+async def paginate_via_url(page, url_template, start=1, max_pages=100):
+    """
+    Navigate through pages using URL parameters.
+
+    url_template: URL with {page} placeholder, e.g., "https://example.com/search?page={page}"
+    """
+    for page_num in range(start, start + max_pages):
+        url = url_template.format(page=page_num)
+        response = await page.goto(url, wait_until="networkidle")
+
+        if response and response.status == 404:
+            break
+
+        yield page, page_num
+        await page.wait_for_timeout(random.randint(800, 2500))
+```
+
+### Infinite Scroll
+
+For sites that load content as you scroll:
+
+```python
+async def paginate_via_scroll(page, item_selector, max_scrolls=100, no_change_limit=3):
+    """
+    Scroll to load more content until no new items appear.
+
+    item_selector: CSS selector for individual items (used to count progress)
+    no_change_limit: Stop after N scrolls with no new items
+    """
+    previous_count = 0
+    no_change_streak = 0
+
+    for scroll_num in range(max_scrolls):
+        # Count current items
+        current_count = await page.locator(item_selector).count()
+
+        if current_count == previous_count:
+            no_change_streak += 1
+            if no_change_streak >= no_change_limit:
+                break
+        else:
+            no_change_streak = 0
+
+        previous_count = current_count
+
+        # Scroll to bottom
+        await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
+        await page.wait_for_timeout(random.randint(1000, 2500))
+
+        # Check for "Load More" button that might appear
+        load_more = page.locator("button:has-text('Load More'), button:has-text('Show More')")
+        if await load_more.count() > 0 and await load_more.is_visible():
+            await load_more.click()
+            await page.wait_for_timeout(random.randint(1000, 2000))
+
+    return current_count
+```
+
+### Load-More Button
+
+Simpler variant of infinite scroll where content loads via a button:
+
+```python
+async def paginate_via_load_more(page, button_selector, item_selector, max_clicks=50):
+    """Click a 'Load More' button repeatedly until it disappears."""
+    for click_num in range(max_clicks):
+        btn = page.locator(button_selector)
+        if await btn.count() == 0 or not await btn.is_visible():
+            break
+
+        count_before = await page.locator(item_selector).count()
+        await btn.click()
+
+        # Wait for new items to appear
+        try:
+            await page.wait_for_function(
+                f"document.querySelectorAll('{item_selector}').length > {count_before}",
+                timeout=10000,
+            )
+        except Exception:
+            break  # No new items loaded
+
+        await page.wait_for_timeout(random.randint(500, 1500))
+
+    return await page.locator(item_selector).count()
+```
+
+## Nested Data Extraction
+
+### Comments with Replies (Threaded)
+
+```python
+async def extract_threaded_comments(page, parent_selector=".comments"):
+    """Recursively extract threaded comments."""
+    return await page.evaluate(f"""
+        (parentSelector) => {{
+            function extractThread(container) {{
+                const comments = [];
+                const directChildren = container.querySelectorAll(':scope > .comment');
+
+                for (const comment of directChildren) {{
+                    const authorEl = comment.querySelector('.author, .username');
+                    const textEl = comment.querySelector('.comment-text, .comment-body');
+                    const dateEl = comment.querySelector('time, .date');
+                    const repliesContainer = comment.querySelector('.replies, .children');
+
+                    comments.push({{
+                        author: authorEl ? authorEl.textContent.trim() : null,
+                        text: textEl ? textEl.textContent.trim() : null,
+                        date: dateEl ? (dateEl.getAttribute('datetime') || dateEl.textContent.trim()) : null,
+                        replies: repliesContainer ? extractThread(repliesContainer) : [],
+                    }});
+                }}
+
+                return comments;
+            }}
+
+            const root = document.querySelector(parentSelector);
+            return root ? extractThread(root) : [];
+        }}
+    """, parent_selector)
+```
+
+### Nested Categories (Sidebar/Menu)
+
+```python
+async def extract_category_tree(page, root_selector="nav.categories"):
+    """Extract nested category structure from a sidebar or menu."""
+    return await page.evaluate(f"""
+        (rootSelector) => {{
+            function extractLevel(container) {{
+                const items = [];
+                const directItems = container.querySelectorAll(':scope > li, :scope > div.category');
+
+                for (const item of directItems) {{
+                    const link = item.querySelector(':scope > a');
+                    const subMenu = item.querySelector(':scope > ul, :scope > div.sub-categories');
+
+                    items.push({{
+                        name: link ? link.textContent.trim() : item.textContent.trim().split('\\n')[0],
+                        url: link ? link.href : null,
+                        children: subMenu ? extractLevel(subMenu) : [],
+                    }});
+                }}
+
+                return items;
+            }}
+
+            const root = document.querySelector(rootSelector);
+            return root ? extractLevel(root.querySelector('ul') || root) : [];
+        }}
+    """, root_selector)
+```
+
+### Accordion/Expandable Content
+
+Some content is hidden behind accordion/expand toggles. Click to reveal, then extract.
+
+```python
+async def extract_accordion(page, toggle_selector, content_selector):
+    """Expand all accordion items and extract their content."""
+    items = []
+    toggles = await page.query_selector_all(toggle_selector)
+
+    for toggle in toggles:
+        title = (await toggle.text_content()).strip()
+
+        # Click to expand
+        await toggle.click()
+        await page.wait_for_timeout(300)
+
+        # Find the associated content panel
+        content = await toggle.evaluate_handle(
+            f"el => el.closest('.accordion-item, .faq-item')?.querySelector('{content_selector}')"
+        )
+
+        body = None
+        if content:
+            body = (await content.text_content())
+            if body:
+                body = body.strip()
+
+        items.append({"title": title, "content": body})
+
+    return items
+```
+
+## Data Cleaning Utilities
+
+### Post-Extraction Cleaning
+
+```python
+import re
+
+def clean_text(text):
+    """Normalize whitespace, remove zero-width characters."""
+    if not text:
+        return None
+    # Remove zero-width characters
+    text = re.sub(r'[\u200b\u200c\u200d\ufeff]', '', text)
+    # Normalize whitespace
+    text = re.sub(r'\s+', ' ', text).strip()
+    return text if text else None
+
+def clean_url(url, base_url=None):
+    """Convert relative URLs to absolute."""
+    if not url:
+        return None
+    url = url.strip()
+    if url.startswith("//"):
+        return "https:" + url
+    if url.startswith("/") and base_url:
+        return base_url.rstrip("/") + url
+    return url
+
+def deduplicate(items, key_field):
+    """Remove duplicate items based on a key field."""
+    seen = set()
+    unique = []
+    for item in items:
+        key = item.get(key_field)
+        if key and key not in seen:
+            seen.add(key)
+            unique.append(item)
+    return unique
+```
+
+### Output Formats
+
+```python
+import json
+import csv
+import io
+
+def to_jsonl(items, file_path):
+    """Write items as JSON Lines (one JSON object per line)."""
+    with open(file_path, "w") as f:
+        for item in items:
+            f.write(json.dumps(item, ensure_ascii=False) + "\n")
+
+def to_csv(items, file_path):
+    """Write items as CSV."""
+    if not items:
+        return
+    headers = list(items[0].keys())
+    with open(file_path, "w", newline="") as f:
+        writer = csv.DictWriter(f, fieldnames=headers)
+        writer.writeheader()
+        writer.writerows(items)
+
+def to_json(items, file_path, indent=2):
+    """Write items as a JSON array."""
+    with open(file_path, "w") as f:
+        json.dump(items, f, indent=indent, ensure_ascii=False)
+```
--- a/engineering/browser-automation/references/playwright_browser_api.md
+++ b/engineering/browser-automation/references/playwright_browser_api.md
@@ -0,0 +1,492 @@
+# Playwright Browser API Reference (Automation Focus)
+
+This reference covers Playwright's Python async API for browser automation tasks — NOT testing. For test-specific APIs (assertions, fixtures, test runners), see playwright-pro.
+
+## Browser Launch & Context
+
+### Launching the Browser
+
+```python
+from playwright.async_api import async_playwright
+
+async with async_playwright() as p:
+    # Chromium (recommended for most automation)
+    browser = await p.chromium.launch(headless=True)
+
+    # Firefox (better for some anti-detection scenarios)
+    browser = await p.firefox.launch(headless=True)
+
+    # WebKit (Safari engine — useful for Apple-specific sites)
+    browser = await p.webkit.launch(headless=True)
+```
+
+**Launch options:**
+| Option | Type | Default | Purpose |
+|--------|------|---------|---------|
+| `headless` | bool | True | Run without visible window |
+| `slow_mo` | int | 0 | Milliseconds to slow each operation (debugging) |
+| `proxy` | dict | None | Proxy server configuration |
+| `args` | list | [] | Additional Chromium flags |
+| `downloads_path` | str | None | Directory for downloads |
+| `channel` | str | None | Browser channel: "chrome", "msedge" |
+
+### Browser Contexts (Session Isolation)
+
+Browser contexts are isolated environments within a single browser instance. Each context has its own cookies, localStorage, and cache. Use them instead of launching multiple browsers.
+
+```python
+# Create isolated context
+context = await browser.new_context(
+    viewport={"width": 1920, "height": 1080},
+    user_agent="Mozilla/5.0 ...",
+    locale="en-US",
+    timezone_id="America/New_York",
+    geolocation={"latitude": 40.7128, "longitude": -74.0060},
+    permissions=["geolocation"],
+)
+
+# Multiple contexts share one browser (resource efficient)
+context_a = await browser.new_context()  # User A session
+context_b = await browser.new_context()  # User B session
+```
+
+### Storage State (Session Persistence)
+
+```python
+# Save state after login (cookies + localStorage)
+await context.storage_state(path="auth_state.json")
+
+# Restore state in new context
+context = await browser.new_context(storage_state="auth_state.json")
+```
+
+## Page Navigation
+
+### Basic Navigation
+
+```python
+page = await context.new_page()
+
+# Navigate with different wait strategies
+await page.goto("https://example.com")                          # Default: "load"
+await page.goto("https://example.com", wait_until="domcontentloaded")  # Faster
+await page.goto("https://example.com", wait_until="networkidle")       # Wait for network quiet
+await page.goto("https://example.com", timeout=30000)                  # Custom timeout (ms)
+```
+
+**`wait_until` options:**
+- `"load"` — wait for the `load` event (all resources loaded)
+- `"domcontentloaded"` — DOM is ready, images/styles may still load
+- `"networkidle"` — no network requests for 500ms (best for SPAs)
+- `"commit"` — response received, before any rendering
+
+### Wait Strategies
+
+```python
+# Wait for a specific element to appear
+await page.wait_for_selector("div.content", state="visible")
+await page.wait_for_selector("div.loading", state="hidden")     # Wait for loading to finish
+await page.wait_for_selector("table tbody tr", state="attached") # In DOM but maybe not visible
+
+# Wait for URL change
+await page.wait_for_url("**/dashboard**")
+await page.wait_for_url(re.compile(r"/dashboard/\d+"))
+
+# Wait for specific network response
+async with page.expect_response("**/api/data*") as resp_info:
+    await page.click("button.load")
+response = await resp_info.value
+json_data = await response.json()
+
+# Wait for page load state
+await page.wait_for_load_state("networkidle")
+
+# Fixed wait (use sparingly — prefer the methods above)
+await page.wait_for_timeout(1000)  # milliseconds
+```
+
+### Navigation History
+
+```python
+await page.go_back()
+await page.go_forward()
+await page.reload()
+```
+
+## Element Interaction
+
+### Finding Elements
+
+```python
+# Single element (returns first match)
+element = await page.query_selector("css=div.product")
+element = await page.query_selector("xpath=//div[@class='product']")
+
+# Multiple elements
+elements = await page.query_selector_all("div.product")
+
+# Locator API (recommended — auto-waits, re-queries on each action)
+locator = page.locator("div.product")
+count = await locator.count()
+first = locator.first
+nth = locator.nth(2)
+```
+
+**Locator vs query_selector:**
+- `query_selector` — returns an ElementHandle at a point in time. Can go stale if DOM changes.
+- `locator` — returns a Locator that re-queries each time you interact with it. Preferred for reliability.
+
+### Clicking
+
+```python
+await page.click("button.submit")
+await page.click("a:has-text('Next')")
+await page.dblclick("div.editable")
+await page.click("button", position={"x": 10, "y": 10})  # Click at offset
+await page.click("button", force=True)  # Skip actionability checks
+await page.click("button", modifiers=["Shift"])  # With modifier key
+```
+
+### Text Input
+
+```python
+# Fill (clears existing content first)
+await page.fill("input#email", "user@example.com")
+
+# Type (simulates keystroke-by-keystroke input — slower, more realistic)
+await page.type("input#search", "query text", delay=50)  # 50ms between keys
+
+# Press specific keys
+await page.press("input#search", "Enter")
+await page.press("body", "Control+a")
+```
+
+### Dropdowns & Select
+
+```python
+# Native <select> element
+await page.select_option("select#country", value="US")
+await page.select_option("select#country", label="United States")
+await page.select_option("select#tags", value=["tag1", "tag2"])  # Multi-select
+
+# Custom dropdown (non-native)
+await page.click("div.dropdown-trigger")
+await page.click("li.option:has-text('United States')")
+```
+
+### Checkboxes & Radio Buttons
+
+```python
+await page.check("input#agree")
+await page.uncheck("input#newsletter")
+is_checked = await page.is_checked("input#agree")
+```
+
+### File Upload
+
+```python
+# Standard file input
+await page.set_input_files("input[type='file']", "/path/to/file.pdf")
+await page.set_input_files("input[type='file']", ["/path/a.pdf", "/path/b.pdf"])
+
+# Clear file selection
+await page.set_input_files("input[type='file']", [])
+
+# Non-standard upload (drag-and-drop zones)
+async with page.expect_file_chooser() as fc_info:
+    await page.click("div.upload-zone")
+file_chooser = await fc_info.value
+await file_chooser.set_files("/path/to/file.pdf")
+```
+
+### Hover & Focus
+
+```python
+await page.hover("div.menu-item")
+await page.focus("input#search")
+```
+
+## Data Extraction
+
+### Text Content
+
+```python
+# Get text content of an element
+text = await page.text_content("h1.title")
+inner_text = await page.inner_text("div.description")  # Visible text only
+inner_html = await page.inner_html("div.content")       # HTML markup
+
+# Get attribute
+href = await page.get_attribute("a.link", "href")
+src = await page.get_attribute("img.photo", "src")
+```
+
+### JavaScript Evaluation
+
+```python
+# Evaluate in page context
+title = await page.evaluate("document.title")
+scroll_height = await page.evaluate("document.body.scrollHeight")
+
+# Evaluate on a specific element
+text = await page.eval_on_selector("h1", "el => el.textContent")
+texts = await page.eval_on_selector_all("li", "els => els.map(e => e.textContent.trim())")
+
+# Complex extraction
+data = await page.evaluate("""
+    () => {
+        const rows = document.querySelectorAll('table tbody tr');
+        return Array.from(rows).map(row => {
+            const cells = row.querySelectorAll('td');
+            return {
+                name: cells[0]?.textContent.trim(),
+                value: cells[1]?.textContent.trim(),
+            };
+        });
+    }
+""")
+```
+
+### Screenshots & PDF
+
+```python
+# Full page screenshot
+await page.screenshot(path="page.png", full_page=True)
+
+# Viewport screenshot
+await page.screenshot(path="viewport.png")
+
+# Element screenshot
+await page.locator("div.chart").screenshot(path="chart.png")
+
+# PDF (Chromium only)
+await page.pdf(path="page.pdf", format="A4", print_background=True)
+
+# Screenshot as bytes (for processing without saving)
+buffer = await page.screenshot()
+```
+
+## Network Interception
+
+### Monitoring Requests
+
+```python
+# Listen for all responses
+page.on("response", lambda response: print(f"{response.status} {response.url}"))
+
+# Wait for a specific API call
+async with page.expect_response("**/api/products*") as resp:
+    await page.click("button.load")
+response = await resp.value
+data = await response.json()
+```
+
+### Blocking Resources (Speed Up Scraping)
+
+```python
+# Block images, fonts, and CSS to speed up scraping
+await page.route("**/*.{png,jpg,jpeg,gif,svg,woff,woff2,ttf}", lambda route: route.abort())
+await page.route("**/*.css", lambda route: route.abort())
+
+# Block specific domains (ads, analytics)
+await page.route("**/google-analytics.com/**", lambda route: route.abort())
+await page.route("**/facebook.com/**", lambda route: route.abort())
+```
+
+### Modifying Requests
+
+```python
+# Add custom headers
+await page.route("**/*", lambda route: route.continue_(headers={
+    **route.request.headers,
+    "X-Custom-Header": "value"
+}))
+
+# Mock API responses
+await page.route("**/api/data", lambda route: route.fulfill(
+    status=200,
+    content_type="application/json",
+    body=json.dumps({"items": []}),
+))
+```
+
+## Dialog Handling
+
+```python
+# Auto-accept all dialogs
+page.on("dialog", lambda dialog: dialog.accept())
+
+# Handle specific dialog types
+async def handle_dialog(dialog):
+    if dialog.type == "confirm":
+        await dialog.accept()
+    elif dialog.type == "prompt":
+        await dialog.accept("my input")
+    elif dialog.type == "alert":
+        await dialog.dismiss()
+
+page.on("dialog", handle_dialog)
+```
+
+## File Downloads
+
+```python
+# Wait for download to start
+async with page.expect_download() as dl_info:
+    await page.click("a.download-link")
+download = await dl_info.value
+
+# Save to specific path
+await download.save_as("/path/to/downloads/" + download.suggested_filename)
+
+# Get download as bytes
+path = await download.path()  # Temp file path
+
+# Set download behavior at context level
+context = await browser.new_context(accept_downloads=True)
+```
+
+## Frames & Iframes
+
+```python
+# Access iframe by selector
+frame = page.frame_locator("iframe#content")
+await frame.locator("button.submit").click()
+
+# Access frame by name
+frame = page.frame(name="editor")
+
+# Access all frames
+for frame in page.frames:
+    print(frame.url)
+```
+
+## Cookie Management
+
+```python
+# Get all cookies
+cookies = await context.cookies()
+
+# Get cookies for specific URL
+cookies = await context.cookies(["https://example.com"])
+
+# Add cookies
+await context.add_cookies([{
+    "name": "session",
+    "value": "abc123",
+    "domain": "example.com",
+    "path": "/",
+    "httpOnly": True,
+    "secure": True,
+}])
+
+# Clear cookies
+await context.clear_cookies()
+```
+
+## Concurrency Patterns
+
+### Multiple Pages in One Context
+
+```python
+# Open multiple tabs in the same session
+pages = []
+for url in urls:
+    page = await context.new_page()
+    await page.goto(url)
+    pages.append(page)
+
+# Process all pages
+for page in pages:
+    data = await extract_data(page)
+    await page.close()
+```
+
+### Multiple Contexts for Parallel Sessions
+
+```python
+import asyncio
+
+async def scrape_with_context(browser, url):
+    context = await browser.new_context(user_agent=random.choice(USER_AGENTS))
+    page = await context.new_page()
+    await page.goto(url)
+    data = await extract_data(page)
+    await context.close()
+    return data
+
+# Run 5 concurrent scraping tasks
+tasks = [scrape_with_context(browser, url) for url in urls[:5]]
+results = await asyncio.gather(*tasks)
+```
+
+## Init Scripts (Stealth)
+
+Init scripts run before any page script, in every new page/context.
+
+```python
+# Remove webdriver flag
+await context.add_init_script("""
+    Object.defineProperty(navigator, 'webdriver', {get: () => undefined});
+""")
+
+# Override plugins (headless Chrome has empty plugins)
+await context.add_init_script("""
+    Object.defineProperty(navigator, 'plugins', {
+        get: () => [1, 2, 3, 4, 5],
+    });
+""")
+
+# Override languages
+await context.add_init_script("""
+    Object.defineProperty(navigator, 'languages', {
+        get: () => ['en-US', 'en'],
+    });
+""")
+
+# From file
+await context.add_init_script(path="stealth.js")
+```
+
+## Common Automation Patterns
+
+### Scrolling
+
+```python
+# Scroll to bottom
+await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
+
+# Scroll element into view
+await page.locator("div.target").scroll_into_view_if_needed()
+
+# Smooth scroll simulation
+await page.evaluate("""
+    async () => {
+        const delay = ms => new Promise(r => setTimeout(r, ms));
+        for (let i = 0; i < document.body.scrollHeight; i += 300) {
+            window.scrollTo(0, i);
+            await delay(100);
+        }
+    }
+""")
+```
+
+### Clipboard Operations
+
+```python
+# Copy text
+await page.evaluate("navigator.clipboard.writeText('hello')")
+
+# Paste via keyboard
+await page.keyboard.press("Control+v")
+```
+
+### Shadow DOM
+
+```python
+# Playwright pierces open shadow DOM with >> operator
+await page.locator("my-component >> .inner-button").click()
+
+# Or use the css= engine with >> for chained piercing
+await page.locator("css=host-element >> css=.shadow-child").click()
+```
--- a/engineering/browser-automation/scraping_toolkit.py
+++ b/engineering/browser-automation/scraping_toolkit.py
@@ -0,0 +1,248 @@
+#!/usr/bin/env python3
+"""
+Scraping Toolkit - Generates Playwright scraping script skeletons.
+
+Takes a URL pattern and CSS selectors as input and produces a ready-to-run
+Playwright scraping script with pagination support, error handling, and
+anti-detection patterns baked in.
+
+No external dependencies - uses only Python standard library.
+"""
+
+import argparse
+import json
+import os
+import sys
+import textwrap
+from datetime import datetime
+
+
+def build_scraping_script(url, selectors, paginate=False, output_format="script"):
+    """Build a Playwright scraping script from the given parameters."""
+
+    selector_list = [s.strip() for s in selectors.split(",") if s.strip()]
+    if not selector_list:
+        return None, "No valid selectors provided."
+
+    field_names = []
+    for sel in selector_list:
+        # Derive field name from selector: .product-title -> product_title
+        name = sel.strip("#.[]()>:+~ ")
+        name = name.replace("-", "_").replace(" ", "_").replace(".", "_")
+        # Remove non-alphanumeric
+        name = "".join(c if c.isalnum() or c == "_" else "" for c in name)
+        if not name:
+            name = f"field_{len(field_names)}"
+        field_names.append(name)
+
+    field_map = dict(zip(field_names, selector_list))
+
+    if output_format == "json":
+        config = {
+            "url": url,
+            "selectors": field_map,
+            "pagination": {
+                "enabled": paginate,
+                "next_selector": "a:has-text('Next'), button:has-text('Next')",
+                "max_pages": 50,
+            },
+            "anti_detection": {
+                "random_delay_ms": [800, 2500],
+                "user_agent_rotation": True,
+                "viewport": {"width": 1920, "height": 1080},
+            },
+            "output": {
+                "format": "jsonl",
+                "deduplicate_by": field_names[0] if field_names else None,
+            },
+            "generated_at": datetime.now().isoformat(),
+        }
+        return config, None
+
+    # Build Python script
+    fields_dict_str = "{\n"
+    for name, sel in field_map.items():
+        fields_dict_str += f'        "{name}": "{sel}",\n'
+    fields_dict_str += "    }"
+
+    pagination_block = ""
+    if paginate:
+        pagination_block = textwrap.dedent("""\
+
+        # --- Pagination ---
+        async def scrape_all_pages(page, container, fields, next_sel, max_pages=50):
+            all_items = []
+            for page_num in range(max_pages):
+                print(f"Scraping page {page_num + 1}...")
+                items = await extract_items(page, container, fields)
+                all_items.extend(items)
+
+                next_btn = page.locator(next_sel)
+                if await next_btn.count() == 0:
+                    break
+                try:
+                    is_disabled = await next_btn.is_disabled()
+                except Exception:
+                    is_disabled = True
+                if is_disabled:
+                    break
+
+                await next_btn.click()
+                await page.wait_for_load_state("networkidle")
+                await asyncio.sleep(random.uniform(0.8, 2.5))
+
+            return all_items
+""")
+
+    main_call = "scrape_all_pages(page, CONTAINER, FIELDS, NEXT_SELECTOR)" if paginate else "extract_items(page, CONTAINER, FIELDS)"
+
+    script = textwrap.dedent(f'''\
+#!/usr/bin/env python3
+"""
+Auto-generated Playwright scraping script.
+Target: {url}
+Generated: {datetime.now().isoformat()}
+
+Requirements:
+    pip install playwright
+    playwright install chromium
+"""
+
+import asyncio
+import json
+import random
+from playwright.async_api import async_playwright
+
+# --- Configuration ---
+URL = "{url}"
+CONTAINER = "body"  # Adjust to the repeating item container selector
+FIELDS = {fields_dict_str}
+NEXT_SELECTOR = "a:has-text('Next'), button:has-text('Next')"
+
+USER_AGENTS = [
+    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+]
+
+
+async def extract_items(page, container_selector, field_map):
+    """Extract structured data from repeating elements."""
+    items = []
+    cards = await page.query_selector_all(container_selector)
+    for card in cards:
+        item = {{}}
+        for name, selector in field_map.items():
+            el = await card.query_selector(selector)
+            if el:
+                item[name] = (await el.text_content() or "").strip()
+            else:
+                item[name] = None
+        items.append(item)
+    return items
+
+{pagination_block}
+async def main():
+    async with async_playwright() as p:
+        browser = await p.chromium.launch(headless=True)
+        context = await browser.new_context(
+            viewport={{"width": 1920, "height": 1080}},
+            user_agent=random.choice(USER_AGENTS),
+        )
+        page = await context.new_page()
+
+        # Remove WebDriver flag
+        await page.add_init_script(
+            "Object.defineProperty(navigator, \'webdriver\', {{get: () => undefined}});"
+        )
+
+        print(f"Navigating to {{URL}}...")
+        await page.goto(URL, wait_until="networkidle")
+
+        data = await {main_call}
+        print(json.dumps(data, indent=2, ensure_ascii=False))
+
+        await browser.close()
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
+''')
+
+    return script, None
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Generate Playwright scraping script skeletons from URL and selectors.",
+        epilog=(
+            "Examples:\n"
+            "  %(prog)s --url https://example.com/products --selectors '.title,.price,.rating'\n"
+            "  %(prog)s --url https://example.com/search --selectors '.name,.desc' --paginate\n"
+            "  %(prog)s --url https://example.com --selectors '.item' --json\n"
+            "  %(prog)s --url https://example.com --selectors '.item' --output scraper.py\n"
+        ),
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument(
+        "--url",
+        required=True,
+        help="Target URL to scrape",
+    )
+    parser.add_argument(
+        "--selectors",
+        required=True,
+        help="Comma-separated CSS selectors for data fields (e.g. '.title,.price,.rating')",
+    )
+    parser.add_argument(
+        "--paginate",
+        action="store_true",
+        default=False,
+        help="Include pagination handling in generated script",
+    )
+    parser.add_argument(
+        "--output",
+        help="Output file path (default: stdout)",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_output",
+        default=False,
+        help="Output JSON configuration instead of Python script",
+    )
+
+    args = parser.parse_args()
+
+    output_format = "json" if args.json_output else "script"
+    result, error = build_scraping_script(
+        url=args.url,
+        selectors=args.selectors,
+        paginate=args.paginate,
+        output_format=output_format,
+    )
+
+    if error:
+        print(f"Error: {error}", file=sys.stderr)
+        sys.exit(2)
+
+    if args.json_output:
+        output_text = json.dumps(result, indent=2)
+    else:
+        output_text = result
+
+    if args.output:
+        output_path = os.path.abspath(args.output)
+        with open(output_path, "w") as f:
+            f.write(output_text)
+        if not args.json_output:
+            os.chmod(output_path, 0o755)
+        print(f"Written to {output_path}", file=sys.stderr)
+        sys.exit(0)
+    else:
+        print(output_text)
+        sys.exit(0)
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering/spec-driven-workflow/SKILL.md
+++ b/engineering/spec-driven-workflow/SKILL.md
@@ -0,0 +1,586 @@
+---
+name: "spec-driven-workflow"
+description: "Use when the user asks to write specs before code, define acceptance criteria, plan features before implementation, generate tests from specifications, or follow spec-first development practices."
+---
+
+# Spec-Driven Workflow — POWERFUL
+
+## Overview
+
+Spec-driven workflow enforces a single, non-negotiable rule: **write the specification BEFORE you write any code.** Not alongside. Not after. Before.
+
+This is not documentation. This is a contract. A spec defines what the system MUST do, what it SHOULD do, and what it explicitly WILL NOT do. Every line of code you write traces back to a requirement in the spec. Every test traces back to an acceptance criterion. If it is not in the spec, it does not get built.
+
+### Why Spec-First Matters
+
+1. **Eliminates rework.** 60-80% of defects originate from requirements, not implementation. Catching ambiguity in a spec costs minutes; catching it in production costs days.
+2. **Forces clarity.** If you cannot write what the system should do in plain language, you do not understand the problem well enough to write code.
+3. **Enables parallelism.** Once a spec is approved, frontend, backend, QA, and documentation can all start simultaneously.
+4. **Creates accountability.** The spec is the definition of done. No arguments about whether a feature is "complete" — either it satisfies the acceptance criteria or it does not.
+5. **Feeds TDD directly.** Acceptance criteria in Given/When/Then format translate 1:1 into test cases. The spec IS the test plan.
+
+### The Iron Law
+
+```
+NO CODE WITHOUT AN APPROVED SPEC.
+NO EXCEPTIONS. NO "QUICK PROTOTYPES." NO "I'LL DOCUMENT IT LATER."
+```
+
+If the spec is not written, reviewed, and approved, implementation does not begin. Period.
+
+---
+
+## The Spec Format
+
+Every spec follows this structure. No sections are optional — if a section does not apply, write "N/A — [reason]" so reviewers know it was considered, not forgotten.
+
+### 1. Title and Context
+
+```markdown
+# Spec: [Feature Name]
+
+**Author:** [name]
+**Date:** [ISO 8601]
+**Status:** Draft | In Review | Approved | Superseded
+**Reviewers:** [list]
+**Related specs:** [links]
+
+## Context
+
+[Why does this feature exist? What problem does it solve? What is the business
+motivation? Include links to user research, support tickets, or metrics that
+justify this work. 2-4 paragraphs maximum.]
+```
+
+### 2. Functional Requirements (RFC 2119)
+
+Use RFC 2119 keywords precisely:
+
+| Keyword | Meaning |
+|---------|---------|
+| **MUST** | Absolute requirement. Failing this means the implementation is non-conformant. |
+| **MUST NOT** | Absolute prohibition. Doing this means the implementation is broken. |
+| **SHOULD** | Recommended. May be omitted with documented justification. |
+| **SHOULD NOT** | Discouraged. May be included with documented justification. |
+| **MAY** | Optional. Purely at the implementer's discretion. |
+
+```markdown
+## Functional Requirements
+
+- FR-1: The system MUST authenticate users via OAuth 2.0 PKCE flow.
+- FR-2: The system MUST reject tokens older than 24 hours.
+- FR-3: The system SHOULD support refresh token rotation.
+- FR-4: The system MAY cache user profiles for up to 5 minutes.
+- FR-5: The system MUST NOT store plaintext passwords under any circumstance.
+```
+
+Number every requirement. Use `FR-` prefix. Each requirement is a single, testable statement.
+
+### 3. Non-Functional Requirements
+
+```markdown
+## Non-Functional Requirements
+
+### Performance
+- NFR-P1: Login flow MUST complete in < 500ms (p95) under normal load.
+- NFR-P2: Token validation MUST complete in < 50ms (p99).
+
+### Security
+- NFR-S1: All tokens MUST be transmitted over TLS 1.2+.
+- NFR-S2: The system MUST rate-limit login attempts to 5/minute per IP.
+
+### Accessibility
+- NFR-A1: Login form MUST meet WCAG 2.1 AA standards.
+- NFR-A2: Error messages MUST be announced to screen readers.
+
+### Scalability
+- NFR-SC1: The system SHOULD handle 10,000 concurrent sessions.
+
+### Reliability
+- NFR-R1: The authentication service MUST maintain 99.9% uptime.
+```
+
+### 4. Acceptance Criteria (Given/When/Then)
+
+Every functional requirement maps to one or more acceptance criteria. Use Gherkin syntax:
+
+```markdown
+## Acceptance Criteria
+
+### AC-1: Successful login (FR-1)
+Given a user with valid credentials
+When they submit the login form with correct email and password
+Then they receive a valid access token
+And they are redirected to the dashboard
+And the login event is logged with timestamp and IP
+
+### AC-2: Expired token rejection (FR-2)
+Given a user with an access token issued 25 hours ago
+When they make an API request with that token
+Then they receive a 401 Unauthorized response
+And the response body contains error code "TOKEN_EXPIRED"
+And they are NOT redirected (API clients handle their own flow)
+
+### AC-3: Rate limiting (NFR-S2)
+Given an IP address that has made 5 failed login attempts in the last minute
+When a 6th login attempt arrives from that IP
+Then the request is rejected with 429 Too Many Requests
+And the response includes a Retry-After header
+```
+
+### 5. Edge Cases and Error Scenarios
+
+```markdown
+## Edge Cases
+
+- EC-1: User submits login form with empty email → Show validation error, do not hit API.
+- EC-2: OAuth provider is down → Show "Service temporarily unavailable", retry after 30s.
+- EC-3: User has account but no password (social-only) → Redirect to social login.
+- EC-4: Concurrent login from two devices → Both sessions are valid (no single-session enforcement).
+- EC-5: Token expires mid-request → Complete the current request, return warning header.
+```
+
+### 6. API Contracts
+
+Define request/response shapes using TypeScript-style notation:
+
+```markdown
+## API Contracts
+
+### POST /api/auth/login
+Request:
+```typescript
+interface LoginRequest {
+  email: string;       // MUST be valid email format
+  password: string;    // MUST be 8-128 characters
+  rememberMe?: boolean; // Default: false
+}
+```
+
+Success Response (200):
+```typescript
+interface LoginResponse {
+  accessToken: string;   // JWT, expires in 24h
+  refreshToken: string;  // Opaque, expires in 30d
+  expiresIn: number;     // Seconds until access token expires
+  user: {
+    id: string;
+    email: string;
+    displayName: string;
+  };
+}
+```
+
+Error Response (401):
+```typescript
+interface AuthError {
+  error: "INVALID_CREDENTIALS" | "TOKEN_EXPIRED" | "ACCOUNT_LOCKED";
+  message: string;
+  retryAfter?: number; // Seconds, present for rate-limited responses
+}
+```
+```
+
+### 7. Data Models
+
+```markdown
+## Data Models
+
+### User
+| Field | Type | Constraints |
+|-------|------|-------------|
+| id | UUID | Primary key, auto-generated |
+| email | string | Unique, max 255 chars, valid email format |
+| passwordHash | string | bcrypt, never exposed via API |
+| createdAt | timestamp | UTC, immutable |
+| lastLoginAt | timestamp | UTC, updated on each login |
+| loginAttempts | integer | Reset to 0 on successful login |
+| lockedUntil | timestamp | Null if not locked |
+```
+
+### 8. Out of Scope
+
+Explicit exclusions prevent scope creep:
+
+```markdown
+## Out of Scope
+
+- OS-1: Multi-factor authentication (separate spec: SPEC-042)
+- OS-2: Social login providers beyond Google and GitHub
+- OS-3: Admin impersonation of user accounts
+- OS-4: Password complexity rules beyond minimum length (deferred to v2)
+- OS-5: Session management UI (users cannot see/revoke active sessions yet)
+```
+
+If someone asks for an out-of-scope item during implementation, point them to this section. Do not build it.
+
+---
+
+## Bounded Autonomy Rules
+
+These rules define when an agent (human or AI) MUST stop and ask for guidance vs. when they can proceed independently.
+
+### STOP and Ask When:
+
+1. **Scope creep detected.** The implementation requires something not in the spec. Even if it seems obviously needed, STOP. The spec might have excluded it deliberately.
+
+2. **Ambiguity exceeds 30%.** If you cannot determine the correct behavior from the spec for more than 30% of a given requirement, the spec is incomplete. Do not guess.
+
+3. **Breaking changes required.** The implementation would change an existing API contract, database schema, or public interface. Always escalate.
+
+4. **Security implications.** Any change that touches authentication, authorization, encryption, or PII handling requires explicit approval.
+
+5. **Performance characteristics unknown.** If a requirement says "MUST complete in < 500ms" but you have no way to measure or guarantee that, escalate before implementing a guess.
+
+6. **Cross-team dependencies.** If the spec requires coordination with another team or service, confirm the dependency before building against it.
+
+### Continue Autonomously When:
+
+1. **Spec is clear and unambiguous** for the current task.
+2. **All acceptance criteria have passing tests** and you are refactoring internals.
+3. **Changes are non-breaking** — no public API, schema, or behavior changes.
+4. **Implementation is a direct translation** of a well-defined acceptance criterion.
+5. **Error handling follows established patterns** already documented in the codebase.
+
+### Escalation Protocol
+
+When you must stop, provide:
+
+```markdown
+## Escalation: [Brief Title]
+
+**Blocked on:** [requirement ID, e.g., FR-3]
+**Question:** [Specific, answerable question — not "what should I do?"]
+**Options considered:**
+  A. [Option] — Pros: [...] Cons: [...]
+  B. [Option] — Pros: [...] Cons: [...]
+**My recommendation:** [A or B, with reasoning]
+**Impact of waiting:** [What is blocked until this is resolved?]
+```
+
+Never escalate without a recommendation. Never present an open-ended question. Always give options.
+
+See `references/bounded_autonomy_rules.md` for the complete decision matrix.
+
+---
+
+## Workflow — 6 Phases
+
+### Phase 1: Gather Requirements
+
+**Goal:** Understand what needs to be built and why.
+
+1. **Interview the user.** Ask:
+   - What problem does this solve?
+   - Who are the users?
+   - What does success look like?
+   - What explicitly should NOT be built?
+2. **Read existing code.** Understand the current system before proposing changes.
+3. **Identify constraints.** Performance budgets, security requirements, backward compatibility.
+4. **List unknowns.** Every unknown is a risk. Surface them now, not during implementation.
+
+**Exit criteria:** You can explain the feature to someone unfamiliar with the project in 2 minutes.
+
+### Phase 2: Write Spec
+
+**Goal:** Produce a complete spec document following The Spec Format above.
+
+1. Fill every section of the template. No section left blank.
+2. Number all requirements (FR-*, NFR-*, AC-*, EC-*, OS-*).
+3. Use RFC 2119 keywords precisely.
+4. Write acceptance criteria in Given/When/Then format.
+5. Define API contracts with TypeScript-style types.
+6. List explicit exclusions in Out of Scope.
+
+**Exit criteria:** The spec can be handed to a developer who was not in the requirements meeting, and they can implement the feature without asking clarifying questions.
+
+### Phase 3: Validate Spec
+
+**Goal:** Verify the spec is complete, consistent, and implementable.
+
+Run `spec_validator.py` against the spec file:
+
+```bash
+python spec_validator.py --file spec.md --strict
+```
+
+Manual validation checklist:
+- [ ] Every functional requirement has at least one acceptance criterion
+- [ ] Every acceptance criterion is testable (no subjective language)
+- [ ] API contracts cover all endpoints mentioned in requirements
+- [ ] Data models cover all entities mentioned in requirements
+- [ ] Edge cases cover failure modes for every external dependency
+- [ ] Out of scope is explicit about what was considered and rejected
+- [ ] Non-functional requirements have measurable thresholds
+
+**Exit criteria:** Spec scores 80+ on validator, and all manual checklist items pass.
+
+### Phase 4: Generate Tests
+
+**Goal:** Extract test cases from acceptance criteria before writing implementation code.
+
+Run `test_extractor.py` against the approved spec:
+
+```bash
+python test_extractor.py --file spec.md --framework pytest --output tests/
+```
+
+1. Each acceptance criterion becomes one or more test cases.
+2. Each edge case becomes a test case.
+3. Tests are stubs — they define the assertion but not the implementation.
+4. All tests MUST fail initially (red phase of TDD).
+
+**Exit criteria:** You have a test file where every test fails with "not implemented" or equivalent.
+
+### Phase 5: Implement
+
+**Goal:** Write code that makes failing tests pass, one acceptance criterion at a time.
+
+1. Pick one acceptance criterion (start with the simplest).
+2. Make its test(s) pass with minimal code.
+3. Run the full test suite — no regressions.
+4. Commit.
+5. Pick the next acceptance criterion. Repeat.
+
+**Rules:**
+- Do NOT implement anything not in the spec.
+- Do NOT optimize before all acceptance criteria pass.
+- Do NOT refactor before all acceptance criteria pass.
+- If you discover a missing requirement, STOP and update the spec first.
+
+**Exit criteria:** All tests pass. All acceptance criteria satisfied.
+
+### Phase 6: Self-Review
+
+**Goal:** Verify implementation matches spec before marking done.
+
+Run through the Self-Review Checklist below. If any item fails, fix it before declaring the task complete.
+
+---
+
+## Self-Review Checklist
+
+Before marking any implementation as done, verify ALL of the following:
+
+- [ ] **Every acceptance criterion has a passing test.** No exceptions. If AC-3 exists, a test for AC-3 exists and passes.
+- [ ] **Every edge case has a test.** EC-1 through EC-N all have corresponding test cases.
+- [ ] **No scope creep.** The implementation does not include features not in the spec. If you added something, either update the spec or remove it.
+- [ ] **API contracts match implementation.** Request/response shapes in code match the spec exactly. Field names, types, status codes — all of it.
+- [ ] **Error scenarios tested.** Every error response defined in the spec has a test that triggers it.
+- [ ] **Non-functional requirements verified.** If the spec says < 500ms, you have evidence (benchmark, load test, profiling) that it meets the threshold.
+- [ ] **Data model matches.** Database schema matches the spec. No extra columns, no missing constraints.
+- [ ] **Out-of-scope items not built.** Double-check that nothing from the Out of Scope section leaked into the implementation.
+
+---
+
+## Integration with TDD Guide
+
+Spec-driven workflow and TDD are complementary, not competing:
+
+```
+Spec-Driven Workflow          TDD (Red-Green-Refactor)
+─────────────────────         ──────────────────────────
+Phase 1: Gather Requirements
+Phase 2: Write Spec
+Phase 3: Validate Spec
+Phase 4: Generate Tests  ──→  RED: Tests exist and fail
+Phase 5: Implement       ──→  GREEN: Minimal code to pass
+Phase 6: Self-Review     ──→  REFACTOR: Clean up internals
+```
+
+**The handoff:** Spec-driven workflow produces the test stubs (Phase 4). TDD takes over from there. The spec tells you WHAT to test. TDD tells you HOW to implement.
+
+Use `engineering-team/tdd-guide` for:
+- Red-green-refactor cycle discipline
+- Coverage analysis and gap detection
+- Framework-specific test patterns (Jest, Pytest, JUnit)
+
+Use `engineering/spec-driven-workflow` for:
+- Defining what to build before building it
+- Acceptance criteria authoring
+- Completeness validation
+- Scope control
+
+---
+
+## Examples
+
+### Full Spec: User Password Reset
+
+```markdown
+# Spec: Password Reset Flow
+
+**Author:** Engineering Team
+**Date:** 2026-03-25
+**Status:** Approved
+
+## Context
+
+Users who forget their passwords currently have no self-service recovery option.
+Support receives ~200 password reset requests per week, costing approximately
+8 hours of support time. This feature eliminates that burden entirely.
+
+## Functional Requirements
+
+- FR-1: The system MUST allow users to request a password reset via email.
+- FR-2: The system MUST send a reset link that expires after 1 hour.
+- FR-3: The system MUST invalidate all previous reset links when a new one is requested.
+- FR-4: The system MUST enforce minimum password length of 8 characters on reset.
+- FR-5: The system MUST NOT reveal whether an email exists in the system.
+- FR-6: The system SHOULD log all reset attempts for audit purposes.
+
+## Acceptance Criteria
+
+### AC-1: Request reset (FR-1, FR-5)
+Given a user on the password reset page
+When they enter any email address and submit
+Then they see "If an account exists, a reset link has been sent"
+And the response is identical whether the email exists or not
+
+### AC-2: Valid reset link (FR-2)
+Given a user who received a reset email 30 minutes ago
+When they click the reset link
+Then they see the password reset form
+
+### AC-3: Expired reset link (FR-2)
+Given a user who received a reset email 2 hours ago
+When they click the reset link
+Then they see "This link has expired. Please request a new one."
+
+### AC-4: Previous links invalidated (FR-3)
+Given a user who requested two reset emails
+When they click the link from the first email
+Then they see "This link is no longer valid."
+
+## Edge Cases
+
+- EC-1: User submits reset for non-existent email → Same success message (FR-5).
+- EC-2: User clicks reset link twice → Second click shows "already used" if password was changed.
+- EC-3: Email delivery fails → Log error, do not retry automatically.
+- EC-4: User requests reset while already logged in → Allow it, do not force logout.
+
+## Out of Scope
+
+- OS-1: Security questions as alternative reset method.
+- OS-2: SMS-based password reset.
+- OS-3: Admin-initiated password reset (separate spec).
+```
+
+### Extracted Test Cases (from above spec)
+
+```python
+# Generated by test_extractor.py --framework pytest
+
+class TestPasswordReset:
+    def test_ac1_request_reset_existing_email(self):
+        """AC-1: Request reset with existing email shows generic message."""
+        # Given a user on the password reset page
+        # When they enter a registered email and submit
+        # Then they see "If an account exists, a reset link has been sent"
+        raise NotImplementedError("Implement this test")
+
+    def test_ac1_request_reset_nonexistent_email(self):
+        """AC-1: Request reset with unknown email shows same generic message."""
+        # Given a user on the password reset page
+        # When they enter an unregistered email and submit
+        # Then they see identical response to existing email case
+        raise NotImplementedError("Implement this test")
+
+    def test_ac2_valid_reset_link(self):
+        """AC-2: Reset link works within expiry window."""
+        raise NotImplementedError("Implement this test")
+
+    def test_ac3_expired_reset_link(self):
+        """AC-3: Reset link rejected after 1 hour."""
+        raise NotImplementedError("Implement this test")
+
+    def test_ac4_previous_links_invalidated(self):
+        """AC-4: Old reset links stop working when new one is requested."""
+        raise NotImplementedError("Implement this test")
+
+    def test_ec1_nonexistent_email_same_response(self):
+        """EC-1: Non-existent email produces identical response."""
+        raise NotImplementedError("Implement this test")
+
+    def test_ec2_reset_link_used_twice(self):
+        """EC-2: Already-used reset link shows appropriate message."""
+        raise NotImplementedError("Implement this test")
+```
+
+---
+
+## Anti-Patterns
+
+### 1. Coding Before Spec Approval
+
+**Symptom:** "I'll start coding while the spec is being reviewed."
+**Problem:** The review will surface changes. Now you have code that implements a rejected design.
+**Rule:** Implementation does not begin until spec status is "Approved."
+
+### 2. Vague Acceptance Criteria
+
+**Symptom:** "The system should work well" or "The UI should be responsive."
+**Problem:** Untestable. What does "well" mean? What does "responsive" mean?
+**Rule:** Every acceptance criterion must be verifiable by a machine. If you cannot write a test for it, rewrite the criterion.
+
+### 3. Missing Edge Cases
+
+**Symptom:** Happy path is specified, error paths are not.
+**Problem:** Developers invent error handling on the fly, leading to inconsistent behavior.
+**Rule:** For every external dependency (API, database, file system, user input), specify at least one failure scenario.
+
+### 4. Spec as Post-Hoc Documentation
+
+**Symptom:** "Let me write the spec now that the feature is done."
+**Problem:** This is documentation, not specification. It describes what was built, not what should have been built. It cannot catch design errors because the design is already frozen.
+**Rule:** If the spec was written after the code, it is not a spec. Relabel it as documentation.
+
+### 5. Gold-Plating Beyond Spec
+
+**Symptom:** "While I was in there, I also added..."
+**Problem:** Untested code. Unreviewed design. Potential for subtle bugs in the "bonus" feature.
+**Rule:** If it is not in the spec, it does not get built. File a new spec for additional features.
+
+### 6. Acceptance Criteria Without Requirement Traceability
+
+**Symptom:** AC-7 exists but does not reference any FR-* or NFR-*.
+**Problem:** Orphaned criteria mean either a requirement is missing or the criterion is unnecessary.
+**Rule:** Every AC-* MUST reference at least one FR-* or NFR-*.
+
+### 7. Skipping Validation
+
+**Symptom:** "The spec looks fine, let's just start."
+**Problem:** Missing sections discovered during implementation cause blocking delays.
+**Rule:** Always run `spec_validator.py --strict` before starting implementation. Fix all warnings.
+
+---
+
+## Cross-References
+
+- **`engineering-team/tdd-guide`** — Red-green-refactor cycle, test generation, coverage analysis. Use after Phase 4 of this workflow.
+- **`engineering/focused-fix`** — Deep-dive feature repair. When a spec-driven implementation has systemic issues, use focused-fix for diagnosis.
+- **`engineering/rag-architect`** — If the feature involves retrieval or knowledge systems, use rag-architect for the technical design within the spec.
+- **`references/spec_format_guide.md`** — Complete template with section-by-section explanations.
+- **`references/bounded_autonomy_rules.md`** — Full decision matrix for when to stop vs. continue.
+- **`references/acceptance_criteria_patterns.md`** — Pattern library for writing Given/When/Then criteria.
+
+---
+
+## Tools
+
+| Script | Purpose | Key Flags |
+|--------|---------|-----------|
+| `spec_generator.py` | Generate spec template from feature name/description | `--name`, `--description`, `--format`, `--json` |
+| `spec_validator.py` | Validate spec completeness (0-100 score) | `--file`, `--strict`, `--json` |
+| `test_extractor.py` | Extract test stubs from acceptance criteria | `--file`, `--framework`, `--output`, `--json` |
+
+```bash
+# Generate a spec template
+python spec_generator.py --name "User Authentication" --description "OAuth 2.0 login flow"
+
+# Validate a spec
+python spec_validator.py --file specs/auth.md --strict
+
+# Extract test cases
+python test_extractor.py --file specs/auth.md --framework pytest --output tests/test_auth.py
+```
--- a/engineering/spec-driven-workflow/references/acceptance_criteria_patterns.md
+++ b/engineering/spec-driven-workflow/references/acceptance_criteria_patterns.md
@@ -0,0 +1,497 @@
+# Acceptance Criteria Patterns
+
+A pattern library for writing Given/When/Then acceptance criteria across common feature types. Use these as starting points — adapt to your domain.
+
+---
+
+## Pattern Structure
+
+Every acceptance criterion follows this structure:
+
+```
+### AC-N: [Descriptive name] (FR-N, NFR-N)
+Given [precondition — the system/user is in this state]
+When  [trigger — the user or system performs this action]
+Then  [outcome — this observable, testable result occurs]
+And   [additional outcome — and this also happens]
+```
+
+**Rules:**
+1. One scenario per AC. Multiple Given/When/Then blocks = multiple ACs.
+2. Every AC references at least one FR-* or NFR-*.
+3. Outcomes must be observable and testable — no subjective language.
+4. Preconditions must be achievable in a test setup.
+
+---
+
+## Authentication Patterns
+
+### Login — Happy Path
+
+```markdown
+### AC-1: Successful login with valid credentials (FR-1)
+Given a registered user with email "user@example.com" and password "V@lidP4ss!"
+When they POST /api/auth/login with email "user@example.com" and password "V@lidP4ss!"
+Then the response status is 200
+And the response body contains a valid JWT access token
+And the response body contains a refresh token
+And the access token expires in 24 hours
+```
+
+### Login — Invalid Credentials
+
+```markdown
+### AC-2: Login rejected with wrong password (FR-1)
+Given a registered user with email "user@example.com"
+When they POST /api/auth/login with email "user@example.com" and an incorrect password
+Then the response status is 401
+And the response body contains error code "INVALID_CREDENTIALS"
+And no token is issued
+And the failed attempt is logged
+```
+
+### Login — Account Locked
+
+```markdown
+### AC-3: Login rejected for locked account (FR-1, NFR-S2)
+Given a user whose account is locked due to 5 consecutive failed login attempts
+When they POST /api/auth/login with correct credentials
+Then the response status is 403
+And the response body contains error code "ACCOUNT_LOCKED"
+And the response includes a "retryAfter" field with seconds until unlock
+```
+
+### Token Refresh
+
+```markdown
+### AC-4: Token refresh with valid refresh token (FR-3)
+Given a user with a valid, non-expired refresh token
+When they POST /api/auth/refresh with that refresh token
+Then the response status is 200
+And a new access token is issued
+And the old refresh token is invalidated
+And a new refresh token is issued (rotation)
+```
+
+### Logout
+
+```markdown
+### AC-5: Logout invalidates session (FR-4)
+Given an authenticated user with a valid access token
+When they POST /api/auth/logout with that token
+Then the response status is 204
+And the access token is no longer accepted for API calls
+And the refresh token is invalidated
+```
+
+---
+
+## CRUD Patterns
+
+### Create
+
+```markdown
+### AC-6: Create resource with valid data (FR-1)
+Given an authenticated user with "editor" role
+When they POST /api/resources with valid payload {name: "Test", type: "A"}
+Then the response status is 201
+And the response body contains the created resource with a generated UUID
+And the resource's "createdAt" field is set to the current UTC timestamp
+And the resource's "createdBy" field matches the authenticated user's ID
+```
+
+### Create — Validation Failure
+
+```markdown
+### AC-7: Create resource rejected with invalid data (FR-1)
+Given an authenticated user
+When they POST /api/resources with payload missing required field "name"
+Then the response status is 400
+And the response body contains error code "VALIDATION_ERROR"
+And the response body contains field-level detail: {"name": "Required field"}
+And no resource is created in the database
+```
+
+### Read — Single Item
+
+```markdown
+### AC-8: Read resource by ID (FR-2)
+Given an existing resource with ID "abc-123"
+When an authenticated user GETs /api/resources/abc-123
+Then the response status is 200
+And the response body contains the resource with all fields
+```
+
+### Read — Not Found
+
+```markdown
+### AC-9: Read non-existent resource returns 404 (FR-2)
+Given no resource exists with ID "nonexistent-id"
+When an authenticated user GETs /api/resources/nonexistent-id
+Then the response status is 404
+And the response body contains error code "NOT_FOUND"
+```
+
+### Update
+
+```markdown
+### AC-10: Update resource with valid data (FR-3)
+Given an existing resource with ID "abc-123" owned by the authenticated user
+When they PATCH /api/resources/abc-123 with {name: "Updated Name"}
+Then the response status is 200
+And the resource's "name" field is "Updated Name"
+And the resource's "updatedAt" field is updated to the current UTC timestamp
+And fields not included in the patch are unchanged
+```
+
+### Update — Ownership Check
+
+```markdown
+### AC-11: Update rejected for non-owner (FR-3, FR-6)
+Given an existing resource with ID "abc-123" owned by user "other-user"
+When the authenticated user (not "other-user") PATCHes /api/resources/abc-123
+Then the response status is 403
+And the response body contains error code "FORBIDDEN"
+And the resource is unchanged
+```
+
+### Delete — Soft Delete
+
+```markdown
+### AC-12: Soft delete resource (FR-5)
+Given an existing resource with ID "abc-123" owned by the authenticated user
+When they DELETE /api/resources/abc-123
+Then the response status is 204
+And the resource's "deletedAt" field is set to the current UTC timestamp
+And the resource no longer appears in GET /api/resources (list endpoint)
+And the resource still exists in the database (soft deleted)
+```
+
+### List — Pagination
+
+```markdown
+### AC-13: List resources with default pagination (FR-4)
+Given 50 resources exist for the authenticated user
+When they GET /api/resources without pagination parameters
+Then the response status is 200
+And the response contains the first 20 resources (default page size)
+And the response includes "totalCount: 50"
+And the response includes "page: 1"
+And the response includes "pageSize: 20"
+And the response includes "hasNextPage: true"
+```
+
+### List — Filtered
+
+```markdown
+### AC-14: List resources with type filter (FR-4)
+Given 30 resources of type "A" and 20 resources of type "B" exist
+When the authenticated user GETs /api/resources?type=A
+Then the response status is 200
+And all returned resources have type "A"
+And the response "totalCount" is 30
+```
+
+---
+
+## Search Patterns
+
+### Basic Search
+
+```markdown
+### AC-15: Search returns matching results (FR-7)
+Given resources with names "Alpha Report", "Beta Analysis", "Alpha Summary" exist
+When the user GETs /api/resources?q=Alpha
+Then the response contains "Alpha Report" and "Alpha Summary"
+And the response does not contain "Beta Analysis"
+And results are ordered by relevance score (descending)
+```
+
+### Search — Empty Results
+
+```markdown
+### AC-16: Search with no matches returns empty list (FR-7)
+Given no resources match the query "xyznonexistent"
+When the user GETs /api/resources?q=xyznonexistent
+Then the response status is 200
+And the response contains an empty "items" array
+And "totalCount" is 0
+```
+
+### Search — Special Characters
+
+```markdown
+### AC-17: Search handles special characters safely (FR-7, NFR-S1)
+Given resources exist in the database
+When the user GETs /api/resources?q="; DROP TABLE resources;--
+Then the response status is 200
+And no SQL injection occurs
+And the search treats the input as a literal string
+```
+
+---
+
+## File Upload Patterns
+
+### Upload — Happy Path
+
+```markdown
+### AC-18: Upload file within size limit (FR-8)
+Given an authenticated user
+When they POST /api/files with a 5MB PNG file
+Then the response status is 201
+And the response contains the file's URL, size, and MIME type
+And the file is stored in the configured storage backend
+And the file is associated with the authenticated user
+```
+
+### Upload — Size Exceeded
+
+```markdown
+### AC-19: Upload rejected for oversized file (FR-8)
+Given the maximum file size is 10MB
+When the user POSTs /api/files with a 15MB file
+Then the response status is 413
+And the response contains error code "FILE_TOO_LARGE"
+And no file is stored
+```
+
+### Upload — Invalid Type
+
+```markdown
+### AC-20: Upload rejected for disallowed file type (FR-8, NFR-S3)
+Given allowed file types are PNG, JPG, PDF
+When the user POSTs /api/files with an .exe file
+Then the response status is 415
+And the response contains error code "UNSUPPORTED_MEDIA_TYPE"
+And no file is stored
+```
+
+---
+
+## Payment Patterns
+
+### Charge — Happy Path
+
+```markdown
+### AC-21: Successful payment charge (FR-10)
+Given a user with a valid payment method on file
+When they POST /api/payments with amount 49.99 and currency "USD"
+Then the payment gateway is charged $49.99
+And the response status is 201
+And the response contains a transaction ID
+And a payment record is created with status "completed"
+And a receipt email is sent to the user
+```
+
+### Charge — Declined
+
+```markdown
+### AC-22: Payment declined by gateway (FR-10)
+Given a user with an expired credit card on file
+When they POST /api/payments with amount 49.99
+Then the payment gateway returns a decline
+And the response status is 402
+And the response contains error code "PAYMENT_DECLINED"
+And no payment record is created with status "completed"
+And the user is prompted to update their payment method
+```
+
+### Charge — Idempotency
+
+```markdown
+### AC-23: Duplicate payment request is idempotent (FR-10, NFR-R1)
+Given a payment was successfully processed with idempotency key "key-123"
+When the same request is sent again with idempotency key "key-123"
+Then the response status is 200
+And the response contains the original transaction ID
+And the user is NOT charged a second time
+```
+
+---
+
+## Notification Patterns
+
+### Email Notification
+
+```markdown
+### AC-24: Email notification sent on event (FR-11)
+Given a user with notification preferences set to "email"
+When their order status changes to "shipped"
+Then an email is sent to their registered email address
+And the email subject contains the order number
+And the email body contains the tracking URL
+And a notification record is created with status "sent"
+```
+
+### Notification — Delivery Failure
+
+```markdown
+### AC-25: Failed notification is retried (FR-11, NFR-R2)
+Given the email service returns a 5xx error on first attempt
+When a notification is triggered
+Then the system retries up to 3 times with exponential backoff (1s, 4s, 16s)
+And if all retries fail, the notification status is set to "failed"
+And an alert is sent to the ops channel
+```
+
+---
+
+## Negative Test Patterns
+
+### Unauthorized Access
+
+```markdown
+### AC-26: Unauthenticated request rejected (NFR-S1)
+Given no authentication token is provided
+When the user GETs /api/resources
+Then the response status is 401
+And the response contains error code "AUTHENTICATION_REQUIRED"
+And no resource data is returned
+```
+
+### Invalid Input — Type Mismatch
+
+```markdown
+### AC-27: String provided for numeric field (FR-1)
+Given the "quantity" field expects an integer
+When the user POSTs with quantity: "abc"
+Then the response status is 400
+And the response body contains field error: {"quantity": "Must be an integer"}
+```
+
+### Rate Limiting
+
+```markdown
+### AC-28: Rate limit enforced (NFR-S2)
+Given the rate limit is 100 requests per minute per API key
+When the user sends the 101st request within 60 seconds
+Then the response status is 429
+And the response includes header "Retry-After" with seconds until reset
+And the response contains error code "RATE_LIMITED"
+```
+
+### Concurrent Modification
+
+```markdown
+### AC-29: Optimistic locking prevents lost updates (NFR-R1)
+Given a resource with version 5
+When user A PATCHes with version 5 and user B PATCHes with version 5 simultaneously
+Then one succeeds with status 200 (version becomes 6)
+And the other receives status 409 with error code "CONFLICT"
+And the 409 response includes the current version number
+```
+
+---
+
+## Performance Criteria Patterns
+
+### Response Time
+
+```markdown
+### AC-30: API response time under load (NFR-P1)
+Given the system is handling 1,000 concurrent users
+When a user GETs /api/dashboard
+Then the response is returned in < 500ms (p95)
+And the response is returned in < 1000ms (p99)
+```
+
+### Throughput
+
+```markdown
+### AC-31: System handles target throughput (NFR-P2)
+Given normal production traffic patterns
+When the system receives 5,000 requests per second
+Then all requests are processed without queue overflow
+And error rate remains below 0.1%
+```
+
+### Resource Usage
+
+```markdown
+### AC-32: Memory usage within bounds (NFR-P3)
+Given the service is processing normal traffic
+When measured over a 24-hour period
+Then memory usage does not exceed 512MB RSS
+And no memory leaks are detected (RSS growth < 5% over 24h)
+```
+
+---
+
+## Accessibility Criteria Patterns
+
+### Keyboard Navigation
+
+```markdown
+### AC-33: Form is fully keyboard navigable (NFR-A1)
+Given the user is on the login page using only a keyboard
+When they press Tab
+Then focus moves through: email field -> password field -> submit button
+And each focused element has a visible focus indicator
+And pressing Enter on the submit button submits the form
+```
+
+### Screen Reader
+
+```markdown
+### AC-34: Error messages announced to screen readers (NFR-A2)
+Given the user submits the form with invalid data
+When validation errors appear
+Then each error is associated with its form field via aria-describedby
+And the error container has role="alert" for immediate announcement
+And the first error field receives focus
+```
+
+### Color Contrast
+
+```markdown
+### AC-35: Text meets contrast requirements (NFR-A3)
+Given the default theme is active
+When measuring text against background colors
+Then all body text meets 4.5:1 contrast ratio (WCAG AA)
+And all large text (18px+ or 14px+ bold) meets 3:1 contrast ratio
+And all interactive element states (hover, focus, active) meet 3:1
+```
+
+### Reduced Motion
+
+```markdown
+### AC-36: Animations respect user preference (NFR-A4)
+Given the user has enabled "prefers-reduced-motion" in their OS settings
+When they load any page with animations
+Then all non-essential animations are disabled
+And essential animations (e.g., loading spinner) use a reduced version
+And no content is hidden behind animation-only interactions
+```
+
+---
+
+## Writing Tips
+
+### Do
+
+- Start Given with the system/user state, not the action
+- Make When a single, specific trigger
+- Make Then observable — status codes, field values, side effects
+- Include And for additional assertions on the same outcome
+- Reference requirement IDs in the AC title
+
+### Do Not
+
+- Write "Then the system works correctly" (not testable)
+- Combine multiple scenarios in one AC
+- Use subjective words: "quickly", "properly", "nicely", "user-friendly"
+- Skip the precondition — Given is required even if it seems obvious
+- Write Given/When/Then as prose paragraphs — use the structured format
+
+### Smell Tests
+
+If your AC has any of these, rewrite it:
+
+| Smell | Example | Fix |
+|-------|---------|-----|
+| No Given clause | "When user clicks, then page loads" | Add "Given user is on the dashboard" |
+| Vague Then | "Then it works" | Specify status code, body, side effects |
+| Multiple Whens | "When user clicks A and then clicks B" | Split into two ACs |
+| Implementation detail | "Then the Redux store is updated" | Focus on user-observable outcome |
+| No requirement reference | "AC-5: Dashboard loads" | "AC-5: Dashboard loads (FR-7)" |
--- a/engineering/spec-driven-workflow/references/bounded_autonomy_rules.md
+++ b/engineering/spec-driven-workflow/references/bounded_autonomy_rules.md
@@ -0,0 +1,273 @@
+# Bounded Autonomy Rules
+
+Decision framework for when an agent (human or AI) should stop and ask vs. continue working autonomously during spec-driven development.
+
+---
+
+## The Core Principle
+
+**Autonomy is earned by clarity.** The clearer the spec, the more autonomy the implementer has. The more ambiguous the spec, the more the implementer must stop and ask.
+
+This is not about trust. It is about risk. A clear spec means low risk of building the wrong thing. An ambiguous spec means high risk.
+
+---
+
+## Decision Matrix
+
+| Signal | Action | Rationale |
+|--------|--------|-----------|
+| Spec is Approved, requirement is clear, tests exist | **Continue** | Low risk. Build it. |
+| Requirement is clear but no test exists yet | **Continue** (write the test first) | You can infer the test from the requirement. |
+| Requirement uses SHOULD/MAY keywords | **Continue** with your best judgment | These are intentionally flexible. Document your choice. |
+| Requirement is ambiguous (multiple valid interpretations) | **STOP** if ambiguity > 30% of the task | Ask the spec author to clarify. |
+| Implementation requires changing an API contract | **STOP** always | Breaking changes need explicit approval. |
+| Implementation requires a new database migration | **STOP** if it changes existing columns/tables | New tables are lower risk than schema changes. |
+| Security-related change (auth, crypto, PII) | **STOP** always | Security changes need review regardless of spec clarity. |
+| Performance-critical path with no benchmark data | **STOP** | You cannot prove NFR compliance without measurement. |
+| Bug found in existing code unrelated to spec | **STOP** — file a separate issue | Do not fix unrelated bugs in a spec-scoped implementation. |
+| Spec says "N/A" for a section you think needs content | **STOP** | The author may have a reason, or they may have missed it. |
+
+---
+
+## Ambiguity Scoring
+
+When you encounter ambiguity, quantify it before deciding to stop or continue.
+
+### How to Score Ambiguity
+
+For each requirement you are implementing, ask:
+
+1. **Can I write a test for this right now?** (No = +20% ambiguity)
+2. **Are there multiple valid interpretations?** (Yes = +20% ambiguity)
+3. **Does the spec contradict itself?** (Yes = +30% ambiguity)
+4. **Am I making assumptions about user behavior?** (Yes = +15% ambiguity)
+5. **Does this depend on an undocumented external system?** (Yes = +15% ambiguity)
+
+### Threshold
+
+| Ambiguity Score | Action |
+|-----------------|--------|
+| 0-15% | Continue. Minor ambiguity is normal. Document your interpretation. |
+| 16-30% | Continue with caution. Add a comment explaining your interpretation. Flag in PR. |
+| 31-50% | STOP. Ask the spec author one specific question. Do not continue until answered. |
+| 51%+ | STOP. The spec is incomplete. Request a revision before proceeding. |
+
+### Example
+
+**Requirement:** "FR-7: The system MUST notify the user when their order ships."
+
+Questions:
+1. Can I write a test? Partially — I know WHAT to test but not HOW (email? push? in-app?). +20%
+2. Multiple interpretations? Yes — notification channel is unclear. +20%
+3. Contradicts itself? No. +0%
+4. Assuming user behavior? Yes — I am assuming they want email. +15%
+5. Undocumented external system? Maybe — depends on notification service. +15%
+
+**Total: 70%.** STOP. The spec needs to specify the notification channel.
+
+---
+
+## Scope Creep Detection
+
+### What Is Scope Creep?
+
+Scope creep is implementing functionality not described in the spec. It includes:
+
+- Adding features the spec does not mention
+- "Improving" behavior beyond what acceptance criteria require
+- Handling edge cases the spec explicitly excluded
+- Refactoring unrelated code "while you're in there"
+- Building infrastructure for future features
+
+### Detection Patterns
+
+| Pattern | Example | Risk |
+|---------|---------|------|
+| "While I'm here..." | Refactoring a utility function unrelated to the spec | Medium — unreviewed changes |
+| "This would be easy to add..." | Adding a search filter the spec does not mention | High — untested, unspecified |
+| "Users will probably want..." | Building a feature based on assumption | High — may conflict with future specs |
+| "This is obviously needed..." | Adding logging, metrics, or caching not in NFRs | Medium — may be overkill or wrong approach |
+| "The spec forgot to mention..." | Building something the spec excluded | Critical — may be deliberately excluded |
+
+### Response Protocol
+
+When you detect scope creep in your own work:
+
+1. **Stop immediately.** Do not commit the extra code.
+2. **Check Out of Scope.** Is this item explicitly excluded?
+3. **If excluded:** Delete the code. The spec author had a reason.
+4. **If not mentioned:** File a note for the spec author. Ask if it should be added.
+5. **If approved:** Update the spec FIRST, then implement.
+
+---
+
+## Breaking Change Identification
+
+### What Counts as a Breaking Change?
+
+A breaking change is any modification that could cause existing clients, tests, or integrations to fail.
+
+| Category | Breaking | Not Breaking |
+|----------|----------|--------------|
+| API endpoint removed | Yes | - |
+| API endpoint added | - | No |
+| Required field added to request | Yes | - |
+| Optional field added to request | - | No |
+| Field removed from response | Yes | - |
+| Field added to response | - | No (usually) |
+| Status code changed | Yes | - |
+| Error code string changed | Yes | - |
+| Database column removed | Yes | - |
+| Database column added (nullable) | - | No |
+| Database column added (not null, no default) | Yes | - |
+| Enum value removed | Yes | - |
+| Enum value added | - | No (usually) |
+| Behavior change for existing input | Yes | - |
+
+### Breaking Change Protocol
+
+1. **Identify** the breaking change before implementing it.
+2. **Escalate** immediately — do not implement without approval.
+3. **Propose** a migration path (versioned API, feature flag, deprecation period).
+4. **Document** the breaking change in the spec's changelog.
+
+---
+
+## Security Implication Checklist
+
+Any change touching the following areas MUST be escalated, even if the spec seems clear.
+
+### Always Escalate
+
+- [ ] Authentication logic (login, logout, token generation)
+- [ ] Authorization logic (role checks, permission gates)
+- [ ] Encryption/hashing (algorithm choice, key management)
+- [ ] PII handling (storage, transmission, logging)
+- [ ] Input validation bypass (new endpoints, parameter changes)
+- [ ] Rate limiting changes (thresholds, scope)
+- [ ] CORS or CSP policy changes
+- [ ] File upload handling
+- [ ] SQL/NoSQL query construction (injection risk)
+- [ ] Deserialization of user input
+- [ ] Redirect URLs from user input (open redirect risk)
+- [ ] Secrets in code, config, or logs
+
+### Security Escalation Template
+
+```markdown
+## Security Escalation: [Title]
+
+**Affected area:** [authentication/authorization/encryption/PII/etc.]
+**Spec reference:** [FR-N or NFR-SN]
+**Risk:** [What could go wrong if implemented incorrectly]
+**Current protection:** [What exists today]
+**Proposed change:** [What the spec requires]
+**My concern:** [Specific security question]
+**Recommendation:** [Proposed approach with security rationale]
+```
+
+---
+
+## Escalation Templates
+
+### Template 1: Ambiguous Requirement
+
+```markdown
+## Escalation: Ambiguous Requirement
+
+**Blocked on:** FR-7 ("notify the user when their order ships")
+**Ambiguity score:** 70%
+**Question:** What notification channel should be used?
+**Options considered:**
+  A. Email only — Pros: simple, reliable. Cons: not real-time.
+  B. Email + in-app notification — Pros: covers both async and real-time. Cons: more implementation effort.
+  C. Configurable per user — Pros: maximum flexibility. Cons: requires preference UI (not in spec).
+**My recommendation:** B (email + in-app). Covers most use cases without requiring new UI.
+**Impact of waiting:** Cannot implement FR-7 until resolved. No other work blocked.
+```
+
+### Template 2: Missing Edge Case
+
+```markdown
+## Escalation: Missing Edge Case
+
+**Related to:** FR-3 (password reset link expires after 1 hour)
+**Scenario:** User clicks a reset link, but their account was deleted between requesting and clicking.
+**Not in spec:** Edge cases section does not cover this.
+**Options considered:**
+  A. Show generic "link invalid" error — Pros: secure (no info leak). Cons: confusing for deleted user.
+  B. Show "account not found" error — Pros: clear. Cons: confirms account deletion to link holder.
+**My recommendation:** A. Security over clarity — do not reveal account existence.
+**Impact of waiting:** Can implement other ACs; this is blocking only AC-2 completion.
+```
+
+### Template 3: Potential Breaking Change
+
+```markdown
+## Escalation: Potential Breaking Change
+
+**Spec requires:** Adding required field "role" to POST /api/users request (FR-6)
+**Current behavior:** POST /api/users accepts {email, password, displayName}
+**Breaking:** Yes — existing clients will get 400 errors (missing required field)
+**Options considered:**
+  A. Make "role" required as spec says — Pros: matches spec. Cons: breaks mobile app v2.1.
+  B. Make "role" optional with default "user" — Pros: backward compatible. Cons: deviates from spec.
+  C. Version the API (v2) — Pros: clean separation. Cons: maintenance burden.
+**My recommendation:** B. Default to "user" for backward compatibility. Update spec to reflect MAY instead of MUST.
+**Impact of waiting:** Frontend team is building against the new contract. Need answer within 2 days.
+```
+
+### Template 4: Scope Creep Proposal
+
+```markdown
+## Escalation: Potential Addition to Spec
+
+**Context:** While implementing FR-2 (password validation), I noticed the spec does not mention password strength feedback.
+**Not in spec:** No requirement for showing strength indicators.
+**Checked Out of Scope:** Not listed there either.
+**Proposal:** Add FR-7: "The system SHOULD display password strength feedback during registration."
+**Effort:** ~2 hours additional implementation.
+**Question:** Should this be added to current spec, filed as a separate spec, or skipped?
+**Impact of waiting:** FR-2 implementation is not blocked. This is an enhancement question only.
+```
+
+---
+
+## Quick Reference Card
+
+```
+CONTINUE if:
+  - Spec is approved
+  - Requirement uses MUST and is unambiguous
+  - Tests can be written directly from the AC
+  - Changes are additive and non-breaking
+  - You are refactoring internals only (no behavior change)
+
+STOP if:
+  - Ambiguity > 30%
+  - Any breaking change
+  - Any security-related change
+  - Spec says N/A but you think it shouldn't
+  - You are about to build something not in the spec
+  - You cannot write a test for the requirement
+  - External dependency is undocumented
+```
+
+---
+
+## Anti-Patterns in Autonomy
+
+### 1. "I'll Ask Later"
+Continuing past an ambiguity checkpoint because asking feels slow. The rework from building the wrong thing is always slower.
+
+### 2. "It's Obviously Needed"
+Assuming a missing feature was accidentally omitted. It may have been deliberately excluded. Check Out of Scope first.
+
+### 3. "The Spec Is Wrong"
+Implementing what you think the spec SHOULD say instead of what it DOES say. If the spec is wrong, escalate. Do not silently "fix" it.
+
+### 4. "Just This Once"
+Bypassing the escalation protocol for a "small" change. Small changes compound. The protocol exists because humans are bad at judging risk in the moment.
+
+### 5. "I Already Built It"
+Presenting completed work that was never in the spec and hoping it gets accepted. This creates review pressure and wastes everyone's time if rejected. Ask BEFORE building.
--- a/engineering/spec-driven-workflow/references/spec_format_guide.md
+++ b/engineering/spec-driven-workflow/references/spec_format_guide.md
@@ -0,0 +1,423 @@
+# Spec Format Guide
+
+Complete reference for writing feature specifications. Every section is explained with examples, rationale, and common mistakes.
+
+---
+
+## The Spec Document Structure
+
+A spec has 8 mandatory sections. If a section does not apply, write "N/A — [reason]" so reviewers know it was considered, not skipped.
+
+```
+1. Title and Metadata
+2. Context
+3. Functional Requirements
+4. Non-Functional Requirements
+5. Acceptance Criteria
+6. Edge Cases and Error Scenarios
+7. API Contracts
+8. Data Models
+9. Out of Scope
+```
+
+---
+
+## Section 1: Title and Metadata
+
+```markdown
+# Spec: [Feature Name]
+
+**Author:** Jane Doe
+**Date:** 2026-03-25
+**Status:** Draft | In Review | Approved | Superseded
+**Reviewers:** John Smith, Alice Chen
+**Related specs:** SPEC-018 (User Registration), SPEC-023 (Session Management)
+```
+
+### Status Lifecycle
+
+| Status | Meaning | Who Can Change |
+|--------|---------|----------------|
+| Draft | Author is still writing. Not ready for review. | Author |
+| In Review | Ready for feedback. Implementation blocked. | Author |
+| Approved | Reviewed and accepted. Implementation may begin. | Reviewer |
+| Superseded | Replaced by a newer spec. Link to replacement. | Author |
+
+**Rule:** Implementation MUST NOT begin until status is "Approved."
+
+---
+
+## Section 2: Context
+
+The context section answers: **Why does this feature exist?**
+
+### What to Include
+
+- The problem being solved (with evidence: support tickets, metrics, user research)
+- The current state (what exists today and what is broken or missing)
+- The business justification (revenue impact, cost savings, user retention)
+- Constraints or dependencies (regulatory, technical, timeline)
+
+### What to Exclude
+
+- Implementation details (that is the engineer's job)
+- Solution proposals (the spec says WHAT, not HOW)
+- Lengthy background (2-4 paragraphs maximum)
+
+### Good Example
+
+```markdown
+## Context
+
+Users who forget their passwords currently have no self-service recovery.
+Support handles ~200 password reset requests per week, consuming approximately
+8 hours of agent time at $45/hour ($360/week, $18,720/year). Additionally,
+12% of users who contact support for a reset never return.
+
+This feature provides self-service password reset via email, eliminating
+support burden and reducing user churn from the reset flow.
+```
+
+### Bad Example
+
+```markdown
+## Context
+
+We need a password reset feature. Users forget their passwords sometimes
+and need to reset them. We should build this.
+```
+
+**Why it is bad:** No evidence, no metrics, no business justification. "We should build this" is not a reason.
+
+---
+
+## Section 3: Functional Requirements — RFC 2119
+
+### RFC 2119 Keywords
+
+These keywords have precise meanings per [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). Do not use them casually.
+
+| Keyword | Meaning | Testing Implication |
+|---------|---------|---------------------|
+| **MUST** | Absolute requirement. The implementation is non-conformant without this. | Must have a passing test. Failure = release blocker. |
+| **MUST NOT** | Absolute prohibition. Doing this = broken implementation. | Must have a test proving this cannot happen. |
+| **SHOULD** | Strongly recommended. Can be omitted only with documented justification. | Should have a test. Omission requires written rationale. |
+| **SHOULD NOT** | Strongly discouraged. Can be done only with documented justification. | Should have a test confirming the behavior does not occur. |
+| **MAY** | Truly optional. Implementer's discretion. | Test is optional. Document if implemented. |
+
+### Writing Good Requirements
+
+**Each requirement MUST be:**
+1. **Atomic** — One behavior per requirement. Not "The system MUST authenticate users and log them in."
+2. **Testable** — You can write a test that proves it works or does not.
+3. **Numbered** — Sequential FR-N format for traceability.
+4. **Specific** — No ambiguous adjectives ("fast", "secure", "user-friendly").
+
+### Good Requirements
+
+```markdown
+- FR-1: The system MUST accept login via email and password.
+- FR-2: The system MUST reject passwords shorter than 8 characters.
+- FR-3: The system MUST return a JWT access token on successful login.
+- FR-4: The system MUST NOT include the password hash in any API response.
+- FR-5: The system SHOULD support "remember me" with a 30-day refresh token.
+- FR-6: The system MAY display last login time on the dashboard.
+```
+
+### Bad Requirements
+
+```markdown
+- FR-1: The login system must be fast and secure.
+  (Untestable: what is "fast"? What is "secure"?)
+
+- FR-2: The system must handle all edge cases.
+  (Vague: which edge cases? This delegates the spec to the implementer.)
+
+- FR-3: Users should be able to log in easily.
+  (Subjective: "easily" is not measurable.)
+```
+
+---
+
+## Section 4: Non-Functional Requirements
+
+Non-functional requirements define quality attributes. Every requirement needs a **measurable threshold**.
+
+### Categories
+
+#### Performance
+```markdown
+- NFR-P1: Login API MUST respond in < 500ms (p95) under 1,000 concurrent users.
+- NFR-P2: Dashboard page MUST achieve Largest Contentful Paint < 2.5s.
+- NFR-P3: Search results MUST return within 200ms for queries under 100 characters.
+```
+
+**Bad:** "The system should be fast." (Not measurable.)
+
+#### Security
+```markdown
+- NFR-S1: All API endpoints MUST require authentication except /health and /login.
+- NFR-S2: Failed login attempts MUST be rate-limited to 5 per minute per IP.
+- NFR-S3: Passwords MUST be hashed with bcrypt (cost factor >= 12).
+- NFR-S4: Session tokens MUST be invalidated on password change.
+```
+
+#### Accessibility
+```markdown
+- NFR-A1: All form inputs MUST have associated labels (WCAG 1.3.1).
+- NFR-A2: Color contrast MUST meet 4.5:1 ratio (WCAG 1.4.3).
+- NFR-A3: All interactive elements MUST be keyboard-navigable (WCAG 2.1.1).
+```
+
+#### Scalability
+```markdown
+- NFR-SC1: The system SHOULD handle 50,000 registered users.
+- NFR-SC2: Database queries MUST use indexes; no full table scans on tables > 10K rows.
+```
+
+#### Reliability
+```markdown
+- NFR-R1: The authentication service MUST maintain 99.9% uptime (< 8.77h downtime/year).
+- NFR-R2: Data MUST NOT be lost on service restart (durable storage required).
+```
+
+---
+
+## Section 5: Acceptance Criteria — Given/When/Then
+
+Acceptance criteria are the contract between the spec author and the implementer. They define "done."
+
+### The Given/When/Then Pattern
+
+```
+Given [precondition — the world is in this state]
+When  [action — the user or system does this]
+Then  [outcome — this observable result occurs]
+And   [additional outcome — and also this]
+```
+
+### Rules for Acceptance Criteria
+
+1. **Every AC MUST reference at least one FR-* or NFR-*.** Orphaned criteria indicate missing requirements.
+2. **Every AC MUST be testable by a machine.** If you cannot write an automated test, rewrite the criterion.
+3. **No subjective language.** Not "should look good" but "MUST render within the design-system grid."
+4. **One scenario per AC.** If you have multiple Given/When/Then blocks, split into separate ACs.
+
+### Example: Authentication Feature
+
+```markdown
+### AC-1: Successful login (FR-1, FR-3)
+Given a registered user with email "user@example.com" and password "P@ssw0rd123"
+When they POST /api/auth/login with those credentials
+Then they receive a 200 response with a valid JWT token
+And the token expires in 24 hours
+And the response includes the user's display name
+
+### AC-2: Invalid password (FR-1)
+Given a registered user with email "user@example.com"
+When they POST /api/auth/login with an incorrect password
+Then they receive a 401 response
+And the response body contains error "INVALID_CREDENTIALS"
+And no token is issued
+
+### AC-3: Short password rejected on registration (FR-2)
+Given a new user attempting to register
+When they submit a password with 7 characters
+Then they receive a 400 response
+And the response body contains error "PASSWORD_TOO_SHORT"
+And the account is not created
+```
+
+### Common Mistakes
+
+| Mistake | Example | Fix |
+|---------|---------|-----|
+| Vague outcome | "Then the system works correctly" | "Then the response status is 200 and body contains {field: value}" |
+| Missing precondition | "When user logs in, then token is issued" | "Given a registered user, when they POST valid credentials, then..." |
+| Multiple scenarios | AC with 3 different When clauses | Split into 3 separate ACs |
+| No FR reference | "AC-5: User sees dashboard" | "AC-5: User sees dashboard (FR-7)" |
+
+---
+
+## Section 6: Edge Cases and Error Scenarios
+
+### What Counts as an Edge Case
+
+- Invalid or malformed input
+- External service failures (API down, timeout, rate-limited)
+- Concurrent operations (race conditions)
+- Boundary values (empty string, max length, zero, negative numbers)
+- State conflicts (already exists, already deleted, expired)
+
+### Format
+
+```markdown
+- EC-1: Empty email field → Return 400 with error "EMAIL_REQUIRED". Do not call auth service.
+- EC-2: Email exceeds 255 characters → Return 400 with error "EMAIL_TOO_LONG".
+- EC-3: OAuth provider returns 503 → Return 503 with "Service temporarily unavailable". Retry after 30s.
+- EC-4: Two users register same email simultaneously → First succeeds, second gets 409 Conflict.
+- EC-5: User clicks reset link after password was already changed → Show "Link already used."
+```
+
+### Coverage Rule
+
+For every external dependency, specify at least one failure:
+- Database: connection lost, timeout, constraint violation
+- API: 4xx, 5xx, timeout, invalid response
+- File system: file not found, permission denied, disk full
+- User input: empty, too long, wrong type, injection attempt
+
+---
+
+## Section 7: API Contracts
+
+### Notation
+
+Use TypeScript-style interfaces. They are readable by both frontend and backend engineers.
+
+```typescript
+interface CreateUserRequest {
+  email: string;         // MUST be valid email, max 255 chars
+  password: string;      // MUST be 8-128 chars
+  displayName: string;   // MUST be 1-100 chars, no HTML
+  role?: "user" | "admin"; // Default: "user"
+}
+```
+
+### What to Define
+
+For each endpoint:
+1. **HTTP method and path** (e.g., POST /api/users)
+2. **Request body** (fields, types, constraints, defaults)
+3. **Success response** (status code, body shape)
+4. **Error responses** (each error code with its status and body)
+5. **Headers** (Authorization, Content-Type, custom headers)
+
+### Error Response Convention
+
+```typescript
+interface ApiError {
+  error: string;         // Machine-readable code: "INVALID_CREDENTIALS"
+  message: string;       // Human-readable: "The email or password is incorrect."
+  details?: Record<string, string>;  // Field-level errors for validation
+}
+```
+
+Always include:
+- 400 for validation errors
+- 401 for authentication failures
+- 403 for authorization failures
+- 404 for not found
+- 409 for conflicts
+- 429 for rate limiting
+- 500 for unexpected errors (keep it generic — do not leak internals)
+
+---
+
+## Section 8: Data Models
+
+### Table Format
+
+```markdown
+### User
+| Field | Type | Constraints |
+|-------|------|-------------|
+| id | UUID | PK, auto-generated, immutable |
+| email | varchar(255) | Unique, not null, valid email |
+| passwordHash | varchar(60) | Not null, bcrypt, never in API responses |
+| displayName | varchar(100) | Not null |
+| role | enum('user','admin') | Default: 'user' |
+| createdAt | timestamp | UTC, immutable, auto-set |
+| updatedAt | timestamp | UTC, auto-updated |
+| deletedAt | timestamp | Null unless soft-deleted |
+```
+
+### Rules
+
+1. **Every entity in requirements MUST have a data model.** If FR-1 mentions "users", there must be a User model.
+2. **Constraints MUST match requirements.** If FR-2 says passwords >= 8 chars, the model must note that.
+3. **Include indexes.** If NFR-P1 says < 500ms queries, note which fields need indexes.
+4. **Specify soft vs. hard delete.** State it explicitly.
+
+---
+
+## Section 9: Out of Scope
+
+### Why This Section Matters
+
+Out of Scope prevents scope creep during implementation. When someone says "while you're in there, could you also..." — point them to this section.
+
+### Format
+
+```markdown
+- OS-1: Multi-factor authentication — Planned for Q3 (SPEC-045).
+- OS-2: Social login beyond Google/GitHub — Insufficient user demand (< 2% requests).
+- OS-3: Admin impersonation — Security review pending. Separate spec required.
+- OS-4: Password strength meter UI — Nice-to-have, deferred to design sprint 12.
+```
+
+### Rules
+
+1. **Every feature discussed and rejected MUST be listed.** This creates a paper trail.
+2. **Include the reason.** "Not now" is not a reason. "Insufficient demand (< 2% of requests)" is.
+3. **Link to future specs** when the exclusion is a deferral, not a rejection.
+
+---
+
+## Feature-Type Templates
+
+### CRUD Feature
+
+Focus on: all 4 operations, validation rules, authorization, pagination for list endpoints.
+
+```markdown
+- FR-1: Users MUST be able to create a [resource] with [required fields].
+- FR-2: Users MUST be able to read a [resource] by ID.
+- FR-3: Users MUST be able to list [resources] with pagination (default: 20/page).
+- FR-4: Users MUST be able to update [mutable fields] of their own [resources].
+- FR-5: Users MUST be able to delete their own [resources] (soft delete).
+- FR-6: Users MUST NOT be able to modify or delete other users' [resources].
+```
+
+### Integration Feature
+
+Focus on: external API contract, retry/fallback behavior, data mapping, error propagation.
+
+```markdown
+- FR-1: The system MUST call [external API] to [purpose].
+- FR-2: The system MUST retry failed calls up to 3 times with exponential backoff.
+- FR-3: The system MUST map [external field] to [internal field].
+- FR-4: The system MUST NOT expose external API errors directly to users.
+- EC-1: External API returns 5xx → Log error, return cached data if < 1h old, else 503.
+- EC-2: External API response schema changes → Log warning, reject unmappable fields.
+```
+
+### Migration Feature
+
+Focus on: backward compatibility, rollback plan, data integrity, zero-downtime deployment.
+
+```markdown
+- FR-1: The migration MUST transform [old schema] to [new schema].
+- FR-2: The migration MUST be reversible (rollback script required).
+- FR-3: The migration MUST NOT cause downtime exceeding 30 seconds.
+- FR-4: The migration MUST validate data integrity post-run (row count, checksum).
+- EC-1: Migration fails mid-way → Automatic rollback, alert ops team.
+- EC-2: New schema has stricter constraints → Log invalid rows, quarantine for manual review.
+```
+
+---
+
+## Checklist: Is This Spec Ready for Review?
+
+- [ ] Every section is filled (or marked N/A with reason)
+- [ ] All requirements use FR-N, NFR-N numbering
+- [ ] RFC 2119 keywords are UPPERCASE
+- [ ] Every AC references at least one requirement
+- [ ] Every AC uses Given/When/Then
+- [ ] Edge cases cover each external dependency failure
+- [ ] API contracts define success AND error responses
+- [ ] Data models include all entities from requirements
+- [ ] Out of Scope lists items discussed and rejected
+- [ ] No placeholder text remains
+- [ ] Context includes evidence (metrics, tickets, research)
+- [ ] Status is "In Review" (not still "Draft")
--- a/engineering/spec-driven-workflow/spec_generator.py
+++ b/engineering/spec-driven-workflow/spec_generator.py
@@ -0,0 +1,338 @@
+#!/usr/bin/env python3
+"""
+Spec Generator - Generates a feature specification template from a name and description.
+
+Produces a complete spec document with all required sections pre-filled with
+guidance prompts. Output can be markdown or structured JSON.
+
+No external dependencies - uses only Python standard library.
+"""
+
+import argparse
+import json
+import sys
+import textwrap
+from datetime import date
+from pathlib import Path
+from typing import Dict, Any, Optional
+
+
+SPEC_TEMPLATE = """\
+# Spec: {name}
+
+**Author:** [your name]
+**Date:** {date}
+**Status:** Draft
+**Reviewers:** [list reviewers]
+**Related specs:** [links to related specs, or "None"]
+
+---
+
+## Context
+
+{context_prompt}
+
+---
+
+## Functional Requirements
+
+_Use RFC 2119 keywords: MUST, MUST NOT, SHOULD, SHOULD NOT, MAY._
+_Each requirement is a single, testable statement. Number sequentially._
+
+- FR-1: The system MUST [describe required behavior].
+- FR-2: The system MUST [describe another required behavior].
+- FR-3: The system SHOULD [describe recommended behavior].
+- FR-4: The system MAY [describe optional behavior].
+- FR-5: The system MUST NOT [describe prohibited behavior].
+
+---
+
+## Non-Functional Requirements
+
+### Performance
+- NFR-P1: [Operation] MUST complete in < [threshold] (p95) under [conditions].
+- NFR-P2: [Operation] SHOULD handle [throughput] requests per second.
+
+### Security
+- NFR-S1: All data in transit MUST be encrypted via TLS 1.2+.
+- NFR-S2: The system MUST rate-limit [operation] to [limit] per [period] per [scope].
+
+### Accessibility
+- NFR-A1: [UI component] MUST meet WCAG 2.1 AA standards.
+- NFR-A2: Error messages MUST be announced to screen readers.
+
+### Scalability
+- NFR-SC1: The system SHOULD handle [number] concurrent [entities].
+
+### Reliability
+- NFR-R1: The [service] MUST maintain [percentage]% uptime.
+
+---
+
+## Acceptance Criteria
+
+_Write in Given/When/Then (Gherkin) format._
+_Each criterion MUST reference at least one FR-* or NFR-*._
+
+### AC-1: [Descriptive name] (FR-1)
+Given [precondition]
+When [action]
+Then [expected result]
+And [additional assertion]
+
+### AC-2: [Descriptive name] (FR-2)
+Given [precondition]
+When [action]
+Then [expected result]
+
+### AC-3: [Descriptive name] (NFR-S2)
+Given [precondition]
+When [action]
+Then [expected result]
+And [additional assertion]
+
+---
+
+## Edge Cases
+
+_For every external dependency (API, database, file system, user input), specify at least one failure scenario._
+
+- EC-1: [Input/condition] -> [expected behavior].
+- EC-2: [Input/condition] -> [expected behavior].
+- EC-3: [External service] is unavailable -> [expected behavior].
+- EC-4: [Concurrent/race condition] -> [expected behavior].
+- EC-5: [Boundary value] -> [expected behavior].
+
+---
+
+## API Contracts
+
+_Define request/response shapes using TypeScript-style notation._
+_Cover all endpoints referenced in functional requirements._
+
+### [METHOD] [endpoint]
+
+Request:
+```typescript
+interface [Name]Request {{
+  field: string;       // Description, constraints
+  optional?: number;   // Default: [value]
+}}
+```
+
+Success Response ([status code]):
+```typescript
+interface [Name]Response {{
+  id: string;
+  field: string;
+  createdAt: string;   // ISO 8601
+}}
+```
+
+Error Response ([status code]):
+```typescript
+interface [Name]Error {{
+  error: "[ERROR_CODE]";
+  message: string;
+}}
+```
+
+---
+
+## Data Models
+
+_Define all entities referenced in requirements._
+
+### [Entity Name]
+| Field | Type | Constraints |
+|-------|------|-------------|
+| id | UUID | Primary key, auto-generated |
+| [field] | [type] | [constraints] |
+| createdAt | timestamp | UTC, immutable |
+| updatedAt | timestamp | UTC, auto-updated |
+
+---
+
+## Out of Scope
+
+_Explicit exclusions prevent scope creep. If someone asks for these during implementation, point them here._
+
+- OS-1: [Feature/capability] — [reason for exclusion or link to future spec].
+- OS-2: [Feature/capability] — [reason for exclusion].
+- OS-3: [Feature/capability] — deferred to [version/sprint].
+
+---
+
+## Open Questions
+
+_Track unresolved questions here. Each must be resolved before status moves to "Approved"._
+
+- [ ] Q1: [Question] — Owner: [name], Due: [date]
+- [ ] Q2: [Question] — Owner: [name], Due: [date]
+"""
+
+
+def generate_context_prompt(description: str) -> str:
+    """Generate a context section prompt based on the provided description."""
+    if description:
+        return textwrap.dedent(f"""\
+            {description}
+
+            _Expand this context section to include:_
+            _- Why does this feature exist? What problem does it solve?_
+            _- What is the business motivation? (link to user research, support tickets, metrics)_
+            _- What is the current state? (what exists today, what pain points exist)_
+            _- 2-4 paragraphs maximum._""")
+    return textwrap.dedent("""\
+        _Why does this feature exist? What problem does it solve? What is the business
+        motivation? Include links to user research, support tickets, or metrics that
+        justify this work. 2-4 paragraphs maximum._""")
+
+
+def generate_spec(name: str, description: str) -> str:
+    """Generate a spec document from name and description."""
+    context_prompt = generate_context_prompt(description)
+    return SPEC_TEMPLATE.format(
+        name=name,
+        date=date.today().isoformat(),
+        context_prompt=context_prompt,
+    )
+
+
+def generate_spec_json(name: str, description: str) -> Dict[str, Any]:
+    """Generate structured JSON representation of the spec template."""
+    return {
+        "spec": {
+            "title": f"Spec: {name}",
+            "metadata": {
+                "author": "[your name]",
+                "date": date.today().isoformat(),
+                "status": "Draft",
+                "reviewers": [],
+                "related_specs": [],
+            },
+            "context": description or "[Describe why this feature exists]",
+            "functional_requirements": [
+                {"id": "FR-1", "keyword": "MUST", "description": "[describe required behavior]"},
+                {"id": "FR-2", "keyword": "MUST", "description": "[describe another required behavior]"},
+                {"id": "FR-3", "keyword": "SHOULD", "description": "[describe recommended behavior]"},
+                {"id": "FR-4", "keyword": "MAY", "description": "[describe optional behavior]"},
+                {"id": "FR-5", "keyword": "MUST NOT", "description": "[describe prohibited behavior]"},
+            ],
+            "non_functional_requirements": {
+                "performance": [
+                    {"id": "NFR-P1", "description": "[operation] MUST complete in < [threshold]"},
+                ],
+                "security": [
+                    {"id": "NFR-S1", "description": "All data in transit MUST be encrypted via TLS 1.2+"},
+                ],
+                "accessibility": [
+                    {"id": "NFR-A1", "description": "[UI component] MUST meet WCAG 2.1 AA"},
+                ],
+                "scalability": [
+                    {"id": "NFR-SC1", "description": "[system] SHOULD handle [N] concurrent [entities]"},
+                ],
+                "reliability": [
+                    {"id": "NFR-R1", "description": "[service] MUST maintain [N]% uptime"},
+                ],
+            },
+            "acceptance_criteria": [
+                {
+                    "id": "AC-1",
+                    "name": "[descriptive name]",
+                    "references": ["FR-1"],
+                    "given": "[precondition]",
+                    "when": "[action]",
+                    "then": "[expected result]",
+                },
+            ],
+            "edge_cases": [
+                {"id": "EC-1", "condition": "[input/condition]", "behavior": "[expected behavior]"},
+            ],
+            "api_contracts": [
+                {
+                    "method": "[METHOD]",
+                    "endpoint": "[/api/path]",
+                    "request_fields": [{"name": "field", "type": "string", "constraints": "[description]"}],
+                    "success_response": {"status": 200, "fields": []},
+                    "error_response": {"status": 400, "fields": []},
+                },
+            ],
+            "data_models": [
+                {
+                    "name": "[Entity]",
+                    "fields": [
+                        {"name": "id", "type": "UUID", "constraints": "Primary key, auto-generated"},
+                    ],
+                },
+            ],
+            "out_of_scope": [
+                {"id": "OS-1", "description": "[feature/capability]", "reason": "[reason]"},
+            ],
+            "open_questions": [],
+        },
+        "metadata": {
+            "generated_by": "spec_generator.py",
+            "feature_name": name,
+            "feature_description": description,
+        },
+    }
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Generate a feature specification template from a name and description.",
+        epilog="Example: python spec_generator.py --name 'User Auth' --description 'OAuth 2.0 login flow'",
+    )
+    parser.add_argument(
+        "--name",
+        required=True,
+        help="Feature name (used as spec title)",
+    )
+    parser.add_argument(
+        "--description",
+        default="",
+        help="Brief feature description (used to seed the context section)",
+    )
+    parser.add_argument(
+        "--output",
+        "-o",
+        default=None,
+        help="Output file path (default: stdout)",
+    )
+    parser.add_argument(
+        "--format",
+        choices=["md", "json"],
+        default="md",
+        help="Output format: md (markdown) or json (default: md)",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_flag",
+        help="Shorthand for --format json",
+    )
+
+    args = parser.parse_args()
+
+    output_format = "json" if args.json_flag else args.format
+
+    if output_format == "json":
+        result = generate_spec_json(args.name, args.description)
+        output = json.dumps(result, indent=2)
+    else:
+        output = generate_spec(args.name, args.description)
+
+    if args.output:
+        out_path = Path(args.output)
+        out_path.parent.mkdir(parents=True, exist_ok=True)
+        out_path.write_text(output, encoding="utf-8")
+        print(f"Spec template written to {out_path}", file=sys.stderr)
+    else:
+        print(output)
+
+    sys.exit(0)
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering/spec-driven-workflow/spec_validator.py
+++ b/engineering/spec-driven-workflow/spec_validator.py
@@ -0,0 +1,461 @@
+#!/usr/bin/env python3
+"""
+Spec Validator - Validates a feature specification for completeness and quality.
+
+Checks that a spec document contains all required sections, uses RFC 2119 keywords
+correctly, has acceptance criteria in Given/When/Then format, and scores overall
+completeness from 0-100.
+
+Sections checked:
+- Context, Functional Requirements, Non-Functional Requirements
+- Acceptance Criteria, Edge Cases, API Contracts, Data Models, Out of Scope
+
+Exit codes: 0 = pass, 1 = warnings, 2 = critical (or --strict with score < 80)
+
+No external dependencies - uses only Python standard library.
+"""
+
+import argparse
+import json
+import re
+import sys
+from pathlib import Path
+from typing import Dict, List, Any, Tuple
+
+
+# Section definitions: (key, display_name, required_header_patterns, weight)
+SECTIONS = [
+    ("context", "Context", [r"^##\s+Context"], 10),
+    ("functional_requirements", "Functional Requirements", [r"^##\s+Functional\s+Requirements"], 15),
+    ("non_functional_requirements", "Non-Functional Requirements", [r"^##\s+Non-Functional\s+Requirements"], 10),
+    ("acceptance_criteria", "Acceptance Criteria", [r"^##\s+Acceptance\s+Criteria"], 20),
+    ("edge_cases", "Edge Cases", [r"^##\s+Edge\s+Cases"], 10),
+    ("api_contracts", "API Contracts", [r"^##\s+API\s+Contracts"], 10),
+    ("data_models", "Data Models", [r"^##\s+Data\s+Models"], 10),
+    ("out_of_scope", "Out of Scope", [r"^##\s+Out\s+of\s+Scope"], 10),
+    ("metadata", "Metadata (Author/Date/Status)", [r"\*\*Author:\*\*", r"\*\*Date:\*\*", r"\*\*Status:\*\*"], 5),
+]
+
+RFC_KEYWORDS = ["MUST", "MUST NOT", "SHOULD", "SHOULD NOT", "MAY"]
+
+# Patterns that indicate placeholder/unfilled content
+PLACEHOLDER_PATTERNS = [
+    r"\[your\s+name\]",
+    r"\[list\s+reviewers\]",
+    r"\[describe\s+",
+    r"\[input/condition\]",
+    r"\[precondition\]",
+    r"\[action\]",
+    r"\[expected\s+result\]",
+    r"\[feature/capability\]",
+    r"\[operation\]",
+    r"\[threshold\]",
+    r"\[UI\s+component\]",
+    r"\[service\]",
+    r"\[percentage\]",
+    r"\[number\]",
+    r"\[METHOD\]",
+    r"\[endpoint\]",
+    r"\[Name\]",
+    r"\[Entity\s+Name\]",
+    r"\[type\]",
+    r"\[constraints\]",
+    r"\[field\]",
+    r"\[reason\]",
+]
+
+
+class SpecValidator:
+    """Validates a spec document for completeness and quality."""
+
+    def __init__(self, content: str, file_path: str = ""):
+        self.content = content
+        self.file_path = file_path
+        self.lines = content.split("\n")
+        self.findings: List[Dict[str, Any]] = []
+        self.section_scores: Dict[str, Dict[str, Any]] = {}
+
+    def validate(self) -> Dict[str, Any]:
+        """Run all validation checks and return results."""
+        self._check_sections_present()
+        self._check_functional_requirements()
+        self._check_acceptance_criteria()
+        self._check_edge_cases()
+        self._check_rfc_keywords()
+        self._check_api_contracts()
+        self._check_data_models()
+        self._check_out_of_scope()
+        self._check_placeholders()
+        self._check_traceability()
+
+        total_score = self._calculate_score()
+
+        return {
+            "file": self.file_path,
+            "score": total_score,
+            "grade": self._score_to_grade(total_score),
+            "sections": self.section_scores,
+            "findings": self.findings,
+            "summary": self._build_summary(total_score),
+        }
+
+    def _add_finding(self, severity: str, section: str, message: str):
+        """Record a validation finding."""
+        self.findings.append({
+            "severity": severity,  # "error", "warning", "info"
+            "section": section,
+            "message": message,
+        })
+
+    def _find_section_content(self, header_pattern: str) -> str:
+        """Extract content between a section header and the next ## header."""
+        in_section = False
+        section_lines = []
+        for line in self.lines:
+            if re.match(header_pattern, line, re.IGNORECASE):
+                in_section = True
+                continue
+            if in_section and re.match(r"^##\s+", line):
+                break
+            if in_section:
+                section_lines.append(line)
+        return "\n".join(section_lines)
+
+    def _check_sections_present(self):
+        """Check that all required sections exist."""
+        for key, name, patterns, weight in SECTIONS:
+            found = False
+            for pattern in patterns:
+                for line in self.lines:
+                    if re.search(pattern, line, re.IGNORECASE):
+                        found = True
+                        break
+                if found:
+                    break
+
+            if found:
+                self.section_scores[key] = {"name": name, "present": True, "score": weight, "max": weight}
+            else:
+                self.section_scores[key] = {"name": name, "present": False, "score": 0, "max": weight}
+                self._add_finding("error", key, f"Missing section: {name}")
+
+    def _check_functional_requirements(self):
+        """Validate functional requirements format and content."""
+        content = self._find_section_content(r"^##\s+Functional\s+Requirements")
+        if not content.strip():
+            return
+
+        fr_pattern = re.compile(r"-\s+FR-(\d+):")
+        matches = fr_pattern.findall(content)
+
+        if not matches:
+            self._add_finding("error", "functional_requirements", "No numbered requirements found (expected FR-N: format)")
+            if "functional_requirements" in self.section_scores:
+                self.section_scores["functional_requirements"]["score"] = max(
+                    0, self.section_scores["functional_requirements"]["score"] - 10
+                )
+            return
+
+        fr_count = len(matches)
+        if fr_count < 3:
+            self._add_finding("warning", "functional_requirements", f"Only {fr_count} requirements found. Most features need 3+.")
+
+        # Check for RFC keywords
+        has_keyword = False
+        for kw in RFC_KEYWORDS:
+            if kw in content:
+                has_keyword = True
+                break
+        if not has_keyword:
+            self._add_finding("warning", "functional_requirements", "No RFC 2119 keywords (MUST/SHOULD/MAY) found.")
+
+    def _check_acceptance_criteria(self):
+        """Validate acceptance criteria use Given/When/Then format."""
+        content = self._find_section_content(r"^##\s+Acceptance\s+Criteria")
+        if not content.strip():
+            return
+
+        ac_pattern = re.compile(r"###\s+AC-(\d+):")
+        matches = ac_pattern.findall(content)
+
+        if not matches:
+            self._add_finding("error", "acceptance_criteria", "No numbered acceptance criteria found (expected ### AC-N: format)")
+            if "acceptance_criteria" in self.section_scores:
+                self.section_scores["acceptance_criteria"]["score"] = max(
+                    0, self.section_scores["acceptance_criteria"]["score"] - 15
+                )
+            return
+
+        ac_count = len(matches)
+
+        # Check Given/When/Then
+        given_count = len(re.findall(r"(?i)\bgiven\b", content))
+        when_count = len(re.findall(r"(?i)\bwhen\b", content))
+        then_count = len(re.findall(r"(?i)\bthen\b", content))
+
+        if given_count < ac_count:
+            self._add_finding("warning", "acceptance_criteria",
+                              f"Found {ac_count} criteria but only {given_count} 'Given' clauses. Each AC needs Given/When/Then.")
+        if when_count < ac_count:
+            self._add_finding("warning", "acceptance_criteria",
+                              f"Found {ac_count} criteria but only {when_count} 'When' clauses.")
+        if then_count < ac_count:
+            self._add_finding("warning", "acceptance_criteria",
+                              f"Found {ac_count} criteria but only {then_count} 'Then' clauses.")
+
+        # Check for FR references
+        fr_refs = re.findall(r"\(FR-\d+", content)
+        if not fr_refs:
+            self._add_finding("warning", "acceptance_criteria",
+                              "No acceptance criteria reference functional requirements (expected (FR-N) in title).")
+
+    def _check_edge_cases(self):
+        """Validate edge cases section."""
+        content = self._find_section_content(r"^##\s+Edge\s+Cases")
+        if not content.strip():
+            return
+
+        ec_pattern = re.compile(r"-\s+EC-(\d+):")
+        matches = ec_pattern.findall(content)
+
+        if not matches:
+            self._add_finding("warning", "edge_cases", "No numbered edge cases found (expected EC-N: format)")
+        elif len(matches) < 3:
+            self._add_finding("warning", "edge_cases", f"Only {len(matches)} edge cases. Consider failure modes for each external dependency.")
+
+    def _check_rfc_keywords(self):
+        """Check RFC 2119 keywords are used consistently (capitalized)."""
+        # Look for lowercase must/should/may that might be intended as RFC keywords
+        context_content = self._find_section_content(r"^##\s+Functional\s+Requirements")
+        context_content += self._find_section_content(r"^##\s+Non-Functional\s+Requirements")
+
+        for kw in ["must", "should", "may"]:
+            # Find lowercase usage in requirement-like sentences
+            pattern = rf"(?:system|service|API|endpoint)\s+{kw}\s+"
+            if re.search(pattern, context_content):
+                self._add_finding("warning", "rfc_keywords",
+                                  f"Found lowercase '{kw}' in requirements. RFC 2119 keywords should be UPPERCASE: {kw.upper()}")
+
+    def _check_api_contracts(self):
+        """Validate API contracts section."""
+        content = self._find_section_content(r"^##\s+API\s+Contracts")
+        if not content.strip():
+            return
+
+        # Check for at least one endpoint definition
+        has_endpoint = bool(re.search(r"(GET|POST|PUT|PATCH|DELETE)\s+/", content))
+        if not has_endpoint:
+            self._add_finding("warning", "api_contracts", "No HTTP method + path found (expected e.g., POST /api/endpoint)")
+
+        # Check for request/response definitions
+        has_interface = bool(re.search(r"interface\s+\w+", content))
+        if not has_interface:
+            self._add_finding("info", "api_contracts", "No TypeScript interfaces found. Consider defining request/response shapes.")
+
+    def _check_data_models(self):
+        """Validate data models section."""
+        content = self._find_section_content(r"^##\s+Data\s+Models")
+        if not content.strip():
+            return
+
+        # Check for table format
+        has_table = bool(re.search(r"\|.*\|.*\|", content))
+        if not has_table:
+            self._add_finding("warning", "data_models", "No table-formatted data models found. Use | Field | Type | Constraints | format.")
+
+    def _check_out_of_scope(self):
+        """Validate out of scope section."""
+        content = self._find_section_content(r"^##\s+Out\s+of\s+Scope")
+        if not content.strip():
+            return
+
+        os_pattern = re.compile(r"-\s+OS-(\d+):")
+        matches = os_pattern.findall(content)
+
+        if not matches:
+            self._add_finding("warning", "out_of_scope", "No numbered exclusions found (expected OS-N: format)")
+        elif len(matches) < 2:
+            self._add_finding("info", "out_of_scope", "Only 1 exclusion listed. Consider what was deliberately left out.")
+
+    def _check_placeholders(self):
+        """Check for unfilled placeholder text."""
+        placeholder_count = 0
+        for pattern in PLACEHOLDER_PATTERNS:
+            matches = re.findall(pattern, self.content, re.IGNORECASE)
+            placeholder_count += len(matches)
+
+        if placeholder_count > 0:
+            self._add_finding("warning", "placeholders",
+                              f"Found {placeholder_count} placeholder(s) that need to be filled in (e.g., [your name], [describe ...]).")
+            # Deduct from overall score proportionally
+            for key in self.section_scores:
+                if self.section_scores[key]["present"]:
+                    deduction = min(3, self.section_scores[key]["score"])
+                    self.section_scores[key]["score"] = max(0, self.section_scores[key]["score"] - deduction)
+
+    def _check_traceability(self):
+        """Check that acceptance criteria reference functional requirements."""
+        ac_content = self._find_section_content(r"^##\s+Acceptance\s+Criteria")
+        fr_content = self._find_section_content(r"^##\s+Functional\s+Requirements")
+
+        if not ac_content.strip() or not fr_content.strip():
+            return
+
+        # Extract FR IDs
+        fr_ids = set(re.findall(r"FR-(\d+)", fr_content))
+        # Extract FR references from AC
+        ac_fr_refs = set(re.findall(r"FR-(\d+)", ac_content))
+
+        unreferenced = fr_ids - ac_fr_refs
+        if unreferenced:
+            unreferenced_list = ", ".join(f"FR-{i}" for i in sorted(unreferenced))
+            self._add_finding("warning", "traceability",
+                              f"Functional requirements without acceptance criteria: {unreferenced_list}")
+
+    def _calculate_score(self) -> int:
+        """Calculate the total completeness score."""
+        total = sum(s["score"] for s in self.section_scores.values())
+        maximum = sum(s["max"] for s in self.section_scores.values())
+
+        if maximum == 0:
+            return 0
+
+        # Apply finding-based deductions
+        error_count = sum(1 for f in self.findings if f["severity"] == "error")
+        warning_count = sum(1 for f in self.findings if f["severity"] == "warning")
+
+        base_score = round((total / maximum) * 100)
+        deduction = (error_count * 5) + (warning_count * 2)
+
+        return max(0, min(100, base_score - deduction))
+
+    @staticmethod
+    def _score_to_grade(score: int) -> str:
+        """Convert score to letter grade."""
+        if score >= 90:
+            return "A"
+        if score >= 80:
+            return "B"
+        if score >= 70:
+            return "C"
+        if score >= 60:
+            return "D"
+        return "F"
+
+    def _build_summary(self, score: int) -> str:
+        """Build human-readable summary."""
+        errors = [f for f in self.findings if f["severity"] == "error"]
+        warnings = [f for f in self.findings if f["severity"] == "warning"]
+        infos = [f for f in self.findings if f["severity"] == "info"]
+
+        lines = [
+            f"Spec Completeness Score: {score}/100 (Grade: {self._score_to_grade(score)})",
+            f"Errors: {len(errors)}, Warnings: {len(warnings)}, Info: {len(infos)}",
+            "",
+        ]
+
+        if errors:
+            lines.append("ERRORS (must fix):")
+            for e in errors:
+                lines.append(f"  [{e['section']}] {e['message']}")
+            lines.append("")
+
+        if warnings:
+            lines.append("WARNINGS (should fix):")
+            for w in warnings:
+                lines.append(f"  [{w['section']}] {w['message']}")
+            lines.append("")
+
+        if infos:
+            lines.append("INFO:")
+            for i in infos:
+                lines.append(f"  [{i['section']}] {i['message']}")
+            lines.append("")
+
+        # Section breakdown
+        lines.append("Section Breakdown:")
+        for key, data in self.section_scores.items():
+            status = "PRESENT" if data["present"] else "MISSING"
+            lines.append(f"  {data['name']}: {data['score']}/{data['max']} ({status})")
+
+        return "\n".join(lines)
+
+
+def format_human(result: Dict[str, Any]) -> str:
+    """Format validation result for human reading."""
+    lines = [
+        "=" * 60,
+        "SPEC VALIDATION REPORT",
+        "=" * 60,
+        "",
+    ]
+    if result["file"]:
+        lines.append(f"File: {result['file']}")
+        lines.append("")
+
+    lines.append(result["summary"])
+
+    return "\n".join(lines)
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Validate a feature specification for completeness and quality.",
+        epilog="Example: python spec_validator.py --file spec.md --strict",
+    )
+    parser.add_argument(
+        "--file",
+        "-f",
+        required=True,
+        help="Path to the spec markdown file",
+    )
+    parser.add_argument(
+        "--strict",
+        action="store_true",
+        help="Exit with code 2 if score is below 80",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_flag",
+        help="Output results as JSON",
+    )
+
+    args = parser.parse_args()
+
+    file_path = Path(args.file)
+    if not file_path.exists():
+        print(f"Error: File not found: {file_path}", file=sys.stderr)
+        sys.exit(2)
+
+    content = file_path.read_text(encoding="utf-8")
+
+    if not content.strip():
+        print(f"Error: File is empty: {file_path}", file=sys.stderr)
+        sys.exit(2)
+
+    validator = SpecValidator(content, str(file_path))
+    result = validator.validate()
+
+    if args.json_flag:
+        print(json.dumps(result, indent=2))
+    else:
+        print(format_human(result))
+
+    # Determine exit code
+    score = result["score"]
+    has_errors = any(f["severity"] == "error" for f in result["findings"])
+    has_warnings = any(f["severity"] == "warning" for f in result["findings"])
+
+    if args.strict and score < 80:
+        sys.exit(2)
+    elif has_errors:
+        sys.exit(2)
+    elif has_warnings:
+        sys.exit(1)
+    else:
+        sys.exit(0)
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering/spec-driven-workflow/test_extractor.py
+++ b/engineering/spec-driven-workflow/test_extractor.py
@@ -0,0 +1,431 @@
+#!/usr/bin/env python3
+"""
+Test Extractor - Extracts test case stubs from a feature specification.
+
+Parses acceptance criteria (Given/When/Then) and edge cases from a spec
+document, then generates test stubs for the specified framework.
+
+Supported frameworks: pytest, jest, go-test
+
+Exit codes: 0 = success, 1 = warnings (some criteria unparseable), 2 = critical error
+
+No external dependencies - uses only Python standard library.
+"""
+
+import argparse
+import json
+import re
+import sys
+import textwrap
+from pathlib import Path
+from typing import Dict, List, Any, Optional, Tuple
+
+
+class SpecParser:
+    """Parses spec documents to extract testable criteria."""
+
+    def __init__(self, content: str):
+        self.content = content
+        self.lines = content.split("\n")
+
+    def extract_acceptance_criteria(self) -> List[Dict[str, Any]]:
+        """Extract AC-N blocks with Given/When/Then clauses."""
+        criteria = []
+        ac_pattern = re.compile(r"###\s+AC-(\d+):\s*(.+?)(?:\s*\(([^)]+)\))?\s*$")
+
+        in_ac = False
+        current_ac: Optional[Dict[str, Any]] = None
+        body_lines: List[str] = []
+
+        for line in self.lines:
+            match = ac_pattern.match(line)
+            if match:
+                # Save previous AC
+                if current_ac is not None:
+                    current_ac["body"] = "\n".join(body_lines).strip()
+                    self._parse_gwt(current_ac)
+                    criteria.append(current_ac)
+
+                ac_id = int(match.group(1))
+                name = match.group(2).strip()
+                refs = match.group(3).strip() if match.group(3) else ""
+
+                current_ac = {
+                    "id": f"AC-{ac_id}",
+                    "name": name,
+                    "references": [r.strip() for r in refs.split(",") if r.strip()] if refs else [],
+                    "given": "",
+                    "when": "",
+                    "then": [],
+                    "body": "",
+                }
+                body_lines = []
+                in_ac = True
+            elif in_ac:
+                # Check if we hit another ## section
+                if re.match(r"^##\s+", line) and not re.match(r"^###\s+", line):
+                    in_ac = False
+                    if current_ac is not None:
+                        current_ac["body"] = "\n".join(body_lines).strip()
+                        self._parse_gwt(current_ac)
+                        criteria.append(current_ac)
+                        current_ac = None
+                else:
+                    body_lines.append(line)
+
+        # Don't forget the last one
+        if current_ac is not None:
+            current_ac["body"] = "\n".join(body_lines).strip()
+            self._parse_gwt(current_ac)
+            criteria.append(current_ac)
+
+        return criteria
+
+    def extract_edge_cases(self) -> List[Dict[str, Any]]:
+        """Extract EC-N edge case items."""
+        edge_cases = []
+        ec_pattern = re.compile(r"-\s+EC-(\d+):\s*(.+?)(?:\s*->\s*|\s*->\s*|\s*→\s*)(.+)")
+
+        in_section = False
+        for line in self.lines:
+            if re.match(r"^##\s+Edge\s+Cases", line, re.IGNORECASE):
+                in_section = True
+                continue
+            if in_section and re.match(r"^##\s+", line):
+                break
+            if in_section:
+                match = ec_pattern.match(line.strip())
+                if match:
+                    edge_cases.append({
+                        "id": f"EC-{match.group(1)}",
+                        "condition": match.group(2).strip().rstrip("."),
+                        "behavior": match.group(3).strip().rstrip("."),
+                    })
+
+        return edge_cases
+
+    def extract_spec_title(self) -> str:
+        """Extract the spec title from the first H1."""
+        for line in self.lines:
+            match = re.match(r"^#\s+(?:Spec:\s*)?(.+)", line)
+            if match:
+                return match.group(1).strip()
+        return "UnknownFeature"
+
+    @staticmethod
+    def _parse_gwt(ac: Dict[str, Any]):
+        """Parse Given/When/Then from the AC body text."""
+        body = ac["body"]
+        lines = body.split("\n")
+
+        current_section = None
+        for line in lines:
+            stripped = line.strip()
+            if not stripped:
+                continue
+
+            lower = stripped.lower()
+            if lower.startswith("given "):
+                current_section = "given"
+                ac["given"] = stripped[6:].strip()
+            elif lower.startswith("when "):
+                current_section = "when"
+                ac["when"] = stripped[5:].strip()
+            elif lower.startswith("then "):
+                current_section = "then"
+                ac["then"].append(stripped[5:].strip())
+            elif lower.startswith("and "):
+                if current_section == "then":
+                    ac["then"].append(stripped[4:].strip())
+                elif current_section == "given":
+                    ac["given"] += " AND " + stripped[4:].strip()
+                elif current_section == "when":
+                    ac["when"] += " AND " + stripped[4:].strip()
+
+
+def _sanitize_name(name: str) -> str:
+    """Convert a human-readable name to a valid function/method name."""
+    # Remove parenthetical references like (FR-1)
+    name = re.sub(r"\([^)]*\)", "", name)
+    # Replace non-alphanumeric with underscore
+    name = re.sub(r"[^a-zA-Z0-9]+", "_", name)
+    # Remove leading/trailing underscores
+    name = name.strip("_").lower()
+    return name or "unnamed"
+
+
+def _to_pascal_case(name: str) -> str:
+    """Convert to PascalCase for Go test names."""
+    parts = _sanitize_name(name).split("_")
+    return "".join(p.capitalize() for p in parts if p)
+
+
+class PytestGenerator:
+    """Generates pytest test stubs."""
+
+    def generate(self, title: str, criteria: List[Dict], edge_cases: List[Dict]) -> str:
+        class_name = "Test" + _to_pascal_case(title)
+        lines = [
+            '"""',
+            f"Test suite for: {title}",
+            f"Auto-generated from spec. {len(criteria)} acceptance criteria, {len(edge_cases)} edge cases.",
+            "",
+            "All tests are stubs — implement the test body to make them pass.",
+            '"""',
+            "",
+            "import pytest",
+            "",
+            "",
+            f"class {class_name}:",
+            f'    """Tests for {title}."""',
+            "",
+        ]
+
+        for ac in criteria:
+            method_name = f"test_{ac['id'].lower().replace('-', '')}_{_sanitize_name(ac['name'])}"
+            docstring = f'{ac["id"]}: {ac["name"]}'
+            ref_str = f" [{', '.join(ac['references'])}]" if ac["references"] else ""
+
+            lines.append(f"    def {method_name}(self):")
+            lines.append(f'        """{docstring}{ref_str}"""')
+
+            if ac["given"]:
+                lines.append(f"        # Given {ac['given']}")
+            if ac["when"]:
+                lines.append(f"        # When {ac['when']}")
+            for t in ac["then"]:
+                lines.append(f"        # Then {t}")
+
+            lines.append('        raise NotImplementedError("Implement this test")')
+            lines.append("")
+
+        if edge_cases:
+            lines.append("    # --- Edge Cases ---")
+            lines.append("")
+
+        for ec in edge_cases:
+            method_name = f"test_{ec['id'].lower().replace('-', '')}_{_sanitize_name(ec['condition'])}"
+            lines.append(f"    def {method_name}(self):")
+            lines.append(f'        """{ec["id"]}: {ec["condition"]} -> {ec["behavior"]}"""')
+            lines.append(f"        # Condition: {ec['condition']}")
+            lines.append(f"        # Expected: {ec['behavior']}")
+            lines.append('        raise NotImplementedError("Implement this test")')
+            lines.append("")
+
+        return "\n".join(lines)
+
+
+class JestGenerator:
+    """Generates Jest/Vitest test stubs (TypeScript)."""
+
+    def generate(self, title: str, criteria: List[Dict], edge_cases: List[Dict]) -> str:
+        lines = [
+            f"/**",
+            f" * Test suite for: {title}",
+            f" * Auto-generated from spec. {len(criteria)} acceptance criteria, {len(edge_cases)} edge cases.",
+            f" *",
+            f" * All tests are stubs — implement the test body to make them pass.",
+            f" */",
+            "",
+            f'describe("{title}", () => {{',
+        ]
+
+        for ac in criteria:
+            ref_str = f" [{', '.join(ac['references'])}]" if ac["references"] else ""
+            test_name = f"{ac['id']}: {ac['name']}{ref_str}"
+
+            lines.append(f'  it("{test_name}", () => {{')
+            if ac["given"]:
+                lines.append(f"    // Given {ac['given']}")
+            if ac["when"]:
+                lines.append(f"    // When {ac['when']}")
+            for t in ac["then"]:
+                lines.append(f"    // Then {t}")
+            lines.append("")
+            lines.append('    throw new Error("Not implemented");')
+            lines.append("  });")
+            lines.append("")
+
+        if edge_cases:
+            lines.append("  // --- Edge Cases ---")
+            lines.append("")
+
+        for ec in edge_cases:
+            test_name = f"{ec['id']}: {ec['condition']}"
+            lines.append(f'  it("{test_name}", () => {{')
+            lines.append(f"    // Condition: {ec['condition']}")
+            lines.append(f"    // Expected: {ec['behavior']}")
+            lines.append("")
+            lines.append('    throw new Error("Not implemented");')
+            lines.append("  });")
+            lines.append("")
+
+        lines.append("});")
+        lines.append("")
+
+        return "\n".join(lines)
+
+
+class GoTestGenerator:
+    """Generates Go test stubs."""
+
+    def generate(self, title: str, criteria: List[Dict], edge_cases: List[Dict]) -> str:
+        package_name = _sanitize_name(title).split("_")[0] or "feature"
+
+        lines = [
+            f"package {package_name}_test",
+            "",
+            "import (",
+            '\t"testing"',
+            ")",
+            "",
+            f"// Test suite for: {title}",
+            f"// Auto-generated from spec. {len(criteria)} acceptance criteria, {len(edge_cases)} edge cases.",
+            f"// All tests are stubs — implement the test body to make them pass.",
+            "",
+        ]
+
+        for ac in criteria:
+            func_name = "Test" + _to_pascal_case(ac["id"] + " " + ac["name"])
+            ref_str = f" [{', '.join(ac['references'])}]" if ac["references"] else ""
+
+            lines.append(f"// {ac['id']}: {ac['name']}{ref_str}")
+            lines.append(f"func {func_name}(t *testing.T) {{")
+
+            if ac["given"]:
+                lines.append(f"\t// Given {ac['given']}")
+            if ac["when"]:
+                lines.append(f"\t// When {ac['when']}")
+            for then_clause in ac["then"]:
+                lines.append(f"\t// Then {then_clause}")
+
+            lines.append("")
+            lines.append('\tt.Fatal("Not implemented")')
+            lines.append("}")
+            lines.append("")
+
+        if edge_cases:
+            lines.append("// --- Edge Cases ---")
+            lines.append("")
+
+        for ec in edge_cases:
+            func_name = "Test" + _to_pascal_case(ec["id"] + " " + ec["condition"])
+            lines.append(f"// {ec['id']}: {ec['condition']} -> {ec['behavior']}")
+            lines.append(f"func {func_name}(t *testing.T) {{")
+            lines.append(f"\t// Condition: {ec['condition']}")
+            lines.append(f"\t// Expected: {ec['behavior']}")
+            lines.append("")
+            lines.append('\tt.Fatal("Not implemented")')
+            lines.append("}")
+            lines.append("")
+
+        return "\n".join(lines)
+
+
+GENERATORS = {
+    "pytest": PytestGenerator,
+    "jest": JestGenerator,
+    "go-test": GoTestGenerator,
+}
+
+FILE_EXTENSIONS = {
+    "pytest": ".py",
+    "jest": ".test.ts",
+    "go-test": "_test.go",
+}
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Extract test case stubs from a feature specification.",
+        epilog="Example: python test_extractor.py --file spec.md --framework pytest --output tests/test_feature.py",
+    )
+    parser.add_argument(
+        "--file",
+        "-f",
+        required=True,
+        help="Path to the spec markdown file",
+    )
+    parser.add_argument(
+        "--framework",
+        choices=list(GENERATORS.keys()),
+        default="pytest",
+        help="Target test framework (default: pytest)",
+    )
+    parser.add_argument(
+        "--output",
+        "-o",
+        default=None,
+        help="Output file path (default: stdout)",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_flag",
+        help="Output extracted criteria as JSON instead of test code",
+    )
+
+    args = parser.parse_args()
+
+    file_path = Path(args.file)
+    if not file_path.exists():
+        print(f"Error: File not found: {file_path}", file=sys.stderr)
+        sys.exit(2)
+
+    content = file_path.read_text(encoding="utf-8")
+    if not content.strip():
+        print(f"Error: File is empty: {file_path}", file=sys.stderr)
+        sys.exit(2)
+
+    spec_parser = SpecParser(content)
+    title = spec_parser.extract_spec_title()
+    criteria = spec_parser.extract_acceptance_criteria()
+    edge_cases = spec_parser.extract_edge_cases()
+
+    if not criteria and not edge_cases:
+        print("Error: No acceptance criteria or edge cases found in spec.", file=sys.stderr)
+        sys.exit(2)
+
+    warnings = []
+    for ac in criteria:
+        if not ac["given"] and not ac["when"]:
+            warnings.append(f"{ac['id']}: Could not parse Given/When/Then — check format.")
+
+    if args.json_flag:
+        result = {
+            "spec_title": title,
+            "framework": args.framework,
+            "acceptance_criteria": criteria,
+            "edge_cases": edge_cases,
+            "warnings": warnings,
+            "counts": {
+                "acceptance_criteria": len(criteria),
+                "edge_cases": len(edge_cases),
+                "total_test_cases": len(criteria) + len(edge_cases),
+            },
+        }
+        output = json.dumps(result, indent=2)
+    else:
+        generator_class = GENERATORS[args.framework]
+        generator = generator_class()
+        output = generator.generate(title, criteria, edge_cases)
+
+    if args.output:
+        out_path = Path(args.output)
+        out_path.parent.mkdir(parents=True, exist_ok=True)
+        out_path.write_text(output, encoding="utf-8")
+        total = len(criteria) + len(edge_cases)
+        print(f"Generated {total} test stubs -> {out_path}", file=sys.stderr)
+    else:
+        print(output)
+
+    if warnings:
+        for w in warnings:
+            print(f"Warning: {w}", file=sys.stderr)
+        sys.exit(1)
+
+    sys.exit(0)
+
+
+if __name__ == "__main__":
+    main()