Merge pull request #405 from alirezarezvani/feature/sprint-phase-1-high

feat(engineering): add browser-automation and spec-driven-workflow skills
2026-03-25 14:21:58 +01:00
parent 7a2189fa21 97952ccbee
commit 2cb3ef74e0
19 changed files with 7379 additions and 3 deletions
--- a/docs/skills/engineering/browser-automation.md
+++ b/docs/skills/engineering/browser-automation.md
@@ -0,0 +1,575 @@
+---
+title: "Browser Automation — Agent Skill for Codex & OpenClaw"
+description: "Use when the user asks to automate browser tasks, scrape websites, fill forms, capture screenshots, extract structured data from web pages, or build. Agent skill for Claude Code, Codex CLI, Gemini CLI, OpenClaw."
+---
+
+# Browser Automation
+
+<div class="page-meta" markdown>
+<span class="meta-badge">:material-rocket-launch: Engineering - POWERFUL</span>
+<span class="meta-badge">:material-identifier: `browser-automation`</span>
+<span class="meta-badge">:material-github: <a href="https://github.com/alirezarezvani/claude-skills/tree/main/engineering/browser-automation/SKILL.md">Source</a></span>
+</div>
+
+<div class="install-banner" markdown>
+<span class="install-label">Install:</span> <code>claude /plugin install engineering-advanced-skills</code>
+</div>
+
+
+## Overview
+
+The Browser Automation skill provides comprehensive tools and knowledge for building production-grade web automation workflows using Playwright. This skill covers data extraction, form filling, screenshot capture, session management, and anti-detection patterns for reliable browser automation at scale.
+
+**When to use this skill:**
+- Scraping structured data from websites (tables, listings, search results)
+- Automating multi-step browser workflows (login, fill forms, download files)
+- Capturing screenshots or PDFs of web pages
+- Extracting data from SPAs and JavaScript-heavy sites
+- Building repeatable browser-based data pipelines
+
+**When NOT to use this skill:**
+- Writing browser tests or E2E test suites — use **playwright-pro** instead
+- Testing API endpoints — use **api-test-suite-builder** instead
+- Load testing or performance benchmarking — use **performance-profiler** instead
+
+**Why Playwright over Selenium or Puppeteer:**
+- **Auto-wait built in** — no explicit `sleep()` or `waitForElement()` needed for most actions
+- **Multi-browser from one API** — Chromium, Firefox, WebKit with zero config changes
+- **Network interception** — block ads, mock responses, capture API calls natively
+- **Browser contexts** — isolated sessions without spinning up new browser instances
+- **Codegen** — `playwright codegen` records your actions and generates scripts
+- **Async-first** — Python async/await for high-throughput scraping
+
+## Core Competencies
+
+### 1. Web Scraping Patterns
+
+#### DOM Extraction with CSS Selectors
+CSS selectors are the primary tool for element targeting. Prefer them over XPath for readability and performance.
+
+**Selector priority (most to least reliable):**
+1. `data-testid`, `data-id`, or custom data attributes — stable across redesigns
+2. `#id` selectors — unique but may change between deploys
+3. Semantic selectors: `article`, `nav`, `main`, `section` — resilient to CSS changes
+4. Class-based: `.product-card`, `.price` — brittle if classes are generated (e.g., CSS modules)
+5. Positional: `nth-child()`, `nth-of-type()` — last resort, breaks on layout changes
+
+**Compound selectors for precision:**
+```python
+# Product cards within a specific container
+page.query_selector_all("div.search-results > article.product-card")
+
+# Price inside a product card (scoped)
+card.query_selector("span[data-field='price']")
+
+# Links with specific text content
+page.locator("a", has_text="Next Page")
+```
+
+#### XPath for Complex Traversal
+Use XPath only when CSS cannot express the relationship:
+```python
+# Find element by text content (XPath strength)
+page.locator("//td[contains(text(), 'Total')]/following-sibling::td[1]")
+
+# Navigate up the DOM tree
+page.locator("//span[@class='price']/ancestor::div[@class='product']")
+```
+
+#### Pagination Patterns
+- **Next-button pagination**: Click "Next" until disabled or absent
+- **URL-based pagination**: Increment `?page=N` or `&offset=N` in URL
+- **Infinite scroll**: Scroll to bottom, wait for new content, repeat until no change
+- **Load-more button**: Click button, wait for DOM mutation, repeat
+
+#### Infinite Scroll Handling
+```python
+async def scroll_to_bottom(page, max_scrolls=50, pause_ms=1500):
+    previous_height = 0
+    for i in range(max_scrolls):
+        current_height = await page.evaluate("document.body.scrollHeight")
+        if current_height == previous_height:
+            break
+        await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
+        await page.wait_for_timeout(pause_ms)
+        previous_height = current_height
+    return i + 1  # number of scrolls performed
+```
+
+### 2. Form Filling & Multi-Step Workflows
+
+#### Login Flows
+```python
+async def login(page, url, username, password):
+    await page.goto(url)
+    await page.fill("input[name='username']", username)
+    await page.fill("input[name='password']", password)
+    await page.click("button[type='submit']")
+    # Wait for navigation to complete (post-login redirect)
+    await page.wait_for_url("**/dashboard**")
+```
+
+#### Multi-Page Forms
+Break multi-step forms into discrete functions per step. Each function:
+1. Fills the fields for that step
+2. Clicks the "Next" or "Continue" button
+3. Waits for the next step to load (URL change or DOM element)
+
+```python
+async def fill_step_1(page, data):
+    await page.fill("#first-name", data["first_name"])
+    await page.fill("#last-name", data["last_name"])
+    await page.select_option("#country", data["country"])
+    await page.click("button:has-text('Continue')")
+    await page.wait_for_selector("#step-2-form")
+
+async def fill_step_2(page, data):
+    await page.fill("#address", data["address"])
+    await page.fill("#city", data["city"])
+    await page.click("button:has-text('Continue')")
+    await page.wait_for_selector("#step-3-form")
+```
+
+#### File Uploads
+```python
+# Single file
+await page.set_input_files("input[type='file']", "/path/to/file.pdf")
+
+# Multiple files
+await page.set_input_files("input[type='file']", [
+    "/path/to/file1.pdf",
+    "/path/to/file2.pdf"
+])
+
+# Drag-and-drop upload zones (no visible input element)
+async with page.expect_file_chooser() as fc_info:
+    await page.click("div.upload-zone")
+file_chooser = await fc_info.value
+await file_chooser.set_files("/path/to/file.pdf")
+```
+
+#### Dropdown and Select Handling
+```python
+# Native <select> element
+await page.select_option("#country", value="US")
+await page.select_option("#country", label="United States")
+
+# Custom dropdown (div-based)
+await page.click("div.dropdown-trigger")
+await page.click("div.dropdown-option:has-text('United States')")
+```
+
+### 3. Screenshot & PDF Capture
+
+#### Screenshot Strategies
+```python
+# Full page (scrolls automatically)
+await page.screenshot(path="full-page.png", full_page=True)
+
+# Viewport only (what's visible)
+await page.screenshot(path="viewport.png")
+
+# Specific element
+element = page.locator("div.chart-container")
+await element.screenshot(path="chart.png")
+
+# With custom viewport for consistency
+context = await browser.new_context(viewport={"width": 1920, "height": 1080})
+```
+
+#### PDF Generation
+```python
+# Only works in Chromium
+await page.pdf(
+    path="output.pdf",
+    format="A4",
+    margin={"top": "1cm", "right": "1cm", "bottom": "1cm", "left": "1cm"},
+    print_background=True
+)
+```
+
+#### Visual Regression Baselines
+Take screenshots at known states and compare pixel-by-pixel. Store baselines in version control. Use naming conventions: `{page}_{viewport}_{state}.png`.
+
+### 4. Structured Data Extraction
+
+#### Tables to JSON
+```python
+async def extract_table(page, selector):
+    headers = await page.eval_on_selector_all(
+        f"{selector} thead th",
+        "elements => elements.map(e => e.textContent.trim())"
+    )
+    rows = await page.eval_on_selector_all(
+        f"{selector} tbody tr",
+        """rows => rows.map(row => {
+            return Array.from(row.querySelectorAll('td'))
+                .map(cell => cell.textContent.trim())
+        })"""
+    )
+    return [dict(zip(headers, row)) for row in rows]
+```
+
+#### Listings to Arrays
+```python
+async def extract_listings(page, container_sel, field_map):
+    """
+    field_map example: {"title": "h3.title", "price": "span.price", "url": "a::attr(href)"}
+    """
+    items = []
+    cards = await page.query_selector_all(container_sel)
+    for card in cards:
+        item = {}
+        for field, sel in field_map.items():
+            if "::attr(" in sel:
+                attr_sel, attr_name = sel.split("::attr(")
+                attr_name = attr_name.rstrip(")")
+                el = await card.query_selector(attr_sel)
+                item[field] = await el.get_attribute(attr_name) if el else None
+            else:
+                el = await card.query_selector(sel)
+                item[field] = (await el.text_content()).strip() if el else None
+        items.append(item)
+    return items
+```
+
+#### Nested Data Extraction
+For threaded content (comments with replies), use recursive extraction:
+```python
+async def extract_comments(page, parent_selector):
+    comments = []
+    elements = await page.query_selector_all(f"{parent_selector} > .comment")
+    for el in elements:
+        text = await (await el.query_selector(".comment-body")).text_content()
+        author = await (await el.query_selector(".author")).text_content()
+        replies = await extract_comments(el, ".replies")
+        comments.append({
+            "author": author.strip(),
+            "text": text.strip(),
+            "replies": replies
+        })
+    return comments
+```
+
+### 5. Cookie & Session Management
+
+#### Save and Restore Sessions
+```python
+import json
+
+# Save cookies after login
+cookies = await context.cookies()
+with open("session.json", "w") as f:
+    json.dump(cookies, f)
+
+# Restore session in new context
+with open("session.json", "r") as f:
+    cookies = json.load(f)
+context = await browser.new_context()
+await context.add_cookies(cookies)
+```
+
+#### Storage State (Cookies + Local Storage)
+```python
+# Save full state (cookies + localStorage + sessionStorage)
+await context.storage_state(path="state.json")
+
+# Restore full state
+context = await browser.new_context(storage_state="state.json")
+```
+
+**Best practice:** Save state after login, reuse across scraping sessions. Check session validity before starting a long job — make a lightweight request to a protected page and verify you are not redirected to login.
+
+### 6. Anti-Detection Patterns
+
+Modern websites detect automation through multiple vectors. Address all of them:
+
+#### User Agent Rotation
+Never use the default Playwright user agent. Rotate through real browser user agents:
+```python
+USER_AGENTS = [
+    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+]
+```
+
+#### Viewport and Screen Size
+Set realistic viewport dimensions. The default 800x600 is a red flag:
+```python
+context = await browser.new_context(
+    viewport={"width": 1920, "height": 1080},
+    screen={"width": 1920, "height": 1080},
+    user_agent=random.choice(USER_AGENTS),
+)
+```
+
+#### WebDriver Flag Removal
+Playwright sets `navigator.webdriver = true`. Remove it:
+```python
+await page.add_init_script("""
+    Object.defineProperty(navigator, 'webdriver', {get: () => undefined});
+""")
+```
+
+#### Request Throttling
+Add human-like delays between actions:
+```python
+import random
+
+async def human_delay(min_ms=500, max_ms=2000):
+    delay = random.randint(min_ms, max_ms)
+    await page.wait_for_timeout(delay)
+```
+
+#### Proxy Support
+```python
+browser = await playwright.chromium.launch(
+    proxy={"server": "http://proxy.example.com:8080"}
+)
+# Or per-context:
+context = await browser.new_context(
+    proxy={"server": "http://proxy.example.com:8080",
+           "username": "user", "password": "pass"}
+)
+```
+
+### 7. Dynamic Content Handling
+
+#### SPA Rendering
+SPAs render content client-side. Wait for the actual content, not the page load:
+```python
+await page.goto(url)
+# Wait for the data to render, not just the shell
+await page.wait_for_selector("div.product-list article", state="attached")
+```
+
+#### AJAX / Fetch Waiting
+Intercept and wait for specific API calls:
+```python
+async with page.expect_response("**/api/products*") as response_info:
+    await page.click("button.load-more")
+response = await response_info.value
+data = await response.json()  # You can use the API data directly
+```
+
+#### Shadow DOM Traversal
+```python
+# Playwright pierces open Shadow DOM automatically with >>
+await page.locator("custom-element >> .inner-class").click()
+```
+
+#### Lazy-Loaded Images
+Scroll elements into view to trigger lazy loading:
+```python
+images = await page.query_selector_all("img[data-src]")
+for img in images:
+    await img.scroll_into_view_if_needed()
+    await page.wait_for_timeout(200)
+```
+
+### 8. Error Handling & Retry Logic
+
+#### Retry Decorator Pattern
+```python
+import asyncio
+
+async def with_retry(coro_factory, max_retries=3, backoff_base=2):
+    for attempt in range(max_retries):
+        try:
+            return await coro_factory()
+        except Exception as e:
+            if attempt == max_retries - 1:
+                raise
+            wait = backoff_base ** attempt
+            print(f"Attempt {attempt + 1} failed: {e}. Retrying in {wait}s...")
+            await asyncio.sleep(wait)
+```
+
+#### Handling Common Failures
+```python
+from playwright.async_api import TimeoutError as PlaywrightTimeout
+
+try:
+    await page.click("button.submit", timeout=5000)
+except PlaywrightTimeout:
+    # Element did not appear — page structure may have changed
+    # Try fallback selector
+    await page.click("[type='submit']", timeout=5000)
+except Exception as e:
+    # Network error, browser crash, etc.
+    await page.screenshot(path="error-state.png")
+    raise
+```
+
+#### Rate Limit Detection
+```python
+async def check_rate_limit(response):
+    if response.status == 429:
+        retry_after = response.headers.get("retry-after", "60")
+        wait_seconds = int(retry_after)
+        print(f"Rate limited. Waiting {wait_seconds}s...")
+        await asyncio.sleep(wait_seconds)
+        return True
+    return False
+```
+
+## Workflows
+
+### Workflow 1: Single-Page Data Extraction
+
+**Scenario:** Extract product data from a single page with JavaScript-rendered content.
+
+**Steps:**
+1. Launch browser in headed mode during development (`headless=False`), switch to headless for production
+2. Navigate to URL and wait for content selector
+3. Extract data using `query_selector_all` with field mapping
+4. Validate extracted data (check for nulls, expected types)
+5. Output as JSON
+
+```python
+async def extract_single_page(url, selectors):
+    async with async_playwright() as p:
+        browser = await p.chromium.launch(headless=True)
+        context = await browser.new_context(
+            viewport={"width": 1920, "height": 1080},
+            user_agent="Mozilla/5.0 ..."
+        )
+        page = await context.new_page()
+        await page.goto(url, wait_until="networkidle")
+        data = await extract_listings(page, selectors["container"], selectors["fields"])
+        await browser.close()
+    return data
+```
+
+### Workflow 2: Multi-Page Scraping with Pagination
+
+**Scenario:** Scrape search results across 50+ pages.
+
+**Steps:**
+1. Launch browser with anti-detection settings
+2. Navigate to first page
+3. Extract data from current page
+4. Check if "Next" button exists and is enabled
+5. Click next, wait for new content to load (not just navigation)
+6. Repeat until no next page or max pages reached
+7. Deduplicate results by unique key
+8. Write output incrementally (don't hold everything in memory)
+
+```python
+async def scrape_paginated(base_url, selectors, max_pages=100):
+    all_data = []
+    async with async_playwright() as p:
+        browser = await p.chromium.launch(headless=True)
+        page = await (await browser.new_context()).new_page()
+        await page.goto(base_url)
+
+        for page_num in range(max_pages):
+            items = await extract_listings(page, selectors["container"], selectors["fields"])
+            all_data.extend(items)
+
+            next_btn = page.locator(selectors["next_button"])
+            if await next_btn.count() == 0 or await next_btn.is_disabled():
+                break
+
+            await next_btn.click()
+            await page.wait_for_selector(selectors["container"])
+            await human_delay(800, 2000)
+
+        await browser.close()
+    return all_data
+```
+
+### Workflow 3: Authenticated Workflow Automation
+
+**Scenario:** Log into a portal, navigate a multi-step form, download a report.
+
+**Steps:**
+1. Check for existing session state file
+2. If no session, perform login and save state
+3. Navigate to target page using saved session
+4. Fill multi-step form with provided data
+5. Wait for download to trigger
+6. Save downloaded file to target directory
+
+```python
+async def authenticated_workflow(credentials, form_data, download_dir):
+    async with async_playwright() as p:
+        browser = await p.chromium.launch(headless=True)
+        state_file = "session_state.json"
+
+        # Restore or create session
+        if os.path.exists(state_file):
+            context = await browser.new_context(storage_state=state_file)
+        else:
+            context = await browser.new_context()
+            page = await context.new_page()
+            await login(page, credentials["url"], credentials["user"], credentials["pass"])
+            await context.storage_state(path=state_file)
+
+        page = await context.new_page()
+        await page.goto(form_data["target_url"])
+
+        # Fill form steps
+        for step_fn in [fill_step_1, fill_step_2]:
+            await step_fn(page, form_data)
+
+        # Handle download
+        async with page.expect_download() as dl_info:
+            await page.click("button:has-text('Download Report')")
+        download = await dl_info.value
+        await download.save_as(os.path.join(download_dir, download.suggested_filename))
+
+        await browser.close()
+```
+
+## Tools Reference
+
+| Script | Purpose | Key Flags | Output |
+|--------|---------|-----------|--------|
+| `scraping_toolkit.py` | Generate Playwright scraping script skeleton | `--url`, `--selectors`, `--paginate`, `--output` | Python script or JSON config |
+| `form_automation_builder.py` | Generate form-fill automation script from field spec | `--fields`, `--url`, `--output` | Python automation script |
+| `anti_detection_checker.py` | Audit a Playwright script for detection vectors | `--file`, `--verbose` | Risk report with score |
+
+All scripts are stdlib-only. Run `python3 <script> --help` for full usage.
+
+## Anti-Patterns
+
+### Hardcoded Waits
+**Bad:** `await page.wait_for_timeout(5000)` before every action.
+**Good:** Use `wait_for_selector`, `wait_for_url`, `expect_response`, or `wait_for_load_state`. Hardcoded waits are flaky and slow.
+
+### No Error Recovery
+**Bad:** Linear script that crashes on first failure.
+**Good:** Wrap each page interaction in try/except. Take error-state screenshots. Implement retry with exponential backoff.
+
+### Ignoring robots.txt
+**Bad:** Scraping without checking robots.txt directives.
+**Good:** Fetch and parse robots.txt before scraping. Respect `Crawl-delay`. Skip disallowed paths. Add your bot name to User-Agent if running at scale.
+
+### Storing Credentials in Scripts
+**Bad:** Hardcoding usernames and passwords in Python files.
+**Good:** Use environment variables, `.env` files (gitignored), or a secrets manager. Pass credentials via CLI arguments.
+
+### No Rate Limiting
+**Bad:** Hammering a site with 100 requests/second.
+**Good:** Add random delays between requests (1-3s for polite scraping). Monitor for 429 responses. Implement exponential backoff.
+
+### Selector Fragility
+**Bad:** Relying on auto-generated class names (`.css-1a2b3c`) or deep nesting (`div > div > div > span:nth-child(3)`).
+**Good:** Use data attributes, semantic HTML, or text-based locators. Test selectors in browser DevTools first.
+
+### Not Cleaning Up Browser Instances
+**Bad:** Launching browsers without closing them, leading to resource leaks.
+**Good:** Always use `try/finally` or async context managers to ensure `browser.close()` is called.
+
+### Running Headed in Production
+**Bad:** Using `headless=False` in production/CI.
+**Good:** Develop with headed mode for debugging, deploy with `headless=True`. Use environment variable to toggle: `headless = os.environ.get("HEADLESS", "true") == "true"`.
+
+## Cross-References
+
+- **playwright-pro** — Browser testing skill. Use for E2E tests, test assertions, test fixtures. Browser Automation is for data extraction and workflow automation, not testing.
+- **api-test-suite-builder** — When the website has a public API, hit the API directly instead of scraping the rendered page. Faster, more reliable, less detectable.
+- **performance-profiler** — If your automation scripts are slow, profile the bottlenecks before adding concurrency.
+- **env-secrets-manager** — For securely managing credentials used in authenticated automation workflows.
--- a/docs/skills/engineering/index.md
+++ b/docs/skills/engineering/index.md
@@ -1,13 +1,13 @@
 ---
 title: "Engineering - POWERFUL Skills — Agent Skills & Codex Plugins"
-description: "44 engineering - powerful skills — advanced agent-native skill and Claude Code plugin for AI agent design, infrastructure, and automation. Works with Claude Code, Codex CLI, Gemini CLI, and OpenClaw."
+description: "46 engineering - powerful skills — advanced agent-native skill and Claude Code plugin for AI agent design, infrastructure, and automation. Works with Claude Code, Codex CLI, Gemini CLI, and OpenClaw."
 ---

 <div class="domain-header" markdown>

 # :material-rocket-launch: Engineering - POWERFUL

-<p class="domain-count">44 skills in this domain</p>
+<p class="domain-count">46 skills in this domain</p>

 </div>

@@ -53,6 +53,12 @@ description: "44 engineering - powerful skills — advanced agent-native skill a

    > You sleep. The agent experiments. You wake up to results.

+-   **[Browser Automation - POWERFUL](browser-automation.md)**
+
+    ---
+
+    The Browser Automation skill provides comprehensive tools and knowledge for building production-grade web automation ...
+
 -   **[Changelog Generator](changelog-generator.md)**

    ---
@@ -197,6 +203,12 @@ description: "44 engineering - powerful skills — advanced agent-native skill a

    ---

+-   **[Spec-Driven Workflow — POWERFUL](spec-driven-workflow.md)**
+
+    ---
+
+    Spec-driven workflow enforces a single, non-negotiable rule: write the specification BEFORE you write any code. Not a...
+
 -   **[Tech Debt Tracker](tech-debt-tracker.md)**

    ---
--- a/docs/skills/engineering/spec-driven-workflow.md
+++ b/docs/skills/engineering/spec-driven-workflow.md
@@ -0,0 +1,597 @@
+---
+title: "Spec-Driven Workflow — Agent Skill for Codex & OpenClaw"
+description: "Use when the user asks to write specs before code, define acceptance criteria, plan features before implementation, generate tests from. Agent skill for Claude Code, Codex CLI, Gemini CLI, OpenClaw."
+---
+
+# Spec-Driven Workflow
+
+<div class="page-meta" markdown>
+<span class="meta-badge">:material-rocket-launch: Engineering - POWERFUL</span>
+<span class="meta-badge">:material-identifier: `spec-driven-workflow`</span>
+<span class="meta-badge">:material-github: <a href="https://github.com/alirezarezvani/claude-skills/tree/main/engineering/spec-driven-workflow/SKILL.md">Source</a></span>
+</div>
+
+<div class="install-banner" markdown>
+<span class="install-label">Install:</span> <code>claude /plugin install engineering-advanced-skills</code>
+</div>
+
+
+## Overview
+
+Spec-driven workflow enforces a single, non-negotiable rule: **write the specification BEFORE you write any code.** Not alongside. Not after. Before.
+
+This is not documentation. This is a contract. A spec defines what the system MUST do, what it SHOULD do, and what it explicitly WILL NOT do. Every line of code you write traces back to a requirement in the spec. Every test traces back to an acceptance criterion. If it is not in the spec, it does not get built.
+
+### Why Spec-First Matters
+
+1. **Eliminates rework.** 60-80% of defects originate from requirements, not implementation. Catching ambiguity in a spec costs minutes; catching it in production costs days.
+2. **Forces clarity.** If you cannot write what the system should do in plain language, you do not understand the problem well enough to write code.
+3. **Enables parallelism.** Once a spec is approved, frontend, backend, QA, and documentation can all start simultaneously.
+4. **Creates accountability.** The spec is the definition of done. No arguments about whether a feature is "complete" — either it satisfies the acceptance criteria or it does not.
+5. **Feeds TDD directly.** Acceptance criteria in Given/When/Then format translate 1:1 into test cases. The spec IS the test plan.
+
+### The Iron Law
+
+```
+NO CODE WITHOUT AN APPROVED SPEC.
+NO EXCEPTIONS. NO "QUICK PROTOTYPES." NO "I'LL DOCUMENT IT LATER."
+```
+
+If the spec is not written, reviewed, and approved, implementation does not begin. Period.
+
+---
+
+## The Spec Format
+
+Every spec follows this structure. No sections are optional — if a section does not apply, write "N/A — [reason]" so reviewers know it was considered, not forgotten.
+
+### 1. Title and Context
+
+```markdown
+# Spec: [Feature Name]
+
+**Author:** [name]
+**Date:** [ISO 8601]
+**Status:** Draft | In Review | Approved | Superseded
+**Reviewers:** [list]
+**Related specs:** [links]
+
+## Context
+
+[Why does this feature exist? What problem does it solve? What is the business
+motivation? Include links to user research, support tickets, or metrics that
+justify this work. 2-4 paragraphs maximum.]
+```
+
+### 2. Functional Requirements (RFC 2119)
+
+Use RFC 2119 keywords precisely:
+
+| Keyword | Meaning |
+|---------|---------|
+| **MUST** | Absolute requirement. Failing this means the implementation is non-conformant. |
+| **MUST NOT** | Absolute prohibition. Doing this means the implementation is broken. |
+| **SHOULD** | Recommended. May be omitted with documented justification. |
+| **SHOULD NOT** | Discouraged. May be included with documented justification. |
+| **MAY** | Optional. Purely at the implementer's discretion. |
+
+```markdown
+## Functional Requirements
+
+- FR-1: The system MUST authenticate users via OAuth 2.0 PKCE flow.
+- FR-2: The system MUST reject tokens older than 24 hours.
+- FR-3: The system SHOULD support refresh token rotation.
+- FR-4: The system MAY cache user profiles for up to 5 minutes.
+- FR-5: The system MUST NOT store plaintext passwords under any circumstance.
+```
+
+Number every requirement. Use `FR-` prefix. Each requirement is a single, testable statement.
+
+### 3. Non-Functional Requirements
+
+```markdown
+## Non-Functional Requirements
+
+### Performance
+- NFR-P1: Login flow MUST complete in < 500ms (p95) under normal load.
+- NFR-P2: Token validation MUST complete in < 50ms (p99).
+
+### Security
+- NFR-S1: All tokens MUST be transmitted over TLS 1.2+.
+- NFR-S2: The system MUST rate-limit login attempts to 5/minute per IP.
+
+### Accessibility
+- NFR-A1: Login form MUST meet WCAG 2.1 AA standards.
+- NFR-A2: Error messages MUST be announced to screen readers.
+
+### Scalability
+- NFR-SC1: The system SHOULD handle 10,000 concurrent sessions.
+
+### Reliability
+- NFR-R1: The authentication service MUST maintain 99.9% uptime.
+```
+
+### 4. Acceptance Criteria (Given/When/Then)
+
+Every functional requirement maps to one or more acceptance criteria. Use Gherkin syntax:
+
+```markdown
+## Acceptance Criteria
+
+### AC-1: Successful login (FR-1)
+Given a user with valid credentials
+When they submit the login form with correct email and password
+Then they receive a valid access token
+And they are redirected to the dashboard
+And the login event is logged with timestamp and IP
+
+### AC-2: Expired token rejection (FR-2)
+Given a user with an access token issued 25 hours ago
+When they make an API request with that token
+Then they receive a 401 Unauthorized response
+And the response body contains error code "TOKEN_EXPIRED"
+And they are NOT redirected (API clients handle their own flow)
+
+### AC-3: Rate limiting (NFR-S2)
+Given an IP address that has made 5 failed login attempts in the last minute
+When a 6th login attempt arrives from that IP
+Then the request is rejected with 429 Too Many Requests
+And the response includes a Retry-After header
+```
+
+### 5. Edge Cases and Error Scenarios
+
+```markdown
+## Edge Cases
+
+- EC-1: User submits login form with empty email → Show validation error, do not hit API.
+- EC-2: OAuth provider is down → Show "Service temporarily unavailable", retry after 30s.
+- EC-3: User has account but no password (social-only) → Redirect to social login.
+- EC-4: Concurrent login from two devices → Both sessions are valid (no single-session enforcement).
+- EC-5: Token expires mid-request → Complete the current request, return warning header.
+```
+
+### 6. API Contracts
+
+Define request/response shapes using TypeScript-style notation:
+
+```markdown
+## API Contracts
+
+### POST /api/auth/login
+Request:
+```typescript
+interface LoginRequest {
+  email: string;       // MUST be valid email format
+  password: string;    // MUST be 8-128 characters
+  rememberMe?: boolean; // Default: false
+}
+```
+
+Success Response (200):
+```typescript
+interface LoginResponse {
+  accessToken: string;   // JWT, expires in 24h
+  refreshToken: string;  // Opaque, expires in 30d
+  expiresIn: number;     // Seconds until access token expires
+  user: {
+    id: string;
+    email: string;
+    displayName: string;
+  };
+}
+```
+
+Error Response (401):
+```typescript
+interface AuthError {
+  error: "INVALID_CREDENTIALS" | "TOKEN_EXPIRED" | "ACCOUNT_LOCKED";
+  message: string;
+  retryAfter?: number; // Seconds, present for rate-limited responses
+}
+```
+```
+
+### 7. Data Models
+
+```markdown
+## Data Models
+
+### User
+| Field | Type | Constraints |
+|-------|------|-------------|
+| id | UUID | Primary key, auto-generated |
+| email | string | Unique, max 255 chars, valid email format |
+| passwordHash | string | bcrypt, never exposed via API |
+| createdAt | timestamp | UTC, immutable |
+| lastLoginAt | timestamp | UTC, updated on each login |
+| loginAttempts | integer | Reset to 0 on successful login |
+| lockedUntil | timestamp | Null if not locked |
+```
+
+### 8. Out of Scope
+
+Explicit exclusions prevent scope creep:
+
+```markdown
+## Out of Scope
+
+- OS-1: Multi-factor authentication (separate spec: SPEC-042)
+- OS-2: Social login providers beyond Google and GitHub
+- OS-3: Admin impersonation of user accounts
+- OS-4: Password complexity rules beyond minimum length (deferred to v2)
+- OS-5: Session management UI (users cannot see/revoke active sessions yet)
+```
+
+If someone asks for an out-of-scope item during implementation, point them to this section. Do not build it.
+
+---
+
+## Bounded Autonomy Rules
+
+These rules define when an agent (human or AI) MUST stop and ask for guidance vs. when they can proceed independently.
+
+### STOP and Ask When:
+
+1. **Scope creep detected.** The implementation requires something not in the spec. Even if it seems obviously needed, STOP. The spec might have excluded it deliberately.
+
+2. **Ambiguity exceeds 30%.** If you cannot determine the correct behavior from the spec for more than 30% of a given requirement, the spec is incomplete. Do not guess.
+
+3. **Breaking changes required.** The implementation would change an existing API contract, database schema, or public interface. Always escalate.
+
+4. **Security implications.** Any change that touches authentication, authorization, encryption, or PII handling requires explicit approval.
+
+5. **Performance characteristics unknown.** If a requirement says "MUST complete in < 500ms" but you have no way to measure or guarantee that, escalate before implementing a guess.
+
+6. **Cross-team dependencies.** If the spec requires coordination with another team or service, confirm the dependency before building against it.
+
+### Continue Autonomously When:
+
+1. **Spec is clear and unambiguous** for the current task.
+2. **All acceptance criteria have passing tests** and you are refactoring internals.
+3. **Changes are non-breaking** — no public API, schema, or behavior changes.
+4. **Implementation is a direct translation** of a well-defined acceptance criterion.
+5. **Error handling follows established patterns** already documented in the codebase.
+
+### Escalation Protocol
+
+When you must stop, provide:
+
+```markdown
+## Escalation: [Brief Title]
+
+**Blocked on:** [requirement ID, e.g., FR-3]
+**Question:** [Specific, answerable question — not "what should I do?"]
+**Options considered:**
+  A. [Option] — Pros: [...] Cons: [...]
+  B. [Option] — Pros: [...] Cons: [...]
+**My recommendation:** [A or B, with reasoning]
+**Impact of waiting:** [What is blocked until this is resolved?]
+```
+
+Never escalate without a recommendation. Never present an open-ended question. Always give options.
+
+See `references/bounded_autonomy_rules.md` for the complete decision matrix.
+
+---
+
+## Workflow — 6 Phases
+
+### Phase 1: Gather Requirements
+
+**Goal:** Understand what needs to be built and why.
+
+1. **Interview the user.** Ask:
+   - What problem does this solve?
+   - Who are the users?
+   - What does success look like?
+   - What explicitly should NOT be built?
+2. **Read existing code.** Understand the current system before proposing changes.
+3. **Identify constraints.** Performance budgets, security requirements, backward compatibility.
+4. **List unknowns.** Every unknown is a risk. Surface them now, not during implementation.
+
+**Exit criteria:** You can explain the feature to someone unfamiliar with the project in 2 minutes.
+
+### Phase 2: Write Spec
+
+**Goal:** Produce a complete spec document following The Spec Format above.
+
+1. Fill every section of the template. No section left blank.
+2. Number all requirements (FR-*, NFR-*, AC-*, EC-*, OS-*).
+3. Use RFC 2119 keywords precisely.
+4. Write acceptance criteria in Given/When/Then format.
+5. Define API contracts with TypeScript-style types.
+6. List explicit exclusions in Out of Scope.
+
+**Exit criteria:** The spec can be handed to a developer who was not in the requirements meeting, and they can implement the feature without asking clarifying questions.
+
+### Phase 3: Validate Spec
+
+**Goal:** Verify the spec is complete, consistent, and implementable.
+
+Run `spec_validator.py` against the spec file:
+
+```bash
+python spec_validator.py --file spec.md --strict
+```
+
+Manual validation checklist:
+- [ ] Every functional requirement has at least one acceptance criterion
+- [ ] Every acceptance criterion is testable (no subjective language)
+- [ ] API contracts cover all endpoints mentioned in requirements
+- [ ] Data models cover all entities mentioned in requirements
+- [ ] Edge cases cover failure modes for every external dependency
+- [ ] Out of scope is explicit about what was considered and rejected
+- [ ] Non-functional requirements have measurable thresholds
+
+**Exit criteria:** Spec scores 80+ on validator, and all manual checklist items pass.
+
+### Phase 4: Generate Tests
+
+**Goal:** Extract test cases from acceptance criteria before writing implementation code.
+
+Run `test_extractor.py` against the approved spec:
+
+```bash
+python test_extractor.py --file spec.md --framework pytest --output tests/
+```
+
+1. Each acceptance criterion becomes one or more test cases.
+2. Each edge case becomes a test case.
+3. Tests are stubs — they define the assertion but not the implementation.
+4. All tests MUST fail initially (red phase of TDD).
+
+**Exit criteria:** You have a test file where every test fails with "not implemented" or equivalent.
+
+### Phase 5: Implement
+
+**Goal:** Write code that makes failing tests pass, one acceptance criterion at a time.
+
+1. Pick one acceptance criterion (start with the simplest).
+2. Make its test(s) pass with minimal code.
+3. Run the full test suite — no regressions.
+4. Commit.
+5. Pick the next acceptance criterion. Repeat.
+
+**Rules:**
+- Do NOT implement anything not in the spec.
+- Do NOT optimize before all acceptance criteria pass.
+- Do NOT refactor before all acceptance criteria pass.
+- If you discover a missing requirement, STOP and update the spec first.
+
+**Exit criteria:** All tests pass. All acceptance criteria satisfied.
+
+### Phase 6: Self-Review
+
+**Goal:** Verify implementation matches spec before marking done.
+
+Run through the Self-Review Checklist below. If any item fails, fix it before declaring the task complete.
+
+---
+
+## Self-Review Checklist
+
+Before marking any implementation as done, verify ALL of the following:
+
+- [ ] **Every acceptance criterion has a passing test.** No exceptions. If AC-3 exists, a test for AC-3 exists and passes.
+- [ ] **Every edge case has a test.** EC-1 through EC-N all have corresponding test cases.
+- [ ] **No scope creep.** The implementation does not include features not in the spec. If you added something, either update the spec or remove it.
+- [ ] **API contracts match implementation.** Request/response shapes in code match the spec exactly. Field names, types, status codes — all of it.
+- [ ] **Error scenarios tested.** Every error response defined in the spec has a test that triggers it.
+- [ ] **Non-functional requirements verified.** If the spec says < 500ms, you have evidence (benchmark, load test, profiling) that it meets the threshold.
+- [ ] **Data model matches.** Database schema matches the spec. No extra columns, no missing constraints.
+- [ ] **Out-of-scope items not built.** Double-check that nothing from the Out of Scope section leaked into the implementation.
+
+---
+
+## Integration with TDD Guide
+
+Spec-driven workflow and TDD are complementary, not competing:
+
+```
+Spec-Driven Workflow          TDD (Red-Green-Refactor)
+─────────────────────         ──────────────────────────
+Phase 1: Gather Requirements
+Phase 2: Write Spec
+Phase 3: Validate Spec
+Phase 4: Generate Tests  ──→  RED: Tests exist and fail
+Phase 5: Implement       ──→  GREEN: Minimal code to pass
+Phase 6: Self-Review     ──→  REFACTOR: Clean up internals
+```
+
+**The handoff:** Spec-driven workflow produces the test stubs (Phase 4). TDD takes over from there. The spec tells you WHAT to test. TDD tells you HOW to implement.
+
+Use `engineering-team/tdd-guide` for:
+- Red-green-refactor cycle discipline
+- Coverage analysis and gap detection
+- Framework-specific test patterns (Jest, Pytest, JUnit)
+
+Use `engineering/spec-driven-workflow` for:
+- Defining what to build before building it
+- Acceptance criteria authoring
+- Completeness validation
+- Scope control
+
+---
+
+## Examples
+
+### Full Spec: User Password Reset
+
+```markdown
+# Spec: Password Reset Flow
+
+**Author:** Engineering Team
+**Date:** 2026-03-25
+**Status:** Approved
+
+## Context
+
+Users who forget their passwords currently have no self-service recovery option.
+Support receives ~200 password reset requests per week, costing approximately
+8 hours of support time. This feature eliminates that burden entirely.
+
+## Functional Requirements
+
+- FR-1: The system MUST allow users to request a password reset via email.
+- FR-2: The system MUST send a reset link that expires after 1 hour.
+- FR-3: The system MUST invalidate all previous reset links when a new one is requested.
+- FR-4: The system MUST enforce minimum password length of 8 characters on reset.
+- FR-5: The system MUST NOT reveal whether an email exists in the system.
+- FR-6: The system SHOULD log all reset attempts for audit purposes.
+
+## Acceptance Criteria
+
+### AC-1: Request reset (FR-1, FR-5)
+Given a user on the password reset page
+When they enter any email address and submit
+Then they see "If an account exists, a reset link has been sent"
+And the response is identical whether the email exists or not
+
+### AC-2: Valid reset link (FR-2)
+Given a user who received a reset email 30 minutes ago
+When they click the reset link
+Then they see the password reset form
+
+### AC-3: Expired reset link (FR-2)
+Given a user who received a reset email 2 hours ago
+When they click the reset link
+Then they see "This link has expired. Please request a new one."
+
+### AC-4: Previous links invalidated (FR-3)
+Given a user who requested two reset emails
+When they click the link from the first email
+Then they see "This link is no longer valid."
+
+## Edge Cases
+
+- EC-1: User submits reset for non-existent email → Same success message (FR-5).
+- EC-2: User clicks reset link twice → Second click shows "already used" if password was changed.
+- EC-3: Email delivery fails → Log error, do not retry automatically.
+- EC-4: User requests reset while already logged in → Allow it, do not force logout.
+
+## Out of Scope
+
+- OS-1: Security questions as alternative reset method.
+- OS-2: SMS-based password reset.
+- OS-3: Admin-initiated password reset (separate spec).
+```
+
+### Extracted Test Cases (from above spec)
+
+```python
+# Generated by test_extractor.py --framework pytest
+
+class TestPasswordReset:
+    def test_ac1_request_reset_existing_email(self):
+        """AC-1: Request reset with existing email shows generic message."""
+        # Given a user on the password reset page
+        # When they enter a registered email and submit
+        # Then they see "If an account exists, a reset link has been sent"
+        raise NotImplementedError("Implement this test")
+
+    def test_ac1_request_reset_nonexistent_email(self):
+        """AC-1: Request reset with unknown email shows same generic message."""
+        # Given a user on the password reset page
+        # When they enter an unregistered email and submit
+        # Then they see identical response to existing email case
+        raise NotImplementedError("Implement this test")
+
+    def test_ac2_valid_reset_link(self):
+        """AC-2: Reset link works within expiry window."""
+        raise NotImplementedError("Implement this test")
+
+    def test_ac3_expired_reset_link(self):
+        """AC-3: Reset link rejected after 1 hour."""
+        raise NotImplementedError("Implement this test")
+
+    def test_ac4_previous_links_invalidated(self):
+        """AC-4: Old reset links stop working when new one is requested."""
+        raise NotImplementedError("Implement this test")
+
+    def test_ec1_nonexistent_email_same_response(self):
+        """EC-1: Non-existent email produces identical response."""
+        raise NotImplementedError("Implement this test")
+
+    def test_ec2_reset_link_used_twice(self):
+        """EC-2: Already-used reset link shows appropriate message."""
+        raise NotImplementedError("Implement this test")
+```
+
+---
+
+## Anti-Patterns
+
+### 1. Coding Before Spec Approval
+
+**Symptom:** "I'll start coding while the spec is being reviewed."
+**Problem:** The review will surface changes. Now you have code that implements a rejected design.
+**Rule:** Implementation does not begin until spec status is "Approved."
+
+### 2. Vague Acceptance Criteria
+
+**Symptom:** "The system should work well" or "The UI should be responsive."
+**Problem:** Untestable. What does "well" mean? What does "responsive" mean?
+**Rule:** Every acceptance criterion must be verifiable by a machine. If you cannot write a test for it, rewrite the criterion.
+
+### 3. Missing Edge Cases
+
+**Symptom:** Happy path is specified, error paths are not.
+**Problem:** Developers invent error handling on the fly, leading to inconsistent behavior.
+**Rule:** For every external dependency (API, database, file system, user input), specify at least one failure scenario.
+
+### 4. Spec as Post-Hoc Documentation
+
+**Symptom:** "Let me write the spec now that the feature is done."
+**Problem:** This is documentation, not specification. It describes what was built, not what should have been built. It cannot catch design errors because the design is already frozen.
+**Rule:** If the spec was written after the code, it is not a spec. Relabel it as documentation.
+
+### 5. Gold-Plating Beyond Spec
+
+**Symptom:** "While I was in there, I also added..."
+**Problem:** Untested code. Unreviewed design. Potential for subtle bugs in the "bonus" feature.
+**Rule:** If it is not in the spec, it does not get built. File a new spec for additional features.
+
+### 6. Acceptance Criteria Without Requirement Traceability
+
+**Symptom:** AC-7 exists but does not reference any FR-* or NFR-*.
+**Problem:** Orphaned criteria mean either a requirement is missing or the criterion is unnecessary.
+**Rule:** Every AC-* MUST reference at least one FR-* or NFR-*.
+
+### 7. Skipping Validation
+
+**Symptom:** "The spec looks fine, let's just start."
+**Problem:** Missing sections discovered during implementation cause blocking delays.
+**Rule:** Always run `spec_validator.py --strict` before starting implementation. Fix all warnings.
+
+---
+
+## Cross-References
+
+- **`engineering-team/tdd-guide`** — Red-green-refactor cycle, test generation, coverage analysis. Use after Phase 4 of this workflow.
+- **`engineering/focused-fix`** — Deep-dive feature repair. When a spec-driven implementation has systemic issues, use focused-fix for diagnosis.
+- **`engineering/rag-architect`** — If the feature involves retrieval or knowledge systems, use rag-architect for the technical design within the spec.
+- **`references/spec_format_guide.md`** — Complete template with section-by-section explanations.
+- **`references/bounded_autonomy_rules.md`** — Full decision matrix for when to stop vs. continue.
+- **`references/acceptance_criteria_patterns.md`** — Pattern library for writing Given/When/Then criteria.
+
+---
+
+## Tools
+
+| Script | Purpose | Key Flags |
+|--------|---------|-----------|
+| `spec_generator.py` | Generate spec template from feature name/description | `--name`, `--description`, `--format`, `--json` |
+| `spec_validator.py` | Validate spec completeness (0-100 score) | `--file`, `--strict`, `--json` |
+| `test_extractor.py` | Extract test stubs from acceptance criteria | `--file`, `--framework`, `--output`, `--json` |
+
+```bash
+# Generate a spec template
+python spec_generator.py --name "User Authentication" --description "OAuth 2.0 login flow"
+
+# Validate a spec
+python spec_validator.py --file specs/auth.md --strict
+
+# Extract test cases
+python test_extractor.py --file specs/auth.md --framework pytest --output tests/test_auth.py
+```
--- a/engineering/.claude-plugin/plugin.json
+++ b/engineering/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
  "name": "engineering-advanced-skills",
-  "description": "31 advanced engineering skills: agent designer, agent workflow designer, AgentHub, RAG architect, database designer, migration architect, observability designer, dependency auditor, release manager, API reviewer, CI/CD pipeline builder, MCP server builder, skill security auditor, performance profiler, Helm chart builder, Terraform patterns, focused-fix, and more. Agent skill and plugin for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw.",
+  "description": "33 advanced engineering skills: agent designer, agent workflow designer, AgentHub, RAG architect, database designer, migration architect, observability designer, dependency auditor, release manager, API reviewer, CI/CD pipeline builder, MCP server builder, skill security auditor, performance profiler, Helm chart builder, Terraform patterns, focused-fix, browser-automation, spec-driven-workflow, and more. Agent skill and plugin for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw.",
  "version": "2.1.2",
  "author": {
    "name": "Alireza Rezvani",
--- a/engineering/browser-automation/SKILL.md
+++ b/engineering/browser-automation/SKILL.md
@@ -0,0 +1,564 @@
+---
+name: "browser-automation"
+description: "Use when the user asks to automate browser tasks, scrape websites, fill forms, capture screenshots, extract structured data from web pages, or build web automation workflows. NOT for testing — use playwright-pro for that."
+---
+
+# Browser Automation - POWERFUL
+
+## Overview
+
+The Browser Automation skill provides comprehensive tools and knowledge for building production-grade web automation workflows using Playwright. This skill covers data extraction, form filling, screenshot capture, session management, and anti-detection patterns for reliable browser automation at scale.
+
+**When to use this skill:**
+- Scraping structured data from websites (tables, listings, search results)
+- Automating multi-step browser workflows (login, fill forms, download files)
+- Capturing screenshots or PDFs of web pages
+- Extracting data from SPAs and JavaScript-heavy sites
+- Building repeatable browser-based data pipelines
+
+**When NOT to use this skill:**
+- Writing browser tests or E2E test suites — use **playwright-pro** instead
+- Testing API endpoints — use **api-test-suite-builder** instead
+- Load testing or performance benchmarking — use **performance-profiler** instead
+
+**Why Playwright over Selenium or Puppeteer:**
+- **Auto-wait built in** — no explicit `sleep()` or `waitForElement()` needed for most actions
+- **Multi-browser from one API** — Chromium, Firefox, WebKit with zero config changes
+- **Network interception** — block ads, mock responses, capture API calls natively
+- **Browser contexts** — isolated sessions without spinning up new browser instances
+- **Codegen** — `playwright codegen` records your actions and generates scripts
+- **Async-first** — Python async/await for high-throughput scraping
+
+## Core Competencies
+
+### 1. Web Scraping Patterns
+
+#### DOM Extraction with CSS Selectors
+CSS selectors are the primary tool for element targeting. Prefer them over XPath for readability and performance.
+
+**Selector priority (most to least reliable):**
+1. `data-testid`, `data-id`, or custom data attributes — stable across redesigns
+2. `#id` selectors — unique but may change between deploys
+3. Semantic selectors: `article`, `nav`, `main`, `section` — resilient to CSS changes
+4. Class-based: `.product-card`, `.price` — brittle if classes are generated (e.g., CSS modules)
+5. Positional: `nth-child()`, `nth-of-type()` — last resort, breaks on layout changes
+
+**Compound selectors for precision:**
+```python
+# Product cards within a specific container
+page.query_selector_all("div.search-results > article.product-card")
+
+# Price inside a product card (scoped)
+card.query_selector("span[data-field='price']")
+
+# Links with specific text content
+page.locator("a", has_text="Next Page")
+```
+
+#### XPath for Complex Traversal
+Use XPath only when CSS cannot express the relationship:
+```python
+# Find element by text content (XPath strength)
+page.locator("//td[contains(text(), 'Total')]/following-sibling::td[1]")
+
+# Navigate up the DOM tree
+page.locator("//span[@class='price']/ancestor::div[@class='product']")
+```
+
+#### Pagination Patterns
+- **Next-button pagination**: Click "Next" until disabled or absent
+- **URL-based pagination**: Increment `?page=N` or `&offset=N` in URL
+- **Infinite scroll**: Scroll to bottom, wait for new content, repeat until no change
+- **Load-more button**: Click button, wait for DOM mutation, repeat
+
+#### Infinite Scroll Handling
+```python
+async def scroll_to_bottom(page, max_scrolls=50, pause_ms=1500):
+    previous_height = 0
+    for i in range(max_scrolls):
+        current_height = await page.evaluate("document.body.scrollHeight")
+        if current_height == previous_height:
+            break
+        await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
+        await page.wait_for_timeout(pause_ms)
+        previous_height = current_height
+    return i + 1  # number of scrolls performed
+```
+
+### 2. Form Filling & Multi-Step Workflows
+
+#### Login Flows
+```python
+async def login(page, url, username, password):
+    await page.goto(url)
+    await page.fill("input[name='username']", username)
+    await page.fill("input[name='password']", password)
+    await page.click("button[type='submit']")
+    # Wait for navigation to complete (post-login redirect)
+    await page.wait_for_url("**/dashboard**")
+```
+
+#### Multi-Page Forms
+Break multi-step forms into discrete functions per step. Each function:
+1. Fills the fields for that step
+2. Clicks the "Next" or "Continue" button
+3. Waits for the next step to load (URL change or DOM element)
+
+```python
+async def fill_step_1(page, data):
+    await page.fill("#first-name", data["first_name"])
+    await page.fill("#last-name", data["last_name"])
+    await page.select_option("#country", data["country"])
+    await page.click("button:has-text('Continue')")
+    await page.wait_for_selector("#step-2-form")
+
+async def fill_step_2(page, data):
+    await page.fill("#address", data["address"])
+    await page.fill("#city", data["city"])
+    await page.click("button:has-text('Continue')")
+    await page.wait_for_selector("#step-3-form")
+```
+
+#### File Uploads
+```python
+# Single file
+await page.set_input_files("input[type='file']", "/path/to/file.pdf")
+
+# Multiple files
+await page.set_input_files("input[type='file']", [
+    "/path/to/file1.pdf",
+    "/path/to/file2.pdf"
+])
+
+# Drag-and-drop upload zones (no visible input element)
+async with page.expect_file_chooser() as fc_info:
+    await page.click("div.upload-zone")
+file_chooser = await fc_info.value
+await file_chooser.set_files("/path/to/file.pdf")
+```
+
+#### Dropdown and Select Handling
+```python
+# Native <select> element
+await page.select_option("#country", value="US")
+await page.select_option("#country", label="United States")
+
+# Custom dropdown (div-based)
+await page.click("div.dropdown-trigger")
+await page.click("div.dropdown-option:has-text('United States')")
+```
+
+### 3. Screenshot & PDF Capture
+
+#### Screenshot Strategies
+```python
+# Full page (scrolls automatically)
+await page.screenshot(path="full-page.png", full_page=True)
+
+# Viewport only (what's visible)
+await page.screenshot(path="viewport.png")
+
+# Specific element
+element = page.locator("div.chart-container")
+await element.screenshot(path="chart.png")
+
+# With custom viewport for consistency
+context = await browser.new_context(viewport={"width": 1920, "height": 1080})
+```
+
+#### PDF Generation
+```python
+# Only works in Chromium
+await page.pdf(
+    path="output.pdf",
+    format="A4",
+    margin={"top": "1cm", "right": "1cm", "bottom": "1cm", "left": "1cm"},
+    print_background=True
+)
+```
+
+#### Visual Regression Baselines
+Take screenshots at known states and compare pixel-by-pixel. Store baselines in version control. Use naming conventions: `{page}_{viewport}_{state}.png`.
+
+### 4. Structured Data Extraction
+
+#### Tables to JSON
+```python
+async def extract_table(page, selector):
+    headers = await page.eval_on_selector_all(
+        f"{selector} thead th",
+        "elements => elements.map(e => e.textContent.trim())"
+    )
+    rows = await page.eval_on_selector_all(
+        f"{selector} tbody tr",
+        """rows => rows.map(row => {
+            return Array.from(row.querySelectorAll('td'))
+                .map(cell => cell.textContent.trim())
+        })"""
+    )
+    return [dict(zip(headers, row)) for row in rows]
+```
+
+#### Listings to Arrays
+```python
+async def extract_listings(page, container_sel, field_map):
+    """
+    field_map example: {"title": "h3.title", "price": "span.price", "url": "a::attr(href)"}
+    """
+    items = []
+    cards = await page.query_selector_all(container_sel)
+    for card in cards:
+        item = {}
+        for field, sel in field_map.items():
+            if "::attr(" in sel:
+                attr_sel, attr_name = sel.split("::attr(")
+                attr_name = attr_name.rstrip(")")
+                el = await card.query_selector(attr_sel)
+                item[field] = await el.get_attribute(attr_name) if el else None
+            else:
+                el = await card.query_selector(sel)
+                item[field] = (await el.text_content()).strip() if el else None
+        items.append(item)
+    return items
+```
+
+#### Nested Data Extraction
+For threaded content (comments with replies), use recursive extraction:
+```python
+async def extract_comments(page, parent_selector):
+    comments = []
+    elements = await page.query_selector_all(f"{parent_selector} > .comment")
+    for el in elements:
+        text = await (await el.query_selector(".comment-body")).text_content()
+        author = await (await el.query_selector(".author")).text_content()
+        replies = await extract_comments(el, ".replies")
+        comments.append({
+            "author": author.strip(),
+            "text": text.strip(),
+            "replies": replies
+        })
+    return comments
+```
+
+### 5. Cookie & Session Management
+
+#### Save and Restore Sessions
+```python
+import json
+
+# Save cookies after login
+cookies = await context.cookies()
+with open("session.json", "w") as f:
+    json.dump(cookies, f)
+
+# Restore session in new context
+with open("session.json", "r") as f:
+    cookies = json.load(f)
+context = await browser.new_context()
+await context.add_cookies(cookies)
+```
+
+#### Storage State (Cookies + Local Storage)
+```python
+# Save full state (cookies + localStorage + sessionStorage)
+await context.storage_state(path="state.json")
+
+# Restore full state
+context = await browser.new_context(storage_state="state.json")
+```
+
+**Best practice:** Save state after login, reuse across scraping sessions. Check session validity before starting a long job — make a lightweight request to a protected page and verify you are not redirected to login.
+
+### 6. Anti-Detection Patterns
+
+Modern websites detect automation through multiple vectors. Address all of them:
+
+#### User Agent Rotation
+Never use the default Playwright user agent. Rotate through real browser user agents:
+```python
+USER_AGENTS = [
+    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+]
+```
+
+#### Viewport and Screen Size
+Set realistic viewport dimensions. The default 800x600 is a red flag:
+```python
+context = await browser.new_context(
+    viewport={"width": 1920, "height": 1080},
+    screen={"width": 1920, "height": 1080},
+    user_agent=random.choice(USER_AGENTS),
+)
+```
+
+#### WebDriver Flag Removal
+Playwright sets `navigator.webdriver = true`. Remove it:
+```python
+await page.add_init_script("""
+    Object.defineProperty(navigator, 'webdriver', {get: () => undefined});
+""")
+```
+
+#### Request Throttling
+Add human-like delays between actions:
+```python
+import random
+
+async def human_delay(min_ms=500, max_ms=2000):
+    delay = random.randint(min_ms, max_ms)
+    await page.wait_for_timeout(delay)
+```
+
+#### Proxy Support
+```python
+browser = await playwright.chromium.launch(
+    proxy={"server": "http://proxy.example.com:8080"}
+)
+# Or per-context:
+context = await browser.new_context(
+    proxy={"server": "http://proxy.example.com:8080",
+           "username": "user", "password": "pass"}
+)
+```
+
+### 7. Dynamic Content Handling
+
+#### SPA Rendering
+SPAs render content client-side. Wait for the actual content, not the page load:
+```python
+await page.goto(url)
+# Wait for the data to render, not just the shell
+await page.wait_for_selector("div.product-list article", state="attached")
+```
+
+#### AJAX / Fetch Waiting
+Intercept and wait for specific API calls:
+```python
+async with page.expect_response("**/api/products*") as response_info:
+    await page.click("button.load-more")
+response = await response_info.value
+data = await response.json()  # You can use the API data directly
+```
+
+#### Shadow DOM Traversal
+```python
+# Playwright pierces open Shadow DOM automatically with >>
+await page.locator("custom-element >> .inner-class").click()
+```
+
+#### Lazy-Loaded Images
+Scroll elements into view to trigger lazy loading:
+```python
+images = await page.query_selector_all("img[data-src]")
+for img in images:
+    await img.scroll_into_view_if_needed()
+    await page.wait_for_timeout(200)
+```
+
+### 8. Error Handling & Retry Logic
+
+#### Retry Decorator Pattern
+```python
+import asyncio
+
+async def with_retry(coro_factory, max_retries=3, backoff_base=2):
+    for attempt in range(max_retries):
+        try:
+            return await coro_factory()
+        except Exception as e:
+            if attempt == max_retries - 1:
+                raise
+            wait = backoff_base ** attempt
+            print(f"Attempt {attempt + 1} failed: {e}. Retrying in {wait}s...")
+            await asyncio.sleep(wait)
+```
+
+#### Handling Common Failures
+```python
+from playwright.async_api import TimeoutError as PlaywrightTimeout
+
+try:
+    await page.click("button.submit", timeout=5000)
+except PlaywrightTimeout:
+    # Element did not appear — page structure may have changed
+    # Try fallback selector
+    await page.click("[type='submit']", timeout=5000)
+except Exception as e:
+    # Network error, browser crash, etc.
+    await page.screenshot(path="error-state.png")
+    raise
+```
+
+#### Rate Limit Detection
+```python
+async def check_rate_limit(response):
+    if response.status == 429:
+        retry_after = response.headers.get("retry-after", "60")
+        wait_seconds = int(retry_after)
+        print(f"Rate limited. Waiting {wait_seconds}s...")
+        await asyncio.sleep(wait_seconds)
+        return True
+    return False
+```
+
+## Workflows
+
+### Workflow 1: Single-Page Data Extraction
+
+**Scenario:** Extract product data from a single page with JavaScript-rendered content.
+
+**Steps:**
+1. Launch browser in headed mode during development (`headless=False`), switch to headless for production
+2. Navigate to URL and wait for content selector
+3. Extract data using `query_selector_all` with field mapping
+4. Validate extracted data (check for nulls, expected types)
+5. Output as JSON
+
+```python
+async def extract_single_page(url, selectors):
+    async with async_playwright() as p:
+        browser = await p.chromium.launch(headless=True)
+        context = await browser.new_context(
+            viewport={"width": 1920, "height": 1080},
+            user_agent="Mozilla/5.0 ..."
+        )
+        page = await context.new_page()
+        await page.goto(url, wait_until="networkidle")
+        data = await extract_listings(page, selectors["container"], selectors["fields"])
+        await browser.close()
+    return data
+```
+
+### Workflow 2: Multi-Page Scraping with Pagination
+
+**Scenario:** Scrape search results across 50+ pages.
+
+**Steps:**
+1. Launch browser with anti-detection settings
+2. Navigate to first page
+3. Extract data from current page
+4. Check if "Next" button exists and is enabled
+5. Click next, wait for new content to load (not just navigation)
+6. Repeat until no next page or max pages reached
+7. Deduplicate results by unique key
+8. Write output incrementally (don't hold everything in memory)
+
+```python
+async def scrape_paginated(base_url, selectors, max_pages=100):
+    all_data = []
+    async with async_playwright() as p:
+        browser = await p.chromium.launch(headless=True)
+        page = await (await browser.new_context()).new_page()
+        await page.goto(base_url)
+
+        for page_num in range(max_pages):
+            items = await extract_listings(page, selectors["container"], selectors["fields"])
+            all_data.extend(items)
+
+            next_btn = page.locator(selectors["next_button"])
+            if await next_btn.count() == 0 or await next_btn.is_disabled():
+                break
+
+            await next_btn.click()
+            await page.wait_for_selector(selectors["container"])
+            await human_delay(800, 2000)
+
+        await browser.close()
+    return all_data
+```
+
+### Workflow 3: Authenticated Workflow Automation
+
+**Scenario:** Log into a portal, navigate a multi-step form, download a report.
+
+**Steps:**
+1. Check for existing session state file
+2. If no session, perform login and save state
+3. Navigate to target page using saved session
+4. Fill multi-step form with provided data
+5. Wait for download to trigger
+6. Save downloaded file to target directory
+
+```python
+async def authenticated_workflow(credentials, form_data, download_dir):
+    async with async_playwright() as p:
+        browser = await p.chromium.launch(headless=True)
+        state_file = "session_state.json"
+
+        # Restore or create session
+        if os.path.exists(state_file):
+            context = await browser.new_context(storage_state=state_file)
+        else:
+            context = await browser.new_context()
+            page = await context.new_page()
+            await login(page, credentials["url"], credentials["user"], credentials["pass"])
+            await context.storage_state(path=state_file)
+
+        page = await context.new_page()
+        await page.goto(form_data["target_url"])
+
+        # Fill form steps
+        for step_fn in [fill_step_1, fill_step_2]:
+            await step_fn(page, form_data)
+
+        # Handle download
+        async with page.expect_download() as dl_info:
+            await page.click("button:has-text('Download Report')")
+        download = await dl_info.value
+        await download.save_as(os.path.join(download_dir, download.suggested_filename))
+
+        await browser.close()
+```
+
+## Tools Reference
+
+| Script | Purpose | Key Flags | Output |
+|--------|---------|-----------|--------|
+| `scraping_toolkit.py` | Generate Playwright scraping script skeleton | `--url`, `--selectors`, `--paginate`, `--output` | Python script or JSON config |
+| `form_automation_builder.py` | Generate form-fill automation script from field spec | `--fields`, `--url`, `--output` | Python automation script |
+| `anti_detection_checker.py` | Audit a Playwright script for detection vectors | `--file`, `--verbose` | Risk report with score |
+
+All scripts are stdlib-only. Run `python3 <script> --help` for full usage.
+
+## Anti-Patterns
+
+### Hardcoded Waits
+**Bad:** `await page.wait_for_timeout(5000)` before every action.
+**Good:** Use `wait_for_selector`, `wait_for_url`, `expect_response`, or `wait_for_load_state`. Hardcoded waits are flaky and slow.
+
+### No Error Recovery
+**Bad:** Linear script that crashes on first failure.
+**Good:** Wrap each page interaction in try/except. Take error-state screenshots. Implement retry with exponential backoff.
+
+### Ignoring robots.txt
+**Bad:** Scraping without checking robots.txt directives.
+**Good:** Fetch and parse robots.txt before scraping. Respect `Crawl-delay`. Skip disallowed paths. Add your bot name to User-Agent if running at scale.
+
+### Storing Credentials in Scripts
+**Bad:** Hardcoding usernames and passwords in Python files.
+**Good:** Use environment variables, `.env` files (gitignored), or a secrets manager. Pass credentials via CLI arguments.
+
+### No Rate Limiting
+**Bad:** Hammering a site with 100 requests/second.
+**Good:** Add random delays between requests (1-3s for polite scraping). Monitor for 429 responses. Implement exponential backoff.
+
+### Selector Fragility
+**Bad:** Relying on auto-generated class names (`.css-1a2b3c`) or deep nesting (`div > div > div > span:nth-child(3)`).
+**Good:** Use data attributes, semantic HTML, or text-based locators. Test selectors in browser DevTools first.
+
+### Not Cleaning Up Browser Instances
+**Bad:** Launching browsers without closing them, leading to resource leaks.
+**Good:** Always use `try/finally` or async context managers to ensure `browser.close()` is called.
+
+### Running Headed in Production
+**Bad:** Using `headless=False` in production/CI.
+**Good:** Develop with headed mode for debugging, deploy with `headless=True`. Use environment variable to toggle: `headless = os.environ.get("HEADLESS", "true") == "true"`.
+
+## Cross-References
+
+- **playwright-pro** — Browser testing skill. Use for E2E tests, test assertions, test fixtures. Browser Automation is for data extraction and workflow automation, not testing.
+- **api-test-suite-builder** — When the website has a public API, hit the API directly instead of scraping the rendered page. Faster, more reliable, less detectable.
+- **performance-profiler** — If your automation scripts are slow, profile the bottlenecks before adding concurrency.
+- **env-secrets-manager** — For securely managing credentials used in authenticated automation workflows.
--- a/engineering/browser-automation/anti_detection_checker.py
+++ b/engineering/browser-automation/anti_detection_checker.py
@@ -0,0 +1,520 @@
+#!/usr/bin/env python3
+"""
+Anti-Detection Checker - Audits Playwright scripts for common bot detection vectors.
+
+Analyzes a Playwright automation script and identifies patterns that make the
+browser detectable as a bot. Produces a risk score (0-100) with specific
+recommendations for each issue found.
+
+Detection vectors checked:
+- Headless mode usage
+- Default/missing user agent configuration
+- Viewport size (default 800x600 is a red flag)
+- WebDriver flag (navigator.webdriver)
+- Navigator property overrides
+- Request throttling / human-like delays
+- Cookie/session management
+- Proxy configuration
+- Error handling patterns
+
+No external dependencies - uses only Python standard library.
+"""
+
+import argparse
+import json
+import os
+import re
+import sys
+from dataclasses import dataclass, asdict
+from typing import List, Optional
+
+
+@dataclass
+class Finding:
+    """A single detection risk finding."""
+    category: str
+    severity: str  # "critical", "high", "medium", "low", "info"
+    description: str
+    line: Optional[int]
+    recommendation: str
+    weight: int  # Points added to risk score (0-15)
+
+
+SEVERITY_WEIGHTS = {
+    "critical": 15,
+    "high": 10,
+    "medium": 5,
+    "low": 2,
+    "info": 0,
+}
+
+
+class AntiDetectionChecker:
+    """Analyzes Playwright scripts for bot detection vulnerabilities."""
+
+    def __init__(self, script_content: str, file_path: str = "<stdin>"):
+        self.content = script_content
+        self.lines = script_content.split("\n")
+        self.file_path = file_path
+        self.findings: List[Finding] = []
+
+    def check_all(self) -> List[Finding]:
+        """Run all detection checks."""
+        self._check_headless_mode()
+        self._check_user_agent()
+        self._check_viewport()
+        self._check_webdriver_flag()
+        self._check_navigator_properties()
+        self._check_request_delays()
+        self._check_error_handling()
+        self._check_proxy()
+        self._check_session_management()
+        self._check_browser_close()
+        self._check_stealth_imports()
+        return self.findings
+
+    def _find_line(self, pattern: str) -> Optional[int]:
+        """Find the first line number matching a regex pattern."""
+        for i, line in enumerate(self.lines, 1):
+            if re.search(pattern, line):
+                return i
+        return None
+
+    def _has_pattern(self, pattern: str) -> bool:
+        """Check if pattern exists anywhere in the script."""
+        return bool(re.search(pattern, self.content))
+
+    def _check_headless_mode(self):
+        """Check if headless mode is properly configured."""
+        if self._has_pattern(r"headless\s*=\s*False"):
+            self.findings.append(Finding(
+                category="Headless Mode",
+                severity="high",
+                description="Browser launched in headed mode (headless=False). This is fine for development but should be headless=True in production.",
+                line=self._find_line(r"headless\s*=\s*False"),
+                recommendation="Use headless=True for production. Toggle via environment variable: headless=os.environ.get('HEADLESS', 'true') == 'true'",
+                weight=SEVERITY_WEIGHTS["high"],
+            ))
+        elif not self._has_pattern(r"headless"):
+            # Default is headless=True in Playwright, which is correct
+            self.findings.append(Finding(
+                category="Headless Mode",
+                severity="info",
+                description="Using default headless mode (True). Good for production.",
+                line=None,
+                recommendation="No action needed. Default headless=True is correct.",
+                weight=SEVERITY_WEIGHTS["info"],
+            ))
+
+    def _check_user_agent(self):
+        """Check if a custom user agent is set."""
+        has_ua = self._has_pattern(r"user_agent\s*=") or self._has_pattern(r"userAgent")
+        has_ua_list = self._has_pattern(r"USER_AGENTS?\s*=\s*\[")
+        has_random_ua = self._has_pattern(r"random\.choice.*(?:USER_AGENT|user_agent|ua)")
+
+        if not has_ua:
+            self.findings.append(Finding(
+                category="User Agent",
+                severity="critical",
+                description="No custom user agent configured. Playwright's default user agent contains 'HeadlessChrome' which is trivially detected.",
+                line=None,
+                recommendation="Set a realistic user agent: context = await browser.new_context(user_agent='Mozilla/5.0 ...')",
+                weight=SEVERITY_WEIGHTS["critical"],
+            ))
+        elif has_ua_list and has_random_ua:
+            self.findings.append(Finding(
+                category="User Agent",
+                severity="info",
+                description="User agent rotation detected. Good anti-detection practice.",
+                line=self._find_line(r"USER_AGENTS?\s*=\s*\["),
+                recommendation="Ensure user agents are recent and match the browser being launched (e.g., Chrome UA for Chromium).",
+                weight=SEVERITY_WEIGHTS["info"],
+            ))
+        elif has_ua:
+            self.findings.append(Finding(
+                category="User Agent",
+                severity="low",
+                description="Custom user agent set but no rotation detected. Single user agent is fingerprint-able at scale.",
+                line=self._find_line(r"user_agent\s*="),
+                recommendation="Rotate through 5-10 recent user agents using random.choice().",
+                weight=SEVERITY_WEIGHTS["low"],
+            ))
+
+    def _check_viewport(self):
+        """Check viewport configuration."""
+        has_viewport = self._has_pattern(r"viewport\s*=\s*\{") or self._has_pattern(r"viewport.*width")
+
+        if not has_viewport:
+            self.findings.append(Finding(
+                category="Viewport Size",
+                severity="high",
+                description="No viewport configured. Default Playwright viewport (1280x720) is common among bots. Sites may flag unusual viewport distributions.",
+                line=None,
+                recommendation="Set a common desktop viewport: viewport={'width': 1920, 'height': 1080}. Vary across runs.",
+                weight=SEVERITY_WEIGHTS["high"],
+            ))
+        else:
+            # Check for suspiciously small viewports
+            match = re.search(r"width['\"]?\s*[:=]\s*(\d+)", self.content)
+            if match:
+                width = int(match.group(1))
+                if width < 1024:
+                    self.findings.append(Finding(
+                        category="Viewport Size",
+                        severity="medium",
+                        description=f"Viewport width {width}px is unusually small. Most desktop browsers are 1366px+ wide.",
+                        line=self._find_line(r"width.*" + str(width)),
+                        recommendation="Use 1366x768 (most common) or 1920x1080. Avoid unusual sizes like 800x600.",
+                        weight=SEVERITY_WEIGHTS["medium"],
+                    ))
+                else:
+                    self.findings.append(Finding(
+                        category="Viewport Size",
+                        severity="info",
+                        description=f"Viewport width {width}px is reasonable.",
+                        line=self._find_line(r"width.*" + str(width)),
+                        recommendation="No action needed.",
+                        weight=SEVERITY_WEIGHTS["info"],
+                    ))
+
+    def _check_webdriver_flag(self):
+        """Check if navigator.webdriver is being removed."""
+        has_webdriver_override = (
+            self._has_pattern(r"navigator.*webdriver") or
+            self._has_pattern(r"webdriver.*undefined") or
+            self._has_pattern(r"add_init_script.*webdriver")
+        )
+
+        if not has_webdriver_override:
+            self.findings.append(Finding(
+                category="WebDriver Flag",
+                severity="critical",
+                description="navigator.webdriver is not overridden. This is the most common bot detection check. Every major anti-bot service tests this property.",
+                line=None,
+                recommendation=(
+                    "Add init script to remove the flag:\n"
+                    "  await page.add_init_script(\"Object.defineProperty(navigator, 'webdriver', {get: () => undefined});\")"
+                ),
+                weight=SEVERITY_WEIGHTS["critical"],
+            ))
+        else:
+            self.findings.append(Finding(
+                category="WebDriver Flag",
+                severity="info",
+                description="navigator.webdriver override detected.",
+                line=self._find_line(r"webdriver"),
+                recommendation="No action needed.",
+                weight=SEVERITY_WEIGHTS["info"],
+            ))
+
+    def _check_navigator_properties(self):
+        """Check for additional navigator property hardening."""
+        checks = {
+            "plugins": (r"navigator.*plugins", "navigator.plugins is empty in headless mode. Real browsers report installed plugins."),
+            "languages": (r"navigator.*languages", "navigator.languages should be set to match the user agent locale."),
+            "platform": (r"navigator.*platform", "navigator.platform should match the user agent OS."),
+        }
+
+        overridden_count = 0
+        for prop, (pattern, desc) in checks.items():
+            if self._has_pattern(pattern):
+                overridden_count += 1
+
+        if overridden_count == 0:
+            self.findings.append(Finding(
+                category="Navigator Properties",
+                severity="medium",
+                description="No navigator property hardening detected. Advanced anti-bot services check plugins, languages, and platform properties.",
+                line=None,
+                recommendation="Override navigator.plugins, navigator.languages, and navigator.platform via add_init_script() to match realistic browser fingerprints.",
+                weight=SEVERITY_WEIGHTS["medium"],
+            ))
+        elif overridden_count < 3:
+            self.findings.append(Finding(
+                category="Navigator Properties",
+                severity="low",
+                description=f"Partial navigator hardening ({overridden_count}/3 properties). Consider covering all three: plugins, languages, platform.",
+                line=None,
+                recommendation="Add overrides for any missing properties among: plugins, languages, platform.",
+                weight=SEVERITY_WEIGHTS["low"],
+            ))
+
+    def _check_request_delays(self):
+        """Check for human-like request delays."""
+        has_sleep = self._has_pattern(r"asyncio\.sleep") or self._has_pattern(r"wait_for_timeout")
+        has_random_delay = (
+            self._has_pattern(r"random\.(uniform|randint|random)") and has_sleep
+        )
+
+        if not has_sleep:
+            self.findings.append(Finding(
+                category="Request Timing",
+                severity="high",
+                description="No delays between actions detected. Machine-speed interactions are the easiest behavior-based detection signal.",
+                line=None,
+                recommendation="Add random delays between page interactions: await asyncio.sleep(random.uniform(0.5, 2.0))",
+                weight=SEVERITY_WEIGHTS["high"],
+            ))
+        elif not has_random_delay:
+            self.findings.append(Finding(
+                category="Request Timing",
+                severity="medium",
+                description="Fixed delays detected but no randomization. Constant timing intervals are detectable patterns.",
+                line=self._find_line(r"(asyncio\.sleep|wait_for_timeout)"),
+                recommendation="Use random delays: random.uniform(min_seconds, max_seconds) instead of fixed values.",
+                weight=SEVERITY_WEIGHTS["medium"],
+            ))
+        else:
+            self.findings.append(Finding(
+                category="Request Timing",
+                severity="info",
+                description="Randomized delays detected between actions.",
+                line=self._find_line(r"random\.(uniform|randint)"),
+                recommendation="No action needed. Ensure delays are realistic (0.5-3s for browsing, 1-5s for reading).",
+                weight=SEVERITY_WEIGHTS["info"],
+            ))
+
+    def _check_error_handling(self):
+        """Check for error handling patterns."""
+        has_try_except = self._has_pattern(r"try\s*:") and self._has_pattern(r"except")
+        has_retry = self._has_pattern(r"retr(y|ies)") or self._has_pattern(r"max_retries|max_attempts")
+
+        if not has_try_except:
+            self.findings.append(Finding(
+                category="Error Handling",
+                severity="medium",
+                description="No try/except blocks found. Unhandled errors will crash the automation and leave browser instances running.",
+                line=None,
+                recommendation="Wrap page interactions in try/except. Handle TimeoutError, network errors, and element-not-found gracefully.",
+                weight=SEVERITY_WEIGHTS["medium"],
+            ))
+        elif not has_retry:
+            self.findings.append(Finding(
+                category="Error Handling",
+                severity="low",
+                description="Error handling present but no retry logic detected. Transient failures (network blips, slow loads) will cause data loss.",
+                line=None,
+                recommendation="Add retry with exponential backoff for network operations and element interactions.",
+                weight=SEVERITY_WEIGHTS["low"],
+            ))
+
+    def _check_proxy(self):
+        """Check for proxy configuration."""
+        has_proxy = self._has_pattern(r"proxy\s*=\s*\{") or self._has_pattern(r"proxy.*server")
+
+        if not has_proxy:
+            self.findings.append(Finding(
+                category="Proxy",
+                severity="low",
+                description="No proxy configuration detected. Running from a single IP address is fine for small jobs but will trigger rate limits at scale.",
+                line=None,
+                recommendation="For high-volume scraping, use rotating proxies: proxy={'server': 'http://proxy:port'}",
+                weight=SEVERITY_WEIGHTS["low"],
+            ))
+
+    def _check_session_management(self):
+        """Check for session/cookie management."""
+        has_storage_state = self._has_pattern(r"storage_state")
+        has_cookies = self._has_pattern(r"cookies\(\)") or self._has_pattern(r"add_cookies")
+
+        if not has_storage_state and not has_cookies:
+            self.findings.append(Finding(
+                category="Session Management",
+                severity="low",
+                description="No session persistence detected. Each run will start fresh, requiring re-authentication.",
+                line=None,
+                recommendation="Use storage_state() to save/restore sessions across runs. This avoids repeated logins that may trigger security alerts.",
+                weight=SEVERITY_WEIGHTS["low"],
+            ))
+
+    def _check_browser_close(self):
+        """Check if browser is properly closed."""
+        has_close = self._has_pattern(r"browser\.close\(\)") or self._has_pattern(r"await.*close")
+        has_context_manager = self._has_pattern(r"async\s+with\s+async_playwright")
+
+        if not has_close and not has_context_manager:
+            self.findings.append(Finding(
+                category="Resource Cleanup",
+                severity="medium",
+                description="No browser.close() or context manager detected. Browser processes will leak on failure.",
+                line=None,
+                recommendation="Use 'async with async_playwright() as p:' or ensure browser.close() is in a finally block.",
+                weight=SEVERITY_WEIGHTS["medium"],
+            ))
+
+    def _check_stealth_imports(self):
+        """Check for stealth/anti-detection library usage."""
+        has_stealth = self._has_pattern(r"playwright_stealth|stealth_async|undetected")
+        if has_stealth:
+            self.findings.append(Finding(
+                category="Stealth Library",
+                severity="info",
+                description="Third-party stealth library detected. These provide additional fingerprint evasion but add dependencies.",
+                line=self._find_line(r"playwright_stealth|stealth_async|undetected"),
+                recommendation="Stealth libraries are helpful but not a silver bullet. Still implement manual checks for user agent, viewport, and timing.",
+                weight=SEVERITY_WEIGHTS["info"],
+            ))
+
+    def get_risk_score(self) -> int:
+        """Calculate overall risk score (0-100). Higher = more detectable."""
+        raw_score = sum(f.weight for f in self.findings)
+        # Cap at 100
+        return min(raw_score, 100)
+
+    def get_risk_level(self) -> str:
+        """Get human-readable risk level."""
+        score = self.get_risk_score()
+        if score <= 10:
+            return "LOW"
+        elif score <= 30:
+            return "MODERATE"
+        elif score <= 50:
+            return "HIGH"
+        else:
+            return "CRITICAL"
+
+    def get_summary(self) -> dict:
+        """Get a summary of the analysis."""
+        severity_counts = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
+        for f in self.findings:
+            severity_counts[f.severity] += 1
+
+        return {
+            "file": self.file_path,
+            "risk_score": self.get_risk_score(),
+            "risk_level": self.get_risk_level(),
+            "total_findings": len(self.findings),
+            "severity_counts": severity_counts,
+            "actionable_findings": len([f for f in self.findings if f.severity != "info"]),
+        }
+
+
+def format_text_report(checker: AntiDetectionChecker, verbose: bool = False) -> str:
+    """Format findings as human-readable text."""
+    lines = []
+    summary = checker.get_summary()
+
+    lines.append("=" * 60)
+    lines.append("  ANTI-DETECTION AUDIT REPORT")
+    lines.append("=" * 60)
+    lines.append(f"File:          {summary['file']}")
+    lines.append(f"Risk Score:    {summary['risk_score']}/100 ({summary['risk_level']})")
+    lines.append(f"Total Issues:  {summary['actionable_findings']} actionable, {summary['severity_counts']['info']} info")
+    lines.append("")
+
+    # Severity breakdown
+    for sev in ["critical", "high", "medium", "low"]:
+        count = summary["severity_counts"][sev]
+        if count > 0:
+            lines.append(f"  {sev.upper():10s} {count}")
+    lines.append("")
+
+    # Findings grouped by severity
+    severity_order = ["critical", "high", "medium", "low"]
+    if verbose:
+        severity_order.append("info")
+
+    for sev in severity_order:
+        sev_findings = [f for f in checker.findings if f.severity == sev]
+        if not sev_findings:
+            continue
+
+        lines.append(f"--- {sev.upper()} ---")
+        for f in sev_findings:
+            line_info = f" (line {f.line})" if f.line else ""
+            lines.append(f"  [{f.category}]{line_info}")
+            lines.append(f"    {f.description}")
+            lines.append(f"    Fix: {f.recommendation}")
+            lines.append("")
+
+    # Exit code guidance
+    lines.append("-" * 60)
+    score = summary["risk_score"]
+    if score <= 10:
+        lines.append("Result: PASS - Low detection risk.")
+    elif score <= 30:
+        lines.append("Result: PASS with warnings - Address medium/high issues for production use.")
+    else:
+        lines.append("Result: FAIL - High detection risk. Fix critical and high issues before deploying.")
+    lines.append("")
+
+    return "\n".join(lines)
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Audit a Playwright script for common bot detection vectors.",
+        epilog=(
+            "Examples:\n"
+            "  %(prog)s --file scraper.py\n"
+            "  %(prog)s --file scraper.py --verbose\n"
+            "  %(prog)s --file scraper.py --json\n"
+            "\n"
+            "Exit codes:\n"
+            "  0 - Low risk (score 0-10)\n"
+            "  1 - Moderate to high risk (score 11-50)\n"
+            "  2 - Critical risk (score 51+)\n"
+        ),
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument(
+        "--file",
+        required=True,
+        help="Path to the Playwright script to audit",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_output",
+        default=False,
+        help="Output results as JSON",
+    )
+    parser.add_argument(
+        "--verbose",
+        action="store_true",
+        default=False,
+        help="Include informational (non-actionable) findings in output",
+    )
+
+    args = parser.parse_args()
+
+    file_path = os.path.abspath(args.file)
+    if not os.path.isfile(file_path):
+        print(f"Error: File not found: {file_path}", file=sys.stderr)
+        sys.exit(2)
+
+    try:
+        with open(file_path, "r", encoding="utf-8") as f:
+            content = f.read()
+    except Exception as e:
+        print(f"Error reading file: {e}", file=sys.stderr)
+        sys.exit(2)
+
+    if not content.strip():
+        print("Error: File is empty.", file=sys.stderr)
+        sys.exit(2)
+
+    checker = AntiDetectionChecker(content, file_path)
+    checker.check_all()
+
+    if args.json_output:
+        output = checker.get_summary()
+        output["findings"] = [asdict(f) for f in checker.findings]
+        if not args.verbose:
+            output["findings"] = [f for f in output["findings"] if f["severity"] != "info"]
+        print(json.dumps(output, indent=2))
+    else:
+        print(format_text_report(checker, verbose=args.verbose))
+
+    # Exit code based on risk
+    score = checker.get_risk_score()
+    if score <= 10:
+        sys.exit(0)
+    elif score <= 50:
+        sys.exit(1)
+    else:
+        sys.exit(2)
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering/browser-automation/form_automation_builder.py
+++ b/engineering/browser-automation/form_automation_builder.py
@@ -0,0 +1,324 @@
+#!/usr/bin/env python3
+"""
+Form Automation Builder - Generates Playwright form-fill automation scripts.
+
+Takes a JSON field specification and target URL, then produces a ready-to-run
+Playwright script that fills forms, handles multi-step flows, and manages
+file uploads.
+
+No external dependencies - uses only Python standard library.
+"""
+
+import argparse
+import json
+import os
+import sys
+import textwrap
+from datetime import datetime
+
+
+SUPPORTED_FIELD_TYPES = {
+    "text": "page.fill('{selector}', '{value}')",
+    "password": "page.fill('{selector}', '{value}')",
+    "email": "page.fill('{selector}', '{value}')",
+    "textarea": "page.fill('{selector}', '{value}')",
+    "select": "page.select_option('{selector}', value='{value}')",
+    "checkbox": "page.check('{selector}')" if True else "page.uncheck('{selector}')",
+    "radio": "page.check('{selector}')",
+    "file": "page.set_input_files('{selector}', '{value}')",
+    "click": "page.click('{selector}')",
+}
+
+
+def validate_fields(fields):
+    """Validate the field specification format. Returns list of issues."""
+    issues = []
+    if not isinstance(fields, list):
+        issues.append("Top-level structure must be a JSON array of field objects.")
+        return issues
+
+    for i, field in enumerate(fields):
+        if not isinstance(field, dict):
+            issues.append(f"Field {i}: must be a JSON object.")
+            continue
+        if "selector" not in field:
+            issues.append(f"Field {i}: missing required 'selector' key.")
+        if "type" not in field:
+            issues.append(f"Field {i}: missing required 'type' key.")
+        elif field["type"] not in SUPPORTED_FIELD_TYPES:
+            issues.append(
+                f"Field {i}: unsupported type '{field['type']}'. "
+                f"Supported: {', '.join(sorted(SUPPORTED_FIELD_TYPES.keys()))}"
+            )
+        if field.get("type") not in ("checkbox", "radio", "click") and "value" not in field:
+            issues.append(f"Field {i}: missing 'value' for type '{field.get('type', '?')}'.")
+
+    return issues
+
+
+def generate_field_action(field, indent=8):
+    """Generate the Playwright action line for a single field."""
+    ftype = field["type"]
+    selector = field["selector"]
+    value = field.get("value", "")
+    label = field.get("label", selector)
+    prefix = " " * indent
+
+    lines = []
+    lines.append(f'{prefix}# {label}')
+
+    if ftype == "checkbox":
+        if field.get("value", "true").lower() in ("true", "yes", "1", "on"):
+            lines.append(f'{prefix}await page.check("{selector}")')
+        else:
+            lines.append(f'{prefix}await page.uncheck("{selector}")')
+    elif ftype == "radio":
+        lines.append(f'{prefix}await page.check("{selector}")')
+    elif ftype == "click":
+        lines.append(f'{prefix}await page.click("{selector}")')
+    elif ftype == "select":
+        lines.append(f'{prefix}await page.select_option("{selector}", value="{value}")')
+    elif ftype == "file":
+        lines.append(f'{prefix}await page.set_input_files("{selector}", "{value}")')
+    else:
+        # text, password, email, textarea
+        lines.append(f'{prefix}await page.fill("{selector}", "{value}")')
+
+    # Add optional wait_after
+    wait_after = field.get("wait_after")
+    if wait_after:
+        lines.append(f'{prefix}await page.wait_for_selector("{wait_after}")')
+
+    return "\n".join(lines)
+
+
+def build_form_script(url, fields, output_format="script"):
+    """Build a Playwright form automation script from the field specification."""
+
+    issues = validate_fields(fields)
+    if issues:
+        return None, issues
+
+    if output_format == "json":
+        config = {
+            "url": url,
+            "fields": fields,
+            "field_count": len(fields),
+            "field_types": list(set(f["type"] for f in fields)),
+            "has_file_upload": any(f["type"] == "file" for f in fields),
+            "generated_at": datetime.now().isoformat(),
+        }
+        return config, None
+
+    # Group fields into steps if step markers are present
+    steps = {}
+    for field in fields:
+        step = field.get("step", 1)
+        if step not in steps:
+            steps[step] = []
+        steps[step].append(field)
+
+    multi_step = len(steps) > 1
+
+    # Generate step functions
+    step_functions = []
+    for step_num in sorted(steps.keys()):
+        step_fields = steps[step_num]
+        actions = "\n".join(generate_field_action(f) for f in step_fields)
+
+        if multi_step:
+            fn = textwrap.dedent(f"""\
+async def fill_step_{step_num}(page):
+    \"\"\"Fill form step {step_num} ({len(step_fields)} fields).\"\"\"
+    print(f"Filling step {step_num}...")
+{actions}
+    print(f"Step {step_num} complete.")
+""")
+        else:
+            fn = textwrap.dedent(f"""\
+async def fill_form(page):
+    \"\"\"Fill form ({len(step_fields)} fields).\"\"\"
+    print("Filling form...")
+{actions}
+    print("Form filled.")
+""")
+        step_functions.append(fn)
+
+    step_functions_str = "\n\n".join(step_functions)
+
+    # Generate main() call sequence
+    if multi_step:
+        step_calls = "\n".join(
+            f"        await fill_step_{n}(page)" for n in sorted(steps.keys())
+        )
+    else:
+        step_calls = "        await fill_form(page)"
+
+    submit_selector = None
+    for field in fields:
+        if field.get("type") == "click" and field.get("is_submit"):
+            submit_selector = field["selector"]
+            break
+
+    submit_block = ""
+    if submit_selector:
+        submit_block = textwrap.dedent(f"""\
+
+        # Submit
+        await page.click("{submit_selector}")
+        await page.wait_for_load_state("networkidle")
+        print("Form submitted.")
+""")
+
+    script = textwrap.dedent(f'''\
+#!/usr/bin/env python3
+"""
+Auto-generated Playwright form automation script.
+Target: {url}
+Fields: {len(fields)}
+Steps: {len(steps)}
+Generated: {datetime.now().isoformat()}
+
+Requirements:
+    pip install playwright
+    playwright install chromium
+"""
+
+import asyncio
+import random
+from playwright.async_api import async_playwright
+
+URL = "{url}"
+
+USER_AGENTS = [
+    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+]
+
+
+{step_functions_str}
+
+async def main():
+    async with async_playwright() as p:
+        browser = await p.chromium.launch(headless=True)
+        context = await browser.new_context(
+            viewport={{"width": 1920, "height": 1080}},
+            user_agent=random.choice(USER_AGENTS),
+        )
+        page = await context.new_page()
+
+        await page.add_init_script(
+            "Object.defineProperty(navigator, \'webdriver\', {{get: () => undefined}});"
+        )
+
+        print(f"Navigating to {{URL}}...")
+        await page.goto(URL, wait_until="networkidle")
+
+{step_calls}
+{submit_block}
+        print("Automation complete.")
+        await browser.close()
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
+''')
+
+    return script, None
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Generate Playwright form-fill automation scripts from a JSON field specification.",
+        epilog=textwrap.dedent("""\
+Examples:
+  %(prog)s --url https://example.com/signup --fields fields.json
+  %(prog)s --url https://example.com/signup --fields fields.json --output fill_form.py
+  %(prog)s --url https://example.com/signup --fields fields.json --json
+
+Field specification format (fields.json):
+  [
+    {"selector": "#email", "type": "email", "value": "user@example.com", "label": "Email"},
+    {"selector": "#password", "type": "password", "value": "s3cret"},
+    {"selector": "#country", "type": "select", "value": "US"},
+    {"selector": "#terms", "type": "checkbox", "value": "true"},
+    {"selector": "#avatar", "type": "file", "value": "/path/to/photo.jpg"},
+    {"selector": "button[type='submit']", "type": "click", "is_submit": true}
+  ]
+
+Supported field types: text, password, email, textarea, select, checkbox, radio, file, click
+
+Multi-step forms: Add "step": N to each field to group into steps.
+        """),
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument(
+        "--url",
+        required=True,
+        help="Target form URL",
+    )
+    parser.add_argument(
+        "--fields",
+        required=True,
+        help="Path to JSON file containing field specifications",
+    )
+    parser.add_argument(
+        "--output",
+        help="Output file path (default: stdout)",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_output",
+        default=False,
+        help="Output JSON configuration instead of Python script",
+    )
+
+    args = parser.parse_args()
+
+    # Load fields
+    fields_path = os.path.abspath(args.fields)
+    if not os.path.isfile(fields_path):
+        print(f"Error: Fields file not found: {fields_path}", file=sys.stderr)
+        sys.exit(2)
+
+    try:
+        with open(fields_path, "r") as f:
+            fields = json.load(f)
+    except json.JSONDecodeError as e:
+        print(f"Error: Invalid JSON in {fields_path}: {e}", file=sys.stderr)
+        sys.exit(2)
+
+    output_format = "json" if args.json_output else "script"
+    result, errors = build_form_script(
+        url=args.url,
+        fields=fields,
+        output_format=output_format,
+    )
+
+    if errors:
+        print("Validation errors:", file=sys.stderr)
+        for err in errors:
+            print(f"  - {err}", file=sys.stderr)
+        sys.exit(2)
+
+    if args.json_output:
+        output_text = json.dumps(result, indent=2)
+    else:
+        output_text = result
+
+    if args.output:
+        output_path = os.path.abspath(args.output)
+        with open(output_path, "w") as f:
+            f.write(output_text)
+        if not args.json_output:
+            os.chmod(output_path, 0o755)
+        print(f"Written to {output_path}", file=sys.stderr)
+        sys.exit(0)
+    else:
+        print(output_text)
+        sys.exit(0)
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering/browser-automation/references/anti_detection_patterns.md
+++ b/engineering/browser-automation/references/anti_detection_patterns.md
@@ -0,0 +1,453 @@
+# Anti-Detection Patterns for Browser Automation
+
+This reference covers techniques to make Playwright automation less detectable by anti-bot services. These are defense-in-depth measures — no single technique is sufficient, but combining them significantly reduces detection risk.
+
+## Detection Vectors
+
+Anti-bot systems detect automation through multiple signals. Understanding what they check helps you counter effectively.
+
+### Tier 1: Trivial Detection (Every Site Checks These)
+1. **navigator.webdriver** — Set to `true` by all automation frameworks
+2. **User-Agent string** — Default headless UA contains "HeadlessChrome"
+3. **WebGL renderer** — Headless Chrome reports "SwiftShader" or "Google SwiftShader"
+
+### Tier 2: Common Detection (Most Anti-Bot Services)
+4. **Viewport/screen dimensions** — Unusual sizes flag automation
+5. **Plugins array** — Empty in headless mode, populated in real browsers
+6. **Languages** — Missing or mismatched locale
+7. **Request timing** — Machine-speed interactions
+8. **Mouse movement** — No mouse events between clicks
+
+### Tier 3: Advanced Detection (Cloudflare, DataDome, PerimeterX)
+9. **Canvas fingerprint** — Headless renders differently
+10. **WebGL fingerprint** — GPU-specific rendering variations
+11. **Audio fingerprint** — AudioContext processing differences
+12. **Font enumeration** — Different available fonts in headless
+13. **Behavioral analysis** — Scroll patterns, click patterns, reading time
+
+## Stealth Techniques
+
+### 1. WebDriver Flag Removal
+
+The most critical fix. Every anti-bot check starts here.
+
+```python
+await page.add_init_script("""
+    // Remove webdriver flag
+    Object.defineProperty(navigator, 'webdriver', {
+        get: () => undefined,
+    });
+
+    // Remove Playwright-specific properties
+    delete window.__playwright;
+    delete window.__pw_manual;
+""")
+```
+
+### 2. User Agent Configuration
+
+Match the user agent to the browser you are launching. A Chrome UA with Firefox-specific headers is a red flag.
+
+```python
+# Chrome 120 on Windows 10 (most common configuration globally)
+CHROME_WIN = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
+
+# Chrome 120 on macOS
+CHROME_MAC = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
+
+# Chrome 120 on Linux
+CHROME_LINUX = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
+
+# Firefox 121 on Windows
+FIREFOX_WIN = "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0"
+```
+
+**Rules:**
+- Update UAs every 2-3 months as browser versions increment
+- Match UA platform to `navigator.platform` override
+- If using Chromium, use Chrome UAs. If Firefox, use Firefox UAs.
+- Never use obviously fake or ancient UAs
+
+### 3. Viewport and Screen Properties
+
+Common real-world screen resolutions (from analytics data):
+
+| Resolution | Market Share | Use For |
+|-----------|-------------|---------|
+| 1920x1080 | ~23% | Default choice |
+| 1366x768 | ~14% | Laptop simulation |
+| 1536x864 | ~9% | Scaled laptop |
+| 1440x900 | ~7% | MacBook |
+| 2560x1440 | ~5% | High-end desktop |
+
+```python
+import random
+
+VIEWPORTS = [
+    {"width": 1920, "height": 1080},
+    {"width": 1366, "height": 768},
+    {"width": 1536, "height": 864},
+    {"width": 1440, "height": 900},
+]
+
+viewport = random.choice(VIEWPORTS)
+context = await browser.new_context(
+    viewport=viewport,
+    screen=viewport,  # screen should match viewport
+)
+```
+
+### 4. Navigator Properties Hardening
+
+```python
+STEALTH_INIT = """
+    // Plugins (headless Chrome has 0 plugins, real Chrome has 3-5)
+    Object.defineProperty(navigator, 'plugins', {
+        get: () => {
+            const plugins = [
+                { name: 'Chrome PDF Plugin', filename: 'internal-pdf-viewer' },
+                { name: 'Chrome PDF Viewer', filename: 'mhjfbmdgcfjbbpaeojofohoefgiehjai' },
+                { name: 'Native Client', filename: 'internal-nacl-plugin' },
+            ];
+            plugins.length = 3;
+            return plugins;
+        },
+    });
+
+    // Languages
+    Object.defineProperty(navigator, 'languages', {
+        get: () => ['en-US', 'en'],
+    });
+
+    // Platform (match to user agent)
+    Object.defineProperty(navigator, 'platform', {
+        get: () => 'Win32',  // or 'MacIntel' for macOS UA
+    });
+
+    // Hardware concurrency (real browsers report CPU cores)
+    Object.defineProperty(navigator, 'hardwareConcurrency', {
+        get: () => 8,
+    });
+
+    // Device memory (Chrome-specific)
+    Object.defineProperty(navigator, 'deviceMemory', {
+        get: () => 8,
+    });
+
+    // Connection info
+    Object.defineProperty(navigator, 'connection', {
+        get: () => ({
+            effectiveType: '4g',
+            rtt: 50,
+            downlink: 10,
+            saveData: false,
+        }),
+    });
+"""
+
+await context.add_init_script(STEALTH_INIT)
+```
+
+### 5. WebGL Fingerprint Evasion
+
+Headless Chrome uses SwiftShader for WebGL, which anti-bot services detect.
+
+```python
+# Option A: Launch with a real GPU (headed mode on a machine with GPU)
+browser = await p.chromium.launch(headless=False)
+
+# Option B: Override WebGL renderer info
+await page.add_init_script("""
+    const getParameter = WebGLRenderingContext.prototype.getParameter;
+    WebGLRenderingContext.prototype.getParameter = function(parameter) {
+        if (parameter === 37445) {
+            return 'Intel Inc.';  // UNMASKED_VENDOR_WEBGL
+        }
+        if (parameter === 37446) {
+            return 'Intel(R) Iris(TM) Plus Graphics 640';  // UNMASKED_RENDERER_WEBGL
+        }
+        return getParameter.call(this, parameter);
+    };
+""")
+```
+
+### 6. Canvas Fingerprint Noise
+
+Anti-bot services render text/shapes to a canvas and hash the output. Headless Chrome produces a different hash.
+
+```python
+await page.add_init_script("""
+    const originalToDataURL = HTMLCanvasElement.prototype.toDataURL;
+    HTMLCanvasElement.prototype.toDataURL = function(type) {
+        if (type === 'image/png' || type === undefined) {
+            // Add minimal noise to the canvas to change fingerprint
+            const ctx = this.getContext('2d');
+            if (ctx) {
+                const imageData = ctx.getImageData(0, 0, this.width, this.height);
+                for (let i = 0; i < imageData.data.length; i += 4) {
+                    // Shift one channel by +/- 1 (imperceptible)
+                    imageData.data[i] = imageData.data[i] ^ 1;
+                }
+                ctx.putImageData(imageData, 0, 0);
+            }
+        }
+        return originalToDataURL.apply(this, arguments);
+    };
+""")
+```
+
+## Request Throttling Patterns
+
+### Human-Like Delays
+
+Real users do not click at machine speed. Add realistic delays between actions.
+
+```python
+import random
+import asyncio
+
+async def human_delay(action_type="browse"):
+    """Add realistic delay based on action type."""
+    delays = {
+        "browse": (1.0, 3.0),      # Browsing between pages
+        "read": (2.0, 8.0),        # Reading content
+        "fill": (0.3, 0.8),        # Between form fields
+        "click": (0.1, 0.5),       # Before clicking
+        "scroll": (0.5, 1.5),      # Between scroll actions
+    }
+    min_s, max_s = delays.get(action_type, (0.5, 2.0))
+    await asyncio.sleep(random.uniform(min_s, max_s))
+```
+
+### Request Rate Limiting
+
+```python
+import time
+
+class RateLimiter:
+    """Enforce minimum delay between requests."""
+
+    def __init__(self, min_interval_seconds=1.0):
+        self.min_interval = min_interval_seconds
+        self.last_request_time = 0
+
+    async def wait(self):
+        elapsed = time.time() - self.last_request_time
+        if elapsed < self.min_interval:
+            await asyncio.sleep(self.min_interval - elapsed)
+        self.last_request_time = time.time()
+
+# Usage
+limiter = RateLimiter(min_interval_seconds=2.0)
+for url in urls:
+    await limiter.wait()
+    await page.goto(url)
+```
+
+### Exponential Backoff on Errors
+
+```python
+async def with_backoff(coro_factory, max_retries=5, base_delay=1.0):
+    for attempt in range(max_retries):
+        try:
+            return await coro_factory()
+        except Exception as e:
+            if attempt == max_retries - 1:
+                raise
+            delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
+            print(f"Attempt {attempt + 1} failed: {e}. Retrying in {delay:.1f}s...")
+            await asyncio.sleep(delay)
+```
+
+## Proxy Rotation Strategies
+
+### Single Proxy
+
+```python
+browser = await p.chromium.launch(
+    proxy={"server": "http://proxy.example.com:8080"}
+)
+```
+
+### Authenticated Proxy
+
+```python
+context = await browser.new_context(
+    proxy={
+        "server": "http://proxy.example.com:8080",
+        "username": "user",
+        "password": "pass",
+    }
+)
+```
+
+### Rotating Proxy Pool
+
+```python
+PROXIES = [
+    "http://proxy1.example.com:8080",
+    "http://proxy2.example.com:8080",
+    "http://proxy3.example.com:8080",
+]
+
+async def create_context_with_proxy(browser):
+    proxy = random.choice(PROXIES)
+    return await browser.new_context(
+        proxy={"server": proxy}
+    )
+```
+
+### Per-Request Proxy (via Context Rotation)
+
+Playwright does not support per-request proxy switching. Achieve it by creating a new context for each request or batch:
+
+```python
+async def scrape_url(browser, url, proxy):
+    context = await browser.new_context(proxy={"server": proxy})
+    page = await context.new_page()
+    try:
+        await page.goto(url)
+        data = await extract_data(page)
+        return data
+    finally:
+        await context.close()
+```
+
+### SOCKS5 Proxy
+
+```python
+browser = await p.chromium.launch(
+    proxy={"server": "socks5://proxy.example.com:1080"}
+)
+```
+
+## Headless Detection Avoidance
+
+### Running Chrome Channel Instead of Chromium
+
+The bundled Chromium binary has different properties than a real Chrome install. Using the Chrome channel makes the browser indistinguishable from a normal install.
+
+```python
+# Use installed Chrome instead of bundled Chromium
+browser = await p.chromium.launch(channel="chrome", headless=True)
+```
+
+**Requirements:** Chrome must be installed on the system.
+
+### New Headless Mode (Chrome 112+)
+
+Chrome's "new headless" mode is harder to detect than the old one:
+
+```python
+browser = await p.chromium.launch(
+    args=["--headless=new"],
+)
+```
+
+### Avoiding Common Flags
+
+Do NOT pass these flags — they are headless-detection signals:
+- `--disable-gpu` (old headless workaround, not needed)
+- `--no-sandbox` (security risk, detectable)
+- `--disable-setuid-sandbox` (same as above)
+
+## Behavioral Evasion
+
+### Mouse Movement Simulation
+
+Anti-bot services track mouse events. A click without preceding mouse movement is suspicious.
+
+```python
+async def human_click(page, selector):
+    """Click with preceding mouse movement."""
+    element = await page.query_selector(selector)
+    box = await element.bounding_box()
+    if box:
+        # Move to element with slight offset
+        x = box["x"] + box["width"] / 2 + random.uniform(-5, 5)
+        y = box["y"] + box["height"] / 2 + random.uniform(-5, 5)
+        await page.mouse.move(x, y, steps=random.randint(5, 15))
+        await asyncio.sleep(random.uniform(0.05, 0.2))
+        await page.mouse.click(x, y)
+```
+
+### Typing Speed Variation
+
+```python
+async def human_type(page, selector, text):
+    """Type with variable speed like a human."""
+    await page.click(selector)
+    for char in text:
+        await page.keyboard.type(char)
+        # Faster for common keys, slower for special characters
+        if char in "aeiou tnrs":
+            await asyncio.sleep(random.uniform(0.03, 0.08))
+        else:
+            await asyncio.sleep(random.uniform(0.08, 0.20))
+```
+
+### Scroll Behavior
+
+Real users scroll gradually, not in instant jumps.
+
+```python
+async def human_scroll(page, distance=None):
+    """Scroll down gradually like a human."""
+    if distance is None:
+        distance = random.randint(300, 800)
+
+    current = 0
+    while current < distance:
+        step = random.randint(50, 150)
+        await page.mouse.wheel(0, step)
+        current += step
+        await asyncio.sleep(random.uniform(0.05, 0.15))
+```
+
+## Detection Testing
+
+### Self-Check Script
+
+Navigate to these URLs to test your stealth configuration:
+
+- `https://bot.sannysoft.com/` — Comprehensive bot detection test
+- `https://abrahamjuliot.github.io/creepjs/` — Advanced fingerprint analysis
+- `https://browserleaks.com/webgl` — WebGL fingerprint details
+- `https://browserleaks.com/canvas` — Canvas fingerprint details
+
+### Quick Test Pattern
+
+```python
+async def test_stealth(page):
+    """Navigate to detection test page and report results."""
+    await page.goto("https://bot.sannysoft.com/")
+    await page.wait_for_timeout(3000)
+
+    # Check for failed tests
+    failed = await page.eval_on_selector_all(
+        "td.failed",
+        "els => els.map(e => e.parentElement.querySelector('td').textContent)"
+    )
+
+    if failed:
+        print(f"FAILED checks: {failed}")
+    else:
+        print("All checks passed.")
+
+    await page.screenshot(path="stealth_test.png", full_page=True)
+```
+
+## Recommended Stealth Stack
+
+For most automation tasks, apply these in order of priority:
+
+1. **WebDriver flag removal** — Critical, takes 2 lines
+2. **Custom user agent** — Critical, takes 1 line
+3. **Viewport configuration** — High priority, takes 1 line
+4. **Request delays** — High priority, add random.uniform() calls
+5. **Navigator properties** — Medium priority, init script block
+6. **Chrome channel** — Medium priority, one launch option
+7. **WebGL override** — Low priority unless hitting advanced anti-bot
+8. **Canvas noise** — Low priority unless hitting advanced anti-bot
+9. **Proxy rotation** — Only for high-volume or repeated scraping
+10. **Behavioral simulation** — Only for sites with behavioral analysis
--- a/engineering/browser-automation/references/data_extraction_recipes.md
+++ b/engineering/browser-automation/references/data_extraction_recipes.md
@@ -0,0 +1,580 @@
+# Data Extraction Recipes
+
+Practical patterns for extracting structured data from web pages using Playwright. Each recipe is a self-contained pattern you can adapt to your target site.
+
+## CSS Selector Patterns for Common Structures
+
+### E-Commerce Product Listings
+
+```python
+PRODUCT_SELECTORS = {
+    "container": "div.product-card, article.product, li.product-item",
+    "fields": {
+        "title": "h2.product-title, h3.product-name, [data-testid='product-title']",
+        "price": "span.price, .product-price, [data-testid='price']",
+        "original_price": "span.original-price, .was-price, del",
+        "rating": "span.rating, .star-rating, [data-rating]",
+        "review_count": "span.review-count, .num-reviews",
+        "image_url": "img.product-image::attr(src), img::attr(data-src)",
+        "product_url": "a.product-link::attr(href), h2 a::attr(href)",
+        "availability": "span.stock-status, .availability",
+    }
+}
+```
+
+### News/Blog Article Listings
+
+```python
+ARTICLE_SELECTORS = {
+    "container": "article, div.post, div.article-card",
+    "fields": {
+        "headline": "h2 a, h3 a, .article-title",
+        "summary": "p.excerpt, .article-summary, .post-excerpt",
+        "author": "span.author, .byline, [rel='author']",
+        "date": "time, span.date, .published-date",
+        "category": "span.category, a.tag, .article-category",
+        "url": "h2 a::attr(href), .article-title a::attr(href)",
+        "image_url": "img.thumbnail::attr(src), .article-image img::attr(src)",
+    }
+}
+```
+
+### Job Listings
+
+```python
+JOB_SELECTORS = {
+    "container": "div.job-card, li.job-listing, article.job",
+    "fields": {
+        "title": "h2.job-title, a.job-link, [data-testid='job-title']",
+        "company": "span.company-name, .employer, [data-testid='company']",
+        "location": "span.location, .job-location, [data-testid='location']",
+        "salary": "span.salary, .compensation, [data-testid='salary']",
+        "job_type": "span.job-type, .employment-type",
+        "posted_date": "time, span.posted, .date-posted",
+        "url": "a.job-link::attr(href), h2 a::attr(href)",
+    }
+}
+```
+
+### Search Engine Results
+
+```python
+SERP_SELECTORS = {
+    "container": "div.g, .search-result, li.result",
+    "fields": {
+        "title": "h3, .result-title",
+        "url": "a::attr(href), cite",
+        "snippet": "div.VwiC3b, .result-snippet, .search-description",
+        "displayed_url": "cite, .result-url",
+    }
+}
+```
+
+## Table Extraction Recipes
+
+### Simple HTML Table to JSON
+
+The most common extraction pattern. Works for any standard `<table>` with `<thead>` and `<tbody>`.
+
+```python
+async def extract_table(page, table_selector="table"):
+    """Extract an HTML table into a list of dictionaries."""
+    data = await page.evaluate(f"""
+        (selector) => {{
+            const table = document.querySelector(selector);
+            if (!table) return null;
+
+            // Get headers
+            const headers = Array.from(table.querySelectorAll('thead th, thead td'))
+                .map(th => th.textContent.trim());
+
+            // If no thead, use first row as headers
+            if (headers.length === 0) {{
+                const firstRow = table.querySelector('tr');
+                if (firstRow) {{
+                    headers.push(...Array.from(firstRow.querySelectorAll('th, td'))
+                        .map(cell => cell.textContent.trim()));
+                }}
+            }}
+
+            // Get data rows
+            const rows = Array.from(table.querySelectorAll('tbody tr'));
+            return rows.map(row => {{
+                const cells = Array.from(row.querySelectorAll('td'));
+                const obj = {{}};
+                cells.forEach((cell, i) => {{
+                    if (i < headers.length) {{
+                        obj[headers[i]] = cell.textContent.trim();
+                    }}
+                }});
+                return obj;
+            }});
+        }}
+    """, table_selector)
+    return data or []
+```
+
+### Table with Links and Attributes
+
+When table cells contain links or data attributes, not just text:
+
+```python
+async def extract_rich_table(page, table_selector="table"):
+    """Extract table including links and data attributes."""
+    return await page.evaluate(f"""
+        (selector) => {{
+            const table = document.querySelector(selector);
+            if (!table) return [];
+
+            const headers = Array.from(table.querySelectorAll('thead th'))
+                .map(th => th.textContent.trim());
+
+            return Array.from(table.querySelectorAll('tbody tr')).map(row => {{
+                const obj = {{}};
+                Array.from(row.querySelectorAll('td')).forEach((cell, i) => {{
+                    const key = headers[i] || `col_${{i}}`;
+                    obj[key] = cell.textContent.trim();
+
+                    // Extract link if present
+                    const link = cell.querySelector('a');
+                    if (link) {{
+                        obj[key + '_url'] = link.href;
+                    }}
+
+                    // Extract data attributes
+                    for (const attr of cell.attributes) {{
+                        if (attr.name.startsWith('data-')) {{
+                            obj[key + '_' + attr.name] = attr.value;
+                        }}
+                    }}
+                }});
+                return obj;
+            }});
+        }}
+    """, table_selector)
+```
+
+### Multi-Page Table (Paginated)
+
+```python
+async def extract_paginated_table(page, table_selector, next_selector, max_pages=50):
+    """Extract data from a table that spans multiple pages."""
+    all_rows = []
+    headers = None
+
+    for page_num in range(max_pages):
+        # Extract current page
+        page_data = await page.evaluate(f"""
+            (selector) => {{
+                const table = document.querySelector(selector);
+                if (!table) return {{ headers: [], rows: [] }};
+
+                const hs = Array.from(table.querySelectorAll('thead th'))
+                    .map(th => th.textContent.trim());
+
+                const rs = Array.from(table.querySelectorAll('tbody tr')).map(row =>
+                    Array.from(row.querySelectorAll('td')).map(td => td.textContent.trim())
+                );
+
+                return {{ headers: hs, rows: rs }};
+            }}
+        """, table_selector)
+
+        if headers is None and page_data["headers"]:
+            headers = page_data["headers"]
+
+        for row in page_data["rows"]:
+            all_rows.append(dict(zip(headers or [], row)))
+
+        # Check for next page
+        next_btn = page.locator(next_selector)
+        if await next_btn.count() == 0 or await next_btn.is_disabled():
+            break
+
+        await next_btn.click()
+        await page.wait_for_load_state("networkidle")
+        await page.wait_for_timeout(random.randint(800, 2000))
+
+    return all_rows
+```
+
+## Product Listing Extraction
+
+### Generic Listing Extractor
+
+Works for any repeating card/list pattern:
+
+```python
+async def extract_listings(page, container_sel, field_map):
+    """
+    Extract data from repeating elements.
+
+    field_map: dict mapping field names to CSS selectors.
+    Special suffixes:
+        ::attr(name)  — extract attribute instead of text
+        ::html        — extract innerHTML
+    """
+    items = []
+    cards = await page.query_selector_all(container_sel)
+
+    for card in cards:
+        item = {}
+        for field_name, selector in field_map.items():
+            try:
+                if "::attr(" in selector:
+                    sel, attr = selector.split("::attr(")
+                    attr = attr.rstrip(")")
+                    el = await card.query_selector(sel)
+                    item[field_name] = await el.get_attribute(attr) if el else None
+                elif selector.endswith("::html"):
+                    sel = selector.replace("::html", "")
+                    el = await card.query_selector(sel)
+                    item[field_name] = await el.inner_html() if el else None
+                else:
+                    el = await card.query_selector(selector)
+                    item[field_name] = (await el.text_content()).strip() if el else None
+            except Exception:
+                item[field_name] = None
+        items.append(item)
+
+    return items
+```
+
+### With Price Parsing
+
+```python
+import re
+
+def parse_price(text):
+    """Extract numeric price from text like '$1,234.56' or '1.234,56 EUR'."""
+    if not text:
+        return None
+    # Remove currency symbols and whitespace
+    cleaned = re.sub(r'[^\d.,]', '', text.strip())
+    if not cleaned:
+        return None
+    # Handle European format (1.234,56)
+    if ',' in cleaned and '.' in cleaned:
+        if cleaned.rindex(',') > cleaned.rindex('.'):
+            cleaned = cleaned.replace('.', '').replace(',', '.')
+        else:
+            cleaned = cleaned.replace(',', '')
+    elif ',' in cleaned:
+        # Could be 1,234 or 1,23 — check decimal places
+        parts = cleaned.split(',')
+        if len(parts[-1]) <= 2:
+            cleaned = cleaned.replace(',', '.')
+        else:
+            cleaned = cleaned.replace(',', '')
+    try:
+        return float(cleaned)
+    except ValueError:
+        return None
+
+async def extract_products_with_prices(page, container_sel, field_map, price_field="price"):
+    """Extract listings and parse prices into floats."""
+    items = await extract_listings(page, container_sel, field_map)
+    for item in items:
+        if price_field in item and item[price_field]:
+            item[f"{price_field}_raw"] = item[price_field]
+            item[price_field] = parse_price(item[price_field])
+    return items
+```
+
+## Pagination Handling
+
+### Next-Button Pagination
+
+The most common pattern. Click "Next" until the button disappears or is disabled.
+
+```python
+async def paginate_via_next_button(page, next_selector, content_selector, max_pages=100):
+    """
+    Yield page objects as you paginate through results.
+
+    next_selector: CSS selector for the "Next" button/link
+    content_selector: CSS selector to wait for after navigation (confirms new page loaded)
+    """
+    pages_scraped = 0
+
+    while pages_scraped < max_pages:
+        yield page  # Caller extracts data from current page
+        pages_scraped += 1
+
+        next_btn = page.locator(next_selector)
+        if await next_btn.count() == 0:
+            break
+
+        try:
+            is_disabled = await next_btn.is_disabled()
+        except Exception:
+            is_disabled = True
+
+        if is_disabled:
+            break
+
+        await next_btn.click()
+        await page.wait_for_selector(content_selector, state="attached")
+        await page.wait_for_timeout(random.randint(500, 1500))
+```
+
+### URL-Based Pagination
+
+When pages follow a predictable URL pattern:
+
+```python
+async def paginate_via_url(page, url_template, start=1, max_pages=100):
+    """
+    Navigate through pages using URL parameters.
+
+    url_template: URL with {page} placeholder, e.g., "https://example.com/search?page={page}"
+    """
+    for page_num in range(start, start + max_pages):
+        url = url_template.format(page=page_num)
+        response = await page.goto(url, wait_until="networkidle")
+
+        if response and response.status == 404:
+            break
+
+        yield page, page_num
+        await page.wait_for_timeout(random.randint(800, 2500))
+```
+
+### Infinite Scroll
+
+For sites that load content as you scroll:
+
+```python
+async def paginate_via_scroll(page, item_selector, max_scrolls=100, no_change_limit=3):
+    """
+    Scroll to load more content until no new items appear.
+
+    item_selector: CSS selector for individual items (used to count progress)
+    no_change_limit: Stop after N scrolls with no new items
+    """
+    previous_count = 0
+    no_change_streak = 0
+
+    for scroll_num in range(max_scrolls):
+        # Count current items
+        current_count = await page.locator(item_selector).count()
+
+        if current_count == previous_count:
+            no_change_streak += 1
+            if no_change_streak >= no_change_limit:
+                break
+        else:
+            no_change_streak = 0
+
+        previous_count = current_count
+
+        # Scroll to bottom
+        await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
+        await page.wait_for_timeout(random.randint(1000, 2500))
+
+        # Check for "Load More" button that might appear
+        load_more = page.locator("button:has-text('Load More'), button:has-text('Show More')")
+        if await load_more.count() > 0 and await load_more.is_visible():
+            await load_more.click()
+            await page.wait_for_timeout(random.randint(1000, 2000))
+
+    return current_count
+```
+
+### Load-More Button
+
+Simpler variant of infinite scroll where content loads via a button:
+
+```python
+async def paginate_via_load_more(page, button_selector, item_selector, max_clicks=50):
+    """Click a 'Load More' button repeatedly until it disappears."""
+    for click_num in range(max_clicks):
+        btn = page.locator(button_selector)
+        if await btn.count() == 0 or not await btn.is_visible():
+            break
+
+        count_before = await page.locator(item_selector).count()
+        await btn.click()
+
+        # Wait for new items to appear
+        try:
+            await page.wait_for_function(
+                f"document.querySelectorAll('{item_selector}').length > {count_before}",
+                timeout=10000,
+            )
+        except Exception:
+            break  # No new items loaded
+
+        await page.wait_for_timeout(random.randint(500, 1500))
+
+    return await page.locator(item_selector).count()
+```
+
+## Nested Data Extraction
+
+### Comments with Replies (Threaded)
+
+```python
+async def extract_threaded_comments(page, parent_selector=".comments"):
+    """Recursively extract threaded comments."""
+    return await page.evaluate(f"""
+        (parentSelector) => {{
+            function extractThread(container) {{
+                const comments = [];
+                const directChildren = container.querySelectorAll(':scope > .comment');
+
+                for (const comment of directChildren) {{
+                    const authorEl = comment.querySelector('.author, .username');
+                    const textEl = comment.querySelector('.comment-text, .comment-body');
+                    const dateEl = comment.querySelector('time, .date');
+                    const repliesContainer = comment.querySelector('.replies, .children');
+
+                    comments.push({{
+                        author: authorEl ? authorEl.textContent.trim() : null,
+                        text: textEl ? textEl.textContent.trim() : null,
+                        date: dateEl ? (dateEl.getAttribute('datetime') || dateEl.textContent.trim()) : null,
+                        replies: repliesContainer ? extractThread(repliesContainer) : [],
+                    }});
+                }}
+
+                return comments;
+            }}
+
+            const root = document.querySelector(parentSelector);
+            return root ? extractThread(root) : [];
+        }}
+    """, parent_selector)
+```
+
+### Nested Categories (Sidebar/Menu)
+
+```python
+async def extract_category_tree(page, root_selector="nav.categories"):
+    """Extract nested category structure from a sidebar or menu."""
+    return await page.evaluate(f"""
+        (rootSelector) => {{
+            function extractLevel(container) {{
+                const items = [];
+                const directItems = container.querySelectorAll(':scope > li, :scope > div.category');
+
+                for (const item of directItems) {{
+                    const link = item.querySelector(':scope > a');
+                    const subMenu = item.querySelector(':scope > ul, :scope > div.sub-categories');
+
+                    items.push({{
+                        name: link ? link.textContent.trim() : item.textContent.trim().split('\\n')[0],
+                        url: link ? link.href : null,
+                        children: subMenu ? extractLevel(subMenu) : [],
+                    }});
+                }}
+
+                return items;
+            }}
+
+            const root = document.querySelector(rootSelector);
+            return root ? extractLevel(root.querySelector('ul') || root) : [];
+        }}
+    """, root_selector)
+```
+
+### Accordion/Expandable Content
+
+Some content is hidden behind accordion/expand toggles. Click to reveal, then extract.
+
+```python
+async def extract_accordion(page, toggle_selector, content_selector):
+    """Expand all accordion items and extract their content."""
+    items = []
+    toggles = await page.query_selector_all(toggle_selector)
+
+    for toggle in toggles:
+        title = (await toggle.text_content()).strip()
+
+        # Click to expand
+        await toggle.click()
+        await page.wait_for_timeout(300)
+
+        # Find the associated content panel
+        content = await toggle.evaluate_handle(
+            f"el => el.closest('.accordion-item, .faq-item')?.querySelector('{content_selector}')"
+        )
+
+        body = None
+        if content:
+            body = (await content.text_content())
+            if body:
+                body = body.strip()
+
+        items.append({"title": title, "content": body})
+
+    return items
+```
+
+## Data Cleaning Utilities
+
+### Post-Extraction Cleaning
+
+```python
+import re
+
+def clean_text(text):
+    """Normalize whitespace, remove zero-width characters."""
+    if not text:
+        return None
+    # Remove zero-width characters
+    text = re.sub(r'[\u200b\u200c\u200d\ufeff]', '', text)
+    # Normalize whitespace
+    text = re.sub(r'\s+', ' ', text).strip()
+    return text if text else None
+
+def clean_url(url, base_url=None):
+    """Convert relative URLs to absolute."""
+    if not url:
+        return None
+    url = url.strip()
+    if url.startswith("//"):
+        return "https:" + url
+    if url.startswith("/") and base_url:
+        return base_url.rstrip("/") + url
+    return url
+
+def deduplicate(items, key_field):
+    """Remove duplicate items based on a key field."""
+    seen = set()
+    unique = []
+    for item in items:
+        key = item.get(key_field)
+        if key and key not in seen:
+            seen.add(key)
+            unique.append(item)
+    return unique
+```
+
+### Output Formats
+
+```python
+import json
+import csv
+import io
+
+def to_jsonl(items, file_path):
+    """Write items as JSON Lines (one JSON object per line)."""
+    with open(file_path, "w") as f:
+        for item in items:
+            f.write(json.dumps(item, ensure_ascii=False) + "\n")
+
+def to_csv(items, file_path):
+    """Write items as CSV."""
+    if not items:
+        return
+    headers = list(items[0].keys())
+    with open(file_path, "w", newline="") as f:
+        writer = csv.DictWriter(f, fieldnames=headers)
+        writer.writeheader()
+        writer.writerows(items)
+
+def to_json(items, file_path, indent=2):
+    """Write items as a JSON array."""
+    with open(file_path, "w") as f:
+        json.dump(items, f, indent=indent, ensure_ascii=False)
+```
--- a/engineering/browser-automation/references/playwright_browser_api.md
+++ b/engineering/browser-automation/references/playwright_browser_api.md
@@ -0,0 +1,492 @@
+# Playwright Browser API Reference (Automation Focus)
+
+This reference covers Playwright's Python async API for browser automation tasks — NOT testing. For test-specific APIs (assertions, fixtures, test runners), see playwright-pro.
+
+## Browser Launch & Context
+
+### Launching the Browser
+
+```python
+from playwright.async_api import async_playwright
+
+async with async_playwright() as p:
+    # Chromium (recommended for most automation)
+    browser = await p.chromium.launch(headless=True)
+
+    # Firefox (better for some anti-detection scenarios)
+    browser = await p.firefox.launch(headless=True)
+
+    # WebKit (Safari engine — useful for Apple-specific sites)
+    browser = await p.webkit.launch(headless=True)
+```
+
+**Launch options:**
+| Option | Type | Default | Purpose |
+|--------|------|---------|---------|
+| `headless` | bool | True | Run without visible window |
+| `slow_mo` | int | 0 | Milliseconds to slow each operation (debugging) |
+| `proxy` | dict | None | Proxy server configuration |
+| `args` | list | [] | Additional Chromium flags |
+| `downloads_path` | str | None | Directory for downloads |
+| `channel` | str | None | Browser channel: "chrome", "msedge" |
+
+### Browser Contexts (Session Isolation)
+
+Browser contexts are isolated environments within a single browser instance. Each context has its own cookies, localStorage, and cache. Use them instead of launching multiple browsers.
+
+```python
+# Create isolated context
+context = await browser.new_context(
+    viewport={"width": 1920, "height": 1080},
+    user_agent="Mozilla/5.0 ...",
+    locale="en-US",
+    timezone_id="America/New_York",
+    geolocation={"latitude": 40.7128, "longitude": -74.0060},
+    permissions=["geolocation"],
+)
+
+# Multiple contexts share one browser (resource efficient)
+context_a = await browser.new_context()  # User A session
+context_b = await browser.new_context()  # User B session
+```
+
+### Storage State (Session Persistence)
+
+```python
+# Save state after login (cookies + localStorage)
+await context.storage_state(path="auth_state.json")
+
+# Restore state in new context
+context = await browser.new_context(storage_state="auth_state.json")
+```
+
+## Page Navigation
+
+### Basic Navigation
+
+```python
+page = await context.new_page()
+
+# Navigate with different wait strategies
+await page.goto("https://example.com")                          # Default: "load"
+await page.goto("https://example.com", wait_until="domcontentloaded")  # Faster
+await page.goto("https://example.com", wait_until="networkidle")       # Wait for network quiet
+await page.goto("https://example.com", timeout=30000)                  # Custom timeout (ms)
+```
+
+**`wait_until` options:**
+- `"load"` — wait for the `load` event (all resources loaded)
+- `"domcontentloaded"` — DOM is ready, images/styles may still load
+- `"networkidle"` — no network requests for 500ms (best for SPAs)
+- `"commit"` — response received, before any rendering
+
+### Wait Strategies
+
+```python
+# Wait for a specific element to appear
+await page.wait_for_selector("div.content", state="visible")
+await page.wait_for_selector("div.loading", state="hidden")     # Wait for loading to finish
+await page.wait_for_selector("table tbody tr", state="attached") # In DOM but maybe not visible
+
+# Wait for URL change
+await page.wait_for_url("**/dashboard**")
+await page.wait_for_url(re.compile(r"/dashboard/\d+"))
+
+# Wait for specific network response
+async with page.expect_response("**/api/data*") as resp_info:
+    await page.click("button.load")
+response = await resp_info.value
+json_data = await response.json()
+
+# Wait for page load state
+await page.wait_for_load_state("networkidle")
+
+# Fixed wait (use sparingly — prefer the methods above)
+await page.wait_for_timeout(1000)  # milliseconds
+```
+
+### Navigation History
+
+```python
+await page.go_back()
+await page.go_forward()
+await page.reload()
+```
+
+## Element Interaction
+
+### Finding Elements
+
+```python
+# Single element (returns first match)
+element = await page.query_selector("css=div.product")
+element = await page.query_selector("xpath=//div[@class='product']")
+
+# Multiple elements
+elements = await page.query_selector_all("div.product")
+
+# Locator API (recommended — auto-waits, re-queries on each action)
+locator = page.locator("div.product")
+count = await locator.count()
+first = locator.first
+nth = locator.nth(2)
+```
+
+**Locator vs query_selector:**
+- `query_selector` — returns an ElementHandle at a point in time. Can go stale if DOM changes.
+- `locator` — returns a Locator that re-queries each time you interact with it. Preferred for reliability.
+
+### Clicking
+
+```python
+await page.click("button.submit")
+await page.click("a:has-text('Next')")
+await page.dblclick("div.editable")
+await page.click("button", position={"x": 10, "y": 10})  # Click at offset
+await page.click("button", force=True)  # Skip actionability checks
+await page.click("button", modifiers=["Shift"])  # With modifier key
+```
+
+### Text Input
+
+```python
+# Fill (clears existing content first)
+await page.fill("input#email", "user@example.com")
+
+# Type (simulates keystroke-by-keystroke input — slower, more realistic)
+await page.type("input#search", "query text", delay=50)  # 50ms between keys
+
+# Press specific keys
+await page.press("input#search", "Enter")
+await page.press("body", "Control+a")
+```
+
+### Dropdowns & Select
+
+```python
+# Native <select> element
+await page.select_option("select#country", value="US")
+await page.select_option("select#country", label="United States")
+await page.select_option("select#tags", value=["tag1", "tag2"])  # Multi-select
+
+# Custom dropdown (non-native)
+await page.click("div.dropdown-trigger")
+await page.click("li.option:has-text('United States')")
+```
+
+### Checkboxes & Radio Buttons
+
+```python
+await page.check("input#agree")
+await page.uncheck("input#newsletter")
+is_checked = await page.is_checked("input#agree")
+```
+
+### File Upload
+
+```python
+# Standard file input
+await page.set_input_files("input[type='file']", "/path/to/file.pdf")
+await page.set_input_files("input[type='file']", ["/path/a.pdf", "/path/b.pdf"])
+
+# Clear file selection
+await page.set_input_files("input[type='file']", [])
+
+# Non-standard upload (drag-and-drop zones)
+async with page.expect_file_chooser() as fc_info:
+    await page.click("div.upload-zone")
+file_chooser = await fc_info.value
+await file_chooser.set_files("/path/to/file.pdf")
+```
+
+### Hover & Focus
+
+```python
+await page.hover("div.menu-item")
+await page.focus("input#search")
+```
+
+## Data Extraction
+
+### Text Content
+
+```python
+# Get text content of an element
+text = await page.text_content("h1.title")
+inner_text = await page.inner_text("div.description")  # Visible text only
+inner_html = await page.inner_html("div.content")       # HTML markup
+
+# Get attribute
+href = await page.get_attribute("a.link", "href")
+src = await page.get_attribute("img.photo", "src")
+```
+
+### JavaScript Evaluation
+
+```python
+# Evaluate in page context
+title = await page.evaluate("document.title")
+scroll_height = await page.evaluate("document.body.scrollHeight")
+
+# Evaluate on a specific element
+text = await page.eval_on_selector("h1", "el => el.textContent")
+texts = await page.eval_on_selector_all("li", "els => els.map(e => e.textContent.trim())")
+
+# Complex extraction
+data = await page.evaluate("""
+    () => {
+        const rows = document.querySelectorAll('table tbody tr');
+        return Array.from(rows).map(row => {
+            const cells = row.querySelectorAll('td');
+            return {
+                name: cells[0]?.textContent.trim(),
+                value: cells[1]?.textContent.trim(),
+            };
+        });
+    }
+""")
+```
+
+### Screenshots & PDF
+
+```python
+# Full page screenshot
+await page.screenshot(path="page.png", full_page=True)
+
+# Viewport screenshot
+await page.screenshot(path="viewport.png")
+
+# Element screenshot
+await page.locator("div.chart").screenshot(path="chart.png")
+
+# PDF (Chromium only)
+await page.pdf(path="page.pdf", format="A4", print_background=True)
+
+# Screenshot as bytes (for processing without saving)
+buffer = await page.screenshot()
+```
+
+## Network Interception
+
+### Monitoring Requests
+
+```python
+# Listen for all responses
+page.on("response", lambda response: print(f"{response.status} {response.url}"))
+
+# Wait for a specific API call
+async with page.expect_response("**/api/products*") as resp:
+    await page.click("button.load")
+response = await resp.value
+data = await response.json()
+```
+
+### Blocking Resources (Speed Up Scraping)
+
+```python
+# Block images, fonts, and CSS to speed up scraping
+await page.route("**/*.{png,jpg,jpeg,gif,svg,woff,woff2,ttf}", lambda route: route.abort())
+await page.route("**/*.css", lambda route: route.abort())
+
+# Block specific domains (ads, analytics)
+await page.route("**/google-analytics.com/**", lambda route: route.abort())
+await page.route("**/facebook.com/**", lambda route: route.abort())
+```
+
+### Modifying Requests
+
+```python
+# Add custom headers
+await page.route("**/*", lambda route: route.continue_(headers={
+    **route.request.headers,
+    "X-Custom-Header": "value"
+}))
+
+# Mock API responses
+await page.route("**/api/data", lambda route: route.fulfill(
+    status=200,
+    content_type="application/json",
+    body=json.dumps({"items": []}),
+))
+```
+
+## Dialog Handling
+
+```python
+# Auto-accept all dialogs
+page.on("dialog", lambda dialog: dialog.accept())
+
+# Handle specific dialog types
+async def handle_dialog(dialog):
+    if dialog.type == "confirm":
+        await dialog.accept()
+    elif dialog.type == "prompt":
+        await dialog.accept("my input")
+    elif dialog.type == "alert":
+        await dialog.dismiss()
+
+page.on("dialog", handle_dialog)
+```
+
+## File Downloads
+
+```python
+# Wait for download to start
+async with page.expect_download() as dl_info:
+    await page.click("a.download-link")
+download = await dl_info.value
+
+# Save to specific path
+await download.save_as("/path/to/downloads/" + download.suggested_filename)
+
+# Get download as bytes
+path = await download.path()  # Temp file path
+
+# Set download behavior at context level
+context = await browser.new_context(accept_downloads=True)
+```
+
+## Frames & Iframes
+
+```python
+# Access iframe by selector
+frame = page.frame_locator("iframe#content")
+await frame.locator("button.submit").click()
+
+# Access frame by name
+frame = page.frame(name="editor")
+
+# Access all frames
+for frame in page.frames:
+    print(frame.url)
+```
+
+## Cookie Management
+
+```python
+# Get all cookies
+cookies = await context.cookies()
+
+# Get cookies for specific URL
+cookies = await context.cookies(["https://example.com"])
+
+# Add cookies
+await context.add_cookies([{
+    "name": "session",
+    "value": "abc123",
+    "domain": "example.com",
+    "path": "/",
+    "httpOnly": True,
+    "secure": True,
+}])
+
+# Clear cookies
+await context.clear_cookies()
+```
+
+## Concurrency Patterns
+
+### Multiple Pages in One Context
+
+```python
+# Open multiple tabs in the same session
+pages = []
+for url in urls:
+    page = await context.new_page()
+    await page.goto(url)
+    pages.append(page)
+
+# Process all pages
+for page in pages:
+    data = await extract_data(page)
+    await page.close()
+```
+
+### Multiple Contexts for Parallel Sessions
+
+```python
+import asyncio
+
+async def scrape_with_context(browser, url):
+    context = await browser.new_context(user_agent=random.choice(USER_AGENTS))
+    page = await context.new_page()
+    await page.goto(url)
+    data = await extract_data(page)
+    await context.close()
+    return data
+
+# Run 5 concurrent scraping tasks
+tasks = [scrape_with_context(browser, url) for url in urls[:5]]
+results = await asyncio.gather(*tasks)
+```
+
+## Init Scripts (Stealth)
+
+Init scripts run before any page script, in every new page/context.
+
+```python
+# Remove webdriver flag
+await context.add_init_script("""
+    Object.defineProperty(navigator, 'webdriver', {get: () => undefined});
+""")
+
+# Override plugins (headless Chrome has empty plugins)
+await context.add_init_script("""
+    Object.defineProperty(navigator, 'plugins', {
+        get: () => [1, 2, 3, 4, 5],
+    });
+""")
+
+# Override languages
+await context.add_init_script("""
+    Object.defineProperty(navigator, 'languages', {
+        get: () => ['en-US', 'en'],
+    });
+""")
+
+# From file
+await context.add_init_script(path="stealth.js")
+```
+
+## Common Automation Patterns
+
+### Scrolling
+
+```python
+# Scroll to bottom
+await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
+
+# Scroll element into view
+await page.locator("div.target").scroll_into_view_if_needed()
+
+# Smooth scroll simulation
+await page.evaluate("""
+    async () => {
+        const delay = ms => new Promise(r => setTimeout(r, ms));
+        for (let i = 0; i < document.body.scrollHeight; i += 300) {
+            window.scrollTo(0, i);
+            await delay(100);
+        }
+    }
+""")
+```
+
+### Clipboard Operations
+
+```python
+# Copy text
+await page.evaluate("navigator.clipboard.writeText('hello')")
+
+# Paste via keyboard
+await page.keyboard.press("Control+v")
+```
+
+### Shadow DOM
+
+```python
+# Playwright pierces open shadow DOM with >> operator
+await page.locator("my-component >> .inner-button").click()
+
+# Or use the css= engine with >> for chained piercing
+await page.locator("css=host-element >> css=.shadow-child").click()
+```
--- a/engineering/browser-automation/scraping_toolkit.py
+++ b/engineering/browser-automation/scraping_toolkit.py
@@ -0,0 +1,248 @@
+#!/usr/bin/env python3
+"""
+Scraping Toolkit - Generates Playwright scraping script skeletons.
+
+Takes a URL pattern and CSS selectors as input and produces a ready-to-run
+Playwright scraping script with pagination support, error handling, and
+anti-detection patterns baked in.
+
+No external dependencies - uses only Python standard library.
+"""
+
+import argparse
+import json
+import os
+import sys
+import textwrap
+from datetime import datetime
+
+
+def build_scraping_script(url, selectors, paginate=False, output_format="script"):
+    """Build a Playwright scraping script from the given parameters."""
+
+    selector_list = [s.strip() for s in selectors.split(",") if s.strip()]
+    if not selector_list:
+        return None, "No valid selectors provided."
+
+    field_names = []
+    for sel in selector_list:
+        # Derive field name from selector: .product-title -> product_title
+        name = sel.strip("#.[]()>:+~ ")
+        name = name.replace("-", "_").replace(" ", "_").replace(".", "_")
+        # Remove non-alphanumeric
+        name = "".join(c if c.isalnum() or c == "_" else "" for c in name)
+        if not name:
+            name = f"field_{len(field_names)}"
+        field_names.append(name)
+
+    field_map = dict(zip(field_names, selector_list))
+
+    if output_format == "json":
+        config = {
+            "url": url,
+            "selectors": field_map,
+            "pagination": {
+                "enabled": paginate,
+                "next_selector": "a:has-text('Next'), button:has-text('Next')",
+                "max_pages": 50,
+            },
+            "anti_detection": {
+                "random_delay_ms": [800, 2500],
+                "user_agent_rotation": True,
+                "viewport": {"width": 1920, "height": 1080},
+            },
+            "output": {
+                "format": "jsonl",
+                "deduplicate_by": field_names[0] if field_names else None,
+            },
+            "generated_at": datetime.now().isoformat(),
+        }
+        return config, None
+
+    # Build Python script
+    fields_dict_str = "{\n"
+    for name, sel in field_map.items():
+        fields_dict_str += f'        "{name}": "{sel}",\n'
+    fields_dict_str += "    }"
+
+    pagination_block = ""
+    if paginate:
+        pagination_block = textwrap.dedent("""\
+
+        # --- Pagination ---
+        async def scrape_all_pages(page, container, fields, next_sel, max_pages=50):
+            all_items = []
+            for page_num in range(max_pages):
+                print(f"Scraping page {page_num + 1}...")
+                items = await extract_items(page, container, fields)
+                all_items.extend(items)
+
+                next_btn = page.locator(next_sel)
+                if await next_btn.count() == 0:
+                    break
+                try:
+                    is_disabled = await next_btn.is_disabled()
+                except Exception:
+                    is_disabled = True
+                if is_disabled:
+                    break
+
+                await next_btn.click()
+                await page.wait_for_load_state("networkidle")
+                await asyncio.sleep(random.uniform(0.8, 2.5))
+
+            return all_items
+""")
+
+    main_call = "scrape_all_pages(page, CONTAINER, FIELDS, NEXT_SELECTOR)" if paginate else "extract_items(page, CONTAINER, FIELDS)"
+
+    script = textwrap.dedent(f'''\
+#!/usr/bin/env python3
+"""
+Auto-generated Playwright scraping script.
+Target: {url}
+Generated: {datetime.now().isoformat()}
+
+Requirements:
+    pip install playwright
+    playwright install chromium
+"""
+
+import asyncio
+import json
+import random
+from playwright.async_api import async_playwright
+
+# --- Configuration ---
+URL = "{url}"
+CONTAINER = "body"  # Adjust to the repeating item container selector
+FIELDS = {fields_dict_str}
+NEXT_SELECTOR = "a:has-text('Next'), button:has-text('Next')"
+
+USER_AGENTS = [
+    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+]
+
+
+async def extract_items(page, container_selector, field_map):
+    """Extract structured data from repeating elements."""
+    items = []
+    cards = await page.query_selector_all(container_selector)
+    for card in cards:
+        item = {{}}
+        for name, selector in field_map.items():
+            el = await card.query_selector(selector)
+            if el:
+                item[name] = (await el.text_content() or "").strip()
+            else:
+                item[name] = None
+        items.append(item)
+    return items
+
+{pagination_block}
+async def main():
+    async with async_playwright() as p:
+        browser = await p.chromium.launch(headless=True)
+        context = await browser.new_context(
+            viewport={{"width": 1920, "height": 1080}},
+            user_agent=random.choice(USER_AGENTS),
+        )
+        page = await context.new_page()
+
+        # Remove WebDriver flag
+        await page.add_init_script(
+            "Object.defineProperty(navigator, \'webdriver\', {{get: () => undefined}});"
+        )
+
+        print(f"Navigating to {{URL}}...")
+        await page.goto(URL, wait_until="networkidle")
+
+        data = await {main_call}
+        print(json.dumps(data, indent=2, ensure_ascii=False))
+
+        await browser.close()
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
+''')
+
+    return script, None
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Generate Playwright scraping script skeletons from URL and selectors.",
+        epilog=(
+            "Examples:\n"
+            "  %(prog)s --url https://example.com/products --selectors '.title,.price,.rating'\n"
+            "  %(prog)s --url https://example.com/search --selectors '.name,.desc' --paginate\n"
+            "  %(prog)s --url https://example.com --selectors '.item' --json\n"
+            "  %(prog)s --url https://example.com --selectors '.item' --output scraper.py\n"
+        ),
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+    )
+    parser.add_argument(
+        "--url",
+        required=True,
+        help="Target URL to scrape",
+    )
+    parser.add_argument(
+        "--selectors",
+        required=True,
+        help="Comma-separated CSS selectors for data fields (e.g. '.title,.price,.rating')",
+    )
+    parser.add_argument(
+        "--paginate",
+        action="store_true",
+        default=False,
+        help="Include pagination handling in generated script",
+    )
+    parser.add_argument(
+        "--output",
+        help="Output file path (default: stdout)",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_output",
+        default=False,
+        help="Output JSON configuration instead of Python script",
+    )
+
+    args = parser.parse_args()
+
+    output_format = "json" if args.json_output else "script"
+    result, error = build_scraping_script(
+        url=args.url,
+        selectors=args.selectors,
+        paginate=args.paginate,
+        output_format=output_format,
+    )
+
+    if error:
+        print(f"Error: {error}", file=sys.stderr)
+        sys.exit(2)
+
+    if args.json_output:
+        output_text = json.dumps(result, indent=2)
+    else:
+        output_text = result
+
+    if args.output:
+        output_path = os.path.abspath(args.output)
+        with open(output_path, "w") as f:
+            f.write(output_text)
+        if not args.json_output:
+            os.chmod(output_path, 0o755)
+        print(f"Written to {output_path}", file=sys.stderr)
+        sys.exit(0)
+    else:
+        print(output_text)
+        sys.exit(0)
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering/spec-driven-workflow/SKILL.md
+++ b/engineering/spec-driven-workflow/SKILL.md
@@ -0,0 +1,586 @@
+---
+name: "spec-driven-workflow"
+description: "Use when the user asks to write specs before code, define acceptance criteria, plan features before implementation, generate tests from specifications, or follow spec-first development practices."
+---
+
+# Spec-Driven Workflow — POWERFUL
+
+## Overview
+
+Spec-driven workflow enforces a single, non-negotiable rule: **write the specification BEFORE you write any code.** Not alongside. Not after. Before.
+
+This is not documentation. This is a contract. A spec defines what the system MUST do, what it SHOULD do, and what it explicitly WILL NOT do. Every line of code you write traces back to a requirement in the spec. Every test traces back to an acceptance criterion. If it is not in the spec, it does not get built.
+
+### Why Spec-First Matters
+
+1. **Eliminates rework.** 60-80% of defects originate from requirements, not implementation. Catching ambiguity in a spec costs minutes; catching it in production costs days.
+2. **Forces clarity.** If you cannot write what the system should do in plain language, you do not understand the problem well enough to write code.
+3. **Enables parallelism.** Once a spec is approved, frontend, backend, QA, and documentation can all start simultaneously.
+4. **Creates accountability.** The spec is the definition of done. No arguments about whether a feature is "complete" — either it satisfies the acceptance criteria or it does not.
+5. **Feeds TDD directly.** Acceptance criteria in Given/When/Then format translate 1:1 into test cases. The spec IS the test plan.
+
+### The Iron Law
+
+```
+NO CODE WITHOUT AN APPROVED SPEC.
+NO EXCEPTIONS. NO "QUICK PROTOTYPES." NO "I'LL DOCUMENT IT LATER."
+```
+
+If the spec is not written, reviewed, and approved, implementation does not begin. Period.
+
+---
+
+## The Spec Format
+
+Every spec follows this structure. No sections are optional — if a section does not apply, write "N/A — [reason]" so reviewers know it was considered, not forgotten.
+
+### 1. Title and Context
+
+```markdown
+# Spec: [Feature Name]
+
+**Author:** [name]
+**Date:** [ISO 8601]
+**Status:** Draft | In Review | Approved | Superseded
+**Reviewers:** [list]
+**Related specs:** [links]
+
+## Context
+
+[Why does this feature exist? What problem does it solve? What is the business
+motivation? Include links to user research, support tickets, or metrics that
+justify this work. 2-4 paragraphs maximum.]
+```
+
+### 2. Functional Requirements (RFC 2119)
+
+Use RFC 2119 keywords precisely:
+
+| Keyword | Meaning |
+|---------|---------|
+| **MUST** | Absolute requirement. Failing this means the implementation is non-conformant. |
+| **MUST NOT** | Absolute prohibition. Doing this means the implementation is broken. |
+| **SHOULD** | Recommended. May be omitted with documented justification. |
+| **SHOULD NOT** | Discouraged. May be included with documented justification. |
+| **MAY** | Optional. Purely at the implementer's discretion. |
+
+```markdown
+## Functional Requirements
+
+- FR-1: The system MUST authenticate users via OAuth 2.0 PKCE flow.
+- FR-2: The system MUST reject tokens older than 24 hours.
+- FR-3: The system SHOULD support refresh token rotation.
+- FR-4: The system MAY cache user profiles for up to 5 minutes.
+- FR-5: The system MUST NOT store plaintext passwords under any circumstance.
+```
+
+Number every requirement. Use `FR-` prefix. Each requirement is a single, testable statement.
+
+### 3. Non-Functional Requirements
+
+```markdown
+## Non-Functional Requirements
+
+### Performance
+- NFR-P1: Login flow MUST complete in < 500ms (p95) under normal load.
+- NFR-P2: Token validation MUST complete in < 50ms (p99).
+
+### Security
+- NFR-S1: All tokens MUST be transmitted over TLS 1.2+.
+- NFR-S2: The system MUST rate-limit login attempts to 5/minute per IP.
+
+### Accessibility
+- NFR-A1: Login form MUST meet WCAG 2.1 AA standards.
+- NFR-A2: Error messages MUST be announced to screen readers.
+
+### Scalability
+- NFR-SC1: The system SHOULD handle 10,000 concurrent sessions.
+
+### Reliability
+- NFR-R1: The authentication service MUST maintain 99.9% uptime.
+```
+
+### 4. Acceptance Criteria (Given/When/Then)
+
+Every functional requirement maps to one or more acceptance criteria. Use Gherkin syntax:
+
+```markdown
+## Acceptance Criteria
+
+### AC-1: Successful login (FR-1)
+Given a user with valid credentials
+When they submit the login form with correct email and password
+Then they receive a valid access token
+And they are redirected to the dashboard
+And the login event is logged with timestamp and IP
+
+### AC-2: Expired token rejection (FR-2)
+Given a user with an access token issued 25 hours ago
+When they make an API request with that token
+Then they receive a 401 Unauthorized response
+And the response body contains error code "TOKEN_EXPIRED"
+And they are NOT redirected (API clients handle their own flow)
+
+### AC-3: Rate limiting (NFR-S2)
+Given an IP address that has made 5 failed login attempts in the last minute
+When a 6th login attempt arrives from that IP
+Then the request is rejected with 429 Too Many Requests
+And the response includes a Retry-After header
+```
+
+### 5. Edge Cases and Error Scenarios
+
+```markdown
+## Edge Cases
+
+- EC-1: User submits login form with empty email → Show validation error, do not hit API.
+- EC-2: OAuth provider is down → Show "Service temporarily unavailable", retry after 30s.
+- EC-3: User has account but no password (social-only) → Redirect to social login.
+- EC-4: Concurrent login from two devices → Both sessions are valid (no single-session enforcement).
+- EC-5: Token expires mid-request → Complete the current request, return warning header.
+```
+
+### 6. API Contracts
+
+Define request/response shapes using TypeScript-style notation:
+
+```markdown
+## API Contracts
+
+### POST /api/auth/login
+Request:
+```typescript
+interface LoginRequest {
+  email: string;       // MUST be valid email format
+  password: string;    // MUST be 8-128 characters
+  rememberMe?: boolean; // Default: false
+}
+```
+
+Success Response (200):
+```typescript
+interface LoginResponse {
+  accessToken: string;   // JWT, expires in 24h
+  refreshToken: string;  // Opaque, expires in 30d
+  expiresIn: number;     // Seconds until access token expires
+  user: {
+    id: string;
+    email: string;
+    displayName: string;
+  };
+}
+```
+
+Error Response (401):
+```typescript
+interface AuthError {
+  error: "INVALID_CREDENTIALS" | "TOKEN_EXPIRED" | "ACCOUNT_LOCKED";
+  message: string;
+  retryAfter?: number; // Seconds, present for rate-limited responses
+}
+```
+```
+
+### 7. Data Models
+
+```markdown
+## Data Models
+
+### User
+| Field | Type | Constraints |
+|-------|------|-------------|
+| id | UUID | Primary key, auto-generated |
+| email | string | Unique, max 255 chars, valid email format |
+| passwordHash | string | bcrypt, never exposed via API |
+| createdAt | timestamp | UTC, immutable |
+| lastLoginAt | timestamp | UTC, updated on each login |
+| loginAttempts | integer | Reset to 0 on successful login |
+| lockedUntil | timestamp | Null if not locked |
+```
+
+### 8. Out of Scope
+
+Explicit exclusions prevent scope creep:
+
+```markdown
+## Out of Scope
+
+- OS-1: Multi-factor authentication (separate spec: SPEC-042)
+- OS-2: Social login providers beyond Google and GitHub
+- OS-3: Admin impersonation of user accounts
+- OS-4: Password complexity rules beyond minimum length (deferred to v2)
+- OS-5: Session management UI (users cannot see/revoke active sessions yet)
+```
+
+If someone asks for an out-of-scope item during implementation, point them to this section. Do not build it.
+
+---
+
+## Bounded Autonomy Rules
+
+These rules define when an agent (human or AI) MUST stop and ask for guidance vs. when they can proceed independently.
+
+### STOP and Ask When:
+
+1. **Scope creep detected.** The implementation requires something not in the spec. Even if it seems obviously needed, STOP. The spec might have excluded it deliberately.
+
+2. **Ambiguity exceeds 30%.** If you cannot determine the correct behavior from the spec for more than 30% of a given requirement, the spec is incomplete. Do not guess.
+
+3. **Breaking changes required.** The implementation would change an existing API contract, database schema, or public interface. Always escalate.
+
+4. **Security implications.** Any change that touches authentication, authorization, encryption, or PII handling requires explicit approval.
+
+5. **Performance characteristics unknown.** If a requirement says "MUST complete in < 500ms" but you have no way to measure or guarantee that, escalate before implementing a guess.
+
+6. **Cross-team dependencies.** If the spec requires coordination with another team or service, confirm the dependency before building against it.
+
+### Continue Autonomously When:
+
+1. **Spec is clear and unambiguous** for the current task.
+2. **All acceptance criteria have passing tests** and you are refactoring internals.
+3. **Changes are non-breaking** — no public API, schema, or behavior changes.
+4. **Implementation is a direct translation** of a well-defined acceptance criterion.
+5. **Error handling follows established patterns** already documented in the codebase.
+
+### Escalation Protocol
+
+When you must stop, provide:
+
+```markdown
+## Escalation: [Brief Title]
+
+**Blocked on:** [requirement ID, e.g., FR-3]
+**Question:** [Specific, answerable question — not "what should I do?"]
+**Options considered:**
+  A. [Option] — Pros: [...] Cons: [...]
+  B. [Option] — Pros: [...] Cons: [...]
+**My recommendation:** [A or B, with reasoning]
+**Impact of waiting:** [What is blocked until this is resolved?]
+```
+
+Never escalate without a recommendation. Never present an open-ended question. Always give options.
+
+See `references/bounded_autonomy_rules.md` for the complete decision matrix.
+
+---
+
+## Workflow — 6 Phases
+
+### Phase 1: Gather Requirements
+
+**Goal:** Understand what needs to be built and why.
+
+1. **Interview the user.** Ask:
+   - What problem does this solve?
+   - Who are the users?
+   - What does success look like?
+   - What explicitly should NOT be built?
+2. **Read existing code.** Understand the current system before proposing changes.
+3. **Identify constraints.** Performance budgets, security requirements, backward compatibility.
+4. **List unknowns.** Every unknown is a risk. Surface them now, not during implementation.
+
+**Exit criteria:** You can explain the feature to someone unfamiliar with the project in 2 minutes.
+
+### Phase 2: Write Spec
+
+**Goal:** Produce a complete spec document following The Spec Format above.
+
+1. Fill every section of the template. No section left blank.
+2. Number all requirements (FR-*, NFR-*, AC-*, EC-*, OS-*).
+3. Use RFC 2119 keywords precisely.
+4. Write acceptance criteria in Given/When/Then format.
+5. Define API contracts with TypeScript-style types.
+6. List explicit exclusions in Out of Scope.
+
+**Exit criteria:** The spec can be handed to a developer who was not in the requirements meeting, and they can implement the feature without asking clarifying questions.
+
+### Phase 3: Validate Spec
+
+**Goal:** Verify the spec is complete, consistent, and implementable.
+
+Run `spec_validator.py` against the spec file:
+
+```bash
+python spec_validator.py --file spec.md --strict
+```
+
+Manual validation checklist:
+- [ ] Every functional requirement has at least one acceptance criterion
+- [ ] Every acceptance criterion is testable (no subjective language)
+- [ ] API contracts cover all endpoints mentioned in requirements
+- [ ] Data models cover all entities mentioned in requirements
+- [ ] Edge cases cover failure modes for every external dependency
+- [ ] Out of scope is explicit about what was considered and rejected
+- [ ] Non-functional requirements have measurable thresholds
+
+**Exit criteria:** Spec scores 80+ on validator, and all manual checklist items pass.
+
+### Phase 4: Generate Tests
+
+**Goal:** Extract test cases from acceptance criteria before writing implementation code.
+
+Run `test_extractor.py` against the approved spec:
+
+```bash
+python test_extractor.py --file spec.md --framework pytest --output tests/
+```
+
+1. Each acceptance criterion becomes one or more test cases.
+2. Each edge case becomes a test case.
+3. Tests are stubs — they define the assertion but not the implementation.
+4. All tests MUST fail initially (red phase of TDD).
+
+**Exit criteria:** You have a test file where every test fails with "not implemented" or equivalent.
+
+### Phase 5: Implement
+
+**Goal:** Write code that makes failing tests pass, one acceptance criterion at a time.
+
+1. Pick one acceptance criterion (start with the simplest).
+2. Make its test(s) pass with minimal code.
+3. Run the full test suite — no regressions.
+4. Commit.
+5. Pick the next acceptance criterion. Repeat.
+
+**Rules:**
+- Do NOT implement anything not in the spec.
+- Do NOT optimize before all acceptance criteria pass.
+- Do NOT refactor before all acceptance criteria pass.
+- If you discover a missing requirement, STOP and update the spec first.
+
+**Exit criteria:** All tests pass. All acceptance criteria satisfied.
+
+### Phase 6: Self-Review
+
+**Goal:** Verify implementation matches spec before marking done.
+
+Run through the Self-Review Checklist below. If any item fails, fix it before declaring the task complete.
+
+---
+
+## Self-Review Checklist
+
+Before marking any implementation as done, verify ALL of the following:
+
+- [ ] **Every acceptance criterion has a passing test.** No exceptions. If AC-3 exists, a test for AC-3 exists and passes.
+- [ ] **Every edge case has a test.** EC-1 through EC-N all have corresponding test cases.
+- [ ] **No scope creep.** The implementation does not include features not in the spec. If you added something, either update the spec or remove it.
+- [ ] **API contracts match implementation.** Request/response shapes in code match the spec exactly. Field names, types, status codes — all of it.
+- [ ] **Error scenarios tested.** Every error response defined in the spec has a test that triggers it.
+- [ ] **Non-functional requirements verified.** If the spec says < 500ms, you have evidence (benchmark, load test, profiling) that it meets the threshold.
+- [ ] **Data model matches.** Database schema matches the spec. No extra columns, no missing constraints.
+- [ ] **Out-of-scope items not built.** Double-check that nothing from the Out of Scope section leaked into the implementation.
+
+---
+
+## Integration with TDD Guide
+
+Spec-driven workflow and TDD are complementary, not competing:
+
+```
+Spec-Driven Workflow          TDD (Red-Green-Refactor)
+─────────────────────         ──────────────────────────
+Phase 1: Gather Requirements
+Phase 2: Write Spec
+Phase 3: Validate Spec
+Phase 4: Generate Tests  ──→  RED: Tests exist and fail
+Phase 5: Implement       ──→  GREEN: Minimal code to pass
+Phase 6: Self-Review     ──→  REFACTOR: Clean up internals
+```
+
+**The handoff:** Spec-driven workflow produces the test stubs (Phase 4). TDD takes over from there. The spec tells you WHAT to test. TDD tells you HOW to implement.
+
+Use `engineering-team/tdd-guide` for:
+- Red-green-refactor cycle discipline
+- Coverage analysis and gap detection
+- Framework-specific test patterns (Jest, Pytest, JUnit)
+
+Use `engineering/spec-driven-workflow` for:
+- Defining what to build before building it
+- Acceptance criteria authoring
+- Completeness validation
+- Scope control
+
+---
+
+## Examples
+
+### Full Spec: User Password Reset
+
+```markdown
+# Spec: Password Reset Flow
+
+**Author:** Engineering Team
+**Date:** 2026-03-25
+**Status:** Approved
+
+## Context
+
+Users who forget their passwords currently have no self-service recovery option.
+Support receives ~200 password reset requests per week, costing approximately
+8 hours of support time. This feature eliminates that burden entirely.
+
+## Functional Requirements
+
+- FR-1: The system MUST allow users to request a password reset via email.
+- FR-2: The system MUST send a reset link that expires after 1 hour.
+- FR-3: The system MUST invalidate all previous reset links when a new one is requested.
+- FR-4: The system MUST enforce minimum password length of 8 characters on reset.
+- FR-5: The system MUST NOT reveal whether an email exists in the system.
+- FR-6: The system SHOULD log all reset attempts for audit purposes.
+
+## Acceptance Criteria
+
+### AC-1: Request reset (FR-1, FR-5)
+Given a user on the password reset page
+When they enter any email address and submit
+Then they see "If an account exists, a reset link has been sent"
+And the response is identical whether the email exists or not
+
+### AC-2: Valid reset link (FR-2)
+Given a user who received a reset email 30 minutes ago
+When they click the reset link
+Then they see the password reset form
+
+### AC-3: Expired reset link (FR-2)
+Given a user who received a reset email 2 hours ago
+When they click the reset link
+Then they see "This link has expired. Please request a new one."
+
+### AC-4: Previous links invalidated (FR-3)
+Given a user who requested two reset emails
+When they click the link from the first email
+Then they see "This link is no longer valid."
+
+## Edge Cases
+
+- EC-1: User submits reset for non-existent email → Same success message (FR-5).
+- EC-2: User clicks reset link twice → Second click shows "already used" if password was changed.
+- EC-3: Email delivery fails → Log error, do not retry automatically.
+- EC-4: User requests reset while already logged in → Allow it, do not force logout.
+
+## Out of Scope
+
+- OS-1: Security questions as alternative reset method.
+- OS-2: SMS-based password reset.
+- OS-3: Admin-initiated password reset (separate spec).
+```
+
+### Extracted Test Cases (from above spec)
+
+```python
+# Generated by test_extractor.py --framework pytest
+
+class TestPasswordReset:
+    def test_ac1_request_reset_existing_email(self):
+        """AC-1: Request reset with existing email shows generic message."""
+        # Given a user on the password reset page
+        # When they enter a registered email and submit
+        # Then they see "If an account exists, a reset link has been sent"
+        raise NotImplementedError("Implement this test")
+
+    def test_ac1_request_reset_nonexistent_email(self):
+        """AC-1: Request reset with unknown email shows same generic message."""
+        # Given a user on the password reset page
+        # When they enter an unregistered email and submit
+        # Then they see identical response to existing email case
+        raise NotImplementedError("Implement this test")
+
+    def test_ac2_valid_reset_link(self):
+        """AC-2: Reset link works within expiry window."""
+        raise NotImplementedError("Implement this test")
+
+    def test_ac3_expired_reset_link(self):
+        """AC-3: Reset link rejected after 1 hour."""
+        raise NotImplementedError("Implement this test")
+
+    def test_ac4_previous_links_invalidated(self):
+        """AC-4: Old reset links stop working when new one is requested."""
+        raise NotImplementedError("Implement this test")
+
+    def test_ec1_nonexistent_email_same_response(self):
+        """EC-1: Non-existent email produces identical response."""
+        raise NotImplementedError("Implement this test")
+
+    def test_ec2_reset_link_used_twice(self):
+        """EC-2: Already-used reset link shows appropriate message."""
+        raise NotImplementedError("Implement this test")
+```
+
+---
+
+## Anti-Patterns
+
+### 1. Coding Before Spec Approval
+
+**Symptom:** "I'll start coding while the spec is being reviewed."
+**Problem:** The review will surface changes. Now you have code that implements a rejected design.
+**Rule:** Implementation does not begin until spec status is "Approved."
+
+### 2. Vague Acceptance Criteria
+
+**Symptom:** "The system should work well" or "The UI should be responsive."
+**Problem:** Untestable. What does "well" mean? What does "responsive" mean?
+**Rule:** Every acceptance criterion must be verifiable by a machine. If you cannot write a test for it, rewrite the criterion.
+
+### 3. Missing Edge Cases
+
+**Symptom:** Happy path is specified, error paths are not.
+**Problem:** Developers invent error handling on the fly, leading to inconsistent behavior.
+**Rule:** For every external dependency (API, database, file system, user input), specify at least one failure scenario.
+
+### 4. Spec as Post-Hoc Documentation
+
+**Symptom:** "Let me write the spec now that the feature is done."
+**Problem:** This is documentation, not specification. It describes what was built, not what should have been built. It cannot catch design errors because the design is already frozen.
+**Rule:** If the spec was written after the code, it is not a spec. Relabel it as documentation.
+
+### 5. Gold-Plating Beyond Spec
+
+**Symptom:** "While I was in there, I also added..."
+**Problem:** Untested code. Unreviewed design. Potential for subtle bugs in the "bonus" feature.
+**Rule:** If it is not in the spec, it does not get built. File a new spec for additional features.
+
+### 6. Acceptance Criteria Without Requirement Traceability
+
+**Symptom:** AC-7 exists but does not reference any FR-* or NFR-*.
+**Problem:** Orphaned criteria mean either a requirement is missing or the criterion is unnecessary.
+**Rule:** Every AC-* MUST reference at least one FR-* or NFR-*.
+
+### 7. Skipping Validation
+
+**Symptom:** "The spec looks fine, let's just start."
+**Problem:** Missing sections discovered during implementation cause blocking delays.
+**Rule:** Always run `spec_validator.py --strict` before starting implementation. Fix all warnings.
+
+---
+
+## Cross-References
+
+- **`engineering-team/tdd-guide`** — Red-green-refactor cycle, test generation, coverage analysis. Use after Phase 4 of this workflow.
+- **`engineering/focused-fix`** — Deep-dive feature repair. When a spec-driven implementation has systemic issues, use focused-fix for diagnosis.
+- **`engineering/rag-architect`** — If the feature involves retrieval or knowledge systems, use rag-architect for the technical design within the spec.
+- **`references/spec_format_guide.md`** — Complete template with section-by-section explanations.
+- **`references/bounded_autonomy_rules.md`** — Full decision matrix for when to stop vs. continue.
+- **`references/acceptance_criteria_patterns.md`** — Pattern library for writing Given/When/Then criteria.
+
+---
+
+## Tools
+
+| Script | Purpose | Key Flags |
+|--------|---------|-----------|
+| `spec_generator.py` | Generate spec template from feature name/description | `--name`, `--description`, `--format`, `--json` |
+| `spec_validator.py` | Validate spec completeness (0-100 score) | `--file`, `--strict`, `--json` |
+| `test_extractor.py` | Extract test stubs from acceptance criteria | `--file`, `--framework`, `--output`, `--json` |
+
+```bash
+# Generate a spec template
+python spec_generator.py --name "User Authentication" --description "OAuth 2.0 login flow"
+
+# Validate a spec
+python spec_validator.py --file specs/auth.md --strict
+
+# Extract test cases
+python test_extractor.py --file specs/auth.md --framework pytest --output tests/test_auth.py
+```
--- a/engineering/spec-driven-workflow/references/acceptance_criteria_patterns.md
+++ b/engineering/spec-driven-workflow/references/acceptance_criteria_patterns.md
@@ -0,0 +1,497 @@
+# Acceptance Criteria Patterns
+
+A pattern library for writing Given/When/Then acceptance criteria across common feature types. Use these as starting points — adapt to your domain.
+
+---
+
+## Pattern Structure
+
+Every acceptance criterion follows this structure:
+
+```
+### AC-N: [Descriptive name] (FR-N, NFR-N)
+Given [precondition — the system/user is in this state]
+When  [trigger — the user or system performs this action]
+Then  [outcome — this observable, testable result occurs]
+And   [additional outcome — and this also happens]
+```
+
+**Rules:**
+1. One scenario per AC. Multiple Given/When/Then blocks = multiple ACs.
+2. Every AC references at least one FR-* or NFR-*.
+3. Outcomes must be observable and testable — no subjective language.
+4. Preconditions must be achievable in a test setup.
+
+---
+
+## Authentication Patterns
+
+### Login — Happy Path
+
+```markdown
+### AC-1: Successful login with valid credentials (FR-1)
+Given a registered user with email "user@example.com" and password "V@lidP4ss!"
+When they POST /api/auth/login with email "user@example.com" and password "V@lidP4ss!"
+Then the response status is 200
+And the response body contains a valid JWT access token
+And the response body contains a refresh token
+And the access token expires in 24 hours
+```
+
+### Login — Invalid Credentials
+
+```markdown
+### AC-2: Login rejected with wrong password (FR-1)
+Given a registered user with email "user@example.com"
+When they POST /api/auth/login with email "user@example.com" and an incorrect password
+Then the response status is 401
+And the response body contains error code "INVALID_CREDENTIALS"
+And no token is issued
+And the failed attempt is logged
+```
+
+### Login — Account Locked
+
+```markdown
+### AC-3: Login rejected for locked account (FR-1, NFR-S2)
+Given a user whose account is locked due to 5 consecutive failed login attempts
+When they POST /api/auth/login with correct credentials
+Then the response status is 403
+And the response body contains error code "ACCOUNT_LOCKED"
+And the response includes a "retryAfter" field with seconds until unlock
+```
+
+### Token Refresh
+
+```markdown
+### AC-4: Token refresh with valid refresh token (FR-3)
+Given a user with a valid, non-expired refresh token
+When they POST /api/auth/refresh with that refresh token
+Then the response status is 200
+And a new access token is issued
+And the old refresh token is invalidated
+And a new refresh token is issued (rotation)
+```
+
+### Logout
+
+```markdown
+### AC-5: Logout invalidates session (FR-4)
+Given an authenticated user with a valid access token
+When they POST /api/auth/logout with that token
+Then the response status is 204
+And the access token is no longer accepted for API calls
+And the refresh token is invalidated
+```
+
+---
+
+## CRUD Patterns
+
+### Create
+
+```markdown
+### AC-6: Create resource with valid data (FR-1)
+Given an authenticated user with "editor" role
+When they POST /api/resources with valid payload {name: "Test", type: "A"}
+Then the response status is 201
+And the response body contains the created resource with a generated UUID
+And the resource's "createdAt" field is set to the current UTC timestamp
+And the resource's "createdBy" field matches the authenticated user's ID
+```
+
+### Create — Validation Failure
+
+```markdown
+### AC-7: Create resource rejected with invalid data (FR-1)
+Given an authenticated user
+When they POST /api/resources with payload missing required field "name"
+Then the response status is 400
+And the response body contains error code "VALIDATION_ERROR"
+And the response body contains field-level detail: {"name": "Required field"}
+And no resource is created in the database
+```
+
+### Read — Single Item
+
+```markdown
+### AC-8: Read resource by ID (FR-2)
+Given an existing resource with ID "abc-123"
+When an authenticated user GETs /api/resources/abc-123
+Then the response status is 200
+And the response body contains the resource with all fields
+```
+
+### Read — Not Found
+
+```markdown
+### AC-9: Read non-existent resource returns 404 (FR-2)
+Given no resource exists with ID "nonexistent-id"
+When an authenticated user GETs /api/resources/nonexistent-id
+Then the response status is 404
+And the response body contains error code "NOT_FOUND"
+```
+
+### Update
+
+```markdown
+### AC-10: Update resource with valid data (FR-3)
+Given an existing resource with ID "abc-123" owned by the authenticated user
+When they PATCH /api/resources/abc-123 with {name: "Updated Name"}
+Then the response status is 200
+And the resource's "name" field is "Updated Name"
+And the resource's "updatedAt" field is updated to the current UTC timestamp
+And fields not included in the patch are unchanged
+```
+
+### Update — Ownership Check
+
+```markdown
+### AC-11: Update rejected for non-owner (FR-3, FR-6)
+Given an existing resource with ID "abc-123" owned by user "other-user"
+When the authenticated user (not "other-user") PATCHes /api/resources/abc-123
+Then the response status is 403
+And the response body contains error code "FORBIDDEN"
+And the resource is unchanged
+```
+
+### Delete — Soft Delete
+
+```markdown
+### AC-12: Soft delete resource (FR-5)
+Given an existing resource with ID "abc-123" owned by the authenticated user
+When they DELETE /api/resources/abc-123
+Then the response status is 204
+And the resource's "deletedAt" field is set to the current UTC timestamp
+And the resource no longer appears in GET /api/resources (list endpoint)
+And the resource still exists in the database (soft deleted)
+```
+
+### List — Pagination
+
+```markdown
+### AC-13: List resources with default pagination (FR-4)
+Given 50 resources exist for the authenticated user
+When they GET /api/resources without pagination parameters
+Then the response status is 200
+And the response contains the first 20 resources (default page size)
+And the response includes "totalCount: 50"
+And the response includes "page: 1"
+And the response includes "pageSize: 20"
+And the response includes "hasNextPage: true"
+```
+
+### List — Filtered
+
+```markdown
+### AC-14: List resources with type filter (FR-4)
+Given 30 resources of type "A" and 20 resources of type "B" exist
+When the authenticated user GETs /api/resources?type=A
+Then the response status is 200
+And all returned resources have type "A"
+And the response "totalCount" is 30
+```
+
+---
+
+## Search Patterns
+
+### Basic Search
+
+```markdown
+### AC-15: Search returns matching results (FR-7)
+Given resources with names "Alpha Report", "Beta Analysis", "Alpha Summary" exist
+When the user GETs /api/resources?q=Alpha
+Then the response contains "Alpha Report" and "Alpha Summary"
+And the response does not contain "Beta Analysis"
+And results are ordered by relevance score (descending)
+```
+
+### Search — Empty Results
+
+```markdown
+### AC-16: Search with no matches returns empty list (FR-7)
+Given no resources match the query "xyznonexistent"
+When the user GETs /api/resources?q=xyznonexistent
+Then the response status is 200
+And the response contains an empty "items" array
+And "totalCount" is 0
+```
+
+### Search — Special Characters
+
+```markdown
+### AC-17: Search handles special characters safely (FR-7, NFR-S1)
+Given resources exist in the database
+When the user GETs /api/resources?q="; DROP TABLE resources;--
+Then the response status is 200
+And no SQL injection occurs
+And the search treats the input as a literal string
+```
+
+---
+
+## File Upload Patterns
+
+### Upload — Happy Path
+
+```markdown
+### AC-18: Upload file within size limit (FR-8)
+Given an authenticated user
+When they POST /api/files with a 5MB PNG file
+Then the response status is 201
+And the response contains the file's URL, size, and MIME type
+And the file is stored in the configured storage backend
+And the file is associated with the authenticated user
+```
+
+### Upload — Size Exceeded
+
+```markdown
+### AC-19: Upload rejected for oversized file (FR-8)
+Given the maximum file size is 10MB
+When the user POSTs /api/files with a 15MB file
+Then the response status is 413
+And the response contains error code "FILE_TOO_LARGE"
+And no file is stored
+```
+
+### Upload — Invalid Type
+
+```markdown
+### AC-20: Upload rejected for disallowed file type (FR-8, NFR-S3)
+Given allowed file types are PNG, JPG, PDF
+When the user POSTs /api/files with an .exe file
+Then the response status is 415
+And the response contains error code "UNSUPPORTED_MEDIA_TYPE"
+And no file is stored
+```
+
+---
+
+## Payment Patterns
+
+### Charge — Happy Path
+
+```markdown
+### AC-21: Successful payment charge (FR-10)
+Given a user with a valid payment method on file
+When they POST /api/payments with amount 49.99 and currency "USD"
+Then the payment gateway is charged $49.99
+And the response status is 201
+And the response contains a transaction ID
+And a payment record is created with status "completed"
+And a receipt email is sent to the user
+```
+
+### Charge — Declined
+
+```markdown
+### AC-22: Payment declined by gateway (FR-10)
+Given a user with an expired credit card on file
+When they POST /api/payments with amount 49.99
+Then the payment gateway returns a decline
+And the response status is 402
+And the response contains error code "PAYMENT_DECLINED"
+And no payment record is created with status "completed"
+And the user is prompted to update their payment method
+```
+
+### Charge — Idempotency
+
+```markdown
+### AC-23: Duplicate payment request is idempotent (FR-10, NFR-R1)
+Given a payment was successfully processed with idempotency key "key-123"
+When the same request is sent again with idempotency key "key-123"
+Then the response status is 200
+And the response contains the original transaction ID
+And the user is NOT charged a second time
+```
+
+---
+
+## Notification Patterns
+
+### Email Notification
+
+```markdown
+### AC-24: Email notification sent on event (FR-11)
+Given a user with notification preferences set to "email"
+When their order status changes to "shipped"
+Then an email is sent to their registered email address
+And the email subject contains the order number
+And the email body contains the tracking URL
+And a notification record is created with status "sent"
+```
+
+### Notification — Delivery Failure
+
+```markdown
+### AC-25: Failed notification is retried (FR-11, NFR-R2)
+Given the email service returns a 5xx error on first attempt
+When a notification is triggered
+Then the system retries up to 3 times with exponential backoff (1s, 4s, 16s)
+And if all retries fail, the notification status is set to "failed"
+And an alert is sent to the ops channel
+```
+
+---
+
+## Negative Test Patterns
+
+### Unauthorized Access
+
+```markdown
+### AC-26: Unauthenticated request rejected (NFR-S1)
+Given no authentication token is provided
+When the user GETs /api/resources
+Then the response status is 401
+And the response contains error code "AUTHENTICATION_REQUIRED"
+And no resource data is returned
+```
+
+### Invalid Input — Type Mismatch
+
+```markdown
+### AC-27: String provided for numeric field (FR-1)
+Given the "quantity" field expects an integer
+When the user POSTs with quantity: "abc"
+Then the response status is 400
+And the response body contains field error: {"quantity": "Must be an integer"}
+```
+
+### Rate Limiting
+
+```markdown
+### AC-28: Rate limit enforced (NFR-S2)
+Given the rate limit is 100 requests per minute per API key
+When the user sends the 101st request within 60 seconds
+Then the response status is 429
+And the response includes header "Retry-After" with seconds until reset
+And the response contains error code "RATE_LIMITED"
+```
+
+### Concurrent Modification
+
+```markdown
+### AC-29: Optimistic locking prevents lost updates (NFR-R1)
+Given a resource with version 5
+When user A PATCHes with version 5 and user B PATCHes with version 5 simultaneously
+Then one succeeds with status 200 (version becomes 6)
+And the other receives status 409 with error code "CONFLICT"
+And the 409 response includes the current version number
+```
+
+---
+
+## Performance Criteria Patterns
+
+### Response Time
+
+```markdown
+### AC-30: API response time under load (NFR-P1)
+Given the system is handling 1,000 concurrent users
+When a user GETs /api/dashboard
+Then the response is returned in < 500ms (p95)
+And the response is returned in < 1000ms (p99)
+```
+
+### Throughput
+
+```markdown
+### AC-31: System handles target throughput (NFR-P2)
+Given normal production traffic patterns
+When the system receives 5,000 requests per second
+Then all requests are processed without queue overflow
+And error rate remains below 0.1%
+```
+
+### Resource Usage
+
+```markdown
+### AC-32: Memory usage within bounds (NFR-P3)
+Given the service is processing normal traffic
+When measured over a 24-hour period
+Then memory usage does not exceed 512MB RSS
+And no memory leaks are detected (RSS growth < 5% over 24h)
+```
+
+---
+
+## Accessibility Criteria Patterns
+
+### Keyboard Navigation
+
+```markdown
+### AC-33: Form is fully keyboard navigable (NFR-A1)
+Given the user is on the login page using only a keyboard
+When they press Tab
+Then focus moves through: email field -> password field -> submit button
+And each focused element has a visible focus indicator
+And pressing Enter on the submit button submits the form
+```
+
+### Screen Reader
+
+```markdown
+### AC-34: Error messages announced to screen readers (NFR-A2)
+Given the user submits the form with invalid data
+When validation errors appear
+Then each error is associated with its form field via aria-describedby
+And the error container has role="alert" for immediate announcement
+And the first error field receives focus
+```
+
+### Color Contrast
+
+```markdown
+### AC-35: Text meets contrast requirements (NFR-A3)
+Given the default theme is active
+When measuring text against background colors
+Then all body text meets 4.5:1 contrast ratio (WCAG AA)
+And all large text (18px+ or 14px+ bold) meets 3:1 contrast ratio
+And all interactive element states (hover, focus, active) meet 3:1
+```
+
+### Reduced Motion
+
+```markdown
+### AC-36: Animations respect user preference (NFR-A4)
+Given the user has enabled "prefers-reduced-motion" in their OS settings
+When they load any page with animations
+Then all non-essential animations are disabled
+And essential animations (e.g., loading spinner) use a reduced version
+And no content is hidden behind animation-only interactions
+```
+
+---
+
+## Writing Tips
+
+### Do
+
+- Start Given with the system/user state, not the action
+- Make When a single, specific trigger
+- Make Then observable — status codes, field values, side effects
+- Include And for additional assertions on the same outcome
+- Reference requirement IDs in the AC title
+
+### Do Not
+
+- Write "Then the system works correctly" (not testable)
+- Combine multiple scenarios in one AC
+- Use subjective words: "quickly", "properly", "nicely", "user-friendly"
+- Skip the precondition — Given is required even if it seems obvious
+- Write Given/When/Then as prose paragraphs — use the structured format
+
+### Smell Tests
+
+If your AC has any of these, rewrite it:
+
+| Smell | Example | Fix |
+|-------|---------|-----|
+| No Given clause | "When user clicks, then page loads" | Add "Given user is on the dashboard" |
+| Vague Then | "Then it works" | Specify status code, body, side effects |
+| Multiple Whens | "When user clicks A and then clicks B" | Split into two ACs |
+| Implementation detail | "Then the Redux store is updated" | Focus on user-observable outcome |
+| No requirement reference | "AC-5: Dashboard loads" | "AC-5: Dashboard loads (FR-7)" |
--- a/engineering/spec-driven-workflow/references/bounded_autonomy_rules.md
+++ b/engineering/spec-driven-workflow/references/bounded_autonomy_rules.md
@@ -0,0 +1,273 @@
+# Bounded Autonomy Rules
+
+Decision framework for when an agent (human or AI) should stop and ask vs. continue working autonomously during spec-driven development.
+
+---
+
+## The Core Principle
+
+**Autonomy is earned by clarity.** The clearer the spec, the more autonomy the implementer has. The more ambiguous the spec, the more the implementer must stop and ask.
+
+This is not about trust. It is about risk. A clear spec means low risk of building the wrong thing. An ambiguous spec means high risk.
+
+---
+
+## Decision Matrix
+
+| Signal | Action | Rationale |
+|--------|--------|-----------|
+| Spec is Approved, requirement is clear, tests exist | **Continue** | Low risk. Build it. |
+| Requirement is clear but no test exists yet | **Continue** (write the test first) | You can infer the test from the requirement. |
+| Requirement uses SHOULD/MAY keywords | **Continue** with your best judgment | These are intentionally flexible. Document your choice. |
+| Requirement is ambiguous (multiple valid interpretations) | **STOP** if ambiguity > 30% of the task | Ask the spec author to clarify. |
+| Implementation requires changing an API contract | **STOP** always | Breaking changes need explicit approval. |
+| Implementation requires a new database migration | **STOP** if it changes existing columns/tables | New tables are lower risk than schema changes. |
+| Security-related change (auth, crypto, PII) | **STOP** always | Security changes need review regardless of spec clarity. |
+| Performance-critical path with no benchmark data | **STOP** | You cannot prove NFR compliance without measurement. |
+| Bug found in existing code unrelated to spec | **STOP** — file a separate issue | Do not fix unrelated bugs in a spec-scoped implementation. |
+| Spec says "N/A" for a section you think needs content | **STOP** | The author may have a reason, or they may have missed it. |
+
+---
+
+## Ambiguity Scoring
+
+When you encounter ambiguity, quantify it before deciding to stop or continue.
+
+### How to Score Ambiguity
+
+For each requirement you are implementing, ask:
+
+1. **Can I write a test for this right now?** (No = +20% ambiguity)
+2. **Are there multiple valid interpretations?** (Yes = +20% ambiguity)
+3. **Does the spec contradict itself?** (Yes = +30% ambiguity)
+4. **Am I making assumptions about user behavior?** (Yes = +15% ambiguity)
+5. **Does this depend on an undocumented external system?** (Yes = +15% ambiguity)
+
+### Threshold
+
+| Ambiguity Score | Action |
+|-----------------|--------|
+| 0-15% | Continue. Minor ambiguity is normal. Document your interpretation. |
+| 16-30% | Continue with caution. Add a comment explaining your interpretation. Flag in PR. |
+| 31-50% | STOP. Ask the spec author one specific question. Do not continue until answered. |
+| 51%+ | STOP. The spec is incomplete. Request a revision before proceeding. |
+
+### Example
+
+**Requirement:** "FR-7: The system MUST notify the user when their order ships."
+
+Questions:
+1. Can I write a test? Partially — I know WHAT to test but not HOW (email? push? in-app?). +20%
+2. Multiple interpretations? Yes — notification channel is unclear. +20%
+3. Contradicts itself? No. +0%
+4. Assuming user behavior? Yes — I am assuming they want email. +15%
+5. Undocumented external system? Maybe — depends on notification service. +15%
+
+**Total: 70%.** STOP. The spec needs to specify the notification channel.
+
+---
+
+## Scope Creep Detection
+
+### What Is Scope Creep?
+
+Scope creep is implementing functionality not described in the spec. It includes:
+
+- Adding features the spec does not mention
+- "Improving" behavior beyond what acceptance criteria require
+- Handling edge cases the spec explicitly excluded
+- Refactoring unrelated code "while you're in there"
+- Building infrastructure for future features
+
+### Detection Patterns
+
+| Pattern | Example | Risk |
+|---------|---------|------|
+| "While I'm here..." | Refactoring a utility function unrelated to the spec | Medium — unreviewed changes |
+| "This would be easy to add..." | Adding a search filter the spec does not mention | High — untested, unspecified |
+| "Users will probably want..." | Building a feature based on assumption | High — may conflict with future specs |
+| "This is obviously needed..." | Adding logging, metrics, or caching not in NFRs | Medium — may be overkill or wrong approach |
+| "The spec forgot to mention..." | Building something the spec excluded | Critical — may be deliberately excluded |
+
+### Response Protocol
+
+When you detect scope creep in your own work:
+
+1. **Stop immediately.** Do not commit the extra code.
+2. **Check Out of Scope.** Is this item explicitly excluded?
+3. **If excluded:** Delete the code. The spec author had a reason.
+4. **If not mentioned:** File a note for the spec author. Ask if it should be added.
+5. **If approved:** Update the spec FIRST, then implement.
+
+---
+
+## Breaking Change Identification
+
+### What Counts as a Breaking Change?
+
+A breaking change is any modification that could cause existing clients, tests, or integrations to fail.
+
+| Category | Breaking | Not Breaking |
+|----------|----------|--------------|
+| API endpoint removed | Yes | - |
+| API endpoint added | - | No |
+| Required field added to request | Yes | - |
+| Optional field added to request | - | No |
+| Field removed from response | Yes | - |
+| Field added to response | - | No (usually) |
+| Status code changed | Yes | - |
+| Error code string changed | Yes | - |
+| Database column removed | Yes | - |
+| Database column added (nullable) | - | No |
+| Database column added (not null, no default) | Yes | - |
+| Enum value removed | Yes | - |
+| Enum value added | - | No (usually) |
+| Behavior change for existing input | Yes | - |
+
+### Breaking Change Protocol
+
+1. **Identify** the breaking change before implementing it.
+2. **Escalate** immediately — do not implement without approval.
+3. **Propose** a migration path (versioned API, feature flag, deprecation period).
+4. **Document** the breaking change in the spec's changelog.
+
+---
+
+## Security Implication Checklist
+
+Any change touching the following areas MUST be escalated, even if the spec seems clear.
+
+### Always Escalate
+
+- [ ] Authentication logic (login, logout, token generation)
+- [ ] Authorization logic (role checks, permission gates)
+- [ ] Encryption/hashing (algorithm choice, key management)
+- [ ] PII handling (storage, transmission, logging)
+- [ ] Input validation bypass (new endpoints, parameter changes)
+- [ ] Rate limiting changes (thresholds, scope)
+- [ ] CORS or CSP policy changes
+- [ ] File upload handling
+- [ ] SQL/NoSQL query construction (injection risk)
+- [ ] Deserialization of user input
+- [ ] Redirect URLs from user input (open redirect risk)
+- [ ] Secrets in code, config, or logs
+
+### Security Escalation Template
+
+```markdown
+## Security Escalation: [Title]
+
+**Affected area:** [authentication/authorization/encryption/PII/etc.]
+**Spec reference:** [FR-N or NFR-SN]
+**Risk:** [What could go wrong if implemented incorrectly]
+**Current protection:** [What exists today]
+**Proposed change:** [What the spec requires]
+**My concern:** [Specific security question]
+**Recommendation:** [Proposed approach with security rationale]
+```
+
+---
+
+## Escalation Templates
+
+### Template 1: Ambiguous Requirement
+
+```markdown
+## Escalation: Ambiguous Requirement
+
+**Blocked on:** FR-7 ("notify the user when their order ships")
+**Ambiguity score:** 70%
+**Question:** What notification channel should be used?
+**Options considered:**
+  A. Email only — Pros: simple, reliable. Cons: not real-time.
+  B. Email + in-app notification — Pros: covers both async and real-time. Cons: more implementation effort.
+  C. Configurable per user — Pros: maximum flexibility. Cons: requires preference UI (not in spec).
+**My recommendation:** B (email + in-app). Covers most use cases without requiring new UI.
+**Impact of waiting:** Cannot implement FR-7 until resolved. No other work blocked.
+```
+
+### Template 2: Missing Edge Case
+
+```markdown
+## Escalation: Missing Edge Case
+
+**Related to:** FR-3 (password reset link expires after 1 hour)
+**Scenario:** User clicks a reset link, but their account was deleted between requesting and clicking.
+**Not in spec:** Edge cases section does not cover this.
+**Options considered:**
+  A. Show generic "link invalid" error — Pros: secure (no info leak). Cons: confusing for deleted user.
+  B. Show "account not found" error — Pros: clear. Cons: confirms account deletion to link holder.
+**My recommendation:** A. Security over clarity — do not reveal account existence.
+**Impact of waiting:** Can implement other ACs; this is blocking only AC-2 completion.
+```
+
+### Template 3: Potential Breaking Change
+
+```markdown
+## Escalation: Potential Breaking Change
+
+**Spec requires:** Adding required field "role" to POST /api/users request (FR-6)
+**Current behavior:** POST /api/users accepts {email, password, displayName}
+**Breaking:** Yes — existing clients will get 400 errors (missing required field)
+**Options considered:**
+  A. Make "role" required as spec says — Pros: matches spec. Cons: breaks mobile app v2.1.
+  B. Make "role" optional with default "user" — Pros: backward compatible. Cons: deviates from spec.
+  C. Version the API (v2) — Pros: clean separation. Cons: maintenance burden.
+**My recommendation:** B. Default to "user" for backward compatibility. Update spec to reflect MAY instead of MUST.
+**Impact of waiting:** Frontend team is building against the new contract. Need answer within 2 days.
+```
+
+### Template 4: Scope Creep Proposal
+
+```markdown
+## Escalation: Potential Addition to Spec
+
+**Context:** While implementing FR-2 (password validation), I noticed the spec does not mention password strength feedback.
+**Not in spec:** No requirement for showing strength indicators.
+**Checked Out of Scope:** Not listed there either.
+**Proposal:** Add FR-7: "The system SHOULD display password strength feedback during registration."
+**Effort:** ~2 hours additional implementation.
+**Question:** Should this be added to current spec, filed as a separate spec, or skipped?
+**Impact of waiting:** FR-2 implementation is not blocked. This is an enhancement question only.
+```
+
+---
+
+## Quick Reference Card
+
+```
+CONTINUE if:
+  - Spec is approved
+  - Requirement uses MUST and is unambiguous
+  - Tests can be written directly from the AC
+  - Changes are additive and non-breaking
+  - You are refactoring internals only (no behavior change)
+
+STOP if:
+  - Ambiguity > 30%
+  - Any breaking change
+  - Any security-related change
+  - Spec says N/A but you think it shouldn't
+  - You are about to build something not in the spec
+  - You cannot write a test for the requirement
+  - External dependency is undocumented
+```
+
+---
+
+## Anti-Patterns in Autonomy
+
+### 1. "I'll Ask Later"
+Continuing past an ambiguity checkpoint because asking feels slow. The rework from building the wrong thing is always slower.
+
+### 2. "It's Obviously Needed"
+Assuming a missing feature was accidentally omitted. It may have been deliberately excluded. Check Out of Scope first.
+
+### 3. "The Spec Is Wrong"
+Implementing what you think the spec SHOULD say instead of what it DOES say. If the spec is wrong, escalate. Do not silently "fix" it.
+
+### 4. "Just This Once"
+Bypassing the escalation protocol for a "small" change. Small changes compound. The protocol exists because humans are bad at judging risk in the moment.
+
+### 5. "I Already Built It"
+Presenting completed work that was never in the spec and hoping it gets accepted. This creates review pressure and wastes everyone's time if rejected. Ask BEFORE building.
--- a/engineering/spec-driven-workflow/references/spec_format_guide.md
+++ b/engineering/spec-driven-workflow/references/spec_format_guide.md
@@ -0,0 +1,423 @@
+# Spec Format Guide
+
+Complete reference for writing feature specifications. Every section is explained with examples, rationale, and common mistakes.
+
+---
+
+## The Spec Document Structure
+
+A spec has 8 mandatory sections. If a section does not apply, write "N/A — [reason]" so reviewers know it was considered, not skipped.
+
+```
+1. Title and Metadata
+2. Context
+3. Functional Requirements
+4. Non-Functional Requirements
+5. Acceptance Criteria
+6. Edge Cases and Error Scenarios
+7. API Contracts
+8. Data Models
+9. Out of Scope
+```
+
+---
+
+## Section 1: Title and Metadata
+
+```markdown
+# Spec: [Feature Name]
+
+**Author:** Jane Doe
+**Date:** 2026-03-25
+**Status:** Draft | In Review | Approved | Superseded
+**Reviewers:** John Smith, Alice Chen
+**Related specs:** SPEC-018 (User Registration), SPEC-023 (Session Management)
+```
+
+### Status Lifecycle
+
+| Status | Meaning | Who Can Change |
+|--------|---------|----------------|
+| Draft | Author is still writing. Not ready for review. | Author |
+| In Review | Ready for feedback. Implementation blocked. | Author |
+| Approved | Reviewed and accepted. Implementation may begin. | Reviewer |
+| Superseded | Replaced by a newer spec. Link to replacement. | Author |
+
+**Rule:** Implementation MUST NOT begin until status is "Approved."
+
+---
+
+## Section 2: Context
+
+The context section answers: **Why does this feature exist?**
+
+### What to Include
+
+- The problem being solved (with evidence: support tickets, metrics, user research)
+- The current state (what exists today and what is broken or missing)
+- The business justification (revenue impact, cost savings, user retention)
+- Constraints or dependencies (regulatory, technical, timeline)
+
+### What to Exclude
+
+- Implementation details (that is the engineer's job)
+- Solution proposals (the spec says WHAT, not HOW)
+- Lengthy background (2-4 paragraphs maximum)
+
+### Good Example
+
+```markdown
+## Context
+
+Users who forget their passwords currently have no self-service recovery.
+Support handles ~200 password reset requests per week, consuming approximately
+8 hours of agent time at $45/hour ($360/week, $18,720/year). Additionally,
+12% of users who contact support for a reset never return.
+
+This feature provides self-service password reset via email, eliminating
+support burden and reducing user churn from the reset flow.
+```
+
+### Bad Example
+
+```markdown
+## Context
+
+We need a password reset feature. Users forget their passwords sometimes
+and need to reset them. We should build this.
+```
+
+**Why it is bad:** No evidence, no metrics, no business justification. "We should build this" is not a reason.
+
+---
+
+## Section 3: Functional Requirements — RFC 2119
+
+### RFC 2119 Keywords
+
+These keywords have precise meanings per [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). Do not use them casually.
+
+| Keyword | Meaning | Testing Implication |
+|---------|---------|---------------------|
+| **MUST** | Absolute requirement. The implementation is non-conformant without this. | Must have a passing test. Failure = release blocker. |
+| **MUST NOT** | Absolute prohibition. Doing this = broken implementation. | Must have a test proving this cannot happen. |
+| **SHOULD** | Strongly recommended. Can be omitted only with documented justification. | Should have a test. Omission requires written rationale. |
+| **SHOULD NOT** | Strongly discouraged. Can be done only with documented justification. | Should have a test confirming the behavior does not occur. |
+| **MAY** | Truly optional. Implementer's discretion. | Test is optional. Document if implemented. |
+
+### Writing Good Requirements
+
+**Each requirement MUST be:**
+1. **Atomic** — One behavior per requirement. Not "The system MUST authenticate users and log them in."
+2. **Testable** — You can write a test that proves it works or does not.
+3. **Numbered** — Sequential FR-N format for traceability.
+4. **Specific** — No ambiguous adjectives ("fast", "secure", "user-friendly").
+
+### Good Requirements
+
+```markdown
+- FR-1: The system MUST accept login via email and password.
+- FR-2: The system MUST reject passwords shorter than 8 characters.
+- FR-3: The system MUST return a JWT access token on successful login.
+- FR-4: The system MUST NOT include the password hash in any API response.
+- FR-5: The system SHOULD support "remember me" with a 30-day refresh token.
+- FR-6: The system MAY display last login time on the dashboard.
+```
+
+### Bad Requirements
+
+```markdown
+- FR-1: The login system must be fast and secure.
+  (Untestable: what is "fast"? What is "secure"?)
+
+- FR-2: The system must handle all edge cases.
+  (Vague: which edge cases? This delegates the spec to the implementer.)
+
+- FR-3: Users should be able to log in easily.
+  (Subjective: "easily" is not measurable.)
+```
+
+---
+
+## Section 4: Non-Functional Requirements
+
+Non-functional requirements define quality attributes. Every requirement needs a **measurable threshold**.
+
+### Categories
+
+#### Performance
+```markdown
+- NFR-P1: Login API MUST respond in < 500ms (p95) under 1,000 concurrent users.
+- NFR-P2: Dashboard page MUST achieve Largest Contentful Paint < 2.5s.
+- NFR-P3: Search results MUST return within 200ms for queries under 100 characters.
+```
+
+**Bad:** "The system should be fast." (Not measurable.)
+
+#### Security
+```markdown
+- NFR-S1: All API endpoints MUST require authentication except /health and /login.
+- NFR-S2: Failed login attempts MUST be rate-limited to 5 per minute per IP.
+- NFR-S3: Passwords MUST be hashed with bcrypt (cost factor >= 12).
+- NFR-S4: Session tokens MUST be invalidated on password change.
+```
+
+#### Accessibility
+```markdown
+- NFR-A1: All form inputs MUST have associated labels (WCAG 1.3.1).
+- NFR-A2: Color contrast MUST meet 4.5:1 ratio (WCAG 1.4.3).
+- NFR-A3: All interactive elements MUST be keyboard-navigable (WCAG 2.1.1).
+```
+
+#### Scalability
+```markdown
+- NFR-SC1: The system SHOULD handle 50,000 registered users.
+- NFR-SC2: Database queries MUST use indexes; no full table scans on tables > 10K rows.
+```
+
+#### Reliability
+```markdown
+- NFR-R1: The authentication service MUST maintain 99.9% uptime (< 8.77h downtime/year).
+- NFR-R2: Data MUST NOT be lost on service restart (durable storage required).
+```
+
+---
+
+## Section 5: Acceptance Criteria — Given/When/Then
+
+Acceptance criteria are the contract between the spec author and the implementer. They define "done."
+
+### The Given/When/Then Pattern
+
+```
+Given [precondition — the world is in this state]
+When  [action — the user or system does this]
+Then  [outcome — this observable result occurs]
+And   [additional outcome — and also this]
+```
+
+### Rules for Acceptance Criteria
+
+1. **Every AC MUST reference at least one FR-* or NFR-*.** Orphaned criteria indicate missing requirements.
+2. **Every AC MUST be testable by a machine.** If you cannot write an automated test, rewrite the criterion.
+3. **No subjective language.** Not "should look good" but "MUST render within the design-system grid."
+4. **One scenario per AC.** If you have multiple Given/When/Then blocks, split into separate ACs.
+
+### Example: Authentication Feature
+
+```markdown
+### AC-1: Successful login (FR-1, FR-3)
+Given a registered user with email "user@example.com" and password "P@ssw0rd123"
+When they POST /api/auth/login with those credentials
+Then they receive a 200 response with a valid JWT token
+And the token expires in 24 hours
+And the response includes the user's display name
+
+### AC-2: Invalid password (FR-1)
+Given a registered user with email "user@example.com"
+When they POST /api/auth/login with an incorrect password
+Then they receive a 401 response
+And the response body contains error "INVALID_CREDENTIALS"
+And no token is issued
+
+### AC-3: Short password rejected on registration (FR-2)
+Given a new user attempting to register
+When they submit a password with 7 characters
+Then they receive a 400 response
+And the response body contains error "PASSWORD_TOO_SHORT"
+And the account is not created
+```
+
+### Common Mistakes
+
+| Mistake | Example | Fix |
+|---------|---------|-----|
+| Vague outcome | "Then the system works correctly" | "Then the response status is 200 and body contains {field: value}" |
+| Missing precondition | "When user logs in, then token is issued" | "Given a registered user, when they POST valid credentials, then..." |
+| Multiple scenarios | AC with 3 different When clauses | Split into 3 separate ACs |
+| No FR reference | "AC-5: User sees dashboard" | "AC-5: User sees dashboard (FR-7)" |
+
+---
+
+## Section 6: Edge Cases and Error Scenarios
+
+### What Counts as an Edge Case
+
+- Invalid or malformed input
+- External service failures (API down, timeout, rate-limited)
+- Concurrent operations (race conditions)
+- Boundary values (empty string, max length, zero, negative numbers)
+- State conflicts (already exists, already deleted, expired)
+
+### Format
+
+```markdown
+- EC-1: Empty email field → Return 400 with error "EMAIL_REQUIRED". Do not call auth service.
+- EC-2: Email exceeds 255 characters → Return 400 with error "EMAIL_TOO_LONG".
+- EC-3: OAuth provider returns 503 → Return 503 with "Service temporarily unavailable". Retry after 30s.
+- EC-4: Two users register same email simultaneously → First succeeds, second gets 409 Conflict.
+- EC-5: User clicks reset link after password was already changed → Show "Link already used."
+```
+
+### Coverage Rule
+
+For every external dependency, specify at least one failure:
+- Database: connection lost, timeout, constraint violation
+- API: 4xx, 5xx, timeout, invalid response
+- File system: file not found, permission denied, disk full
+- User input: empty, too long, wrong type, injection attempt
+
+---
+
+## Section 7: API Contracts
+
+### Notation
+
+Use TypeScript-style interfaces. They are readable by both frontend and backend engineers.
+
+```typescript
+interface CreateUserRequest {
+  email: string;         // MUST be valid email, max 255 chars
+  password: string;      // MUST be 8-128 chars
+  displayName: string;   // MUST be 1-100 chars, no HTML
+  role?: "user" | "admin"; // Default: "user"
+}
+```
+
+### What to Define
+
+For each endpoint:
+1. **HTTP method and path** (e.g., POST /api/users)
+2. **Request body** (fields, types, constraints, defaults)
+3. **Success response** (status code, body shape)
+4. **Error responses** (each error code with its status and body)
+5. **Headers** (Authorization, Content-Type, custom headers)
+
+### Error Response Convention
+
+```typescript
+interface ApiError {
+  error: string;         // Machine-readable code: "INVALID_CREDENTIALS"
+  message: string;       // Human-readable: "The email or password is incorrect."
+  details?: Record<string, string>;  // Field-level errors for validation
+}
+```
+
+Always include:
+- 400 for validation errors
+- 401 for authentication failures
+- 403 for authorization failures
+- 404 for not found
+- 409 for conflicts
+- 429 for rate limiting
+- 500 for unexpected errors (keep it generic — do not leak internals)
+
+---
+
+## Section 8: Data Models
+
+### Table Format
+
+```markdown
+### User
+| Field | Type | Constraints |
+|-------|------|-------------|
+| id | UUID | PK, auto-generated, immutable |
+| email | varchar(255) | Unique, not null, valid email |
+| passwordHash | varchar(60) | Not null, bcrypt, never in API responses |
+| displayName | varchar(100) | Not null |
+| role | enum('user','admin') | Default: 'user' |
+| createdAt | timestamp | UTC, immutable, auto-set |
+| updatedAt | timestamp | UTC, auto-updated |
+| deletedAt | timestamp | Null unless soft-deleted |
+```
+
+### Rules
+
+1. **Every entity in requirements MUST have a data model.** If FR-1 mentions "users", there must be a User model.
+2. **Constraints MUST match requirements.** If FR-2 says passwords >= 8 chars, the model must note that.
+3. **Include indexes.** If NFR-P1 says < 500ms queries, note which fields need indexes.
+4. **Specify soft vs. hard delete.** State it explicitly.
+
+---
+
+## Section 9: Out of Scope
+
+### Why This Section Matters
+
+Out of Scope prevents scope creep during implementation. When someone says "while you're in there, could you also..." — point them to this section.
+
+### Format
+
+```markdown
+- OS-1: Multi-factor authentication — Planned for Q3 (SPEC-045).
+- OS-2: Social login beyond Google/GitHub — Insufficient user demand (< 2% requests).
+- OS-3: Admin impersonation — Security review pending. Separate spec required.
+- OS-4: Password strength meter UI — Nice-to-have, deferred to design sprint 12.
+```
+
+### Rules
+
+1. **Every feature discussed and rejected MUST be listed.** This creates a paper trail.
+2. **Include the reason.** "Not now" is not a reason. "Insufficient demand (< 2% of requests)" is.
+3. **Link to future specs** when the exclusion is a deferral, not a rejection.
+
+---
+
+## Feature-Type Templates
+
+### CRUD Feature
+
+Focus on: all 4 operations, validation rules, authorization, pagination for list endpoints.
+
+```markdown
+- FR-1: Users MUST be able to create a [resource] with [required fields].
+- FR-2: Users MUST be able to read a [resource] by ID.
+- FR-3: Users MUST be able to list [resources] with pagination (default: 20/page).
+- FR-4: Users MUST be able to update [mutable fields] of their own [resources].
+- FR-5: Users MUST be able to delete their own [resources] (soft delete).
+- FR-6: Users MUST NOT be able to modify or delete other users' [resources].
+```
+
+### Integration Feature
+
+Focus on: external API contract, retry/fallback behavior, data mapping, error propagation.
+
+```markdown
+- FR-1: The system MUST call [external API] to [purpose].
+- FR-2: The system MUST retry failed calls up to 3 times with exponential backoff.
+- FR-3: The system MUST map [external field] to [internal field].
+- FR-4: The system MUST NOT expose external API errors directly to users.
+- EC-1: External API returns 5xx → Log error, return cached data if < 1h old, else 503.
+- EC-2: External API response schema changes → Log warning, reject unmappable fields.
+```
+
+### Migration Feature
+
+Focus on: backward compatibility, rollback plan, data integrity, zero-downtime deployment.
+
+```markdown
+- FR-1: The migration MUST transform [old schema] to [new schema].
+- FR-2: The migration MUST be reversible (rollback script required).
+- FR-3: The migration MUST NOT cause downtime exceeding 30 seconds.
+- FR-4: The migration MUST validate data integrity post-run (row count, checksum).
+- EC-1: Migration fails mid-way → Automatic rollback, alert ops team.
+- EC-2: New schema has stricter constraints → Log invalid rows, quarantine for manual review.
+```
+
+---
+
+## Checklist: Is This Spec Ready for Review?
+
+- [ ] Every section is filled (or marked N/A with reason)
+- [ ] All requirements use FR-N, NFR-N numbering
+- [ ] RFC 2119 keywords are UPPERCASE
+- [ ] Every AC references at least one requirement
+- [ ] Every AC uses Given/When/Then
+- [ ] Edge cases cover each external dependency failure
+- [ ] API contracts define success AND error responses
+- [ ] Data models include all entities from requirements
+- [ ] Out of Scope lists items discussed and rejected
+- [ ] No placeholder text remains
+- [ ] Context includes evidence (metrics, tickets, research)
+- [ ] Status is "In Review" (not still "Draft")
--- a/engineering/spec-driven-workflow/spec_generator.py
+++ b/engineering/spec-driven-workflow/spec_generator.py
@@ -0,0 +1,338 @@
+#!/usr/bin/env python3
+"""
+Spec Generator - Generates a feature specification template from a name and description.
+
+Produces a complete spec document with all required sections pre-filled with
+guidance prompts. Output can be markdown or structured JSON.
+
+No external dependencies - uses only Python standard library.
+"""
+
+import argparse
+import json
+import sys
+import textwrap
+from datetime import date
+from pathlib import Path
+from typing import Dict, Any, Optional
+
+
+SPEC_TEMPLATE = """\
+# Spec: {name}
+
+**Author:** [your name]
+**Date:** {date}
+**Status:** Draft
+**Reviewers:** [list reviewers]
+**Related specs:** [links to related specs, or "None"]
+
+---
+
+## Context
+
+{context_prompt}
+
+---
+
+## Functional Requirements
+
+_Use RFC 2119 keywords: MUST, MUST NOT, SHOULD, SHOULD NOT, MAY._
+_Each requirement is a single, testable statement. Number sequentially._
+
+- FR-1: The system MUST [describe required behavior].
+- FR-2: The system MUST [describe another required behavior].
+- FR-3: The system SHOULD [describe recommended behavior].
+- FR-4: The system MAY [describe optional behavior].
+- FR-5: The system MUST NOT [describe prohibited behavior].
+
+---
+
+## Non-Functional Requirements
+
+### Performance
+- NFR-P1: [Operation] MUST complete in < [threshold] (p95) under [conditions].
+- NFR-P2: [Operation] SHOULD handle [throughput] requests per second.
+
+### Security
+- NFR-S1: All data in transit MUST be encrypted via TLS 1.2+.
+- NFR-S2: The system MUST rate-limit [operation] to [limit] per [period] per [scope].
+
+### Accessibility
+- NFR-A1: [UI component] MUST meet WCAG 2.1 AA standards.
+- NFR-A2: Error messages MUST be announced to screen readers.
+
+### Scalability
+- NFR-SC1: The system SHOULD handle [number] concurrent [entities].
+
+### Reliability
+- NFR-R1: The [service] MUST maintain [percentage]% uptime.
+
+---
+
+## Acceptance Criteria
+
+_Write in Given/When/Then (Gherkin) format._
+_Each criterion MUST reference at least one FR-* or NFR-*._
+
+### AC-1: [Descriptive name] (FR-1)
+Given [precondition]
+When [action]
+Then [expected result]
+And [additional assertion]
+
+### AC-2: [Descriptive name] (FR-2)
+Given [precondition]
+When [action]
+Then [expected result]
+
+### AC-3: [Descriptive name] (NFR-S2)
+Given [precondition]
+When [action]
+Then [expected result]
+And [additional assertion]
+
+---
+
+## Edge Cases
+
+_For every external dependency (API, database, file system, user input), specify at least one failure scenario._
+
+- EC-1: [Input/condition] -> [expected behavior].
+- EC-2: [Input/condition] -> [expected behavior].
+- EC-3: [External service] is unavailable -> [expected behavior].
+- EC-4: [Concurrent/race condition] -> [expected behavior].
+- EC-5: [Boundary value] -> [expected behavior].
+
+---
+
+## API Contracts
+
+_Define request/response shapes using TypeScript-style notation._
+_Cover all endpoints referenced in functional requirements._
+
+### [METHOD] [endpoint]
+
+Request:
+```typescript
+interface [Name]Request {{
+  field: string;       // Description, constraints
+  optional?: number;   // Default: [value]
+}}
+```
+
+Success Response ([status code]):
+```typescript
+interface [Name]Response {{
+  id: string;
+  field: string;
+  createdAt: string;   // ISO 8601
+}}
+```
+
+Error Response ([status code]):
+```typescript
+interface [Name]Error {{
+  error: "[ERROR_CODE]";
+  message: string;
+}}
+```
+
+---
+
+## Data Models
+
+_Define all entities referenced in requirements._
+
+### [Entity Name]
+| Field | Type | Constraints |
+|-------|------|-------------|
+| id | UUID | Primary key, auto-generated |
+| [field] | [type] | [constraints] |
+| createdAt | timestamp | UTC, immutable |
+| updatedAt | timestamp | UTC, auto-updated |
+
+---
+
+## Out of Scope
+
+_Explicit exclusions prevent scope creep. If someone asks for these during implementation, point them here._
+
+- OS-1: [Feature/capability] — [reason for exclusion or link to future spec].
+- OS-2: [Feature/capability] — [reason for exclusion].
+- OS-3: [Feature/capability] — deferred to [version/sprint].
+
+---
+
+## Open Questions
+
+_Track unresolved questions here. Each must be resolved before status moves to "Approved"._
+
+- [ ] Q1: [Question] — Owner: [name], Due: [date]
+- [ ] Q2: [Question] — Owner: [name], Due: [date]
+"""
+
+
+def generate_context_prompt(description: str) -> str:
+    """Generate a context section prompt based on the provided description."""
+    if description:
+        return textwrap.dedent(f"""\
+            {description}
+
+            _Expand this context section to include:_
+            _- Why does this feature exist? What problem does it solve?_
+            _- What is the business motivation? (link to user research, support tickets, metrics)_
+            _- What is the current state? (what exists today, what pain points exist)_
+            _- 2-4 paragraphs maximum._""")
+    return textwrap.dedent("""\
+        _Why does this feature exist? What problem does it solve? What is the business
+        motivation? Include links to user research, support tickets, or metrics that
+        justify this work. 2-4 paragraphs maximum._""")
+
+
+def generate_spec(name: str, description: str) -> str:
+    """Generate a spec document from name and description."""
+    context_prompt = generate_context_prompt(description)
+    return SPEC_TEMPLATE.format(
+        name=name,
+        date=date.today().isoformat(),
+        context_prompt=context_prompt,
+    )
+
+
+def generate_spec_json(name: str, description: str) -> Dict[str, Any]:
+    """Generate structured JSON representation of the spec template."""
+    return {
+        "spec": {
+            "title": f"Spec: {name}",
+            "metadata": {
+                "author": "[your name]",
+                "date": date.today().isoformat(),
+                "status": "Draft",
+                "reviewers": [],
+                "related_specs": [],
+            },
+            "context": description or "[Describe why this feature exists]",
+            "functional_requirements": [
+                {"id": "FR-1", "keyword": "MUST", "description": "[describe required behavior]"},
+                {"id": "FR-2", "keyword": "MUST", "description": "[describe another required behavior]"},
+                {"id": "FR-3", "keyword": "SHOULD", "description": "[describe recommended behavior]"},
+                {"id": "FR-4", "keyword": "MAY", "description": "[describe optional behavior]"},
+                {"id": "FR-5", "keyword": "MUST NOT", "description": "[describe prohibited behavior]"},
+            ],
+            "non_functional_requirements": {
+                "performance": [
+                    {"id": "NFR-P1", "description": "[operation] MUST complete in < [threshold]"},
+                ],
+                "security": [
+                    {"id": "NFR-S1", "description": "All data in transit MUST be encrypted via TLS 1.2+"},
+                ],
+                "accessibility": [
+                    {"id": "NFR-A1", "description": "[UI component] MUST meet WCAG 2.1 AA"},
+                ],
+                "scalability": [
+                    {"id": "NFR-SC1", "description": "[system] SHOULD handle [N] concurrent [entities]"},
+                ],
+                "reliability": [
+                    {"id": "NFR-R1", "description": "[service] MUST maintain [N]% uptime"},
+                ],
+            },
+            "acceptance_criteria": [
+                {
+                    "id": "AC-1",
+                    "name": "[descriptive name]",
+                    "references": ["FR-1"],
+                    "given": "[precondition]",
+                    "when": "[action]",
+                    "then": "[expected result]",
+                },
+            ],
+            "edge_cases": [
+                {"id": "EC-1", "condition": "[input/condition]", "behavior": "[expected behavior]"},
+            ],
+            "api_contracts": [
+                {
+                    "method": "[METHOD]",
+                    "endpoint": "[/api/path]",
+                    "request_fields": [{"name": "field", "type": "string", "constraints": "[description]"}],
+                    "success_response": {"status": 200, "fields": []},
+                    "error_response": {"status": 400, "fields": []},
+                },
+            ],
+            "data_models": [
+                {
+                    "name": "[Entity]",
+                    "fields": [
+                        {"name": "id", "type": "UUID", "constraints": "Primary key, auto-generated"},
+                    ],
+                },
+            ],
+            "out_of_scope": [
+                {"id": "OS-1", "description": "[feature/capability]", "reason": "[reason]"},
+            ],
+            "open_questions": [],
+        },
+        "metadata": {
+            "generated_by": "spec_generator.py",
+            "feature_name": name,
+            "feature_description": description,
+        },
+    }
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Generate a feature specification template from a name and description.",
+        epilog="Example: python spec_generator.py --name 'User Auth' --description 'OAuth 2.0 login flow'",
+    )
+    parser.add_argument(
+        "--name",
+        required=True,
+        help="Feature name (used as spec title)",
+    )
+    parser.add_argument(
+        "--description",
+        default="",
+        help="Brief feature description (used to seed the context section)",
+    )
+    parser.add_argument(
+        "--output",
+        "-o",
+        default=None,
+        help="Output file path (default: stdout)",
+    )
+    parser.add_argument(
+        "--format",
+        choices=["md", "json"],
+        default="md",
+        help="Output format: md (markdown) or json (default: md)",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_flag",
+        help="Shorthand for --format json",
+    )
+
+    args = parser.parse_args()
+
+    output_format = "json" if args.json_flag else args.format
+
+    if output_format == "json":
+        result = generate_spec_json(args.name, args.description)
+        output = json.dumps(result, indent=2)
+    else:
+        output = generate_spec(args.name, args.description)
+
+    if args.output:
+        out_path = Path(args.output)
+        out_path.parent.mkdir(parents=True, exist_ok=True)
+        out_path.write_text(output, encoding="utf-8")
+        print(f"Spec template written to {out_path}", file=sys.stderr)
+    else:
+        print(output)
+
+    sys.exit(0)
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering/spec-driven-workflow/spec_validator.py
+++ b/engineering/spec-driven-workflow/spec_validator.py
@@ -0,0 +1,461 @@
+#!/usr/bin/env python3
+"""
+Spec Validator - Validates a feature specification for completeness and quality.
+
+Checks that a spec document contains all required sections, uses RFC 2119 keywords
+correctly, has acceptance criteria in Given/When/Then format, and scores overall
+completeness from 0-100.
+
+Sections checked:
+- Context, Functional Requirements, Non-Functional Requirements
+- Acceptance Criteria, Edge Cases, API Contracts, Data Models, Out of Scope
+
+Exit codes: 0 = pass, 1 = warnings, 2 = critical (or --strict with score < 80)
+
+No external dependencies - uses only Python standard library.
+"""
+
+import argparse
+import json
+import re
+import sys
+from pathlib import Path
+from typing import Dict, List, Any, Tuple
+
+
+# Section definitions: (key, display_name, required_header_patterns, weight)
+SECTIONS = [
+    ("context", "Context", [r"^##\s+Context"], 10),
+    ("functional_requirements", "Functional Requirements", [r"^##\s+Functional\s+Requirements"], 15),
+    ("non_functional_requirements", "Non-Functional Requirements", [r"^##\s+Non-Functional\s+Requirements"], 10),
+    ("acceptance_criteria", "Acceptance Criteria", [r"^##\s+Acceptance\s+Criteria"], 20),
+    ("edge_cases", "Edge Cases", [r"^##\s+Edge\s+Cases"], 10),
+    ("api_contracts", "API Contracts", [r"^##\s+API\s+Contracts"], 10),
+    ("data_models", "Data Models", [r"^##\s+Data\s+Models"], 10),
+    ("out_of_scope", "Out of Scope", [r"^##\s+Out\s+of\s+Scope"], 10),
+    ("metadata", "Metadata (Author/Date/Status)", [r"\*\*Author:\*\*", r"\*\*Date:\*\*", r"\*\*Status:\*\*"], 5),
+]
+
+RFC_KEYWORDS = ["MUST", "MUST NOT", "SHOULD", "SHOULD NOT", "MAY"]
+
+# Patterns that indicate placeholder/unfilled content
+PLACEHOLDER_PATTERNS = [
+    r"\[your\s+name\]",
+    r"\[list\s+reviewers\]",
+    r"\[describe\s+",
+    r"\[input/condition\]",
+    r"\[precondition\]",
+    r"\[action\]",
+    r"\[expected\s+result\]",
+    r"\[feature/capability\]",
+    r"\[operation\]",
+    r"\[threshold\]",
+    r"\[UI\s+component\]",
+    r"\[service\]",
+    r"\[percentage\]",
+    r"\[number\]",
+    r"\[METHOD\]",
+    r"\[endpoint\]",
+    r"\[Name\]",
+    r"\[Entity\s+Name\]",
+    r"\[type\]",
+    r"\[constraints\]",
+    r"\[field\]",
+    r"\[reason\]",
+]
+
+
+class SpecValidator:
+    """Validates a spec document for completeness and quality."""
+
+    def __init__(self, content: str, file_path: str = ""):
+        self.content = content
+        self.file_path = file_path
+        self.lines = content.split("\n")
+        self.findings: List[Dict[str, Any]] = []
+        self.section_scores: Dict[str, Dict[str, Any]] = {}
+
+    def validate(self) -> Dict[str, Any]:
+        """Run all validation checks and return results."""
+        self._check_sections_present()
+        self._check_functional_requirements()
+        self._check_acceptance_criteria()
+        self._check_edge_cases()
+        self._check_rfc_keywords()
+        self._check_api_contracts()
+        self._check_data_models()
+        self._check_out_of_scope()
+        self._check_placeholders()
+        self._check_traceability()
+
+        total_score = self._calculate_score()
+
+        return {
+            "file": self.file_path,
+            "score": total_score,
+            "grade": self._score_to_grade(total_score),
+            "sections": self.section_scores,
+            "findings": self.findings,
+            "summary": self._build_summary(total_score),
+        }
+
+    def _add_finding(self, severity: str, section: str, message: str):
+        """Record a validation finding."""
+        self.findings.append({
+            "severity": severity,  # "error", "warning", "info"
+            "section": section,
+            "message": message,
+        })
+
+    def _find_section_content(self, header_pattern: str) -> str:
+        """Extract content between a section header and the next ## header."""
+        in_section = False
+        section_lines = []
+        for line in self.lines:
+            if re.match(header_pattern, line, re.IGNORECASE):
+                in_section = True
+                continue
+            if in_section and re.match(r"^##\s+", line):
+                break
+            if in_section:
+                section_lines.append(line)
+        return "\n".join(section_lines)
+
+    def _check_sections_present(self):
+        """Check that all required sections exist."""
+        for key, name, patterns, weight in SECTIONS:
+            found = False
+            for pattern in patterns:
+                for line in self.lines:
+                    if re.search(pattern, line, re.IGNORECASE):
+                        found = True
+                        break
+                if found:
+                    break
+
+            if found:
+                self.section_scores[key] = {"name": name, "present": True, "score": weight, "max": weight}
+            else:
+                self.section_scores[key] = {"name": name, "present": False, "score": 0, "max": weight}
+                self._add_finding("error", key, f"Missing section: {name}")
+
+    def _check_functional_requirements(self):
+        """Validate functional requirements format and content."""
+        content = self._find_section_content(r"^##\s+Functional\s+Requirements")
+        if not content.strip():
+            return
+
+        fr_pattern = re.compile(r"-\s+FR-(\d+):")
+        matches = fr_pattern.findall(content)
+
+        if not matches:
+            self._add_finding("error", "functional_requirements", "No numbered requirements found (expected FR-N: format)")
+            if "functional_requirements" in self.section_scores:
+                self.section_scores["functional_requirements"]["score"] = max(
+                    0, self.section_scores["functional_requirements"]["score"] - 10
+                )
+            return
+
+        fr_count = len(matches)
+        if fr_count < 3:
+            self._add_finding("warning", "functional_requirements", f"Only {fr_count} requirements found. Most features need 3+.")
+
+        # Check for RFC keywords
+        has_keyword = False
+        for kw in RFC_KEYWORDS:
+            if kw in content:
+                has_keyword = True
+                break
+        if not has_keyword:
+            self._add_finding("warning", "functional_requirements", "No RFC 2119 keywords (MUST/SHOULD/MAY) found.")
+
+    def _check_acceptance_criteria(self):
+        """Validate acceptance criteria use Given/When/Then format."""
+        content = self._find_section_content(r"^##\s+Acceptance\s+Criteria")
+        if not content.strip():
+            return
+
+        ac_pattern = re.compile(r"###\s+AC-(\d+):")
+        matches = ac_pattern.findall(content)
+
+        if not matches:
+            self._add_finding("error", "acceptance_criteria", "No numbered acceptance criteria found (expected ### AC-N: format)")
+            if "acceptance_criteria" in self.section_scores:
+                self.section_scores["acceptance_criteria"]["score"] = max(
+                    0, self.section_scores["acceptance_criteria"]["score"] - 15
+                )
+            return
+
+        ac_count = len(matches)
+
+        # Check Given/When/Then
+        given_count = len(re.findall(r"(?i)\bgiven\b", content))
+        when_count = len(re.findall(r"(?i)\bwhen\b", content))
+        then_count = len(re.findall(r"(?i)\bthen\b", content))
+
+        if given_count < ac_count:
+            self._add_finding("warning", "acceptance_criteria",
+                              f"Found {ac_count} criteria but only {given_count} 'Given' clauses. Each AC needs Given/When/Then.")
+        if when_count < ac_count:
+            self._add_finding("warning", "acceptance_criteria",
+                              f"Found {ac_count} criteria but only {when_count} 'When' clauses.")
+        if then_count < ac_count:
+            self._add_finding("warning", "acceptance_criteria",
+                              f"Found {ac_count} criteria but only {then_count} 'Then' clauses.")
+
+        # Check for FR references
+        fr_refs = re.findall(r"\(FR-\d+", content)
+        if not fr_refs:
+            self._add_finding("warning", "acceptance_criteria",
+                              "No acceptance criteria reference functional requirements (expected (FR-N) in title).")
+
+    def _check_edge_cases(self):
+        """Validate edge cases section."""
+        content = self._find_section_content(r"^##\s+Edge\s+Cases")
+        if not content.strip():
+            return
+
+        ec_pattern = re.compile(r"-\s+EC-(\d+):")
+        matches = ec_pattern.findall(content)
+
+        if not matches:
+            self._add_finding("warning", "edge_cases", "No numbered edge cases found (expected EC-N: format)")
+        elif len(matches) < 3:
+            self._add_finding("warning", "edge_cases", f"Only {len(matches)} edge cases. Consider failure modes for each external dependency.")
+
+    def _check_rfc_keywords(self):
+        """Check RFC 2119 keywords are used consistently (capitalized)."""
+        # Look for lowercase must/should/may that might be intended as RFC keywords
+        context_content = self._find_section_content(r"^##\s+Functional\s+Requirements")
+        context_content += self._find_section_content(r"^##\s+Non-Functional\s+Requirements")
+
+        for kw in ["must", "should", "may"]:
+            # Find lowercase usage in requirement-like sentences
+            pattern = rf"(?:system|service|API|endpoint)\s+{kw}\s+"
+            if re.search(pattern, context_content):
+                self._add_finding("warning", "rfc_keywords",
+                                  f"Found lowercase '{kw}' in requirements. RFC 2119 keywords should be UPPERCASE: {kw.upper()}")
+
+    def _check_api_contracts(self):
+        """Validate API contracts section."""
+        content = self._find_section_content(r"^##\s+API\s+Contracts")
+        if not content.strip():
+            return
+
+        # Check for at least one endpoint definition
+        has_endpoint = bool(re.search(r"(GET|POST|PUT|PATCH|DELETE)\s+/", content))
+        if not has_endpoint:
+            self._add_finding("warning", "api_contracts", "No HTTP method + path found (expected e.g., POST /api/endpoint)")
+
+        # Check for request/response definitions
+        has_interface = bool(re.search(r"interface\s+\w+", content))
+        if not has_interface:
+            self._add_finding("info", "api_contracts", "No TypeScript interfaces found. Consider defining request/response shapes.")
+
+    def _check_data_models(self):
+        """Validate data models section."""
+        content = self._find_section_content(r"^##\s+Data\s+Models")
+        if not content.strip():
+            return
+
+        # Check for table format
+        has_table = bool(re.search(r"\|.*\|.*\|", content))
+        if not has_table:
+            self._add_finding("warning", "data_models", "No table-formatted data models found. Use | Field | Type | Constraints | format.")
+
+    def _check_out_of_scope(self):
+        """Validate out of scope section."""
+        content = self._find_section_content(r"^##\s+Out\s+of\s+Scope")
+        if not content.strip():
+            return
+
+        os_pattern = re.compile(r"-\s+OS-(\d+):")
+        matches = os_pattern.findall(content)
+
+        if not matches:
+            self._add_finding("warning", "out_of_scope", "No numbered exclusions found (expected OS-N: format)")
+        elif len(matches) < 2:
+            self._add_finding("info", "out_of_scope", "Only 1 exclusion listed. Consider what was deliberately left out.")
+
+    def _check_placeholders(self):
+        """Check for unfilled placeholder text."""
+        placeholder_count = 0
+        for pattern in PLACEHOLDER_PATTERNS:
+            matches = re.findall(pattern, self.content, re.IGNORECASE)
+            placeholder_count += len(matches)
+
+        if placeholder_count > 0:
+            self._add_finding("warning", "placeholders",
+                              f"Found {placeholder_count} placeholder(s) that need to be filled in (e.g., [your name], [describe ...]).")
+            # Deduct from overall score proportionally
+            for key in self.section_scores:
+                if self.section_scores[key]["present"]:
+                    deduction = min(3, self.section_scores[key]["score"])
+                    self.section_scores[key]["score"] = max(0, self.section_scores[key]["score"] - deduction)
+
+    def _check_traceability(self):
+        """Check that acceptance criteria reference functional requirements."""
+        ac_content = self._find_section_content(r"^##\s+Acceptance\s+Criteria")
+        fr_content = self._find_section_content(r"^##\s+Functional\s+Requirements")
+
+        if not ac_content.strip() or not fr_content.strip():
+            return
+
+        # Extract FR IDs
+        fr_ids = set(re.findall(r"FR-(\d+)", fr_content))
+        # Extract FR references from AC
+        ac_fr_refs = set(re.findall(r"FR-(\d+)", ac_content))
+
+        unreferenced = fr_ids - ac_fr_refs
+        if unreferenced:
+            unreferenced_list = ", ".join(f"FR-{i}" for i in sorted(unreferenced))
+            self._add_finding("warning", "traceability",
+                              f"Functional requirements without acceptance criteria: {unreferenced_list}")
+
+    def _calculate_score(self) -> int:
+        """Calculate the total completeness score."""
+        total = sum(s["score"] for s in self.section_scores.values())
+        maximum = sum(s["max"] for s in self.section_scores.values())
+
+        if maximum == 0:
+            return 0
+
+        # Apply finding-based deductions
+        error_count = sum(1 for f in self.findings if f["severity"] == "error")
+        warning_count = sum(1 for f in self.findings if f["severity"] == "warning")
+
+        base_score = round((total / maximum) * 100)
+        deduction = (error_count * 5) + (warning_count * 2)
+
+        return max(0, min(100, base_score - deduction))
+
+    @staticmethod
+    def _score_to_grade(score: int) -> str:
+        """Convert score to letter grade."""
+        if score >= 90:
+            return "A"
+        if score >= 80:
+            return "B"
+        if score >= 70:
+            return "C"
+        if score >= 60:
+            return "D"
+        return "F"
+
+    def _build_summary(self, score: int) -> str:
+        """Build human-readable summary."""
+        errors = [f for f in self.findings if f["severity"] == "error"]
+        warnings = [f for f in self.findings if f["severity"] == "warning"]
+        infos = [f for f in self.findings if f["severity"] == "info"]
+
+        lines = [
+            f"Spec Completeness Score: {score}/100 (Grade: {self._score_to_grade(score)})",
+            f"Errors: {len(errors)}, Warnings: {len(warnings)}, Info: {len(infos)}",
+            "",
+        ]
+
+        if errors:
+            lines.append("ERRORS (must fix):")
+            for e in errors:
+                lines.append(f"  [{e['section']}] {e['message']}")
+            lines.append("")
+
+        if warnings:
+            lines.append("WARNINGS (should fix):")
+            for w in warnings:
+                lines.append(f"  [{w['section']}] {w['message']}")
+            lines.append("")
+
+        if infos:
+            lines.append("INFO:")
+            for i in infos:
+                lines.append(f"  [{i['section']}] {i['message']}")
+            lines.append("")
+
+        # Section breakdown
+        lines.append("Section Breakdown:")
+        for key, data in self.section_scores.items():
+            status = "PRESENT" if data["present"] else "MISSING"
+            lines.append(f"  {data['name']}: {data['score']}/{data['max']} ({status})")
+
+        return "\n".join(lines)
+
+
+def format_human(result: Dict[str, Any]) -> str:
+    """Format validation result for human reading."""
+    lines = [
+        "=" * 60,
+        "SPEC VALIDATION REPORT",
+        "=" * 60,
+        "",
+    ]
+    if result["file"]:
+        lines.append(f"File: {result['file']}")
+        lines.append("")
+
+    lines.append(result["summary"])
+
+    return "\n".join(lines)
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Validate a feature specification for completeness and quality.",
+        epilog="Example: python spec_validator.py --file spec.md --strict",
+    )
+    parser.add_argument(
+        "--file",
+        "-f",
+        required=True,
+        help="Path to the spec markdown file",
+    )
+    parser.add_argument(
+        "--strict",
+        action="store_true",
+        help="Exit with code 2 if score is below 80",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_flag",
+        help="Output results as JSON",
+    )
+
+    args = parser.parse_args()
+
+    file_path = Path(args.file)
+    if not file_path.exists():
+        print(f"Error: File not found: {file_path}", file=sys.stderr)
+        sys.exit(2)
+
+    content = file_path.read_text(encoding="utf-8")
+
+    if not content.strip():
+        print(f"Error: File is empty: {file_path}", file=sys.stderr)
+        sys.exit(2)
+
+    validator = SpecValidator(content, str(file_path))
+    result = validator.validate()
+
+    if args.json_flag:
+        print(json.dumps(result, indent=2))
+    else:
+        print(format_human(result))
+
+    # Determine exit code
+    score = result["score"]
+    has_errors = any(f["severity"] == "error" for f in result["findings"])
+    has_warnings = any(f["severity"] == "warning" for f in result["findings"])
+
+    if args.strict and score < 80:
+        sys.exit(2)
+    elif has_errors:
+        sys.exit(2)
+    elif has_warnings:
+        sys.exit(1)
+    else:
+        sys.exit(0)
+
+
+if __name__ == "__main__":
+    main()
--- a/engineering/spec-driven-workflow/test_extractor.py
+++ b/engineering/spec-driven-workflow/test_extractor.py
@@ -0,0 +1,431 @@
+#!/usr/bin/env python3
+"""
+Test Extractor - Extracts test case stubs from a feature specification.
+
+Parses acceptance criteria (Given/When/Then) and edge cases from a spec
+document, then generates test stubs for the specified framework.
+
+Supported frameworks: pytest, jest, go-test
+
+Exit codes: 0 = success, 1 = warnings (some criteria unparseable), 2 = critical error
+
+No external dependencies - uses only Python standard library.
+"""
+
+import argparse
+import json
+import re
+import sys
+import textwrap
+from pathlib import Path
+from typing import Dict, List, Any, Optional, Tuple
+
+
+class SpecParser:
+    """Parses spec documents to extract testable criteria."""
+
+    def __init__(self, content: str):
+        self.content = content
+        self.lines = content.split("\n")
+
+    def extract_acceptance_criteria(self) -> List[Dict[str, Any]]:
+        """Extract AC-N blocks with Given/When/Then clauses."""
+        criteria = []
+        ac_pattern = re.compile(r"###\s+AC-(\d+):\s*(.+?)(?:\s*\(([^)]+)\))?\s*$")
+
+        in_ac = False
+        current_ac: Optional[Dict[str, Any]] = None
+        body_lines: List[str] = []
+
+        for line in self.lines:
+            match = ac_pattern.match(line)
+            if match:
+                # Save previous AC
+                if current_ac is not None:
+                    current_ac["body"] = "\n".join(body_lines).strip()
+                    self._parse_gwt(current_ac)
+                    criteria.append(current_ac)
+
+                ac_id = int(match.group(1))
+                name = match.group(2).strip()
+                refs = match.group(3).strip() if match.group(3) else ""
+
+                current_ac = {
+                    "id": f"AC-{ac_id}",
+                    "name": name,
+                    "references": [r.strip() for r in refs.split(",") if r.strip()] if refs else [],
+                    "given": "",
+                    "when": "",
+                    "then": [],
+                    "body": "",
+                }
+                body_lines = []
+                in_ac = True
+            elif in_ac:
+                # Check if we hit another ## section
+                if re.match(r"^##\s+", line) and not re.match(r"^###\s+", line):
+                    in_ac = False
+                    if current_ac is not None:
+                        current_ac["body"] = "\n".join(body_lines).strip()
+                        self._parse_gwt(current_ac)
+                        criteria.append(current_ac)
+                        current_ac = None
+                else:
+                    body_lines.append(line)
+
+        # Don't forget the last one
+        if current_ac is not None:
+            current_ac["body"] = "\n".join(body_lines).strip()
+            self._parse_gwt(current_ac)
+            criteria.append(current_ac)
+
+        return criteria
+
+    def extract_edge_cases(self) -> List[Dict[str, Any]]:
+        """Extract EC-N edge case items."""
+        edge_cases = []
+        ec_pattern = re.compile(r"-\s+EC-(\d+):\s*(.+?)(?:\s*->\s*|\s*->\s*|\s*→\s*)(.+)")
+
+        in_section = False
+        for line in self.lines:
+            if re.match(r"^##\s+Edge\s+Cases", line, re.IGNORECASE):
+                in_section = True
+                continue
+            if in_section and re.match(r"^##\s+", line):
+                break
+            if in_section:
+                match = ec_pattern.match(line.strip())
+                if match:
+                    edge_cases.append({
+                        "id": f"EC-{match.group(1)}",
+                        "condition": match.group(2).strip().rstrip("."),
+                        "behavior": match.group(3).strip().rstrip("."),
+                    })
+
+        return edge_cases
+
+    def extract_spec_title(self) -> str:
+        """Extract the spec title from the first H1."""
+        for line in self.lines:
+            match = re.match(r"^#\s+(?:Spec:\s*)?(.+)", line)
+            if match:
+                return match.group(1).strip()
+        return "UnknownFeature"
+
+    @staticmethod
+    def _parse_gwt(ac: Dict[str, Any]):
+        """Parse Given/When/Then from the AC body text."""
+        body = ac["body"]
+        lines = body.split("\n")
+
+        current_section = None
+        for line in lines:
+            stripped = line.strip()
+            if not stripped:
+                continue
+
+            lower = stripped.lower()
+            if lower.startswith("given "):
+                current_section = "given"
+                ac["given"] = stripped[6:].strip()
+            elif lower.startswith("when "):
+                current_section = "when"
+                ac["when"] = stripped[5:].strip()
+            elif lower.startswith("then "):
+                current_section = "then"
+                ac["then"].append(stripped[5:].strip())
+            elif lower.startswith("and "):
+                if current_section == "then":
+                    ac["then"].append(stripped[4:].strip())
+                elif current_section == "given":
+                    ac["given"] += " AND " + stripped[4:].strip()
+                elif current_section == "when":
+                    ac["when"] += " AND " + stripped[4:].strip()
+
+
+def _sanitize_name(name: str) -> str:
+    """Convert a human-readable name to a valid function/method name."""
+    # Remove parenthetical references like (FR-1)
+    name = re.sub(r"\([^)]*\)", "", name)
+    # Replace non-alphanumeric with underscore
+    name = re.sub(r"[^a-zA-Z0-9]+", "_", name)
+    # Remove leading/trailing underscores
+    name = name.strip("_").lower()
+    return name or "unnamed"
+
+
+def _to_pascal_case(name: str) -> str:
+    """Convert to PascalCase for Go test names."""
+    parts = _sanitize_name(name).split("_")
+    return "".join(p.capitalize() for p in parts if p)
+
+
+class PytestGenerator:
+    """Generates pytest test stubs."""
+
+    def generate(self, title: str, criteria: List[Dict], edge_cases: List[Dict]) -> str:
+        class_name = "Test" + _to_pascal_case(title)
+        lines = [
+            '"""',
+            f"Test suite for: {title}",
+            f"Auto-generated from spec. {len(criteria)} acceptance criteria, {len(edge_cases)} edge cases.",
+            "",
+            "All tests are stubs — implement the test body to make them pass.",
+            '"""',
+            "",
+            "import pytest",
+            "",
+            "",
+            f"class {class_name}:",
+            f'    """Tests for {title}."""',
+            "",
+        ]
+
+        for ac in criteria:
+            method_name = f"test_{ac['id'].lower().replace('-', '')}_{_sanitize_name(ac['name'])}"
+            docstring = f'{ac["id"]}: {ac["name"]}'
+            ref_str = f" [{', '.join(ac['references'])}]" if ac["references"] else ""
+
+            lines.append(f"    def {method_name}(self):")
+            lines.append(f'        """{docstring}{ref_str}"""')
+
+            if ac["given"]:
+                lines.append(f"        # Given {ac['given']}")
+            if ac["when"]:
+                lines.append(f"        # When {ac['when']}")
+            for t in ac["then"]:
+                lines.append(f"        # Then {t}")
+
+            lines.append('        raise NotImplementedError("Implement this test")')
+            lines.append("")
+
+        if edge_cases:
+            lines.append("    # --- Edge Cases ---")
+            lines.append("")
+
+        for ec in edge_cases:
+            method_name = f"test_{ec['id'].lower().replace('-', '')}_{_sanitize_name(ec['condition'])}"
+            lines.append(f"    def {method_name}(self):")
+            lines.append(f'        """{ec["id"]}: {ec["condition"]} -> {ec["behavior"]}"""')
+            lines.append(f"        # Condition: {ec['condition']}")
+            lines.append(f"        # Expected: {ec['behavior']}")
+            lines.append('        raise NotImplementedError("Implement this test")')
+            lines.append("")
+
+        return "\n".join(lines)
+
+
+class JestGenerator:
+    """Generates Jest/Vitest test stubs (TypeScript)."""
+
+    def generate(self, title: str, criteria: List[Dict], edge_cases: List[Dict]) -> str:
+        lines = [
+            f"/**",
+            f" * Test suite for: {title}",
+            f" * Auto-generated from spec. {len(criteria)} acceptance criteria, {len(edge_cases)} edge cases.",
+            f" *",
+            f" * All tests are stubs — implement the test body to make them pass.",
+            f" */",
+            "",
+            f'describe("{title}", () => {{',
+        ]
+
+        for ac in criteria:
+            ref_str = f" [{', '.join(ac['references'])}]" if ac["references"] else ""
+            test_name = f"{ac['id']}: {ac['name']}{ref_str}"
+
+            lines.append(f'  it("{test_name}", () => {{')
+            if ac["given"]:
+                lines.append(f"    // Given {ac['given']}")
+            if ac["when"]:
+                lines.append(f"    // When {ac['when']}")
+            for t in ac["then"]:
+                lines.append(f"    // Then {t}")
+            lines.append("")
+            lines.append('    throw new Error("Not implemented");')
+            lines.append("  });")
+            lines.append("")
+
+        if edge_cases:
+            lines.append("  // --- Edge Cases ---")
+            lines.append("")
+
+        for ec in edge_cases:
+            test_name = f"{ec['id']}: {ec['condition']}"
+            lines.append(f'  it("{test_name}", () => {{')
+            lines.append(f"    // Condition: {ec['condition']}")
+            lines.append(f"    // Expected: {ec['behavior']}")
+            lines.append("")
+            lines.append('    throw new Error("Not implemented");')
+            lines.append("  });")
+            lines.append("")
+
+        lines.append("});")
+        lines.append("")
+
+        return "\n".join(lines)
+
+
+class GoTestGenerator:
+    """Generates Go test stubs."""
+
+    def generate(self, title: str, criteria: List[Dict], edge_cases: List[Dict]) -> str:
+        package_name = _sanitize_name(title).split("_")[0] or "feature"
+
+        lines = [
+            f"package {package_name}_test",
+            "",
+            "import (",
+            '\t"testing"',
+            ")",
+            "",
+            f"// Test suite for: {title}",
+            f"// Auto-generated from spec. {len(criteria)} acceptance criteria, {len(edge_cases)} edge cases.",
+            f"// All tests are stubs — implement the test body to make them pass.",
+            "",
+        ]
+
+        for ac in criteria:
+            func_name = "Test" + _to_pascal_case(ac["id"] + " " + ac["name"])
+            ref_str = f" [{', '.join(ac['references'])}]" if ac["references"] else ""
+
+            lines.append(f"// {ac['id']}: {ac['name']}{ref_str}")
+            lines.append(f"func {func_name}(t *testing.T) {{")
+
+            if ac["given"]:
+                lines.append(f"\t// Given {ac['given']}")
+            if ac["when"]:
+                lines.append(f"\t// When {ac['when']}")
+            for then_clause in ac["then"]:
+                lines.append(f"\t// Then {then_clause}")
+
+            lines.append("")
+            lines.append('\tt.Fatal("Not implemented")')
+            lines.append("}")
+            lines.append("")
+
+        if edge_cases:
+            lines.append("// --- Edge Cases ---")
+            lines.append("")
+
+        for ec in edge_cases:
+            func_name = "Test" + _to_pascal_case(ec["id"] + " " + ec["condition"])
+            lines.append(f"// {ec['id']}: {ec['condition']} -> {ec['behavior']}")
+            lines.append(f"func {func_name}(t *testing.T) {{")
+            lines.append(f"\t// Condition: {ec['condition']}")
+            lines.append(f"\t// Expected: {ec['behavior']}")
+            lines.append("")
+            lines.append('\tt.Fatal("Not implemented")')
+            lines.append("}")
+            lines.append("")
+
+        return "\n".join(lines)
+
+
+GENERATORS = {
+    "pytest": PytestGenerator,
+    "jest": JestGenerator,
+    "go-test": GoTestGenerator,
+}
+
+FILE_EXTENSIONS = {
+    "pytest": ".py",
+    "jest": ".test.ts",
+    "go-test": "_test.go",
+}
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Extract test case stubs from a feature specification.",
+        epilog="Example: python test_extractor.py --file spec.md --framework pytest --output tests/test_feature.py",
+    )
+    parser.add_argument(
+        "--file",
+        "-f",
+        required=True,
+        help="Path to the spec markdown file",
+    )
+    parser.add_argument(
+        "--framework",
+        choices=list(GENERATORS.keys()),
+        default="pytest",
+        help="Target test framework (default: pytest)",
+    )
+    parser.add_argument(
+        "--output",
+        "-o",
+        default=None,
+        help="Output file path (default: stdout)",
+    )
+    parser.add_argument(
+        "--json",
+        action="store_true",
+        dest="json_flag",
+        help="Output extracted criteria as JSON instead of test code",
+    )
+
+    args = parser.parse_args()
+
+    file_path = Path(args.file)
+    if not file_path.exists():
+        print(f"Error: File not found: {file_path}", file=sys.stderr)
+        sys.exit(2)
+
+    content = file_path.read_text(encoding="utf-8")
+    if not content.strip():
+        print(f"Error: File is empty: {file_path}", file=sys.stderr)
+        sys.exit(2)
+
+    spec_parser = SpecParser(content)
+    title = spec_parser.extract_spec_title()
+    criteria = spec_parser.extract_acceptance_criteria()
+    edge_cases = spec_parser.extract_edge_cases()
+
+    if not criteria and not edge_cases:
+        print("Error: No acceptance criteria or edge cases found in spec.", file=sys.stderr)
+        sys.exit(2)
+
+    warnings = []
+    for ac in criteria:
+        if not ac["given"] and not ac["when"]:
+            warnings.append(f"{ac['id']}: Could not parse Given/When/Then — check format.")
+
+    if args.json_flag:
+        result = {
+            "spec_title": title,
+            "framework": args.framework,
+            "acceptance_criteria": criteria,
+            "edge_cases": edge_cases,
+            "warnings": warnings,
+            "counts": {
+                "acceptance_criteria": len(criteria),
+                "edge_cases": len(edge_cases),
+                "total_test_cases": len(criteria) + len(edge_cases),
+            },
+        }
+        output = json.dumps(result, indent=2)
+    else:
+        generator_class = GENERATORS[args.framework]
+        generator = generator_class()
+        output = generator.generate(title, criteria, edge_cases)
+
+    if args.output:
+        out_path = Path(args.output)
+        out_path.parent.mkdir(parents=True, exist_ok=True)
+        out_path.write_text(output, encoding="utf-8")
+        total = len(criteria) + len(edge_cases)
+        print(f"Generated {total} test stubs -> {out_path}", file=sys.stderr)
+    else:
+        print(output)
+
+    if warnings:
+        for w in warnings:
+            print(f"Warning: {w}", file=sys.stderr)
+        sys.exit(1)
+
+    sys.exit(0)
+
+
+if __name__ == "__main__":
+    main()
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -174,6 +174,7 @@ nav:
      - "Agent Workflow Designer": skills/engineering/agent-workflow-designer.md
      - "API Design Reviewer": skills/engineering/api-design-reviewer.md
      - "API Test Suite Builder": skills/engineering/api-test-suite-builder.md
+      - "Browser Automation": skills/engineering/browser-automation.md
      - "Changelog Generator": skills/engineering/changelog-generator.md
      - "CI/CD Pipeline Builder": skills/engineering/ci-cd-pipeline-builder.md
      - "Codebase Onboarding": skills/engineering/codebase-onboarding.md
@@ -195,6 +196,7 @@ nav:
      - "Runbook Generator": skills/engineering/runbook-generator.md
      - "Skill Security Auditor": skills/engineering/skill-security-auditor.md
      - "Skill Tester": skills/engineering/skill-tester.md
+      - "Spec-Driven Workflow": skills/engineering/spec-driven-workflow.md
      - "Tech Debt Tracker": skills/engineering/tech-debt-tracker.md
      - "Terraform Patterns": skills/engineering/terraform-patterns.md
      - "Helm Chart Builder": skills/engineering/helm-chart-builder.md