feat(engineering): add browser-automation and spec-driven-workflow skills

browser-automation (564-line SKILL.md, 3 scripts, 3 references):
- Web scraping, form filling, screenshot capture, data extraction
- Anti-detection patterns, cookie/session management, dynamic content
- scraping_toolkit.py, form_automation_builder.py, anti_detection_checker.py
- NOT testing (that's playwright-pro) — this is automation & scraping

spec-driven-workflow (586-line SKILL.md, 3 scripts, 3 references):
- Spec-first development: write spec BEFORE code
- Bounded autonomy rules, 6-phase workflow, self-review checklist
- spec_generator.py, spec_validator.py, test_extractor.py
- Pairs with tdd-guide for red-green-refactor after spec

Updated engineering plugin.json (31 → 33 skills).
Added both to mkdocs.yml nav and generated docs pages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Reza Rezvani
2026-03-25 12:57:18 +01:00
parent 7a2189fa21
commit 97952ccbee
19 changed files with 7379 additions and 3 deletions

View File

@@ -0,0 +1,575 @@
---
title: "Browser Automation — Agent Skill for Codex & OpenClaw"
description: "Use when the user asks to automate browser tasks, scrape websites, fill forms, capture screenshots, extract structured data from web pages, or build. Agent skill for Claude Code, Codex CLI, Gemini CLI, OpenClaw."
---
# Browser Automation
<div class="page-meta" markdown>
<span class="meta-badge">:material-rocket-launch: Engineering - POWERFUL</span>
<span class="meta-badge">:material-identifier: `browser-automation`</span>
<span class="meta-badge">:material-github: <a href="https://github.com/alirezarezvani/claude-skills/tree/main/engineering/browser-automation/SKILL.md">Source</a></span>
</div>
<div class="install-banner" markdown>
<span class="install-label">Install:</span> <code>claude /plugin install engineering-advanced-skills</code>
</div>
## Overview
The Browser Automation skill provides comprehensive tools and knowledge for building production-grade web automation workflows using Playwright. This skill covers data extraction, form filling, screenshot capture, session management, and anti-detection patterns for reliable browser automation at scale.
**When to use this skill:**
- Scraping structured data from websites (tables, listings, search results)
- Automating multi-step browser workflows (login, fill forms, download files)
- Capturing screenshots or PDFs of web pages
- Extracting data from SPAs and JavaScript-heavy sites
- Building repeatable browser-based data pipelines
**When NOT to use this skill:**
- Writing browser tests or E2E test suites — use **playwright-pro** instead
- Testing API endpoints — use **api-test-suite-builder** instead
- Load testing or performance benchmarking — use **performance-profiler** instead
**Why Playwright over Selenium or Puppeteer:**
- **Auto-wait built in** — no explicit `sleep()` or `waitForElement()` needed for most actions
- **Multi-browser from one API** — Chromium, Firefox, WebKit with zero config changes
- **Network interception** — block ads, mock responses, capture API calls natively
- **Browser contexts** — isolated sessions without spinning up new browser instances
- **Codegen** — `playwright codegen` records your actions and generates scripts
- **Async-first** — Python async/await for high-throughput scraping
## Core Competencies
### 1. Web Scraping Patterns
#### DOM Extraction with CSS Selectors
CSS selectors are the primary tool for element targeting. Prefer them over XPath for readability and performance.
**Selector priority (most to least reliable):**
1. `data-testid`, `data-id`, or custom data attributes — stable across redesigns
2. `#id` selectors — unique but may change between deploys
3. Semantic selectors: `article`, `nav`, `main`, `section` — resilient to CSS changes
4. Class-based: `.product-card`, `.price` — brittle if classes are generated (e.g., CSS modules)
5. Positional: `nth-child()`, `nth-of-type()` — last resort, breaks on layout changes
**Compound selectors for precision:**
```python
# Product cards within a specific container
page.query_selector_all("div.search-results > article.product-card")
# Price inside a product card (scoped)
card.query_selector("span[data-field='price']")
# Links with specific text content
page.locator("a", has_text="Next Page")
```
#### XPath for Complex Traversal
Use XPath only when CSS cannot express the relationship:
```python
# Find element by text content (XPath strength)
page.locator("//td[contains(text(), 'Total')]/following-sibling::td[1]")
# Navigate up the DOM tree
page.locator("//span[@class='price']/ancestor::div[@class='product']")
```
#### Pagination Patterns
- **Next-button pagination**: Click "Next" until disabled or absent
- **URL-based pagination**: Increment `?page=N` or `&offset=N` in URL
- **Infinite scroll**: Scroll to bottom, wait for new content, repeat until no change
- **Load-more button**: Click button, wait for DOM mutation, repeat
#### Infinite Scroll Handling
```python
async def scroll_to_bottom(page, max_scrolls=50, pause_ms=1500):
previous_height = 0
for i in range(max_scrolls):
current_height = await page.evaluate("document.body.scrollHeight")
if current_height == previous_height:
break
await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
await page.wait_for_timeout(pause_ms)
previous_height = current_height
return i + 1 # number of scrolls performed
```
### 2. Form Filling & Multi-Step Workflows
#### Login Flows
```python
async def login(page, url, username, password):
await page.goto(url)
await page.fill("input[name='username']", username)
await page.fill("input[name='password']", password)
await page.click("button[type='submit']")
# Wait for navigation to complete (post-login redirect)
await page.wait_for_url("**/dashboard**")
```
#### Multi-Page Forms
Break multi-step forms into discrete functions per step. Each function:
1. Fills the fields for that step
2. Clicks the "Next" or "Continue" button
3. Waits for the next step to load (URL change or DOM element)
```python
async def fill_step_1(page, data):
await page.fill("#first-name", data["first_name"])
await page.fill("#last-name", data["last_name"])
await page.select_option("#country", data["country"])
await page.click("button:has-text('Continue')")
await page.wait_for_selector("#step-2-form")
async def fill_step_2(page, data):
await page.fill("#address", data["address"])
await page.fill("#city", data["city"])
await page.click("button:has-text('Continue')")
await page.wait_for_selector("#step-3-form")
```
#### File Uploads
```python
# Single file
await page.set_input_files("input[type='file']", "/path/to/file.pdf")
# Multiple files
await page.set_input_files("input[type='file']", [
"/path/to/file1.pdf",
"/path/to/file2.pdf"
])
# Drag-and-drop upload zones (no visible input element)
async with page.expect_file_chooser() as fc_info:
await page.click("div.upload-zone")
file_chooser = await fc_info.value
await file_chooser.set_files("/path/to/file.pdf")
```
#### Dropdown and Select Handling
```python
# Native <select> element
await page.select_option("#country", value="US")
await page.select_option("#country", label="United States")
# Custom dropdown (div-based)
await page.click("div.dropdown-trigger")
await page.click("div.dropdown-option:has-text('United States')")
```
### 3. Screenshot & PDF Capture
#### Screenshot Strategies
```python
# Full page (scrolls automatically)
await page.screenshot(path="full-page.png", full_page=True)
# Viewport only (what's visible)
await page.screenshot(path="viewport.png")
# Specific element
element = page.locator("div.chart-container")
await element.screenshot(path="chart.png")
# With custom viewport for consistency
context = await browser.new_context(viewport={"width": 1920, "height": 1080})
```
#### PDF Generation
```python
# Only works in Chromium
await page.pdf(
path="output.pdf",
format="A4",
margin={"top": "1cm", "right": "1cm", "bottom": "1cm", "left": "1cm"},
print_background=True
)
```
#### Visual Regression Baselines
Take screenshots at known states and compare pixel-by-pixel. Store baselines in version control. Use naming conventions: `{page}_{viewport}_{state}.png`.
### 4. Structured Data Extraction
#### Tables to JSON
```python
async def extract_table(page, selector):
headers = await page.eval_on_selector_all(
f"{selector} thead th",
"elements => elements.map(e => e.textContent.trim())"
)
rows = await page.eval_on_selector_all(
f"{selector} tbody tr",
"""rows => rows.map(row => {
return Array.from(row.querySelectorAll('td'))
.map(cell => cell.textContent.trim())
})"""
)
return [dict(zip(headers, row)) for row in rows]
```
#### Listings to Arrays
```python
async def extract_listings(page, container_sel, field_map):
"""
field_map example: {"title": "h3.title", "price": "span.price", "url": "a::attr(href)"}
"""
items = []
cards = await page.query_selector_all(container_sel)
for card in cards:
item = {}
for field, sel in field_map.items():
if "::attr(" in sel:
attr_sel, attr_name = sel.split("::attr(")
attr_name = attr_name.rstrip(")")
el = await card.query_selector(attr_sel)
item[field] = await el.get_attribute(attr_name) if el else None
else:
el = await card.query_selector(sel)
item[field] = (await el.text_content()).strip() if el else None
items.append(item)
return items
```
#### Nested Data Extraction
For threaded content (comments with replies), use recursive extraction:
```python
async def extract_comments(page, parent_selector):
comments = []
elements = await page.query_selector_all(f"{parent_selector} > .comment")
for el in elements:
text = await (await el.query_selector(".comment-body")).text_content()
author = await (await el.query_selector(".author")).text_content()
replies = await extract_comments(el, ".replies")
comments.append({
"author": author.strip(),
"text": text.strip(),
"replies": replies
})
return comments
```
### 5. Cookie & Session Management
#### Save and Restore Sessions
```python
import json
# Save cookies after login
cookies = await context.cookies()
with open("session.json", "w") as f:
json.dump(cookies, f)
# Restore session in new context
with open("session.json", "r") as f:
cookies = json.load(f)
context = await browser.new_context()
await context.add_cookies(cookies)
```
#### Storage State (Cookies + Local Storage)
```python
# Save full state (cookies + localStorage + sessionStorage)
await context.storage_state(path="state.json")
# Restore full state
context = await browser.new_context(storage_state="state.json")
```
**Best practice:** Save state after login, reuse across scraping sessions. Check session validity before starting a long job — make a lightweight request to a protected page and verify you are not redirected to login.
### 6. Anti-Detection Patterns
Modern websites detect automation through multiple vectors. Address all of them:
#### User Agent Rotation
Never use the default Playwright user agent. Rotate through real browser user agents:
```python
USER_AGENTS = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
]
```
#### Viewport and Screen Size
Set realistic viewport dimensions. The default 800x600 is a red flag:
```python
context = await browser.new_context(
viewport={"width": 1920, "height": 1080},
screen={"width": 1920, "height": 1080},
user_agent=random.choice(USER_AGENTS),
)
```
#### WebDriver Flag Removal
Playwright sets `navigator.webdriver = true`. Remove it:
```python
await page.add_init_script("""
Object.defineProperty(navigator, 'webdriver', {get: () => undefined});
""")
```
#### Request Throttling
Add human-like delays between actions:
```python
import random
async def human_delay(min_ms=500, max_ms=2000):
delay = random.randint(min_ms, max_ms)
await page.wait_for_timeout(delay)
```
#### Proxy Support
```python
browser = await playwright.chromium.launch(
proxy={"server": "http://proxy.example.com:8080"}
)
# Or per-context:
context = await browser.new_context(
proxy={"server": "http://proxy.example.com:8080",
"username": "user", "password": "pass"}
)
```
### 7. Dynamic Content Handling
#### SPA Rendering
SPAs render content client-side. Wait for the actual content, not the page load:
```python
await page.goto(url)
# Wait for the data to render, not just the shell
await page.wait_for_selector("div.product-list article", state="attached")
```
#### AJAX / Fetch Waiting
Intercept and wait for specific API calls:
```python
async with page.expect_response("**/api/products*") as response_info:
await page.click("button.load-more")
response = await response_info.value
data = await response.json() # You can use the API data directly
```
#### Shadow DOM Traversal
```python
# Playwright pierces open Shadow DOM automatically with >>
await page.locator("custom-element >> .inner-class").click()
```
#### Lazy-Loaded Images
Scroll elements into view to trigger lazy loading:
```python
images = await page.query_selector_all("img[data-src]")
for img in images:
await img.scroll_into_view_if_needed()
await page.wait_for_timeout(200)
```
### 8. Error Handling & Retry Logic
#### Retry Decorator Pattern
```python
import asyncio
async def with_retry(coro_factory, max_retries=3, backoff_base=2):
for attempt in range(max_retries):
try:
return await coro_factory()
except Exception as e:
if attempt == max_retries - 1:
raise
wait = backoff_base ** attempt
print(f"Attempt {attempt + 1} failed: {e}. Retrying in {wait}s...")
await asyncio.sleep(wait)
```
#### Handling Common Failures
```python
from playwright.async_api import TimeoutError as PlaywrightTimeout
try:
await page.click("button.submit", timeout=5000)
except PlaywrightTimeout:
# Element did not appear — page structure may have changed
# Try fallback selector
await page.click("[type='submit']", timeout=5000)
except Exception as e:
# Network error, browser crash, etc.
await page.screenshot(path="error-state.png")
raise
```
#### Rate Limit Detection
```python
async def check_rate_limit(response):
if response.status == 429:
retry_after = response.headers.get("retry-after", "60")
wait_seconds = int(retry_after)
print(f"Rate limited. Waiting {wait_seconds}s...")
await asyncio.sleep(wait_seconds)
return True
return False
```
## Workflows
### Workflow 1: Single-Page Data Extraction
**Scenario:** Extract product data from a single page with JavaScript-rendered content.
**Steps:**
1. Launch browser in headed mode during development (`headless=False`), switch to headless for production
2. Navigate to URL and wait for content selector
3. Extract data using `query_selector_all` with field mapping
4. Validate extracted data (check for nulls, expected types)
5. Output as JSON
```python
async def extract_single_page(url, selectors):
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
context = await browser.new_context(
viewport={"width": 1920, "height": 1080},
user_agent="Mozilla/5.0 ..."
)
page = await context.new_page()
await page.goto(url, wait_until="networkidle")
data = await extract_listings(page, selectors["container"], selectors["fields"])
await browser.close()
return data
```
### Workflow 2: Multi-Page Scraping with Pagination
**Scenario:** Scrape search results across 50+ pages.
**Steps:**
1. Launch browser with anti-detection settings
2. Navigate to first page
3. Extract data from current page
4. Check if "Next" button exists and is enabled
5. Click next, wait for new content to load (not just navigation)
6. Repeat until no next page or max pages reached
7. Deduplicate results by unique key
8. Write output incrementally (don't hold everything in memory)
```python
async def scrape_paginated(base_url, selectors, max_pages=100):
all_data = []
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await (await browser.new_context()).new_page()
await page.goto(base_url)
for page_num in range(max_pages):
items = await extract_listings(page, selectors["container"], selectors["fields"])
all_data.extend(items)
next_btn = page.locator(selectors["next_button"])
if await next_btn.count() == 0 or await next_btn.is_disabled():
break
await next_btn.click()
await page.wait_for_selector(selectors["container"])
await human_delay(800, 2000)
await browser.close()
return all_data
```
### Workflow 3: Authenticated Workflow Automation
**Scenario:** Log into a portal, navigate a multi-step form, download a report.
**Steps:**
1. Check for existing session state file
2. If no session, perform login and save state
3. Navigate to target page using saved session
4. Fill multi-step form with provided data
5. Wait for download to trigger
6. Save downloaded file to target directory
```python
async def authenticated_workflow(credentials, form_data, download_dir):
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
state_file = "session_state.json"
# Restore or create session
if os.path.exists(state_file):
context = await browser.new_context(storage_state=state_file)
else:
context = await browser.new_context()
page = await context.new_page()
await login(page, credentials["url"], credentials["user"], credentials["pass"])
await context.storage_state(path=state_file)
page = await context.new_page()
await page.goto(form_data["target_url"])
# Fill form steps
for step_fn in [fill_step_1, fill_step_2]:
await step_fn(page, form_data)
# Handle download
async with page.expect_download() as dl_info:
await page.click("button:has-text('Download Report')")
download = await dl_info.value
await download.save_as(os.path.join(download_dir, download.suggested_filename))
await browser.close()
```
## Tools Reference
| Script | Purpose | Key Flags | Output |
|--------|---------|-----------|--------|
| `scraping_toolkit.py` | Generate Playwright scraping script skeleton | `--url`, `--selectors`, `--paginate`, `--output` | Python script or JSON config |
| `form_automation_builder.py` | Generate form-fill automation script from field spec | `--fields`, `--url`, `--output` | Python automation script |
| `anti_detection_checker.py` | Audit a Playwright script for detection vectors | `--file`, `--verbose` | Risk report with score |
All scripts are stdlib-only. Run `python3 <script> --help` for full usage.
## Anti-Patterns
### Hardcoded Waits
**Bad:** `await page.wait_for_timeout(5000)` before every action.
**Good:** Use `wait_for_selector`, `wait_for_url`, `expect_response`, or `wait_for_load_state`. Hardcoded waits are flaky and slow.
### No Error Recovery
**Bad:** Linear script that crashes on first failure.
**Good:** Wrap each page interaction in try/except. Take error-state screenshots. Implement retry with exponential backoff.
### Ignoring robots.txt
**Bad:** Scraping without checking robots.txt directives.
**Good:** Fetch and parse robots.txt before scraping. Respect `Crawl-delay`. Skip disallowed paths. Add your bot name to User-Agent if running at scale.
### Storing Credentials in Scripts
**Bad:** Hardcoding usernames and passwords in Python files.
**Good:** Use environment variables, `.env` files (gitignored), or a secrets manager. Pass credentials via CLI arguments.
### No Rate Limiting
**Bad:** Hammering a site with 100 requests/second.
**Good:** Add random delays between requests (1-3s for polite scraping). Monitor for 429 responses. Implement exponential backoff.
### Selector Fragility
**Bad:** Relying on auto-generated class names (`.css-1a2b3c`) or deep nesting (`div > div > div > span:nth-child(3)`).
**Good:** Use data attributes, semantic HTML, or text-based locators. Test selectors in browser DevTools first.
### Not Cleaning Up Browser Instances
**Bad:** Launching browsers without closing them, leading to resource leaks.
**Good:** Always use `try/finally` or async context managers to ensure `browser.close()` is called.
### Running Headed in Production
**Bad:** Using `headless=False` in production/CI.
**Good:** Develop with headed mode for debugging, deploy with `headless=True`. Use environment variable to toggle: `headless = os.environ.get("HEADLESS", "true") == "true"`.
## Cross-References
- **playwright-pro** — Browser testing skill. Use for E2E tests, test assertions, test fixtures. Browser Automation is for data extraction and workflow automation, not testing.
- **api-test-suite-builder** — When the website has a public API, hit the API directly instead of scraping the rendered page. Faster, more reliable, less detectable.
- **performance-profiler** — If your automation scripts are slow, profile the bottlenecks before adding concurrency.
- **env-secrets-manager** — For securely managing credentials used in authenticated automation workflows.

View File

@@ -1,13 +1,13 @@
---
title: "Engineering - POWERFUL Skills — Agent Skills & Codex Plugins"
description: "44 engineering - powerful skills — advanced agent-native skill and Claude Code plugin for AI agent design, infrastructure, and automation. Works with Claude Code, Codex CLI, Gemini CLI, and OpenClaw."
description: "46 engineering - powerful skills — advanced agent-native skill and Claude Code plugin for AI agent design, infrastructure, and automation. Works with Claude Code, Codex CLI, Gemini CLI, and OpenClaw."
---
<div class="domain-header" markdown>
# :material-rocket-launch: Engineering - POWERFUL
<p class="domain-count">44 skills in this domain</p>
<p class="domain-count">46 skills in this domain</p>
</div>
@@ -53,6 +53,12 @@ description: "44 engineering - powerful skills — advanced agent-native skill a
> You sleep. The agent experiments. You wake up to results.
- **[Browser Automation - POWERFUL](browser-automation.md)**
---
The Browser Automation skill provides comprehensive tools and knowledge for building production-grade web automation ...
- **[Changelog Generator](changelog-generator.md)**
---
@@ -197,6 +203,12 @@ description: "44 engineering - powerful skills — advanced agent-native skill a
---
- **[Spec-Driven Workflow — POWERFUL](spec-driven-workflow.md)**
---
Spec-driven workflow enforces a single, non-negotiable rule: write the specification BEFORE you write any code. Not a...
- **[Tech Debt Tracker](tech-debt-tracker.md)**
---

View File

@@ -0,0 +1,597 @@
---
title: "Spec-Driven Workflow — Agent Skill for Codex & OpenClaw"
description: "Use when the user asks to write specs before code, define acceptance criteria, plan features before implementation, generate tests from. Agent skill for Claude Code, Codex CLI, Gemini CLI, OpenClaw."
---
# Spec-Driven Workflow
<div class="page-meta" markdown>
<span class="meta-badge">:material-rocket-launch: Engineering - POWERFUL</span>
<span class="meta-badge">:material-identifier: `spec-driven-workflow`</span>
<span class="meta-badge">:material-github: <a href="https://github.com/alirezarezvani/claude-skills/tree/main/engineering/spec-driven-workflow/SKILL.md">Source</a></span>
</div>
<div class="install-banner" markdown>
<span class="install-label">Install:</span> <code>claude /plugin install engineering-advanced-skills</code>
</div>
## Overview
Spec-driven workflow enforces a single, non-negotiable rule: **write the specification BEFORE you write any code.** Not alongside. Not after. Before.
This is not documentation. This is a contract. A spec defines what the system MUST do, what it SHOULD do, and what it explicitly WILL NOT do. Every line of code you write traces back to a requirement in the spec. Every test traces back to an acceptance criterion. If it is not in the spec, it does not get built.
### Why Spec-First Matters
1. **Eliminates rework.** 60-80% of defects originate from requirements, not implementation. Catching ambiguity in a spec costs minutes; catching it in production costs days.
2. **Forces clarity.** If you cannot write what the system should do in plain language, you do not understand the problem well enough to write code.
3. **Enables parallelism.** Once a spec is approved, frontend, backend, QA, and documentation can all start simultaneously.
4. **Creates accountability.** The spec is the definition of done. No arguments about whether a feature is "complete" — either it satisfies the acceptance criteria or it does not.
5. **Feeds TDD directly.** Acceptance criteria in Given/When/Then format translate 1:1 into test cases. The spec IS the test plan.
### The Iron Law
```
NO CODE WITHOUT AN APPROVED SPEC.
NO EXCEPTIONS. NO "QUICK PROTOTYPES." NO "I'LL DOCUMENT IT LATER."
```
If the spec is not written, reviewed, and approved, implementation does not begin. Period.
---
## The Spec Format
Every spec follows this structure. No sections are optional — if a section does not apply, write "N/A — [reason]" so reviewers know it was considered, not forgotten.
### 1. Title and Context
```markdown
# Spec: [Feature Name]
**Author:** [name]
**Date:** [ISO 8601]
**Status:** Draft | In Review | Approved | Superseded
**Reviewers:** [list]
**Related specs:** [links]
## Context
[Why does this feature exist? What problem does it solve? What is the business
motivation? Include links to user research, support tickets, or metrics that
justify this work. 2-4 paragraphs maximum.]
```
### 2. Functional Requirements (RFC 2119)
Use RFC 2119 keywords precisely:
| Keyword | Meaning |
|---------|---------|
| **MUST** | Absolute requirement. Failing this means the implementation is non-conformant. |
| **MUST NOT** | Absolute prohibition. Doing this means the implementation is broken. |
| **SHOULD** | Recommended. May be omitted with documented justification. |
| **SHOULD NOT** | Discouraged. May be included with documented justification. |
| **MAY** | Optional. Purely at the implementer's discretion. |
```markdown
## Functional Requirements
- FR-1: The system MUST authenticate users via OAuth 2.0 PKCE flow.
- FR-2: The system MUST reject tokens older than 24 hours.
- FR-3: The system SHOULD support refresh token rotation.
- FR-4: The system MAY cache user profiles for up to 5 minutes.
- FR-5: The system MUST NOT store plaintext passwords under any circumstance.
```
Number every requirement. Use `FR-` prefix. Each requirement is a single, testable statement.
### 3. Non-Functional Requirements
```markdown
## Non-Functional Requirements
### Performance
- NFR-P1: Login flow MUST complete in < 500ms (p95) under normal load.
- NFR-P2: Token validation MUST complete in < 50ms (p99).
### Security
- NFR-S1: All tokens MUST be transmitted over TLS 1.2+.
- NFR-S2: The system MUST rate-limit login attempts to 5/minute per IP.
### Accessibility
- NFR-A1: Login form MUST meet WCAG 2.1 AA standards.
- NFR-A2: Error messages MUST be announced to screen readers.
### Scalability
- NFR-SC1: The system SHOULD handle 10,000 concurrent sessions.
### Reliability
- NFR-R1: The authentication service MUST maintain 99.9% uptime.
```
### 4. Acceptance Criteria (Given/When/Then)
Every functional requirement maps to one or more acceptance criteria. Use Gherkin syntax:
```markdown
## Acceptance Criteria
### AC-1: Successful login (FR-1)
Given a user with valid credentials
When they submit the login form with correct email and password
Then they receive a valid access token
And they are redirected to the dashboard
And the login event is logged with timestamp and IP
### AC-2: Expired token rejection (FR-2)
Given a user with an access token issued 25 hours ago
When they make an API request with that token
Then they receive a 401 Unauthorized response
And the response body contains error code "TOKEN_EXPIRED"
And they are NOT redirected (API clients handle their own flow)
### AC-3: Rate limiting (NFR-S2)
Given an IP address that has made 5 failed login attempts in the last minute
When a 6th login attempt arrives from that IP
Then the request is rejected with 429 Too Many Requests
And the response includes a Retry-After header
```
### 5. Edge Cases and Error Scenarios
```markdown
## Edge Cases
- EC-1: User submits login form with empty email → Show validation error, do not hit API.
- EC-2: OAuth provider is down → Show "Service temporarily unavailable", retry after 30s.
- EC-3: User has account but no password (social-only) → Redirect to social login.
- EC-4: Concurrent login from two devices → Both sessions are valid (no single-session enforcement).
- EC-5: Token expires mid-request → Complete the current request, return warning header.
```
### 6. API Contracts
Define request/response shapes using TypeScript-style notation:
```markdown
## API Contracts
### POST /api/auth/login
Request:
```typescript
interface LoginRequest {
email: string; // MUST be valid email format
password: string; // MUST be 8-128 characters
rememberMe?: boolean; // Default: false
}
```
Success Response (200):
```typescript
interface LoginResponse {
accessToken: string; // JWT, expires in 24h
refreshToken: string; // Opaque, expires in 30d
expiresIn: number; // Seconds until access token expires
user: {
id: string;
email: string;
displayName: string;
};
}
```
Error Response (401):
```typescript
interface AuthError {
error: "INVALID_CREDENTIALS" | "TOKEN_EXPIRED" | "ACCOUNT_LOCKED";
message: string;
retryAfter?: number; // Seconds, present for rate-limited responses
}
```
```
### 7. Data Models
```markdown
## Data Models
### User
| Field | Type | Constraints |
|-------|------|-------------|
| id | UUID | Primary key, auto-generated |
| email | string | Unique, max 255 chars, valid email format |
| passwordHash | string | bcrypt, never exposed via API |
| createdAt | timestamp | UTC, immutable |
| lastLoginAt | timestamp | UTC, updated on each login |
| loginAttempts | integer | Reset to 0 on successful login |
| lockedUntil | timestamp | Null if not locked |
```
### 8. Out of Scope
Explicit exclusions prevent scope creep:
```markdown
## Out of Scope
- OS-1: Multi-factor authentication (separate spec: SPEC-042)
- OS-2: Social login providers beyond Google and GitHub
- OS-3: Admin impersonation of user accounts
- OS-4: Password complexity rules beyond minimum length (deferred to v2)
- OS-5: Session management UI (users cannot see/revoke active sessions yet)
```
If someone asks for an out-of-scope item during implementation, point them to this section. Do not build it.
---
## Bounded Autonomy Rules
These rules define when an agent (human or AI) MUST stop and ask for guidance vs. when they can proceed independently.
### STOP and Ask When:
1. **Scope creep detected.** The implementation requires something not in the spec. Even if it seems obviously needed, STOP. The spec might have excluded it deliberately.
2. **Ambiguity exceeds 30%.** If you cannot determine the correct behavior from the spec for more than 30% of a given requirement, the spec is incomplete. Do not guess.
3. **Breaking changes required.** The implementation would change an existing API contract, database schema, or public interface. Always escalate.
4. **Security implications.** Any change that touches authentication, authorization, encryption, or PII handling requires explicit approval.
5. **Performance characteristics unknown.** If a requirement says "MUST complete in < 500ms" but you have no way to measure or guarantee that, escalate before implementing a guess.
6. **Cross-team dependencies.** If the spec requires coordination with another team or service, confirm the dependency before building against it.
### Continue Autonomously When:
1. **Spec is clear and unambiguous** for the current task.
2. **All acceptance criteria have passing tests** and you are refactoring internals.
3. **Changes are non-breaking** — no public API, schema, or behavior changes.
4. **Implementation is a direct translation** of a well-defined acceptance criterion.
5. **Error handling follows established patterns** already documented in the codebase.
### Escalation Protocol
When you must stop, provide:
```markdown
## Escalation: [Brief Title]
**Blocked on:** [requirement ID, e.g., FR-3]
**Question:** [Specific, answerable question — not "what should I do?"]
**Options considered:**
A. [Option] — Pros: [...] Cons: [...]
B. [Option] — Pros: [...] Cons: [...]
**My recommendation:** [A or B, with reasoning]
**Impact of waiting:** [What is blocked until this is resolved?]
```
Never escalate without a recommendation. Never present an open-ended question. Always give options.
See `references/bounded_autonomy_rules.md` for the complete decision matrix.
---
## Workflow — 6 Phases
### Phase 1: Gather Requirements
**Goal:** Understand what needs to be built and why.
1. **Interview the user.** Ask:
- What problem does this solve?
- Who are the users?
- What does success look like?
- What explicitly should NOT be built?
2. **Read existing code.** Understand the current system before proposing changes.
3. **Identify constraints.** Performance budgets, security requirements, backward compatibility.
4. **List unknowns.** Every unknown is a risk. Surface them now, not during implementation.
**Exit criteria:** You can explain the feature to someone unfamiliar with the project in 2 minutes.
### Phase 2: Write Spec
**Goal:** Produce a complete spec document following The Spec Format above.
1. Fill every section of the template. No section left blank.
2. Number all requirements (FR-*, NFR-*, AC-*, EC-*, OS-*).
3. Use RFC 2119 keywords precisely.
4. Write acceptance criteria in Given/When/Then format.
5. Define API contracts with TypeScript-style types.
6. List explicit exclusions in Out of Scope.
**Exit criteria:** The spec can be handed to a developer who was not in the requirements meeting, and they can implement the feature without asking clarifying questions.
### Phase 3: Validate Spec
**Goal:** Verify the spec is complete, consistent, and implementable.
Run `spec_validator.py` against the spec file:
```bash
python spec_validator.py --file spec.md --strict
```
Manual validation checklist:
- [ ] Every functional requirement has at least one acceptance criterion
- [ ] Every acceptance criterion is testable (no subjective language)
- [ ] API contracts cover all endpoints mentioned in requirements
- [ ] Data models cover all entities mentioned in requirements
- [ ] Edge cases cover failure modes for every external dependency
- [ ] Out of scope is explicit about what was considered and rejected
- [ ] Non-functional requirements have measurable thresholds
**Exit criteria:** Spec scores 80+ on validator, and all manual checklist items pass.
### Phase 4: Generate Tests
**Goal:** Extract test cases from acceptance criteria before writing implementation code.
Run `test_extractor.py` against the approved spec:
```bash
python test_extractor.py --file spec.md --framework pytest --output tests/
```
1. Each acceptance criterion becomes one or more test cases.
2. Each edge case becomes a test case.
3. Tests are stubs — they define the assertion but not the implementation.
4. All tests MUST fail initially (red phase of TDD).
**Exit criteria:** You have a test file where every test fails with "not implemented" or equivalent.
### Phase 5: Implement
**Goal:** Write code that makes failing tests pass, one acceptance criterion at a time.
1. Pick one acceptance criterion (start with the simplest).
2. Make its test(s) pass with minimal code.
3. Run the full test suite — no regressions.
4. Commit.
5. Pick the next acceptance criterion. Repeat.
**Rules:**
- Do NOT implement anything not in the spec.
- Do NOT optimize before all acceptance criteria pass.
- Do NOT refactor before all acceptance criteria pass.
- If you discover a missing requirement, STOP and update the spec first.
**Exit criteria:** All tests pass. All acceptance criteria satisfied.
### Phase 6: Self-Review
**Goal:** Verify implementation matches spec before marking done.
Run through the Self-Review Checklist below. If any item fails, fix it before declaring the task complete.
---
## Self-Review Checklist
Before marking any implementation as done, verify ALL of the following:
- [ ] **Every acceptance criterion has a passing test.** No exceptions. If AC-3 exists, a test for AC-3 exists and passes.
- [ ] **Every edge case has a test.** EC-1 through EC-N all have corresponding test cases.
- [ ] **No scope creep.** The implementation does not include features not in the spec. If you added something, either update the spec or remove it.
- [ ] **API contracts match implementation.** Request/response shapes in code match the spec exactly. Field names, types, status codes — all of it.
- [ ] **Error scenarios tested.** Every error response defined in the spec has a test that triggers it.
- [ ] **Non-functional requirements verified.** If the spec says < 500ms, you have evidence (benchmark, load test, profiling) that it meets the threshold.
- [ ] **Data model matches.** Database schema matches the spec. No extra columns, no missing constraints.
- [ ] **Out-of-scope items not built.** Double-check that nothing from the Out of Scope section leaked into the implementation.
---
## Integration with TDD Guide
Spec-driven workflow and TDD are complementary, not competing:
```
Spec-Driven Workflow TDD (Red-Green-Refactor)
───────────────────── ──────────────────────────
Phase 1: Gather Requirements
Phase 2: Write Spec
Phase 3: Validate Spec
Phase 4: Generate Tests ──→ RED: Tests exist and fail
Phase 5: Implement ──→ GREEN: Minimal code to pass
Phase 6: Self-Review ──→ REFACTOR: Clean up internals
```
**The handoff:** Spec-driven workflow produces the test stubs (Phase 4). TDD takes over from there. The spec tells you WHAT to test. TDD tells you HOW to implement.
Use `engineering-team/tdd-guide` for:
- Red-green-refactor cycle discipline
- Coverage analysis and gap detection
- Framework-specific test patterns (Jest, Pytest, JUnit)
Use `engineering/spec-driven-workflow` for:
- Defining what to build before building it
- Acceptance criteria authoring
- Completeness validation
- Scope control
---
## Examples
### Full Spec: User Password Reset
```markdown
# Spec: Password Reset Flow
**Author:** Engineering Team
**Date:** 2026-03-25
**Status:** Approved
## Context
Users who forget their passwords currently have no self-service recovery option.
Support receives ~200 password reset requests per week, costing approximately
8 hours of support time. This feature eliminates that burden entirely.
## Functional Requirements
- FR-1: The system MUST allow users to request a password reset via email.
- FR-2: The system MUST send a reset link that expires after 1 hour.
- FR-3: The system MUST invalidate all previous reset links when a new one is requested.
- FR-4: The system MUST enforce minimum password length of 8 characters on reset.
- FR-5: The system MUST NOT reveal whether an email exists in the system.
- FR-6: The system SHOULD log all reset attempts for audit purposes.
## Acceptance Criteria
### AC-1: Request reset (FR-1, FR-5)
Given a user on the password reset page
When they enter any email address and submit
Then they see "If an account exists, a reset link has been sent"
And the response is identical whether the email exists or not
### AC-2: Valid reset link (FR-2)
Given a user who received a reset email 30 minutes ago
When they click the reset link
Then they see the password reset form
### AC-3: Expired reset link (FR-2)
Given a user who received a reset email 2 hours ago
When they click the reset link
Then they see "This link has expired. Please request a new one."
### AC-4: Previous links invalidated (FR-3)
Given a user who requested two reset emails
When they click the link from the first email
Then they see "This link is no longer valid."
## Edge Cases
- EC-1: User submits reset for non-existent email → Same success message (FR-5).
- EC-2: User clicks reset link twice → Second click shows "already used" if password was changed.
- EC-3: Email delivery fails → Log error, do not retry automatically.
- EC-4: User requests reset while already logged in → Allow it, do not force logout.
## Out of Scope
- OS-1: Security questions as alternative reset method.
- OS-2: SMS-based password reset.
- OS-3: Admin-initiated password reset (separate spec).
```
### Extracted Test Cases (from above spec)
```python
# Generated by test_extractor.py --framework pytest
class TestPasswordReset:
def test_ac1_request_reset_existing_email(self):
"""AC-1: Request reset with existing email shows generic message."""
# Given a user on the password reset page
# When they enter a registered email and submit
# Then they see "If an account exists, a reset link has been sent"
raise NotImplementedError("Implement this test")
def test_ac1_request_reset_nonexistent_email(self):
"""AC-1: Request reset with unknown email shows same generic message."""
# Given a user on the password reset page
# When they enter an unregistered email and submit
# Then they see identical response to existing email case
raise NotImplementedError("Implement this test")
def test_ac2_valid_reset_link(self):
"""AC-2: Reset link works within expiry window."""
raise NotImplementedError("Implement this test")
def test_ac3_expired_reset_link(self):
"""AC-3: Reset link rejected after 1 hour."""
raise NotImplementedError("Implement this test")
def test_ac4_previous_links_invalidated(self):
"""AC-4: Old reset links stop working when new one is requested."""
raise NotImplementedError("Implement this test")
def test_ec1_nonexistent_email_same_response(self):
"""EC-1: Non-existent email produces identical response."""
raise NotImplementedError("Implement this test")
def test_ec2_reset_link_used_twice(self):
"""EC-2: Already-used reset link shows appropriate message."""
raise NotImplementedError("Implement this test")
```
---
## Anti-Patterns
### 1. Coding Before Spec Approval
**Symptom:** "I'll start coding while the spec is being reviewed."
**Problem:** The review will surface changes. Now you have code that implements a rejected design.
**Rule:** Implementation does not begin until spec status is "Approved."
### 2. Vague Acceptance Criteria
**Symptom:** "The system should work well" or "The UI should be responsive."
**Problem:** Untestable. What does "well" mean? What does "responsive" mean?
**Rule:** Every acceptance criterion must be verifiable by a machine. If you cannot write a test for it, rewrite the criterion.
### 3. Missing Edge Cases
**Symptom:** Happy path is specified, error paths are not.
**Problem:** Developers invent error handling on the fly, leading to inconsistent behavior.
**Rule:** For every external dependency (API, database, file system, user input), specify at least one failure scenario.
### 4. Spec as Post-Hoc Documentation
**Symptom:** "Let me write the spec now that the feature is done."
**Problem:** This is documentation, not specification. It describes what was built, not what should have been built. It cannot catch design errors because the design is already frozen.
**Rule:** If the spec was written after the code, it is not a spec. Relabel it as documentation.
### 5. Gold-Plating Beyond Spec
**Symptom:** "While I was in there, I also added..."
**Problem:** Untested code. Unreviewed design. Potential for subtle bugs in the "bonus" feature.
**Rule:** If it is not in the spec, it does not get built. File a new spec for additional features.
### 6. Acceptance Criteria Without Requirement Traceability
**Symptom:** AC-7 exists but does not reference any FR-* or NFR-*.
**Problem:** Orphaned criteria mean either a requirement is missing or the criterion is unnecessary.
**Rule:** Every AC-* MUST reference at least one FR-* or NFR-*.
### 7. Skipping Validation
**Symptom:** "The spec looks fine, let's just start."
**Problem:** Missing sections discovered during implementation cause blocking delays.
**Rule:** Always run `spec_validator.py --strict` before starting implementation. Fix all warnings.
---
## Cross-References
- **`engineering-team/tdd-guide`** — Red-green-refactor cycle, test generation, coverage analysis. Use after Phase 4 of this workflow.
- **`engineering/focused-fix`** — Deep-dive feature repair. When a spec-driven implementation has systemic issues, use focused-fix for diagnosis.
- **`engineering/rag-architect`** — If the feature involves retrieval or knowledge systems, use rag-architect for the technical design within the spec.
- **`references/spec_format_guide.md`** — Complete template with section-by-section explanations.
- **`references/bounded_autonomy_rules.md`** — Full decision matrix for when to stop vs. continue.
- **`references/acceptance_criteria_patterns.md`** — Pattern library for writing Given/When/Then criteria.
---
## Tools
| Script | Purpose | Key Flags |
|--------|---------|-----------|
| `spec_generator.py` | Generate spec template from feature name/description | `--name`, `--description`, `--format`, `--json` |
| `spec_validator.py` | Validate spec completeness (0-100 score) | `--file`, `--strict`, `--json` |
| `test_extractor.py` | Extract test stubs from acceptance criteria | `--file`, `--framework`, `--output`, `--json` |
```bash
# Generate a spec template
python spec_generator.py --name "User Authentication" --description "OAuth 2.0 login flow"
# Validate a spec
python spec_validator.py --file specs/auth.md --strict
# Extract test cases
python test_extractor.py --file specs/auth.md --framework pytest --output tests/test_auth.py
```

View File

@@ -1,6 +1,6 @@
{
"name": "engineering-advanced-skills",
"description": "31 advanced engineering skills: agent designer, agent workflow designer, AgentHub, RAG architect, database designer, migration architect, observability designer, dependency auditor, release manager, API reviewer, CI/CD pipeline builder, MCP server builder, skill security auditor, performance profiler, Helm chart builder, Terraform patterns, focused-fix, and more. Agent skill and plugin for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw.",
"description": "33 advanced engineering skills: agent designer, agent workflow designer, AgentHub, RAG architect, database designer, migration architect, observability designer, dependency auditor, release manager, API reviewer, CI/CD pipeline builder, MCP server builder, skill security auditor, performance profiler, Helm chart builder, Terraform patterns, focused-fix, browser-automation, spec-driven-workflow, and more. Agent skill and plugin for Claude Code, Codex, Gemini CLI, Cursor, OpenClaw.",
"version": "2.1.2",
"author": {
"name": "Alireza Rezvani",

View File

@@ -0,0 +1,564 @@
---
name: "browser-automation"
description: "Use when the user asks to automate browser tasks, scrape websites, fill forms, capture screenshots, extract structured data from web pages, or build web automation workflows. NOT for testing — use playwright-pro for that."
---
# Browser Automation - POWERFUL
## Overview
The Browser Automation skill provides comprehensive tools and knowledge for building production-grade web automation workflows using Playwright. This skill covers data extraction, form filling, screenshot capture, session management, and anti-detection patterns for reliable browser automation at scale.
**When to use this skill:**
- Scraping structured data from websites (tables, listings, search results)
- Automating multi-step browser workflows (login, fill forms, download files)
- Capturing screenshots or PDFs of web pages
- Extracting data from SPAs and JavaScript-heavy sites
- Building repeatable browser-based data pipelines
**When NOT to use this skill:**
- Writing browser tests or E2E test suites — use **playwright-pro** instead
- Testing API endpoints — use **api-test-suite-builder** instead
- Load testing or performance benchmarking — use **performance-profiler** instead
**Why Playwright over Selenium or Puppeteer:**
- **Auto-wait built in** — no explicit `sleep()` or `waitForElement()` needed for most actions
- **Multi-browser from one API** — Chromium, Firefox, WebKit with zero config changes
- **Network interception** — block ads, mock responses, capture API calls natively
- **Browser contexts** — isolated sessions without spinning up new browser instances
- **Codegen** — `playwright codegen` records your actions and generates scripts
- **Async-first** — Python async/await for high-throughput scraping
## Core Competencies
### 1. Web Scraping Patterns
#### DOM Extraction with CSS Selectors
CSS selectors are the primary tool for element targeting. Prefer them over XPath for readability and performance.
**Selector priority (most to least reliable):**
1. `data-testid`, `data-id`, or custom data attributes — stable across redesigns
2. `#id` selectors — unique but may change between deploys
3. Semantic selectors: `article`, `nav`, `main`, `section` — resilient to CSS changes
4. Class-based: `.product-card`, `.price` — brittle if classes are generated (e.g., CSS modules)
5. Positional: `nth-child()`, `nth-of-type()` — last resort, breaks on layout changes
**Compound selectors for precision:**
```python
# Product cards within a specific container
page.query_selector_all("div.search-results > article.product-card")
# Price inside a product card (scoped)
card.query_selector("span[data-field='price']")
# Links with specific text content
page.locator("a", has_text="Next Page")
```
#### XPath for Complex Traversal
Use XPath only when CSS cannot express the relationship:
```python
# Find element by text content (XPath strength)
page.locator("//td[contains(text(), 'Total')]/following-sibling::td[1]")
# Navigate up the DOM tree
page.locator("//span[@class='price']/ancestor::div[@class='product']")
```
#### Pagination Patterns
- **Next-button pagination**: Click "Next" until disabled or absent
- **URL-based pagination**: Increment `?page=N` or `&offset=N` in URL
- **Infinite scroll**: Scroll to bottom, wait for new content, repeat until no change
- **Load-more button**: Click button, wait for DOM mutation, repeat
#### Infinite Scroll Handling
```python
async def scroll_to_bottom(page, max_scrolls=50, pause_ms=1500):
previous_height = 0
for i in range(max_scrolls):
current_height = await page.evaluate("document.body.scrollHeight")
if current_height == previous_height:
break
await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
await page.wait_for_timeout(pause_ms)
previous_height = current_height
return i + 1 # number of scrolls performed
```
### 2. Form Filling & Multi-Step Workflows
#### Login Flows
```python
async def login(page, url, username, password):
await page.goto(url)
await page.fill("input[name='username']", username)
await page.fill("input[name='password']", password)
await page.click("button[type='submit']")
# Wait for navigation to complete (post-login redirect)
await page.wait_for_url("**/dashboard**")
```
#### Multi-Page Forms
Break multi-step forms into discrete functions per step. Each function:
1. Fills the fields for that step
2. Clicks the "Next" or "Continue" button
3. Waits for the next step to load (URL change or DOM element)
```python
async def fill_step_1(page, data):
await page.fill("#first-name", data["first_name"])
await page.fill("#last-name", data["last_name"])
await page.select_option("#country", data["country"])
await page.click("button:has-text('Continue')")
await page.wait_for_selector("#step-2-form")
async def fill_step_2(page, data):
await page.fill("#address", data["address"])
await page.fill("#city", data["city"])
await page.click("button:has-text('Continue')")
await page.wait_for_selector("#step-3-form")
```
#### File Uploads
```python
# Single file
await page.set_input_files("input[type='file']", "/path/to/file.pdf")
# Multiple files
await page.set_input_files("input[type='file']", [
"/path/to/file1.pdf",
"/path/to/file2.pdf"
])
# Drag-and-drop upload zones (no visible input element)
async with page.expect_file_chooser() as fc_info:
await page.click("div.upload-zone")
file_chooser = await fc_info.value
await file_chooser.set_files("/path/to/file.pdf")
```
#### Dropdown and Select Handling
```python
# Native <select> element
await page.select_option("#country", value="US")
await page.select_option("#country", label="United States")
# Custom dropdown (div-based)
await page.click("div.dropdown-trigger")
await page.click("div.dropdown-option:has-text('United States')")
```
### 3. Screenshot & PDF Capture
#### Screenshot Strategies
```python
# Full page (scrolls automatically)
await page.screenshot(path="full-page.png", full_page=True)
# Viewport only (what's visible)
await page.screenshot(path="viewport.png")
# Specific element
element = page.locator("div.chart-container")
await element.screenshot(path="chart.png")
# With custom viewport for consistency
context = await browser.new_context(viewport={"width": 1920, "height": 1080})
```
#### PDF Generation
```python
# Only works in Chromium
await page.pdf(
path="output.pdf",
format="A4",
margin={"top": "1cm", "right": "1cm", "bottom": "1cm", "left": "1cm"},
print_background=True
)
```
#### Visual Regression Baselines
Take screenshots at known states and compare pixel-by-pixel. Store baselines in version control. Use naming conventions: `{page}_{viewport}_{state}.png`.
### 4. Structured Data Extraction
#### Tables to JSON
```python
async def extract_table(page, selector):
headers = await page.eval_on_selector_all(
f"{selector} thead th",
"elements => elements.map(e => e.textContent.trim())"
)
rows = await page.eval_on_selector_all(
f"{selector} tbody tr",
"""rows => rows.map(row => {
return Array.from(row.querySelectorAll('td'))
.map(cell => cell.textContent.trim())
})"""
)
return [dict(zip(headers, row)) for row in rows]
```
#### Listings to Arrays
```python
async def extract_listings(page, container_sel, field_map):
"""
field_map example: {"title": "h3.title", "price": "span.price", "url": "a::attr(href)"}
"""
items = []
cards = await page.query_selector_all(container_sel)
for card in cards:
item = {}
for field, sel in field_map.items():
if "::attr(" in sel:
attr_sel, attr_name = sel.split("::attr(")
attr_name = attr_name.rstrip(")")
el = await card.query_selector(attr_sel)
item[field] = await el.get_attribute(attr_name) if el else None
else:
el = await card.query_selector(sel)
item[field] = (await el.text_content()).strip() if el else None
items.append(item)
return items
```
#### Nested Data Extraction
For threaded content (comments with replies), use recursive extraction:
```python
async def extract_comments(page, parent_selector):
comments = []
elements = await page.query_selector_all(f"{parent_selector} > .comment")
for el in elements:
text = await (await el.query_selector(".comment-body")).text_content()
author = await (await el.query_selector(".author")).text_content()
replies = await extract_comments(el, ".replies")
comments.append({
"author": author.strip(),
"text": text.strip(),
"replies": replies
})
return comments
```
### 5. Cookie & Session Management
#### Save and Restore Sessions
```python
import json
# Save cookies after login
cookies = await context.cookies()
with open("session.json", "w") as f:
json.dump(cookies, f)
# Restore session in new context
with open("session.json", "r") as f:
cookies = json.load(f)
context = await browser.new_context()
await context.add_cookies(cookies)
```
#### Storage State (Cookies + Local Storage)
```python
# Save full state (cookies + localStorage + sessionStorage)
await context.storage_state(path="state.json")
# Restore full state
context = await browser.new_context(storage_state="state.json")
```
**Best practice:** Save state after login, reuse across scraping sessions. Check session validity before starting a long job — make a lightweight request to a protected page and verify you are not redirected to login.
### 6. Anti-Detection Patterns
Modern websites detect automation through multiple vectors. Address all of them:
#### User Agent Rotation
Never use the default Playwright user agent. Rotate through real browser user agents:
```python
USER_AGENTS = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
]
```
#### Viewport and Screen Size
Set realistic viewport dimensions. The default 800x600 is a red flag:
```python
context = await browser.new_context(
viewport={"width": 1920, "height": 1080},
screen={"width": 1920, "height": 1080},
user_agent=random.choice(USER_AGENTS),
)
```
#### WebDriver Flag Removal
Playwright sets `navigator.webdriver = true`. Remove it:
```python
await page.add_init_script("""
Object.defineProperty(navigator, 'webdriver', {get: () => undefined});
""")
```
#### Request Throttling
Add human-like delays between actions:
```python
import random
async def human_delay(min_ms=500, max_ms=2000):
delay = random.randint(min_ms, max_ms)
await page.wait_for_timeout(delay)
```
#### Proxy Support
```python
browser = await playwright.chromium.launch(
proxy={"server": "http://proxy.example.com:8080"}
)
# Or per-context:
context = await browser.new_context(
proxy={"server": "http://proxy.example.com:8080",
"username": "user", "password": "pass"}
)
```
### 7. Dynamic Content Handling
#### SPA Rendering
SPAs render content client-side. Wait for the actual content, not the page load:
```python
await page.goto(url)
# Wait for the data to render, not just the shell
await page.wait_for_selector("div.product-list article", state="attached")
```
#### AJAX / Fetch Waiting
Intercept and wait for specific API calls:
```python
async with page.expect_response("**/api/products*") as response_info:
await page.click("button.load-more")
response = await response_info.value
data = await response.json() # You can use the API data directly
```
#### Shadow DOM Traversal
```python
# Playwright pierces open Shadow DOM automatically with >>
await page.locator("custom-element >> .inner-class").click()
```
#### Lazy-Loaded Images
Scroll elements into view to trigger lazy loading:
```python
images = await page.query_selector_all("img[data-src]")
for img in images:
await img.scroll_into_view_if_needed()
await page.wait_for_timeout(200)
```
### 8. Error Handling & Retry Logic
#### Retry Decorator Pattern
```python
import asyncio
async def with_retry(coro_factory, max_retries=3, backoff_base=2):
for attempt in range(max_retries):
try:
return await coro_factory()
except Exception as e:
if attempt == max_retries - 1:
raise
wait = backoff_base ** attempt
print(f"Attempt {attempt + 1} failed: {e}. Retrying in {wait}s...")
await asyncio.sleep(wait)
```
#### Handling Common Failures
```python
from playwright.async_api import TimeoutError as PlaywrightTimeout
try:
await page.click("button.submit", timeout=5000)
except PlaywrightTimeout:
# Element did not appear — page structure may have changed
# Try fallback selector
await page.click("[type='submit']", timeout=5000)
except Exception as e:
# Network error, browser crash, etc.
await page.screenshot(path="error-state.png")
raise
```
#### Rate Limit Detection
```python
async def check_rate_limit(response):
if response.status == 429:
retry_after = response.headers.get("retry-after", "60")
wait_seconds = int(retry_after)
print(f"Rate limited. Waiting {wait_seconds}s...")
await asyncio.sleep(wait_seconds)
return True
return False
```
## Workflows
### Workflow 1: Single-Page Data Extraction
**Scenario:** Extract product data from a single page with JavaScript-rendered content.
**Steps:**
1. Launch browser in headed mode during development (`headless=False`), switch to headless for production
2. Navigate to URL and wait for content selector
3. Extract data using `query_selector_all` with field mapping
4. Validate extracted data (check for nulls, expected types)
5. Output as JSON
```python
async def extract_single_page(url, selectors):
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
context = await browser.new_context(
viewport={"width": 1920, "height": 1080},
user_agent="Mozilla/5.0 ..."
)
page = await context.new_page()
await page.goto(url, wait_until="networkidle")
data = await extract_listings(page, selectors["container"], selectors["fields"])
await browser.close()
return data
```
### Workflow 2: Multi-Page Scraping with Pagination
**Scenario:** Scrape search results across 50+ pages.
**Steps:**
1. Launch browser with anti-detection settings
2. Navigate to first page
3. Extract data from current page
4. Check if "Next" button exists and is enabled
5. Click next, wait for new content to load (not just navigation)
6. Repeat until no next page or max pages reached
7. Deduplicate results by unique key
8. Write output incrementally (don't hold everything in memory)
```python
async def scrape_paginated(base_url, selectors, max_pages=100):
all_data = []
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await (await browser.new_context()).new_page()
await page.goto(base_url)
for page_num in range(max_pages):
items = await extract_listings(page, selectors["container"], selectors["fields"])
all_data.extend(items)
next_btn = page.locator(selectors["next_button"])
if await next_btn.count() == 0 or await next_btn.is_disabled():
break
await next_btn.click()
await page.wait_for_selector(selectors["container"])
await human_delay(800, 2000)
await browser.close()
return all_data
```
### Workflow 3: Authenticated Workflow Automation
**Scenario:** Log into a portal, navigate a multi-step form, download a report.
**Steps:**
1. Check for existing session state file
2. If no session, perform login and save state
3. Navigate to target page using saved session
4. Fill multi-step form with provided data
5. Wait for download to trigger
6. Save downloaded file to target directory
```python
async def authenticated_workflow(credentials, form_data, download_dir):
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
state_file = "session_state.json"
# Restore or create session
if os.path.exists(state_file):
context = await browser.new_context(storage_state=state_file)
else:
context = await browser.new_context()
page = await context.new_page()
await login(page, credentials["url"], credentials["user"], credentials["pass"])
await context.storage_state(path=state_file)
page = await context.new_page()
await page.goto(form_data["target_url"])
# Fill form steps
for step_fn in [fill_step_1, fill_step_2]:
await step_fn(page, form_data)
# Handle download
async with page.expect_download() as dl_info:
await page.click("button:has-text('Download Report')")
download = await dl_info.value
await download.save_as(os.path.join(download_dir, download.suggested_filename))
await browser.close()
```
## Tools Reference
| Script | Purpose | Key Flags | Output |
|--------|---------|-----------|--------|
| `scraping_toolkit.py` | Generate Playwright scraping script skeleton | `--url`, `--selectors`, `--paginate`, `--output` | Python script or JSON config |
| `form_automation_builder.py` | Generate form-fill automation script from field spec | `--fields`, `--url`, `--output` | Python automation script |
| `anti_detection_checker.py` | Audit a Playwright script for detection vectors | `--file`, `--verbose` | Risk report with score |
All scripts are stdlib-only. Run `python3 <script> --help` for full usage.
## Anti-Patterns
### Hardcoded Waits
**Bad:** `await page.wait_for_timeout(5000)` before every action.
**Good:** Use `wait_for_selector`, `wait_for_url`, `expect_response`, or `wait_for_load_state`. Hardcoded waits are flaky and slow.
### No Error Recovery
**Bad:** Linear script that crashes on first failure.
**Good:** Wrap each page interaction in try/except. Take error-state screenshots. Implement retry with exponential backoff.
### Ignoring robots.txt
**Bad:** Scraping without checking robots.txt directives.
**Good:** Fetch and parse robots.txt before scraping. Respect `Crawl-delay`. Skip disallowed paths. Add your bot name to User-Agent if running at scale.
### Storing Credentials in Scripts
**Bad:** Hardcoding usernames and passwords in Python files.
**Good:** Use environment variables, `.env` files (gitignored), or a secrets manager. Pass credentials via CLI arguments.
### No Rate Limiting
**Bad:** Hammering a site with 100 requests/second.
**Good:** Add random delays between requests (1-3s for polite scraping). Monitor for 429 responses. Implement exponential backoff.
### Selector Fragility
**Bad:** Relying on auto-generated class names (`.css-1a2b3c`) or deep nesting (`div > div > div > span:nth-child(3)`).
**Good:** Use data attributes, semantic HTML, or text-based locators. Test selectors in browser DevTools first.
### Not Cleaning Up Browser Instances
**Bad:** Launching browsers without closing them, leading to resource leaks.
**Good:** Always use `try/finally` or async context managers to ensure `browser.close()` is called.
### Running Headed in Production
**Bad:** Using `headless=False` in production/CI.
**Good:** Develop with headed mode for debugging, deploy with `headless=True`. Use environment variable to toggle: `headless = os.environ.get("HEADLESS", "true") == "true"`.
## Cross-References
- **playwright-pro** — Browser testing skill. Use for E2E tests, test assertions, test fixtures. Browser Automation is for data extraction and workflow automation, not testing.
- **api-test-suite-builder** — When the website has a public API, hit the API directly instead of scraping the rendered page. Faster, more reliable, less detectable.
- **performance-profiler** — If your automation scripts are slow, profile the bottlenecks before adding concurrency.
- **env-secrets-manager** — For securely managing credentials used in authenticated automation workflows.

View File

@@ -0,0 +1,520 @@
#!/usr/bin/env python3
"""
Anti-Detection Checker - Audits Playwright scripts for common bot detection vectors.
Analyzes a Playwright automation script and identifies patterns that make the
browser detectable as a bot. Produces a risk score (0-100) with specific
recommendations for each issue found.
Detection vectors checked:
- Headless mode usage
- Default/missing user agent configuration
- Viewport size (default 800x600 is a red flag)
- WebDriver flag (navigator.webdriver)
- Navigator property overrides
- Request throttling / human-like delays
- Cookie/session management
- Proxy configuration
- Error handling patterns
No external dependencies - uses only Python standard library.
"""
import argparse
import json
import os
import re
import sys
from dataclasses import dataclass, asdict
from typing import List, Optional
@dataclass
class Finding:
"""A single detection risk finding."""
category: str
severity: str # "critical", "high", "medium", "low", "info"
description: str
line: Optional[int]
recommendation: str
weight: int # Points added to risk score (0-15)
SEVERITY_WEIGHTS = {
"critical": 15,
"high": 10,
"medium": 5,
"low": 2,
"info": 0,
}
class AntiDetectionChecker:
"""Analyzes Playwright scripts for bot detection vulnerabilities."""
def __init__(self, script_content: str, file_path: str = "<stdin>"):
self.content = script_content
self.lines = script_content.split("\n")
self.file_path = file_path
self.findings: List[Finding] = []
def check_all(self) -> List[Finding]:
"""Run all detection checks."""
self._check_headless_mode()
self._check_user_agent()
self._check_viewport()
self._check_webdriver_flag()
self._check_navigator_properties()
self._check_request_delays()
self._check_error_handling()
self._check_proxy()
self._check_session_management()
self._check_browser_close()
self._check_stealth_imports()
return self.findings
def _find_line(self, pattern: str) -> Optional[int]:
"""Find the first line number matching a regex pattern."""
for i, line in enumerate(self.lines, 1):
if re.search(pattern, line):
return i
return None
def _has_pattern(self, pattern: str) -> bool:
"""Check if pattern exists anywhere in the script."""
return bool(re.search(pattern, self.content))
def _check_headless_mode(self):
"""Check if headless mode is properly configured."""
if self._has_pattern(r"headless\s*=\s*False"):
self.findings.append(Finding(
category="Headless Mode",
severity="high",
description="Browser launched in headed mode (headless=False). This is fine for development but should be headless=True in production.",
line=self._find_line(r"headless\s*=\s*False"),
recommendation="Use headless=True for production. Toggle via environment variable: headless=os.environ.get('HEADLESS', 'true') == 'true'",
weight=SEVERITY_WEIGHTS["high"],
))
elif not self._has_pattern(r"headless"):
# Default is headless=True in Playwright, which is correct
self.findings.append(Finding(
category="Headless Mode",
severity="info",
description="Using default headless mode (True). Good for production.",
line=None,
recommendation="No action needed. Default headless=True is correct.",
weight=SEVERITY_WEIGHTS["info"],
))
def _check_user_agent(self):
"""Check if a custom user agent is set."""
has_ua = self._has_pattern(r"user_agent\s*=") or self._has_pattern(r"userAgent")
has_ua_list = self._has_pattern(r"USER_AGENTS?\s*=\s*\[")
has_random_ua = self._has_pattern(r"random\.choice.*(?:USER_AGENT|user_agent|ua)")
if not has_ua:
self.findings.append(Finding(
category="User Agent",
severity="critical",
description="No custom user agent configured. Playwright's default user agent contains 'HeadlessChrome' which is trivially detected.",
line=None,
recommendation="Set a realistic user agent: context = await browser.new_context(user_agent='Mozilla/5.0 ...')",
weight=SEVERITY_WEIGHTS["critical"],
))
elif has_ua_list and has_random_ua:
self.findings.append(Finding(
category="User Agent",
severity="info",
description="User agent rotation detected. Good anti-detection practice.",
line=self._find_line(r"USER_AGENTS?\s*=\s*\["),
recommendation="Ensure user agents are recent and match the browser being launched (e.g., Chrome UA for Chromium).",
weight=SEVERITY_WEIGHTS["info"],
))
elif has_ua:
self.findings.append(Finding(
category="User Agent",
severity="low",
description="Custom user agent set but no rotation detected. Single user agent is fingerprint-able at scale.",
line=self._find_line(r"user_agent\s*="),
recommendation="Rotate through 5-10 recent user agents using random.choice().",
weight=SEVERITY_WEIGHTS["low"],
))
def _check_viewport(self):
"""Check viewport configuration."""
has_viewport = self._has_pattern(r"viewport\s*=\s*\{") or self._has_pattern(r"viewport.*width")
if not has_viewport:
self.findings.append(Finding(
category="Viewport Size",
severity="high",
description="No viewport configured. Default Playwright viewport (1280x720) is common among bots. Sites may flag unusual viewport distributions.",
line=None,
recommendation="Set a common desktop viewport: viewport={'width': 1920, 'height': 1080}. Vary across runs.",
weight=SEVERITY_WEIGHTS["high"],
))
else:
# Check for suspiciously small viewports
match = re.search(r"width['\"]?\s*[:=]\s*(\d+)", self.content)
if match:
width = int(match.group(1))
if width < 1024:
self.findings.append(Finding(
category="Viewport Size",
severity="medium",
description=f"Viewport width {width}px is unusually small. Most desktop browsers are 1366px+ wide.",
line=self._find_line(r"width.*" + str(width)),
recommendation="Use 1366x768 (most common) or 1920x1080. Avoid unusual sizes like 800x600.",
weight=SEVERITY_WEIGHTS["medium"],
))
else:
self.findings.append(Finding(
category="Viewport Size",
severity="info",
description=f"Viewport width {width}px is reasonable.",
line=self._find_line(r"width.*" + str(width)),
recommendation="No action needed.",
weight=SEVERITY_WEIGHTS["info"],
))
def _check_webdriver_flag(self):
"""Check if navigator.webdriver is being removed."""
has_webdriver_override = (
self._has_pattern(r"navigator.*webdriver") or
self._has_pattern(r"webdriver.*undefined") or
self._has_pattern(r"add_init_script.*webdriver")
)
if not has_webdriver_override:
self.findings.append(Finding(
category="WebDriver Flag",
severity="critical",
description="navigator.webdriver is not overridden. This is the most common bot detection check. Every major anti-bot service tests this property.",
line=None,
recommendation=(
"Add init script to remove the flag:\n"
" await page.add_init_script(\"Object.defineProperty(navigator, 'webdriver', {get: () => undefined});\")"
),
weight=SEVERITY_WEIGHTS["critical"],
))
else:
self.findings.append(Finding(
category="WebDriver Flag",
severity="info",
description="navigator.webdriver override detected.",
line=self._find_line(r"webdriver"),
recommendation="No action needed.",
weight=SEVERITY_WEIGHTS["info"],
))
def _check_navigator_properties(self):
"""Check for additional navigator property hardening."""
checks = {
"plugins": (r"navigator.*plugins", "navigator.plugins is empty in headless mode. Real browsers report installed plugins."),
"languages": (r"navigator.*languages", "navigator.languages should be set to match the user agent locale."),
"platform": (r"navigator.*platform", "navigator.platform should match the user agent OS."),
}
overridden_count = 0
for prop, (pattern, desc) in checks.items():
if self._has_pattern(pattern):
overridden_count += 1
if overridden_count == 0:
self.findings.append(Finding(
category="Navigator Properties",
severity="medium",
description="No navigator property hardening detected. Advanced anti-bot services check plugins, languages, and platform properties.",
line=None,
recommendation="Override navigator.plugins, navigator.languages, and navigator.platform via add_init_script() to match realistic browser fingerprints.",
weight=SEVERITY_WEIGHTS["medium"],
))
elif overridden_count < 3:
self.findings.append(Finding(
category="Navigator Properties",
severity="low",
description=f"Partial navigator hardening ({overridden_count}/3 properties). Consider covering all three: plugins, languages, platform.",
line=None,
recommendation="Add overrides for any missing properties among: plugins, languages, platform.",
weight=SEVERITY_WEIGHTS["low"],
))
def _check_request_delays(self):
"""Check for human-like request delays."""
has_sleep = self._has_pattern(r"asyncio\.sleep") or self._has_pattern(r"wait_for_timeout")
has_random_delay = (
self._has_pattern(r"random\.(uniform|randint|random)") and has_sleep
)
if not has_sleep:
self.findings.append(Finding(
category="Request Timing",
severity="high",
description="No delays between actions detected. Machine-speed interactions are the easiest behavior-based detection signal.",
line=None,
recommendation="Add random delays between page interactions: await asyncio.sleep(random.uniform(0.5, 2.0))",
weight=SEVERITY_WEIGHTS["high"],
))
elif not has_random_delay:
self.findings.append(Finding(
category="Request Timing",
severity="medium",
description="Fixed delays detected but no randomization. Constant timing intervals are detectable patterns.",
line=self._find_line(r"(asyncio\.sleep|wait_for_timeout)"),
recommendation="Use random delays: random.uniform(min_seconds, max_seconds) instead of fixed values.",
weight=SEVERITY_WEIGHTS["medium"],
))
else:
self.findings.append(Finding(
category="Request Timing",
severity="info",
description="Randomized delays detected between actions.",
line=self._find_line(r"random\.(uniform|randint)"),
recommendation="No action needed. Ensure delays are realistic (0.5-3s for browsing, 1-5s for reading).",
weight=SEVERITY_WEIGHTS["info"],
))
def _check_error_handling(self):
"""Check for error handling patterns."""
has_try_except = self._has_pattern(r"try\s*:") and self._has_pattern(r"except")
has_retry = self._has_pattern(r"retr(y|ies)") or self._has_pattern(r"max_retries|max_attempts")
if not has_try_except:
self.findings.append(Finding(
category="Error Handling",
severity="medium",
description="No try/except blocks found. Unhandled errors will crash the automation and leave browser instances running.",
line=None,
recommendation="Wrap page interactions in try/except. Handle TimeoutError, network errors, and element-not-found gracefully.",
weight=SEVERITY_WEIGHTS["medium"],
))
elif not has_retry:
self.findings.append(Finding(
category="Error Handling",
severity="low",
description="Error handling present but no retry logic detected. Transient failures (network blips, slow loads) will cause data loss.",
line=None,
recommendation="Add retry with exponential backoff for network operations and element interactions.",
weight=SEVERITY_WEIGHTS["low"],
))
def _check_proxy(self):
"""Check for proxy configuration."""
has_proxy = self._has_pattern(r"proxy\s*=\s*\{") or self._has_pattern(r"proxy.*server")
if not has_proxy:
self.findings.append(Finding(
category="Proxy",
severity="low",
description="No proxy configuration detected. Running from a single IP address is fine for small jobs but will trigger rate limits at scale.",
line=None,
recommendation="For high-volume scraping, use rotating proxies: proxy={'server': 'http://proxy:port'}",
weight=SEVERITY_WEIGHTS["low"],
))
def _check_session_management(self):
"""Check for session/cookie management."""
has_storage_state = self._has_pattern(r"storage_state")
has_cookies = self._has_pattern(r"cookies\(\)") or self._has_pattern(r"add_cookies")
if not has_storage_state and not has_cookies:
self.findings.append(Finding(
category="Session Management",
severity="low",
description="No session persistence detected. Each run will start fresh, requiring re-authentication.",
line=None,
recommendation="Use storage_state() to save/restore sessions across runs. This avoids repeated logins that may trigger security alerts.",
weight=SEVERITY_WEIGHTS["low"],
))
def _check_browser_close(self):
"""Check if browser is properly closed."""
has_close = self._has_pattern(r"browser\.close\(\)") or self._has_pattern(r"await.*close")
has_context_manager = self._has_pattern(r"async\s+with\s+async_playwright")
if not has_close and not has_context_manager:
self.findings.append(Finding(
category="Resource Cleanup",
severity="medium",
description="No browser.close() or context manager detected. Browser processes will leak on failure.",
line=None,
recommendation="Use 'async with async_playwright() as p:' or ensure browser.close() is in a finally block.",
weight=SEVERITY_WEIGHTS["medium"],
))
def _check_stealth_imports(self):
"""Check for stealth/anti-detection library usage."""
has_stealth = self._has_pattern(r"playwright_stealth|stealth_async|undetected")
if has_stealth:
self.findings.append(Finding(
category="Stealth Library",
severity="info",
description="Third-party stealth library detected. These provide additional fingerprint evasion but add dependencies.",
line=self._find_line(r"playwright_stealth|stealth_async|undetected"),
recommendation="Stealth libraries are helpful but not a silver bullet. Still implement manual checks for user agent, viewport, and timing.",
weight=SEVERITY_WEIGHTS["info"],
))
def get_risk_score(self) -> int:
"""Calculate overall risk score (0-100). Higher = more detectable."""
raw_score = sum(f.weight for f in self.findings)
# Cap at 100
return min(raw_score, 100)
def get_risk_level(self) -> str:
"""Get human-readable risk level."""
score = self.get_risk_score()
if score <= 10:
return "LOW"
elif score <= 30:
return "MODERATE"
elif score <= 50:
return "HIGH"
else:
return "CRITICAL"
def get_summary(self) -> dict:
"""Get a summary of the analysis."""
severity_counts = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
for f in self.findings:
severity_counts[f.severity] += 1
return {
"file": self.file_path,
"risk_score": self.get_risk_score(),
"risk_level": self.get_risk_level(),
"total_findings": len(self.findings),
"severity_counts": severity_counts,
"actionable_findings": len([f for f in self.findings if f.severity != "info"]),
}
def format_text_report(checker: AntiDetectionChecker, verbose: bool = False) -> str:
"""Format findings as human-readable text."""
lines = []
summary = checker.get_summary()
lines.append("=" * 60)
lines.append(" ANTI-DETECTION AUDIT REPORT")
lines.append("=" * 60)
lines.append(f"File: {summary['file']}")
lines.append(f"Risk Score: {summary['risk_score']}/100 ({summary['risk_level']})")
lines.append(f"Total Issues: {summary['actionable_findings']} actionable, {summary['severity_counts']['info']} info")
lines.append("")
# Severity breakdown
for sev in ["critical", "high", "medium", "low"]:
count = summary["severity_counts"][sev]
if count > 0:
lines.append(f" {sev.upper():10s} {count}")
lines.append("")
# Findings grouped by severity
severity_order = ["critical", "high", "medium", "low"]
if verbose:
severity_order.append("info")
for sev in severity_order:
sev_findings = [f for f in checker.findings if f.severity == sev]
if not sev_findings:
continue
lines.append(f"--- {sev.upper()} ---")
for f in sev_findings:
line_info = f" (line {f.line})" if f.line else ""
lines.append(f" [{f.category}]{line_info}")
lines.append(f" {f.description}")
lines.append(f" Fix: {f.recommendation}")
lines.append("")
# Exit code guidance
lines.append("-" * 60)
score = summary["risk_score"]
if score <= 10:
lines.append("Result: PASS - Low detection risk.")
elif score <= 30:
lines.append("Result: PASS with warnings - Address medium/high issues for production use.")
else:
lines.append("Result: FAIL - High detection risk. Fix critical and high issues before deploying.")
lines.append("")
return "\n".join(lines)
def main():
parser = argparse.ArgumentParser(
description="Audit a Playwright script for common bot detection vectors.",
epilog=(
"Examples:\n"
" %(prog)s --file scraper.py\n"
" %(prog)s --file scraper.py --verbose\n"
" %(prog)s --file scraper.py --json\n"
"\n"
"Exit codes:\n"
" 0 - Low risk (score 0-10)\n"
" 1 - Moderate to high risk (score 11-50)\n"
" 2 - Critical risk (score 51+)\n"
),
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument(
"--file",
required=True,
help="Path to the Playwright script to audit",
)
parser.add_argument(
"--json",
action="store_true",
dest="json_output",
default=False,
help="Output results as JSON",
)
parser.add_argument(
"--verbose",
action="store_true",
default=False,
help="Include informational (non-actionable) findings in output",
)
args = parser.parse_args()
file_path = os.path.abspath(args.file)
if not os.path.isfile(file_path):
print(f"Error: File not found: {file_path}", file=sys.stderr)
sys.exit(2)
try:
with open(file_path, "r", encoding="utf-8") as f:
content = f.read()
except Exception as e:
print(f"Error reading file: {e}", file=sys.stderr)
sys.exit(2)
if not content.strip():
print("Error: File is empty.", file=sys.stderr)
sys.exit(2)
checker = AntiDetectionChecker(content, file_path)
checker.check_all()
if args.json_output:
output = checker.get_summary()
output["findings"] = [asdict(f) for f in checker.findings]
if not args.verbose:
output["findings"] = [f for f in output["findings"] if f["severity"] != "info"]
print(json.dumps(output, indent=2))
else:
print(format_text_report(checker, verbose=args.verbose))
# Exit code based on risk
score = checker.get_risk_score()
if score <= 10:
sys.exit(0)
elif score <= 50:
sys.exit(1)
else:
sys.exit(2)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,324 @@
#!/usr/bin/env python3
"""
Form Automation Builder - Generates Playwright form-fill automation scripts.
Takes a JSON field specification and target URL, then produces a ready-to-run
Playwright script that fills forms, handles multi-step flows, and manages
file uploads.
No external dependencies - uses only Python standard library.
"""
import argparse
import json
import os
import sys
import textwrap
from datetime import datetime
SUPPORTED_FIELD_TYPES = {
"text": "page.fill('{selector}', '{value}')",
"password": "page.fill('{selector}', '{value}')",
"email": "page.fill('{selector}', '{value}')",
"textarea": "page.fill('{selector}', '{value}')",
"select": "page.select_option('{selector}', value='{value}')",
"checkbox": "page.check('{selector}')" if True else "page.uncheck('{selector}')",
"radio": "page.check('{selector}')",
"file": "page.set_input_files('{selector}', '{value}')",
"click": "page.click('{selector}')",
}
def validate_fields(fields):
"""Validate the field specification format. Returns list of issues."""
issues = []
if not isinstance(fields, list):
issues.append("Top-level structure must be a JSON array of field objects.")
return issues
for i, field in enumerate(fields):
if not isinstance(field, dict):
issues.append(f"Field {i}: must be a JSON object.")
continue
if "selector" not in field:
issues.append(f"Field {i}: missing required 'selector' key.")
if "type" not in field:
issues.append(f"Field {i}: missing required 'type' key.")
elif field["type"] not in SUPPORTED_FIELD_TYPES:
issues.append(
f"Field {i}: unsupported type '{field['type']}'. "
f"Supported: {', '.join(sorted(SUPPORTED_FIELD_TYPES.keys()))}"
)
if field.get("type") not in ("checkbox", "radio", "click") and "value" not in field:
issues.append(f"Field {i}: missing 'value' for type '{field.get('type', '?')}'.")
return issues
def generate_field_action(field, indent=8):
"""Generate the Playwright action line for a single field."""
ftype = field["type"]
selector = field["selector"]
value = field.get("value", "")
label = field.get("label", selector)
prefix = " " * indent
lines = []
lines.append(f'{prefix}# {label}')
if ftype == "checkbox":
if field.get("value", "true").lower() in ("true", "yes", "1", "on"):
lines.append(f'{prefix}await page.check("{selector}")')
else:
lines.append(f'{prefix}await page.uncheck("{selector}")')
elif ftype == "radio":
lines.append(f'{prefix}await page.check("{selector}")')
elif ftype == "click":
lines.append(f'{prefix}await page.click("{selector}")')
elif ftype == "select":
lines.append(f'{prefix}await page.select_option("{selector}", value="{value}")')
elif ftype == "file":
lines.append(f'{prefix}await page.set_input_files("{selector}", "{value}")')
else:
# text, password, email, textarea
lines.append(f'{prefix}await page.fill("{selector}", "{value}")')
# Add optional wait_after
wait_after = field.get("wait_after")
if wait_after:
lines.append(f'{prefix}await page.wait_for_selector("{wait_after}")')
return "\n".join(lines)
def build_form_script(url, fields, output_format="script"):
"""Build a Playwright form automation script from the field specification."""
issues = validate_fields(fields)
if issues:
return None, issues
if output_format == "json":
config = {
"url": url,
"fields": fields,
"field_count": len(fields),
"field_types": list(set(f["type"] for f in fields)),
"has_file_upload": any(f["type"] == "file" for f in fields),
"generated_at": datetime.now().isoformat(),
}
return config, None
# Group fields into steps if step markers are present
steps = {}
for field in fields:
step = field.get("step", 1)
if step not in steps:
steps[step] = []
steps[step].append(field)
multi_step = len(steps) > 1
# Generate step functions
step_functions = []
for step_num in sorted(steps.keys()):
step_fields = steps[step_num]
actions = "\n".join(generate_field_action(f) for f in step_fields)
if multi_step:
fn = textwrap.dedent(f"""\
async def fill_step_{step_num}(page):
\"\"\"Fill form step {step_num} ({len(step_fields)} fields).\"\"\"
print(f"Filling step {step_num}...")
{actions}
print(f"Step {step_num} complete.")
""")
else:
fn = textwrap.dedent(f"""\
async def fill_form(page):
\"\"\"Fill form ({len(step_fields)} fields).\"\"\"
print("Filling form...")
{actions}
print("Form filled.")
""")
step_functions.append(fn)
step_functions_str = "\n\n".join(step_functions)
# Generate main() call sequence
if multi_step:
step_calls = "\n".join(
f" await fill_step_{n}(page)" for n in sorted(steps.keys())
)
else:
step_calls = " await fill_form(page)"
submit_selector = None
for field in fields:
if field.get("type") == "click" and field.get("is_submit"):
submit_selector = field["selector"]
break
submit_block = ""
if submit_selector:
submit_block = textwrap.dedent(f"""\
# Submit
await page.click("{submit_selector}")
await page.wait_for_load_state("networkidle")
print("Form submitted.")
""")
script = textwrap.dedent(f'''\
#!/usr/bin/env python3
"""
Auto-generated Playwright form automation script.
Target: {url}
Fields: {len(fields)}
Steps: {len(steps)}
Generated: {datetime.now().isoformat()}
Requirements:
pip install playwright
playwright install chromium
"""
import asyncio
import random
from playwright.async_api import async_playwright
URL = "{url}"
USER_AGENTS = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
]
{step_functions_str}
async def main():
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
context = await browser.new_context(
viewport={{"width": 1920, "height": 1080}},
user_agent=random.choice(USER_AGENTS),
)
page = await context.new_page()
await page.add_init_script(
"Object.defineProperty(navigator, \'webdriver\', {{get: () => undefined}});"
)
print(f"Navigating to {{URL}}...")
await page.goto(URL, wait_until="networkidle")
{step_calls}
{submit_block}
print("Automation complete.")
await browser.close()
if __name__ == "__main__":
asyncio.run(main())
''')
return script, None
def main():
parser = argparse.ArgumentParser(
description="Generate Playwright form-fill automation scripts from a JSON field specification.",
epilog=textwrap.dedent("""\
Examples:
%(prog)s --url https://example.com/signup --fields fields.json
%(prog)s --url https://example.com/signup --fields fields.json --output fill_form.py
%(prog)s --url https://example.com/signup --fields fields.json --json
Field specification format (fields.json):
[
{"selector": "#email", "type": "email", "value": "user@example.com", "label": "Email"},
{"selector": "#password", "type": "password", "value": "s3cret"},
{"selector": "#country", "type": "select", "value": "US"},
{"selector": "#terms", "type": "checkbox", "value": "true"},
{"selector": "#avatar", "type": "file", "value": "/path/to/photo.jpg"},
{"selector": "button[type='submit']", "type": "click", "is_submit": true}
]
Supported field types: text, password, email, textarea, select, checkbox, radio, file, click
Multi-step forms: Add "step": N to each field to group into steps.
"""),
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument(
"--url",
required=True,
help="Target form URL",
)
parser.add_argument(
"--fields",
required=True,
help="Path to JSON file containing field specifications",
)
parser.add_argument(
"--output",
help="Output file path (default: stdout)",
)
parser.add_argument(
"--json",
action="store_true",
dest="json_output",
default=False,
help="Output JSON configuration instead of Python script",
)
args = parser.parse_args()
# Load fields
fields_path = os.path.abspath(args.fields)
if not os.path.isfile(fields_path):
print(f"Error: Fields file not found: {fields_path}", file=sys.stderr)
sys.exit(2)
try:
with open(fields_path, "r") as f:
fields = json.load(f)
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in {fields_path}: {e}", file=sys.stderr)
sys.exit(2)
output_format = "json" if args.json_output else "script"
result, errors = build_form_script(
url=args.url,
fields=fields,
output_format=output_format,
)
if errors:
print("Validation errors:", file=sys.stderr)
for err in errors:
print(f" - {err}", file=sys.stderr)
sys.exit(2)
if args.json_output:
output_text = json.dumps(result, indent=2)
else:
output_text = result
if args.output:
output_path = os.path.abspath(args.output)
with open(output_path, "w") as f:
f.write(output_text)
if not args.json_output:
os.chmod(output_path, 0o755)
print(f"Written to {output_path}", file=sys.stderr)
sys.exit(0)
else:
print(output_text)
sys.exit(0)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,453 @@
# Anti-Detection Patterns for Browser Automation
This reference covers techniques to make Playwright automation less detectable by anti-bot services. These are defense-in-depth measures — no single technique is sufficient, but combining them significantly reduces detection risk.
## Detection Vectors
Anti-bot systems detect automation through multiple signals. Understanding what they check helps you counter effectively.
### Tier 1: Trivial Detection (Every Site Checks These)
1. **navigator.webdriver** — Set to `true` by all automation frameworks
2. **User-Agent string** — Default headless UA contains "HeadlessChrome"
3. **WebGL renderer** — Headless Chrome reports "SwiftShader" or "Google SwiftShader"
### Tier 2: Common Detection (Most Anti-Bot Services)
4. **Viewport/screen dimensions** — Unusual sizes flag automation
5. **Plugins array** — Empty in headless mode, populated in real browsers
6. **Languages** — Missing or mismatched locale
7. **Request timing** — Machine-speed interactions
8. **Mouse movement** — No mouse events between clicks
### Tier 3: Advanced Detection (Cloudflare, DataDome, PerimeterX)
9. **Canvas fingerprint** — Headless renders differently
10. **WebGL fingerprint** — GPU-specific rendering variations
11. **Audio fingerprint** — AudioContext processing differences
12. **Font enumeration** — Different available fonts in headless
13. **Behavioral analysis** — Scroll patterns, click patterns, reading time
## Stealth Techniques
### 1. WebDriver Flag Removal
The most critical fix. Every anti-bot check starts here.
```python
await page.add_init_script("""
// Remove webdriver flag
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined,
});
// Remove Playwright-specific properties
delete window.__playwright;
delete window.__pw_manual;
""")
```
### 2. User Agent Configuration
Match the user agent to the browser you are launching. A Chrome UA with Firefox-specific headers is a red flag.
```python
# Chrome 120 on Windows 10 (most common configuration globally)
CHROME_WIN = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
# Chrome 120 on macOS
CHROME_MAC = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
# Chrome 120 on Linux
CHROME_LINUX = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
# Firefox 121 on Windows
FIREFOX_WIN = "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0"
```
**Rules:**
- Update UAs every 2-3 months as browser versions increment
- Match UA platform to `navigator.platform` override
- If using Chromium, use Chrome UAs. If Firefox, use Firefox UAs.
- Never use obviously fake or ancient UAs
### 3. Viewport and Screen Properties
Common real-world screen resolutions (from analytics data):
| Resolution | Market Share | Use For |
|-----------|-------------|---------|
| 1920x1080 | ~23% | Default choice |
| 1366x768 | ~14% | Laptop simulation |
| 1536x864 | ~9% | Scaled laptop |
| 1440x900 | ~7% | MacBook |
| 2560x1440 | ~5% | High-end desktop |
```python
import random
VIEWPORTS = [
{"width": 1920, "height": 1080},
{"width": 1366, "height": 768},
{"width": 1536, "height": 864},
{"width": 1440, "height": 900},
]
viewport = random.choice(VIEWPORTS)
context = await browser.new_context(
viewport=viewport,
screen=viewport, # screen should match viewport
)
```
### 4. Navigator Properties Hardening
```python
STEALTH_INIT = """
// Plugins (headless Chrome has 0 plugins, real Chrome has 3-5)
Object.defineProperty(navigator, 'plugins', {
get: () => {
const plugins = [
{ name: 'Chrome PDF Plugin', filename: 'internal-pdf-viewer' },
{ name: 'Chrome PDF Viewer', filename: 'mhjfbmdgcfjbbpaeojofohoefgiehjai' },
{ name: 'Native Client', filename: 'internal-nacl-plugin' },
];
plugins.length = 3;
return plugins;
},
});
// Languages
Object.defineProperty(navigator, 'languages', {
get: () => ['en-US', 'en'],
});
// Platform (match to user agent)
Object.defineProperty(navigator, 'platform', {
get: () => 'Win32', // or 'MacIntel' for macOS UA
});
// Hardware concurrency (real browsers report CPU cores)
Object.defineProperty(navigator, 'hardwareConcurrency', {
get: () => 8,
});
// Device memory (Chrome-specific)
Object.defineProperty(navigator, 'deviceMemory', {
get: () => 8,
});
// Connection info
Object.defineProperty(navigator, 'connection', {
get: () => ({
effectiveType: '4g',
rtt: 50,
downlink: 10,
saveData: false,
}),
});
"""
await context.add_init_script(STEALTH_INIT)
```
### 5. WebGL Fingerprint Evasion
Headless Chrome uses SwiftShader for WebGL, which anti-bot services detect.
```python
# Option A: Launch with a real GPU (headed mode on a machine with GPU)
browser = await p.chromium.launch(headless=False)
# Option B: Override WebGL renderer info
await page.add_init_script("""
const getParameter = WebGLRenderingContext.prototype.getParameter;
WebGLRenderingContext.prototype.getParameter = function(parameter) {
if (parameter === 37445) {
return 'Intel Inc.'; // UNMASKED_VENDOR_WEBGL
}
if (parameter === 37446) {
return 'Intel(R) Iris(TM) Plus Graphics 640'; // UNMASKED_RENDERER_WEBGL
}
return getParameter.call(this, parameter);
};
""")
```
### 6. Canvas Fingerprint Noise
Anti-bot services render text/shapes to a canvas and hash the output. Headless Chrome produces a different hash.
```python
await page.add_init_script("""
const originalToDataURL = HTMLCanvasElement.prototype.toDataURL;
HTMLCanvasElement.prototype.toDataURL = function(type) {
if (type === 'image/png' || type === undefined) {
// Add minimal noise to the canvas to change fingerprint
const ctx = this.getContext('2d');
if (ctx) {
const imageData = ctx.getImageData(0, 0, this.width, this.height);
for (let i = 0; i < imageData.data.length; i += 4) {
// Shift one channel by +/- 1 (imperceptible)
imageData.data[i] = imageData.data[i] ^ 1;
}
ctx.putImageData(imageData, 0, 0);
}
}
return originalToDataURL.apply(this, arguments);
};
""")
```
## Request Throttling Patterns
### Human-Like Delays
Real users do not click at machine speed. Add realistic delays between actions.
```python
import random
import asyncio
async def human_delay(action_type="browse"):
"""Add realistic delay based on action type."""
delays = {
"browse": (1.0, 3.0), # Browsing between pages
"read": (2.0, 8.0), # Reading content
"fill": (0.3, 0.8), # Between form fields
"click": (0.1, 0.5), # Before clicking
"scroll": (0.5, 1.5), # Between scroll actions
}
min_s, max_s = delays.get(action_type, (0.5, 2.0))
await asyncio.sleep(random.uniform(min_s, max_s))
```
### Request Rate Limiting
```python
import time
class RateLimiter:
"""Enforce minimum delay between requests."""
def __init__(self, min_interval_seconds=1.0):
self.min_interval = min_interval_seconds
self.last_request_time = 0
async def wait(self):
elapsed = time.time() - self.last_request_time
if elapsed < self.min_interval:
await asyncio.sleep(self.min_interval - elapsed)
self.last_request_time = time.time()
# Usage
limiter = RateLimiter(min_interval_seconds=2.0)
for url in urls:
await limiter.wait()
await page.goto(url)
```
### Exponential Backoff on Errors
```python
async def with_backoff(coro_factory, max_retries=5, base_delay=1.0):
for attempt in range(max_retries):
try:
return await coro_factory()
except Exception as e:
if attempt == max_retries - 1:
raise
delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
print(f"Attempt {attempt + 1} failed: {e}. Retrying in {delay:.1f}s...")
await asyncio.sleep(delay)
```
## Proxy Rotation Strategies
### Single Proxy
```python
browser = await p.chromium.launch(
proxy={"server": "http://proxy.example.com:8080"}
)
```
### Authenticated Proxy
```python
context = await browser.new_context(
proxy={
"server": "http://proxy.example.com:8080",
"username": "user",
"password": "pass",
}
)
```
### Rotating Proxy Pool
```python
PROXIES = [
"http://proxy1.example.com:8080",
"http://proxy2.example.com:8080",
"http://proxy3.example.com:8080",
]
async def create_context_with_proxy(browser):
proxy = random.choice(PROXIES)
return await browser.new_context(
proxy={"server": proxy}
)
```
### Per-Request Proxy (via Context Rotation)
Playwright does not support per-request proxy switching. Achieve it by creating a new context for each request or batch:
```python
async def scrape_url(browser, url, proxy):
context = await browser.new_context(proxy={"server": proxy})
page = await context.new_page()
try:
await page.goto(url)
data = await extract_data(page)
return data
finally:
await context.close()
```
### SOCKS5 Proxy
```python
browser = await p.chromium.launch(
proxy={"server": "socks5://proxy.example.com:1080"}
)
```
## Headless Detection Avoidance
### Running Chrome Channel Instead of Chromium
The bundled Chromium binary has different properties than a real Chrome install. Using the Chrome channel makes the browser indistinguishable from a normal install.
```python
# Use installed Chrome instead of bundled Chromium
browser = await p.chromium.launch(channel="chrome", headless=True)
```
**Requirements:** Chrome must be installed on the system.
### New Headless Mode (Chrome 112+)
Chrome's "new headless" mode is harder to detect than the old one:
```python
browser = await p.chromium.launch(
args=["--headless=new"],
)
```
### Avoiding Common Flags
Do NOT pass these flags — they are headless-detection signals:
- `--disable-gpu` (old headless workaround, not needed)
- `--no-sandbox` (security risk, detectable)
- `--disable-setuid-sandbox` (same as above)
## Behavioral Evasion
### Mouse Movement Simulation
Anti-bot services track mouse events. A click without preceding mouse movement is suspicious.
```python
async def human_click(page, selector):
"""Click with preceding mouse movement."""
element = await page.query_selector(selector)
box = await element.bounding_box()
if box:
# Move to element with slight offset
x = box["x"] + box["width"] / 2 + random.uniform(-5, 5)
y = box["y"] + box["height"] / 2 + random.uniform(-5, 5)
await page.mouse.move(x, y, steps=random.randint(5, 15))
await asyncio.sleep(random.uniform(0.05, 0.2))
await page.mouse.click(x, y)
```
### Typing Speed Variation
```python
async def human_type(page, selector, text):
"""Type with variable speed like a human."""
await page.click(selector)
for char in text:
await page.keyboard.type(char)
# Faster for common keys, slower for special characters
if char in "aeiou tnrs":
await asyncio.sleep(random.uniform(0.03, 0.08))
else:
await asyncio.sleep(random.uniform(0.08, 0.20))
```
### Scroll Behavior
Real users scroll gradually, not in instant jumps.
```python
async def human_scroll(page, distance=None):
"""Scroll down gradually like a human."""
if distance is None:
distance = random.randint(300, 800)
current = 0
while current < distance:
step = random.randint(50, 150)
await page.mouse.wheel(0, step)
current += step
await asyncio.sleep(random.uniform(0.05, 0.15))
```
## Detection Testing
### Self-Check Script
Navigate to these URLs to test your stealth configuration:
- `https://bot.sannysoft.com/` — Comprehensive bot detection test
- `https://abrahamjuliot.github.io/creepjs/` — Advanced fingerprint analysis
- `https://browserleaks.com/webgl` — WebGL fingerprint details
- `https://browserleaks.com/canvas` — Canvas fingerprint details
### Quick Test Pattern
```python
async def test_stealth(page):
"""Navigate to detection test page and report results."""
await page.goto("https://bot.sannysoft.com/")
await page.wait_for_timeout(3000)
# Check for failed tests
failed = await page.eval_on_selector_all(
"td.failed",
"els => els.map(e => e.parentElement.querySelector('td').textContent)"
)
if failed:
print(f"FAILED checks: {failed}")
else:
print("All checks passed.")
await page.screenshot(path="stealth_test.png", full_page=True)
```
## Recommended Stealth Stack
For most automation tasks, apply these in order of priority:
1. **WebDriver flag removal** — Critical, takes 2 lines
2. **Custom user agent** — Critical, takes 1 line
3. **Viewport configuration** — High priority, takes 1 line
4. **Request delays** — High priority, add random.uniform() calls
5. **Navigator properties** — Medium priority, init script block
6. **Chrome channel** — Medium priority, one launch option
7. **WebGL override** — Low priority unless hitting advanced anti-bot
8. **Canvas noise** — Low priority unless hitting advanced anti-bot
9. **Proxy rotation** — Only for high-volume or repeated scraping
10. **Behavioral simulation** — Only for sites with behavioral analysis

View File

@@ -0,0 +1,580 @@
# Data Extraction Recipes
Practical patterns for extracting structured data from web pages using Playwright. Each recipe is a self-contained pattern you can adapt to your target site.
## CSS Selector Patterns for Common Structures
### E-Commerce Product Listings
```python
PRODUCT_SELECTORS = {
"container": "div.product-card, article.product, li.product-item",
"fields": {
"title": "h2.product-title, h3.product-name, [data-testid='product-title']",
"price": "span.price, .product-price, [data-testid='price']",
"original_price": "span.original-price, .was-price, del",
"rating": "span.rating, .star-rating, [data-rating]",
"review_count": "span.review-count, .num-reviews",
"image_url": "img.product-image::attr(src), img::attr(data-src)",
"product_url": "a.product-link::attr(href), h2 a::attr(href)",
"availability": "span.stock-status, .availability",
}
}
```
### News/Blog Article Listings
```python
ARTICLE_SELECTORS = {
"container": "article, div.post, div.article-card",
"fields": {
"headline": "h2 a, h3 a, .article-title",
"summary": "p.excerpt, .article-summary, .post-excerpt",
"author": "span.author, .byline, [rel='author']",
"date": "time, span.date, .published-date",
"category": "span.category, a.tag, .article-category",
"url": "h2 a::attr(href), .article-title a::attr(href)",
"image_url": "img.thumbnail::attr(src), .article-image img::attr(src)",
}
}
```
### Job Listings
```python
JOB_SELECTORS = {
"container": "div.job-card, li.job-listing, article.job",
"fields": {
"title": "h2.job-title, a.job-link, [data-testid='job-title']",
"company": "span.company-name, .employer, [data-testid='company']",
"location": "span.location, .job-location, [data-testid='location']",
"salary": "span.salary, .compensation, [data-testid='salary']",
"job_type": "span.job-type, .employment-type",
"posted_date": "time, span.posted, .date-posted",
"url": "a.job-link::attr(href), h2 a::attr(href)",
}
}
```
### Search Engine Results
```python
SERP_SELECTORS = {
"container": "div.g, .search-result, li.result",
"fields": {
"title": "h3, .result-title",
"url": "a::attr(href), cite",
"snippet": "div.VwiC3b, .result-snippet, .search-description",
"displayed_url": "cite, .result-url",
}
}
```
## Table Extraction Recipes
### Simple HTML Table to JSON
The most common extraction pattern. Works for any standard `<table>` with `<thead>` and `<tbody>`.
```python
async def extract_table(page, table_selector="table"):
"""Extract an HTML table into a list of dictionaries."""
data = await page.evaluate(f"""
(selector) => {{
const table = document.querySelector(selector);
if (!table) return null;
// Get headers
const headers = Array.from(table.querySelectorAll('thead th, thead td'))
.map(th => th.textContent.trim());
// If no thead, use first row as headers
if (headers.length === 0) {{
const firstRow = table.querySelector('tr');
if (firstRow) {{
headers.push(...Array.from(firstRow.querySelectorAll('th, td'))
.map(cell => cell.textContent.trim()));
}}
}}
// Get data rows
const rows = Array.from(table.querySelectorAll('tbody tr'));
return rows.map(row => {{
const cells = Array.from(row.querySelectorAll('td'));
const obj = {{}};
cells.forEach((cell, i) => {{
if (i < headers.length) {{
obj[headers[i]] = cell.textContent.trim();
}}
}});
return obj;
}});
}}
""", table_selector)
return data or []
```
### Table with Links and Attributes
When table cells contain links or data attributes, not just text:
```python
async def extract_rich_table(page, table_selector="table"):
"""Extract table including links and data attributes."""
return await page.evaluate(f"""
(selector) => {{
const table = document.querySelector(selector);
if (!table) return [];
const headers = Array.from(table.querySelectorAll('thead th'))
.map(th => th.textContent.trim());
return Array.from(table.querySelectorAll('tbody tr')).map(row => {{
const obj = {{}};
Array.from(row.querySelectorAll('td')).forEach((cell, i) => {{
const key = headers[i] || `col_${{i}}`;
obj[key] = cell.textContent.trim();
// Extract link if present
const link = cell.querySelector('a');
if (link) {{
obj[key + '_url'] = link.href;
}}
// Extract data attributes
for (const attr of cell.attributes) {{
if (attr.name.startsWith('data-')) {{
obj[key + '_' + attr.name] = attr.value;
}}
}}
}});
return obj;
}});
}}
""", table_selector)
```
### Multi-Page Table (Paginated)
```python
async def extract_paginated_table(page, table_selector, next_selector, max_pages=50):
"""Extract data from a table that spans multiple pages."""
all_rows = []
headers = None
for page_num in range(max_pages):
# Extract current page
page_data = await page.evaluate(f"""
(selector) => {{
const table = document.querySelector(selector);
if (!table) return {{ headers: [], rows: [] }};
const hs = Array.from(table.querySelectorAll('thead th'))
.map(th => th.textContent.trim());
const rs = Array.from(table.querySelectorAll('tbody tr')).map(row =>
Array.from(row.querySelectorAll('td')).map(td => td.textContent.trim())
);
return {{ headers: hs, rows: rs }};
}}
""", table_selector)
if headers is None and page_data["headers"]:
headers = page_data["headers"]
for row in page_data["rows"]:
all_rows.append(dict(zip(headers or [], row)))
# Check for next page
next_btn = page.locator(next_selector)
if await next_btn.count() == 0 or await next_btn.is_disabled():
break
await next_btn.click()
await page.wait_for_load_state("networkidle")
await page.wait_for_timeout(random.randint(800, 2000))
return all_rows
```
## Product Listing Extraction
### Generic Listing Extractor
Works for any repeating card/list pattern:
```python
async def extract_listings(page, container_sel, field_map):
"""
Extract data from repeating elements.
field_map: dict mapping field names to CSS selectors.
Special suffixes:
::attr(name) — extract attribute instead of text
::html — extract innerHTML
"""
items = []
cards = await page.query_selector_all(container_sel)
for card in cards:
item = {}
for field_name, selector in field_map.items():
try:
if "::attr(" in selector:
sel, attr = selector.split("::attr(")
attr = attr.rstrip(")")
el = await card.query_selector(sel)
item[field_name] = await el.get_attribute(attr) if el else None
elif selector.endswith("::html"):
sel = selector.replace("::html", "")
el = await card.query_selector(sel)
item[field_name] = await el.inner_html() if el else None
else:
el = await card.query_selector(selector)
item[field_name] = (await el.text_content()).strip() if el else None
except Exception:
item[field_name] = None
items.append(item)
return items
```
### With Price Parsing
```python
import re
def parse_price(text):
"""Extract numeric price from text like '$1,234.56' or '1.234,56 EUR'."""
if not text:
return None
# Remove currency symbols and whitespace
cleaned = re.sub(r'[^\d.,]', '', text.strip())
if not cleaned:
return None
# Handle European format (1.234,56)
if ',' in cleaned and '.' in cleaned:
if cleaned.rindex(',') > cleaned.rindex('.'):
cleaned = cleaned.replace('.', '').replace(',', '.')
else:
cleaned = cleaned.replace(',', '')
elif ',' in cleaned:
# Could be 1,234 or 1,23 — check decimal places
parts = cleaned.split(',')
if len(parts[-1]) <= 2:
cleaned = cleaned.replace(',', '.')
else:
cleaned = cleaned.replace(',', '')
try:
return float(cleaned)
except ValueError:
return None
async def extract_products_with_prices(page, container_sel, field_map, price_field="price"):
"""Extract listings and parse prices into floats."""
items = await extract_listings(page, container_sel, field_map)
for item in items:
if price_field in item and item[price_field]:
item[f"{price_field}_raw"] = item[price_field]
item[price_field] = parse_price(item[price_field])
return items
```
## Pagination Handling
### Next-Button Pagination
The most common pattern. Click "Next" until the button disappears or is disabled.
```python
async def paginate_via_next_button(page, next_selector, content_selector, max_pages=100):
"""
Yield page objects as you paginate through results.
next_selector: CSS selector for the "Next" button/link
content_selector: CSS selector to wait for after navigation (confirms new page loaded)
"""
pages_scraped = 0
while pages_scraped < max_pages:
yield page # Caller extracts data from current page
pages_scraped += 1
next_btn = page.locator(next_selector)
if await next_btn.count() == 0:
break
try:
is_disabled = await next_btn.is_disabled()
except Exception:
is_disabled = True
if is_disabled:
break
await next_btn.click()
await page.wait_for_selector(content_selector, state="attached")
await page.wait_for_timeout(random.randint(500, 1500))
```
### URL-Based Pagination
When pages follow a predictable URL pattern:
```python
async def paginate_via_url(page, url_template, start=1, max_pages=100):
"""
Navigate through pages using URL parameters.
url_template: URL with {page} placeholder, e.g., "https://example.com/search?page={page}"
"""
for page_num in range(start, start + max_pages):
url = url_template.format(page=page_num)
response = await page.goto(url, wait_until="networkidle")
if response and response.status == 404:
break
yield page, page_num
await page.wait_for_timeout(random.randint(800, 2500))
```
### Infinite Scroll
For sites that load content as you scroll:
```python
async def paginate_via_scroll(page, item_selector, max_scrolls=100, no_change_limit=3):
"""
Scroll to load more content until no new items appear.
item_selector: CSS selector for individual items (used to count progress)
no_change_limit: Stop after N scrolls with no new items
"""
previous_count = 0
no_change_streak = 0
for scroll_num in range(max_scrolls):
# Count current items
current_count = await page.locator(item_selector).count()
if current_count == previous_count:
no_change_streak += 1
if no_change_streak >= no_change_limit:
break
else:
no_change_streak = 0
previous_count = current_count
# Scroll to bottom
await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
await page.wait_for_timeout(random.randint(1000, 2500))
# Check for "Load More" button that might appear
load_more = page.locator("button:has-text('Load More'), button:has-text('Show More')")
if await load_more.count() > 0 and await load_more.is_visible():
await load_more.click()
await page.wait_for_timeout(random.randint(1000, 2000))
return current_count
```
### Load-More Button
Simpler variant of infinite scroll where content loads via a button:
```python
async def paginate_via_load_more(page, button_selector, item_selector, max_clicks=50):
"""Click a 'Load More' button repeatedly until it disappears."""
for click_num in range(max_clicks):
btn = page.locator(button_selector)
if await btn.count() == 0 or not await btn.is_visible():
break
count_before = await page.locator(item_selector).count()
await btn.click()
# Wait for new items to appear
try:
await page.wait_for_function(
f"document.querySelectorAll('{item_selector}').length > {count_before}",
timeout=10000,
)
except Exception:
break # No new items loaded
await page.wait_for_timeout(random.randint(500, 1500))
return await page.locator(item_selector).count()
```
## Nested Data Extraction
### Comments with Replies (Threaded)
```python
async def extract_threaded_comments(page, parent_selector=".comments"):
"""Recursively extract threaded comments."""
return await page.evaluate(f"""
(parentSelector) => {{
function extractThread(container) {{
const comments = [];
const directChildren = container.querySelectorAll(':scope > .comment');
for (const comment of directChildren) {{
const authorEl = comment.querySelector('.author, .username');
const textEl = comment.querySelector('.comment-text, .comment-body');
const dateEl = comment.querySelector('time, .date');
const repliesContainer = comment.querySelector('.replies, .children');
comments.push({{
author: authorEl ? authorEl.textContent.trim() : null,
text: textEl ? textEl.textContent.trim() : null,
date: dateEl ? (dateEl.getAttribute('datetime') || dateEl.textContent.trim()) : null,
replies: repliesContainer ? extractThread(repliesContainer) : [],
}});
}}
return comments;
}}
const root = document.querySelector(parentSelector);
return root ? extractThread(root) : [];
}}
""", parent_selector)
```
### Nested Categories (Sidebar/Menu)
```python
async def extract_category_tree(page, root_selector="nav.categories"):
"""Extract nested category structure from a sidebar or menu."""
return await page.evaluate(f"""
(rootSelector) => {{
function extractLevel(container) {{
const items = [];
const directItems = container.querySelectorAll(':scope > li, :scope > div.category');
for (const item of directItems) {{
const link = item.querySelector(':scope > a');
const subMenu = item.querySelector(':scope > ul, :scope > div.sub-categories');
items.push({{
name: link ? link.textContent.trim() : item.textContent.trim().split('\\n')[0],
url: link ? link.href : null,
children: subMenu ? extractLevel(subMenu) : [],
}});
}}
return items;
}}
const root = document.querySelector(rootSelector);
return root ? extractLevel(root.querySelector('ul') || root) : [];
}}
""", root_selector)
```
### Accordion/Expandable Content
Some content is hidden behind accordion/expand toggles. Click to reveal, then extract.
```python
async def extract_accordion(page, toggle_selector, content_selector):
"""Expand all accordion items and extract their content."""
items = []
toggles = await page.query_selector_all(toggle_selector)
for toggle in toggles:
title = (await toggle.text_content()).strip()
# Click to expand
await toggle.click()
await page.wait_for_timeout(300)
# Find the associated content panel
content = await toggle.evaluate_handle(
f"el => el.closest('.accordion-item, .faq-item')?.querySelector('{content_selector}')"
)
body = None
if content:
body = (await content.text_content())
if body:
body = body.strip()
items.append({"title": title, "content": body})
return items
```
## Data Cleaning Utilities
### Post-Extraction Cleaning
```python
import re
def clean_text(text):
"""Normalize whitespace, remove zero-width characters."""
if not text:
return None
# Remove zero-width characters
text = re.sub(r'[\u200b\u200c\u200d\ufeff]', '', text)
# Normalize whitespace
text = re.sub(r'\s+', ' ', text).strip()
return text if text else None
def clean_url(url, base_url=None):
"""Convert relative URLs to absolute."""
if not url:
return None
url = url.strip()
if url.startswith("//"):
return "https:" + url
if url.startswith("/") and base_url:
return base_url.rstrip("/") + url
return url
def deduplicate(items, key_field):
"""Remove duplicate items based on a key field."""
seen = set()
unique = []
for item in items:
key = item.get(key_field)
if key and key not in seen:
seen.add(key)
unique.append(item)
return unique
```
### Output Formats
```python
import json
import csv
import io
def to_jsonl(items, file_path):
"""Write items as JSON Lines (one JSON object per line)."""
with open(file_path, "w") as f:
for item in items:
f.write(json.dumps(item, ensure_ascii=False) + "\n")
def to_csv(items, file_path):
"""Write items as CSV."""
if not items:
return
headers = list(items[0].keys())
with open(file_path, "w", newline="") as f:
writer = csv.DictWriter(f, fieldnames=headers)
writer.writeheader()
writer.writerows(items)
def to_json(items, file_path, indent=2):
"""Write items as a JSON array."""
with open(file_path, "w") as f:
json.dump(items, f, indent=indent, ensure_ascii=False)
```

View File

@@ -0,0 +1,492 @@
# Playwright Browser API Reference (Automation Focus)
This reference covers Playwright's Python async API for browser automation tasks — NOT testing. For test-specific APIs (assertions, fixtures, test runners), see playwright-pro.
## Browser Launch & Context
### Launching the Browser
```python
from playwright.async_api import async_playwright
async with async_playwright() as p:
# Chromium (recommended for most automation)
browser = await p.chromium.launch(headless=True)
# Firefox (better for some anti-detection scenarios)
browser = await p.firefox.launch(headless=True)
# WebKit (Safari engine — useful for Apple-specific sites)
browser = await p.webkit.launch(headless=True)
```
**Launch options:**
| Option | Type | Default | Purpose |
|--------|------|---------|---------|
| `headless` | bool | True | Run without visible window |
| `slow_mo` | int | 0 | Milliseconds to slow each operation (debugging) |
| `proxy` | dict | None | Proxy server configuration |
| `args` | list | [] | Additional Chromium flags |
| `downloads_path` | str | None | Directory for downloads |
| `channel` | str | None | Browser channel: "chrome", "msedge" |
### Browser Contexts (Session Isolation)
Browser contexts are isolated environments within a single browser instance. Each context has its own cookies, localStorage, and cache. Use them instead of launching multiple browsers.
```python
# Create isolated context
context = await browser.new_context(
viewport={"width": 1920, "height": 1080},
user_agent="Mozilla/5.0 ...",
locale="en-US",
timezone_id="America/New_York",
geolocation={"latitude": 40.7128, "longitude": -74.0060},
permissions=["geolocation"],
)
# Multiple contexts share one browser (resource efficient)
context_a = await browser.new_context() # User A session
context_b = await browser.new_context() # User B session
```
### Storage State (Session Persistence)
```python
# Save state after login (cookies + localStorage)
await context.storage_state(path="auth_state.json")
# Restore state in new context
context = await browser.new_context(storage_state="auth_state.json")
```
## Page Navigation
### Basic Navigation
```python
page = await context.new_page()
# Navigate with different wait strategies
await page.goto("https://example.com") # Default: "load"
await page.goto("https://example.com", wait_until="domcontentloaded") # Faster
await page.goto("https://example.com", wait_until="networkidle") # Wait for network quiet
await page.goto("https://example.com", timeout=30000) # Custom timeout (ms)
```
**`wait_until` options:**
- `"load"` — wait for the `load` event (all resources loaded)
- `"domcontentloaded"` — DOM is ready, images/styles may still load
- `"networkidle"` — no network requests for 500ms (best for SPAs)
- `"commit"` — response received, before any rendering
### Wait Strategies
```python
# Wait for a specific element to appear
await page.wait_for_selector("div.content", state="visible")
await page.wait_for_selector("div.loading", state="hidden") # Wait for loading to finish
await page.wait_for_selector("table tbody tr", state="attached") # In DOM but maybe not visible
# Wait for URL change
await page.wait_for_url("**/dashboard**")
await page.wait_for_url(re.compile(r"/dashboard/\d+"))
# Wait for specific network response
async with page.expect_response("**/api/data*") as resp_info:
await page.click("button.load")
response = await resp_info.value
json_data = await response.json()
# Wait for page load state
await page.wait_for_load_state("networkidle")
# Fixed wait (use sparingly — prefer the methods above)
await page.wait_for_timeout(1000) # milliseconds
```
### Navigation History
```python
await page.go_back()
await page.go_forward()
await page.reload()
```
## Element Interaction
### Finding Elements
```python
# Single element (returns first match)
element = await page.query_selector("css=div.product")
element = await page.query_selector("xpath=//div[@class='product']")
# Multiple elements
elements = await page.query_selector_all("div.product")
# Locator API (recommended — auto-waits, re-queries on each action)
locator = page.locator("div.product")
count = await locator.count()
first = locator.first
nth = locator.nth(2)
```
**Locator vs query_selector:**
- `query_selector` — returns an ElementHandle at a point in time. Can go stale if DOM changes.
- `locator` — returns a Locator that re-queries each time you interact with it. Preferred for reliability.
### Clicking
```python
await page.click("button.submit")
await page.click("a:has-text('Next')")
await page.dblclick("div.editable")
await page.click("button", position={"x": 10, "y": 10}) # Click at offset
await page.click("button", force=True) # Skip actionability checks
await page.click("button", modifiers=["Shift"]) # With modifier key
```
### Text Input
```python
# Fill (clears existing content first)
await page.fill("input#email", "user@example.com")
# Type (simulates keystroke-by-keystroke input — slower, more realistic)
await page.type("input#search", "query text", delay=50) # 50ms between keys
# Press specific keys
await page.press("input#search", "Enter")
await page.press("body", "Control+a")
```
### Dropdowns & Select
```python
# Native <select> element
await page.select_option("select#country", value="US")
await page.select_option("select#country", label="United States")
await page.select_option("select#tags", value=["tag1", "tag2"]) # Multi-select
# Custom dropdown (non-native)
await page.click("div.dropdown-trigger")
await page.click("li.option:has-text('United States')")
```
### Checkboxes & Radio Buttons
```python
await page.check("input#agree")
await page.uncheck("input#newsletter")
is_checked = await page.is_checked("input#agree")
```
### File Upload
```python
# Standard file input
await page.set_input_files("input[type='file']", "/path/to/file.pdf")
await page.set_input_files("input[type='file']", ["/path/a.pdf", "/path/b.pdf"])
# Clear file selection
await page.set_input_files("input[type='file']", [])
# Non-standard upload (drag-and-drop zones)
async with page.expect_file_chooser() as fc_info:
await page.click("div.upload-zone")
file_chooser = await fc_info.value
await file_chooser.set_files("/path/to/file.pdf")
```
### Hover & Focus
```python
await page.hover("div.menu-item")
await page.focus("input#search")
```
## Data Extraction
### Text Content
```python
# Get text content of an element
text = await page.text_content("h1.title")
inner_text = await page.inner_text("div.description") # Visible text only
inner_html = await page.inner_html("div.content") # HTML markup
# Get attribute
href = await page.get_attribute("a.link", "href")
src = await page.get_attribute("img.photo", "src")
```
### JavaScript Evaluation
```python
# Evaluate in page context
title = await page.evaluate("document.title")
scroll_height = await page.evaluate("document.body.scrollHeight")
# Evaluate on a specific element
text = await page.eval_on_selector("h1", "el => el.textContent")
texts = await page.eval_on_selector_all("li", "els => els.map(e => e.textContent.trim())")
# Complex extraction
data = await page.evaluate("""
() => {
const rows = document.querySelectorAll('table tbody tr');
return Array.from(rows).map(row => {
const cells = row.querySelectorAll('td');
return {
name: cells[0]?.textContent.trim(),
value: cells[1]?.textContent.trim(),
};
});
}
""")
```
### Screenshots & PDF
```python
# Full page screenshot
await page.screenshot(path="page.png", full_page=True)
# Viewport screenshot
await page.screenshot(path="viewport.png")
# Element screenshot
await page.locator("div.chart").screenshot(path="chart.png")
# PDF (Chromium only)
await page.pdf(path="page.pdf", format="A4", print_background=True)
# Screenshot as bytes (for processing without saving)
buffer = await page.screenshot()
```
## Network Interception
### Monitoring Requests
```python
# Listen for all responses
page.on("response", lambda response: print(f"{response.status} {response.url}"))
# Wait for a specific API call
async with page.expect_response("**/api/products*") as resp:
await page.click("button.load")
response = await resp.value
data = await response.json()
```
### Blocking Resources (Speed Up Scraping)
```python
# Block images, fonts, and CSS to speed up scraping
await page.route("**/*.{png,jpg,jpeg,gif,svg,woff,woff2,ttf}", lambda route: route.abort())
await page.route("**/*.css", lambda route: route.abort())
# Block specific domains (ads, analytics)
await page.route("**/google-analytics.com/**", lambda route: route.abort())
await page.route("**/facebook.com/**", lambda route: route.abort())
```
### Modifying Requests
```python
# Add custom headers
await page.route("**/*", lambda route: route.continue_(headers={
**route.request.headers,
"X-Custom-Header": "value"
}))
# Mock API responses
await page.route("**/api/data", lambda route: route.fulfill(
status=200,
content_type="application/json",
body=json.dumps({"items": []}),
))
```
## Dialog Handling
```python
# Auto-accept all dialogs
page.on("dialog", lambda dialog: dialog.accept())
# Handle specific dialog types
async def handle_dialog(dialog):
if dialog.type == "confirm":
await dialog.accept()
elif dialog.type == "prompt":
await dialog.accept("my input")
elif dialog.type == "alert":
await dialog.dismiss()
page.on("dialog", handle_dialog)
```
## File Downloads
```python
# Wait for download to start
async with page.expect_download() as dl_info:
await page.click("a.download-link")
download = await dl_info.value
# Save to specific path
await download.save_as("/path/to/downloads/" + download.suggested_filename)
# Get download as bytes
path = await download.path() # Temp file path
# Set download behavior at context level
context = await browser.new_context(accept_downloads=True)
```
## Frames & Iframes
```python
# Access iframe by selector
frame = page.frame_locator("iframe#content")
await frame.locator("button.submit").click()
# Access frame by name
frame = page.frame(name="editor")
# Access all frames
for frame in page.frames:
print(frame.url)
```
## Cookie Management
```python
# Get all cookies
cookies = await context.cookies()
# Get cookies for specific URL
cookies = await context.cookies(["https://example.com"])
# Add cookies
await context.add_cookies([{
"name": "session",
"value": "abc123",
"domain": "example.com",
"path": "/",
"httpOnly": True,
"secure": True,
}])
# Clear cookies
await context.clear_cookies()
```
## Concurrency Patterns
### Multiple Pages in One Context
```python
# Open multiple tabs in the same session
pages = []
for url in urls:
page = await context.new_page()
await page.goto(url)
pages.append(page)
# Process all pages
for page in pages:
data = await extract_data(page)
await page.close()
```
### Multiple Contexts for Parallel Sessions
```python
import asyncio
async def scrape_with_context(browser, url):
context = await browser.new_context(user_agent=random.choice(USER_AGENTS))
page = await context.new_page()
await page.goto(url)
data = await extract_data(page)
await context.close()
return data
# Run 5 concurrent scraping tasks
tasks = [scrape_with_context(browser, url) for url in urls[:5]]
results = await asyncio.gather(*tasks)
```
## Init Scripts (Stealth)
Init scripts run before any page script, in every new page/context.
```python
# Remove webdriver flag
await context.add_init_script("""
Object.defineProperty(navigator, 'webdriver', {get: () => undefined});
""")
# Override plugins (headless Chrome has empty plugins)
await context.add_init_script("""
Object.defineProperty(navigator, 'plugins', {
get: () => [1, 2, 3, 4, 5],
});
""")
# Override languages
await context.add_init_script("""
Object.defineProperty(navigator, 'languages', {
get: () => ['en-US', 'en'],
});
""")
# From file
await context.add_init_script(path="stealth.js")
```
## Common Automation Patterns
### Scrolling
```python
# Scroll to bottom
await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
# Scroll element into view
await page.locator("div.target").scroll_into_view_if_needed()
# Smooth scroll simulation
await page.evaluate("""
async () => {
const delay = ms => new Promise(r => setTimeout(r, ms));
for (let i = 0; i < document.body.scrollHeight; i += 300) {
window.scrollTo(0, i);
await delay(100);
}
}
""")
```
### Clipboard Operations
```python
# Copy text
await page.evaluate("navigator.clipboard.writeText('hello')")
# Paste via keyboard
await page.keyboard.press("Control+v")
```
### Shadow DOM
```python
# Playwright pierces open shadow DOM with >> operator
await page.locator("my-component >> .inner-button").click()
# Or use the css= engine with >> for chained piercing
await page.locator("css=host-element >> css=.shadow-child").click()
```

View File

@@ -0,0 +1,248 @@
#!/usr/bin/env python3
"""
Scraping Toolkit - Generates Playwright scraping script skeletons.
Takes a URL pattern and CSS selectors as input and produces a ready-to-run
Playwright scraping script with pagination support, error handling, and
anti-detection patterns baked in.
No external dependencies - uses only Python standard library.
"""
import argparse
import json
import os
import sys
import textwrap
from datetime import datetime
def build_scraping_script(url, selectors, paginate=False, output_format="script"):
"""Build a Playwright scraping script from the given parameters."""
selector_list = [s.strip() for s in selectors.split(",") if s.strip()]
if not selector_list:
return None, "No valid selectors provided."
field_names = []
for sel in selector_list:
# Derive field name from selector: .product-title -> product_title
name = sel.strip("#.[]()>:+~ ")
name = name.replace("-", "_").replace(" ", "_").replace(".", "_")
# Remove non-alphanumeric
name = "".join(c if c.isalnum() or c == "_" else "" for c in name)
if not name:
name = f"field_{len(field_names)}"
field_names.append(name)
field_map = dict(zip(field_names, selector_list))
if output_format == "json":
config = {
"url": url,
"selectors": field_map,
"pagination": {
"enabled": paginate,
"next_selector": "a:has-text('Next'), button:has-text('Next')",
"max_pages": 50,
},
"anti_detection": {
"random_delay_ms": [800, 2500],
"user_agent_rotation": True,
"viewport": {"width": 1920, "height": 1080},
},
"output": {
"format": "jsonl",
"deduplicate_by": field_names[0] if field_names else None,
},
"generated_at": datetime.now().isoformat(),
}
return config, None
# Build Python script
fields_dict_str = "{\n"
for name, sel in field_map.items():
fields_dict_str += f' "{name}": "{sel}",\n'
fields_dict_str += " }"
pagination_block = ""
if paginate:
pagination_block = textwrap.dedent("""\
# --- Pagination ---
async def scrape_all_pages(page, container, fields, next_sel, max_pages=50):
all_items = []
for page_num in range(max_pages):
print(f"Scraping page {page_num + 1}...")
items = await extract_items(page, container, fields)
all_items.extend(items)
next_btn = page.locator(next_sel)
if await next_btn.count() == 0:
break
try:
is_disabled = await next_btn.is_disabled()
except Exception:
is_disabled = True
if is_disabled:
break
await next_btn.click()
await page.wait_for_load_state("networkidle")
await asyncio.sleep(random.uniform(0.8, 2.5))
return all_items
""")
main_call = "scrape_all_pages(page, CONTAINER, FIELDS, NEXT_SELECTOR)" if paginate else "extract_items(page, CONTAINER, FIELDS)"
script = textwrap.dedent(f'''\
#!/usr/bin/env python3
"""
Auto-generated Playwright scraping script.
Target: {url}
Generated: {datetime.now().isoformat()}
Requirements:
pip install playwright
playwright install chromium
"""
import asyncio
import json
import random
from playwright.async_api import async_playwright
# --- Configuration ---
URL = "{url}"
CONTAINER = "body" # Adjust to the repeating item container selector
FIELDS = {fields_dict_str}
NEXT_SELECTOR = "a:has-text('Next'), button:has-text('Next')"
USER_AGENTS = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
]
async def extract_items(page, container_selector, field_map):
"""Extract structured data from repeating elements."""
items = []
cards = await page.query_selector_all(container_selector)
for card in cards:
item = {{}}
for name, selector in field_map.items():
el = await card.query_selector(selector)
if el:
item[name] = (await el.text_content() or "").strip()
else:
item[name] = None
items.append(item)
return items
{pagination_block}
async def main():
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
context = await browser.new_context(
viewport={{"width": 1920, "height": 1080}},
user_agent=random.choice(USER_AGENTS),
)
page = await context.new_page()
# Remove WebDriver flag
await page.add_init_script(
"Object.defineProperty(navigator, \'webdriver\', {{get: () => undefined}});"
)
print(f"Navigating to {{URL}}...")
await page.goto(URL, wait_until="networkidle")
data = await {main_call}
print(json.dumps(data, indent=2, ensure_ascii=False))
await browser.close()
if __name__ == "__main__":
asyncio.run(main())
''')
return script, None
def main():
parser = argparse.ArgumentParser(
description="Generate Playwright scraping script skeletons from URL and selectors.",
epilog=(
"Examples:\n"
" %(prog)s --url https://example.com/products --selectors '.title,.price,.rating'\n"
" %(prog)s --url https://example.com/search --selectors '.name,.desc' --paginate\n"
" %(prog)s --url https://example.com --selectors '.item' --json\n"
" %(prog)s --url https://example.com --selectors '.item' --output scraper.py\n"
),
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument(
"--url",
required=True,
help="Target URL to scrape",
)
parser.add_argument(
"--selectors",
required=True,
help="Comma-separated CSS selectors for data fields (e.g. '.title,.price,.rating')",
)
parser.add_argument(
"--paginate",
action="store_true",
default=False,
help="Include pagination handling in generated script",
)
parser.add_argument(
"--output",
help="Output file path (default: stdout)",
)
parser.add_argument(
"--json",
action="store_true",
dest="json_output",
default=False,
help="Output JSON configuration instead of Python script",
)
args = parser.parse_args()
output_format = "json" if args.json_output else "script"
result, error = build_scraping_script(
url=args.url,
selectors=args.selectors,
paginate=args.paginate,
output_format=output_format,
)
if error:
print(f"Error: {error}", file=sys.stderr)
sys.exit(2)
if args.json_output:
output_text = json.dumps(result, indent=2)
else:
output_text = result
if args.output:
output_path = os.path.abspath(args.output)
with open(output_path, "w") as f:
f.write(output_text)
if not args.json_output:
os.chmod(output_path, 0o755)
print(f"Written to {output_path}", file=sys.stderr)
sys.exit(0)
else:
print(output_text)
sys.exit(0)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,586 @@
---
name: "spec-driven-workflow"
description: "Use when the user asks to write specs before code, define acceptance criteria, plan features before implementation, generate tests from specifications, or follow spec-first development practices."
---
# Spec-Driven Workflow — POWERFUL
## Overview
Spec-driven workflow enforces a single, non-negotiable rule: **write the specification BEFORE you write any code.** Not alongside. Not after. Before.
This is not documentation. This is a contract. A spec defines what the system MUST do, what it SHOULD do, and what it explicitly WILL NOT do. Every line of code you write traces back to a requirement in the spec. Every test traces back to an acceptance criterion. If it is not in the spec, it does not get built.
### Why Spec-First Matters
1. **Eliminates rework.** 60-80% of defects originate from requirements, not implementation. Catching ambiguity in a spec costs minutes; catching it in production costs days.
2. **Forces clarity.** If you cannot write what the system should do in plain language, you do not understand the problem well enough to write code.
3. **Enables parallelism.** Once a spec is approved, frontend, backend, QA, and documentation can all start simultaneously.
4. **Creates accountability.** The spec is the definition of done. No arguments about whether a feature is "complete" — either it satisfies the acceptance criteria or it does not.
5. **Feeds TDD directly.** Acceptance criteria in Given/When/Then format translate 1:1 into test cases. The spec IS the test plan.
### The Iron Law
```
NO CODE WITHOUT AN APPROVED SPEC.
NO EXCEPTIONS. NO "QUICK PROTOTYPES." NO "I'LL DOCUMENT IT LATER."
```
If the spec is not written, reviewed, and approved, implementation does not begin. Period.
---
## The Spec Format
Every spec follows this structure. No sections are optional — if a section does not apply, write "N/A — [reason]" so reviewers know it was considered, not forgotten.
### 1. Title and Context
```markdown
# Spec: [Feature Name]
**Author:** [name]
**Date:** [ISO 8601]
**Status:** Draft | In Review | Approved | Superseded
**Reviewers:** [list]
**Related specs:** [links]
## Context
[Why does this feature exist? What problem does it solve? What is the business
motivation? Include links to user research, support tickets, or metrics that
justify this work. 2-4 paragraphs maximum.]
```
### 2. Functional Requirements (RFC 2119)
Use RFC 2119 keywords precisely:
| Keyword | Meaning |
|---------|---------|
| **MUST** | Absolute requirement. Failing this means the implementation is non-conformant. |
| **MUST NOT** | Absolute prohibition. Doing this means the implementation is broken. |
| **SHOULD** | Recommended. May be omitted with documented justification. |
| **SHOULD NOT** | Discouraged. May be included with documented justification. |
| **MAY** | Optional. Purely at the implementer's discretion. |
```markdown
## Functional Requirements
- FR-1: The system MUST authenticate users via OAuth 2.0 PKCE flow.
- FR-2: The system MUST reject tokens older than 24 hours.
- FR-3: The system SHOULD support refresh token rotation.
- FR-4: The system MAY cache user profiles for up to 5 minutes.
- FR-5: The system MUST NOT store plaintext passwords under any circumstance.
```
Number every requirement. Use `FR-` prefix. Each requirement is a single, testable statement.
### 3. Non-Functional Requirements
```markdown
## Non-Functional Requirements
### Performance
- NFR-P1: Login flow MUST complete in < 500ms (p95) under normal load.
- NFR-P2: Token validation MUST complete in < 50ms (p99).
### Security
- NFR-S1: All tokens MUST be transmitted over TLS 1.2+.
- NFR-S2: The system MUST rate-limit login attempts to 5/minute per IP.
### Accessibility
- NFR-A1: Login form MUST meet WCAG 2.1 AA standards.
- NFR-A2: Error messages MUST be announced to screen readers.
### Scalability
- NFR-SC1: The system SHOULD handle 10,000 concurrent sessions.
### Reliability
- NFR-R1: The authentication service MUST maintain 99.9% uptime.
```
### 4. Acceptance Criteria (Given/When/Then)
Every functional requirement maps to one or more acceptance criteria. Use Gherkin syntax:
```markdown
## Acceptance Criteria
### AC-1: Successful login (FR-1)
Given a user with valid credentials
When they submit the login form with correct email and password
Then they receive a valid access token
And they are redirected to the dashboard
And the login event is logged with timestamp and IP
### AC-2: Expired token rejection (FR-2)
Given a user with an access token issued 25 hours ago
When they make an API request with that token
Then they receive a 401 Unauthorized response
And the response body contains error code "TOKEN_EXPIRED"
And they are NOT redirected (API clients handle their own flow)
### AC-3: Rate limiting (NFR-S2)
Given an IP address that has made 5 failed login attempts in the last minute
When a 6th login attempt arrives from that IP
Then the request is rejected with 429 Too Many Requests
And the response includes a Retry-After header
```
### 5. Edge Cases and Error Scenarios
```markdown
## Edge Cases
- EC-1: User submits login form with empty email → Show validation error, do not hit API.
- EC-2: OAuth provider is down → Show "Service temporarily unavailable", retry after 30s.
- EC-3: User has account but no password (social-only) → Redirect to social login.
- EC-4: Concurrent login from two devices → Both sessions are valid (no single-session enforcement).
- EC-5: Token expires mid-request → Complete the current request, return warning header.
```
### 6. API Contracts
Define request/response shapes using TypeScript-style notation:
```markdown
## API Contracts
### POST /api/auth/login
Request:
```typescript
interface LoginRequest {
email: string; // MUST be valid email format
password: string; // MUST be 8-128 characters
rememberMe?: boolean; // Default: false
}
```
Success Response (200):
```typescript
interface LoginResponse {
accessToken: string; // JWT, expires in 24h
refreshToken: string; // Opaque, expires in 30d
expiresIn: number; // Seconds until access token expires
user: {
id: string;
email: string;
displayName: string;
};
}
```
Error Response (401):
```typescript
interface AuthError {
error: "INVALID_CREDENTIALS" | "TOKEN_EXPIRED" | "ACCOUNT_LOCKED";
message: string;
retryAfter?: number; // Seconds, present for rate-limited responses
}
```
```
### 7. Data Models
```markdown
## Data Models
### User
| Field | Type | Constraints |
|-------|------|-------------|
| id | UUID | Primary key, auto-generated |
| email | string | Unique, max 255 chars, valid email format |
| passwordHash | string | bcrypt, never exposed via API |
| createdAt | timestamp | UTC, immutable |
| lastLoginAt | timestamp | UTC, updated on each login |
| loginAttempts | integer | Reset to 0 on successful login |
| lockedUntil | timestamp | Null if not locked |
```
### 8. Out of Scope
Explicit exclusions prevent scope creep:
```markdown
## Out of Scope
- OS-1: Multi-factor authentication (separate spec: SPEC-042)
- OS-2: Social login providers beyond Google and GitHub
- OS-3: Admin impersonation of user accounts
- OS-4: Password complexity rules beyond minimum length (deferred to v2)
- OS-5: Session management UI (users cannot see/revoke active sessions yet)
```
If someone asks for an out-of-scope item during implementation, point them to this section. Do not build it.
---
## Bounded Autonomy Rules
These rules define when an agent (human or AI) MUST stop and ask for guidance vs. when they can proceed independently.
### STOP and Ask When:
1. **Scope creep detected.** The implementation requires something not in the spec. Even if it seems obviously needed, STOP. The spec might have excluded it deliberately.
2. **Ambiguity exceeds 30%.** If you cannot determine the correct behavior from the spec for more than 30% of a given requirement, the spec is incomplete. Do not guess.
3. **Breaking changes required.** The implementation would change an existing API contract, database schema, or public interface. Always escalate.
4. **Security implications.** Any change that touches authentication, authorization, encryption, or PII handling requires explicit approval.
5. **Performance characteristics unknown.** If a requirement says "MUST complete in < 500ms" but you have no way to measure or guarantee that, escalate before implementing a guess.
6. **Cross-team dependencies.** If the spec requires coordination with another team or service, confirm the dependency before building against it.
### Continue Autonomously When:
1. **Spec is clear and unambiguous** for the current task.
2. **All acceptance criteria have passing tests** and you are refactoring internals.
3. **Changes are non-breaking** — no public API, schema, or behavior changes.
4. **Implementation is a direct translation** of a well-defined acceptance criterion.
5. **Error handling follows established patterns** already documented in the codebase.
### Escalation Protocol
When you must stop, provide:
```markdown
## Escalation: [Brief Title]
**Blocked on:** [requirement ID, e.g., FR-3]
**Question:** [Specific, answerable question — not "what should I do?"]
**Options considered:**
A. [Option] — Pros: [...] Cons: [...]
B. [Option] — Pros: [...] Cons: [...]
**My recommendation:** [A or B, with reasoning]
**Impact of waiting:** [What is blocked until this is resolved?]
```
Never escalate without a recommendation. Never present an open-ended question. Always give options.
See `references/bounded_autonomy_rules.md` for the complete decision matrix.
---
## Workflow — 6 Phases
### Phase 1: Gather Requirements
**Goal:** Understand what needs to be built and why.
1. **Interview the user.** Ask:
- What problem does this solve?
- Who are the users?
- What does success look like?
- What explicitly should NOT be built?
2. **Read existing code.** Understand the current system before proposing changes.
3. **Identify constraints.** Performance budgets, security requirements, backward compatibility.
4. **List unknowns.** Every unknown is a risk. Surface them now, not during implementation.
**Exit criteria:** You can explain the feature to someone unfamiliar with the project in 2 minutes.
### Phase 2: Write Spec
**Goal:** Produce a complete spec document following The Spec Format above.
1. Fill every section of the template. No section left blank.
2. Number all requirements (FR-*, NFR-*, AC-*, EC-*, OS-*).
3. Use RFC 2119 keywords precisely.
4. Write acceptance criteria in Given/When/Then format.
5. Define API contracts with TypeScript-style types.
6. List explicit exclusions in Out of Scope.
**Exit criteria:** The spec can be handed to a developer who was not in the requirements meeting, and they can implement the feature without asking clarifying questions.
### Phase 3: Validate Spec
**Goal:** Verify the spec is complete, consistent, and implementable.
Run `spec_validator.py` against the spec file:
```bash
python spec_validator.py --file spec.md --strict
```
Manual validation checklist:
- [ ] Every functional requirement has at least one acceptance criterion
- [ ] Every acceptance criterion is testable (no subjective language)
- [ ] API contracts cover all endpoints mentioned in requirements
- [ ] Data models cover all entities mentioned in requirements
- [ ] Edge cases cover failure modes for every external dependency
- [ ] Out of scope is explicit about what was considered and rejected
- [ ] Non-functional requirements have measurable thresholds
**Exit criteria:** Spec scores 80+ on validator, and all manual checklist items pass.
### Phase 4: Generate Tests
**Goal:** Extract test cases from acceptance criteria before writing implementation code.
Run `test_extractor.py` against the approved spec:
```bash
python test_extractor.py --file spec.md --framework pytest --output tests/
```
1. Each acceptance criterion becomes one or more test cases.
2. Each edge case becomes a test case.
3. Tests are stubs — they define the assertion but not the implementation.
4. All tests MUST fail initially (red phase of TDD).
**Exit criteria:** You have a test file where every test fails with "not implemented" or equivalent.
### Phase 5: Implement
**Goal:** Write code that makes failing tests pass, one acceptance criterion at a time.
1. Pick one acceptance criterion (start with the simplest).
2. Make its test(s) pass with minimal code.
3. Run the full test suite — no regressions.
4. Commit.
5. Pick the next acceptance criterion. Repeat.
**Rules:**
- Do NOT implement anything not in the spec.
- Do NOT optimize before all acceptance criteria pass.
- Do NOT refactor before all acceptance criteria pass.
- If you discover a missing requirement, STOP and update the spec first.
**Exit criteria:** All tests pass. All acceptance criteria satisfied.
### Phase 6: Self-Review
**Goal:** Verify implementation matches spec before marking done.
Run through the Self-Review Checklist below. If any item fails, fix it before declaring the task complete.
---
## Self-Review Checklist
Before marking any implementation as done, verify ALL of the following:
- [ ] **Every acceptance criterion has a passing test.** No exceptions. If AC-3 exists, a test for AC-3 exists and passes.
- [ ] **Every edge case has a test.** EC-1 through EC-N all have corresponding test cases.
- [ ] **No scope creep.** The implementation does not include features not in the spec. If you added something, either update the spec or remove it.
- [ ] **API contracts match implementation.** Request/response shapes in code match the spec exactly. Field names, types, status codes — all of it.
- [ ] **Error scenarios tested.** Every error response defined in the spec has a test that triggers it.
- [ ] **Non-functional requirements verified.** If the spec says < 500ms, you have evidence (benchmark, load test, profiling) that it meets the threshold.
- [ ] **Data model matches.** Database schema matches the spec. No extra columns, no missing constraints.
- [ ] **Out-of-scope items not built.** Double-check that nothing from the Out of Scope section leaked into the implementation.
---
## Integration with TDD Guide
Spec-driven workflow and TDD are complementary, not competing:
```
Spec-Driven Workflow TDD (Red-Green-Refactor)
───────────────────── ──────────────────────────
Phase 1: Gather Requirements
Phase 2: Write Spec
Phase 3: Validate Spec
Phase 4: Generate Tests ──→ RED: Tests exist and fail
Phase 5: Implement ──→ GREEN: Minimal code to pass
Phase 6: Self-Review ──→ REFACTOR: Clean up internals
```
**The handoff:** Spec-driven workflow produces the test stubs (Phase 4). TDD takes over from there. The spec tells you WHAT to test. TDD tells you HOW to implement.
Use `engineering-team/tdd-guide` for:
- Red-green-refactor cycle discipline
- Coverage analysis and gap detection
- Framework-specific test patterns (Jest, Pytest, JUnit)
Use `engineering/spec-driven-workflow` for:
- Defining what to build before building it
- Acceptance criteria authoring
- Completeness validation
- Scope control
---
## Examples
### Full Spec: User Password Reset
```markdown
# Spec: Password Reset Flow
**Author:** Engineering Team
**Date:** 2026-03-25
**Status:** Approved
## Context
Users who forget their passwords currently have no self-service recovery option.
Support receives ~200 password reset requests per week, costing approximately
8 hours of support time. This feature eliminates that burden entirely.
## Functional Requirements
- FR-1: The system MUST allow users to request a password reset via email.
- FR-2: The system MUST send a reset link that expires after 1 hour.
- FR-3: The system MUST invalidate all previous reset links when a new one is requested.
- FR-4: The system MUST enforce minimum password length of 8 characters on reset.
- FR-5: The system MUST NOT reveal whether an email exists in the system.
- FR-6: The system SHOULD log all reset attempts for audit purposes.
## Acceptance Criteria
### AC-1: Request reset (FR-1, FR-5)
Given a user on the password reset page
When they enter any email address and submit
Then they see "If an account exists, a reset link has been sent"
And the response is identical whether the email exists or not
### AC-2: Valid reset link (FR-2)
Given a user who received a reset email 30 minutes ago
When they click the reset link
Then they see the password reset form
### AC-3: Expired reset link (FR-2)
Given a user who received a reset email 2 hours ago
When they click the reset link
Then they see "This link has expired. Please request a new one."
### AC-4: Previous links invalidated (FR-3)
Given a user who requested two reset emails
When they click the link from the first email
Then they see "This link is no longer valid."
## Edge Cases
- EC-1: User submits reset for non-existent email → Same success message (FR-5).
- EC-2: User clicks reset link twice → Second click shows "already used" if password was changed.
- EC-3: Email delivery fails → Log error, do not retry automatically.
- EC-4: User requests reset while already logged in → Allow it, do not force logout.
## Out of Scope
- OS-1: Security questions as alternative reset method.
- OS-2: SMS-based password reset.
- OS-3: Admin-initiated password reset (separate spec).
```
### Extracted Test Cases (from above spec)
```python
# Generated by test_extractor.py --framework pytest
class TestPasswordReset:
def test_ac1_request_reset_existing_email(self):
"""AC-1: Request reset with existing email shows generic message."""
# Given a user on the password reset page
# When they enter a registered email and submit
# Then they see "If an account exists, a reset link has been sent"
raise NotImplementedError("Implement this test")
def test_ac1_request_reset_nonexistent_email(self):
"""AC-1: Request reset with unknown email shows same generic message."""
# Given a user on the password reset page
# When they enter an unregistered email and submit
# Then they see identical response to existing email case
raise NotImplementedError("Implement this test")
def test_ac2_valid_reset_link(self):
"""AC-2: Reset link works within expiry window."""
raise NotImplementedError("Implement this test")
def test_ac3_expired_reset_link(self):
"""AC-3: Reset link rejected after 1 hour."""
raise NotImplementedError("Implement this test")
def test_ac4_previous_links_invalidated(self):
"""AC-4: Old reset links stop working when new one is requested."""
raise NotImplementedError("Implement this test")
def test_ec1_nonexistent_email_same_response(self):
"""EC-1: Non-existent email produces identical response."""
raise NotImplementedError("Implement this test")
def test_ec2_reset_link_used_twice(self):
"""EC-2: Already-used reset link shows appropriate message."""
raise NotImplementedError("Implement this test")
```
---
## Anti-Patterns
### 1. Coding Before Spec Approval
**Symptom:** "I'll start coding while the spec is being reviewed."
**Problem:** The review will surface changes. Now you have code that implements a rejected design.
**Rule:** Implementation does not begin until spec status is "Approved."
### 2. Vague Acceptance Criteria
**Symptom:** "The system should work well" or "The UI should be responsive."
**Problem:** Untestable. What does "well" mean? What does "responsive" mean?
**Rule:** Every acceptance criterion must be verifiable by a machine. If you cannot write a test for it, rewrite the criterion.
### 3. Missing Edge Cases
**Symptom:** Happy path is specified, error paths are not.
**Problem:** Developers invent error handling on the fly, leading to inconsistent behavior.
**Rule:** For every external dependency (API, database, file system, user input), specify at least one failure scenario.
### 4. Spec as Post-Hoc Documentation
**Symptom:** "Let me write the spec now that the feature is done."
**Problem:** This is documentation, not specification. It describes what was built, not what should have been built. It cannot catch design errors because the design is already frozen.
**Rule:** If the spec was written after the code, it is not a spec. Relabel it as documentation.
### 5. Gold-Plating Beyond Spec
**Symptom:** "While I was in there, I also added..."
**Problem:** Untested code. Unreviewed design. Potential for subtle bugs in the "bonus" feature.
**Rule:** If it is not in the spec, it does not get built. File a new spec for additional features.
### 6. Acceptance Criteria Without Requirement Traceability
**Symptom:** AC-7 exists but does not reference any FR-* or NFR-*.
**Problem:** Orphaned criteria mean either a requirement is missing or the criterion is unnecessary.
**Rule:** Every AC-* MUST reference at least one FR-* or NFR-*.
### 7. Skipping Validation
**Symptom:** "The spec looks fine, let's just start."
**Problem:** Missing sections discovered during implementation cause blocking delays.
**Rule:** Always run `spec_validator.py --strict` before starting implementation. Fix all warnings.
---
## Cross-References
- **`engineering-team/tdd-guide`** — Red-green-refactor cycle, test generation, coverage analysis. Use after Phase 4 of this workflow.
- **`engineering/focused-fix`** — Deep-dive feature repair. When a spec-driven implementation has systemic issues, use focused-fix for diagnosis.
- **`engineering/rag-architect`** — If the feature involves retrieval or knowledge systems, use rag-architect for the technical design within the spec.
- **`references/spec_format_guide.md`** — Complete template with section-by-section explanations.
- **`references/bounded_autonomy_rules.md`** — Full decision matrix for when to stop vs. continue.
- **`references/acceptance_criteria_patterns.md`** — Pattern library for writing Given/When/Then criteria.
---
## Tools
| Script | Purpose | Key Flags |
|--------|---------|-----------|
| `spec_generator.py` | Generate spec template from feature name/description | `--name`, `--description`, `--format`, `--json` |
| `spec_validator.py` | Validate spec completeness (0-100 score) | `--file`, `--strict`, `--json` |
| `test_extractor.py` | Extract test stubs from acceptance criteria | `--file`, `--framework`, `--output`, `--json` |
```bash
# Generate a spec template
python spec_generator.py --name "User Authentication" --description "OAuth 2.0 login flow"
# Validate a spec
python spec_validator.py --file specs/auth.md --strict
# Extract test cases
python test_extractor.py --file specs/auth.md --framework pytest --output tests/test_auth.py
```

View File

@@ -0,0 +1,497 @@
# Acceptance Criteria Patterns
A pattern library for writing Given/When/Then acceptance criteria across common feature types. Use these as starting points — adapt to your domain.
---
## Pattern Structure
Every acceptance criterion follows this structure:
```
### AC-N: [Descriptive name] (FR-N, NFR-N)
Given [precondition — the system/user is in this state]
When [trigger — the user or system performs this action]
Then [outcome — this observable, testable result occurs]
And [additional outcome — and this also happens]
```
**Rules:**
1. One scenario per AC. Multiple Given/When/Then blocks = multiple ACs.
2. Every AC references at least one FR-* or NFR-*.
3. Outcomes must be observable and testable — no subjective language.
4. Preconditions must be achievable in a test setup.
---
## Authentication Patterns
### Login — Happy Path
```markdown
### AC-1: Successful login with valid credentials (FR-1)
Given a registered user with email "user@example.com" and password "V@lidP4ss!"
When they POST /api/auth/login with email "user@example.com" and password "V@lidP4ss!"
Then the response status is 200
And the response body contains a valid JWT access token
And the response body contains a refresh token
And the access token expires in 24 hours
```
### Login — Invalid Credentials
```markdown
### AC-2: Login rejected with wrong password (FR-1)
Given a registered user with email "user@example.com"
When they POST /api/auth/login with email "user@example.com" and an incorrect password
Then the response status is 401
And the response body contains error code "INVALID_CREDENTIALS"
And no token is issued
And the failed attempt is logged
```
### Login — Account Locked
```markdown
### AC-3: Login rejected for locked account (FR-1, NFR-S2)
Given a user whose account is locked due to 5 consecutive failed login attempts
When they POST /api/auth/login with correct credentials
Then the response status is 403
And the response body contains error code "ACCOUNT_LOCKED"
And the response includes a "retryAfter" field with seconds until unlock
```
### Token Refresh
```markdown
### AC-4: Token refresh with valid refresh token (FR-3)
Given a user with a valid, non-expired refresh token
When they POST /api/auth/refresh with that refresh token
Then the response status is 200
And a new access token is issued
And the old refresh token is invalidated
And a new refresh token is issued (rotation)
```
### Logout
```markdown
### AC-5: Logout invalidates session (FR-4)
Given an authenticated user with a valid access token
When they POST /api/auth/logout with that token
Then the response status is 204
And the access token is no longer accepted for API calls
And the refresh token is invalidated
```
---
## CRUD Patterns
### Create
```markdown
### AC-6: Create resource with valid data (FR-1)
Given an authenticated user with "editor" role
When they POST /api/resources with valid payload {name: "Test", type: "A"}
Then the response status is 201
And the response body contains the created resource with a generated UUID
And the resource's "createdAt" field is set to the current UTC timestamp
And the resource's "createdBy" field matches the authenticated user's ID
```
### Create — Validation Failure
```markdown
### AC-7: Create resource rejected with invalid data (FR-1)
Given an authenticated user
When they POST /api/resources with payload missing required field "name"
Then the response status is 400
And the response body contains error code "VALIDATION_ERROR"
And the response body contains field-level detail: {"name": "Required field"}
And no resource is created in the database
```
### Read — Single Item
```markdown
### AC-8: Read resource by ID (FR-2)
Given an existing resource with ID "abc-123"
When an authenticated user GETs /api/resources/abc-123
Then the response status is 200
And the response body contains the resource with all fields
```
### Read — Not Found
```markdown
### AC-9: Read non-existent resource returns 404 (FR-2)
Given no resource exists with ID "nonexistent-id"
When an authenticated user GETs /api/resources/nonexistent-id
Then the response status is 404
And the response body contains error code "NOT_FOUND"
```
### Update
```markdown
### AC-10: Update resource with valid data (FR-3)
Given an existing resource with ID "abc-123" owned by the authenticated user
When they PATCH /api/resources/abc-123 with {name: "Updated Name"}
Then the response status is 200
And the resource's "name" field is "Updated Name"
And the resource's "updatedAt" field is updated to the current UTC timestamp
And fields not included in the patch are unchanged
```
### Update — Ownership Check
```markdown
### AC-11: Update rejected for non-owner (FR-3, FR-6)
Given an existing resource with ID "abc-123" owned by user "other-user"
When the authenticated user (not "other-user") PATCHes /api/resources/abc-123
Then the response status is 403
And the response body contains error code "FORBIDDEN"
And the resource is unchanged
```
### Delete — Soft Delete
```markdown
### AC-12: Soft delete resource (FR-5)
Given an existing resource with ID "abc-123" owned by the authenticated user
When they DELETE /api/resources/abc-123
Then the response status is 204
And the resource's "deletedAt" field is set to the current UTC timestamp
And the resource no longer appears in GET /api/resources (list endpoint)
And the resource still exists in the database (soft deleted)
```
### List — Pagination
```markdown
### AC-13: List resources with default pagination (FR-4)
Given 50 resources exist for the authenticated user
When they GET /api/resources without pagination parameters
Then the response status is 200
And the response contains the first 20 resources (default page size)
And the response includes "totalCount: 50"
And the response includes "page: 1"
And the response includes "pageSize: 20"
And the response includes "hasNextPage: true"
```
### List — Filtered
```markdown
### AC-14: List resources with type filter (FR-4)
Given 30 resources of type "A" and 20 resources of type "B" exist
When the authenticated user GETs /api/resources?type=A
Then the response status is 200
And all returned resources have type "A"
And the response "totalCount" is 30
```
---
## Search Patterns
### Basic Search
```markdown
### AC-15: Search returns matching results (FR-7)
Given resources with names "Alpha Report", "Beta Analysis", "Alpha Summary" exist
When the user GETs /api/resources?q=Alpha
Then the response contains "Alpha Report" and "Alpha Summary"
And the response does not contain "Beta Analysis"
And results are ordered by relevance score (descending)
```
### Search — Empty Results
```markdown
### AC-16: Search with no matches returns empty list (FR-7)
Given no resources match the query "xyznonexistent"
When the user GETs /api/resources?q=xyznonexistent
Then the response status is 200
And the response contains an empty "items" array
And "totalCount" is 0
```
### Search — Special Characters
```markdown
### AC-17: Search handles special characters safely (FR-7, NFR-S1)
Given resources exist in the database
When the user GETs /api/resources?q="; DROP TABLE resources;--
Then the response status is 200
And no SQL injection occurs
And the search treats the input as a literal string
```
---
## File Upload Patterns
### Upload — Happy Path
```markdown
### AC-18: Upload file within size limit (FR-8)
Given an authenticated user
When they POST /api/files with a 5MB PNG file
Then the response status is 201
And the response contains the file's URL, size, and MIME type
And the file is stored in the configured storage backend
And the file is associated with the authenticated user
```
### Upload — Size Exceeded
```markdown
### AC-19: Upload rejected for oversized file (FR-8)
Given the maximum file size is 10MB
When the user POSTs /api/files with a 15MB file
Then the response status is 413
And the response contains error code "FILE_TOO_LARGE"
And no file is stored
```
### Upload — Invalid Type
```markdown
### AC-20: Upload rejected for disallowed file type (FR-8, NFR-S3)
Given allowed file types are PNG, JPG, PDF
When the user POSTs /api/files with an .exe file
Then the response status is 415
And the response contains error code "UNSUPPORTED_MEDIA_TYPE"
And no file is stored
```
---
## Payment Patterns
### Charge — Happy Path
```markdown
### AC-21: Successful payment charge (FR-10)
Given a user with a valid payment method on file
When they POST /api/payments with amount 49.99 and currency "USD"
Then the payment gateway is charged $49.99
And the response status is 201
And the response contains a transaction ID
And a payment record is created with status "completed"
And a receipt email is sent to the user
```
### Charge — Declined
```markdown
### AC-22: Payment declined by gateway (FR-10)
Given a user with an expired credit card on file
When they POST /api/payments with amount 49.99
Then the payment gateway returns a decline
And the response status is 402
And the response contains error code "PAYMENT_DECLINED"
And no payment record is created with status "completed"
And the user is prompted to update their payment method
```
### Charge — Idempotency
```markdown
### AC-23: Duplicate payment request is idempotent (FR-10, NFR-R1)
Given a payment was successfully processed with idempotency key "key-123"
When the same request is sent again with idempotency key "key-123"
Then the response status is 200
And the response contains the original transaction ID
And the user is NOT charged a second time
```
---
## Notification Patterns
### Email Notification
```markdown
### AC-24: Email notification sent on event (FR-11)
Given a user with notification preferences set to "email"
When their order status changes to "shipped"
Then an email is sent to their registered email address
And the email subject contains the order number
And the email body contains the tracking URL
And a notification record is created with status "sent"
```
### Notification — Delivery Failure
```markdown
### AC-25: Failed notification is retried (FR-11, NFR-R2)
Given the email service returns a 5xx error on first attempt
When a notification is triggered
Then the system retries up to 3 times with exponential backoff (1s, 4s, 16s)
And if all retries fail, the notification status is set to "failed"
And an alert is sent to the ops channel
```
---
## Negative Test Patterns
### Unauthorized Access
```markdown
### AC-26: Unauthenticated request rejected (NFR-S1)
Given no authentication token is provided
When the user GETs /api/resources
Then the response status is 401
And the response contains error code "AUTHENTICATION_REQUIRED"
And no resource data is returned
```
### Invalid Input — Type Mismatch
```markdown
### AC-27: String provided for numeric field (FR-1)
Given the "quantity" field expects an integer
When the user POSTs with quantity: "abc"
Then the response status is 400
And the response body contains field error: {"quantity": "Must be an integer"}
```
### Rate Limiting
```markdown
### AC-28: Rate limit enforced (NFR-S2)
Given the rate limit is 100 requests per minute per API key
When the user sends the 101st request within 60 seconds
Then the response status is 429
And the response includes header "Retry-After" with seconds until reset
And the response contains error code "RATE_LIMITED"
```
### Concurrent Modification
```markdown
### AC-29: Optimistic locking prevents lost updates (NFR-R1)
Given a resource with version 5
When user A PATCHes with version 5 and user B PATCHes with version 5 simultaneously
Then one succeeds with status 200 (version becomes 6)
And the other receives status 409 with error code "CONFLICT"
And the 409 response includes the current version number
```
---
## Performance Criteria Patterns
### Response Time
```markdown
### AC-30: API response time under load (NFR-P1)
Given the system is handling 1,000 concurrent users
When a user GETs /api/dashboard
Then the response is returned in < 500ms (p95)
And the response is returned in < 1000ms (p99)
```
### Throughput
```markdown
### AC-31: System handles target throughput (NFR-P2)
Given normal production traffic patterns
When the system receives 5,000 requests per second
Then all requests are processed without queue overflow
And error rate remains below 0.1%
```
### Resource Usage
```markdown
### AC-32: Memory usage within bounds (NFR-P3)
Given the service is processing normal traffic
When measured over a 24-hour period
Then memory usage does not exceed 512MB RSS
And no memory leaks are detected (RSS growth < 5% over 24h)
```
---
## Accessibility Criteria Patterns
### Keyboard Navigation
```markdown
### AC-33: Form is fully keyboard navigable (NFR-A1)
Given the user is on the login page using only a keyboard
When they press Tab
Then focus moves through: email field -> password field -> submit button
And each focused element has a visible focus indicator
And pressing Enter on the submit button submits the form
```
### Screen Reader
```markdown
### AC-34: Error messages announced to screen readers (NFR-A2)
Given the user submits the form with invalid data
When validation errors appear
Then each error is associated with its form field via aria-describedby
And the error container has role="alert" for immediate announcement
And the first error field receives focus
```
### Color Contrast
```markdown
### AC-35: Text meets contrast requirements (NFR-A3)
Given the default theme is active
When measuring text against background colors
Then all body text meets 4.5:1 contrast ratio (WCAG AA)
And all large text (18px+ or 14px+ bold) meets 3:1 contrast ratio
And all interactive element states (hover, focus, active) meet 3:1
```
### Reduced Motion
```markdown
### AC-36: Animations respect user preference (NFR-A4)
Given the user has enabled "prefers-reduced-motion" in their OS settings
When they load any page with animations
Then all non-essential animations are disabled
And essential animations (e.g., loading spinner) use a reduced version
And no content is hidden behind animation-only interactions
```
---
## Writing Tips
### Do
- Start Given with the system/user state, not the action
- Make When a single, specific trigger
- Make Then observable — status codes, field values, side effects
- Include And for additional assertions on the same outcome
- Reference requirement IDs in the AC title
### Do Not
- Write "Then the system works correctly" (not testable)
- Combine multiple scenarios in one AC
- Use subjective words: "quickly", "properly", "nicely", "user-friendly"
- Skip the precondition — Given is required even if it seems obvious
- Write Given/When/Then as prose paragraphs — use the structured format
### Smell Tests
If your AC has any of these, rewrite it:
| Smell | Example | Fix |
|-------|---------|-----|
| No Given clause | "When user clicks, then page loads" | Add "Given user is on the dashboard" |
| Vague Then | "Then it works" | Specify status code, body, side effects |
| Multiple Whens | "When user clicks A and then clicks B" | Split into two ACs |
| Implementation detail | "Then the Redux store is updated" | Focus on user-observable outcome |
| No requirement reference | "AC-5: Dashboard loads" | "AC-5: Dashboard loads (FR-7)" |

View File

@@ -0,0 +1,273 @@
# Bounded Autonomy Rules
Decision framework for when an agent (human or AI) should stop and ask vs. continue working autonomously during spec-driven development.
---
## The Core Principle
**Autonomy is earned by clarity.** The clearer the spec, the more autonomy the implementer has. The more ambiguous the spec, the more the implementer must stop and ask.
This is not about trust. It is about risk. A clear spec means low risk of building the wrong thing. An ambiguous spec means high risk.
---
## Decision Matrix
| Signal | Action | Rationale |
|--------|--------|-----------|
| Spec is Approved, requirement is clear, tests exist | **Continue** | Low risk. Build it. |
| Requirement is clear but no test exists yet | **Continue** (write the test first) | You can infer the test from the requirement. |
| Requirement uses SHOULD/MAY keywords | **Continue** with your best judgment | These are intentionally flexible. Document your choice. |
| Requirement is ambiguous (multiple valid interpretations) | **STOP** if ambiguity > 30% of the task | Ask the spec author to clarify. |
| Implementation requires changing an API contract | **STOP** always | Breaking changes need explicit approval. |
| Implementation requires a new database migration | **STOP** if it changes existing columns/tables | New tables are lower risk than schema changes. |
| Security-related change (auth, crypto, PII) | **STOP** always | Security changes need review regardless of spec clarity. |
| Performance-critical path with no benchmark data | **STOP** | You cannot prove NFR compliance without measurement. |
| Bug found in existing code unrelated to spec | **STOP** — file a separate issue | Do not fix unrelated bugs in a spec-scoped implementation. |
| Spec says "N/A" for a section you think needs content | **STOP** | The author may have a reason, or they may have missed it. |
---
## Ambiguity Scoring
When you encounter ambiguity, quantify it before deciding to stop or continue.
### How to Score Ambiguity
For each requirement you are implementing, ask:
1. **Can I write a test for this right now?** (No = +20% ambiguity)
2. **Are there multiple valid interpretations?** (Yes = +20% ambiguity)
3. **Does the spec contradict itself?** (Yes = +30% ambiguity)
4. **Am I making assumptions about user behavior?** (Yes = +15% ambiguity)
5. **Does this depend on an undocumented external system?** (Yes = +15% ambiguity)
### Threshold
| Ambiguity Score | Action |
|-----------------|--------|
| 0-15% | Continue. Minor ambiguity is normal. Document your interpretation. |
| 16-30% | Continue with caution. Add a comment explaining your interpretation. Flag in PR. |
| 31-50% | STOP. Ask the spec author one specific question. Do not continue until answered. |
| 51%+ | STOP. The spec is incomplete. Request a revision before proceeding. |
### Example
**Requirement:** "FR-7: The system MUST notify the user when their order ships."
Questions:
1. Can I write a test? Partially — I know WHAT to test but not HOW (email? push? in-app?). +20%
2. Multiple interpretations? Yes — notification channel is unclear. +20%
3. Contradicts itself? No. +0%
4. Assuming user behavior? Yes — I am assuming they want email. +15%
5. Undocumented external system? Maybe — depends on notification service. +15%
**Total: 70%.** STOP. The spec needs to specify the notification channel.
---
## Scope Creep Detection
### What Is Scope Creep?
Scope creep is implementing functionality not described in the spec. It includes:
- Adding features the spec does not mention
- "Improving" behavior beyond what acceptance criteria require
- Handling edge cases the spec explicitly excluded
- Refactoring unrelated code "while you're in there"
- Building infrastructure for future features
### Detection Patterns
| Pattern | Example | Risk |
|---------|---------|------|
| "While I'm here..." | Refactoring a utility function unrelated to the spec | Medium — unreviewed changes |
| "This would be easy to add..." | Adding a search filter the spec does not mention | High — untested, unspecified |
| "Users will probably want..." | Building a feature based on assumption | High — may conflict with future specs |
| "This is obviously needed..." | Adding logging, metrics, or caching not in NFRs | Medium — may be overkill or wrong approach |
| "The spec forgot to mention..." | Building something the spec excluded | Critical — may be deliberately excluded |
### Response Protocol
When you detect scope creep in your own work:
1. **Stop immediately.** Do not commit the extra code.
2. **Check Out of Scope.** Is this item explicitly excluded?
3. **If excluded:** Delete the code. The spec author had a reason.
4. **If not mentioned:** File a note for the spec author. Ask if it should be added.
5. **If approved:** Update the spec FIRST, then implement.
---
## Breaking Change Identification
### What Counts as a Breaking Change?
A breaking change is any modification that could cause existing clients, tests, or integrations to fail.
| Category | Breaking | Not Breaking |
|----------|----------|--------------|
| API endpoint removed | Yes | - |
| API endpoint added | - | No |
| Required field added to request | Yes | - |
| Optional field added to request | - | No |
| Field removed from response | Yes | - |
| Field added to response | - | No (usually) |
| Status code changed | Yes | - |
| Error code string changed | Yes | - |
| Database column removed | Yes | - |
| Database column added (nullable) | - | No |
| Database column added (not null, no default) | Yes | - |
| Enum value removed | Yes | - |
| Enum value added | - | No (usually) |
| Behavior change for existing input | Yes | - |
### Breaking Change Protocol
1. **Identify** the breaking change before implementing it.
2. **Escalate** immediately — do not implement without approval.
3. **Propose** a migration path (versioned API, feature flag, deprecation period).
4. **Document** the breaking change in the spec's changelog.
---
## Security Implication Checklist
Any change touching the following areas MUST be escalated, even if the spec seems clear.
### Always Escalate
- [ ] Authentication logic (login, logout, token generation)
- [ ] Authorization logic (role checks, permission gates)
- [ ] Encryption/hashing (algorithm choice, key management)
- [ ] PII handling (storage, transmission, logging)
- [ ] Input validation bypass (new endpoints, parameter changes)
- [ ] Rate limiting changes (thresholds, scope)
- [ ] CORS or CSP policy changes
- [ ] File upload handling
- [ ] SQL/NoSQL query construction (injection risk)
- [ ] Deserialization of user input
- [ ] Redirect URLs from user input (open redirect risk)
- [ ] Secrets in code, config, or logs
### Security Escalation Template
```markdown
## Security Escalation: [Title]
**Affected area:** [authentication/authorization/encryption/PII/etc.]
**Spec reference:** [FR-N or NFR-SN]
**Risk:** [What could go wrong if implemented incorrectly]
**Current protection:** [What exists today]
**Proposed change:** [What the spec requires]
**My concern:** [Specific security question]
**Recommendation:** [Proposed approach with security rationale]
```
---
## Escalation Templates
### Template 1: Ambiguous Requirement
```markdown
## Escalation: Ambiguous Requirement
**Blocked on:** FR-7 ("notify the user when their order ships")
**Ambiguity score:** 70%
**Question:** What notification channel should be used?
**Options considered:**
A. Email only — Pros: simple, reliable. Cons: not real-time.
B. Email + in-app notification — Pros: covers both async and real-time. Cons: more implementation effort.
C. Configurable per user — Pros: maximum flexibility. Cons: requires preference UI (not in spec).
**My recommendation:** B (email + in-app). Covers most use cases without requiring new UI.
**Impact of waiting:** Cannot implement FR-7 until resolved. No other work blocked.
```
### Template 2: Missing Edge Case
```markdown
## Escalation: Missing Edge Case
**Related to:** FR-3 (password reset link expires after 1 hour)
**Scenario:** User clicks a reset link, but their account was deleted between requesting and clicking.
**Not in spec:** Edge cases section does not cover this.
**Options considered:**
A. Show generic "link invalid" error — Pros: secure (no info leak). Cons: confusing for deleted user.
B. Show "account not found" error — Pros: clear. Cons: confirms account deletion to link holder.
**My recommendation:** A. Security over clarity — do not reveal account existence.
**Impact of waiting:** Can implement other ACs; this is blocking only AC-2 completion.
```
### Template 3: Potential Breaking Change
```markdown
## Escalation: Potential Breaking Change
**Spec requires:** Adding required field "role" to POST /api/users request (FR-6)
**Current behavior:** POST /api/users accepts {email, password, displayName}
**Breaking:** Yes — existing clients will get 400 errors (missing required field)
**Options considered:**
A. Make "role" required as spec says — Pros: matches spec. Cons: breaks mobile app v2.1.
B. Make "role" optional with default "user" — Pros: backward compatible. Cons: deviates from spec.
C. Version the API (v2) — Pros: clean separation. Cons: maintenance burden.
**My recommendation:** B. Default to "user" for backward compatibility. Update spec to reflect MAY instead of MUST.
**Impact of waiting:** Frontend team is building against the new contract. Need answer within 2 days.
```
### Template 4: Scope Creep Proposal
```markdown
## Escalation: Potential Addition to Spec
**Context:** While implementing FR-2 (password validation), I noticed the spec does not mention password strength feedback.
**Not in spec:** No requirement for showing strength indicators.
**Checked Out of Scope:** Not listed there either.
**Proposal:** Add FR-7: "The system SHOULD display password strength feedback during registration."
**Effort:** ~2 hours additional implementation.
**Question:** Should this be added to current spec, filed as a separate spec, or skipped?
**Impact of waiting:** FR-2 implementation is not blocked. This is an enhancement question only.
```
---
## Quick Reference Card
```
CONTINUE if:
- Spec is approved
- Requirement uses MUST and is unambiguous
- Tests can be written directly from the AC
- Changes are additive and non-breaking
- You are refactoring internals only (no behavior change)
STOP if:
- Ambiguity > 30%
- Any breaking change
- Any security-related change
- Spec says N/A but you think it shouldn't
- You are about to build something not in the spec
- You cannot write a test for the requirement
- External dependency is undocumented
```
---
## Anti-Patterns in Autonomy
### 1. "I'll Ask Later"
Continuing past an ambiguity checkpoint because asking feels slow. The rework from building the wrong thing is always slower.
### 2. "It's Obviously Needed"
Assuming a missing feature was accidentally omitted. It may have been deliberately excluded. Check Out of Scope first.
### 3. "The Spec Is Wrong"
Implementing what you think the spec SHOULD say instead of what it DOES say. If the spec is wrong, escalate. Do not silently "fix" it.
### 4. "Just This Once"
Bypassing the escalation protocol for a "small" change. Small changes compound. The protocol exists because humans are bad at judging risk in the moment.
### 5. "I Already Built It"
Presenting completed work that was never in the spec and hoping it gets accepted. This creates review pressure and wastes everyone's time if rejected. Ask BEFORE building.

View File

@@ -0,0 +1,423 @@
# Spec Format Guide
Complete reference for writing feature specifications. Every section is explained with examples, rationale, and common mistakes.
---
## The Spec Document Structure
A spec has 8 mandatory sections. If a section does not apply, write "N/A — [reason]" so reviewers know it was considered, not skipped.
```
1. Title and Metadata
2. Context
3. Functional Requirements
4. Non-Functional Requirements
5. Acceptance Criteria
6. Edge Cases and Error Scenarios
7. API Contracts
8. Data Models
9. Out of Scope
```
---
## Section 1: Title and Metadata
```markdown
# Spec: [Feature Name]
**Author:** Jane Doe
**Date:** 2026-03-25
**Status:** Draft | In Review | Approved | Superseded
**Reviewers:** John Smith, Alice Chen
**Related specs:** SPEC-018 (User Registration), SPEC-023 (Session Management)
```
### Status Lifecycle
| Status | Meaning | Who Can Change |
|--------|---------|----------------|
| Draft | Author is still writing. Not ready for review. | Author |
| In Review | Ready for feedback. Implementation blocked. | Author |
| Approved | Reviewed and accepted. Implementation may begin. | Reviewer |
| Superseded | Replaced by a newer spec. Link to replacement. | Author |
**Rule:** Implementation MUST NOT begin until status is "Approved."
---
## Section 2: Context
The context section answers: **Why does this feature exist?**
### What to Include
- The problem being solved (with evidence: support tickets, metrics, user research)
- The current state (what exists today and what is broken or missing)
- The business justification (revenue impact, cost savings, user retention)
- Constraints or dependencies (regulatory, technical, timeline)
### What to Exclude
- Implementation details (that is the engineer's job)
- Solution proposals (the spec says WHAT, not HOW)
- Lengthy background (2-4 paragraphs maximum)
### Good Example
```markdown
## Context
Users who forget their passwords currently have no self-service recovery.
Support handles ~200 password reset requests per week, consuming approximately
8 hours of agent time at $45/hour ($360/week, $18,720/year). Additionally,
12% of users who contact support for a reset never return.
This feature provides self-service password reset via email, eliminating
support burden and reducing user churn from the reset flow.
```
### Bad Example
```markdown
## Context
We need a password reset feature. Users forget their passwords sometimes
and need to reset them. We should build this.
```
**Why it is bad:** No evidence, no metrics, no business justification. "We should build this" is not a reason.
---
## Section 3: Functional Requirements — RFC 2119
### RFC 2119 Keywords
These keywords have precise meanings per [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). Do not use them casually.
| Keyword | Meaning | Testing Implication |
|---------|---------|---------------------|
| **MUST** | Absolute requirement. The implementation is non-conformant without this. | Must have a passing test. Failure = release blocker. |
| **MUST NOT** | Absolute prohibition. Doing this = broken implementation. | Must have a test proving this cannot happen. |
| **SHOULD** | Strongly recommended. Can be omitted only with documented justification. | Should have a test. Omission requires written rationale. |
| **SHOULD NOT** | Strongly discouraged. Can be done only with documented justification. | Should have a test confirming the behavior does not occur. |
| **MAY** | Truly optional. Implementer's discretion. | Test is optional. Document if implemented. |
### Writing Good Requirements
**Each requirement MUST be:**
1. **Atomic** — One behavior per requirement. Not "The system MUST authenticate users and log them in."
2. **Testable** — You can write a test that proves it works or does not.
3. **Numbered** — Sequential FR-N format for traceability.
4. **Specific** — No ambiguous adjectives ("fast", "secure", "user-friendly").
### Good Requirements
```markdown
- FR-1: The system MUST accept login via email and password.
- FR-2: The system MUST reject passwords shorter than 8 characters.
- FR-3: The system MUST return a JWT access token on successful login.
- FR-4: The system MUST NOT include the password hash in any API response.
- FR-5: The system SHOULD support "remember me" with a 30-day refresh token.
- FR-6: The system MAY display last login time on the dashboard.
```
### Bad Requirements
```markdown
- FR-1: The login system must be fast and secure.
(Untestable: what is "fast"? What is "secure"?)
- FR-2: The system must handle all edge cases.
(Vague: which edge cases? This delegates the spec to the implementer.)
- FR-3: Users should be able to log in easily.
(Subjective: "easily" is not measurable.)
```
---
## Section 4: Non-Functional Requirements
Non-functional requirements define quality attributes. Every requirement needs a **measurable threshold**.
### Categories
#### Performance
```markdown
- NFR-P1: Login API MUST respond in < 500ms (p95) under 1,000 concurrent users.
- NFR-P2: Dashboard page MUST achieve Largest Contentful Paint < 2.5s.
- NFR-P3: Search results MUST return within 200ms for queries under 100 characters.
```
**Bad:** "The system should be fast." (Not measurable.)
#### Security
```markdown
- NFR-S1: All API endpoints MUST require authentication except /health and /login.
- NFR-S2: Failed login attempts MUST be rate-limited to 5 per minute per IP.
- NFR-S3: Passwords MUST be hashed with bcrypt (cost factor >= 12).
- NFR-S4: Session tokens MUST be invalidated on password change.
```
#### Accessibility
```markdown
- NFR-A1: All form inputs MUST have associated labels (WCAG 1.3.1).
- NFR-A2: Color contrast MUST meet 4.5:1 ratio (WCAG 1.4.3).
- NFR-A3: All interactive elements MUST be keyboard-navigable (WCAG 2.1.1).
```
#### Scalability
```markdown
- NFR-SC1: The system SHOULD handle 50,000 registered users.
- NFR-SC2: Database queries MUST use indexes; no full table scans on tables > 10K rows.
```
#### Reliability
```markdown
- NFR-R1: The authentication service MUST maintain 99.9% uptime (< 8.77h downtime/year).
- NFR-R2: Data MUST NOT be lost on service restart (durable storage required).
```
---
## Section 5: Acceptance Criteria — Given/When/Then
Acceptance criteria are the contract between the spec author and the implementer. They define "done."
### The Given/When/Then Pattern
```
Given [precondition — the world is in this state]
When [action — the user or system does this]
Then [outcome — this observable result occurs]
And [additional outcome — and also this]
```
### Rules for Acceptance Criteria
1. **Every AC MUST reference at least one FR-* or NFR-*.** Orphaned criteria indicate missing requirements.
2. **Every AC MUST be testable by a machine.** If you cannot write an automated test, rewrite the criterion.
3. **No subjective language.** Not "should look good" but "MUST render within the design-system grid."
4. **One scenario per AC.** If you have multiple Given/When/Then blocks, split into separate ACs.
### Example: Authentication Feature
```markdown
### AC-1: Successful login (FR-1, FR-3)
Given a registered user with email "user@example.com" and password "P@ssw0rd123"
When they POST /api/auth/login with those credentials
Then they receive a 200 response with a valid JWT token
And the token expires in 24 hours
And the response includes the user's display name
### AC-2: Invalid password (FR-1)
Given a registered user with email "user@example.com"
When they POST /api/auth/login with an incorrect password
Then they receive a 401 response
And the response body contains error "INVALID_CREDENTIALS"
And no token is issued
### AC-3: Short password rejected on registration (FR-2)
Given a new user attempting to register
When they submit a password with 7 characters
Then they receive a 400 response
And the response body contains error "PASSWORD_TOO_SHORT"
And the account is not created
```
### Common Mistakes
| Mistake | Example | Fix |
|---------|---------|-----|
| Vague outcome | "Then the system works correctly" | "Then the response status is 200 and body contains {field: value}" |
| Missing precondition | "When user logs in, then token is issued" | "Given a registered user, when they POST valid credentials, then..." |
| Multiple scenarios | AC with 3 different When clauses | Split into 3 separate ACs |
| No FR reference | "AC-5: User sees dashboard" | "AC-5: User sees dashboard (FR-7)" |
---
## Section 6: Edge Cases and Error Scenarios
### What Counts as an Edge Case
- Invalid or malformed input
- External service failures (API down, timeout, rate-limited)
- Concurrent operations (race conditions)
- Boundary values (empty string, max length, zero, negative numbers)
- State conflicts (already exists, already deleted, expired)
### Format
```markdown
- EC-1: Empty email field → Return 400 with error "EMAIL_REQUIRED". Do not call auth service.
- EC-2: Email exceeds 255 characters → Return 400 with error "EMAIL_TOO_LONG".
- EC-3: OAuth provider returns 503 → Return 503 with "Service temporarily unavailable". Retry after 30s.
- EC-4: Two users register same email simultaneously → First succeeds, second gets 409 Conflict.
- EC-5: User clicks reset link after password was already changed → Show "Link already used."
```
### Coverage Rule
For every external dependency, specify at least one failure:
- Database: connection lost, timeout, constraint violation
- API: 4xx, 5xx, timeout, invalid response
- File system: file not found, permission denied, disk full
- User input: empty, too long, wrong type, injection attempt
---
## Section 7: API Contracts
### Notation
Use TypeScript-style interfaces. They are readable by both frontend and backend engineers.
```typescript
interface CreateUserRequest {
email: string; // MUST be valid email, max 255 chars
password: string; // MUST be 8-128 chars
displayName: string; // MUST be 1-100 chars, no HTML
role?: "user" | "admin"; // Default: "user"
}
```
### What to Define
For each endpoint:
1. **HTTP method and path** (e.g., POST /api/users)
2. **Request body** (fields, types, constraints, defaults)
3. **Success response** (status code, body shape)
4. **Error responses** (each error code with its status and body)
5. **Headers** (Authorization, Content-Type, custom headers)
### Error Response Convention
```typescript
interface ApiError {
error: string; // Machine-readable code: "INVALID_CREDENTIALS"
message: string; // Human-readable: "The email or password is incorrect."
details?: Record<string, string>; // Field-level errors for validation
}
```
Always include:
- 400 for validation errors
- 401 for authentication failures
- 403 for authorization failures
- 404 for not found
- 409 for conflicts
- 429 for rate limiting
- 500 for unexpected errors (keep it generic — do not leak internals)
---
## Section 8: Data Models
### Table Format
```markdown
### User
| Field | Type | Constraints |
|-------|------|-------------|
| id | UUID | PK, auto-generated, immutable |
| email | varchar(255) | Unique, not null, valid email |
| passwordHash | varchar(60) | Not null, bcrypt, never in API responses |
| displayName | varchar(100) | Not null |
| role | enum('user','admin') | Default: 'user' |
| createdAt | timestamp | UTC, immutable, auto-set |
| updatedAt | timestamp | UTC, auto-updated |
| deletedAt | timestamp | Null unless soft-deleted |
```
### Rules
1. **Every entity in requirements MUST have a data model.** If FR-1 mentions "users", there must be a User model.
2. **Constraints MUST match requirements.** If FR-2 says passwords >= 8 chars, the model must note that.
3. **Include indexes.** If NFR-P1 says < 500ms queries, note which fields need indexes.
4. **Specify soft vs. hard delete.** State it explicitly.
---
## Section 9: Out of Scope
### Why This Section Matters
Out of Scope prevents scope creep during implementation. When someone says "while you're in there, could you also..." — point them to this section.
### Format
```markdown
- OS-1: Multi-factor authentication — Planned for Q3 (SPEC-045).
- OS-2: Social login beyond Google/GitHub — Insufficient user demand (< 2% requests).
- OS-3: Admin impersonation — Security review pending. Separate spec required.
- OS-4: Password strength meter UI — Nice-to-have, deferred to design sprint 12.
```
### Rules
1. **Every feature discussed and rejected MUST be listed.** This creates a paper trail.
2. **Include the reason.** "Not now" is not a reason. "Insufficient demand (< 2% of requests)" is.
3. **Link to future specs** when the exclusion is a deferral, not a rejection.
---
## Feature-Type Templates
### CRUD Feature
Focus on: all 4 operations, validation rules, authorization, pagination for list endpoints.
```markdown
- FR-1: Users MUST be able to create a [resource] with [required fields].
- FR-2: Users MUST be able to read a [resource] by ID.
- FR-3: Users MUST be able to list [resources] with pagination (default: 20/page).
- FR-4: Users MUST be able to update [mutable fields] of their own [resources].
- FR-5: Users MUST be able to delete their own [resources] (soft delete).
- FR-6: Users MUST NOT be able to modify or delete other users' [resources].
```
### Integration Feature
Focus on: external API contract, retry/fallback behavior, data mapping, error propagation.
```markdown
- FR-1: The system MUST call [external API] to [purpose].
- FR-2: The system MUST retry failed calls up to 3 times with exponential backoff.
- FR-3: The system MUST map [external field] to [internal field].
- FR-4: The system MUST NOT expose external API errors directly to users.
- EC-1: External API returns 5xx → Log error, return cached data if < 1h old, else 503.
- EC-2: External API response schema changes → Log warning, reject unmappable fields.
```
### Migration Feature
Focus on: backward compatibility, rollback plan, data integrity, zero-downtime deployment.
```markdown
- FR-1: The migration MUST transform [old schema] to [new schema].
- FR-2: The migration MUST be reversible (rollback script required).
- FR-3: The migration MUST NOT cause downtime exceeding 30 seconds.
- FR-4: The migration MUST validate data integrity post-run (row count, checksum).
- EC-1: Migration fails mid-way → Automatic rollback, alert ops team.
- EC-2: New schema has stricter constraints → Log invalid rows, quarantine for manual review.
```
---
## Checklist: Is This Spec Ready for Review?
- [ ] Every section is filled (or marked N/A with reason)
- [ ] All requirements use FR-N, NFR-N numbering
- [ ] RFC 2119 keywords are UPPERCASE
- [ ] Every AC references at least one requirement
- [ ] Every AC uses Given/When/Then
- [ ] Edge cases cover each external dependency failure
- [ ] API contracts define success AND error responses
- [ ] Data models include all entities from requirements
- [ ] Out of Scope lists items discussed and rejected
- [ ] No placeholder text remains
- [ ] Context includes evidence (metrics, tickets, research)
- [ ] Status is "In Review" (not still "Draft")

View File

@@ -0,0 +1,338 @@
#!/usr/bin/env python3
"""
Spec Generator - Generates a feature specification template from a name and description.
Produces a complete spec document with all required sections pre-filled with
guidance prompts. Output can be markdown or structured JSON.
No external dependencies - uses only Python standard library.
"""
import argparse
import json
import sys
import textwrap
from datetime import date
from pathlib import Path
from typing import Dict, Any, Optional
SPEC_TEMPLATE = """\
# Spec: {name}
**Author:** [your name]
**Date:** {date}
**Status:** Draft
**Reviewers:** [list reviewers]
**Related specs:** [links to related specs, or "None"]
---
## Context
{context_prompt}
---
## Functional Requirements
_Use RFC 2119 keywords: MUST, MUST NOT, SHOULD, SHOULD NOT, MAY._
_Each requirement is a single, testable statement. Number sequentially._
- FR-1: The system MUST [describe required behavior].
- FR-2: The system MUST [describe another required behavior].
- FR-3: The system SHOULD [describe recommended behavior].
- FR-4: The system MAY [describe optional behavior].
- FR-5: The system MUST NOT [describe prohibited behavior].
---
## Non-Functional Requirements
### Performance
- NFR-P1: [Operation] MUST complete in < [threshold] (p95) under [conditions].
- NFR-P2: [Operation] SHOULD handle [throughput] requests per second.
### Security
- NFR-S1: All data in transit MUST be encrypted via TLS 1.2+.
- NFR-S2: The system MUST rate-limit [operation] to [limit] per [period] per [scope].
### Accessibility
- NFR-A1: [UI component] MUST meet WCAG 2.1 AA standards.
- NFR-A2: Error messages MUST be announced to screen readers.
### Scalability
- NFR-SC1: The system SHOULD handle [number] concurrent [entities].
### Reliability
- NFR-R1: The [service] MUST maintain [percentage]% uptime.
---
## Acceptance Criteria
_Write in Given/When/Then (Gherkin) format._
_Each criterion MUST reference at least one FR-* or NFR-*._
### AC-1: [Descriptive name] (FR-1)
Given [precondition]
When [action]
Then [expected result]
And [additional assertion]
### AC-2: [Descriptive name] (FR-2)
Given [precondition]
When [action]
Then [expected result]
### AC-3: [Descriptive name] (NFR-S2)
Given [precondition]
When [action]
Then [expected result]
And [additional assertion]
---
## Edge Cases
_For every external dependency (API, database, file system, user input), specify at least one failure scenario._
- EC-1: [Input/condition] -> [expected behavior].
- EC-2: [Input/condition] -> [expected behavior].
- EC-3: [External service] is unavailable -> [expected behavior].
- EC-4: [Concurrent/race condition] -> [expected behavior].
- EC-5: [Boundary value] -> [expected behavior].
---
## API Contracts
_Define request/response shapes using TypeScript-style notation._
_Cover all endpoints referenced in functional requirements._
### [METHOD] [endpoint]
Request:
```typescript
interface [Name]Request {{
field: string; // Description, constraints
optional?: number; // Default: [value]
}}
```
Success Response ([status code]):
```typescript
interface [Name]Response {{
id: string;
field: string;
createdAt: string; // ISO 8601
}}
```
Error Response ([status code]):
```typescript
interface [Name]Error {{
error: "[ERROR_CODE]";
message: string;
}}
```
---
## Data Models
_Define all entities referenced in requirements._
### [Entity Name]
| Field | Type | Constraints |
|-------|------|-------------|
| id | UUID | Primary key, auto-generated |
| [field] | [type] | [constraints] |
| createdAt | timestamp | UTC, immutable |
| updatedAt | timestamp | UTC, auto-updated |
---
## Out of Scope
_Explicit exclusions prevent scope creep. If someone asks for these during implementation, point them here._
- OS-1: [Feature/capability] — [reason for exclusion or link to future spec].
- OS-2: [Feature/capability] — [reason for exclusion].
- OS-3: [Feature/capability] — deferred to [version/sprint].
---
## Open Questions
_Track unresolved questions here. Each must be resolved before status moves to "Approved"._
- [ ] Q1: [Question] — Owner: [name], Due: [date]
- [ ] Q2: [Question] — Owner: [name], Due: [date]
"""
def generate_context_prompt(description: str) -> str:
"""Generate a context section prompt based on the provided description."""
if description:
return textwrap.dedent(f"""\
{description}
_Expand this context section to include:_
_- Why does this feature exist? What problem does it solve?_
_- What is the business motivation? (link to user research, support tickets, metrics)_
_- What is the current state? (what exists today, what pain points exist)_
_- 2-4 paragraphs maximum._""")
return textwrap.dedent("""\
_Why does this feature exist? What problem does it solve? What is the business
motivation? Include links to user research, support tickets, or metrics that
justify this work. 2-4 paragraphs maximum._""")
def generate_spec(name: str, description: str) -> str:
"""Generate a spec document from name and description."""
context_prompt = generate_context_prompt(description)
return SPEC_TEMPLATE.format(
name=name,
date=date.today().isoformat(),
context_prompt=context_prompt,
)
def generate_spec_json(name: str, description: str) -> Dict[str, Any]:
"""Generate structured JSON representation of the spec template."""
return {
"spec": {
"title": f"Spec: {name}",
"metadata": {
"author": "[your name]",
"date": date.today().isoformat(),
"status": "Draft",
"reviewers": [],
"related_specs": [],
},
"context": description or "[Describe why this feature exists]",
"functional_requirements": [
{"id": "FR-1", "keyword": "MUST", "description": "[describe required behavior]"},
{"id": "FR-2", "keyword": "MUST", "description": "[describe another required behavior]"},
{"id": "FR-3", "keyword": "SHOULD", "description": "[describe recommended behavior]"},
{"id": "FR-4", "keyword": "MAY", "description": "[describe optional behavior]"},
{"id": "FR-5", "keyword": "MUST NOT", "description": "[describe prohibited behavior]"},
],
"non_functional_requirements": {
"performance": [
{"id": "NFR-P1", "description": "[operation] MUST complete in < [threshold]"},
],
"security": [
{"id": "NFR-S1", "description": "All data in transit MUST be encrypted via TLS 1.2+"},
],
"accessibility": [
{"id": "NFR-A1", "description": "[UI component] MUST meet WCAG 2.1 AA"},
],
"scalability": [
{"id": "NFR-SC1", "description": "[system] SHOULD handle [N] concurrent [entities]"},
],
"reliability": [
{"id": "NFR-R1", "description": "[service] MUST maintain [N]% uptime"},
],
},
"acceptance_criteria": [
{
"id": "AC-1",
"name": "[descriptive name]",
"references": ["FR-1"],
"given": "[precondition]",
"when": "[action]",
"then": "[expected result]",
},
],
"edge_cases": [
{"id": "EC-1", "condition": "[input/condition]", "behavior": "[expected behavior]"},
],
"api_contracts": [
{
"method": "[METHOD]",
"endpoint": "[/api/path]",
"request_fields": [{"name": "field", "type": "string", "constraints": "[description]"}],
"success_response": {"status": 200, "fields": []},
"error_response": {"status": 400, "fields": []},
},
],
"data_models": [
{
"name": "[Entity]",
"fields": [
{"name": "id", "type": "UUID", "constraints": "Primary key, auto-generated"},
],
},
],
"out_of_scope": [
{"id": "OS-1", "description": "[feature/capability]", "reason": "[reason]"},
],
"open_questions": [],
},
"metadata": {
"generated_by": "spec_generator.py",
"feature_name": name,
"feature_description": description,
},
}
def main():
parser = argparse.ArgumentParser(
description="Generate a feature specification template from a name and description.",
epilog="Example: python spec_generator.py --name 'User Auth' --description 'OAuth 2.0 login flow'",
)
parser.add_argument(
"--name",
required=True,
help="Feature name (used as spec title)",
)
parser.add_argument(
"--description",
default="",
help="Brief feature description (used to seed the context section)",
)
parser.add_argument(
"--output",
"-o",
default=None,
help="Output file path (default: stdout)",
)
parser.add_argument(
"--format",
choices=["md", "json"],
default="md",
help="Output format: md (markdown) or json (default: md)",
)
parser.add_argument(
"--json",
action="store_true",
dest="json_flag",
help="Shorthand for --format json",
)
args = parser.parse_args()
output_format = "json" if args.json_flag else args.format
if output_format == "json":
result = generate_spec_json(args.name, args.description)
output = json.dumps(result, indent=2)
else:
output = generate_spec(args.name, args.description)
if args.output:
out_path = Path(args.output)
out_path.parent.mkdir(parents=True, exist_ok=True)
out_path.write_text(output, encoding="utf-8")
print(f"Spec template written to {out_path}", file=sys.stderr)
else:
print(output)
sys.exit(0)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,461 @@
#!/usr/bin/env python3
"""
Spec Validator - Validates a feature specification for completeness and quality.
Checks that a spec document contains all required sections, uses RFC 2119 keywords
correctly, has acceptance criteria in Given/When/Then format, and scores overall
completeness from 0-100.
Sections checked:
- Context, Functional Requirements, Non-Functional Requirements
- Acceptance Criteria, Edge Cases, API Contracts, Data Models, Out of Scope
Exit codes: 0 = pass, 1 = warnings, 2 = critical (or --strict with score < 80)
No external dependencies - uses only Python standard library.
"""
import argparse
import json
import re
import sys
from pathlib import Path
from typing import Dict, List, Any, Tuple
# Section definitions: (key, display_name, required_header_patterns, weight)
SECTIONS = [
("context", "Context", [r"^##\s+Context"], 10),
("functional_requirements", "Functional Requirements", [r"^##\s+Functional\s+Requirements"], 15),
("non_functional_requirements", "Non-Functional Requirements", [r"^##\s+Non-Functional\s+Requirements"], 10),
("acceptance_criteria", "Acceptance Criteria", [r"^##\s+Acceptance\s+Criteria"], 20),
("edge_cases", "Edge Cases", [r"^##\s+Edge\s+Cases"], 10),
("api_contracts", "API Contracts", [r"^##\s+API\s+Contracts"], 10),
("data_models", "Data Models", [r"^##\s+Data\s+Models"], 10),
("out_of_scope", "Out of Scope", [r"^##\s+Out\s+of\s+Scope"], 10),
("metadata", "Metadata (Author/Date/Status)", [r"\*\*Author:\*\*", r"\*\*Date:\*\*", r"\*\*Status:\*\*"], 5),
]
RFC_KEYWORDS = ["MUST", "MUST NOT", "SHOULD", "SHOULD NOT", "MAY"]
# Patterns that indicate placeholder/unfilled content
PLACEHOLDER_PATTERNS = [
r"\[your\s+name\]",
r"\[list\s+reviewers\]",
r"\[describe\s+",
r"\[input/condition\]",
r"\[precondition\]",
r"\[action\]",
r"\[expected\s+result\]",
r"\[feature/capability\]",
r"\[operation\]",
r"\[threshold\]",
r"\[UI\s+component\]",
r"\[service\]",
r"\[percentage\]",
r"\[number\]",
r"\[METHOD\]",
r"\[endpoint\]",
r"\[Name\]",
r"\[Entity\s+Name\]",
r"\[type\]",
r"\[constraints\]",
r"\[field\]",
r"\[reason\]",
]
class SpecValidator:
"""Validates a spec document for completeness and quality."""
def __init__(self, content: str, file_path: str = ""):
self.content = content
self.file_path = file_path
self.lines = content.split("\n")
self.findings: List[Dict[str, Any]] = []
self.section_scores: Dict[str, Dict[str, Any]] = {}
def validate(self) -> Dict[str, Any]:
"""Run all validation checks and return results."""
self._check_sections_present()
self._check_functional_requirements()
self._check_acceptance_criteria()
self._check_edge_cases()
self._check_rfc_keywords()
self._check_api_contracts()
self._check_data_models()
self._check_out_of_scope()
self._check_placeholders()
self._check_traceability()
total_score = self._calculate_score()
return {
"file": self.file_path,
"score": total_score,
"grade": self._score_to_grade(total_score),
"sections": self.section_scores,
"findings": self.findings,
"summary": self._build_summary(total_score),
}
def _add_finding(self, severity: str, section: str, message: str):
"""Record a validation finding."""
self.findings.append({
"severity": severity, # "error", "warning", "info"
"section": section,
"message": message,
})
def _find_section_content(self, header_pattern: str) -> str:
"""Extract content between a section header and the next ## header."""
in_section = False
section_lines = []
for line in self.lines:
if re.match(header_pattern, line, re.IGNORECASE):
in_section = True
continue
if in_section and re.match(r"^##\s+", line):
break
if in_section:
section_lines.append(line)
return "\n".join(section_lines)
def _check_sections_present(self):
"""Check that all required sections exist."""
for key, name, patterns, weight in SECTIONS:
found = False
for pattern in patterns:
for line in self.lines:
if re.search(pattern, line, re.IGNORECASE):
found = True
break
if found:
break
if found:
self.section_scores[key] = {"name": name, "present": True, "score": weight, "max": weight}
else:
self.section_scores[key] = {"name": name, "present": False, "score": 0, "max": weight}
self._add_finding("error", key, f"Missing section: {name}")
def _check_functional_requirements(self):
"""Validate functional requirements format and content."""
content = self._find_section_content(r"^##\s+Functional\s+Requirements")
if not content.strip():
return
fr_pattern = re.compile(r"-\s+FR-(\d+):")
matches = fr_pattern.findall(content)
if not matches:
self._add_finding("error", "functional_requirements", "No numbered requirements found (expected FR-N: format)")
if "functional_requirements" in self.section_scores:
self.section_scores["functional_requirements"]["score"] = max(
0, self.section_scores["functional_requirements"]["score"] - 10
)
return
fr_count = len(matches)
if fr_count < 3:
self._add_finding("warning", "functional_requirements", f"Only {fr_count} requirements found. Most features need 3+.")
# Check for RFC keywords
has_keyword = False
for kw in RFC_KEYWORDS:
if kw in content:
has_keyword = True
break
if not has_keyword:
self._add_finding("warning", "functional_requirements", "No RFC 2119 keywords (MUST/SHOULD/MAY) found.")
def _check_acceptance_criteria(self):
"""Validate acceptance criteria use Given/When/Then format."""
content = self._find_section_content(r"^##\s+Acceptance\s+Criteria")
if not content.strip():
return
ac_pattern = re.compile(r"###\s+AC-(\d+):")
matches = ac_pattern.findall(content)
if not matches:
self._add_finding("error", "acceptance_criteria", "No numbered acceptance criteria found (expected ### AC-N: format)")
if "acceptance_criteria" in self.section_scores:
self.section_scores["acceptance_criteria"]["score"] = max(
0, self.section_scores["acceptance_criteria"]["score"] - 15
)
return
ac_count = len(matches)
# Check Given/When/Then
given_count = len(re.findall(r"(?i)\bgiven\b", content))
when_count = len(re.findall(r"(?i)\bwhen\b", content))
then_count = len(re.findall(r"(?i)\bthen\b", content))
if given_count < ac_count:
self._add_finding("warning", "acceptance_criteria",
f"Found {ac_count} criteria but only {given_count} 'Given' clauses. Each AC needs Given/When/Then.")
if when_count < ac_count:
self._add_finding("warning", "acceptance_criteria",
f"Found {ac_count} criteria but only {when_count} 'When' clauses.")
if then_count < ac_count:
self._add_finding("warning", "acceptance_criteria",
f"Found {ac_count} criteria but only {then_count} 'Then' clauses.")
# Check for FR references
fr_refs = re.findall(r"\(FR-\d+", content)
if not fr_refs:
self._add_finding("warning", "acceptance_criteria",
"No acceptance criteria reference functional requirements (expected (FR-N) in title).")
def _check_edge_cases(self):
"""Validate edge cases section."""
content = self._find_section_content(r"^##\s+Edge\s+Cases")
if not content.strip():
return
ec_pattern = re.compile(r"-\s+EC-(\d+):")
matches = ec_pattern.findall(content)
if not matches:
self._add_finding("warning", "edge_cases", "No numbered edge cases found (expected EC-N: format)")
elif len(matches) < 3:
self._add_finding("warning", "edge_cases", f"Only {len(matches)} edge cases. Consider failure modes for each external dependency.")
def _check_rfc_keywords(self):
"""Check RFC 2119 keywords are used consistently (capitalized)."""
# Look for lowercase must/should/may that might be intended as RFC keywords
context_content = self._find_section_content(r"^##\s+Functional\s+Requirements")
context_content += self._find_section_content(r"^##\s+Non-Functional\s+Requirements")
for kw in ["must", "should", "may"]:
# Find lowercase usage in requirement-like sentences
pattern = rf"(?:system|service|API|endpoint)\s+{kw}\s+"
if re.search(pattern, context_content):
self._add_finding("warning", "rfc_keywords",
f"Found lowercase '{kw}' in requirements. RFC 2119 keywords should be UPPERCASE: {kw.upper()}")
def _check_api_contracts(self):
"""Validate API contracts section."""
content = self._find_section_content(r"^##\s+API\s+Contracts")
if not content.strip():
return
# Check for at least one endpoint definition
has_endpoint = bool(re.search(r"(GET|POST|PUT|PATCH|DELETE)\s+/", content))
if not has_endpoint:
self._add_finding("warning", "api_contracts", "No HTTP method + path found (expected e.g., POST /api/endpoint)")
# Check for request/response definitions
has_interface = bool(re.search(r"interface\s+\w+", content))
if not has_interface:
self._add_finding("info", "api_contracts", "No TypeScript interfaces found. Consider defining request/response shapes.")
def _check_data_models(self):
"""Validate data models section."""
content = self._find_section_content(r"^##\s+Data\s+Models")
if not content.strip():
return
# Check for table format
has_table = bool(re.search(r"\|.*\|.*\|", content))
if not has_table:
self._add_finding("warning", "data_models", "No table-formatted data models found. Use | Field | Type | Constraints | format.")
def _check_out_of_scope(self):
"""Validate out of scope section."""
content = self._find_section_content(r"^##\s+Out\s+of\s+Scope")
if not content.strip():
return
os_pattern = re.compile(r"-\s+OS-(\d+):")
matches = os_pattern.findall(content)
if not matches:
self._add_finding("warning", "out_of_scope", "No numbered exclusions found (expected OS-N: format)")
elif len(matches) < 2:
self._add_finding("info", "out_of_scope", "Only 1 exclusion listed. Consider what was deliberately left out.")
def _check_placeholders(self):
"""Check for unfilled placeholder text."""
placeholder_count = 0
for pattern in PLACEHOLDER_PATTERNS:
matches = re.findall(pattern, self.content, re.IGNORECASE)
placeholder_count += len(matches)
if placeholder_count > 0:
self._add_finding("warning", "placeholders",
f"Found {placeholder_count} placeholder(s) that need to be filled in (e.g., [your name], [describe ...]).")
# Deduct from overall score proportionally
for key in self.section_scores:
if self.section_scores[key]["present"]:
deduction = min(3, self.section_scores[key]["score"])
self.section_scores[key]["score"] = max(0, self.section_scores[key]["score"] - deduction)
def _check_traceability(self):
"""Check that acceptance criteria reference functional requirements."""
ac_content = self._find_section_content(r"^##\s+Acceptance\s+Criteria")
fr_content = self._find_section_content(r"^##\s+Functional\s+Requirements")
if not ac_content.strip() or not fr_content.strip():
return
# Extract FR IDs
fr_ids = set(re.findall(r"FR-(\d+)", fr_content))
# Extract FR references from AC
ac_fr_refs = set(re.findall(r"FR-(\d+)", ac_content))
unreferenced = fr_ids - ac_fr_refs
if unreferenced:
unreferenced_list = ", ".join(f"FR-{i}" for i in sorted(unreferenced))
self._add_finding("warning", "traceability",
f"Functional requirements without acceptance criteria: {unreferenced_list}")
def _calculate_score(self) -> int:
"""Calculate the total completeness score."""
total = sum(s["score"] for s in self.section_scores.values())
maximum = sum(s["max"] for s in self.section_scores.values())
if maximum == 0:
return 0
# Apply finding-based deductions
error_count = sum(1 for f in self.findings if f["severity"] == "error")
warning_count = sum(1 for f in self.findings if f["severity"] == "warning")
base_score = round((total / maximum) * 100)
deduction = (error_count * 5) + (warning_count * 2)
return max(0, min(100, base_score - deduction))
@staticmethod
def _score_to_grade(score: int) -> str:
"""Convert score to letter grade."""
if score >= 90:
return "A"
if score >= 80:
return "B"
if score >= 70:
return "C"
if score >= 60:
return "D"
return "F"
def _build_summary(self, score: int) -> str:
"""Build human-readable summary."""
errors = [f for f in self.findings if f["severity"] == "error"]
warnings = [f for f in self.findings if f["severity"] == "warning"]
infos = [f for f in self.findings if f["severity"] == "info"]
lines = [
f"Spec Completeness Score: {score}/100 (Grade: {self._score_to_grade(score)})",
f"Errors: {len(errors)}, Warnings: {len(warnings)}, Info: {len(infos)}",
"",
]
if errors:
lines.append("ERRORS (must fix):")
for e in errors:
lines.append(f" [{e['section']}] {e['message']}")
lines.append("")
if warnings:
lines.append("WARNINGS (should fix):")
for w in warnings:
lines.append(f" [{w['section']}] {w['message']}")
lines.append("")
if infos:
lines.append("INFO:")
for i in infos:
lines.append(f" [{i['section']}] {i['message']}")
lines.append("")
# Section breakdown
lines.append("Section Breakdown:")
for key, data in self.section_scores.items():
status = "PRESENT" if data["present"] else "MISSING"
lines.append(f" {data['name']}: {data['score']}/{data['max']} ({status})")
return "\n".join(lines)
def format_human(result: Dict[str, Any]) -> str:
"""Format validation result for human reading."""
lines = [
"=" * 60,
"SPEC VALIDATION REPORT",
"=" * 60,
"",
]
if result["file"]:
lines.append(f"File: {result['file']}")
lines.append("")
lines.append(result["summary"])
return "\n".join(lines)
def main():
parser = argparse.ArgumentParser(
description="Validate a feature specification for completeness and quality.",
epilog="Example: python spec_validator.py --file spec.md --strict",
)
parser.add_argument(
"--file",
"-f",
required=True,
help="Path to the spec markdown file",
)
parser.add_argument(
"--strict",
action="store_true",
help="Exit with code 2 if score is below 80",
)
parser.add_argument(
"--json",
action="store_true",
dest="json_flag",
help="Output results as JSON",
)
args = parser.parse_args()
file_path = Path(args.file)
if not file_path.exists():
print(f"Error: File not found: {file_path}", file=sys.stderr)
sys.exit(2)
content = file_path.read_text(encoding="utf-8")
if not content.strip():
print(f"Error: File is empty: {file_path}", file=sys.stderr)
sys.exit(2)
validator = SpecValidator(content, str(file_path))
result = validator.validate()
if args.json_flag:
print(json.dumps(result, indent=2))
else:
print(format_human(result))
# Determine exit code
score = result["score"]
has_errors = any(f["severity"] == "error" for f in result["findings"])
has_warnings = any(f["severity"] == "warning" for f in result["findings"])
if args.strict and score < 80:
sys.exit(2)
elif has_errors:
sys.exit(2)
elif has_warnings:
sys.exit(1)
else:
sys.exit(0)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,431 @@
#!/usr/bin/env python3
"""
Test Extractor - Extracts test case stubs from a feature specification.
Parses acceptance criteria (Given/When/Then) and edge cases from a spec
document, then generates test stubs for the specified framework.
Supported frameworks: pytest, jest, go-test
Exit codes: 0 = success, 1 = warnings (some criteria unparseable), 2 = critical error
No external dependencies - uses only Python standard library.
"""
import argparse
import json
import re
import sys
import textwrap
from pathlib import Path
from typing import Dict, List, Any, Optional, Tuple
class SpecParser:
"""Parses spec documents to extract testable criteria."""
def __init__(self, content: str):
self.content = content
self.lines = content.split("\n")
def extract_acceptance_criteria(self) -> List[Dict[str, Any]]:
"""Extract AC-N blocks with Given/When/Then clauses."""
criteria = []
ac_pattern = re.compile(r"###\s+AC-(\d+):\s*(.+?)(?:\s*\(([^)]+)\))?\s*$")
in_ac = False
current_ac: Optional[Dict[str, Any]] = None
body_lines: List[str] = []
for line in self.lines:
match = ac_pattern.match(line)
if match:
# Save previous AC
if current_ac is not None:
current_ac["body"] = "\n".join(body_lines).strip()
self._parse_gwt(current_ac)
criteria.append(current_ac)
ac_id = int(match.group(1))
name = match.group(2).strip()
refs = match.group(3).strip() if match.group(3) else ""
current_ac = {
"id": f"AC-{ac_id}",
"name": name,
"references": [r.strip() for r in refs.split(",") if r.strip()] if refs else [],
"given": "",
"when": "",
"then": [],
"body": "",
}
body_lines = []
in_ac = True
elif in_ac:
# Check if we hit another ## section
if re.match(r"^##\s+", line) and not re.match(r"^###\s+", line):
in_ac = False
if current_ac is not None:
current_ac["body"] = "\n".join(body_lines).strip()
self._parse_gwt(current_ac)
criteria.append(current_ac)
current_ac = None
else:
body_lines.append(line)
# Don't forget the last one
if current_ac is not None:
current_ac["body"] = "\n".join(body_lines).strip()
self._parse_gwt(current_ac)
criteria.append(current_ac)
return criteria
def extract_edge_cases(self) -> List[Dict[str, Any]]:
"""Extract EC-N edge case items."""
edge_cases = []
ec_pattern = re.compile(r"-\s+EC-(\d+):\s*(.+?)(?:\s*->\s*|\s*->\s*|\s*→\s*)(.+)")
in_section = False
for line in self.lines:
if re.match(r"^##\s+Edge\s+Cases", line, re.IGNORECASE):
in_section = True
continue
if in_section and re.match(r"^##\s+", line):
break
if in_section:
match = ec_pattern.match(line.strip())
if match:
edge_cases.append({
"id": f"EC-{match.group(1)}",
"condition": match.group(2).strip().rstrip("."),
"behavior": match.group(3).strip().rstrip("."),
})
return edge_cases
def extract_spec_title(self) -> str:
"""Extract the spec title from the first H1."""
for line in self.lines:
match = re.match(r"^#\s+(?:Spec:\s*)?(.+)", line)
if match:
return match.group(1).strip()
return "UnknownFeature"
@staticmethod
def _parse_gwt(ac: Dict[str, Any]):
"""Parse Given/When/Then from the AC body text."""
body = ac["body"]
lines = body.split("\n")
current_section = None
for line in lines:
stripped = line.strip()
if not stripped:
continue
lower = stripped.lower()
if lower.startswith("given "):
current_section = "given"
ac["given"] = stripped[6:].strip()
elif lower.startswith("when "):
current_section = "when"
ac["when"] = stripped[5:].strip()
elif lower.startswith("then "):
current_section = "then"
ac["then"].append(stripped[5:].strip())
elif lower.startswith("and "):
if current_section == "then":
ac["then"].append(stripped[4:].strip())
elif current_section == "given":
ac["given"] += " AND " + stripped[4:].strip()
elif current_section == "when":
ac["when"] += " AND " + stripped[4:].strip()
def _sanitize_name(name: str) -> str:
"""Convert a human-readable name to a valid function/method name."""
# Remove parenthetical references like (FR-1)
name = re.sub(r"\([^)]*\)", "", name)
# Replace non-alphanumeric with underscore
name = re.sub(r"[^a-zA-Z0-9]+", "_", name)
# Remove leading/trailing underscores
name = name.strip("_").lower()
return name or "unnamed"
def _to_pascal_case(name: str) -> str:
"""Convert to PascalCase for Go test names."""
parts = _sanitize_name(name).split("_")
return "".join(p.capitalize() for p in parts if p)
class PytestGenerator:
"""Generates pytest test stubs."""
def generate(self, title: str, criteria: List[Dict], edge_cases: List[Dict]) -> str:
class_name = "Test" + _to_pascal_case(title)
lines = [
'"""',
f"Test suite for: {title}",
f"Auto-generated from spec. {len(criteria)} acceptance criteria, {len(edge_cases)} edge cases.",
"",
"All tests are stubs — implement the test body to make them pass.",
'"""',
"",
"import pytest",
"",
"",
f"class {class_name}:",
f' """Tests for {title}."""',
"",
]
for ac in criteria:
method_name = f"test_{ac['id'].lower().replace('-', '')}_{_sanitize_name(ac['name'])}"
docstring = f'{ac["id"]}: {ac["name"]}'
ref_str = f" [{', '.join(ac['references'])}]" if ac["references"] else ""
lines.append(f" def {method_name}(self):")
lines.append(f' """{docstring}{ref_str}"""')
if ac["given"]:
lines.append(f" # Given {ac['given']}")
if ac["when"]:
lines.append(f" # When {ac['when']}")
for t in ac["then"]:
lines.append(f" # Then {t}")
lines.append(' raise NotImplementedError("Implement this test")')
lines.append("")
if edge_cases:
lines.append(" # --- Edge Cases ---")
lines.append("")
for ec in edge_cases:
method_name = f"test_{ec['id'].lower().replace('-', '')}_{_sanitize_name(ec['condition'])}"
lines.append(f" def {method_name}(self):")
lines.append(f' """{ec["id"]}: {ec["condition"]} -> {ec["behavior"]}"""')
lines.append(f" # Condition: {ec['condition']}")
lines.append(f" # Expected: {ec['behavior']}")
lines.append(' raise NotImplementedError("Implement this test")')
lines.append("")
return "\n".join(lines)
class JestGenerator:
"""Generates Jest/Vitest test stubs (TypeScript)."""
def generate(self, title: str, criteria: List[Dict], edge_cases: List[Dict]) -> str:
lines = [
f"/**",
f" * Test suite for: {title}",
f" * Auto-generated from spec. {len(criteria)} acceptance criteria, {len(edge_cases)} edge cases.",
f" *",
f" * All tests are stubs — implement the test body to make them pass.",
f" */",
"",
f'describe("{title}", () => {{',
]
for ac in criteria:
ref_str = f" [{', '.join(ac['references'])}]" if ac["references"] else ""
test_name = f"{ac['id']}: {ac['name']}{ref_str}"
lines.append(f' it("{test_name}", () => {{')
if ac["given"]:
lines.append(f" // Given {ac['given']}")
if ac["when"]:
lines.append(f" // When {ac['when']}")
for t in ac["then"]:
lines.append(f" // Then {t}")
lines.append("")
lines.append(' throw new Error("Not implemented");')
lines.append(" });")
lines.append("")
if edge_cases:
lines.append(" // --- Edge Cases ---")
lines.append("")
for ec in edge_cases:
test_name = f"{ec['id']}: {ec['condition']}"
lines.append(f' it("{test_name}", () => {{')
lines.append(f" // Condition: {ec['condition']}")
lines.append(f" // Expected: {ec['behavior']}")
lines.append("")
lines.append(' throw new Error("Not implemented");')
lines.append(" });")
lines.append("")
lines.append("});")
lines.append("")
return "\n".join(lines)
class GoTestGenerator:
"""Generates Go test stubs."""
def generate(self, title: str, criteria: List[Dict], edge_cases: List[Dict]) -> str:
package_name = _sanitize_name(title).split("_")[0] or "feature"
lines = [
f"package {package_name}_test",
"",
"import (",
'\t"testing"',
")",
"",
f"// Test suite for: {title}",
f"// Auto-generated from spec. {len(criteria)} acceptance criteria, {len(edge_cases)} edge cases.",
f"// All tests are stubs — implement the test body to make them pass.",
"",
]
for ac in criteria:
func_name = "Test" + _to_pascal_case(ac["id"] + " " + ac["name"])
ref_str = f" [{', '.join(ac['references'])}]" if ac["references"] else ""
lines.append(f"// {ac['id']}: {ac['name']}{ref_str}")
lines.append(f"func {func_name}(t *testing.T) {{")
if ac["given"]:
lines.append(f"\t// Given {ac['given']}")
if ac["when"]:
lines.append(f"\t// When {ac['when']}")
for then_clause in ac["then"]:
lines.append(f"\t// Then {then_clause}")
lines.append("")
lines.append('\tt.Fatal("Not implemented")')
lines.append("}")
lines.append("")
if edge_cases:
lines.append("// --- Edge Cases ---")
lines.append("")
for ec in edge_cases:
func_name = "Test" + _to_pascal_case(ec["id"] + " " + ec["condition"])
lines.append(f"// {ec['id']}: {ec['condition']} -> {ec['behavior']}")
lines.append(f"func {func_name}(t *testing.T) {{")
lines.append(f"\t// Condition: {ec['condition']}")
lines.append(f"\t// Expected: {ec['behavior']}")
lines.append("")
lines.append('\tt.Fatal("Not implemented")')
lines.append("}")
lines.append("")
return "\n".join(lines)
GENERATORS = {
"pytest": PytestGenerator,
"jest": JestGenerator,
"go-test": GoTestGenerator,
}
FILE_EXTENSIONS = {
"pytest": ".py",
"jest": ".test.ts",
"go-test": "_test.go",
}
def main():
parser = argparse.ArgumentParser(
description="Extract test case stubs from a feature specification.",
epilog="Example: python test_extractor.py --file spec.md --framework pytest --output tests/test_feature.py",
)
parser.add_argument(
"--file",
"-f",
required=True,
help="Path to the spec markdown file",
)
parser.add_argument(
"--framework",
choices=list(GENERATORS.keys()),
default="pytest",
help="Target test framework (default: pytest)",
)
parser.add_argument(
"--output",
"-o",
default=None,
help="Output file path (default: stdout)",
)
parser.add_argument(
"--json",
action="store_true",
dest="json_flag",
help="Output extracted criteria as JSON instead of test code",
)
args = parser.parse_args()
file_path = Path(args.file)
if not file_path.exists():
print(f"Error: File not found: {file_path}", file=sys.stderr)
sys.exit(2)
content = file_path.read_text(encoding="utf-8")
if not content.strip():
print(f"Error: File is empty: {file_path}", file=sys.stderr)
sys.exit(2)
spec_parser = SpecParser(content)
title = spec_parser.extract_spec_title()
criteria = spec_parser.extract_acceptance_criteria()
edge_cases = spec_parser.extract_edge_cases()
if not criteria and not edge_cases:
print("Error: No acceptance criteria or edge cases found in spec.", file=sys.stderr)
sys.exit(2)
warnings = []
for ac in criteria:
if not ac["given"] and not ac["when"]:
warnings.append(f"{ac['id']}: Could not parse Given/When/Then — check format.")
if args.json_flag:
result = {
"spec_title": title,
"framework": args.framework,
"acceptance_criteria": criteria,
"edge_cases": edge_cases,
"warnings": warnings,
"counts": {
"acceptance_criteria": len(criteria),
"edge_cases": len(edge_cases),
"total_test_cases": len(criteria) + len(edge_cases),
},
}
output = json.dumps(result, indent=2)
else:
generator_class = GENERATORS[args.framework]
generator = generator_class()
output = generator.generate(title, criteria, edge_cases)
if args.output:
out_path = Path(args.output)
out_path.parent.mkdir(parents=True, exist_ok=True)
out_path.write_text(output, encoding="utf-8")
total = len(criteria) + len(edge_cases)
print(f"Generated {total} test stubs -> {out_path}", file=sys.stderr)
else:
print(output)
if warnings:
for w in warnings:
print(f"Warning: {w}", file=sys.stderr)
sys.exit(1)
sys.exit(0)
if __name__ == "__main__":
main()

View File

@@ -174,6 +174,7 @@ nav:
- "Agent Workflow Designer": skills/engineering/agent-workflow-designer.md
- "API Design Reviewer": skills/engineering/api-design-reviewer.md
- "API Test Suite Builder": skills/engineering/api-test-suite-builder.md
- "Browser Automation": skills/engineering/browser-automation.md
- "Changelog Generator": skills/engineering/changelog-generator.md
- "CI/CD Pipeline Builder": skills/engineering/ci-cd-pipeline-builder.md
- "Codebase Onboarding": skills/engineering/codebase-onboarding.md
@@ -195,6 +196,7 @@ nav:
- "Runbook Generator": skills/engineering/runbook-generator.md
- "Skill Security Auditor": skills/engineering/skill-security-auditor.md
- "Skill Tester": skills/engineering/skill-tester.md
- "Spec-Driven Workflow": skills/engineering/spec-driven-workflow.md
- "Tech Debt Tracker": skills/engineering/tech-debt-tracker.md
- "Terraform Patterns": skills/engineering/terraform-patterns.md
- "Helm Chart Builder": skills/engineering/helm-chart-builder.md