Files
claude-skills-reference/engineering/browser-automation/references/playwright_browser_api.md
Reza Rezvani 97952ccbee feat(engineering): add browser-automation and spec-driven-workflow skills
browser-automation (564-line SKILL.md, 3 scripts, 3 references):
- Web scraping, form filling, screenshot capture, data extraction
- Anti-detection patterns, cookie/session management, dynamic content
- scraping_toolkit.py, form_automation_builder.py, anti_detection_checker.py
- NOT testing (that's playwright-pro) — this is automation & scraping

spec-driven-workflow (586-line SKILL.md, 3 scripts, 3 references):
- Spec-first development: write spec BEFORE code
- Bounded autonomy rules, 6-phase workflow, self-review checklist
- spec_generator.py, spec_validator.py, test_extractor.py
- Pairs with tdd-guide for red-green-refactor after spec

Updated engineering plugin.json (31 → 33 skills).
Added both to mkdocs.yml nav and generated docs pages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 12:57:18 +01:00

13 KiB

Playwright Browser API Reference (Automation Focus)

This reference covers Playwright's Python async API for browser automation tasks — NOT testing. For test-specific APIs (assertions, fixtures, test runners), see playwright-pro.

Browser Launch & Context

Launching the Browser

from playwright.async_api import async_playwright

async with async_playwright() as p:
    # Chromium (recommended for most automation)
    browser = await p.chromium.launch(headless=True)

    # Firefox (better for some anti-detection scenarios)
    browser = await p.firefox.launch(headless=True)

    # WebKit (Safari engine — useful for Apple-specific sites)
    browser = await p.webkit.launch(headless=True)

Launch options:

Option Type Default Purpose
headless bool True Run without visible window
slow_mo int 0 Milliseconds to slow each operation (debugging)
proxy dict None Proxy server configuration
args list [] Additional Chromium flags
downloads_path str None Directory for downloads
channel str None Browser channel: "chrome", "msedge"

Browser Contexts (Session Isolation)

Browser contexts are isolated environments within a single browser instance. Each context has its own cookies, localStorage, and cache. Use them instead of launching multiple browsers.

# Create isolated context
context = await browser.new_context(
    viewport={"width": 1920, "height": 1080},
    user_agent="Mozilla/5.0 ...",
    locale="en-US",
    timezone_id="America/New_York",
    geolocation={"latitude": 40.7128, "longitude": -74.0060},
    permissions=["geolocation"],
)

# Multiple contexts share one browser (resource efficient)
context_a = await browser.new_context()  # User A session
context_b = await browser.new_context()  # User B session

Storage State (Session Persistence)

# Save state after login (cookies + localStorage)
await context.storage_state(path="auth_state.json")

# Restore state in new context
context = await browser.new_context(storage_state="auth_state.json")

Page Navigation

Basic Navigation

page = await context.new_page()

# Navigate with different wait strategies
await page.goto("https://example.com")                          # Default: "load"
await page.goto("https://example.com", wait_until="domcontentloaded")  # Faster
await page.goto("https://example.com", wait_until="networkidle")       # Wait for network quiet
await page.goto("https://example.com", timeout=30000)                  # Custom timeout (ms)

wait_until options:

  • "load" — wait for the load event (all resources loaded)
  • "domcontentloaded" — DOM is ready, images/styles may still load
  • "networkidle" — no network requests for 500ms (best for SPAs)
  • "commit" — response received, before any rendering

Wait Strategies

# Wait for a specific element to appear
await page.wait_for_selector("div.content", state="visible")
await page.wait_for_selector("div.loading", state="hidden")     # Wait for loading to finish
await page.wait_for_selector("table tbody tr", state="attached") # In DOM but maybe not visible

# Wait for URL change
await page.wait_for_url("**/dashboard**")
await page.wait_for_url(re.compile(r"/dashboard/\d+"))

# Wait for specific network response
async with page.expect_response("**/api/data*") as resp_info:
    await page.click("button.load")
response = await resp_info.value
json_data = await response.json()

# Wait for page load state
await page.wait_for_load_state("networkidle")

# Fixed wait (use sparingly — prefer the methods above)
await page.wait_for_timeout(1000)  # milliseconds

Navigation History

await page.go_back()
await page.go_forward()
await page.reload()

Element Interaction

Finding Elements

# Single element (returns first match)
element = await page.query_selector("css=div.product")
element = await page.query_selector("xpath=//div[@class='product']")

# Multiple elements
elements = await page.query_selector_all("div.product")

# Locator API (recommended — auto-waits, re-queries on each action)
locator = page.locator("div.product")
count = await locator.count()
first = locator.first
nth = locator.nth(2)

Locator vs query_selector:

  • query_selector — returns an ElementHandle at a point in time. Can go stale if DOM changes.
  • locator — returns a Locator that re-queries each time you interact with it. Preferred for reliability.

Clicking

await page.click("button.submit")
await page.click("a:has-text('Next')")
await page.dblclick("div.editable")
await page.click("button", position={"x": 10, "y": 10})  # Click at offset
await page.click("button", force=True)  # Skip actionability checks
await page.click("button", modifiers=["Shift"])  # With modifier key

Text Input

# Fill (clears existing content first)
await page.fill("input#email", "user@example.com")

# Type (simulates keystroke-by-keystroke input — slower, more realistic)
await page.type("input#search", "query text", delay=50)  # 50ms between keys

# Press specific keys
await page.press("input#search", "Enter")
await page.press("body", "Control+a")

Dropdowns & Select

# Native <select> element
await page.select_option("select#country", value="US")
await page.select_option("select#country", label="United States")
await page.select_option("select#tags", value=["tag1", "tag2"])  # Multi-select

# Custom dropdown (non-native)
await page.click("div.dropdown-trigger")
await page.click("li.option:has-text('United States')")

Checkboxes & Radio Buttons

await page.check("input#agree")
await page.uncheck("input#newsletter")
is_checked = await page.is_checked("input#agree")

File Upload

# Standard file input
await page.set_input_files("input[type='file']", "/path/to/file.pdf")
await page.set_input_files("input[type='file']", ["/path/a.pdf", "/path/b.pdf"])

# Clear file selection
await page.set_input_files("input[type='file']", [])

# Non-standard upload (drag-and-drop zones)
async with page.expect_file_chooser() as fc_info:
    await page.click("div.upload-zone")
file_chooser = await fc_info.value
await file_chooser.set_files("/path/to/file.pdf")

Hover & Focus

await page.hover("div.menu-item")
await page.focus("input#search")

Data Extraction

Text Content

# Get text content of an element
text = await page.text_content("h1.title")
inner_text = await page.inner_text("div.description")  # Visible text only
inner_html = await page.inner_html("div.content")       # HTML markup

# Get attribute
href = await page.get_attribute("a.link", "href")
src = await page.get_attribute("img.photo", "src")

JavaScript Evaluation

# Evaluate in page context
title = await page.evaluate("document.title")
scroll_height = await page.evaluate("document.body.scrollHeight")

# Evaluate on a specific element
text = await page.eval_on_selector("h1", "el => el.textContent")
texts = await page.eval_on_selector_all("li", "els => els.map(e => e.textContent.trim())")

# Complex extraction
data = await page.evaluate("""
    () => {
        const rows = document.querySelectorAll('table tbody tr');
        return Array.from(rows).map(row => {
            const cells = row.querySelectorAll('td');
            return {
                name: cells[0]?.textContent.trim(),
                value: cells[1]?.textContent.trim(),
            };
        });
    }
""")

Screenshots & PDF

# Full page screenshot
await page.screenshot(path="page.png", full_page=True)

# Viewport screenshot
await page.screenshot(path="viewport.png")

# Element screenshot
await page.locator("div.chart").screenshot(path="chart.png")

# PDF (Chromium only)
await page.pdf(path="page.pdf", format="A4", print_background=True)

# Screenshot as bytes (for processing without saving)
buffer = await page.screenshot()

Network Interception

Monitoring Requests

# Listen for all responses
page.on("response", lambda response: print(f"{response.status} {response.url}"))

# Wait for a specific API call
async with page.expect_response("**/api/products*") as resp:
    await page.click("button.load")
response = await resp.value
data = await response.json()

Blocking Resources (Speed Up Scraping)

# Block images, fonts, and CSS to speed up scraping
await page.route("**/*.{png,jpg,jpeg,gif,svg,woff,woff2,ttf}", lambda route: route.abort())
await page.route("**/*.css", lambda route: route.abort())

# Block specific domains (ads, analytics)
await page.route("**/google-analytics.com/**", lambda route: route.abort())
await page.route("**/facebook.com/**", lambda route: route.abort())

Modifying Requests

# Add custom headers
await page.route("**/*", lambda route: route.continue_(headers={
    **route.request.headers,
    "X-Custom-Header": "value"
}))

# Mock API responses
await page.route("**/api/data", lambda route: route.fulfill(
    status=200,
    content_type="application/json",
    body=json.dumps({"items": []}),
))

Dialog Handling

# Auto-accept all dialogs
page.on("dialog", lambda dialog: dialog.accept())

# Handle specific dialog types
async def handle_dialog(dialog):
    if dialog.type == "confirm":
        await dialog.accept()
    elif dialog.type == "prompt":
        await dialog.accept("my input")
    elif dialog.type == "alert":
        await dialog.dismiss()

page.on("dialog", handle_dialog)

File Downloads

# Wait for download to start
async with page.expect_download() as dl_info:
    await page.click("a.download-link")
download = await dl_info.value

# Save to specific path
await download.save_as("/path/to/downloads/" + download.suggested_filename)

# Get download as bytes
path = await download.path()  # Temp file path

# Set download behavior at context level
context = await browser.new_context(accept_downloads=True)

Frames & Iframes

# Access iframe by selector
frame = page.frame_locator("iframe#content")
await frame.locator("button.submit").click()

# Access frame by name
frame = page.frame(name="editor")

# Access all frames
for frame in page.frames:
    print(frame.url)
# Get all cookies
cookies = await context.cookies()

# Get cookies for specific URL
cookies = await context.cookies(["https://example.com"])

# Add cookies
await context.add_cookies([{
    "name": "session",
    "value": "abc123",
    "domain": "example.com",
    "path": "/",
    "httpOnly": True,
    "secure": True,
}])

# Clear cookies
await context.clear_cookies()

Concurrency Patterns

Multiple Pages in One Context

# Open multiple tabs in the same session
pages = []
for url in urls:
    page = await context.new_page()
    await page.goto(url)
    pages.append(page)

# Process all pages
for page in pages:
    data = await extract_data(page)
    await page.close()

Multiple Contexts for Parallel Sessions

import asyncio

async def scrape_with_context(browser, url):
    context = await browser.new_context(user_agent=random.choice(USER_AGENTS))
    page = await context.new_page()
    await page.goto(url)
    data = await extract_data(page)
    await context.close()
    return data

# Run 5 concurrent scraping tasks
tasks = [scrape_with_context(browser, url) for url in urls[:5]]
results = await asyncio.gather(*tasks)

Init Scripts (Stealth)

Init scripts run before any page script, in every new page/context.

# Remove webdriver flag
await context.add_init_script("""
    Object.defineProperty(navigator, 'webdriver', {get: () => undefined});
""")

# Override plugins (headless Chrome has empty plugins)
await context.add_init_script("""
    Object.defineProperty(navigator, 'plugins', {
        get: () => [1, 2, 3, 4, 5],
    });
""")

# Override languages
await context.add_init_script("""
    Object.defineProperty(navigator, 'languages', {
        get: () => ['en-US', 'en'],
    });
""")

# From file
await context.add_init_script(path="stealth.js")

Common Automation Patterns

Scrolling

# Scroll to bottom
await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")

# Scroll element into view
await page.locator("div.target").scroll_into_view_if_needed()

# Smooth scroll simulation
await page.evaluate("""
    async () => {
        const delay = ms => new Promise(r => setTimeout(r, ms));
        for (let i = 0; i < document.body.scrollHeight; i += 300) {
            window.scrollTo(0, i);
            await delay(100);
        }
    }
""")

Clipboard Operations

# Copy text
await page.evaluate("navigator.clipboard.writeText('hello')")

# Paste via keyboard
await page.keyboard.press("Control+v")

Shadow DOM

# Playwright pierces open shadow DOM with >> operator
await page.locator("my-component >> .inner-button").click()

# Or use the css= engine with >> for chained piercing
await page.locator("css=host-element >> css=.shadow-child").click()