- Fix ruff format issue in doc_scraper.py
- Add pytest skip markers for browser renderer tests when Playwright is
not installed in CI
- Replace broken Python heredocs in 4 workflow YAML files
(scheduled-updates, vector-db-export, quality-metrics, test-vector-dbs)
with python3 -c calls to fix YAML parsing errors
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New BrowserRenderer class uses Playwright to render JavaScript-heavy
documentation sites (React, Vue SPAs) that return empty HTML shells
with requests.get(). Activated via --browser flag on web scraping.
- browser_renderer.py: Playwright wrapper with lazy browser launch,
auto-install Chromium on first use, context manager support
- doc_scraper.py: browser_mode config, _render_with_browser() helper,
integrated into scrape_page() and scrape_page_async()
- SPA detection warnings now suggest --browser flag
- Optional dep: pip install "skill-seekers[browser]"
- 14 real e2e tests (actual Chromium, no mocks)
- UML updated: Scrapers class diagram (BrowserRenderer + dependency),
Parsers (DoctorParser), Utilities (Doctor), Components, and new
Browser Rendering sequence diagram (#20)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>