firefrost-gaming/claude-code-skills-reference

Files

daymade 2192458ef7 release: add scrapling-skill and fix script compatibility

- add scrapling-skill with validated CLI workflow, diagnostics, packaging, and docs integration
- fix skill-creator package_skill.py so direct script invocation works from repo root
- fix continue-claude-work extract_resume_context.py typing compatibility for local python3
- bump marketplace to 1.39.0 and updated skill versions

2026-03-18 23:08:55 +08:00

3.3 KiB

Raw Blame History

Scrapling Troubleshooting

Installation modes
Verified failure modes
Static vs dynamic fetch choice
WeChat extraction pattern
Smoke test commands

Installation Modes

Use the CLI path as the default:

uv tool install 'scrapling[shell]'

Do not assume uv tool install scrapling is enough for CLI usage. The base package may install the executable wrapper without the optional CLI dependencies.

Verified Failure Modes

1. CLI installed without extras

Symptom:

scrapling --help fails
Output mentions missing click
Output says Scrapling must be installed with extras

Recovery:

uv tool uninstall scrapling
uv tool install 'scrapling[shell]'

2. Browser-backed fetchers not ready

Symptom:

extract fetch or extract stealthy-fetch fails because the Playwright runtime is not installed
Scrapling has not downloaded Chromium or Chrome Headless Shell

Recovery:

scrapling install

Success signals:

scrapling install later reports The dependencies are already installed
Browser caches contain both:
- chromium-*
- chromium_headless_shell-*

Typical cache roots:

~/Library/Caches/ms-playwright/
~/.cache/ms-playwright/

3. Static fetch TLS trust-store failure

Symptom:

extract get fails with curl: (60) SSL certificate problem

Interpretation:

Treat this as a local certificate verification problem first
Do not assume the target URL or Scrapling itself is broken

Recovery:

Retry the same static command with:

--no-verify

Do not make --no-verify the default. Use it only after the failure matches this certificate-verification pattern.

Static vs Dynamic Fetch Choice

Use this order:

extract get
extract fetch
extract stealthy-fetch

Use extract get when:

The page is mostly server-rendered
The content is likely already present in raw HTML
The target is an article page with a stable content container

Use extract fetch when:

Static HTML does not contain the real content
The site depends on JavaScript rendering
The page content appears only after runtime hydration

Use extract stealthy-fetch when:

fetch still fails
The target site shows challenge or anti-bot behavior

WeChat Extraction Pattern

For mp.weixin.qq.com public article pages:

Start with extract get
Use the selector #js_content
Validate the saved file immediately

Example:

scrapling extract get 'https://mp.weixin.qq.com/s/ARTICLE_ID?scene=1' article.md -s '#js_content'

Observed behavior:

The static fetch can already contain the real article body
Browser-backed fetch is often unnecessary for article extraction

Smoke Test Commands

Basic diagnosis

python3 scripts/diagnose_scrapling.py

Static extraction smoke test

python3 scripts/diagnose_scrapling.py --url 'https://example.com'

WeChat article smoke test

python3 scripts/diagnose_scrapling.py \
  --url 'https://mp.weixin.qq.com/s/ARTICLE_ID?scene=1' \
  --selector '#js_content'

Dynamic extraction smoke test

python3 scripts/diagnose_scrapling.py \
  --url 'https://example.com' \
  --dynamic

Validate saved output

wc -c article.md
sed -n '1,40p' article.md
rg -n '<title>|js_content|main|rich_media_title' page.html

3.3 KiB Raw Blame History

Scrapling Troubleshooting

Contents

Installation Modes

Verified Failure Modes

1. CLI installed without extras

2. Browser-backed fetchers not ready

3. Static fetch TLS trust-store failure

Static vs Dynamic Fetch Choice

WeChat Extraction Pattern

Smoke Test Commands

Basic diagnosis

Static extraction smoke test

WeChat article smoke test

Dynamic extraction smoke test

Validate saved output

3.3 KiB

Raw Blame History