skill-seekers-reference

firefrost-gaming/skill-seekers-reference

Author SHA1 Message Date

Author	SHA1	Message	Date
yusyus	0fa99641aa	style: fix pre-existing ruff format issues in 5 files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 21:24:21 +03:00
yusyus	1d3d7389d7	fix: sanitize_url crashes on Python 3.14 strict urlparse (#284 ) Python 3.14's urlparse() raises ValueError on URLs with unencoded brackets that look like malformed IPv6 (e.g. http://[fdaa:x:x:x::x from docs.openclaw.ai llms-full.txt). sanitize_url() called urlparse() BEFORE encoding brackets, so it crashed before it could fix them. Fix: catch ValueError from urlparse, encode ALL brackets, then retry. This is safe because if urlparse rejected the brackets, they are NOT valid IPv6 host literals and should be encoded anyway. Also fixed Discord e2e tests to skip gracefully on network issues. Fixes #284 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 00:30:48 +03:00
yusyus	2ef6e59d06	fix: stop blindly appending /index.html.md to non-.md URLs (#277 ) The previous fix (`a82cf69`) only addressed anchor fragment stripping but left the fundamental problem: _convert_to_md_urls() blindly appended /index.html.md to ALL non-.md URLs from llms.txt. This only works for Docusaurus sites — for sites like Discord docs it generates mass 404s. Changes: - _convert_to_md_urls() now strips anchors and deduplicates only, preserving original URLs as-is instead of appending /index.html.md - New _has_md_extension() helper uses urlparse().path.endswith(".md") instead of error-prone ".md" in url substring matching - Fixed ".md" in url checks at 4 locations (lines 465, 554, 716, 775) - Removed 24 lines of dead commented-out code - Added real-world e2e test against docs.discord.com (no mocks) - Updated unit tests for new behavior (32 tests) Fixes #277 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 23:44:35 +03:00

yusyus

0fa99641aa

style: fix pre-existing ruff format issues in 5 files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-21 21:24:21 +03:00

yusyus

1d3d7389d7

fix: sanitize_url crashes on Python 3.14 strict urlparse (#284 )

Python 3.14's urlparse() raises ValueError on URLs with unencoded
brackets that look like malformed IPv6 (e.g. http://[fdaa:x:x:x::x
from docs.openclaw.ai llms-full.txt). sanitize_url() called urlparse()
BEFORE encoding brackets, so it crashed before it could fix them.

Fix: catch ValueError from urlparse, encode ALL brackets, then retry.
This is safe because if urlparse rejected the brackets, they are NOT
valid IPv6 host literals and should be encoded anyway.

Also fixed Discord e2e tests to skip gracefully on network issues.

Fixes #284

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-21 00:30:48 +03:00

yusyus

2ef6e59d06

fix: stop blindly appending /index.html.md to non-.md URLs (#277 )

The previous fix (a82cf69) only addressed anchor fragment stripping but
left the fundamental problem: _convert_to_md_urls() blindly appended
/index.html.md to ALL non-.md URLs from llms.txt. This only works for
Docusaurus sites — for sites like Discord docs it generates mass 404s.

Changes:
- _convert_to_md_urls() now strips anchors and deduplicates only,
  preserving original URLs as-is instead of appending /index.html.md
- New _has_md_extension() helper uses urlparse().path.endswith(".md")
  instead of error-prone ".md" in url substring matching
- Fixed ".md" in url checks at 4 locations (lines 465, 554, 716, 775)
- Removed 24 lines of dead commented-out code
- Added real-world e2e test against docs.discord.com (no mocks)
- Updated unit tests for new behavior (32 tests)

Fixes #277

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-20 23:44:35 +03:00

3 Commits