Files
skill-seekers-reference/tests
yusyus 2ef6e59d06 fix: stop blindly appending /index.html.md to non-.md URLs (#277)
The previous fix (a82cf69) only addressed anchor fragment stripping but
left the fundamental problem: _convert_to_md_urls() blindly appended
/index.html.md to ALL non-.md URLs from llms.txt. This only works for
Docusaurus sites — for sites like Discord docs it generates mass 404s.

Changes:
- _convert_to_md_urls() now strips anchors and deduplicates only,
  preserving original URLs as-is instead of appending /index.html.md
- New _has_md_extension() helper uses urlparse().path.endswith(".md")
  instead of error-prone ".md" in url substring matching
- Fixed ".md" in url checks at 4 locations (lines 465, 554, 716, 775)
- Removed 24 lines of dead commented-out code
- Added real-world e2e test against docs.discord.com (no mocks)
- Updated unit tests for new behavior (32 tests)

Fixes #277

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 23:44:35 +03:00
..
2026-01-17 17:29:21 +00:00
2026-01-17 17:48:15 +00:00
2026-01-17 17:48:15 +00:00
2026-01-17 17:48:15 +00:00
2026-01-17 17:29:21 +00:00
2026-01-17 17:29:21 +00:00
2026-01-17 17:48:15 +00:00
2026-01-17 17:48:15 +00:00
2026-01-17 17:48:15 +00:00
2026-01-17 17:48:15 +00:00
2026-01-17 17:29:21 +00:00
2026-01-17 17:48:15 +00:00
2026-01-17 17:48:15 +00:00
2026-01-17 17:48:15 +00:00
2026-01-17 17:29:21 +00:00