fix: Complete fix for Issue #219 - All three problems resolved

**Problem #1: Large File Encoding Error**  FIXED
- Add large file download support via download_url
- Detect encoding='none' for files >1MB
- Download via GitHub raw URL instead of API
- Handles ccxt/ccxt's 1.4MB CHANGELOG.md successfully

**Problem #2: Missing CLI Enhancement Flags**  FIXED
- Add --enhance, --enhance-local, --api-key to main.py github_parser
- Add flag forwarding in CLI dispatcher
- Fixes 'unrecognized arguments' error
- Users can now use: skill-seekers github --repo owner/repo --enhance-local

**Problem #3: Custom API Endpoint Support**  FIXED
- Support ANTHROPIC_BASE_URL environment variable
- Support ANTHROPIC_AUTH_TOKEN (alternative to ANTHROPIC_API_KEY)
- Fix ThinkingBlock.text error with newer Anthropic SDK
- Find TextBlock in response content array (handles thinking blocks)

**Changes**:
- src/skill_seekers/cli/enhance_skill.py:
  - Support custom base_url parameter
  - Support both ANTHROPIC_API_KEY and ANTHROPIC_AUTH_TOKEN
  - Iterate through content blocks to find text (handles ThinkingBlock)

- src/skill_seekers/cli/main.py:
  - Add --enhance, --enhance-local, --api-key to github_parser
  - Forward flags to github_scraper.py in dispatcher

- src/skill_seekers/cli/github_scraper.py:
  - Add large file detection (encoding=None/"none")
  - Download via download_url with requests
  - Log file size and download progress

- tests/test_github_scraper.py:
  - Add test_get_file_content_large_file
  - Add test_extract_changelog_large_file
  - All 31 tests passing 

**Credits**:
- Thanks to @XGCoder for detailed bug report
- Thanks to @gorquan for local fixes and guidance

Fixes #219

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
yusyus
2026-01-01 20:57:03 +03:00
parent 58286f454a
commit f2faebb8d5
4 changed files with 124 additions and 6 deletions

View File

@@ -355,6 +355,26 @@ class GitHubScraper:
logger.warning(f"Symlink {file_path} has no target")
return None
# Handle large files (encoding="none") - download via URL
# GitHub API doesn't base64-encode files >1MB
if hasattr(content, 'encoding') and content.encoding in [None, "none"]:
download_url = getattr(content, 'download_url', None)
file_size = getattr(content, 'size', 0)
if download_url:
logger.info(f"File {file_path} is large ({file_size:,} bytes), downloading via URL...")
try:
import requests
response = requests.get(download_url, timeout=30)
response.raise_for_status()
return response.text
except Exception as e:
logger.warning(f"Failed to download {file_path} from {download_url}: {e}")
return None
else:
logger.warning(f"File {file_path} has no download URL (encoding={content.encoding})")
return None
# Handle regular files - decode content
try:
if isinstance(content.decoded_content, bytes):