fix: Complete fix for Issue #219 - All three problems resolved

**Problem #1: Large File Encoding Error** ✅ FIXED - Add large file download support via download_url - Detect encoding='none' for files >1MB - Download via GitHub raw URL instead of API - Handles ccxt/ccxt's 1.4MB CHANGELOG.md successfully **Problem #2: Missing CLI Enhancement Flags** ✅ FIXED - Add --enhance, --enhance-local, --api-key to main.py github_parser - Add flag forwarding in CLI dispatcher - Fixes 'unrecognized arguments' error - Users can now use: skill-seekers github --repo owner/repo --enhance-local **Problem #3: Custom API Endpoint Support** ✅ FIXED - Support ANTHROPIC_BASE_URL environment variable - Support ANTHROPIC_AUTH_TOKEN (alternative to ANTHROPIC_API_KEY) - Fix ThinkingBlock.text error with newer Anthropic SDK - Find TextBlock in response content array (handles thinking blocks) **Changes**: - src/skill_seekers/cli/enhance_skill.py: - Support custom base_url parameter - Support both ANTHROPIC_API_KEY and ANTHROPIC_AUTH_TOKEN - Iterate through content blocks to find text (handles ThinkingBlock) - src/skill_seekers/cli/main.py: - Add --enhance, --enhance-local, --api-key to github_parser - Forward flags to github_scraper.py in dispatcher - src/skill_seekers/cli/github_scraper.py: - Add large file detection (encoding=None/"none") - Download via download_url with requests - Log file size and download progress - tests/test_github_scraper.py: - Add test_get_file_content_large_file - Add test_extract_changelog_large_file - All 31 tests passing ✅ **Credits**: - Thanks to @XGCoder for detailed bug report - Thanks to @gorquan for local fixes and guidance Fixes #219 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-01 20:57:03 +03:00
parent 58286f454a
commit f2faebb8d5
4 changed files with 124 additions and 6 deletions
--- a/src/skill_seekers/cli/github_scraper.py
+++ b/src/skill_seekers/cli/github_scraper.py
@@ -355,6 +355,26 @@ class GitHubScraper:
                    logger.warning(f"Symlink {file_path} has no target")
                    return None

+            # Handle large files (encoding="none") - download via URL
+            # GitHub API doesn't base64-encode files >1MB
+            if hasattr(content, 'encoding') and content.encoding in [None, "none"]:
+                download_url = getattr(content, 'download_url', None)
+                file_size = getattr(content, 'size', 0)
+
+                if download_url:
+                    logger.info(f"File {file_path} is large ({file_size:,} bytes), downloading via URL...")
+                    try:
+                        import requests
+                        response = requests.get(download_url, timeout=30)
+                        response.raise_for_status()
+                        return response.text
+                    except Exception as e:
+                        logger.warning(f"Failed to download {file_path} from {download_url}: {e}")
+                        return None
+                else:
+                    logger.warning(f"File {file_path} has no download URL (encoding={content.encoding})")
+                    return None
+
            # Handle regular files - decode content
            try:
                if isinstance(content.decoded_content, bytes):