fix: filter non-integer metadata from GitHub languages API response (#322)

PyGithub's get_languages() returns raw API JSON which in some environments
includes non-integer metadata keys (e.g., "url"), causing a TypeError in
sum(). Now filters to integer values only before calculating percentages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
yusyus
2026-03-26 23:44:52 +03:00
parent 336ab6aaac
commit d71c1d3aa3
4 changed files with 36 additions and 1 deletions

View File

@@ -519,6 +519,13 @@ class GitHubScraper:
try:
languages = self.repo.get_languages()
# Filter out non-integer metadata (e.g., "url" key from some API configurations)
non_lang_keys = {k for k, v in languages.items() if not isinstance(v, int)}
if non_lang_keys:
logger.debug(
f"Filtered non-language keys from API response: {non_lang_keys}"
)
languages = {k: v for k, v in languages.items() if isinstance(v, int)}
total_bytes = sum(languages.values())
self.extracted_data["languages"] = {