fix: filter non-integer metadata from GitHub languages API response (#322)
PyGithub's get_languages() returns raw API JSON which in some environments includes non-integer metadata keys (e.g., "url"), causing a TypeError in sum(). Now filters to integer values only before calculating percentages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -519,6 +519,13 @@ class GitHubScraper:
|
||||
|
||||
try:
|
||||
languages = self.repo.get_languages()
|
||||
# Filter out non-integer metadata (e.g., "url" key from some API configurations)
|
||||
non_lang_keys = {k for k, v in languages.items() if not isinstance(v, int)}
|
||||
if non_lang_keys:
|
||||
logger.debug(
|
||||
f"Filtered non-language keys from API response: {non_lang_keys}"
|
||||
)
|
||||
languages = {k: v for k, v in languages.items() if isinstance(v, int)}
|
||||
total_bytes = sum(languages.values())
|
||||
|
||||
self.extracted_data["languages"] = {
|
||||
|
||||
Reference in New Issue
Block a user