feat: Add 6 new languages to codebase analysis system (C#, Go, Rust, Java, Ruby, PHP)
Expands language support from 3 to 9 languages across entire codebase scraping system. **New Languages Added:** - C# (Unity/.NET support) - classes, methods, properties, async/await, XML docs - Go - structs, functions, methods with receivers, multiple return values - Rust - structs, functions, async functions, impl blocks - Java - classes, methods, inheritance, interfaces, generics - Ruby - classes, methods, inheritance, predicate methods - PHP - classes, methods, namespaces, inheritance **Code Analysis (code_analyzer.py):** - Added 6 new language analyzers (~1000 lines) - Regex-based parsers inspired by official language specs - Extract classes, functions, signatures, async detection - Comprehensive comment extraction for all languages **Dependency Analysis (dependency_analyzer.py):** - Added 6 new import extractors (~300 lines) - C#: using statements, static using, aliases - Go: import blocks, aliases - Rust: use statements, curly braces, crate/super - Java: import statements, static imports, wildcards - Ruby: require, require_relative, load - PHP: require/include, namespace use **File Extensions (codebase_scraper.py):** - Added mappings: .cs, .go, .rs, .java, .rb, .php **Test Coverage:** - Added 24 new tests for 6 languages (4 tests each) - Added 19 dependency analyzer tests - Added 6 language detection tests - Total: 118 tests, 100% passing ✅ **Credits:** - Regex patterns based on official language specifications: - Microsoft C# Language Specification - Go Language Specification - Rust Language Reference - Oracle Java Language Specification - Ruby Documentation - PHP Language Reference - NetworkX for graph algorithms **Issues Resolved:** - Closes #166 (C# support request) - Closes #140 (E1.7 MCP tool scrape_codebase) **Test Results:** - test_code_analyzer.py: 54 tests passing - test_dependency_analyzer.py: 43 tests passing - test_codebase_scraper.py: 21 tests passing - Total execution: ~0.41s 🚀 Generated with Claude Code Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -12,10 +12,16 @@ Usage:
|
||||
|
||||
Features:
|
||||
- File tree walking with .gitignore support
|
||||
- Multi-language code analysis (Python, JavaScript, C++)
|
||||
- Multi-language code analysis (9 languages: Python, JavaScript/TypeScript, C/C++, C#, Go, Rust, Java, Ruby, PHP)
|
||||
- API reference generation
|
||||
- Comment extraction
|
||||
- Dependency graph analysis
|
||||
- Configurable depth levels
|
||||
|
||||
Credits:
|
||||
- Language parsing patterns inspired by official language specifications
|
||||
- NetworkX for dependency graph analysis: https://networkx.org/
|
||||
- pathspec for .gitignore support: https://pypi.org/project/pathspec/
|
||||
"""
|
||||
|
||||
import os
|
||||
@@ -61,6 +67,13 @@ LANGUAGE_EXTENSIONS = {
|
||||
'.h': 'C++',
|
||||
'.hpp': 'C++',
|
||||
'.hxx': 'C++',
|
||||
'.c': 'C',
|
||||
'.cs': 'C#',
|
||||
'.go': 'Go',
|
||||
'.rs': 'Rust',
|
||||
'.java': 'Java',
|
||||
'.rb': 'Ruby',
|
||||
'.php': 'PHP',
|
||||
}
|
||||
|
||||
# Default directories to exclude
|
||||
|
||||
Reference in New Issue
Block a user