YusufKaraaslanSpyke
170dd0fd75
feat(C3.9): Add project documentation extraction from markdown files
...
- Scan ALL .md files in project (README, docs/, etc.)
- Smart categorization by folder/filename (overview, architecture, guides, etc.)
- Processing depth: surface=raw copy, deep=parse+summarize, full=AI-enhanced
- AI enhancement at level 2+ adds topic extraction and cross-references
- New "Project Documentation" section in SKILL.md with summaries
- Output to references/documentation/ organized by category
- Default ON, use --skip-docs to disable
- Add skip_docs parameter to MCP scrape_codebase_tool
- Add 15 new tests for markdown documentation features
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-01-31 13:54:56 +03:00
yusyus
3408315f40
feat: Add 6 new languages to codebase analysis system (C#, Go, Rust, Java, Ruby, PHP)
...
Expands language support from 3 to 9 languages across entire codebase scraping system.
**New Languages Added:**
- C# (Unity/.NET support) - classes, methods, properties, async/await, XML docs
- Go - structs, functions, methods with receivers, multiple return values
- Rust - structs, functions, async functions, impl blocks
- Java - classes, methods, inheritance, interfaces, generics
- Ruby - classes, methods, inheritance, predicate methods
- PHP - classes, methods, namespaces, inheritance
**Code Analysis (code_analyzer.py):**
- Added 6 new language analyzers (~1000 lines)
- Regex-based parsers inspired by official language specs
- Extract classes, functions, signatures, async detection
- Comprehensive comment extraction for all languages
**Dependency Analysis (dependency_analyzer.py):**
- Added 6 new import extractors (~300 lines)
- C#: using statements, static using, aliases
- Go: import blocks, aliases
- Rust: use statements, curly braces, crate/super
- Java: import statements, static imports, wildcards
- Ruby: require, require_relative, load
- PHP: require/include, namespace use
**File Extensions (codebase_scraper.py):**
- Added mappings: .cs, .go, .rs, .java, .rb, .php
**Test Coverage:**
- Added 24 new tests for 6 languages (4 tests each)
- Added 19 dependency analyzer tests
- Added 6 language detection tests
- Total: 118 tests, 100% passing ✅
**Credits:**
- Regex patterns based on official language specifications:
- Microsoft C# Language Specification
- Go Language Specification
- Rust Language Reference
- Oracle Java Language Specification
- Ruby Documentation
- PHP Language Reference
- NetworkX for graph algorithms
**Issues Resolved:**
- Closes #166 (C# support request)
- Closes #140 (E1.7 MCP tool scrape_codebase)
**Test Results:**
- test_code_analyzer.py: 54 tests passing
- test_dependency_analyzer.py: 43 tests passing
- test_codebase_scraper.py: 21 tests passing
- Total execution: ~0.41s
🚀 Generated with Claude Code
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com >
2026-01-02 21:28:21 +03:00
yusyus
ae96526d4b
feat(C2.7): Add standalone codebase-scraper CLI tool
...
- Created src/skill_seekers/cli/codebase_scraper.py (450 lines)
- Standalone tool for analyzing local codebases without GitHub API
- Full .gitignore support using pathspec library
Features:
- Directory tree walking with .gitignore respect
- Multi-language code analysis (Python, JavaScript, TypeScript, C++)
- Language filtering (--languages Python,JavaScript)
- File pattern matching (--file-patterns "*.py,src/**/*.js")
- API reference generation (--build-api-reference)
- Comment extraction (enabled by default)
- Configurable analysis depth (surface/deep/full)
- Smart directory exclusion (node_modules, venv, .git, etc.)
CLI Usage:
skill-seekers-codebase --directory /path/to/repo --output output/codebase/
skill-seekers-codebase --directory . --depth deep --build-api-reference
skill-seekers-codebase --directory . --languages Python,JavaScript
Output:
- code_analysis.json - Complete analysis results
- api_reference/*.md - Generated API documentation (optional)
Tests:
- Created tests/test_codebase_scraper.py with 15 tests
- All tests passing ✅
- Test coverage: Language detection (5 tests), directory exclusion (4 tests),
directory walking (4 tests), .gitignore loading (2 tests)
Dependencies Added:
- pathspec>=0.12.1 - For .gitignore parsing
Entry Point:
- Added skill-seekers-codebase to pyproject.toml
Related Issues:
- Closes #69 (C2.7 Create codebase_scraper.py CLI tool)
- Part of C2 Local Codebase Scraping roadmap (TIER 3)
Files Modified:
- src/skill_seekers/cli/codebase_scraper.py (CREATE - 450 lines)
- tests/test_codebase_scraper.py (CREATE - 160 lines)
- pyproject.toml (+2 lines - pathspec dependency + entry point)
2026-01-01 23:10:55 +03:00