From 8f99ed0003673314c17771cd4c9fd5cbc6fe1291 Mon Sep 17 00:00:00 2001 From: yusyus Date: Wed, 4 Feb 2026 21:01:40 +0300 Subject: [PATCH] docs: Add documentation for 7 new programming languages Update documentation for PR #275 extended language detection: - CHANGELOG.md: Add comprehensive section for new languages - language_detector.py: Update docstrings from 20+ to 27+ languages New languages: - Dart (Flutter framework) - Scala (pattern matching, case classes) - SCSS/SASS (CSS preprocessors) - Elixir (functional, pipe operator) - Lua (game scripting) - Perl (text processing) 70 regex patterns with confidence scoring (0.6-0.8+ thresholds) 7 new tests, 30/30 passing (100%) Co-Authored-By: Claude Sonnet 4.5 --- CHANGELOG.md | 18 ++++++++++++++++++ src/skill_seekers/cli/language_detector.py | 4 ++-- 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 0310016..b861491 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,24 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added +#### Extended Language Detection (NEW) +- **7 New Programming Languages**: Dart, Scala, SCSS, SASS, Elixir, Lua, Perl + - Pattern-based detection with confidence scoring (0.6-0.8+ thresholds) + - **70 regex patterns** prioritizing unique identifiers (weight 5) + - Framework-specific patterns: + - **Dart**: Flutter widgets (`StatelessWidget`, `StatefulWidget`, `Widget build()`) + - **Scala**: Pattern matching (`case class`, `trait`, `match {}`) + - **SCSS**: Preprocessor features (`$variables`, `@mixin`, `@include`, `@extend`) + - **SASS**: Indented syntax (`=mixin`, `+include`, `$variables`) + - **Elixir**: Functional patterns (`defmodule`, `def ... do`, pipe operator `|>`) + - **Lua**: Game scripting (`local`, `repeat...until`, `~=`, `elseif`) + - **Perl**: Text processing (`my $`, `use strict`, `sub`, `chomp`, regex `=~`) + - **Comprehensive test coverage**: 7 new tests, 30/30 passing (100%) + - **False positive prevention**: Unique identifiers (weight 5) + confidence thresholds + - **No regressions**: All existing language detection tests still pass + - **Total language support**: Now 27+ programming languages + - **Credit**: Contributed by @PaawanBarach via PR #275 + #### Multi-Agent Support for Local Enhancement (NEW) - **Multiple Coding Agent Support**: Choose your preferred local coding agent for SKILL.md enhancement - **Claude Code** (default): Claude Code CLI with `--dangerously-skip-permissions` diff --git a/src/skill_seekers/cli/language_detector.py b/src/skill_seekers/cli/language_detector.py index 5694d35..4582ca7 100644 --- a/src/skill_seekers/cli/language_detector.py +++ b/src/skill_seekers/cli/language_detector.py @@ -3,7 +3,7 @@ Unified Language Detection for Code Blocks Provides confidence-based language detection for documentation scrapers. -Supports 20+ programming languages with weighted pattern matching. +Supports 27+ programming languages with weighted pattern matching. Author: Skill Seekers Project """ @@ -505,7 +505,7 @@ class LanguageDetector: """ Unified confidence-based language detection for code blocks. - Supports 20+ programming languages with weighted pattern matching. + Supports 27+ programming languages with weighted pattern matching. Uses two-stage detection: 1. CSS class extraction (high confidence = 1.0) 2. Pattern-based heuristics with confidence scoring (0.0-1.0)