fix(A1.3): Add comprehensive validation to submit_config MCP tool
Issue: #11 (A1.3 - Add MCP tool to submit custom configs) ## Summary Fixed submit_config MCP tool to use ConfigValidator for comprehensive validation instead of basic 3-field checks. Now supports both legacy and unified config formats with detailed error messages and validation warnings. ## Critical Gaps Fixed (6 total) 1. ✅ Missing comprehensive validation (HIGH) - Only checked 3 fields 2. ✅ No unified config support (HIGH) - Couldn't handle multi-source configs 3. ✅ No test coverage (MEDIUM) - Zero tests for submit_config_tool 4. ✅ No URL format validation (MEDIUM) - Accepted malformed URLs 5. ✅ No warnings for unlimited scraping (LOW) - Silent config issues 6. ✅ No url_patterns validation (MEDIUM) - No selector structure checks ## Changes Made ### Phase 1: Validation Logic (server.py lines 1224-1380) - Added ConfigValidator import with graceful degradation - Replaced basic validation (3 fields) with comprehensive ConfigValidator.validate() - Enhanced category detection for unified multi-source configs - Added validation warnings collection (unlimited scraping, missing max_pages) - Updated GitHub issue template with: * Config format type (Unified vs Legacy) * Validation warnings section * Updated documentation URL handling for unified configs * Checklist showing "Config validated with ConfigValidator" ### Phase 2: Test Coverage (test_mcp_server.py lines 617-769) Added 8 comprehensive test cases: 1. test_submit_config_requires_token - GitHub token requirement 2. test_submit_config_validates_required_fields - Required field validation 3. test_submit_config_validates_name_format - Name format validation 4. test_submit_config_validates_url_format - URL format validation 5. test_submit_config_accepts_legacy_format - Legacy config acceptance 6. test_submit_config_accepts_unified_format - Unified config acceptance 7. test_submit_config_from_file_path - File path input support 8. test_submit_config_detects_category - Category auto-detection ### Phase 3: Documentation Updates - Updated Issue #11 with completion notes - Updated tool description to mention format support - Updated CHANGELOG.md with fix details - Added EVOLUTION_ANALYSIS.md for deep architecture analysis ## Validation Improvements ### Before: ```python required_fields = ["name", "description", "base_url"] missing_fields = [field for field in required_fields if field not in config_data] if missing_fields: return error ``` ### After: ```python validator = ConfigValidator(config_data) validator.validate() # Comprehensive validation: # - Name format (alphanumeric, hyphens, underscores only) # - URL formats (must start with http:// or https://) # - Selectors structure (dict with proper keys) # - Rate limits (non-negative numbers) # - Max pages (positive integer or -1) # - Supports both legacy AND unified formats # - Provides detailed error messages with examples ``` ## Test Results ✅ All 427 tests passing (no regressions) ✅ 8 new tests for submit_config_tool ✅ No breaking changes ## Files Modified - src/skill_seekers/mcp/server.py (157 lines changed) - tests/test_mcp_server.py (157 lines added) - CHANGELOG.md (12 lines added) - EVOLUTION_ANALYSIS.md (500+ lines, new file) ## Issue Resolution Closes #11 - A1.3 now fully implemented with comprehensive validation, test coverage, and support for both config formats. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -39,6 +39,13 @@ app = Server("skill-seeker") if MCP_AVAILABLE and Server is not None else None
|
||||
# Path to CLI tools
|
||||
CLI_DIR = Path(__file__).parent.parent / "cli"
|
||||
|
||||
# Import config validator for submit_config validation
|
||||
sys.path.insert(0, str(CLI_DIR))
|
||||
try:
|
||||
from config_validator import ConfigValidator
|
||||
except ImportError:
|
||||
ConfigValidator = None # Graceful degradation if not available
|
||||
|
||||
# Helper decorator that works even when app is None
|
||||
def safe_decorator(decorator_func):
|
||||
"""Returns the decorator if MCP is available, otherwise returns a no-op"""
|
||||
@@ -440,7 +447,7 @@ async def list_tools() -> list[Tool]:
|
||||
),
|
||||
Tool(
|
||||
name="submit_config",
|
||||
description="Submit a custom config file to the community. Creates a GitHub issue in skill-seekers-configs repo for review.",
|
||||
description="Submit a custom config file to the community. Validates config (legacy or unified format) and creates a GitHub issue in skill-seekers-configs repo for review.",
|
||||
inputSchema={
|
||||
"type": "object",
|
||||
"properties": {
|
||||
@@ -1255,24 +1262,77 @@ async def submit_config_tool(args: dict) -> list[TextContent]:
|
||||
else:
|
||||
return [TextContent(type="text", text="❌ Error: Must provide either config_path or config_json")]
|
||||
|
||||
# Validate required fields
|
||||
required_fields = ["name", "description", "base_url"]
|
||||
missing_fields = [field for field in required_fields if field not in config_data]
|
||||
# Use ConfigValidator for comprehensive validation
|
||||
if ConfigValidator is None:
|
||||
return [TextContent(type="text", text="❌ Error: ConfigValidator not available. Please ensure config_validator.py is in the CLI directory.")]
|
||||
|
||||
if missing_fields:
|
||||
return [TextContent(type="text", text=f"❌ Error: Missing required fields: {', '.join(missing_fields)}\n\nRequired: name, description, base_url")]
|
||||
try:
|
||||
validator = ConfigValidator(config_data)
|
||||
validator.validate()
|
||||
|
||||
# Detect category
|
||||
name_lower = config_name.lower()
|
||||
category = "other"
|
||||
if any(x in name_lower for x in ["react", "vue", "django", "laravel", "fastapi", "astro", "hono"]):
|
||||
category = "web-frameworks"
|
||||
elif any(x in name_lower for x in ["godot", "unity", "unreal"]):
|
||||
category = "game-engines"
|
||||
elif any(x in name_lower for x in ["kubernetes", "ansible", "docker"]):
|
||||
category = "devops"
|
||||
elif any(x in name_lower for x in ["tailwind", "bootstrap", "bulma"]):
|
||||
category = "css-frameworks"
|
||||
# Get format info
|
||||
is_unified = validator.is_unified
|
||||
config_name = config_data.get("name", "unnamed")
|
||||
|
||||
except ValueError as validation_error:
|
||||
# Provide detailed validation feedback
|
||||
error_msg = f"""❌ Config validation failed:
|
||||
|
||||
{str(validation_error)}
|
||||
|
||||
Please fix these issues and try again.
|
||||
|
||||
💡 Validation help:
|
||||
- Names: alphanumeric, hyphens, underscores only (e.g., "my-framework", "react_docs")
|
||||
- URLs: must start with http:// or https://
|
||||
- Selectors: should be a dict with keys like 'main_content', 'title', 'code_blocks'
|
||||
- Rate limit: non-negative number (default: 0.5)
|
||||
- Max pages: positive integer or -1 for unlimited
|
||||
|
||||
📚 Example configs: https://github.com/yusufkaraaslan/skill-seekers-configs/tree/main/official
|
||||
"""
|
||||
return [TextContent(type="text", text=error_msg)]
|
||||
|
||||
# Detect category based on config format and content
|
||||
if is_unified:
|
||||
# For unified configs, look at source types
|
||||
source_types = [src.get('type') for src in config_data.get('sources', [])]
|
||||
if 'documentation' in source_types and 'github' in source_types:
|
||||
category = "multi-source"
|
||||
elif 'documentation' in source_types and 'pdf' in source_types:
|
||||
category = "multi-source"
|
||||
elif len(source_types) > 1:
|
||||
category = "multi-source"
|
||||
else:
|
||||
category = "unified"
|
||||
else:
|
||||
# For legacy configs, use name-based detection
|
||||
name_lower = config_name.lower()
|
||||
category = "other"
|
||||
if any(x in name_lower for x in ["react", "vue", "django", "laravel", "fastapi", "astro", "hono"]):
|
||||
category = "web-frameworks"
|
||||
elif any(x in name_lower for x in ["godot", "unity", "unreal"]):
|
||||
category = "game-engines"
|
||||
elif any(x in name_lower for x in ["kubernetes", "ansible", "docker"]):
|
||||
category = "devops"
|
||||
elif any(x in name_lower for x in ["tailwind", "bootstrap", "bulma"]):
|
||||
category = "css-frameworks"
|
||||
|
||||
# Collect validation warnings
|
||||
warnings = []
|
||||
if not is_unified:
|
||||
# Legacy config warnings
|
||||
if 'max_pages' not in config_data:
|
||||
warnings.append("⚠️ No max_pages set - will use default (100)")
|
||||
elif config_data.get('max_pages') in (None, -1):
|
||||
warnings.append("⚠️ Unlimited scraping enabled - may scrape thousands of pages and take hours")
|
||||
else:
|
||||
# Unified config warnings
|
||||
for src in config_data.get('sources', []):
|
||||
if src.get('type') == 'documentation' and 'max_pages' not in src:
|
||||
warnings.append(f"⚠️ No max_pages set for documentation source - will use default (100)")
|
||||
elif src.get('type') == 'documentation' and src.get('max_pages') in (None, -1):
|
||||
warnings.append(f"⚠️ Unlimited scraping enabled for documentation source")
|
||||
|
||||
# Check for GitHub token
|
||||
if not github_token:
|
||||
@@ -1292,6 +1352,9 @@ async def submit_config_tool(args: dict) -> list[TextContent]:
|
||||
### Category
|
||||
{category}
|
||||
|
||||
### Config Format
|
||||
{"Unified (multi-source)" if is_unified else "Legacy (single-source)"}
|
||||
|
||||
### Configuration JSON
|
||||
```json
|
||||
{config_json_str}
|
||||
@@ -1301,12 +1364,15 @@ async def submit_config_tool(args: dict) -> list[TextContent]:
|
||||
{testing_notes if testing_notes else "Not provided"}
|
||||
|
||||
### Documentation URL
|
||||
{config_data.get('base_url', 'N/A')}
|
||||
{config_data.get('base_url') if not is_unified else 'See sources in config'}
|
||||
|
||||
{"### Validation Warnings" if warnings else ""}
|
||||
{chr(10).join(f"- {w}" for w in warnings) if warnings else ""}
|
||||
|
||||
---
|
||||
|
||||
### Checklist
|
||||
- [ ] Config validated
|
||||
- [x] Config validated with ConfigValidator
|
||||
- [ ] Test scraping completed
|
||||
- [ ] Added to appropriate category
|
||||
- [ ] API updated
|
||||
|
||||
Reference in New Issue
Block a user