chore: remove stale root-level test scripts and junk files
Remove files that should never have been committed: - test_api.py, test_httpx_quick.sh, test_httpx_skill.sh (ad-hoc test scripts) - test_week2_features.py (one-off validation script) - test_results.log (log file) - =0.24.0 (accidental pip error output) - demo_conflicts.py (demo script) - ruff_errors.txt (stale lint output) - TESTING_GAP_REPORT.md (stale one-time report) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
18
=0.24.0
18
=0.24.0
@@ -1,18 +0,0 @@
|
||||
error: externally-managed-environment
|
||||
|
||||
× This environment is externally managed
|
||||
╰─> To install Python packages system-wide, try 'pacman -S
|
||||
python-xyz', where xyz is the package you are trying to
|
||||
install.
|
||||
|
||||
If you wish to install a non-Arch-packaged Python package,
|
||||
create a virtual environment using 'python -m venv path/to/venv'.
|
||||
Then use path/to/venv/bin/python and path/to/venv/bin/pip.
|
||||
|
||||
If you wish to install a non-Arch packaged Python application,
|
||||
it may be easiest to use 'pipx install xyz', which will manage a
|
||||
virtual environment for you. Make sure you have python-pipx
|
||||
installed via pacman.
|
||||
|
||||
note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages.
|
||||
hint: See PEP 668 for the detailed specification.
|
||||
@@ -1,345 +0,0 @@
|
||||
# Comprehensive Testing Gap Report
|
||||
|
||||
**Project:** Skill Seekers v3.1.0
|
||||
**Date:** 2026-02-22
|
||||
**Total Test Files:** 113
|
||||
**Total Test Functions:** ~208+ (collected: 2173 tests)
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
### Overall Test Health: 🟡 GOOD with Gaps
|
||||
|
||||
| Category | Status | Coverage | Key Gaps |
|
||||
|----------|--------|----------|----------|
|
||||
| CLI Arguments | ✅ Good | 85% | Some edge cases |
|
||||
| Workflow System | ✅ Excellent | 90% | Inline stage parsing edge cases |
|
||||
| Scrapers | 🟡 Moderate | 70% | Missing real HTTP/PDF tests |
|
||||
| Enhancement | 🟡 Partial | 60% | Core logic not tested |
|
||||
| MCP Tools | 🟡 Good | 75% | 8 tools not covered |
|
||||
| Integration/E2E | 🟡 Moderate | 65% | Heavy mocking |
|
||||
| Adaptors | ✅ Good | 80% | Good coverage per platform |
|
||||
|
||||
---
|
||||
|
||||
## Detailed Findings by Category
|
||||
|
||||
### 1. CLI Argument Tests ✅ GOOD
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_analyze_command.py` (269 lines, 26 tests)
|
||||
- `test_unified.py` - TestUnifiedCLIArguments class (6 tests)
|
||||
- `test_pdf_scraper.py` - TestPDFCLIArguments class (4 tests)
|
||||
- `test_create_arguments.py` (399 lines)
|
||||
- `test_create_integration_basic.py` (310 lines, 23 tests)
|
||||
|
||||
**Strengths:**
|
||||
- All new workflow flags are tested (`--enhance-workflow`, `--enhance-stage`, `--var`, `--workflow-dry-run`)
|
||||
- Argument parsing thoroughly tested
|
||||
- Default values verified
|
||||
- Complex command combinations tested
|
||||
|
||||
**Gaps:**
|
||||
- `test_create_integration_basic.py`: 2 tests skipped (source auto-detection not fully tested)
|
||||
- No tests for invalid argument combinations beyond basic parsing errors
|
||||
|
||||
---
|
||||
|
||||
### 2. Workflow Tests ✅ EXCELLENT
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_workflow_runner.py` (445 lines, 30+ tests)
|
||||
- `test_workflows_command.py` (571 lines, 40+ tests)
|
||||
- `test_workflow_tools_mcp.py` (295 lines, 20+ tests)
|
||||
|
||||
**Strengths:**
|
||||
- Comprehensive workflow execution tests
|
||||
- Variable substitution thoroughly tested
|
||||
- Dry-run mode tested
|
||||
- Workflow chaining tested
|
||||
- All 6 workflow subcommands tested (list, show, copy, add, remove, validate)
|
||||
- MCP workflow tools tested
|
||||
|
||||
**Minor Gaps:**
|
||||
- No tests for `_build_inline_engine` edge cases
|
||||
- No tests for malformed stage specs (empty, invalid format)
|
||||
|
||||
---
|
||||
|
||||
### 3. Scraper Tests 🟡 MODERATE with Significant Gaps
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_scraper_features.py` (524 lines) - Doc scraper features
|
||||
- `test_codebase_scraper.py` (478 lines) - Codebase analysis
|
||||
- `test_pdf_scraper.py` (558 lines) - PDF scraper
|
||||
- `test_github_scraper.py` (1015 lines) - GitHub scraper
|
||||
- `test_unified_analyzer.py` (428 lines) - Unified analyzer
|
||||
|
||||
**Critical Gaps:**
|
||||
|
||||
#### A. Missing Real External Resource Tests
|
||||
| Resource | Test Type | Status |
|
||||
|----------|-----------|--------|
|
||||
| HTTP Requests (docs) | Mocked only | ❌ Gap |
|
||||
| PDF Extraction | Mocked only | ❌ Gap |
|
||||
| GitHub API | Mocked only | ❌ Gap (acceptable) |
|
||||
| Local Files | Real tests | ✅ Good |
|
||||
|
||||
#### B. Missing Core Function Tests
|
||||
| Function | Location | Priority |
|
||||
|----------|----------|----------|
|
||||
| `UnifiedScraper.run()` | unified_scraper.py | 🔴 High |
|
||||
| `UnifiedScraper._scrape_documentation()` | unified_scraper.py | 🔴 High |
|
||||
| `UnifiedScraper._scrape_github()` | unified_scraper.py | 🔴 High |
|
||||
| `UnifiedScraper._scrape_pdf()` | unified_scraper.py | 🔴 High |
|
||||
| `UnifiedScraper._scrape_local()` | unified_scraper.py | 🟡 Medium |
|
||||
| `DocToSkillConverter.scrape()` | doc_scraper.py | 🔴 High |
|
||||
| `PDFToSkillConverter.extract_pdf()` | pdf_scraper.py | 🔴 High |
|
||||
|
||||
#### C. PDF Scraper Limited Coverage
|
||||
- No actual PDF parsing tests (only mocked)
|
||||
- OCR functionality not tested
|
||||
- Page range extraction not tested
|
||||
|
||||
---
|
||||
|
||||
### 4. Enhancement Tests 🟡 PARTIAL - MAJOR GAPS
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_enhance_command.py` (367 lines, 25+ tests)
|
||||
- `test_enhance_skill_local.py` (163 lines, 14 tests)
|
||||
|
||||
**Critical Gap in `test_enhance_skill_local.py`:**
|
||||
|
||||
| Function | Lines | Tested? | Priority |
|
||||
|----------|-------|---------|----------|
|
||||
| `summarize_reference()` | ~50 | ❌ No | 🔴 High |
|
||||
| `create_enhancement_prompt()` | ~200 | ❌ No | 🔴 High |
|
||||
| `run()` | ~100 | ❌ No | 🔴 High |
|
||||
| `_run_headless()` | ~130 | ❌ No | 🔴 High |
|
||||
| `_run_background()` | ~80 | ❌ No | 🟡 Medium |
|
||||
| `_run_daemon()` | ~60 | ❌ No | 🟡 Medium |
|
||||
| `write_status()` | ~30 | ❌ No | 🟡 Medium |
|
||||
| `read_status()` | ~40 | ❌ No | 🟡 Medium |
|
||||
| `detect_terminal_app()` | ~80 | ❌ No | 🟡 Medium |
|
||||
|
||||
**Current Tests Only Cover:**
|
||||
- Agent presets configuration
|
||||
- Command building
|
||||
- Agent name normalization
|
||||
- Environment variable handling
|
||||
|
||||
**Recommendation:** Add comprehensive tests for the core enhancement logic.
|
||||
|
||||
---
|
||||
|
||||
### 5. MCP Tool Tests 🟡 GOOD with Coverage Gaps
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_mcp_fastmcp.py` (868 lines)
|
||||
- `test_mcp_server.py` (715 lines)
|
||||
- `test_mcp_vector_dbs.py` (259 lines)
|
||||
- `test_real_world_fastmcp.py` (558 lines)
|
||||
|
||||
**Coverage Analysis:**
|
||||
|
||||
| Tool Category | Tools | Tested | Coverage |
|
||||
|---------------|-------|--------|----------|
|
||||
| Config Tools | 3 | 3 | ✅ 100% |
|
||||
| Scraping Tools | 8 | 4 | 🟡 50% |
|
||||
| Packaging Tools | 4 | 4 | ✅ 100% |
|
||||
| Splitting Tools | 2 | 2 | ✅ 100% |
|
||||
| Source Tools | 5 | 5 | ✅ 100% |
|
||||
| Vector DB Tools | 4 | 4 | ✅ 100% |
|
||||
| Workflow Tools | 5 | 0 | ❌ 0% |
|
||||
| **Total** | **31** | **22** | **🟡 71%** |
|
||||
|
||||
**Untested Tools:**
|
||||
1. `detect_patterns`
|
||||
2. `extract_test_examples`
|
||||
3. `build_how_to_guides`
|
||||
4. `extract_config_patterns`
|
||||
5. `list_workflows`
|
||||
6. `get_workflow`
|
||||
7. `create_workflow`
|
||||
8. `update_workflow`
|
||||
9. `delete_workflow`
|
||||
|
||||
**Note:** `test_mcp_server.py` tests legacy server, `test_mcp_fastmcp.py` tests modern server.
|
||||
|
||||
---
|
||||
|
||||
### 6. Integration/E2E Tests 🟡 MODERATE
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_create_integration_basic.py` (310 lines)
|
||||
- `test_e2e_three_stream_pipeline.py` (598 lines)
|
||||
- `test_analyze_e2e.py` (344 lines)
|
||||
- `test_install_skill_e2e.py` (533 lines)
|
||||
- `test_c3_integration.py` (362 lines)
|
||||
|
||||
**Issues Found:**
|
||||
|
||||
1. **Skipped Tests:**
|
||||
- `test_create_detects_web_url` - Source auto-detection incomplete
|
||||
- `test_create_invalid_source_shows_error` - Error handling incomplete
|
||||
- `test_cli_via_unified_command` - Asyncio issues
|
||||
|
||||
2. **Heavy Mocking:**
|
||||
- Most GitHub API tests use mocking
|
||||
- No real HTTP tests for doc scraping
|
||||
- Integration tests don't test actual integration
|
||||
|
||||
3. **Limited Scope:**
|
||||
- Only `--quick` preset tested (not `--comprehensive`)
|
||||
- C3.x tests use mock data only
|
||||
- Most E2E tests are unit tests with mocks
|
||||
|
||||
---
|
||||
|
||||
### 7. Adaptor Tests ✅ GOOD
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_adaptors/test_adaptors_e2e.py` (893 lines)
|
||||
- `test_adaptors/test_claude_adaptor.py` (314 lines)
|
||||
- `test_adaptors/test_gemini_adaptor.py` (146 lines)
|
||||
- `test_adaptors/test_openai_adaptor.py` (188 lines)
|
||||
- Plus 8 more platform adaptors
|
||||
|
||||
**Strengths:**
|
||||
- Each adaptor has dedicated tests
|
||||
- Package format testing
|
||||
- Upload success/failure scenarios
|
||||
- Platform-specific features tested
|
||||
|
||||
**Minor Gaps:**
|
||||
- Some adaptors only test 1-2 scenarios
|
||||
- Error handling coverage varies by platform
|
||||
|
||||
---
|
||||
|
||||
### 8. Config/Validation Tests ✅ GOOD
|
||||
|
||||
**Files Reviewed:**
|
||||
- `test_config_validation.py` (270 lines)
|
||||
- `test_config_extractor.py` (629 lines)
|
||||
- `test_config_fetcher.py` (340 lines)
|
||||
|
||||
**Strengths:**
|
||||
- Unified vs legacy format detection
|
||||
- Field validation comprehensive
|
||||
- Error message quality tested
|
||||
|
||||
---
|
||||
|
||||
## Summary of Critical Testing Gaps
|
||||
|
||||
### 🔴 HIGH PRIORITY (Must Fix)
|
||||
|
||||
1. **Enhancement Core Logic**
|
||||
- File: `test_enhance_skill_local.py`
|
||||
- Missing: 9 major functions
|
||||
- Impact: Core feature untested
|
||||
|
||||
2. **Unified Scraper Main Flow**
|
||||
- File: New tests needed
|
||||
- Missing: `_scrape_*()` methods, `run()` orchestration
|
||||
- Impact: Multi-source scraping untested
|
||||
|
||||
3. **Actual HTTP/PDF/GitHub Integration**
|
||||
- Missing: Real external resource tests
|
||||
- Impact: Only mock tests exist
|
||||
|
||||
### 🟡 MEDIUM PRIORITY (Should Fix)
|
||||
|
||||
4. **MCP Workflow Tools**
|
||||
- Missing: 5 workflow tools (0% coverage)
|
||||
- Impact: MCP workflow features untested
|
||||
|
||||
5. **Skipped Integration Tests**
|
||||
- 3 tests skipped
|
||||
- Impact: Source auto-detection incomplete
|
||||
|
||||
6. **PDF Real Extraction**
|
||||
- Missing: Actual PDF parsing
|
||||
- Impact: PDF feature quality unknown
|
||||
|
||||
### 🟢 LOW PRIORITY (Nice to Have)
|
||||
|
||||
7. **Additional Scraping Tools**
|
||||
- Missing: 4 scraping tool tests
|
||||
- Impact: Low (core tools covered)
|
||||
|
||||
8. **Edge Case Coverage**
|
||||
- Missing: Invalid argument combinations
|
||||
- Impact: Low (happy path covered)
|
||||
|
||||
---
|
||||
|
||||
## Recommendations
|
||||
|
||||
### Immediate Actions (Next Sprint)
|
||||
|
||||
1. **Add Enhancement Logic Tests** (~400 lines)
|
||||
- Test `summarize_reference()`
|
||||
- Test `create_enhancement_prompt()`
|
||||
- Test `run()` method
|
||||
- Test status read/write
|
||||
|
||||
2. **Fix Skipped Tests** (~100 lines)
|
||||
- Fix asyncio issues in `test_cli_via_unified_command`
|
||||
- Complete source auto-detection tests
|
||||
|
||||
3. **Add MCP Workflow Tool Tests** (~200 lines)
|
||||
- Test all 5 workflow tools
|
||||
|
||||
### Short Term (Next Month)
|
||||
|
||||
4. **Add Unified Scraper Integration Tests** (~300 lines)
|
||||
- Test main orchestration flow
|
||||
- Test individual source scraping
|
||||
|
||||
5. **Add Real PDF Tests** (~150 lines)
|
||||
- Test with actual PDF files
|
||||
- Test OCR if available
|
||||
|
||||
### Long Term (Next Quarter)
|
||||
|
||||
6. **HTTP Integration Tests** (~200 lines)
|
||||
- Test with real websites (use test sites)
|
||||
- Mock server approach
|
||||
|
||||
7. **Complete E2E Pipeline** (~300 lines)
|
||||
- Full workflow from scrape to upload
|
||||
- Real GitHub repo (fork test repo)
|
||||
|
||||
---
|
||||
|
||||
## Test Quality Metrics
|
||||
|
||||
| Metric | Score | Notes |
|
||||
|--------|-------|-------|
|
||||
| Test Count | 🟢 Good | 2173+ tests |
|
||||
| Coverage | 🟡 Moderate | ~75% estimated |
|
||||
| Real Tests | 🟡 Moderate | Many mocked |
|
||||
| Documentation | 🟢 Good | Most tests documented |
|
||||
| Maintenance | 🟢 Good | Tests recently updated |
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
The Skill Seekers test suite is **comprehensive in quantity** (2173+ tests) but has **quality gaps** in critical areas:
|
||||
|
||||
1. **Core enhancement logic** is largely untested
|
||||
2. **Multi-source scraping** orchestration lacks integration tests
|
||||
3. **MCP workflow tools** have zero coverage
|
||||
4. **Real external resource** testing is minimal
|
||||
|
||||
**Priority:** Fix the 🔴 HIGH priority gaps first, as they impact core functionality.
|
||||
|
||||
---
|
||||
|
||||
*Report generated: 2026-02-22*
|
||||
*Reviewer: Systematic test review with parallel subagent analysis*
|
||||
@@ -1,204 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Demo: Conflict Detection and Reporting
|
||||
|
||||
This demonstrates the unified scraper's ability to detect and report
|
||||
conflicts between documentation and code implementation.
|
||||
"""
|
||||
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Add CLI to path
|
||||
sys.path.insert(0, str(Path(__file__).parent / "cli"))
|
||||
|
||||
|
||||
print("=" * 70)
|
||||
print("UNIFIED SCRAPER - CONFLICT DETECTION DEMO")
|
||||
print("=" * 70)
|
||||
print()
|
||||
|
||||
# Load test data
|
||||
print("📂 Loading test data...")
|
||||
print(" - Documentation APIs from example docs")
|
||||
print(" - Code APIs from example repository")
|
||||
print()
|
||||
|
||||
with open("cli/conflicts.json") as f:
|
||||
conflicts_data = json.load(f)
|
||||
|
||||
conflicts = conflicts_data["conflicts"]
|
||||
summary = conflicts_data["summary"]
|
||||
|
||||
print(f"✅ Loaded {summary['total']} conflicts")
|
||||
print()
|
||||
|
||||
# Display summary
|
||||
print("=" * 70)
|
||||
print("CONFLICT SUMMARY")
|
||||
print("=" * 70)
|
||||
print()
|
||||
|
||||
print(f"📊 **Total Conflicts**: {summary['total']}")
|
||||
print()
|
||||
|
||||
print("**By Type:**")
|
||||
for conflict_type, count in summary["by_type"].items():
|
||||
if count > 0:
|
||||
emoji = (
|
||||
"📖"
|
||||
if conflict_type == "missing_in_docs"
|
||||
else "💻"
|
||||
if conflict_type == "missing_in_code"
|
||||
else "⚠️"
|
||||
)
|
||||
print(f" {emoji} {conflict_type}: {count}")
|
||||
print()
|
||||
|
||||
print("**By Severity:**")
|
||||
for severity, count in summary["by_severity"].items():
|
||||
if count > 0:
|
||||
emoji = "🔴" if severity == "high" else "🟡" if severity == "medium" else "🟢"
|
||||
print(f" {emoji} {severity.upper()}: {count}")
|
||||
print()
|
||||
|
||||
# Display detailed conflicts
|
||||
print("=" * 70)
|
||||
print("DETAILED CONFLICT REPORTS")
|
||||
print("=" * 70)
|
||||
print()
|
||||
|
||||
# Group by severity
|
||||
high = [c for c in conflicts if c["severity"] == "high"]
|
||||
medium = [c for c in conflicts if c["severity"] == "medium"]
|
||||
low = [c for c in conflicts if c["severity"] == "low"]
|
||||
|
||||
# Show high severity first
|
||||
if high:
|
||||
print("🔴 **HIGH SEVERITY CONFLICTS** (Requires immediate attention)")
|
||||
print("-" * 70)
|
||||
for conflict in high:
|
||||
print()
|
||||
print(f"**API**: `{conflict['api_name']}`")
|
||||
print(f"**Type**: {conflict['type']}")
|
||||
print(f"**Issue**: {conflict['difference']}")
|
||||
print(f"**Suggestion**: {conflict['suggestion']}")
|
||||
|
||||
if conflict["docs_info"]:
|
||||
print("\n**Documented as**:")
|
||||
print(f" Signature: {conflict['docs_info'].get('raw_signature', 'N/A')}")
|
||||
|
||||
if conflict["code_info"]:
|
||||
print("\n**Implemented as**:")
|
||||
params = conflict["code_info"].get("parameters", [])
|
||||
param_str = ", ".join(
|
||||
f"{p['name']}: {p.get('type_hint', 'Any')}" for p in params if p["name"] != "self"
|
||||
)
|
||||
print(f" Signature: {conflict['code_info']['name']}({param_str})")
|
||||
print(f" Return type: {conflict['code_info'].get('return_type', 'None')}")
|
||||
print(
|
||||
f" Location: {conflict['code_info'].get('source', 'N/A')}:{conflict['code_info'].get('line', '?')}"
|
||||
)
|
||||
print()
|
||||
|
||||
# Show medium severity
|
||||
if medium:
|
||||
print("🟡 **MEDIUM SEVERITY CONFLICTS** (Review recommended)")
|
||||
print("-" * 70)
|
||||
for conflict in medium[:3]: # Show first 3
|
||||
print()
|
||||
print(f"**API**: `{conflict['api_name']}`")
|
||||
print(f"**Type**: {conflict['type']}")
|
||||
print(f"**Issue**: {conflict['difference']}")
|
||||
|
||||
if conflict["code_info"]:
|
||||
print(f"**Location**: {conflict['code_info'].get('source', 'N/A')}")
|
||||
|
||||
if len(medium) > 3:
|
||||
print(f"\n ... and {len(medium) - 3} more medium severity conflicts")
|
||||
print()
|
||||
|
||||
# Example: How conflicts appear in final skill
|
||||
print("=" * 70)
|
||||
print("HOW CONFLICTS APPEAR IN SKILL.MD")
|
||||
print("=" * 70)
|
||||
print()
|
||||
|
||||
example_conflict = high[0] if high else medium[0] if medium else conflicts[0]
|
||||
|
||||
print("```markdown")
|
||||
print("## 🔧 API Reference")
|
||||
print()
|
||||
print("### ⚠️ APIs with Conflicts")
|
||||
print()
|
||||
print(f"#### `{example_conflict['api_name']}`")
|
||||
print()
|
||||
print(f"⚠️ **Conflict**: {example_conflict['difference']}")
|
||||
print()
|
||||
|
||||
if example_conflict.get("docs_info"):
|
||||
print("**Documentation says:**")
|
||||
print("```")
|
||||
print(example_conflict["docs_info"].get("raw_signature", "N/A"))
|
||||
print("```")
|
||||
print()
|
||||
|
||||
if example_conflict.get("code_info"):
|
||||
print("**Code implementation:**")
|
||||
print("```python")
|
||||
params = example_conflict["code_info"].get("parameters", [])
|
||||
param_strs = []
|
||||
for p in params:
|
||||
if p["name"] == "self":
|
||||
continue
|
||||
param_str = p["name"]
|
||||
if p.get("type_hint"):
|
||||
param_str += f": {p['type_hint']}"
|
||||
if p.get("default"):
|
||||
param_str += f" = {p['default']}"
|
||||
param_strs.append(param_str)
|
||||
|
||||
sig = f"def {example_conflict['code_info']['name']}({', '.join(param_strs)})"
|
||||
if example_conflict["code_info"].get("return_type"):
|
||||
sig += f" -> {example_conflict['code_info']['return_type']}"
|
||||
|
||||
print(sig)
|
||||
print("```")
|
||||
print()
|
||||
|
||||
print("*Source: both (conflict)*")
|
||||
print("```")
|
||||
print()
|
||||
|
||||
# Key takeaways
|
||||
print("=" * 70)
|
||||
print("KEY TAKEAWAYS")
|
||||
print("=" * 70)
|
||||
print()
|
||||
|
||||
print("✅ **What the Unified Scraper Does:**")
|
||||
print(" 1. Extracts APIs from both documentation and code")
|
||||
print(" 2. Compares them to detect discrepancies")
|
||||
print(" 3. Classifies conflicts by type and severity")
|
||||
print(" 4. Provides actionable suggestions")
|
||||
print(" 5. Shows both versions transparently in the skill")
|
||||
print()
|
||||
|
||||
print("⚠️ **Common Conflict Types:**")
|
||||
print(" - **Missing in docs**: Undocumented features in code")
|
||||
print(" - **Missing in code**: Documented but not implemented")
|
||||
print(" - **Signature mismatch**: Different parameters/types")
|
||||
print(" - **Description mismatch**: Different explanations")
|
||||
print()
|
||||
|
||||
print("🎯 **Value:**")
|
||||
print(" - Identifies documentation gaps")
|
||||
print(" - Catches outdated documentation")
|
||||
print(" - Highlights implementation differences")
|
||||
print(" - Creates single source of truth showing reality")
|
||||
print()
|
||||
|
||||
print("=" * 70)
|
||||
print("END OF DEMO")
|
||||
print("=" * 70)
|
||||
439
ruff_errors.txt
439
ruff_errors.txt
@@ -1,439 +0,0 @@
|
||||
ARG002 Unused method argument: `config_type`
|
||||
--> src/skill_seekers/cli/config_extractor.py:294:47
|
||||
|
|
||||
292 | return None
|
||||
293 |
|
||||
294 | def _infer_purpose(self, file_path: Path, config_type: str) -> str:
|
||||
| ^^^^^^^^^^^
|
||||
295 | """Infer configuration purpose from file path and name"""
|
||||
296 | path_lower = str(file_path).lower()
|
||||
|
|
||||
|
||||
SIM102 Use a single `if` statement instead of nested `if` statements
|
||||
--> src/skill_seekers/cli/config_extractor.py:469:17
|
||||
|
|
||||
468 | for node in ast.walk(tree):
|
||||
469 | / if isinstance(node, ast.Assign):
|
||||
470 | | # Get variable name and skip private variables
|
||||
471 | | if len(node.targets) == 1 and isinstance(node.targets[0], ast.Name) and not node.targets[0].id.startswith("_"):
|
||||
| |___________________________________________________________________________________________________________________________________^
|
||||
472 | key = node.targets[0].id
|
||||
|
|
||||
help: Combine `if` statements using `and`
|
||||
|
||||
ARG002 Unused method argument: `node`
|
||||
--> src/skill_seekers/cli/config_extractor.py:585:41
|
||||
|
|
||||
583 | return ""
|
||||
584 |
|
||||
585 | def _extract_python_docstring(self, node: ast.AST) -> str:
|
||||
| ^^^^
|
||||
586 | """Extract docstring/comment for Python node"""
|
||||
587 | # This is simplified - real implementation would need more context
|
||||
|
|
||||
|
||||
B904 Within an `except` clause, raise exceptions with `raise ... from err` or `raise ... from None` to distinguish them from errors in exception handling
|
||||
--> src/skill_seekers/cli/config_validator.py:60:13
|
||||
|
|
||||
58 | return json.load(f)
|
||||
59 | except FileNotFoundError:
|
||||
60 | raise ValueError(f"Config file not found: {self.config_path}")
|
||||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
61 | except json.JSONDecodeError as e:
|
||||
62 | raise ValueError(f"Invalid JSON in config file: {e}")
|
||||
|
|
||||
|
||||
B904 Within an `except` clause, raise exceptions with `raise ... from err` or `raise ... from None` to distinguish them from errors in exception handling
|
||||
--> src/skill_seekers/cli/config_validator.py:62:13
|
||||
|
|
||||
60 | raise ValueError(f"Config file not found: {self.config_path}")
|
||||
61 | except json.JSONDecodeError as e:
|
||||
62 | raise ValueError(f"Invalid JSON in config file: {e}")
|
||||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
63 |
|
||||
64 | def _detect_format(self) -> bool:
|
||||
|
|
||||
|
||||
SIM113 Use `enumerate()` for index variable `completed` in `for` loop
|
||||
--> src/skill_seekers/cli/doc_scraper.py:1068:25
|
||||
|
|
||||
1066 | logger.warning(" ⚠️ Worker exception: %s", e)
|
||||
1067 |
|
||||
1068 | completed += 1
|
||||
| ^^^^^^^^^^^^^^
|
||||
1069 |
|
||||
1070 | with self.lock:
|
||||
|
|
||||
|
||||
B904 Within an `except` clause, raise exceptions with `raise ... from err` or `raise ... from None` to distinguish them from errors in exception handling
|
||||
--> src/skill_seekers/cli/github_scraper.py:353:17
|
||||
|
|
||||
351 | except GithubException as e:
|
||||
352 | if e.status == 404:
|
||||
353 | raise ValueError(f"Repository not found: {self.repo_name}")
|
||||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
354 | raise
|
||||
|
|
||||
|
||||
E402 Module level import not at top of file
|
||||
--> src/skill_seekers/cli/llms_txt_downloader.py:5:1
|
||||
|
|
||||
3 | """ABOUTME: Validates markdown content and handles timeouts with exponential backoff"""
|
||||
4 |
|
||||
5 | import time
|
||||
| ^^^^^^^^^^^
|
||||
6 |
|
||||
7 | import requests
|
||||
|
|
||||
|
||||
E402 Module level import not at top of file
|
||||
--> src/skill_seekers/cli/llms_txt_downloader.py:7:1
|
||||
|
|
||||
5 | import time
|
||||
6 |
|
||||
7 | import requests
|
||||
| ^^^^^^^^^^^^^^^
|
||||
|
|
||||
|
||||
E402 Module level import not at top of file
|
||||
--> src/skill_seekers/cli/llms_txt_parser.py:5:1
|
||||
|
|
||||
3 | """ABOUTME: Extracts titles, content, code samples, and headings from markdown"""
|
||||
4 |
|
||||
5 | import re
|
||||
| ^^^^^^^^^
|
||||
6 | from urllib.parse import urljoin
|
||||
|
|
||||
|
||||
E402 Module level import not at top of file
|
||||
--> src/skill_seekers/cli/llms_txt_parser.py:6:1
|
||||
|
|
||||
5 | import re
|
||||
6 | from urllib.parse import urljoin
|
||||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
|
||||
|
||||
SIM102 Use a single `if` statement instead of nested `if` statements
|
||||
--> src/skill_seekers/cli/pattern_recognizer.py:430:13
|
||||
|
|
||||
428 | # Python: __init__ or __new__
|
||||
429 | # Java/C#: private constructor (detected by naming)
|
||||
430 | / if method.name in ["__new__", "__init__", "constructor"]:
|
||||
431 | | # Check if it has logic (not just pass)
|
||||
432 | | if method.docstring or len(method.parameters) > 1:
|
||||
| |__________________________________________________________________^
|
||||
433 | evidence.append(f"Controlled initialization: {method.name}")
|
||||
434 | confidence += 0.3
|
||||
|
|
||||
help: Combine `if` statements using `and`
|
||||
|
||||
SIM102 Use a single `if` statement instead of nested `if` statements
|
||||
--> src/skill_seekers/cli/pattern_recognizer.py:538:13
|
||||
|
|
||||
536 | for method in class_sig.methods:
|
||||
537 | method_lower = method.name.lower()
|
||||
538 | / if any(name in method_lower for name in factory_method_names):
|
||||
539 | | # Check if method returns something (has return type or is not void)
|
||||
540 | | if method.return_type or "create" in method_lower:
|
||||
| |__________________________________________________________________^
|
||||
541 | return PatternInstance(
|
||||
542 | pattern_type=self.pattern_type,
|
||||
|
|
||||
help: Combine `if` statements using `and`
|
||||
|
||||
SIM102 Use a single `if` statement instead of nested `if` statements
|
||||
--> src/skill_seekers/cli/pattern_recognizer.py:916:9
|
||||
|
|
||||
914 | # Check __init__ for composition (takes object parameter)
|
||||
915 | init_method = next((m for m in class_sig.methods if m.name == "__init__"), None)
|
||||
916 | / if init_method:
|
||||
917 | | # Check if takes object parameter (not just self)
|
||||
918 | | if len(init_method.parameters) > 1: # More than just 'self'
|
||||
| |_______________________________________________^
|
||||
919 | param_names = [p.name for p in init_method.parameters if p.name != "self"]
|
||||
920 | if any(
|
||||
|
|
||||
help: Combine `if` statements using `and`
|
||||
|
||||
F821 Undefined name `l`
|
||||
--> src/skill_seekers/cli/pdf_extractor_poc.py:302:28
|
||||
|
|
||||
300 | 1 for line in code.split("\n") if line.strip().startswith(("#", "//", "/*", "*", "--"))
|
||||
301 | )
|
||||
302 | total_lines = len([l for line in code.split("\n") if line.strip()])
|
||||
| ^
|
||||
303 | if total_lines > 0 and comment_lines / total_lines > 0.7:
|
||||
304 | issues.append("Mostly comments")
|
||||
|
|
||||
|
||||
F821 Undefined name `l`
|
||||
--> src/skill_seekers/cli/pdf_extractor_poc.py:330:18
|
||||
|
|
||||
329 | # Factor 3: Number of lines
|
||||
330 | lines = [l for line in code.split("\n") if line.strip()]
|
||||
| ^
|
||||
331 | if 2 <= len(lines) <= 50:
|
||||
332 | score += 1.0
|
||||
|
|
||||
|
||||
B007 Loop control variable `keywords` not used within loop body
|
||||
--> src/skill_seekers/cli/pdf_scraper.py:167:30
|
||||
|
|
||||
165 | # Keyword-based categorization
|
||||
166 | # Initialize categories
|
||||
167 | for cat_key, keywords in self.categories.items():
|
||||
| ^^^^^^^^
|
||||
168 | categorized[cat_key] = {"title": cat_key.replace("_", " ").title(), "pages": []}
|
||||
|
|
||||
help: Rename unused `keywords` to `_keywords`
|
||||
|
||||
SIM115 Use a context manager for opening files
|
||||
--> src/skill_seekers/cli/pdf_scraper.py:434:26
|
||||
|
|
||||
432 | f.write("**Generated by Skill Seeker** | PDF Documentation Scraper\n")
|
||||
433 |
|
||||
434 | line_count = len(open(filename, encoding="utf-8").read().split("\n"))
|
||||
| ^^^^
|
||||
435 | print(f" Generated: {filename} ({line_count} lines)")
|
||||
|
|
||||
|
||||
E741 Ambiguous variable name: `l`
|
||||
--> src/skill_seekers/cli/quality_checker.py:318:44
|
||||
|
|
||||
316 | else:
|
||||
317 | if links:
|
||||
318 | internal_links = [l for t, l in links if not l.startswith("http")]
|
||||
| ^
|
||||
319 | if internal_links:
|
||||
320 | self.report.add_info(
|
||||
|
|
||||
|
||||
SIM102 Use a single `if` statement instead of nested `if` statements
|
||||
--> src/skill_seekers/cli/test_example_extractor.py:364:13
|
||||
|
|
||||
363 | for node in ast.walk(func_node):
|
||||
364 | / if isinstance(node, ast.Assign) and isinstance(node.value, ast.Call):
|
||||
365 | | # Check if meaningful instantiation
|
||||
366 | | if self._is_meaningful_instantiation(node):
|
||||
| |___________________________________________________________^
|
||||
367 | code = ast.unparse(node)
|
||||
|
|
||||
help: Combine `if` statements using `and`
|
||||
|
||||
SIM102 Use a single `if` statement instead of nested `if` statements
|
||||
--> src/skill_seekers/cli/test_example_extractor.py:412:13
|
||||
|
|
||||
410 | for i, stmt in enumerate(statements):
|
||||
411 | # Look for method calls
|
||||
412 | / if isinstance(stmt, ast.Expr) and isinstance(stmt.value, ast.Call):
|
||||
413 | | # Check if next statement is an assertion
|
||||
414 | | if i + 1 < len(statements):
|
||||
| |___________________________________________^
|
||||
415 | next_stmt = statements[i + 1]
|
||||
416 | if self._is_assertion(next_stmt):
|
||||
|
|
||||
help: Combine `if` statements using `and`
|
||||
|
||||
SIM102 Use a single `if` statement instead of nested `if` statements
|
||||
--> src/skill_seekers/cli/test_example_extractor.py:460:13
|
||||
|
|
||||
459 | for node in ast.walk(func_node):
|
||||
460 | / if isinstance(node, ast.Assign) and isinstance(node.value, ast.Dict):
|
||||
461 | | # Must have 2+ keys and be meaningful
|
||||
462 | | if len(node.value.keys) >= 2:
|
||||
| |_____________________________________________^
|
||||
463 | code = ast.unparse(node)
|
||||
|
|
||||
help: Combine `if` statements using `and`
|
||||
|
||||
SIM102 Use a single `if` statement instead of nested `if` statements
|
||||
--> src/skill_seekers/cli/unified_skill_builder.py:1070:13
|
||||
|
|
||||
1069 | # If no languages from C3.7, try to get from GitHub data
|
||||
1070 | / if not languages:
|
||||
1071 | | # github_data already available from method scope
|
||||
1072 | | if github_data.get("languages"):
|
||||
| |________________________________________________^
|
||||
1073 | # GitHub data has languages as list, convert to dict with count 1
|
||||
1074 | languages = dict.fromkeys(github_data["languages"], 1)
|
||||
|
|
||||
help: Combine `if` statements using `and`
|
||||
|
||||
ARG001 Unused function argument: `request`
|
||||
--> src/skill_seekers/mcp/server_fastmcp.py:1159:32
|
||||
|
|
||||
1157 | from starlette.routing import Route
|
||||
1158 |
|
||||
1159 | async def health_check(request):
|
||||
| ^^^^^^^
|
||||
1160 | """Health check endpoint."""
|
||||
1161 | return JSONResponse(
|
||||
|
|
||||
|
||||
ARG002 Unused method argument: `tmp_path`
|
||||
--> tests/test_bootstrap_skill.py:54:56
|
||||
|
|
||||
53 | @pytest.mark.slow
|
||||
54 | def test_bootstrap_script_runs(self, project_root, tmp_path):
|
||||
| ^^^^^^^^
|
||||
55 | """Test that bootstrap script runs successfully.
|
||||
|
|
||||
|
||||
B007 Loop control variable `message` not used within loop body
|
||||
--> tests/test_install_agent.py:374:44
|
||||
|
|
||||
372 | # With force - should succeed
|
||||
373 | results_with_force = install_to_all_agents(self.skill_dir, force=True)
|
||||
374 | for _agent_name, (success, message) in results_with_force.items():
|
||||
| ^^^^^^^
|
||||
375 | assert success is True
|
||||
|
|
||||
help: Rename unused `message` to `_message`
|
||||
|
||||
SIM117 Use a single `with` statement with multiple contexts instead of nested `with` statements
|
||||
--> tests/test_install_agent.py:418:9
|
||||
|
|
||||
416 | def test_cli_requires_agent_flag(self):
|
||||
417 | """Test that CLI fails without --agent flag."""
|
||||
418 | / with pytest.raises(SystemExit) as exc_info:
|
||||
419 | | with patch("sys.argv", ["install_agent.py", str(self.skill_dir)]):
|
||||
| |______________________________________________________________________________^
|
||||
420 | main()
|
||||
|
|
||||
help: Combine `with` statements
|
||||
|
||||
SIM117 Use a single `with` statement with multiple contexts instead of nested `with` statements
|
||||
--> tests/test_issue_219_e2e.py:278:9
|
||||
|
|
||||
276 | self.skipTest("anthropic package not installed")
|
||||
277 |
|
||||
278 | / with patch.dict(os.environ, {"ANTHROPIC_API_KEY": "test-key"}):
|
||||
279 | | with patch("skill_seekers.cli.enhance_skill.anthropic.Anthropic") as mock_anthropic:
|
||||
| |________________________________________________________________________________________________^
|
||||
280 | enhancer = SkillEnhancer(self.skill_dir)
|
||||
|
|
||||
help: Combine `with` statements
|
||||
|
||||
SIM117 Use a single `with` statement with multiple contexts instead of nested `with` statements
|
||||
--> tests/test_llms_txt_downloader.py:33:5
|
||||
|
|
||||
31 | downloader = LlmsTxtDownloader("https://example.com/llms.txt", max_retries=2)
|
||||
32 |
|
||||
33 | / with patch("requests.get", side_effect=requests.Timeout("Connection timeout")) as mock_get:
|
||||
34 | | with patch("time.sleep") as mock_sleep: # Mock sleep to speed up test
|
||||
| |_______________________________________________^
|
||||
35 | content = downloader.download()
|
||||
|
|
||||
help: Combine `with` statements
|
||||
|
||||
SIM117 Use a single `with` statement with multiple contexts instead of nested `with` statements
|
||||
--> tests/test_llms_txt_downloader.py:88:5
|
||||
|
|
||||
86 | downloader = LlmsTxtDownloader("https://example.com/llms.txt", max_retries=3)
|
||||
87 |
|
||||
88 | / with patch("requests.get", side_effect=requests.Timeout("Connection timeout")):
|
||||
89 | | with patch("time.sleep") as mock_sleep:
|
||||
| |_______________________________________________^
|
||||
90 | content = downloader.download()
|
||||
|
|
||||
help: Combine `with` statements
|
||||
|
||||
F821 Undefined name `l`
|
||||
--> tests/test_markdown_parsing.py:100:21
|
||||
|
|
||||
98 | )
|
||||
99 | # Should only include .md links
|
||||
100 | md_links = [l for line in result["links"] if ".md" in l]
|
||||
| ^
|
||||
101 | self.assertEqual(len(md_links), len(result["links"]))
|
||||
|
|
||||
|
||||
F821 Undefined name `l`
|
||||
--> tests/test_markdown_parsing.py:100:63
|
||||
|
|
||||
98 | )
|
||||
99 | # Should only include .md links
|
||||
100 | md_links = [l for line in result["links"] if ".md" in l]
|
||||
| ^
|
||||
101 | self.assertEqual(len(md_links), len(result["links"]))
|
||||
|
|
||||
|
||||
SIM117 Use a single `with` statement with multiple contexts instead of nested `with` statements
|
||||
--> tests/test_skip_llms_txt.py:75:17
|
||||
|
|
||||
73 | converter = DocToSkillConverter(config, dry_run=False)
|
||||
74 |
|
||||
75 | / with patch.object(converter, "_try_llms_txt", return_value=False) as mock_try:
|
||||
76 | | with patch.object(converter, "scrape_page"):
|
||||
| |________________________________________________________________^
|
||||
77 | with patch.object(converter, "save_summary"):
|
||||
78 | converter.scrape_all()
|
||||
|
|
||||
help: Combine `with` statements
|
||||
|
||||
SIM117 Use a single `with` statement with multiple contexts instead of nested `with` statements
|
||||
--> tests/test_skip_llms_txt.py:98:17
|
||||
|
|
||||
96 | converter = DocToSkillConverter(config, dry_run=False)
|
||||
97 |
|
||||
98 | / with patch.object(converter, "_try_llms_txt") as mock_try:
|
||||
99 | | with patch.object(converter, "scrape_page"):
|
||||
| |________________________________________________________________^
|
||||
100 | with patch.object(converter, "save_summary"):
|
||||
101 | converter.scrape_all()
|
||||
|
|
||||
help: Combine `with` statements
|
||||
|
||||
SIM117 Use a single `with` statement with multiple contexts instead of nested `with` statements
|
||||
--> tests/test_skip_llms_txt.py:121:17
|
||||
|
|
||||
119 | converter = DocToSkillConverter(config, dry_run=True)
|
||||
120 |
|
||||
121 | / with patch.object(converter, "_try_llms_txt") as mock_try:
|
||||
122 | | with patch.object(converter, "save_summary"):
|
||||
| |_________________________________________________________________^
|
||||
123 | converter.scrape_all()
|
||||
124 | mock_try.assert_not_called()
|
||||
|
|
||||
help: Combine `with` statements
|
||||
|
||||
SIM117 Use a single `with` statement with multiple contexts instead of nested `with` statements
|
||||
--> tests/test_skip_llms_txt.py:148:17
|
||||
|
|
||||
146 | converter = DocToSkillConverter(config, dry_run=False)
|
||||
147 |
|
||||
148 | / with patch.object(converter, "_try_llms_txt", return_value=False) as mock_try:
|
||||
149 | | with patch.object(converter, "scrape_page_async", return_value=None):
|
||||
| |_________________________________________________________________________________________^
|
||||
150 | with patch.object(converter, "save_summary"):
|
||||
151 | converter.scrape_all()
|
||||
|
|
||||
help: Combine `with` statements
|
||||
|
||||
SIM117 Use a single `with` statement with multiple contexts instead of nested `with` statements
|
||||
--> tests/test_skip_llms_txt.py:172:17
|
||||
|
|
||||
170 | converter = DocToSkillConverter(config, dry_run=False)
|
||||
171 |
|
||||
172 | / with patch.object(converter, "_try_llms_txt") as mock_try:
|
||||
173 | | with patch.object(converter, "scrape_page_async", return_value=None):
|
||||
| |_________________________________________________________________________________________^
|
||||
174 | with patch.object(converter, "save_summary"):
|
||||
175 | converter.scrape_all()
|
||||
|
|
||||
help: Combine `with` statements
|
||||
|
||||
SIM117 Use a single `with` statement with multiple contexts instead of nested `with` statements
|
||||
--> tests/test_skip_llms_txt.py:304:17
|
||||
|
|
||||
302 | return None
|
||||
303 |
|
||||
304 | / with patch.object(converter, "scrape_page", side_effect=mock_scrape):
|
||||
305 | | with patch.object(converter, "save_summary"):
|
||||
| |_________________________________________________________________^
|
||||
306 | converter.scrape_all()
|
||||
307 | # Should have attempted to scrape the base URL
|
||||
|
|
||||
help: Combine `with` statements
|
||||
|
||||
Found 38 errors.
|
||||
43
test_api.py
43
test_api.py
@@ -1,43 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Quick test of the config analyzer"""
|
||||
|
||||
import sys
|
||||
|
||||
sys.path.insert(0, "api")
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
from api.config_analyzer import ConfigAnalyzer
|
||||
|
||||
# Initialize analyzer
|
||||
config_dir = Path("configs")
|
||||
analyzer = ConfigAnalyzer(config_dir, base_url="https://api.skillseekersweb.com")
|
||||
|
||||
# Test analyzing all configs
|
||||
print("Testing config analyzer...")
|
||||
print("-" * 60)
|
||||
|
||||
configs = analyzer.analyze_all_configs()
|
||||
print(f"\n✅ Found {len(configs)} configs")
|
||||
|
||||
# Show first 3 configs
|
||||
print("\n📋 Sample Configs:")
|
||||
for config in configs[:3]:
|
||||
print(f"\n Name: {config['name']}")
|
||||
print(f" Type: {config['type']}")
|
||||
print(f" Category: {config['category']}")
|
||||
print(f" Tags: {', '.join(config['tags'])}")
|
||||
print(f" Source: {config['primary_source'][:50]}...")
|
||||
print(f" File Size: {config['file_size']} bytes")
|
||||
|
||||
# Test category counts
|
||||
print("\n\n📊 Categories:")
|
||||
categories = {}
|
||||
for config in configs:
|
||||
cat = config["category"]
|
||||
categories[cat] = categories.get(cat, 0) + 1
|
||||
|
||||
for cat, count in sorted(categories.items()):
|
||||
print(f" {cat}: {count} configs")
|
||||
|
||||
print("\n✅ All tests passed!")
|
||||
@@ -1,62 +0,0 @@
|
||||
#!/bin/bash
|
||||
# Quick Test - HTTPX Skill (Documentation Only, No GitHub)
|
||||
# For faster testing without full C3.x analysis
|
||||
|
||||
set -e
|
||||
|
||||
echo "🚀 Quick HTTPX Skill Test (Docs Only)"
|
||||
echo "======================================"
|
||||
echo ""
|
||||
|
||||
# Simple config - docs only
|
||||
CONFIG_FILE="configs/httpx_quick.json"
|
||||
|
||||
# Create quick config (docs only)
|
||||
cat > "$CONFIG_FILE" << 'EOF'
|
||||
{
|
||||
"name": "httpx_quick",
|
||||
"description": "HTTPX HTTP client for Python - Quick test version",
|
||||
"base_url": "https://www.python-httpx.org/",
|
||||
"selectors": {
|
||||
"main_content": "article.md-content__inner",
|
||||
"title": "h1",
|
||||
"code_blocks": "pre code"
|
||||
},
|
||||
"url_patterns": {
|
||||
"include": ["/quickstart/", "/advanced/", "/api/"],
|
||||
"exclude": ["/changelog/", "/contributing/"]
|
||||
},
|
||||
"categories": {
|
||||
"getting_started": ["quickstart", "install"],
|
||||
"api": ["api", "reference"],
|
||||
"advanced": ["async", "http2"]
|
||||
},
|
||||
"rate_limit": 0.3,
|
||||
"max_pages": 50
|
||||
}
|
||||
EOF
|
||||
|
||||
echo "✓ Created quick config (docs only, max 50 pages)"
|
||||
echo ""
|
||||
|
||||
# Run scraper
|
||||
echo "🔍 Scraping documentation..."
|
||||
START_TIME=$(date +%s)
|
||||
|
||||
skill-seekers scrape --config "$CONFIG_FILE" --output output/httpx_quick
|
||||
|
||||
END_TIME=$(date +%s)
|
||||
DURATION=$((END_TIME - START_TIME))
|
||||
|
||||
echo ""
|
||||
echo "✅ Complete in ${DURATION}s"
|
||||
echo ""
|
||||
echo "📊 Results:"
|
||||
echo " Output: output/httpx_quick/"
|
||||
echo " SKILL.md: $(wc -l < output/httpx_quick/SKILL.md) lines"
|
||||
echo " References: $(find output/httpx_quick/references -name "*.md" 2>/dev/null | wc -l) files"
|
||||
echo ""
|
||||
echo "🔍 Preview:"
|
||||
head -30 output/httpx_quick/SKILL.md
|
||||
echo ""
|
||||
echo "📦 Next: skill-seekers package output/httpx_quick/"
|
||||
@@ -1,249 +0,0 @@
|
||||
#!/bin/bash
|
||||
# Test Script for HTTPX Skill Generation
|
||||
# Tests all C3.x features and experimental capabilities
|
||||
|
||||
set -e # Exit on error
|
||||
|
||||
echo "=================================="
|
||||
echo "🧪 HTTPX Skill Generation Test"
|
||||
echo "=================================="
|
||||
echo ""
|
||||
echo "This script will test:"
|
||||
echo " ✓ Unified multi-source scraping (docs + GitHub)"
|
||||
echo " ✓ Three-stream GitHub analysis"
|
||||
echo " ✓ C3.x features (patterns, tests, guides, configs, architecture)"
|
||||
echo " ✓ AI enhancement (LOCAL mode)"
|
||||
echo " ✓ Quality metrics"
|
||||
echo " ✓ Packaging"
|
||||
echo ""
|
||||
read -p "Press Enter to start (or Ctrl+C to cancel)..."
|
||||
|
||||
# Configuration
|
||||
CONFIG_FILE="configs/httpx_comprehensive.json"
|
||||
OUTPUT_DIR="output/httpx"
|
||||
SKILL_NAME="httpx"
|
||||
|
||||
# Step 1: Clean previous output
|
||||
echo ""
|
||||
echo "📁 Step 1: Cleaning previous output..."
|
||||
if [ -d "$OUTPUT_DIR" ]; then
|
||||
rm -rf "$OUTPUT_DIR"
|
||||
echo " ✓ Cleaned $OUTPUT_DIR"
|
||||
fi
|
||||
|
||||
# Step 2: Validate config
|
||||
echo ""
|
||||
echo "🔍 Step 2: Validating configuration..."
|
||||
if [ ! -f "$CONFIG_FILE" ]; then
|
||||
echo " ✗ Config file not found: $CONFIG_FILE"
|
||||
exit 1
|
||||
fi
|
||||
echo " ✓ Config file found"
|
||||
|
||||
# Show config summary
|
||||
echo ""
|
||||
echo "📋 Config Summary:"
|
||||
echo " Name: httpx"
|
||||
echo " Sources: Documentation + GitHub (C3.x analysis)"
|
||||
echo " Analysis Depth: c3x (full analysis)"
|
||||
echo " Features: API ref, patterns, test examples, guides, architecture"
|
||||
echo ""
|
||||
|
||||
# Step 3: Run unified scraper
|
||||
echo "🚀 Step 3: Running unified scraper (this will take 10-20 minutes)..."
|
||||
echo " This includes:"
|
||||
echo " - Documentation scraping"
|
||||
echo " - GitHub repo cloning and analysis"
|
||||
echo " - C3.1: Design pattern detection"
|
||||
echo " - C3.2: Test example extraction"
|
||||
echo " - C3.3: How-to guide generation"
|
||||
echo " - C3.4: Configuration extraction"
|
||||
echo " - C3.5: Architectural overview"
|
||||
echo " - C3.6: AI enhancement preparation"
|
||||
echo ""
|
||||
|
||||
START_TIME=$(date +%s)
|
||||
|
||||
# Run unified scraper with all features
|
||||
python -m skill_seekers.cli.unified_scraper \
|
||||
--config "$CONFIG_FILE" \
|
||||
--output "$OUTPUT_DIR" \
|
||||
--verbose
|
||||
|
||||
SCRAPE_END_TIME=$(date +%s)
|
||||
SCRAPE_DURATION=$((SCRAPE_END_TIME - START_TIME))
|
||||
|
||||
echo ""
|
||||
echo " ✓ Scraping completed in ${SCRAPE_DURATION}s"
|
||||
|
||||
# Step 4: Show analysis results
|
||||
echo ""
|
||||
echo "📊 Step 4: Analysis Results Summary"
|
||||
echo ""
|
||||
|
||||
# Check for C3.1 patterns
|
||||
if [ -f "$OUTPUT_DIR/c3_1_patterns.json" ]; then
|
||||
PATTERN_COUNT=$(python3 -c "import json; print(len(json.load(open('$OUTPUT_DIR/c3_1_patterns.json', 'r'))))")
|
||||
echo " C3.1 Design Patterns: $PATTERN_COUNT patterns detected"
|
||||
fi
|
||||
|
||||
# Check for C3.2 test examples
|
||||
if [ -f "$OUTPUT_DIR/c3_2_test_examples.json" ]; then
|
||||
EXAMPLE_COUNT=$(python3 -c "import json; data=json.load(open('$OUTPUT_DIR/c3_2_test_examples.json', 'r')); print(len(data.get('examples', [])))")
|
||||
echo " C3.2 Test Examples: $EXAMPLE_COUNT examples extracted"
|
||||
fi
|
||||
|
||||
# Check for C3.3 guides
|
||||
GUIDE_COUNT=0
|
||||
if [ -d "$OUTPUT_DIR/guides" ]; then
|
||||
GUIDE_COUNT=$(find "$OUTPUT_DIR/guides" -name "*.md" | wc -l)
|
||||
echo " C3.3 How-To Guides: $GUIDE_COUNT guides generated"
|
||||
fi
|
||||
|
||||
# Check for C3.4 configs
|
||||
if [ -f "$OUTPUT_DIR/c3_4_configs.json" ]; then
|
||||
CONFIG_COUNT=$(python3 -c "import json; print(len(json.load(open('$OUTPUT_DIR/c3_4_configs.json', 'r'))))")
|
||||
echo " C3.4 Configurations: $CONFIG_COUNT config patterns found"
|
||||
fi
|
||||
|
||||
# Check for C3.5 architecture
|
||||
if [ -f "$OUTPUT_DIR/c3_5_architecture.md" ]; then
|
||||
ARCH_LINES=$(wc -l < "$OUTPUT_DIR/c3_5_architecture.md")
|
||||
echo " C3.5 Architecture: Overview generated ($ARCH_LINES lines)"
|
||||
fi
|
||||
|
||||
# Check for API reference
|
||||
if [ -f "$OUTPUT_DIR/api_reference.md" ]; then
|
||||
API_LINES=$(wc -l < "$OUTPUT_DIR/api_reference.md")
|
||||
echo " API Reference: Generated ($API_LINES lines)"
|
||||
fi
|
||||
|
||||
# Check for dependency graph
|
||||
if [ -f "$OUTPUT_DIR/dependency_graph.json" ]; then
|
||||
echo " Dependency Graph: Generated"
|
||||
fi
|
||||
|
||||
# Check SKILL.md
|
||||
if [ -f "$OUTPUT_DIR/SKILL.md" ]; then
|
||||
SKILL_LINES=$(wc -l < "$OUTPUT_DIR/SKILL.md")
|
||||
echo " SKILL.md: Generated ($SKILL_LINES lines)"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
|
||||
# Step 5: Quality assessment (pre-enhancement)
|
||||
echo "📈 Step 5: Quality Assessment (Pre-Enhancement)"
|
||||
echo ""
|
||||
|
||||
# Count references
|
||||
if [ -d "$OUTPUT_DIR/references" ]; then
|
||||
REF_COUNT=$(find "$OUTPUT_DIR/references" -name "*.md" | wc -l)
|
||||
TOTAL_REF_LINES=$(find "$OUTPUT_DIR/references" -name "*.md" -exec wc -l {} + | tail -1 | awk '{print $1}')
|
||||
echo " Reference Files: $REF_COUNT files ($TOTAL_REF_LINES total lines)"
|
||||
fi
|
||||
|
||||
# Estimate quality score (basic heuristics)
|
||||
QUALITY_SCORE=3 # Base score
|
||||
|
||||
# Add points for features
|
||||
[ -f "$OUTPUT_DIR/c3_1_patterns.json" ] && QUALITY_SCORE=$((QUALITY_SCORE + 1))
|
||||
[ -f "$OUTPUT_DIR/c3_2_test_examples.json" ] && QUALITY_SCORE=$((QUALITY_SCORE + 1))
|
||||
[ $GUIDE_COUNT -gt 0 ] && QUALITY_SCORE=$((QUALITY_SCORE + 1))
|
||||
[ -f "$OUTPUT_DIR/c3_4_configs.json" ] && QUALITY_SCORE=$((QUALITY_SCORE + 1))
|
||||
[ -f "$OUTPUT_DIR/c3_5_architecture.md" ] && QUALITY_SCORE=$((QUALITY_SCORE + 1))
|
||||
[ -f "$OUTPUT_DIR/api_reference.md" ] && QUALITY_SCORE=$((QUALITY_SCORE + 1))
|
||||
|
||||
echo " Estimated Quality (Pre-Enhancement): $QUALITY_SCORE/10"
|
||||
echo ""
|
||||
|
||||
# Step 6: AI Enhancement (LOCAL mode)
|
||||
echo "🤖 Step 6: AI Enhancement (LOCAL mode)"
|
||||
echo ""
|
||||
echo " This will use Claude Code to enhance the skill"
|
||||
echo " Expected improvement: $QUALITY_SCORE/10 → 8-9/10"
|
||||
echo ""
|
||||
|
||||
read -p " Run AI enhancement? (y/n) [y]: " RUN_ENHANCEMENT
|
||||
RUN_ENHANCEMENT=${RUN_ENHANCEMENT:-y}
|
||||
|
||||
if [ "$RUN_ENHANCEMENT" = "y" ]; then
|
||||
echo " Running LOCAL enhancement (force mode ON)..."
|
||||
|
||||
python -m skill_seekers.cli.enhance_skill_local \
|
||||
"$OUTPUT_DIR" \
|
||||
--mode LOCAL \
|
||||
--force
|
||||
|
||||
ENHANCE_END_TIME=$(date +%s)
|
||||
ENHANCE_DURATION=$((ENHANCE_END_TIME - SCRAPE_END_TIME))
|
||||
|
||||
echo ""
|
||||
echo " ✓ Enhancement completed in ${ENHANCE_DURATION}s"
|
||||
|
||||
# Post-enhancement quality
|
||||
POST_QUALITY=9 # Assume significant improvement
|
||||
echo " Estimated Quality (Post-Enhancement): $POST_QUALITY/10"
|
||||
else
|
||||
echo " Skipping enhancement"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
|
||||
# Step 7: Package skill
|
||||
echo "📦 Step 7: Packaging Skill"
|
||||
echo ""
|
||||
|
||||
python -m skill_seekers.cli.package_skill \
|
||||
"$OUTPUT_DIR" \
|
||||
--target claude \
|
||||
--output output/
|
||||
|
||||
PACKAGE_FILE="output/${SKILL_NAME}.zip"
|
||||
|
||||
if [ -f "$PACKAGE_FILE" ]; then
|
||||
PACKAGE_SIZE=$(du -h "$PACKAGE_FILE" | cut -f1)
|
||||
echo " ✓ Package created: $PACKAGE_FILE ($PACKAGE_SIZE)"
|
||||
else
|
||||
echo " ✗ Package creation failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo ""
|
||||
|
||||
# Step 8: Final Summary
|
||||
END_TIME=$(date +%s)
|
||||
TOTAL_DURATION=$((END_TIME - START_TIME))
|
||||
MINUTES=$((TOTAL_DURATION / 60))
|
||||
SECONDS=$((TOTAL_DURATION % 60))
|
||||
|
||||
echo "=================================="
|
||||
echo "✅ Test Complete!"
|
||||
echo "=================================="
|
||||
echo ""
|
||||
echo "📊 Summary:"
|
||||
echo " Total Time: ${MINUTES}m ${SECONDS}s"
|
||||
echo " Output Directory: $OUTPUT_DIR"
|
||||
echo " Package: $PACKAGE_FILE ($PACKAGE_SIZE)"
|
||||
echo ""
|
||||
echo "📈 Features Tested:"
|
||||
echo " ✓ Multi-source scraping (docs + GitHub)"
|
||||
echo " ✓ Three-stream analysis"
|
||||
echo " ✓ C3.1 Pattern detection"
|
||||
echo " ✓ C3.2 Test examples"
|
||||
echo " ✓ C3.3 How-to guides"
|
||||
echo " ✓ C3.4 Config extraction"
|
||||
echo " ✓ C3.5 Architecture overview"
|
||||
if [ "$RUN_ENHANCEMENT" = "y" ]; then
|
||||
echo " ✓ AI enhancement (LOCAL)"
|
||||
fi
|
||||
echo " ✓ Packaging"
|
||||
echo ""
|
||||
echo "🔍 Next Steps:"
|
||||
echo " 1. Review SKILL.md: cat $OUTPUT_DIR/SKILL.md | head -50"
|
||||
echo " 2. Check patterns: cat $OUTPUT_DIR/c3_1_patterns.json | jq '.'"
|
||||
echo " 3. Review guides: ls $OUTPUT_DIR/guides/"
|
||||
echo " 4. Upload to Claude: skill-seekers upload $PACKAGE_FILE"
|
||||
echo ""
|
||||
echo "📁 File Structure:"
|
||||
tree -L 2 "$OUTPUT_DIR" | head -30
|
||||
echo ""
|
||||
@@ -1,65 +0,0 @@
|
||||
============================= test session starts ==============================
|
||||
platform linux -- Python 3.14.2, pytest-8.4.2, pluggy-1.6.0 -- /usr/bin/python
|
||||
cachedir: .pytest_cache
|
||||
hypothesis profile 'default'
|
||||
rootdir: /mnt/1ece809a-2821-4f10-aecb-fcdf34760c0b/Git/Skill_Seekers
|
||||
configfile: pyproject.toml
|
||||
plugins: anyio-4.12.1, hypothesis-6.150.0, cov-6.1.1, typeguard-4.4.4
|
||||
collecting ... collected 1940 items / 1 error
|
||||
|
||||
==================================== ERRORS ====================================
|
||||
_________________ ERROR collecting tests/test_preset_system.py _________________
|
||||
ImportError while importing test module '/mnt/1ece809a-2821-4f10-aecb-fcdf34760c0b/Git/Skill_Seekers/tests/test_preset_system.py'.
|
||||
Hint: make sure your test modules/packages have valid Python names.
|
||||
Traceback:
|
||||
/usr/lib/python3.14/site-packages/_pytest/python.py:498: in importtestmodule
|
||||
mod = import_path(
|
||||
/usr/lib/python3.14/site-packages/_pytest/pathlib.py:587: in import_path
|
||||
importlib.import_module(module_name)
|
||||
/usr/lib/python3.14/importlib/__init__.py:88: in import_module
|
||||
return _bootstrap._gcd_import(name[level:], package, level)
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
<frozen importlib._bootstrap>:1398: in _gcd_import
|
||||
???
|
||||
<frozen importlib._bootstrap>:1371: in _find_and_load
|
||||
???
|
||||
<frozen importlib._bootstrap>:1342: in _find_and_load_unlocked
|
||||
???
|
||||
<frozen importlib._bootstrap>:938: in _load_unlocked
|
||||
???
|
||||
/usr/lib/python3.14/site-packages/_pytest/assertion/rewrite.py:186: in exec_module
|
||||
exec(co, module.__dict__)
|
||||
tests/test_preset_system.py:9: in <module>
|
||||
from skill_seekers.cli.presets import PresetManager, PRESETS, AnalysisPreset
|
||||
E ImportError: cannot import name 'PresetManager' from 'skill_seekers.cli.presets' (/mnt/1ece809a-2821-4f10-aecb-fcdf34760c0b/Git/Skill_Seekers/src/skill_seekers/cli/presets/__init__.py)
|
||||
=============================== warnings summary ===============================
|
||||
../../../../usr/lib/python3.14/site-packages/_pytest/config/__init__.py:1474
|
||||
/usr/lib/python3.14/site-packages/_pytest/config/__init__.py:1474: PytestConfigWarning: Unknown config option: asyncio_default_fixture_loop_scope
|
||||
|
||||
self._warn_or_fail_if_strict(f"Unknown config option: {key}\n")
|
||||
|
||||
../../../../usr/lib/python3.14/site-packages/_pytest/config/__init__.py:1474
|
||||
/usr/lib/python3.14/site-packages/_pytest/config/__init__.py:1474: PytestConfigWarning: Unknown config option: asyncio_mode
|
||||
|
||||
self._warn_or_fail_if_strict(f"Unknown config option: {key}\n")
|
||||
|
||||
tests/test_mcp_fastmcp.py:21
|
||||
/mnt/1ece809a-2821-4f10-aecb-fcdf34760c0b/Git/Skill_Seekers/tests/test_mcp_fastmcp.py:21: DeprecationWarning: The legacy server.py is deprecated and will be removed in v3.0.0. Please update your MCP configuration to use 'server_fastmcp' instead:
|
||||
OLD: python -m skill_seekers.mcp.server
|
||||
NEW: python -m skill_seekers.mcp.server_fastmcp
|
||||
The new server provides the same functionality with improved performance.
|
||||
from mcp.server import FastMCP
|
||||
|
||||
src/skill_seekers/cli/test_example_extractor.py:50
|
||||
/mnt/1ece809a-2821-4f10-aecb-fcdf34760c0b/Git/Skill_Seekers/src/skill_seekers/cli/test_example_extractor.py:50: PytestCollectionWarning: cannot collect test class 'TestExample' because it has a __init__ constructor (from: tests/test_test_example_extractor.py)
|
||||
@dataclass
|
||||
|
||||
src/skill_seekers/cli/test_example_extractor.py:920
|
||||
/mnt/1ece809a-2821-4f10-aecb-fcdf34760c0b/Git/Skill_Seekers/src/skill_seekers/cli/test_example_extractor.py:920: PytestCollectionWarning: cannot collect test class 'TestExampleExtractor' because it has a __init__ constructor (from: tests/test_test_example_extractor.py)
|
||||
class TestExampleExtractor:
|
||||
|
||||
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
|
||||
=========================== short test summary info ============================
|
||||
ERROR tests/test_preset_system.py
|
||||
!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!
|
||||
========================= 5 warnings, 1 error in 1.11s =========================
|
||||
@@ -1,273 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Quick validation script for Week 2 features.
|
||||
Run this to verify all new capabilities are working.
|
||||
"""
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
import tempfile
|
||||
import shutil
|
||||
|
||||
# Add src to path for testing
|
||||
sys.path.insert(0, str(Path(__file__).parent / "src"))
|
||||
|
||||
def test_vector_databases():
|
||||
"""Test all 4 vector database adaptors."""
|
||||
from skill_seekers.cli.adaptors import get_adaptor
|
||||
import json
|
||||
|
||||
print("📦 Testing vector database adaptors...")
|
||||
|
||||
# Create minimal test data
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
skill_dir = Path(tmpdir) / 'test_skill'
|
||||
skill_dir.mkdir()
|
||||
(skill_dir / 'SKILL.md').write_text('# Test\n\nContent.')
|
||||
|
||||
targets = ['weaviate', 'chroma', 'faiss', 'qdrant']
|
||||
for target in targets:
|
||||
try:
|
||||
adaptor = get_adaptor(target)
|
||||
package_path = adaptor.package(skill_dir, Path(tmpdir))
|
||||
assert package_path.exists(), f"{target} package not created"
|
||||
print(f" ✅ {target.capitalize()}")
|
||||
except Exception as e:
|
||||
print(f" ❌ {target.capitalize()}: {e}")
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def test_streaming():
|
||||
"""Test streaming ingestion."""
|
||||
from skill_seekers.cli.streaming_ingest import StreamingIngester
|
||||
|
||||
print("📈 Testing streaming ingestion...")
|
||||
|
||||
try:
|
||||
large_content = "Test content. " * 500
|
||||
ingester = StreamingIngester(chunk_size=1000, chunk_overlap=100)
|
||||
|
||||
chunks = list(ingester.chunk_document(
|
||||
large_content,
|
||||
{'source': 'test'}
|
||||
))
|
||||
|
||||
assert len(chunks) > 5, "Expected multiple chunks"
|
||||
assert all(len(chunk[0]) <= 1100 for chunk in chunks), "Chunk too large"
|
||||
|
||||
print(f" ✅ Chunked {len(large_content)} chars into {len(chunks)} chunks")
|
||||
return True
|
||||
except Exception as e:
|
||||
print(f" ❌ Streaming test failed: {e}")
|
||||
return False
|
||||
|
||||
|
||||
def test_incremental():
|
||||
"""Test incremental updates."""
|
||||
from skill_seekers.cli.incremental_updater import IncrementalUpdater
|
||||
import time
|
||||
|
||||
print("⚡ Testing incremental updates...")
|
||||
|
||||
try:
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
skill_dir = Path(tmpdir) / 'test_skill'
|
||||
skill_dir.mkdir()
|
||||
|
||||
# Create references directory
|
||||
refs_dir = skill_dir / 'references'
|
||||
refs_dir.mkdir()
|
||||
|
||||
# Create initial version
|
||||
(skill_dir / 'SKILL.md').write_text('# V1\n\nInitial content.')
|
||||
(refs_dir / 'guide.md').write_text('# Guide\n\nInitial guide.')
|
||||
|
||||
updater = IncrementalUpdater(skill_dir)
|
||||
updater.current_versions = updater._scan_documents() # Scan before saving
|
||||
updater.save_current_versions()
|
||||
|
||||
# Small delay to ensure different timestamps
|
||||
time.sleep(0.01)
|
||||
|
||||
# Make changes
|
||||
(skill_dir / 'SKILL.md').write_text('# V2\n\nUpdated content.')
|
||||
(refs_dir / 'new_ref.md').write_text('# New Reference\n\nNew documentation.')
|
||||
|
||||
# Detect changes (loads previous versions internally)
|
||||
updater2 = IncrementalUpdater(skill_dir)
|
||||
changes = updater2.detect_changes()
|
||||
|
||||
# Verify we have changes
|
||||
assert changes.has_changes, "No changes detected"
|
||||
assert len(changes.added) > 0, f"New file not detected"
|
||||
assert len(changes.modified) > 0, f"Modified file not detected"
|
||||
|
||||
print(f" ✅ Detected {len(changes.added)} added, {len(changes.modified)} modified")
|
||||
return True
|
||||
except Exception as e:
|
||||
print(f" ❌ Incremental test failed: {e}")
|
||||
return False
|
||||
|
||||
|
||||
def test_multilang():
|
||||
"""Test multi-language support."""
|
||||
from skill_seekers.cli.multilang_support import (
|
||||
LanguageDetector,
|
||||
MultiLanguageManager
|
||||
)
|
||||
|
||||
print("🌍 Testing multi-language support...")
|
||||
|
||||
try:
|
||||
detector = LanguageDetector()
|
||||
|
||||
# Test language detection
|
||||
en_text = "This is an English document about programming."
|
||||
es_text = "Este es un documento en español sobre programación."
|
||||
|
||||
en_detected = detector.detect(en_text)
|
||||
es_detected = detector.detect(es_text)
|
||||
|
||||
assert en_detected.code == 'en', f"Expected 'en', got '{en_detected.code}'"
|
||||
assert es_detected.code == 'es', f"Expected 'es', got '{es_detected.code}'"
|
||||
|
||||
# Test filename detection
|
||||
assert detector.detect_from_filename('README.en.md') == 'en'
|
||||
assert detector.detect_from_filename('guide.es.md') == 'es'
|
||||
|
||||
# Test manager
|
||||
manager = MultiLanguageManager()
|
||||
manager.add_document('doc.md', en_text, {})
|
||||
manager.add_document('doc.es.md', es_text, {})
|
||||
|
||||
languages = manager.get_languages()
|
||||
assert 'en' in languages and 'es' in languages
|
||||
|
||||
print(f" ✅ Detected {len(languages)} languages")
|
||||
return True
|
||||
except Exception as e:
|
||||
print(f" ❌ Multi-language test failed: {e}")
|
||||
return False
|
||||
|
||||
|
||||
def test_embeddings():
|
||||
"""Test embedding pipeline."""
|
||||
from skill_seekers.cli.embedding_pipeline import (
|
||||
EmbeddingPipeline,
|
||||
EmbeddingConfig
|
||||
)
|
||||
|
||||
print("💰 Testing embedding pipeline...")
|
||||
|
||||
try:
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
config = EmbeddingConfig(
|
||||
provider='local',
|
||||
model='test-model',
|
||||
dimension=64,
|
||||
batch_size=10,
|
||||
cache_dir=Path(tmpdir)
|
||||
)
|
||||
|
||||
pipeline = EmbeddingPipeline(config)
|
||||
|
||||
# Test generation (first run)
|
||||
texts = ['doc1', 'doc2', 'doc3']
|
||||
result1 = pipeline.generate_batch(texts, show_progress=False)
|
||||
|
||||
assert len(result1.embeddings) == 3, "Expected 3 embeddings"
|
||||
assert len(result1.embeddings[0]) == 64, "Wrong dimension"
|
||||
assert result1.generated_count == 3, "Should generate all on first run"
|
||||
|
||||
# Test caching (second run with same texts)
|
||||
result2 = pipeline.generate_batch(texts, show_progress=False)
|
||||
|
||||
assert result2.cached_count == 3, "Caching not working"
|
||||
assert result2.generated_count == 0, "Should not generate on second run"
|
||||
|
||||
print(f" ✅ First run: {result1.generated_count} generated")
|
||||
print(f" ✅ Second run: {result2.cached_count} cached (100% cache hit)")
|
||||
return True
|
||||
except Exception as e:
|
||||
print(f" ❌ Embedding test failed: {e}")
|
||||
return False
|
||||
|
||||
|
||||
def test_quality():
|
||||
"""Test quality metrics."""
|
||||
from skill_seekers.cli.quality_metrics import QualityAnalyzer
|
||||
|
||||
print("📊 Testing quality metrics...")
|
||||
|
||||
try:
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
skill_dir = Path(tmpdir) / 'test_skill'
|
||||
skill_dir.mkdir()
|
||||
|
||||
# Create test skill
|
||||
(skill_dir / 'SKILL.md').write_text('# Test Skill\n\nContent.')
|
||||
|
||||
refs_dir = skill_dir / 'references'
|
||||
refs_dir.mkdir()
|
||||
(refs_dir / 'guide.md').write_text('# Guide\n\nGuide content.')
|
||||
|
||||
# Analyze quality
|
||||
analyzer = QualityAnalyzer(skill_dir)
|
||||
report = analyzer.generate_report()
|
||||
|
||||
assert report.overall_score.total_score > 0, "Score is 0"
|
||||
assert report.overall_score.grade in ['A+', 'A', 'A-', 'B+', 'B', 'B-', 'C+', 'C', 'C-', 'D', 'F']
|
||||
assert len(report.metrics) == 4, "Expected 4 metrics"
|
||||
|
||||
print(f" ✅ Grade: {report.overall_score.grade} ({report.overall_score.total_score:.1f}/100)")
|
||||
return True
|
||||
except Exception as e:
|
||||
print(f" ❌ Quality test failed: {e}")
|
||||
return False
|
||||
|
||||
|
||||
def main():
|
||||
"""Run all tests."""
|
||||
print("=" * 70)
|
||||
print("🧪 Week 2 Feature Validation")
|
||||
print("=" * 70)
|
||||
print()
|
||||
|
||||
tests = [
|
||||
("Vector Databases", test_vector_databases),
|
||||
("Streaming Ingestion", test_streaming),
|
||||
("Incremental Updates", test_incremental),
|
||||
("Multi-Language", test_multilang),
|
||||
("Embedding Pipeline", test_embeddings),
|
||||
("Quality Metrics", test_quality),
|
||||
]
|
||||
|
||||
passed = 0
|
||||
failed = 0
|
||||
|
||||
for name, test_func in tests:
|
||||
try:
|
||||
if test_func():
|
||||
passed += 1
|
||||
else:
|
||||
failed += 1
|
||||
except Exception as e:
|
||||
print(f" ❌ Unexpected error: {e}")
|
||||
failed += 1
|
||||
print()
|
||||
|
||||
print("=" * 70)
|
||||
print(f"📊 Results: {passed}/{len(tests)} passed")
|
||||
|
||||
if failed == 0:
|
||||
print("🎉 All Week 2 features validated successfully!")
|
||||
return 0
|
||||
else:
|
||||
print(f"⚠️ {failed} test(s) failed")
|
||||
return 1
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(main())
|
||||
Reference in New Issue
Block a user