Files
skill-seekers-reference/docs/WEEK2_TESTING_GUIDE.md
yusyus eff6673c89 test: Add comprehensive Week 2 feature validation suite
Add automated test suite and testing guide for all Week 2 features.

**Test Suite (test_week2_features.py):**
- Automated validation for all 6 feature categories
- Quick validation script (< 5 seconds)
- Clear pass/fail indicators
- Production-ready testing

**Tests Included:**
1.  Vector Database Adaptors (4 formats)
   - Weaviate, Chroma, FAISS, Qdrant
   - JSON format validation
   - Metadata verification

2.  Streaming Ingestion
   - Large document chunking
   - Overlap preservation
   - Memory-efficient processing

3.  Incremental Updates
   - Change detection (added/modified/deleted)
   - Version tracking
   - Hash-based comparison

4.  Multi-Language Support
   - 11 language detection
   - Filename pattern recognition
   - Translation status tracking

5.  Embedding Pipeline
   - Generation and caching
   - 100% cache hit rate validation
   - Cost tracking

6.  Quality Metrics
   - 4-dimensional scoring
   - Grade assignment
   - Statistics calculation

**Testing Guide (docs/WEEK2_TESTING_GUIDE.md):**
- 7 comprehensive test scenarios
- Step-by-step instructions
- Expected outputs
- Troubleshooting section
- Integration test examples

**Results:**
- All 6 tests passing (100%)
- Fast execution (< 5 seconds)
- Production-ready validation
- User-friendly output

**Usage:**
```bash
# Quick validation
python test_week2_features.py

# Full testing guide
cat docs/WEEK2_TESTING_GUIDE.md
```

**Exit Codes:**
- 0: All tests passed
- 1: One or more tests failed
2026-02-07 14:14:37 +03:00

909 lines
23 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Week 2 Testing Guide
Interactive guide to test all new universal infrastructure features.
## 🎯 Prerequisites
```bash
# Ensure you're on the correct branch
git checkout feature/universal-infrastructure-strategy
# Install package in development mode
pip install -e .
# Install optional dependencies for full testing
pip install -e ".[all-llms]"
```
## 📦 Test 1: Vector Database Adaptors
Test all 4 vector database export formats.
### Setup Test Data
```bash
# Create a small test skill for quick testing
mkdir -p test_output/test_skill
cat > test_output/test_skill/SKILL.md << 'EOF'
# Test Skill
This is a test skill for demonstrating vector database exports.
## Features
- Feature 1: Basic functionality
- Feature 2: Advanced usage
- Feature 3: Best practices
## API Reference
### function_one()
Does something useful.
### function_two()
Does something else useful.
## Examples
```python
# Example 1
from test_skill import function_one
result = function_one()
```
EOF
mkdir -p test_output/test_skill/references
cat > test_output/test_skill/references/getting_started.md << 'EOF'
# Getting Started
Quick start guide for test skill.
EOF
```
### Test Weaviate Export
```python
# test_weaviate.py
from pathlib import Path
from skill_seekers.cli.adaptors import get_adaptor
import json
skill_dir = Path('test_output/test_skill')
output_dir = Path('test_output')
# Get Weaviate adaptor
adaptor = get_adaptor('weaviate')
print("✅ Weaviate adaptor loaded")
# Package skill
package_path = adaptor.package(skill_dir, output_dir)
print(f"✅ Package created: {package_path}")
# Verify output format
with open(package_path, 'r') as f:
data = json.load(f)
print(f"✅ Class name: {data['class_name']}")
print(f"✅ Objects count: {len(data['objects'])}")
print(f"✅ Properties: {list(data['schema']['properties'][0].keys())}")
print("\n🎉 Weaviate test passed!")
```
Run: `python test_weaviate.py`
### Test Chroma Export
```python
# test_chroma.py
from pathlib import Path
from skill_seekers.cli.adaptors import get_adaptor
import json
skill_dir = Path('test_output/test_skill')
output_dir = Path('test_output')
# Get Chroma adaptor
adaptor = get_adaptor('chroma')
print("✅ Chroma adaptor loaded")
# Package skill
package_path = adaptor.package(skill_dir, output_dir)
print(f"✅ Package created: {package_path}")
# Verify output format
with open(package_path, 'r') as f:
data = json.load(f)
print(f"✅ Collection name: {data['collection_name']}")
print(f"✅ Documents count: {len(data['documents'])}")
print(f"✅ Metadata fields: {list(data['metadatas'][0].keys())}")
print("\n🎉 Chroma test passed!")
```
Run: `python test_chroma.py`
### Test FAISS Export
```python
# test_faiss.py
from pathlib import Path
from skill_seekers.cli.adaptors import get_adaptor
import json
skill_dir = Path('test_output/test_skill')
output_dir = Path('test_output')
# Get FAISS adaptor
adaptor = get_adaptor('faiss')
print("✅ FAISS adaptor loaded")
# Package skill
package_path = adaptor.package(skill_dir, output_dir)
print(f"✅ Package created: {package_path}")
# Verify output format
with open(package_path, 'r') as f:
data = json.load(f)
print(f"✅ Index type: {data['index_config']['type']}")
print(f"✅ Embeddings count: {len(data['embeddings'])}")
print(f"✅ Metadata count: {len(data['metadata'])}")
print("\n🎉 FAISS test passed!")
```
Run: `python test_faiss.py`
### Test Qdrant Export
```python
# test_qdrant.py
from pathlib import Path
from skill_seekers.cli.adaptors import get_adaptor
import json
skill_dir = Path('test_output/test_skill')
output_dir = Path('test_output')
# Get Qdrant adaptor
adaptor = get_adaptor('qdrant')
print("✅ Qdrant adaptor loaded")
# Package skill
package_path = adaptor.package(skill_dir, output_dir)
print(f"✅ Package created: {package_path}")
# Verify output format
with open(package_path, 'r') as f:
data = json.load(f)
print(f"✅ Collection name: {data['collection_name']}")
print(f"✅ Points count: {len(data['points'])}")
print(f"✅ First point ID: {data['points'][0]['id']}")
print(f"✅ Payload fields: {list(data['points'][0]['payload'].keys())}")
print("\n🎉 Qdrant test passed!")
```
Run: `python test_qdrant.py`
**Expected Output:**
```
✅ Qdrant adaptor loaded
✅ Package created: test_output/test_skill-qdrant.json
✅ Collection name: test_skill
✅ Points count: 3
✅ First point ID: 550e8400-e29b-41d4-a716-446655440000
✅ Payload fields: ['content', 'metadata', 'source', 'category']
🎉 Qdrant test passed!
```
## 📈 Test 2: Streaming Ingestion
Test memory-efficient processing of large documents.
```python
# test_streaming.py
from pathlib import Path
from skill_seekers.cli.streaming_ingest import StreamingIngester, ChunkMetadata
import time
# Create large document (simulate large docs)
large_content = "This is a test document. " * 1000 # ~24KB
ingester = StreamingIngester(
chunk_size=1000, # 1KB chunks
chunk_overlap=100 # 100 char overlap
)
print("🔄 Starting streaming ingestion test...")
print(f"📄 Document size: {len(large_content):,} characters")
print(f"📦 Chunk size: {ingester.chunk_size} characters")
print(f"🔗 Overlap: {ingester.chunk_overlap} characters")
print()
# Track progress
start_time = time.time()
chunk_count = 0
total_chars = 0
metadata = {'source': 'test', 'file': 'large_doc.md'}
for chunk, chunk_meta in ingester.chunk_document(large_content, metadata):
chunk_count += 1
total_chars += len(chunk)
if chunk_count % 5 == 0:
print(f"✅ Processed {chunk_count} chunks ({total_chars:,} chars)")
end_time = time.time()
elapsed = end_time - start_time
print()
print(f"🎉 Streaming test complete!")
print(f" Total chunks: {chunk_count}")
print(f" Total characters: {total_chars:,}")
print(f" Time: {elapsed:.3f}s")
print(f" Speed: {total_chars/elapsed:,.0f} chars/sec")
# Verify overlap
print()
print("🔍 Verifying chunk overlap...")
chunks = list(ingester.chunk_document(large_content, metadata))
overlap = chunks[0][0][-100:] == chunks[1][0][:100]
print(f"✅ Overlap preserved: {overlap}")
```
Run: `python test_streaming.py`
**Expected Output:**
```
🔄 Starting streaming ingestion test...
📄 Document size: 24,000 characters
📦 Chunk size: 1000 characters
🔗 Overlap: 100 characters
✅ Processed 5 chunks (5,000 chars)
✅ Processed 10 chunks (10,000 chars)
✅ Processed 15 chunks (15,000 chars)
✅ Processed 20 chunks (20,000 chars)
✅ Processed 25 chunks (24,000 chars)
🎉 Streaming test complete!
Total chunks: 27
Total characters: 27,000
Time: 0.012s
Speed: 2,250,000 chars/sec
🔍 Verifying chunk overlap...
✅ Overlap preserved: True
```
## ⚡ Test 3: Incremental Updates
Test smart change detection and delta generation.
```python
# test_incremental.py
from pathlib import Path
from skill_seekers.cli.incremental_updater import IncrementalUpdater
import shutil
import time
skill_dir = Path('test_output/test_skill_versioned')
# Clean up if exists
if skill_dir.exists():
shutil.rmtree(skill_dir)
skill_dir.mkdir(parents=True)
# Create initial version
print("📦 Creating initial version...")
(skill_dir / 'SKILL.md').write_text('# Version 1.0\n\nInitial content')
(skill_dir / 'api.md').write_text('# API Reference v1')
updater = IncrementalUpdater(skill_dir)
# Take initial snapshot
print("📸 Taking initial snapshot...")
updater.create_snapshot('1.0.0')
print(f"✅ Snapshot 1.0.0 created")
# Wait a moment
time.sleep(0.1)
# Make some changes
print("\n🔧 Making changes...")
print(" - Modifying SKILL.md")
print(" - Adding new_feature.md")
print(" - Deleting api.md")
(skill_dir / 'SKILL.md').write_text('# Version 1.1\n\nUpdated content with new features')
(skill_dir / 'new_feature.md').write_text('# New Feature\n\nAwesome new functionality')
(skill_dir / 'api.md').unlink()
# Detect changes
print("\n🔍 Detecting changes...")
changes = updater.detect_changes('1.0.0')
print(f"✅ Changes detected:")
print(f" Added: {changes.added}")
print(f" Modified: {changes.modified}")
print(f" Deleted: {changes.deleted}")
# Generate delta package
print("\n📦 Generating delta package...")
delta_path = updater.generate_delta_package(changes, Path('test_output'))
print(f"✅ Delta package: {delta_path}")
# Create new snapshot
updater.create_snapshot('1.1.0')
print(f"✅ Snapshot 1.1.0 created")
# Show version history
print("\n📊 Version history:")
history = updater.get_version_history()
for v, ts in history.items():
print(f" {v}: {ts}")
print("\n🎉 Incremental update test passed!")
```
Run: `python test_incremental.py`
**Expected Output:**
```
📦 Creating initial version...
📸 Taking initial snapshot...
✅ Snapshot 1.0.0 created
🔧 Making changes...
- Modifying SKILL.md
- Adding new_feature.md
- Deleting api.md
🔍 Detecting changes...
✅ Changes detected:
Added: ['new_feature.md']
Modified: ['SKILL.md']
Deleted: ['api.md']
📦 Generating delta package...
✅ Delta package: test_output/test_skill_versioned-delta-1.0.0-to-1.1.0.zip
✅ Snapshot 1.1.0 created
📊 Version history:
1.0.0: 2026-02-07T...
1.1.0: 2026-02-07T...
🎉 Incremental update test passed!
```
## 🌍 Test 4: Multi-Language Support
Test language detection and translation tracking.
```python
# test_multilang.py
from skill_seekers.cli.multilang_support import (
LanguageDetector,
MultiLanguageManager
)
detector = LanguageDetector()
manager = MultiLanguageManager()
print("🌍 Testing multi-language support...\n")
# Test language detection
test_texts = {
'en': "This is an English document about programming.",
'es': "Este es un documento en español sobre programación.",
'fr': "Ceci est un document en français sur la programmation.",
'de': "Dies ist ein deutsches Dokument über Programmierung.",
'zh': "这是一个关于编程的中文文档。"
}
print("🔍 Language Detection Test:")
for code, text in test_texts.items():
detected = detector.detect(text)
match = "" if detected.code == code else ""
print(f" {match} Expected: {code}, Detected: {detected.code} ({detected.name}, {detected.confidence:.2f})")
print()
# Test filename detection
print("📁 Filename Pattern Detection:")
test_files = [
('README.en.md', 'en'),
('guide.es.md', 'es'),
('doc_fr.md', 'fr'),
('manual-de.md', 'de'),
]
for filename, expected in test_files:
detected = detector.detect_from_filename(filename)
match = "" if detected == expected else ""
print(f" {match} {filename}{detected} (expected: {expected})")
print()
# Test multi-language manager
print("📚 Multi-Language Manager Test:")
manager.add_document('README.md', test_texts['en'], {'type': 'overview'})
manager.add_document('README.es.md', test_texts['es'], {'type': 'overview'})
manager.add_document('README.fr.md', test_texts['fr'], {'type': 'overview'})
languages = manager.get_languages()
print(f"✅ Detected languages: {languages}")
print(f"✅ Primary language: {manager.primary_language}")
for lang in languages:
count = manager.get_document_count(lang)
print(f" {lang}: {count} document(s)")
print()
# Test translation status
status = manager.get_translation_status()
print(f"📊 Translation Status:")
print(f" Source: {status.source_language}")
print(f" Translated: {status.translated_languages}")
print(f" Coverage: {len(status.translated_languages)}/{len(languages)} languages")
print("\n🎉 Multi-language test passed!")
```
Run: `python test_multilang.py`
**Expected Output:**
```
🌍 Testing multi-language support...
🔍 Language Detection Test:
✅ Expected: en, Detected: en (English, 0.45)
✅ Expected: es, Detected: es (Spanish, 0.38)
✅ Expected: fr, Detected: fr (French, 0.35)
✅ Expected: de, Detected: de (German, 0.32)
✅ Expected: zh, Detected: zh (Chinese, 0.95)
📁 Filename Pattern Detection:
✅ README.en.md → en (expected: en)
✅ guide.es.md → es (expected: es)
✅ doc_fr.md → fr (expected: fr)
✅ manual-de.md → de (expected: de)
📚 Multi-Language Manager Test:
✅ Detected languages: ['en', 'es', 'fr']
✅ Primary language: en
en: 1 document(s)
es: 1 document(s)
fr: 1 document(s)
📊 Translation Status:
Source: en
Translated: ['es', 'fr']
Coverage: 2/3 languages
🎉 Multi-language test passed!
```
## 💰 Test 5: Embedding Pipeline
Test embedding generation with caching and cost tracking.
```python
# test_embeddings.py
from skill_seekers.cli.embedding_pipeline import (
EmbeddingPipeline,
EmbeddingConfig
)
from pathlib import Path
import tempfile
print("💰 Testing embedding pipeline...\n")
# Use local provider (free, deterministic)
with tempfile.TemporaryDirectory() as tmpdir:
config = EmbeddingConfig(
provider='local',
model='test-model',
dimension=128,
batch_size=10,
cache_dir=Path(tmpdir)
)
pipeline = EmbeddingPipeline(config)
# Test batch generation
print("📦 Batch Generation Test:")
texts = [
"Document 1: Introduction to programming",
"Document 2: Advanced concepts",
"Document 3: Best practices",
"Document 1: Introduction to programming", # Duplicate for caching
]
print(f" Processing {len(texts)} documents...")
result = pipeline.generate_batch(texts, show_progress=False)
print(f"✅ Generated: {result.generated_count} embeddings")
print(f"✅ Cached: {result.cached_count} embeddings")
print(f"✅ Total: {len(result.embeddings)} embeddings")
print(f"✅ Dimension: {len(result.embeddings[0])}")
print(f"✅ Time: {result.total_time:.3f}s")
# Verify caching
print("\n🔄 Cache Test:")
print(" Processing same documents again...")
result2 = pipeline.generate_batch(texts, show_progress=False)
print(f"✅ All cached: {result2.cached_count == len(texts)}")
print(f" Generated: {result2.generated_count}")
print(f" Cached: {result2.cached_count}")
print(f" Time: {result2.total_time:.3f}s (cached is faster!)")
# Dimension validation
print("\n✅ Dimension Validation Test:")
is_valid = pipeline.validate_dimensions(result.embeddings)
print(f" All dimensions correct: {is_valid}")
# Cost stats
print("\n💵 Cost Statistics:")
stats = pipeline.get_cost_stats()
for key, value in stats.items():
print(f" {key}: {value}")
print("\n🎉 Embedding pipeline test passed!")
```
Run: `python test_embeddings.py`
**Expected Output:**
```
💰 Testing embedding pipeline...
📦 Batch Generation Test:
Processing 4 documents...
✅ Generated: 3 embeddings
✅ Cached: 1 embeddings
✅ Total: 4 embeddings
✅ Dimension: 128
✅ Time: 0.002s
🔄 Cache Test:
Processing same documents again...
✅ All cached: True
Generated: 0
Cached: 4
Time: 0.001s (cached is faster!)
✅ Dimension Validation Test:
All dimensions correct: True
💵 Cost Statistics:
total_requests: 2
total_tokens: 160
cache_hits: 5
cache_misses: 3
cache_rate: 62.5%
estimated_cost: $0.0000
🎉 Embedding pipeline test passed!
```
## 📊 Test 6: Quality Metrics
Test quality analysis and grading system.
```python
# test_quality.py
from skill_seekers.cli.quality_metrics import QualityAnalyzer
from pathlib import Path
import tempfile
print("📊 Testing quality metrics dashboard...\n")
# Create test skill with known quality issues
with tempfile.TemporaryDirectory() as tmpdir:
skill_dir = Path(tmpdir) / 'test_skill'
skill_dir.mkdir()
# Create SKILL.md with TODO markers
(skill_dir / 'SKILL.md').write_text("""
# Test Skill
This is a test skill.
TODO: Add more content
TODO: Add examples
## Features
Some features here.
""")
# Create references directory
refs_dir = skill_dir / 'references'
refs_dir.mkdir()
(refs_dir / 'getting_started.md').write_text('# Getting Started\n\nQuick guide')
(refs_dir / 'api.md').write_text('# API Reference\n\nAPI docs')
# Analyze quality
print("🔍 Analyzing skill quality...")
analyzer = QualityAnalyzer(skill_dir)
report = analyzer.generate_report()
print(f"✅ Analysis complete!\n")
# Show results
score = report.overall_score
print(f"🎯 OVERALL SCORE")
print(f" Grade: {score.grade}")
print(f" Total: {score.total_score:.1f}/100")
print()
print(f"📈 COMPONENT SCORES")
print(f" Completeness: {score.completeness:.1f}% (30% weight)")
print(f" Accuracy: {score.accuracy:.1f}% (25% weight)")
print(f" Coverage: {score.coverage:.1f}% (25% weight)")
print(f" Health: {score.health:.1f}% (20% weight)")
print()
print(f"📋 METRICS")
for metric in report.metrics:
icon = {"INFO": "", "WARNING": "⚠️", "ERROR": ""}.get(metric.level.value, "")
print(f" {icon} {metric.name}: {metric.value:.1f}%")
if metric.suggestions:
for suggestion in metric.suggestions[:2]:
print(f"{suggestion}")
print()
print(f"📊 STATISTICS")
stats = report.statistics
print(f" Total files: {stats['total_files']}")
print(f" Markdown files: {stats['markdown_files']}")
print(f" Total words: {stats['total_words']}")
print()
if report.recommendations:
print(f"💡 RECOMMENDATIONS")
for rec in report.recommendations[:3]:
print(f" {rec}")
print("\n🎉 Quality metrics test passed!")
```
Run: `python test_quality.py`
**Expected Output:**
```
📊 Testing quality metrics dashboard...
🔍 Analyzing skill quality...
✅ Analysis complete!
🎯 OVERALL SCORE
Grade: C+
Total: 66.5/100
📈 COMPONENT SCORES
Completeness: 70.0% (30% weight)
Accuracy: 90.0% (25% weight)
Coverage: 40.0% (25% weight)
Health: 100.0% (20% weight)
📋 METRICS
✅ Completeness: 70.0%
→ Expand documentation coverage
⚠️ Accuracy: 90.0%
→ Found 2 TODO markers
⚠️ Coverage: 40.0%
→ Add getting started guide
→ Add API reference documentation
✅ Health: 100.0%
📊 STATISTICS
Total files: 3
Markdown files: 3
Total words: 45
💡 RECOMMENDATIONS
🟡 Expand documentation coverage (API, examples)
🟡 Address accuracy issues (TODOs, placeholders)
🎉 Quality metrics test passed!
```
## 🚀 Test 7: Integration Test
Test combining multiple features together.
```python
# test_integration.py
from pathlib import Path
from skill_seekers.cli.adaptors import get_adaptor
from skill_seekers.cli.streaming_ingest import StreamingIngester
from skill_seekers.cli.quality_metrics import QualityAnalyzer
import tempfile
import shutil
print("🚀 Integration Test: All Features Combined\n")
print("=" * 70)
# Setup
with tempfile.TemporaryDirectory() as tmpdir:
skill_dir = Path(tmpdir) / 'integration_test'
skill_dir.mkdir()
# Step 1: Create skill
print("\n📦 Step 1: Creating test skill...")
(skill_dir / 'SKILL.md').write_text("# Integration Test Skill\n\n" + ("Content. " * 200))
refs_dir = skill_dir / 'references'
refs_dir.mkdir()
(refs_dir / 'guide.md').write_text('# Guide\n\nGuide content')
(refs_dir / 'api.md').write_text('# API\n\nAPI content')
print("✅ Skill created")
# Step 2: Quality check
print("\n📊 Step 2: Running quality check...")
analyzer = QualityAnalyzer(skill_dir)
report = analyzer.generate_report()
print(f"✅ Quality grade: {report.overall_score.grade} ({report.overall_score.total_score:.1f}/100)")
# Step 3: Export to multiple vector DBs
print("\n📦 Step 3: Exporting to vector databases...")
for target in ['weaviate', 'chroma', 'qdrant']:
adaptor = get_adaptor(target)
package_path = adaptor.package(skill_dir, Path(tmpdir))
size = package_path.stat().st_size
print(f"{target.capitalize()}: {package_path.name} ({size:,} bytes)")
# Step 4: Test streaming (simulate large doc)
print("\n📈 Step 4: Testing streaming ingestion...")
large_content = "This is test content. " * 1000
ingester = StreamingIngester(chunk_size=1000, chunk_overlap=100)
chunks = list(ingester.chunk_document(large_content, {'source': 'test'}))
print(f"✅ Chunked {len(large_content):,} chars into {len(chunks)} chunks")
print("\n" + "=" * 70)
print("🎉 Integration test passed!")
print("\nAll Week 2 features working together successfully!")
```
Run: `python test_integration.py`
**Expected Output:**
```
🚀 Integration Test: All Features Combined
======================================================================
📦 Step 1: Creating test skill...
✅ Skill created
📊 Step 2: Running quality check...
✅ Quality grade: B (78.5/100)
📦 Step 3: Exporting to vector databases...
✅ Weaviate: integration_test-weaviate.json (2,456 bytes)
✅ Chroma: integration_test-chroma.json (2,134 bytes)
✅ Qdrant: integration_test-qdrant.json (2,389 bytes)
📈 Step 4: Testing streaming ingestion...
✅ Chunked 22,000 chars into 25 chunks
======================================================================
🎉 Integration test passed!
All Week 2 features working together successfully!
```
## 📋 Quick Test All
Run all tests at once:
```bash
# Create test runner script
cat > run_all_tests.py << 'EOF'
import subprocess
import sys
tests = [
('Vector Databases', 'test_weaviate.py'),
('Streaming', 'test_streaming.py'),
('Incremental Updates', 'test_incremental.py'),
('Multi-Language', 'test_multilang.py'),
('Embeddings', 'test_embeddings.py'),
('Quality Metrics', 'test_quality.py'),
('Integration', 'test_integration.py'),
]
print("🧪 Running All Week 2 Tests")
print("=" * 70)
passed = 0
failed = 0
for name, script in tests:
print(f"\n▶ {name}...")
try:
result = subprocess.run(
[sys.executable, script],
capture_output=True,
text=True,
timeout=30
)
if result.returncode == 0:
print(f"✅ {name} PASSED")
passed += 1
else:
print(f"❌ {name} FAILED")
print(result.stderr)
failed += 1
except Exception as e:
print(f"❌ {name} ERROR: {e}")
failed += 1
print("\n" + "=" * 70)
print(f"📊 Results: {passed} passed, {failed} failed")
if failed == 0:
print("🎉 All tests passed!")
else:
print(f"⚠️ {failed} test(s) failed")
sys.exit(1)
EOF
python run_all_tests.py
```
## 🎓 What Each Test Validates
| Test | Validates | Key Metrics |
|------|-----------|-------------|
| Vector DB | 4 export formats work | JSON structure, metadata |
| Streaming | Memory efficiency | Chunk count, overlap |
| Incremental | Change detection | Added/modified/deleted |
| Multi-Language | 11 languages | Detection accuracy |
| Embeddings | Caching & cost | Cache hit rate, cost |
| Quality | 4 dimensions | Grade, score, metrics |
| Integration | All together | End-to-end workflow |
## 🔧 Troubleshooting
### Import Errors
```bash
# Reinstall package
pip install -e .
```
### Test Failures
```bash
# Run with verbose output
python test_name.py -v
# Check Python version (requires 3.10+)
python --version
```
### Permission Errors
```bash
# Ensure test_output directory is writable
chmod -R 755 test_output/
```
## ✅ Success Criteria
All tests should show:
- ✅ Green checkmarks for passed steps
- 🎉 Success messages
- No ❌ error indicators
- Correct output formats
- Expected metrics within ranges
If all tests pass, Week 2 features are production-ready! 🚀