feat: Add integration testing with real vector databases (Phase 5)

Phase 5 of optional enhancements: Integration Testing

**New Files:**
- tests/docker-compose.test.yml (Docker Compose configuration)
  - Weaviate service (port 8080) with health checks
  - Qdrant service (ports 6333, 6334) with persistent storage
  - ChromaDB service (port 8000) with persistent storage
  - Auto-restart and health monitoring for all services
  - Named volumes for data persistence

- tests/test_integration_adaptors.py (695 lines)
  - 6 comprehensive integration tests with pytest
  - 3 test classes: TestWeaviateIntegration, TestChromaIntegration, TestQdrantIntegration
  - Complete workflows: package → upload → query → verify → cleanup
  - Metadata preservation tests
  - Query filtering tests (ChromaDB, Qdrant)
  - Graceful skipping when services unavailable
  - Best-effort cleanup in all tests

- scripts/run_integration_tests.sh (executable runner)
  - Beautiful terminal UI with colored output
  - Automated service lifecycle management
  - Health check verification for all services
  - Automatic client library installation
  - Commands: start, stop, test, run, logs, status, help
  - Complete workflow: start → test → stop

**Test Results:**
- All 6 integration tests skip gracefully when services not running
- All 164 adaptor tests still passing
- No regressions detected

**Usage:**
# Complete workflow (start services, run tests, cleanup)
./scripts/run_integration_tests.sh

# Or manage manually
docker-compose -f tests/docker-compose.test.yml up -d
pytest tests/test_integration_adaptors.py -v -m integration
docker-compose -f tests/docker-compose.test.yml down -v

# Individual commands
./scripts/run_integration_tests.sh start   # Start services only
./scripts/run_integration_tests.sh test    # Run tests only
./scripts/run_integration_tests.sh stop    # Stop services
./scripts/run_integration_tests.sh logs    # View service logs
./scripts/run_integration_tests.sh status  # Check service status

**Test Coverage:**
✓ Weaviate: Complete workflow + metadata preservation (2 tests)
✓ ChromaDB: Complete workflow + query filtering (2 tests)
✓ Qdrant: Complete workflow + payload filtering (2 tests)

**Key Features:**
• Real database integration (not mocks)
• Complete end-to-end workflows
• Metadata validation across all platforms
• Query filtering demonstrations
• Automatic cleanup (best-effort)
• Graceful degradation (skip if services unavailable)
• Health checks ensure service readiness
• Persistent storage with Docker volumes

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
yusyus
2026-02-07 22:55:02 +03:00
parent b7e800614a
commit 6f9584ba67
3 changed files with 936 additions and 0 deletions

View File

@@ -0,0 +1,66 @@
version: '3.8'
services:
# Weaviate vector database
weaviate:
image: semitechnologies/weaviate:latest
container_name: skill_seekers_test_weaviate
ports:
- "8080:8080"
environment:
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
QUERY_DEFAULTS_LIMIT: 20
DEFAULT_VECTORIZER_MODULE: 'none'
CLUSTER_HOSTNAME: 'node1'
restart: on-failure:3
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8080/v1/.well-known/ready"]
interval: 5s
timeout: 3s
retries: 10
start_period: 10s
# Qdrant vector database
qdrant:
image: qdrant/qdrant:latest
container_name: skill_seekers_test_qdrant
ports:
- "6333:6333"
- "6334:6334"
environment:
QDRANT__SERVICE__GRPC_PORT: 6334
volumes:
- qdrant_data:/qdrant/storage
restart: on-failure:3
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:6333/"]
interval: 5s
timeout: 3s
retries: 10
start_period: 10s
# ChromaDB vector database
chroma:
image: chromadb/chroma:latest
container_name: skill_seekers_test_chroma
ports:
- "8000:8000"
environment:
IS_PERSISTENT: TRUE
ANONYMIZED_TELEMETRY: FALSE
volumes:
- chroma_data:/chroma/chroma
restart: on-failure:3
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8000/api/v1/heartbeat"]
interval: 5s
timeout: 3s
retries: 10
start_period: 10s
volumes:
qdrant_data:
driver: local
chroma_data:
driver: local