diff --git a/docs/DOCKER_GUIDE.md b/docs/DOCKER_GUIDE.md new file mode 100644 index 0000000..771aeec --- /dev/null +++ b/docs/DOCKER_GUIDE.md @@ -0,0 +1,575 @@ +# Docker Deployment Guide + +Complete guide for deploying Skill Seekers using Docker and Docker Compose. + +## Quick Start + +### 1. Prerequisites + +- Docker 20.10+ installed +- Docker Compose 2.0+ installed +- 2GB+ available RAM +- 5GB+ available disk space + +```bash +# Check Docker installation +docker --version +docker-compose --version +``` + +### 2. Clone Repository + +```bash +git clone https://github.com/your-org/skill-seekers.git +cd skill-seekers +``` + +### 3. Configure Environment + +```bash +# Copy environment template +cp .env.example .env + +# Edit .env with your API keys +nano .env # or your preferred editor +``` + +**Minimum Required:** +- `ANTHROPIC_API_KEY` - For AI enhancement features + +### 4. Start Services + +```bash +# Start all services (CLI + MCP server + vector DBs) +docker-compose up -d + +# Or start specific services +docker-compose up -d mcp-server weaviate +``` + +### 5. Verify Deployment + +```bash +# Check service status +docker-compose ps + +# Test CLI +docker-compose run skill-seekers skill-seekers --version + +# Test MCP server +curl http://localhost:8765/health +``` + +--- + +## Available Images + +### 1. skill-seekers (CLI) + +**Purpose:** Main CLI application for documentation scraping and skill generation + +**Usage:** +```bash +# Run CLI command +docker run --rm \ + -v $(pwd)/output:/output \ + -e ANTHROPIC_API_KEY=your-key \ + skill-seekers skill-seekers scrape --config /configs/react.json + +# Interactive shell +docker run -it --rm skill-seekers bash +``` + +**Image Size:** ~400MB +**Platforms:** linux/amd64, linux/arm64 + +### 2. skill-seekers-mcp (MCP Server) + +**Purpose:** MCP server with 25 tools for AI assistants + +**Usage:** +```bash +# HTTP mode (default) +docker run -d -p 8765:8765 \ + -e ANTHROPIC_API_KEY=your-key \ + skill-seekers-mcp + +# Stdio mode +docker run -it \ + -e ANTHROPIC_API_KEY=your-key \ + skill-seekers-mcp \ + python -m skill_seekers.mcp.server_fastmcp --transport stdio +``` + +**Image Size:** ~450MB +**Platforms:** linux/amd64, linux/arm64 +**Health Check:** http://localhost:8765/health + +--- + +## Docker Compose Services + +### Service Architecture + +``` +┌─────────────────────┐ +│ skill-seekers │ CLI Application +└─────────────────────┘ + +┌─────────────────────┐ +│ mcp-server │ MCP Server (25 tools) +│ Port: 8765 │ +└─────────────────────┘ + +┌─────────────────────┐ +│ weaviate │ Vector DB (hybrid search) +│ Port: 8080 │ +└─────────────────────┘ + +┌─────────────────────┐ +│ qdrant │ Vector DB (native filtering) +│ Ports: 6333/6334 │ +└─────────────────────┘ + +┌─────────────────────┐ +│ chroma │ Vector DB (local-first) +│ Port: 8000 │ +└─────────────────────┘ +``` + +### Service Commands + +```bash +# Start all services +docker-compose up -d + +# Start specific services +docker-compose up -d mcp-server weaviate + +# Stop all services +docker-compose down + +# View logs +docker-compose logs -f mcp-server + +# Restart service +docker-compose restart mcp-server + +# Scale service (if supported) +docker-compose up -d --scale mcp-server=3 +``` + +--- + +## Common Use Cases + +### Use Case 1: Scrape Documentation + +```bash +# Create skill from React documentation +docker-compose run skill-seekers \ + skill-seekers scrape --config /configs/react.json + +# Output will be in ./output/react/ +``` + +### Use Case 2: Export to Vector Databases + +```bash +# Export React skill to all vector databases +docker-compose run skill-seekers bash -c " + skill-seekers scrape --config /configs/react.json && + python -c ' +import sys +from pathlib import Path +sys.path.insert(0, \"/app/src\") +from skill_seekers.cli.adaptors import get_adaptor + +for target in [\"weaviate\", \"chroma\", \"faiss\", \"qdrant\"]: + adaptor = get_adaptor(target) + adaptor.package(Path(\"/output/react\"), Path(\"/output\")) + print(f\"✅ Exported to {target}\") + ' +" +``` + +### Use Case 3: Run Quality Analysis + +```bash +# Generate quality report for a skill +docker-compose run skill-seekers bash -c " + python3 <<'EOF' +import sys +from pathlib import Path +sys.path.insert(0, '/app/src') +from skill_seekers.cli.quality_metrics import QualityAnalyzer + +analyzer = QualityAnalyzer(Path('/output/react')) +report = analyzer.generate_report() +print(analyzer.format_report(report)) +EOF +" +``` + +### Use Case 4: MCP Server Integration + +```bash +# Start MCP server +docker-compose up -d mcp-server + +# Configure Claude Desktop +# Add to ~/Library/Application Support/Claude/claude_desktop_config.json: +{ + "mcpServers": { + "skill-seekers": { + "url": "http://localhost:8765/sse" + } + } +} +``` + +--- + +## Volume Management + +### Default Volumes + +| Volume | Path | Purpose | +|--------|------|---------| +| `./data` | `/data` | Persistent data (cache, logs) | +| `./configs` | `/configs` | Configuration files (read-only) | +| `./output` | `/output` | Generated skills and exports | +| `weaviate-data` | N/A | Weaviate database storage | +| `qdrant-data` | N/A | Qdrant database storage | +| `chroma-data` | N/A | Chroma database storage | + +### Backup Volumes + +```bash +# Backup vector database data +docker run --rm -v skill-seekers_weaviate-data:/data -v $(pwd):/backup \ + alpine tar czf /backup/weaviate-backup.tar.gz -C /data . + +# Restore from backup +docker run --rm -v skill-seekers_weaviate-data:/data -v $(pwd):/backup \ + alpine tar xzf /backup/weaviate-backup.tar.gz -C /data +``` + +### Clean Up Volumes + +```bash +# Remove all volumes (WARNING: deletes all data) +docker-compose down -v + +# Remove specific volume +docker volume rm skill-seekers_weaviate-data +``` + +--- + +## Environment Variables + +### Required Variables + +| Variable | Description | Example | +|----------|-------------|---------| +| `ANTHROPIC_API_KEY` | Claude AI API key | `sk-ant-...` | + +### Optional Variables + +| Variable | Description | Default | +|----------|-------------|---------| +| `GOOGLE_API_KEY` | Gemini API key | - | +| `OPENAI_API_KEY` | OpenAI API key | - | +| `GITHUB_TOKEN` | GitHub API token | - | +| `MCP_TRANSPORT` | MCP transport mode | `http` | +| `MCP_PORT` | MCP server port | `8765` | + +### Setting Variables + +**Option 1: .env file (recommended)** +```bash +cp .env.example .env +# Edit .env with your keys +``` + +**Option 2: Export in shell** +```bash +export ANTHROPIC_API_KEY=sk-ant-your-key +docker-compose up -d +``` + +**Option 3: Inline** +```bash +ANTHROPIC_API_KEY=sk-ant-your-key docker-compose up -d +``` + +--- + +## Building Images Locally + +### Build CLI Image + +```bash +docker build -t skill-seekers:local -f Dockerfile . +``` + +### Build MCP Server Image + +```bash +docker build -t skill-seekers-mcp:local -f Dockerfile.mcp . +``` + +### Build with Custom Base Image + +```bash +# Use slim base (smaller) +docker build -t skill-seekers:slim \ + --build-arg BASE_IMAGE=python:3.12-slim \ + -f Dockerfile . + +# Use alpine base (smallest) +docker build -t skill-seekers:alpine \ + --build-arg BASE_IMAGE=python:3.12-alpine \ + -f Dockerfile . +``` + +--- + +## Troubleshooting + +### Issue: MCP Server Won't Start + +**Symptoms:** +- Container exits immediately +- Health check fails + +**Solutions:** +```bash +# Check logs +docker-compose logs mcp-server + +# Verify port is available +lsof -i :8765 + +# Test MCP package installation +docker-compose run mcp-server python -c "import mcp; print('OK')" +``` + +### Issue: Permission Denied + +**Symptoms:** +- Cannot write to /output +- Cannot access /configs + +**Solutions:** +```bash +# Fix permissions +chmod -R 777 data/ output/ + +# Or use specific user ID +docker-compose run -u $(id -u):$(id -g) skill-seekers ... +``` + +### Issue: Out of Memory + +**Symptoms:** +- Container killed +- OOMKilled in `docker-compose ps` + +**Solutions:** +```bash +# Increase Docker memory limit +# Edit docker-compose.yml, add: +services: + skill-seekers: + mem_limit: 4g + memswap_limit: 4g + +# Or use streaming for large docs +docker-compose run skill-seekers \ + skill-seekers scrape --config /configs/react.json --streaming +``` + +### Issue: Vector Database Connection Failed + +**Symptoms:** +- Cannot connect to Weaviate/Qdrant/Chroma +- Connection refused errors + +**Solutions:** +```bash +# Check if services are running +docker-compose ps + +# Test connectivity +docker-compose exec skill-seekers curl http://weaviate:8080 +docker-compose exec skill-seekers curl http://qdrant:6333 +docker-compose exec skill-seekers curl http://chroma:8000 + +# Restart services +docker-compose restart weaviate qdrant chroma +``` + +### Issue: Slow Performance + +**Symptoms:** +- Long scraping times +- Slow container startup + +**Solutions:** +```bash +# Use smaller image +docker pull skill-seekers:slim + +# Enable BuildKit cache +export DOCKER_BUILDKIT=1 +docker build -t skill-seekers:local . + +# Increase CPU allocation +docker-compose up -d --scale skill-seekers=1 --cpu-shares=2048 +``` + +--- + +## Production Deployment + +### Security Hardening + +1. **Use secrets management** +```bash +# Docker secrets (Swarm mode) +echo "sk-ant-your-key" | docker secret create anthropic_key - + +# Kubernetes secrets +kubectl create secret generic skill-seekers-secrets \ + --from-literal=anthropic-api-key=sk-ant-your-key +``` + +2. **Run as non-root** +```dockerfile +# Already configured in Dockerfile +USER skillseeker # UID 1000 +``` + +3. **Read-only filesystems** +```yaml +# docker-compose.yml +services: + mcp-server: + read_only: true + tmpfs: + - /tmp +``` + +4. **Resource limits** +```yaml +services: + mcp-server: + deploy: + resources: + limits: + cpus: '2.0' + memory: 2G + reservations: + cpus: '0.5' + memory: 512M +``` + +### Monitoring + +1. **Health checks** +```bash +# Check all services +docker-compose ps + +# Detailed health status +docker inspect --format='{{.State.Health.Status}}' skill-seekers-mcp +``` + +2. **Logs** +```bash +# Stream logs +docker-compose logs -f --tail=100 + +# Export logs +docker-compose logs > skill-seekers-logs.txt +``` + +3. **Metrics** +```bash +# Resource usage +docker stats + +# Container inspect +docker-compose exec mcp-server ps aux +docker-compose exec mcp-server df -h +``` + +### Scaling + +1. **Horizontal scaling** +```bash +# Scale MCP servers +docker-compose up -d --scale mcp-server=3 + +# Use load balancer +# Add nginx/haproxy in docker-compose.yml +``` + +2. **Vertical scaling** +```yaml +# Increase resources +services: + mcp-server: + deploy: + resources: + limits: + cpus: '4.0' + memory: 8G +``` + +--- + +## Best Practices + +### 1. Use Multi-Stage Builds +✅ Already implemented in Dockerfile +- Builder stage for dependencies +- Runtime stage for production + +### 2. Minimize Image Size +- Use slim base images +- Clean up apt cache +- Remove unnecessary files via .dockerignore + +### 3. Security +- Run as non-root user (UID 1000) +- Use secrets for sensitive data +- Keep images updated + +### 4. Persistence +- Use named volumes for databases +- Mount ./output for generated skills +- Regular backups of vector DB data + +### 5. Monitoring +- Enable health checks +- Stream logs to external service +- Monitor resource usage + +--- + +## Additional Resources + +- [Docker Documentation](https://docs.docker.com/) +- [Docker Compose Reference](https://docs.docker.com/compose/compose-file/) +- [Skill Seekers Documentation](https://skillseekersweb.com/) +- [MCP Server Setup](docs/MCP_SETUP.md) +- [Vector Database Integration](docs/strategy/WEEK2_COMPLETE.md) + +--- + +**Last Updated:** February 7, 2026 +**Docker Version:** 20.10+ +**Compose Version:** 2.0+ diff --git a/docs/KUBERNETES_GUIDE.md b/docs/KUBERNETES_GUIDE.md new file mode 100644 index 0000000..f5fe8e8 --- /dev/null +++ b/docs/KUBERNETES_GUIDE.md @@ -0,0 +1,957 @@ +# Kubernetes Deployment Guide + +Complete guide for deploying Skill Seekers to Kubernetes using Helm charts. + +## Table of Contents + +- [Prerequisites](#prerequisites) +- [Quick Start](#quick-start) +- [Installation Methods](#installation-methods) +- [Configuration](#configuration) +- [Accessing Services](#accessing-services) +- [Scaling](#scaling) +- [Persistence](#persistence) +- [Vector Databases](#vector-databases) +- [Security](#security) +- [Monitoring](#monitoring) +- [Troubleshooting](#troubleshooting) +- [Production Best Practices](#production-best-practices) + +## Prerequisites + +### Required + +- Kubernetes cluster (1.23+) +- Helm 3.8+ +- kubectl configured for your cluster +- 20GB+ available storage (for persistence) + +### Recommended + +- Ingress controller (nginx, traefik) +- cert-manager (for TLS certificates) +- Prometheus operator (for monitoring) +- Persistent storage provisioner + +### Cluster Resource Requirements + +**Minimum (Development):** +- 2 CPU cores +- 8GB RAM +- 20GB storage + +**Recommended (Production):** +- 8+ CPU cores +- 32GB+ RAM +- 200GB+ storage (persistent volumes) + +## Quick Start + +### 1. Add Helm Repository (if published) + +```bash +# Add Helm repo +helm repo add skill-seekers https://yourusername.github.io/skill-seekers +helm repo update + +# Install with default values +helm install my-skill-seekers skill-seekers/skill-seekers \ + --create-namespace \ + --namespace skill-seekers +``` + +### 2. Install from Local Chart + +```bash +# Clone repository +git clone https://github.com/yourusername/skill-seekers.git +cd skill-seekers + +# Install chart +helm install my-skill-seekers ./helm/skill-seekers \ + --create-namespace \ + --namespace skill-seekers +``` + +### 3. Quick Test + +```bash +# Port-forward MCP server +kubectl port-forward -n skill-seekers svc/my-skill-seekers-mcp 8765:8765 + +# Test health endpoint +curl http://localhost:8765/health + +# Expected response: {"status": "ok"} +``` + +## Installation Methods + +### Method 1: Minimal Installation (Testing) + +Smallest deployment for testing - no persistence, no vector databases. + +```bash +helm install my-skill-seekers ./helm/skill-seekers \ + --namespace skill-seekers \ + --create-namespace \ + --set persistence.enabled=false \ + --set vectorDatabases.weaviate.enabled=false \ + --set vectorDatabases.qdrant.enabled=false \ + --set vectorDatabases.chroma.enabled=false \ + --set mcpServer.replicaCount=1 \ + --set mcpServer.autoscaling.enabled=false +``` + +### Method 2: Development Installation + +Moderate resources with persistence for local development. + +```bash +helm install my-skill-seekers ./helm/skill-seekers \ + --namespace skill-seekers \ + --create-namespace \ + --set persistence.data.size=5Gi \ + --set persistence.output.size=10Gi \ + --set vectorDatabases.weaviate.persistence.size=20Gi \ + --set mcpServer.replicaCount=1 \ + --set secrets.anthropicApiKey="sk-ant-..." +``` + +### Method 3: Production Installation + +Full production deployment with autoscaling, persistence, and all vector databases. + +```bash +helm install my-skill-seekers ./helm/skill-seekers \ + --namespace skill-seekers \ + --create-namespace \ + --values production-values.yaml +``` + +**production-values.yaml:** +```yaml +global: + environment: production + +mcpServer: + enabled: true + replicaCount: 3 + autoscaling: + enabled: true + minReplicas: 3 + maxReplicas: 20 + targetCPUUtilizationPercentage: 70 + resources: + limits: + cpu: 2000m + memory: 4Gi + requests: + cpu: 500m + memory: 1Gi + +persistence: + data: + size: 20Gi + storageClass: "fast-ssd" + output: + size: 50Gi + storageClass: "fast-ssd" + +vectorDatabases: + weaviate: + enabled: true + persistence: + size: 100Gi + storageClass: "fast-ssd" + qdrant: + enabled: true + persistence: + size: 100Gi + storageClass: "fast-ssd" + chroma: + enabled: true + persistence: + size: 50Gi + storageClass: "fast-ssd" + +ingress: + enabled: true + className: nginx + annotations: + cert-manager.io/cluster-issuer: "letsencrypt-prod" + nginx.ingress.kubernetes.io/ssl-redirect: "true" + hosts: + - host: skill-seekers.example.com + paths: + - path: /mcp + pathType: Prefix + backend: + service: + name: mcp + port: 8765 + tls: + - secretName: skill-seekers-tls + hosts: + - skill-seekers.example.com + +secrets: + anthropicApiKey: "sk-ant-..." + googleApiKey: "" + openaiApiKey: "" + githubToken: "" +``` + +### Method 4: Custom Values Installation + +```bash +# Create custom values +cat > my-values.yaml < skill-seekers-data-backup.tar.gz +``` + +**Restore:** +```bash +# Using Velero +velero restore create --from-backup skill-seekers-backup + +# Manual restore +kubectl exec -i -n skill-seekers deployment/my-skill-seekers-mcp -- \ + tar xzf - -C /data < skill-seekers-data-backup.tar.gz +``` + +## Vector Databases + +### Weaviate + +**Access:** +```bash +kubectl port-forward -n skill-seekers svc/my-skill-seekers-weaviate 8080:8080 +``` + +**Query:** +```bash +curl http://localhost:8080/v1/schema +``` + +### Qdrant + +**Access:** +```bash +# HTTP API +kubectl port-forward -n skill-seekers svc/my-skill-seekers-qdrant 6333:6333 + +# gRPC +kubectl port-forward -n skill-seekers svc/my-skill-seekers-qdrant 6334:6334 +``` + +**Query:** +```bash +curl http://localhost:6333/collections +``` + +### Chroma + +**Access:** +```bash +kubectl port-forward -n skill-seekers svc/my-skill-seekers-chroma 8000:8000 +``` + +**Query:** +```bash +curl http://localhost:8000/api/v1/collections +``` + +### Disable Vector Databases + +To disable individual vector databases: + +```yaml +vectorDatabases: + weaviate: + enabled: false + qdrant: + enabled: false + chroma: + enabled: false +``` + +## Security + +### Pod Security Context + +Runs as non-root user (UID 1000): + +```yaml +podSecurityContext: + runAsNonRoot: true + runAsUser: 1000 + fsGroup: 1000 + +securityContext: + capabilities: + drop: + - ALL + readOnlyRootFilesystem: false + allowPrivilegeEscalation: false +``` + +### Network Policies + +Create network policies for isolation: + +```yaml +networkPolicy: + enabled: true + policyTypes: + - Ingress + - Egress + ingress: + - from: + - namespaceSelector: + matchLabels: + name: ingress-nginx + egress: + - to: + - namespaceSelector: {} +``` + +### RBAC + +Enable RBAC with minimal permissions: + +```yaml +rbac: + create: true + rules: + - apiGroups: [""] + resources: ["configmaps", "secrets"] + verbs: ["get", "list"] +``` + +### Secrets Management + +**Best Practices:** +1. Never commit secrets to git +2. Use external secret managers (AWS Secrets Manager, HashiCorp Vault) +3. Enable encryption at rest in Kubernetes +4. Rotate secrets regularly + +**Example with Sealed Secrets:** +```bash +# Create sealed secret +kubectl create secret generic skill-seekers-secrets \ + --from-literal=ANTHROPIC_API_KEY="sk-ant-..." \ + --dry-run=client -o yaml | \ + kubeseal -o yaml > sealed-secret.yaml + +# Apply sealed secret +kubectl apply -f sealed-secret.yaml -n skill-seekers +``` + +## Monitoring + +### Pod Metrics + +```bash +# View pod status +kubectl get pods -n skill-seekers + +# View pod metrics (requires metrics-server) +kubectl top pods -n skill-seekers + +# View pod logs +kubectl logs -n skill-seekers -l app.kubernetes.io/component=mcp-server --tail=100 -f +``` + +### Prometheus Integration + +Enable ServiceMonitor (requires Prometheus Operator): + +```yaml +serviceMonitor: + enabled: true + interval: 30s + scrapeTimeout: 10s + labels: + prometheus: kube-prometheus +``` + +### Grafana Dashboards + +Import dashboard JSON from `helm/skill-seekers/dashboards/`. + +### Health Checks + +MCP server has built-in health checks: + +```yaml +livenessProbe: + httpGet: + path: /health + port: 8765 + initialDelaySeconds: 30 + periodSeconds: 10 + +readinessProbe: + httpGet: + path: /health + port: 8765 + initialDelaySeconds: 10 + periodSeconds: 5 +``` + +Test manually: +```bash +kubectl exec -n skill-seekers deployment/my-skill-seekers-mcp -- \ + curl http://localhost:8765/health +``` + +## Troubleshooting + +### Pods Not Starting + +```bash +# Check pod status +kubectl get pods -n skill-seekers + +# View events +kubectl get events -n skill-seekers --sort-by='.lastTimestamp' + +# Describe pod +kubectl describe pod -n skill-seekers + +# Check logs +kubectl logs -n skill-seekers +``` + +### Common Issues + +**Issue: ImagePullBackOff** +```bash +# Check image pull secrets +kubectl get secrets -n skill-seekers + +# Verify image exists +docker pull +``` + +**Issue: CrashLoopBackOff** +```bash +# View recent logs +kubectl logs -n skill-seekers --previous + +# Check environment variables +kubectl exec -n skill-seekers -- env +``` + +**Issue: PVC Pending** +```bash +# Check storage class +kubectl get storageclass + +# View PVC events +kubectl describe pvc -n skill-seekers + +# Check if provisioner is running +kubectl get pods -n kube-system | grep provisioner +``` + +**Issue: API Key Not Working** +```bash +# Verify secret exists +kubectl get secret -n skill-seekers my-skill-seekers + +# Check secret contents (base64 encoded) +kubectl get secret -n skill-seekers my-skill-seekers -o yaml + +# Test API key manually +kubectl exec -n skill-seekers deployment/my-skill-seekers-mcp -- \ + env | grep ANTHROPIC +``` + +### Debug Container + +Run debug container in same namespace: + +```bash +kubectl run debug -n skill-seekers --rm -it \ + --image=nicolaka/netshoot \ + --restart=Never -- bash + +# Inside debug container: +# Test MCP server connectivity +curl http://my-skill-seekers-mcp:8765/health + +# Test vector database connectivity +curl http://my-skill-seekers-weaviate:8080/v1/.well-known/ready +``` + +## Production Best Practices + +### 1. Resource Planning + +**Capacity Planning:** +- MCP Server: 500m CPU + 1Gi RAM per 10 concurrent requests +- Vector DBs: 2GB RAM + 10GB storage per 100K documents +- Reserve 30% overhead for spikes + +**Example Production Setup:** +```yaml +mcpServer: + replicaCount: 5 # Handle 50 concurrent requests + resources: + requests: + cpu: 2500m + memory: 5Gi + autoscaling: + minReplicas: 5 + maxReplicas: 20 +``` + +### 2. High Availability + +**Anti-Affinity Rules:** +```yaml +mcpServer: + affinity: + podAntiAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + - labelSelector: + matchExpressions: + - key: app.kubernetes.io/component + operator: In + values: + - mcp-server + topologyKey: kubernetes.io/hostname +``` + +**Multiple Replicas:** +- MCP Server: 3+ replicas across different nodes +- Vector DBs: 2+ replicas with replication + +### 3. Monitoring and Alerting + +**Key Metrics to Monitor:** +- Pod restart count (> 5 per hour = critical) +- Memory usage (> 90% = warning) +- CPU throttling (> 50% = investigate) +- Request latency (p95 > 1s = warning) +- Error rate (> 1% = critical) + +**Prometheus Alerts:** +```yaml +- alert: HighPodRestarts + expr: rate(kube_pod_container_status_restarts_total{namespace="skill-seekers"}[15m]) > 0.1 + for: 5m + labels: + severity: warning +``` + +### 4. Backup Strategy + +**Automated Backups:** +```yaml +# CronJob for daily backups +apiVersion: batch/v1 +kind: CronJob +metadata: + name: skill-seekers-backup +spec: + schedule: "0 2 * * *" # 2 AM daily + jobTemplate: + spec: + template: + spec: + containers: + - name: backup + image: skill-seekers:latest + command: + - /bin/sh + - -c + - tar czf /backup/data-$(date +%Y%m%d).tar.gz /data +``` + +### 5. Security Hardening + +**Security Checklist:** +- [ ] Enable Pod Security Standards +- [ ] Use Network Policies +- [ ] Enable RBAC with least privilege +- [ ] Rotate secrets every 90 days +- [ ] Scan images for vulnerabilities +- [ ] Enable audit logging +- [ ] Use private container registry +- [ ] Enable encryption at rest + +### 6. Cost Optimization + +**Strategies:** +- Use spot/preemptible instances for non-critical workloads +- Enable cluster autoscaler +- Right-size resource requests +- Use storage tiering (hot/warm/cold) +- Schedule downscaling during off-hours + +**Example Cost Optimization:** +```yaml +# Development environment: downscale at night +# Create CronJob to scale down replicas +apiVersion: batch/v1 +kind: CronJob +metadata: + name: downscale-dev +spec: + schedule: "0 20 * * *" # 8 PM + jobTemplate: + spec: + template: + spec: + serviceAccountName: scaler + containers: + - name: kubectl + image: bitnami/kubectl + command: + - kubectl + - scale + - deployment + - my-skill-seekers-mcp + - --replicas=1 +``` + +### 7. Update Strategy + +**Rolling Updates:** +```yaml +mcpServer: + strategy: + type: RollingUpdate + rollingUpdate: + maxSurge: 1 + maxUnavailable: 0 +``` + +**Update Process:** +```bash +# 1. Test in staging +helm upgrade my-skill-seekers ./helm/skill-seekers \ + --namespace skill-seekers-staging \ + --values staging-values.yaml + +# 2. Run smoke tests +./scripts/smoke-test.sh + +# 3. Deploy to production +helm upgrade my-skill-seekers ./helm/skill-seekers \ + --namespace skill-seekers \ + --values production-values.yaml + +# 4. Monitor for 15 minutes +kubectl rollout status deployment -n skill-seekers my-skill-seekers-mcp + +# 5. Rollback if issues +helm rollback my-skill-seekers -n skill-seekers +``` + +## Upgrade Guide + +### Minor Version Upgrade + +```bash +# Fetch latest chart +helm repo update + +# Upgrade with existing values +helm upgrade my-skill-seekers skill-seekers/skill-seekers \ + --namespace skill-seekers \ + --reuse-values +``` + +### Major Version Upgrade + +```bash +# Backup current values +helm get values my-skill-seekers -n skill-seekers > backup-values.yaml + +# Review CHANGELOG for breaking changes +curl https://raw.githubusercontent.com/yourusername/skill-seekers/main/CHANGELOG.md + +# Upgrade with migration steps +helm upgrade my-skill-seekers skill-seekers/skill-seekers \ + --namespace skill-seekers \ + --values backup-values.yaml \ + --force # Only if schema changed +``` + +## Uninstallation + +### Full Cleanup + +```bash +# Delete Helm release +helm uninstall my-skill-seekers -n skill-seekers + +# Delete PVCs (if you want to remove data) +kubectl delete pvc -n skill-seekers --all + +# Delete namespace +kubectl delete namespace skill-seekers +``` + +### Keep Data + +```bash +# Delete release but keep PVCs +helm uninstall my-skill-seekers -n skill-seekers + +# PVCs remain for later use +kubectl get pvc -n skill-seekers +``` + +## Additional Resources + +- [Helm Documentation](https://helm.sh/docs/) +- [Kubernetes Documentation](https://kubernetes.io/docs/) +- [Skill Seekers GitHub](https://github.com/yourusername/skill-seekers) +- [Issue Tracker](https://github.com/yourusername/skill-seekers/issues) + +--- + +**Need Help?** +- GitHub Issues: https://github.com/yourusername/skill-seekers/issues +- Documentation: https://skillseekersweb.com +- Community: [Link to Discord/Slack]