Files
skill-seekers-reference/docs/KUBERNETES_DEPLOYMENT.md
yusyus 8b3f31409e fix: Enforce min_chunk_size in RAG chunker
- Filter out chunks smaller than min_chunk_size (default 100 tokens)
- Exception: Keep all chunks if entire document is smaller than target size
- All 15 tests passing (100% pass rate)

Fixes edge case where very small chunks (e.g., 'Short.' = 6 chars) were
being created despite min_chunk_size=100 setting.

Test: pytest tests/test_rag_chunker.py -v
2026-02-07 20:59:03 +03:00

934 lines
18 KiB
Markdown

# Kubernetes Deployment Guide
Complete guide for deploying Skill Seekers on Kubernetes.
## Table of Contents
- [Prerequisites](#prerequisites)
- [Quick Start with Helm](#quick-start-with-helm)
- [Manual Deployment](#manual-deployment)
- [Configuration](#configuration)
- [Scaling](#scaling)
- [High Availability](#high-availability)
- [Monitoring](#monitoring)
- [Ingress & Load Balancing](#ingress--load-balancing)
- [Storage](#storage)
- [Security](#security)
- [Troubleshooting](#troubleshooting)
## Prerequisites
### 1. Kubernetes Cluster
**Minimum requirements:**
- Kubernetes v1.21+
- kubectl configured
- 2 nodes (minimum)
- 4 CPU cores total
- 8 GB RAM total
**Cloud providers:**
- **AWS:** EKS (Elastic Kubernetes Service)
- **GCP:** GKE (Google Kubernetes Engine)
- **Azure:** AKS (Azure Kubernetes Service)
- **Local:** Minikube, kind, k3s
### 2. Required Tools
```bash
# kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
# Helm 3
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
# Verify installations
kubectl version --client
helm version
```
### 3. Cluster Access
```bash
# Verify cluster connection
kubectl cluster-info
kubectl get nodes
# Create namespace
kubectl create namespace skillseekers
kubectl config set-context --current --namespace=skillseekers
```
## Quick Start with Helm
### 1. Install with Default Values
```bash
# Add Helm repository (when available)
helm repo add skillseekers https://charts.skillseekers.io
helm repo update
# Install release
helm install skillseekers skillseekers/skillseekers \
--namespace skillseekers \
--create-namespace
# Or install from local chart
helm install skillseekers ./helm/skillseekers \
--namespace skillseekers \
--create-namespace
```
### 2. Install with Custom Values
```bash
# Create values file
cat > values-prod.yaml <<EOF
replicaCount: 3
secrets:
anthropicApiKey: "sk-ant-..."
githubToken: "ghp_..."
openaiApiKey: "sk-..."
resources:
limits:
cpu: 2000m
memory: 4Gi
requests:
cpu: 1000m
memory: 2Gi
ingress:
enabled: true
className: nginx
hosts:
- host: api.skillseekers.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: skillseekers-tls
hosts:
- api.skillseekers.example.com
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 70
EOF
# Install with custom values
helm install skillseekers ./helm/skillseekers \
--namespace skillseekers \
--create-namespace \
--values values-prod.yaml
```
### 3. Helm Commands
```bash
# List releases
helm list -n skillseekers
# Get status
helm status skillseekers -n skillseekers
# Upgrade release
helm upgrade skillseekers ./helm/skillseekers \
--namespace skillseekers \
--values values-prod.yaml
# Rollback
helm rollback skillseekers 1 -n skillseekers
# Uninstall
helm uninstall skillseekers -n skillseekers
```
## Manual Deployment
### 1. Secrets
Create secrets for API keys:
```yaml
# secrets.yaml
apiVersion: v1
kind: Secret
metadata:
name: skillseekers-secrets
namespace: skillseekers
type: Opaque
stringData:
ANTHROPIC_API_KEY: "sk-ant-..."
GITHUB_TOKEN: "ghp_..."
OPENAI_API_KEY: "sk-..."
VOYAGE_API_KEY: "..."
```
```bash
kubectl apply -f secrets.yaml
```
### 2. ConfigMap
```yaml
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: skillseekers-config
namespace: skillseekers
data:
MCP_TRANSPORT: "http"
MCP_PORT: "8765"
LOG_LEVEL: "INFO"
CACHE_TTL: "86400"
```
```bash
kubectl apply -f configmap.yaml
```
### 3. Deployment
```yaml
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: skillseekers-mcp
namespace: skillseekers
labels:
app: skillseekers
component: mcp-server
spec:
replicas: 3
selector:
matchLabels:
app: skillseekers
component: mcp-server
template:
metadata:
labels:
app: skillseekers
component: mcp-server
spec:
containers:
- name: mcp-server
image: skillseekers:2.9.0
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8765
name: http
protocol: TCP
env:
- name: MCP_TRANSPORT
valueFrom:
configMapKeyRef:
name: skillseekers-config
key: MCP_TRANSPORT
- name: MCP_PORT
valueFrom:
configMapKeyRef:
name: skillseekers-config
key: MCP_PORT
- name: ANTHROPIC_API_KEY
valueFrom:
secretKeyRef:
name: skillseekers-secrets
key: ANTHROPIC_API_KEY
- name: GITHUB_TOKEN
valueFrom:
secretKeyRef:
name: skillseekers-secrets
key: GITHUB_TOKEN
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 2000m
memory: 4Gi
livenessProbe:
httpGet:
path: /health
port: 8765
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /health
port: 8765
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 2
volumeMounts:
- name: data
mountPath: /app/data
- name: cache
mountPath: /app/cache
volumes:
- name: data
persistentVolumeClaim:
claimName: skillseekers-data
- name: cache
emptyDir: {}
```
```bash
kubectl apply -f deployment.yaml
```
### 4. Service
```yaml
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: skillseekers-mcp
namespace: skillseekers
labels:
app: skillseekers
component: mcp-server
spec:
type: ClusterIP
ports:
- port: 8765
targetPort: 8765
protocol: TCP
name: http
selector:
app: skillseekers
component: mcp-server
```
```bash
kubectl apply -f service.yaml
```
### 5. Verify Deployment
```bash
# Check pods
kubectl get pods -n skillseekers
# Check services
kubectl get svc -n skillseekers
# Check logs
kubectl logs -n skillseekers -l app=skillseekers --tail=100 -f
# Port forward for testing
kubectl port-forward -n skillseekers svc/skillseekers-mcp 8765:8765
# Test endpoint
curl http://localhost:8765/health
```
## Configuration
### 1. Resource Requests & Limits
```yaml
resources:
requests:
cpu: 500m # Guaranteed CPU
memory: 1Gi # Guaranteed memory
limits:
cpu: 2000m # Maximum CPU
memory: 4Gi # Maximum memory
```
### 2. Environment Variables
```yaml
env:
# From ConfigMap
- name: LOG_LEVEL
valueFrom:
configMapKeyRef:
name: skillseekers-config
key: LOG_LEVEL
# From Secret
- name: ANTHROPIC_API_KEY
valueFrom:
secretKeyRef:
name: skillseekers-secrets
key: ANTHROPIC_API_KEY
# Direct value
- name: MCP_TRANSPORT
value: "http"
```
### 3. Multi-Environment Setup
```bash
# Development
helm install skillseekers-dev ./helm/skillseekers \
--namespace skillseekers-dev \
--values values-dev.yaml
# Staging
helm install skillseekers-staging ./helm/skillseekers \
--namespace skillseekers-staging \
--values values-staging.yaml
# Production
helm install skillseekers-prod ./helm/skillseekers \
--namespace skillseekers-prod \
--values values-prod.yaml
```
## Scaling
### 1. Manual Scaling
```bash
# Scale deployment
kubectl scale deployment skillseekers-mcp -n skillseekers --replicas=5
# Verify
kubectl get pods -n skillseekers
```
### 2. Horizontal Pod Autoscaler (HPA)
```yaml
# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: skillseekers-mcp
namespace: skillseekers
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: skillseekers-mcp
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 2
periodSeconds: 15
selectPolicy: Max
```
```bash
kubectl apply -f hpa.yaml
# Monitor autoscaling
kubectl get hpa -n skillseekers --watch
```
### 3. Vertical Pod Autoscaler (VPA)
```yaml
# vpa.yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: skillseekers-mcp
namespace: skillseekers
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: skillseekers-mcp
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: mcp-server
minAllowed:
cpu: 500m
memory: 1Gi
maxAllowed:
cpu: 4000m
memory: 8Gi
```
## High Availability
### 1. Pod Disruption Budget
```yaml
# pdb.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: skillseekers-mcp
namespace: skillseekers
spec:
minAvailable: 2
selector:
matchLabels:
app: skillseekers
component: mcp-server
```
### 2. Pod Anti-Affinity
```yaml
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- skillseekers
topologyKey: kubernetes.io/hostname
```
### 3. Node Affinity
```yaml
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role
operator: In
values:
- worker
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: node-type
operator: In
values:
- high-cpu
```
### 4. Multi-Zone Deployment
```yaml
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: skillseekers
```
## Monitoring
### 1. Prometheus Metrics
```yaml
# servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: skillseekers-mcp
namespace: skillseekers
spec:
selector:
matchLabels:
app: skillseekers
endpoints:
- port: metrics
interval: 30s
path: /metrics
```
### 2. Grafana Dashboard
```bash
# Import dashboard
kubectl apply -f grafana/dashboard.json
```
### 3. Logging with Fluentd
```yaml
# fluentd-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
data:
fluent.conf: |
<source>
@type tail
path /var/log/containers/skillseekers*.log
pos_file /var/log/fluentd-skillseekers.pos
tag kubernetes.*
format json
</source>
<match **>
@type elasticsearch
host elasticsearch
port 9200
</match>
```
## Ingress & Load Balancing
### 1. Nginx Ingress
```yaml
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: skillseekers
namespace: skillseekers
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/rate-limit: "100"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
tls:
- hosts:
- api.skillseekers.example.com
secretName: skillseekers-tls
rules:
- host: api.skillseekers.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: skillseekers-mcp
port:
number: 8765
```
### 2. TLS with cert-manager
```bash
# Install cert-manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml
# Create ClusterIssuer
cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@example.com
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginx
EOF
```
## Storage
### 1. Persistent Volume
```yaml
# pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: skillseekers-data
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: standard
hostPath:
path: /mnt/skillseekers-data
```
### 2. Persistent Volume Claim
```yaml
# pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: skillseekers-data
namespace: skillseekers
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: standard
```
### 3. StatefulSet (for stateful workloads)
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: skillseekers-cache
spec:
serviceName: skillseekers-cache
replicas: 3
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
```
## Security
### 1. Network Policies
```yaml
# networkpolicy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: skillseekers-mcp
namespace: skillseekers
spec:
podSelector:
matchLabels:
app: skillseekers
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: skillseekers
ports:
- protocol: TCP
port: 8765
egress:
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 443 # HTTPS
- protocol: TCP
port: 80 # HTTP
```
### 2. Pod Security Policy
```yaml
# psp.yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: skillseekers-restricted
spec:
privileged: false
allowPrivilegeEscalation: false
requiredDropCapabilities:
- ALL
volumes:
- 'configMap'
- 'emptyDir'
- 'projected'
- 'secret'
- 'persistentVolumeClaim'
runAsUser:
rule: 'MustRunAsNonRoot'
seLinux:
rule: 'RunAsAny'
fsGroup:
rule: 'RunAsAny'
```
### 3. RBAC
```yaml
# rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: skillseekers
namespace: skillseekers
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: skillseekers
namespace: skillseekers
rules:
- apiGroups: [""]
resources: ["configmaps", "secrets"]
verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: skillseekers
namespace: skillseekers
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: skillseekers
subjects:
- kind: ServiceAccount
name: skillseekers
namespace: skillseekers
```
## Troubleshooting
### Common Issues
#### 1. Pods Not Starting
```bash
# Check pod status
kubectl get pods -n skillseekers
# Describe pod
kubectl describe pod <pod-name> -n skillseekers
# Check events
kubectl get events -n skillseekers --sort-by='.lastTimestamp'
# Check logs
kubectl logs <pod-name> -n skillseekers
```
#### 2. Image Pull Errors
```bash
# Check image pull secrets
kubectl get secrets -n skillseekers
# Create image pull secret
kubectl create secret docker-registry regcred \
--docker-server=registry.example.com \
--docker-username=user \
--docker-password=password \
-n skillseekers
# Use in pod spec
spec:
imagePullSecrets:
- name: regcred
```
#### 3. Resource Constraints
```bash
# Check node resources
kubectl top nodes
# Check pod resources
kubectl top pods -n skillseekers
# Increase resources
kubectl edit deployment skillseekers-mcp -n skillseekers
```
#### 4. Service Not Accessible
```bash
# Check service
kubectl get svc -n skillseekers
kubectl describe svc skillseekers-mcp -n skillseekers
# Check endpoints
kubectl get endpoints -n skillseekers
# Port forward
kubectl port-forward svc/skillseekers-mcp 8765:8765 -n skillseekers
```
### Debug Commands
```bash
# Execute command in pod
kubectl exec -it <pod-name> -n skillseekers -- /bin/bash
# Copy files from pod
kubectl cp skillseekers/<pod-name>:/app/data ./data
# Check pod networking
kubectl exec <pod-name> -n skillseekers -- nslookup google.com
# View full pod spec
kubectl get pod <pod-name> -n skillseekers -o yaml
# Restart deployment
kubectl rollout restart deployment skillseekers-mcp -n skillseekers
```
## Best Practices
1. **Always set resource requests and limits**
2. **Use namespaces for environment separation**
3. **Enable autoscaling for variable workloads**
4. **Implement health checks (liveness & readiness)**
5. **Use Secrets for sensitive data**
6. **Enable monitoring and logging**
7. **Implement Pod Disruption Budgets for HA**
8. **Use RBAC for access control**
9. **Enable Network Policies**
10. **Regular backup of persistent volumes**
## Next Steps
- Review [PRODUCTION_DEPLOYMENT.md](./PRODUCTION_DEPLOYMENT.md) for general guidelines
- See [DOCKER_DEPLOYMENT.md](./DOCKER_DEPLOYMENT.md) for container-specific details
- Check [TROUBLESHOOTING.md](./TROUBLESHOOTING.md) for common issues
---
**Need help?** Open an issue on [GitHub](https://github.com/yusufkaraaslan/Skill_Seekers/issues).