Files
skill-seekers-reference/docs/PRODUCTION_DEPLOYMENT.md
yusyus 8b3f31409e fix: Enforce min_chunk_size in RAG chunker
- Filter out chunks smaller than min_chunk_size (default 100 tokens)
- Exception: Keep all chunks if entire document is smaller than target size
- All 15 tests passing (100% pass rate)

Fixes edge case where very small chunks (e.g., 'Short.' = 6 chars) were
being created despite min_chunk_size=100 setting.

Test: pytest tests/test_rag_chunker.py -v
2026-02-07 20:59:03 +03:00

16 KiB

Production Deployment Guide

Complete guide for deploying Skill Seekers in production environments.

Table of Contents

Prerequisites

System Requirements

Minimum:

  • CPU: 2 cores
  • RAM: 4 GB
  • Disk: 10 GB
  • Python: 3.10+

Recommended (for production):

  • CPU: 4+ cores
  • RAM: 8+ GB
  • Disk: 50+ GB SSD
  • Python: 3.12+

Dependencies

Required:

# System packages (Ubuntu/Debian)
sudo apt update
sudo apt install -y python3.12 python3.12-venv python3-pip \
  git curl wget build-essential libssl-dev

# System packages (RHEL/CentOS)
sudo yum install -y python312 python312-devel git curl wget \
  gcc gcc-c++ openssl-devel

Optional (for specific features):

# OCR support (PDF scraping)
sudo apt install -y tesseract-ocr

# Cloud storage
# (Install provider-specific SDKs via pip)

# Embedding generation
# (GPU support requires CUDA)

Installation

1. Production Installation

# Create dedicated user
sudo useradd -m -s /bin/bash skillseekers
sudo su - skillseekers

# Create virtual environment
python3.12 -m venv /opt/skillseekers/venv
source /opt/skillseekers/venv/bin/activate

# Install package
pip install --upgrade pip
pip install skill-seekers[all]

# Verify installation
skill-seekers --version

2. Configuration Directory

# Create config directory
mkdir -p ~/.config/skill-seekers/{configs,output,logs,cache}

# Set permissions
chmod 700 ~/.config/skill-seekers

3. Environment Variables

Create /opt/skillseekers/.env:

# API Keys
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=AIza...
OPENAI_API_KEY=sk-...
VOYAGE_API_KEY=...

# GitHub Tokens (use skill-seekers config --github for multiple)
GITHUB_TOKEN=ghp_...

# Cloud Storage (optional)
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
GOOGLE_APPLICATION_CREDENTIALS=/path/to/gcs-key.json
AZURE_STORAGE_CONNECTION_STRING=...

# MCP Server
MCP_TRANSPORT=http
MCP_PORT=8765

# Sync Monitoring (optional)
SYNC_WEBHOOK_URL=https://...
SLACK_WEBHOOK_URL=https://hooks.slack.com/...

# Logging
LOG_LEVEL=INFO
LOG_FILE=/var/log/skillseekers/app.log

Security Note: Never commit .env files to version control!

# Secure the env file
chmod 600 /opt/skillseekers/.env

Configuration

1. GitHub Configuration

Use the interactive configuration wizard:

skill-seekers config --github

This will:

  • Add GitHub personal access tokens
  • Configure rate limit strategies
  • Test token validity
  • Support multiple profiles (work, personal, etc.)

2. API Keys Configuration

skill-seekers config --api-keys

Configure:

  • Claude API (Anthropic)
  • Gemini API (Google)
  • OpenAI API
  • Voyage AI (embeddings)

3. Connection Testing

skill-seekers config --test

Verifies:

  • GitHub token(s) validity and rate limits
  • Claude API connectivity
  • Gemini API connectivity
  • OpenAI API connectivity
  • Cloud storage access (if configured)

Deployment Options

Create /etc/systemd/system/skillseekers-mcp.service:

[Unit]
Description=Skill Seekers MCP Server
After=network.target

[Service]
Type=simple
User=skillseekers
Group=skillseekers
WorkingDirectory=/opt/skillseekers
EnvironmentFile=/opt/skillseekers/.env
ExecStart=/opt/skillseekers/venv/bin/python -m skill_seekers.mcp.server_fastmcp --transport http --port 8765
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal
SyslogIdentifier=skillseekers-mcp

# Security
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/opt/skillseekers /var/log/skillseekers

[Install]
WantedBy=multi-user.target

Enable and start:

sudo systemctl daemon-reload
sudo systemctl enable skillseekers-mcp
sudo systemctl start skillseekers-mcp
sudo systemctl status skillseekers-mcp

Option 2: Docker Deployment

See Docker Deployment Guide for detailed instructions.

Quick Start:

# Build image
docker build -t skillseekers:latest .

# Run container
docker run -d \
  --name skillseekers-mcp \
  -p 8765:8765 \
  -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
  -e GITHUB_TOKEN=$GITHUB_TOKEN \
  -v /opt/skillseekers/data:/app/data \
  --restart unless-stopped \
  skillseekers:latest

Option 3: Kubernetes Deployment

See Kubernetes Deployment Guide for detailed instructions.

Quick Start:

# Install with Helm
helm install skillseekers ./helm/skillseekers \
  --namespace skillseekers \
  --create-namespace \
  --set secrets.anthropicApiKey=$ANTHROPIC_API_KEY \
  --set secrets.githubToken=$GITHUB_TOKEN

Option 4: Docker Compose

See Docker Compose Guide for multi-service deployment.

# Start all services
docker-compose up -d

# Check status
docker-compose ps

# View logs
docker-compose logs -f

Monitoring & Observability

1. Health Checks

MCP Server Health:

# HTTP transport
curl http://localhost:8765/health

# Expected response:
{
  "status": "healthy",
  "version": "2.9.0",
  "uptime": 3600,
  "tools": 25
}

2. Logging

Configure structured logging:

# config/logging.yaml
version: 1
formatters:
  json:
    format: '{"time":"%(asctime)s","level":"%(levelname)s","msg":"%(message)s"}'
handlers:
  file:
    class: logging.handlers.RotatingFileHandler
    filename: /var/log/skillseekers/app.log
    maxBytes: 10485760  # 10MB
    backupCount: 5
    formatter: json
loggers:
  skill_seekers:
    level: INFO
    handlers: [file]

Log aggregation options:

  • ELK Stack: Elasticsearch + Logstash + Kibana
  • Grafana Loki: Lightweight log aggregation
  • CloudWatch Logs: For AWS deployments
  • Stackdriver: For GCP deployments

3. Metrics

Prometheus metrics endpoint:

# Add to MCP server
from prometheus_client import start_http_server, Counter, Histogram

# Metrics
scraping_requests = Counter('scraping_requests_total', 'Total scraping requests')
scraping_duration = Histogram('scraping_duration_seconds', 'Scraping duration')

# Start metrics server
start_http_server(9090)

Key metrics to monitor:

  • Request rate
  • Response time (p50, p95, p99)
  • Error rate
  • Memory usage
  • CPU usage
  • Disk I/O
  • GitHub API rate limit remaining
  • Claude API token usage

4. Alerting

Example Prometheus alert rules:

groups:
  - name: skillseekers
    rules:
      - alert: HighErrorRate
        expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
        for: 5m
        annotations:
          summary: "High error rate detected"

      - alert: HighMemoryUsage
        expr: process_resident_memory_bytes > 2e9  # 2GB
        for: 10m
        annotations:
          summary: "Memory usage above 2GB"

      - alert: GitHubRateLimitLow
        expr: github_rate_limit_remaining < 100
        for: 1m
        annotations:
          summary: "GitHub rate limit low"

Security

1. API Key Management

Best Practices:

DO:

  • Store keys in environment variables or secret managers
  • Use different keys for dev/staging/prod
  • Rotate keys regularly (quarterly minimum)
  • Use least-privilege IAM roles for cloud services
  • Monitor key usage for anomalies

DON'T:

  • Commit keys to version control
  • Share keys via email/Slack
  • Use production keys in development
  • Grant overly broad permissions

Recommended Secret Managers:

  • Kubernetes Secrets (for K8s deployments)
  • AWS Secrets Manager (for AWS)
  • Google Secret Manager (for GCP)
  • Azure Key Vault (for Azure)
  • HashiCorp Vault (cloud-agnostic)

2. Network Security

Firewall Rules:

# Allow only necessary ports
sudo ufw enable
sudo ufw allow 22/tcp    # SSH
sudo ufw allow 8765/tcp  # MCP server (if public)
sudo ufw deny incoming
sudo ufw allow outgoing

Reverse Proxy (Nginx):

# /etc/nginx/sites-available/skillseekers
server {
    listen 80;
    server_name api.skillseekers.example.com;

    # Redirect to HTTPS
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    server_name api.skillseekers.example.com;

    ssl_certificate /etc/letsencrypt/live/api.skillseekers.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/api.skillseekers.example.com/privkey.pem;

    # Security headers
    add_header Strict-Transport-Security "max-age=31536000" always;
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;

    # Rate limiting
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
    limit_req zone=api burst=20 nodelay;

    location / {
        proxy_pass http://localhost:8765;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # Timeouts
        proxy_connect_timeout 60s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
    }
}

3. TLS/SSL

Let's Encrypt (free certificates):

# Install certbot
sudo apt install certbot python3-certbot-nginx

# Obtain certificate
sudo certbot --nginx -d api.skillseekers.example.com

# Auto-renewal (cron)
0 12 * * * /usr/bin/certbot renew --quiet

4. Authentication & Authorization

API Key Authentication (optional):

# Add to MCP server
from fastapi import Security, HTTPException
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials

security = HTTPBearer()

async def verify_token(credentials: HTTPAuthorizationCredentials = Security(security)):
    token = credentials.credentials
    if token != os.getenv("API_SECRET_KEY"):
        raise HTTPException(status_code=401, detail="Invalid token")
    return token

Scaling

1. Vertical Scaling

Increase resources:

# Kubernetes resource limits
resources:
  requests:
    cpu: "2"
    memory: "4Gi"
  limits:
    cpu: "4"
    memory: "8Gi"

2. Horizontal Scaling

Deploy multiple instances:

# Kubernetes HPA (Horizontal Pod Autoscaler)
kubectl autoscale deployment skillseekers-mcp \
  --cpu-percent=70 \
  --min=2 \
  --max=10

Load Balancing:

# Nginx load balancer
upstream skillseekers {
    least_conn;
    server 10.0.0.1:8765;
    server 10.0.0.2:8765;
    server 10.0.0.3:8765;
}

server {
    listen 80;
    location / {
        proxy_pass http://skillseekers;
    }
}

3. Database/Storage Scaling

Distributed caching:

# Redis for distributed cache
import redis

cache = redis.Redis(host='redis.example.com', port=6379, db=0)

Object storage:

  • Use S3/GCS/Azure Blob for skill packages
  • Enable CDN for static assets
  • Use read replicas for databases

4. Rate Limit Management

Multiple GitHub tokens:

# Configure multiple profiles
skill-seekers config --github

# Automatic token rotation on rate limit
# (handled by rate_limit_handler.py)

Backup & Disaster Recovery

1. Data Backup

What to backup:

  • Configuration files (~/.config/skill-seekers/)
  • Generated skills (output/)
  • Database/cache (if applicable)
  • Logs (for forensics)

Backup script:

#!/bin/bash
# /opt/skillseekers/scripts/backup.sh

BACKUP_DIR="/backups/skillseekers"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)

# Create backup
tar -czf "$BACKUP_DIR/backup_$TIMESTAMP.tar.gz" \
  ~/.config/skill-seekers \
  /opt/skillseekers/output \
  /opt/skillseekers/.env

# Retain last 30 days
find "$BACKUP_DIR" -name "backup_*.tar.gz" -mtime +30 -delete

# Upload to S3 (optional)
aws s3 cp "$BACKUP_DIR/backup_$TIMESTAMP.tar.gz" \
  s3://backups/skillseekers/

Schedule backups:

# Crontab
0 2 * * * /opt/skillseekers/scripts/backup.sh

2. Disaster Recovery Plan

Recovery steps:

  1. Provision new infrastructure

    # Deploy from backup
    terraform apply
    
  2. Restore configuration

    tar -xzf backup_20250207.tar.gz -C /
    
  3. Verify services

    skill-seekers config --test
    systemctl status skillseekers-mcp
    
  4. Test functionality

    skill-seekers scrape --config configs/test.json --max-pages 10
    

RTO/RPO targets:

  • RTO (Recovery Time Objective): < 2 hours
  • RPO (Recovery Point Objective): < 24 hours

Troubleshooting

Common Issues

1. High Memory Usage

Symptoms:

  • OOM kills
  • Slow performance
  • Swapping

Solutions:

# Check memory usage
ps aux --sort=-%mem | head -10

# Reduce batch size
skill-seekers scrape --config config.json --batch-size 10

# Enable memory limits
docker run --memory=4g skillseekers:latest

2. GitHub Rate Limits

Symptoms:

  • 403 Forbidden errors
  • "API rate limit exceeded" messages

Solutions:

# Check rate limit
curl -H "Authorization: token $GITHUB_TOKEN" \
  https://api.github.com/rate_limit

# Add more tokens
skill-seekers config --github

# Use rate limit strategy
# (automatic with multi-token config)

3. Slow Scraping

Symptoms:

  • Long scraping times
  • Timeouts

Solutions:

# Enable async scraping (2-3x faster)
skill-seekers scrape --config config.json --async

# Increase concurrency
# (adjust in config: "concurrency": 10)

# Use caching
skill-seekers scrape --config config.json --use-cache

4. API Errors

Symptoms:

  • 401 Unauthorized
  • 429 Too Many Requests

Solutions:

# Verify API keys
skill-seekers config --test

# Check API key validity
# Claude API: https://console.anthropic.com/
# OpenAI: https://platform.openai.com/api-keys
# Google: https://console.cloud.google.com/apis/credentials

# Rotate keys if compromised

5. Service Won't Start

Symptoms:

  • systemd service fails
  • Container exits immediately

Solutions:

# Check logs
journalctl -u skillseekers-mcp -n 100

# Or for Docker
docker logs skillseekers-mcp

# Common causes:
# - Missing environment variables
# - Port already in use
# - Permission issues

# Verify config
skill-seekers config --show

Debug Mode

Enable detailed logging:

# Set debug level
export LOG_LEVEL=DEBUG

# Run with verbose output
skill-seekers scrape --config config.json --verbose

Getting Help

Community Support:

Log Collection:

# Collect diagnostic info
tar -czf skillseekers-debug.tar.gz \
  /var/log/skillseekers/ \
  ~/.config/skill-seekers/configs/ \
  /opt/skillseekers/.env

Performance Tuning

1. Scraping Performance

Optimization techniques:

# Enable async scraping
"async_scraping": true,
"concurrency": 20,  # Adjust based on resources

# Optimize selectors
"selectors": {
    "main_content": "article",  # More specific = faster
    "code_blocks": "pre code"
}

# Enable caching
"use_cache": true,
"cache_ttl": 86400  # 24 hours

2. Embedding Performance

GPU acceleration (if available):

# Use GPU for sentence-transformers
pip install sentence-transformers[gpu]

# Configure
export CUDA_VISIBLE_DEVICES=0

Batch processing:

# Generate embeddings in batches
generator.generate_batch(texts, batch_size=32)

3. Storage Performance

Use SSD for:

  • SQLite databases
  • Cache directories
  • Log files

Use object storage for:

  • Skill packages
  • Backup archives
  • Large datasets

Next Steps

  1. Review deployment option that fits your infrastructure
  2. Configure monitoring and alerting
  3. Set up backups and disaster recovery
  4. Test failover procedures
  5. Document your specific deployment
  6. Train your team on operations

Need help? See TROUBLESHOOTING.md or open an issue on GitHub.