- Filter out chunks smaller than min_chunk_size (default 100 tokens) - Exception: Keep all chunks if entire document is smaller than target size - All 15 tests passing (100% pass rate) Fixes edge case where very small chunks (e.g., 'Short.' = 6 chars) were being created despite min_chunk_size=100 setting. Test: pytest tests/test_rag_chunker.py -v
42 lines
1.1 KiB
Plaintext
42 lines
1.1 KiB
Plaintext
# Skill Seekers Docker Environment Configuration
|
|
# Copy this file to .env and fill in your API keys
|
|
|
|
# Claude AI / Anthropic API
|
|
# Required for AI enhancement features
|
|
# Get your key from: https://console.anthropic.com/
|
|
ANTHROPIC_API_KEY=sk-ant-your-key-here
|
|
|
|
# Google Gemini API (Optional)
|
|
# Required for Gemini platform support
|
|
# Get your key from: https://makersuite.google.com/app/apikey
|
|
GOOGLE_API_KEY=
|
|
|
|
# OpenAI API (Optional)
|
|
# Required for OpenAI/ChatGPT platform support
|
|
# Get your key from: https://platform.openai.com/api-keys
|
|
OPENAI_API_KEY=
|
|
|
|
# GitHub Token (Optional, but recommended)
|
|
# Increases rate limits from 60/hour to 5000/hour
|
|
# Create token at: https://github.com/settings/tokens
|
|
# Required scopes: public_repo (for public repos)
|
|
GITHUB_TOKEN=
|
|
|
|
# MCP Server Configuration
|
|
MCP_TRANSPORT=http
|
|
MCP_PORT=8765
|
|
|
|
# Docker Resource Limits (Optional)
|
|
# Uncomment to set custom limits
|
|
# DOCKER_CPU_LIMIT=2.0
|
|
# DOCKER_MEMORY_LIMIT=4g
|
|
|
|
# Vector Database Ports (Optional - change if needed)
|
|
# WEAVIATE_PORT=8080
|
|
# QDRANT_PORT=6333
|
|
# CHROMA_PORT=8000
|
|
|
|
# Logging (Optional)
|
|
# SKILL_SEEKERS_LOG_LEVEL=INFO
|
|
# SKILL_SEEKERS_LOG_FILE=/data/logs/skill-seekers.log
|