Created complete working examples for all 4 vector databases with RAG adaptors: Weaviate Example: - Comprehensive README with hybrid search guide - 3 Python scripts (generate, upload, query) - Sample outputs and query results - Covers hybrid search, filtering, schema design Chroma Example: - Simple, local-first approach - In-memory and persistent storage options - Semantic search and metadata filtering - Comparison with Weaviate FAISS Example: - Facebook AI Similarity Search integration - OpenAI embeddings generation - Index building and persistence - Performance-focused for scale Qdrant Example: - Advanced filtering capabilities - Production-ready features - Complex query patterns - Rust-based performance Each example includes: - Detailed README with setup and troubleshooting - requirements.txt with dependencies - 3 working Python scripts - Sample outputs directory Total files: 20 (4 examples × 5 files each) Documentation: 4 comprehensive READMEs (~800 lines total) Phase 2 of optional enhancements complete. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Weaviate Vector Database Example
This example demonstrates how to use Skill Seekers with Weaviate, a powerful vector database with hybrid search capabilities (keyword + semantic).
What You'll Learn
- How to generate skills in Weaviate format
- How to create a Weaviate schema and upload data
- How to perform hybrid searches (keyword + vector)
- How to filter by metadata categories
Prerequisites
1. Weaviate Instance
Option A: Weaviate Cloud (Recommended for production)
- Sign up at https://console.weaviate.cloud/
- Create a free sandbox cluster
- Get your cluster URL and API key
Option B: Local Docker (Recommended for development)
docker run -d \
--name weaviate \
-p 8080:8080 \
-e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
-e PERSISTENCE_DATA_PATH=/var/lib/weaviate \
semitechnologies/weaviate:latest
2. Python Dependencies
pip install -r requirements.txt
Step-by-Step Guide
Step 1: Generate Skill from Documentation
First, we'll scrape React documentation and package it for Weaviate:
python 1_generate_skill.py
This script will:
- Scrape React docs (limited to 20 pages for demo)
- Package the skill in Weaviate format (JSON with schema + objects)
- Save to
sample_output/react-weaviate.json
Expected Output:
✅ Weaviate data packaged successfully!
📦 Output: output/react-weaviate.json
📊 Total objects: 21
📂 Categories: overview (1), guides (8), api (12)
What's in the JSON?
{
"schema": {
"class": "React",
"description": "React documentation skill",
"properties": [
{"name": "content", "dataType": ["text"]},
{"name": "source", "dataType": ["text"]},
{"name": "category", "dataType": ["text"]},
...
]
},
"objects": [
{
"id": "uuid-here",
"properties": {
"content": "React is a JavaScript library...",
"source": "react",
"category": "overview",
...
}
}
],
"class_name": "React"
}
Step 2: Upload to Weaviate
Now we'll create the schema and upload all objects to Weaviate:
python 2_upload_to_weaviate.py
For local Docker:
python 2_upload_to_weaviate.py --url http://localhost:8080
For Weaviate Cloud:
python 2_upload_to_weaviate.py \
--url https://your-cluster.weaviate.network \
--api-key YOUR_API_KEY
This script will:
- Connect to your Weaviate instance
- Create the schema (class + properties)
- Batch upload all objects
- Verify the upload was successful
Expected Output:
🔗 Connecting to Weaviate at http://localhost:8080...
✅ Weaviate is ready!
📊 Creating schema: React
✅ Schema created successfully!
📤 Uploading 21 objects in batches...
✅ Batch 1/1 uploaded (21 objects)
✅ Successfully uploaded 21 documents to Weaviate
🔍 Class 'React' now contains 21 objects
Step 3: Query and Search
Now the fun part - querying your knowledge base!
python 3_query_example.py
For local Docker:
python 3_query_example.py --url http://localhost:8080
For Weaviate Cloud:
python 3_query_example.py \
--url https://your-cluster.weaviate.network \
--api-key YOUR_API_KEY
This script demonstrates:
- Keyword Search: Traditional text search
- Hybrid Search: Combines keyword + vector similarity
- Metadata Filtering: Filter by category
- Limit and Offset: Pagination
Example Queries:
Query 1: Hybrid Search
Query: "How do I use React hooks?"
Alpha: 0.5 (50% keyword, 50% vector)
Results:
1. Category: api
Snippet: Hooks are functions that let you "hook into" React state and lifecycle...
2. Category: guides
Snippet: To use a Hook, you need to call it at the top level of your component...
Query 2: Filter by Category
Query: API reference
Category: api
Results:
1. useState Hook - Manage component state
2. useEffect Hook - Perform side effects
3. useContext Hook - Access context values
Understanding Weaviate Features
Hybrid Search (alpha parameter)
Weaviate's killer feature is hybrid search, which combines:
- Keyword Search (BM25): Traditional text matching
- Vector Search (ANN): Semantic similarity
Control the balance with alpha:
alpha=0: Pure keyword search (BM25 only)alpha=0.5: Balanced (default - recommended)alpha=1: Pure vector search (semantic only)
When to use what:
- Exact terms (API names, error messages):
alpha=0toalpha=0.3 - Concepts (how to do X, why does Y):
alpha=0.7toalpha=1 - General queries:
alpha=0.5(balanced)
Metadata Filtering
Filter results by any property:
.with_where({
"path": ["category"],
"operator": "Equal",
"valueText": "api"
})
Supported operators:
Equal,NotEqualGreaterThan,LessThanAnd,Or,Not
Schema Design
Our schema includes:
- content: The actual documentation text (vectorized)
- source: Skill name (e.g., "react")
- category: Document category (e.g., "api", "guides")
- file: Source file name
- type: Document type ("overview" or "reference")
- version: Skill version
Customization
Generate Your Own Skill
Want to use a different documentation source? Easy:
# 1_generate_skill.py (modify line 10)
"--config", "configs/vue.json", # Change to your config
Or scrape from scratch:
skill-seekers scrape --config configs/your_framework.json
skill-seekers package output/your_framework --target weaviate
Adjust Search Parameters
In 3_query_example.py, modify:
# Adjust hybrid search balance
alpha=0.7 # More semantic, less keyword
# Adjust result count
.with_limit(10) # Get more results
# Add more filters
.with_where({
"operator": "And",
"operands": [
{"path": ["category"], "operator": "Equal", "valueText": "api"},
{"path": ["type"], "operator": "Equal", "valueText": "reference"}
]
})
Troubleshooting
Connection Refused
Error: Connection refused to http://localhost:8080
Solution: Ensure Weaviate is running:
docker ps | grep weaviate
# If not running, start it:
docker start weaviate
Schema Already Exists
Error: Class 'React' already exists
Solution: Delete the existing class:
# In Python or using Weaviate API
client.schema.delete_class("React")
Or use the example's built-in reset:
python 2_upload_to_weaviate.py --reset
Empty Results
Query returned 0 results
Possible causes:
- No embeddings: Weaviate needs a vectorizer configured (we use default)
- Wrong class name: Check the class name matches
- Data not uploaded: Verify with
client.query.aggregate("React").with_meta_count().do()
Solution: Check object count:
result = client.query.aggregate("React").with_meta_count().do()
print(result) # Should show {"data": {"Aggregate": {"React": [{"meta": {"count": 21}}]}}}
Next Steps
- Try other skills: Generate skills for your favorite frameworks
- Production deployment: Use Weaviate Cloud for scalability
- Add custom vectorizers: Use OpenAI, Cohere, or local models
- Build RAG apps: Integrate with LangChain or LlamaIndex
Resources
- Weaviate Docs: https://weaviate.io/developers/weaviate
- Hybrid Search: https://weaviate.io/developers/weaviate/search/hybrid
- Python Client: https://weaviate.io/developers/weaviate/client-libraries/python
- Skill Seekers Docs: https://github.com/yourusername/skill-seekers
File Structure
weaviate-example/
├── README.md # This file
├── requirements.txt # Python dependencies
├── 1_generate_skill.py # Generate Weaviate-format skill
├── 2_upload_to_weaviate.py # Upload to Weaviate instance
├── 3_query_example.py # Query demonstrations
└── sample_output/ # Example outputs
├── react-weaviate.json # Generated skill (21 objects)
└── query_results.txt # Sample query results
Last Updated: February 2026 Tested With: Weaviate v1.25.0, Python 3.10+, skill-seekers v2.10.0