# Skill Seekers + Qdrant Integration **Source:** https://github.com/yusufkaraaslan/Skill_Seekers **License:** MIT **Gitea Fork:** https://git.firefrostgaming.com/firefrost-gaming/skill-seekers-reference ## Overview Skill Seekers converts documentation sites, GitHub repos, PDFs, and 17+ source types into structured knowledge assets ready for RAG pipelines. This is directly applicable to Trinity Codex. ## Installation ```bash pip install skill-seekers ``` ## Quick Start ```bash # Convert docs to skill skill-seekers create https://docs.example.com/ # Package for Qdrant skill-seekers package output/example --target qdrant ``` ## Supported Sources (17 types) - Documentation websites - GitHub repositories - PDF documents - Word documents (.docx) - EPUB e-books - Jupyter Notebooks - OpenAPI specs - PowerPoint presentations - AsciiDoc documents - HTML files - RSS/Atom feeds - Man pages - YouTube videos (with `skill-seekers[video]`) ## Qdrant Pipeline ### Step 1: Generate Skill ```python #!/usr/bin/env python3 import subprocess from pathlib import Path # Scrape documentation subprocess.run([ "skill-seekers", "scrape", "--config", "configs/your-config.json", "--max-pages", "20" ], check=True) # Package for Qdrant subprocess.run([ "skill-seekers", "package", "output/your-skill", "--target", "qdrant" ], check=True) output = Path("output/your-skill-qdrant.json") print(f"Ready: {output} ({output.stat().st_size/1024:.1f} KB)") ``` ### Step 2: Upload to Qdrant ```python #!/usr/bin/env python3 import json from qdrant_client import QdrantClient from qdrant_client.models import Distance, VectorParams, PointStruct # Connect to Qdrant (our instance will be on TX1) client = QdrantClient(url="http://localhost:6333") # Load packaged data with open("output/your-skill-qdrant.json") as f: data = json.load(f) collection_name = data["collection_name"] config = data["config"] # Create collection client.create_collection( collection_name=collection_name, vectors_config=VectorParams( size=config["vector_size"], distance=Distance.COSINE ) ) # Upload points (add real embeddings in production) points = [] for point in data["points"]: points.append(PointStruct( id=point["id"], vector=[0.0] * config["vector_size"], # Replace with real embeddings payload=point["payload"] )) client.upsert(collection_name=collection_name, points=points) print(f"Uploaded {len(points)} points to {collection_name}") ``` ### Step 3: Query ```python #!/usr/bin/env python3 from qdrant_client import QdrantClient from qdrant_client.models import Filter, FieldCondition, MatchValue client = QdrantClient(url="http://localhost:6333") collection_name = "your-collection" # Filter by category result = client.scroll( collection_name=collection_name, scroll_filter=Filter( must=[ FieldCondition( key="category", match=MatchValue(value="api") ) ] ), limit=5 ) for point in result[0]: print(f"- {point.payload['file']}: {point.payload['content'][:100]}...") ``` ## Trinity Codex Application ### Phase 1: Documentation Ingestion Convert key Firefrost documentation sources: ```bash # Pterodactyl docs skill-seekers create https://pterodactyl.io/project/introduction.html skill-seekers package output/pterodactyl --target qdrant # Minecraft Wiki (modding) skill-seekers create https://minecraft.wiki/w/Mods # Operations Manual (local) skill-seekers create ./docs/ skill-seekers package output/docs --target qdrant ``` ### Phase 2: Vector Database Setup Qdrant runs on TX1 (38.68.14.26) alongside Dify: ```bash # Docker deployment docker run -d \ --name qdrant \ -p 6333:6333 \ -v /opt/qdrant/storage:/qdrant/storage \ qdrant/qdrant:latest ``` ### Phase 3: Dify Integration Dify connects to Qdrant for RAG queries. See Dify documentation for knowledge base configuration. ## Key Features for Firefrost | Feature | Benefit | |---------|---------| | Multi-source ingestion | Combine wiki, docs, PDFs into one knowledge base | | Qdrant-native output | Direct integration with our planned stack | | Smart chunking | Preserves code blocks and context | | Metadata preservation | Category, file, type fields for filtering | | 500+ line SKILL.md | High-quality Claude skills from any source | ## Resources - **Full repo:** https://git.firefrostgaming.com/firefrost-gaming/skill-seekers-reference - **Original:** https://github.com/yusufkaraaslan/Skill_Seekers - **Website:** https://skillseekersweb.com/ - **Qdrant Docs:** https://qdrant.tech/documentation/ --- *Added by Chronicler #73 on 2026-04-09*