diff --git a/docs/skills/SKILLS-INDEX.md b/docs/skills/SKILLS-INDEX.md index 8a905bc..98b486c 100644 --- a/docs/skills/SKILLS-INDEX.md +++ b/docs/skills/SKILLS-INDEX.md @@ -529,6 +529,30 @@ docs/skills/ **License:** MIT **Added:** 2026-04-09 by Chronicler #73 +### skill-seekers-reference (Gitea) +**Location:** https://git.firefrostgaming.com/firefrost-gaming/skill-seekers-reference +**Source:** Fork of [yusufkaraaslan/Skill_Seekers](https://github.com/yusufkaraaslan/Skill_Seekers) +**License:** MIT +**Added:** 2026-04-09 by Chronicler #73 + +**Purpose:** Convert documentation sites, GitHub repos, PDFs, and 17+ source types into AI skills and RAG pipelines. Directly applicable to Task #93 (Trinity Codex). + +| Feature | Description | +|---------|-------------| +| **17 source types** | Docs, GitHub, PDFs, YouTube, notebooks, wikis, OpenAPI | +| **Multiple targets** | Claude, Gemini, OpenAI, LangChain, LlamaIndex, Qdrant | +| **Qdrant pipeline** | Direct integration with our planned RAG stack | +| **Smart chunking** | Preserves code blocks and context | + +**Quick start:** +```bash +pip install skill-seekers +skill-seekers create https://docs.example.com/ +skill-seekers package output/example --target qdrant +``` + +**See also:** `docs/tasks/task-093-trinity-codex/references/skill-seekers-qdrant.md` + **Contents:** 2,375 files across 20+ categories — a comprehensive AI skills library for full review. | Category | Items | Highlights | diff --git a/docs/tasks/task-093-trinity-codex/references/skill-seekers-qdrant.md b/docs/tasks/task-093-trinity-codex/references/skill-seekers-qdrant.md new file mode 100644 index 0000000..7828b5b --- /dev/null +++ b/docs/tasks/task-093-trinity-codex/references/skill-seekers-qdrant.md @@ -0,0 +1,193 @@ +# Skill Seekers + Qdrant Integration + +**Source:** https://github.com/yusufkaraaslan/Skill_Seekers +**License:** MIT +**Gitea Fork:** https://git.firefrostgaming.com/firefrost-gaming/skill-seekers-reference + +## Overview + +Skill Seekers converts documentation sites, GitHub repos, PDFs, and 17+ source types into structured knowledge assets ready for RAG pipelines. This is directly applicable to Trinity Codex. + +## Installation + +```bash +pip install skill-seekers +``` + +## Quick Start + +```bash +# Convert docs to skill +skill-seekers create https://docs.example.com/ + +# Package for Qdrant +skill-seekers package output/example --target qdrant +``` + +## Supported Sources (17 types) + +- Documentation websites +- GitHub repositories +- PDF documents +- Word documents (.docx) +- EPUB e-books +- Jupyter Notebooks +- OpenAPI specs +- PowerPoint presentations +- AsciiDoc documents +- HTML files +- RSS/Atom feeds +- Man pages +- YouTube videos (with `skill-seekers[video]`) + +## Qdrant Pipeline + +### Step 1: Generate Skill + +```python +#!/usr/bin/env python3 +import subprocess +from pathlib import Path + +# Scrape documentation +subprocess.run([ + "skill-seekers", "scrape", + "--config", "configs/your-config.json", + "--max-pages", "20" +], check=True) + +# Package for Qdrant +subprocess.run([ + "skill-seekers", "package", + "output/your-skill", + "--target", "qdrant" +], check=True) + +output = Path("output/your-skill-qdrant.json") +print(f"Ready: {output} ({output.stat().st_size/1024:.1f} KB)") +``` + +### Step 2: Upload to Qdrant + +```python +#!/usr/bin/env python3 +import json +from qdrant_client import QdrantClient +from qdrant_client.models import Distance, VectorParams, PointStruct + +# Connect to Qdrant (our instance will be on TX1) +client = QdrantClient(url="http://localhost:6333") + +# Load packaged data +with open("output/your-skill-qdrant.json") as f: + data = json.load(f) + +collection_name = data["collection_name"] +config = data["config"] + +# Create collection +client.create_collection( + collection_name=collection_name, + vectors_config=VectorParams( + size=config["vector_size"], + distance=Distance.COSINE + ) +) + +# Upload points (add real embeddings in production) +points = [] +for point in data["points"]: + points.append(PointStruct( + id=point["id"], + vector=[0.0] * config["vector_size"], # Replace with real embeddings + payload=point["payload"] + )) + +client.upsert(collection_name=collection_name, points=points) +print(f"Uploaded {len(points)} points to {collection_name}") +``` + +### Step 3: Query + +```python +#!/usr/bin/env python3 +from qdrant_client import QdrantClient +from qdrant_client.models import Filter, FieldCondition, MatchValue + +client = QdrantClient(url="http://localhost:6333") +collection_name = "your-collection" + +# Filter by category +result = client.scroll( + collection_name=collection_name, + scroll_filter=Filter( + must=[ + FieldCondition( + key="category", + match=MatchValue(value="api") + ) + ] + ), + limit=5 +) + +for point in result[0]: + print(f"- {point.payload['file']}: {point.payload['content'][:100]}...") +``` + +## Trinity Codex Application + +### Phase 1: Documentation Ingestion + +Convert key Firefrost documentation sources: + +```bash +# Pterodactyl docs +skill-seekers create https://pterodactyl.io/project/introduction.html +skill-seekers package output/pterodactyl --target qdrant + +# Minecraft Wiki (modding) +skill-seekers create https://minecraft.wiki/w/Mods + +# Operations Manual (local) +skill-seekers create ./docs/ +skill-seekers package output/docs --target qdrant +``` + +### Phase 2: Vector Database Setup + +Qdrant runs on TX1 (38.68.14.26) alongside Dify: + +```bash +# Docker deployment +docker run -d \ + --name qdrant \ + -p 6333:6333 \ + -v /opt/qdrant/storage:/qdrant/storage \ + qdrant/qdrant:latest +``` + +### Phase 3: Dify Integration + +Dify connects to Qdrant for RAG queries. See Dify documentation for knowledge base configuration. + +## Key Features for Firefrost + +| Feature | Benefit | +|---------|---------| +| Multi-source ingestion | Combine wiki, docs, PDFs into one knowledge base | +| Qdrant-native output | Direct integration with our planned stack | +| Smart chunking | Preserves code blocks and context | +| Metadata preservation | Category, file, type fields for filtering | +| 500+ line SKILL.md | High-quality Claude skills from any source | + +## Resources + +- **Full repo:** https://git.firefrostgaming.com/firefrost-gaming/skill-seekers-reference +- **Original:** https://github.com/yusufkaraaslan/Skill_Seekers +- **Website:** https://skillseekersweb.com/ +- **Qdrant Docs:** https://qdrant.tech/documentation/ + +--- + +*Added by Chronicler #73 on 2026-04-09*