docs: Add Skill Seekers reference for Trinity Codex (Task #93)

- Forked yusufkaraaslan/Skill_Seekers to Gitea (MIT License)
- Added Qdrant integration guide for Task #93
- Tool converts docs/repos/PDFs to RAG-ready format
- Directly applicable to Trinity Codex knowledge base

Chronicler #73
This commit is contained in:
Claude
2026-04-09 13:35:44 +00:00
parent 874f185435
commit 791c131fac
2 changed files with 217 additions and 0 deletions

View File

@@ -529,6 +529,30 @@ docs/skills/
**License:** MIT
**Added:** 2026-04-09 by Chronicler #73
### skill-seekers-reference (Gitea)
**Location:** https://git.firefrostgaming.com/firefrost-gaming/skill-seekers-reference
**Source:** Fork of [yusufkaraaslan/Skill_Seekers](https://github.com/yusufkaraaslan/Skill_Seekers)
**License:** MIT
**Added:** 2026-04-09 by Chronicler #73
**Purpose:** Convert documentation sites, GitHub repos, PDFs, and 17+ source types into AI skills and RAG pipelines. Directly applicable to Task #93 (Trinity Codex).
| Feature | Description |
|---------|-------------|
| **17 source types** | Docs, GitHub, PDFs, YouTube, notebooks, wikis, OpenAPI |
| **Multiple targets** | Claude, Gemini, OpenAI, LangChain, LlamaIndex, Qdrant |
| **Qdrant pipeline** | Direct integration with our planned RAG stack |
| **Smart chunking** | Preserves code blocks and context |
**Quick start:**
```bash
pip install skill-seekers
skill-seekers create https://docs.example.com/
skill-seekers package output/example --target qdrant
```
**See also:** `docs/tasks/task-093-trinity-codex/references/skill-seekers-qdrant.md`
**Contents:** 2,375 files across 20+ categories — a comprehensive AI skills library for full review.
| Category | Items | Highlights |

View File

@@ -0,0 +1,193 @@
# Skill Seekers + Qdrant Integration
**Source:** https://github.com/yusufkaraaslan/Skill_Seekers
**License:** MIT
**Gitea Fork:** https://git.firefrostgaming.com/firefrost-gaming/skill-seekers-reference
## Overview
Skill Seekers converts documentation sites, GitHub repos, PDFs, and 17+ source types into structured knowledge assets ready for RAG pipelines. This is directly applicable to Trinity Codex.
## Installation
```bash
pip install skill-seekers
```
## Quick Start
```bash
# Convert docs to skill
skill-seekers create https://docs.example.com/
# Package for Qdrant
skill-seekers package output/example --target qdrant
```
## Supported Sources (17 types)
- Documentation websites
- GitHub repositories
- PDF documents
- Word documents (.docx)
- EPUB e-books
- Jupyter Notebooks
- OpenAPI specs
- PowerPoint presentations
- AsciiDoc documents
- HTML files
- RSS/Atom feeds
- Man pages
- YouTube videos (with `skill-seekers[video]`)
## Qdrant Pipeline
### Step 1: Generate Skill
```python
#!/usr/bin/env python3
import subprocess
from pathlib import Path
# Scrape documentation
subprocess.run([
"skill-seekers", "scrape",
"--config", "configs/your-config.json",
"--max-pages", "20"
], check=True)
# Package for Qdrant
subprocess.run([
"skill-seekers", "package",
"output/your-skill",
"--target", "qdrant"
], check=True)
output = Path("output/your-skill-qdrant.json")
print(f"Ready: {output} ({output.stat().st_size/1024:.1f} KB)")
```
### Step 2: Upload to Qdrant
```python
#!/usr/bin/env python3
import json
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct
# Connect to Qdrant (our instance will be on TX1)
client = QdrantClient(url="http://localhost:6333")
# Load packaged data
with open("output/your-skill-qdrant.json") as f:
data = json.load(f)
collection_name = data["collection_name"]
config = data["config"]
# Create collection
client.create_collection(
collection_name=collection_name,
vectors_config=VectorParams(
size=config["vector_size"],
distance=Distance.COSINE
)
)
# Upload points (add real embeddings in production)
points = []
for point in data["points"]:
points.append(PointStruct(
id=point["id"],
vector=[0.0] * config["vector_size"], # Replace with real embeddings
payload=point["payload"]
))
client.upsert(collection_name=collection_name, points=points)
print(f"Uploaded {len(points)} points to {collection_name}")
```
### Step 3: Query
```python
#!/usr/bin/env python3
from qdrant_client import QdrantClient
from qdrant_client.models import Filter, FieldCondition, MatchValue
client = QdrantClient(url="http://localhost:6333")
collection_name = "your-collection"
# Filter by category
result = client.scroll(
collection_name=collection_name,
scroll_filter=Filter(
must=[
FieldCondition(
key="category",
match=MatchValue(value="api")
)
]
),
limit=5
)
for point in result[0]:
print(f"- {point.payload['file']}: {point.payload['content'][:100]}...")
```
## Trinity Codex Application
### Phase 1: Documentation Ingestion
Convert key Firefrost documentation sources:
```bash
# Pterodactyl docs
skill-seekers create https://pterodactyl.io/project/introduction.html
skill-seekers package output/pterodactyl --target qdrant
# Minecraft Wiki (modding)
skill-seekers create https://minecraft.wiki/w/Mods
# Operations Manual (local)
skill-seekers create ./docs/
skill-seekers package output/docs --target qdrant
```
### Phase 2: Vector Database Setup
Qdrant runs on TX1 (38.68.14.26) alongside Dify:
```bash
# Docker deployment
docker run -d \
--name qdrant \
-p 6333:6333 \
-v /opt/qdrant/storage:/qdrant/storage \
qdrant/qdrant:latest
```
### Phase 3: Dify Integration
Dify connects to Qdrant for RAG queries. See Dify documentation for knowledge base configuration.
## Key Features for Firefrost
| Feature | Benefit |
|---------|---------|
| Multi-source ingestion | Combine wiki, docs, PDFs into one knowledge base |
| Qdrant-native output | Direct integration with our planned stack |
| Smart chunking | Preserves code blocks and context |
| Metadata preservation | Category, file, type fields for filtering |
| 500+ line SKILL.md | High-quality Claude skills from any source |
## Resources
- **Full repo:** https://git.firefrostgaming.com/firefrost-gaming/skill-seekers-reference
- **Original:** https://github.com/yusufkaraaslan/Skill_Seekers
- **Website:** https://skillseekersweb.com/
- **Qdrant Docs:** https://qdrant.tech/documentation/
---
*Added by Chronicler #73 on 2026-04-09*