feat: add distribution files for Smithery, GitHub Action, and Claude Code Plugin
- Add Claude Code Plugin: plugin.json, .mcp.json, 3 slash commands, skill-builder agent skill - Add GitHub Action: composite action.yml with 6 inputs/2 outputs, comprehensive README - Add Smithery: publishing guide with namespace yusufkaraaslan/skill-seekers created - Add render-mcp.yaml for MCP server deployment on Render - Fix Dockerfile.mcp: --transport flag (nonexistent) → --http, add dynamic PORT support - Update AGENTS.md to v3.3.0 with corrected test count and expanded CI section - Allow distribution/claude-plugin/.mcp.json in .gitignore
This commit is contained in:
1
.gitignore
vendored
1
.gitignore
vendored
@@ -61,5 +61,6 @@ htmlcov/
|
||||
skill-seekers-configs/
|
||||
.claude/skills
|
||||
.mcp.json
|
||||
!distribution/claude-plugin/.mcp.json
|
||||
settings.json
|
||||
USER_GUIDE.md
|
||||
|
||||
81
AGENTS.md
81
AGENTS.md
@@ -1,18 +1,20 @@
|
||||
# AGENTS.md - Skill Seekers
|
||||
|
||||
Concise reference for AI coding agents. Skill Seekers is a Python CLI tool (v3.2.0) that converts documentation sites, GitHub repos, PDFs, videos, notebooks, wikis, and more into AI-ready skills for 16+ LLM platforms and RAG pipelines.
|
||||
Concise reference for AI coding agents. Skill Seekers is a Python CLI tool (v3.3.0) that converts documentation sites, GitHub repos, PDFs, videos, notebooks, wikis, and more into AI-ready skills for 16+ LLM platforms and RAG pipelines.
|
||||
|
||||
## Setup
|
||||
|
||||
```bash
|
||||
# REQUIRED before running tests (src/ layout — tests fail without this)
|
||||
# REQUIRED before running tests (src/ layout — tests hard-exit if package not installed)
|
||||
pip install -e .
|
||||
# With dev tools
|
||||
# With dev tools (pytest, ruff, mypy, coverage)
|
||||
pip install -e ".[dev]"
|
||||
# With all optional deps
|
||||
pip install -e ".[all]"
|
||||
```
|
||||
|
||||
Note: `tests/conftest.py` checks that `skill_seekers` is importable and calls `sys.exit(1)` if not. Always install in editable mode first.
|
||||
|
||||
## Build / Test / Lint Commands
|
||||
|
||||
```bash
|
||||
@@ -46,8 +48,10 @@ ruff format src/ tests/
|
||||
mypy src/skill_seekers --show-error-codes --pretty
|
||||
```
|
||||
|
||||
**Test markers:** `slow`, `integration`, `e2e`, `venv`, `bootstrap`, `benchmark`
|
||||
**Async tests:** use `@pytest.mark.asyncio`; asyncio_mode is `auto`.
|
||||
**Pytest config** (from pyproject.toml): `addopts = "-v --tb=short --strict-markers"`, `asyncio_mode = "auto"`, `asyncio_default_fixture_loop_scope = "function"`.
|
||||
**Test markers:** `slow`, `integration`, `e2e`, `venv`, `bootstrap`, `benchmark`, `asyncio`.
|
||||
**Async tests:** use `@pytest.mark.asyncio`; asyncio_mode is `auto` so the decorator is often implicit.
|
||||
**Test count:** 120 test files (107 in `tests/`, 13 in `tests/test_adaptors/`).
|
||||
|
||||
## Code Style
|
||||
|
||||
@@ -61,61 +65,47 @@ mypy src/skill_seekers --show-error-codes --pretty
|
||||
- Sort with isort (via ruff); `skill_seekers` is first-party
|
||||
- Standard library → third-party → first-party, separated by blank lines
|
||||
- Use `from __future__ import annotations` only if needed for forward refs
|
||||
- Guard optional imports with try/except ImportError (see `adaptors/__init__.py` pattern)
|
||||
- Guard optional imports with try/except ImportError (see `adaptors/__init__.py` pattern):
|
||||
```python
|
||||
try:
|
||||
from .claude import ClaudeAdaptor
|
||||
except ImportError:
|
||||
ClaudeAdaptor = None
|
||||
```
|
||||
|
||||
### Naming Conventions
|
||||
- **Files:** `snake_case.py`
|
||||
- **Classes:** `PascalCase` (e.g., `SkillAdaptor`, `ClaudeAdaptor`)
|
||||
- **Functions/methods:** `snake_case`
|
||||
- **Constants:** `UPPER_CASE` (e.g., `ADAPTORS`, `DEFAULT_CHUNK_TOKENS`)
|
||||
- **Private:** prefix with `_`
|
||||
- **Files:** `snake_case.py` (e.g., `source_detector.py`, `config_validator.py`)
|
||||
- **Classes:** `PascalCase` (e.g., `SkillAdaptor`, `ClaudeAdaptor`, `SourceDetector`)
|
||||
- **Functions/methods:** `snake_case` (e.g., `get_adaptor()`, `detect_language()`)
|
||||
- **Constants:** `UPPER_CASE` (e.g., `ADAPTORS`, `DEFAULT_CHUNK_TOKENS`, `VALID_SOURCE_TYPES`)
|
||||
- **Private:** prefix with `_` (e.g., `_read_existing_content()`, `_validate_unified()`)
|
||||
|
||||
### Type Hints
|
||||
- Gradual typing — add hints where practical, not enforced everywhere
|
||||
- Use modern syntax: `str | None` not `Optional[str]`, `list[str]` not `List[str]`
|
||||
- MyPy config: `disallow_untyped_defs = false`, `check_untyped_defs = true`, `ignore_missing_imports = true`
|
||||
- Tests are excluded from strict type checking (`disallow_untyped_defs = false`, `check_untyped_defs = false` for `tests.*`)
|
||||
|
||||
### Docstrings
|
||||
- Module-level docstring on every file (triple-quoted, describes purpose)
|
||||
- Google-style or standard docstrings for public functions/classes
|
||||
- Google-style docstrings for public functions/classes
|
||||
- Include `Args:`, `Returns:`, `Raises:` sections where useful
|
||||
|
||||
### Error Handling
|
||||
- Use specific exceptions, never bare `except:`
|
||||
- Provide helpful error messages with context (see `get_adaptor()` in `adaptors/__init__.py`)
|
||||
- Provide helpful error messages with context
|
||||
- Use `raise ValueError(...)` for invalid arguments, `raise RuntimeError(...)` for state errors
|
||||
- Guard optional dependency imports with try/except and give clear install instructions on failure
|
||||
- Chain exceptions with `raise ... from e` when wrapping
|
||||
|
||||
### Suppressing Lint Warnings
|
||||
- Use inline `# noqa: XXXX` comments (e.g., `# noqa: F401` for re-exports, `# noqa: ARG001` for required but unused params)
|
||||
|
||||
## Supported Source Types (17)
|
||||
|
||||
| Type | CLI Command | Config Type | Detection |
|
||||
|------|------------|-------------|-----------|
|
||||
| Documentation (web) | `scrape` / `create <url>` | `documentation` | HTTP/HTTPS URLs |
|
||||
| GitHub repo | `github` / `create owner/repo` | `github` | `owner/repo` or github.com URLs |
|
||||
| PDF | `pdf` / `create file.pdf` | `pdf` | `.pdf` extension |
|
||||
| Word (.docx) | `word` / `create file.docx` | `word` | `.docx` extension |
|
||||
| EPUB | `epub` / `create file.epub` | `epub` | `.epub` extension |
|
||||
| Video | `video` / `create <url/file>` | `video` | YouTube/Vimeo URLs, video extensions |
|
||||
| Local codebase | `analyze` / `create ./path` | `local` | Directory paths |
|
||||
| Jupyter Notebook | `jupyter` / `create file.ipynb` | `jupyter` | `.ipynb` extension |
|
||||
| Local HTML | `html` / `create file.html` | `html` | `.html`/`.htm` extensions |
|
||||
| OpenAPI/Swagger | `openapi` / `create spec.yaml` | `openapi` | `.yaml`/`.yml` with OpenAPI content |
|
||||
| AsciiDoc | `asciidoc` / `create file.adoc` | `asciidoc` | `.adoc`/`.asciidoc` extensions |
|
||||
| PowerPoint | `pptx` / `create file.pptx` | `pptx` | `.pptx` extension |
|
||||
| RSS/Atom | `rss` / `create feed.rss` | `rss` | `.rss`/`.atom` extensions |
|
||||
| Man pages | `manpage` / `create cmd.1` | `manpage` | `.1`-`.8`/`.man` extensions |
|
||||
| Confluence | `confluence` | `confluence` | API or export directory |
|
||||
| Notion | `notion` | `notion` | API or export directory |
|
||||
| Slack/Discord | `chat` | `chat` | Export directory or API |
|
||||
|
||||
## Project Layout
|
||||
|
||||
```
|
||||
src/skill_seekers/ # Main package (src/ layout)
|
||||
cli/ # CLI commands and entry points
|
||||
cli/ # CLI commands and entry points (96 files)
|
||||
adaptors/ # Platform adaptors (Strategy pattern, inherit SkillAdaptor)
|
||||
arguments/ # CLI argument definitions (one per source type)
|
||||
parsers/ # Subcommand parsers (one per source type)
|
||||
@@ -127,15 +117,15 @@ src/skill_seekers/ # Main package (src/ layout)
|
||||
unified_scraper.py # Multi-source orchestrator (scraped_data + dispatch)
|
||||
unified_skill_builder.py # Pairwise synthesis + generic merge
|
||||
mcp/ # MCP server (FastMCP + legacy)
|
||||
tools/ # MCP tool implementations by category
|
||||
tools/ # MCP tool implementations by category (10 files)
|
||||
sync/ # Sync monitoring (Pydantic models)
|
||||
benchmark/ # Benchmarking framework
|
||||
embedding/ # FastAPI embedding server
|
||||
workflows/ # 67 YAML workflow presets (includes complex-merge.yaml)
|
||||
workflows/ # 67 YAML workflow presets
|
||||
_version.py # Reads version from pyproject.toml
|
||||
tests/ # 115+ test files (pytest)
|
||||
tests/ # 120 test files (pytest)
|
||||
configs/ # Preset JSON scraping configs
|
||||
docs/ # 80+ markdown doc files
|
||||
docs/ # Documentation (guides, integrations, architecture)
|
||||
```
|
||||
|
||||
## Key Patterns
|
||||
@@ -150,6 +140,8 @@ docs/ # 80+ markdown doc files
|
||||
|
||||
**CLI subcommands** — git-style in `cli/main.py`. Each delegates to a module's `main()` function.
|
||||
|
||||
**Supported source types (17):** documentation (web), github, pdf, word, epub, video, local codebase, jupyter, html, openapi, asciidoc, pptx, rss, manpage, confluence, notion, chat (slack/discord). Each detected automatically by `source_detector.py`.
|
||||
|
||||
## Git Workflow
|
||||
|
||||
- **`main`** — production, protected
|
||||
@@ -168,4 +160,11 @@ Never commit API keys. Use env vars: `ANTHROPIC_API_KEY`, `GOOGLE_API_KEY`, `OPE
|
||||
|
||||
## CI
|
||||
|
||||
GitHub Actions (`.github/workflows/tests.yml`): ruff + mypy lint job, then pytest matrix (Ubuntu + macOS, Python 3.10-3.12) with Codecov upload.
|
||||
GitHub Actions (7 workflows in `.github/workflows/`):
|
||||
- **tests.yml** — ruff + mypy lint job, then pytest matrix (Ubuntu + macOS, Python 3.10-3.12) with Codecov upload
|
||||
- **release.yml** — tag-triggered: tests → version verification → PyPI publish via `uv build`
|
||||
- **test-vector-dbs.yml** — tests vector DB adaptors (weaviate, chroma, faiss, qdrant)
|
||||
- **docker-publish.yml** — multi-platform Docker builds (amd64, arm64) for CLI + MCP images
|
||||
- **quality-metrics.yml** — quality analysis with configurable threshold
|
||||
- **scheduled-updates.yml** — weekly skill updates for popular frameworks
|
||||
- **vector-db-export.yml** — weekly vector DB exports
|
||||
|
||||
@@ -4,8 +4,8 @@
|
||||
FROM python:3.12-slim
|
||||
|
||||
LABEL maintainer="Skill Seekers <noreply@skillseekers.dev>"
|
||||
LABEL description="Skill Seekers MCP Server - 25 tools for AI skills generation"
|
||||
LABEL version="2.9.0"
|
||||
LABEL description="Skill Seekers MCP Server - 35 tools for AI skills generation"
|
||||
LABEL version="3.3.0"
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
@@ -48,9 +48,10 @@ HEALTHCHECK --interval=30s --timeout=10s --start-period=10s --retries=3 \
|
||||
# Volumes
|
||||
VOLUME ["/data", "/configs", "/output"]
|
||||
|
||||
# Expose MCP server port
|
||||
EXPOSE 8765
|
||||
# Expose MCP server port (default 8765, overridden by $PORT on cloud platforms)
|
||||
EXPOSE ${MCP_PORT:-8765}
|
||||
|
||||
# Start MCP server in HTTP mode by default
|
||||
# Use --transport stdio for stdio mode
|
||||
CMD ["python", "-m", "skill_seekers.mcp.server_fastmcp", "--transport", "http", "--port", "8765"]
|
||||
# Uses shell form so $PORT/$MCP_PORT env vars are expanded at runtime
|
||||
# Cloud platforms (Render, Railway, etc.) set $PORT automatically
|
||||
CMD python -m skill_seekers.mcp.server_fastmcp --http --host 0.0.0.0 --port ${PORT:-${MCP_PORT:-8765}}
|
||||
|
||||
11
distribution/claude-plugin/.claude-plugin/plugin.json
Normal file
11
distribution/claude-plugin/.claude-plugin/plugin.json
Normal file
@@ -0,0 +1,11 @@
|
||||
{
|
||||
"name": "skill-seekers",
|
||||
"description": "Transform 17 source types (docs, GitHub, PDFs, videos, Jupyter, Confluence, Notion, Slack, and more) into AI-ready skills and RAG knowledge for 16+ LLM platforms.",
|
||||
"version": "3.3.0",
|
||||
"author": {
|
||||
"name": "Yusuf Karaaslan"
|
||||
},
|
||||
"homepage": "https://github.com/yusufkaraaslan/Skill_Seekers",
|
||||
"repository": "https://github.com/yusufkaraaslan/Skill_Seekers",
|
||||
"license": "MIT"
|
||||
}
|
||||
6
distribution/claude-plugin/.mcp.json
Normal file
6
distribution/claude-plugin/.mcp.json
Normal file
@@ -0,0 +1,6 @@
|
||||
{
|
||||
"skill-seekers": {
|
||||
"command": "python",
|
||||
"args": ["-m", "skill_seekers.mcp.server_fastmcp"]
|
||||
}
|
||||
}
|
||||
93
distribution/claude-plugin/README.md
Normal file
93
distribution/claude-plugin/README.md
Normal file
@@ -0,0 +1,93 @@
|
||||
# Skill Seekers — Claude Code Plugin
|
||||
|
||||
Transform 17 source types into AI-ready skills and RAG knowledge, directly from Claude Code.
|
||||
|
||||
## Installation
|
||||
|
||||
### From the Official Plugin Directory
|
||||
|
||||
```
|
||||
/plugin install skill-seekers@claude-plugin-directory
|
||||
```
|
||||
|
||||
Or browse for it in `/plugin > Discover`.
|
||||
|
||||
### Local Installation (for development)
|
||||
|
||||
```bash
|
||||
claude --plugin-dir ./path/to/skill-seekers-plugin
|
||||
```
|
||||
|
||||
### Prerequisites
|
||||
|
||||
The plugin requires `skill-seekers` to be installed:
|
||||
|
||||
```bash
|
||||
pip install skill-seekers[mcp]
|
||||
```
|
||||
|
||||
## What's Included
|
||||
|
||||
### MCP Server (35 tools)
|
||||
|
||||
The plugin bundles the Skill Seekers MCP server providing tools for:
|
||||
- Scraping documentation, GitHub repos, PDFs, videos, and 13 other source types
|
||||
- Packaging skills for 16+ LLM platforms
|
||||
- Exporting to vector databases (Weaviate, Chroma, FAISS, Qdrant)
|
||||
- Managing configs, workflows, and sources
|
||||
|
||||
### Slash Commands
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `/skill-seekers:create-skill <source>` | Create a skill from any source (auto-detects type) |
|
||||
| `/skill-seekers:sync-config <config>` | Sync config URLs against live docs |
|
||||
| `/skill-seekers:install-skill <source>` | End-to-end: fetch, scrape, enhance, package, install |
|
||||
|
||||
### Agent Skill
|
||||
|
||||
The **skill-builder** skill is automatically available to Claude. It detects source types and uses the appropriate MCP tools to build skills autonomously.
|
||||
|
||||
## Usage Examples
|
||||
|
||||
```
|
||||
# Create a skill from a documentation site
|
||||
/skill-seekers:create-skill https://react.dev
|
||||
|
||||
# Create from a GitHub repo, targeting LangChain
|
||||
/skill-seekers:create-skill pallets/flask --target langchain
|
||||
|
||||
# Full install workflow with AI enhancement
|
||||
/skill-seekers:install-skill https://fastapi.tiangolo.com --enhance
|
||||
|
||||
# Sync an existing config
|
||||
/skill-seekers:sync-config react
|
||||
```
|
||||
|
||||
Or just ask Claude naturally:
|
||||
> "Create an AI skill from the React documentation"
|
||||
> "Scrape the Flask GitHub repo and package it for OpenAI"
|
||||
> "Export my skill to a Chroma vector database"
|
||||
|
||||
The skill-builder agent skill will automatically detect the intent and use the right tools.
|
||||
|
||||
## Remote MCP Alternative
|
||||
|
||||
By default, the plugin runs the MCP server locally via `python -m skill_seekers.mcp.server_fastmcp`. To use a remote server instead, edit `.mcp.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"skill-seekers": {
|
||||
"type": "http",
|
||||
"url": "https://your-hosted-server.com/mcp"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Supported Source Types
|
||||
|
||||
Documentation (web), GitHub repos, PDFs, Word docs, EPUBs, videos, local codebases, Jupyter notebooks, HTML files, OpenAPI specs, AsciiDoc, PowerPoint, RSS/Atom feeds, man pages, Confluence, Notion, Slack/Discord exports.
|
||||
|
||||
## License
|
||||
|
||||
MIT — https://github.com/yusufkaraaslan/Skill_Seekers
|
||||
52
distribution/claude-plugin/commands/create-skill.md
Normal file
52
distribution/claude-plugin/commands/create-skill.md
Normal file
@@ -0,0 +1,52 @@
|
||||
---
|
||||
description: Create an AI skill from any source (URL, repo, PDF, video, notebook, etc.)
|
||||
---
|
||||
|
||||
# Create Skill
|
||||
|
||||
Create an AI-ready skill from a source. The source type is auto-detected.
|
||||
|
||||
## Usage
|
||||
|
||||
```
|
||||
/skill-seekers:create-skill <source> [--target <platform>] [--output <dir>]
|
||||
```
|
||||
|
||||
## Instructions
|
||||
|
||||
When the user provides a source via `$ARGUMENTS`, run the `skill-seekers create` command to generate a skill.
|
||||
|
||||
1. Parse the arguments: extract the source (first argument) and any flags.
|
||||
2. If no `--target` is specified, default to `claude`.
|
||||
3. If no `--output` is specified, default to `./output`.
|
||||
4. Run the command:
|
||||
```bash
|
||||
skill-seekers create "$SOURCE" --target "$TARGET" --output "$OUTPUT"
|
||||
```
|
||||
5. After completion, read the generated `SKILL.md` and summarize what was created.
|
||||
|
||||
## Source Types (auto-detected)
|
||||
|
||||
- **URL** (https://...) → Documentation scraping
|
||||
- **owner/repo** or github.com URL → GitHub repo analysis
|
||||
- **file.pdf** → PDF extraction
|
||||
- **file.ipynb** → Jupyter notebook
|
||||
- **file.docx** → Word document
|
||||
- **file.epub** → EPUB book
|
||||
- **YouTube/Vimeo URL** → Video transcript
|
||||
- **./directory** → Local codebase analysis
|
||||
- **file.yaml** with OpenAPI → API spec
|
||||
- **file.pptx** → PowerPoint
|
||||
- **file.adoc** → AsciiDoc
|
||||
- **file.html** → HTML page
|
||||
- **file.rss** → RSS/Atom feed
|
||||
- **cmd.1** → Man page
|
||||
|
||||
## Examples
|
||||
|
||||
```
|
||||
/skill-seekers:create-skill https://react.dev
|
||||
/skill-seekers:create-skill pallets/flask --target langchain
|
||||
/skill-seekers:create-skill ./docs/api.pdf --target openai
|
||||
/skill-seekers:create-skill https://youtube.com/watch?v=abc123
|
||||
```
|
||||
44
distribution/claude-plugin/commands/install-skill.md
Normal file
44
distribution/claude-plugin/commands/install-skill.md
Normal file
@@ -0,0 +1,44 @@
|
||||
---
|
||||
description: One-command skill installation — fetch config, scrape, enhance, package, and install
|
||||
---
|
||||
|
||||
# Install Skill
|
||||
|
||||
Complete end-to-end workflow: fetch a config (from preset or URL), scrape the source, optionally enhance with AI, package for the target platform, and install.
|
||||
|
||||
## Usage
|
||||
|
||||
```
|
||||
/skill-seekers:install-skill <config-or-source> [--target <platform>] [--enhance]
|
||||
```
|
||||
|
||||
## Instructions
|
||||
|
||||
When the user provides a source or config via `$ARGUMENTS`:
|
||||
|
||||
1. Determine if the argument is a config preset name, config file path, or a direct source.
|
||||
2. Use the `install_skill` MCP tool if available, or run the equivalent CLI commands:
|
||||
```bash
|
||||
# For preset configs
|
||||
skill-seekers install --config "$CONFIG" --target "$TARGET"
|
||||
|
||||
# For direct sources
|
||||
skill-seekers create "$SOURCE" --target "$TARGET"
|
||||
```
|
||||
3. If `--enhance` is specified, run enhancement after initial scraping:
|
||||
```bash
|
||||
skill-seekers enhance "$SKILL_DIR" --target "$TARGET"
|
||||
```
|
||||
4. Report the final skill location and how to use it.
|
||||
|
||||
## Target Platforms
|
||||
|
||||
`claude`, `openai`, `gemini`, `langchain`, `llamaindex`, `haystack`, `cursor`, `windsurf`, `continue`, `cline`, `markdown`
|
||||
|
||||
## Examples
|
||||
|
||||
```
|
||||
/skill-seekers:install-skill react --target claude
|
||||
/skill-seekers:install-skill https://fastapi.tiangolo.com --target langchain --enhance
|
||||
/skill-seekers:install-skill pallets/flask
|
||||
```
|
||||
32
distribution/claude-plugin/commands/sync-config.md
Normal file
32
distribution/claude-plugin/commands/sync-config.md
Normal file
@@ -0,0 +1,32 @@
|
||||
---
|
||||
description: Sync a scraping config's URLs against the live documentation site
|
||||
---
|
||||
|
||||
# Sync Config
|
||||
|
||||
Synchronize a Skill Seekers config file with the current state of a documentation site. Detects new pages, removed pages, and URL changes.
|
||||
|
||||
## Usage
|
||||
|
||||
```
|
||||
/skill-seekers:sync-config <config-path-or-name>
|
||||
```
|
||||
|
||||
## Instructions
|
||||
|
||||
When the user provides a config path or preset name via `$ARGUMENTS`:
|
||||
|
||||
1. If it's a preset name (e.g., `react`, `godot`), look for it in the `configs/` directory or fetch from the API.
|
||||
2. Run the sync command:
|
||||
```bash
|
||||
skill-seekers sync-config "$CONFIG"
|
||||
```
|
||||
3. Report what changed: new URLs found, removed URLs, and any conflicts.
|
||||
4. Ask the user if they want to update the config and re-scrape.
|
||||
|
||||
## Examples
|
||||
|
||||
```
|
||||
/skill-seekers:sync-config configs/react.json
|
||||
/skill-seekers:sync-config react
|
||||
```
|
||||
69
distribution/claude-plugin/skills/skill-builder/SKILL.md
Normal file
69
distribution/claude-plugin/skills/skill-builder/SKILL.md
Normal file
@@ -0,0 +1,69 @@
|
||||
---
|
||||
name: skill-builder
|
||||
description: Automatically detect source types and build AI skills using Skill Seekers. Use when the user wants to create skills from documentation, repos, PDFs, videos, or other knowledge sources.
|
||||
---
|
||||
|
||||
# Skill Builder
|
||||
|
||||
You have access to the Skill Seekers MCP server which provides 35 tools for converting knowledge sources into AI-ready skills.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
Use this skill when the user:
|
||||
- Wants to create an AI skill from a documentation site, GitHub repo, PDF, video, or other source
|
||||
- Needs to convert documentation into a format suitable for LLM consumption
|
||||
- Wants to update or sync existing skills with their source documentation
|
||||
- Needs to export skills to vector databases (Weaviate, Chroma, FAISS, Qdrant)
|
||||
- Asks about scraping, converting, or packaging documentation for AI
|
||||
|
||||
## Source Type Detection
|
||||
|
||||
Automatically detect the source type from user input:
|
||||
|
||||
| Input Pattern | Source Type | Tool to Use |
|
||||
|---------------|-------------|-------------|
|
||||
| `https://...` (not GitHub/YouTube) | Documentation | `scrape_docs` |
|
||||
| `owner/repo` or `github.com/...` | GitHub | `scrape_github` |
|
||||
| `*.pdf` | PDF | `scrape_pdf` |
|
||||
| YouTube/Vimeo URL or video file | Video | `scrape_video` |
|
||||
| Local directory path | Codebase | `scrape_codebase` |
|
||||
| `*.ipynb`, `*.html`, `*.yaml` (OpenAPI), `*.adoc`, `*.pptx`, `*.rss`, `*.1`-`.8` | Various | `scrape_generic` |
|
||||
| JSON config file | Unified | Use config with `scrape_docs` |
|
||||
|
||||
## Recommended Workflow
|
||||
|
||||
1. **Detect source type** from the user's input
|
||||
2. **Generate or fetch config** using `generate_config` or `fetch_config` if needed
|
||||
3. **Estimate scope** with `estimate_pages` for documentation sites
|
||||
4. **Scrape the source** using the appropriate scraping tool
|
||||
5. **Enhance** with `enhance_skill` if the user wants AI-powered improvements
|
||||
6. **Package** with `package_skill` for the target platform
|
||||
7. **Export to vector DB** if requested using `export_to_*` tools
|
||||
|
||||
## Available MCP Tools
|
||||
|
||||
### Config Management
|
||||
- `generate_config` — Generate a scraping config from a URL
|
||||
- `list_configs` — List available preset configs
|
||||
- `validate_config` — Validate a config file
|
||||
|
||||
### Scraping (use based on source type)
|
||||
- `scrape_docs` — Documentation sites
|
||||
- `scrape_github` — GitHub repositories
|
||||
- `scrape_pdf` — PDF files
|
||||
- `scrape_video` — Video transcripts
|
||||
- `scrape_codebase` — Local code analysis
|
||||
- `scrape_generic` — Jupyter, HTML, OpenAPI, AsciiDoc, PPTX, RSS, manpage, Confluence, Notion, chat
|
||||
|
||||
### Post-processing
|
||||
- `enhance_skill` — AI-powered skill enhancement
|
||||
- `package_skill` — Package for target platform
|
||||
- `upload_skill` — Upload to platform API
|
||||
- `install_skill` — End-to-end install workflow
|
||||
|
||||
### Advanced
|
||||
- `detect_patterns` — Design pattern detection in code
|
||||
- `extract_test_examples` — Extract usage examples from tests
|
||||
- `build_how_to_guides` — Generate how-to guides from tests
|
||||
- `split_config` — Split large configs into focused skills
|
||||
- `export_to_weaviate`, `export_to_chroma`, `export_to_faiss`, `export_to_qdrant` — Vector DB export
|
||||
147
distribution/github-action/README.md
Normal file
147
distribution/github-action/README.md
Normal file
@@ -0,0 +1,147 @@
|
||||
# Skill Seekers GitHub Action
|
||||
|
||||
Transform documentation, GitHub repos, PDFs, videos, and 13 other source types into AI-ready skills and RAG knowledge — directly in your CI/CD pipeline.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```yaml
|
||||
- uses: yusufkaraaslan/skill-seekers-action@v3
|
||||
with:
|
||||
source: 'https://react.dev'
|
||||
```
|
||||
|
||||
## Inputs
|
||||
|
||||
| Input | Required | Default | Description |
|
||||
|-------|----------|---------|-------------|
|
||||
| `source` | Yes | — | Source URL, file path, or `owner/repo` |
|
||||
| `command` | No | `create` | Command: `create`, `scrape`, `github`, `pdf`, `video`, `analyze`, `unified` |
|
||||
| `target` | No | `claude` | Target platform: `claude`, `openai`, `gemini`, `langchain`, `llamaindex`, `markdown` |
|
||||
| `config` | No | — | Path to JSON config file |
|
||||
| `output-dir` | No | `output` | Output directory |
|
||||
| `extra-args` | No | — | Additional CLI arguments |
|
||||
|
||||
## Outputs
|
||||
|
||||
| Output | Description |
|
||||
|--------|-------------|
|
||||
| `skill-dir` | Path to the generated skill directory |
|
||||
| `skill-name` | Name of the generated skill |
|
||||
|
||||
## Examples
|
||||
|
||||
### Auto-update documentation skill weekly
|
||||
|
||||
```yaml
|
||||
name: Update AI Skills
|
||||
on:
|
||||
schedule:
|
||||
- cron: '0 6 * * 1' # Every Monday 6am UTC
|
||||
workflow_dispatch:
|
||||
|
||||
jobs:
|
||||
update-skills:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- uses: yusufkaraaslan/skill-seekers-action@v3
|
||||
with:
|
||||
source: 'https://react.dev'
|
||||
target: 'langchain'
|
||||
|
||||
- uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: react-skill
|
||||
path: output/
|
||||
```
|
||||
|
||||
### Generate skill from GitHub repo
|
||||
|
||||
```yaml
|
||||
- uses: yusufkaraaslan/skill-seekers-action@v3
|
||||
with:
|
||||
source: 'pallets/flask'
|
||||
command: 'github'
|
||||
target: 'claude'
|
||||
```
|
||||
|
||||
### Process PDF documentation
|
||||
|
||||
```yaml
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- uses: yusufkaraaslan/skill-seekers-action@v3
|
||||
with:
|
||||
source: 'docs/api-reference.pdf'
|
||||
command: 'pdf'
|
||||
```
|
||||
|
||||
### Unified multi-source build with config
|
||||
|
||||
```yaml
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- uses: yusufkaraaslan/skill-seekers-action@v3
|
||||
with:
|
||||
config: 'configs/my-project.json'
|
||||
command: 'unified'
|
||||
target: 'openai'
|
||||
```
|
||||
|
||||
### Commit generated skill back to repo
|
||||
|
||||
```yaml
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- uses: yusufkaraaslan/skill-seekers-action@v3
|
||||
id: generate
|
||||
with:
|
||||
source: 'https://fastapi.tiangolo.com'
|
||||
|
||||
- name: Commit skill
|
||||
run: |
|
||||
git config user.name "github-actions[bot]"
|
||||
git config user.email "github-actions[bot]@users.noreply.github.com"
|
||||
git add output/
|
||||
git diff --staged --quiet || git commit -m "Update AI skill: ${{ steps.generate.outputs.skill-name }}"
|
||||
git push
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
Pass API keys as environment variables for AI-enhanced skills:
|
||||
|
||||
```yaml
|
||||
env:
|
||||
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
|
||||
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
|
||||
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
|
||||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
```
|
||||
|
||||
## Supported Source Types
|
||||
|
||||
| Type | Example Source |
|
||||
|------|---------------|
|
||||
| Documentation (web) | `https://react.dev` |
|
||||
| GitHub repo | `pallets/flask` or `https://github.com/pallets/flask` |
|
||||
| PDF | `docs/manual.pdf` |
|
||||
| Video | `https://youtube.com/watch?v=...` |
|
||||
| Local codebase | `./src` |
|
||||
| Jupyter Notebook | `analysis.ipynb` |
|
||||
| OpenAPI/Swagger | `openapi.yaml` |
|
||||
| Word (.docx) | `docs/guide.docx` |
|
||||
| EPUB | `book.epub` |
|
||||
| PowerPoint | `slides.pptx` |
|
||||
| AsciiDoc | `docs/guide.adoc` |
|
||||
| HTML | `page.html` |
|
||||
| RSS/Atom | `feed.rss` |
|
||||
| Man pages | `tool.1` |
|
||||
| Confluence | Via config file |
|
||||
| Notion | Via config file |
|
||||
| Chat (Slack/Discord) | Via config file |
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
92
distribution/github-action/action.yml
Normal file
92
distribution/github-action/action.yml
Normal file
@@ -0,0 +1,92 @@
|
||||
name: 'Skill Seekers - AI Knowledge Builder'
|
||||
description: 'Transform documentation, repos, PDFs, videos, and 13 other source types into AI skills and RAG knowledge'
|
||||
author: 'Yusuf Karaaslan'
|
||||
|
||||
branding:
|
||||
icon: 'book-open'
|
||||
color: 'blue'
|
||||
|
||||
inputs:
|
||||
source:
|
||||
description: 'Source URL, file path, or owner/repo for GitHub repos'
|
||||
required: true
|
||||
command:
|
||||
description: 'Command to run: create (auto-detect), scrape, github, pdf, video, analyze, unified'
|
||||
required: false
|
||||
default: 'create'
|
||||
target:
|
||||
description: 'Output target platform: claude, openai, gemini, langchain, llamaindex, markdown, cursor, windsurf'
|
||||
required: false
|
||||
default: 'claude'
|
||||
config:
|
||||
description: 'Path to JSON config file (for unified/advanced scraping)'
|
||||
required: false
|
||||
output-dir:
|
||||
description: 'Output directory for generated skills'
|
||||
required: false
|
||||
default: 'output'
|
||||
extra-args:
|
||||
description: 'Additional CLI arguments to pass to skill-seekers'
|
||||
required: false
|
||||
default: ''
|
||||
|
||||
outputs:
|
||||
skill-dir:
|
||||
description: 'Path to the generated skill directory'
|
||||
value: ${{ steps.run.outputs.skill-dir }}
|
||||
skill-name:
|
||||
description: 'Name of the generated skill'
|
||||
value: ${{ steps.run.outputs.skill-name }}
|
||||
|
||||
runs:
|
||||
using: 'composite'
|
||||
steps:
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: '3.12'
|
||||
|
||||
- name: Install Skill Seekers
|
||||
shell: bash
|
||||
run: pip install skill-seekers
|
||||
|
||||
- name: Run Skill Seekers
|
||||
id: run
|
||||
shell: bash
|
||||
env:
|
||||
ANTHROPIC_API_KEY: ${{ env.ANTHROPIC_API_KEY }}
|
||||
OPENAI_API_KEY: ${{ env.OPENAI_API_KEY }}
|
||||
GOOGLE_API_KEY: ${{ env.GOOGLE_API_KEY }}
|
||||
GITHUB_TOKEN: ${{ env.GITHUB_TOKEN }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
|
||||
OUTPUT_DIR="${{ inputs.output-dir }}"
|
||||
mkdir -p "$OUTPUT_DIR"
|
||||
|
||||
CMD="${{ inputs.command }}"
|
||||
SOURCE="${{ inputs.source }}"
|
||||
TARGET="${{ inputs.target }}"
|
||||
CONFIG="${{ inputs.config }}"
|
||||
EXTRA="${{ inputs.extra-args }}"
|
||||
|
||||
# Build the command
|
||||
if [ "$CMD" = "create" ]; then
|
||||
skill-seekers create "$SOURCE" --target "$TARGET" --output "$OUTPUT_DIR" $EXTRA
|
||||
elif [ -n "$CONFIG" ]; then
|
||||
skill-seekers "$CMD" --config "$CONFIG" --target "$TARGET" --output "$OUTPUT_DIR" $EXTRA
|
||||
else
|
||||
skill-seekers "$CMD" "$SOURCE" --target "$TARGET" --output "$OUTPUT_DIR" $EXTRA
|
||||
fi
|
||||
|
||||
# Find the generated skill directory
|
||||
SKILL_DIR=$(find "$OUTPUT_DIR" -name "SKILL.md" -exec dirname {} \; | head -1)
|
||||
SKILL_NAME=$(basename "$SKILL_DIR" 2>/dev/null || echo "unknown")
|
||||
|
||||
echo "skill-dir=$SKILL_DIR" >> "$GITHUB_OUTPUT"
|
||||
echo "skill-name=$SKILL_NAME" >> "$GITHUB_OUTPUT"
|
||||
|
||||
echo "### Skill Generated" >> "$GITHUB_STEP_SUMMARY"
|
||||
echo "- **Name:** $SKILL_NAME" >> "$GITHUB_STEP_SUMMARY"
|
||||
echo "- **Directory:** $SKILL_DIR" >> "$GITHUB_STEP_SUMMARY"
|
||||
echo "- **Target:** $TARGET" >> "$GITHUB_STEP_SUMMARY"
|
||||
107
distribution/smithery/README.md
Normal file
107
distribution/smithery/README.md
Normal file
@@ -0,0 +1,107 @@
|
||||
# Skill Seekers — Smithery MCP Registry
|
||||
|
||||
Publishing guide for the Skill Seekers MCP server on [Smithery](https://smithery.ai).
|
||||
|
||||
## Status
|
||||
|
||||
- **Namespace created:** `yusufkaraaslan`
|
||||
- **Server created:** `yusufkaraaslan/skill-seekers`
|
||||
- **Server page:** https://smithery.ai/servers/yusufkaraaslan/skill-seekers
|
||||
- **Release status:** Needs re-publish (initial release failed — Smithery couldn't scan GitHub URL as MCP endpoint)
|
||||
|
||||
## Publishing
|
||||
|
||||
Smithery requires a live, scannable MCP HTTP endpoint for URL-based publishing. Two options:
|
||||
|
||||
### Option A: Publish via Web UI (Recommended)
|
||||
|
||||
1. Go to https://smithery.ai/servers/yusufkaraaslan/skill-seekers/releases
|
||||
2. The server already exists — create a new release
|
||||
3. For the "Local" tab: follow the prompts to publish as a stdio server
|
||||
4. For the "URL" tab: provide a hosted HTTP endpoint URL
|
||||
|
||||
### Option B: Deploy HTTP endpoint first, then publish via CLI
|
||||
|
||||
1. Deploy the MCP server on Render/Railway/Fly.io:
|
||||
```bash
|
||||
# Using existing Dockerfile.mcp
|
||||
docker build -f Dockerfile.mcp -t skill-seekers-mcp .
|
||||
# Deploy to your hosting provider
|
||||
```
|
||||
2. Publish the live URL:
|
||||
```bash
|
||||
npx @smithery/cli@latest auth login
|
||||
npx @smithery/cli@latest mcp publish "https://your-deployed-url/mcp" \
|
||||
-n yusufkaraaslan/skill-seekers
|
||||
```
|
||||
|
||||
### CLI Authentication (already done)
|
||||
|
||||
```bash
|
||||
# Install via npx (no global install needed)
|
||||
npx @smithery/cli@latest auth login
|
||||
npx @smithery/cli@latest namespace show # Should show: yusufkaraaslan
|
||||
```
|
||||
|
||||
### After Publishing
|
||||
|
||||
Update the server page with metadata:
|
||||
|
||||
**Display name:** Skill Seekers — AI Skill & RAG Toolkit
|
||||
|
||||
**Description:**
|
||||
> Transform 17 source types into AI-ready skills and RAG knowledge. Ingest documentation sites, GitHub repos, PDFs, Jupyter notebooks, videos, Confluence, Notion, Slack/Discord exports, and more. Package for 16+ LLM platforms including Claude, GPT, Gemini, LangChain, LlamaIndex, and vector databases.
|
||||
|
||||
**Tags:** `ai`, `rag`, `documentation`, `skills`, `preprocessing`, `mcp`, `knowledge-base`, `vector-database`
|
||||
|
||||
## User Installation
|
||||
|
||||
Once published, users can add the server to their MCP client:
|
||||
|
||||
```bash
|
||||
# Via Smithery CLI (adds to Claude Desktop, Cursor, etc.)
|
||||
smithery mcp add yusufkaraaslan/skill-seekers --client claude
|
||||
|
||||
# Or configure manually — users need skill-seekers installed:
|
||||
pip install skill-seekers[mcp]
|
||||
```
|
||||
|
||||
### Manual MCP Configuration
|
||||
|
||||
For clients that use JSON config (Claude Desktop, Claude Code, Cursor):
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"skill-seekers": {
|
||||
"command": "python",
|
||||
"args": ["-m", "skill_seekers.mcp.server_fastmcp"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Available Tools (35)
|
||||
|
||||
| Category | Tools | Description |
|
||||
|----------|-------|-------------|
|
||||
| Config | 3 | Generate, list, validate scraping configs |
|
||||
| Sync | 1 | Sync config URLs against live docs |
|
||||
| Scraping | 11 | Scrape docs, GitHub, PDF, video, codebase, generic (10 types) |
|
||||
| Packaging | 4 | Package, upload, enhance, install skills |
|
||||
| Splitting | 2 | Split large configs, generate routers |
|
||||
| Sources | 5 | Fetch, submit, manage config sources |
|
||||
| Vector DB | 4 | Export to Weaviate, Chroma, FAISS, Qdrant |
|
||||
| Workflows | 5 | List, get, create, update, delete workflows |
|
||||
|
||||
## Maintenance
|
||||
|
||||
- Update description/tags on major releases
|
||||
- No code changes needed — users always get the latest via `pip install`
|
||||
|
||||
## Notes
|
||||
|
||||
- Smithery CLI v4.7.0 removed the `--transport stdio` flag from the docs
|
||||
- The CLI `publish` command only supports URL-based (external) publishing
|
||||
- For local/stdio servers, use the web UI at smithery.ai/servers/new
|
||||
- The namespace and server entity are already created; only the release needs to succeed
|
||||
17
render-mcp.yaml
Normal file
17
render-mcp.yaml
Normal file
@@ -0,0 +1,17 @@
|
||||
services:
|
||||
# MCP Server Service (HTTP mode)
|
||||
- type: web
|
||||
name: skill-seekers-mcp
|
||||
runtime: docker
|
||||
plan: free
|
||||
dockerfilePath: ./Dockerfile.mcp
|
||||
envVars:
|
||||
- key: MCP_PORT
|
||||
value: "8765"
|
||||
- key: PORT
|
||||
fromService:
|
||||
type: web
|
||||
name: skill-seekers-mcp
|
||||
property: port
|
||||
healthCheckPath: /health
|
||||
autoDeploy: true
|
||||
Reference in New Issue
Block a user