Files
skill-seekers-reference/docs/zh-CN/user-guide/04-packaging.md
yusyus 064405c052 fix: resolve 18 bugs and code quality issues across adaptors, CLI, and chunking pipeline
Bug fixes:
- Fix --var flag silently dropped in create routing (args.workflow_var → args.var)
- Fix double _score_code_quality() call in word scraper
- Add .docx file extension validation in WordToSkillConverter
- Fix weaviate ImportError masked by generic Exception handler
- Fix RAG chunking crash using non-existent converter.output_dir

Chunking pipeline improvements:
- Wire --chunk-overlap-tokens through entire package pipeline
  (package_skill → adaptor.package → format_skill_md → _maybe_chunk_content → RAGChunker)
- Add auto-scaling overlap: max(50, chunk_tokens//10) when chunk size is non-default
- Rename --no-preserve-code to --no-preserve-code-blocks (backward-compat alias kept)
- Replace hardcoded 512/50 chunk defaults with DEFAULT_CHUNK_TOKENS/DEFAULT_CHUNK_OVERLAP_TOKENS
  constants across all 12 concrete adaptors, rag_chunker, base, and package_skill

Code quality:
- Extract shared _generate_openai_embeddings() and _generate_st_embeddings() to SkillAdaptor
  base class, removing ~150 lines of duplication from chroma/weaviate/pinecone
- Add Pinecone adaptor with full upload support (pinecone_adaptor.py)

Tests (14 new):
- chunk_overlap_tokens parameter wiring, auto-scaling overlap, preserve_code_blocks flag
- .docx/.doc/no-extension file validation, --var flag routing E2E
- Embedding method inheritance verification, backward-compatible flag aliases

Docs:
- Update CHANGELOG, CLI_REFERENCE, API_REFERENCE, packaging guide (EN+ZH)
- Update README test count badge (1880+ → 2283+)

All 2283 tests passing, 8 skipped, 0 failures.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 21:57:59 +03:00

10 KiB

Packaging Guide

Skill Seekers v3.1.0
Export skills to AI platforms and vector databases


Overview

Packaging converts your skill directory into a platform-specific format:

output/my-skill/ ──▶ Packager ──▶ output/my-skill-{platform}.{format}
    ↓                                ↓
(SKILL.md +        Platform-specific  (ZIP, tar.gz,
 references)        formatting        directories,
                                     FAISS index)

Supported Platforms

Platform Format Extension Best For
Claude AI ZIP + YAML .zip Claude Code, Claude API
Google Gemini tar.gz .tar.gz Gemini skills
OpenAI ChatGPT ZIP + Vector .zip Custom GPTs
LangChain Documents directory RAG pipelines
LlamaIndex TextNodes directory Query engines
Haystack Documents directory Enterprise RAG
Pinecone Markdown .zip Vector upsert
ChromaDB Collection .zip Local vector DB
Weaviate Objects .zip Vector database
Qdrant Points .zip Vector database
FAISS Index .faiss Local similarity
Markdown ZIP .zip Universal export
Cursor .cursorrules file IDE AI context
Windsurf .windsurfrules file IDE AI context
Cline .clinerules file VS Code AI

Basic Packaging

Package for Claude (Default)

# Default packaging
skill-seekers package output/my-skill/

# Explicit target
skill-seekers package output/my-skill/ --target claude

# Output: output/my-skill-claude.zip

Package for Other Platforms

# Google Gemini
skill-seekers package output/my-skill/ --target gemini
# Output: output/my-skill-gemini.tar.gz

# OpenAI
skill-seekers package output/my-skill/ --target openai
# Output: output/my-skill-openai.zip

# LangChain
skill-seekers package output/my-skill/ --target langchain
# Output: output/my-skill-langchain/ directory

# ChromaDB
skill-seekers package output/my-skill/ --target chroma
# Output: output/my-skill-chroma.zip

Multi-Platform Packaging

Package for All Platforms

# Create skill once
skill-seekers create <source>

# Package for multiple platforms
for platform in claude gemini openai langchain; do
  echo "Packaging for $platform..."
  skill-seekers package output/my-skill/ --target $platform
done

# Results:
# output/my-skill-claude.zip
# output/my-skill-gemini.tar.gz
# output/my-skill-openai.zip
# output/my-skill-langchain/

Batch Packaging Script

#!/bin/bash
SKILL_DIR="output/my-skill"
PLATFORMS="claude gemini openai langchain llama-index chroma"

for platform in $PLATFORMS; do
  echo "▶️ Packaging for $platform..."
  skill-seekers package "$SKILL_DIR" --target "$platform"
  
  if [ $? -eq 0 ]; then
    echo "✅ $platform done"
  else
    echo "❌ $platform failed"
 fi
done

echo "🎉 All platforms packaged!"

Packaging Options

Skip Quality Check

# Skip validation (faster)
skill-seekers package output/my-skill/ --skip-quality-check

Don't Open Output Folder

# Prevent opening folder after packaging
skill-seekers package output/my-skill/ --no-open

Auto-Upload After Packaging

# Package and upload
export ANTHROPIC_API_KEY=sk-ant-...
skill-seekers package output/my-skill/ --target claude --upload

Streaming Mode

For very large skills, use streaming to reduce memory usage:

# Enable streaming
skill-seekers package output/large-skill/ --streaming

# Custom chunk size
skill-seekers package output/large-skill/ \
  --streaming \
  --streaming-chunk-chars 2000 \
  --streaming-overlap-chars 100

When to use:

  • Skills > 500 pages
  • Limited RAM (< 8GB)
  • Batch processing many skills

RAG Chunking

Optimize for Retrieval-Augmented Generation:

# Enable semantic chunking
skill-seekers package output/my-skill/ \
  --target langchain \
  --chunk-for-rag \
  --chunk-tokens 512

# Custom chunk size
skill-seekers package output/my-skill/ \
  --target chroma \
  --chunk-tokens 256 \
  --chunk-overlap-tokens 50

Chunking Options:

Option Default Description
--chunk-for-rag auto Enable chunking
--chunk-tokens 512 Tokens per chunk
--chunk-overlap-tokens 50 Overlap between chunks (tokens)
--no-preserve-code-blocks - Allow splitting code blocks

自动缩放重叠:--chunk-tokens 设置为非默认值但 --chunk-overlap-tokens 保持默认值 (50) 时,重叠会自动缩放为 max(50, chunk_tokens / 10),以在较大的分块中实现更好的上下文保留。


Platform-Specific Details

Claude AI

skill-seekers package output/my-skill/ --target claude

Upload:

# Auto-upload
skill-seekers package output/my-skill/ --target claude --upload

# Manual upload
skill-seekers upload output/my-skill-claude.zip --target claude

Format:

  • ZIP archive
  • Contains SKILL.md + references/
  • Includes YAML manifest

Google Gemini

skill-seekers package output/my-skill/ --target gemini

Upload:

export GOOGLE_API_KEY=AIza...
skill-seekers upload output/my-skill-gemini.tar.gz --target gemini

Format:

  • tar.gz archive
  • Optimized for Gemini's format

OpenAI ChatGPT

skill-seekers package output/my-skill/ --target openai

Upload:

export OPENAI_API_KEY=sk-...
skill-seekers upload output/my-skill-openai.zip --target openai

Format:

  • ZIP with vector embeddings
  • Ready for Assistants API

LangChain

skill-seekers package output/my-skill/ --target langchain

Usage:

from langchain.document_loaders import DirectoryLoader

loader = DirectoryLoader("output/my-skill-langchain/")
docs = loader.load()

# Use in RAG pipeline

Format:

  • Directory of Document objects
  • JSON metadata

ChromaDB

skill-seekers package output/my-skill/ --target chroma

Upload:

# Local ChromaDB
skill-seekers upload output/my-skill-chroma.zip --target chroma

# With custom URL
skill-seekers upload output/my-skill-chroma.zip \
  --target chroma \
  --chroma-url http://localhost:8000

Usage:

import chromadb

client = chromadb.HttpClient(host="localhost", port=8000)
collection = client.get_collection("my-skill")

Weaviate

skill-seekers package output/my-skill/ --target weaviate

Upload:

# Local Weaviate
skill-seekers upload output/my-skill-weaviate.zip --target weaviate

# Weaviate Cloud
skill-seekers upload output/my-skill-weaviate.zip \
  --target weaviate \
  --use-cloud \
  --cluster-url https://xxx.weaviate.network

Cursor IDE

# Package (actually creates .cursorrules file)
skill-seekers package output/my-skill/ --target cursor

# Or install directly
skill-seekers install-agent output/my-skill/ --agent cursor

Result: .cursorrules file in your project root.


Windsurf IDE

skill-seekers install-agent output/my-skill/ --agent windsurf

Result: .windsurfrules file in your project root.


Quality Check

Before packaging, skills are validated:

# Check quality
skill-seekers quality output/my-skill/

# Detailed report
skill-seekers quality output/my-skill/ --report

# Set minimum threshold
skill-seekers quality output/my-skill/ --threshold 7.0

Quality Metrics:

  • SKILL.md completeness
  • Code example coverage
  • Navigation structure
  • Reference file organization

Output Structure

After Packaging

output/
├── my-skill/                    # Source skill
│   ├── SKILL.md
│   └── references/
│
├── my-skill-claude.zip          # Claude package
├── my-skill-gemini.tar.gz       # Gemini package
├── my-skill-openai.zip          # OpenAI package
├── my-skill-langchain/          # LangChain directory
├── my-skill-chroma.zip          # ChromaDB package
└── my-skill-weaviate.zip        # Weaviate package

Troubleshooting

"Package validation failed"

Problem: SKILL.md is missing or malformed

Solution:

# Check skill structure
ls output/my-skill/

# Rebuild if needed
skill-seekers create --config my-config --skip-scrape

# Or recreate
skill-seekers create <source>

"Target platform not supported"

Problem: Typo in target name

Solution:

# Check available targets
skill-seekers package --help

# Common targets: claude, gemini, openai, langchain, chroma, weaviate

"Upload failed"

Problem: Missing API key

Solution:

# Set API key
export ANTHROPIC_API_KEY=sk-ant-...
export GOOGLE_API_KEY=AIza...
export OPENAI_API_KEY=sk-...

# Try again
skill-seekers upload output/my-skill-claude.zip --target claude

"Out of memory"

Problem: Skill too large for memory

Solution:

# Use streaming mode
skill-seekers package output/my-skill/ --streaming

# Smaller chunks
skill-seekers package output/my-skill/ --streaming --streaming-chunk-chars 1000

Best Practices

1. Package Once, Use Everywhere

# Create once
skill-seekers create <source>

# Package for all needed platforms
for platform in claude gemini langchain; do
  skill-seekers package output/my-skill/ --target $platform
done

2. Check Quality Before Packaging

# Validate first
skill-seekers quality output/my-skill/ --threshold 6.0

# Then package
skill-seekers package output/my-skill/

3. Use Streaming for Large Skills

# Automatically detected, but can force
skill-seekers package output/large-skill/ --streaming

4. Keep Original Skill Directory

Don't delete output/my-skill/ after packaging - you might want to:

  • Re-package for other platforms
  • Apply different workflows
  • Update and re-enhance

Next Steps