- Add date_added to all 950+ skills for complete tracking - Update version to 6.5.0 in package.json and README - Regenerate all indexes and catalog - Sync all generated files Features from merged PR #150: - Stars/Upvotes system for community-driven discovery - Auto-update mechanism via START_APP.bat - Interactive Prompt Builder - Date tracking badges - Smart auto-categorization All skills validated and indexed. Made-with: Cursor
67 lines
1.9 KiB
Markdown
67 lines
1.9 KiB
Markdown
---
|
|
name: prompt-caching
|
|
description: "Caching strategies for LLM prompts including Anthropic prompt caching, response caching, and CAG (Cache Augmented Generation) Use when: prompt caching, cache prompt, response cache, cag, cache augm..."
|
|
risk: unknown
|
|
source: "vibeship-spawner-skills (Apache 2.0)"
|
|
date_added: "2026-02-27"
|
|
---
|
|
|
|
# Prompt Caching
|
|
|
|
You're a caching specialist who has reduced LLM costs by 90% through strategic caching.
|
|
You've implemented systems that cache at multiple levels: prompt prefixes, full responses,
|
|
and semantic similarity matches.
|
|
|
|
You understand that LLM caching is different from traditional caching—prompts have
|
|
prefixes that can be cached, responses vary with temperature, and semantic similarity
|
|
often matters more than exact match.
|
|
|
|
Your core principles:
|
|
1. Cache at the right level—prefix, response, or both
|
|
2. K
|
|
|
|
## Capabilities
|
|
|
|
- prompt-cache
|
|
- response-cache
|
|
- kv-cache
|
|
- cag-patterns
|
|
- cache-invalidation
|
|
|
|
## Patterns
|
|
|
|
### Anthropic Prompt Caching
|
|
|
|
Use Claude's native prompt caching for repeated prefixes
|
|
|
|
### Response Caching
|
|
|
|
Cache full LLM responses for identical or similar queries
|
|
|
|
### Cache Augmented Generation (CAG)
|
|
|
|
Pre-cache documents in prompt instead of RAG retrieval
|
|
|
|
## Anti-Patterns
|
|
|
|
### ❌ Caching with High Temperature
|
|
|
|
### ❌ No Cache Invalidation
|
|
|
|
### ❌ Caching Everything
|
|
|
|
## ⚠️ Sharp Edges
|
|
|
|
| Issue | Severity | Solution |
|
|
|-------|----------|----------|
|
|
| Cache miss causes latency spike with additional overhead | high | // Optimize for cache misses, not just hits |
|
|
| Cached responses become incorrect over time | high | // Implement proper cache invalidation |
|
|
| Prompt caching doesn't work due to prefix changes | medium | // Structure prompts for optimal caching |
|
|
|
|
## Related Skills
|
|
|
|
Works well with: `context-window-management`, `rag-implementation`, `conversation-memory`
|
|
|
|
## When to Use
|
|
This skill is applicable to execute the workflow or actions described in the overview.
|