firefrost-gaming/claude-skills-reference

Files

Alireza Rezvani 6723bc6977 fix(skill): rewrite senior-prompt-engineer with unique, actionable content (#91 )

Issue #49 feedback implementation:

SKILL.md:
- Added YAML frontmatter with trigger phrases
- Removed marketing language ("world-class", etc.)
- Added Table of Contents
- Converted vague bullets to concrete workflows
- Added input/output examples for all tools

Reference files (all 3 previously 100% identical):
- prompt_engineering_patterns.md: 10 patterns with examples
  (Zero-Shot, Few-Shot, CoT, Role, Structured Output, etc.)
- llm_evaluation_frameworks.md: 7 sections on metrics
  (BLEU, ROUGE, BERTScore, RAG metrics, A/B testing)
- agentic_system_design.md: 6 agent architecture sections
  (ReAct, Plan-Execute, Tool Use, Multi-Agent, Memory)

Python scripts (all 3 previously identical placeholders):
- prompt_optimizer.py: Token counting, clarity analysis,
  few-shot extraction, optimization suggestions
- rag_evaluator.py: Context relevance, faithfulness,
  retrieval metrics (Precision@K, MRR, NDCG)
- agent_orchestrator.py: Config parsing, validation,
  ASCII/Mermaid visualization, cost estimation

Total: 3,571 lines added, 587 deleted
Before: ~785 lines duplicate boilerplate
After: 3,750 lines unique, actionable content

Closes #49

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-26 11:03:37 +01:00

13 KiB

Raw Blame History

Prompt Engineering Patterns

Specific prompt techniques with example inputs and expected outputs.

1. Zero-Shot Prompting

When to use: Simple, well-defined tasks where the model has sufficient training knowledge.

Pattern:

[Task instruction]
[Input]

Example:

Input:

Classify the following customer review as positive, negative, or neutral.

Review: "The shipping was fast but the product quality was disappointing."

Expected Output:

negative

Best practices:

Be explicit about output format
Use clear, unambiguous verbs (classify, extract, summarize)
Specify constraints (word limits, format requirements)

When to avoid:

Tasks requiring specific formatting the model hasn't seen
Domain-specific tasks requiring specialized knowledge
Tasks where consistency is critical

2. Few-Shot Prompting

When to use: Tasks requiring consistent formatting or domain-specific patterns.

Pattern:

[Task description]

Example 1:
Input: [example input]
Output: [example output]

Example 2:
Input: [example input]
Output: [example output]

Now process:
Input: [actual input]
Output:

Example:

Input:

Extract the company name and founding year from the text.

Example 1:
Input: "Apple Inc. was founded in 1976 by Steve Jobs."
Output: {"company": "Apple Inc.", "year": 1976}

Example 2:
Input: "Microsoft Corporation started in 1975."
Output: {"company": "Microsoft Corporation", "year": 1975}

Example 3:
Input: "Founded in 1994, Amazon has grown into a tech giant."
Output: {"company": "Amazon", "year": 1994}

Now process:
Input: "Tesla, Inc. was established in 2003 by Martin Eberhard."
Output:

Expected Output:

{"company": "Tesla, Inc.", "year": 2003}

Example selection guidelines:

Example Type	Purpose	Count
Simple/typical	Establish basic pattern	1-2
Edge case	Handle ambiguity	1
Different format	Show variations	1
Total		3-5

Common mistakes:

Too many examples (wastes tokens, may confuse)
Inconsistent formatting between examples
Examples too similar (model doesn't learn variation)

3. Chain-of-Thought (CoT)

When to use: Math problems, logical reasoning, multi-step analysis.

Pattern:

[Problem]

Think through this step by step:

Example:

Input:

A store sells apples for $2 each. If you buy 5 or more, you get 20% off.
How much do 7 apples cost?

Think through this step by step:

Expected Output:

Step 1: Calculate the regular price for 7 apples
7 apples × $2 = $14

Step 2: Check if discount applies
7 apples ≥ 5 apples, so 20% discount applies

Step 3: Calculate the discount amount
20% of $14 = 0.20 × $14 = $2.80

Step 4: Calculate final price
$14 - $2.80 = $11.20

The 7 apples cost $11.20

Variations:

Variation	Trigger Phrase	Use Case
Basic CoT	"Think step by step"	General reasoning
Explicit steps	"Break this into numbered steps"	Complex procedures
Show work	"Show your calculations"	Math problems
Reasoning chain	"Explain your reasoning"	Justification needed

Zero-shot CoT: Simply append "Let's think step by step." to any prompt.

4. Role Prompting

When to use: Tasks requiring expertise, specific perspective, or consistent persona.

Pattern:

You are a [role/expert description].

[Context about the task]

[Specific instruction]

Example:

Input:

You are a senior security engineer with 15 years of experience in web application security.

Review the following code snippet for security vulnerabilities:

```python
def login(username, password):
    query = f"SELECT * FROM users WHERE username='{username}' AND password='{password}'"
    result = db.execute(query)
    return result

Identify vulnerabilities and provide specific fixes.


Expected Output:

Security Vulnerabilities Identified:

SQL Injection (Critical)

The code directly interpolates user input into SQL query
Attacker can input: username=' OR '1'='1' --
This bypasses authentication entirely

Fix:

def login(username, password):
    query = "SELECT * FROM users WHERE username=? AND password=?"
    result = db.execute(query, (username, password))
    return result

Plain-text Password Storage (Critical)
- Passwords appear to be stored and compared in plain text
- Fix: Use bcrypt or argon2 for password hashing
Missing Rate Limiting (Medium)
- No protection against brute force attacks
- Fix: Implement login attempt throttling


**Effective role descriptions:**
| Weak | Strong |
|------|--------|
| "You are an expert" | "You are a senior data scientist with 10 years of experience in NLP" |
| "You are helpful" | "You are a patient teacher explaining concepts to beginners" |
| "You know about X" | "You are a certified AWS solutions architect specializing in serverless" |

---

## 5. Structured Output

**When to use:** When you need parseable responses (JSON, XML, CSV).

**Pattern:**

[Task instruction]

Respond in JSON format with exactly these fields:

field1 (type): description
field2 (type): description

[Input]

Return ONLY valid JSON, no markdown or explanation.


**Example:**

Input:

Extract meeting details from this email.

Respond in JSON format with exactly these fields:

date (string, ISO format): Meeting date
time (string, 24h format): Meeting time
attendees (array of strings): List of attendees
topic (string): Meeting topic
location (string or null): Meeting location if mentioned

Email: "Hi team, let's meet tomorrow at 2pm to discuss Q4 planning. Sarah, Mike, and Lisa should attend. We'll use Conference Room B."

Today's date is 2024-01-15.

Return ONLY valid JSON, no markdown or explanation.


Expected Output:
```json
{
  "date": "2024-01-16",
  "time": "14:00",
  "attendees": ["Sarah", "Mike", "Lisa"],
  "topic": "Q4 planning",
  "location": "Conference Room B"
}

Format enforcement techniques:

# Strong enforcement
"Return ONLY valid JSON. Start with { and end with }"

# Schema validation hint
"The output must be valid JSON matching this TypeScript type:
type Output = { name: string; age: number; active: boolean }"

# Negative instruction
"Do NOT include markdown code blocks. Do NOT add explanations."

6. Self-Consistency

When to use: Complex reasoning where multiple valid paths exist.

Pattern:

Generate multiple reasoning paths (temperature > 0)
Extract final answers from each path
Select most common answer (majority vote)

Example approach:

# Run this prompt 5 times with temperature=0.7

Solve this logic puzzle. Think through it step by step.

Three friends (Alice, Bob, Carol) each have a different pet (cat, dog, bird).
- Alice doesn't have the dog
- The person with the bird is not Carol
- Bob's pet has four legs

What pet does each person have?

Implementation:

# Pseudo-code for self-consistency
answers = []
for i in range(5):
    response = llm.generate(prompt, temperature=0.7)
    answer = extract_final_answer(response)
    answers.append(answer)

final_answer = majority_vote(answers)

When to use:

Math word problems
Logic puzzles
Any task where you can verify consistency

7. ReAct (Reasoning + Acting)

When to use: Tasks requiring external tool use or information gathering.

Pattern:

Answer the question using the available tools.

Tools available:
- search(query): Search the web
- calculate(expression): Evaluate math
- lookup(term): Look up in knowledge base

Format your response as:
Thought: [reasoning about what to do]
Action: [tool_name(arguments)]
Observation: [result from tool]
... (repeat as needed)
Thought: I now have enough information
Final Answer: [answer]

Question: [user question]

Example:

Input:

Question: What is the population of France divided by the population of Belgium?

Expected Output:

Thought: I need to find the population of France and Belgium, then calculate the ratio.

Action: search("population of France 2024")
Observation: France has a population of approximately 67.75 million.

Thought: Now I need Belgium's population.

Action: search("population of Belgium 2024")
Observation: Belgium has a population of approximately 11.7 million.

Thought: Now I can calculate the ratio.

Action: calculate(67.75 / 11.7)
Observation: 5.79

Thought: I now have enough information to answer.

Final Answer: The population of France is approximately 5.79 times the population of Belgium.

8. Tree of Thoughts

When to use: Complex problems requiring exploration of multiple solution paths.

Pattern:

Problem: [complex problem]

Generate 3 different approaches to solve this:

Approach 1: [description]
Evaluation: [pros/cons]

Approach 2: [description]
Evaluation: [pros/cons]

Approach 3: [description]
Evaluation: [pros/cons]

Best approach: [selection with reasoning]
Solution using best approach: [detailed solution]

Example:

Input:

Problem: Design a system to handle 10,000 concurrent user sessions with sub-100ms response times.

Generate 3 different architectural approaches:

Expected Output:

Approach 1: Horizontal scaling with load balancer
- Multiple stateless app servers behind load balancer
- Session state in Redis cluster
Evaluation: Simple, proven pattern. May have Redis as bottleneck.

Approach 2: Serverless with edge computing
- AWS Lambda@Edge or Cloudflare Workers
- Session state in DynamoDB Global Tables
Evaluation: Auto-scales, but cold starts may impact latency.

Approach 3: Actor model with sticky sessions
- Akka/Orleans cluster with virtual actors
- Session state co-located with actor
Evaluation: Lowest latency, but complex to operate.

Best approach: Approach 1 (Horizontal scaling)
Reasoning: Best balance of simplicity, reliability, and performance for the given requirements.

Solution: [detailed architecture follows]

9. Retrieval-Augmented Generation

When to use: Tasks requiring factual accuracy or current information.

Pattern:

Use the following context to answer the question. If the answer is not in the context, say "I don't have information about that."

Context:
[retrieved documents]

Question: [user question]

Answer based only on the context provided:

Example:

Input:

Use the following context to answer the question.

Context:
Document 1: "The Acme Widget 3000 was released in March 2024. It features
a 12-hour battery life and weighs 250 grams."

Document 2: "Pricing for the Acme Widget 3000 starts at $299 for the base
model. The Pro version with extended battery is $399."

Question: How much does the Acme Widget 3000 cost and what is its battery life?

Answer based only on the context provided:

Expected Output:

The Acme Widget 3000 starts at $299 for the base model, with a Pro version
available at $399. The standard model has a 12-hour battery life.

Chunking strategies:

Strategy	Chunk Size	Overlap	Best For
Fixed	500 tokens	50 tokens	General purpose
Semantic	Paragraph	None	Well-structured docs
Sentence	3-5 sentences	1 sentence	Q&A systems

10. Meta-Prompting

When to use: Generating or improving prompts programmatically.

Pattern:

You are a prompt engineering expert.

Task: [description of what the prompt should do]
Target model: [GPT-4/Claude/etc.]
Constraints: [length limits, format requirements]

Generate an optimized prompt for this task.

Example:

Input:

You are a prompt engineering expert.

Task: Create a prompt that extracts action items from meeting notes
Target model: GPT-4
Constraints:
- Output must be valid JSON
- Each action item needs: task, owner, due_date
- Handle cases where information is missing

Generate an optimized prompt for this task.

Expected Output:

Extract action items from the following meeting notes.

For each action item, identify:
- task: The specific action to be taken
- owner: Person responsible (use "Unassigned" if not mentioned)
- due_date: Deadline in YYYY-MM-DD format (use null if not specified)

Meeting Notes:
{meeting_notes}

Respond with a JSON array. Example format:
[
  {"task": "Review proposal", "owner": "Sarah", "due_date": "2024-01-20"},
  {"task": "Send update", "owner": "Unassigned", "due_date": null}
]

Return ONLY the JSON array, no additional text.

Pattern Selection Guide

Task Type	Recommended Pattern
Simple classification	Zero-shot
Consistent formatting needed	Few-shot
Math/logic problems	Chain-of-Thought
Need expertise/perspective	Role Prompting
API integration	Structured Output
High-stakes decisions	Self-Consistency
Tool use required	ReAct
Complex problem solving	Tree of Thoughts
Factual Q&A	RAG
Prompt generation	Meta-Prompting

13 KiB Raw Blame History Unescape Escape

Prompt Engineering Patterns

Patterns Index

1. Zero-Shot Prompting

2. Few-Shot Prompting

3. Chain-of-Thought (CoT)

4. Role Prompting

6. Self-Consistency

7. ReAct (Reasoning + Acting)

8. Tree of Thoughts

9. Retrieval-Augmented Generation

10. Meta-Prompting

Pattern Selection Guide

13 KiB

Raw Blame History