firefrost-gaming/claude-skills-reference

Files

Alireza Rezvani 6723bc6977 fix(skill): rewrite senior-prompt-engineer with unique, actionable content (#91 )

Issue #49 feedback implementation:

SKILL.md:
- Added YAML frontmatter with trigger phrases
- Removed marketing language ("world-class", etc.)
- Added Table of Contents
- Converted vague bullets to concrete workflows
- Added input/output examples for all tools

Reference files (all 3 previously 100% identical):
- prompt_engineering_patterns.md: 10 patterns with examples
  (Zero-Shot, Few-Shot, CoT, Role, Structured Output, etc.)
- llm_evaluation_frameworks.md: 7 sections on metrics
  (BLEU, ROUGE, BERTScore, RAG metrics, A/B testing)
- agentic_system_design.md: 6 agent architecture sections
  (ReAct, Plan-Execute, Tool Use, Multi-Agent, Memory)

Python scripts (all 3 previously identical placeholders):
- prompt_optimizer.py: Token counting, clarity analysis,
  few-shot extraction, optimization suggestions
- rag_evaluator.py: Context relevance, faithfulness,
  retrieval metrics (Precision@K, MRR, NDCG)
- agent_orchestrator.py: Config parsing, validation,
  ASCII/Mermaid visualization, cost estimation

Total: 3,571 lines added, 587 deleted
Before: ~785 lines duplicate boilerplate
After: 3,750 lines unique, actionable content

Closes #49

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-26 11:03:37 +01:00

21 KiB

Raw Blame History

Agentic System Design

Agent architectures, tool use patterns, and multi-agent orchestration with pseudocode.

1. ReAct Pattern

Reasoning + Acting: The agent alternates between thinking about what to do and taking actions.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                        ReAct Loop                           │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   ┌─────────┐    ┌─────────┐    ┌─────────┐    ┌─────────┐ │
│   │ Thought │───▶│ Action  │───▶│  Tool   │───▶│Observat.│ │
│   └─────────┘    └─────────┘    └─────────┘    └────┬────┘ │
│        ▲                                            │       │
│        └────────────────────────────────────────────┘       │
│                         (loop until done)                   │
└─────────────────────────────────────────────────────────────┘

Pseudocode

def react_agent(query, tools, max_iterations=10):
    """
    ReAct agent implementation.

    Args:
        query: User question
        tools: Dict of available tools {name: function}
        max_iterations: Safety limit
    """
    context = f"Question: {query}\n"

    for i in range(max_iterations):
        # Generate thought and action
        response = llm.generate(
            REACT_PROMPT.format(
                tools=format_tools(tools),
                context=context
            )
        )

        # Parse response
        thought = extract_thought(response)
        action = extract_action(response)

        context += f"Thought: {thought}\n"

        # Check for final answer
        if action.name == "finish":
            return action.argument

        # Execute tool
        if action.name in tools:
            observation = tools[action.name](action.argument)
            context += f"Action: {action.name}({action.argument})\n"
            context += f"Observation: {observation}\n"
        else:
            context += f"Error: Unknown tool {action.name}\n"

    return "Max iterations reached"

Prompt Template

You are a helpful assistant that can use tools to answer questions.

Available tools:
{tools}

Answer format:
Thought: [your reasoning about what to do next]
Action: [tool_name(argument)] OR finish(final_answer)

{context}

Continue:

When to Use

Scenario	ReAct Fit
Simple Q&A with lookup	Good
Multi-step research	Good
Math calculations	Good
Creative writing	Poor
Real-time conversation	Poor

2. Plan-and-Execute

Two-phase approach: First create a plan, then execute each step.

Architecture

┌──────────────────────────────────────────────────────────────┐
│                     Plan-and-Execute                         │
├──────────────────────────────────────────────────────────────┤
│                                                              │
│  Phase 1: Planning                                           │
│  ┌──────────┐    ┌──────────────────────────────────────┐   │
│  │  Query   │───▶│  Generate step-by-step plan          │   │
│  └──────────┘    └──────────────────────────────────────┘   │
│                              │                               │
│                              ▼                               │
│                  ┌──────────────────────┐                    │
│                  │ Plan: [S1, S2, S3]   │                    │
│                  └──────────┬───────────┘                    │
│                             │                                │
│  Phase 2: Execution         │                                │
│                  ┌──────────▼───────────┐                    │
│                  │   Execute Step 1     │                    │
│                  └──────────┬───────────┘                    │
│                             │                                │
│                  ┌──────────▼───────────┐                    │
│                  │   Execute Step 2     │──▶ Replan?         │
│                  └──────────┬───────────┘                    │
│                             │                                │
│                  ┌──────────▼───────────┐                    │
│                  │   Execute Step 3     │                    │
│                  └──────────┬───────────┘                    │
│                             │                                │
│                  ┌──────────▼───────────┐                    │
│                  │    Final Answer      │                    │
│                  └──────────────────────┘                    │
└──────────────────────────────────────────────────────────────┘

Pseudocode

def plan_and_execute(query, tools):
    """
    Plan-and-Execute agent.

    Separates planning from execution for complex tasks.
    """
    # Phase 1: Generate plan
    plan = generate_plan(query)

    results = []

    # Phase 2: Execute each step
    for i, step in enumerate(plan.steps):
        # Execute step
        result = execute_step(step, tools, results)
        results.append(result)

        # Optional: Check if replanning needed
        if should_replan(step, result, plan):
            remaining_steps = plan.steps[i+1:]
            new_plan = replan(query, results, remaining_steps)
            plan.steps = plan.steps[:i+1] + new_plan.steps

    # Synthesize final answer
    return synthesize_answer(query, results)


def generate_plan(query):
    """Generate execution plan from query."""
    prompt = f"""
    Create a step-by-step plan to answer this question:
    {query}

    Format each step as:
    Step N: [action description]

    Keep the plan concise (3-7 steps).
    """
    response = llm.generate(prompt)
    return parse_plan(response)


def execute_step(step, tools, previous_results):
    """Execute a single step using available tools."""
    prompt = f"""
    Execute this step: {step.description}

    Previous results:
    {format_results(previous_results)}

    Available tools: {format_tools(tools)}

    Provide the result of this step.
    """
    return llm.generate(prompt)

When to Use

Task Complexity	Recommendation
Simple (1-2 steps)	Use ReAct
Medium (3-5 steps)	Plan-and-Execute
Complex (6+ steps)	Plan-and-Execute with replanning
Highly dynamic	ReAct with adaptive planning

3. Tool Use / Function Calling

Structured tool invocation: LLM generates structured calls that are executed externally.

Tool Definition Schema

{
  "name": "search_web",
  "description": "Search the web for current information",
  "parameters": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "Search query"
      },
      "num_results": {
        "type": "integer",
        "default": 5,
        "description": "Number of results to return"
      }
    },
    "required": ["query"]
  }
}

Implementation Pattern

class ToolRegistry:
    """Registry for agent tools."""

    def __init__(self):
        self.tools = {}

    def register(self, name, func, schema):
        """Register a tool with its schema."""
        self.tools[name] = {
            "function": func,
            "schema": schema
        }

    def get_schemas(self):
        """Get all tool schemas for LLM."""
        return [t["schema"] for t in self.tools.values()]

    def execute(self, name, arguments):
        """Execute a tool by name."""
        if name not in self.tools:
            raise ValueError(f"Unknown tool: {name}")

        func = self.tools[name]["function"]
        return func(**arguments)


def tool_use_agent(query, registry):
    """Agent with function calling."""
    messages = [{"role": "user", "content": query}]

    while True:
        # Call LLM with tools
        response = llm.chat(
            messages=messages,
            tools=registry.get_schemas(),
            tool_choice="auto"
        )

        # Check if done
        if response.finish_reason == "stop":
            return response.content

        # Execute tool calls
        if response.tool_calls:
            for call in response.tool_calls:
                result = registry.execute(
                    call.function.name,
                    json.loads(call.function.arguments)
                )
                messages.append({
                    "role": "tool",
                    "tool_call_id": call.id,
                    "content": str(result)
                })

Tool Design Best Practices

Practice	Example
Clear descriptions	"Search web for query" not "search"
Type hints	Use JSON Schema types
Default values	Provide sensible defaults
Error handling	Return error messages, not exceptions
Idempotency	Same input = same output

4. Multi-Agent Collaboration

Orchestration Patterns

Pattern 1: Sequential Pipeline

Agent A → Agent B → Agent C → Output

Use case: Research → Analysis → Writing

Pattern 2: Hierarchical

        ┌─────────────┐
        │ Coordinator │
        └──────┬──────┘
    ┌──────────┼──────────┐
    ▼          ▼          ▼
┌───────┐ ┌───────┐ ┌───────┐
│Agent A│ │Agent B│ │Agent C│
└───────┘ └───────┘ └───────┘

Use case: Complex task decomposition

Pattern 3: Debate/Consensus

┌───────┐     ┌───────┐
│Agent A│◄───▶│Agent B│
└───┬───┘     └───┬───┘
    │             │
    └──────┬──────┘
           ▼
    ┌─────────────┐
    │   Arbiter   │
    └─────────────┘

Use case: Critical decisions, fact-checking

Pseudocode: Hierarchical Multi-Agent

class CoordinatorAgent:
    """Coordinates multiple specialized agents."""

    def __init__(self, agents):
        self.agents = agents  # Dict[str, Agent]

    def process(self, query):
        # Decompose task
        subtasks = self.decompose(query)

        # Assign to agents
        results = {}
        for subtask in subtasks:
            agent_name = self.select_agent(subtask)
            result = self.agents[agent_name].execute(subtask)
            results[subtask.id] = result

        # Synthesize
        return self.synthesize(query, results)

    def decompose(self, query):
        """Break query into subtasks."""
        prompt = f"""
        Break this task into subtasks for specialized agents:

        Task: {query}

        Available agents:
        - researcher: Gathers information
        - analyst: Analyzes data
        - writer: Produces content

        Format:
        1. [agent]: [subtask description]
        """
        response = llm.generate(prompt)
        return parse_subtasks(response)

    def select_agent(self, subtask):
        """Select best agent for subtask."""
        return subtask.assigned_agent

    def synthesize(self, query, results):
        """Combine agent results into final answer."""
        prompt = f"""
        Combine these results to answer: {query}

        Results:
        {format_results(results)}

        Provide a coherent final answer.
        """
        return llm.generate(prompt)

Communication Protocols

Protocol	Description	Use When
Direct	Agent calls agent	Simple pipelines
Message queue	Async message passing	High throughput
Shared state	Shared memory/database	Collaborative editing
Broadcast	One-to-many	Status updates

5. Memory and State Management

Memory Types

┌─────────────────────────────────────────────────────────────┐
│                    Agent Memory System                       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌─────────────────┐  ┌─────────────────┐                  │
│  │ Working Memory  │  │  Episodic Memory │                  │
│  │ (Current task)  │  │ (Past sessions)  │                  │
│  └────────┬────────┘  └────────┬─────────┘                  │
│           │                    │                            │
│           └────────┬───────────┘                            │
│                    ▼                                        │
│  ┌─────────────────────────────────────────┐               │
│  │           Semantic Memory               │               │
│  │    (Long-term knowledge, embeddings)    │               │
│  └─────────────────────────────────────────┘               │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Implementation

class AgentMemory:
    """Memory system for conversational agents."""

    def __init__(self, embedding_model, vector_store):
        self.embedding_model = embedding_model
        self.vector_store = vector_store
        self.working_memory = []  # Current conversation
        self.buffer_size = 10     # Recent messages to keep

    def add_message(self, role, content):
        """Add message to working memory."""
        self.working_memory.append({
            "role": role,
            "content": content,
            "timestamp": datetime.now()
        })

        # Trim if too long
        if len(self.working_memory) > self.buffer_size:
            # Summarize old messages before removing
            old_messages = self.working_memory[:5]
            summary = self.summarize(old_messages)
            self.store_long_term(summary)
            self.working_memory = self.working_memory[5:]

    def store_long_term(self, content):
        """Store in semantic memory (vector store)."""
        embedding = self.embedding_model.embed(content)
        self.vector_store.add(
            embedding=embedding,
            metadata={"content": content, "type": "summary"}
        )

    def retrieve_relevant(self, query, k=5):
        """Retrieve relevant memories for context."""
        query_embedding = self.embedding_model.embed(query)
        results = self.vector_store.search(query_embedding, k=k)
        return [r.metadata["content"] for r in results]

    def get_context(self, query):
        """Build context for LLM from memories."""
        relevant = self.retrieve_relevant(query)
        recent = self.working_memory[-self.buffer_size:]

        return {
            "relevant_memories": relevant,
            "recent_conversation": recent
        }

    def summarize(self, messages):
        """Summarize messages for long-term storage."""
        content = "\n".join([
            f"{m['role']}: {m['content']}"
            for m in messages
        ])
        prompt = f"Summarize this conversation:\n{content}"
        return llm.generate(prompt)

State Persistence Patterns

Pattern	Storage	Use Case
In-memory	Dict/List	Single session
Redis	Key-value	Multi-session, fast
PostgreSQL	Relational	Complex queries
Vector DB	Embeddings	Semantic search

6. Agent Design Patterns

Pattern: Reflection

Agent reviews and critiques its own output.

def reflective_agent(query, tools):
    """Agent that reflects on its answers."""
    # Initial response
    response = react_agent(query, tools)

    # Reflection
    critique = llm.generate(f"""
    Review this answer for:
    1. Accuracy - Is the information correct?
    2. Completeness - Does it fully answer the question?
    3. Clarity - Is it easy to understand?

    Question: {query}
    Answer: {response}

    Critique:
    """)

    # Check if revision needed
    if needs_revision(critique):
        revised = llm.generate(f"""
        Improve this answer based on the critique:

        Original: {response}
        Critique: {critique}

        Improved answer:
        """)
        return revised

    return response

Pattern: Self-Ask

Break complex questions into simpler sub-questions.

def self_ask_agent(query, tools):
    """Agent that asks itself follow-up questions."""
    context = []

    while True:
        prompt = f"""
        Question: {query}

        Previous Q&A:
        {format_qa(context)}

        Do you need to ask a follow-up question to answer this?
        If yes: "Follow-up: [question]"
        If no: "Final Answer: [answer]"
        """

        response = llm.generate(prompt)

        if response.startswith("Final Answer:"):
            return response.replace("Final Answer:", "").strip()

        # Answer follow-up question
        follow_up = response.replace("Follow-up:", "").strip()
        answer = simple_qa(follow_up, tools)
        context.append({"q": follow_up, "a": answer})

Pattern: Expert Routing

Route queries to specialized sub-agents.

class ExpertRouter:
    """Routes queries to expert agents."""

    def __init__(self):
        self.experts = {
            "code": CodeAgent(),
            "math": MathAgent(),
            "research": ResearchAgent(),
            "general": GeneralAgent()
        }

    def route(self, query):
        """Determine best expert for query."""
        prompt = f"""
        Classify this query into one category:
        - code: Programming questions
        - math: Mathematical calculations
        - research: Fact-finding, current events
        - general: Everything else

        Query: {query}
        Category:
        """
        category = llm.generate(prompt).strip().lower()
        return self.experts.get(category, self.experts["general"])

    def process(self, query):
        expert = self.route(query)
        return expert.execute(query)

Quick Reference: Pattern Selection

Need	Pattern
Simple tool use	ReAct
Complex multi-step	Plan-and-Execute
API integration	Function Calling
Multiple perspectives	Multi-Agent Debate
Quality assurance	Reflection
Complex reasoning	Self-Ask
Domain expertise	Expert Routing
Conversation continuity	Memory System

21 KiB Raw Blame History

Agentic System Design

Architectures Index

1. ReAct Pattern

Architecture

Pseudocode

Prompt Template

When to Use

2. Plan-and-Execute

Architecture

Pseudocode

When to Use

3. Tool Use / Function Calling

Tool Definition Schema

Implementation Pattern

Tool Design Best Practices

4. Multi-Agent Collaboration

Orchestration Patterns

Pseudocode: Hierarchical Multi-Agent

Communication Protocols

5. Memory and State Management

Memory Types

Implementation

State Persistence Patterns

6. Agent Design Patterns

Pattern: Reflection

Pattern: Self-Ask

Pattern: Expert Routing

Quick Reference: Pattern Selection

21 KiB

Raw Blame History