feat: Add Official Microsoft & Gemini Skills (845+ Total)

🚀 Impact

Significantly expands the capabilities of **Antigravity Awesome Skills** by integrating official skill collections from **Microsoft** and **Google Gemini**. This update increases the total skill count to **845+**, making the library even more comprehensive for AI coding assistants.

 Key Changes

1. New Official Skills

- **Microsoft Skills**: Added a massive collection of official skills from [microsoft/skills](https://github.com/microsoft/skills).
  - Includes Azure, .NET, Python, TypeScript, and Semantic Kernel skills.
  - Preserves the original directory structure under `skills/official/microsoft/`.
  - Includes plugin skills from the `.github/plugins` directory.
- **Gemini Skills**: Added official Gemini API development skills under `skills/gemini-api-dev/`.

2. New Scripts & Tooling

- **`scripts/sync_microsoft_skills.py`**: A robust synchronization script that:
  - Clones the official Microsoft repository.
  - Preserves the original directory heirarchy.
  - Handles symlinks and plugin locations.
  - Generates attribution metadata.
- **`scripts/tests/inspect_microsoft_repo.py`**: Debug tool to inspect the remote repository structure.
- **`scripts/tests/test_comprehensive_coverage.py`**: Verification script to ensure 100% of skills are captured during sync.

3. Core Improvements

- **`scripts/generate_index.py`**: Enhanced frontmatter parsing to safely handle unquoted values containing `@` symbols and commas (fixing issues with some Microsoft skill descriptions).
- **`package.json`**: Added `sync:microsoft` and `sync:all-official` scripts for easy maintenance.

4. Documentation

- Updated `README.md` to reflect the new skill counts (845+) and added Microsoft/Gemini to the provider list.
- Updated `CATALOG.md` and `skills_index.json` with the new skills.

🧪 Verification

- Ran `scripts/tests/test_comprehensive_coverage.py` to verify all Microsoft skills are detected.
- Validated `generate_index.py` fixes by successfully indexing the new skills.
This commit is contained in:
Ahmed Rehan
2026-02-11 20:16:23 +05:00
parent 167d7c97c7
commit 17bce709de
145 changed files with 44081 additions and 72 deletions

View File

@@ -0,0 +1,333 @@
---
name: agent-framework-azure-ai-py
description: Build Azure AI Foundry agents using the Microsoft Agent Framework Python SDK (agent-framework-azure-ai). Use when creating persistent agents with AzureAIAgentsProvider, using hosted tools (code interpreter, file search, web search), integrating MCP servers, managing conversation threads, or implementing streaming responses. Covers function tools, structured outputs, and multi-tool agents.
package: agent-framework-azure-ai
---
# Agent Framework Azure Hosted Agents
Build persistent agents on Azure AI Foundry using the Microsoft Agent Framework Python SDK.
## Architecture
```
User Query → AzureAIAgentsProvider → Azure AI Agent Service (Persistent)
Agent.run() / Agent.run_stream()
Tools: Functions | Hosted (Code/Search/Web) | MCP
AgentThread (conversation persistence)
```
## Installation
```bash
# Full framework (recommended)
pip install agent-framework --pre
# Or Azure-specific package only
pip install agent-framework-azure-ai --pre
```
## Environment Variables
```bash
export AZURE_AI_PROJECT_ENDPOINT="https://<project>.services.ai.azure.com/api/projects/<project-id>"
export AZURE_AI_MODEL_DEPLOYMENT_NAME="gpt-4o-mini"
export BING_CONNECTION_ID="your-bing-connection-id" # For web search
```
## Authentication
```python
from azure.identity.aio import AzureCliCredential, DefaultAzureCredential
# Development
credential = AzureCliCredential()
# Production
credential = DefaultAzureCredential()
```
## Core Workflow
### Basic Agent
```python
import asyncio
from agent_framework.azure import AzureAIAgentsProvider
from azure.identity.aio import AzureCliCredential
async def main():
async with (
AzureCliCredential() as credential,
AzureAIAgentsProvider(credential=credential) as provider,
):
agent = await provider.create_agent(
name="MyAgent",
instructions="You are a helpful assistant.",
)
result = await agent.run("Hello!")
print(result.text)
asyncio.run(main())
```
### Agent with Function Tools
```python
from typing import Annotated
from pydantic import Field
from agent_framework.azure import AzureAIAgentsProvider
from azure.identity.aio import AzureCliCredential
def get_weather(
location: Annotated[str, Field(description="City name to get weather for")],
) -> str:
"""Get the current weather for a location."""
return f"Weather in {location}: 72°F, sunny"
def get_current_time() -> str:
"""Get the current UTC time."""
from datetime import datetime, timezone
return datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S UTC")
async def main():
async with (
AzureCliCredential() as credential,
AzureAIAgentsProvider(credential=credential) as provider,
):
agent = await provider.create_agent(
name="WeatherAgent",
instructions="You help with weather and time queries.",
tools=[get_weather, get_current_time], # Pass functions directly
)
result = await agent.run("What's the weather in Seattle?")
print(result.text)
```
### Agent with Hosted Tools
```python
from agent_framework import (
HostedCodeInterpreterTool,
HostedFileSearchTool,
HostedWebSearchTool,
)
from agent_framework.azure import AzureAIAgentsProvider
from azure.identity.aio import AzureCliCredential
async def main():
async with (
AzureCliCredential() as credential,
AzureAIAgentsProvider(credential=credential) as provider,
):
agent = await provider.create_agent(
name="MultiToolAgent",
instructions="You can execute code, search files, and search the web.",
tools=[
HostedCodeInterpreterTool(),
HostedWebSearchTool(name="Bing"),
],
)
result = await agent.run("Calculate the factorial of 20 in Python")
print(result.text)
```
### Streaming Responses
```python
async def main():
async with (
AzureCliCredential() as credential,
AzureAIAgentsProvider(credential=credential) as provider,
):
agent = await provider.create_agent(
name="StreamingAgent",
instructions="You are a helpful assistant.",
)
print("Agent: ", end="", flush=True)
async for chunk in agent.run_stream("Tell me a short story"):
if chunk.text:
print(chunk.text, end="", flush=True)
print()
```
### Conversation Threads
```python
from agent_framework.azure import AzureAIAgentsProvider
from azure.identity.aio import AzureCliCredential
async def main():
async with (
AzureCliCredential() as credential,
AzureAIAgentsProvider(credential=credential) as provider,
):
agent = await provider.create_agent(
name="ChatAgent",
instructions="You are a helpful assistant.",
tools=[get_weather],
)
# Create thread for conversation persistence
thread = agent.get_new_thread()
# First turn
result1 = await agent.run("What's the weather in Seattle?", thread=thread)
print(f"Agent: {result1.text}")
# Second turn - context is maintained
result2 = await agent.run("What about Portland?", thread=thread)
print(f"Agent: {result2.text}")
# Save thread ID for later resumption
print(f"Conversation ID: {thread.conversation_id}")
```
### Structured Outputs
```python
from pydantic import BaseModel, ConfigDict
from agent_framework.azure import AzureAIAgentsProvider
from azure.identity.aio import AzureCliCredential
class WeatherResponse(BaseModel):
model_config = ConfigDict(extra="forbid")
location: str
temperature: float
unit: str
conditions: str
async def main():
async with (
AzureCliCredential() as credential,
AzureAIAgentsProvider(credential=credential) as provider,
):
agent = await provider.create_agent(
name="StructuredAgent",
instructions="Provide weather information in structured format.",
response_format=WeatherResponse,
)
result = await agent.run("Weather in Seattle?")
weather = WeatherResponse.model_validate_json(result.text)
print(f"{weather.location}: {weather.temperature}°{weather.unit}")
```
## Provider Methods
| Method | Description |
|--------|-------------|
| `create_agent()` | Create new agent on Azure AI service |
| `get_agent(agent_id)` | Retrieve existing agent by ID |
| `as_agent(sdk_agent)` | Wrap SDK Agent object (no HTTP call) |
## Hosted Tools Quick Reference
| Tool | Import | Purpose |
|------|--------|---------|
| `HostedCodeInterpreterTool` | `from agent_framework import HostedCodeInterpreterTool` | Execute Python code |
| `HostedFileSearchTool` | `from agent_framework import HostedFileSearchTool` | Search vector stores |
| `HostedWebSearchTool` | `from agent_framework import HostedWebSearchTool` | Bing web search |
| `HostedMCPTool` | `from agent_framework import HostedMCPTool` | Service-managed MCP |
| `MCPStreamableHTTPTool` | `from agent_framework import MCPStreamableHTTPTool` | Client-managed MCP |
## Complete Example
```python
import asyncio
from typing import Annotated
from pydantic import BaseModel, Field
from agent_framework import (
HostedCodeInterpreterTool,
HostedWebSearchTool,
MCPStreamableHTTPTool,
)
from agent_framework.azure import AzureAIAgentsProvider
from azure.identity.aio import AzureCliCredential
def get_weather(
location: Annotated[str, Field(description="City name")],
) -> str:
"""Get weather for a location."""
return f"Weather in {location}: 72°F, sunny"
class AnalysisResult(BaseModel):
summary: str
key_findings: list[str]
confidence: float
async def main():
async with (
AzureCliCredential() as credential,
MCPStreamableHTTPTool(
name="Docs MCP",
url="https://learn.microsoft.com/api/mcp",
) as mcp_tool,
AzureAIAgentsProvider(credential=credential) as provider,
):
agent = await provider.create_agent(
name="ResearchAssistant",
instructions="You are a research assistant with multiple capabilities.",
tools=[
get_weather,
HostedCodeInterpreterTool(),
HostedWebSearchTool(name="Bing"),
mcp_tool,
],
)
thread = agent.get_new_thread()
# Non-streaming
result = await agent.run(
"Search for Python best practices and summarize",
thread=thread,
)
print(f"Response: {result.text}")
# Streaming
print("\nStreaming: ", end="")
async for chunk in agent.run_stream("Continue with examples", thread=thread):
if chunk.text:
print(chunk.text, end="", flush=True)
print()
# Structured output
result = await agent.run(
"Analyze findings",
thread=thread,
response_format=AnalysisResult,
)
analysis = AnalysisResult.model_validate_json(result.text)
print(f"\nConfidence: {analysis.confidence}")
if __name__ == "__main__":
asyncio.run(main())
```
## Conventions
- Always use async context managers: `async with provider:`
- Pass functions directly to `tools=` parameter (auto-converted to AIFunction)
- Use `Annotated[type, Field(description=...)]` for function parameters
- Use `get_new_thread()` for multi-turn conversations
- Prefer `HostedMCPTool` for service-managed MCP, `MCPStreamableHTTPTool` for client-managed
## Reference Files
- [references/tools.md](references/tools.md): Detailed hosted tool patterns
- [references/mcp.md](references/mcp.md): MCP integration (hosted + local)
- [references/threads.md](references/threads.md): Thread and conversation management
- [references/advanced.md](references/advanced.md): OpenAPI, citations, structured outputs

View File

@@ -0,0 +1,325 @@
---
name: agents-v2-py
description: |
Build container-based Foundry Agents using Azure AI Projects SDK with ImageBasedHostedAgentDefinition.
Use when creating hosted agents that run custom code in Azure AI Foundry with your own container images.
Triggers: "ImageBasedHostedAgentDefinition", "hosted agent", "container agent", "Foundry Agent",
"create_version", "ProtocolVersionRecord", "AgentProtocol.RESPONSES", "custom agent image".
package: azure-ai-projects
---
# Azure AI Hosted Agents (Python)
Build container-based hosted agents using `ImageBasedHostedAgentDefinition` from the Azure AI Projects SDK.
## Installation
```bash
pip install azure-ai-projects>=2.0.0b3 azure-identity
```
**Minimum SDK Version:** `2.0.0b3` or later required for hosted agent support.
## Environment Variables
```bash
AZURE_AI_PROJECT_ENDPOINT=https://<resource>.services.ai.azure.com/api/projects/<project>
```
## Prerequisites
Before creating hosted agents:
1. **Container Image** - Build and push to Azure Container Registry (ACR)
2. **ACR Pull Permissions** - Grant your project's managed identity `AcrPull` role on the ACR
3. **Capability Host** - Account-level capability host with `enablePublicHostingEnvironment=true`
4. **SDK Version** - Ensure `azure-ai-projects>=2.0.0b3`
## Authentication
Always use `DefaultAzureCredential`:
```python
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
credential = DefaultAzureCredential()
client = AIProjectClient(
endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
credential=credential
)
```
## Core Workflow
### 1. Imports
```python
import os
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import (
ImageBasedHostedAgentDefinition,
ProtocolVersionRecord,
AgentProtocol,
)
```
### 2. Create Hosted Agent
```python
client = AIProjectClient(
endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
credential=DefaultAzureCredential()
)
agent = client.agents.create_version(
agent_name="my-hosted-agent",
definition=ImageBasedHostedAgentDefinition(
container_protocol_versions=[
ProtocolVersionRecord(protocol=AgentProtocol.RESPONSES, version="v1")
],
cpu="1",
memory="2Gi",
image="myregistry.azurecr.io/my-agent:latest",
tools=[{"type": "code_interpreter"}],
environment_variables={
"AZURE_AI_PROJECT_ENDPOINT": os.environ["AZURE_AI_PROJECT_ENDPOINT"],
"MODEL_NAME": "gpt-4o-mini"
}
)
)
print(f"Created agent: {agent.name} (version: {agent.version})")
```
### 3. List Agent Versions
```python
versions = client.agents.list_versions(agent_name="my-hosted-agent")
for version in versions:
print(f"Version: {version.version}, State: {version.state}")
```
### 4. Delete Agent Version
```python
client.agents.delete_version(
agent_name="my-hosted-agent",
version=agent.version
)
```
## ImageBasedHostedAgentDefinition Parameters
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `container_protocol_versions` | `list[ProtocolVersionRecord]` | Yes | Protocol versions the agent supports |
| `image` | `str` | Yes | Full container image path (registry/image:tag) |
| `cpu` | `str` | No | CPU allocation (e.g., "1", "2") |
| `memory` | `str` | No | Memory allocation (e.g., "2Gi", "4Gi") |
| `tools` | `list[dict]` | No | Tools available to the agent |
| `environment_variables` | `dict[str, str]` | No | Environment variables for the container |
## Protocol Versions
The `container_protocol_versions` parameter specifies which protocols your agent supports:
```python
from azure.ai.projects.models import ProtocolVersionRecord, AgentProtocol
# RESPONSES protocol - standard agent responses
container_protocol_versions=[
ProtocolVersionRecord(protocol=AgentProtocol.RESPONSES, version="v1")
]
```
**Available Protocols:**
| Protocol | Description |
|----------|-------------|
| `AgentProtocol.RESPONSES` | Standard response protocol for agent interactions |
## Resource Allocation
Specify CPU and memory for your container:
```python
definition=ImageBasedHostedAgentDefinition(
container_protocol_versions=[...],
image="myregistry.azurecr.io/my-agent:latest",
cpu="2", # 2 CPU cores
memory="4Gi" # 4 GiB memory
)
```
**Resource Limits:**
| Resource | Min | Max | Default |
|----------|-----|-----|---------|
| CPU | 0.5 | 4 | 1 |
| Memory | 1Gi | 8Gi | 2Gi |
## Tools Configuration
Add tools to your hosted agent:
### Code Interpreter
```python
tools=[{"type": "code_interpreter"}]
```
### MCP Tools
```python
tools=[
{"type": "code_interpreter"},
{
"type": "mcp",
"server_label": "my-mcp-server",
"server_url": "https://my-mcp-server.example.com"
}
]
```
### Multiple Tools
```python
tools=[
{"type": "code_interpreter"},
{"type": "file_search"},
{
"type": "mcp",
"server_label": "custom-tool",
"server_url": "https://custom-tool.example.com"
}
]
```
## Environment Variables
Pass configuration to your container:
```python
environment_variables={
"AZURE_AI_PROJECT_ENDPOINT": os.environ["AZURE_AI_PROJECT_ENDPOINT"],
"MODEL_NAME": "gpt-4o-mini",
"LOG_LEVEL": "INFO",
"CUSTOM_CONFIG": "value"
}
```
**Best Practice:** Never hardcode secrets. Use environment variables or Azure Key Vault.
## Complete Example
```python
import os
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import (
ImageBasedHostedAgentDefinition,
ProtocolVersionRecord,
AgentProtocol,
)
def create_hosted_agent():
"""Create a hosted agent with custom container image."""
client = AIProjectClient(
endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
credential=DefaultAzureCredential()
)
agent = client.agents.create_version(
agent_name="data-processor-agent",
definition=ImageBasedHostedAgentDefinition(
container_protocol_versions=[
ProtocolVersionRecord(
protocol=AgentProtocol.RESPONSES,
version="v1"
)
],
image="myregistry.azurecr.io/data-processor:v1.0",
cpu="2",
memory="4Gi",
tools=[
{"type": "code_interpreter"},
{"type": "file_search"}
],
environment_variables={
"AZURE_AI_PROJECT_ENDPOINT": os.environ["AZURE_AI_PROJECT_ENDPOINT"],
"MODEL_NAME": "gpt-4o-mini",
"MAX_RETRIES": "3"
}
)
)
print(f"Created hosted agent: {agent.name}")
print(f"Version: {agent.version}")
print(f"State: {agent.state}")
return agent
if __name__ == "__main__":
create_hosted_agent()
```
## Async Pattern
```python
import os
from azure.identity.aio import DefaultAzureCredential
from azure.ai.projects.aio import AIProjectClient
from azure.ai.projects.models import (
ImageBasedHostedAgentDefinition,
ProtocolVersionRecord,
AgentProtocol,
)
async def create_hosted_agent_async():
"""Create a hosted agent asynchronously."""
async with DefaultAzureCredential() as credential:
async with AIProjectClient(
endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
credential=credential
) as client:
agent = await client.agents.create_version(
agent_name="async-agent",
definition=ImageBasedHostedAgentDefinition(
container_protocol_versions=[
ProtocolVersionRecord(
protocol=AgentProtocol.RESPONSES,
version="v1"
)
],
image="myregistry.azurecr.io/async-agent:latest",
cpu="1",
memory="2Gi"
)
)
return agent
```
## Common Errors
| Error | Cause | Solution |
|-------|-------|----------|
| `ImagePullBackOff` | ACR pull permission denied | Grant `AcrPull` role to project's managed identity |
| `InvalidContainerImage` | Image not found | Verify image path and tag exist in ACR |
| `CapabilityHostNotFound` | No capability host configured | Create account-level capability host |
| `ProtocolVersionNotSupported` | Invalid protocol version | Use `AgentProtocol.RESPONSES` with version `"v1"` |
## Best Practices
1. **Version Your Images** - Use specific tags, not `latest` in production
2. **Minimal Resources** - Start with minimum CPU/memory, scale up as needed
3. **Environment Variables** - Use for all configuration, never hardcode
4. **Error Handling** - Wrap agent creation in try/except blocks
5. **Cleanup** - Delete unused agent versions to free resources
## Reference Links
- [Azure AI Projects SDK](https://pypi.org/project/azure-ai-projects/)
- [Hosted Agents Documentation](https://learn.microsoft.com/azure/ai-services/agents/how-to/hosted-agents)
- [Azure Container Registry](https://learn.microsoft.com/azure/container-registry/)

View File

@@ -0,0 +1,214 @@
---
name: azure-ai-contentsafety-py
description: |
Azure AI Content Safety SDK for Python. Use for detecting harmful content in text and images with multi-severity classification.
Triggers: "azure-ai-contentsafety", "ContentSafetyClient", "content moderation", "harmful content", "text analysis", "image analysis".
package: azure-ai-contentsafety
---
# Azure AI Content Safety SDK for Python
Detect harmful user-generated and AI-generated content in applications.
## Installation
```bash
pip install azure-ai-contentsafety
```
## Environment Variables
```bash
CONTENT_SAFETY_ENDPOINT=https://<resource>.cognitiveservices.azure.com
CONTENT_SAFETY_KEY=<your-api-key>
```
## Authentication
### API Key
```python
from azure.ai.contentsafety import ContentSafetyClient
from azure.core.credentials import AzureKeyCredential
import os
client = ContentSafetyClient(
endpoint=os.environ["CONTENT_SAFETY_ENDPOINT"],
credential=AzureKeyCredential(os.environ["CONTENT_SAFETY_KEY"])
)
```
### Entra ID
```python
from azure.ai.contentsafety import ContentSafetyClient
from azure.identity import DefaultAzureCredential
client = ContentSafetyClient(
endpoint=os.environ["CONTENT_SAFETY_ENDPOINT"],
credential=DefaultAzureCredential()
)
```
## Analyze Text
```python
from azure.ai.contentsafety import ContentSafetyClient
from azure.ai.contentsafety.models import AnalyzeTextOptions, TextCategory
from azure.core.credentials import AzureKeyCredential
client = ContentSafetyClient(endpoint, AzureKeyCredential(key))
request = AnalyzeTextOptions(text="Your text content to analyze")
response = client.analyze_text(request)
# Check each category
for category in [TextCategory.HATE, TextCategory.SELF_HARM,
TextCategory.SEXUAL, TextCategory.VIOLENCE]:
result = next((r for r in response.categories_analysis
if r.category == category), None)
if result:
print(f"{category}: severity {result.severity}")
```
## Analyze Image
```python
from azure.ai.contentsafety import ContentSafetyClient
from azure.ai.contentsafety.models import AnalyzeImageOptions, ImageData
from azure.core.credentials import AzureKeyCredential
import base64
client = ContentSafetyClient(endpoint, AzureKeyCredential(key))
# From file
with open("image.jpg", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
request = AnalyzeImageOptions(
image=ImageData(content=image_data)
)
response = client.analyze_image(request)
for result in response.categories_analysis:
print(f"{result.category}: severity {result.severity}")
```
### Image from URL
```python
from azure.ai.contentsafety.models import AnalyzeImageOptions, ImageData
request = AnalyzeImageOptions(
image=ImageData(blob_url="https://example.com/image.jpg")
)
response = client.analyze_image(request)
```
## Text Blocklist Management
### Create Blocklist
```python
from azure.ai.contentsafety import BlocklistClient
from azure.ai.contentsafety.models import TextBlocklist
from azure.core.credentials import AzureKeyCredential
blocklist_client = BlocklistClient(endpoint, AzureKeyCredential(key))
blocklist = TextBlocklist(
blocklist_name="my-blocklist",
description="Custom terms to block"
)
result = blocklist_client.create_or_update_text_blocklist(
blocklist_name="my-blocklist",
options=blocklist
)
```
### Add Block Items
```python
from azure.ai.contentsafety.models import AddOrUpdateTextBlocklistItemsOptions, TextBlocklistItem
items = AddOrUpdateTextBlocklistItemsOptions(
blocklist_items=[
TextBlocklistItem(text="blocked-term-1"),
TextBlocklistItem(text="blocked-term-2")
]
)
result = blocklist_client.add_or_update_blocklist_items(
blocklist_name="my-blocklist",
options=items
)
```
### Analyze with Blocklist
```python
from azure.ai.contentsafety.models import AnalyzeTextOptions
request = AnalyzeTextOptions(
text="Text containing blocked-term-1",
blocklist_names=["my-blocklist"],
halt_on_blocklist_hit=True
)
response = client.analyze_text(request)
if response.blocklists_match:
for match in response.blocklists_match:
print(f"Blocked: {match.blocklist_item_text}")
```
## Severity Levels
Text analysis returns 4 severity levels (0, 2, 4, 6) by default. For 8 levels (0-7):
```python
from azure.ai.contentsafety.models import AnalyzeTextOptions, AnalyzeTextOutputType
request = AnalyzeTextOptions(
text="Your text",
output_type=AnalyzeTextOutputType.EIGHT_SEVERITY_LEVELS
)
```
## Harm Categories
| Category | Description |
|----------|-------------|
| `Hate` | Attacks based on identity (race, religion, gender, etc.) |
| `Sexual` | Sexual content, relationships, anatomy |
| `Violence` | Physical harm, weapons, injury |
| `SelfHarm` | Self-injury, suicide, eating disorders |
## Severity Scale
| Level | Text Range | Image Range | Meaning |
|-------|------------|-------------|---------|
| 0 | Safe | Safe | No harmful content |
| 2 | Low | Low | Mild references |
| 4 | Medium | Medium | Moderate content |
| 6 | High | High | Severe content |
## Client Types
| Client | Purpose |
|--------|---------|
| `ContentSafetyClient` | Analyze text and images |
| `BlocklistClient` | Manage custom blocklists |
## Best Practices
1. **Use blocklists** for domain-specific terms
2. **Set severity thresholds** appropriate for your use case
3. **Handle multiple categories** — content can be harmful in multiple ways
4. **Use halt_on_blocklist_hit** for immediate rejection
5. **Log analysis results** for audit and improvement
6. **Consider 8-severity mode** for finer-grained control
7. **Pre-moderate AI outputs** before showing to users

View File

@@ -0,0 +1,273 @@
---
name: azure-ai-contentunderstanding-py
description: |
Azure AI Content Understanding SDK for Python. Use for multimodal content extraction from documents, images, audio, and video.
Triggers: "azure-ai-contentunderstanding", "ContentUnderstandingClient", "multimodal analysis", "document extraction", "video analysis", "audio transcription".
package: azure-ai-contentunderstanding
---
# Azure AI Content Understanding SDK for Python
Multimodal AI service that extracts semantic content from documents, video, audio, and image files for RAG and automated workflows.
## Installation
```bash
pip install azure-ai-contentunderstanding
```
## Environment Variables
```bash
CONTENTUNDERSTANDING_ENDPOINT=https://<resource>.cognitiveservices.azure.com/
```
## Authentication
```python
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.identity import DefaultAzureCredential
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
credential = DefaultAzureCredential()
client = ContentUnderstandingClient(endpoint=endpoint, credential=credential)
```
## Core Workflow
Content Understanding operations are asynchronous long-running operations:
1. **Begin Analysis** — Start the analysis operation with `begin_analyze()` (returns a poller)
2. **Poll for Results** — Poll until analysis completes (SDK handles this with `.result()`)
3. **Process Results** — Extract structured results from `AnalyzeResult.contents`
## Prebuilt Analyzers
| Analyzer | Content Type | Purpose |
|----------|--------------|---------|
| `prebuilt-documentSearch` | Documents | Extract markdown for RAG applications |
| `prebuilt-imageSearch` | Images | Extract content from images |
| `prebuilt-audioSearch` | Audio | Transcribe audio with timing |
| `prebuilt-videoSearch` | Video | Extract frames, transcripts, summaries |
| `prebuilt-invoice` | Documents | Extract invoice fields |
## Analyze Document
```python
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity import DefaultAzureCredential
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
client = ContentUnderstandingClient(
endpoint=endpoint,
credential=DefaultAzureCredential()
)
# Analyze document from URL
poller = client.begin_analyze(
analyzer_id="prebuilt-documentSearch",
inputs=[AnalyzeInput(url="https://example.com/document.pdf")]
)
result = poller.result()
# Access markdown content (contents is a list)
content = result.contents[0]
print(content.markdown)
```
## Access Document Content Details
```python
from azure.ai.contentunderstanding.models import MediaContentKind, DocumentContent
content = result.contents[0]
if content.kind == MediaContentKind.DOCUMENT:
document_content: DocumentContent = content # type: ignore
print(document_content.start_page_number)
```
## Analyze Image
```python
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-imageSearch",
inputs=[AnalyzeInput(url="https://example.com/image.jpg")]
)
result = poller.result()
content = result.contents[0]
print(content.markdown)
```
## Analyze Video
```python
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-videoSearch",
inputs=[AnalyzeInput(url="https://example.com/video.mp4")]
)
result = poller.result()
# Access video content (AudioVisualContent)
content = result.contents[0]
# Get transcript phrases with timing
for phrase in content.transcript_phrases:
print(f"[{phrase.start_time} - {phrase.end_time}]: {phrase.text}")
# Get key frames (for video)
for frame in content.key_frames:
print(f"Frame at {frame.time}: {frame.description}")
```
## Analyze Audio
```python
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-audioSearch",
inputs=[AnalyzeInput(url="https://example.com/audio.mp3")]
)
result = poller.result()
# Access audio transcript
content = result.contents[0]
for phrase in content.transcript_phrases:
print(f"[{phrase.start_time}] {phrase.text}")
```
## Custom Analyzers
Create custom analyzers with field schemas for specialized extraction:
```python
# Create custom analyzer
analyzer = client.create_analyzer(
analyzer_id="my-invoice-analyzer",
analyzer={
"description": "Custom invoice analyzer",
"base_analyzer_id": "prebuilt-documentSearch",
"field_schema": {
"fields": {
"vendor_name": {"type": "string"},
"invoice_total": {"type": "number"},
"line_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {"type": "string"},
"amount": {"type": "number"}
}
}
}
}
}
}
)
# Use custom analyzer
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="my-invoice-analyzer",
inputs=[AnalyzeInput(url="https://example.com/invoice.pdf")]
)
result = poller.result()
# Access extracted fields
print(result.fields["vendor_name"])
print(result.fields["invoice_total"])
```
## Analyzer Management
```python
# List all analyzers
analyzers = client.list_analyzers()
for analyzer in analyzers:
print(f"{analyzer.analyzer_id}: {analyzer.description}")
# Get specific analyzer
analyzer = client.get_analyzer("prebuilt-documentSearch")
# Delete custom analyzer
client.delete_analyzer("my-custom-analyzer")
```
## Async Client
```python
import asyncio
import os
from azure.ai.contentunderstanding.aio import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity.aio import DefaultAzureCredential
async def analyze_document():
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
credential = DefaultAzureCredential()
async with ContentUnderstandingClient(
endpoint=endpoint,
credential=credential
) as client:
poller = await client.begin_analyze(
analyzer_id="prebuilt-documentSearch",
inputs=[AnalyzeInput(url="https://example.com/doc.pdf")]
)
result = await poller.result()
content = result.contents[0]
return content.markdown
asyncio.run(analyze_document())
```
## Content Types
| Class | For | Provides |
|-------|-----|----------|
| `DocumentContent` | PDF, images, Office docs | Pages, tables, figures, paragraphs |
| `AudioVisualContent` | Audio, video files | Transcript phrases, timing, key frames |
Both derive from `MediaContent` which provides basic info and markdown representation.
## Model Imports
```python
from azure.ai.contentunderstanding.models import (
AnalyzeInput,
AnalyzeResult,
MediaContentKind,
DocumentContent,
AudioVisualContent,
)
```
## Client Types
| Client | Purpose |
|--------|---------|
| `ContentUnderstandingClient` | Sync client for all operations |
| `ContentUnderstandingClient` (aio) | Async client for all operations |
## Best Practices
1. **Use `begin_analyze` with `AnalyzeInput`** — this is the correct method signature
2. **Access results via `result.contents[0]`** — results are returned as a list
3. **Use prebuilt analyzers** for common scenarios (document/image/audio/video search)
4. **Create custom analyzers** only for domain-specific field extraction
5. **Use async client** for high-throughput scenarios with `azure.identity.aio` credentials
6. **Handle long-running operations** — video/audio analysis can take minutes
7. **Use URL sources** when possible to avoid upload overhead

View File

@@ -0,0 +1,271 @@
---
name: azure-ai-ml-py
description: |
Azure Machine Learning SDK v2 for Python. Use for ML workspaces, jobs, models, datasets, compute, and pipelines.
Triggers: "azure-ai-ml", "MLClient", "workspace", "model registry", "training jobs", "datasets".
package: azure-ai-ml
---
# Azure Machine Learning SDK v2 for Python
Client library for managing Azure ML resources: workspaces, jobs, models, data, and compute.
## Installation
```bash
pip install azure-ai-ml
```
## Environment Variables
```bash
AZURE_SUBSCRIPTION_ID=<your-subscription-id>
AZURE_RESOURCE_GROUP=<your-resource-group>
AZURE_ML_WORKSPACE_NAME=<your-workspace-name>
```
## Authentication
```python
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
ml_client = MLClient(
credential=DefaultAzureCredential(),
subscription_id=os.environ["AZURE_SUBSCRIPTION_ID"],
resource_group_name=os.environ["AZURE_RESOURCE_GROUP"],
workspace_name=os.environ["AZURE_ML_WORKSPACE_NAME"]
)
```
### From Config File
```python
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
# Uses config.json in current directory or parent
ml_client = MLClient.from_config(
credential=DefaultAzureCredential()
)
```
## Workspace Management
### Create Workspace
```python
from azure.ai.ml.entities import Workspace
ws = Workspace(
name="my-workspace",
location="eastus",
display_name="My Workspace",
description="ML workspace for experiments",
tags={"purpose": "demo"}
)
ml_client.workspaces.begin_create(ws).result()
```
### List Workspaces
```python
for ws in ml_client.workspaces.list():
print(f"{ws.name}: {ws.location}")
```
## Data Assets
### Register Data
```python
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes
# Register a file
my_data = Data(
name="my-dataset",
version="1",
path="azureml://datastores/workspaceblobstore/paths/data/train.csv",
type=AssetTypes.URI_FILE,
description="Training data"
)
ml_client.data.create_or_update(my_data)
```
### Register Folder
```python
my_data = Data(
name="my-folder-dataset",
version="1",
path="azureml://datastores/workspaceblobstore/paths/data/",
type=AssetTypes.URI_FOLDER
)
ml_client.data.create_or_update(my_data)
```
## Model Registry
### Register Model
```python
from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes
model = Model(
name="my-model",
version="1",
path="./model/",
type=AssetTypes.CUSTOM_MODEL,
description="My trained model"
)
ml_client.models.create_or_update(model)
```
### List Models
```python
for model in ml_client.models.list(name="my-model"):
print(f"{model.name} v{model.version}")
```
## Compute
### Create Compute Cluster
```python
from azure.ai.ml.entities import AmlCompute
cluster = AmlCompute(
name="cpu-cluster",
type="amlcompute",
size="Standard_DS3_v2",
min_instances=0,
max_instances=4,
idle_time_before_scale_down=120
)
ml_client.compute.begin_create_or_update(cluster).result()
```
### List Compute
```python
for compute in ml_client.compute.list():
print(f"{compute.name}: {compute.type}")
```
## Jobs
### Command Job
```python
from azure.ai.ml import command, Input
job = command(
code="./src",
command="python train.py --data ${{inputs.data}} --lr ${{inputs.learning_rate}}",
inputs={
"data": Input(type="uri_folder", path="azureml:my-dataset:1"),
"learning_rate": 0.01
},
environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
compute="cpu-cluster",
display_name="training-job"
)
returned_job = ml_client.jobs.create_or_update(job)
print(f"Job URL: {returned_job.studio_url}")
```
### Monitor Job
```python
ml_client.jobs.stream(returned_job.name)
```
## Pipelines
```python
from azure.ai.ml import dsl, Input, Output
from azure.ai.ml.entities import Pipeline
@dsl.pipeline(
compute="cpu-cluster",
description="Training pipeline"
)
def training_pipeline(data_input):
prep_step = prep_component(data=data_input)
train_step = train_component(
data=prep_step.outputs.output_data,
learning_rate=0.01
)
return {"model": train_step.outputs.model}
pipeline = training_pipeline(
data_input=Input(type="uri_folder", path="azureml:my-dataset:1")
)
pipeline_job = ml_client.jobs.create_or_update(pipeline)
```
## Environments
### Create Custom Environment
```python
from azure.ai.ml.entities import Environment
env = Environment(
name="my-env",
version="1",
image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04",
conda_file="./environment.yml"
)
ml_client.environments.create_or_update(env)
```
## Datastores
### List Datastores
```python
for ds in ml_client.datastores.list():
print(f"{ds.name}: {ds.type}")
```
### Get Default Datastore
```python
default_ds = ml_client.datastores.get_default()
print(f"Default: {default_ds.name}")
```
## MLClient Operations
| Property | Operations |
|----------|------------|
| `workspaces` | create, get, list, delete |
| `jobs` | create_or_update, get, list, stream, cancel |
| `models` | create_or_update, get, list, archive |
| `data` | create_or_update, get, list |
| `compute` | begin_create_or_update, get, list, delete |
| `environments` | create_or_update, get, list |
| `datastores` | create_or_update, get, list, get_default |
| `components` | create_or_update, get, list |
## Best Practices
1. **Use versioning** for data, models, and environments
2. **Configure idle scale-down** to reduce compute costs
3. **Use environments** for reproducible training
4. **Stream job logs** to monitor progress
5. **Register models** after successful training jobs
6. **Use pipelines** for multi-step workflows
7. **Tag resources** for organization and cost tracking

View File

@@ -0,0 +1,295 @@
---
name: azure-ai-projects-py
description: Build AI applications using the Azure AI Projects Python SDK (azure-ai-projects). Use when working with Foundry project clients, creating versioned agents with PromptAgentDefinition, running evaluations, managing connections/deployments/datasets/indexes, or using OpenAI-compatible clients. This is the high-level Foundry SDK - for low-level agent operations, use azure-ai-agents-python skill.
package: azure-ai-projects
---
# Azure AI Projects Python SDK (Foundry SDK)
Build AI applications on Microsoft Foundry using the `azure-ai-projects` SDK.
## Installation
```bash
pip install azure-ai-projects azure-identity
```
## Environment Variables
```bash
AZURE_AI_PROJECT_ENDPOINT="https://<resource>.services.ai.azure.com/api/projects/<project>"
AZURE_AI_MODEL_DEPLOYMENT_NAME="gpt-4o-mini"
```
## Authentication
```python
import os
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
credential = DefaultAzureCredential()
client = AIProjectClient(
endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
credential=credential,
)
```
## Client Operations Overview
| Operation | Access | Purpose |
|-----------|--------|---------|
| `client.agents` | `.agents.*` | Agent CRUD, versions, threads, runs |
| `client.connections` | `.connections.*` | List/get project connections |
| `client.deployments` | `.deployments.*` | List model deployments |
| `client.datasets` | `.datasets.*` | Dataset management |
| `client.indexes` | `.indexes.*` | Index management |
| `client.evaluations` | `.evaluations.*` | Run evaluations |
| `client.red_teams` | `.red_teams.*` | Red team operations |
## Two Client Approaches
### 1. AIProjectClient (Native Foundry)
```python
from azure.ai.projects import AIProjectClient
client = AIProjectClient(
endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
credential=DefaultAzureCredential(),
)
# Use Foundry-native operations
agent = client.agents.create_agent(
model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
name="my-agent",
instructions="You are helpful.",
)
```
### 2. OpenAI-Compatible Client
```python
# Get OpenAI-compatible client from project
openai_client = client.get_openai_client()
# Use standard OpenAI API
response = openai_client.chat.completions.create(
model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
messages=[{"role": "user", "content": "Hello!"}],
)
```
## Agent Operations
### Create Agent (Basic)
```python
agent = client.agents.create_agent(
model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
name="my-agent",
instructions="You are a helpful assistant.",
)
```
### Create Agent with Tools
```python
from azure.ai.agents import CodeInterpreterTool, FileSearchTool
agent = client.agents.create_agent(
model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
name="tool-agent",
instructions="You can execute code and search files.",
tools=[CodeInterpreterTool(), FileSearchTool()],
)
```
### Versioned Agents with PromptAgentDefinition
```python
from azure.ai.projects.models import PromptAgentDefinition
# Create a versioned agent
agent_version = client.agents.create_version(
agent_name="customer-support-agent",
definition=PromptAgentDefinition(
model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
instructions="You are a customer support specialist.",
tools=[], # Add tools as needed
),
version_label="v1.0",
)
```
See [references/agents.md](references/agents.md) for detailed agent patterns.
## Tools Overview
| Tool | Class | Use Case |
|------|-------|----------|
| Code Interpreter | `CodeInterpreterTool` | Execute Python, generate files |
| File Search | `FileSearchTool` | RAG over uploaded documents |
| Bing Grounding | `BingGroundingTool` | Web search (requires connection) |
| Azure AI Search | `AzureAISearchTool` | Search your indexes |
| Function Calling | `FunctionTool` | Call your Python functions |
| OpenAPI | `OpenApiTool` | Call REST APIs |
| MCP | `McpTool` | Model Context Protocol servers |
| Memory Search | `MemorySearchTool` | Search agent memory stores |
| SharePoint | `SharepointGroundingTool` | Search SharePoint content |
See [references/tools.md](references/tools.md) for all tool patterns.
## Thread and Message Flow
```python
# 1. Create thread
thread = client.agents.threads.create()
# 2. Add message
client.agents.messages.create(
thread_id=thread.id,
role="user",
content="What's the weather like?",
)
# 3. Create and process run
run = client.agents.runs.create_and_process(
thread_id=thread.id,
agent_id=agent.id,
)
# 4. Get response
if run.status == "completed":
messages = client.agents.messages.list(thread_id=thread.id)
for msg in messages:
if msg.role == "assistant":
print(msg.content[0].text.value)
```
## Connections
```python
# List all connections
connections = client.connections.list()
for conn in connections:
print(f"{conn.name}: {conn.connection_type}")
# Get specific connection
connection = client.connections.get(connection_name="my-search-connection")
```
See [references/connections.md](references/connections.md) for connection patterns.
## Deployments
```python
# List available model deployments
deployments = client.deployments.list()
for deployment in deployments:
print(f"{deployment.name}: {deployment.model}")
```
See [references/deployments.md](references/deployments.md) for deployment patterns.
## Datasets and Indexes
```python
# List datasets
datasets = client.datasets.list()
# List indexes
indexes = client.indexes.list()
```
See [references/datasets-indexes.md](references/datasets-indexes.md) for data operations.
## Evaluation
```python
# Using OpenAI client for evals
openai_client = client.get_openai_client()
# Create evaluation with built-in evaluators
eval_run = openai_client.evals.runs.create(
eval_id="my-eval",
name="quality-check",
data_source={
"type": "custom",
"item_references": [{"item_id": "test-1"}],
},
testing_criteria=[
{"type": "fluency"},
{"type": "task_adherence"},
],
)
```
See [references/evaluation.md](references/evaluation.md) for evaluation patterns.
## Async Client
```python
from azure.ai.projects.aio import AIProjectClient
async with AIProjectClient(
endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
credential=DefaultAzureCredential(),
) as client:
agent = await client.agents.create_agent(...)
# ... async operations
```
See [references/async-patterns.md](references/async-patterns.md) for async patterns.
## Memory Stores
```python
# Create memory store for agent
memory_store = client.agents.create_memory_store(
name="conversation-memory",
)
# Attach to agent for persistent memory
agent = client.agents.create_agent(
model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
name="memory-agent",
tools=[MemorySearchTool()],
tool_resources={"memory": {"store_ids": [memory_store.id]}},
)
```
## Best Practices
1. **Use context managers** for async client: `async with AIProjectClient(...) as client:`
2. **Clean up agents** when done: `client.agents.delete_agent(agent.id)`
3. **Use `create_and_process`** for simple runs, **streaming** for real-time UX
4. **Use versioned agents** for production deployments
5. **Prefer connections** for external service integration (AI Search, Bing, etc.)
## SDK Comparison
| Feature | `azure-ai-projects` | `azure-ai-agents` |
|---------|---------------------|-------------------|
| Level | High-level (Foundry) | Low-level (Agents) |
| Client | `AIProjectClient` | `AgentsClient` |
| Versioning | `create_version()` | Not available |
| Connections | Yes | No |
| Deployments | Yes | No |
| Datasets/Indexes | Yes | No |
| Evaluation | Via OpenAI client | No |
| When to use | Full Foundry integration | Standalone agent apps |
## Reference Files
- [references/agents.md](references/agents.md): Agent operations with PromptAgentDefinition
- [references/tools.md](references/tools.md): All agent tools with examples
- [references/evaluation.md](references/evaluation.md): Evaluation operations overview
- [references/built-in-evaluators.md](references/built-in-evaluators.md): Complete built-in evaluator reference
- [references/custom-evaluators.md](references/custom-evaluators.md): Code and prompt-based evaluator patterns
- [references/connections.md](references/connections.md): Connection operations
- [references/deployments.md](references/deployments.md): Deployment enumeration
- [references/datasets-indexes.md](references/datasets-indexes.md): Dataset and index operations
- [references/async-patterns.md](references/async-patterns.md): Async client usage
- [references/api-reference.md](references/api-reference.md): Complete API reference for all 373 SDK exports (v2.0.0b4)
- [scripts/run_batch_evaluation.py](scripts/run_batch_evaluation.py): CLI tool for batch evaluations

View File

@@ -0,0 +1,528 @@
---
name: azure-search-documents-py
description: |
Azure AI Search SDK for Python. Use for vector search, hybrid search, semantic ranking, indexing, and skillsets.
Triggers: "azure-search-documents", "SearchClient", "SearchIndexClient", "vector search", "hybrid search", "semantic search".
package: azure-search-documents
---
# Azure AI Search SDK for Python
Full-text, vector, and hybrid search with AI enrichment capabilities.
## Installation
```bash
pip install azure-search-documents
```
## Environment Variables
```bash
AZURE_SEARCH_ENDPOINT=https://<service-name>.search.windows.net
AZURE_SEARCH_API_KEY=<your-api-key>
AZURE_SEARCH_INDEX_NAME=<your-index-name>
```
## Authentication
### API Key
```python
from azure.search.documents import SearchClient
from azure.core.credentials import AzureKeyCredential
client = SearchClient(
endpoint=os.environ["AZURE_SEARCH_ENDPOINT"],
index_name=os.environ["AZURE_SEARCH_INDEX_NAME"],
credential=AzureKeyCredential(os.environ["AZURE_SEARCH_API_KEY"])
)
```
### Entra ID (Recommended)
```python
from azure.search.documents import SearchClient
from azure.identity import DefaultAzureCredential
client = SearchClient(
endpoint=os.environ["AZURE_SEARCH_ENDPOINT"],
index_name=os.environ["AZURE_SEARCH_INDEX_NAME"],
credential=DefaultAzureCredential()
)
```
## Client Types
| Client | Purpose |
|--------|---------|
| `SearchClient` | Search and document operations |
| `SearchIndexClient` | Index management, synonym maps |
| `SearchIndexerClient` | Indexers, data sources, skillsets |
## Create Index with Vector Field
```python
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
SearchIndex,
SearchField,
SearchFieldDataType,
VectorSearch,
HnswAlgorithmConfiguration,
VectorSearchProfile,
SearchableField,
SimpleField
)
index_client = SearchIndexClient(endpoint, AzureKeyCredential(key))
fields = [
SimpleField(name="id", type=SearchFieldDataType.String, key=True),
SearchableField(name="title", type=SearchFieldDataType.String),
SearchableField(name="content", type=SearchFieldDataType.String),
SearchField(
name="content_vector",
type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
searchable=True,
vector_search_dimensions=1536,
vector_search_profile_name="my-vector-profile"
)
]
vector_search = VectorSearch(
algorithms=[
HnswAlgorithmConfiguration(name="my-hnsw")
],
profiles=[
VectorSearchProfile(
name="my-vector-profile",
algorithm_configuration_name="my-hnsw"
)
]
)
index = SearchIndex(
name="my-index",
fields=fields,
vector_search=vector_search
)
index_client.create_or_update_index(index)
```
## Upload Documents
```python
from azure.search.documents import SearchClient
client = SearchClient(endpoint, "my-index", AzureKeyCredential(key))
documents = [
{
"id": "1",
"title": "Azure AI Search",
"content": "Full-text and vector search service",
"content_vector": [0.1, 0.2, ...] # 1536 dimensions
}
]
result = client.upload_documents(documents)
print(f"Uploaded {len(result)} documents")
```
## Keyword Search
```python
results = client.search(
search_text="azure search",
select=["id", "title", "content"],
top=10
)
for result in results:
print(f"{result['title']}: {result['@search.score']}")
```
## Vector Search
```python
from azure.search.documents.models import VectorizedQuery
# Your query embedding (1536 dimensions)
query_vector = get_embedding("semantic search capabilities")
vector_query = VectorizedQuery(
vector=query_vector,
k_nearest_neighbors=10,
fields="content_vector"
)
results = client.search(
vector_queries=[vector_query],
select=["id", "title", "content"]
)
for result in results:
print(f"{result['title']}: {result['@search.score']}")
```
## Hybrid Search (Vector + Keyword)
```python
from azure.search.documents.models import VectorizedQuery
vector_query = VectorizedQuery(
vector=query_vector,
k_nearest_neighbors=10,
fields="content_vector"
)
results = client.search(
search_text="azure search",
vector_queries=[vector_query],
select=["id", "title", "content"],
top=10
)
```
## Semantic Ranking
```python
from azure.search.documents.models import QueryType
results = client.search(
search_text="what is azure search",
query_type=QueryType.SEMANTIC,
semantic_configuration_name="my-semantic-config",
select=["id", "title", "content"],
top=10
)
for result in results:
print(f"{result['title']}")
if result.get("@search.captions"):
print(f" Caption: {result['@search.captions'][0].text}")
```
## Filters
```python
results = client.search(
search_text="*",
filter="category eq 'Technology' and rating gt 4",
order_by=["rating desc"],
select=["id", "title", "category", "rating"]
)
```
## Facets
```python
results = client.search(
search_text="*",
facets=["category,count:10", "rating"],
top=0 # Only get facets, no documents
)
for facet_name, facet_values in results.get_facets().items():
print(f"{facet_name}:")
for facet in facet_values:
print(f" {facet['value']}: {facet['count']}")
```
## Autocomplete & Suggest
```python
# Autocomplete
results = client.autocomplete(
search_text="sea",
suggester_name="my-suggester",
mode="twoTerms"
)
# Suggest
results = client.suggest(
search_text="sea",
suggester_name="my-suggester",
select=["title"]
)
```
## Indexer with Skillset
```python
from azure.search.documents.indexes import SearchIndexerClient
from azure.search.documents.indexes.models import (
SearchIndexer,
SearchIndexerDataSourceConnection,
SearchIndexerSkillset,
EntityRecognitionSkill,
InputFieldMappingEntry,
OutputFieldMappingEntry
)
indexer_client = SearchIndexerClient(endpoint, AzureKeyCredential(key))
# Create data source
data_source = SearchIndexerDataSourceConnection(
name="my-datasource",
type="azureblob",
connection_string=connection_string,
container={"name": "documents"}
)
indexer_client.create_or_update_data_source_connection(data_source)
# Create skillset
skillset = SearchIndexerSkillset(
name="my-skillset",
skills=[
EntityRecognitionSkill(
inputs=[InputFieldMappingEntry(name="text", source="/document/content")],
outputs=[OutputFieldMappingEntry(name="organizations", target_name="organizations")]
)
]
)
indexer_client.create_or_update_skillset(skillset)
# Create indexer
indexer = SearchIndexer(
name="my-indexer",
data_source_name="my-datasource",
target_index_name="my-index",
skillset_name="my-skillset"
)
indexer_client.create_or_update_indexer(indexer)
```
## Best Practices
1. **Use hybrid search** for best relevance combining vector and keyword
2. **Enable semantic ranking** for natural language queries
3. **Index in batches** of 100-1000 documents for efficiency
4. **Use filters** to narrow results before ranking
5. **Configure vector dimensions** to match your embedding model
6. **Use HNSW algorithm** for large-scale vector search
7. **Create suggesters** at index creation time (cannot add later)
## Reference Files
| File | Contents |
|------|----------|
| [references/vector-search.md](references/vector-search.md) | HNSW configuration, integrated vectorization, multi-vector queries |
| [references/semantic-ranking.md](references/semantic-ranking.md) | Semantic configuration, captions, answers, hybrid patterns |
| [scripts/setup_vector_index.py](scripts/setup_vector_index.py) | CLI script to create vector-enabled search index |
---
## Additional Azure AI Search Patterns
# Azure AI Search Python SDK
Write clean, idiomatic Python code for Azure AI Search using `azure-search-documents`.
## Installation
```bash
pip install azure-search-documents azure-identity
```
## Environment Variables
```bash
AZURE_SEARCH_ENDPOINT=https://<search-service>.search.windows.net
AZURE_SEARCH_INDEX_NAME=<index-name>
# For API key auth (not recommended for production)
AZURE_SEARCH_API_KEY=<api-key>
```
## Authentication
**DefaultAzureCredential (preferred)**:
```python
from azure.identity import DefaultAzureCredential
from azure.search.documents import SearchClient
credential = DefaultAzureCredential()
client = SearchClient(endpoint, index_name, credential)
```
**API Key**:
```python
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
client = SearchClient(endpoint, index_name, AzureKeyCredential(api_key))
```
## Client Selection
| Client | Purpose |
|--------|---------|
| `SearchClient` | Query indexes, upload/update/delete documents |
| `SearchIndexClient` | Create/manage indexes, knowledge sources, knowledge bases |
| `SearchIndexerClient` | Manage indexers, skillsets, data sources |
| `KnowledgeBaseRetrievalClient` | Agentic retrieval with LLM-powered Q&A |
## Index Creation Pattern
```python
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
SearchIndex, SearchField, VectorSearch, VectorSearchProfile,
HnswAlgorithmConfiguration, AzureOpenAIVectorizer,
AzureOpenAIVectorizerParameters, SemanticSearch,
SemanticConfiguration, SemanticPrioritizedFields, SemanticField
)
index = SearchIndex(
name=index_name,
fields=[
SearchField(name="id", type="Edm.String", key=True),
SearchField(name="content", type="Edm.String", searchable=True),
SearchField(name="embedding", type="Collection(Edm.Single)",
vector_search_dimensions=3072,
vector_search_profile_name="vector-profile"),
],
vector_search=VectorSearch(
profiles=[VectorSearchProfile(
name="vector-profile",
algorithm_configuration_name="hnsw-algo",
vectorizer_name="openai-vectorizer"
)],
algorithms=[HnswAlgorithmConfiguration(name="hnsw-algo")],
vectorizers=[AzureOpenAIVectorizer(
vectorizer_name="openai-vectorizer",
parameters=AzureOpenAIVectorizerParameters(
resource_url=aoai_endpoint,
deployment_name=embedding_deployment,
model_name=embedding_model
)
)]
),
semantic_search=SemanticSearch(
default_configuration_name="semantic-config",
configurations=[SemanticConfiguration(
name="semantic-config",
prioritized_fields=SemanticPrioritizedFields(
content_fields=[SemanticField(field_name="content")]
)
)]
)
)
index_client = SearchIndexClient(endpoint, credential)
index_client.create_or_update_index(index)
```
## Document Operations
```python
from azure.search.documents import SearchIndexingBufferedSender
# Batch upload with automatic batching
with SearchIndexingBufferedSender(endpoint, index_name, credential) as sender:
sender.upload_documents(documents)
# Direct operations via SearchClient
search_client = SearchClient(endpoint, index_name, credential)
search_client.upload_documents(documents) # Add new
search_client.merge_documents(documents) # Update existing
search_client.merge_or_upload_documents(documents) # Upsert
search_client.delete_documents(documents) # Remove
```
## Search Patterns
```python
# Basic search
results = search_client.search(search_text="query")
# Vector search
from azure.search.documents.models import VectorizedQuery
results = search_client.search(
search_text=None,
vector_queries=[VectorizedQuery(
vector=embedding,
k_nearest_neighbors=5,
fields="embedding"
)]
)
# Hybrid search (vector + keyword)
results = search_client.search(
search_text="query",
vector_queries=[VectorizedQuery(vector=embedding, k_nearest_neighbors=5, fields="embedding")],
query_type="semantic",
semantic_configuration_name="semantic-config"
)
# With filters
results = search_client.search(
search_text="query",
filter="category eq 'technology'",
select=["id", "title", "content"],
top=10
)
```
## Agentic Retrieval (Knowledge Bases)
For LLM-powered Q&A with answer synthesis, see [references/agentic-retrieval.md](references/agentic-retrieval.md).
Key concepts:
- **Knowledge Source**: Points to a search index
- **Knowledge Base**: Wraps knowledge sources + LLM for query planning and synthesis
- **Output modes**: `EXTRACTIVE_DATA` (raw chunks) or `ANSWER_SYNTHESIS` (LLM-generated answers)
## Async Pattern
```python
from azure.search.documents.aio import SearchClient
async with SearchClient(endpoint, index_name, credential) as client:
results = await client.search(search_text="query")
async for result in results:
print(result["title"])
```
## Best Practices
1. **Use environment variables** for endpoints, keys, and deployment names
2. **Prefer `DefaultAzureCredential`** over API keys for production
3. **Use `SearchIndexingBufferedSender`** for batch uploads (handles batching/retries)
4. **Always define semantic configuration** for agentic retrieval indexes
5. **Use `create_or_update_index`** for idempotent index creation
6. **Close clients** with context managers or explicit `close()`
## Field Types Reference
| EDM Type | Python | Notes |
|----------|--------|-------|
| `Edm.String` | str | Searchable text |
| `Edm.Int32` | int | Integer |
| `Edm.Int64` | int | Long integer |
| `Edm.Double` | float | Floating point |
| `Edm.Boolean` | bool | True/False |
| `Edm.DateTimeOffset` | datetime | ISO 8601 |
| `Collection(Edm.Single)` | List[float] | Vector embeddings |
| `Collection(Edm.String)` | List[str] | String arrays |
## Error Handling
```python
from azure.core.exceptions import (
HttpResponseError,
ResourceNotFoundError,
ResourceExistsError
)
try:
result = search_client.get_document(key="123")
except ResourceNotFoundError:
print("Document not found")
except HttpResponseError as e:
print(f"Search error: {e.message}")
```

View File

@@ -0,0 +1,372 @@
---
name: azure-speech-to-text-rest-py
description: |
Azure Speech to Text REST API for short audio (Python). Use for simple speech recognition of audio files up to 60 seconds without the Speech SDK.
Triggers: "speech to text REST", "short audio transcription", "speech recognition REST API", "STT REST", "recognize speech REST".
DO NOT USE FOR: Long audio (>60 seconds), real-time streaming, batch transcription, custom speech models, speech translation. Use Speech SDK or Batch Transcription API instead.
---
# Azure Speech to Text REST API for Short Audio
Simple REST API for speech-to-text transcription of short audio files (up to 60 seconds). No SDK required - just HTTP requests.
## Prerequisites
1. **Azure subscription** - [Create one free](https://azure.microsoft.com/free/)
2. **Speech resource** - Create in [Azure Portal](https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices)
3. **Get credentials** - After deployment, go to resource > Keys and Endpoint
## Environment Variables
```bash
# Required
AZURE_SPEECH_KEY=<your-speech-resource-key>
AZURE_SPEECH_REGION=<region> # e.g., eastus, westus2, westeurope
# Alternative: Use endpoint directly
AZURE_SPEECH_ENDPOINT=https://<region>.stt.speech.microsoft.com
```
## Installation
```bash
pip install requests
```
## Quick Start
```python
import os
import requests
def transcribe_audio(audio_file_path: str, language: str = "en-US") -> dict:
"""Transcribe short audio file (max 60 seconds) using REST API."""
region = os.environ["AZURE_SPEECH_REGION"]
api_key = os.environ["AZURE_SPEECH_KEY"]
url = f"https://{region}.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1"
headers = {
"Ocp-Apim-Subscription-Key": api_key,
"Content-Type": "audio/wav; codecs=audio/pcm; samplerate=16000",
"Accept": "application/json"
}
params = {
"language": language,
"format": "detailed" # or "simple"
}
with open(audio_file_path, "rb") as audio_file:
response = requests.post(url, headers=headers, params=params, data=audio_file)
response.raise_for_status()
return response.json()
# Usage
result = transcribe_audio("audio.wav", "en-US")
print(result["DisplayText"])
```
## Audio Requirements
| Format | Codec | Sample Rate | Notes |
|--------|-------|-------------|-------|
| WAV | PCM | 16 kHz, mono | **Recommended** |
| OGG | OPUS | 16 kHz, mono | Smaller file size |
**Limitations:**
- Maximum 60 seconds of audio
- For pronunciation assessment: maximum 30 seconds
- No partial/interim results (final only)
## Content-Type Headers
```python
# WAV PCM 16kHz
"Content-Type": "audio/wav; codecs=audio/pcm; samplerate=16000"
# OGG OPUS
"Content-Type": "audio/ogg; codecs=opus"
```
## Response Formats
### Simple Format (default)
```python
params = {"language": "en-US", "format": "simple"}
```
```json
{
"RecognitionStatus": "Success",
"DisplayText": "Remind me to buy 5 pencils.",
"Offset": "1236645672289",
"Duration": "1236645672289"
}
```
### Detailed Format
```python
params = {"language": "en-US", "format": "detailed"}
```
```json
{
"RecognitionStatus": "Success",
"Offset": "1236645672289",
"Duration": "1236645672289",
"NBest": [
{
"Confidence": 0.9052885,
"Display": "What's the weather like?",
"ITN": "what's the weather like",
"Lexical": "what's the weather like",
"MaskedITN": "what's the weather like"
}
]
}
```
## Chunked Transfer (Recommended)
For lower latency, stream audio in chunks:
```python
import os
import requests
def transcribe_chunked(audio_file_path: str, language: str = "en-US") -> dict:
"""Stream audio in chunks for lower latency."""
region = os.environ["AZURE_SPEECH_REGION"]
api_key = os.environ["AZURE_SPEECH_KEY"]
url = f"https://{region}.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1"
headers = {
"Ocp-Apim-Subscription-Key": api_key,
"Content-Type": "audio/wav; codecs=audio/pcm; samplerate=16000",
"Accept": "application/json",
"Transfer-Encoding": "chunked",
"Expect": "100-continue"
}
params = {"language": language, "format": "detailed"}
def generate_chunks(file_path: str, chunk_size: int = 1024):
with open(file_path, "rb") as f:
while chunk := f.read(chunk_size):
yield chunk
response = requests.post(
url,
headers=headers,
params=params,
data=generate_chunks(audio_file_path)
)
response.raise_for_status()
return response.json()
```
## Authentication Options
### Option 1: Subscription Key (Simple)
```python
headers = {
"Ocp-Apim-Subscription-Key": os.environ["AZURE_SPEECH_KEY"]
}
```
### Option 2: Bearer Token
```python
import requests
import os
def get_access_token() -> str:
"""Get access token from the token endpoint."""
region = os.environ["AZURE_SPEECH_REGION"]
api_key = os.environ["AZURE_SPEECH_KEY"]
token_url = f"https://{region}.api.cognitive.microsoft.com/sts/v1.0/issueToken"
response = requests.post(
token_url,
headers={
"Ocp-Apim-Subscription-Key": api_key,
"Content-Type": "application/x-www-form-urlencoded",
"Content-Length": "0"
}
)
response.raise_for_status()
return response.text
# Use token in requests (valid for 10 minutes)
token = get_access_token()
headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "audio/wav; codecs=audio/pcm; samplerate=16000",
"Accept": "application/json"
}
```
## Query Parameters
| Parameter | Required | Values | Description |
|-----------|----------|--------|-------------|
| `language` | **Yes** | `en-US`, `de-DE`, etc. | Language of speech |
| `format` | No | `simple`, `detailed` | Result format (default: simple) |
| `profanity` | No | `masked`, `removed`, `raw` | Profanity handling (default: masked) |
## Recognition Status Values
| Status | Description |
|--------|-------------|
| `Success` | Recognition succeeded |
| `NoMatch` | Speech detected but no words matched |
| `InitialSilenceTimeout` | Only silence detected |
| `BabbleTimeout` | Only noise detected |
| `Error` | Internal service error |
## Profanity Handling
```python
# Mask profanity with asterisks (default)
params = {"language": "en-US", "profanity": "masked"}
# Remove profanity entirely
params = {"language": "en-US", "profanity": "removed"}
# Include profanity as-is
params = {"language": "en-US", "profanity": "raw"}
```
## Error Handling
```python
import requests
def transcribe_with_error_handling(audio_path: str, language: str = "en-US") -> dict | None:
"""Transcribe with proper error handling."""
region = os.environ["AZURE_SPEECH_REGION"]
api_key = os.environ["AZURE_SPEECH_KEY"]
url = f"https://{region}.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1"
try:
with open(audio_path, "rb") as audio_file:
response = requests.post(
url,
headers={
"Ocp-Apim-Subscription-Key": api_key,
"Content-Type": "audio/wav; codecs=audio/pcm; samplerate=16000",
"Accept": "application/json"
},
params={"language": language, "format": "detailed"},
data=audio_file
)
if response.status_code == 200:
result = response.json()
if result.get("RecognitionStatus") == "Success":
return result
else:
print(f"Recognition failed: {result.get('RecognitionStatus')}")
return None
elif response.status_code == 400:
print(f"Bad request: Check language code or audio format")
elif response.status_code == 401:
print(f"Unauthorized: Check API key or token")
elif response.status_code == 403:
print(f"Forbidden: Missing authorization header")
else:
print(f"Error {response.status_code}: {response.text}")
return None
except requests.exceptions.RequestException as e:
print(f"Request failed: {e}")
return None
```
## Async Version
```python
import os
import aiohttp
import asyncio
async def transcribe_async(audio_file_path: str, language: str = "en-US") -> dict:
"""Async version using aiohttp."""
region = os.environ["AZURE_SPEECH_REGION"]
api_key = os.environ["AZURE_SPEECH_KEY"]
url = f"https://{region}.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1"
headers = {
"Ocp-Apim-Subscription-Key": api_key,
"Content-Type": "audio/wav; codecs=audio/pcm; samplerate=16000",
"Accept": "application/json"
}
params = {"language": language, "format": "detailed"}
async with aiohttp.ClientSession() as session:
with open(audio_file_path, "rb") as f:
audio_data = f.read()
async with session.post(url, headers=headers, params=params, data=audio_data) as response:
response.raise_for_status()
return await response.json()
# Usage
result = asyncio.run(transcribe_async("audio.wav", "en-US"))
print(result["DisplayText"])
```
## Supported Languages
Common language codes (see [full list](https://learn.microsoft.com/azure/ai-services/speech-service/language-support)):
| Code | Language |
|------|----------|
| `en-US` | English (US) |
| `en-GB` | English (UK) |
| `de-DE` | German |
| `fr-FR` | French |
| `es-ES` | Spanish (Spain) |
| `es-MX` | Spanish (Mexico) |
| `zh-CN` | Chinese (Mandarin) |
| `ja-JP` | Japanese |
| `ko-KR` | Korean |
| `pt-BR` | Portuguese (Brazil) |
## Best Practices
1. **Use WAV PCM 16kHz mono** for best compatibility
2. **Enable chunked transfer** for lower latency
3. **Cache access tokens** for 9 minutes (valid for 10)
4. **Specify the correct language** for accurate recognition
5. **Use detailed format** when you need confidence scores
6. **Handle all RecognitionStatus values** in production code
## When NOT to Use This API
Use the Speech SDK or Batch Transcription API instead when you need:
- Audio longer than 60 seconds
- Real-time streaming transcription
- Partial/interim results
- Speech translation
- Custom speech models
- Batch transcription of many files
## Reference Files
| File | Contents |
|------|----------|
| [references/pronunciation-assessment.md](references/pronunciation-assessment.md) | Pronunciation assessment parameters and scoring |

View File

@@ -0,0 +1,227 @@
---
name: azure-ai-textanalytics-py
description: |
Azure AI Text Analytics SDK for sentiment analysis, entity recognition, key phrases, language detection, PII, and healthcare NLP. Use for natural language processing on text.
Triggers: "text analytics", "sentiment analysis", "entity recognition", "key phrase", "PII detection", "TextAnalyticsClient".
package: azure-ai-textanalytics
---
# Azure AI Text Analytics SDK for Python
Client library for Azure AI Language service NLP capabilities including sentiment, entities, key phrases, and more.
## Installation
```bash
pip install azure-ai-textanalytics
```
## Environment Variables
```bash
AZURE_LANGUAGE_ENDPOINT=https://<resource>.cognitiveservices.azure.com
AZURE_LANGUAGE_KEY=<your-api-key> # If using API key
```
## Authentication
### API Key
```python
import os
from azure.core.credentials import AzureKeyCredential
from azure.ai.textanalytics import TextAnalyticsClient
endpoint = os.environ["AZURE_LANGUAGE_ENDPOINT"]
key = os.environ["AZURE_LANGUAGE_KEY"]
client = TextAnalyticsClient(endpoint, AzureKeyCredential(key))
```
### Entra ID (Recommended)
```python
from azure.ai.textanalytics import TextAnalyticsClient
from azure.identity import DefaultAzureCredential
client = TextAnalyticsClient(
endpoint=os.environ["AZURE_LANGUAGE_ENDPOINT"],
credential=DefaultAzureCredential()
)
```
## Sentiment Analysis
```python
documents = [
"I had a wonderful trip to Seattle last week!",
"The food was terrible and the service was slow."
]
result = client.analyze_sentiment(documents, show_opinion_mining=True)
for doc in result:
if not doc.is_error:
print(f"Sentiment: {doc.sentiment}")
print(f"Scores: pos={doc.confidence_scores.positive:.2f}, "
f"neg={doc.confidence_scores.negative:.2f}, "
f"neu={doc.confidence_scores.neutral:.2f}")
# Opinion mining (aspect-based sentiment)
for sentence in doc.sentences:
for opinion in sentence.mined_opinions:
target = opinion.target
print(f" Target: '{target.text}' - {target.sentiment}")
for assessment in opinion.assessments:
print(f" Assessment: '{assessment.text}' - {assessment.sentiment}")
```
## Entity Recognition
```python
documents = ["Microsoft was founded by Bill Gates and Paul Allen in Albuquerque."]
result = client.recognize_entities(documents)
for doc in result:
if not doc.is_error:
for entity in doc.entities:
print(f"Entity: {entity.text}")
print(f" Category: {entity.category}")
print(f" Subcategory: {entity.subcategory}")
print(f" Confidence: {entity.confidence_score:.2f}")
```
## PII Detection
```python
documents = ["My SSN is 123-45-6789 and my email is john@example.com"]
result = client.recognize_pii_entities(documents)
for doc in result:
if not doc.is_error:
print(f"Redacted: {doc.redacted_text}")
for entity in doc.entities:
print(f"PII: {entity.text} ({entity.category})")
```
## Key Phrase Extraction
```python
documents = ["Azure AI provides powerful machine learning capabilities for developers."]
result = client.extract_key_phrases(documents)
for doc in result:
if not doc.is_error:
print(f"Key phrases: {doc.key_phrases}")
```
## Language Detection
```python
documents = ["Ce document est en francais.", "This is written in English."]
result = client.detect_language(documents)
for doc in result:
if not doc.is_error:
print(f"Language: {doc.primary_language.name} ({doc.primary_language.iso6391_name})")
print(f"Confidence: {doc.primary_language.confidence_score:.2f}")
```
## Healthcare Text Analytics
```python
documents = ["Patient has diabetes and was prescribed metformin 500mg twice daily."]
poller = client.begin_analyze_healthcare_entities(documents)
result = poller.result()
for doc in result:
if not doc.is_error:
for entity in doc.entities:
print(f"Entity: {entity.text}")
print(f" Category: {entity.category}")
print(f" Normalized: {entity.normalized_text}")
# Entity links (UMLS, etc.)
for link in entity.data_sources:
print(f" Link: {link.name} - {link.entity_id}")
```
## Multiple Analysis (Batch)
```python
from azure.ai.textanalytics import (
RecognizeEntitiesAction,
ExtractKeyPhrasesAction,
AnalyzeSentimentAction
)
documents = ["Microsoft announced new Azure AI features at Build conference."]
poller = client.begin_analyze_actions(
documents,
actions=[
RecognizeEntitiesAction(),
ExtractKeyPhrasesAction(),
AnalyzeSentimentAction()
]
)
results = poller.result()
for doc_results in results:
for result in doc_results:
if result.kind == "EntityRecognition":
print(f"Entities: {[e.text for e in result.entities]}")
elif result.kind == "KeyPhraseExtraction":
print(f"Key phrases: {result.key_phrases}")
elif result.kind == "SentimentAnalysis":
print(f"Sentiment: {result.sentiment}")
```
## Async Client
```python
from azure.ai.textanalytics.aio import TextAnalyticsClient
from azure.identity.aio import DefaultAzureCredential
async def analyze():
async with TextAnalyticsClient(
endpoint=endpoint,
credential=DefaultAzureCredential()
) as client:
result = await client.analyze_sentiment(documents)
# Process results...
```
## Client Types
| Client | Purpose |
|--------|---------|
| `TextAnalyticsClient` | All text analytics operations |
| `TextAnalyticsClient` (aio) | Async version |
## Available Operations
| Method | Description |
|--------|-------------|
| `analyze_sentiment` | Sentiment analysis with opinion mining |
| `recognize_entities` | Named entity recognition |
| `recognize_pii_entities` | PII detection and redaction |
| `recognize_linked_entities` | Entity linking to Wikipedia |
| `extract_key_phrases` | Key phrase extraction |
| `detect_language` | Language detection |
| `begin_analyze_healthcare_entities` | Healthcare NLP (long-running) |
| `begin_analyze_actions` | Multiple analyses in batch |
## Best Practices
1. **Use batch operations** for multiple documents (up to 10 per request)
2. **Enable opinion mining** for detailed aspect-based sentiment
3. **Use async client** for high-throughput scenarios
4. **Handle document errors** — results list may contain errors for some docs
5. **Specify language** when known to improve accuracy
6. **Use context manager** or close client explicitly

View File

@@ -0,0 +1,69 @@
---
name: azure-ai-transcription-py
description: |
Azure AI Transcription SDK for Python. Use for real-time and batch speech-to-text transcription with timestamps and diarization.
Triggers: "transcription", "speech to text", "Azure AI Transcription", "TranscriptionClient".
package: azure-ai-transcription
---
# Azure AI Transcription SDK for Python
Client library for Azure AI Transcription (speech-to-text) with real-time and batch transcription.
## Installation
```bash
pip install azure-ai-transcription
```
## Environment Variables
```bash
TRANSCRIPTION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
TRANSCRIPTION_KEY=<your-key>
```
## Authentication
Use subscription key authentication (DefaultAzureCredential is not supported for this client):
```python
import os
from azure.ai.transcription import TranscriptionClient
client = TranscriptionClient(
endpoint=os.environ["TRANSCRIPTION_ENDPOINT"],
credential=os.environ["TRANSCRIPTION_KEY"]
)
```
## Transcription (Batch)
```python
job = client.begin_transcription(
name="meeting-transcription",
locale="en-US",
content_urls=["https://<storage>/audio.wav"],
diarization_enabled=True
)
result = job.result()
print(result.status)
```
## Transcription (Real-time)
```python
stream = client.begin_stream_transcription(locale="en-US")
stream.send_audio_file("audio.wav")
for event in stream:
print(event.text)
```
## Best Practices
1. **Enable diarization** when multiple speakers are present
2. **Use batch transcription** for long files stored in blob storage
3. **Capture timestamps** for subtitle generation
4. **Specify language** to improve recognition accuracy
5. **Handle streaming backpressure** for real-time transcription
6. **Close transcription sessions** when complete

View File

@@ -0,0 +1,249 @@
---
name: azure-ai-translation-document-py
description: |
Azure AI Document Translation SDK for batch translation of documents with format preservation. Use for translating Word, PDF, Excel, PowerPoint, and other document formats at scale.
Triggers: "document translation", "batch translation", "translate documents", "DocumentTranslationClient".
package: azure-ai-translation-document
---
# Azure AI Document Translation SDK for Python
Client library for Azure AI Translator document translation service for batch document translation with format preservation.
## Installation
```bash
pip install azure-ai-translation-document
```
## Environment Variables
```bash
AZURE_DOCUMENT_TRANSLATION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
AZURE_DOCUMENT_TRANSLATION_KEY=<your-api-key> # If using API key
# Storage for source and target documents
AZURE_SOURCE_CONTAINER_URL=https://<storage>.blob.core.windows.net/<container>?<sas>
AZURE_TARGET_CONTAINER_URL=https://<storage>.blob.core.windows.net/<container>?<sas>
```
## Authentication
### API Key
```python
import os
from azure.ai.translation.document import DocumentTranslationClient
from azure.core.credentials import AzureKeyCredential
endpoint = os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"]
key = os.environ["AZURE_DOCUMENT_TRANSLATION_KEY"]
client = DocumentTranslationClient(endpoint, AzureKeyCredential(key))
```
### Entra ID (Recommended)
```python
from azure.ai.translation.document import DocumentTranslationClient
from azure.identity import DefaultAzureCredential
client = DocumentTranslationClient(
endpoint=os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"],
credential=DefaultAzureCredential()
)
```
## Basic Document Translation
```python
from azure.ai.translation.document import DocumentTranslationInput, TranslationTarget
source_url = os.environ["AZURE_SOURCE_CONTAINER_URL"]
target_url = os.environ["AZURE_TARGET_CONTAINER_URL"]
# Start translation job
poller = client.begin_translation(
inputs=[
DocumentTranslationInput(
source_url=source_url,
targets=[
TranslationTarget(
target_url=target_url,
language="es" # Translate to Spanish
)
]
)
]
)
# Wait for completion
result = poller.result()
print(f"Status: {poller.status()}")
print(f"Documents translated: {poller.details.documents_succeeded_count}")
print(f"Documents failed: {poller.details.documents_failed_count}")
```
## Multiple Target Languages
```python
poller = client.begin_translation(
inputs=[
DocumentTranslationInput(
source_url=source_url,
targets=[
TranslationTarget(target_url=target_url_es, language="es"),
TranslationTarget(target_url=target_url_fr, language="fr"),
TranslationTarget(target_url=target_url_de, language="de")
]
)
]
)
```
## Translate Single Document
```python
from azure.ai.translation.document import SingleDocumentTranslationClient
single_client = SingleDocumentTranslationClient(endpoint, AzureKeyCredential(key))
with open("document.docx", "rb") as f:
document_content = f.read()
result = single_client.translate(
body=document_content,
target_language="es",
content_type="application/vnd.openxmlformats-officedocument.wordprocessingml.document"
)
# Save translated document
with open("document_es.docx", "wb") as f:
f.write(result)
```
## Check Translation Status
```python
# Get all translation operations
operations = client.list_translation_statuses()
for op in operations:
print(f"Operation ID: {op.id}")
print(f"Status: {op.status}")
print(f"Created: {op.created_on}")
print(f"Total documents: {op.documents_total_count}")
print(f"Succeeded: {op.documents_succeeded_count}")
print(f"Failed: {op.documents_failed_count}")
```
## List Document Statuses
```python
# Get status of individual documents in a job
operation_id = poller.id
document_statuses = client.list_document_statuses(operation_id)
for doc in document_statuses:
print(f"Document: {doc.source_document_url}")
print(f" Status: {doc.status}")
print(f" Translated to: {doc.translated_to}")
if doc.error:
print(f" Error: {doc.error.message}")
```
## Cancel Translation
```python
# Cancel a running translation
client.cancel_translation(operation_id)
```
## Using Glossary
```python
from azure.ai.translation.document import TranslationGlossary
poller = client.begin_translation(
inputs=[
DocumentTranslationInput(
source_url=source_url,
targets=[
TranslationTarget(
target_url=target_url,
language="es",
glossaries=[
TranslationGlossary(
glossary_url="https://<storage>.blob.core.windows.net/glossary/terms.csv?<sas>",
file_format="csv"
)
]
)
]
)
]
)
```
## Supported Document Formats
```python
# Get supported formats
formats = client.get_supported_document_formats()
for fmt in formats:
print(f"Format: {fmt.format}")
print(f" Extensions: {fmt.file_extensions}")
print(f" Content types: {fmt.content_types}")
```
## Supported Languages
```python
# Get supported languages
languages = client.get_supported_languages()
for lang in languages:
print(f"Language: {lang.name} ({lang.code})")
```
## Async Client
```python
from azure.ai.translation.document.aio import DocumentTranslationClient
from azure.identity.aio import DefaultAzureCredential
async def translate_documents():
async with DocumentTranslationClient(
endpoint=endpoint,
credential=DefaultAzureCredential()
) as client:
poller = await client.begin_translation(inputs=[...])
result = await poller.result()
```
## Supported Formats
| Category | Formats |
|----------|---------|
| Documents | DOCX, PDF, PPTX, XLSX, HTML, TXT, RTF |
| Structured | CSV, TSV, JSON, XML |
| Localization | XLIFF, XLF, MHTML |
## Storage Requirements
- Source and target containers must be Azure Blob Storage
- Use SAS tokens with appropriate permissions:
- Source: Read, List
- Target: Write, List
## Best Practices
1. **Use SAS tokens** with minimal required permissions
2. **Monitor long-running operations** with `poller.status()`
3. **Handle document-level errors** by iterating document statuses
4. **Use glossaries** for domain-specific terminology
5. **Separate target containers** for each language
6. **Use async client** for multiple concurrent jobs
7. **Check supported formats** before submitting documents

View File

@@ -0,0 +1,274 @@
---
name: azure-ai-translation-text-py
description: |
Azure AI Text Translation SDK for real-time text translation, transliteration, language detection, and dictionary lookup. Use for translating text content in applications.
Triggers: "text translation", "translator", "translate text", "transliterate", "TextTranslationClient".
package: azure-ai-translation-text
---
# Azure AI Text Translation SDK for Python
Client library for Azure AI Translator text translation service for real-time text translation, transliteration, and language operations.
## Installation
```bash
pip install azure-ai-translation-text
```
## Environment Variables
```bash
AZURE_TRANSLATOR_KEY=<your-api-key>
AZURE_TRANSLATOR_REGION=<your-region> # e.g., eastus, westus2
# Or use custom endpoint
AZURE_TRANSLATOR_ENDPOINT=https://<resource>.cognitiveservices.azure.com
```
## Authentication
### API Key with Region
```python
import os
from azure.ai.translation.text import TextTranslationClient
from azure.core.credentials import AzureKeyCredential
key = os.environ["AZURE_TRANSLATOR_KEY"]
region = os.environ["AZURE_TRANSLATOR_REGION"]
# Create credential with region
credential = AzureKeyCredential(key)
client = TextTranslationClient(credential=credential, region=region)
```
### API Key with Custom Endpoint
```python
endpoint = os.environ["AZURE_TRANSLATOR_ENDPOINT"]
client = TextTranslationClient(
credential=AzureKeyCredential(key),
endpoint=endpoint
)
```
### Entra ID (Recommended)
```python
from azure.ai.translation.text import TextTranslationClient
from azure.identity import DefaultAzureCredential
client = TextTranslationClient(
credential=DefaultAzureCredential(),
endpoint=os.environ["AZURE_TRANSLATOR_ENDPOINT"]
)
```
## Basic Translation
```python
# Translate to a single language
result = client.translate(
body=["Hello, how are you?", "Welcome to Azure!"],
to=["es"] # Spanish
)
for item in result:
for translation in item.translations:
print(f"Translated: {translation.text}")
print(f"Target language: {translation.to}")
```
## Translate to Multiple Languages
```python
result = client.translate(
body=["Hello, world!"],
to=["es", "fr", "de", "ja"] # Spanish, French, German, Japanese
)
for item in result:
print(f"Source: {item.detected_language.language if item.detected_language else 'unknown'}")
for translation in item.translations:
print(f" {translation.to}: {translation.text}")
```
## Specify Source Language
```python
result = client.translate(
body=["Bonjour le monde"],
from_parameter="fr", # Source is French
to=["en", "es"]
)
```
## Language Detection
```python
result = client.translate(
body=["Hola, como estas?"],
to=["en"]
)
for item in result:
if item.detected_language:
print(f"Detected language: {item.detected_language.language}")
print(f"Confidence: {item.detected_language.score:.2f}")
```
## Transliteration
Convert text from one script to another:
```python
result = client.transliterate(
body=["konnichiwa"],
language="ja",
from_script="Latn", # From Latin script
to_script="Jpan" # To Japanese script
)
for item in result:
print(f"Transliterated: {item.text}")
print(f"Script: {item.script}")
```
## Dictionary Lookup
Find alternate translations and definitions:
```python
result = client.lookup_dictionary_entries(
body=["fly"],
from_parameter="en",
to="es"
)
for item in result:
print(f"Source: {item.normalized_source} ({item.display_source})")
for translation in item.translations:
print(f" Translation: {translation.normalized_target}")
print(f" Part of speech: {translation.pos_tag}")
print(f" Confidence: {translation.confidence:.2f}")
```
## Dictionary Examples
Get usage examples for translations:
```python
from azure.ai.translation.text.models import DictionaryExampleTextItem
result = client.lookup_dictionary_examples(
body=[DictionaryExampleTextItem(text="fly", translation="volar")],
from_parameter="en",
to="es"
)
for item in result:
for example in item.examples:
print(f"Source: {example.source_prefix}{example.source_term}{example.source_suffix}")
print(f"Target: {example.target_prefix}{example.target_term}{example.target_suffix}")
```
## Get Supported Languages
```python
# Get all supported languages
languages = client.get_supported_languages()
# Translation languages
print("Translation languages:")
for code, lang in languages.translation.items():
print(f" {code}: {lang.name} ({lang.native_name})")
# Transliteration languages
print("\nTransliteration languages:")
for code, lang in languages.transliteration.items():
print(f" {code}: {lang.name}")
for script in lang.scripts:
print(f" {script.code} -> {[t.code for t in script.to_scripts]}")
# Dictionary languages
print("\nDictionary languages:")
for code, lang in languages.dictionary.items():
print(f" {code}: {lang.name}")
```
## Break Sentence
Identify sentence boundaries:
```python
result = client.find_sentence_boundaries(
body=["Hello! How are you? I hope you are well."],
language="en"
)
for item in result:
print(f"Sentence lengths: {item.sent_len}")
```
## Translation Options
```python
result = client.translate(
body=["Hello, world!"],
to=["de"],
text_type="html", # "plain" or "html"
profanity_action="Marked", # "NoAction", "Deleted", "Marked"
profanity_marker="Asterisk", # "Asterisk", "Tag"
include_alignment=True, # Include word alignment
include_sentence_length=True # Include sentence boundaries
)
for item in result:
translation = item.translations[0]
print(f"Translated: {translation.text}")
if translation.alignment:
print(f"Alignment: {translation.alignment.proj}")
if translation.sent_len:
print(f"Sentence lengths: {translation.sent_len.src_sent_len}")
```
## Async Client
```python
from azure.ai.translation.text.aio import TextTranslationClient
from azure.core.credentials import AzureKeyCredential
async def translate_text():
async with TextTranslationClient(
credential=AzureKeyCredential(key),
region=region
) as client:
result = await client.translate(
body=["Hello, world!"],
to=["es"]
)
print(result[0].translations[0].text)
```
## Client Methods
| Method | Description |
|--------|-------------|
| `translate` | Translate text to one or more languages |
| `transliterate` | Convert text between scripts |
| `detect` | Detect language of text |
| `find_sentence_boundaries` | Identify sentence boundaries |
| `lookup_dictionary_entries` | Dictionary lookup for translations |
| `lookup_dictionary_examples` | Get usage examples |
| `get_supported_languages` | List supported languages |
## Best Practices
1. **Batch translations** — Send multiple texts in one request (up to 100)
2. **Specify source language** when known to improve accuracy
3. **Use async client** for high-throughput scenarios
4. **Cache language list** — Supported languages don't change frequently
5. **Handle profanity** appropriately for your application
6. **Use html text_type** when translating HTML content
7. **Include alignment** for applications needing word mapping

View File

@@ -0,0 +1,260 @@
---
name: azure-ai-vision-imageanalysis-py
description: |
Azure AI Vision Image Analysis SDK for captions, tags, objects, OCR, people detection, and smart cropping. Use for computer vision and image understanding tasks.
Triggers: "image analysis", "computer vision", "OCR", "object detection", "ImageAnalysisClient", "image caption".
package: azure-ai-vision-imageanalysis
---
# Azure AI Vision Image Analysis SDK for Python
Client library for Azure AI Vision 4.0 image analysis including captions, tags, objects, OCR, and more.
## Installation
```bash
pip install azure-ai-vision-imageanalysis
```
## Environment Variables
```bash
VISION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
VISION_KEY=<your-api-key> # If using API key
```
## Authentication
### API Key
```python
import os
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.core.credentials import AzureKeyCredential
endpoint = os.environ["VISION_ENDPOINT"]
key = os.environ["VISION_KEY"]
client = ImageAnalysisClient(
endpoint=endpoint,
credential=AzureKeyCredential(key)
)
```
### Entra ID (Recommended)
```python
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.identity import DefaultAzureCredential
client = ImageAnalysisClient(
endpoint=os.environ["VISION_ENDPOINT"],
credential=DefaultAzureCredential()
)
```
## Analyze Image from URL
```python
from azure.ai.vision.imageanalysis.models import VisualFeatures
image_url = "https://example.com/image.jpg"
result = client.analyze_from_url(
image_url=image_url,
visual_features=[
VisualFeatures.CAPTION,
VisualFeatures.TAGS,
VisualFeatures.OBJECTS,
VisualFeatures.READ,
VisualFeatures.PEOPLE,
VisualFeatures.SMART_CROPS,
VisualFeatures.DENSE_CAPTIONS
],
gender_neutral_caption=True,
language="en"
)
```
## Analyze Image from File
```python
with open("image.jpg", "rb") as f:
image_data = f.read()
result = client.analyze(
image_data=image_data,
visual_features=[VisualFeatures.CAPTION, VisualFeatures.TAGS]
)
```
## Image Caption
```python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.CAPTION],
gender_neutral_caption=True
)
if result.caption:
print(f"Caption: {result.caption.text}")
print(f"Confidence: {result.caption.confidence:.2f}")
```
## Dense Captions (Multiple Regions)
```python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.DENSE_CAPTIONS]
)
if result.dense_captions:
for caption in result.dense_captions.list:
print(f"Caption: {caption.text}")
print(f" Confidence: {caption.confidence:.2f}")
print(f" Bounding box: {caption.bounding_box}")
```
## Tags
```python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.TAGS]
)
if result.tags:
for tag in result.tags.list:
print(f"Tag: {tag.name} (confidence: {tag.confidence:.2f})")
```
## Object Detection
```python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.OBJECTS]
)
if result.objects:
for obj in result.objects.list:
print(f"Object: {obj.tags[0].name}")
print(f" Confidence: {obj.tags[0].confidence:.2f}")
box = obj.bounding_box
print(f" Bounding box: x={box.x}, y={box.y}, w={box.width}, h={box.height}")
```
## OCR (Text Extraction)
```python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.READ]
)
if result.read:
for block in result.read.blocks:
for line in block.lines:
print(f"Line: {line.text}")
print(f" Bounding polygon: {line.bounding_polygon}")
# Word-level details
for word in line.words:
print(f" Word: {word.text} (confidence: {word.confidence:.2f})")
```
## People Detection
```python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.PEOPLE]
)
if result.people:
for person in result.people.list:
print(f"Person detected:")
print(f" Confidence: {person.confidence:.2f}")
box = person.bounding_box
print(f" Bounding box: x={box.x}, y={box.y}, w={box.width}, h={box.height}")
```
## Smart Cropping
```python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.SMART_CROPS],
smart_crops_aspect_ratios=[0.9, 1.33, 1.78] # Portrait, 4:3, 16:9
)
if result.smart_crops:
for crop in result.smart_crops.list:
print(f"Aspect ratio: {crop.aspect_ratio}")
box = crop.bounding_box
print(f" Crop region: x={box.x}, y={box.y}, w={box.width}, h={box.height}")
```
## Async Client
```python
from azure.ai.vision.imageanalysis.aio import ImageAnalysisClient
from azure.identity.aio import DefaultAzureCredential
async def analyze_image():
async with ImageAnalysisClient(
endpoint=endpoint,
credential=DefaultAzureCredential()
) as client:
result = await client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.CAPTION]
)
print(result.caption.text)
```
## Visual Features
| Feature | Description |
|---------|-------------|
| `CAPTION` | Single sentence describing the image |
| `DENSE_CAPTIONS` | Captions for multiple regions |
| `TAGS` | Content tags (objects, scenes, actions) |
| `OBJECTS` | Object detection with bounding boxes |
| `READ` | OCR text extraction |
| `PEOPLE` | People detection with bounding boxes |
| `SMART_CROPS` | Suggested crop regions for thumbnails |
## Error Handling
```python
from azure.core.exceptions import HttpResponseError
try:
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.CAPTION]
)
except HttpResponseError as e:
print(f"Status code: {e.status_code}")
print(f"Reason: {e.reason}")
print(f"Message: {e.error.message}")
```
## Image Requirements
- Formats: JPEG, PNG, GIF, BMP, WEBP, ICO, TIFF, MPO
- Max size: 20 MB
- Dimensions: 50x50 to 16000x16000 pixels
## Best Practices
1. **Select only needed features** to optimize latency and cost
2. **Use async client** for high-throughput scenarios
3. **Handle HttpResponseError** for invalid images or auth issues
4. **Enable gender_neutral_caption** for inclusive descriptions
5. **Specify language** for localized captions
6. **Use smart_crops_aspect_ratios** matching your thumbnail requirements
7. **Cache results** when analyzing the same image multiple times

View File

@@ -0,0 +1,309 @@
---
name: azure-ai-voicelive-py
description: Build real-time voice AI applications using Azure AI Voice Live SDK (azure-ai-voicelive). Use this skill when creating Python applications that need real-time bidirectional audio communication with Azure AI, including voice assistants, voice-enabled chatbots, real-time speech-to-speech translation, voice-driven avatars, or any WebSocket-based audio streaming with AI models. Supports Server VAD (Voice Activity Detection), turn-based conversation, function calling, MCP tools, avatar integration, and transcription.
package: azure-ai-voicelive
---
# Azure AI Voice Live SDK
Build real-time voice AI applications with bidirectional WebSocket communication.
## Installation
```bash
pip install azure-ai-voicelive aiohttp azure-identity
```
## Environment Variables
```bash
AZURE_COGNITIVE_SERVICES_ENDPOINT=https://<region>.api.cognitive.microsoft.com
# For API key auth (not recommended for production)
AZURE_COGNITIVE_SERVICES_KEY=<api-key>
```
## Authentication
**DefaultAzureCredential (preferred)**:
```python
from azure.ai.voicelive.aio import connect
from azure.identity.aio import DefaultAzureCredential
async with connect(
endpoint=os.environ["AZURE_COGNITIVE_SERVICES_ENDPOINT"],
credential=DefaultAzureCredential(),
model="gpt-4o-realtime-preview",
credential_scopes=["https://cognitiveservices.azure.com/.default"]
) as conn:
...
```
**API Key**:
```python
from azure.ai.voicelive.aio import connect
from azure.core.credentials import AzureKeyCredential
async with connect(
endpoint=os.environ["AZURE_COGNITIVE_SERVICES_ENDPOINT"],
credential=AzureKeyCredential(os.environ["AZURE_COGNITIVE_SERVICES_KEY"]),
model="gpt-4o-realtime-preview"
) as conn:
...
```
## Quick Start
```python
import asyncio
import os
from azure.ai.voicelive.aio import connect
from azure.identity.aio import DefaultAzureCredential
async def main():
async with connect(
endpoint=os.environ["AZURE_COGNITIVE_SERVICES_ENDPOINT"],
credential=DefaultAzureCredential(),
model="gpt-4o-realtime-preview",
credential_scopes=["https://cognitiveservices.azure.com/.default"]
) as conn:
# Update session with instructions
await conn.session.update(session={
"instructions": "You are a helpful assistant.",
"modalities": ["text", "audio"],
"voice": "alloy"
})
# Listen for events
async for event in conn:
print(f"Event: {event.type}")
if event.type == "response.audio_transcript.done":
print(f"Transcript: {event.transcript}")
elif event.type == "response.done":
break
asyncio.run(main())
```
## Core Architecture
### Connection Resources
The `VoiceLiveConnection` exposes these resources:
| Resource | Purpose | Key Methods |
|----------|---------|-------------|
| `conn.session` | Session configuration | `update(session=...)` |
| `conn.response` | Model responses | `create()`, `cancel()` |
| `conn.input_audio_buffer` | Audio input | `append()`, `commit()`, `clear()` |
| `conn.output_audio_buffer` | Audio output | `clear()` |
| `conn.conversation` | Conversation state | `item.create()`, `item.delete()`, `item.truncate()` |
| `conn.transcription_session` | Transcription config | `update(session=...)` |
## Session Configuration
```python
from azure.ai.voicelive.models import RequestSession, FunctionTool
await conn.session.update(session=RequestSession(
instructions="You are a helpful voice assistant.",
modalities=["text", "audio"],
voice="alloy", # or "echo", "shimmer", "sage", etc.
input_audio_format="pcm16",
output_audio_format="pcm16",
turn_detection={
"type": "server_vad",
"threshold": 0.5,
"prefix_padding_ms": 300,
"silence_duration_ms": 500
},
tools=[
FunctionTool(
type="function",
name="get_weather",
description="Get current weather",
parameters={
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
)
]
))
```
## Audio Streaming
### Send Audio (Base64 PCM16)
```python
import base64
# Read audio chunk (16-bit PCM, 24kHz mono)
audio_chunk = await read_audio_from_microphone()
b64_audio = base64.b64encode(audio_chunk).decode()
await conn.input_audio_buffer.append(audio=b64_audio)
```
### Receive Audio
```python
async for event in conn:
if event.type == "response.audio.delta":
audio_bytes = base64.b64decode(event.delta)
await play_audio(audio_bytes)
elif event.type == "response.audio.done":
print("Audio complete")
```
## Event Handling
```python
async for event in conn:
match event.type:
# Session events
case "session.created":
print(f"Session: {event.session}")
case "session.updated":
print("Session updated")
# Audio input events
case "input_audio_buffer.speech_started":
print(f"Speech started at {event.audio_start_ms}ms")
case "input_audio_buffer.speech_stopped":
print(f"Speech stopped at {event.audio_end_ms}ms")
# Transcription events
case "conversation.item.input_audio_transcription.completed":
print(f"User said: {event.transcript}")
case "conversation.item.input_audio_transcription.delta":
print(f"Partial: {event.delta}")
# Response events
case "response.created":
print(f"Response started: {event.response.id}")
case "response.audio_transcript.delta":
print(event.delta, end="", flush=True)
case "response.audio.delta":
audio = base64.b64decode(event.delta)
case "response.done":
print(f"Response complete: {event.response.status}")
# Function calls
case "response.function_call_arguments.done":
result = handle_function(event.name, event.arguments)
await conn.conversation.item.create(item={
"type": "function_call_output",
"call_id": event.call_id,
"output": json.dumps(result)
})
await conn.response.create()
# Errors
case "error":
print(f"Error: {event.error.message}")
```
## Common Patterns
### Manual Turn Mode (No VAD)
```python
await conn.session.update(session={"turn_detection": None})
# Manually control turns
await conn.input_audio_buffer.append(audio=b64_audio)
await conn.input_audio_buffer.commit() # End of user turn
await conn.response.create() # Trigger response
```
### Interrupt Handling
```python
async for event in conn:
if event.type == "input_audio_buffer.speech_started":
# User interrupted - cancel current response
await conn.response.cancel()
await conn.output_audio_buffer.clear()
```
### Conversation History
```python
# Add system message
await conn.conversation.item.create(item={
"type": "message",
"role": "system",
"content": [{"type": "input_text", "text": "Be concise."}]
})
# Add user message
await conn.conversation.item.create(item={
"type": "message",
"role": "user",
"content": [{"type": "input_text", "text": "Hello!"}]
})
await conn.response.create()
```
## Voice Options
| Voice | Description |
|-------|-------------|
| `alloy` | Neutral, balanced |
| `echo` | Warm, conversational |
| `shimmer` | Clear, professional |
| `sage` | Calm, authoritative |
| `coral` | Friendly, upbeat |
| `ash` | Deep, measured |
| `ballad` | Expressive |
| `verse` | Storytelling |
Azure voices: Use `AzureStandardVoice`, `AzureCustomVoice`, or `AzurePersonalVoice` models.
## Audio Formats
| Format | Sample Rate | Use Case |
|--------|-------------|----------|
| `pcm16` | 24kHz | Default, high quality |
| `pcm16-8000hz` | 8kHz | Telephony |
| `pcm16-16000hz` | 16kHz | Voice assistants |
| `g711_ulaw` | 8kHz | Telephony (US) |
| `g711_alaw` | 8kHz | Telephony (EU) |
## Turn Detection Options
```python
# Server VAD (default)
{"type": "server_vad", "threshold": 0.5, "silence_duration_ms": 500}
# Azure Semantic VAD (smarter detection)
{"type": "azure_semantic_vad"}
{"type": "azure_semantic_vad_en"} # English optimized
{"type": "azure_semantic_vad_multilingual"}
```
## Error Handling
```python
from azure.ai.voicelive.aio import ConnectionError, ConnectionClosed
try:
async with connect(...) as conn:
async for event in conn:
if event.type == "error":
print(f"API Error: {event.error.code} - {event.error.message}")
except ConnectionClosed as e:
print(f"Connection closed: {e.code} - {e.reason}")
except ConnectionError as e:
print(f"Connection error: {e}")
```
## References
- **Detailed API Reference**: See [references/api-reference.md](references/api-reference.md)
- **Complete Examples**: See [references/examples.md](references/examples.md)
- **All Models & Types**: See [references/models.md](references/models.md)