feat: Add Official Microsoft & Gemini Skills (845+ Total)
🚀 Impact Significantly expands the capabilities of **Antigravity Awesome Skills** by integrating official skill collections from **Microsoft** and **Google Gemini**. This update increases the total skill count to **845+**, making the library even more comprehensive for AI coding assistants. ✨ Key Changes 1. New Official Skills - **Microsoft Skills**: Added a massive collection of official skills from [microsoft/skills](https://github.com/microsoft/skills). - Includes Azure, .NET, Python, TypeScript, and Semantic Kernel skills. - Preserves the original directory structure under `skills/official/microsoft/`. - Includes plugin skills from the `.github/plugins` directory. - **Gemini Skills**: Added official Gemini API development skills under `skills/gemini-api-dev/`. 2. New Scripts & Tooling - **`scripts/sync_microsoft_skills.py`**: A robust synchronization script that: - Clones the official Microsoft repository. - Preserves the original directory heirarchy. - Handles symlinks and plugin locations. - Generates attribution metadata. - **`scripts/tests/inspect_microsoft_repo.py`**: Debug tool to inspect the remote repository structure. - **`scripts/tests/test_comprehensive_coverage.py`**: Verification script to ensure 100% of skills are captured during sync. 3. Core Improvements - **`scripts/generate_index.py`**: Enhanced frontmatter parsing to safely handle unquoted values containing `@` symbols and commas (fixing issues with some Microsoft skill descriptions). - **`package.json`**: Added `sync:microsoft` and `sync:all-official` scripts for easy maintenance. 4. Documentation - Updated `README.md` to reflect the new skill counts (845+) and added Microsoft/Gemini to the provider list. - Updated `CATALOG.md` and `skills_index.json` with the new skills. 🧪 Verification - Ran `scripts/tests/test_comprehensive_coverage.py` to verify all Microsoft skills are detected. - Validated `generate_index.py` fixes by successfully indexing the new skills.
This commit is contained in:
@@ -0,0 +1,333 @@
|
||||
---
|
||||
name: agent-framework-azure-ai-py
|
||||
description: Build Azure AI Foundry agents using the Microsoft Agent Framework Python SDK (agent-framework-azure-ai). Use when creating persistent agents with AzureAIAgentsProvider, using hosted tools (code interpreter, file search, web search), integrating MCP servers, managing conversation threads, or implementing streaming responses. Covers function tools, structured outputs, and multi-tool agents.
|
||||
package: agent-framework-azure-ai
|
||||
---
|
||||
|
||||
# Agent Framework Azure Hosted Agents
|
||||
|
||||
Build persistent agents on Azure AI Foundry using the Microsoft Agent Framework Python SDK.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
User Query → AzureAIAgentsProvider → Azure AI Agent Service (Persistent)
|
||||
↓
|
||||
Agent.run() / Agent.run_stream()
|
||||
↓
|
||||
Tools: Functions | Hosted (Code/Search/Web) | MCP
|
||||
↓
|
||||
AgentThread (conversation persistence)
|
||||
```
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
# Full framework (recommended)
|
||||
pip install agent-framework --pre
|
||||
|
||||
# Or Azure-specific package only
|
||||
pip install agent-framework-azure-ai --pre
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
export AZURE_AI_PROJECT_ENDPOINT="https://<project>.services.ai.azure.com/api/projects/<project-id>"
|
||||
export AZURE_AI_MODEL_DEPLOYMENT_NAME="gpt-4o-mini"
|
||||
export BING_CONNECTION_ID="your-bing-connection-id" # For web search
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
```python
|
||||
from azure.identity.aio import AzureCliCredential, DefaultAzureCredential
|
||||
|
||||
# Development
|
||||
credential = AzureCliCredential()
|
||||
|
||||
# Production
|
||||
credential = DefaultAzureCredential()
|
||||
```
|
||||
|
||||
## Core Workflow
|
||||
|
||||
### Basic Agent
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
from agent_framework.azure import AzureAIAgentsProvider
|
||||
from azure.identity.aio import AzureCliCredential
|
||||
|
||||
async def main():
|
||||
async with (
|
||||
AzureCliCredential() as credential,
|
||||
AzureAIAgentsProvider(credential=credential) as provider,
|
||||
):
|
||||
agent = await provider.create_agent(
|
||||
name="MyAgent",
|
||||
instructions="You are a helpful assistant.",
|
||||
)
|
||||
|
||||
result = await agent.run("Hello!")
|
||||
print(result.text)
|
||||
|
||||
asyncio.run(main())
|
||||
```
|
||||
|
||||
### Agent with Function Tools
|
||||
|
||||
```python
|
||||
from typing import Annotated
|
||||
from pydantic import Field
|
||||
from agent_framework.azure import AzureAIAgentsProvider
|
||||
from azure.identity.aio import AzureCliCredential
|
||||
|
||||
def get_weather(
|
||||
location: Annotated[str, Field(description="City name to get weather for")],
|
||||
) -> str:
|
||||
"""Get the current weather for a location."""
|
||||
return f"Weather in {location}: 72°F, sunny"
|
||||
|
||||
def get_current_time() -> str:
|
||||
"""Get the current UTC time."""
|
||||
from datetime import datetime, timezone
|
||||
return datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S UTC")
|
||||
|
||||
async def main():
|
||||
async with (
|
||||
AzureCliCredential() as credential,
|
||||
AzureAIAgentsProvider(credential=credential) as provider,
|
||||
):
|
||||
agent = await provider.create_agent(
|
||||
name="WeatherAgent",
|
||||
instructions="You help with weather and time queries.",
|
||||
tools=[get_weather, get_current_time], # Pass functions directly
|
||||
)
|
||||
|
||||
result = await agent.run("What's the weather in Seattle?")
|
||||
print(result.text)
|
||||
```
|
||||
|
||||
### Agent with Hosted Tools
|
||||
|
||||
```python
|
||||
from agent_framework import (
|
||||
HostedCodeInterpreterTool,
|
||||
HostedFileSearchTool,
|
||||
HostedWebSearchTool,
|
||||
)
|
||||
from agent_framework.azure import AzureAIAgentsProvider
|
||||
from azure.identity.aio import AzureCliCredential
|
||||
|
||||
async def main():
|
||||
async with (
|
||||
AzureCliCredential() as credential,
|
||||
AzureAIAgentsProvider(credential=credential) as provider,
|
||||
):
|
||||
agent = await provider.create_agent(
|
||||
name="MultiToolAgent",
|
||||
instructions="You can execute code, search files, and search the web.",
|
||||
tools=[
|
||||
HostedCodeInterpreterTool(),
|
||||
HostedWebSearchTool(name="Bing"),
|
||||
],
|
||||
)
|
||||
|
||||
result = await agent.run("Calculate the factorial of 20 in Python")
|
||||
print(result.text)
|
||||
```
|
||||
|
||||
### Streaming Responses
|
||||
|
||||
```python
|
||||
async def main():
|
||||
async with (
|
||||
AzureCliCredential() as credential,
|
||||
AzureAIAgentsProvider(credential=credential) as provider,
|
||||
):
|
||||
agent = await provider.create_agent(
|
||||
name="StreamingAgent",
|
||||
instructions="You are a helpful assistant.",
|
||||
)
|
||||
|
||||
print("Agent: ", end="", flush=True)
|
||||
async for chunk in agent.run_stream("Tell me a short story"):
|
||||
if chunk.text:
|
||||
print(chunk.text, end="", flush=True)
|
||||
print()
|
||||
```
|
||||
|
||||
### Conversation Threads
|
||||
|
||||
```python
|
||||
from agent_framework.azure import AzureAIAgentsProvider
|
||||
from azure.identity.aio import AzureCliCredential
|
||||
|
||||
async def main():
|
||||
async with (
|
||||
AzureCliCredential() as credential,
|
||||
AzureAIAgentsProvider(credential=credential) as provider,
|
||||
):
|
||||
agent = await provider.create_agent(
|
||||
name="ChatAgent",
|
||||
instructions="You are a helpful assistant.",
|
||||
tools=[get_weather],
|
||||
)
|
||||
|
||||
# Create thread for conversation persistence
|
||||
thread = agent.get_new_thread()
|
||||
|
||||
# First turn
|
||||
result1 = await agent.run("What's the weather in Seattle?", thread=thread)
|
||||
print(f"Agent: {result1.text}")
|
||||
|
||||
# Second turn - context is maintained
|
||||
result2 = await agent.run("What about Portland?", thread=thread)
|
||||
print(f"Agent: {result2.text}")
|
||||
|
||||
# Save thread ID for later resumption
|
||||
print(f"Conversation ID: {thread.conversation_id}")
|
||||
```
|
||||
|
||||
### Structured Outputs
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel, ConfigDict
|
||||
from agent_framework.azure import AzureAIAgentsProvider
|
||||
from azure.identity.aio import AzureCliCredential
|
||||
|
||||
class WeatherResponse(BaseModel):
|
||||
model_config = ConfigDict(extra="forbid")
|
||||
|
||||
location: str
|
||||
temperature: float
|
||||
unit: str
|
||||
conditions: str
|
||||
|
||||
async def main():
|
||||
async with (
|
||||
AzureCliCredential() as credential,
|
||||
AzureAIAgentsProvider(credential=credential) as provider,
|
||||
):
|
||||
agent = await provider.create_agent(
|
||||
name="StructuredAgent",
|
||||
instructions="Provide weather information in structured format.",
|
||||
response_format=WeatherResponse,
|
||||
)
|
||||
|
||||
result = await agent.run("Weather in Seattle?")
|
||||
weather = WeatherResponse.model_validate_json(result.text)
|
||||
print(f"{weather.location}: {weather.temperature}°{weather.unit}")
|
||||
```
|
||||
|
||||
## Provider Methods
|
||||
|
||||
| Method | Description |
|
||||
|--------|-------------|
|
||||
| `create_agent()` | Create new agent on Azure AI service |
|
||||
| `get_agent(agent_id)` | Retrieve existing agent by ID |
|
||||
| `as_agent(sdk_agent)` | Wrap SDK Agent object (no HTTP call) |
|
||||
|
||||
## Hosted Tools Quick Reference
|
||||
|
||||
| Tool | Import | Purpose |
|
||||
|------|--------|---------|
|
||||
| `HostedCodeInterpreterTool` | `from agent_framework import HostedCodeInterpreterTool` | Execute Python code |
|
||||
| `HostedFileSearchTool` | `from agent_framework import HostedFileSearchTool` | Search vector stores |
|
||||
| `HostedWebSearchTool` | `from agent_framework import HostedWebSearchTool` | Bing web search |
|
||||
| `HostedMCPTool` | `from agent_framework import HostedMCPTool` | Service-managed MCP |
|
||||
| `MCPStreamableHTTPTool` | `from agent_framework import MCPStreamableHTTPTool` | Client-managed MCP |
|
||||
|
||||
## Complete Example
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
from typing import Annotated
|
||||
from pydantic import BaseModel, Field
|
||||
from agent_framework import (
|
||||
HostedCodeInterpreterTool,
|
||||
HostedWebSearchTool,
|
||||
MCPStreamableHTTPTool,
|
||||
)
|
||||
from agent_framework.azure import AzureAIAgentsProvider
|
||||
from azure.identity.aio import AzureCliCredential
|
||||
|
||||
|
||||
def get_weather(
|
||||
location: Annotated[str, Field(description="City name")],
|
||||
) -> str:
|
||||
"""Get weather for a location."""
|
||||
return f"Weather in {location}: 72°F, sunny"
|
||||
|
||||
|
||||
class AnalysisResult(BaseModel):
|
||||
summary: str
|
||||
key_findings: list[str]
|
||||
confidence: float
|
||||
|
||||
|
||||
async def main():
|
||||
async with (
|
||||
AzureCliCredential() as credential,
|
||||
MCPStreamableHTTPTool(
|
||||
name="Docs MCP",
|
||||
url="https://learn.microsoft.com/api/mcp",
|
||||
) as mcp_tool,
|
||||
AzureAIAgentsProvider(credential=credential) as provider,
|
||||
):
|
||||
agent = await provider.create_agent(
|
||||
name="ResearchAssistant",
|
||||
instructions="You are a research assistant with multiple capabilities.",
|
||||
tools=[
|
||||
get_weather,
|
||||
HostedCodeInterpreterTool(),
|
||||
HostedWebSearchTool(name="Bing"),
|
||||
mcp_tool,
|
||||
],
|
||||
)
|
||||
|
||||
thread = agent.get_new_thread()
|
||||
|
||||
# Non-streaming
|
||||
result = await agent.run(
|
||||
"Search for Python best practices and summarize",
|
||||
thread=thread,
|
||||
)
|
||||
print(f"Response: {result.text}")
|
||||
|
||||
# Streaming
|
||||
print("\nStreaming: ", end="")
|
||||
async for chunk in agent.run_stream("Continue with examples", thread=thread):
|
||||
if chunk.text:
|
||||
print(chunk.text, end="", flush=True)
|
||||
print()
|
||||
|
||||
# Structured output
|
||||
result = await agent.run(
|
||||
"Analyze findings",
|
||||
thread=thread,
|
||||
response_format=AnalysisResult,
|
||||
)
|
||||
analysis = AnalysisResult.model_validate_json(result.text)
|
||||
print(f"\nConfidence: {analysis.confidence}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
```
|
||||
|
||||
## Conventions
|
||||
|
||||
- Always use async context managers: `async with provider:`
|
||||
- Pass functions directly to `tools=` parameter (auto-converted to AIFunction)
|
||||
- Use `Annotated[type, Field(description=...)]` for function parameters
|
||||
- Use `get_new_thread()` for multi-turn conversations
|
||||
- Prefer `HostedMCPTool` for service-managed MCP, `MCPStreamableHTTPTool` for client-managed
|
||||
|
||||
## Reference Files
|
||||
|
||||
- [references/tools.md](references/tools.md): Detailed hosted tool patterns
|
||||
- [references/mcp.md](references/mcp.md): MCP integration (hosted + local)
|
||||
- [references/threads.md](references/threads.md): Thread and conversation management
|
||||
- [references/advanced.md](references/advanced.md): OpenAPI, citations, structured outputs
|
||||
325
skills/official/microsoft/python/foundry/agents-v2/SKILL.md
Normal file
325
skills/official/microsoft/python/foundry/agents-v2/SKILL.md
Normal file
@@ -0,0 +1,325 @@
|
||||
---
|
||||
name: agents-v2-py
|
||||
description: |
|
||||
Build container-based Foundry Agents using Azure AI Projects SDK with ImageBasedHostedAgentDefinition.
|
||||
Use when creating hosted agents that run custom code in Azure AI Foundry with your own container images.
|
||||
Triggers: "ImageBasedHostedAgentDefinition", "hosted agent", "container agent", "Foundry Agent",
|
||||
"create_version", "ProtocolVersionRecord", "AgentProtocol.RESPONSES", "custom agent image".
|
||||
package: azure-ai-projects
|
||||
---
|
||||
|
||||
# Azure AI Hosted Agents (Python)
|
||||
|
||||
Build container-based hosted agents using `ImageBasedHostedAgentDefinition` from the Azure AI Projects SDK.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-ai-projects>=2.0.0b3 azure-identity
|
||||
```
|
||||
|
||||
**Minimum SDK Version:** `2.0.0b3` or later required for hosted agent support.
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
AZURE_AI_PROJECT_ENDPOINT=https://<resource>.services.ai.azure.com/api/projects/<project>
|
||||
```
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before creating hosted agents:
|
||||
|
||||
1. **Container Image** - Build and push to Azure Container Registry (ACR)
|
||||
2. **ACR Pull Permissions** - Grant your project's managed identity `AcrPull` role on the ACR
|
||||
3. **Capability Host** - Account-level capability host with `enablePublicHostingEnvironment=true`
|
||||
4. **SDK Version** - Ensure `azure-ai-projects>=2.0.0b3`
|
||||
|
||||
## Authentication
|
||||
|
||||
Always use `DefaultAzureCredential`:
|
||||
|
||||
```python
|
||||
from azure.identity import DefaultAzureCredential
|
||||
from azure.ai.projects import AIProjectClient
|
||||
|
||||
credential = DefaultAzureCredential()
|
||||
client = AIProjectClient(
|
||||
endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
|
||||
credential=credential
|
||||
)
|
||||
```
|
||||
|
||||
## Core Workflow
|
||||
|
||||
### 1. Imports
|
||||
|
||||
```python
|
||||
import os
|
||||
from azure.identity import DefaultAzureCredential
|
||||
from azure.ai.projects import AIProjectClient
|
||||
from azure.ai.projects.models import (
|
||||
ImageBasedHostedAgentDefinition,
|
||||
ProtocolVersionRecord,
|
||||
AgentProtocol,
|
||||
)
|
||||
```
|
||||
|
||||
### 2. Create Hosted Agent
|
||||
|
||||
```python
|
||||
client = AIProjectClient(
|
||||
endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
|
||||
credential=DefaultAzureCredential()
|
||||
)
|
||||
|
||||
agent = client.agents.create_version(
|
||||
agent_name="my-hosted-agent",
|
||||
definition=ImageBasedHostedAgentDefinition(
|
||||
container_protocol_versions=[
|
||||
ProtocolVersionRecord(protocol=AgentProtocol.RESPONSES, version="v1")
|
||||
],
|
||||
cpu="1",
|
||||
memory="2Gi",
|
||||
image="myregistry.azurecr.io/my-agent:latest",
|
||||
tools=[{"type": "code_interpreter"}],
|
||||
environment_variables={
|
||||
"AZURE_AI_PROJECT_ENDPOINT": os.environ["AZURE_AI_PROJECT_ENDPOINT"],
|
||||
"MODEL_NAME": "gpt-4o-mini"
|
||||
}
|
||||
)
|
||||
)
|
||||
|
||||
print(f"Created agent: {agent.name} (version: {agent.version})")
|
||||
```
|
||||
|
||||
### 3. List Agent Versions
|
||||
|
||||
```python
|
||||
versions = client.agents.list_versions(agent_name="my-hosted-agent")
|
||||
for version in versions:
|
||||
print(f"Version: {version.version}, State: {version.state}")
|
||||
```
|
||||
|
||||
### 4. Delete Agent Version
|
||||
|
||||
```python
|
||||
client.agents.delete_version(
|
||||
agent_name="my-hosted-agent",
|
||||
version=agent.version
|
||||
)
|
||||
```
|
||||
|
||||
## ImageBasedHostedAgentDefinition Parameters
|
||||
|
||||
| Parameter | Type | Required | Description |
|
||||
|-----------|------|----------|-------------|
|
||||
| `container_protocol_versions` | `list[ProtocolVersionRecord]` | Yes | Protocol versions the agent supports |
|
||||
| `image` | `str` | Yes | Full container image path (registry/image:tag) |
|
||||
| `cpu` | `str` | No | CPU allocation (e.g., "1", "2") |
|
||||
| `memory` | `str` | No | Memory allocation (e.g., "2Gi", "4Gi") |
|
||||
| `tools` | `list[dict]` | No | Tools available to the agent |
|
||||
| `environment_variables` | `dict[str, str]` | No | Environment variables for the container |
|
||||
|
||||
## Protocol Versions
|
||||
|
||||
The `container_protocol_versions` parameter specifies which protocols your agent supports:
|
||||
|
||||
```python
|
||||
from azure.ai.projects.models import ProtocolVersionRecord, AgentProtocol
|
||||
|
||||
# RESPONSES protocol - standard agent responses
|
||||
container_protocol_versions=[
|
||||
ProtocolVersionRecord(protocol=AgentProtocol.RESPONSES, version="v1")
|
||||
]
|
||||
```
|
||||
|
||||
**Available Protocols:**
|
||||
| Protocol | Description |
|
||||
|----------|-------------|
|
||||
| `AgentProtocol.RESPONSES` | Standard response protocol for agent interactions |
|
||||
|
||||
## Resource Allocation
|
||||
|
||||
Specify CPU and memory for your container:
|
||||
|
||||
```python
|
||||
definition=ImageBasedHostedAgentDefinition(
|
||||
container_protocol_versions=[...],
|
||||
image="myregistry.azurecr.io/my-agent:latest",
|
||||
cpu="2", # 2 CPU cores
|
||||
memory="4Gi" # 4 GiB memory
|
||||
)
|
||||
```
|
||||
|
||||
**Resource Limits:**
|
||||
| Resource | Min | Max | Default |
|
||||
|----------|-----|-----|---------|
|
||||
| CPU | 0.5 | 4 | 1 |
|
||||
| Memory | 1Gi | 8Gi | 2Gi |
|
||||
|
||||
## Tools Configuration
|
||||
|
||||
Add tools to your hosted agent:
|
||||
|
||||
### Code Interpreter
|
||||
|
||||
```python
|
||||
tools=[{"type": "code_interpreter"}]
|
||||
```
|
||||
|
||||
### MCP Tools
|
||||
|
||||
```python
|
||||
tools=[
|
||||
{"type": "code_interpreter"},
|
||||
{
|
||||
"type": "mcp",
|
||||
"server_label": "my-mcp-server",
|
||||
"server_url": "https://my-mcp-server.example.com"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
### Multiple Tools
|
||||
|
||||
```python
|
||||
tools=[
|
||||
{"type": "code_interpreter"},
|
||||
{"type": "file_search"},
|
||||
{
|
||||
"type": "mcp",
|
||||
"server_label": "custom-tool",
|
||||
"server_url": "https://custom-tool.example.com"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
Pass configuration to your container:
|
||||
|
||||
```python
|
||||
environment_variables={
|
||||
"AZURE_AI_PROJECT_ENDPOINT": os.environ["AZURE_AI_PROJECT_ENDPOINT"],
|
||||
"MODEL_NAME": "gpt-4o-mini",
|
||||
"LOG_LEVEL": "INFO",
|
||||
"CUSTOM_CONFIG": "value"
|
||||
}
|
||||
```
|
||||
|
||||
**Best Practice:** Never hardcode secrets. Use environment variables or Azure Key Vault.
|
||||
|
||||
## Complete Example
|
||||
|
||||
```python
|
||||
import os
|
||||
from azure.identity import DefaultAzureCredential
|
||||
from azure.ai.projects import AIProjectClient
|
||||
from azure.ai.projects.models import (
|
||||
ImageBasedHostedAgentDefinition,
|
||||
ProtocolVersionRecord,
|
||||
AgentProtocol,
|
||||
)
|
||||
|
||||
def create_hosted_agent():
|
||||
"""Create a hosted agent with custom container image."""
|
||||
|
||||
client = AIProjectClient(
|
||||
endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
|
||||
credential=DefaultAzureCredential()
|
||||
)
|
||||
|
||||
agent = client.agents.create_version(
|
||||
agent_name="data-processor-agent",
|
||||
definition=ImageBasedHostedAgentDefinition(
|
||||
container_protocol_versions=[
|
||||
ProtocolVersionRecord(
|
||||
protocol=AgentProtocol.RESPONSES,
|
||||
version="v1"
|
||||
)
|
||||
],
|
||||
image="myregistry.azurecr.io/data-processor:v1.0",
|
||||
cpu="2",
|
||||
memory="4Gi",
|
||||
tools=[
|
||||
{"type": "code_interpreter"},
|
||||
{"type": "file_search"}
|
||||
],
|
||||
environment_variables={
|
||||
"AZURE_AI_PROJECT_ENDPOINT": os.environ["AZURE_AI_PROJECT_ENDPOINT"],
|
||||
"MODEL_NAME": "gpt-4o-mini",
|
||||
"MAX_RETRIES": "3"
|
||||
}
|
||||
)
|
||||
)
|
||||
|
||||
print(f"Created hosted agent: {agent.name}")
|
||||
print(f"Version: {agent.version}")
|
||||
print(f"State: {agent.state}")
|
||||
|
||||
return agent
|
||||
|
||||
if __name__ == "__main__":
|
||||
create_hosted_agent()
|
||||
```
|
||||
|
||||
## Async Pattern
|
||||
|
||||
```python
|
||||
import os
|
||||
from azure.identity.aio import DefaultAzureCredential
|
||||
from azure.ai.projects.aio import AIProjectClient
|
||||
from azure.ai.projects.models import (
|
||||
ImageBasedHostedAgentDefinition,
|
||||
ProtocolVersionRecord,
|
||||
AgentProtocol,
|
||||
)
|
||||
|
||||
async def create_hosted_agent_async():
|
||||
"""Create a hosted agent asynchronously."""
|
||||
|
||||
async with DefaultAzureCredential() as credential:
|
||||
async with AIProjectClient(
|
||||
endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
|
||||
credential=credential
|
||||
) as client:
|
||||
agent = await client.agents.create_version(
|
||||
agent_name="async-agent",
|
||||
definition=ImageBasedHostedAgentDefinition(
|
||||
container_protocol_versions=[
|
||||
ProtocolVersionRecord(
|
||||
protocol=AgentProtocol.RESPONSES,
|
||||
version="v1"
|
||||
)
|
||||
],
|
||||
image="myregistry.azurecr.io/async-agent:latest",
|
||||
cpu="1",
|
||||
memory="2Gi"
|
||||
)
|
||||
)
|
||||
return agent
|
||||
```
|
||||
|
||||
## Common Errors
|
||||
|
||||
| Error | Cause | Solution |
|
||||
|-------|-------|----------|
|
||||
| `ImagePullBackOff` | ACR pull permission denied | Grant `AcrPull` role to project's managed identity |
|
||||
| `InvalidContainerImage` | Image not found | Verify image path and tag exist in ACR |
|
||||
| `CapabilityHostNotFound` | No capability host configured | Create account-level capability host |
|
||||
| `ProtocolVersionNotSupported` | Invalid protocol version | Use `AgentProtocol.RESPONSES` with version `"v1"` |
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Version Your Images** - Use specific tags, not `latest` in production
|
||||
2. **Minimal Resources** - Start with minimum CPU/memory, scale up as needed
|
||||
3. **Environment Variables** - Use for all configuration, never hardcode
|
||||
4. **Error Handling** - Wrap agent creation in try/except blocks
|
||||
5. **Cleanup** - Delete unused agent versions to free resources
|
||||
|
||||
## Reference Links
|
||||
|
||||
- [Azure AI Projects SDK](https://pypi.org/project/azure-ai-projects/)
|
||||
- [Hosted Agents Documentation](https://learn.microsoft.com/azure/ai-services/agents/how-to/hosted-agents)
|
||||
- [Azure Container Registry](https://learn.microsoft.com/azure/container-registry/)
|
||||
214
skills/official/microsoft/python/foundry/contentsafety/SKILL.md
Normal file
214
skills/official/microsoft/python/foundry/contentsafety/SKILL.md
Normal file
@@ -0,0 +1,214 @@
|
||||
---
|
||||
name: azure-ai-contentsafety-py
|
||||
description: |
|
||||
Azure AI Content Safety SDK for Python. Use for detecting harmful content in text and images with multi-severity classification.
|
||||
Triggers: "azure-ai-contentsafety", "ContentSafetyClient", "content moderation", "harmful content", "text analysis", "image analysis".
|
||||
package: azure-ai-contentsafety
|
||||
---
|
||||
|
||||
# Azure AI Content Safety SDK for Python
|
||||
|
||||
Detect harmful user-generated and AI-generated content in applications.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-ai-contentsafety
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
CONTENT_SAFETY_ENDPOINT=https://<resource>.cognitiveservices.azure.com
|
||||
CONTENT_SAFETY_KEY=<your-api-key>
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
### API Key
|
||||
|
||||
```python
|
||||
from azure.ai.contentsafety import ContentSafetyClient
|
||||
from azure.core.credentials import AzureKeyCredential
|
||||
import os
|
||||
|
||||
client = ContentSafetyClient(
|
||||
endpoint=os.environ["CONTENT_SAFETY_ENDPOINT"],
|
||||
credential=AzureKeyCredential(os.environ["CONTENT_SAFETY_KEY"])
|
||||
)
|
||||
```
|
||||
|
||||
### Entra ID
|
||||
|
||||
```python
|
||||
from azure.ai.contentsafety import ContentSafetyClient
|
||||
from azure.identity import DefaultAzureCredential
|
||||
|
||||
client = ContentSafetyClient(
|
||||
endpoint=os.environ["CONTENT_SAFETY_ENDPOINT"],
|
||||
credential=DefaultAzureCredential()
|
||||
)
|
||||
```
|
||||
|
||||
## Analyze Text
|
||||
|
||||
```python
|
||||
from azure.ai.contentsafety import ContentSafetyClient
|
||||
from azure.ai.contentsafety.models import AnalyzeTextOptions, TextCategory
|
||||
from azure.core.credentials import AzureKeyCredential
|
||||
|
||||
client = ContentSafetyClient(endpoint, AzureKeyCredential(key))
|
||||
|
||||
request = AnalyzeTextOptions(text="Your text content to analyze")
|
||||
response = client.analyze_text(request)
|
||||
|
||||
# Check each category
|
||||
for category in [TextCategory.HATE, TextCategory.SELF_HARM,
|
||||
TextCategory.SEXUAL, TextCategory.VIOLENCE]:
|
||||
result = next((r for r in response.categories_analysis
|
||||
if r.category == category), None)
|
||||
if result:
|
||||
print(f"{category}: severity {result.severity}")
|
||||
```
|
||||
|
||||
## Analyze Image
|
||||
|
||||
```python
|
||||
from azure.ai.contentsafety import ContentSafetyClient
|
||||
from azure.ai.contentsafety.models import AnalyzeImageOptions, ImageData
|
||||
from azure.core.credentials import AzureKeyCredential
|
||||
import base64
|
||||
|
||||
client = ContentSafetyClient(endpoint, AzureKeyCredential(key))
|
||||
|
||||
# From file
|
||||
with open("image.jpg", "rb") as f:
|
||||
image_data = base64.b64encode(f.read()).decode("utf-8")
|
||||
|
||||
request = AnalyzeImageOptions(
|
||||
image=ImageData(content=image_data)
|
||||
)
|
||||
|
||||
response = client.analyze_image(request)
|
||||
|
||||
for result in response.categories_analysis:
|
||||
print(f"{result.category}: severity {result.severity}")
|
||||
```
|
||||
|
||||
### Image from URL
|
||||
|
||||
```python
|
||||
from azure.ai.contentsafety.models import AnalyzeImageOptions, ImageData
|
||||
|
||||
request = AnalyzeImageOptions(
|
||||
image=ImageData(blob_url="https://example.com/image.jpg")
|
||||
)
|
||||
|
||||
response = client.analyze_image(request)
|
||||
```
|
||||
|
||||
## Text Blocklist Management
|
||||
|
||||
### Create Blocklist
|
||||
|
||||
```python
|
||||
from azure.ai.contentsafety import BlocklistClient
|
||||
from azure.ai.contentsafety.models import TextBlocklist
|
||||
from azure.core.credentials import AzureKeyCredential
|
||||
|
||||
blocklist_client = BlocklistClient(endpoint, AzureKeyCredential(key))
|
||||
|
||||
blocklist = TextBlocklist(
|
||||
blocklist_name="my-blocklist",
|
||||
description="Custom terms to block"
|
||||
)
|
||||
|
||||
result = blocklist_client.create_or_update_text_blocklist(
|
||||
blocklist_name="my-blocklist",
|
||||
options=blocklist
|
||||
)
|
||||
```
|
||||
|
||||
### Add Block Items
|
||||
|
||||
```python
|
||||
from azure.ai.contentsafety.models import AddOrUpdateTextBlocklistItemsOptions, TextBlocklistItem
|
||||
|
||||
items = AddOrUpdateTextBlocklistItemsOptions(
|
||||
blocklist_items=[
|
||||
TextBlocklistItem(text="blocked-term-1"),
|
||||
TextBlocklistItem(text="blocked-term-2")
|
||||
]
|
||||
)
|
||||
|
||||
result = blocklist_client.add_or_update_blocklist_items(
|
||||
blocklist_name="my-blocklist",
|
||||
options=items
|
||||
)
|
||||
```
|
||||
|
||||
### Analyze with Blocklist
|
||||
|
||||
```python
|
||||
from azure.ai.contentsafety.models import AnalyzeTextOptions
|
||||
|
||||
request = AnalyzeTextOptions(
|
||||
text="Text containing blocked-term-1",
|
||||
blocklist_names=["my-blocklist"],
|
||||
halt_on_blocklist_hit=True
|
||||
)
|
||||
|
||||
response = client.analyze_text(request)
|
||||
|
||||
if response.blocklists_match:
|
||||
for match in response.blocklists_match:
|
||||
print(f"Blocked: {match.blocklist_item_text}")
|
||||
```
|
||||
|
||||
## Severity Levels
|
||||
|
||||
Text analysis returns 4 severity levels (0, 2, 4, 6) by default. For 8 levels (0-7):
|
||||
|
||||
```python
|
||||
from azure.ai.contentsafety.models import AnalyzeTextOptions, AnalyzeTextOutputType
|
||||
|
||||
request = AnalyzeTextOptions(
|
||||
text="Your text",
|
||||
output_type=AnalyzeTextOutputType.EIGHT_SEVERITY_LEVELS
|
||||
)
|
||||
```
|
||||
|
||||
## Harm Categories
|
||||
|
||||
| Category | Description |
|
||||
|----------|-------------|
|
||||
| `Hate` | Attacks based on identity (race, religion, gender, etc.) |
|
||||
| `Sexual` | Sexual content, relationships, anatomy |
|
||||
| `Violence` | Physical harm, weapons, injury |
|
||||
| `SelfHarm` | Self-injury, suicide, eating disorders |
|
||||
|
||||
## Severity Scale
|
||||
|
||||
| Level | Text Range | Image Range | Meaning |
|
||||
|-------|------------|-------------|---------|
|
||||
| 0 | Safe | Safe | No harmful content |
|
||||
| 2 | Low | Low | Mild references |
|
||||
| 4 | Medium | Medium | Moderate content |
|
||||
| 6 | High | High | Severe content |
|
||||
|
||||
## Client Types
|
||||
|
||||
| Client | Purpose |
|
||||
|--------|---------|
|
||||
| `ContentSafetyClient` | Analyze text and images |
|
||||
| `BlocklistClient` | Manage custom blocklists |
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use blocklists** for domain-specific terms
|
||||
2. **Set severity thresholds** appropriate for your use case
|
||||
3. **Handle multiple categories** — content can be harmful in multiple ways
|
||||
4. **Use halt_on_blocklist_hit** for immediate rejection
|
||||
5. **Log analysis results** for audit and improvement
|
||||
6. **Consider 8-severity mode** for finer-grained control
|
||||
7. **Pre-moderate AI outputs** before showing to users
|
||||
@@ -0,0 +1,273 @@
|
||||
---
|
||||
name: azure-ai-contentunderstanding-py
|
||||
description: |
|
||||
Azure AI Content Understanding SDK for Python. Use for multimodal content extraction from documents, images, audio, and video.
|
||||
Triggers: "azure-ai-contentunderstanding", "ContentUnderstandingClient", "multimodal analysis", "document extraction", "video analysis", "audio transcription".
|
||||
package: azure-ai-contentunderstanding
|
||||
---
|
||||
|
||||
# Azure AI Content Understanding SDK for Python
|
||||
|
||||
Multimodal AI service that extracts semantic content from documents, video, audio, and image files for RAG and automated workflows.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-ai-contentunderstanding
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
CONTENTUNDERSTANDING_ENDPOINT=https://<resource>.cognitiveservices.azure.com/
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
```python
|
||||
import os
|
||||
from azure.ai.contentunderstanding import ContentUnderstandingClient
|
||||
from azure.identity import DefaultAzureCredential
|
||||
|
||||
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
|
||||
credential = DefaultAzureCredential()
|
||||
client = ContentUnderstandingClient(endpoint=endpoint, credential=credential)
|
||||
```
|
||||
|
||||
## Core Workflow
|
||||
|
||||
Content Understanding operations are asynchronous long-running operations:
|
||||
|
||||
1. **Begin Analysis** — Start the analysis operation with `begin_analyze()` (returns a poller)
|
||||
2. **Poll for Results** — Poll until analysis completes (SDK handles this with `.result()`)
|
||||
3. **Process Results** — Extract structured results from `AnalyzeResult.contents`
|
||||
|
||||
## Prebuilt Analyzers
|
||||
|
||||
| Analyzer | Content Type | Purpose |
|
||||
|----------|--------------|---------|
|
||||
| `prebuilt-documentSearch` | Documents | Extract markdown for RAG applications |
|
||||
| `prebuilt-imageSearch` | Images | Extract content from images |
|
||||
| `prebuilt-audioSearch` | Audio | Transcribe audio with timing |
|
||||
| `prebuilt-videoSearch` | Video | Extract frames, transcripts, summaries |
|
||||
| `prebuilt-invoice` | Documents | Extract invoice fields |
|
||||
|
||||
## Analyze Document
|
||||
|
||||
```python
|
||||
import os
|
||||
from azure.ai.contentunderstanding import ContentUnderstandingClient
|
||||
from azure.ai.contentunderstanding.models import AnalyzeInput
|
||||
from azure.identity import DefaultAzureCredential
|
||||
|
||||
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
|
||||
client = ContentUnderstandingClient(
|
||||
endpoint=endpoint,
|
||||
credential=DefaultAzureCredential()
|
||||
)
|
||||
|
||||
# Analyze document from URL
|
||||
poller = client.begin_analyze(
|
||||
analyzer_id="prebuilt-documentSearch",
|
||||
inputs=[AnalyzeInput(url="https://example.com/document.pdf")]
|
||||
)
|
||||
|
||||
result = poller.result()
|
||||
|
||||
# Access markdown content (contents is a list)
|
||||
content = result.contents[0]
|
||||
print(content.markdown)
|
||||
```
|
||||
|
||||
## Access Document Content Details
|
||||
|
||||
```python
|
||||
from azure.ai.contentunderstanding.models import MediaContentKind, DocumentContent
|
||||
|
||||
content = result.contents[0]
|
||||
if content.kind == MediaContentKind.DOCUMENT:
|
||||
document_content: DocumentContent = content # type: ignore
|
||||
print(document_content.start_page_number)
|
||||
```
|
||||
|
||||
## Analyze Image
|
||||
|
||||
```python
|
||||
from azure.ai.contentunderstanding.models import AnalyzeInput
|
||||
|
||||
poller = client.begin_analyze(
|
||||
analyzer_id="prebuilt-imageSearch",
|
||||
inputs=[AnalyzeInput(url="https://example.com/image.jpg")]
|
||||
)
|
||||
result = poller.result()
|
||||
content = result.contents[0]
|
||||
print(content.markdown)
|
||||
```
|
||||
|
||||
## Analyze Video
|
||||
|
||||
```python
|
||||
from azure.ai.contentunderstanding.models import AnalyzeInput
|
||||
|
||||
poller = client.begin_analyze(
|
||||
analyzer_id="prebuilt-videoSearch",
|
||||
inputs=[AnalyzeInput(url="https://example.com/video.mp4")]
|
||||
)
|
||||
|
||||
result = poller.result()
|
||||
|
||||
# Access video content (AudioVisualContent)
|
||||
content = result.contents[0]
|
||||
|
||||
# Get transcript phrases with timing
|
||||
for phrase in content.transcript_phrases:
|
||||
print(f"[{phrase.start_time} - {phrase.end_time}]: {phrase.text}")
|
||||
|
||||
# Get key frames (for video)
|
||||
for frame in content.key_frames:
|
||||
print(f"Frame at {frame.time}: {frame.description}")
|
||||
```
|
||||
|
||||
## Analyze Audio
|
||||
|
||||
```python
|
||||
from azure.ai.contentunderstanding.models import AnalyzeInput
|
||||
|
||||
poller = client.begin_analyze(
|
||||
analyzer_id="prebuilt-audioSearch",
|
||||
inputs=[AnalyzeInput(url="https://example.com/audio.mp3")]
|
||||
)
|
||||
|
||||
result = poller.result()
|
||||
|
||||
# Access audio transcript
|
||||
content = result.contents[0]
|
||||
for phrase in content.transcript_phrases:
|
||||
print(f"[{phrase.start_time}] {phrase.text}")
|
||||
```
|
||||
|
||||
## Custom Analyzers
|
||||
|
||||
Create custom analyzers with field schemas for specialized extraction:
|
||||
|
||||
```python
|
||||
# Create custom analyzer
|
||||
analyzer = client.create_analyzer(
|
||||
analyzer_id="my-invoice-analyzer",
|
||||
analyzer={
|
||||
"description": "Custom invoice analyzer",
|
||||
"base_analyzer_id": "prebuilt-documentSearch",
|
||||
"field_schema": {
|
||||
"fields": {
|
||||
"vendor_name": {"type": "string"},
|
||||
"invoice_total": {"type": "number"},
|
||||
"line_items": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"description": {"type": "string"},
|
||||
"amount": {"type": "number"}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
# Use custom analyzer
|
||||
from azure.ai.contentunderstanding.models import AnalyzeInput
|
||||
|
||||
poller = client.begin_analyze(
|
||||
analyzer_id="my-invoice-analyzer",
|
||||
inputs=[AnalyzeInput(url="https://example.com/invoice.pdf")]
|
||||
)
|
||||
|
||||
result = poller.result()
|
||||
|
||||
# Access extracted fields
|
||||
print(result.fields["vendor_name"])
|
||||
print(result.fields["invoice_total"])
|
||||
```
|
||||
|
||||
## Analyzer Management
|
||||
|
||||
```python
|
||||
# List all analyzers
|
||||
analyzers = client.list_analyzers()
|
||||
for analyzer in analyzers:
|
||||
print(f"{analyzer.analyzer_id}: {analyzer.description}")
|
||||
|
||||
# Get specific analyzer
|
||||
analyzer = client.get_analyzer("prebuilt-documentSearch")
|
||||
|
||||
# Delete custom analyzer
|
||||
client.delete_analyzer("my-custom-analyzer")
|
||||
```
|
||||
|
||||
## Async Client
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
import os
|
||||
from azure.ai.contentunderstanding.aio import ContentUnderstandingClient
|
||||
from azure.ai.contentunderstanding.models import AnalyzeInput
|
||||
from azure.identity.aio import DefaultAzureCredential
|
||||
|
||||
async def analyze_document():
|
||||
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
|
||||
credential = DefaultAzureCredential()
|
||||
|
||||
async with ContentUnderstandingClient(
|
||||
endpoint=endpoint,
|
||||
credential=credential
|
||||
) as client:
|
||||
poller = await client.begin_analyze(
|
||||
analyzer_id="prebuilt-documentSearch",
|
||||
inputs=[AnalyzeInput(url="https://example.com/doc.pdf")]
|
||||
)
|
||||
result = await poller.result()
|
||||
content = result.contents[0]
|
||||
return content.markdown
|
||||
|
||||
asyncio.run(analyze_document())
|
||||
```
|
||||
|
||||
## Content Types
|
||||
|
||||
| Class | For | Provides |
|
||||
|-------|-----|----------|
|
||||
| `DocumentContent` | PDF, images, Office docs | Pages, tables, figures, paragraphs |
|
||||
| `AudioVisualContent` | Audio, video files | Transcript phrases, timing, key frames |
|
||||
|
||||
Both derive from `MediaContent` which provides basic info and markdown representation.
|
||||
|
||||
## Model Imports
|
||||
|
||||
```python
|
||||
from azure.ai.contentunderstanding.models import (
|
||||
AnalyzeInput,
|
||||
AnalyzeResult,
|
||||
MediaContentKind,
|
||||
DocumentContent,
|
||||
AudioVisualContent,
|
||||
)
|
||||
```
|
||||
|
||||
## Client Types
|
||||
|
||||
| Client | Purpose |
|
||||
|--------|---------|
|
||||
| `ContentUnderstandingClient` | Sync client for all operations |
|
||||
| `ContentUnderstandingClient` (aio) | Async client for all operations |
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use `begin_analyze` with `AnalyzeInput`** — this is the correct method signature
|
||||
2. **Access results via `result.contents[0]`** — results are returned as a list
|
||||
3. **Use prebuilt analyzers** for common scenarios (document/image/audio/video search)
|
||||
4. **Create custom analyzers** only for domain-specific field extraction
|
||||
5. **Use async client** for high-throughput scenarios with `azure.identity.aio` credentials
|
||||
6. **Handle long-running operations** — video/audio analysis can take minutes
|
||||
7. **Use URL sources** when possible to avoid upload overhead
|
||||
271
skills/official/microsoft/python/foundry/ml/SKILL.md
Normal file
271
skills/official/microsoft/python/foundry/ml/SKILL.md
Normal file
@@ -0,0 +1,271 @@
|
||||
---
|
||||
name: azure-ai-ml-py
|
||||
description: |
|
||||
Azure Machine Learning SDK v2 for Python. Use for ML workspaces, jobs, models, datasets, compute, and pipelines.
|
||||
Triggers: "azure-ai-ml", "MLClient", "workspace", "model registry", "training jobs", "datasets".
|
||||
package: azure-ai-ml
|
||||
---
|
||||
|
||||
# Azure Machine Learning SDK v2 for Python
|
||||
|
||||
Client library for managing Azure ML resources: workspaces, jobs, models, data, and compute.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-ai-ml
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
AZURE_SUBSCRIPTION_ID=<your-subscription-id>
|
||||
AZURE_RESOURCE_GROUP=<your-resource-group>
|
||||
AZURE_ML_WORKSPACE_NAME=<your-workspace-name>
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
```python
|
||||
from azure.ai.ml import MLClient
|
||||
from azure.identity import DefaultAzureCredential
|
||||
|
||||
ml_client = MLClient(
|
||||
credential=DefaultAzureCredential(),
|
||||
subscription_id=os.environ["AZURE_SUBSCRIPTION_ID"],
|
||||
resource_group_name=os.environ["AZURE_RESOURCE_GROUP"],
|
||||
workspace_name=os.environ["AZURE_ML_WORKSPACE_NAME"]
|
||||
)
|
||||
```
|
||||
|
||||
### From Config File
|
||||
|
||||
```python
|
||||
from azure.ai.ml import MLClient
|
||||
from azure.identity import DefaultAzureCredential
|
||||
|
||||
# Uses config.json in current directory or parent
|
||||
ml_client = MLClient.from_config(
|
||||
credential=DefaultAzureCredential()
|
||||
)
|
||||
```
|
||||
|
||||
## Workspace Management
|
||||
|
||||
### Create Workspace
|
||||
|
||||
```python
|
||||
from azure.ai.ml.entities import Workspace
|
||||
|
||||
ws = Workspace(
|
||||
name="my-workspace",
|
||||
location="eastus",
|
||||
display_name="My Workspace",
|
||||
description="ML workspace for experiments",
|
||||
tags={"purpose": "demo"}
|
||||
)
|
||||
|
||||
ml_client.workspaces.begin_create(ws).result()
|
||||
```
|
||||
|
||||
### List Workspaces
|
||||
|
||||
```python
|
||||
for ws in ml_client.workspaces.list():
|
||||
print(f"{ws.name}: {ws.location}")
|
||||
```
|
||||
|
||||
## Data Assets
|
||||
|
||||
### Register Data
|
||||
|
||||
```python
|
||||
from azure.ai.ml.entities import Data
|
||||
from azure.ai.ml.constants import AssetTypes
|
||||
|
||||
# Register a file
|
||||
my_data = Data(
|
||||
name="my-dataset",
|
||||
version="1",
|
||||
path="azureml://datastores/workspaceblobstore/paths/data/train.csv",
|
||||
type=AssetTypes.URI_FILE,
|
||||
description="Training data"
|
||||
)
|
||||
|
||||
ml_client.data.create_or_update(my_data)
|
||||
```
|
||||
|
||||
### Register Folder
|
||||
|
||||
```python
|
||||
my_data = Data(
|
||||
name="my-folder-dataset",
|
||||
version="1",
|
||||
path="azureml://datastores/workspaceblobstore/paths/data/",
|
||||
type=AssetTypes.URI_FOLDER
|
||||
)
|
||||
|
||||
ml_client.data.create_or_update(my_data)
|
||||
```
|
||||
|
||||
## Model Registry
|
||||
|
||||
### Register Model
|
||||
|
||||
```python
|
||||
from azure.ai.ml.entities import Model
|
||||
from azure.ai.ml.constants import AssetTypes
|
||||
|
||||
model = Model(
|
||||
name="my-model",
|
||||
version="1",
|
||||
path="./model/",
|
||||
type=AssetTypes.CUSTOM_MODEL,
|
||||
description="My trained model"
|
||||
)
|
||||
|
||||
ml_client.models.create_or_update(model)
|
||||
```
|
||||
|
||||
### List Models
|
||||
|
||||
```python
|
||||
for model in ml_client.models.list(name="my-model"):
|
||||
print(f"{model.name} v{model.version}")
|
||||
```
|
||||
|
||||
## Compute
|
||||
|
||||
### Create Compute Cluster
|
||||
|
||||
```python
|
||||
from azure.ai.ml.entities import AmlCompute
|
||||
|
||||
cluster = AmlCompute(
|
||||
name="cpu-cluster",
|
||||
type="amlcompute",
|
||||
size="Standard_DS3_v2",
|
||||
min_instances=0,
|
||||
max_instances=4,
|
||||
idle_time_before_scale_down=120
|
||||
)
|
||||
|
||||
ml_client.compute.begin_create_or_update(cluster).result()
|
||||
```
|
||||
|
||||
### List Compute
|
||||
|
||||
```python
|
||||
for compute in ml_client.compute.list():
|
||||
print(f"{compute.name}: {compute.type}")
|
||||
```
|
||||
|
||||
## Jobs
|
||||
|
||||
### Command Job
|
||||
|
||||
```python
|
||||
from azure.ai.ml import command, Input
|
||||
|
||||
job = command(
|
||||
code="./src",
|
||||
command="python train.py --data ${{inputs.data}} --lr ${{inputs.learning_rate}}",
|
||||
inputs={
|
||||
"data": Input(type="uri_folder", path="azureml:my-dataset:1"),
|
||||
"learning_rate": 0.01
|
||||
},
|
||||
environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
|
||||
compute="cpu-cluster",
|
||||
display_name="training-job"
|
||||
)
|
||||
|
||||
returned_job = ml_client.jobs.create_or_update(job)
|
||||
print(f"Job URL: {returned_job.studio_url}")
|
||||
```
|
||||
|
||||
### Monitor Job
|
||||
|
||||
```python
|
||||
ml_client.jobs.stream(returned_job.name)
|
||||
```
|
||||
|
||||
## Pipelines
|
||||
|
||||
```python
|
||||
from azure.ai.ml import dsl, Input, Output
|
||||
from azure.ai.ml.entities import Pipeline
|
||||
|
||||
@dsl.pipeline(
|
||||
compute="cpu-cluster",
|
||||
description="Training pipeline"
|
||||
)
|
||||
def training_pipeline(data_input):
|
||||
prep_step = prep_component(data=data_input)
|
||||
train_step = train_component(
|
||||
data=prep_step.outputs.output_data,
|
||||
learning_rate=0.01
|
||||
)
|
||||
return {"model": train_step.outputs.model}
|
||||
|
||||
pipeline = training_pipeline(
|
||||
data_input=Input(type="uri_folder", path="azureml:my-dataset:1")
|
||||
)
|
||||
|
||||
pipeline_job = ml_client.jobs.create_or_update(pipeline)
|
||||
```
|
||||
|
||||
## Environments
|
||||
|
||||
### Create Custom Environment
|
||||
|
||||
```python
|
||||
from azure.ai.ml.entities import Environment
|
||||
|
||||
env = Environment(
|
||||
name="my-env",
|
||||
version="1",
|
||||
image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04",
|
||||
conda_file="./environment.yml"
|
||||
)
|
||||
|
||||
ml_client.environments.create_or_update(env)
|
||||
```
|
||||
|
||||
## Datastores
|
||||
|
||||
### List Datastores
|
||||
|
||||
```python
|
||||
for ds in ml_client.datastores.list():
|
||||
print(f"{ds.name}: {ds.type}")
|
||||
```
|
||||
|
||||
### Get Default Datastore
|
||||
|
||||
```python
|
||||
default_ds = ml_client.datastores.get_default()
|
||||
print(f"Default: {default_ds.name}")
|
||||
```
|
||||
|
||||
## MLClient Operations
|
||||
|
||||
| Property | Operations |
|
||||
|----------|------------|
|
||||
| `workspaces` | create, get, list, delete |
|
||||
| `jobs` | create_or_update, get, list, stream, cancel |
|
||||
| `models` | create_or_update, get, list, archive |
|
||||
| `data` | create_or_update, get, list |
|
||||
| `compute` | begin_create_or_update, get, list, delete |
|
||||
| `environments` | create_or_update, get, list |
|
||||
| `datastores` | create_or_update, get, list, get_default |
|
||||
| `components` | create_or_update, get, list |
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use versioning** for data, models, and environments
|
||||
2. **Configure idle scale-down** to reduce compute costs
|
||||
3. **Use environments** for reproducible training
|
||||
4. **Stream job logs** to monitor progress
|
||||
5. **Register models** after successful training jobs
|
||||
6. **Use pipelines** for multi-step workflows
|
||||
7. **Tag resources** for organization and cost tracking
|
||||
295
skills/official/microsoft/python/foundry/projects/SKILL.md
Normal file
295
skills/official/microsoft/python/foundry/projects/SKILL.md
Normal file
@@ -0,0 +1,295 @@
|
||||
---
|
||||
name: azure-ai-projects-py
|
||||
description: Build AI applications using the Azure AI Projects Python SDK (azure-ai-projects). Use when working with Foundry project clients, creating versioned agents with PromptAgentDefinition, running evaluations, managing connections/deployments/datasets/indexes, or using OpenAI-compatible clients. This is the high-level Foundry SDK - for low-level agent operations, use azure-ai-agents-python skill.
|
||||
package: azure-ai-projects
|
||||
---
|
||||
|
||||
# Azure AI Projects Python SDK (Foundry SDK)
|
||||
|
||||
Build AI applications on Microsoft Foundry using the `azure-ai-projects` SDK.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-ai-projects azure-identity
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
AZURE_AI_PROJECT_ENDPOINT="https://<resource>.services.ai.azure.com/api/projects/<project>"
|
||||
AZURE_AI_MODEL_DEPLOYMENT_NAME="gpt-4o-mini"
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
```python
|
||||
import os
|
||||
from azure.identity import DefaultAzureCredential
|
||||
from azure.ai.projects import AIProjectClient
|
||||
|
||||
credential = DefaultAzureCredential()
|
||||
client = AIProjectClient(
|
||||
endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
|
||||
credential=credential,
|
||||
)
|
||||
```
|
||||
|
||||
## Client Operations Overview
|
||||
|
||||
| Operation | Access | Purpose |
|
||||
|-----------|--------|---------|
|
||||
| `client.agents` | `.agents.*` | Agent CRUD, versions, threads, runs |
|
||||
| `client.connections` | `.connections.*` | List/get project connections |
|
||||
| `client.deployments` | `.deployments.*` | List model deployments |
|
||||
| `client.datasets` | `.datasets.*` | Dataset management |
|
||||
| `client.indexes` | `.indexes.*` | Index management |
|
||||
| `client.evaluations` | `.evaluations.*` | Run evaluations |
|
||||
| `client.red_teams` | `.red_teams.*` | Red team operations |
|
||||
|
||||
## Two Client Approaches
|
||||
|
||||
### 1. AIProjectClient (Native Foundry)
|
||||
|
||||
```python
|
||||
from azure.ai.projects import AIProjectClient
|
||||
|
||||
client = AIProjectClient(
|
||||
endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
|
||||
credential=DefaultAzureCredential(),
|
||||
)
|
||||
|
||||
# Use Foundry-native operations
|
||||
agent = client.agents.create_agent(
|
||||
model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
|
||||
name="my-agent",
|
||||
instructions="You are helpful.",
|
||||
)
|
||||
```
|
||||
|
||||
### 2. OpenAI-Compatible Client
|
||||
|
||||
```python
|
||||
# Get OpenAI-compatible client from project
|
||||
openai_client = client.get_openai_client()
|
||||
|
||||
# Use standard OpenAI API
|
||||
response = openai_client.chat.completions.create(
|
||||
model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
|
||||
messages=[{"role": "user", "content": "Hello!"}],
|
||||
)
|
||||
```
|
||||
|
||||
## Agent Operations
|
||||
|
||||
### Create Agent (Basic)
|
||||
|
||||
```python
|
||||
agent = client.agents.create_agent(
|
||||
model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
|
||||
name="my-agent",
|
||||
instructions="You are a helpful assistant.",
|
||||
)
|
||||
```
|
||||
|
||||
### Create Agent with Tools
|
||||
|
||||
```python
|
||||
from azure.ai.agents import CodeInterpreterTool, FileSearchTool
|
||||
|
||||
agent = client.agents.create_agent(
|
||||
model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
|
||||
name="tool-agent",
|
||||
instructions="You can execute code and search files.",
|
||||
tools=[CodeInterpreterTool(), FileSearchTool()],
|
||||
)
|
||||
```
|
||||
|
||||
### Versioned Agents with PromptAgentDefinition
|
||||
|
||||
```python
|
||||
from azure.ai.projects.models import PromptAgentDefinition
|
||||
|
||||
# Create a versioned agent
|
||||
agent_version = client.agents.create_version(
|
||||
agent_name="customer-support-agent",
|
||||
definition=PromptAgentDefinition(
|
||||
model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
|
||||
instructions="You are a customer support specialist.",
|
||||
tools=[], # Add tools as needed
|
||||
),
|
||||
version_label="v1.0",
|
||||
)
|
||||
```
|
||||
|
||||
See [references/agents.md](references/agents.md) for detailed agent patterns.
|
||||
|
||||
## Tools Overview
|
||||
|
||||
| Tool | Class | Use Case |
|
||||
|------|-------|----------|
|
||||
| Code Interpreter | `CodeInterpreterTool` | Execute Python, generate files |
|
||||
| File Search | `FileSearchTool` | RAG over uploaded documents |
|
||||
| Bing Grounding | `BingGroundingTool` | Web search (requires connection) |
|
||||
| Azure AI Search | `AzureAISearchTool` | Search your indexes |
|
||||
| Function Calling | `FunctionTool` | Call your Python functions |
|
||||
| OpenAPI | `OpenApiTool` | Call REST APIs |
|
||||
| MCP | `McpTool` | Model Context Protocol servers |
|
||||
| Memory Search | `MemorySearchTool` | Search agent memory stores |
|
||||
| SharePoint | `SharepointGroundingTool` | Search SharePoint content |
|
||||
|
||||
See [references/tools.md](references/tools.md) for all tool patterns.
|
||||
|
||||
## Thread and Message Flow
|
||||
|
||||
```python
|
||||
# 1. Create thread
|
||||
thread = client.agents.threads.create()
|
||||
|
||||
# 2. Add message
|
||||
client.agents.messages.create(
|
||||
thread_id=thread.id,
|
||||
role="user",
|
||||
content="What's the weather like?",
|
||||
)
|
||||
|
||||
# 3. Create and process run
|
||||
run = client.agents.runs.create_and_process(
|
||||
thread_id=thread.id,
|
||||
agent_id=agent.id,
|
||||
)
|
||||
|
||||
# 4. Get response
|
||||
if run.status == "completed":
|
||||
messages = client.agents.messages.list(thread_id=thread.id)
|
||||
for msg in messages:
|
||||
if msg.role == "assistant":
|
||||
print(msg.content[0].text.value)
|
||||
```
|
||||
|
||||
## Connections
|
||||
|
||||
```python
|
||||
# List all connections
|
||||
connections = client.connections.list()
|
||||
for conn in connections:
|
||||
print(f"{conn.name}: {conn.connection_type}")
|
||||
|
||||
# Get specific connection
|
||||
connection = client.connections.get(connection_name="my-search-connection")
|
||||
```
|
||||
|
||||
See [references/connections.md](references/connections.md) for connection patterns.
|
||||
|
||||
## Deployments
|
||||
|
||||
```python
|
||||
# List available model deployments
|
||||
deployments = client.deployments.list()
|
||||
for deployment in deployments:
|
||||
print(f"{deployment.name}: {deployment.model}")
|
||||
```
|
||||
|
||||
See [references/deployments.md](references/deployments.md) for deployment patterns.
|
||||
|
||||
## Datasets and Indexes
|
||||
|
||||
```python
|
||||
# List datasets
|
||||
datasets = client.datasets.list()
|
||||
|
||||
# List indexes
|
||||
indexes = client.indexes.list()
|
||||
```
|
||||
|
||||
See [references/datasets-indexes.md](references/datasets-indexes.md) for data operations.
|
||||
|
||||
## Evaluation
|
||||
|
||||
```python
|
||||
# Using OpenAI client for evals
|
||||
openai_client = client.get_openai_client()
|
||||
|
||||
# Create evaluation with built-in evaluators
|
||||
eval_run = openai_client.evals.runs.create(
|
||||
eval_id="my-eval",
|
||||
name="quality-check",
|
||||
data_source={
|
||||
"type": "custom",
|
||||
"item_references": [{"item_id": "test-1"}],
|
||||
},
|
||||
testing_criteria=[
|
||||
{"type": "fluency"},
|
||||
{"type": "task_adherence"},
|
||||
],
|
||||
)
|
||||
```
|
||||
|
||||
See [references/evaluation.md](references/evaluation.md) for evaluation patterns.
|
||||
|
||||
## Async Client
|
||||
|
||||
```python
|
||||
from azure.ai.projects.aio import AIProjectClient
|
||||
|
||||
async with AIProjectClient(
|
||||
endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
|
||||
credential=DefaultAzureCredential(),
|
||||
) as client:
|
||||
agent = await client.agents.create_agent(...)
|
||||
# ... async operations
|
||||
```
|
||||
|
||||
See [references/async-patterns.md](references/async-patterns.md) for async patterns.
|
||||
|
||||
## Memory Stores
|
||||
|
||||
```python
|
||||
# Create memory store for agent
|
||||
memory_store = client.agents.create_memory_store(
|
||||
name="conversation-memory",
|
||||
)
|
||||
|
||||
# Attach to agent for persistent memory
|
||||
agent = client.agents.create_agent(
|
||||
model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
|
||||
name="memory-agent",
|
||||
tools=[MemorySearchTool()],
|
||||
tool_resources={"memory": {"store_ids": [memory_store.id]}},
|
||||
)
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use context managers** for async client: `async with AIProjectClient(...) as client:`
|
||||
2. **Clean up agents** when done: `client.agents.delete_agent(agent.id)`
|
||||
3. **Use `create_and_process`** for simple runs, **streaming** for real-time UX
|
||||
4. **Use versioned agents** for production deployments
|
||||
5. **Prefer connections** for external service integration (AI Search, Bing, etc.)
|
||||
|
||||
## SDK Comparison
|
||||
|
||||
| Feature | `azure-ai-projects` | `azure-ai-agents` |
|
||||
|---------|---------------------|-------------------|
|
||||
| Level | High-level (Foundry) | Low-level (Agents) |
|
||||
| Client | `AIProjectClient` | `AgentsClient` |
|
||||
| Versioning | `create_version()` | Not available |
|
||||
| Connections | Yes | No |
|
||||
| Deployments | Yes | No |
|
||||
| Datasets/Indexes | Yes | No |
|
||||
| Evaluation | Via OpenAI client | No |
|
||||
| When to use | Full Foundry integration | Standalone agent apps |
|
||||
|
||||
## Reference Files
|
||||
|
||||
- [references/agents.md](references/agents.md): Agent operations with PromptAgentDefinition
|
||||
- [references/tools.md](references/tools.md): All agent tools with examples
|
||||
- [references/evaluation.md](references/evaluation.md): Evaluation operations overview
|
||||
- [references/built-in-evaluators.md](references/built-in-evaluators.md): Complete built-in evaluator reference
|
||||
- [references/custom-evaluators.md](references/custom-evaluators.md): Code and prompt-based evaluator patterns
|
||||
- [references/connections.md](references/connections.md): Connection operations
|
||||
- [references/deployments.md](references/deployments.md): Deployment enumeration
|
||||
- [references/datasets-indexes.md](references/datasets-indexes.md): Dataset and index operations
|
||||
- [references/async-patterns.md](references/async-patterns.md): Async client usage
|
||||
- [references/api-reference.md](references/api-reference.md): Complete API reference for all 373 SDK exports (v2.0.0b4)
|
||||
- [scripts/run_batch_evaluation.py](scripts/run_batch_evaluation.py): CLI tool for batch evaluations
|
||||
@@ -0,0 +1,528 @@
|
||||
---
|
||||
name: azure-search-documents-py
|
||||
description: |
|
||||
Azure AI Search SDK for Python. Use for vector search, hybrid search, semantic ranking, indexing, and skillsets.
|
||||
Triggers: "azure-search-documents", "SearchClient", "SearchIndexClient", "vector search", "hybrid search", "semantic search".
|
||||
package: azure-search-documents
|
||||
---
|
||||
|
||||
# Azure AI Search SDK for Python
|
||||
|
||||
Full-text, vector, and hybrid search with AI enrichment capabilities.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-search-documents
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
AZURE_SEARCH_ENDPOINT=https://<service-name>.search.windows.net
|
||||
AZURE_SEARCH_API_KEY=<your-api-key>
|
||||
AZURE_SEARCH_INDEX_NAME=<your-index-name>
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
### API Key
|
||||
|
||||
```python
|
||||
from azure.search.documents import SearchClient
|
||||
from azure.core.credentials import AzureKeyCredential
|
||||
|
||||
client = SearchClient(
|
||||
endpoint=os.environ["AZURE_SEARCH_ENDPOINT"],
|
||||
index_name=os.environ["AZURE_SEARCH_INDEX_NAME"],
|
||||
credential=AzureKeyCredential(os.environ["AZURE_SEARCH_API_KEY"])
|
||||
)
|
||||
```
|
||||
|
||||
### Entra ID (Recommended)
|
||||
|
||||
```python
|
||||
from azure.search.documents import SearchClient
|
||||
from azure.identity import DefaultAzureCredential
|
||||
|
||||
client = SearchClient(
|
||||
endpoint=os.environ["AZURE_SEARCH_ENDPOINT"],
|
||||
index_name=os.environ["AZURE_SEARCH_INDEX_NAME"],
|
||||
credential=DefaultAzureCredential()
|
||||
)
|
||||
```
|
||||
|
||||
## Client Types
|
||||
|
||||
| Client | Purpose |
|
||||
|--------|---------|
|
||||
| `SearchClient` | Search and document operations |
|
||||
| `SearchIndexClient` | Index management, synonym maps |
|
||||
| `SearchIndexerClient` | Indexers, data sources, skillsets |
|
||||
|
||||
## Create Index with Vector Field
|
||||
|
||||
```python
|
||||
from azure.search.documents.indexes import SearchIndexClient
|
||||
from azure.search.documents.indexes.models import (
|
||||
SearchIndex,
|
||||
SearchField,
|
||||
SearchFieldDataType,
|
||||
VectorSearch,
|
||||
HnswAlgorithmConfiguration,
|
||||
VectorSearchProfile,
|
||||
SearchableField,
|
||||
SimpleField
|
||||
)
|
||||
|
||||
index_client = SearchIndexClient(endpoint, AzureKeyCredential(key))
|
||||
|
||||
fields = [
|
||||
SimpleField(name="id", type=SearchFieldDataType.String, key=True),
|
||||
SearchableField(name="title", type=SearchFieldDataType.String),
|
||||
SearchableField(name="content", type=SearchFieldDataType.String),
|
||||
SearchField(
|
||||
name="content_vector",
|
||||
type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
|
||||
searchable=True,
|
||||
vector_search_dimensions=1536,
|
||||
vector_search_profile_name="my-vector-profile"
|
||||
)
|
||||
]
|
||||
|
||||
vector_search = VectorSearch(
|
||||
algorithms=[
|
||||
HnswAlgorithmConfiguration(name="my-hnsw")
|
||||
],
|
||||
profiles=[
|
||||
VectorSearchProfile(
|
||||
name="my-vector-profile",
|
||||
algorithm_configuration_name="my-hnsw"
|
||||
)
|
||||
]
|
||||
)
|
||||
|
||||
index = SearchIndex(
|
||||
name="my-index",
|
||||
fields=fields,
|
||||
vector_search=vector_search
|
||||
)
|
||||
|
||||
index_client.create_or_update_index(index)
|
||||
```
|
||||
|
||||
## Upload Documents
|
||||
|
||||
```python
|
||||
from azure.search.documents import SearchClient
|
||||
|
||||
client = SearchClient(endpoint, "my-index", AzureKeyCredential(key))
|
||||
|
||||
documents = [
|
||||
{
|
||||
"id": "1",
|
||||
"title": "Azure AI Search",
|
||||
"content": "Full-text and vector search service",
|
||||
"content_vector": [0.1, 0.2, ...] # 1536 dimensions
|
||||
}
|
||||
]
|
||||
|
||||
result = client.upload_documents(documents)
|
||||
print(f"Uploaded {len(result)} documents")
|
||||
```
|
||||
|
||||
## Keyword Search
|
||||
|
||||
```python
|
||||
results = client.search(
|
||||
search_text="azure search",
|
||||
select=["id", "title", "content"],
|
||||
top=10
|
||||
)
|
||||
|
||||
for result in results:
|
||||
print(f"{result['title']}: {result['@search.score']}")
|
||||
```
|
||||
|
||||
## Vector Search
|
||||
|
||||
```python
|
||||
from azure.search.documents.models import VectorizedQuery
|
||||
|
||||
# Your query embedding (1536 dimensions)
|
||||
query_vector = get_embedding("semantic search capabilities")
|
||||
|
||||
vector_query = VectorizedQuery(
|
||||
vector=query_vector,
|
||||
k_nearest_neighbors=10,
|
||||
fields="content_vector"
|
||||
)
|
||||
|
||||
results = client.search(
|
||||
vector_queries=[vector_query],
|
||||
select=["id", "title", "content"]
|
||||
)
|
||||
|
||||
for result in results:
|
||||
print(f"{result['title']}: {result['@search.score']}")
|
||||
```
|
||||
|
||||
## Hybrid Search (Vector + Keyword)
|
||||
|
||||
```python
|
||||
from azure.search.documents.models import VectorizedQuery
|
||||
|
||||
vector_query = VectorizedQuery(
|
||||
vector=query_vector,
|
||||
k_nearest_neighbors=10,
|
||||
fields="content_vector"
|
||||
)
|
||||
|
||||
results = client.search(
|
||||
search_text="azure search",
|
||||
vector_queries=[vector_query],
|
||||
select=["id", "title", "content"],
|
||||
top=10
|
||||
)
|
||||
```
|
||||
|
||||
## Semantic Ranking
|
||||
|
||||
```python
|
||||
from azure.search.documents.models import QueryType
|
||||
|
||||
results = client.search(
|
||||
search_text="what is azure search",
|
||||
query_type=QueryType.SEMANTIC,
|
||||
semantic_configuration_name="my-semantic-config",
|
||||
select=["id", "title", "content"],
|
||||
top=10
|
||||
)
|
||||
|
||||
for result in results:
|
||||
print(f"{result['title']}")
|
||||
if result.get("@search.captions"):
|
||||
print(f" Caption: {result['@search.captions'][0].text}")
|
||||
```
|
||||
|
||||
## Filters
|
||||
|
||||
```python
|
||||
results = client.search(
|
||||
search_text="*",
|
||||
filter="category eq 'Technology' and rating gt 4",
|
||||
order_by=["rating desc"],
|
||||
select=["id", "title", "category", "rating"]
|
||||
)
|
||||
```
|
||||
|
||||
## Facets
|
||||
|
||||
```python
|
||||
results = client.search(
|
||||
search_text="*",
|
||||
facets=["category,count:10", "rating"],
|
||||
top=0 # Only get facets, no documents
|
||||
)
|
||||
|
||||
for facet_name, facet_values in results.get_facets().items():
|
||||
print(f"{facet_name}:")
|
||||
for facet in facet_values:
|
||||
print(f" {facet['value']}: {facet['count']}")
|
||||
```
|
||||
|
||||
## Autocomplete & Suggest
|
||||
|
||||
```python
|
||||
# Autocomplete
|
||||
results = client.autocomplete(
|
||||
search_text="sea",
|
||||
suggester_name="my-suggester",
|
||||
mode="twoTerms"
|
||||
)
|
||||
|
||||
# Suggest
|
||||
results = client.suggest(
|
||||
search_text="sea",
|
||||
suggester_name="my-suggester",
|
||||
select=["title"]
|
||||
)
|
||||
```
|
||||
|
||||
## Indexer with Skillset
|
||||
|
||||
```python
|
||||
from azure.search.documents.indexes import SearchIndexerClient
|
||||
from azure.search.documents.indexes.models import (
|
||||
SearchIndexer,
|
||||
SearchIndexerDataSourceConnection,
|
||||
SearchIndexerSkillset,
|
||||
EntityRecognitionSkill,
|
||||
InputFieldMappingEntry,
|
||||
OutputFieldMappingEntry
|
||||
)
|
||||
|
||||
indexer_client = SearchIndexerClient(endpoint, AzureKeyCredential(key))
|
||||
|
||||
# Create data source
|
||||
data_source = SearchIndexerDataSourceConnection(
|
||||
name="my-datasource",
|
||||
type="azureblob",
|
||||
connection_string=connection_string,
|
||||
container={"name": "documents"}
|
||||
)
|
||||
indexer_client.create_or_update_data_source_connection(data_source)
|
||||
|
||||
# Create skillset
|
||||
skillset = SearchIndexerSkillset(
|
||||
name="my-skillset",
|
||||
skills=[
|
||||
EntityRecognitionSkill(
|
||||
inputs=[InputFieldMappingEntry(name="text", source="/document/content")],
|
||||
outputs=[OutputFieldMappingEntry(name="organizations", target_name="organizations")]
|
||||
)
|
||||
]
|
||||
)
|
||||
indexer_client.create_or_update_skillset(skillset)
|
||||
|
||||
# Create indexer
|
||||
indexer = SearchIndexer(
|
||||
name="my-indexer",
|
||||
data_source_name="my-datasource",
|
||||
target_index_name="my-index",
|
||||
skillset_name="my-skillset"
|
||||
)
|
||||
indexer_client.create_or_update_indexer(indexer)
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use hybrid search** for best relevance combining vector and keyword
|
||||
2. **Enable semantic ranking** for natural language queries
|
||||
3. **Index in batches** of 100-1000 documents for efficiency
|
||||
4. **Use filters** to narrow results before ranking
|
||||
5. **Configure vector dimensions** to match your embedding model
|
||||
6. **Use HNSW algorithm** for large-scale vector search
|
||||
7. **Create suggesters** at index creation time (cannot add later)
|
||||
|
||||
## Reference Files
|
||||
|
||||
| File | Contents |
|
||||
|------|----------|
|
||||
| [references/vector-search.md](references/vector-search.md) | HNSW configuration, integrated vectorization, multi-vector queries |
|
||||
| [references/semantic-ranking.md](references/semantic-ranking.md) | Semantic configuration, captions, answers, hybrid patterns |
|
||||
| [scripts/setup_vector_index.py](scripts/setup_vector_index.py) | CLI script to create vector-enabled search index |
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Additional Azure AI Search Patterns
|
||||
|
||||
# Azure AI Search Python SDK
|
||||
|
||||
Write clean, idiomatic Python code for Azure AI Search using `azure-search-documents`.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-search-documents azure-identity
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
AZURE_SEARCH_ENDPOINT=https://<search-service>.search.windows.net
|
||||
AZURE_SEARCH_INDEX_NAME=<index-name>
|
||||
# For API key auth (not recommended for production)
|
||||
AZURE_SEARCH_API_KEY=<api-key>
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
**DefaultAzureCredential (preferred)**:
|
||||
```python
|
||||
from azure.identity import DefaultAzureCredential
|
||||
from azure.search.documents import SearchClient
|
||||
|
||||
credential = DefaultAzureCredential()
|
||||
client = SearchClient(endpoint, index_name, credential)
|
||||
```
|
||||
|
||||
**API Key**:
|
||||
```python
|
||||
from azure.core.credentials import AzureKeyCredential
|
||||
from azure.search.documents import SearchClient
|
||||
|
||||
client = SearchClient(endpoint, index_name, AzureKeyCredential(api_key))
|
||||
```
|
||||
|
||||
## Client Selection
|
||||
|
||||
| Client | Purpose |
|
||||
|--------|---------|
|
||||
| `SearchClient` | Query indexes, upload/update/delete documents |
|
||||
| `SearchIndexClient` | Create/manage indexes, knowledge sources, knowledge bases |
|
||||
| `SearchIndexerClient` | Manage indexers, skillsets, data sources |
|
||||
| `KnowledgeBaseRetrievalClient` | Agentic retrieval with LLM-powered Q&A |
|
||||
|
||||
## Index Creation Pattern
|
||||
|
||||
```python
|
||||
from azure.search.documents.indexes import SearchIndexClient
|
||||
from azure.search.documents.indexes.models import (
|
||||
SearchIndex, SearchField, VectorSearch, VectorSearchProfile,
|
||||
HnswAlgorithmConfiguration, AzureOpenAIVectorizer,
|
||||
AzureOpenAIVectorizerParameters, SemanticSearch,
|
||||
SemanticConfiguration, SemanticPrioritizedFields, SemanticField
|
||||
)
|
||||
|
||||
index = SearchIndex(
|
||||
name=index_name,
|
||||
fields=[
|
||||
SearchField(name="id", type="Edm.String", key=True),
|
||||
SearchField(name="content", type="Edm.String", searchable=True),
|
||||
SearchField(name="embedding", type="Collection(Edm.Single)",
|
||||
vector_search_dimensions=3072,
|
||||
vector_search_profile_name="vector-profile"),
|
||||
],
|
||||
vector_search=VectorSearch(
|
||||
profiles=[VectorSearchProfile(
|
||||
name="vector-profile",
|
||||
algorithm_configuration_name="hnsw-algo",
|
||||
vectorizer_name="openai-vectorizer"
|
||||
)],
|
||||
algorithms=[HnswAlgorithmConfiguration(name="hnsw-algo")],
|
||||
vectorizers=[AzureOpenAIVectorizer(
|
||||
vectorizer_name="openai-vectorizer",
|
||||
parameters=AzureOpenAIVectorizerParameters(
|
||||
resource_url=aoai_endpoint,
|
||||
deployment_name=embedding_deployment,
|
||||
model_name=embedding_model
|
||||
)
|
||||
)]
|
||||
),
|
||||
semantic_search=SemanticSearch(
|
||||
default_configuration_name="semantic-config",
|
||||
configurations=[SemanticConfiguration(
|
||||
name="semantic-config",
|
||||
prioritized_fields=SemanticPrioritizedFields(
|
||||
content_fields=[SemanticField(field_name="content")]
|
||||
)
|
||||
)]
|
||||
)
|
||||
)
|
||||
|
||||
index_client = SearchIndexClient(endpoint, credential)
|
||||
index_client.create_or_update_index(index)
|
||||
```
|
||||
|
||||
## Document Operations
|
||||
|
||||
```python
|
||||
from azure.search.documents import SearchIndexingBufferedSender
|
||||
|
||||
# Batch upload with automatic batching
|
||||
with SearchIndexingBufferedSender(endpoint, index_name, credential) as sender:
|
||||
sender.upload_documents(documents)
|
||||
|
||||
# Direct operations via SearchClient
|
||||
search_client = SearchClient(endpoint, index_name, credential)
|
||||
search_client.upload_documents(documents) # Add new
|
||||
search_client.merge_documents(documents) # Update existing
|
||||
search_client.merge_or_upload_documents(documents) # Upsert
|
||||
search_client.delete_documents(documents) # Remove
|
||||
```
|
||||
|
||||
## Search Patterns
|
||||
|
||||
```python
|
||||
# Basic search
|
||||
results = search_client.search(search_text="query")
|
||||
|
||||
# Vector search
|
||||
from azure.search.documents.models import VectorizedQuery
|
||||
|
||||
results = search_client.search(
|
||||
search_text=None,
|
||||
vector_queries=[VectorizedQuery(
|
||||
vector=embedding,
|
||||
k_nearest_neighbors=5,
|
||||
fields="embedding"
|
||||
)]
|
||||
)
|
||||
|
||||
# Hybrid search (vector + keyword)
|
||||
results = search_client.search(
|
||||
search_text="query",
|
||||
vector_queries=[VectorizedQuery(vector=embedding, k_nearest_neighbors=5, fields="embedding")],
|
||||
query_type="semantic",
|
||||
semantic_configuration_name="semantic-config"
|
||||
)
|
||||
|
||||
# With filters
|
||||
results = search_client.search(
|
||||
search_text="query",
|
||||
filter="category eq 'technology'",
|
||||
select=["id", "title", "content"],
|
||||
top=10
|
||||
)
|
||||
```
|
||||
|
||||
## Agentic Retrieval (Knowledge Bases)
|
||||
|
||||
For LLM-powered Q&A with answer synthesis, see [references/agentic-retrieval.md](references/agentic-retrieval.md).
|
||||
|
||||
Key concepts:
|
||||
- **Knowledge Source**: Points to a search index
|
||||
- **Knowledge Base**: Wraps knowledge sources + LLM for query planning and synthesis
|
||||
- **Output modes**: `EXTRACTIVE_DATA` (raw chunks) or `ANSWER_SYNTHESIS` (LLM-generated answers)
|
||||
|
||||
## Async Pattern
|
||||
|
||||
```python
|
||||
from azure.search.documents.aio import SearchClient
|
||||
|
||||
async with SearchClient(endpoint, index_name, credential) as client:
|
||||
results = await client.search(search_text="query")
|
||||
async for result in results:
|
||||
print(result["title"])
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use environment variables** for endpoints, keys, and deployment names
|
||||
2. **Prefer `DefaultAzureCredential`** over API keys for production
|
||||
3. **Use `SearchIndexingBufferedSender`** for batch uploads (handles batching/retries)
|
||||
4. **Always define semantic configuration** for agentic retrieval indexes
|
||||
5. **Use `create_or_update_index`** for idempotent index creation
|
||||
6. **Close clients** with context managers or explicit `close()`
|
||||
|
||||
## Field Types Reference
|
||||
|
||||
| EDM Type | Python | Notes |
|
||||
|----------|--------|-------|
|
||||
| `Edm.String` | str | Searchable text |
|
||||
| `Edm.Int32` | int | Integer |
|
||||
| `Edm.Int64` | int | Long integer |
|
||||
| `Edm.Double` | float | Floating point |
|
||||
| `Edm.Boolean` | bool | True/False |
|
||||
| `Edm.DateTimeOffset` | datetime | ISO 8601 |
|
||||
| `Collection(Edm.Single)` | List[float] | Vector embeddings |
|
||||
| `Collection(Edm.String)` | List[str] | String arrays |
|
||||
|
||||
## Error Handling
|
||||
|
||||
```python
|
||||
from azure.core.exceptions import (
|
||||
HttpResponseError,
|
||||
ResourceNotFoundError,
|
||||
ResourceExistsError
|
||||
)
|
||||
|
||||
try:
|
||||
result = search_client.get_document(key="123")
|
||||
except ResourceNotFoundError:
|
||||
print("Document not found")
|
||||
except HttpResponseError as e:
|
||||
print(f"Search error: {e.message}")
|
||||
```
|
||||
@@ -0,0 +1,372 @@
|
||||
---
|
||||
name: azure-speech-to-text-rest-py
|
||||
description: |
|
||||
Azure Speech to Text REST API for short audio (Python). Use for simple speech recognition of audio files up to 60 seconds without the Speech SDK.
|
||||
Triggers: "speech to text REST", "short audio transcription", "speech recognition REST API", "STT REST", "recognize speech REST".
|
||||
DO NOT USE FOR: Long audio (>60 seconds), real-time streaming, batch transcription, custom speech models, speech translation. Use Speech SDK or Batch Transcription API instead.
|
||||
---
|
||||
|
||||
# Azure Speech to Text REST API for Short Audio
|
||||
|
||||
Simple REST API for speech-to-text transcription of short audio files (up to 60 seconds). No SDK required - just HTTP requests.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. **Azure subscription** - [Create one free](https://azure.microsoft.com/free/)
|
||||
2. **Speech resource** - Create in [Azure Portal](https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices)
|
||||
3. **Get credentials** - After deployment, go to resource > Keys and Endpoint
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
# Required
|
||||
AZURE_SPEECH_KEY=<your-speech-resource-key>
|
||||
AZURE_SPEECH_REGION=<region> # e.g., eastus, westus2, westeurope
|
||||
|
||||
# Alternative: Use endpoint directly
|
||||
AZURE_SPEECH_ENDPOINT=https://<region>.stt.speech.microsoft.com
|
||||
```
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install requests
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
||||
```python
|
||||
import os
|
||||
import requests
|
||||
|
||||
def transcribe_audio(audio_file_path: str, language: str = "en-US") -> dict:
|
||||
"""Transcribe short audio file (max 60 seconds) using REST API."""
|
||||
region = os.environ["AZURE_SPEECH_REGION"]
|
||||
api_key = os.environ["AZURE_SPEECH_KEY"]
|
||||
|
||||
url = f"https://{region}.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1"
|
||||
|
||||
headers = {
|
||||
"Ocp-Apim-Subscription-Key": api_key,
|
||||
"Content-Type": "audio/wav; codecs=audio/pcm; samplerate=16000",
|
||||
"Accept": "application/json"
|
||||
}
|
||||
|
||||
params = {
|
||||
"language": language,
|
||||
"format": "detailed" # or "simple"
|
||||
}
|
||||
|
||||
with open(audio_file_path, "rb") as audio_file:
|
||||
response = requests.post(url, headers=headers, params=params, data=audio_file)
|
||||
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
|
||||
# Usage
|
||||
result = transcribe_audio("audio.wav", "en-US")
|
||||
print(result["DisplayText"])
|
||||
```
|
||||
|
||||
## Audio Requirements
|
||||
|
||||
| Format | Codec | Sample Rate | Notes |
|
||||
|--------|-------|-------------|-------|
|
||||
| WAV | PCM | 16 kHz, mono | **Recommended** |
|
||||
| OGG | OPUS | 16 kHz, mono | Smaller file size |
|
||||
|
||||
**Limitations:**
|
||||
- Maximum 60 seconds of audio
|
||||
- For pronunciation assessment: maximum 30 seconds
|
||||
- No partial/interim results (final only)
|
||||
|
||||
## Content-Type Headers
|
||||
|
||||
```python
|
||||
# WAV PCM 16kHz
|
||||
"Content-Type": "audio/wav; codecs=audio/pcm; samplerate=16000"
|
||||
|
||||
# OGG OPUS
|
||||
"Content-Type": "audio/ogg; codecs=opus"
|
||||
```
|
||||
|
||||
## Response Formats
|
||||
|
||||
### Simple Format (default)
|
||||
|
||||
```python
|
||||
params = {"language": "en-US", "format": "simple"}
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"RecognitionStatus": "Success",
|
||||
"DisplayText": "Remind me to buy 5 pencils.",
|
||||
"Offset": "1236645672289",
|
||||
"Duration": "1236645672289"
|
||||
}
|
||||
```
|
||||
|
||||
### Detailed Format
|
||||
|
||||
```python
|
||||
params = {"language": "en-US", "format": "detailed"}
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"RecognitionStatus": "Success",
|
||||
"Offset": "1236645672289",
|
||||
"Duration": "1236645672289",
|
||||
"NBest": [
|
||||
{
|
||||
"Confidence": 0.9052885,
|
||||
"Display": "What's the weather like?",
|
||||
"ITN": "what's the weather like",
|
||||
"Lexical": "what's the weather like",
|
||||
"MaskedITN": "what's the weather like"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Chunked Transfer (Recommended)
|
||||
|
||||
For lower latency, stream audio in chunks:
|
||||
|
||||
```python
|
||||
import os
|
||||
import requests
|
||||
|
||||
def transcribe_chunked(audio_file_path: str, language: str = "en-US") -> dict:
|
||||
"""Stream audio in chunks for lower latency."""
|
||||
region = os.environ["AZURE_SPEECH_REGION"]
|
||||
api_key = os.environ["AZURE_SPEECH_KEY"]
|
||||
|
||||
url = f"https://{region}.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1"
|
||||
|
||||
headers = {
|
||||
"Ocp-Apim-Subscription-Key": api_key,
|
||||
"Content-Type": "audio/wav; codecs=audio/pcm; samplerate=16000",
|
||||
"Accept": "application/json",
|
||||
"Transfer-Encoding": "chunked",
|
||||
"Expect": "100-continue"
|
||||
}
|
||||
|
||||
params = {"language": language, "format": "detailed"}
|
||||
|
||||
def generate_chunks(file_path: str, chunk_size: int = 1024):
|
||||
with open(file_path, "rb") as f:
|
||||
while chunk := f.read(chunk_size):
|
||||
yield chunk
|
||||
|
||||
response = requests.post(
|
||||
url,
|
||||
headers=headers,
|
||||
params=params,
|
||||
data=generate_chunks(audio_file_path)
|
||||
)
|
||||
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
```
|
||||
|
||||
## Authentication Options
|
||||
|
||||
### Option 1: Subscription Key (Simple)
|
||||
|
||||
```python
|
||||
headers = {
|
||||
"Ocp-Apim-Subscription-Key": os.environ["AZURE_SPEECH_KEY"]
|
||||
}
|
||||
```
|
||||
|
||||
### Option 2: Bearer Token
|
||||
|
||||
```python
|
||||
import requests
|
||||
import os
|
||||
|
||||
def get_access_token() -> str:
|
||||
"""Get access token from the token endpoint."""
|
||||
region = os.environ["AZURE_SPEECH_REGION"]
|
||||
api_key = os.environ["AZURE_SPEECH_KEY"]
|
||||
|
||||
token_url = f"https://{region}.api.cognitive.microsoft.com/sts/v1.0/issueToken"
|
||||
|
||||
response = requests.post(
|
||||
token_url,
|
||||
headers={
|
||||
"Ocp-Apim-Subscription-Key": api_key,
|
||||
"Content-Type": "application/x-www-form-urlencoded",
|
||||
"Content-Length": "0"
|
||||
}
|
||||
)
|
||||
response.raise_for_status()
|
||||
return response.text
|
||||
|
||||
# Use token in requests (valid for 10 minutes)
|
||||
token = get_access_token()
|
||||
headers = {
|
||||
"Authorization": f"Bearer {token}",
|
||||
"Content-Type": "audio/wav; codecs=audio/pcm; samplerate=16000",
|
||||
"Accept": "application/json"
|
||||
}
|
||||
```
|
||||
|
||||
## Query Parameters
|
||||
|
||||
| Parameter | Required | Values | Description |
|
||||
|-----------|----------|--------|-------------|
|
||||
| `language` | **Yes** | `en-US`, `de-DE`, etc. | Language of speech |
|
||||
| `format` | No | `simple`, `detailed` | Result format (default: simple) |
|
||||
| `profanity` | No | `masked`, `removed`, `raw` | Profanity handling (default: masked) |
|
||||
|
||||
## Recognition Status Values
|
||||
|
||||
| Status | Description |
|
||||
|--------|-------------|
|
||||
| `Success` | Recognition succeeded |
|
||||
| `NoMatch` | Speech detected but no words matched |
|
||||
| `InitialSilenceTimeout` | Only silence detected |
|
||||
| `BabbleTimeout` | Only noise detected |
|
||||
| `Error` | Internal service error |
|
||||
|
||||
## Profanity Handling
|
||||
|
||||
```python
|
||||
# Mask profanity with asterisks (default)
|
||||
params = {"language": "en-US", "profanity": "masked"}
|
||||
|
||||
# Remove profanity entirely
|
||||
params = {"language": "en-US", "profanity": "removed"}
|
||||
|
||||
# Include profanity as-is
|
||||
params = {"language": "en-US", "profanity": "raw"}
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
```python
|
||||
import requests
|
||||
|
||||
def transcribe_with_error_handling(audio_path: str, language: str = "en-US") -> dict | None:
|
||||
"""Transcribe with proper error handling."""
|
||||
region = os.environ["AZURE_SPEECH_REGION"]
|
||||
api_key = os.environ["AZURE_SPEECH_KEY"]
|
||||
|
||||
url = f"https://{region}.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1"
|
||||
|
||||
try:
|
||||
with open(audio_path, "rb") as audio_file:
|
||||
response = requests.post(
|
||||
url,
|
||||
headers={
|
||||
"Ocp-Apim-Subscription-Key": api_key,
|
||||
"Content-Type": "audio/wav; codecs=audio/pcm; samplerate=16000",
|
||||
"Accept": "application/json"
|
||||
},
|
||||
params={"language": language, "format": "detailed"},
|
||||
data=audio_file
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
result = response.json()
|
||||
if result.get("RecognitionStatus") == "Success":
|
||||
return result
|
||||
else:
|
||||
print(f"Recognition failed: {result.get('RecognitionStatus')}")
|
||||
return None
|
||||
elif response.status_code == 400:
|
||||
print(f"Bad request: Check language code or audio format")
|
||||
elif response.status_code == 401:
|
||||
print(f"Unauthorized: Check API key or token")
|
||||
elif response.status_code == 403:
|
||||
print(f"Forbidden: Missing authorization header")
|
||||
else:
|
||||
print(f"Error {response.status_code}: {response.text}")
|
||||
|
||||
return None
|
||||
|
||||
except requests.exceptions.RequestException as e:
|
||||
print(f"Request failed: {e}")
|
||||
return None
|
||||
```
|
||||
|
||||
## Async Version
|
||||
|
||||
```python
|
||||
import os
|
||||
import aiohttp
|
||||
import asyncio
|
||||
|
||||
async def transcribe_async(audio_file_path: str, language: str = "en-US") -> dict:
|
||||
"""Async version using aiohttp."""
|
||||
region = os.environ["AZURE_SPEECH_REGION"]
|
||||
api_key = os.environ["AZURE_SPEECH_KEY"]
|
||||
|
||||
url = f"https://{region}.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1"
|
||||
|
||||
headers = {
|
||||
"Ocp-Apim-Subscription-Key": api_key,
|
||||
"Content-Type": "audio/wav; codecs=audio/pcm; samplerate=16000",
|
||||
"Accept": "application/json"
|
||||
}
|
||||
|
||||
params = {"language": language, "format": "detailed"}
|
||||
|
||||
async with aiohttp.ClientSession() as session:
|
||||
with open(audio_file_path, "rb") as f:
|
||||
audio_data = f.read()
|
||||
|
||||
async with session.post(url, headers=headers, params=params, data=audio_data) as response:
|
||||
response.raise_for_status()
|
||||
return await response.json()
|
||||
|
||||
# Usage
|
||||
result = asyncio.run(transcribe_async("audio.wav", "en-US"))
|
||||
print(result["DisplayText"])
|
||||
```
|
||||
|
||||
## Supported Languages
|
||||
|
||||
Common language codes (see [full list](https://learn.microsoft.com/azure/ai-services/speech-service/language-support)):
|
||||
|
||||
| Code | Language |
|
||||
|------|----------|
|
||||
| `en-US` | English (US) |
|
||||
| `en-GB` | English (UK) |
|
||||
| `de-DE` | German |
|
||||
| `fr-FR` | French |
|
||||
| `es-ES` | Spanish (Spain) |
|
||||
| `es-MX` | Spanish (Mexico) |
|
||||
| `zh-CN` | Chinese (Mandarin) |
|
||||
| `ja-JP` | Japanese |
|
||||
| `ko-KR` | Korean |
|
||||
| `pt-BR` | Portuguese (Brazil) |
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use WAV PCM 16kHz mono** for best compatibility
|
||||
2. **Enable chunked transfer** for lower latency
|
||||
3. **Cache access tokens** for 9 minutes (valid for 10)
|
||||
4. **Specify the correct language** for accurate recognition
|
||||
5. **Use detailed format** when you need confidence scores
|
||||
6. **Handle all RecognitionStatus values** in production code
|
||||
|
||||
## When NOT to Use This API
|
||||
|
||||
Use the Speech SDK or Batch Transcription API instead when you need:
|
||||
|
||||
- Audio longer than 60 seconds
|
||||
- Real-time streaming transcription
|
||||
- Partial/interim results
|
||||
- Speech translation
|
||||
- Custom speech models
|
||||
- Batch transcription of many files
|
||||
|
||||
## Reference Files
|
||||
|
||||
| File | Contents |
|
||||
|------|----------|
|
||||
| [references/pronunciation-assessment.md](references/pronunciation-assessment.md) | Pronunciation assessment parameters and scoring |
|
||||
227
skills/official/microsoft/python/foundry/textanalytics/SKILL.md
Normal file
227
skills/official/microsoft/python/foundry/textanalytics/SKILL.md
Normal file
@@ -0,0 +1,227 @@
|
||||
---
|
||||
name: azure-ai-textanalytics-py
|
||||
description: |
|
||||
Azure AI Text Analytics SDK for sentiment analysis, entity recognition, key phrases, language detection, PII, and healthcare NLP. Use for natural language processing on text.
|
||||
Triggers: "text analytics", "sentiment analysis", "entity recognition", "key phrase", "PII detection", "TextAnalyticsClient".
|
||||
package: azure-ai-textanalytics
|
||||
---
|
||||
|
||||
# Azure AI Text Analytics SDK for Python
|
||||
|
||||
Client library for Azure AI Language service NLP capabilities including sentiment, entities, key phrases, and more.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-ai-textanalytics
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
AZURE_LANGUAGE_ENDPOINT=https://<resource>.cognitiveservices.azure.com
|
||||
AZURE_LANGUAGE_KEY=<your-api-key> # If using API key
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
### API Key
|
||||
|
||||
```python
|
||||
import os
|
||||
from azure.core.credentials import AzureKeyCredential
|
||||
from azure.ai.textanalytics import TextAnalyticsClient
|
||||
|
||||
endpoint = os.environ["AZURE_LANGUAGE_ENDPOINT"]
|
||||
key = os.environ["AZURE_LANGUAGE_KEY"]
|
||||
|
||||
client = TextAnalyticsClient(endpoint, AzureKeyCredential(key))
|
||||
```
|
||||
|
||||
### Entra ID (Recommended)
|
||||
|
||||
```python
|
||||
from azure.ai.textanalytics import TextAnalyticsClient
|
||||
from azure.identity import DefaultAzureCredential
|
||||
|
||||
client = TextAnalyticsClient(
|
||||
endpoint=os.environ["AZURE_LANGUAGE_ENDPOINT"],
|
||||
credential=DefaultAzureCredential()
|
||||
)
|
||||
```
|
||||
|
||||
## Sentiment Analysis
|
||||
|
||||
```python
|
||||
documents = [
|
||||
"I had a wonderful trip to Seattle last week!",
|
||||
"The food was terrible and the service was slow."
|
||||
]
|
||||
|
||||
result = client.analyze_sentiment(documents, show_opinion_mining=True)
|
||||
|
||||
for doc in result:
|
||||
if not doc.is_error:
|
||||
print(f"Sentiment: {doc.sentiment}")
|
||||
print(f"Scores: pos={doc.confidence_scores.positive:.2f}, "
|
||||
f"neg={doc.confidence_scores.negative:.2f}, "
|
||||
f"neu={doc.confidence_scores.neutral:.2f}")
|
||||
|
||||
# Opinion mining (aspect-based sentiment)
|
||||
for sentence in doc.sentences:
|
||||
for opinion in sentence.mined_opinions:
|
||||
target = opinion.target
|
||||
print(f" Target: '{target.text}' - {target.sentiment}")
|
||||
for assessment in opinion.assessments:
|
||||
print(f" Assessment: '{assessment.text}' - {assessment.sentiment}")
|
||||
```
|
||||
|
||||
## Entity Recognition
|
||||
|
||||
```python
|
||||
documents = ["Microsoft was founded by Bill Gates and Paul Allen in Albuquerque."]
|
||||
|
||||
result = client.recognize_entities(documents)
|
||||
|
||||
for doc in result:
|
||||
if not doc.is_error:
|
||||
for entity in doc.entities:
|
||||
print(f"Entity: {entity.text}")
|
||||
print(f" Category: {entity.category}")
|
||||
print(f" Subcategory: {entity.subcategory}")
|
||||
print(f" Confidence: {entity.confidence_score:.2f}")
|
||||
```
|
||||
|
||||
## PII Detection
|
||||
|
||||
```python
|
||||
documents = ["My SSN is 123-45-6789 and my email is john@example.com"]
|
||||
|
||||
result = client.recognize_pii_entities(documents)
|
||||
|
||||
for doc in result:
|
||||
if not doc.is_error:
|
||||
print(f"Redacted: {doc.redacted_text}")
|
||||
for entity in doc.entities:
|
||||
print(f"PII: {entity.text} ({entity.category})")
|
||||
```
|
||||
|
||||
## Key Phrase Extraction
|
||||
|
||||
```python
|
||||
documents = ["Azure AI provides powerful machine learning capabilities for developers."]
|
||||
|
||||
result = client.extract_key_phrases(documents)
|
||||
|
||||
for doc in result:
|
||||
if not doc.is_error:
|
||||
print(f"Key phrases: {doc.key_phrases}")
|
||||
```
|
||||
|
||||
## Language Detection
|
||||
|
||||
```python
|
||||
documents = ["Ce document est en francais.", "This is written in English."]
|
||||
|
||||
result = client.detect_language(documents)
|
||||
|
||||
for doc in result:
|
||||
if not doc.is_error:
|
||||
print(f"Language: {doc.primary_language.name} ({doc.primary_language.iso6391_name})")
|
||||
print(f"Confidence: {doc.primary_language.confidence_score:.2f}")
|
||||
```
|
||||
|
||||
## Healthcare Text Analytics
|
||||
|
||||
```python
|
||||
documents = ["Patient has diabetes and was prescribed metformin 500mg twice daily."]
|
||||
|
||||
poller = client.begin_analyze_healthcare_entities(documents)
|
||||
result = poller.result()
|
||||
|
||||
for doc in result:
|
||||
if not doc.is_error:
|
||||
for entity in doc.entities:
|
||||
print(f"Entity: {entity.text}")
|
||||
print(f" Category: {entity.category}")
|
||||
print(f" Normalized: {entity.normalized_text}")
|
||||
|
||||
# Entity links (UMLS, etc.)
|
||||
for link in entity.data_sources:
|
||||
print(f" Link: {link.name} - {link.entity_id}")
|
||||
```
|
||||
|
||||
## Multiple Analysis (Batch)
|
||||
|
||||
```python
|
||||
from azure.ai.textanalytics import (
|
||||
RecognizeEntitiesAction,
|
||||
ExtractKeyPhrasesAction,
|
||||
AnalyzeSentimentAction
|
||||
)
|
||||
|
||||
documents = ["Microsoft announced new Azure AI features at Build conference."]
|
||||
|
||||
poller = client.begin_analyze_actions(
|
||||
documents,
|
||||
actions=[
|
||||
RecognizeEntitiesAction(),
|
||||
ExtractKeyPhrasesAction(),
|
||||
AnalyzeSentimentAction()
|
||||
]
|
||||
)
|
||||
|
||||
results = poller.result()
|
||||
for doc_results in results:
|
||||
for result in doc_results:
|
||||
if result.kind == "EntityRecognition":
|
||||
print(f"Entities: {[e.text for e in result.entities]}")
|
||||
elif result.kind == "KeyPhraseExtraction":
|
||||
print(f"Key phrases: {result.key_phrases}")
|
||||
elif result.kind == "SentimentAnalysis":
|
||||
print(f"Sentiment: {result.sentiment}")
|
||||
```
|
||||
|
||||
## Async Client
|
||||
|
||||
```python
|
||||
from azure.ai.textanalytics.aio import TextAnalyticsClient
|
||||
from azure.identity.aio import DefaultAzureCredential
|
||||
|
||||
async def analyze():
|
||||
async with TextAnalyticsClient(
|
||||
endpoint=endpoint,
|
||||
credential=DefaultAzureCredential()
|
||||
) as client:
|
||||
result = await client.analyze_sentiment(documents)
|
||||
# Process results...
|
||||
```
|
||||
|
||||
## Client Types
|
||||
|
||||
| Client | Purpose |
|
||||
|--------|---------|
|
||||
| `TextAnalyticsClient` | All text analytics operations |
|
||||
| `TextAnalyticsClient` (aio) | Async version |
|
||||
|
||||
## Available Operations
|
||||
|
||||
| Method | Description |
|
||||
|--------|-------------|
|
||||
| `analyze_sentiment` | Sentiment analysis with opinion mining |
|
||||
| `recognize_entities` | Named entity recognition |
|
||||
| `recognize_pii_entities` | PII detection and redaction |
|
||||
| `recognize_linked_entities` | Entity linking to Wikipedia |
|
||||
| `extract_key_phrases` | Key phrase extraction |
|
||||
| `detect_language` | Language detection |
|
||||
| `begin_analyze_healthcare_entities` | Healthcare NLP (long-running) |
|
||||
| `begin_analyze_actions` | Multiple analyses in batch |
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use batch operations** for multiple documents (up to 10 per request)
|
||||
2. **Enable opinion mining** for detailed aspect-based sentiment
|
||||
3. **Use async client** for high-throughput scenarios
|
||||
4. **Handle document errors** — results list may contain errors for some docs
|
||||
5. **Specify language** when known to improve accuracy
|
||||
6. **Use context manager** or close client explicitly
|
||||
@@ -0,0 +1,69 @@
|
||||
---
|
||||
name: azure-ai-transcription-py
|
||||
description: |
|
||||
Azure AI Transcription SDK for Python. Use for real-time and batch speech-to-text transcription with timestamps and diarization.
|
||||
Triggers: "transcription", "speech to text", "Azure AI Transcription", "TranscriptionClient".
|
||||
package: azure-ai-transcription
|
||||
---
|
||||
|
||||
# Azure AI Transcription SDK for Python
|
||||
|
||||
Client library for Azure AI Transcription (speech-to-text) with real-time and batch transcription.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-ai-transcription
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
TRANSCRIPTION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
|
||||
TRANSCRIPTION_KEY=<your-key>
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
Use subscription key authentication (DefaultAzureCredential is not supported for this client):
|
||||
|
||||
```python
|
||||
import os
|
||||
from azure.ai.transcription import TranscriptionClient
|
||||
|
||||
client = TranscriptionClient(
|
||||
endpoint=os.environ["TRANSCRIPTION_ENDPOINT"],
|
||||
credential=os.environ["TRANSCRIPTION_KEY"]
|
||||
)
|
||||
```
|
||||
|
||||
## Transcription (Batch)
|
||||
|
||||
```python
|
||||
job = client.begin_transcription(
|
||||
name="meeting-transcription",
|
||||
locale="en-US",
|
||||
content_urls=["https://<storage>/audio.wav"],
|
||||
diarization_enabled=True
|
||||
)
|
||||
result = job.result()
|
||||
print(result.status)
|
||||
```
|
||||
|
||||
## Transcription (Real-time)
|
||||
|
||||
```python
|
||||
stream = client.begin_stream_transcription(locale="en-US")
|
||||
stream.send_audio_file("audio.wav")
|
||||
for event in stream:
|
||||
print(event.text)
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Enable diarization** when multiple speakers are present
|
||||
2. **Use batch transcription** for long files stored in blob storage
|
||||
3. **Capture timestamps** for subtitle generation
|
||||
4. **Specify language** to improve recognition accuracy
|
||||
5. **Handle streaming backpressure** for real-time transcription
|
||||
6. **Close transcription sessions** when complete
|
||||
@@ -0,0 +1,249 @@
|
||||
---
|
||||
name: azure-ai-translation-document-py
|
||||
description: |
|
||||
Azure AI Document Translation SDK for batch translation of documents with format preservation. Use for translating Word, PDF, Excel, PowerPoint, and other document formats at scale.
|
||||
Triggers: "document translation", "batch translation", "translate documents", "DocumentTranslationClient".
|
||||
package: azure-ai-translation-document
|
||||
---
|
||||
|
||||
# Azure AI Document Translation SDK for Python
|
||||
|
||||
Client library for Azure AI Translator document translation service for batch document translation with format preservation.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-ai-translation-document
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
AZURE_DOCUMENT_TRANSLATION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
|
||||
AZURE_DOCUMENT_TRANSLATION_KEY=<your-api-key> # If using API key
|
||||
|
||||
# Storage for source and target documents
|
||||
AZURE_SOURCE_CONTAINER_URL=https://<storage>.blob.core.windows.net/<container>?<sas>
|
||||
AZURE_TARGET_CONTAINER_URL=https://<storage>.blob.core.windows.net/<container>?<sas>
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
### API Key
|
||||
|
||||
```python
|
||||
import os
|
||||
from azure.ai.translation.document import DocumentTranslationClient
|
||||
from azure.core.credentials import AzureKeyCredential
|
||||
|
||||
endpoint = os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"]
|
||||
key = os.environ["AZURE_DOCUMENT_TRANSLATION_KEY"]
|
||||
|
||||
client = DocumentTranslationClient(endpoint, AzureKeyCredential(key))
|
||||
```
|
||||
|
||||
### Entra ID (Recommended)
|
||||
|
||||
```python
|
||||
from azure.ai.translation.document import DocumentTranslationClient
|
||||
from azure.identity import DefaultAzureCredential
|
||||
|
||||
client = DocumentTranslationClient(
|
||||
endpoint=os.environ["AZURE_DOCUMENT_TRANSLATION_ENDPOINT"],
|
||||
credential=DefaultAzureCredential()
|
||||
)
|
||||
```
|
||||
|
||||
## Basic Document Translation
|
||||
|
||||
```python
|
||||
from azure.ai.translation.document import DocumentTranslationInput, TranslationTarget
|
||||
|
||||
source_url = os.environ["AZURE_SOURCE_CONTAINER_URL"]
|
||||
target_url = os.environ["AZURE_TARGET_CONTAINER_URL"]
|
||||
|
||||
# Start translation job
|
||||
poller = client.begin_translation(
|
||||
inputs=[
|
||||
DocumentTranslationInput(
|
||||
source_url=source_url,
|
||||
targets=[
|
||||
TranslationTarget(
|
||||
target_url=target_url,
|
||||
language="es" # Translate to Spanish
|
||||
)
|
||||
]
|
||||
)
|
||||
]
|
||||
)
|
||||
|
||||
# Wait for completion
|
||||
result = poller.result()
|
||||
|
||||
print(f"Status: {poller.status()}")
|
||||
print(f"Documents translated: {poller.details.documents_succeeded_count}")
|
||||
print(f"Documents failed: {poller.details.documents_failed_count}")
|
||||
```
|
||||
|
||||
## Multiple Target Languages
|
||||
|
||||
```python
|
||||
poller = client.begin_translation(
|
||||
inputs=[
|
||||
DocumentTranslationInput(
|
||||
source_url=source_url,
|
||||
targets=[
|
||||
TranslationTarget(target_url=target_url_es, language="es"),
|
||||
TranslationTarget(target_url=target_url_fr, language="fr"),
|
||||
TranslationTarget(target_url=target_url_de, language="de")
|
||||
]
|
||||
)
|
||||
]
|
||||
)
|
||||
```
|
||||
|
||||
## Translate Single Document
|
||||
|
||||
```python
|
||||
from azure.ai.translation.document import SingleDocumentTranslationClient
|
||||
|
||||
single_client = SingleDocumentTranslationClient(endpoint, AzureKeyCredential(key))
|
||||
|
||||
with open("document.docx", "rb") as f:
|
||||
document_content = f.read()
|
||||
|
||||
result = single_client.translate(
|
||||
body=document_content,
|
||||
target_language="es",
|
||||
content_type="application/vnd.openxmlformats-officedocument.wordprocessingml.document"
|
||||
)
|
||||
|
||||
# Save translated document
|
||||
with open("document_es.docx", "wb") as f:
|
||||
f.write(result)
|
||||
```
|
||||
|
||||
## Check Translation Status
|
||||
|
||||
```python
|
||||
# Get all translation operations
|
||||
operations = client.list_translation_statuses()
|
||||
|
||||
for op in operations:
|
||||
print(f"Operation ID: {op.id}")
|
||||
print(f"Status: {op.status}")
|
||||
print(f"Created: {op.created_on}")
|
||||
print(f"Total documents: {op.documents_total_count}")
|
||||
print(f"Succeeded: {op.documents_succeeded_count}")
|
||||
print(f"Failed: {op.documents_failed_count}")
|
||||
```
|
||||
|
||||
## List Document Statuses
|
||||
|
||||
```python
|
||||
# Get status of individual documents in a job
|
||||
operation_id = poller.id
|
||||
document_statuses = client.list_document_statuses(operation_id)
|
||||
|
||||
for doc in document_statuses:
|
||||
print(f"Document: {doc.source_document_url}")
|
||||
print(f" Status: {doc.status}")
|
||||
print(f" Translated to: {doc.translated_to}")
|
||||
if doc.error:
|
||||
print(f" Error: {doc.error.message}")
|
||||
```
|
||||
|
||||
## Cancel Translation
|
||||
|
||||
```python
|
||||
# Cancel a running translation
|
||||
client.cancel_translation(operation_id)
|
||||
```
|
||||
|
||||
## Using Glossary
|
||||
|
||||
```python
|
||||
from azure.ai.translation.document import TranslationGlossary
|
||||
|
||||
poller = client.begin_translation(
|
||||
inputs=[
|
||||
DocumentTranslationInput(
|
||||
source_url=source_url,
|
||||
targets=[
|
||||
TranslationTarget(
|
||||
target_url=target_url,
|
||||
language="es",
|
||||
glossaries=[
|
||||
TranslationGlossary(
|
||||
glossary_url="https://<storage>.blob.core.windows.net/glossary/terms.csv?<sas>",
|
||||
file_format="csv"
|
||||
)
|
||||
]
|
||||
)
|
||||
]
|
||||
)
|
||||
]
|
||||
)
|
||||
```
|
||||
|
||||
## Supported Document Formats
|
||||
|
||||
```python
|
||||
# Get supported formats
|
||||
formats = client.get_supported_document_formats()
|
||||
|
||||
for fmt in formats:
|
||||
print(f"Format: {fmt.format}")
|
||||
print(f" Extensions: {fmt.file_extensions}")
|
||||
print(f" Content types: {fmt.content_types}")
|
||||
```
|
||||
|
||||
## Supported Languages
|
||||
|
||||
```python
|
||||
# Get supported languages
|
||||
languages = client.get_supported_languages()
|
||||
|
||||
for lang in languages:
|
||||
print(f"Language: {lang.name} ({lang.code})")
|
||||
```
|
||||
|
||||
## Async Client
|
||||
|
||||
```python
|
||||
from azure.ai.translation.document.aio import DocumentTranslationClient
|
||||
from azure.identity.aio import DefaultAzureCredential
|
||||
|
||||
async def translate_documents():
|
||||
async with DocumentTranslationClient(
|
||||
endpoint=endpoint,
|
||||
credential=DefaultAzureCredential()
|
||||
) as client:
|
||||
poller = await client.begin_translation(inputs=[...])
|
||||
result = await poller.result()
|
||||
```
|
||||
|
||||
## Supported Formats
|
||||
|
||||
| Category | Formats |
|
||||
|----------|---------|
|
||||
| Documents | DOCX, PDF, PPTX, XLSX, HTML, TXT, RTF |
|
||||
| Structured | CSV, TSV, JSON, XML |
|
||||
| Localization | XLIFF, XLF, MHTML |
|
||||
|
||||
## Storage Requirements
|
||||
|
||||
- Source and target containers must be Azure Blob Storage
|
||||
- Use SAS tokens with appropriate permissions:
|
||||
- Source: Read, List
|
||||
- Target: Write, List
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use SAS tokens** with minimal required permissions
|
||||
2. **Monitor long-running operations** with `poller.status()`
|
||||
3. **Handle document-level errors** by iterating document statuses
|
||||
4. **Use glossaries** for domain-specific terminology
|
||||
5. **Separate target containers** for each language
|
||||
6. **Use async client** for multiple concurrent jobs
|
||||
7. **Check supported formats** before submitting documents
|
||||
@@ -0,0 +1,274 @@
|
||||
---
|
||||
name: azure-ai-translation-text-py
|
||||
description: |
|
||||
Azure AI Text Translation SDK for real-time text translation, transliteration, language detection, and dictionary lookup. Use for translating text content in applications.
|
||||
Triggers: "text translation", "translator", "translate text", "transliterate", "TextTranslationClient".
|
||||
package: azure-ai-translation-text
|
||||
---
|
||||
|
||||
# Azure AI Text Translation SDK for Python
|
||||
|
||||
Client library for Azure AI Translator text translation service for real-time text translation, transliteration, and language operations.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-ai-translation-text
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
AZURE_TRANSLATOR_KEY=<your-api-key>
|
||||
AZURE_TRANSLATOR_REGION=<your-region> # e.g., eastus, westus2
|
||||
# Or use custom endpoint
|
||||
AZURE_TRANSLATOR_ENDPOINT=https://<resource>.cognitiveservices.azure.com
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
### API Key with Region
|
||||
|
||||
```python
|
||||
import os
|
||||
from azure.ai.translation.text import TextTranslationClient
|
||||
from azure.core.credentials import AzureKeyCredential
|
||||
|
||||
key = os.environ["AZURE_TRANSLATOR_KEY"]
|
||||
region = os.environ["AZURE_TRANSLATOR_REGION"]
|
||||
|
||||
# Create credential with region
|
||||
credential = AzureKeyCredential(key)
|
||||
client = TextTranslationClient(credential=credential, region=region)
|
||||
```
|
||||
|
||||
### API Key with Custom Endpoint
|
||||
|
||||
```python
|
||||
endpoint = os.environ["AZURE_TRANSLATOR_ENDPOINT"]
|
||||
|
||||
client = TextTranslationClient(
|
||||
credential=AzureKeyCredential(key),
|
||||
endpoint=endpoint
|
||||
)
|
||||
```
|
||||
|
||||
### Entra ID (Recommended)
|
||||
|
||||
```python
|
||||
from azure.ai.translation.text import TextTranslationClient
|
||||
from azure.identity import DefaultAzureCredential
|
||||
|
||||
client = TextTranslationClient(
|
||||
credential=DefaultAzureCredential(),
|
||||
endpoint=os.environ["AZURE_TRANSLATOR_ENDPOINT"]
|
||||
)
|
||||
```
|
||||
|
||||
## Basic Translation
|
||||
|
||||
```python
|
||||
# Translate to a single language
|
||||
result = client.translate(
|
||||
body=["Hello, how are you?", "Welcome to Azure!"],
|
||||
to=["es"] # Spanish
|
||||
)
|
||||
|
||||
for item in result:
|
||||
for translation in item.translations:
|
||||
print(f"Translated: {translation.text}")
|
||||
print(f"Target language: {translation.to}")
|
||||
```
|
||||
|
||||
## Translate to Multiple Languages
|
||||
|
||||
```python
|
||||
result = client.translate(
|
||||
body=["Hello, world!"],
|
||||
to=["es", "fr", "de", "ja"] # Spanish, French, German, Japanese
|
||||
)
|
||||
|
||||
for item in result:
|
||||
print(f"Source: {item.detected_language.language if item.detected_language else 'unknown'}")
|
||||
for translation in item.translations:
|
||||
print(f" {translation.to}: {translation.text}")
|
||||
```
|
||||
|
||||
## Specify Source Language
|
||||
|
||||
```python
|
||||
result = client.translate(
|
||||
body=["Bonjour le monde"],
|
||||
from_parameter="fr", # Source is French
|
||||
to=["en", "es"]
|
||||
)
|
||||
```
|
||||
|
||||
## Language Detection
|
||||
|
||||
```python
|
||||
result = client.translate(
|
||||
body=["Hola, como estas?"],
|
||||
to=["en"]
|
||||
)
|
||||
|
||||
for item in result:
|
||||
if item.detected_language:
|
||||
print(f"Detected language: {item.detected_language.language}")
|
||||
print(f"Confidence: {item.detected_language.score:.2f}")
|
||||
```
|
||||
|
||||
## Transliteration
|
||||
|
||||
Convert text from one script to another:
|
||||
|
||||
```python
|
||||
result = client.transliterate(
|
||||
body=["konnichiwa"],
|
||||
language="ja",
|
||||
from_script="Latn", # From Latin script
|
||||
to_script="Jpan" # To Japanese script
|
||||
)
|
||||
|
||||
for item in result:
|
||||
print(f"Transliterated: {item.text}")
|
||||
print(f"Script: {item.script}")
|
||||
```
|
||||
|
||||
## Dictionary Lookup
|
||||
|
||||
Find alternate translations and definitions:
|
||||
|
||||
```python
|
||||
result = client.lookup_dictionary_entries(
|
||||
body=["fly"],
|
||||
from_parameter="en",
|
||||
to="es"
|
||||
)
|
||||
|
||||
for item in result:
|
||||
print(f"Source: {item.normalized_source} ({item.display_source})")
|
||||
for translation in item.translations:
|
||||
print(f" Translation: {translation.normalized_target}")
|
||||
print(f" Part of speech: {translation.pos_tag}")
|
||||
print(f" Confidence: {translation.confidence:.2f}")
|
||||
```
|
||||
|
||||
## Dictionary Examples
|
||||
|
||||
Get usage examples for translations:
|
||||
|
||||
```python
|
||||
from azure.ai.translation.text.models import DictionaryExampleTextItem
|
||||
|
||||
result = client.lookup_dictionary_examples(
|
||||
body=[DictionaryExampleTextItem(text="fly", translation="volar")],
|
||||
from_parameter="en",
|
||||
to="es"
|
||||
)
|
||||
|
||||
for item in result:
|
||||
for example in item.examples:
|
||||
print(f"Source: {example.source_prefix}{example.source_term}{example.source_suffix}")
|
||||
print(f"Target: {example.target_prefix}{example.target_term}{example.target_suffix}")
|
||||
```
|
||||
|
||||
## Get Supported Languages
|
||||
|
||||
```python
|
||||
# Get all supported languages
|
||||
languages = client.get_supported_languages()
|
||||
|
||||
# Translation languages
|
||||
print("Translation languages:")
|
||||
for code, lang in languages.translation.items():
|
||||
print(f" {code}: {lang.name} ({lang.native_name})")
|
||||
|
||||
# Transliteration languages
|
||||
print("\nTransliteration languages:")
|
||||
for code, lang in languages.transliteration.items():
|
||||
print(f" {code}: {lang.name}")
|
||||
for script in lang.scripts:
|
||||
print(f" {script.code} -> {[t.code for t in script.to_scripts]}")
|
||||
|
||||
# Dictionary languages
|
||||
print("\nDictionary languages:")
|
||||
for code, lang in languages.dictionary.items():
|
||||
print(f" {code}: {lang.name}")
|
||||
```
|
||||
|
||||
## Break Sentence
|
||||
|
||||
Identify sentence boundaries:
|
||||
|
||||
```python
|
||||
result = client.find_sentence_boundaries(
|
||||
body=["Hello! How are you? I hope you are well."],
|
||||
language="en"
|
||||
)
|
||||
|
||||
for item in result:
|
||||
print(f"Sentence lengths: {item.sent_len}")
|
||||
```
|
||||
|
||||
## Translation Options
|
||||
|
||||
```python
|
||||
result = client.translate(
|
||||
body=["Hello, world!"],
|
||||
to=["de"],
|
||||
text_type="html", # "plain" or "html"
|
||||
profanity_action="Marked", # "NoAction", "Deleted", "Marked"
|
||||
profanity_marker="Asterisk", # "Asterisk", "Tag"
|
||||
include_alignment=True, # Include word alignment
|
||||
include_sentence_length=True # Include sentence boundaries
|
||||
)
|
||||
|
||||
for item in result:
|
||||
translation = item.translations[0]
|
||||
print(f"Translated: {translation.text}")
|
||||
if translation.alignment:
|
||||
print(f"Alignment: {translation.alignment.proj}")
|
||||
if translation.sent_len:
|
||||
print(f"Sentence lengths: {translation.sent_len.src_sent_len}")
|
||||
```
|
||||
|
||||
## Async Client
|
||||
|
||||
```python
|
||||
from azure.ai.translation.text.aio import TextTranslationClient
|
||||
from azure.core.credentials import AzureKeyCredential
|
||||
|
||||
async def translate_text():
|
||||
async with TextTranslationClient(
|
||||
credential=AzureKeyCredential(key),
|
||||
region=region
|
||||
) as client:
|
||||
result = await client.translate(
|
||||
body=["Hello, world!"],
|
||||
to=["es"]
|
||||
)
|
||||
print(result[0].translations[0].text)
|
||||
```
|
||||
|
||||
## Client Methods
|
||||
|
||||
| Method | Description |
|
||||
|--------|-------------|
|
||||
| `translate` | Translate text to one or more languages |
|
||||
| `transliterate` | Convert text between scripts |
|
||||
| `detect` | Detect language of text |
|
||||
| `find_sentence_boundaries` | Identify sentence boundaries |
|
||||
| `lookup_dictionary_entries` | Dictionary lookup for translations |
|
||||
| `lookup_dictionary_examples` | Get usage examples |
|
||||
| `get_supported_languages` | List supported languages |
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Batch translations** — Send multiple texts in one request (up to 100)
|
||||
2. **Specify source language** when known to improve accuracy
|
||||
3. **Use async client** for high-throughput scenarios
|
||||
4. **Cache language list** — Supported languages don't change frequently
|
||||
5. **Handle profanity** appropriately for your application
|
||||
6. **Use html text_type** when translating HTML content
|
||||
7. **Include alignment** for applications needing word mapping
|
||||
@@ -0,0 +1,260 @@
|
||||
---
|
||||
name: azure-ai-vision-imageanalysis-py
|
||||
description: |
|
||||
Azure AI Vision Image Analysis SDK for captions, tags, objects, OCR, people detection, and smart cropping. Use for computer vision and image understanding tasks.
|
||||
Triggers: "image analysis", "computer vision", "OCR", "object detection", "ImageAnalysisClient", "image caption".
|
||||
package: azure-ai-vision-imageanalysis
|
||||
---
|
||||
|
||||
# Azure AI Vision Image Analysis SDK for Python
|
||||
|
||||
Client library for Azure AI Vision 4.0 image analysis including captions, tags, objects, OCR, and more.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-ai-vision-imageanalysis
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
VISION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
|
||||
VISION_KEY=<your-api-key> # If using API key
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
### API Key
|
||||
|
||||
```python
|
||||
import os
|
||||
from azure.ai.vision.imageanalysis import ImageAnalysisClient
|
||||
from azure.core.credentials import AzureKeyCredential
|
||||
|
||||
endpoint = os.environ["VISION_ENDPOINT"]
|
||||
key = os.environ["VISION_KEY"]
|
||||
|
||||
client = ImageAnalysisClient(
|
||||
endpoint=endpoint,
|
||||
credential=AzureKeyCredential(key)
|
||||
)
|
||||
```
|
||||
|
||||
### Entra ID (Recommended)
|
||||
|
||||
```python
|
||||
from azure.ai.vision.imageanalysis import ImageAnalysisClient
|
||||
from azure.identity import DefaultAzureCredential
|
||||
|
||||
client = ImageAnalysisClient(
|
||||
endpoint=os.environ["VISION_ENDPOINT"],
|
||||
credential=DefaultAzureCredential()
|
||||
)
|
||||
```
|
||||
|
||||
## Analyze Image from URL
|
||||
|
||||
```python
|
||||
from azure.ai.vision.imageanalysis.models import VisualFeatures
|
||||
|
||||
image_url = "https://example.com/image.jpg"
|
||||
|
||||
result = client.analyze_from_url(
|
||||
image_url=image_url,
|
||||
visual_features=[
|
||||
VisualFeatures.CAPTION,
|
||||
VisualFeatures.TAGS,
|
||||
VisualFeatures.OBJECTS,
|
||||
VisualFeatures.READ,
|
||||
VisualFeatures.PEOPLE,
|
||||
VisualFeatures.SMART_CROPS,
|
||||
VisualFeatures.DENSE_CAPTIONS
|
||||
],
|
||||
gender_neutral_caption=True,
|
||||
language="en"
|
||||
)
|
||||
```
|
||||
|
||||
## Analyze Image from File
|
||||
|
||||
```python
|
||||
with open("image.jpg", "rb") as f:
|
||||
image_data = f.read()
|
||||
|
||||
result = client.analyze(
|
||||
image_data=image_data,
|
||||
visual_features=[VisualFeatures.CAPTION, VisualFeatures.TAGS]
|
||||
)
|
||||
```
|
||||
|
||||
## Image Caption
|
||||
|
||||
```python
|
||||
result = client.analyze_from_url(
|
||||
image_url=image_url,
|
||||
visual_features=[VisualFeatures.CAPTION],
|
||||
gender_neutral_caption=True
|
||||
)
|
||||
|
||||
if result.caption:
|
||||
print(f"Caption: {result.caption.text}")
|
||||
print(f"Confidence: {result.caption.confidence:.2f}")
|
||||
```
|
||||
|
||||
## Dense Captions (Multiple Regions)
|
||||
|
||||
```python
|
||||
result = client.analyze_from_url(
|
||||
image_url=image_url,
|
||||
visual_features=[VisualFeatures.DENSE_CAPTIONS]
|
||||
)
|
||||
|
||||
if result.dense_captions:
|
||||
for caption in result.dense_captions.list:
|
||||
print(f"Caption: {caption.text}")
|
||||
print(f" Confidence: {caption.confidence:.2f}")
|
||||
print(f" Bounding box: {caption.bounding_box}")
|
||||
```
|
||||
|
||||
## Tags
|
||||
|
||||
```python
|
||||
result = client.analyze_from_url(
|
||||
image_url=image_url,
|
||||
visual_features=[VisualFeatures.TAGS]
|
||||
)
|
||||
|
||||
if result.tags:
|
||||
for tag in result.tags.list:
|
||||
print(f"Tag: {tag.name} (confidence: {tag.confidence:.2f})")
|
||||
```
|
||||
|
||||
## Object Detection
|
||||
|
||||
```python
|
||||
result = client.analyze_from_url(
|
||||
image_url=image_url,
|
||||
visual_features=[VisualFeatures.OBJECTS]
|
||||
)
|
||||
|
||||
if result.objects:
|
||||
for obj in result.objects.list:
|
||||
print(f"Object: {obj.tags[0].name}")
|
||||
print(f" Confidence: {obj.tags[0].confidence:.2f}")
|
||||
box = obj.bounding_box
|
||||
print(f" Bounding box: x={box.x}, y={box.y}, w={box.width}, h={box.height}")
|
||||
```
|
||||
|
||||
## OCR (Text Extraction)
|
||||
|
||||
```python
|
||||
result = client.analyze_from_url(
|
||||
image_url=image_url,
|
||||
visual_features=[VisualFeatures.READ]
|
||||
)
|
||||
|
||||
if result.read:
|
||||
for block in result.read.blocks:
|
||||
for line in block.lines:
|
||||
print(f"Line: {line.text}")
|
||||
print(f" Bounding polygon: {line.bounding_polygon}")
|
||||
|
||||
# Word-level details
|
||||
for word in line.words:
|
||||
print(f" Word: {word.text} (confidence: {word.confidence:.2f})")
|
||||
```
|
||||
|
||||
## People Detection
|
||||
|
||||
```python
|
||||
result = client.analyze_from_url(
|
||||
image_url=image_url,
|
||||
visual_features=[VisualFeatures.PEOPLE]
|
||||
)
|
||||
|
||||
if result.people:
|
||||
for person in result.people.list:
|
||||
print(f"Person detected:")
|
||||
print(f" Confidence: {person.confidence:.2f}")
|
||||
box = person.bounding_box
|
||||
print(f" Bounding box: x={box.x}, y={box.y}, w={box.width}, h={box.height}")
|
||||
```
|
||||
|
||||
## Smart Cropping
|
||||
|
||||
```python
|
||||
result = client.analyze_from_url(
|
||||
image_url=image_url,
|
||||
visual_features=[VisualFeatures.SMART_CROPS],
|
||||
smart_crops_aspect_ratios=[0.9, 1.33, 1.78] # Portrait, 4:3, 16:9
|
||||
)
|
||||
|
||||
if result.smart_crops:
|
||||
for crop in result.smart_crops.list:
|
||||
print(f"Aspect ratio: {crop.aspect_ratio}")
|
||||
box = crop.bounding_box
|
||||
print(f" Crop region: x={box.x}, y={box.y}, w={box.width}, h={box.height}")
|
||||
```
|
||||
|
||||
## Async Client
|
||||
|
||||
```python
|
||||
from azure.ai.vision.imageanalysis.aio import ImageAnalysisClient
|
||||
from azure.identity.aio import DefaultAzureCredential
|
||||
|
||||
async def analyze_image():
|
||||
async with ImageAnalysisClient(
|
||||
endpoint=endpoint,
|
||||
credential=DefaultAzureCredential()
|
||||
) as client:
|
||||
result = await client.analyze_from_url(
|
||||
image_url=image_url,
|
||||
visual_features=[VisualFeatures.CAPTION]
|
||||
)
|
||||
print(result.caption.text)
|
||||
```
|
||||
|
||||
## Visual Features
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| `CAPTION` | Single sentence describing the image |
|
||||
| `DENSE_CAPTIONS` | Captions for multiple regions |
|
||||
| `TAGS` | Content tags (objects, scenes, actions) |
|
||||
| `OBJECTS` | Object detection with bounding boxes |
|
||||
| `READ` | OCR text extraction |
|
||||
| `PEOPLE` | People detection with bounding boxes |
|
||||
| `SMART_CROPS` | Suggested crop regions for thumbnails |
|
||||
|
||||
## Error Handling
|
||||
|
||||
```python
|
||||
from azure.core.exceptions import HttpResponseError
|
||||
|
||||
try:
|
||||
result = client.analyze_from_url(
|
||||
image_url=image_url,
|
||||
visual_features=[VisualFeatures.CAPTION]
|
||||
)
|
||||
except HttpResponseError as e:
|
||||
print(f"Status code: {e.status_code}")
|
||||
print(f"Reason: {e.reason}")
|
||||
print(f"Message: {e.error.message}")
|
||||
```
|
||||
|
||||
## Image Requirements
|
||||
|
||||
- Formats: JPEG, PNG, GIF, BMP, WEBP, ICO, TIFF, MPO
|
||||
- Max size: 20 MB
|
||||
- Dimensions: 50x50 to 16000x16000 pixels
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Select only needed features** to optimize latency and cost
|
||||
2. **Use async client** for high-throughput scenarios
|
||||
3. **Handle HttpResponseError** for invalid images or auth issues
|
||||
4. **Enable gender_neutral_caption** for inclusive descriptions
|
||||
5. **Specify language** for localized captions
|
||||
6. **Use smart_crops_aspect_ratios** matching your thumbnail requirements
|
||||
7. **Cache results** when analyzing the same image multiple times
|
||||
309
skills/official/microsoft/python/foundry/voicelive/SKILL.md
Normal file
309
skills/official/microsoft/python/foundry/voicelive/SKILL.md
Normal file
@@ -0,0 +1,309 @@
|
||||
---
|
||||
name: azure-ai-voicelive-py
|
||||
description: Build real-time voice AI applications using Azure AI Voice Live SDK (azure-ai-voicelive). Use this skill when creating Python applications that need real-time bidirectional audio communication with Azure AI, including voice assistants, voice-enabled chatbots, real-time speech-to-speech translation, voice-driven avatars, or any WebSocket-based audio streaming with AI models. Supports Server VAD (Voice Activity Detection), turn-based conversation, function calling, MCP tools, avatar integration, and transcription.
|
||||
package: azure-ai-voicelive
|
||||
---
|
||||
|
||||
# Azure AI Voice Live SDK
|
||||
|
||||
Build real-time voice AI applications with bidirectional WebSocket communication.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-ai-voicelive aiohttp azure-identity
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
AZURE_COGNITIVE_SERVICES_ENDPOINT=https://<region>.api.cognitive.microsoft.com
|
||||
# For API key auth (not recommended for production)
|
||||
AZURE_COGNITIVE_SERVICES_KEY=<api-key>
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
**DefaultAzureCredential (preferred)**:
|
||||
```python
|
||||
from azure.ai.voicelive.aio import connect
|
||||
from azure.identity.aio import DefaultAzureCredential
|
||||
|
||||
async with connect(
|
||||
endpoint=os.environ["AZURE_COGNITIVE_SERVICES_ENDPOINT"],
|
||||
credential=DefaultAzureCredential(),
|
||||
model="gpt-4o-realtime-preview",
|
||||
credential_scopes=["https://cognitiveservices.azure.com/.default"]
|
||||
) as conn:
|
||||
...
|
||||
```
|
||||
|
||||
**API Key**:
|
||||
```python
|
||||
from azure.ai.voicelive.aio import connect
|
||||
from azure.core.credentials import AzureKeyCredential
|
||||
|
||||
async with connect(
|
||||
endpoint=os.environ["AZURE_COGNITIVE_SERVICES_ENDPOINT"],
|
||||
credential=AzureKeyCredential(os.environ["AZURE_COGNITIVE_SERVICES_KEY"]),
|
||||
model="gpt-4o-realtime-preview"
|
||||
) as conn:
|
||||
...
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
import os
|
||||
from azure.ai.voicelive.aio import connect
|
||||
from azure.identity.aio import DefaultAzureCredential
|
||||
|
||||
async def main():
|
||||
async with connect(
|
||||
endpoint=os.environ["AZURE_COGNITIVE_SERVICES_ENDPOINT"],
|
||||
credential=DefaultAzureCredential(),
|
||||
model="gpt-4o-realtime-preview",
|
||||
credential_scopes=["https://cognitiveservices.azure.com/.default"]
|
||||
) as conn:
|
||||
# Update session with instructions
|
||||
await conn.session.update(session={
|
||||
"instructions": "You are a helpful assistant.",
|
||||
"modalities": ["text", "audio"],
|
||||
"voice": "alloy"
|
||||
})
|
||||
|
||||
# Listen for events
|
||||
async for event in conn:
|
||||
print(f"Event: {event.type}")
|
||||
if event.type == "response.audio_transcript.done":
|
||||
print(f"Transcript: {event.transcript}")
|
||||
elif event.type == "response.done":
|
||||
break
|
||||
|
||||
asyncio.run(main())
|
||||
```
|
||||
|
||||
## Core Architecture
|
||||
|
||||
### Connection Resources
|
||||
|
||||
The `VoiceLiveConnection` exposes these resources:
|
||||
|
||||
| Resource | Purpose | Key Methods |
|
||||
|----------|---------|-------------|
|
||||
| `conn.session` | Session configuration | `update(session=...)` |
|
||||
| `conn.response` | Model responses | `create()`, `cancel()` |
|
||||
| `conn.input_audio_buffer` | Audio input | `append()`, `commit()`, `clear()` |
|
||||
| `conn.output_audio_buffer` | Audio output | `clear()` |
|
||||
| `conn.conversation` | Conversation state | `item.create()`, `item.delete()`, `item.truncate()` |
|
||||
| `conn.transcription_session` | Transcription config | `update(session=...)` |
|
||||
|
||||
## Session Configuration
|
||||
|
||||
```python
|
||||
from azure.ai.voicelive.models import RequestSession, FunctionTool
|
||||
|
||||
await conn.session.update(session=RequestSession(
|
||||
instructions="You are a helpful voice assistant.",
|
||||
modalities=["text", "audio"],
|
||||
voice="alloy", # or "echo", "shimmer", "sage", etc.
|
||||
input_audio_format="pcm16",
|
||||
output_audio_format="pcm16",
|
||||
turn_detection={
|
||||
"type": "server_vad",
|
||||
"threshold": 0.5,
|
||||
"prefix_padding_ms": 300,
|
||||
"silence_duration_ms": 500
|
||||
},
|
||||
tools=[
|
||||
FunctionTool(
|
||||
type="function",
|
||||
name="get_weather",
|
||||
description="Get current weather",
|
||||
parameters={
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"location": {"type": "string"}
|
||||
},
|
||||
"required": ["location"]
|
||||
}
|
||||
)
|
||||
]
|
||||
))
|
||||
```
|
||||
|
||||
## Audio Streaming
|
||||
|
||||
### Send Audio (Base64 PCM16)
|
||||
|
||||
```python
|
||||
import base64
|
||||
|
||||
# Read audio chunk (16-bit PCM, 24kHz mono)
|
||||
audio_chunk = await read_audio_from_microphone()
|
||||
b64_audio = base64.b64encode(audio_chunk).decode()
|
||||
|
||||
await conn.input_audio_buffer.append(audio=b64_audio)
|
||||
```
|
||||
|
||||
### Receive Audio
|
||||
|
||||
```python
|
||||
async for event in conn:
|
||||
if event.type == "response.audio.delta":
|
||||
audio_bytes = base64.b64decode(event.delta)
|
||||
await play_audio(audio_bytes)
|
||||
elif event.type == "response.audio.done":
|
||||
print("Audio complete")
|
||||
```
|
||||
|
||||
## Event Handling
|
||||
|
||||
```python
|
||||
async for event in conn:
|
||||
match event.type:
|
||||
# Session events
|
||||
case "session.created":
|
||||
print(f"Session: {event.session}")
|
||||
case "session.updated":
|
||||
print("Session updated")
|
||||
|
||||
# Audio input events
|
||||
case "input_audio_buffer.speech_started":
|
||||
print(f"Speech started at {event.audio_start_ms}ms")
|
||||
case "input_audio_buffer.speech_stopped":
|
||||
print(f"Speech stopped at {event.audio_end_ms}ms")
|
||||
|
||||
# Transcription events
|
||||
case "conversation.item.input_audio_transcription.completed":
|
||||
print(f"User said: {event.transcript}")
|
||||
case "conversation.item.input_audio_transcription.delta":
|
||||
print(f"Partial: {event.delta}")
|
||||
|
||||
# Response events
|
||||
case "response.created":
|
||||
print(f"Response started: {event.response.id}")
|
||||
case "response.audio_transcript.delta":
|
||||
print(event.delta, end="", flush=True)
|
||||
case "response.audio.delta":
|
||||
audio = base64.b64decode(event.delta)
|
||||
case "response.done":
|
||||
print(f"Response complete: {event.response.status}")
|
||||
|
||||
# Function calls
|
||||
case "response.function_call_arguments.done":
|
||||
result = handle_function(event.name, event.arguments)
|
||||
await conn.conversation.item.create(item={
|
||||
"type": "function_call_output",
|
||||
"call_id": event.call_id,
|
||||
"output": json.dumps(result)
|
||||
})
|
||||
await conn.response.create()
|
||||
|
||||
# Errors
|
||||
case "error":
|
||||
print(f"Error: {event.error.message}")
|
||||
```
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Manual Turn Mode (No VAD)
|
||||
|
||||
```python
|
||||
await conn.session.update(session={"turn_detection": None})
|
||||
|
||||
# Manually control turns
|
||||
await conn.input_audio_buffer.append(audio=b64_audio)
|
||||
await conn.input_audio_buffer.commit() # End of user turn
|
||||
await conn.response.create() # Trigger response
|
||||
```
|
||||
|
||||
### Interrupt Handling
|
||||
|
||||
```python
|
||||
async for event in conn:
|
||||
if event.type == "input_audio_buffer.speech_started":
|
||||
# User interrupted - cancel current response
|
||||
await conn.response.cancel()
|
||||
await conn.output_audio_buffer.clear()
|
||||
```
|
||||
|
||||
### Conversation History
|
||||
|
||||
```python
|
||||
# Add system message
|
||||
await conn.conversation.item.create(item={
|
||||
"type": "message",
|
||||
"role": "system",
|
||||
"content": [{"type": "input_text", "text": "Be concise."}]
|
||||
})
|
||||
|
||||
# Add user message
|
||||
await conn.conversation.item.create(item={
|
||||
"type": "message",
|
||||
"role": "user",
|
||||
"content": [{"type": "input_text", "text": "Hello!"}]
|
||||
})
|
||||
|
||||
await conn.response.create()
|
||||
```
|
||||
|
||||
## Voice Options
|
||||
|
||||
| Voice | Description |
|
||||
|-------|-------------|
|
||||
| `alloy` | Neutral, balanced |
|
||||
| `echo` | Warm, conversational |
|
||||
| `shimmer` | Clear, professional |
|
||||
| `sage` | Calm, authoritative |
|
||||
| `coral` | Friendly, upbeat |
|
||||
| `ash` | Deep, measured |
|
||||
| `ballad` | Expressive |
|
||||
| `verse` | Storytelling |
|
||||
|
||||
Azure voices: Use `AzureStandardVoice`, `AzureCustomVoice`, or `AzurePersonalVoice` models.
|
||||
|
||||
## Audio Formats
|
||||
|
||||
| Format | Sample Rate | Use Case |
|
||||
|--------|-------------|----------|
|
||||
| `pcm16` | 24kHz | Default, high quality |
|
||||
| `pcm16-8000hz` | 8kHz | Telephony |
|
||||
| `pcm16-16000hz` | 16kHz | Voice assistants |
|
||||
| `g711_ulaw` | 8kHz | Telephony (US) |
|
||||
| `g711_alaw` | 8kHz | Telephony (EU) |
|
||||
|
||||
## Turn Detection Options
|
||||
|
||||
```python
|
||||
# Server VAD (default)
|
||||
{"type": "server_vad", "threshold": 0.5, "silence_duration_ms": 500}
|
||||
|
||||
# Azure Semantic VAD (smarter detection)
|
||||
{"type": "azure_semantic_vad"}
|
||||
{"type": "azure_semantic_vad_en"} # English optimized
|
||||
{"type": "azure_semantic_vad_multilingual"}
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
```python
|
||||
from azure.ai.voicelive.aio import ConnectionError, ConnectionClosed
|
||||
|
||||
try:
|
||||
async with connect(...) as conn:
|
||||
async for event in conn:
|
||||
if event.type == "error":
|
||||
print(f"API Error: {event.error.code} - {event.error.message}")
|
||||
except ConnectionClosed as e:
|
||||
print(f"Connection closed: {e.code} - {e.reason}")
|
||||
except ConnectionError as e:
|
||||
print(f"Connection error: {e}")
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- **Detailed API Reference**: See [references/api-reference.md](references/api-reference.md)
|
||||
- **Complete Examples**: See [references/examples.md](references/examples.md)
|
||||
- **All Models & Types**: See [references/models.md](references/models.md)
|
||||
Reference in New Issue
Block a user