* chore: upgrade maintenance scripts to robust PyYAML parsing - Replaces fragile regex frontmatter parsing with PyYAML/yaml library - Ensures multi-line descriptions and complex characters are handled safely - Normalizes quoting and field ordering across all maintenance scripts - Updates validator to strictly enforce description quality * fix: restore and refine truncated skill descriptions - Recovered 223+ truncated descriptions from git history (6.5.0 regression) - Refined long descriptions into concise, complete sentences (<200 chars) - Added missing descriptions for brainstorming and orchestration skills - Manually fixed imagen skill description - Resolved dangling links in competitor-alternatives skill * chore: sync generated registry files and document fixes - Regenerated skills index with normalized forward-slash paths - Updated README and CATALOG to reflect restored descriptions - Documented restoration and script improvements in CHANGELOG.md * fix: restore missing skill and align metadata for full 955 count - Renamed SKILL.MD to SKILL.md in andruia-skill-smith to ensure indexing - Fixed risk level and missing section in andruia-skill-smith - Synchronized all registry files for final 955 skill count * chore(scripts): add cross-platform runners and hermetic test orchestration * fix(scripts): harden utf-8 output and clone target writeability * fix(skills): add missing date metadata for strict validation * chore(index): sync generated metadata dates * fix(catalog): normalize skill paths to prevent CI drift * chore: sync generated registry files * fix: enforce LF line endings for generated registry files
284 lines
6.7 KiB
Markdown
284 lines
6.7 KiB
Markdown
---
|
|
name: azure-cosmos-py
|
|
description: Azure Cosmos DB SDK for Python (NoSQL API). Use for document CRUD, queries, containers, and globally distributed data.
|
|
risk: unknown
|
|
source: community
|
|
date_added: '2026-02-27'
|
|
---
|
|
|
|
# Azure Cosmos DB SDK for Python
|
|
|
|
Client library for Azure Cosmos DB NoSQL API — globally distributed, multi-model database.
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
pip install azure-cosmos azure-identity
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
```bash
|
|
COSMOS_ENDPOINT=https://<account>.documents.azure.com:443/
|
|
COSMOS_DATABASE=mydb
|
|
COSMOS_CONTAINER=mycontainer
|
|
```
|
|
|
|
## Authentication
|
|
|
|
```python
|
|
from azure.identity import DefaultAzureCredential
|
|
from azure.cosmos import CosmosClient
|
|
|
|
credential = DefaultAzureCredential()
|
|
endpoint = "https://<account>.documents.azure.com:443/"
|
|
|
|
client = CosmosClient(url=endpoint, credential=credential)
|
|
```
|
|
|
|
## Client Hierarchy
|
|
|
|
| Client | Purpose | Get From |
|
|
|--------|---------|----------|
|
|
| `CosmosClient` | Account-level operations | Direct instantiation |
|
|
| `DatabaseProxy` | Database operations | `client.get_database_client()` |
|
|
| `ContainerProxy` | Container/item operations | `database.get_container_client()` |
|
|
|
|
## Core Workflow
|
|
|
|
### Setup Database and Container
|
|
|
|
```python
|
|
# Get or create database
|
|
database = client.create_database_if_not_exists(id="mydb")
|
|
|
|
# Get or create container with partition key
|
|
container = database.create_container_if_not_exists(
|
|
id="mycontainer",
|
|
partition_key=PartitionKey(path="/category")
|
|
)
|
|
|
|
# Get existing
|
|
database = client.get_database_client("mydb")
|
|
container = database.get_container_client("mycontainer")
|
|
```
|
|
|
|
### Create Item
|
|
|
|
```python
|
|
item = {
|
|
"id": "item-001", # Required: unique within partition
|
|
"category": "electronics", # Partition key value
|
|
"name": "Laptop",
|
|
"price": 999.99,
|
|
"tags": ["computer", "portable"]
|
|
}
|
|
|
|
created = container.create_item(body=item)
|
|
print(f"Created: {created['id']}")
|
|
```
|
|
|
|
### Read Item
|
|
|
|
```python
|
|
# Read requires id AND partition key
|
|
item = container.read_item(
|
|
item="item-001",
|
|
partition_key="electronics"
|
|
)
|
|
print(f"Name: {item['name']}")
|
|
```
|
|
|
|
### Update Item (Replace)
|
|
|
|
```python
|
|
item = container.read_item(item="item-001", partition_key="electronics")
|
|
item["price"] = 899.99
|
|
item["on_sale"] = True
|
|
|
|
updated = container.replace_item(item=item["id"], body=item)
|
|
```
|
|
|
|
### Upsert Item
|
|
|
|
```python
|
|
# Create if not exists, replace if exists
|
|
item = {
|
|
"id": "item-002",
|
|
"category": "electronics",
|
|
"name": "Tablet",
|
|
"price": 499.99
|
|
}
|
|
|
|
result = container.upsert_item(body=item)
|
|
```
|
|
|
|
### Delete Item
|
|
|
|
```python
|
|
container.delete_item(
|
|
item="item-001",
|
|
partition_key="electronics"
|
|
)
|
|
```
|
|
|
|
## Queries
|
|
|
|
### Basic Query
|
|
|
|
```python
|
|
# Query within a partition (efficient)
|
|
query = "SELECT * FROM c WHERE c.price < @max_price"
|
|
items = container.query_items(
|
|
query=query,
|
|
parameters=[{"name": "@max_price", "value": 500}],
|
|
partition_key="electronics"
|
|
)
|
|
|
|
for item in items:
|
|
print(f"{item['name']}: ${item['price']}")
|
|
```
|
|
|
|
### Cross-Partition Query
|
|
|
|
```python
|
|
# Cross-partition (more expensive, use sparingly)
|
|
query = "SELECT * FROM c WHERE c.price < @max_price"
|
|
items = container.query_items(
|
|
query=query,
|
|
parameters=[{"name": "@max_price", "value": 500}],
|
|
enable_cross_partition_query=True
|
|
)
|
|
|
|
for item in items:
|
|
print(item)
|
|
```
|
|
|
|
### Query with Projection
|
|
|
|
```python
|
|
query = "SELECT c.id, c.name, c.price FROM c WHERE c.category = @category"
|
|
items = container.query_items(
|
|
query=query,
|
|
parameters=[{"name": "@category", "value": "electronics"}],
|
|
partition_key="electronics"
|
|
)
|
|
```
|
|
|
|
### Read All Items
|
|
|
|
```python
|
|
# Read all in a partition
|
|
items = container.read_all_items() # Cross-partition
|
|
# Or with partition key
|
|
items = container.query_items(
|
|
query="SELECT * FROM c",
|
|
partition_key="electronics"
|
|
)
|
|
```
|
|
|
|
## Partition Keys
|
|
|
|
**Critical**: Always include partition key for efficient operations.
|
|
|
|
```python
|
|
from azure.cosmos import PartitionKey
|
|
|
|
# Single partition key
|
|
container = database.create_container_if_not_exists(
|
|
id="orders",
|
|
partition_key=PartitionKey(path="/customer_id")
|
|
)
|
|
|
|
# Hierarchical partition key (preview)
|
|
container = database.create_container_if_not_exists(
|
|
id="events",
|
|
partition_key=PartitionKey(path=["/tenant_id", "/user_id"])
|
|
)
|
|
```
|
|
|
|
## Throughput
|
|
|
|
```python
|
|
# Create container with provisioned throughput
|
|
container = database.create_container_if_not_exists(
|
|
id="mycontainer",
|
|
partition_key=PartitionKey(path="/pk"),
|
|
offer_throughput=400 # RU/s
|
|
)
|
|
|
|
# Read current throughput
|
|
offer = container.read_offer()
|
|
print(f"Throughput: {offer.offer_throughput} RU/s")
|
|
|
|
# Update throughput
|
|
container.replace_throughput(throughput=1000)
|
|
```
|
|
|
|
## Async Client
|
|
|
|
```python
|
|
from azure.cosmos.aio import CosmosClient
|
|
from azure.identity.aio import DefaultAzureCredential
|
|
|
|
async def cosmos_operations():
|
|
credential = DefaultAzureCredential()
|
|
|
|
async with CosmosClient(endpoint, credential=credential) as client:
|
|
database = client.get_database_client("mydb")
|
|
container = database.get_container_client("mycontainer")
|
|
|
|
# Create
|
|
await container.create_item(body={"id": "1", "pk": "test"})
|
|
|
|
# Read
|
|
item = await container.read_item(item="1", partition_key="test")
|
|
|
|
# Query
|
|
async for item in container.query_items(
|
|
query="SELECT * FROM c",
|
|
partition_key="test"
|
|
):
|
|
print(item)
|
|
|
|
import asyncio
|
|
asyncio.run(cosmos_operations())
|
|
```
|
|
|
|
## Error Handling
|
|
|
|
```python
|
|
from azure.cosmos.exceptions import CosmosHttpResponseError
|
|
|
|
try:
|
|
item = container.read_item(item="nonexistent", partition_key="pk")
|
|
except CosmosHttpResponseError as e:
|
|
if e.status_code == 404:
|
|
print("Item not found")
|
|
elif e.status_code == 429:
|
|
print(f"Rate limited. Retry after: {e.headers.get('x-ms-retry-after-ms')}ms")
|
|
else:
|
|
raise
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Always specify partition key** for point reads and queries
|
|
2. **Use parameterized queries** to prevent injection and improve caching
|
|
3. **Avoid cross-partition queries** when possible
|
|
4. **Use `upsert_item`** for idempotent writes
|
|
5. **Use async client** for high-throughput scenarios
|
|
6. **Design partition key** for even data distribution
|
|
7. **Use `read_item`** instead of query for single document retrieval
|
|
|
|
## Reference Files
|
|
|
|
| File | Contents |
|
|
|------|----------|
|
|
| references/partitioning.md | Partition key strategies, hierarchical keys, hot partition detection and mitigation |
|
|
| references/query-patterns.md | Query optimization, aggregations, pagination, transactions, change feed |
|
|
| scripts/setup_cosmos_container.py | CLI tool for creating containers with partitioning, throughput, and indexing |
|
|
|
|
## When to Use
|
|
This skill is applicable to execute the workflow or actions described in the overview.
|