* chore: upgrade maintenance scripts to robust PyYAML parsing - Replaces fragile regex frontmatter parsing with PyYAML/yaml library - Ensures multi-line descriptions and complex characters are handled safely - Normalizes quoting and field ordering across all maintenance scripts - Updates validator to strictly enforce description quality * fix: restore and refine truncated skill descriptions - Recovered 223+ truncated descriptions from git history (6.5.0 regression) - Refined long descriptions into concise, complete sentences (<200 chars) - Added missing descriptions for brainstorming and orchestration skills - Manually fixed imagen skill description - Resolved dangling links in competitor-alternatives skill * chore: sync generated registry files and document fixes - Regenerated skills index with normalized forward-slash paths - Updated README and CATALOG to reflect restored descriptions - Documented restoration and script improvements in CHANGELOG.md * fix: restore missing skill and align metadata for full 955 count - Renamed SKILL.MD to SKILL.md in andruia-skill-smith to ensure indexing - Fixed risk level and missing section in andruia-skill-smith - Synchronized all registry files for final 955 skill count * chore(scripts): add cross-platform runners and hermetic test orchestration * fix(scripts): harden utf-8 output and clone target writeability * fix(skills): add missing date metadata for strict validation * chore(index): sync generated metadata dates * fix(catalog): normalize skill paths to prevent CI drift * chore: sync generated registry files * fix: enforce LF line endings for generated registry files
223 lines
5.8 KiB
Markdown
223 lines
5.8 KiB
Markdown
---
|
|
name: azure-storage-blob-py
|
|
description: Azure Blob Storage SDK for Python. Use for uploading, downloading, listing blobs, managing containers, and blob lifecycle.
|
|
risk: unknown
|
|
source: community
|
|
date_added: '2026-02-27'
|
|
---
|
|
|
|
# Azure Blob Storage SDK for Python
|
|
|
|
Client library for Azure Blob Storage — object storage for unstructured data.
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
pip install azure-storage-blob azure-identity
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
```bash
|
|
AZURE_STORAGE_ACCOUNT_NAME=<your-storage-account>
|
|
# Or use full URL
|
|
AZURE_STORAGE_ACCOUNT_URL=https://<account>.blob.core.windows.net
|
|
```
|
|
|
|
## Authentication
|
|
|
|
```python
|
|
from azure.identity import DefaultAzureCredential
|
|
from azure.storage.blob import BlobServiceClient
|
|
|
|
credential = DefaultAzureCredential()
|
|
account_url = "https://<account>.blob.core.windows.net"
|
|
|
|
blob_service_client = BlobServiceClient(account_url, credential=credential)
|
|
```
|
|
|
|
## Client Hierarchy
|
|
|
|
| Client | Purpose | Get From |
|
|
|--------|---------|----------|
|
|
| `BlobServiceClient` | Account-level operations | Direct instantiation |
|
|
| `ContainerClient` | Container operations | `blob_service_client.get_container_client()` |
|
|
| `BlobClient` | Single blob operations | `container_client.get_blob_client()` |
|
|
|
|
## Core Workflow
|
|
|
|
### Create Container
|
|
|
|
```python
|
|
container_client = blob_service_client.get_container_client("mycontainer")
|
|
container_client.create_container()
|
|
```
|
|
|
|
### Upload Blob
|
|
|
|
```python
|
|
# From file path
|
|
blob_client = blob_service_client.get_blob_client(
|
|
container="mycontainer",
|
|
blob="sample.txt"
|
|
)
|
|
|
|
with open("./local-file.txt", "rb") as data:
|
|
blob_client.upload_blob(data, overwrite=True)
|
|
|
|
# From bytes/string
|
|
blob_client.upload_blob(b"Hello, World!", overwrite=True)
|
|
|
|
# From stream
|
|
import io
|
|
stream = io.BytesIO(b"Stream content")
|
|
blob_client.upload_blob(stream, overwrite=True)
|
|
```
|
|
|
|
### Download Blob
|
|
|
|
```python
|
|
blob_client = blob_service_client.get_blob_client(
|
|
container="mycontainer",
|
|
blob="sample.txt"
|
|
)
|
|
|
|
# To file
|
|
with open("./downloaded.txt", "wb") as file:
|
|
download_stream = blob_client.download_blob()
|
|
file.write(download_stream.readall())
|
|
|
|
# To memory
|
|
download_stream = blob_client.download_blob()
|
|
content = download_stream.readall() # bytes
|
|
|
|
# Read into existing buffer
|
|
stream = io.BytesIO()
|
|
num_bytes = blob_client.download_blob().readinto(stream)
|
|
```
|
|
|
|
### List Blobs
|
|
|
|
```python
|
|
container_client = blob_service_client.get_container_client("mycontainer")
|
|
|
|
# List all blobs
|
|
for blob in container_client.list_blobs():
|
|
print(f"{blob.name} - {blob.size} bytes")
|
|
|
|
# List with prefix (folder-like)
|
|
for blob in container_client.list_blobs(name_starts_with="logs/"):
|
|
print(blob.name)
|
|
|
|
# Walk blob hierarchy (virtual directories)
|
|
for item in container_client.walk_blobs(delimiter="/"):
|
|
if item.get("prefix"):
|
|
print(f"Directory: {item['prefix']}")
|
|
else:
|
|
print(f"Blob: {item.name}")
|
|
```
|
|
|
|
### Delete Blob
|
|
|
|
```python
|
|
blob_client.delete_blob()
|
|
|
|
# Delete with snapshots
|
|
blob_client.delete_blob(delete_snapshots="include")
|
|
```
|
|
|
|
## Performance Tuning
|
|
|
|
```python
|
|
# Configure chunk sizes for large uploads/downloads
|
|
blob_client = BlobClient(
|
|
account_url=account_url,
|
|
container_name="mycontainer",
|
|
blob_name="large-file.zip",
|
|
credential=credential,
|
|
max_block_size=4 * 1024 * 1024, # 4 MiB blocks
|
|
max_single_put_size=64 * 1024 * 1024 # 64 MiB single upload limit
|
|
)
|
|
|
|
# Parallel upload
|
|
blob_client.upload_blob(data, max_concurrency=4)
|
|
|
|
# Parallel download
|
|
download_stream = blob_client.download_blob(max_concurrency=4)
|
|
```
|
|
|
|
## SAS Tokens
|
|
|
|
```python
|
|
from datetime import datetime, timedelta, timezone
|
|
from azure.storage.blob import generate_blob_sas, BlobSasPermissions
|
|
|
|
sas_token = generate_blob_sas(
|
|
account_name="<account>",
|
|
container_name="mycontainer",
|
|
blob_name="sample.txt",
|
|
account_key="<account-key>", # Or use user delegation key
|
|
permission=BlobSasPermissions(read=True),
|
|
expiry=datetime.now(timezone.utc) + timedelta(hours=1)
|
|
)
|
|
|
|
# Use SAS token
|
|
blob_url = f"https://<account>.blob.core.windows.net/mycontainer/sample.txt?{sas_token}"
|
|
```
|
|
|
|
## Blob Properties and Metadata
|
|
|
|
```python
|
|
# Get properties
|
|
properties = blob_client.get_blob_properties()
|
|
print(f"Size: {properties.size}")
|
|
print(f"Content-Type: {properties.content_settings.content_type}")
|
|
print(f"Last modified: {properties.last_modified}")
|
|
|
|
# Set metadata
|
|
blob_client.set_blob_metadata(metadata={"category": "logs", "year": "2024"})
|
|
|
|
# Set content type
|
|
from azure.storage.blob import ContentSettings
|
|
blob_client.set_http_headers(
|
|
content_settings=ContentSettings(content_type="application/json")
|
|
)
|
|
```
|
|
|
|
## Async Client
|
|
|
|
```python
|
|
from azure.identity.aio import DefaultAzureCredential
|
|
from azure.storage.blob.aio import BlobServiceClient
|
|
|
|
async def upload_async():
|
|
credential = DefaultAzureCredential()
|
|
|
|
async with BlobServiceClient(account_url, credential=credential) as client:
|
|
blob_client = client.get_blob_client("mycontainer", "sample.txt")
|
|
|
|
with open("./file.txt", "rb") as data:
|
|
await blob_client.upload_blob(data, overwrite=True)
|
|
|
|
# Download async
|
|
async def download_async():
|
|
async with BlobServiceClient(account_url, credential=credential) as client:
|
|
blob_client = client.get_blob_client("mycontainer", "sample.txt")
|
|
|
|
stream = await blob_client.download_blob()
|
|
data = await stream.readall()
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Use DefaultAzureCredential** instead of connection strings
|
|
2. **Use context managers** for async clients
|
|
3. **Set `overwrite=True`** explicitly when re-uploading
|
|
4. **Use `max_concurrency`** for large file transfers
|
|
5. **Prefer `readinto()`** over `readall()` for memory efficiency
|
|
6. **Use `walk_blobs()`** for hierarchical listing
|
|
7. **Set appropriate content types** for web-served blobs
|
|
|
|
## When to Use
|
|
This skill is applicable to execute the workflow or actions described in the overview.
|