🚀 Impact Significantly expands the capabilities of **Antigravity Awesome Skills** by integrating official skill collections from **Microsoft** and **Google Gemini**. This update increases the total skill count to **845+**, making the library even more comprehensive for AI coding assistants. ✨ Key Changes 1. New Official Skills - **Microsoft Skills**: Added a massive collection of official skills from [microsoft/skills](https://github.com/microsoft/skills). - Includes Azure, .NET, Python, TypeScript, and Semantic Kernel skills. - Preserves the original directory structure under `skills/official/microsoft/`. - Includes plugin skills from the `.github/plugins` directory. - **Gemini Skills**: Added official Gemini API development skills under `skills/gemini-api-dev/`. 2. New Scripts & Tooling - **`scripts/sync_microsoft_skills.py`**: A robust synchronization script that: - Clones the official Microsoft repository. - Preserves the original directory heirarchy. - Handles symlinks and plugin locations. - Generates attribution metadata. - **`scripts/tests/inspect_microsoft_repo.py`**: Debug tool to inspect the remote repository structure. - **`scripts/tests/test_comprehensive_coverage.py`**: Verification script to ensure 100% of skills are captured during sync. 3. Core Improvements - **`scripts/generate_index.py`**: Enhanced frontmatter parsing to safely handle unquoted values containing `@` symbols and commas (fixing issues with some Microsoft skill descriptions). - **`package.json`**: Added `sync:microsoft` and `sync:all-official` scripts for easy maintenance. 4. Documentation - Updated `README.md` to reflect the new skill counts (845+) and added Microsoft/Gemini to the provider list. - Updated `CATALOG.md` and `skills_index.json` with the new skills. 🧪 Verification - Ran `scripts/tests/test_comprehensive_coverage.py` to verify all Microsoft skills are detected. - Validated `generate_index.py` fixes by successfully indexing the new skills.
261 lines
6.5 KiB
Markdown
261 lines
6.5 KiB
Markdown
---
|
|
name: azure-ai-vision-imageanalysis-py
|
|
description: |
|
|
Azure AI Vision Image Analysis SDK for captions, tags, objects, OCR, people detection, and smart cropping. Use for computer vision and image understanding tasks.
|
|
Triggers: "image analysis", "computer vision", "OCR", "object detection", "ImageAnalysisClient", "image caption".
|
|
package: azure-ai-vision-imageanalysis
|
|
---
|
|
|
|
# Azure AI Vision Image Analysis SDK for Python
|
|
|
|
Client library for Azure AI Vision 4.0 image analysis including captions, tags, objects, OCR, and more.
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
pip install azure-ai-vision-imageanalysis
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
```bash
|
|
VISION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
|
|
VISION_KEY=<your-api-key> # If using API key
|
|
```
|
|
|
|
## Authentication
|
|
|
|
### API Key
|
|
|
|
```python
|
|
import os
|
|
from azure.ai.vision.imageanalysis import ImageAnalysisClient
|
|
from azure.core.credentials import AzureKeyCredential
|
|
|
|
endpoint = os.environ["VISION_ENDPOINT"]
|
|
key = os.environ["VISION_KEY"]
|
|
|
|
client = ImageAnalysisClient(
|
|
endpoint=endpoint,
|
|
credential=AzureKeyCredential(key)
|
|
)
|
|
```
|
|
|
|
### Entra ID (Recommended)
|
|
|
|
```python
|
|
from azure.ai.vision.imageanalysis import ImageAnalysisClient
|
|
from azure.identity import DefaultAzureCredential
|
|
|
|
client = ImageAnalysisClient(
|
|
endpoint=os.environ["VISION_ENDPOINT"],
|
|
credential=DefaultAzureCredential()
|
|
)
|
|
```
|
|
|
|
## Analyze Image from URL
|
|
|
|
```python
|
|
from azure.ai.vision.imageanalysis.models import VisualFeatures
|
|
|
|
image_url = "https://example.com/image.jpg"
|
|
|
|
result = client.analyze_from_url(
|
|
image_url=image_url,
|
|
visual_features=[
|
|
VisualFeatures.CAPTION,
|
|
VisualFeatures.TAGS,
|
|
VisualFeatures.OBJECTS,
|
|
VisualFeatures.READ,
|
|
VisualFeatures.PEOPLE,
|
|
VisualFeatures.SMART_CROPS,
|
|
VisualFeatures.DENSE_CAPTIONS
|
|
],
|
|
gender_neutral_caption=True,
|
|
language="en"
|
|
)
|
|
```
|
|
|
|
## Analyze Image from File
|
|
|
|
```python
|
|
with open("image.jpg", "rb") as f:
|
|
image_data = f.read()
|
|
|
|
result = client.analyze(
|
|
image_data=image_data,
|
|
visual_features=[VisualFeatures.CAPTION, VisualFeatures.TAGS]
|
|
)
|
|
```
|
|
|
|
## Image Caption
|
|
|
|
```python
|
|
result = client.analyze_from_url(
|
|
image_url=image_url,
|
|
visual_features=[VisualFeatures.CAPTION],
|
|
gender_neutral_caption=True
|
|
)
|
|
|
|
if result.caption:
|
|
print(f"Caption: {result.caption.text}")
|
|
print(f"Confidence: {result.caption.confidence:.2f}")
|
|
```
|
|
|
|
## Dense Captions (Multiple Regions)
|
|
|
|
```python
|
|
result = client.analyze_from_url(
|
|
image_url=image_url,
|
|
visual_features=[VisualFeatures.DENSE_CAPTIONS]
|
|
)
|
|
|
|
if result.dense_captions:
|
|
for caption in result.dense_captions.list:
|
|
print(f"Caption: {caption.text}")
|
|
print(f" Confidence: {caption.confidence:.2f}")
|
|
print(f" Bounding box: {caption.bounding_box}")
|
|
```
|
|
|
|
## Tags
|
|
|
|
```python
|
|
result = client.analyze_from_url(
|
|
image_url=image_url,
|
|
visual_features=[VisualFeatures.TAGS]
|
|
)
|
|
|
|
if result.tags:
|
|
for tag in result.tags.list:
|
|
print(f"Tag: {tag.name} (confidence: {tag.confidence:.2f})")
|
|
```
|
|
|
|
## Object Detection
|
|
|
|
```python
|
|
result = client.analyze_from_url(
|
|
image_url=image_url,
|
|
visual_features=[VisualFeatures.OBJECTS]
|
|
)
|
|
|
|
if result.objects:
|
|
for obj in result.objects.list:
|
|
print(f"Object: {obj.tags[0].name}")
|
|
print(f" Confidence: {obj.tags[0].confidence:.2f}")
|
|
box = obj.bounding_box
|
|
print(f" Bounding box: x={box.x}, y={box.y}, w={box.width}, h={box.height}")
|
|
```
|
|
|
|
## OCR (Text Extraction)
|
|
|
|
```python
|
|
result = client.analyze_from_url(
|
|
image_url=image_url,
|
|
visual_features=[VisualFeatures.READ]
|
|
)
|
|
|
|
if result.read:
|
|
for block in result.read.blocks:
|
|
for line in block.lines:
|
|
print(f"Line: {line.text}")
|
|
print(f" Bounding polygon: {line.bounding_polygon}")
|
|
|
|
# Word-level details
|
|
for word in line.words:
|
|
print(f" Word: {word.text} (confidence: {word.confidence:.2f})")
|
|
```
|
|
|
|
## People Detection
|
|
|
|
```python
|
|
result = client.analyze_from_url(
|
|
image_url=image_url,
|
|
visual_features=[VisualFeatures.PEOPLE]
|
|
)
|
|
|
|
if result.people:
|
|
for person in result.people.list:
|
|
print(f"Person detected:")
|
|
print(f" Confidence: {person.confidence:.2f}")
|
|
box = person.bounding_box
|
|
print(f" Bounding box: x={box.x}, y={box.y}, w={box.width}, h={box.height}")
|
|
```
|
|
|
|
## Smart Cropping
|
|
|
|
```python
|
|
result = client.analyze_from_url(
|
|
image_url=image_url,
|
|
visual_features=[VisualFeatures.SMART_CROPS],
|
|
smart_crops_aspect_ratios=[0.9, 1.33, 1.78] # Portrait, 4:3, 16:9
|
|
)
|
|
|
|
if result.smart_crops:
|
|
for crop in result.smart_crops.list:
|
|
print(f"Aspect ratio: {crop.aspect_ratio}")
|
|
box = crop.bounding_box
|
|
print(f" Crop region: x={box.x}, y={box.y}, w={box.width}, h={box.height}")
|
|
```
|
|
|
|
## Async Client
|
|
|
|
```python
|
|
from azure.ai.vision.imageanalysis.aio import ImageAnalysisClient
|
|
from azure.identity.aio import DefaultAzureCredential
|
|
|
|
async def analyze_image():
|
|
async with ImageAnalysisClient(
|
|
endpoint=endpoint,
|
|
credential=DefaultAzureCredential()
|
|
) as client:
|
|
result = await client.analyze_from_url(
|
|
image_url=image_url,
|
|
visual_features=[VisualFeatures.CAPTION]
|
|
)
|
|
print(result.caption.text)
|
|
```
|
|
|
|
## Visual Features
|
|
|
|
| Feature | Description |
|
|
|---------|-------------|
|
|
| `CAPTION` | Single sentence describing the image |
|
|
| `DENSE_CAPTIONS` | Captions for multiple regions |
|
|
| `TAGS` | Content tags (objects, scenes, actions) |
|
|
| `OBJECTS` | Object detection with bounding boxes |
|
|
| `READ` | OCR text extraction |
|
|
| `PEOPLE` | People detection with bounding boxes |
|
|
| `SMART_CROPS` | Suggested crop regions for thumbnails |
|
|
|
|
## Error Handling
|
|
|
|
```python
|
|
from azure.core.exceptions import HttpResponseError
|
|
|
|
try:
|
|
result = client.analyze_from_url(
|
|
image_url=image_url,
|
|
visual_features=[VisualFeatures.CAPTION]
|
|
)
|
|
except HttpResponseError as e:
|
|
print(f"Status code: {e.status_code}")
|
|
print(f"Reason: {e.reason}")
|
|
print(f"Message: {e.error.message}")
|
|
```
|
|
|
|
## Image Requirements
|
|
|
|
- Formats: JPEG, PNG, GIF, BMP, WEBP, ICO, TIFF, MPO
|
|
- Max size: 20 MB
|
|
- Dimensions: 50x50 to 16000x16000 pixels
|
|
|
|
## Best Practices
|
|
|
|
1. **Select only needed features** to optimize latency and cost
|
|
2. **Use async client** for high-throughput scenarios
|
|
3. **Handle HttpResponseError** for invalid images or auth issues
|
|
4. **Enable gender_neutral_caption** for inclusive descriptions
|
|
5. **Specify language** for localized captions
|
|
6. **Use smart_crops_aspect_ratios** matching your thumbnail requirements
|
|
7. **Cache results** when analyzing the same image multiple times
|