feat: Add Official Microsoft & Gemini Skills (845+ Total)
🚀 Impact Significantly expands the capabilities of **Antigravity Awesome Skills** by integrating official skill collections from **Microsoft** and **Google Gemini**. This update increases the total skill count to **845+**, making the library even more comprehensive for AI coding assistants. ✨ Key Changes 1. New Official Skills - **Microsoft Skills**: Added a massive collection of official skills from [microsoft/skills](https://github.com/microsoft/skills). - Includes Azure, .NET, Python, TypeScript, and Semantic Kernel skills. - Preserves the original directory structure under `skills/official/microsoft/`. - Includes plugin skills from the `.github/plugins` directory. - **Gemini Skills**: Added official Gemini API development skills under `skills/gemini-api-dev/`. 2. New Scripts & Tooling - **`scripts/sync_microsoft_skills.py`**: A robust synchronization script that: - Clones the official Microsoft repository. - Preserves the original directory heirarchy. - Handles symlinks and plugin locations. - Generates attribution metadata. - **`scripts/tests/inspect_microsoft_repo.py`**: Debug tool to inspect the remote repository structure. - **`scripts/tests/test_comprehensive_coverage.py`**: Verification script to ensure 100% of skills are captured during sync. 3. Core Improvements - **`scripts/generate_index.py`**: Enhanced frontmatter parsing to safely handle unquoted values containing `@` symbols and commas (fixing issues with some Microsoft skill descriptions). - **`package.json`**: Added `sync:microsoft` and `sync:all-official` scripts for easy maintenance. 4. Documentation - Updated `README.md` to reflect the new skill counts (845+) and added Microsoft/Gemini to the provider list. - Updated `CATALOG.md` and `skills_index.json` with the new skills. 🧪 Verification - Ran `scripts/tests/test_comprehensive_coverage.py` to verify all Microsoft skills are detected. - Validated `generate_index.py` fixes by successfully indexing the new skills.
This commit is contained in:
271
skills/official/microsoft/python/foundry/ml/SKILL.md
Normal file
271
skills/official/microsoft/python/foundry/ml/SKILL.md
Normal file
@@ -0,0 +1,271 @@
|
||||
---
|
||||
name: azure-ai-ml-py
|
||||
description: |
|
||||
Azure Machine Learning SDK v2 for Python. Use for ML workspaces, jobs, models, datasets, compute, and pipelines.
|
||||
Triggers: "azure-ai-ml", "MLClient", "workspace", "model registry", "training jobs", "datasets".
|
||||
package: azure-ai-ml
|
||||
---
|
||||
|
||||
# Azure Machine Learning SDK v2 for Python
|
||||
|
||||
Client library for managing Azure ML resources: workspaces, jobs, models, data, and compute.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install azure-ai-ml
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
AZURE_SUBSCRIPTION_ID=<your-subscription-id>
|
||||
AZURE_RESOURCE_GROUP=<your-resource-group>
|
||||
AZURE_ML_WORKSPACE_NAME=<your-workspace-name>
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
```python
|
||||
from azure.ai.ml import MLClient
|
||||
from azure.identity import DefaultAzureCredential
|
||||
|
||||
ml_client = MLClient(
|
||||
credential=DefaultAzureCredential(),
|
||||
subscription_id=os.environ["AZURE_SUBSCRIPTION_ID"],
|
||||
resource_group_name=os.environ["AZURE_RESOURCE_GROUP"],
|
||||
workspace_name=os.environ["AZURE_ML_WORKSPACE_NAME"]
|
||||
)
|
||||
```
|
||||
|
||||
### From Config File
|
||||
|
||||
```python
|
||||
from azure.ai.ml import MLClient
|
||||
from azure.identity import DefaultAzureCredential
|
||||
|
||||
# Uses config.json in current directory or parent
|
||||
ml_client = MLClient.from_config(
|
||||
credential=DefaultAzureCredential()
|
||||
)
|
||||
```
|
||||
|
||||
## Workspace Management
|
||||
|
||||
### Create Workspace
|
||||
|
||||
```python
|
||||
from azure.ai.ml.entities import Workspace
|
||||
|
||||
ws = Workspace(
|
||||
name="my-workspace",
|
||||
location="eastus",
|
||||
display_name="My Workspace",
|
||||
description="ML workspace for experiments",
|
||||
tags={"purpose": "demo"}
|
||||
)
|
||||
|
||||
ml_client.workspaces.begin_create(ws).result()
|
||||
```
|
||||
|
||||
### List Workspaces
|
||||
|
||||
```python
|
||||
for ws in ml_client.workspaces.list():
|
||||
print(f"{ws.name}: {ws.location}")
|
||||
```
|
||||
|
||||
## Data Assets
|
||||
|
||||
### Register Data
|
||||
|
||||
```python
|
||||
from azure.ai.ml.entities import Data
|
||||
from azure.ai.ml.constants import AssetTypes
|
||||
|
||||
# Register a file
|
||||
my_data = Data(
|
||||
name="my-dataset",
|
||||
version="1",
|
||||
path="azureml://datastores/workspaceblobstore/paths/data/train.csv",
|
||||
type=AssetTypes.URI_FILE,
|
||||
description="Training data"
|
||||
)
|
||||
|
||||
ml_client.data.create_or_update(my_data)
|
||||
```
|
||||
|
||||
### Register Folder
|
||||
|
||||
```python
|
||||
my_data = Data(
|
||||
name="my-folder-dataset",
|
||||
version="1",
|
||||
path="azureml://datastores/workspaceblobstore/paths/data/",
|
||||
type=AssetTypes.URI_FOLDER
|
||||
)
|
||||
|
||||
ml_client.data.create_or_update(my_data)
|
||||
```
|
||||
|
||||
## Model Registry
|
||||
|
||||
### Register Model
|
||||
|
||||
```python
|
||||
from azure.ai.ml.entities import Model
|
||||
from azure.ai.ml.constants import AssetTypes
|
||||
|
||||
model = Model(
|
||||
name="my-model",
|
||||
version="1",
|
||||
path="./model/",
|
||||
type=AssetTypes.CUSTOM_MODEL,
|
||||
description="My trained model"
|
||||
)
|
||||
|
||||
ml_client.models.create_or_update(model)
|
||||
```
|
||||
|
||||
### List Models
|
||||
|
||||
```python
|
||||
for model in ml_client.models.list(name="my-model"):
|
||||
print(f"{model.name} v{model.version}")
|
||||
```
|
||||
|
||||
## Compute
|
||||
|
||||
### Create Compute Cluster
|
||||
|
||||
```python
|
||||
from azure.ai.ml.entities import AmlCompute
|
||||
|
||||
cluster = AmlCompute(
|
||||
name="cpu-cluster",
|
||||
type="amlcompute",
|
||||
size="Standard_DS3_v2",
|
||||
min_instances=0,
|
||||
max_instances=4,
|
||||
idle_time_before_scale_down=120
|
||||
)
|
||||
|
||||
ml_client.compute.begin_create_or_update(cluster).result()
|
||||
```
|
||||
|
||||
### List Compute
|
||||
|
||||
```python
|
||||
for compute in ml_client.compute.list():
|
||||
print(f"{compute.name}: {compute.type}")
|
||||
```
|
||||
|
||||
## Jobs
|
||||
|
||||
### Command Job
|
||||
|
||||
```python
|
||||
from azure.ai.ml import command, Input
|
||||
|
||||
job = command(
|
||||
code="./src",
|
||||
command="python train.py --data ${{inputs.data}} --lr ${{inputs.learning_rate}}",
|
||||
inputs={
|
||||
"data": Input(type="uri_folder", path="azureml:my-dataset:1"),
|
||||
"learning_rate": 0.01
|
||||
},
|
||||
environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
|
||||
compute="cpu-cluster",
|
||||
display_name="training-job"
|
||||
)
|
||||
|
||||
returned_job = ml_client.jobs.create_or_update(job)
|
||||
print(f"Job URL: {returned_job.studio_url}")
|
||||
```
|
||||
|
||||
### Monitor Job
|
||||
|
||||
```python
|
||||
ml_client.jobs.stream(returned_job.name)
|
||||
```
|
||||
|
||||
## Pipelines
|
||||
|
||||
```python
|
||||
from azure.ai.ml import dsl, Input, Output
|
||||
from azure.ai.ml.entities import Pipeline
|
||||
|
||||
@dsl.pipeline(
|
||||
compute="cpu-cluster",
|
||||
description="Training pipeline"
|
||||
)
|
||||
def training_pipeline(data_input):
|
||||
prep_step = prep_component(data=data_input)
|
||||
train_step = train_component(
|
||||
data=prep_step.outputs.output_data,
|
||||
learning_rate=0.01
|
||||
)
|
||||
return {"model": train_step.outputs.model}
|
||||
|
||||
pipeline = training_pipeline(
|
||||
data_input=Input(type="uri_folder", path="azureml:my-dataset:1")
|
||||
)
|
||||
|
||||
pipeline_job = ml_client.jobs.create_or_update(pipeline)
|
||||
```
|
||||
|
||||
## Environments
|
||||
|
||||
### Create Custom Environment
|
||||
|
||||
```python
|
||||
from azure.ai.ml.entities import Environment
|
||||
|
||||
env = Environment(
|
||||
name="my-env",
|
||||
version="1",
|
||||
image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04",
|
||||
conda_file="./environment.yml"
|
||||
)
|
||||
|
||||
ml_client.environments.create_or_update(env)
|
||||
```
|
||||
|
||||
## Datastores
|
||||
|
||||
### List Datastores
|
||||
|
||||
```python
|
||||
for ds in ml_client.datastores.list():
|
||||
print(f"{ds.name}: {ds.type}")
|
||||
```
|
||||
|
||||
### Get Default Datastore
|
||||
|
||||
```python
|
||||
default_ds = ml_client.datastores.get_default()
|
||||
print(f"Default: {default_ds.name}")
|
||||
```
|
||||
|
||||
## MLClient Operations
|
||||
|
||||
| Property | Operations |
|
||||
|----------|------------|
|
||||
| `workspaces` | create, get, list, delete |
|
||||
| `jobs` | create_or_update, get, list, stream, cancel |
|
||||
| `models` | create_or_update, get, list, archive |
|
||||
| `data` | create_or_update, get, list |
|
||||
| `compute` | begin_create_or_update, get, list, delete |
|
||||
| `environments` | create_or_update, get, list |
|
||||
| `datastores` | create_or_update, get, list, get_default |
|
||||
| `components` | create_or_update, get, list |
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use versioning** for data, models, and environments
|
||||
2. **Configure idle scale-down** to reduce compute costs
|
||||
3. **Use environments** for reproducible training
|
||||
4. **Stream job logs** to monitor progress
|
||||
5. **Register models** after successful training jobs
|
||||
6. **Use pipelines** for multi-step workflows
|
||||
7. **Tag resources** for organization and cost tracking
|
||||
Reference in New Issue
Block a user