Building an Inference API with YAML
Objective
Deploy a scikit-learn model as a Managed Online Endpoint in Azure Machine Learning (AML) using YAML + Azure ML CLI v2.
You will:
Create or reuse an Azure ML workspace
Train a tiny model locally (diabetes regression), producing
model.pklRegister the model in Azure ML
Create a managed online endpoint
Create a deployment from YAML and send all traffic to it
Invoke the endpoint and validate results
Clean up
This lab is intentionally written step-by-step and assumes you are new to AML endpoints.
Prerequisites
Azure CLI installed
Azure ML CLI extension installed:
az extension add -n ml -y az extension update -n mlkubectlis not requiredPython 3.9+ locally
RBAC: Contributor on the resource group, plus permissions for AML operations
References:
Online endpoint YAML schema: https://learn.microsoft.com/azure/machine-learning/reference-yaml-endpoint-online?view=azureml-api-2
Managed online deployment YAML schema: https://learn.microsoft.com/azure/machine-learning/reference-yaml-deployment-managed-online?view=azureml-api-2
Azure ML inference server guidance: https://learn.microsoft.com/azure/machine-learning/how-to-inference-server-http?view=azureml-api-2
Folder structure
Step 0: Set defaults (subscription, RG, workspace)
Set defaults so you can omit --resource-group and --workspace-name later:
Quick sanity check:
Step 1: Train a tiny sample model locally
Install deps:
Train:
Expected:
./model/model.pklis created
Step 2: Register the model in Azure ML
Confirm:
Step 3: Create the online endpoint
The endpoint YAML contains only endpoint-level settings (name, auth, identity). Deployments are created separately.
Wait until provisioning completes:
Step 4: Create the deployment and route traffic
Check status:
If it fails, get logs:
Step 5: Invoke the endpoint
Get the scoring URI:
Invoke via CLI (recommended for first test):
Expected output:
JSON list of numeric predictions
Step 6: Optional hardening (quick pointers)
Switch to Entra-based auth (
aad_token) for enterprise use cases (instead ofkey). ξciteξturn0search10ξAdd Private Link for private endpoints and lock down public access for production (not covered in this lab)
Use managed identity for downstream access (Storage, Key Vault). ξciteξturn0search17ξ
Cleanup
Delete deployment first (optional):
Delete endpoint:
Optionally delete the resource group:
Troubleshooting quick guide
Endpoint created but deployment is stuck
Check logs:
401 Unauthorized
If using
auth_mode: key, fetch keys:
Import errors in scoring
Ensure
environment.ymlandrequirements.txtinclude your needed packagesPrefer minimal requirements. Large environments slow down deployment startup. ξciteξturn0search2ξ
Last updated