1. Introduction to data science on Azure
-
Understanding data science: Explore core data science concepts, the data science lifecycle, and the role of an Azure Data Scientist.
-
Introducing Azure Machine Learning (Azure ML): Discover the capabilities of the Azure ML service and its high-level architecture.
-
Setting up the Azure ML workspace: Learn how to create and manage an Azure ML workspace using the Azure portal, Studio, CLI, and Python SDK (v2).
-
Managing resources: Create and manage compute targets (instances and clusters) and environments within the Azure ML workspace.
2. Working with data in Azure ML
-
Data ingestion and preparation: Identify data sources and formats, choose how to serve data to ML workflows, and connect to data using Azure ML datastores and data assets.
-
Data exploration and wrangling: Access and transform data during interactive development using notebooks, potentially leveraging attached Spark pools and serverless Spark compute.
-
Data transformation techniques: Apply techniques for cleaning and transforming data to handle challenges like missing or inaccurate data.
3. Training and evaluating machine learning models
-
Azure Machine Learning designer: Create and run training pipelines using the visual, no-code designer, consuming data assets and incorporating custom code components.
-
Automated machine learning (AutoML): Utilize AutoML for various ML tasks (tabular, computer vision, NLP) to explore featurization and algorithms, potentially accelerating model development and deployment.
-
Custom model training with notebooks: Develop code using a compute instance, train models using the Python SDK, track training with MLflow, and evaluate models.
-
Hyperparameter tuning: Optimize models by tuning hyperparameters using sweep jobs, defining search spaces, sampling methods, and early termination options.
-
Model evaluation: Learn how to evaluate model performance using relevant metrics (e.g., accuracy, precision, recall).
4. Deploying, managing, and operationalizing ML solutions
-
Model training scripts: Convert notebooks to scripts, run scripts as command jobs, configure job run settings, and utilize MLflow to log metrics and troubleshoot errors.
-
Building pipelines: Create, run, and schedule pipelines in Azure ML using components, passing data between steps to automate ML workflows.
-
Model deployment: Configure and deploy models to both online and batch endpoints for real-time and batch inferencing, and test deployed services.
-
Responsible machine learning: Apply responsible AI principles throughout the ML lifecycle, addressing ethical considerations like fairness, privacy, and bias.
5. MLOps (machine learning operations)
-
Implementing MLOps practices: Automate model retraining based on new data or changes, define event-based retraining triggers, and integrate with CI/CD pipelines (e.g., Azure DevOps or GitHub).
-
Monitoring models: Track model quality and detect data drift or bias in production.
-
Building ML workflows: Orchestrate complex machine learning tasks using various Azure services.
6. Optimizing language models (LMs) for AI applications
-
Exploring language models: Select and deploy LMs from the model catalog, compare them using benchmarks, and test deployed LMs in the playground.
-
Optimization approaches: Choose appropriate strategies for optimizing LMs.