Set up AutoML for Time-Series Forecasting with SDK and CLI
In this comprehensive guide, we’ll explore how to set up Azure Machine Learning’s automated machine learning (AutoML) for time-series forecasting using the Azure Machine Learning Python SDK and CLI. We’ll cover everything from preparing your data to orchestrating the entire workflow, including training, inference, and model evaluation.
Prepare Your Data for Forecasting
To start, your input data for AutoML forecasting must contain valid time series in tabular format. Each variable should have its own corresponding column in the data table. AutoML requires at least two columns: a time column representing the time axis and the target column which is the quantity to forecast. Other columns can serve as predictors.
You’ll define your training data as an MLTable object, which specifies a data source and steps for loading the data. This can be done using the MLTable Python SDK or by defining a YAML configuration file.
For example, if your training data is in a local CSV file, you can create an MLTable like this:
import mltable
paths = [
{'file': './train_data/timeseries_train.csv'}
]
train_table = mltable.from_delimited_files(paths)
train_table.save('./train_data')
my_training_data_input = Input(
type=AssetTypes.MLTABLE, path="./train_data"
)
You can also specify validation data in a similar way, or let AutoML automatically create cross-validation splits from your training data.
Configure the AutoML Forecasting Job
With your data ready, you’ll configure the forecasting job by setting parameters like the primary metric, target column, and cross-validation settings. For example:
from azure.ai.ml import automl
forecasting_job = automl.forecasting(
compute="cpu-compute