Deep Demand Forecasting with Amazon SageMaker

Use Deep Learning for Demand Forecasting

May 12 ·6min read

Introduction

In this article, we explore how to use Deep Learning methods for Demand Forecasting using Amazon SageMaker .

TL;DR: The code for this project is available on GitHub with a single click AWS CloudFormation template to set up the required stack.

What is Demand Forecasting?

Demand forecasting uses historical time-series data to help streamline the supply-demand decision-making process across businesses. Examples include predicting the number of

Customer representatives to hire for multiple locations in the next month
Product sales across multiple regions in the next quarter
Cloud server usage for the next day for a video streaming service
Electricity consumption for multiple regions over the next week
IoT devices and sensors such as energy consumption

A Brief Overview of Time-Series Forecasting

Any data indexed with time is time-series data. Time-series data are categorized as univariate and multi-variate . For example, the total electricity consumption for a single household is a univariate time-series over a period of time. Here is how a univariate time-series looks like with some forecasts in green

uEn2Mby.png!web

Univariate time-series with forecasts in green: GluonTS tutorial

When multiple univariate time-series are stacked up on each other, it’s called multi-variate time-series. For example, total electricity consumption of 10 different (but correlated) households in a single neighborhood make up a multi-variate time-series data. Such data encapsulates more information such as being correlated in a neighborhood. Therefore, we potentially can use this shared information to get better forecasts for each of the households.

The status quo approaches for time-series forecasting include:

Auto-regressive methods such as Auto Regressive Integrated Moving Average (ARIMA) for uni-variate time-series data
Vector Auto-Regression (VAR) for multi-variate time-series data

One of the disadvantages of these classical methods is that they require tedious data preprocessing and feature engineering prior to model training such as incorporating various data normalization, lags, different time scales, some categorical data, dealing with missing values, etc. all usually with stationarity assumption. However, Deep Learning (DL) models can automate these steps. Moreover, it is widely known that DL methods have surpassed classical methods in areas such as Computer Vision or Natural Language Process with enough data but how about time-series data ?

Deep Learning for time-series forecasting

The use of Deep Learning methods in time-series forecasting has been a major point of research in particular for (stationary/non-stationary) multi-variate time-series data. With highly optimized, dedicated frameworks such as Apache MXNet , PyTorch and TensorFlow , with fast GPU-enabled training and inference capabilities.

Research indicates that DL methods outperform the aforementioned classical methods ARIMA and VAR, in particular when dealing with large volumes of (correlated) multi-variate time-series data that have categorical features and missing values . One reason is that neural network models can predict seasonality for new events since these global models learn patterns jointly over the whole dataset and can extrapolate learned regularities to new series better. One such method is LSTNet from Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks.

How Does LSTNet Work?

LSTNet is one of the state-of-the-art DL methods for forecasting. We have contributed it to GluonTS which currently is based on the MXNet Gluon API . The advantage of LSTNet is that it incorporates traditional auto-regressive linear models in parallel to the non-linear neural network part. This makes the non-linear DL model more robust for the time series which violates scale changes .

The following is the LSTNet architecture, which contains a

Convolution as the first layer, followed by
Recurrent and Skip-Recurrent layers
Fully Connected layer combining the non-linear features and the linear features

VZRFRf7.png!web

Image taken from Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks

Data

For this demonstration, we will use multi-variate time-series electricity consumption data¹. A cleaned version of the data is available to download directly via GluonTS . The data contains 321 time-series with 1 Hour frequency, where

training data starts from 2012–01–01 00:00:00 and ends at 2014–05–26 19:00:00
testing data has additional data to from 2014–05–26 19:00:00 until 2014–05–27 19:00:00

Here is a snapshot of the normalized training data in a Pandas DataFrame

iaqAJnN.png!web

Components and Architecture Overview

The dataset is stored in an S3 bucket. The project that uses Amazon SageMaker has three main components as follows:

Pre-processing step to normalize the data designed as a micro-service to handle heavy duty jobs
Training an LSTNet model and examine the metrics such as sMAPE for the predictions
(Optional) Deploying and creating a real-time prediction HTTPS endpoint that connects to Amazon CloudWatch for monitoring

This complete code project is available in GitHub. Follow the instructions for a single click setup .

Here is the visual architecture guide:

iuM7fiI.png!web

Deep demand forecasting architecture

Train with Amazon SageMaker MXNet Estimator

Since we are using GluonTS, we need to train our model using an MXNet estimator by providing train.py as our entry point. For example, we train our model for 1 epoch for context_length=12 which is the training window size of 12 hours of past electricity consumption to predict for the next 6 hours prediction_length=6 as testing window size .

import logging
from sagemaker.mxnet import MXNetCONTEXT_LENGTH = 12
PREDICTION_LENGTH = 6hyperparameters = {
    'context_length': CONTEXT_LENGTH,
    'prediction_length': PREDICTION_LENGTH,
    'skip_size': 4,
    'ar_window': 4,
    'channels': 72,
    'scaling': False,
    'output_activation': 'sigmoid',
    'epochs': 1,
}estimator = MXNet(entry_point='train.py',
                  source_dir='deep_demand_forecast',
                  role=role,
                  train_instance_count=1, 
                  train_instance_type='ml.p3.2xlarge',
                  framework_version="1.6.0",
                  py_version='py3',
                  hyperparameters=hyperparameters,
                  output_path=train_output,
                  code_location=code_location,
                  sagemaker_session=session,
                  container_log_level=logging.DEBUG,
                 )estimator.fit(train_data)

Metrics

One of the most common evaluation metrics in time-series forecasting is

Symmetric Mean Absolution Percentage Error (sMAPE) which quantifies how much the model is under-forecasting or over-forecasting where the forecast at time t is Ft and the actual value of time t is At

Image taken from Wikipedia

When can visually compare it with another useful metric known as Mean Absolute Scaled Error (MASE) where the closer the values are to zero, generally the better the predictions become. We use Altair Python package to interactively examine their relationships.

Finally, we can interactively visualize the predictions vs. train and test data. For example, here is a sample of first 10 time-series covariates in train, test and predicted results.

As you can see, the model is performing relatively well for 1 epoch in capture the overall trends. One can use the HyperparameterTuner in the Amazon SageMaker Python SDK to achieve the state-of-the-art results on this data.

Deploy an Endpoint

Depending on the business objectives such as in a power station facility, when we are satisfied with how the model is performing offline, we can deploy an endpoint directly within Amazon SageMaker Notebooks as follows:

from sagemaker.mxnet import MXNetModelmodel = MXNetModel(model_data,
                   role,
                   entry_point='inference.py',
                   source_dir='deep_demand_forecast',
                   py_version='py3',
                   framework_version='1.6.0',
                  )predictor = model.deploy(instance_type='ml.m4.xlarge', initial_instance_count=1)

after that, we can hit the endpoint and send some data request_data to get predictions a single line of code with automatic JSON serialization and deserialization features

predictions = predictor.predict(request_data)

Conclusion

Thanks for reading through the article!

In this blog, we demonstrated how Deep Learning models be used in Demand Forecasting applications using Amazon SageMaker for preprocessing, training, testing, and deployment.

We would love to hear your opinions. Please use the GitHub issues for any questions, comments or feedback.

Acknowledgements

Last but not least, I would like to thank Vishaal Kapoor , Jonathan Chung and Adriana Simmons for their valuable feedbacks when writing this article.

Use Deep Learning for Demand Forecasting

Introduction

What is Demand Forecasting?

A Brief Overview of Time-Series Forecasting

Deep Learning for time-series forecasting

How Does LSTNet Work?

Data

Components and Architecture Overview

Train with Amazon SageMaker MXNet Estimator

Metrics

Deploy an Endpoint

Conclusion

Acknowledgements

Recommend

“我是大厂面试官” Java 集合，你肯定也会被问到这些

ViewModel and SavedStateHandle: always retain state

旗舰手机们上半年只干了一件事：涨价

喜茶、奈雪的茶，你们争夺“新式茶饮之王”的样子有点好笑

Monitoring Flask microservices with Prometheus

中概股为何掀起赴港二次上市浪潮？独角兽们去哪敲钟？

特斯拉违令复工

「只要五分钟」骇客就能通过 Thunderbolt 漏洞窃取你的数据

埃博拉病毒发现者感染新冠

新财富500富人榜：首富马云跃过3000亿黄峥首入前十

About Joyk