Table of Contents

The first attempt using daily data was poor, so now I gonna use weekly data. But what was the weekday of 1-1-2009?

The target is the first row!

Multivariate Time Series Modeling using LSTM

How to use the TimeseriesGenerator

Keras provides the TimeseriesGenerator that can be used to automatically transform a univariate or multivariate time series dataset into a supervised learning problem.

There are two parts to using the TimeseriesGenerator: defining it and using it to train models.

Defining a TimeseriesGenerator

You can create an instance of the class and specify the input and output aspects of your time series problem and it will provide an instance of a Sequence class that can then be used to iterate across the inputs and outputs of the series.

In most time series prediction problems, the input and output series will be the same series.

For example:

Technically, the class is not a generator in the sense that it is not a Python Generator and you cannot use the next() function on it.

In addition to specifying the input and output aspects of your time series problem, there are some additional parameters that you should configure; for example:

You must define a length argument based on your designed framing of the problem. That is the desired number of lag observations to use as input.

You must also define the batch size as the batch size of your model during training. If the number of samples in your dataset is less than your batch size, you can set the batch size in the generator and in your model to the total number of samples in your generator found via calculating its length; for example:

There are also other arguments such as defining start and end offsets into your data, the sampling rate, stride, and more. You are less likely to use these features, but you can see the full API for more details.

The samples are not shuffled by default. This is useful for some recurrent neural networks like LSTMs that maintain state across samples within a batch.

It can benefit other neural networks, such as CNNs and MLPs, to shuffle the samples when training. Shuffling can be enabled by setting the ‘shuffle‘ argument to True. This will have the effect of shuffling samples returned for each batch.

At the time of writing, the TimeseriesGenerator is limited to one-step outputs. Multi-step time series forecasting is not supported.

Training a Model with a TimeseriesGenerator

Once a TimeseriesGenerator instance has been defined, it can be used to train a neural network model.

A model can be trained using the TimeseriesGenerator as a data generator. This can be achieved by fitting the defined model using the fit_generator() function.

This function takes the generator as an argument. It also takes a steps_per_epoch argument that defines the number of samples to use in each epoch. This can be set to the length of the TimeseriesGenerator instance to use all samples in the generator.

Create the LSTM model

[0.005353941582143307, 0.06212739273905754] for 19 epochs

Reverse the transformation:

the graph of the training loss vs. validation loss over the number of epochs.

This will help the developer of the model make informed decisions about the architectural choices that need to be made.

the graph of training accuracy vs. validation accuracy over the number of epochs.

The Abs percentage Error is <5% for the last 25 weeks!