LSTM Tensorflow Input/Output Dimensions

I’m a little confused by what I’m getting vs. what I’m
expecting. I’m using Tensorflow 2.1 in Python 3.7 in Anaconda
3-2020.07

Here’s my problem:

I want my output to be the next value in an hour-by-hour time
series.
My input has 99 features.
I have 24,444 data points for training. Some of the data was
corrupted/reserved for validation.

I’m trying to build a 2 layer deep neural network using LSTM
layers:

model = Sequential() model.add(tensorflow.keras.layers.LSTM(64,
return_sequences=True, input_dim=99))

model.add(tensorflow.keras.layers.LSTM(32,
return_sequences=True))

model.add(tensorflow.keras.layers.Dense(1)

I plan to give it sets of data with 72 hours (3 days) of
sequential training.

So when I give my model training data:

model.fit(X_data, Y_data,
…)

I planned on giving X_data with dimensions of size [24444, 72,
99], where the first dimension 24444 describes the data points, the
72 describes the 72 hours of history, and the 99 describes my
training features.

My Y_data has dimensions of size [24444, 72, 1] where first
dimension 24444 describes my training points, 72 describes the
history, and 1 is my output feature.

My question is, when training is done, and I’m actively using my
model for predictions, what should my production input size be?

prediction = model.predict(production_data)

Should my production size be [1, 72, 99]? Where 1 is the number
of output points I expect, 72 is my history, and 99 my feature
size?

When I do this, I get an output size of [72, 1]. That feels…
weird?

What is the difference between feeding my model input of [72, 1,
99] vs [1, 72, 99]? Does the first case not proprogate the internal
state forward?

If I give my model [1, 1, 99] do I need to loop my model
predictions? And how would I do this?

submitted by /u/jyliu86

[visit reddit]
[comments]

Leave a Reply Cancel reply