I’m a little confused by what I’m getting vs. what I’m

expecting. I’m using Tensorflow 2.1 in Python 3.7 in Anaconda

3-2020.07

Here’s my problem:

- I want my output to be the next value in an hour-by-hour time

series. - My input has 99 features.
- I have 24,444 data points for training. Some of the data was

corrupted/reserved for validation.

I’m trying to build a 2 layer deep neural network using LSTM

layers:

model = Sequential() model.add(tensorflow.keras.layers.LSTM(64,

return_sequences=True, input_dim=99))

model.add(tensorflow.keras.layers.LSTM(32,

return_sequences=True))

model.add(tensorflow.keras.layers.Dense(1)

I plan to give it sets of data with 72 hours (3 days) of

sequential training.

So when I give my model training data:

model.fit(X_data, Y_data,

…)

I planned on giving X_data with dimensions of size [24444, 72,

99], where the first dimension 24444 describes the data points, the

72 describes the 72 hours of history, and the 99 describes my

training features.

My Y_data has dimensions of size [24444, 72, 1] where first

dimension 24444 describes my training points, 72 describes the

history, and 1 is my output feature.

My question is, when training is done, and I’m actively using my

model for predictions, what should my production input size be?

prediction = model.predict(production_data)

Should my production size be [1, 72, 99]? Where 1 is the number

of output points I expect, 72 is my history, and 99 my feature

size?

When I do this, I get an output size of [72, 1]. That feels…

weird?

What is the difference between feeding my model input of [72, 1,

99] vs [1, 72, 99]? Does the first case not proprogate the internal

state forward?

If I give my model [1, 1, 99] do I need to loop my model

predictions? And how would I do this?

submitted by /u/jyliu86

[visit reddit]

[comments]