How to have better results at forecasting?

I am a beginner in machine learning and I would like to forecast some pollution data.

I am using a dataset with values for pm2.5, pm10 and pm1 as features and I am predicting the values for the pm2.5. I built an LSTM network but the predicted values are quite from the real values.

What I used:

win_length=2 batch_size=32 num_features=3 train_generator=TimeseriesGenerator(x_train,y_train,length=win_length,sampling_rate=1,batch_size=batch_size) test_generator=TimeseriesGenerator(x_test,y_test,length=win_length,sampling_rate=1,batch_size=batch_size) 


The used model is LSTM:

model=tf.keras.Sequential() model.add(tf.keras.layers.LSTM(200,input_shape=(win_length,num_features),return_sequences=True)) model.add(tf.keras.layers.LeakyReLU(alpha=0.5)) model.add(tf.keras.layers.LSTM(128,return_sequences=True)) model.add(tf.keras.layers.LeakyReLU(alpha=0.5)) model.add(tf.keras.layers.Dropout(0.3)) model.add(tf.keras.layers.LSTM(64,return_sequences=False)) model.add(tf.keras.layers.Dropout(0.3)) model.add(tf.keras.layers.Dense(1)) 

This is the data snippet and how the predicted values look compared to the original pm2.5 values: prediction snippet

How to increase the accuracy of the forecast? I am also attaching the jupyter notebook, which contains all the analysis:

submitted by /u/MobileInformal460
[visit reddit] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *