16:06 Regarding shifting part, the first prediction point will be seq_size units after the first point in testX. So you can fix it by extending testX by seq_size steps from start
That is right. When you apply the sequencing on the data set the first trainX sequence is made up of points that do not exist (there is no t-1 pint before the real t=0 of the data). So it works funky. Also when you try to extend a forecast horizon to more than just the next point, similar stuff happens.
Awesome explanations! (with no need for apologies for too long video- who wants can stop the video). I found it actually the most interesting because specifically the experiments at the end were super helpful since your experience with tweaking params is certainly classes superior to most of us. Heartfelt Gratitude for all these videos and sharing of knowledge, understanding and experience! 🙏
Thank you so much for your content! It gets extremely helpful for me. You make the ML topics look trivial and do it with excellence. Cheers, from Netherlands☺
Thanks for all of these it gives me some confidence to try out something in the entirely new territory of AI/machine learning. I like to work with LSTM but on remembering past images like satellite images for change detection with weather patterns, anomalies, and also with vegetation growth after a fire, typhoon or reforestation, fertilizer intervention, etc.
Hi dear Sreenivas, thanks again for sharing such awesome material every video release. So I think I found a little mistake shifting data to plot. On function to_sequences change: for i in range(len(dataset)-seq_size-1): by: for i in range(len(dataset)-seq_size): And in the shift test predictions part, change: testPredictPlot[len(trainPredict)+(seq_size*2)+1:len(dataset)-1, :] = testPredict by: testPredictPlot[len(trainPredict)+(seq_size*2):len(dataset), :] = testPredict With these few adjustments we may get plots better aligned.
Great video man. Both your videos and code really helped. Video is well explained and code is well written with sufficient comments. Keep up the good work. More power to you.
Great videos, excellently simplified explanation 👌 Thanks for all your efforts to make it possible for each of the users understand the very basic concepts needed and appropriate examples to understand 😊Great work 👍
Just seen a couple of your videos related to time series, you have explained the concepts very well especially the code explanation. Another good thing is that code is in Python , not R (which is used in most of good Time Series books) Thanks
Thank you! Wondering on the Bidirectional, was the Dense(32) left out intentionally? #Bidirectional LSTM # reshape input to be [samples, time steps, features] #trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1])) #testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1])) # ##For some sequence forecasting problems we may need LSTM to learn ## sequence in both forward and backward directions #from keras.layers import Bidirectional #model = Sequential() #model.add(Bidirectional(LSTM(50, activation='relu'), input_shape=(None, seq_size))) ##model.add(Dense(32)) ##**?? #model.add(Dense(1)) #model.compile(optimizer='adam', loss='mean_squared_error') #model.summary() #print('Train...')
Great series of videos! Can anyone provide any resources related to how to use CNNs and LSTMs for multivariate forecasting in which the networks are trained using multiple features and all of the same features used in training are then predicted? All of the examples I have seen even related to "multivariate forecasting" involve only forecasting one feature as the output.
Hi. I think the shape of trainX and testX must be (_ , 5, 1). In this particular example, featuresize is 1 not 5. Just check it please. Note that (_ , 1, 5 ) is also works but it models different problem (i.e. it unrolls only 1 time step which is not desired for this problem). LSTM's with input shape (_ ,1 , 5) works as somewhat similar to the fully connected feed forward system, because there is no unrolling (time step=1) in LSTM layer.
I agree with you John. The shape of trainX and test X should be (_ , 5 ,1) where five is the timestep and 1 is the number of features or indicators, in this case is only 1. I also realized that the "for loop" in the function 'to_sequence' should be "for i in range(len(dataset)-seq_size):" without subtracting 1, I like and enjoy the content anyway. Keep it up Sreeni.
I have a question regarding the size of the prediction: At 22:16 you can see 3 plots on top of each other: 1: Target variable from Dataset 2: Train Prediction 3: Test Prediction When you look very closely you can see Test Prediction is actually 1 timestep shorter than the ploted target variable from Dataset. Since we are using 5 previous days (or whatever seq_size is) to forecast day number 6, shouldn't it be the case that Test Prediction is 1 timestep longer than the plotted target variable - or said in other words, shouldn't the prediction reach 1 day into the future? When we get to the end of the series, the model should use the last 5 known data points to predict the 6th, so it should be 1 prediction in the future. Or am I missing something here? Thank you for your awesome tutorial, and for answering my question!
Thank you so much for this tutorial. Could you please extend the code to forecast in the future? For example, how to predict three months in the future?
Is it posible to use the model to predict future values? I mean we are feeding the model data that is allready known, then compare the model to said values. I want to se what would happen if the model were to try and predict values a fixed amount of time after the last datapoint. is it possible?
Good day Hope all is well. I hope you can help me. I understand that you fit the training and testing graphs (green and red) on top of the data (orange), but how do you predict like 10 time observations in the future. I want to determine the remaining useful life of a component and LSTM looks like the way to go. The only problem is that I do not understand to predict into the future. Thank you for the help. Regards P.S. You have the best videos on the net.
Amazing video Sir! I used the same codings to forecast the data of tourist arrivals but the value of RMSE is too big. Anyone can help? Is that LSTM is not suitable for this dataset?
I have a question about the code "reshape(trainX.shape[0] , 1 , tranX.shape[1])", the "1" here means your input is only a single timestep of tranX.shape[1] feature values, from my understanding, it does not make sense to use an RNN layer at all since basically the input is not a sequence, please help me out if my understanding is wrong, thanks.
In the timeseries set of tutorials you have consistently used scaler.fit for training and test data together. However, scaling model should only be created on training data and test data should be transformed based on the scaling model. Any reason why you chose to create scaler on both the data set's together? Is it for simplicity of understanding? train_size = int(len(dataset) * 0.7) test_size = len(dataset) - train_size train, test = dataset[0:train_size, :], dataset[train_size:len(dataset), :] scaler = MinMaxScaler(feature_range=(0, 1)) train = scaler.fit_transform(train) test = scaler.transform(test)
I did fit the scaler to training data and then applied the fit to process (scale) both training and test datasets. You have to scale the test data exactly the same way you’ve scaled the training data.
how can I use neural networks to predict a chunk data in the future as can be seen in prediction of the ARIMA model? all I ever see with neural networks are these 1-step incremental predictions on existing data. What if I want to predict a whole chunk of future data to see whether the neural network actually recognized the patterns inherent to the dataset? Otherwise the neural networks are essentially just creating a sort of moving average of the data.
How to get forecast values in lstm as we got in arima, like in arima the graph you can see train , test and prediction part which was from 1961 to 1964 , but it is not shown in lstm. How to do that sir?
non-broadcastable output operand with shape (116,1) doesn't match the broadcast shape (116,3) I am getting this error while doing inverse transform step ,need help please.
Thank you for this. The videos are helping me alot. Just want make sure of something: I think you mixed up the timesteps and featurs in trainX, should it not be (89, 5, 1) instead of (89, 1, 5) since we have 5 timesteps?
One question, looking at how the data has been reshaped, it seems like the input shape is : samples*features* time_steps, I don' t understand why it is written as samples* time_steps*features in the code comments. The video is really helpful though!! Thanks.
Thankyou Sreeni for this excellent video of LSTM implementation.. I have a doubt regarding the LSTM capability. Is it possible to train one LSTM network(with one or more hidden layers) for learning different time sequence patterns, if yes could you please demonstrate with a sample code
Hi Sreeni, thanks a lot for your great channel! I have a question, is it possible to provide LSTM with multiple time series (multiple factors) for prediction?
Yes, you can use multiple features (variables) to train LSTM network to predict an outcome. The solution is not easy to type as a reply but I will see if I can add it to my list of future videos.
Can not understand what is the point of "input_shape=(None, seq_size)". It should be about sizes in all dimmensions. First is None because it is lenght of an array and it may be unpredictable different. And dimmension of passengers data array is "1". I can't understand how we push 5 look back values to the 64 inputs of 64 LSTM units.
I am facing a similar issue when plotting testY[ :100] & testPredict[ :100] the predicted plot is shifted ahead by 1-time step. But is solved when applying the below idea. testY[ :100] & testPredict[1 :101 ] @DigitalSreen can you please help!
Why shouldn''t one use a proportional measure of error ? I mean: much as CoV (or coefficient of variation) is much easier to understand than the standard deviation, why not relate the error with the original data ? For instance, instead of summing the squared difference (Vobs - Vcalc), why not to use the squared difference Vobs - Vcalc)/Vobs ? That way one could have a more meaningful measure that could be used to evaluate better the model fitting;
Thanks! But How to compute an accuracy measure based on RMSE? forexample on your case RMSR is 29.48 for test score. so what is the accuracy of the model in %?? And on your case, val_acc is 0.023, is this mean your model has accuracy of 2.3%??? please help me ! please ! I am comfused!
You are real modern day "sadhu" -Saint who wanted give their knowledge to others , keep doing it , you are doing great work.
16:06 Regarding shifting part, the first prediction point will be seq_size units after the first point in testX. So you can fix it by extending testX by seq_size steps from start
That is right. When you apply the sequencing on the data set the first trainX sequence is made up of points that do not exist (there is no t-1 pint before the real t=0 of the data). So it works funky. Also when you try to extend a forecast horizon to more than just the next point, similar stuff happens.
Awesome explanations! (with no need for apologies for too long video- who wants can stop the video). I found it actually the most interesting because specifically the experiments at the end were super helpful since your experience with tweaking params is certainly classes superior to most of us. Heartfelt Gratitude for all these videos and sharing of knowledge, understanding and experience! 🙏
Thank you so much for your content! It gets extremely helpful for me. You make the ML topics look trivial and do it with excellence.
Cheers, from Netherlands☺
Thanks for all of these it gives me some confidence to try out something in the entirely new territory of AI/machine learning. I like to work with LSTM but on remembering past images like satellite images for change detection with weather patterns, anomalies, and also with vegetation growth after a fire, typhoon or reforestation, fertilizer intervention, etc.
Thanks for this amazing series about Time series forecasting and anomalies detection.
Nice explanation about data feeding! Informative tutorial! Thank you!
Hi dear Sreenivas, thanks again for sharing such awesome material every video release. So I think I found a little mistake shifting data to plot.
On function to_sequences change:
for i in range(len(dataset)-seq_size-1):
by:
for i in range(len(dataset)-seq_size):
And in the shift test predictions part, change:
testPredictPlot[len(trainPredict)+(seq_size*2)+1:len(dataset)-1, :] = testPredict
by:
testPredictPlot[len(trainPredict)+(seq_size*2):len(dataset), :] = testPredict
With these few adjustments we may get plots better aligned.
can you help me please I have some problem in this implementation
Dear sir, may god always always and always keep you happy...
Great video man. Both your videos and code really helped.
Video is well explained and code is well written with sufficient comments.
Keep up the good work. More power to you.
Thank you! You may have just landed me a new job!
Good luck.
Thanks for the great videos with proper defines every step
Thank you very much for the crystal clear explanation as well as sharing the code Sreeni
You are welcome 😊
Great videos, excellently simplified explanation 👌
Thanks for all your efforts to make it possible for each of the users understand the very basic concepts needed and appropriate examples to understand 😊Great work 👍
My pleasure!
Thanks for all the very good explanations of LSTM. The first time, I have understood the use of sigmoid and tanh in LSTM from this video.
Just seen a couple of your videos related to time series, you have explained the concepts very well especially the code explanation.
Another good thing is that code is in Python , not R (which is used in most of good Time Series books)
Thanks
Thank you! Wondering on the Bidirectional, was the Dense(32) left out intentionally?
#Bidirectional LSTM
# reshape input to be [samples, time steps, features]
#trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
#testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
#
##For some sequence forecasting problems we may need LSTM to learn
## sequence in both forward and backward directions
#from keras.layers import Bidirectional
#model = Sequential()
#model.add(Bidirectional(LSTM(50, activation='relu'), input_shape=(None, seq_size)))
##model.add(Dense(32)) ##**??
#model.add(Dense(1))
#model.compile(optimizer='adam', loss='mean_squared_error')
#model.summary()
#print('Train...')
Can you help me please I have a problem while implementing LSTM,
Sir You are so awesome for this detailed explanations
you are the best teacher.
Great series of videos! Can anyone provide any resources related to how to use CNNs and LSTMs for multivariate forecasting in which the networks are trained using multiple features and all of the same features used in training are then predicted?
All of the examples I have seen even related to "multivariate forecasting" involve only forecasting one feature as the output.
thank you so much..very good explanation.....i followed all your lstm videos
Hi. I think the shape of trainX and testX must be (_ , 5, 1). In this particular example, featuresize is 1 not 5. Just check it please. Note that (_ , 1, 5 ) is also works but it models different problem (i.e. it unrolls only 1 time step which is not desired for this problem). LSTM's with input shape (_ ,1 , 5) works as somewhat similar to the fully connected feed forward system, because there is no unrolling (time step=1) in LSTM layer.
I agree with you John. The shape of trainX and test X should be (_ , 5 ,1) where five is the timestep and 1 is the number of features or indicators, in this case is only 1. I also realized that the "for loop" in the function 'to_sequence' should be "for i in range(len(dataset)-seq_size):" without subtracting 1,
I like and enjoy the content anyway. Keep it up Sreeni.
agreed.
I think so. Please correct me if it is wrong. seq_size should be equal to time steps?
I have some issue in it can you help me please . I am in trouble.
Thanks you very much.Your videos are very helpful.
I have a question regarding the size of the prediction:
At 22:16 you can see 3 plots on top of each other:
1: Target variable from Dataset
2: Train Prediction
3: Test Prediction
When you look very closely you can see Test Prediction is actually 1 timestep shorter than the ploted target variable from Dataset.
Since we are using 5 previous days (or whatever seq_size is) to forecast day number 6, shouldn't it be the case that Test Prediction is 1 timestep longer than the plotted target variable - or said in other words, shouldn't the prediction reach 1 day into the future?
When we get to the end of the series, the model should use the last 5 known data points to predict the 6th, so it should be 1 prediction in the future. Or am I missing something here?
Thank you for your awesome tutorial, and for answering my question!
Awesome material. Thanks a lot for sharing!
Thank you Sreeni.
Please make videos on multivariate dataset for time series forecasting.
Sure, will try.
Pretty Insightful video, thank you for the content 👍
Thank you for the comprehensive explanation. I have a question, please. What does unit mean? Unit 128, unit 64, etc.
Thank you so much for this tutorial. Could you please extend the code to forecast in the future? For example, how to predict three months in the future?
Is it posible to use the model to predict future values?
I mean we are feeding the model data that is allready known, then compare the model to said values.
I want to se what would happen if the model were to try and predict values a fixed amount of time after the last datapoint.
is it possible?
Good day
Hope all is well.
I hope you can help me.
I understand that you fit the training and testing graphs (green and red) on top of the data (orange), but how do you predict like 10 time observations in the future.
I want to determine the remaining useful life of a component and LSTM looks like the way to go.
The only problem is that I do not understand to predict into the future.
Thank you for the help.
Regards
P.S. You have the best videos on the net.
Thanks for the video! What is the effect of batch size? Why haven't you use it in the model fitting command?
Amazing video Sir! I used the same codings to forecast the data of tourist arrivals but the value of RMSE is too big. Anyone can help? Is that LSTM is not suitable for this dataset?
I have a question about the code "reshape(trainX.shape[0] , 1 , tranX.shape[1])", the "1" here means your input is only a single timestep of tranX.shape[1] feature values, from my understanding, it does not make sense to use an RNN layer at all since basically the input is not a sequence, please help me out if my understanding is wrong, thanks.
I think that scaling on the full dataset is a mistake. Scaling should be performed on training set only.
In the timeseries set of tutorials you have consistently used scaler.fit for training and test data together. However, scaling model should only be created on training data and test data should be transformed based on the scaling model. Any reason why you chose to create scaler on both the data set's together? Is it for simplicity of understanding?
train_size = int(len(dataset) * 0.7)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size, :], dataset[train_size:len(dataset), :]
scaler = MinMaxScaler(feature_range=(0, 1))
train = scaler.fit_transform(train)
test = scaler.transform(test)
I did fit the scaler to training data and then applied the fit to process (scale) both training and test datasets. You have to scale the test data exactly the same way you’ve scaled the training data.
how can I use neural networks to predict a chunk data in the future as can be seen in prediction of the ARIMA model? all I ever see with neural networks are these 1-step incremental predictions on existing data. What if I want to predict a whole chunk of future data to see whether the neural network actually recognized the patterns inherent to the dataset? Otherwise the neural networks are essentially just creating a sort of moving average of the data.
Hi this is a great tutorial. How will I get values if I want to predict beyond the test date? Thanks
Thank you Sreeni.
Is there any video for multi-step time series forecasting?
hello @yahyaalezzi, have done the forecasting ? Because I am in the need of this part .
How to get forecast values in lstm as we got in arima, like in arima the graph you can see train , test and prediction part which was from 1961 to 1964 , but it is not shown in lstm.
How to do that sir?
It's perfect explanation. Thanks
Glad it was helpful!
non-broadcastable output operand with shape (116,1) doesn't match the broadcast shape (116,3)
I am getting this error while doing inverse transform step ,need help please.
Thank you for this. The videos are helping me alot. Just want make sure of something: I think you mixed up the timesteps and featurs in trainX, should it not be (89, 5, 1) instead of (89, 1, 5) since we have 5 timesteps?
I reshaped trainX to (89, 5, 1) and used:
model.add(tfkl.LSTM(64, input_shape=(5, 1)))
It worked well, I used different data though
Please if we want to make prediction to futur values how does it work ?
One question, looking at how the data has been reshaped, it seems like the input shape is : samples*features* time_steps, I don' t understand why it is written as samples* time_steps*features in the code comments.
The video is really helpful though!! Thanks.
u can’t scale whole dataset at beginning. scale train set, then apply fit to test set.
please help:
How do I reshape my input if I have 2 variables as input
I'm trying model.add(LSTM(64, input_dim=(SEQ_SIZE,2))) but it doesn't work
Your video is just amazing!
Glad you think so!
Thankyou Sreeni for this excellent video of LSTM implementation.. I have a doubt regarding the LSTM capability. Is it possible to train one LSTM network(with one or more hidden layers) for learning different time sequence patterns, if yes could you please demonstrate with a sample code
Hi Sreeni, thanks a lot for your great channel! I have a question, is it possible to provide LSTM with multiple time series (multiple factors) for prediction?
Yes, you can use multiple features (variables) to train LSTM network to predict an outcome. The solution is not easy to type as a reply but I will see if I can add it to my list of future videos.
@@DigitalSreeni Thank you!
Sir... ❤ Thank you... God bless you
Is seq_size define time steps in time series forecasting
I have some issue while implementing it sir can you help me please
Thanks, it's very useful
great Video, sreeni can you please share the code for prediction of future 100 values.
hello @ZeeshanZafarzams706, have done the forecasting ? Because I am in the need of this part . Thanks
Thank you for the amazing tutorial!
You're very welcome!
Can not understand what is the point of "input_shape=(None, seq_size)". It should be about sizes in all dimmensions. First is None because it is lenght of an array and it may be unpredictable different. And dimmension of passengers data array is "1". I can't understand how we push 5 look back values to the 64 inputs of 64 LSTM units.
I am facing a similar issue when plotting
testY[ :100] & testPredict[ :100]
the predicted plot is shifted ahead by 1-time step. But is solved when applying the below idea.
testY[ :100] & testPredict[1 :101 ]
@DigitalSreen can you please help!
Hello, have you done the forecasting ?
Nice explanation
Thanks
Good day
Have you found the shift problem in the code yet. How can I fix the shift on the graph in the code.
It is not a problem, it is real. Initially I thought it was a shift issue based on how systematic it was.
Sir, any idea about time series forecasting of images?
I’ve done videos on tracking objects in time series images. Please check them out.
@@DigitalSreeni fine sir, one more doubt is it possible to forecast with temporal series of satellite images in python or R?
Thank you for the great tutorial
Why shouldn''t one use a proportional measure of error ? I mean: much as CoV (or coefficient of variation) is much easier to understand than the standard deviation, why not relate the error with the original data ? For instance, instead of summing the squared difference (Vobs - Vcalc), why not to use the squared difference Vobs - Vcalc)/Vobs ? That way one could have a more meaningful measure that could be used to evaluate better the model fitting;
thanks again... to the point.
You are welcome.
this was really helpful thank you!
Thanks!
Thank you very much Sumeet, very kind of you. बहुत बहुत धन्यवाद!
23:20 didn't left
Thank you
You're welcome
Thanks! But How to compute an accuracy measure based on RMSE? forexample on your case RMSR is 29.48 for test score. so what is the accuracy of the model in %?? And on your case, val_acc is 0.023, is this mean your model has accuracy of 2.3%??? please help me ! please ! I am comfused!
do you find the answer
@@kalkidanmulatu9433 I can't get yet