I just wanted to let you know that I went through many LSTM tutorials over the Internet and yours is the one that got me understanding it all. Thank you! And let me give you a compliment you probably didn't get yet: the way you talk is very easy to understand for a non-native english speaker like me!
Greg thanks for sticking with this even to the point of how tired you were getting at the end of it. You are a gifted lecturer. If I were to add anything it would be more about the need for and the technique of pre-processing the data.
After scourging through the entire TH-cam, I finally landed on a great video that helps me get the idea of how to implement RNNs. I am trying to tackle a problem in my side project that involves RNNs and this video is exactly what I needed. Thanks for the lovely explanation, Greg!
Yeah hands down excellent content. I was getting lost trying to understand what approach to take and this wrapped everything up very nicely where I can branch off and test other methods. Highly appreciate these videos!
Greg you really explain so nice. Finally I landed up to a solution to problems in my project. Thank you so much for providing this knowledge for free. God bless you, your family and your TH-cam channel. Lots of love from India.
This is cool, but I'm not really that impressed. While the agreement seems impressive at first glance when looking at the comparison plot, what we are doing here is plotting our estimate of the *next* temperature based on a recent historical window of data. That means that at the next time step, our model is getting the values from the previous step. In other words, the model is always making a prediction only one time-step in the future. A more helpful evaluation of its performance would be to plot and compare the difference between the last and current temperature data between the actual and estimation. Another way to understand why I'm skeptical is to consider doing a control experiment using a much simpler estimate of the temperature: a linear interpolation of the last two temperatures. If we plotted the results the same way, I suspect my linear interpolation method would also look extremely good. Really, the valuable kinds of questions aren't just whether the estimate at the next time is close to the actual value, but rather if the change in temperature from the previous time-step to the current time-step is predicted accurately -- does this difference have the same sign between the prediction and actual values? Does it have a similar magnitude?
Thank you Greg for this awesome video. To think that i have watched this video severally and i just got my "A ha moment". I am currently working on a project on Forex prediction using LSTM and this will help me alot. Thank you again.
Hello @@GregHogg, I trust you are doing great? I am working on a multivariate time series forecasting and i am getting the same values for my predictions, do u know why this is please and how I can resolve that? Thank you
Fantastic video, Greg! I found your video to be highly educational. However, I would like to seek clarification on a particular point. It appears that you're utilizing the present temperature as an input in order to predict the current temperature. Is my understanding accurate? If this is indeed the case, I wonder if such an approach might be considered less sophisticated. Perhaps the model's performance could be enhanced if the input values were temporally shifted and not directly included as features. Additionally, it seems to me that you might be utilizing the prior prediction outcome as an input to forecast subsequent outcomes. This is just my interpretation, and I'm open to correction if I've misunderstood your methodology.
Just coming back to reference some stuff. great vid! btw if you're dealing with large datasets your df_to_X_y2 function might be a bit slow. This should speed it up significantly and give the same result. """ def split_data(data, n_steps): x, y = list(), list() for i in range(len(data)): end_ix = i + n_steps if end_ix > len(data)-1: break seq_x, seq_y = data[i:end_ix, :-1], data[end_ix-1, -1] x.append(seq_x) y.append(seq_y) return np.array(x), np.array(y) """ takes in a numpy array as data. if you're dealing with a dataframe just input df.values, n_steps is the window size. this is more for a classification task as it uses the last column of the input as the target (y=target) and thus wont include the target in the output x array. can tweak as you'd need ofc.
If you run your cell with Shift + Enter, it automatically selects the next cell. If you are at the end, it automatically creates one. Took me months of using Colab before I finally figured that out! Hope this helps. :-)
A bit confused on 2 things. 1) We are trying to predict the temperature for a row of data. In our features, we include that row's temperature too. Isn't that data leakage/cheating since the label is in our features? 2) Let's say that it isn't data leakage, and what we are predicting is the next hour's temperature. Wouldn't our labels then be shifted one row back so the actual is in line with predicted?
When using 1D CNN layers for time series you have to use the parameter padding=“causal”, otherwise you will train a model on target t, with data from t+1. This is called leakage
Great video! It relly helped me understand how to use LSTM and CNN models for time series forecasting. I just have one question: How would you go about to forecast several time steps? Let say your forecast is on hourly resolution, and you want to forecast the temperature for the next 24 hours (24 values). Would you: 1) Keep the same model with 1 ouput and iterate using the predicted values for hour 1 to forecast hour 2 and so on... or would you... 2) Change the model output to be 24 values? And if you think option 2) is better, how youconstruct the feature data? I am guessing the input for the output t1 that is just one hour away shouldn't be the same as for t24?
@@GregHogg one question, i have a client request. Let's talk in your case my client wants me to use temp column as 1 variable input and another column will be scaled temp column as 2nd variable. And use both of them to predict temp. I don't think using another column of same data but scaled will do any better than just using one of those. Any suggestions?
@@GregHogg Definitely! You can hit me up by email or LinkedIn under the name “Ajay Halthor” (Sorry I thought I typed this out a few days ago. Looks like I hadn’t :) )
This is the best video I’ve seen on this topic - well done. One question: why didn’t you also standardize your validation data in the example of temp and day/year sun/cos?
Great video Greg!! 1 question: i want to do classification on a numerical data set.. what changes do I make in your code? It is for a pattern recognition kind of project where I have to identify if the same data pattern is repeating itself thanks in advance
Hello, Greg! I've watched several videos from you, they're all great and easy to understand! I am doing a similar project now and there is a question (maybe a silly one): how do you prepare X to really predict the future values (let's say to predict temp and pressure from today to the next 7 days, and there is no real data of temp and pressure in X set (I am asking because I think both pressure and temperature are in X now). Let's imagine that the model is already trained and fitted from the train and the validation sets and there is no "test set" for today In my project, I am trying to predict demand, still very similar :)
Hello Greg, I am from Brazil. Thank you so much for the video; the content is excellent. I have a question I'd like to ask. When I train the model, the loss and mean squared error decrease, but the validation loss and validation mean squared error start to increase. Do you have any idea why this might be happening?
hi Greg. Why do you stardardization temperature in this video? In part 1, why you didn't do that? Thank you so much Is this standardization process help us increase the accuracy? or this is the rule that we have to follow?
Hi Greg, great video! But i've got a question about your model. Are you predicting temperature based on historical values of temperature alone? Or are u using extra variables to predict the temp, like for example pressure. Because i understand that you only predict both of them only. But do the predictions have any correlation? Is the pressure variable correlated with the temperatuur model prediction? Thank you!
Hi Gregg, nice video and explanation of your code. One question: Why do you standardize your train, validation and test set with the same mean and std from the trainings set?
I can't understand one thing. What we are doing here is predicting rather than forecasting, right? Cause lets say we want to forecast the temperature for some date in the future for which we don't have the previous lag values. Therefore, do we have to forecast till we get the values for the target date ? Also, how to incorporate non-datetime features and use them for forecasting ?
Not necessary. The model doesn't know that y_train is one value in the future of X_train. It's just trying to discover the system behavior. As long as the inputs are on the same scale, that's really all the model cares about, and will produce outputs that are on the same scale as the target.
Thank you first for the great video, very helpful I have a question : I developed a temperature control system with a specific set point using arduino in a small greenhouse and I logged data in a SD card. Now I wanted to develop an LSTM model that I can implement it as a predictor controller in my greenhouse. my question is to know after training the model and get the prediction accurate, what to do in order to convert that model and use it as a controller? and how to proceed for the implementation? Looking forward to hear from Thank you
How can you standardise data sets which uses the entire data set to calculate the Z value and yet, you are thrn scrolling through the data where future values used to make the Z value is not yet available??? Those values are not available to make out of sample set Z values as well. How can you then inverse out of sample results for causal prediction? What your Mu and Sigma be for a 1 step ahead prefiction with real data? You woild have to train youf model on a sliding Z value using only past data?
Hi Greg, awesome video! I noticed you're using the data from the test set to make predictions on the test set. I believe this will be impractical when it comes to real world deployment of such model, because the assessment would have been wrong. In the real world, we would want our model to make predictions using data from the past (train set), not from the future (val or test set) as we would have have had the future values then. Can you provide insights on how that would play out when calling the model.predict() function? Thanks Greg, and I look forward to your response.
Thank you for your video, it is very expressive. Please, as the LSTM model is being trained, how can we predict data for the furtur years (e.g 2018-2030). I am looking foward to your answer best regards, Clint
wait if these didn't add 60001, 65001 does it cause data leakage? X_train1, y_train1 = X1[:60000], y1[:60000] X_val1, y_val1 = X1[60000:65000], y1[60000:65000] X_test1, y_test1 = X1[65000:], y1[65000:]
Great video - this is exactly the level of detail I’ve been looking for! Quick question- why do you set the window size to be 1 + number of variables? Is it bad to have a smaller window size then the number of variables? Thank you
Great video. Very informative. You mentioned doing one predicting stock prices. It would be good to see if the human emotional influence of stock prices could be modeled. Seems very difficult. Have you already done one? Thanks
Great, great content. This is my personal introduction to tensorflow, and I really want to thank you because the explanation is crystal clear. I really appreciated the detail and also how you are gradually putting more stuff in modeling, so it's easy to follow along. I have just one question: I can't understand why (about at 45:00) when going through the df_to_X_y3(df, window_size=7) you are expecting two output values for the target variable y. Shouldn't we have expected a single output from the 6 variables (now that df['p (mbar) was added into our p_temp_df dataframe), with window_size=7 (the size of the sliding window for creating sequences)? Thank you very much!
25:08 I understand why you turn the Seconds column (every increasing values) into something periodical, since the weather is also periodical BUT will this realy be beneficial since it´s probably the case that the periodic pattern of the temp missmatches the periodic pattern of sin/cos values ? edit: Am I right in answering my own question with: "the temp_df[seconds] * 2pi/year input in sin/cos takes care of this concern" ?
Thanks Greg... Wonderful explanation. Really helped in understanding the Forecasting logic. Just one question. What is in the last model, we need to forecast Temp and Pressure together for next 10 observations may b? What changes we need to do in order to get, Multi-step Multivariable Forecast problem?? Thanks Once Again !!!! ✨✨✨
Hello, ty for your content! :-) I have a question: you used the X2_train for the mean and std at 38:56 is it because you want the predictions later on to have the same basis when preprocessing?
Hey Greg, Good job, great service to the community! What is your experience wrt prediction ability if you combine two outputs (make the neural network predict temp and pressure). I am thinking that during the learning process the weights will try to minimize the loss of both outputs. Then for example some weights may be better just for temperature, but not for pressure, what the algorithm would do? I appreciate inisghts or a reference in this context, as in the past I have tried to predict two outputs from a system, but I trained two different networks. Cheers,
@@GregHogg Thank you for your prompt response, normalizing would help to make an even influence of both output. Still I think during the learning process not every weight adjustment would optimize the fitting of both outputs. Will try to find a paper discussing this topic. Cheers,
Wow this is exactly what I was looking for! Can i ask something, how to predict for the next timestep? let say 24 hour (for hours timeseries) or 30 days? and can you give 95% prediction interval for the prediction.. thank you
Great videos. But I have been waiting for the continuation of future prediction of multivariate LSTM. Will you continue this video to run future predictions?
Thanks for your videos, massive work!! I have a question : Whats happened if instead of having just one time serie of temperatures, you have many temperature time series for many countries ? How you do the split? do you put 'country' and 'date' as indexes ? and what's happened if you dont have the same length for each country ?
Hi Greg. Thanks for the useful video. Let me ask you a question. Can we use a Transformer model (Attention All you Need) for this dataset? as you know Transformers receive all data sequences together!
You are defining preprocess3 function sir , but you are not using it anywhere . Again , can I use this model for stock market predictions ? Where data doesn't follow a particular. Periodic pattern ?
Hello Greg, Hope you are doing well. Question: How we can design a machine learning/NN model for time series forecasting in case its a multivariate forecasting and also at the same time if some of our independent variables are regressor's. Thanks
Hi Greg! I wanna ask about the coding, so I copy and paste your code and run the function "plot_predictions1(model4, X2_test, y2_test)" but why I got the normalized values for X2 but I got the real values for the y2? please tell me if i missed something in your great video! thanks!
and i wanna ask about the preprocessing function too, why we only use the mean and the std from the x train when we use the function for x val and x test? we should be using the mean and std from x val for x val and from x test for x test right? or I missed something again? thanks!
how to do future prediction sir. (should we append the predicted value to the window and the send that data to predict next value) should we repeat the same till time frame we want to predict in future?
Thanks for the great explanation Greg. I was wondering why for model4 we scaled only the X_test and X_train data (and did not scale the y_test y_train data) before we fed it into the model?
Hi Greg, thanks a lot! I learn a lot. Can you please help me to understand why did you use Standardization for preprocessing of temp data instead of minmaxscaler ?
I watched till the end of the video , my question is if i want to train lstm on different stock market data, what kind of preprocessing should i do? Should i employ standardisation(sub tract mean and divide by deviation as done in video?) for each stock data and combine(put first stock data, then second stock data underneath) all of them when training ?
Great video Greg! For the preprocessing and postprocessing, I was wondering why did not we do it at the start on temp_df itself before we split it up in [train, val, test]. In that case, it would be much easier to keep track of it, right?
I think that's because this would represent a form of data leakage, i.e. you would be using data from your validation and test sets (mean and std) to train your model.
Very interesting , how to view this dataframe from plot_predictions , this dataframe will be a part from another dashboard on powerBI and start , end will be dynamic
Hi, useful and clear video for multivariate time series. Could u plz clarify me following usecase is feasible using time series data. Predicting student performance using past semester data for multiple students in a single model. We need to convert multiple student time series history to supervised for binary classification. Is this possible?Thanks in advance.
Label column can be dropped by modifying row as below: def df_to_X_y2(df, window_size=6): df_as_np = df.to_numpy() X = [] y = [] for i in range(len(df_as_np)-window_size): row = [r for r in df_as_np[i:i+window_size][:,1:6]] X.append(row) label = df_as_np[i+window_size][0] y.append(label) return np.array(X), np.array(y)
Hello Greg, Thanks a lot for your videos and sharing your knowledge. One question if you allow me. Once we have the model trained, how to add new data (same kind of data) and train just the new data without have to train the whole history and predict using the already trained model + new data? does it make sense to do? thanks a lot!!
You have a couple options that make sense to me. 1. You don't, which is probably the easiest. 2. You train on all or some of the old data and the new data. 3. You train on just the new data using a very low learning rate
Thank you. This is very informative and clearly explained. Very nice trick with the periodic sin/cos feature construction. Do you possibly plan to explain more about the different feature construction tricks? I'm using a fit_transform from sklearn for data preprocessing (which is virtually the same as your preprocessing functions, I guess). The question is - couldn't it be easier to apply preprocessing just once to the entire dataset instead of test/train/validation subsets each time for ease of inverse transformation for prediction?
You're very welcome. In the future, this is planned. Fit transform will be fine for temperature and pressure for sure. You could probably preprocess the whole dataset, yes.
No, the point of the validation and test set are supposed to be an unseen dataset. Therefore when preprocessing the data you should only have access to the training data. Therefore you should always FIT the scaler (Min/Max or Standardization or etc) on the Training data and then only use TRANSFORM on the validation/test datasets. Never fit your scaler on the validation or test set, as this will result in the validation and test set ("unseen") data leaking into your training data and having a bias in your model.
@@cassiusvlok6916 You may be correct as i have read this in a few papers as well. Now guys pls i need help. I am working on a multivariate timeseries with LSTM, i have 28 variables as input. Input to LSTM is a 3D as expected, i am however having trouble doing inverse_transform on my predictions.
Also liked the sin/cos feature construction trick, but I have a question. Why convert to a sine wave as opposed to having a value between 0 and 1 where 0 would be the start of day/year and 1 woul be the end?
Hello I'm working on a dataset with 14 features and 3 outputs . I want to apply df_to_X_y3 to my data what i should change in this function ? thank you
I offer 1 on 1 tutoring for Data Structures & Algos, and Analytics / ML! Book a free consultation here: calendly.com/greghogg/30min
Hey Greg how you doing? I just wanted to know if you still offer 1 on 1 tutoring for DSA?
I just wanted to let you know that I went through many LSTM tutorials over the Internet and yours is the one that got me understanding it all. Thank you! And let me give you a compliment you probably didn't get yet: the way you talk is very easy to understand for a non-native english speaker like me!
I'm really happy to hear that Felipe! Thanks for such a nice comment :) :)
Greg thanks for sticking with this even to the point of how tired you were getting at the end of it. You are a gifted lecturer.
If I were to add anything it would be more about the need for and the technique of pre-processing the data.
Hahaha thanks so much! Yeah some of these ML tutorials are kinda crazy lol. And yes great point :)
After scourging through the entire TH-cam, I finally landed on a great video that helps me get the idea of how to implement RNNs. I am trying to tackle a problem in my side project that involves RNNs and this video is exactly what I needed. Thanks for the lovely explanation, Greg!
Oh I'm so glad to hear that!! I hope your project is going okay!!
@@GregHogg Yes, still to get the data in a sequential format. But man, you helped out a TON!!
Best dl time series tutorial by far.. well done Greg!
Great to hear, thanks a ton!
Yeah hands down excellent content. I was getting lost trying to understand what approach to take and this wrapped everything up very nicely where I can branch off and test other methods. Highly appreciate these videos!
Greg you really explain so nice. Finally I landed up to a solution to problems in my project. Thank you so much for providing this knowledge for free. God bless you, your family and your TH-cam channel. Lots of love from India.
Thank you so much Sanjeev, I appreciate that a ton and am really glad to hear I've been helpful to you 😊
This is cool, but I'm not really that impressed. While the agreement seems impressive at first glance when looking at the comparison plot, what we are doing here is plotting our estimate of the *next* temperature based on a recent historical window of data. That means that at the next time step, our model is getting the values from the previous step. In other words, the model is always making a prediction only one time-step in the future. A more helpful evaluation of its performance would be to plot and compare the difference between the last and current temperature data between the actual and estimation. Another way to understand why I'm skeptical is to consider doing a control experiment using a much simpler estimate of the temperature: a linear interpolation of the last two temperatures. If we plotted the results the same way, I suspect my linear interpolation method would also look extremely good. Really, the valuable kinds of questions aren't just whether the estimate at the next time is close to the actual value, but rather if the change in temperature from the previous time-step to the current time-step is predicted accurately -- does this difference have the same sign between the prediction and actual values? Does it have a similar magnitude?
I love how you filter out extra details and math jargon, that isnot necessary at this point and time, by saying "It really doesn't matter."
Thank you so much! This video is by far the best one to show how to fit multivariate LSTM.
You're very welcome, and I'm super glad to hear that!
Thank you Greg for this awesome video. To think that i have watched this video severally and i just got my "A ha moment". I am currently working on a project on Forex prediction using LSTM and this will help me alot. Thank you again.
It's definitely a confusing topic for sure. Well done and great job!! You're super welcome, and good luck on your project!!
Hello @@GregHogg, I trust you are doing great? I am working on a multivariate time series forecasting and i am getting the same values for my predictions, do u know why this is please and how I can resolve that? Thank you
Please continue the series , amazing explanation
Fantastic video, Greg! I found your video to be highly educational. However, I would like to seek clarification on a particular point. It appears that you're utilizing the present temperature as an input in order to predict the current temperature. Is my understanding accurate? If this is indeed the case, I wonder if such an approach might be considered less sophisticated. Perhaps the model's performance could be enhanced if the input values were temporally shifted and not directly included as features.
Additionally, it seems to me that you might be utilizing the prior prediction outcome as an input to forecast subsequent outcomes. This is just my interpretation, and I'm open to correction if I've misunderstood your methodology.
Just coming back to reference some stuff. great vid!
btw if you're dealing with large datasets your df_to_X_y2 function might be a bit slow. This should speed it up significantly and give the same result.
"""
def split_data(data, n_steps):
x, y = list(), list()
for i in range(len(data)):
end_ix = i + n_steps
if end_ix > len(data)-1:
break
seq_x, seq_y = data[i:end_ix, :-1], data[end_ix-1, -1]
x.append(seq_x)
y.append(seq_y)
return np.array(x), np.array(y)
"""
takes in a numpy array as data. if you're dealing with a dataframe just input df.values, n_steps is the window size. this is more for a classification task as it uses the last column of the input as the target (y=target) and thus wont include the target in the output x array. can tweak as you'd need ofc.
Interesting, fascinating detailed explanation. Keep it up!
Thanks so much, I really appreciate that :)
Very simple yet comprehensive. Thank you!
Glad to hear it.
If you run your cell with Shift + Enter, it automatically selects the next cell. If you are at the end, it automatically creates one. Took me months of using Colab before I finally figured that out! Hope this helps. :-)
Haha yeah I think when I made this I also didn't know, thank you :)
Great video Greg! could you please share some ideas on how to adapt the models to forecast several values (hours) ahead, not only just one? thank you.
A bit confused on 2 things.
1) We are trying to predict the temperature for a row of data. In our features, we include that row's temperature too. Isn't that data leakage/cheating since the label is in our features?
2) Let's say that it isn't data leakage, and what we are predicting is the next hour's temperature. Wouldn't our labels then be shifted one row back so the actual is in line with predicted?
When using 1D CNN layers for time series you have to use the parameter padding=“causal”, otherwise you will train a model on target t, with data from t+1. This is called leakage
I learn a lot just from this video, thanks mentioning all the details as they are always the confusing parts!
Glad to hear that, you're very welcome!
Great video! It relly helped me understand how to use LSTM and CNN models for time series forecasting.
I just have one question: How would you go about to forecast several time steps? Let say your forecast is on hourly resolution, and you want to forecast the temperature for the next 24 hours (24 values).
Would you:
1) Keep the same model with 1 ouput and iterate using the predicted values for hour 1 to forecast hour 2 and so on... or would you...
2) Change the model output to be 24 values?
And if you think option 2) is better, how youconstruct the feature data? I am guessing the input for the output t1 that is just one hour away shouldn't be the same as for t24?
Thank you so much i was very confused about the multivariate ( multiple features ) to use as input. You cleared my doubts. Thank you
Glad to hear it! :)
@@GregHogg one question, i have a client request. Let's talk in your case my client wants me to use temp column as 1 variable input and another column will be scaled temp column as 2nd variable. And use both of them to predict temp. I don't think using another column of same data but scaled will do any better than just using one of those.
Any suggestions?
This was awesome. I got through both the previous video on LSTM time series and this one using data from Hugging Face. Thanks for the great content
That is awesome video! Thanks for sharing this wonderful material.
Thanks Rafael, I really appreciate that!
Nice work 😊
I love your work!! Let me know if you'd like to get on a call together to chat some time:)
@@GregHogg Definitely! You can hit me up by email or LinkedIn under the name “Ajay Halthor”
(Sorry I thought I typed this out a few days ago. Looks like I hadn’t :) )
Awesome will do! And np haha
Why are we processing the output when using the pressure column too? Why not when we were just using the temperature?
This is the best video I’ve seen on this topic - well done. One question: why didn’t you also standardize your validation data in the example of temp and day/year sun/cos?
It was very helpful. Thank you Greg
Great video Greg!! 1 question: i want to do classification on a numerical data set.. what changes do I make in your code? It is for a pattern recognition kind of project where I have to identify if the same data pattern is repeating itself
thanks in advance
You just have to make the output a sigmoid neuron :)
I am going through a hiring process and a test project related to time series forecasting. Happy that I found your video series on this topic. Thanks!
Great to hear. Best of luck! You're very welcome :)
Hello, Greg! I've watched several videos from you, they're all great and easy to understand!
I am doing a similar project now and there is a question (maybe a silly one): how do you prepare X to really predict the future values (let's say to predict temp and pressure from today to the next 7 days, and there is no real data of temp and pressure in X set (I am asking because I think both pressure and temperature are in X now). Let's imagine that the model is already trained and fitted from the train and the validation sets and there is no "test set" for today
In my project, I am trying to predict demand, still very similar :)
I too have same doubt how to predict future
Hello Greg,
I am from Brazil.
Thank you so much for the video; the content is excellent. I have a question I'd like to ask. When I train the model, the loss and mean squared error decrease, but the validation loss and validation mean squared error start to increase. Do you have any idea why this might be happening?
Is there a video of using multiple features for temperature forecasting
hi Greg. Why do you stardardization temperature in this video?
In part 1, why you didn't do that?
Thank you so much
Is this standardization process help us increase the accuracy? or this is the rule that we have to follow?
Hi Greg, great video! But i've got a question about your model. Are you predicting temperature based on historical values of temperature alone? Or are u using extra variables to predict the temp, like for example pressure. Because i understand that you only predict both of them only. But do the predictions have any correlation? Is the pressure variable correlated with the temperatuur model prediction?
Thank you!
Same question in my mind. Like to know how to do the LSTM with 2 or more features (variables) to get the label (result). Anyone here can help?
marvellous. learnt a lot!
Hi Gregg,
nice video and explanation of your code.
One question: Why do you standardize your train, validation and test set with the same mean and std from the trainings set?
Thank you! The same preprocessing must be done for every input. It's just something that ML folk usually agree upon.
I can't understand one thing. What we are doing here is predicting rather than forecasting, right? Cause lets say we want to forecast the temperature for some date in the future for which we don't have the previous lag values. Therefore, do we have to forecast till we get the values for the target date ? Also, how to incorporate non-datetime features and use them for forecasting ?
@Greg Hogg Very interesting video!Super helpful for my current project!Just wanted to understand why aren't we standardizing y_train,y_val and y_test?
Not necessary. The model doesn't know that y_train is one value in the future of X_train. It's just trying to discover the system behavior. As long as the inputs are on the same scale, that's really all the model cares about, and will produce outputs that are on the same scale as the target.
Thank you so much for your this tutorial! God bless you! Regards!
Thank you!!
Thank you first for the great video, very helpful
I have a question :
I developed a temperature control system with a specific set point using arduino in a small greenhouse and I logged data in a SD card. Now I wanted to develop an LSTM model that I can implement it as a predictor controller in my greenhouse.
my question is to know after training the model and get the prediction accurate, what to do in order to convert that model and use it as a controller?
and how to proceed for the implementation?
Looking forward to hear from
Thank you
How can you standardise data sets which uses the entire data set to calculate the Z value and yet, you are thrn scrolling through the data where future values used to make the Z value is not yet available??? Those values are not available to make out of sample set Z values as well. How can you then inverse out of sample results for causal prediction? What your Mu and Sigma be for a 1 step ahead prefiction with real data? You woild have to train youf model on a sliding Z value using only past data?
Hi Greg, awesome video! I noticed you're using the data from the test set to make predictions on the test set. I believe this will be impractical when it comes to real world deployment of such model, because the assessment would have been wrong.
In the real world, we would want our model to make predictions using data from the past (train set), not from the future (val or test set) as we would have have had the future values then.
Can you provide insights on how that would play out when calling the model.predict() function? Thanks Greg, and I look forward to your response.
Invaluable information thanks man!
Thank you and you're super welcome!
Hi i used your method but when i see the plot.actual vs predictions it seems it predictions are inverted.why
Thank you for your video, it is very expressive. Please, as the LSTM model is being trained, how can we predict data for the furtur years (e.g 2018-2030).
I am looking foward to your answer
best regards,
Clint
Glad to hear it! You can call it recursively and use its output as new inputs :)
I love u brother. U are a great teacher
I really appreciate that 😍
Shouldn't the mean of all temperatures be like np.mean( X2_train[0, :, 0] + X2_train[1:, -1, 0]) ?
So we don't repeat any number in the mean?
i have facing the problem ,how do i predict the value for future in multivariate lsrm plzz help?
Can you use Stock Prices?
wait if these didn't add 60001, 65001 does it cause data leakage?
X_train1, y_train1 = X1[:60000], y1[:60000]
X_val1, y_val1 = X1[60000:65000], y1[60000:65000]
X_test1, y_test1 = X1[65000:], y1[65000:]
Great video - this is exactly the level of detail I’ve been looking for!
Quick question- why do you set the window size to be 1 + number of variables? Is it bad to have a smaller window size then the number of variables? Thank you
Great to hear!
Did I say I wanted it as 1 + number of variables? Thinking right now, I can't see why these things are related
@@GregHogg just to not confuse between window size and parameters numbers in the jumpy shape
Great video. Very informative. You mentioned doing one predicting stock prices. It would be good to see if the human emotional influence of stock prices could be modeled. Seems very difficult. Have you already done one? Thanks
Great, great content. This is my personal introduction to tensorflow, and I really want to thank you because the explanation is crystal clear. I really appreciated the detail and also how you are gradually putting more stuff in modeling, so it's easy to follow along.
I have just one question: I can't understand why (about at 45:00) when going through the df_to_X_y3(df, window_size=7) you are expecting two output values for the target variable y. Shouldn't we have expected a single output from the 6 variables (now that df['p (mbar) was added into our p_temp_df dataframe), with window_size=7 (the size of the sliding window for creating sequences)?
Thank you very much!
42:30 - Now I see where the second output comes from: also the pressure is included as the output ;)
Hello, can I split datasets like this? For example, window=5, one row of training data=[[t1], [t2], [t3], [t4],[t5]], target=[t7]. skip T6
25:08 I understand why you turn the Seconds column (every increasing values) into something periodical, since the weather is also periodical BUT will this realy be beneficial since it´s probably the case that the periodic pattern of the temp missmatches the periodic pattern of sin/cos values ?
edit: Am I right in answering my own question with: "the temp_df[seconds] * 2pi/year input in sin/cos takes care of this concern" ?
Thanks Greg... Wonderful explanation. Really helped in understanding the Forecasting logic. Just one question. What is in the last model, we need to forecast Temp and Pressure together for next 10 observations may b? What changes we need to do in order to get, Multi-step Multivariable Forecast problem??
Thanks Once Again !!!! ✨✨✨
Great and understandable video. You have a beautiful voice btw
Why thank you
Amazing video! Thank you so much for this awesome guide
Thank you and you're very welcome 😁
@@GregHogg Thank you so much for replying to my comment, These guides were extremely helpful, You're awesome! ❤️
excelent tutorial, thanks you very much
No problem!
Hello, ty for your content! :-)
I have a question: you used the X2_train for the mean and std at 38:56 is it because you want the predictions later on to have the same basis when preprocessing?
Do you still do tutoring ? Your calendly link is broken. Thanks.
Hey Greg,
Good job, great service to the community!
What is your experience wrt prediction ability if you combine two outputs (make the neural network predict temp and pressure). I am thinking that during the learning process the weights will try to minimize the loss of both outputs. Then for example some weights may be better just for temperature, but not for pressure, what the algorithm would do?
I appreciate inisghts or a reference in this context, as in the past I have tried to predict two outputs from a system, but I trained two different networks.
Cheers,
You need to normalize the output so that it doesn't care. And thanks so much Carlos!
@@GregHogg Thank you for your prompt response, normalizing would help to make an even influence of both output. Still I think during the learning process not every weight adjustment would optimize the fitting of both outputs. Will try to find a paper discussing this topic.
Cheers,
@@betokrt I see no reason why it wouldn't try and equally balance both, but I might be missing something. No problem!
Wow this is exactly what I was looking for!
Can i ask something, how to predict for the next timestep? let say 24 hour (for hours timeseries) or 30 days? and can you give 95% prediction interval for the prediction.. thank you
Great videos. But I have been waiting for the continuation of future prediction of multivariate LSTM. Will you continue this video to run future predictions?
Thanks for your videos, massive work!! I have a question : Whats happened if instead of having just one time serie of temperatures, you have many temperature time series for many countries ? How you do the split? do you put 'country' and 'date' as indexes ? and what's happened if you dont have the same length for each country ?
Amazing video! thanks helps a lot. A request to make a video on temporal fusion transformer model for time series analysis! Thankyou
Hi Greg. Thanks for the useful video. Let me ask you a question. Can we use a Transformer model (Attention All you Need) for this dataset? as you know Transformers receive all data sequences together!
This has been so helpful and maybe to ask what did you consider to choose the input and output layers
Input has to match your features, which has some flexibility. Output has to be whatever you feel like predicting
Just as an info, ALT+return gives you next blank cell in colab.
Haha thank you!
Brilliant video Greg!
How would you treat months in cyclic terms as days in a month varies over time?
Thank you very much for your good video
It was very, very useful
Great to hear! You're very welcome 🙂
This video is insane!
does lstm work in under 100 monthly data point with 4 variables?
You are defining preprocess3 function sir , but you are not using it anywhere . Again , can I use this model for stock market predictions ? Where data doesn't follow a particular. Periodic pattern ?
Hello Greg,
Hope you are doing well.
Question: How we can design a machine learning/NN model for time series forecasting in case its a multivariate forecasting and also at the same time if some of our independent variables are regressor's.
Thanks
can we also use the other variables that are in the dataset instead of day sin, day cos, etc. ? which one is the better approach
Hi, i am getting nan after using preprocess code of yours in my output variable.
Hey great video. Curious why you didn't use StandardScalar before splitting the data into train/test. Thanks.
I believe it will lead to data leakage. The data should be separated before, so that test data have no connection to train data whatsoever :)
Love it!! Thank you!!
You're very welcome :)
thank you for the video!
Thanks for the kind words Ihsan!! You're very welcome!!
Hi Greg! I wanna ask about the coding, so I copy and paste your code and run the function "plot_predictions1(model4, X2_test, y2_test)" but why I got the normalized values for X2 but I got the real values for the y2? please tell me if i missed something in your great video! thanks!
and i wanna ask about the preprocessing function too, why we only use the mean and the std from the x train when we use the function for x val and x test? we should be using the mean and std from x val for x val and from x test for x test right? or I missed something again? thanks!
how to do future prediction sir. (should we append the predicted value to the window and the send that data to predict next value) should we repeat the same till time frame we want to predict in future?
Thanks for the great explanation Greg.
I was wondering why for model4 we scaled only the X_test and X_train data (and did not scale the y_test y_train data) before we fed it into the model?
We only scale the data that will be used to train the model. The y value is the output we want to predict therefore we do not scale y.
Hi Greg, thanks a lot! I learn a lot. Can you please help me to understand why did you use Standardization for preprocessing of temp data instead of minmaxscaler ?
Great to hear! There's a lot of flexibility and sometimes arbitrary choices on preprocessing, I wouldn't worry too much about it :)
you are a legend 👊
No you are
how can i combine data of two stocks . I mean how can i train the model in the correct way so that it can predict rightly when i change between stocks
I watched till the end of the video , my question is if i want to train lstm on different stock market data, what kind of preprocessing should i do? Should i employ standardisation(sub tract mean and divide by deviation as done in video?) for each stock data and combine(put first stock data, then second stock data underneath) all of them when training ?
The most simple time series forecasting videos on YT. Why did you use just the training set for mean and std estimation?
Thanks! That's what you do. You want to do the exact same preprocessing to every input
thanks for the great videos man
You're very welcome!
Great video Greg! For the preprocessing and postprocessing, I was wondering why did not we do it at the start on temp_df itself before we split it up in [train, val, test]. In that case, it would be much easier to keep track of it, right?
I think that's because this would represent a form of data leakage, i.e. you would be using data from your validation and test sets (mean and std) to train your model.
Very interesting , how to view this dataframe from plot_predictions , this dataframe will be a part from another dashboard on powerBI and start , end will be dynamic
Hi, useful and clear video for multivariate time series. Could u plz clarify me following usecase is feasible using time series data. Predicting student performance using past semester data for multiple students in a single model. We need to convert multiple student time series history to supervised for binary classification. Is this possible?Thanks in advance.
You didn't remove the label column from the inputs
Label column can be dropped by modifying row as below:
def df_to_X_y2(df, window_size=6):
df_as_np = df.to_numpy()
X = []
y = []
for i in range(len(df_as_np)-window_size):
row = [r for r in df_as_np[i:i+window_size][:,1:6]]
X.append(row)
label = df_as_np[i+window_size][0]
y.append(label)
return np.array(X), np.array(y)
Hello Greg, Thanks a lot for your videos and sharing your knowledge. One question if you allow me. Once we have the model trained, how to add new data (same kind of data) and train just the new data without have to train the whole history and predict using the already trained model + new data? does it make sense to do? thanks a lot!!
You have a couple options that make sense to me. 1. You don't, which is probably the easiest. 2. You train on all or some of the old data and the new data. 3. You train on just the new data using a very low learning rate
AWESOME VIDEO !!!!!!🤩🤩🤩
Brilliant tutorial. Thanks so much for putting it together. Any plans for implementing this in PyTorch?
Thank you! I recently posted something very similar in pytorch. Text follows this pattern, so I'll probably do something similar with nlp.
Thanks for the awesome video. A fucntion preprocess_output3 will take mean and std of the y's, isn't it?
Thank you. This is very informative and clearly explained.
Very nice trick with the periodic sin/cos feature construction.
Do you possibly plan to explain more about the different feature construction tricks?
I'm using a fit_transform from sklearn for data preprocessing (which is virtually the same as your preprocessing functions, I guess).
The question is - couldn't it be easier to apply preprocessing just once to the entire dataset instead of test/train/validation subsets each time for ease of inverse transformation for prediction?
You're very welcome. In the future, this is planned. Fit transform will be fine for temperature and pressure for sure. You could probably preprocess the whole dataset, yes.
No, the point of the validation and test set are supposed to be an unseen dataset. Therefore when preprocessing the data you should only have access to the training data. Therefore you should always FIT the scaler (Min/Max or Standardization or etc) on the Training data and then only use TRANSFORM on the validation/test datasets.
Never fit your scaler on the validation or test set, as this will result in the validation and test set ("unseen") data leaking into your training data and having a bias in your model.
@@cassiusvlok6916 You may be correct as i have read this in a few papers as well. Now guys pls i need help. I am working on a multivariate timeseries with LSTM, i have 28 variables as input. Input to LSTM is a 3D as expected, i am however having trouble doing inverse_transform on my predictions.
Also liked the sin/cos feature construction trick, but I have a question. Why convert to a sine wave as opposed to having a value between 0 and 1 where 0 would be the start of day/year and 1 woul be the end?
Hello
I'm working on a dataset with 14 features and 3 outputs . I want to apply df_to_X_y3 to my data what i should change in this function ?
thank you
up