Stock Price Prediction & Forecasting with LSTM Neural Networks in Python

แชร์
ฝัง
  • เผยแพร่เมื่อ 25 มี.ค. 2022
  • Thank you for watching the video! Here is the Colab Notebook: colab.research.google.com/dri...
    I offer 1 on 1 tutoring for Data Structures & Algos, and Analytics / ML! Book a free consultation here: calendly.com/greghogg/30min
    Learn Python, SQL, & Data Science for free at mlnow.ai/ :)
    Subscribe if you enjoyed the video!
    Best Courses for Analytics:
    ---------------------------------------------------------------------------------------------------------
    + IBM Data Science (Python): bit.ly/3Rn00ZA
    + Google Analytics (R): bit.ly/3cPikLQ
    + SQL Basics: bit.ly/3Bd9nFu
    Best Courses for Programming:
    ---------------------------------------------------------------------------------------------------------
    + Data Science in R: bit.ly/3RhvfFp
    + Python for Everybody: bit.ly/3ARQ1Ei
    + Data Structures & Algorithms: bit.ly/3CYR6wR
    Best Courses for Machine Learning:
    ---------------------------------------------------------------------------------------------------------
    + Math Prerequisites: bit.ly/3ASUtTi
    + Machine Learning: bit.ly/3d1QATT
    + Deep Learning: bit.ly/3KPfint
    + ML Ops: bit.ly/3AWRrxE
    Best Courses for Statistics:
    ---------------------------------------------------------------------------------------------------------
    + Introduction to Statistics: bit.ly/3QkEgvM
    + Statistics with Python: bit.ly/3BfwejF
    + Statistics with R: bit.ly/3QkicBJ
    Best Courses for Big Data:
    ---------------------------------------------------------------------------------------------------------
    + Google Cloud Data Engineering: bit.ly/3RjHJw6
    + AWS Data Science: bit.ly/3TKnoBS
    + Big Data Specialization: bit.ly/3ANqSut
    More Courses:
    ---------------------------------------------------------------------------------------------------------
    + Tableau: bit.ly/3q966AN
    + Excel: bit.ly/3RBxind
    + Computer Vision: bit.ly/3esxVS5
    + Natural Language Processing: bit.ly/3edXAgW
    + IBM Dev Ops: bit.ly/3RlVKt2
    + IBM Full Stack Cloud: bit.ly/3x0pOm6
    + Object Oriented Programming (Java): bit.ly/3Bfjn0K
    + TensorFlow Advanced Techniques: bit.ly/3BePQV2
    + TensorFlow Data and Deployment: bit.ly/3BbC5Xb
    + Generative Adversarial Networks / GANs (PyTorch): bit.ly/3RHQiRj

ความคิดเห็น • 264

  • @GregHogg
    @GregHogg  6 หลายเดือนก่อน +3

    I offer 1 on 1 tutoring for Data Structures & Algos, and Analytics / ML! Book a free consultation here: calendly.com/greghogg/30min

    • @user-mz2fd1dr9g
      @user-mz2fd1dr9g 4 หลายเดือนก่อน

      please provide link for that big Window code tutorial. Thanks

    • @Santhosh_sandi
      @Santhosh_sandi 3 หลายเดือนก่อน +1

      go through his notebook which is the first description@@user-mz2fd1dr9g

    • @elbala777
      @elbala777 3 หลายเดือนก่อน

      Thank you so much for this video. IT taught me LSTM and how to use Python

    • @RI-zn3ju
      @RI-zn3ju 2 หลายเดือนก่อน +1

      I have a LSTM Model that I think is good but I want it to run multiple stock tickers at the same time to save time and give me all their individual results. I am having a tough time figuring out how to do this and am constantly getting errors can you help?

    • @godspowerbunor2638
      @godspowerbunor2638 5 วันที่ผ่านมา

      HELLO I AM INTERESTED CAN I CONTACT U?

  • @OtRatsaphong
    @OtRatsaphong 2 ปีที่แล้ว +2

    Great video, Gregg. Best one I’ve on how to implement LSTM on stock price prediction. I’m going to see if I can replicate this LSTM. Thank you 👍🙏

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว

      Great to hear! Perfect!!

  • @37gippo
    @37gippo ปีที่แล้ว +1

    You're a legend I was looking for ages for a explanation like this one

    • @GregHogg
      @GregHogg  ปีที่แล้ว

      Glad I could help!

  • @jtrobotics5421
    @jtrobotics5421 ปีที่แล้ว +14

    the value at the end is constant because theres a big logic error in your foor loop. as you always set it back to x_test[-1] before the prediction. s.t. the LSTM will output a constant value.
    Non the less if you do it correctly (i just tried it) the prediction will ofc diverge from the stock price anyways (it ll keep going in the direction of first prediction

    • @benjamintenbuuren9652
      @benjamintenbuuren9652 11 วันที่ผ่านมา

      Can you show the example on how you change it? I also noticed that it is incorrect but have no idea how to correct it.

  • @cesarfierro4792
    @cesarfierro4792 9 หลายเดือนก่อน +2

    Great video, it's really well explained. I was expecting the recursive predictions to work a little bit better since I will apply this to a school project more oriented to failure detection but I understand that's a hard task. But still is a great base to start with. Thanks so much.

  • @Nedwin
    @Nedwin 2 ปีที่แล้ว

    Yay, 2nd comment!! My Saturday night is now completed with this video! Thanks Gregg.

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว +1

      You're very welcome and have a great night!

  • @timseed4489
    @timseed4489 2 ปีที่แล้ว +2

    One of the best explanations on the subject - I would do some of the pandas stuff slightly different - but overall 10/10

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว

      Yeah, I would too looking back. But thanks - great to hear!

  • @luisaurso
    @luisaurso 4 หลายเดือนก่อน

    Hey Greg, thanks for sharing your knowledge. Amazing video and very well explained. Following you onwards

  • @mrrolandlawrence
    @mrrolandlawrence ปีที่แล้ว

    wow amazing. for a tiny mortal without a PHD in maths i actually understood all what you said. amazing work

    • @GregHogg
      @GregHogg  ปีที่แล้ว +1

      Glad to hear it :)

  • @TheRafler
    @TheRafler 2 ปีที่แล้ว +41

    I believe your final recurrent forecasting is wrong. last_window should be initialized outside the loop, and then for each loop do np.roll(last_window,-1) and update last_window(-1) such that you have a continuously-moving window. In your case you only update the last value, but the rest remain fixed and that is why your prediction remains flat.

    • @clintonletsoela1479
      @clintonletsoela1479 ปีที่แล้ว

      @TheRafler can you please paste the code, I tried using np.roll as you suggested but the prediction is still flat.

    • @shengchuangfeng227
      @shengchuangfeng227 ปีที่แล้ว +3

      @@clintonletsoela1479
      I tried something similar, and it soon converged. But you can see for the first few loops, the predicted values are different. In the video, the predicted values are always the same, meaning the last_window variable is not updated.
      Here is my code:
      from copy import deepcopy
      recursive_predictions = []
      recursive_dates=np.concatenate([dates_val, dates_test])
      last_window = deepcopy(X_train[-1])
      for target_date in recursive_dates:
      next_prediction=model.predict(np.array([last_window[-3:]])).flatten()
      recursive_predictions.append(next_prediction)
      last_window=np.concatenate((last_window,[next_prediction]))
      #print(last_window)
      print(recursive_predictions)

    • @obedrolland3259
      @obedrolland3259 8 หลายเดือนก่อน

      @@shengchuangfeng227 you can do even better just by doing last_window = np.concatenate([last_window[-2:], np.array(next_prediction)]). You won’t need anymore to specify last_window[-3:] in your model.predict() function.

    • @obedrolland3259
      @obedrolland3259 8 หลายเดือนก่อน +2

      We don’t need a np.roll() function. He should just change the last_window expression in the loop by :
      last_window = np.concatenate([last_window[-2:], np.array(next_prediction)]).

  • @tariqmahmood2734
    @tariqmahmood2734 ปีที่แล้ว +3

    Hi Greg, its a nice video to learn. Can u explain what changes we should made to forecast the values after the dates_test, mean after the end date (mentioned in the 2022-03-23) for few days or weeks.

  • @peralser
    @peralser ปีที่แล้ว

    Amazing video. You were so clear to explain! Thanks!

    • @GregHogg
      @GregHogg  ปีที่แล้ว +1

      Great to hear!

  • @arsheyajain7055
    @arsheyajain7055 2 ปีที่แล้ว +1

    Was waiting for this one!!

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว +1

      Great to hear!

  • @quanxu1
    @quanxu1 ปีที่แล้ว +6

    awesome video 👍 the flat line from recursive prediction is most likely caused by an glitch in the code. "last_window = deepcopy..." should be moved up and outside of the loop. After that's fixed, my guess is that the recursive predictions will converge going forward into an almost flat line

    • @Algardraug
      @Algardraug ปีที่แล้ว

      Yeah, I noticed that too. Glad I'm not the only one! The loop is overwriting the last_window at the start so predicting on the same data in every loop should reasonably result in the same value.
      Also setting last_window[-1] = next_prediction every loop seems wrong. Realistically, you would want to shift the first two values before doing that otherwise, you'd be stuck with the first two values in the list the whole time. It should probably be something like this
      last_window = last_window[:-1]
      last_window.append(next_prediction)
      (I'm not very good at python)

    • @YoungMoneyInvestments
      @YoungMoneyInvestments 9 หลายเดือนก่อน

      Glad someone else caught this.

    • @idrisseahamadiabdallah7669
      @idrisseahamadiabdallah7669 8 หลายเดือนก่อน

      @@YoungMoneyInvestments , have you tried this ? Because i want do forecasting, like predict after the test_dates that we already have ?

    • @YoungMoneyInvestments
      @YoungMoneyInvestments 8 หลายเดือนก่อน

      @@idrisseahamadiabdallah7669I have and haven’t been able to get anything over 60% on shorter timeframes. I’m using 15 min candles with about 80k candles of data. It has been harder than I anticipated. I’ve tried 3 variations of LSTM’s,, 3 different random forests, a perception, and I’m not yet at the point where im pleased with the results.

  • @senna_william
    @senna_william 2 ปีที่แล้ว +1

    The final catch was fantastic, thanks for the lesson!

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว +1

      Yeah I feel like adding that in there is very useful to know haha

    • @senna_william
      @senna_william 2 ปีที่แล้ว

      At first I thought "I do not believe that he will be able to predict the market only with this information", a good lesson, congratulations!

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว +1

      @@senna_william yeah! :)

  • @zeroj7492
    @zeroj7492 ปีที่แล้ว

    really nice tutorial and thank you so much for making it brother

  • @rileyclubb
    @rileyclubb 2 ปีที่แล้ว +4

    I'm trying to adapt your code to hourly and minute time series. I think I need to modify the complicated section that you mention @6:40 can you help me find that tutorial to understand this section? I don't see that in the video description above

  • @gayanc6193
    @gayanc6193 ปีที่แล้ว

    Great content solved one of my major problems. many thanks❣

    • @GregHogg
      @GregHogg  ปีที่แล้ว

      Super glad to hear that!

  • @ivancostabernardo7670
    @ivancostabernardo7670 2 ปีที่แล้ว

    Thanks for the brilliant work!

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว +1

      Thanks so much for the kind words Ivan!! :)

  • @MrLam-lx7td
    @MrLam-lx7td ปีที่แล้ว

    Thank , so much ! Very good project .

    • @GregHogg
      @GregHogg  ปีที่แล้ว

      You're very welcome

  • @ryanmilgrim6427
    @ryanmilgrim6427 ปีที่แล้ว +7

    I know little about machine learning but I think you could fix the extrapolation probelm by making predictions on the stock's log return rathern than price. with a pandas dataframe it should be a line like np.log( 1 + df.pct_change() ). The goal is to get the data down to a stationary time series, so modeling returns should get you closer but the data is not quite symmetric, which is why you also take the log. of course, you outout then becomes a series of logreturns, which you will have to apply this line to fix, np.cumprod( np.exp(df) - 1 ).

    • @kilocesar
      @kilocesar 5 หลายเดือนก่อน

      I thought the same

  • @folashadeolaitan6222
    @folashadeolaitan6222 2 ปีที่แล้ว +5

    Hi Greg, thank you for another awesome video. I am working on a multivariate LSTM and i have scaled my data (containing both the input and output variables) as in the code below.
    scaler = StandardScaler()
    scaler = scaler.fit(df_train)
    df_train_scaled = scaler.transform(df_train)
    Now, doing inverse_transform as below in order to be able to compare my predicted values with the actual, i am getting an error.
    prediction_copies = np.repeat(prediction, df_train.shape[1], axis=-1)
    #This will repeat the prediction the number of times columns it was originally scaled with
    y_pred = scaler.inverse_transform(prediction_copies)[:,0]
    #Here i am taking all the rows and just the first column, since the other columns are replications of the first.
    ERROR
    Found array with dim 3. StandardScaler expected

    • @gauravnarkhede3622
      @gauravnarkhede3622 ปีที่แล้ว +1

      I am also working on LSTM

    • @folashadeolaitan6222
      @folashadeolaitan6222 ปีที่แล้ว +1

      @@gauravnarkhede3622 Nice one Gaurav, how is it going with u?

    • @gauravnarkhede3622
      @gauravnarkhede3622 ปีที่แล้ว

      @@folashadeolaitan6222 Good, need some company to work

    • @jpaulexo
      @jpaulexo 8 หลายเดือนก่อน

      ​Hi ​@@folashadeolaitan6222Did you found the error?

    • @jakob4371
      @jakob4371 4 หลายเดือนก่อน

      Its an extremely basic one, so it should be resolved by now.

  • @carrocesta
    @carrocesta ปีที่แล้ว

    Greg, you are the man! greetings from Spain :)

    • @GregHogg
      @GregHogg  ปีที่แล้ว +1

      Thanks so much! I'd love to visit Spain some day

  • @alexchirapozu3234
    @alexchirapozu3234 2 ปีที่แล้ว +9

    Hi Greg, I believe that you made a mistake at the recursive prediction. If you print your windows, you'll see that every time you predict with the same 3 values, instead of changing it at every iteration. That is why your plot is completely flat, which it shouldn't. Here is my correction (just for last 14 days), I think it's OK:
    recursive_predictions = []
    recursive_dates = dates_test[-14:]
    last_window = X_test[-14]
    for target_date in recursive_dates:
    print(last_window)
    next_prediction = model.predict(np.array([last_window])).flatten()
    recursive_predictions.append(next_prediction)
    new_window = list(last_window[1:])
    new_window.append(next_prediction)
    new_window = np.array(new_window)
    last_window = new_window
    If you try to predict with it still doesn't fit to the actual values, but at least is not a straight line hehehe

    • @DevilErnest
      @DevilErnest ปีที่แล้ว +3

      from copy import deepcopy
      recursive_predictions = []
      recursive_dates = np.concatenate([dates_val, dates_test])
      last_window = deepcopy(X_train[-1])
      for target_date in recursive_dates:
      next_prediction = model.predict(np.array([last_window])).flatten()
      recursive_predictions.append(next_prediction)
      last_window = np.append(last_window[1:], next_prediction)

    • @arpadikuma
      @arpadikuma ปีที่แล้ว +1

      @@DevilErnest this one almost worked, the last line needs to be
      last_window = np.concatenate([last_window[1:], [next_prediction]])
      otherwise it errored out on me.
      But yeah this is the way how last_window actually gets updated with the last three predictions of each iteration
      Still the result is almost a straight line...too bad :]

  • @britox.6216
    @britox.6216 ปีที่แล้ว

    still hate how you left us on a cliffhanger greg! please make a part two!

  • @yan200go
    @yan200go 11 หลายเดือนก่อน

    this actually explained things that I didn't understand by other people's explaination

    • @GregHogg
      @GregHogg  11 หลายเดือนก่อน

      Glad to hear it!

  • @ivandozoretz3031
    @ivandozoretz3031 ปีที่แล้ว +4

    Great video! I've just one doubt: if I want to put the model in 'production' and predict prices for the next month, I won't have the previous 3 prices for most of those dates (in that production dataset). So, is it right to include those features in training,validation and test set?

    • @otaviocoutinho2855
      @otaviocoutinho2855 ปีที่แล้ว

      I have the exact same doubt. Shame nobody aswered. I this ideia we could only predict the next day, but what you can do is try to make the same he did in the validation, you must try to predict a larger range of days and the performance will drop quickly, but problably enough for a few days ahead. Which I find interesting besides being limited.

  • @sandhusukhdeep
    @sandhusukhdeep 2 ปีที่แล้ว

    that is one excellent video to setup a basic LSTM model! great work and thank you

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว

      Awesome! Thank you, and you're very welcome!

  • @1990lietuva
    @1990lietuva 9 หลายเดือนก่อน +2

    Hi, I might be wrong here but I think one of your issues about recursive prediction is - that you always replace the last element, meaning that it tried to predict from the same values with the last one changed, and that might be why it predicts the same. I would change the part to be (append prediction to the end) and pop the first one out,

    • @LawrenceReitan
      @LawrenceReitan 7 หลายเดือนก่อน

      Which is pretty much the same copy paste mistake all these stock predictors wannabe do
      Appreciate the effort in confusing thousands of people though...

  • @JilN660
    @JilN660 8 หลายเดือนก่อน +1

    Hi thanks a lot for this video! How would you move from forecast to a Signal 1/0/-1 column to be applied to the price variation? I have issues with potential future leakage and the condition to use to create the signal. 😊

  • @user-qy6sj1zv1l
    @user-qy6sj1zv1l ปีที่แล้ว +9

    I'm gonne make a "course" about machine learning but I actually don't know how it works so I'm gonna copy-paste this function and voila, working like a charm. Way to go

  • @manuelnovella39
    @manuelnovella39 ปีที่แล้ว +1

    Man, at 6:40 you refer to a tutorial in the video description that explains the function df_to_windowed_df. What link is that, exactly? Can't find it

  • @liornisimov9367
    @liornisimov9367 ปีที่แล้ว +20

    Hi, great video!
    I have written a more efficient function to window the data using the pandas "shift" method.
    Feel free to use it!
    def window_data(data, n=3):
    windowed_data = pd.DataFrame()
    for i in range(n, 0, -1):
    windowed_data[f'Target-{i}'] = data['Close'].shift(i)
    windowed_data['Target'] = data['Close']
    return windowed_data.dropna()

    • @GregHogg
      @GregHogg  ปีที่แล้ว +2

      Thanks so much, well done!

    • @Adriannotadrien33
      @Adriannotadrien33 10 หลายเดือนก่อน +1

      This doesn't have an integer index so the resultant shape of X is wrong

    • @MrDaniloDj
      @MrDaniloDj 6 หลายเดือนก่อน

      ​@@GregHogg Pin this!

    • @majilarohit4
      @majilarohit4 6 หลายเดือนก่อน

      @liornisimov9367 You should start a channel. Thanks:)

  • @anatoliyzavdoveev4252
    @anatoliyzavdoveev4252 ปีที่แล้ว

    Super tutorial 💪 great code💪💪🤝

    • @GregHogg
      @GregHogg  ปีที่แล้ว

      Thanks so much and great to hear!!

  • @anmolpreet8959
    @anmolpreet8959 2 ปีที่แล้ว

    Congrats on Hitting 22k Subs.

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว

      Aw thanks Anmol, and nice dab!!

  • @rohanaryan2231
    @rohanaryan2231 6 หลายเดือนก่อน

    hey!
    love the way you deliver your content sir! learnt a lot from you. just a quick question, can we use a direct type conversion to convert the type of date (pd.to_datetime or astype(datetime64)). if yes why didnt you use it and create a function for the conversion, how is it different from doing it directly as i mentioned above.
    thanks!! good day!!

  • @konstantintomilin1826
    @konstantintomilin1826 ปีที่แล้ว

    Would be interesting to see how inclusion of other features like trading volume and maybe some common technical analysis indicators could affect prediction!

    • @fluctura
      @fluctura ปีที่แล้ว

      I'll do this and keep you posted

    • @dnas5629
      @dnas5629 ปีที่แล้ว

      I have backtested many of the TI and found they are broken clocks. Not that they are not useful to help filter data, but as for predictions they ideal. The oscillators help with sideways markets, but not trending. Detecting trends before they happen is tough. As for inputs into predictive models I do not think it will help the model. But definitely try. TALIB is best library for that.

  • @heribertoquintanilla743
    @heribertoquintanilla743 ปีที่แล้ว +1

    Thanks a lot Greg, very interesting. One question, once you have your model, how to add new data, without re-train the whole model, but train just the new data and predict under the whole dataset? does it make sense? thanks a lot!

    • @andrewlichte6734
      @andrewlichte6734 7 หลายเดือนก่อน

      I think at some point you have to retrain your model. like how Greg threw out all the old data because microsoft went parabolic in the more recent years. the model didn't know how to cope with that. it's like budgeting gas money for a trip, but your're using gas prices from 5 years ago. You're just not gonna get where you want to go.

  • @RafaelRivetti
    @RafaelRivetti วันที่ผ่านมา

    Hi, Greg! In the MLP network, data from independent variables from date t are used to predict a future value t+n. In the LSTM network, instead of using only data from time t of the independent variables, it uses data from time t, t-1, t-2, ..., t-n as desired by the programmer, and after that, generates the prediction for a future time t+n? Is this reasoning correct? Thank you very much!

  • @tiagobrito9765
    @tiagobrito9765 2 ปีที่แล้ว +4

    Another great video!
    I have a doubt:
    My date column is in the format day/month/year hour:minutes.
    What should I change in the str_to_datetime def function to show hours and minutes?

  • @alrey72
    @alrey72 ปีที่แล้ว +2

    Nice, very detailed. I just have a few questions:
    1. I suppose Sequential means Time Series. If that's the case, is there still a need to create 3 columns for the last 3 closing prices? My understanding of Time Series is it automatically looks on the order of values (output of last record is input of current record).
    2. Believe LSTM is an optimization technique so that if there are many layers, the vanishing gradient problem will not be encountered. I'm also confused since you mentioned Adam (I'm not familiar with it) as optimizer so it might be in conflict with LSTM?
    3. You mentioned LSTM(64) means 64 nodes and Dense(32) means 32 layers .. so its just 2 nodes per layer?
    Thanks.

    • @lorryzou9367
      @lorryzou9367 ปีที่แล้ว

      the last 3 closing prices are our input features, and the closing price for today is the output/prediction value.

    • @alrey72
      @alrey72 ปีที่แล้ว

      @@lorryzou9367 Yes that's the intention. However, the neural network used is time series so the last 3 closing prices is redundant. If a feed forward neural network is used then the input of last 3 closing prices is logical.

  • @crazychutties3766
    @crazychutties3766 2 ปีที่แล้ว

    Great Stuff... When I'm redoing for a shorter tenure (for last 1 year), the Training Predictions is coming up as a straight line. Where am I going wrong pls?

  • @ntepelaletsoela812
    @ntepelaletsoela812 ปีที่แล้ว

    Danko means thank you🤟🙏

  • @rushilsingh
    @rushilsingh หลายเดือนก่อน

    00:00 Forecasting Microsoft stock using LSTM neural networks.
    03:56 Convert date column to datetime objects and make it the index of the dataframe
    07:44 Convert the given data frame into numpy arrays for input and output
    11:28 Performing univariate forecasting on closing value over time
    15:06 Create and train a sequential model with LSTM and dense layers
    18:35 The LSTM model for predicting data shows poor performance on validation and test sets.
    22:00 Training LSTM model on a subset of data to improve prediction accuracy.
    25:19 The model recursively predicts future values based on the available data.

  • @bloodio4237
    @bloodio4237 ปีที่แล้ว

    Hi 👋👋
    I am looking to create a model that predicts token bonding curves. In order to do that, moreover to predict token price, I need to predict the supply.
    The majority of youtube videos are talking about the price. I have some trouble finding tutorials explaining how to predict token supply and which features would be better used.
    Any help, please !!

  • @antoniosaltoborsodicarminati
    @antoniosaltoborsodicarminati 2 ปีที่แล้ว

    You just made my project for bitcoin prediction in college, thanks man

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว +1

      Sounds great!

  • @luisaurso
    @luisaurso 4 หลายเดือนก่อน

    Greg if I understood correctly I can include more variables (multivariate) to make predictions (e.g. Predict the sales for Icecream using the Units per Day + Temperature) and for that I would need to include these 2 variables into the dataset and reshare the feature (X) accordingly, but I got confuded on how to windows the data ?

  • @user-cz2dl7yg6f
    @user-cz2dl7yg6f 7 หลายเดือนก่อน

    Where do I find windowing the data frame video the size doesn't support my dataframe and how do I change the dates according to the size of the window?

  • @eshaangupta8637
    @eshaangupta8637 2 ปีที่แล้ว

    Great Video

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว

      Thanks Eshaan

  • @moekhaled
    @moekhaled ปีที่แล้ว

    thank you , where is the windowed_df code explanation ? i could not find it in the description ?

  • @DevilErnest
    @DevilErnest ปีที่แล้ว +1

    Hi Greg, great video and I learnt a lot here. May I ask whether detrending and deseasonalisation (if they exist) are needed during time series modelling. There are resources out there which suggests these should be done and the series must be stable before feeding time series to the model, while others (like yours) don't really care about trends and seasonality within the model and let the model does its work. My guess is there's no right or wrong answer to it, but what's the reasoning behind these decisions in time series modeling using machine learning algos? Thanks.

    • @GregHogg
      @GregHogg  ปีที่แล้ว +1

      There's always multiple ways of doing things. In theory you try everything and do what works best

    • @fluctura
      @fluctura ปีที่แล้ว

      majjjggic😂

  • @fkeb37e9w0
    @fkeb37e9w0 2 ปีที่แล้ว +2

    Hi Gred, I had a doubt, for an unseen data (which contains only the date), how can I send that data into our LSTM model and predict for the next 5 days, when I have no input data apart from the DATE? Please help.

    • @pixusru
      @pixusru 2 ปีที่แล้ว +1

      Your input for LSTM is not a date, but prices for previous N days before now.

    • @efhndz7990
      @efhndz7990 2 ปีที่แล้ว

      i have the same question what can i do?
      if i want to know values for next days

  • @cbassett123
    @cbassett123 ปีที่แล้ว +1

    6:40 "If you need to know what that code is, check out the tutorial"...
    Can you please add the tutorial to the description? Thank you!!

  • @srishtisdesignerstudio8317
    @srishtisdesignerstudio8317 2 หลายเดือนก่อน

    very helpful

    • @GregHogg
      @GregHogg  2 หลายเดือนก่อน

      Thank you!

  • @ffinalttrip
    @ffinalttrip 2 ปีที่แล้ว

    Amazing work Greg!!👏 Do you know any good community to share and talk about projects like this one? discord or something...🧐

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว

      Thank you!! LinkedIn and Facebook have some very large groups :)

  • @runpinghuang9259
    @runpinghuang9259 ปีที่แล้ว

    Hi Greg, my MAE is getting pretty huge, do you have any idea how to fix it? And wondering why you can achieve the low MAE without standardization, thanks!

  • @arfasobirhanu4801
    @arfasobirhanu4801 ปีที่แล้ว

    Where can I get the description function for df_to_windowed_df function, sir?

  • @DawidMichna
    @DawidMichna 2 ปีที่แล้ว

    Great stuff! Why did you set date as an df.index in min ~5, was it necessary/what's a rationale behind it?

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว

      See the other comments haha

  • @user-sy9sq8qi5f
    @user-sy9sq8qi5f ปีที่แล้ว

    Thank you

    • @GregHogg
      @GregHogg  11 หลายเดือนก่อน

      Very welcome!

  • @afseeniqbal
    @afseeniqbal 3 หลายเดือนก่อน

    hello I am new to this so I was wondering if this exact model would work on say VTSAX stocks and not just microsoft? if I were to just import VTSAX data instead of microsoft data as a csv in the beginning

  • @gacctom
    @gacctom ปีที่แล้ว

    awesome video

    • @GregHogg
      @GregHogg  ปีที่แล้ว

      Thanks so much!!

  • @prashant8762
    @prashant8762 ปีที่แล้ว

    Hello Greg, great video again as usual, can advice ,whether can we have a LSTM neural network for predicting stock price of multiple companies say 1000 companies ?

    • @jarrodmautz159
      @jarrodmautz159 ปีที่แล้ว

      Yes compile each stocks data into a key value pair of a dictionary, the key will hold the stocks name, and the values will hold the data frames that contain the dates and prices/more indicators etc., then turn each into sequence, then train your lstm on each stock in your dictionary

    • @prashant8762
      @prashant8762 ปีที่แล้ว

      @@jarrodmautz159 thats very impressive explanation , do you have any sample notebook( if you free to share ) on similar use case or any article link ? iam really looking for one sample notebook, thanks for the reply ,it helps :-)

  • @onlystudy4645
    @onlystudy4645 หลายเดือนก่อน +1

    at 11:41 why we have to convert it to 3-D matrix, can't we train the lstm on 2-d X matrix?

  • @mahdis-hs6bn
    @mahdis-hs6bn ปีที่แล้ว

    Hi
    the reason that it couldn't predict correctly was because there was never a rise like the test set in train set right ?

  • @SyedMoneeb-Ul-hassanHaris
    @SyedMoneeb-Ul-hassanHaris ปีที่แล้ว

    My dataset has only Years and the Total. can someone help me on how df_to_window code which comes after the very 1st matplot graph will look like?
    i have tried changing the code but it is very complicated.

  • @uveuvenouve7680
    @uveuvenouve7680 หลายเดือนก่อน

    Can someone explain why the prediction is so flat when the train data is big?
    as the video show , it should be using the 3 previous days' price to predict the next day close price.
    that's mean it always predict a dramatic drop of few hundred dollars?
    Is that the cause of LSTM? would it be better to use the price differences to make this model?

  • @Jipzorowns
    @Jipzorowns 3 หลายเดือนก่อน

    I don't get the part on 20:25, how is it possible that it gets the beginning of the prediction so right? I just don't see how this is possible, am I missing something?

  • @JeffPohlmeyer
    @JeffPohlmeyer 2 ปีที่แล้ว +4

    I have a concern. It looks like the predicted vs observed values are offset by a day. In other words, when zoomed in the predicted values do seem close to the observed, but they're delayed by a day. From the perspective of actual trading this is very detrimental.

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว

      Hmm I'll have to check this out.

    • @chris285as
      @chris285as 3 หลายเดือนก่อน

      I watched another video on neural training and it said that the model is not good for this purpose as it just predicts the last day - or + on the close value that’s why it’s real close. Their reasoning is that if you take the difference each day it can’t predict and the error rate is huge.They suggested that it’s better to use other techniques for this reason.

    • @chris285as
      @chris285as 3 หลายเดือนก่อน

      For me the jury is still out I would do lots of testing and your own research

  • @jeraldgooch6438
    @jeraldgooch6438 2 ปีที่แล้ว

    Greg, Thanks for this. Two questions
    1. Will adding in more target values (e.g. go to Target-10) help?
    2. What would shuffling the windowed data set do? At least some of the data in the training set would be after 2016 then.
    Thanks

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว

      Changing the window may help. Not sure what you mean by shuffling.

    • @jeraldgooch6438
      @jeraldgooch6438 2 ปีที่แล้ว

      @@GregHoggsklearn.utils.shuffle i think is what i mean. In your windowed dataframe, the 2nd row might end up as the 100th row, the 3rd as the 250th, the 1000th might end up as the second and so on. Obviously I do not have enough of the vocabulary to truly express myself and typing on an iPad is a bit clunky!

    • @jeraldgooch6438
      @jeraldgooch6438 2 ปีที่แล้ว

      @@GregHogg After shuffling, the only time related data would be within each sample. The shuffling would destroy any time related dependency between samples. This might not be desirable with the LSTM layer? Also, what might be the effect of just using % change on a daily basis, rather than absolute value of the stock. And this is where you get to say “Why don’t you go ttrry it out and let me know what you find”?

    • @AIwithHossein
      @AIwithHossein 2 ปีที่แล้ว

      I've done it before. It depends on future forecasting and some other parameters. There is no clear relationship between past data and the accuracy of results. Generally, you should test different scenarios based on your project and your data.

  • @ismaelbastos4097
    @ismaelbastos4097 2 ปีที่แล้ว

    Very good video!!! I have a question, when we separate the validation data, because of the the size of the window is three, the first row of the validation tdata will carry out the last 3 values that are in the target of of the three last rows of train data, and the second row will carry two values of the the target of last two rows of train data and so forth. Is this considered a data leakage? I mean, I have the target of train set as inputs of the validation set, is this ok?

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว +1

      I see what you're saying, and thanks for the kind words. I wouldn't really consider this is a leakage. Formally? Maybe. Practically? Not really

    • @ismaelbastos4097
      @ismaelbastos4097 2 ปีที่แล้ว

      @@GregHogg Thanks for the answer. I really appreciate your job here on TH-cam and on LinkedIn. So I have to be careful when choosing the size of the window, right? Because, if I choose a larger window, the amount of data that will "leak" will grow.

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว +1

      @@ismaelbastos4097 you're very welcome! And yeah, I guess I agree with that. I really wouldn't worry too much though, when you're in a company these things will sort themselves out.

  • @G5Locks
    @G5Locks ปีที่แล้ว +1

    Can you implement this in a streamlit interface

  • @yavuzhancakr6592
    @yavuzhancakr6592 10 หลายเดือนก่อน

    sometimes in the big datas my epochs value be nan i mean loss : nan, i haved normalize the data but it didn't change anyting, what should i do ?

  • @ricgondo
    @ricgondo ปีที่แล้ว

    Just Thanks!!!!!

    • @GregHogg
      @GregHogg  ปีที่แล้ว +1

      That's super nice of you, thank you so much!

    • @ricgondo
      @ricgondo ปีที่แล้ว

      @@GregHoggNot at all! Thank you sir! Jupiter was fine but, now trying to figure out why I can’t run in my Ubuntu terminal lol!

    • @ricgondo
      @ricgondo ปีที่แล้ว

      OMG, nicely working on a terminal, but I will stick with Jupyter. First time using this thing... so much better!

  • @poojithachalla8106
    @poojithachalla8106 ปีที่แล้ว

    @Greg Hogg Please provide vedio link for window function

  • @___OmerAJ___
    @___OmerAJ___ 8 หลายเดือนก่อน

    The problem is with experienced Ai developers they always wanted to predict the future price, but with real traders or investors they all wanted to take a profitable trades, no one wants to exactly know the price and nothing can do that

  • @zabrinasmith5087
    @zabrinasmith5087 2 ปีที่แล้ว

    I'm trying to replicate this model with another dataset but the code for 'f_to_windowed_df' that was pasted at 6:40 doesn't seem to be working, I have mirrored the code using the linked provided. Any suggestions?

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว

      Sorry, not sure.

    • @21morpha
      @21morpha 2 ปีที่แล้ว

      ```def targets_df(df, n):
      df = df.reset_index()
      names = ['target_date']
      for i in range(1, n+1):
      names.append("target"+str(n-i+1))
      names.append('target')
      new_df=[]
      for i in range(n, len(df)):
      new_df.append([df[df.columns[0]][i]])
      for j in range(0, n+1):
      new_df[i-n].append(df[df.columns[1]][i-(n-j)])
      return pd.DataFrame(new_df, columns=names)```

    • @21morpha
      @21morpha 2 ปีที่แล้ว

      I made this targets_df function because I didn't want to just copy what was done by others. It works, but it considers all of the dates. I believe it is not hard to adapt it for you to select the window by yourself. Maybe you could just slice the close dataframe before applying the function on it. In my case it worked just fine because I got a dataset with as many as 120 rows. Don't know if it would work well on huge datasets, with millions of lines, but I believe it wouldnt, because there is a nested for loop in there.

  • @Wissam-rk7tv
    @Wissam-rk7tv ปีที่แล้ว

    thank you for the vidéo , it's amazig, how if we wont to forecast stock prices for different company ? can you make a video about this case

    • @GregHogg
      @GregHogg  ปีที่แล้ว

      Exact same thing

  • @SakshiDwivedi-p8e
    @SakshiDwivedi-p8e 10 วันที่ผ่านมา

    My question is , how will one model fit all stock, every stock will have different behaviour based on news and companies end results, so will it fit all??

  • @taruchitgoyal3735
    @taruchitgoyal3735 ปีที่แล้ว +2

    Hello Sir,
    Thank you for the tutorial.
    I am getting the error when we start using the function: df_to_windowed_df()
    I get error:-
    Error: Window of size 3 is too large for date 1986-03-12 00:00:00
    None
    Thus, need your help to understand about this error and how to resolve it.
    Thank you.

    • @hybriddude007
      @hybriddude007 2 หลายเดือนก่อน

      If you want more previous days, then take at least two years worth of data or more, that should work

  • @sameerulhaq4066
    @sameerulhaq4066 2 หลายเดือนก่อน

    how can we check the rmse score?

  • @someotherstuffs
    @someotherstuffs 10 หลายเดือนก่อน +1

    hi, why did you not include the other columns open, volume, low, high in the training dataset?

    • @ouanesachouche6785
      @ouanesachouche6785 3 หลายเดือนก่อน

      because he actually doesn't know what he's doing

  • @robottalks7312
    @robottalks7312 2 ปีที่แล้ว +2

    Will there be any advantage if we scale the closing stock price between 0-1

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว

      You could definitely try that :)

  • @dananjayaidabagusgede5144
    @dananjayaidabagusgede5144 2 ปีที่แล้ว +1

    i copy your code in description, why my data is error? "window of size 3 is too large" what thats mean? thnks sir i hope you answer soon 🙏

    • @SafwanAlselwi
      @SafwanAlselwi ปีที่แล้ว

      I got the same error, but I fix it by downgliding the entire max dataset

  • @ignessrilians
    @ignessrilians 11 หลายเดือนก่อน +1

    Amazing guide again , Thank you so much for these REALLY helpful guides. I appreciate it alot 💯 🙏
    I'm actually doing the same kind of task with autoregression forecasting (where the next inputs are previous output) for an arbitrary time series , but no matter how hard i try to find the best model, i still end up with my autoregressive time series being not even accurate at all and after days and hours of searching I'm still lost...
    Seems kinda impossible to do , but anyway , thanks alot for the guides again 🙏

    • @GregHogg
      @GregHogg  11 หลายเดือนก่อน

      You're very welcome! And yeah it's pretty tricky

  • @hulunlanteworku9341
    @hulunlanteworku9341 ปีที่แล้ว

    Awesome video , but how do we know our window size in this video case 3 is there a way to know that. I think it's most crucial while working with LSTM

    • @GregHogg
      @GregHogg  ปีที่แล้ว

      You choose

    • @hulunlanteworku9341
      @hulunlanteworku9341 ปีที่แล้ว

      @@GregHogg Based on what? or by seeing the data visualization to see some pattern that repeats

  • @marcosagustinmel2294
    @marcosagustinmel2294 ปีที่แล้ว

    Hi Greg,
    Thank you very much for the video! It's excelent!
    I have a doubt regarding the windowed_df function.
    What is the meaning of the if statement about df_subset? I'm talking about this part:
    while True:
    df_subset = dataframe.loc[:target_date].tail(n+1)

    if len(df_subset) != n+1:
    print(f'Error: Window of size {n} is too large for date {target_date}')
    return
    Because I'm getting this error when I pass to the function a period higher that 2 years (+/-) . For example from 2020-12-01 to 2022-12-4 is retrieving me the correct array (with the 3 timesteps), but if I pass 2020-05-01 to 2022-12-4 I'm getting the error "Window of size 3 is too large for date 2020-05-01 00:00:00"
    So how can I solve it ? Why this statement is present? What is trying to avoid?
    Many thanks!
    Regards
    Marcos

  • @abhirajranjan1518
    @abhirajranjan1518 15 วันที่ผ่านมา

    Why is that windowed_df function taking so long to run, it isnt stopping...

  • @dnas5629
    @dnas5629 ปีที่แล้ว

    Helpful vid, but I ran into same problem which is long term predictions. I was able to predict out a year the complete ohlc data and charted on candlestick chart. What I found was the future predictions would be a mirror image of the past. So if there was an increasing hump, there would be an inversely decreasing slump. I wonder what would happen if you used a larger window and you instead normalized the data to %up/down. That way you could use the use the full dataset then you could convert back later. Also would be interesting if you first used a clustering on the candles and then fed that data into your lstm model along with the normalized candles.

  • @aadarshsingh2247
    @aadarshsingh2247 ปีที่แล้ว

    I have a doubt asked by my professor "Why are we using only one attribute "close price" when the stock market can depend on volume too????

    • @GregHogg
      @GregHogg  ปีที่แล้ว

      You absolutely could

  • @pufferwockey
    @pufferwockey 2 ปีที่แล้ว

    Hi there, new to the channel, love it so far. Been trying to follow these instructions, but my model always predicts the same number. When I plot model.predict(X_whichever), dates_ whichever, and y_whichever, dates whichever, the y line is what you'd expect but the prediction line is completely horizontal, at what appears to be the mean of y_val. I've been messing with normalization and quadruple checked shapes, but no change so far. Anyone have any thoughts?

    • @pufferwockey
      @pufferwockey 2 ปีที่แล้ว

      I must have bungled something. Threw a batch normalization on the input and that fixed it

    • @britox.6216
      @britox.6216 ปีที่แล้ว

      @@pufferwockey dealing with that issue now too, prediction line is just horizontal. what exactly did you do to fix this ? I am new to coding, would appreciate a respone

  • @skinderspike7564
    @skinderspike7564 5 หลายเดือนก่อน

    Error: Window of size 3 is too large for date 1986-03-18 00:00:00
    How to Resolve?

  • @kkololp
    @kkololp 6 หลายเดือนก่อน

    if your data is not linear, you can use the log to make it linear

  • @rizkamilandgamilenio9806
    @rizkamilandgamilenio9806 ปีที่แล้ว

    amazing video, why does your model on this video perform differently from the actual code.

    • @GregHogg
      @GregHogg  ปีที่แล้ว

      Thank you! Probably just randomisation

  • @ShuhaoGao
    @ShuhaoGao 8 หลายเดือนก่อน +1

    What is the value for test_val?

  • @eightonecapital6554
    @eightonecapital6554 ปีที่แล้ว

    BRAND NEW TO CODING. Is there a way to make the chart interactive to see exact predicted prices?

    • @GregHogg
      @GregHogg  ปีที่แล้ว +1

      Yes use plotly instead of matplot

  • @pranaliredgaonkar8154
    @pranaliredgaonkar8154 2 ปีที่แล้ว

    Which mathematical model we you have used for prediction?

  • @1234567PokemGaming
    @1234567PokemGaming 5 หลายเดือนก่อน

    Guys, at the end of the video, it should not be a constant value. Just a little mistake. You guys can use my code for reference.
    In the video, the constant trend happened due to the last_window always keep constant.
    # Predict future
    from copy import deepcopy
    recursive_predictions = []
    recursive_dates = np.concatenate([dates_val, dates_test])
    last_window_new = deepcopy(X_train[-1])
    for target_date in recursive_dates:
    next_prediction = model.predict(np.array([last_window_new])).flatten()
    recursive_predictions.append(next_prediction)
    last_window_new[0] = last_window_new[1]
    last_window_new[1] = last_window_new[2]
    last_window_new[-1] = next_prediction
    print(X_train[-2:]) # Print the last 2 elements of X_train
    print(np.array([last_window_new])) # Print the value of np.array([last_window])

  • @fabiobrito8038
    @fabiobrito8038 2 ปีที่แล้ว

    ValueError: Input 0 of layer "sequential_39" is incompatible with the layer: expected shape=(None, 3, 1), found shape=(32, 0, 1) I got this error I changed the stock in this prediction, But I'm always get that error. I changed the number of windows to 0 once my csv is much smaller.

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว +1

      I wouldn't suggest a window of 0.

  • @obedrolland3259
    @obedrolland3259 8 หลายเดือนก่อน

    There is a huge mistake in your loop which predicts the test & validation closing prices according to the last 3 values of the training data. You should just change the last_window expression by :
    last_window = np.concatenate([last_window[-2:], np.array(next_prediction)])
    And see what you will get.

    • @obedrolland3259
      @obedrolland3259 8 หลายเดือนก่อน

      Awesome video by the way ! 👌🏽

  • @Paul-mk3jt
    @Paul-mk3jt 2 ปีที่แล้ว

    I think you made a mistake at the end with the recursive predictions, you are not updating the last window correctly. Great video nonetheless!

    • @GregHogg
      @GregHogg  2 ปีที่แล้ว

      Hmm thank you I'll have to take a look at that!

    • @felipegiacomel9751
      @felipegiacomel9751 ปีที่แล้ว +1

      @Alejandro Pachon it worked for me, thank you! But I had to change the last line to last_window = last_window[1:].reshape(len(last_window)-1,1), or I would get an exception
      Also, although it works, I didn't understand why you make a copy of X_train only to deepcopy it later