Multiple Time Series Forecasting With Scikit-Learn

แชร์
ฝัง
  • เผยแพร่เมื่อ 6 ก.ค. 2021
  • You got a lot of time series data points and want to predict the next step (or steps). What should you do now? Train a model for each series? Is there a way to fit a model for all the series together? Which is better?
    I have seen many data scientists think about approaching this problem by creating a single model for each product. Although this is one of the possible solutions, it's not likely to be the best.
    Here I will demonstrate how to train a single model to forecast multiple time series at the same time. This technique usually creates powerful models that help teams win machine learning competitions and can be used in your project.
    And you don’t need deep learning models to do that!
    Timestamps
    0:00 Intro
    1:28 Melt the data, stack the series
    7:18 Split the data
    10:29 Set-up a 1-step target
    13:57 Create 4 fundamental features (feature engineering)
    26:16 Choose an evaluation metric
    31:34 Establish a baseline
    35:18 Train the model
    37:34 Evaluate the model
    39:11 Extend the model to multi-step forecasting
    43:04 Forecast new data
    45:37 Next steps
    Code: github.com/ledmaster/english_...
    Timestamps:
    0:00 Intro
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    // SUPPORT THE CHANNEL 👇❤️
    Sign up for a Coursera course:
    imp.i384100.net/EaDmQe
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    // SOCIAL MEDIA
    LinkedIn: / mariofilho
    Kaggle: kaggle.com/mariofilho
    Twitter: / mariofilhoml
    Blog: forecastegy.com
    Some links above can be from partnerships where I get a commission if you buy a product, without any additional cost to you. Thanks for the support!

ความคิดเห็น • 43

  • @nehan.2199
    @nehan.2199 2 ปีที่แล้ว +1

    This is very helpful thank you! Where can I find the dataset to download?

    • @Forecastegy
      @Forecastegy  2 ปีที่แล้ว +1

      Great, here it is: archive.ics.uci.edu/ml/datasets/Sales_Transactions_Dataset_Weekly

  • @LifeKiT-i
    @LifeKiT-i 8 หลายเดือนก่อน +3

    I just checked this amazing video after your feature selection engineering video! I have no idea why this is video isn’t popular!!! Respect the effort you spent on this!

  • @necuspam
    @necuspam 4 วันที่ผ่านมา

    More intriguing question is: how to train a model, based on thousands of timeseries, determined by multiple parameters, and then to simulate/forecast single timeseries, based on new set of the respective parameters

  • @towhidultonmoy3046
    @towhidultonmoy3046 2 ปีที่แล้ว

    Keep it up! You have a long way to go brother. Best wishes!

  • @Luckasborges
    @Luckasborges 3 ปีที่แล้ว +2

    Learning ML and English together! Here we go! hehe
    Congrats for the new channel, Mario!

  • @vamsikrishnabhadragiri9742
    @vamsikrishnabhadragiri9742 2 ปีที่แล้ว +4

    Why haven't perform standardization for the data? As sales for different products will be different ranges does it not affect the model performance?

  • @diegosccp09
    @diegosccp09 2 ปีที่แล้ว +1

    you are a legend Im using this to do a masters assessment

  • @Septumsempra8818
    @Septumsempra8818 ปีที่แล้ว

    Are we going to get a video on cross-validation and selecting the right model?
    Your time series videos have been a wealth of knowledge.

  • @alirezajabbari2537
    @alirezajabbari2537 2 ปีที่แล้ว +2

    Thank you Mario!
    You saved me in my 4th year project
    ciao

  • @sancarlitos1125
    @sancarlitos1125 2 ปีที่แล้ว

    Excellent explanation! Thanks for sharing it! I was realizing a similar forecasting, and I was wondering if when product number changes, let say from 0 to 1… the rolling window and the lag should be modified? Because we would be using the information of the last product.
    Thank you very much!

  • @ElChe-Ko
    @ElChe-Ko ปีที่แล้ว

    Nice! It would be interesting to see what to do if the time series have different lengths.

  • @kaianchan7768
    @kaianchan7768 2 ปีที่แล้ว

    Thanks for this tutorial. Will you provide some videos about many features? Thanks!

  • @igorkuivjogifernandes3012
    @igorkuivjogifernandes3012 2 ปีที่แล้ว

    Hi, Mario. Awesome video...it helped me a lot. One doubt: what could we do if the train set has uneven peridiocity (the peridiocity is 2 days for one product, 7 days for another product, 3 days for another product and so or even worst...some products has only 1 or 2 observations), but my test set has even peridiocity (every product has peridiocity of 7 days)?

  • @Dragnar21
    @Dragnar21 2 ปีที่แล้ว

    First of all, thank you for that video and that extraordinary explanation. I would like to know how would you structure your data, if the data is not the same length ?

  • @JoaoVitorBRgomes
    @JoaoVitorBRgomes 2 ปีที่แล้ว +1

    Vc é o cara!

  • @pcdowling
    @pcdowling 10 หลายเดือนก่อน

    Thank you.

  • @Mohammad-vr9dj
    @Mohammad-vr9dj 2 ปีที่แล้ว

    Thanks for your useful video. Sorry, If our dataset has two target columns how can we write the codes?

  • @Mohammad-vr9dj
    @Mohammad-vr9dj ปีที่แล้ว

    Thanks for the useful video. Sorry, is it possible to implement independent spatial sequences simultaneously? I have a dataset which is consist of 1000 independent spatial sequences with dimension 2*7 (2 for x and y, and the length 7 for positions in each time). I implemented it with Simple RNN, LSTM and GRU. Can I do it with transformers (attention mechanism)? Could you introduce me a practical example?

  • @anwarsaidan3959
    @anwarsaidan3959 หลายเดือนก่อน

    Thank you very much for this amazing video !
    Can we use Cross Validation for hyperparameter tuning in the case of RandomForest with time series data ?

  • @Orlandobelli
    @Orlandobelli 2 ปีที่แล้ว

    Good video, we can make multiples time series with ARIMA model?

  • @faraza5161
    @faraza5161 2 ปีที่แล้ว +1

    The Simple Imputer will impute mean values for the entire column in the missing values. Shouldn't that be done product wise as well?
    Thanks for a wonderful lecture btw :-)

  • @user-fh7gb2yf5z
    @user-fh7gb2yf5z ปีที่แล้ว

    Mario, boa tarde. Tem algum dica para usarmos a LSTM para predições com passos à frente em um sistema MISO? .

  • @Learner_123
    @Learner_123 2 ปีที่แล้ว +4

    Thank you for making the topic simple. Since you have combined all the product sales to train and validate your model, How can one use this model to predict sales for 'any single' product only?

    • @zabmaz10
      @zabmaz10 ปีที่แล้ว

      I have the same question, but I guess one way is to convert the product code into dummy variables and use those as features in the random forest.

  • @VG-yw2mp
    @VG-yw2mp ปีที่แล้ว +1

    Why dont we use product_code as one of the features while training?

  • @zulhas9
    @zulhas9 ปีที่แล้ว

    Hi Mario, thanks for the wonderful presentation. One qouestion, how could you use the feature the "Sales" to predict sales? Using that features, when you predict using .predict function, you have to pass that as an argument. In reality, you would not have that information available.

  • @mamyrak1114
    @mamyrak1114 2 หลายเดือนก่อน

    i can do the same processus if in place of week i have a date like yyyy-mm-dd and how to handle the year?

  • @jackcarter97
    @jackcarter97 6 หลายเดือนก่อน

    how do I find the season effect features?

  • @jackcarter97
    @jackcarter97 6 หลายเดือนก่อน

    How do I find the season effect features?

  • @efremyohannes2334
    @efremyohannes2334 2 ปีที่แล้ว

    How to model time series for unevenly distributed data using sckit-learn

  • @Gabriel-iw3hc
    @Gabriel-iw3hc ปีที่แล้ว

    how i future forecast with this method ?
    Ex: forecast week 52 ?
    i think, need to forecast another series too for another features
    .

  • @StatiR_br
    @StatiR_br 3 ปีที่แล้ว +3

    Olá Mario! Em primeiro lugar parabéns pelo vídeo ! Fiquei com uma dúvida: Nesse contexto, temos vários produtos (Product_Code) e apenas um modelo ajustado, da forma que está o dataset, o modelo irá/poderá considerar, por exemplo, o último 'lag_sales_1' de um Product_Cod para prever as vendas do próximo Product_Code ? Pois o modelo não saberá quando é um Product_Code e quando será outro. Ou eu estou confundindo? Desde já obrigado !

    • @guilhermeparreira5448
      @guilhermeparreira5448 ปีที่แล้ว

      Concordo contigo. Essa forma de modelagem só funcionaria se todos os produtos tivessem uma venda média próxima (e olha lá). Penso que o mais correto seria o product code também como covariável do modelo.

  • @ozan4702
    @ozan4702 2 ปีที่แล้ว

    Why the difference should be a feature? Given sales and lag sales, difference can be already known.

  • @stonesupermaster
    @stonesupermaster ปีที่แล้ว +2

    Hello Mario, I have a question... how does the model know that we're trying to predict multiple products at once? I've trying to train a model in order to predict the sales of 2000 SKU and the main concern I have now is how to do it efficiently. I watched everything that you did but I still have the same problem, do you know where I can find an example of it? thank you very much for your video

    • @AskApt05
      @AskApt05 หลายเดือนก่อน

      Hi @stonesupermaster, Facing same problem. Have you found a solution? It would be really helpful if you can share. Thanks!

  • @aacharyadhruvi8301
    @aacharyadhruvi8301 2 ปีที่แล้ว

    From where I can get Sales_Transactions_Dataset_Weekly.csv ?

    • @Forecastegy
      @Forecastegy  2 ปีที่แล้ว

      Here archive.ics.uci.edu/ml/datasets/Sales_Transactions_Dataset_Weekly

  • @XiboquinhaMilGrau
    @XiboquinhaMilGrau ปีที่แล้ว +1

    Por essa eu não esperava kkkk

  • @vivianealveslima9358
    @vivianealveslima9358 3 ปีที่แล้ว +1

    the code in GitHub is unavailable =S

    • @Forecastegy
      @Forecastegy  3 ปีที่แล้ว

      Oops! Fixed, this is the right link: github.com/ledmaster/english_tutorials/tree/main/multiple_time_series