How to stack machine learning models in Python

แชร์
ฝัง
  • เผยแพร่เมื่อ 27 ก.ย. 2024

ความคิดเห็น • 84

  • @DataProfessor
    @DataProfessor  3 ปีที่แล้ว +2

    👉 Watch this next: th-cam.com/video/oR670Txwh88/w-d-xo.html (The Art of Learning Data Science - How to learn data science in 2021)
    ----------
    🌟 Download Kite for FREE www.kite.com/get-kite/?
    🌟 Buy me a coffee www.buymeacoffee.com/dataprofessor
    🌟 Subscribe to this TH-cam channel th-cam.com/users/dataprofessor
    🌟 Join the Newsletter of Data Professor newsletter.dataprofessor.org

    • @skylerfranklin7089
      @skylerfranklin7089 3 ปีที่แล้ว

      @Kingston Dangelo thank you, I signed up and it seems to work :D I really appreciate it !

    • @kingstondangelo4541
      @kingstondangelo4541 3 ปีที่แล้ว

      @Skyler Franklin You are welcome xD

    • @Dr_Insha_Altaf
      @Dr_Insha_Altaf 2 ปีที่แล้ว

      Sir can you please provide me its pseudocode???

  • @TinaHuang1
    @TinaHuang1 3 ปีที่แล้ว +4

    Wow this is great - was just thinking about this couple days back!

    • @DataProfessor
      @DataProfessor  3 ปีที่แล้ว

      Awesome, thanks for tuning in Tina!

  • @paulntalo1425
    @paulntalo1425 3 ปีที่แล้ว +2

    Thank you professor, your contribution towards enablement for data scientists is unmatched in this year. Your my best channel towards full stack data science

    • @DataProfessor
      @DataProfessor  3 ปีที่แล้ว

      Thanks Paul for the encouragement and support of the channel :)

  • @mogamoga4474
    @mogamoga4474 2 ปีที่แล้ว +5

    sir, why always logistic regression is used for stacking?

    • @shahadsha8692
      @shahadsha8692 7 หลายเดือนก่อน

      logistic regression for classification tasks

  • @shubhamdandekar20
    @shubhamdandekar20 2 ปีที่แล้ว +1

    Thanks for this video now i understand what stacking is.

  • @taki7394
    @taki7394 4 หลายเดือนก่อน +1

    you are great, prof !

  • @andykim7654
    @andykim7654 5 หลายเดือนก่อน +2

    I’ve noticed that the outcomes from the random forest model and the stacking model are identical. Any thoughts?

    • @stephentete1211
      @stephentete1211 4 หลายเดือนก่อน

      Yes I was wondering also, the same for the svm model. @DataProfessor Could you please clarify this. Thanks!!

  • @t.t.cooperphd5389
    @t.t.cooperphd5389 3 ปีที่แล้ว +4

    Beautiful!

  • @Kmysiak1
    @Kmysiak1 2 ปีที่แล้ว +1

    Whats your logic for using log classifier final_estimator? How come you didn't tune your hyperparameters? Good clean code and well explained but could be better.

  • @muditarora9860
    @muditarora9860 3 ปีที่แล้ว +2

    it shows error at this line
    stack_model_train_accuracy = accuracy_score(Y_tr, Y_tr_pr)
    can not be continuous.

    • @DataProfessor
      @DataProfessor  3 ปีที่แล้ว

      Can you try r2_score function instead (also import the r2_score function first), I think you're Y is a quantitative value therefore the error is saying "continuous". For accuracy_score function to work your Y has to be categorical (or discrete values).

  • @mamacita5636
    @mamacita5636 2 ปีที่แล้ว +1

    Thank you! Quick question, why do you perform train-test split if you’re going to use cross validation? Wouldn’t the cross validation do the split ?

  • @nurmukhammad_30k
    @nurmukhammad_30k 2 ปีที่แล้ว +1

    Very nice explanation! Keep going!

  • @ama016
    @ama016 3 ปีที่แล้ว +1

    This is SO awesome! Laser eyes ML

  • @SandraBabirye-t7d
    @SandraBabirye-t7d 3 หลายเดือนก่อน

    How are you able to obtain features of importance from the stacked model

  • @gguchristine
    @gguchristine 3 ปีที่แล้ว +5

    I was literally thinking about how to do stacking and I saw your video in my subscription box haha thanks for the video!

    • @DataProfessor
      @DataProfessor  3 ปีที่แล้ว

      Awesome, glad to hear!

    • @remymakota1813
      @remymakota1813 2 ปีที่แล้ว

      @@DataProfessor Hello there,
      Is it possible to use Particle Swarm Optimization as part of the stacking models? If so, could you kindly show me how?

  • @bambangSiswo
    @bambangSiswo ปีที่แล้ว +1

    Good post

  • @futureceltic00
    @futureceltic00 6 หลายเดือนก่อน

    So if I were not to use StackingClassifier, this is basically me consolidating all predicted classes of the base models and using them as features for the meta model? If this is the case, then does stacking also detect overfitting from too many features used in the base model if the performance decreased on the final meta model?

  • @mohak9102
    @mohak9102 3 ปีที่แล้ว +1

    Please always keep up the great work

  • @hareshk.mangtaniretrita1376
    @hareshk.mangtaniretrita1376 3 ปีที่แล้ว +2

    Great video! Had a quick question for you. After training and testing the KNN algorithm, how are the metrics of the performance of the test set higher than that of the training set? Haven't we trained the model using the training set? I would expect the model to be more accurate when making predictions in the training set (which is seen data) as opposed to the test set (unseen data). Regards!

    • @chadgregory9037
      @chadgregory9037 2 ปีที่แล้ว +2

      LOL you are right.... something SUS going on here!

    • @bedoe9684
      @bedoe9684 2 ปีที่แล้ว +1

      I dont have a theoretical answer. But it does happen that validation scores are greater than training ones. Might be because of the test split itself containing more favorable data (aka data that the model has learnt very well)
      Regards

    • @gammetube3976
      @gammetube3976 ปีที่แล้ว

      great question! how we apply the feature extraction on this model?

  • @username42
    @username42 3 ปีที่แล้ว +2

    what about the cross validation of the model ? how do we do that in such stacking models?

    • @DataProfessor
      @DataProfessor  3 ปีที่แล้ว +2

      There's a built-in CV option in the function of stackingclassifier and stackingregressor

    • @username42
      @username42 3 ปีที่แล้ว

      @@DataProfessor cool so it is doing the cv before stacking the model or afterwards ? for instance in yourcase, is it gonna be cv the logistic regression model or the previous ones?

  • @gammetube3976
    @gammetube3976 ปีที่แล้ว

    Thank Dear. how we check Cv of the model and save the developed model and check the evaluation of the model?

  • @sangnp
    @sangnp 8 หลายเดือนก่อน

    Sir, Can you please answer me.Is it a 2 layers stacking ?

  • @transferlearning6983
    @transferlearning6983 3 ปีที่แล้ว

    Tanks a lot for this helpful video, i was wondering on how we can use a loaded models(already pre-trained) as estimators ?

  • @aditiarora2128
    @aditiarora2128 ปีที่แล้ว

    sir great explanation...but still i am confused about formation of dataset for meta learner...different blogs says different concepts of creating training dataset for meta learner.
    For example you have first training every model individually and and then stacked model again on same training testing set!!! But in many blogs I have seen that input training dataset for meta learner should be formed by combing Pred_prob of every base model and actual label. Plz clarify!!!

  • @Мага123-о2о
    @Мага123-о2о 3 ปีที่แล้ว

    One of the most useful videos in my life! Sure wont be able to find more convenient explanation of models stacking. Thank you professor! But I have a question, is it possible to visualize feature importances after stacking?

    • @chadgregory9037
      @chadgregory9037 2 ปีที่แล้ว +1

      I don't think it makes sense to "look after stacking", because each model is like an independent set which relies on its own features... but the whole is just a sum of the parts....
      So I guess technically speaking, if you want a convoluted way of evaluating feature importance, you could probably do some kinda stats based on each prediction, and which submodel was most accurate to that model, and then take like, the top 3 submodels, and compare features across them.
      I could be entirely wrong here, lol, but it's my experience that machine learning stuff is quite intuitive. So just by that I have a tendency to feel like it doesn't make sense to see a feature importance after stacking, since the stack relies on each individual model, and for any particular prediction one submodel might be superior to others.
      It's a very interesting thing to think about though!

  • @boulaabimeher5891
    @boulaabimeher5891 ปีที่แล้ว

    But why stack haven't the good result? , I think it was the same of RF results.
    So if it will take the best results why it's not 1 for all the metrics ?
    Thnx a lot

  • @ahsanm.5040
    @ahsanm.5040 2 ปีที่แล้ว

    Dear Professor, Could you please help me write python code about how to stack PROPHET and SARIMA univariate regressor models to predict better

  • @divyakarade4378
    @divyakarade4378 3 ปีที่แล้ว +1

    Nice video..Thanks :)

  • @kholoodsh4616
    @kholoodsh4616 2 ปีที่แล้ว

    It shows error:
    ValueError: The estimator Sequential should be a classifier.
    at stack_model.fit(x_train,y_train)
    , how can I fix it ?

  • @khedirzakaria1916
    @khedirzakaria1916 2 ปีที่แล้ว

    great vide,
    Why didn't we use hyperparameter?

  • @chandu-mu2cg
    @chandu-mu2cg 2 ปีที่แล้ว

    but how do we know which meta learner to choose??

  • @priyadoesdatascience5141
    @priyadoesdatascience5141 3 ปีที่แล้ว +1

    excellent video! I am going to use it in my model. I have just one question? How can it predict better than a single model? Is it because of the inputs from the different models?

    • @DataProfessor
      @DataProfessor  3 ปีที่แล้ว

      It’s an ensemble of several classifiers, think of it like a team of judges helping to decide together. And yes it uses the predictions from individual classifiers to make a final single prediction

    • @priyadoesdatascience5141
      @priyadoesdatascience5141 3 ปีที่แล้ว +1

      @@DataProfessor Excellent sir thank you! If I want to tune the hyperparameters, should I do it individually for each model and then supply it in the meta algorithm?

    • @DataProfessor
      @DataProfessor  3 ปีที่แล้ว +2

      @@priyadoesdatascience5141 Yes, exactly. The video shows the use of default parameters for the individual classifiers.

  • @Karenshow
    @Karenshow ปีที่แล้ว

    Can we use the stacking on Time Series Models??

  • @ashwinig8273
    @ashwinig8273 3 ปีที่แล้ว

    hello sir its was wonderful video very informative
    sir can u please suggest me the best denoising network can we stack different denoising algorithms in same manner?

  • @khedirzakaria1916
    @khedirzakaria1916 2 ปีที่แล้ว

    great video

  • @praveen2112
    @praveen2112 2 ปีที่แล้ว

    Sir,
    Here in stacking how do we know our final estimator as logistic regressor???

    • @ebenezeragbozo
      @ebenezeragbozo 2 ปีที่แล้ว

      because we are dealing with a set of continuous values (i.e. the results from all the models combined)

    • @praveen2112
      @praveen2112 2 ปีที่แล้ว

      @@ebenezeragbozo it can be any regressor now sir such as random forest, linear regressor or decision tree regressor... How can we pick the best one as final estimator among them?

  • @vision3309
    @vision3309 2 ปีที่แล้ว

    sir, I am getting a error in every fit function like , knn.fit(X_train, y_train), when I am using my own data set. I am showing you the error. can please provide any solution for that.
    ValueError Traceback (most recent call last)
    in ()
    4
    5 knn = KNeighborsClassifier(3) # Define classifier
    ----> 6 knn.fit(X_train, y_train) # Train model, x-> features y->classification of flowers
    7
    8 # Make predictions
    2 frames
    /usr/local/lib/python3.7/dist-packages/sklearn/utils/multiclass.py in check_classification_targets(y)
    196 "multilabel-sequences",
    197 ]:
    --> 198 raise ValueError("Unknown label type: %r" % y_type)
    199
    200
    ValueError: Unknown label type: 'continuous'

  • @shreyanhce315
    @shreyanhce315 3 ปีที่แล้ว

    hello sir thanks a lot I have a doubt however , if we choose to use our own csv , which column should be our y ?? SET

    • @DataProfessor
      @DataProfessor  3 ปีที่แล้ว +1

      Hi, there's many solutions to this but the easiest is to set Y by using df.Y or df['Y']
      (given that your Y variable is called "Y")

    • @shreyanhce315
      @shreyanhce315 3 ปีที่แล้ว +1

      @@DataProfessor thank you so much and please make more videos they are very valuable 🤩🤩🤩🤩🤩

  • @tsunamio7750
    @tsunamio7750 2 ปีที่แล้ว +1

    Could you make an example of this with Keras neural networks? This is a very specific issue when you wrap your DNN with KerasClassifier, where you must provide the new model as a function or something... instead of training it.

  • @Julio_Zambrano
    @Julio_Zambrano 3 ปีที่แล้ว +3

    Amazing!
    This video is super helpful!
    Thank you, Professor! :D

    • @DataProfessor
      @DataProfessor  3 ปีที่แล้ว

      Thanks for watching and glad to hear that it's helpful 😊

    • @Dr_Insha_Altaf
      @Dr_Insha_Altaf 2 ปีที่แล้ว

      @@DataProfessor kindly provide me its pseudocode...

  • @taseersuleman7343
    @taseersuleman7343 3 ปีที่แล้ว +3

    Much awaited video 👍

  • @jaredking929
    @jaredking929 8 หลายเดือนก่อน

    Wouldn’t it be great if you could explain the strengths of each algo and show how they improved the models

  • @connectrRomania
    @connectrRomania 3 ปีที่แล้ว +2

    Man your videos are awesome, keep up the good work. Thank you

  • @aimanjatt4
    @aimanjatt4 2 ปีที่แล้ว

    I got error Y_train is not defined how can i fix this error plz tell me

  • @joaomaia2898
    @joaomaia2898 2 ปีที่แล้ว

    Thanks for this..
    The better way to make data science is making data science...
    Recomends a method to select models to input in stack? Only models performs better?

  • @SadTeddyBeer
    @SadTeddyBeer ปีที่แล้ว

    At the end of the video, when we print the final df with metric scores in it, we can see that the stack model is mainly inspired of the random forest classifier. Why not go to the decision tree who has 1 to all its scores ? It's because 1 is likely to be a biased and so the stack classifier doesn't take it in count ?

  • @yaminadjoudi4357
    @yaminadjoudi4357 3 ปีที่แล้ว

    thank you for this, is the stacking is the same concept as Modular neural networks MNNs please ?

  • @MegaBoss1980
    @MegaBoss1980 3 ปีที่แล้ว

    Hi. Can we do level 2 meta model? Any references? Also can we insert new training data in meta model? Any references if yes?

  • @mogamoga4474
    @mogamoga4474 2 ปีที่แล้ว

    Sir, when I'm using my own data set, this line "X = data.drop('Activity', axis=1)s" is not working...showing invalid syntax

    • @DataProfessor
      @DataProfessor  2 ปีที่แล้ว +1

      Hi, you have an extra "s" at the end of your syntax, please delete it.

    • @mogamoga4474
      @mogamoga4474 2 ปีที่แล้ว

      @@DataProfessor thanks sir, it worked.