Ensemble Methods in Scikit Learn

Data Talks

มุมมอง 13 300

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 5 ต.ค. 2024
We explore the really heavy hitters: ensemble methods. We go over the meta estimators: voting classifier, adaboost, and bagging. And then we dive into the two power houses: random forests and gradient boosting.
Associated Github Commit:
github.com/kna...
Associated Scikit Links:
scikit-learn.or...

ความคิดเห็น • 26

@waela3637 4 ปีที่แล้ว ⁺¹
Thank you for the efforts. I am self learning machine learning and I've watched so many videos and so far this series is the most useful.
@rashmimahadevaiah8321 3 ปีที่แล้ว ⁺¹
12:30 bagging : low bias estimators combined to get low variance estimator
boosting : low variance estimators are improved upon to get low bias estimator
@EyeofEthiopia 3 ปีที่แล้ว
Thanks nice work and learn quickly
@zeus1082 6 ปีที่แล้ว ⁺¹
Nice explanation.
@rashmimahadevaiah8321 3 ปีที่แล้ว
Please go over these estimators in depth. Others who also think the same, please upvote this or comment
@davidrodriguezgimeno4863 6 ปีที่แล้ว
Thanks for sharing this video! Good stuff!
@aadityadamle2951 4 ปีที่แล้ว ⁺¹
I have a doubt about Voting Classifier, while voting is set to "hard" how does the voting classifier deals with a tie. For instance if I have 4 classifiers 2 of them have an output of "0" and the other two have "1" then which will be selected by the voting classifier. Does the rule of ascending order apply in this case as well considering that we have binary classes(0 and 1).
@rashmimahadevaiah8321 3 ปีที่แล้ว
When do you use oob_score = False ?
@kofteci408 7 ปีที่แล้ว
Thank you for a clear and quite detailed explanation. However, here is something I would have liked to see: that the meta-classifiers were actually performing better than the base classifiers with "test" data. For example, does the AdaBoostClassifier always (most of the time?) perform better on test data than the DecisionTreeClassifier which happens to be its default base classifier?
Thanks again
@DataTalks 7 ปีที่แล้ว
Great question! I'd consider this more of a core data science/theory question. And I'll be doing some videos (data science foundation videos) that will be going over the theories behind your question and showing it empirically. The videos should be out in about half a year!
@kofteci408 7 ปีที่แล้ว
Thanks : )
@anon12three4 5 ปีที่แล้ว
Great videos, quick question when you say estimators are you referring to n_estimators or estimator vs classifier
@DataTalks 5 ปีที่แล้ว
The n_estimators. Great clarification! Thanks!
@janiobachmann5029 6 ปีที่แล้ว ⁺¹
Thank you for sharing this video. Anyways, is there a way to add several models to the AdaBoostClassifier "base_estimator" hyperparameter. I am asking this because the essence of AdaBoostClassifier is to ensemble several models through bagging and then training the first model to the whole training dataset and giving higher weights to the misclassified instances, in order to use the misclassified instances in the second model and trying to get correct classification. Nevertheless, please tell me if there is a way to ensemble several classifiers into the AdaBoostClassifier or maybe there is something that I am missing in the way we structure the AdaBoostClassifier. Again, Nice video specially when it comes to complex algorithms such as these ones!
@DataTalks 6 ปีที่แล้ว
Thanks Janio!
So that is a very good question. (I am quite interested in which models you are interested in mixing and to what end!) Naively the answer seems to be no. But I think that it could be done with a little code and elbow grease.
You would most likely wan to inherit from the Adaboost base class:
github.com/scikit-learn/scikit-learn/blob/a24c8b464d094d2c468a16ea9f8bf8d42d949f84/sklearn/ensemble/weight_boosting.py#L297
And then each time a call to _boost was made you could change the self.base_estimator property according to some passed in list eg. [Logistic, DecisionTreeTrunk].
If you do end up implementing and using it, you can always submit a PR to sklearn too!
@janiobachmann5029 6 ปีที่แล้ว ⁺¹
I will look into the code, well I was just asking because most definitions of AdaBoostClassifiers include several models that assisgn different weights to misclassified instances in order for the next model to focus in solving those misclassified instances in a correct manner and thus have a more accurate model. I just find it a bit odd that sklearn does not have a hyperparameter to include several classifiers at once since we could say that AdaBoostClassifier as an ensemble method should have at least two classifiers. I know that by defaul AdaBoostClassifier is using the DecisionTreeClassifier, maybe its because AdaBoostClassifiers work best with binary classification, anyways thanks for sharing the code I will explore it today. Anyways I will be looking into your seaborn tutorial to improve my statistical visualization skills. Have a great day!
@janiobachmann5029 6 ปีที่แล้ว
Hey again chateau, by any means do you know specifically what the hyperparameter of "learning_rate" does in the AdaBoostClassifier class? From what I have read the learning_rate is basically the final weight given to models that are considered weak learners (a little bit better than just random guessing.). So by reducing the learning_rate we are reducing the weight of weak learners to our final outcome (prediction). When it comes to the trade-off with n_estimators well I guess the higher the n_estimators hyperparameter is the lower the learning_rate since more models are trained and thus weights are reduced and vice-versa the lower the n_estimators hyperparameters the higher the weight to each model. I just want to make sure we somewhat have the same interpretation of the use of learning_rate. Thanks again!
@DataTalks 6 ปีที่แล้ว
Yep so that is one way to think about it. Another way to think about it is the classic SGD way, in that each learner steps you towards a global optimum and having steps that are too large can sometimes overshoot that optimum. The smaller your learning rate the more estimators/steps you will need to take to get to that optimum.
So when computation is not limiting you could reduce the learning rate and choose the appropriate number of estimators through CV.
I checked out the code for where this is set, and it's here:
github.com/scikit-learn/scikit-learn/blob/a24c8b464d094d2c468a16ea9f8bf8d42d949f84/sklearn/ensemble/weight_boosting.py#L572
Whichever way helps you think about it more!
@aniskaib 5 ปีที่แล้ว
great video, love your content! quick question, how can we use votingclassifier with different models created using keras and not the MLPclassifier provided by scikit learn?
@DataTalks 5 ปีที่แล้ว
Awesome question! Keras has a great wrapper that can let you do this out of the box: keras.io/scikit-learn-api/
@rileyhun3710 6 ปีที่แล้ว
When you reach the 100% accuracy, how did you know you didn't overfit?
@DataTalks 6 ปีที่แล้ว
Great question! We would need to use a validation set/test set in order to know. In the above more just showing off how to use Sklearn ensemble methods than how to do data science with them. Definitely tune into my Data Science Foundations lessons that are coming out in the next few months!
@prakashdahal2560 4 ปีที่แล้ว
What about XGBoost?
@DataTalks 4 ปีที่แล้ว
My recommendation is to use catboost as your boosting tool! I'll go over these in another video too :)
@mufakkirhussain2816 5 ปีที่แล้ว
Man your voice is really unclear...
@DataTalks 5 ปีที่แล้ว
Thanks for the feedback - I'll work on improving the audio quality!

ต่อไป

เล่นอัตโนมัติ

Python Stacking Regressor Mastery: From Basics to Advanced Tips