Decision and Classification Trees, Clearly Explained!!!
ฝัง
- เผยแพร่เมื่อ 24 พ.ค. 2024
- Decision trees are part of the foundation for Machine Learning. Although they are quite simple, they are very flexible and pop up in a very wide variety of situations. This StatQuest covers all the basics and shows you how to create a new tree from scratch, one step at a time.
NOTE: This is an updated and revised version of the Decision Tree StatQuest that I made back in 2018. It is my hope that this new version does a better job answering some of the most frequently asked questions people asked about the old one.
Note, you may also want to learn about...
Regression Trees: • Regression Trees, Clea...
Bias and Variance (and over fitting): • Machine Learning Funda...
Cross Validation: • Machine Learning Funda...
Pruning Trees: • How to Prune Regressio...
For a complete index of all the StatQuest videos, check out:
statquest.org/video-index/
If you'd like to support StatQuest, please consider...
Buying my book, The StatQuest Illustrated Guide to Machine Learning:
PDF - statquest.gumroad.com/l/wvtmc
Paperback - www.amazon.com/dp/B09ZCKR4H6
Kindle eBook - www.amazon.com/dp/B09ZG79HXC
Patreon: / statquest
...or...
TH-cam Membership: / @statquest
...a cool StatQuest t-shirt or sweatshirt:
shop.spreadshirt.com/statques...
...buying one or two of my songs (or go large and get a whole album!)
joshuastarmer.bandcamp.com/
...or just donating to StatQuest!
www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
/ joshuastarmer
0:00 Awesome song and introduction
0:18 Basic decision tree concepts
3:16 Building a tree with Gini Impurity
9:15 Numeric and continuous variables
12:35 Adding branches
13:56 Adding leaves
14:32 Defining output values
15:12 Using the tree
15:38 How to prevent overfitting
#StatQuest #decisiontree #ML
NOTE: This is an updated and revised version of the Decision Tree StatQuest that I made back in 2018. It is my hope that this new version does a better job answering some of the most frequently asked questions people asked about the old one.
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
Awesome work!
The very model of clarity. Thanks :)
Hi Josh, great video! I have one question.
When you are calculating the total Gini impurity based on the weighted average, why is it not (1 - weight)*Gini instead of (weight)*Gini?
Since we want to minimize Gini, wouldn't the Gini value with the most sample size have its overall Gini reduced (as in (1 - weight)*Gini ) instead of increase (as in (weight)*Gini) ?
Thanks!
@@haoyuanliu8034 The more data we have to support something, the more trust we have that that something is correct. Likewise, if I don't have much data to support something, then I should probably have less confidence that that something is correct. And that's what we're doing here. The more data observations we have in a leaf, the more data we have to support the predictions made by that leaf. Thus, the weight amplifies the gini value for the leaf for the most data.
The complexity of understanding the concepts and explaining them so simply show what a great teacher Josh is.
Thank you! :)
Josh, I just finished watching absolutely all your videos on this channel. Congratulations for them, you are the best!
WOW!!! That's a lot of videos! Thank you very much! :)
how much are you sponsoring? LOL
why did you do that, What do you do?
@@amaarmarco530 I am Data Scientist, and I wanted to have a really good knowledge about statistics
Haven't watched them all yet but probably will. And even that you have and will receive more compliments, it's always worth keeping on thanking you for this amazing job!
Save the environment by planting some decision trees
bam!
Environmentally friendly decisions
@@statquestDouble bam 🎉
I am half way through your Machine Learning playlist. It has been so helpful and resourceful, I can't put into words. Thank you Josh.
Thank you very much! :)
I'd never seen a youtube comment section so full of thankful, enlightened and happy people. You must have revolutionized teaching. Thank you Josh, for these excellent videos. You rock!
Thank you! :)
I never watched Andrew NG's OG course.... i just come back to these videos if I have any doubts or if I need to refresh my knowledge. Thanks a lot josh ;)
Bam! :)
If "love at first sight" is real, then this video made me love your way of teaching!
Hooray! Thank you very much! :)
This channel helped me a loooot! It helps me from researching to looking for a job, from recreating myself to exploring the field of statistics and machine learning. You are the best! I can't express my gratefulness in words!
Thank you very much! :)
Yet another fantastic stat-quest, Loved these for my class for Deep learning. Keep up the good work!
Thanks, will do!
I will spent my first salary from the job by buying your merch and supporting your channel , you are just great prof
Hooray!!! Thank you very much! :)
hey bhai i need help from u to understand this concepts for my assignment, can u contact to me?
Thank you Josh, there is no channel on TH-cam (or maybe on the Internet) that explains this topics as nifty as you do.
Thanks!
I am so happy I found this video. Thank you for making it. It is so clear how the decision tree actually works.
Glad it was helpful!
You are the REAL GOAT! The best and most intuitive textbook is ISL and your TH-cam video makes this even better. Hats off to your hard work.
Wow, thanks!
Brilliant, thanks Josh - exactly the slow and steady explaination I needed
Thanks!
This is one of best videos on Decision Trees on the internet. Thanks Josh!
Thank you!
I just became fan of yours....the way you teach complicated things with humour and fun, its simply amazing....
Thank you so much 😀
BAM!
Oh great video. Wish the lecturers would have same knowledge about this topic as yours... Thanks man!
Glad it was helpful!
Hi Josh,your videos are the best in understanding the working of machine learning algorithms in the simplest way!!!
Thank you very much! :)
I'm not used to comment on youtube videos, but this one for sure deserves it. Thanks so much for the explanation, and keep up the good work!
Thanks, will do!
Thank you SO much Josh. This has been the most helpful guide on decision trees I have come across. :)
Glad it was helpful!
about two weeks ago i was trying to learn how the slit is made on numerical data for best split. I was using python for this and was always setting the split space with np.linspace, to find the best split, but the way you showed with averaging a sorted list is very intuitive. If I have only watched this video it would let me save few days of learning how to manually calculate information gain and best split to better understand how DT is working. Great video!
Thank you!
The best teacher by far I've ever seen in my life! Thank you Josh!
Thanks!
The explanation is so simple and rewarding too. Thank you.
bam!
your method of teaching is so simple, yet so amazing
Thank you very much!
Thank you a ton for these Josh, these explanations are super clear. Love the humor too.
Thank you very much! :)
I love this video ( in the same spirit of many other of your machine learning algorithm videos) because after watching it, I actually managed to code a simple classification tree on my own to just solidify the things I learned here, and after watching this video, all the parameters in scikit-learn DecisionTreeClassifier are making sense to me. Most of the ML videos and many of the classes out there only talk about very generalized, high-level ideas of these models. You don’t. You always do such a great job giving clear yet detailed explanation of the nitty gritty of these models. Between the ISLP/ISLR books and your videos I am able to gain basic understanding beyond just making api calls of caret in R or sklearn in python. It really made me feel like I am learning, instead of just typing formulas on the keyboard. Could never thank you enough ❤❤❤
Hooray!!! I'm so glad you enjoy my videos. :)
Thank god, i found your video. You explained it so well, that I literally couldn't control jumping in happiness.
Glad it helped!
You are amazing. I really wish that University professors had the ability and drive to actually teach like this.
Thank you!
with this channel, go to school clase is just waste time. Great teacher Josh is.
Thanks!
I'd like to thank you so much for making this stream cast available!
Thanks!
This guy is amazing! I also love how he reminds us of what we were doing, why we were doing it and how we were doing it. Usually, halfway through my lectures I have forgotten where we came from and why we are doing what we're doing.
Can't see the forest for all those trees... B)
Thank you so much! I'm glad you like my videos. :)
All videos are golden. Thank you StatQuest!
Thank you!
amazing video, 18 minutes of your video conveys more useful information than a 3 hours lecture at my uni
Glad it helped!
You might be the best teacher I've ever had :D
Thank you very much! :)
Thanks in a million! Very well explained. This is the nth time that I am watching this again. Great content. Awesome. I couldn't find this explanation--simply put anywhere else. “Great teachers are hard to find”. Grade: A++ 💥
Thank you very much! :)
Josh, your videos do help me as visual Lerner a lot. Thanks.
Happy to help!
best tutor alive on earth. thanks man. appreciate your hard work for us.
Wow, thanks!
Your videos are absolutely amazing!
Thank you so much for this beautiful content!
Thank you very much! :)
The best teacher i ever had ....i will send a gift on teachers day Mr Josh!
Bam! :)
@@statquest Sir can you pls tell me how should i start ML as beginner. Is this the place that should start th-cam.com/video/Gv9_4yMHFhI/w-d-xo.html from your tutorial
Josh, thanks for these videos and the awesome intros. Your channel really helps me study for my bioinformatics coursework and exams. Much love 💖
Good luck with your exams! :)
Thanks 🙏🏼💖
Bam!
Josh, I keep watching many videos on your channel! You can explain things in a simpler way! And you videos are inspiring. You are the best teacher I have ever had! We need more great channels like you!
Double Bam!
Could you make some videos in different distributions series? Like topic in Gamma, chi-squared, beta, poisson distribution, etc. I could hardly find a TH-cam channel that explains them clearly. :(
Triple Bam!
I know you did not “officially” teach statistics and just made videos for fun, but the world needs you to create more great videos lol! Your “To-do” list would be huge! Keep it up! :D
Thank you very much! :)
World's best video on Decision Tree Classifier 💚💚💚💚
Thank you!
Your explanations are the best!! Instead of teaching the mathematical abstraction first, you teach with a small step by step example that removes the abstraction complexity, so then when reading the formal explanation I can understand it much better. That's the best teaching method, keep doing it this way :D
Happy to help!
You are the best teacher for Stats and ML!
Thank you!
Thank you, Josh. Based on the methods you provided I tried creating a Python function that calculates the GINI impurity for each independent variable, It really helped deepening my knowledge. thanks again.
bam!
Your explanations are always the best. Thank you
Thank you! :)
Hey thank you so much. Your video is easy to follow. I can tell that you put efforts and heart in it!
Glad it was helpful!
this is BY FAR! one of the best explanations ever!
Thanks!
More important than teaching people statistics and machine learning, you teach people they are capable of understanding things they would of otherwise thought themselves incapable of understanding.
bam! :)
You are such an amazing teacher Josh!
Thank you! :)
this was the most exciting and crystal clear explanation . thanks a lot
Thank you! :)
Thank you so much for this, Josh! I am still fumbling around, trying to figure out the best analysis for my research, and I am new to classification trees and trying to see if they're appropriate for my data. This was a great introduction! I'd LOVE to see an R tutorial for Classification Trees, especially for small sample sizes. And maybe suggestions for what you do afterwards...I'm guessing you have to do more than just classification to build a final model, e.g., some kind of error test or goodness-of-fit?
You can use Cross Validation to get a sense of how well your model will perform: th-cam.com/video/fSytzGwwBVw/w-d-xo.html
00:07 Decision trees and classification trees are explained in this video.
02:26 The classification tree helps in making decisions based on the given data.
04:39 The two main trees, popcorn and soda, contain impure leaves with mixtures of people who do and do not love cool as ice.
07:02 Calculate the genie impurity for love's popcorn
09:22 The genie impurity for love's popcorn is 0.405, and for love soda is 0.214.
11:45 A decision tree is created based on features like love for soda and age
14:10 Build and use classification trees
16:29 Main ways to deal with this problem
Crafted by Merlin AI.
bam
Josh...I love you man ...you really making the concepts clear n easy for us.. thanks, thanks n big thanks...Lots of Love from my side and INDIA...
Thank you very much! :)
Love your explanation! Thank you so much for your videos!
Thanks!
Another amazing video! Keep up the good work. It helps a lot with studying :)
Thank you! Will do!
You are amazing. Great content, pleasing visuals, and great songs.
Thank you very much! :)
Thank you for explaining it in a so easy way to understand.
Thanks!
That was very informative. I can do my homework thanks to this video. Thank you so much.
Glad it was helpful!
Very nicely explained! I can't find better explanation than this!! Double Bam!!
Glad you enjoyed it!
Thank you for your explaination. It is so clear to understand.
Glad it was helpful!
Great contents and thank you...Please look forward to making your website mobile friendly with more modern appearence.I think that would help you get much more visits on your website...you are the best👍
Amazing video to learn Decision and Classification Trees from zero to hero!
bam!
It's innovative way of teaching. thanks for creating and uploading.
Thank you!
your 2 vids of Knn and this explain better than my 2 hrs lecture and 1 hr lab which done by my uni teacher.
thank you
Happy to help!
Simply brilliant! Thank you!
Thanks!
Couldn't resist to thank you a SECOND TIME!!
Double bam!!! :)
who is here not just for statistics but for English pronunciation as well? Clearly explained and clearly pronounced!!!!
Thank you very much! :)
Best channel on youtube, such a treasure!
Wow, thank you!
Perfect explanation. Thanks a lot!
Glad you enjoyed it!
wow...best explanation ever..I'm impressed. Thanks a lot
Glad you liked it!
such a simple and beautiful explanation...BAM!
Thank you!
What an awesome explanation, Thanks a lot
Glad it was helpful!
I don't know who are you but man you are the best instructor ever I have ever seen. I wish my math teacher met you, she was teaching us the same way you do 😍
Wow, thank you!
Please I want u to know that u are something like jesus of statistic, the clarity of yours explanation has no competition at all, thank you
Thank you!
Just bought the book, hope it would help you to continue your work 🧑🎓
@@ettoremiglioranza2959 Hooray!!! Thank you so much for supporting StatQuest!!!
You explained it great, thanks !
Thank you!
Hi Josh,
Many many thanks for you invaluable videos which make complex concepts / models easier to understand!
Not sure if you ended up finding where Gini comes from, Wikipedia has is as being named after and Italian mathematician. "Gini impurity
Gini impurity, Gini's diversity index,[23] or Gini-Simpson Index in biodiversity research, is named after Italian mathematician Corrado Gini and used by the CART (classification and regression tree) algorithm for classification trees. " (source Wikipedia)
Thanks!
You know your content is fire when even the professor at our university used your videos in his lectures.
bam!
Hey, you are really a cool intellect. Thanks for the cool learning tutorials!
Thanks!
Thank you for this video. God bless you ❤️
Thanks!
Great video as usual. Thank you.
Thank you! :)
Splendid job!!🙌
Thank you!
I would be working for NASA by now if all my teachers/Profs were as good and concise as Mr Josh!
bam!
You're a hero Josh, thankyou 🧠
Thank you!
I love the knowledge and the humor in this channel.
Baaaam 😎
bam! :)
Wow great explanation! Veryy well done!! BAM!!
Glad you liked it!
Recently I have purchased your book. BTW Im from India. BAMM !!! Its nice
Thank you very much for supporting StatQuest!!! BAM! :)
starting to regret why i need to go to school 3 days per wee, 5 hours per day for machine learning. listening to your videos beats every lecture.
bam!
Best Explanation I have ever saw!!!
Thank you!
Thanks for explaining many complex concepts in simplest way.....plz upload more data science basics
Thanks! I'll keep that in mind! :)
Hey Josh, I am about to go into my last exam before I graduate and this is the last video I'm watching for a topic that was covered in a day I missed
I'm sure you won't see this but thank you for all the help you've done
Thanks and good luck on your exam! Let me know how it goes.
CLEARLY EXPLAINED SIR THANK YOU!
Thanks!
Great video! Thanks!
Thanks!
Super tutorial. Thanks a lot!!!
Thank you!
Best explaination ever!
Thanks!
Hi Josh, great video!!!
Thank you very much! :)
How could anyone dislike "Cool As Ice?" Vanilla Ice is the man!
That is the eternal question! :)
The great job! Thanks, now I have idea how to explain it to my students in easy way ;) I need to translate it, but it is good! Thanks!
Glad it was helpful!
excellent explanation!!
Thank you!