Hi, I have a question regarding fitting the model. When we do model. fit in every training, there will be a random set of samples for training. For example, in the iris dataset, I fit my model and then fine-tune with n_estimators =10,20,100, etc. sometimes it is getting 1.0 score on 20, but if I run it again, it gets 0.98, so how can I fix the x_train and y_train so it will not change every time. ? And I am really thankful for your lectures I am learning day by day. Thank you.
Keeping the tutorial part aside (which is great), I really love your sense of humor and it's an amazing way to make the video more engaging. Kudos!! Also, thank you so much for imparting such great knowledge for free.
Lets promote this channel. I am just a humble python hobbies who took local course yet still I don't understand most of the lecturer says. Because this channel i've finally found fun with python. In just 2 weeks(more) I already this Level? Man....! Can't Wait for Neural Network but only from this channel
I cannot quite express how amazing teaching you are doing. I am doing masters one of the finest universities in America and this is better than the supervised learning class I am taking there. Kudos! Please keep it up. appreciate you are making this available for free although I would be willing to see your lectures even for a fee.
I achieved an accuracy of .9736. Earlier, I got an accuracy of .9 when the test size was 0.2 and changing the number of trees wasn't changing the accuracy much. So, I tweaked the test size to .25 and tried different number of tree size. The best I got was .9736 with n_estimators = 60 and criterion = entropy gives a better result. Thank you so much sir for the series. This is the best TH-cam Series on Machine Learning out there!!
@@lokeshplssl8795 I think I know why you are probably confused. This not a plot chart. You should not assume that because you passed y_test as a first argument you would see it horizontally similarly you do with xlabel. Unfortunately the confusion matrix is printed out unlabeled. True/Actual/test values are vertically alligned and predicted ones are horizontally. A couple of videos before he used another library to demonstrate the matrix labeled. If you have any questions regarding confusion matrix this is by far the best video th-cam.com/video/8Oog7TXHvFY/w-d-xo.html . Also a similar use case has to do with Bayesian statistics. Another great example th-cam.com/video/-1dYY43DRMA/w-d-xo.html You don't have to get into it since the software does it for you, but it would help understand what is going on
This is a great series! Would you be interested in allowing us to repost it on our channel? We'll link to your channel in the description and comment section. Send me an email to discuss further: beau [at] [channelname]
mega.nz/file/LaozDBrI#iDkMIu6v-aL9fMsl-X1DETkOqnMqwptkn54Z51KINyw (like data in this file )//help if anyone understand. mega.nz/file/LaozDBrI#iDkMIu6v-aL9fMsl-X1DETkOqnMqwptkn54Z51KINyw (like data in this file )//help if anyone understand.
ok so i read one comment and put test_size = 0.25 and n_estimator = 60. I rerun my test sample cell as well as model.fit and model.predict cell and got the accuracy of 100%. I am having a god complex right now thank you for this amazing series
I got 93.33 accuracy at n_estimators=30 after that accuracy not increasing w.r.t increase in n_estimators. Thankyou very much for simply great explanation
your teaching is superb, and your knowledge sharing to Data Science community is Nobe|. I tried the exercise by giving the criterion = "entropy" got score as 1
It is a good practice to make a for loop for the n_estimators check the score for one of these: scores=[ ] n_estimators=range(1,51) #example for i in n_estimators : model=RandomForestClassifier(n_estimators=i) model.fit(X_train,y_train) scores.append(model.score(X_test,y_test)) print('score:{}, n_estimator:{}'.format(scores[i-1],i)) plt.plot(n_estimators,scores) plt.xlabel('n_estimators') plt.ylabel(('testing accuracy') And then you can sort of see what's going on. This practice is very useful for knearest neighbors technique for calculating k.
I've done all the Exercise till here. But I was planning not to do it for this video until I saw your last picture! I don't want you to be angry! so I am going to do it right now!
Just love ur videos. I was struggling with python. With ur videos was able to get everything in a weeks time. Also completed pandas and bumpy series. I would highly encourage u to start a machine learning course with some real life projects
Hello Sir, I have started learning pandas and ML from your channel, and i am amazed the way you are teaching. For Iris Datasets I got score =1 for n_estimators = 30
Man, its great! Your videos is best i have seen ever about machine learning. Its very helpfull material. I am waiting when you make tutorial about gradient boosting and neural networks. I think you can make easily to report it. Thanks!
n_estimators = 1 (also 290 or bigger) is even made accuracy %100 but, as all we know , this type of datasets are prepared for learning phases, so making %100 accuracy is so easy as well.
Maybe I am a bit late jumping on the train, even though, I still want to say thank you for everything you have been doing. Your videos are much better to understand the field rather than the courses of top class Universities such as MIT. I have to say that you outperform all your competitors in a very simple way. As far as I know you had some problems with your health and I hope everything is good now. Wish you good luck and stay healthy at least for your TH-cam community. ^_^
Hey Yea James, thanks for checking on my health. You are right, I was suffering from chronic ulcerative colitis and last year 2019 had been pretty rought. But guess what I cured it using raw vegan diet, ayurveda and homeopathy. I am 100% all right and symptoms free since past 10 months almost and back in full force doing youtube tutorials :)
default 100 n_estimators or 20 n_estimator , each case it gives 1.0 accuracy. well after getting on this channel , i can feel the warmth on the tip of my fingers.
model = RandomForestClassifier(n_estimators=10,criterion='entropy') : While splitting the data, i used random state of 1000. After which, even while tuning the model to different specs, the accuracy I got was 93.33%
Bro may I know where did you solved this question, i mean to say that the version of jupyter notebook sir was having is not available now I guess So where should we solve this problem
Hi Sir, Can we use any other model (eg: svm) with the random forest approach, that is, by creating an ensemble out of 10 svm models and getting a majority vote? Thank you for the wonderful video.
train_test_split test size is 20% and the random state is 32 1. n_estimators default test score is 0.96 2. The best test score is 1.0 and n_estimators is 3
I got the perfect score of 1 when I set n_estimators to 40 although the selection of train,test data would also have been contributed in the accuracy of model.
Hi Sir, we are blessed that we got your videos on youtube. Your videos are unmatchable. I am interested in your upcoming python course. When can I expect starting of the course?
The default value of n_estimators changed from 10 to 100 in 0.22 version of skllearn. i got accuracy of 95.56 with n_estimators = 10 and for 100 the same.
Again a nice video from you. Sir I have one general question. What is random_state and why we sometime take 0 and sometimes we assign value to it. What's the significance of this.
Thanks for another post.. It's really helpful.... Just a question- Considering the fact that Random forest takes the majority decision from multiple decision trees, does it imply that Random forest is better than using Decision tree algorithm? How do we decide when to use Decision tree versus Random forest?
Solved the exercise problem. With model = RandomForestClassifier(n_estimators=10) got an accuracy of 0.96667 and with model = RandomForestClassifier(n_estimators=20) got 1.0
Way of teaching is very good.....sir plz make a vedio on how to give our image to it....how to convert our image like mnist dataset as there is benefit till the time we will use our images
Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced
Hi, I have a question regarding fitting the model. When we do model. fit in every training, there will be a random set of samples for training. For example, in the iris dataset, I fit my model and then fine-tune with n_estimators =10,20,100, etc. sometimes it is getting 1.0 score on 20, but if I run it again, it gets 0.98, so how can I fix the x_train and y_train so it will not change every time. ?
And I am really thankful for your lectures I am learning day by day.
Thank you.
Keeping the tutorial part aside (which is great), I really love your sense of humor and it's an amazing way to make the video more engaging. Kudos!!
Also, thank you so much for imparting such great knowledge for free.
Thanks for your kind words and appreciation shankey 😊
github.com/codebasics/py/blob/master/ML/11_random_forest/Exercise/random_forest_exercise.ipynb
Complete machine learning tutorial playlist: th-cam.com/video/gmvvaobm7eQ/w-d-xo.html
xlabel is the truth
and
ylabel is the prediction
but in the video it is reverse....
Am I right?
because we take "confusion_matrix(y_test,y_predicted)"
@@lokeshplssl8795 I do have same question
@@jiyabyju I figured it out
@@lokeshplssl8795 hope there is no mistake in code..
@@jiyabyju no mistake,
He took y_predicted as a model of prediction with X_test.
Lets promote this channel.
I am just a humble python hobbies who took local course yet still I don't understand most of the lecturer says. Because this channel i've finally found fun with python. In just 2 weeks(more) I already this Level? Man....! Can't Wait for Neural Network but only from this channel
Sir, I am damn impressed by you!!!! You are the best ML instructor here on YT!!!!
The way you teach or explain the concepts completely different thanks a lot!!!!!! Please make more videos
I cannot quite express how amazing teaching you are doing. I am doing masters one of the finest universities in America and this is better than the supervised learning class I am taking there. Kudos! Please keep it up. appreciate you are making this available for free although I would be willing to see your lectures even for a fee.
Thanks for leaving the feedback aditya
Great Video! I'm working on my first project using machine learning and am learning so much from your videos!
Hey Alex, good luck on your project buddy. I am glad these tutorials are helpful to you :)
I can watch this type of videos whole day without take any break. Thank you!!!
I achieved an accuracy of .9736. Earlier, I got an accuracy of .9 when the test size was 0.2 and changing the number of trees wasn't changing the accuracy much. So, I tweaked the test size to .25 and tried different number of tree size. The best I got was .9736 with n_estimators = 60 and criterion = entropy gives a better result.
Thank you so much sir for the series. This is the best TH-cam Series on Machine Learning out there!!
xlabel is the truth
and
ylabel is the prediction
but in the video it is reverse....
Am I right?
because we take "confusion_matrix(y_test,y_predicted)"
@@lokeshplssl8795 I think I know why you are probably confused. This not a plot chart. You should not assume that because you passed y_test as a first argument you would see it horizontally similarly you do with xlabel.
Unfortunately the confusion matrix is printed out unlabeled. True/Actual/test values are vertically alligned and predicted ones are horizontally.
A couple of videos before he used another library to demonstrate the matrix labeled.
If you have any questions regarding confusion matrix this is by far the best video th-cam.com/video/8Oog7TXHvFY/w-d-xo.html .
Also a similar use case has to do with Bayesian statistics. Another great example th-cam.com/video/-1dYY43DRMA/w-d-xo.html
You don't have to get into it since the software does it for you, but it would help understand what is going on
This is a great series! Would you be interested in allowing us to repost it on our channel? We'll link to your channel in the description and comment section. Send me an email to discuss further: beau [at] [channelname]
mega.nz/file/LaozDBrI#iDkMIu6v-aL9fMsl-X1DETkOqnMqwptkn54Z51KINyw (like data in this file )//help if anyone understand. mega.nz/file/LaozDBrI#iDkMIu6v-aL9fMsl-X1DETkOqnMqwptkn54Z51KINyw (like data in this file )//help if anyone understand.
sir, can you tell me how to plot random forest classification with multiple independent variables.so confused in that
yes sure. go ahead. You can post it.
Thank you Sir for this awesome Explanation about RandomForestClassifier . I got score of 1.0 for every increased value in n_estimators
Nice work!
For Iris Datasets I got score =1 for n_estimators = 40,50,60
Thank sir very much
frankly telling your videos are more neat and clear than anyother videos in the youtube
Thanks Ramesh for your valuable feedback :)
again, just spectacular graphics and easy to understand explanation. thank you so much.
ok so i read one comment and put test_size = 0.25 and n_estimator = 60. I rerun my test sample cell as well as model.fit and model.predict cell and got the accuracy of 100%. I am having a god complex right now thank you for this amazing series
eres un excelente profesor!, gracias por compartir tus conocimientos! saludos desde Peru!
Another Great Video. Thanks for that. I got 1.0 as score with n_estimators=1000. Keep doing these kind of great videos. Thank you.
Anji, it's great you are getting such an excellent score. Good job 👍👏
FYI if you are using version 0.22 or later the default value of n_estimators changed from 10 to 100 in 0.22
I got 93.33 accuracy at n_estimators=30 after that accuracy not increasing w.r.t increase in n_estimators. Thankyou very much for simply great explanation
You sir, are a gem! Thank you for this series!
I managed to get an accuracy of 98%!
Did you use the same jupyter notebook version as he did?
I got 100% accuracy with default estimator and random_state=10. Thanks a lot Sir
Good job Praveen, that’s a pretty good score. Thanks for working on the exercise
n_estimators = 10, criterion = 'entropy' led to a 100% accurate model !! Thanks!
Great job Sagnik :) Thanks for working on exercise
@@codebasics My pleasure ! Amazing tutorials !! Been a great learning experience so far ! Cheers :)
your teaching is superb, and your knowledge sharing to Data Science community is Nobe|.
I tried the exercise by giving the criterion = "entropy" got score as 1
this is crash course; if you are in hurry; this is the best series out there on youtube
It is a good practice to make a for loop for the n_estimators check the score for one of these:
scores=[ ]
n_estimators=range(1,51) #example
for i in n_estimators :
model=RandomForestClassifier(n_estimators=i)
model.fit(X_train,y_train)
scores.append(model.score(X_test,y_test))
print('score:{}, n_estimator:{}'.format(scores[i-1],i))
plt.plot(n_estimators,scores)
plt.xlabel('n_estimators')
plt.ylabel(('testing accuracy')
And then you can sort of see what's going on. This practice is very useful for knearest neighbors technique for calculating k.
Thank you! I was looking for something like thi. I think in the fourth line the i is missing, as in model=RandomForestClassifier(n_estimators = i)
@@cololalo yep forgot it thanks.
Thank you, I am trying to find something like this since the previous video!
This is the only channel i subscribed.
J Es, thanks. I am happy to have you as a subscriber 👍😊
I've done all the Exercise till here. But I was planning not to do it for this video until I saw your last picture! I don't want you to be angry! so I am going to do it right now!
Ha ha nice. Javad. Wish you all the best 🤓👍
Thank you very much! This tutorial is really amazing!
Just love ur videos. I was struggling with python. With ur videos was able to get everything in a weeks time. Also completed pandas and bumpy series. I would highly encourage u to start a machine learning course with some real life projects
I Got 100% accuracy!.... by changing criterion = "entropy"
xlabel is the truth
and
ylabel is the prediction
but in the video it is reverse....
Am I right?
because we take "confusion_matrix(y_test,y_predicted)"
@@lokeshplssl8795 it doesn’t change much, i mean u are just transposing the confusion matrix. The info still remain the same
Best explanation of Random Forest!!!!!!
I am happy this was helpful to you.
Hello Sir, I have started learning pandas and ML from your channel, and i am amazed the way you are teaching.
For Iris Datasets I got score =1 for n_estimators = 30
Great Vivek. I am glad you are working on exercise. Thanks 😊
Man, its great! Your videos is best i have seen ever about machine learning. Its very helpfull material. I am waiting when you make tutorial about gradient boosting and neural networks. I think you can make easily to report it. Thanks!
I got an accuracy of 0.982579 by giving, n_estimators = 100, well 100 is the default value now, and sir, big fan of your teaching 🙂
Good job Abhinav, that’s a pretty good score. Thanks for working on the exercise
@@codebasics sir just wished to get in contact with you, to get a proper guidance
Thank you for such wonderful videos, I got accuracy score a 1 in the exercise question
Nice to watch your videos.. you make us understand things end to end !!
👍😊
It's nice to see you bhaiya again
Sir I have Done the Exercise with 100% Accuracy
You are amazing brother. I really loved this. You made it so simple. Thank you so much.
Sir I got score=1.0 for estimator=10
And random_state=10
Very nice explanation👌👌👌
Great score. Good job 👌👏
I am not afraid of you, but I respect you!
So I am gonna do the exercise right now!
This is so awesome explanation!! Thank you so much!!!
Amazing man, keep it up and share more tutorial like this.
Amazing, I like how you explain simply
I am afraid of you, I did the exercice 😂😂😂😂😂😂😂😂😂😂😂😂
Amazing content, Thank you so much
This is sooo awesome! Amazing work sir💎
Sir u r great thnx for these kinds of videos please make more videos 😊😊😊😊
Sir make more videos
Thank you so much for very dynamic and clear content with the ideal depth on the topic details
Glad it was helpful!
Very nice sir.... Expecting more videos 😀
Glad it was helpful!
n_estimators = 1 (also 290 or bigger) is even made accuracy %100 but, as all we know , this type of datasets are prepared for learning phases, so making %100 accuracy is so easy as well.
Thank you sir...I got 100% accuracy with n_estimator 90
Good job Kapil, that’s a pretty good score. Thanks for working on the exercise
Hi sir,i did your exercise of iris data and got an accuracy of 1.0 with n_estimators=80
You made that so simple thank you so much
Maybe I am a bit late jumping on the train, even though, I still want to say thank you for everything you have been doing. Your videos are much better to understand the field rather than the courses of top class Universities such as MIT. I have to say that you outperform all your competitors in a very simple way. As far as I know you had some problems with your health and I hope everything is good now. Wish you good luck and stay healthy at least for your TH-cam community. ^_^
Hey Yea James, thanks for checking on my health. You are right, I was suffering from chronic ulcerative colitis and last year 2019 had been pretty rought. But guess what I cured it using raw vegan diet, ayurveda and homeopathy. I am 100% all right and symptoms free since past 10 months almost and back in full force doing youtube tutorials :)
@@codebasics Good to hear, Things are working out in a positive way! Be safe and I pray everything works well in the long run.
Jai SriRam
Nice explanation as always. Great work.
Nice videos, Your videos are the best..Keep doing
Jyothish, I am happy this was helpful to you.
default 100 n_estimators or 20 n_estimator , each case it gives 1.0 accuracy. well after getting on this channel , i can feel the warmth on the tip of my fingers.
from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(n_estimators=30)
rf.fit(X_train,Y_train)
Output: RandomForestClassifier(n_estimators=30)
rf.score(X_test,Y_test)
output: 1.0
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(Y_test,Y_pred)
cm
output:
array([[11, 0, 0],
[ 0, 8, 0],
[ 0, 0, 11]], dtype=int64)
Please upload frequently..we will wait for you
thank you for this tutorial how to visualize randomforest and decision tree
very Great video!!!!! thanks
glad you liked it Han
100% accuracy on the given exercise. I used n_estimators = 1
That’s the way to go Iradukunda, good job working on that exercise
Love your videos. They're helping me a lot. thanks
Hey Mohamed,
Thanks for nice comment. Stay in touch for more videos.
This video was amazing. Thanks!
You r God for me for helping me phd
🙌Woohoo! So glad it hit the mark for you! 😃
great videos! thank you so much
Thank you so much. I need some help on this classifier for my data set. This helped a lot.
Glad it helped!
Excellent. Thank you.
Thank you so much for this tutorial
my accuracy score is 0.9666667 with n_estimators at 40
That's a great score larry. Good job 👍👏
I got 0.9333 with 90 trees. Thanks!
Good job Коробка, that’s a pretty good score. Thanks for working on the exercise
model = RandomForestClassifier(n_estimators=10,criterion='entropy') : While splitting the data, i used random state of 1000. After which, even while tuning the model to different specs, the accuracy I got was 93.33%
Bro may I know where did you solved this question, i mean to say that the version of jupyter notebook sir was having is not available now I guess
So where should we solve this problem
I can't get any better than 93.3333333% on the exercise even with more n_estimators.
YOU NEED MORE AND MORE VIEWS SIR
I got 100% accuracy after tuning the parameters and train test split for the iris dataset
test_size=0.2, n_estimators=20, random_state=2
this video is very helpful .
Glad it was helpful!
Hi Sir,
Can we use any other model (eg: svm) with the random forest approach, that is, by creating an ensemble out of 10 svm models and getting a majority vote?
Thank you for the wonderful video.
test_size=0.2
model=RandomForestClassifier(n_estimators=10,criterion='gini')
model.score = 1
2)
RandomForestClassifier(criterion='entropy', n_estimators=10)
model.score = 1
3)
test_size=0.35
RandomForestClassifier(n_estimators=10,criterion='gini')
model.score=0.9811
4)
test_size=0.35
RandomForestClassifier(n_estimators=10,criterion='entropy')
model.score=0.9811
train_test_split test size is 20% and the random state is 32
1. n_estimators default test score is 0.96
2. The best test score is 1.0 and n_estimators is 3
I got the perfect score of 1 when I set n_estimators to 40 although the selection of train,test data would also have been contributed in the accuracy of model.
Bro may I know if you have used the same version as dhaval sir ?
Hi Sir, we are blessed that we got your videos on youtube. Your videos are unmatchable. I am interested in your upcoming python course. When can I expect starting of the course?
Python course is launching in June, 2022. Not sure about exact date though
The default value of n_estimators changed from 10 to 100 in 0.22 version of skllearn. i got accuracy of 95.56 with n_estimators = 10 and for 100 the same.
Again a nice video from you.
Sir I have one general question. What is random_state and why we sometime take 0 and sometimes we assign value to it. What's the significance of this.
Thanks a lot sir for the videos, I wanna know when to use random forest or just tree?
you're so much fun dude
Very nice tutorial
Glad it was helpful!
great, thank you so much!!!!
I got 1.0 score on my training as well as test set by setting number of trees to '5' and criterion to 'gini'.
excellent lesson!
Glad it was helpful!
Thanks for another post.. It's really helpful.... Just a question- Considering the fact that Random forest takes the majority decision from multiple decision trees, does it imply that Random forest is better than using Decision tree algorithm? How do we decide when to use Decision tree versus Random forest?
Nice work.
Amazing sir
Good Evening sir,
I hope you are doing well.
n_estimators = 50 at this estimator I am getting more score.
Thank You
Solved the exercise problem.
With model = RandomForestClassifier(n_estimators=10) got an accuracy of 0.96667
and with model = RandomForestClassifier(n_estimators=20) got 1.0
Anuj, good job 👍👏👌
@@codebasics Thanks sir! Its all because of you 😊
"that's me in the picture" 🤣🤣🤣
Way of teaching is very good.....sir plz make a vedio on how to give our image to it....how to convert our image like mnist dataset as there is benefit till the time we will use our images
Sure I am going to add image classification tutorial.
@@codebasics thanks sir
I did the exercise and I got a score of 0.9 with 20 estimators, 0.93 with 50 estimators and 0.9 with 100 estimators
On Iris data exercise,
best_score = 0.95
best_parameters = {'criterion': 'gini', 'n_estimators': 95}
I got a hundred percent score by using n_estimators as 50…took test size as 0.2!!!!
Excuse me may I know
Where did you solve this question
Did you solve using the same version of jupyter notebook that dhaval sir had?
Sir how are you deciding the xlabel and ylabel in the heatmap