Correction: 7:28 My wording could be improved. The True Negatives for Troll 2 are the people who were correctly predicted to not like Troll 2. Support StatQuest by buying my books The StatQuest Illustrated Guide to Machine Learning, The StatQuest Illustrated Guide to Neural Networks and AI, or a Study Guide or Merch!!! statquest.org/statquest-store/
Thank you for all the great videos! I have learned a lot from you. Will we see a video on Partial Least Squares (regression / SEM) for ML? There are some videos on TH-cam but are all very confusing.
Hi Josh, Love your content. Has helped me to learn a lot & grow. You are doing an awesome work. Please continue to do so. Wanted to support you but unfortunately your Paypal link seems to be dysfunctional. Please update it.
StatQuest should be mentioned in my future Data Scientist Master Diploma. Seriously! Every time I get introduced to a new concept, I come here and.. BAM! it all makes sense.
Having trained myself on other StatQuest video, I predicted that I would understand this video and I actually understood this video, so that looks like a True Positive right there.
Watched confusion matrix statquest, and I forgot to like. So I paused in the middle of this one, went back and liked then came back. Because u deserve more than just likes. But for now that's all I have to offer. God bless u and thanks
Your channel is honestly gold... I used to think stats is a mystery and it didn't make intuitive sense until I met your channel - now everything makes sense and I enjoy using them. Thank you for making these free and accessible, I really appreciate your effort, dedication and intelligence
Your videos are always astoundingly clear and well explained! In fact, you've inspired me to the point of being the first TH-camr I support financially!
Great explanation! I would like to tell you that our Prof in the college uses your content and suggests it as a reference to understand ML. Thanks a lot for your effort! Really the best TH-cam channel .
All the videos of this Chanel are very well made and are extremely good ….. Thank you so very much ….. I never knew I could get such incredible videos for free …. I will share this channel with All tomorrow.
To be 100% honest I wasn't a big fan of the intro song hahaha but I gotta admit whenever I see new concept or having difficult time to understand topic while studying I come back to this channel. Great work!! It's sorta addictive now hahahaha
Wonderful Mate. Appreciate your work. Your explanation with example is top class!!! Happy that I found this channel. Great work. Thanks a lot for en lighting us on ML concepts.
I was alwasy think of Sensitivity and Specifity as if they only can be calculated for 2x2 table. it was a good point from you to apply it on a 3x3 one Thanks!
Thanks for the video, super clear. I am just dumb and had some difficulty making sense of True positive, False positive etc. I have summarised it as follows : For an output X - True Positive : Actual = X, Predicted = X - False Negative : Actual = X, Predicted != X - True Negative : Actual != X, Predicted != X - False Positive : Actual != X, Predicted = X
2:10 specificity example from 2x2 confusion matrix Sensitivity is true positives identified by the model / (true positives + false negatives) Specificity true negatives identified by the model / (true negatives + false positives) 4:50 using specificity and sensitivity to compare models Todo continue
Hello, my suggestion for video I am missing on your chanell - Harmonic mean and F - score for recall and precision. Or at least when I searched your chanell I havent found video on this topic. Thanks for amazing work you are doing.
I rarely see specificity as an evaluation metrics. It seems precision and sensitivity are more commonly used. However, it's very good to know this information!
Hi Josh. Let's say one model was better at predicting Sensitivity, whereas another was better at predicting Specificity. Do you foresee any problems or downsides to data integrity to using different models, the model with the highest score respectively? For example, in your 2x2 confusion matrix @4:38, using your Logistic Regression model for Specificity measures and the Random Forest model for Sensitivity measures?
This video is about the way to calculating sensitivity and specificity. Sensitivity = True positive /( True positive + False negative ) Specificity = True negative / ( True negative + False positive) For the case, we have more than two category, we will get these two numbers according to specific category
I just discovered your videos yesterday and I subscribed immediately. Thanks a lot! Could you please make a video on precision and recall? Actually, what I would really like to see, more than just the calculations, is help understanding when sensitivity vs. specificity is more useful than precision vs. recall and vice versa. I imagine that a purely calculations video on precision and recall would be very similar to this one on sensitivity and specificity, but knowing when each statistic is more useful in real life applications would be more helpful to me. You touched on that very briefly in this video when you mentioned which is preferred (random forest for higher sensitivity in detecting heart disease versus logisitic regression for higher specificity in confirming that people don't have heart disease), but it would be helpful if you spend more time on those--to me, it is that kind of application that really brings statistics to life.
What if you are hoping to predict something continuous - like the length of an object: Are sensitivity and specificity still used (perhaps by binning and tuning the sizes of bins)? I guess the metric moves from whether the model was right (True Positives and True Negatives) vs. wrong (False pos. and neg.) to how close the model performed to the true value. Would love to hear thoughts on what you would do to assess continuous models instead!
I have spent so much time on this and now I finally understand it. Sensitivity and Recall are the same. They tell us how many of the true values were truly predicted. Sensitivity tells us how many false values are correctly predicted. Precision tells us how many true values of all the true values are correctly predicted.
Thank you for the amazing explanation as usual, I have a small question : what is the difference between this metric and precesion/recall metric ... when should we prefer one over the other? because my teachers just mentioned the latter metric
Thanks for the video. Truly love it. Quick question1 - You mentioned that if correctly identifying negatives is more important, then we should put more importance on Specificity. But how do we decide which class is positive and which is negative. For instance, I am working on fraud detection problem and only 2% of transactions/examples are rejected. What would be my positive class and negative class in that case? Question 2 - Can we have a video on precision and recall ? I find it very confusing so as which Precision and Recall should I be computing when working with imbalanced classification problems. Should I compute Precision and Recall for Majority Class (accepted transactions) or should I compute these metrics for Minority Class(rejected transactions). Any insights on this would be really helpful.
First, you have to decide what you want to detect. If you want to be sensitive to the rejected transactions/examples, then you need a method that is sensitive to the minority class. I talk about this towards to the end of my ROC video, so, for more information, see: th-cam.com/video/4jRBRDbJemM/w-d-xo.html
I am watching this video series for my final research in the university. Since I can not mention you tube as a reference can you mention a reference that you used in doing this videos
Amazing one! How does the sensitivity and specificity metrics could be made actionable for more classes or categories (400 Classes) not sure which one to consider the most importance for bulk of classes
Hey, can you make a video about Independant Component Analysis, and how this can be used for biological data, such as gene expression data? Btw, my favorite channel. You rock.
At @5:43 are the sums of rows (predicted) and columns (actual) supposed to be the same? It was confusing for me. >>> 12+102+93 207 >>> 112+23+77 212 >>> 83+92+17 192 >>> 12+112+83 207 >>> 102+23+92 217 >>> 93+77+17 187 >>>
When looking for the best model for our data, how does one decide when to use Cross validation and when to use the sensitivity and specificity from the confusion matrix?
What would be the definition of sensitivity and specificity in the case of non disease? Lets say we need to classify between two classes: class A and class B. Which one is positive which one is negative?
Hey Josh, awesome videos! They help me understand the statistics and maths side of machine learning :) I have a question based on your example for the heart disease prediction; is it not possible to just use both models for predicting those 2 separate things? for example using the Random Forest for predicting heart disease and Logistic Regression for predicting free of heart disease? Or does that not make sense. Im farely new to ML and would appreciate an answer :) regards,
Usually people use several different machine learning methods as an "ensemble" and make the final classification based on votes from each method. If you have a random forest predict if people have heart disease and logistic regression for predicting free of heart disease, then what do you do when the random forest does not predict that someone has heart disease and the logistic regression predicts that they do?
@@statquest Thank you for the quick response :) I see what you mean, that could definitely be a problem. Appreciate the videos and your activeness! cheers mate.
False positive means you get a positive result when you are expecting negative. False positive means you get a positive result when you are expecting negative. Thank you
I see your point, but I believe that in this case "True Negative", as calculated for Troll 2, is only relative to Troll 2. All 23 + 77 + 92 + 17 people were correctly predicted to not love troll 2.
So, if correctly identifying positive is our priority i.e. reducing the misclassification of positive i.e. reducing type 2 error, then in that case we have to compare with sensitivity. And if, correctly identifying negative is our priority i.e. reducing the misclassification of negative i.e. reducing type 1 error, then in that case we have to compare with specificity.
Hello Josh, Thank for directly explain about Sensitivity and Specificity. I read some term about Precision and Recall in Confusion Matrix. And I make confuse about using those concepts when choosing a algorithm for training. When do we take care Precision/Recall and Sensitivity/Specificity? Sorry if my question is in wrong in your flow! Thank you and have a good day!
Correction:
7:28 My wording could be improved. The True Negatives for Troll 2 are the people who were correctly predicted to not like Troll 2.
Support StatQuest by buying my books The StatQuest Illustrated Guide to Machine Learning, The StatQuest Illustrated Guide to Neural Networks and AI, or a Study Guide or Merch!!! statquest.org/statquest-store/
Thanks for this comment, I was wondering during the video.
Thank you for all the great videos! I have learned a lot from you. Will we see a video on Partial Least Squares (regression / SEM) for ML? There are some videos on TH-cam but are all very confusing.
@@LockyLawPhD I'll keep it in mind.
And for those who wonder, yes, "Sensitivity " is "Recall"
Hi Josh,
Love your content. Has helped me to learn a lot & grow. You are doing an awesome work. Please continue to do so.
Wanted to support you but unfortunately your Paypal link seems to be dysfunctional. Please update it.
StatQuest should be mentioned in my future Data Scientist Master Diploma. Seriously! Every time I get introduced to a new concept, I come here and.. BAM! it all makes sense.
Awesome! Good luck with your Master Diploma. :)
@@statquest I have learned more from watching your videos than from my teacher at the university.
How's your Masteral now ma'am? I'm starting one too this 2021 :p
Same for me with my Bioinformatics master's! He's awesome
same thing for me. Thank you
Having trained myself on other StatQuest video, I predicted that I would understand this video and I actually understood this video, so that looks like a True Positive right there.
BAM! :)
It isn't English that u r speaking. U r speaking ML
@@littleKingSolomon I read ML as machine language 😂
This channel is a gem. Glad I found it!
Awesome! Thank you very much. :)
Watched confusion matrix statquest, and I forgot to like. So I paused in the middle of this one, went back and liked then came back. Because u deserve more than just likes. But for now that's all I have to offer. God bless u and thanks
Wow, thanks!
Size ne kadar teşekkür etsek az. Müthiş videolar gerçekten. Bu kadar iyi ders anlatan bir hocam olmamıştı hiç. Başarılarınızın devamını dilerim.
Thank you!
Your channel is honestly gold... I used to think stats is a mystery and it didn't make intuitive sense until I met your channel - now everything makes sense and I enjoy using them. Thank you for making these free and accessible, I really appreciate your effort, dedication and intelligence
Thank you very much! :)
I listen to ur Quests at 2x speed and yet they are super uber crystal clear. god speed man
Nice! :)
I actually download it and listen at 4x speed. Still uber crystal clear!
No one teaches like you, thanks for making such a playlist.
Thank you! :)
Your videos are always astoundingly clear and well explained! In fact, you've inspired me to the point of being the first TH-camr I support financially!
WOW! Thank you very much! I really appreciate the support. :)
I only wish that I had found your amazing videos at the start of the semester. oh well, better later than ever, so BAM anyway!!
Thanks! :)
Out of curiosity, how'd the rest of the semester go?
Great explanation!
I would like to tell you that our Prof in the college uses your content and suggests it as a reference to understand ML.
Thanks a lot for your effort!
Really the best TH-cam channel .
Thank you very much!
I'm lucky to stumble on this channel before my final exam! How cool is your teaching...wow!
Good luck on your exam!
I really sad that there is no one support such that videos I really think you deserve more
Thank you! :)
All the videos of this Chanel are very well made and are extremely good ….. Thank you so very much ….. I never knew I could get such incredible videos for free …. I will share this channel with All tomorrow.
Thank you very much!
Josh Starmer you're awesome and you're really helping the community so much. Many thanks to you sir
Thanks! I'm glad you like some of my videos.
Why didn't I find this channel before?? It's a gem!!
bam!
That's what I call a great explanation! TRIPLE BAM! Thanks, Josh! Greetings from Brazil!
Muito obrigado!
This is my go to channel for any stats related knowledge. Thanks Josh for this exceptional channel👏👏
Wow, thanks!
To be 100% honest I wasn't a big fan of the intro song hahaha but I gotta admit whenever I see new concept or having difficult time to understand topic while studying I come back to this channel. Great work!! It's sorta addictive now hahahaha
Bam!
Thanks!
WOW!!! THANK YOU VERY MUCH! BAM! :)
you are the doctor who saves my time. many applause.
Thank you! :)
Wonderful Mate. Appreciate your work. Your explanation with example is top class!!! Happy that I found this channel. Great work. Thanks a lot for en lighting us on ML concepts.
Awesome, thank you!
@@statquest BAM!
BAM! That's an amazing video. I completely get sensitivity vs specificity now. Thanks!
BAM!
I was alwasy think of Sensitivity and Specifity as if they only can be calculated for 2x2 table. it was a good point from you to apply it on a 3x3 one Thanks!
bam!
Always precise and clear video on intuitions. Thank you Josh :) Enjoying Double BAM!!!!!
Hooray! :)
Josh, I must confess you’re the best !!
I can’t thank you enough 🙌🙌
BAM! :)
easily understood, clearly explained, this channel is precious for newbies, Thanks ad
Thank you very much! :)
before statquest path was like false positive for machine learning but after watching this channel my path become true positive.
Bam! :)
The prelude is absolutely funny :) And the content is amazing, thanks StatQuest 👌
Thanks! :)
You are better than all of my undergrad and grad professors :D
Thanks!
Thanks for the video, super clear. I am just dumb and had some difficulty making sense of True positive, False positive etc. I have summarised it as follows : For an output X
- True Positive : Actual = X, Predicted = X
- False Negative : Actual = X, Predicted != X
- True Negative : Actual != X, Predicted != X
- False Positive : Actual != X, Predicted = X
Thanks, Josh, you explained the concept very clearly.
Glad it was helpful!
2:10 specificity example from 2x2 confusion matrix
Sensitivity is true positives identified by the model / (true positives + false negatives)
Specificity true negatives identified by the model / (true negatives + false positives)
4:50 using specificity and sensitivity to compare models
Todo continue
Thank god that this video exists !
bam! :)
Gave practical application in a meaningful way.
Thanks!
Great Explaination! The confusion matrix is really confusing... need to go over this video several times.
bam! :)
You explained it so simply ! Just loved this tutorial. Thanks Josh :)
Glad you liked it!!
best explanation i've ever come across.. thanks a lot..
Glad it was helpful!
Amazing class. Way to go! Salute from Brazil.
Muito obrigado!
thank you very much for your clear and appealing explanation.
Thank you!
Hello, my suggestion for video I am missing on your chanell - Harmonic mean and F - score for recall and precision. Or at least when I searched your chanell I havent found video on this topic.
Thanks for amazing work you are doing.
Great suggestion!
I am really thirsty and watching statquest! Thank you sire.
BAM! :)
Troll 2 is the best worst movie ever! Love that you included it as an example.
BAM! :)
Thank you so much for fantastic lessons :). Love from TN
Bam! :)
@statQuest Pls make a video on Precision as well - wth love from India
Noted
You made again a great video! Thank you. Maybe you could have mentioned precision and recall (=sensitivity) as both terms are often used, too.
Thanks! I'll put that on the to-to list.
I rarely see specificity as an evaluation metrics. It seems precision and sensitivity are more commonly used. However, it's very good to know this information!
I should have a precision video out soon.
@@statquest That'd be great! I look forward to your newer videos!
Love you videos.
Correction at 10:16 - The sensitivity for Cool as Ice not 17 / (17 + 175)
nice!
Thank you for saving my life!
bam!
Hi Josh. Let's say one model was better at predicting Sensitivity, whereas another was better at predicting Specificity. Do you foresee any problems or downsides to data integrity to using different models, the model with the highest score respectively? For example, in your 2x2 confusion matrix @4:38, using your Logistic Regression model for Specificity measures and the Random Forest model for Sensitivity measures?
Not that I know of off the top of my head.
@@statquest Okay, that's good enough for me! Thanks Josh.
This video is about the way to calculating sensitivity and specificity.
Sensitivity = True positive /( True positive + False negative )
Specificity = True negative / ( True negative + False positive)
For the case, we have more than two category, we will get these two numbers according to specific category
bam!
This channel is awesome and content is so great ,
Thank you so much 😀
Thank you sooooo much 💖
You’re welcome 😊!
Please, keep making these amazing videos!!!
I will! :)
Your videos are awesome, thanks
Glad you like them!
Youre awesome Josh. Keep going! i have recommended your channel to all those bitten by sadistics... ahem! statistics!
Awesome! Thank you!
such a great explanation and content. thanks a lot
Thank you! :)
Josh Starmer, You rock!
Thanks!
I just discovered your videos yesterday and I subscribed immediately. Thanks a lot!
Could you please make a video on precision and recall? Actually, what I would really like to see, more than just the calculations, is help understanding when sensitivity vs. specificity is more useful than precision vs. recall and vice versa. I imagine that a purely calculations video on precision and recall would be very similar to this one on sensitivity and specificity, but knowing when each statistic is more useful in real life applications would be more helpful to me. You touched on that very briefly in this video when you mentioned which is preferred (random forest for higher sensitivity in detecting heart disease versus logisitic regression for higher specificity in confirming that people don't have heart disease), but it would be helpful if you spend more time on those--to me, it is that kind of application that really brings statistics to life.
@techoji No, they're not. That's why I'm asking for a separate video.
now i know, when to use which classification model , lovely mates thank you
bam!
STATQUEST IS AWESOME!!
bam!
I swear this guy release coolest music
bam! :)
Quest on! Happy New Year! 👏👍👏
Hooray! Happy New Year! :)
FINALLY - I understand thank you!
Glad it helped!
Woww! It's very much clear. I love it
Thank you! 😃
Great BAM for this video!!!!
Thank you!!! :)
Thank you for the lesson)
Triple bam!!! :)
Sensitivity: true positives/(true positives+false negatives) basically TruePositivePrediction/TotalPositiveCases. Tells about positives getting correctly identified (as positives)
Specificity: TrueNegativePrediction/TotalNegativeCases or TrueNegatives/(TrueNegative+FalsePositives). Tells about negatives being correctly identified (as negatives)
What if you are hoping to predict something continuous - like the length of an object: Are sensitivity and specificity still used (perhaps by binning and tuning the sizes of bins)? I guess the metric moves from whether the model was right (True Positives and True Negatives) vs. wrong (False pos. and neg.) to how close the model performed to the true value. Would love to hear thoughts on what you would do to assess continuous models instead!
When we predict something continuous, we usually use sum of residuals or mean squared residuals.
Amazing Content! Keep growing.
Thank you very much! :)
Amazing videos and explanation ! Thanks ..
Thank you! :)
More power to you! :)
Thank you!
Amazing Video. Thanks. What're relations between Precision/Recall and Specificity/Sensitivity?
I have spent so much time on this and now I finally understand it. Sensitivity and Recall are the same. They tell us how many of the true values were truly predicted. Sensitivity tells us how many false values are correctly predicted. Precision tells us how many true values of all the true values are correctly predicted.
I am sorry if I confuse you further.
Thanks a million Shivam.
Thank you for the amazing explanation as usual, I have a small question : what is the difference between this metric and precesion/recall metric ... when should we prefer one over the other? because my teachers just mentioned the latter metric
I talk about that in my video on ROC graphs: th-cam.com/video/4jRBRDbJemM/w-d-xo.html
@@statquest Thank you so much
BAMM!! this guy is a legend.
Thanks!
Thanks for the video. Truly love it. Quick question1 - You mentioned that if correctly identifying negatives is more important, then we should put more importance on Specificity. But how do we decide which class is positive and which is negative. For instance, I am working on fraud detection problem and only 2% of transactions/examples are rejected. What would be my positive class and negative class in that case? Question 2 - Can we have a video on precision and recall ? I find it very confusing so as which Precision and Recall should I be computing when working with imbalanced classification problems. Should I compute Precision and Recall for Majority Class (accepted transactions) or should I compute these metrics for Minority Class(rejected transactions). Any insights on this would be really helpful.
First, you have to decide what you want to detect. If you want to be sensitive to the rejected transactions/examples, then you need a method that is sensitive to the minority class. I talk about this towards to the end of my ROC video, so, for more information, see: th-cam.com/video/4jRBRDbJemM/w-d-xo.html
Your vids are awesome!, thank you
Thank you! :)
thank you so much.
Bam! :)
0:21
4:05
6:04
11:01
well done!
:)
So good!
Thank you!
I am watching this video series for my final research in the university. Since I can not mention you tube as a reference can you mention a reference that you used in doing this videos
You can reference my book: www.amazon.com/dp/B0BLM4TLPY
Amazing one! How does the sensitivity and specificity metrics could be made actionable for more classes or categories (400 Classes) not sure which one to consider the most importance for bulk of classes
See: th-cam.com/video/4jRBRDbJemM/w-d-xo.html
Thanks for the video. Can you make videos on Bayesian statistics?
It's on the "to-do" list, but it won't be for a while.
Hey, can you make a video about Independant Component Analysis, and how this can be used for biological data, such as gene expression data?
Btw, my favorite channel. You rock.
Brilliant @StatQuest with Josh Starmer..
It is explicitly explained by You
Thanks!
At @5:43 are the sums of rows (predicted) and columns (actual) supposed to be the same? It was confusing for me.
>>> 12+102+93
207
>>> 112+23+77
212
>>> 83+92+17
192
>>> 12+112+83
207
>>> 102+23+92
217
>>> 93+77+17
187
>>>
In practice, you'll see different orientations of the confusion matrices, so it is very important to be flexible and adaptable.
just awesome
Thanks again!
Hi Josh, do you have a video about lift chart? Can't find it in your channel :(
Not yet.
When looking for the best model for our data, how does one decide when to use Cross validation and when to use the sensitivity and specificity from the confusion matrix?
You pretty much always use cross validation to create the data that we then summarize with a confusion matrix.
@@statquest Thanks a lot Josh!
What would be the definition of sensitivity and specificity in the case of non disease? Lets say we need to classify between two classes: class A and class B. Which one is positive which one is negative?
You decide which one to treat as positive and which to treat as negative.
Hey Josh, awesome videos! They help me understand the statistics and maths side of machine learning :)
I have a question based on your example for the heart disease prediction;
is it not possible to just use both models for predicting those 2 separate things? for example using the Random Forest for predicting heart disease and Logistic Regression for predicting free of heart disease? Or does that not make sense.
Im farely new to ML and would appreciate an answer :)
regards,
Usually people use several different machine learning methods as an "ensemble" and make the final classification based on votes from each method.
If you have a random forest predict if people have heart disease and logistic regression for predicting free of heart disease, then what do you do when the random forest does not predict that someone has heart disease and the logistic regression predicts that they do?
@@statquest Thank you for the quick response :) I see what you mean, that could definitely be a problem. Appreciate the videos and your activeness! cheers mate.
BAM! i love your videos
Thank you very much! :)
False positive means you get a positive result when you are expecting negative.
False positive means you get a positive result when you are expecting negative.
Thank you
:)
thank youuu
bam!
at 7:27 shouldn't true negatives for troll2 be 23 + 17, people that were CORRECTLY predicted to like gore police and cool as ice more than troll2?
I see your point, but I believe that in this case "True Negative", as calculated for Troll 2, is only relative to Troll 2. All 23 + 77 + 92 + 17 people were correctly predicted to not love troll 2.
Ok thanks
So, if correctly identifying positive is our priority i.e. reducing the misclassification of positive i.e. reducing type 2 error, then in that case we have to compare with sensitivity.
And if, correctly identifying negative is our priority i.e. reducing the misclassification of negative i.e. reducing type 1 error, then in that case we have to compare with specificity.
yep
Hello Josh,
Thank for directly explain about Sensitivity and Specificity.
I read some term about Precision and Recall in Confusion Matrix.
And I make confuse about using those concepts when choosing a algorithm for training.
When do we take care Precision/Recall and Sensitivity/Specificity?
Sorry if my question is in wrong in your flow!
Thank you and have a good day!
First see: th-cam.com/video/4jRBRDbJemM/w-d-xo.html
What are False Positive rate and False Negative rate in the simplest term?
I talk about those in my video on ROC: th-cam.com/video/4jRBRDbJemM/w-d-xo.html