Excellent, Sir! Thanks a lot for bringing it so straightforward and consistently. For the first time, I could understand and reproduce the whole thing in r, regarding ROC Curve and AUC.
thank you Sir, very kind of you to send the R code and the data, appreciate it. Your youtube here explains those concepts of ROC and AUC clearly, with a simple example. Well done.
Dear Sir, Your tutorial helps us all to learn about data science. I learn many thing from your tutorial. Now I want to learn how we can make ROC and AUC for multi-class? May you make another video to teach us about multi-class model performance? Thank you
Thank you sir...very clear and crisp explanation. In one video I got all the information. From the explanation in the video, I got how to find cutoff for maximum accuracy, by doing this only one class has got more weight in my dataset. but how to find a threshold value of cutoff(which gives maximum of sensitivity and maximum of specificity).
You can get that using the ROC curve. The color used on the curve changes from 0 to 1. You can identify a point on the curve that is closest to the ideal curve.
Sir How does ROC curve work when the dependent variable is not binary in nature, in essence more than 2 factors for which we have to model the data(Note: but not continious in nature).
Hello Sir , i'm an avid viewer of your videos which truly add value to our ML understanding. Have a quick question that once we determine the Best Value of Cut Offs post Model performance evaluation , should we go back and re-run the Model performance with Best Cut Off values and change the cut off of 0.5 that we considered as thumb rule.
Hi, Firstly, great video this really helped me to understand the ROC curve and implement it with my data in R. I am analysing diagnostic data for a masters degree research project. I wanted to know how to identify the cutoff value from the value that we take from the accuracy versus cutoff curve or the final ROC curve. The scale goes from 0-1 but my independent variable data ranges from 100 to 10^7 . In short, how do I take the best cutoff value that this analysis outputs and relate/convert this to my independent variable and an exact cutoff value? Thanks very much.
Cutoff value is for the dependent variable. If the dependent has two categories 'yes' and 'no', then default cutoff is probability = 0.5. ROC provides information on how prediction model performance will change if cutoff value changes from 0 to 1.
Thank you so much Sir..the video was really helpful in providing practical knowledge of dealing with predictive modelling problems in R..Can you please tell me how to apply weight of evidence/ fine classing in R - is there any ready made syntax?
Hi Professor, i love your videos ,it's very interesting.I'm a PhD student , sometimes i find difficult to have the link between my own variables ( concengrations of elements) and the variables that you work with, that's why ; I wish you have documents well explained concerning the data processing analysis to sends it to me I will be very grateful . Also i want that you sent me the data file.
Sir, can you please also attach dataset files/link along with your videos. This would greatly help us in learning by practicing with same data set. Thanks for great videos sir.
Thank you for this great video! And thank you for prompt reply. I have questions. If we are doing machine learning, we need to create ROC using predictive model created by test set, correct? (in your "Logistic Regression with R" video, you created predictive model using test set. We need to validate the accuracy of the model). Also, if I want to use which.max func to plot the highest values on the eval plot, what code should I use?
Im new to ur lecture and find it very interesting and useful.I have one question ?how do you get the cutoff .5 from the classification table @5:39mn of the video.Thanks
I figure out and see it should be a default value.However I'm still having issue with the performance object eval.when I tried to print eval,it is giving me this : A performance instance 'Cutoff' vs. 'Accuracy' (alpha: 'none') with 392 data points please help me have what you had on your screen.Thanks
Thanks for the useful information. I would like to ask you if I can use ROC to measure the effectiveness of the prediction model? And can I use ROC in R software?
Hi Dr. Bharatendra Rai, would you be able to make a tutorial on building a logistic regression model using training and validation sets, with performance checking via ROC curve as you have done here? I know you posted one on linear regression, but I thought a logistic model would be very helpful too. Thank you!
Hello. This looks like the cutoff would be the probability of a certain student getting admitted based on the multinomial model. If I am working with a dataset with one independent variable contributing to my multinomial model, and am wanting to obtain a cutoff value from that independent variable (ie what is the cutoff value of SAT score that will best tell me if someone is admitted to college), what would I be changing in the code? Thank you.
Hi Bharatendra, why don't you use glm()? I looked it up and it seems like multinom is used when the dependent has more than 2 levels. In your example, the dependent is admin(no, yes). That's why I'm confused why you chose multinom () instead of glm(). Thank you.
I have applied the same functions in evaluation of my GAM model where I am not able to produce the confusion matrix. The results shows 2*132 table matrix instead of 2*2 matrix moreover I have 203 'Y" variable in validation data. Why its coming so. Plz help me. Thanking you.
How to handle if my all data is categorical my predictor features are subject columns with 1 to 8 grades for each subject and response variable is subject where we have to predict response variable grades (1-8 ). Before applying model I converted all features and response variable into factors is this right step or should i only covert response variable into factors and keep predictors in numerical format
Hi, I'm on R studio v. 1.1.423 now and nnet package isn't available and I can't seem to find an equivalent... any ideas what I can use to get the same results? Thanks.
Great video, trying to plot ROC for random fors but it giving me the following error Error in prediction(p2, train$Dispute) : Number of cross-validation runs must be equal for predictions and labels
Thank you so much sir, Just want to ask you whether type='response' is same as type='prob' when I am trying to give type='prob' , R is throwing an error like "Error in match.arg(type) : 'arg' should be one of “link”, “response”, “terms” ?
I have a question about the logistic regression model part. Does the code deal with the whole data? I thought when doing the logistic regression model, you have to divide the data into training set and test set. In the code you've used, does it divide training set and test set automatically?
I used full data as focus was more on ROC. But when developing a prediction model it is always good to partition data into training and testing data sets.
Hello sir, Video was excellent but I have small question. Cut off value changed to 0.45 so do we need to again run model. If not then why and if yes then what changed need to be made in code so that pure classification can be made for improving accuracy level of model.
You don't need to run model again. The model gives predictions in terms of probabilities. When we use cutoff of 0.45, probabilities that are less than 0.45 are classified into first group and those above 0.45 are classified into 2nd group. So for any cutoff value, probabilities from prediction model are not going to change, they are only used for classification.
Thank you sir for your reply and for guiding me I am want to know one more thing, as according to your reply I can assume probability less then 0.45 as yes and above 0.45 as no in terms of dependent variable. But where or on which sort of output I will come to know that due to variation in independent variable majority of time dependent variable occur (yes or no). Thanks
Independent variable will take only one value for each case. If a new student has GRE = 380, GPA = 3.61 and comes from a school ranked 3, then inputting these values in the prediction model will give probability = 0.18. Since p=0.18 is less than cutoff = 0.45, this student will be rejected and will not get admission.
Hello Sir I need little help from your side. In code where we create subset out of sample why == TRUE or FALSE is used. What is difference between = and == symbol in Logistic regression. Thank you sir
Sir, I have ran the model using neuralnet package , is it necessary to calculate probability for predicted value or we can directly go with value obtained for test set. One more question sir is it any way to plot roc curve for two models.
Hello Sir, Thanks for your every lecture for students like me. Respected Sir, i am working on Groundwater potential mapping, i have compete my analysis using ArcGIS, now i have to validate my results with ROC curve, could you please guide me how could it possible using R. I will be very thankful to you.
@@bkrai Sir i have get problem here, please check, how to make model with my data? setwd("C:/Users/Umair/OneDrive/Manuscript Quetta GWP/ANN") .libPaths("C:/Users/Umair/OneDrive/Manuscript Quetta GWP/library ANN") install.packages("tidyverse") library(tidyverse) Wells
hi sir, I have some confusion, please help me to resolve it. Your IV (admit) has two levels...0 and 1 and you performed multinomial logit? Is it obvious to plot a multiclass ROC rather than a typical ROC curve, when my IV has three levels ( i.e. 1,2, and 3). Thanks.
Hello Dr. Rai. Would you happen to know what code I can use to compare 2 ROCs and/or AUCs using R? Also, if there is a way to represent 2 ROCs in one graph. Thank you! ~ Tanya
I need to know how the medical data set are going to use in R studio programming and (example MIMIC, DCOM, etc) which library i have to use... pls if you know anyone inform...
Hello, This is great video but I am slightly confused with the probability explanation. YOu mentioned that if the prob prediction score is less than .5 then chance are less than average but doesn't it depend on %age of events in the data on which model is based? If in the sample data, the event rate is 1 out of 4 then probability is .25, so any scores above .25 in the final output mean model is saying that this has higher chances. Not necessarily it has to be above .5.
In this data there are only two outcomes. Either a student is admitted or not-admitted. If you only calculate % as students admitted and use that as a probability, it will not be a very useful prediction. Because it will remain same for every students irrespective of their gpa, sat score or college they are coming from. The prediction in the form of probability between 0 and 1 here is based on all input variables.
@@bkrai Hello, Let me give an example: Lets say in your data set , there were 100 observations and 10 "events" of passing . So the probability of an event in your data set is 10/100= 0.1 . Now I build the prediction model based on different predictors like Gap, SAT, gender etc, and I get the final predicted probability scores. Lets say for some student, we get a score of .3,. All I want to say is that this student has higher chances than average. Not necessary the probability score has to be above .5. This depends upon the original percentage events in your data on which the model is built - which in this case is 0.1
For original %, let's consider your own data. Let's say 10 out 100 applicants get accepted giving a rate of 0.1. Now forget about any type of model and simply classify all 100 students as getting accepted. Here without any model, you will be correct 10% of the time, but incorrect 90% of the time. Do not mix-up this overall rate with individual probability. When you develop a prediction model, it should give overall accuracy better than 90% to be of any value.
@@bkrai Thanks. Am I right to say that predicted probability score less than .5 (after building the final model) does not necessarily mean that, that event is less likely to happen.
multinom is not working?? please i upload the csv file but the multinom fuction generate an erro : multinom function require two classes or more.. please help me
Hello sir, can you make a video for ploting ROC curve for SVM. I am getting an error in my code. The error is i am getting is format of prediction is invalid. Thank you
Hi Sir , why did you use multinom function here ? isnt multinom used only if target var have more than 2 categories ? while in this video we have only 2 categories , yes or no ?
When you run 'eval' that contains that contains accuracy values for various cut offs, you can see different type of information are stored in various slots. And y.values contain data on accuracy.
if we would have to plot a PR curve for the same data that you used to plot ROC curve, how would we do that? Can you please send me the code at puneetkaursidhu@gmail.com
Hi Sir, One question here. Can ROCR curve be drawn in case of multiple classification. Ex:We have to classify the given data in 3 different classes? Thanks, Roopa
It is meant for only two categories. If you have three classes such as 1, 2, and 3. And if your interest lies more in correctly classifying say class "3", then you may still have two classes with 3 and others.
how to check performance for ordinal logistic regression model. By using this method I am getting this error 'Error in prediction(pedi, heart_A$num) : Number of cross-validation runs must be equal for predictions and labels.'
I already checked that video but not able to plot roc curve getting error " number of cross-validation runs must be equal for predictions and labels' .
Sir... your explanation of ROC and AUC was very simple and easy to understand .Its cleared my all doubts..Thanks a lot...
You are most welcome!
Excellent, Sir! Thanks a lot for bringing it so straightforward and consistently. For the first time, I could understand and reproduce the whole thing in r, regarding ROC Curve and AUC.
You are most welcome!
superb sir...phenomenal.....u make tough things look simple....proud of you boss
Thanks!
thank you Sir, very kind of you to send the R code and the data, appreciate it. Your youtube here explains those concepts of ROC and AUC clearly, with a simple example. Well done.
Thanks!
Unbelievably helpful video - I've been searching all over internet for this. Thank you.
That's good to know!
A ROC Curve Tutorial for more than two classes with the 1 vs ALL approach
would be a very helpful video :).
Thanks foe the suggestion, its on my list now.
Short, simple and covers everything! Thank you!
Thanks for comments!
Dear Sir,
Your tutorial helps us all to learn about data science. I learn many thing from your tutorial. Now I want to learn how we can make ROC and AUC for multi-class? May you make another video to teach us about multi-class model performance?
Thank you
Thanks for your comments, I'll add your suggestion to my list.
@@bkrai Sir I have the same query my data set has three class in that case how will I get ROC & AUC curve
Thank you sir...very clear and crisp explanation. In one video I got all the information. From the explanation in the video, I got how to find cutoff for maximum accuracy, by doing this only one class has got more weight in my dataset. but how to find a threshold value of cutoff(which gives maximum of sensitivity and maximum of specificity).
You can get that using the ROC curve. The color used on the curve changes from 0 to 1. You can identify a point on the curve that is closest to the ideal curve.
Thank you sir
Very clear and impressive lecture! Thanks so much!
You're very welcome!
Thanks a lot sir... for such precise explanation of AU ROC curve. Truly appreciated.!
Thanks for comments!
it is very clear you know what you are doing.Thank you for your contribution !
Thanks for your feedback!
Yo that beat in the beginning was fire
Thanks :)
This was an amazingly clear approach. Thank you.
Great to hear your feedback!
Explained very neatly sir, appreciate if you can pls add dataset and code for learning please....
email id?
Thanks for your quick response sir, my emailid is avinashsinghemailid@gmail.com
all set.
we an use Deducer package in R to directly run the ROC Curve
library(Deducer)
mymodel
thanks!
Thank you so much for explaining in much much simpler way!!!!!
Thanks for your comments!
Thank you so much for your explanation, I could run my code and understand better the process.
Thanks for the feedback!
very helpful video sir. thank you so much. I have a doubt how do you fix the threshold value as 0.5.
Default threshold is already 0.5, there is no need to do anything for this.
Ok.. Thankyou sir.
Nicely explain. Sir, can you arrange to prepare a video on SVM of binary outcomes
Try this.
th-cam.com/video/pS5gXENd3a4/w-d-xo.html
Sir How does ROC curve work when the dependent variable is not binary in nature, in essence more than 2 factors for which we have to model the data(Note: but not continious in nature).
i think m the only one who doesn't able to learn any computer language except R and this all happen just bcs of u sir 🙂
Thanks for comments!
Hello Sir , i'm an avid viewer of your videos which truly add value to our ML understanding. Have a quick question that once we determine the Best Value of Cut Offs post Model performance evaluation , should we go back and re-run the Model performance with Best Cut Off values and change the cut off of 0.5 that we considered as thumb rule.
Yes, that's correct.
Very nice and Excellent explanation. Could you please make another video to draw multiclass (more than 2 class) Roc curve. (one vs rest roc )?
You can refer to these:
th-cam.com/video/ftjNuPkPQB4/w-d-xo.html
th-cam.com/video/6SMrjEwFiQY/w-d-xo.html
Please do a video on sentiment analysis using R in detail... Deep dive analysis
Thanks for the suggestion, I'll probably do it sometime this month.
Hi,
Firstly, great video this really helped me to understand the ROC curve and implement it with my data in R. I am analysing diagnostic data for a masters degree research project. I wanted to know how to identify the cutoff value from the value that we take from the accuracy versus cutoff curve or the final ROC curve. The scale goes from 0-1 but my independent variable data ranges from 100 to 10^7 . In short, how do I take the best cutoff value that this analysis outputs and relate/convert this to my independent variable and an exact cutoff value?
Thanks very much.
Cutoff value is for the dependent variable. If the dependent has two categories 'yes' and 'no', then default cutoff is probability = 0.5. ROC provides information on how prediction model performance will change if cutoff value changes from 0 to 1.
Very good video sir it is very helpful
Thanks for comments!
Thank you so much Sir..the video was really helpful in providing practical knowledge of dealing with predictive modelling problems in R..Can you please tell me how to apply weight of evidence/ fine classing in R - is there any ready made syntax?
Thanks for your feedback! I've also added your suggestion to my list.
Hi Professor, i love your videos ,it's very interesting.I'm a PhD student , sometimes i find difficult to have the link between my own variables ( concengrations of elements) and the variables that you work with, that's why ; I wish you have documents well explained concerning the data processing analysis to sends it to me I will be very grateful . Also i want that you sent me the data file.
email id?
very very helpfull !!!
im sending to some brazilians friends
Thanks and hope you find other videos helpful too!
Sir, can you please also attach dataset files/link along with your videos. This would greatly help us in learning by practicing with same data set. Thanks for great videos sir.
I've added a link in the description area below the video.
@@bkrai thanks.
Thank you for this great video! And thank you for prompt reply. I have questions.
If we are doing machine learning, we need to create ROC using predictive model created by test set, correct?
(in your "Logistic Regression with R" video, you created predictive model using test set. We need to validate the accuracy of the model). Also, if I want to use which.max func to plot the highest values on the eval plot, what code should I use?
Seeing this today. But roc curves can be done for both train and test data.
Sir, a)for multi-class, how you will will come with false positive, false negative b)how to compute ROC for multiclass
I'm adding this to my list. Thanks!
thanks so much sir
Im new to ur lecture and find it very interesting and useful.I have one question ?how do you get the cutoff .5 from the classification table @5:39mn of the video.Thanks
I figure out and see it should be a default value.However I'm still having issue with the performance object eval.when I tried to print eval,it is giving me this :
A performance instance
'Cutoff' vs. 'Accuracy' (alpha: 'none')
with 392 data points
please help me have what you had on your screen.Thanks
You will see it better after plot.
absolutely excellent explanation. thank you very much.
Thanks for comments!
I appreciate your help, excellent video 👏🙏
You are welcome!
Thank you for a great tutorial sir. Could you please share the dataset and the code ?
email id?
hamsini0992@gmail.com. Thank you
all set.
Great video and very well explained!
Thanks for comments!
Hi Sir,
Thanks for wonderful video.
Could also make AUC video where dependent variable is continuous.
Thanks for the suggestion! I've added this to my list.
Thanks for the useful information. I would like to ask you if I can use ROC to measure the effectiveness of the prediction model? And can I use ROC in R software?
For model effectiveness you can use AUC. Also yes you can do ROC in R, this video gives you all the steps.
Dr. Bharatendra Rai thanks
for those who are looking for the data: stats.idre.ucla.edu/r/dae/logit-regression/
Thanks for sharing!
Hi Dr. Bharatendra Rai, would you be able to make a tutorial on building a logistic regression model using training and validation sets, with performance checking via ROC curve as you have done here? I know you posted one on linear regression, but I thought a logistic model would be very helpful too. Thank you!
You can get logistic regression from this link:
th-cam.com/video/AVx7Wc1CQ7Y/w-d-xo.html
ROC steps are already in the current lecture video.
Hello. This looks like the cutoff would be the probability of a certain student getting admitted based on the multinomial model. If I am working with a dataset with one independent variable contributing to my multinomial model, and am wanting to obtain a cutoff value from that independent variable (ie what is the cutoff value of SAT score that will best tell me if someone is admitted to college), what would I be changing in the code? Thank you.
I was also looking for this answer to this if you managed to find out?
Hi
Bharatendra, why don't you use glm()? I looked it up and it seems like multinom is used when the dependent has more than 2 levels. In your example, the dependent is admin(no, yes). That's why I'm confused why you chose multinom () instead of glm(). Thank you.
multinom works for 2 or more. So when dependent variable has 2 levels, it should work fine.
You nailed it teacher..!!
Thanks!
I have applied the same functions in evaluation of my GAM model where I am not able to produce the confusion matrix. The results shows 2*132 table matrix instead of 2*2 matrix moreover I have 203 'Y" variable in validation data. Why its coming so. Plz help me. Thanking you.
Nicely explained. Thank you.
Thanks for comments!
How to handle if my all data is categorical my predictor features are subject columns with 1 to 8 grades for each subject and
response variable is subject where we have to predict response variable grades (1-8 ).
Before applying model I converted all features and response variable into factors is this right step or should i only covert response variable into factors and keep predictors in numerical format
I'm seeing this today, but for categorical variables you can try random forest:
th-cam.com/video/dJclNIN-TPo/w-d-xo.html
Unfortunately unable to proceed.
An error message appears:
pred
How to generate ROC AUC curve for multi class responsible variable?
Thank You
You can use this method with two class at a time.
Hi, I'm on R studio v. 1.1.423 now and nnet package isn't available and I can't seem to find an equivalent... any ideas what I can use to get the same results? Thanks.
Please upgrade RStudio to atleast 1.1.463.. nnet works in this version.. Good luck.
This is old comment. I guess you must have already updated.
Sir, based on this miss classification problem for admit =1, how can u change prob to other value in the model? maybe with under 0.45 =0 and over = 1
ROC automatically tries probability values from 0 to 1 and then plots it on the curve.
Great video, trying to plot ROC for random fors but it giving me the following error
Error in prediction(p2, train$Dispute) :
Number of cross-validation runs must be equal for predictions and labels
what are your previous lines before this line where you are getting error?
no error. thank you very much for quick response can you forward me the co
library(randomForest)
TD_model
i've sent my code file.
note that the response variable here has two levels. If your data has more, then the codes shown in the video may not work as it is.
if am using 2 variables only GPA and Admit then what will be the Logistic Regression Model formula
You can refer to this:
th-cam.com/video/AVx7Wc1CQ7Y/w-d-xo.html
Very nice explanation
Could u plzz send me dataset and code
For data, see link below video. For code, see the pinned comment.
Hi Dr, when i do stacking of ensemble why do i get the roc curve in triangular shape?
Thank you very much, sir.
You are welcome!
Thank you so much sir, Just want to ask you whether type='response' is same as type='prob' when I am trying to give type='prob' , R is throwing an error like "Error in match.arg(type) :
'arg' should be one of “link”, “response”, “terms” ?
'Response' usually could be classes such as 'yes' or 'no'. But 'prob' gives probability values. And that could lead to errors that you are getting.
@@bkrai Thank you Sir
Great explanation!
Thanks!
I have a question about the logistic regression model part. Does the code deal with the whole data? I thought when doing the logistic regression model, you have to divide the data into training set and test set. In the code you've used, does it divide training set and test set automatically?
I used full data as focus was more on ROC. But when developing a prediction model it is always good to partition data into training and testing data sets.
Thank you so much! helped a lot! :)
Further to my earlier comment, also wanted to ask what software you used to create these videos on data analytics. Thanks
I used iMovie.
Really helpful!
Glad it was helpful!
Great explanation. Thank you Sir! )
Welcome!
Hello sir, Video was excellent but I have small question. Cut off value changed to 0.45 so do we need to again run model. If not then why and if yes then what changed need to be made in code so that pure classification can be made for improving accuracy level of model.
You don't need to run model again. The model gives predictions in terms of probabilities. When we use cutoff of 0.45, probabilities that are less than 0.45 are classified into first group and those above 0.45 are classified into 2nd group. So for any cutoff value, probabilities from prediction model are not going to change, they are only used for classification.
Thank you sir for your reply and for guiding me
I am want to know one more thing, as according to your reply I can assume probability less then 0.45 as yes and above 0.45 as no in terms of dependent variable. But where or on which sort of output I will come to know that due to variation in independent variable majority of time dependent variable occur (yes or no).
Thanks
Independent variable will take only one value for each case. If a new student has GRE = 380, GPA = 3.61 and comes from a school ranked 3, then inputting these values in the prediction model will give probability = 0.18. Since p=0.18 is less than cutoff = 0.45, this student will be rejected and will not get admission.
Good morning, now I understand answer. Thanks for helping me in compression of logistic regression
Hello Sir
I need little help from your side. In code where we create subset out of sample why == TRUE or FALSE is used. What is difference between = and == symbol in Logistic regression.
Thank you sir
can you tell me the differences below:
yourmodel
I think yourmodel
Sir, I have ran the model using neuralnet package , is it necessary to calculate probability for predicted value or we can directly go with value obtained for test set. One more question sir is it any way to plot roc curve for two models.
i am getting following error when i use : pred
Difficult to say anything without looking at code.
Same error. Could you solve it?
How could we cross validate these results?
You can refer to this for more detailed coverage:
th-cam.com/video/ftjNuPkPQB4/w-d-xo.html
Hello Sir, Thanks for your every lecture for students like me. Respected Sir, i am working on Groundwater potential mapping, i have compete my analysis using ArcGIS, now i have to validate my results with ROC curve, could you please guide me how could it possible using R. I will be very thankful to you.
You can follow the steps in this video. If you get struck anywhere, let me know.
@@bkrai Sir i have get problem here, please check, how to make model with my data?
setwd("C:/Users/Umair/OneDrive/Manuscript Quetta GWP/ANN")
.libPaths("C:/Users/Umair/OneDrive/Manuscript Quetta GWP/library ANN")
install.packages("tidyverse")
library(tidyverse)
Wells
Which part has a problem?
@@bkrai Sir problem to make model for ROC and need to compute the true + and true -.
Hi Sir, i have calculated cut off for accuracy for my data (~0.475). i would like to know where exactly i should replace default 0.5 with this .475 ?
Hello sir.. is possible to calculate auc metric for multiple class prediction
I'll look into this.
@@bkrai thank you sir...
welcome!
hi sir, I have some confusion, please help me to resolve it.
Your IV (admit) has two levels...0 and 1 and you performed multinomial logit?
Is it obvious to plot a multiclass ROC rather than a typical ROC curve, when my IV has three levels ( i.e. 1,2, and 3).
Thanks.
Multinomial logit works for 2 or more levels. However, ROC used here is only for situations where IV has 2 levels.
@@bkrai Would you please provide a lecture about multiclass ROC? Thanks.
Thanks, I've added it to my list.
Hello Dr. Rai. Would you happen to know what code I can use to compare 2 ROCs and/or AUCs using R? Also, if there is a way to represent 2 ROCs in one graph. Thank you! ~ Tanya
You can calculate AUC for different models and compare them. It should be higher the better.
@@bkrai Sir, in that situation how can we get the p-value of the compared statistic.
sir, how to identify the optimal cut off value so that TPR (sensitivity) will be high ?
I need to know how the medical data set are going to use in R studio programming and (example MIMIC, DCOM, etc) which library i have to use... pls if you know anyone inform...
You can try these:
th-cam.com/play/PL34t5iLfZddsQ0NzMFszGduj3jE8UFm4O.html
Hello, This is great video but I am slightly confused with the probability explanation. YOu mentioned that if the prob prediction score is less than .5 then chance are less than average but doesn't it depend on %age of events in the data on which model is based? If in the sample data, the event rate is 1 out of 4 then probability is .25, so any scores above .25 in the final output mean model is saying that this has higher chances. Not necessarily it has to be above .5.
In this data there are only two outcomes. Either a student is admitted or not-admitted. If you only calculate % as students admitted and use that as a probability, it will not be a very useful prediction. Because it will remain same for every students irrespective of their gpa, sat score or college they are coming from. The prediction in the form of probability between 0 and 1 here is based on all input variables.
@@bkrai Hello, Let me give an example: Lets say in your data set , there were 100 observations and 10 "events" of passing . So the probability of an event in your data set is 10/100= 0.1 . Now I build the prediction model based on different predictors like Gap, SAT, gender etc, and I get the final predicted probability scores. Lets say for some student, we get a score of .3,. All I want to say is that this student has higher chances than average. Not necessary the probability score has to be above .5. This depends upon the original percentage events in your data on which the model is built - which in this case is 0.1
For original %, let's consider your own data. Let's say 10 out 100 applicants get accepted giving a rate of 0.1. Now forget about any type of model and simply classify all 100 students as getting accepted. Here without any model, you will be correct 10% of the time, but incorrect 90% of the time. Do not mix-up this overall rate with individual probability. When you develop a prediction model, it should give overall accuracy better than 90% to be of any value.
@@bkrai Thanks. Am I right to say that predicted probability score less than .5 (after building the final model) does not necessarily mean that, that event is less likely to happen.
That's correct.
Thanks for the video! Amazing!
Thanks for comments!
Would you be able to send me the dataset used for this please? awesome job done.
rshekdar@gmail.com my email id. sorry accidentally clicked send before could complete.
+Ranjit Shekdar all set.
you sir are very fast, much appreciated. Thanks again. I was not expecting it to be sent out so fast.
what does represent the color from green to red of the ROC?
It represents cutoff values between 0 and 1.
Dr. Bharatendra Rai thanks
multinom is not working?? please i upload the csv file but the multinom fuction generate an erro : multinom function require two classes or more.. please help me
Dr. Bharatendra Rai sir please
Check your response variable. It should have 2 or more classes.
Hello sir, can you make a video for ploting ROC curve for SVM. I am getting an error in my code. The error is i am getting is format of prediction is invalid. Thank you
Thanks alot man. you helped
You are welcome!
Hi Sir , why did you use multinom function here ? isnt multinom used only if target var have more than 2 categories ? while in this video we have only 2 categories , yes or no ?
Multi works for 2 or more, so using it here should be ok.
I need some help please trying to submit my project but cannot get ROC work
How do I do this AUC I have NA Values in my dataframe?
You need to first address missing values. See this link:
th-cam.com/video/An7nPLJ0fsg/w-d-xo.html
Amazing, very usefull, Thanks
You are very welcome!
why u wrote 'y.values' in slot function of AUC?
When you run 'eval' that contains that contains accuracy values for various cut offs, you can see different type of information are stored in various slots. And y.values contain data on accuracy.
Hello, can this be applied on CNN?
If there are two classes, then yes.
How to compare two ROC? kindly explain, sir.
You can refer to this:
th-cam.com/video/ftjNuPkPQB4/w-d-xo.html
@@bkrai Sir can we adjust variables in SVM as we usually do in any MLR analysis. If, possible, how we can proceed. Please suggest.
cutoff means threshold?
That's correct.
Great Video....
In place of "tpr", will a numeric entry work?
I've not tried, but should work.
please avail me the codes of ROC Curve & Area Under Curve (AUC) with R - Application Example
For code, see the pinned comment.
I get a confusion matrix that looks like (201,0; 45,0) with an accuracy of 1... HELP
Make a confusion matrix for test data, that will provide more practical accuracy.
Hello Sir, are precision and recall same as sensitivity and 1-specificity respectively?
if we would have to plot a PR curve for the same data that you used to plot ROC curve, how would we do that? Can you please send me the code at puneetkaursidhu@gmail.com
Hi Sir,
One question here.
Can ROCR curve be drawn in case of multiple classification.
Ex:We have to classify the given data in 3 different classes?
Thanks,
Roopa
It is meant for only two categories. If you have three classes such as 1, 2, and 3. And if your interest lies more in correctly classifying say class "3", then you may still have two classes with 3 and others.
ok..got it.Thank you sir!
thx , very good video.
Thanks for comments!
how to check performance for ordinal logistic regression model. By using this method I am getting this error
'Error in prediction(pedi, heart_A$num) :
Number of cross-validation runs must be equal for predictions and labels.'
this is my email id kanikaabap@gmail.com please help me out thank you
For ordinal logistic regression you can use this:
th-cam.com/video/qkivJzjyHoA/w-d-xo.html
I already checked that video but not able to plot roc curve getting error " number of cross-validation runs must be equal for predictions and labels' .
I can not find ant 'y.value' in eval