ROC Curve & Area Under Curve (AUC) with R - Application Example

แชร์
ฝัง
  • เผยแพร่เมื่อ 30 ก.ย. 2024

ความคิดเห็น • 499

  • @aaradhyaabhijitshetti3933
    @aaradhyaabhijitshetti3933 4 ปีที่แล้ว +4

    Sir... your explanation of ROC and AUC was very simple and easy to understand .Its cleared my all doubts..Thanks a lot...

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      You are most welcome!

  • @johnwilliammeyer6592
    @johnwilliammeyer6592 6 ปีที่แล้ว +5

    Unbelievably helpful video - I've been searching all over internet for this. Thank you.

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      That's good to know!

  • @oumaimanassiri5555
    @oumaimanassiri5555 6 ปีที่แล้ว +4

    Hi Professor, i love your videos ,it's very interesting.I'm a PhD student , sometimes i find difficult to have the link between my own variables ( concengrations of elements) and the variables that you work with, that's why ; I wish you have documents well explained concerning the data processing analysis to sends it to me I will be very grateful . Also i want that you sent me the data file.

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      email id?

  • @TH-fe1vs
    @TH-fe1vs 7 ปีที่แล้ว +4

    thank you Sir, very kind of you to send the R code and the data, appreciate it. Your youtube here explains those concepts of ROC and AUC clearly, with a simple example. Well done.

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      Thanks!

  • @kalyanasundaramsp8267
    @kalyanasundaramsp8267 6 ปีที่แล้ว +3

    superb sir...phenomenal.....u make tough things look simple....proud of you boss

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Thanks!

  • @danielsaphir8596
    @danielsaphir8596 4 ปีที่แล้ว +5

    Yo that beat in the beginning was fire

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      Thanks :)

  • @vairachilai3588
    @vairachilai3588 5 ปีที่แล้ว +1

    True Postive Rate= True Positive/Actual admit ( 29/(29+20) =0.591
    Fales Positive Rate= False Positive/Actual not admit= (98/(253+98)=0.279

    • @bkrai
      @bkrai  5 ปีที่แล้ว

      In the confusion matrix shown in the video, the above formula that you use will not be correct. In the confusion matrix if you have actual class on the LEFT and predicted class on the TOP, then your formula will be correct.

  • @RoomeyRahman
    @RoomeyRahman 6 ปีที่แล้ว +4

    Dear Sir,
    Your tutorial helps us all to learn about data science. I learn many thing from your tutorial. Now I want to learn how we can make ROC and AUC for multi-class? May you make another video to teach us about multi-class model performance?
    Thank you

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Thanks for your comments, I'll add your suggestion to my list.

    • @santanumallik6992
      @santanumallik6992 3 ปีที่แล้ว

      @@bkrai Sir I have the same query my data set has three class in that case how will I get ROC & AUC curve

  • @raj2385
    @raj2385 11 หลายเดือนก่อน +1

    if am using 2 variables only GPA and Admit then what will be the Logistic Regression Model formula

    • @bkrai
      @bkrai  11 หลายเดือนก่อน

      You can refer to this:
      th-cam.com/video/AVx7Wc1CQ7Y/w-d-xo.html

  • @avi20009
    @avi20009 7 ปีที่แล้ว +2

    Sir How does ROC curve work when the dependent variable is not binary in nature, in essence more than 2 factors for which we have to model the data(Note: but not continious in nature).

  • @mohsinfayaz8103
    @mohsinfayaz8103 3 ปีที่แล้ว +1

    How to generate ROC AUC curve for multi class responsible variable?
    Thank You

    • @bkrai
      @bkrai  3 ปีที่แล้ว +1

      You can use this method with two class at a time.

  • @kessiezhang9357
    @kessiezhang9357 5 ปีที่แล้ว +1

    Hi
    Bharatendra, why don't you use glm()? I looked it up and it seems like multinom is used when the dependent has more than 2 levels. In your example, the dependent is admin(no, yes). That's why I'm confused why you chose multinom () instead of glm(). Thank you.

    • @bkrai
      @bkrai  5 ปีที่แล้ว

      multinom works for 2 or more. So when dependent variable has 2 levels, it should work fine.

  • @rafiqulislam1085
    @rafiqulislam1085 ปีที่แล้ว +1

    Very nice and Excellent explanation. Could you please make another video to draw multiclass (more than 2 class) Roc curve. (one vs rest roc )?

    • @bkrai
      @bkrai  ปีที่แล้ว

      You can refer to these:
      th-cam.com/video/ftjNuPkPQB4/w-d-xo.html
      th-cam.com/video/6SMrjEwFiQY/w-d-xo.html

  • @kaduflutist
    @kaduflutist 3 ปีที่แล้ว +3

    Excellent, Sir! Thanks a lot for bringing it so straightforward and consistently. For the first time, I could understand and reproduce the whole thing in r, regarding ROC Curve and AUC.

    • @bkrai
      @bkrai  3 ปีที่แล้ว

      You are most welcome!

  • @duleepaj
    @duleepaj 6 ปีที่แล้ว +4

    Short, simple and covers everything! Thank you!

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Thanks for comments!

  • @visheshgour
    @visheshgour 4 ปีที่แล้ว +1

    i think m the only one who doesn't able to learn any computer language except R and this all happen just bcs of u sir 🙂

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      Thanks for comments!

  • @FOR4MUSIC
    @FOR4MUSIC 7 ปีที่แล้ว +3

    it is very clear you know what you are doing.Thank you for your contribution !

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      Thanks for your feedback!

  • @getugamo7051
    @getugamo7051 3 ปีที่แล้ว +3

    Very clear and impressive lecture! Thanks so much!

    • @bkrai
      @bkrai  3 ปีที่แล้ว

      You're very welcome!

  • @muldon2
    @muldon2 6 ปีที่แล้ว +1

    for those who are looking for the data: stats.idre.ucla.edu/r/dae/logit-regression/

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Thanks for sharing!

  • @sengulozdemir418
    @sengulozdemir418 5 หลายเดือนก่อน +1

    I appreciate your help, excellent video 👏🙏

    • @bkrai
      @bkrai  5 หลายเดือนก่อน

      You are welcome!

  • @shubhammishra8550
    @shubhammishra8550 6 ปีที่แล้ว +1

    why u wrote 'y.values' in slot function of AUC?

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      When you run 'eval' that contains that contains accuracy values for various cut offs, you can see different type of information are stored in various slots. And y.values contain data on accuracy.

  • @KR-good
    @KR-good 7 ปีที่แล้ว +4

    This was an amazingly clear approach. Thank you.

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      Great to hear your feedback!

  • @sside99
    @sside99 7 ปีที่แล้ว +1

    cutoff means threshold?

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      That's correct.

  • @sheikhseerat7105
    @sheikhseerat7105 3 ปีที่แล้ว +1

    Very nice explanation
    Could u plzz send me dataset and code

    • @bkrai
      @bkrai  3 ปีที่แล้ว

      For data, see link below video. For code, see the pinned comment.

  • @suryagaur7440
    @suryagaur7440 6 ปีที่แล้ว +2

    Hi Sir,
    Thanks for wonderful video.
    Could also make AUC video where dependent variable is continuous.

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Thanks for the suggestion! I've added this to my list.

  • @jennykeeping8918
    @jennykeeping8918 5 ปีที่แล้ว +1

    Hi, I'm on R studio v. 1.1.423 now and nnet package isn't available and I can't seem to find an equivalent... any ideas what I can use to get the same results? Thanks.

    • @kamalpada1270
      @kamalpada1270 5 ปีที่แล้ว +1

      Please upgrade RStudio to atleast 1.1.463.. nnet works in this version.. Good luck.

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      This is old comment. I guess you must have already updated.

  • @kpakpomoevi1603
    @kpakpomoevi1603 4 ปีที่แล้ว +1

    Im new to ur lecture and find it very interesting and useful.I have one question ?how do you get the cutoff .5 from the classification table @5:39mn of the video.Thanks

    • @kpakpomoevi1603
      @kpakpomoevi1603 4 ปีที่แล้ว +1

      I figure out and see it should be a default value.However I'm still having issue with the performance object eval.when I tried to print eval,it is giving me this :
      A performance instance
      'Cutoff' vs. 'Accuracy' (alpha: 'none')
      with 392 data points
      please help me have what you had on your screen.Thanks

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      You will see it better after plot.

  • @ShahbazAhmedENG
    @ShahbazAhmedENG 3 ปีที่แล้ว +1

    multinom is not working?? please i upload the csv file but the multinom fuction generate an erro : multinom function require two classes or more.. please help me

    • @ShahbazAhmedENG
      @ShahbazAhmedENG 3 ปีที่แล้ว

      Dr. Bharatendra Rai sir please

    • @bkrai
      @bkrai  3 ปีที่แล้ว

      Check your response variable. It should have 2 or more classes.

  • @devawratvidhate9093
    @devawratvidhate9093 5 ปีที่แล้ว +2

    How to handle if my all data is categorical my predictor features are subject columns with 1 to 8 grades for each subject and
    response variable is subject where we have to predict response variable grades (1-8 ).
    Before applying model I converted all features and response variable into factors is this right step or should i only covert response variable into factors and keep predictors in numerical format

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      I'm seeing this today, but for categorical variables you can try random forest:
      th-cam.com/video/dJclNIN-TPo/w-d-xo.html

  • @kalyanasundaramsp8267
    @kalyanasundaramsp8267 6 ปีที่แล้ว +2

    Sir, a)for multi-class, how you will will come with false positive, false negative b)how to compute ROC for multiclass

    • @bkrai
      @bkrai  6 ปีที่แล้ว +1

      I'm adding this to my list. Thanks!

    • @SaranathenArun11E214
      @SaranathenArun11E214 6 ปีที่แล้ว

      thanks so much sir

  • @wanglaoshuwang3831
    @wanglaoshuwang3831 6 ปีที่แล้ว +1

    I have a question about the logistic regression model part. Does the code deal with the whole data? I thought when doing the logistic regression model, you have to divide the data into training set and test set. In the code you've used, does it divide training set and test set automatically?

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      I used full data as focus was more on ROC. But when developing a prediction model it is always good to partition data into training and testing data sets.

    • @wanglaoshuwang3831
      @wanglaoshuwang3831 6 ปีที่แล้ว

      Thank you so much! helped a lot! :)

  • @tamaraabzhandadze2712
    @tamaraabzhandadze2712 3 ปีที่แล้ว +1

    How could we cross validate these results?

    • @bkrai
      @bkrai  3 ปีที่แล้ว

      You can refer to this for more detailed coverage:
      th-cam.com/video/ftjNuPkPQB4/w-d-xo.html

  • @olamidegab2390
    @olamidegab2390 3 ปีที่แล้ว +1

    Hello, can this be applied on CNN?

    • @bkrai
      @bkrai  3 ปีที่แล้ว

      If there are two classes, then yes.

  • @lianjek5788
    @lianjek5788 4 ปีที่แล้ว +1

    hi sir, I have some confusion, please help me to resolve it.
    Your IV (admit) has two levels...0 and 1 and you performed multinomial logit?
    Is it obvious to plot a multiclass ROC rather than a typical ROC curve, when my IV has three levels ( i.e. 1,2, and 3).
    Thanks.

    • @bkrai
      @bkrai  4 ปีที่แล้ว +1

      Multinomial logit works for 2 or more levels. However, ROC used here is only for situations where IV has 2 levels.

    • @lianjek5788
      @lianjek5788 4 ปีที่แล้ว +1

      @@bkrai Would you please provide a lecture about multiclass ROC? Thanks.

    • @bkrai
      @bkrai  4 ปีที่แล้ว +1

      Thanks, I've added it to my list.

  • @freddyflores6608
    @freddyflores6608 6 ปีที่แล้ว +2

    Thank you so much for your explanation, I could run my code and understand better the process.

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Thanks for the feedback!

  • @sanjibphukan8921
    @sanjibphukan8921 2 ปีที่แล้ว +1

    How to compare two ROC? kindly explain, sir.

    • @bkrai
      @bkrai  2 ปีที่แล้ว +1

      You can refer to this:
      th-cam.com/video/ftjNuPkPQB4/w-d-xo.html

    • @sanjibphukan8921
      @sanjibphukan8921 2 ปีที่แล้ว

      @@bkrai Sir can we adjust variables in SVM as we usually do in any MLR analysis. If, possible, how we can proceed. Please suggest.

  • @sovon08
    @sovon08 6 ปีที่แล้ว +2

    Thank you so much Sir..the video was really helpful in providing practical knowledge of dealing with predictive modelling problems in R..Can you please tell me how to apply weight of evidence/ fine classing in R - is there any ready made syntax?

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Thanks for your feedback! I've also added your suggestion to my list.

  • @victorhenostroza1871
    @victorhenostroza1871 4 ปีที่แล้ว +1

    Sir, based on this miss classification problem for admit =1, how can u change prob to other value in the model? maybe with under 0.45 =0 and over = 1

    • @bkrai
      @bkrai  3 ปีที่แล้ว +1

      ROC automatically tries probability values from 0 to 1 and then plots it on the curve.

  • @veianthanjayaramu2995
    @veianthanjayaramu2995 3 ปีที่แล้ว +1

    Thank you very much, sir.

    • @bkrai
      @bkrai  3 ปีที่แล้ว +2

      You are welcome!

  • @omkarthakur2251
    @omkarthakur2251 4 ปีที่แล้ว +1

    Very good video sir it is very helpful

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      Thanks for comments!

  • @humbertobarino578
    @humbertobarino578 6 ปีที่แล้ว +2

    very very helpfull !!!
    im sending to some brazilians friends

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Thanks and hope you find other videos helpful too!

  • @ahmedbilal1831
    @ahmedbilal1831 2 ปีที่แล้ว +1

    Thanks alot man. you helped

    • @bkrai
      @bkrai  2 ปีที่แล้ว

      You are welcome!

  • @vijaymore1239
    @vijaymore1239 7 ปีที่แล้ว +2

    Thank you so much for explaining in much much simpler way!!!!!

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      Thanks for your comments!

  • @juarezantonio656
    @juarezantonio656 ปีที่แล้ว

    Unfortunately unable to proceed.
    An error message appears:
    pred

  • @abdulazeez9863
    @abdulazeez9863 6 ปีที่แล้ว

    I have applied the same functions in evaluation of my GAM model where I am not able to produce the confusion matrix. The results shows 2*132 table matrix instead of 2*2 matrix moreover I have 203 'Y" variable in validation data. Why its coming so. Plz help me. Thanking you.

  • @irondia73
    @irondia73 5 ปีที่แล้ว +1

    Hi Dr. Bharatendra Rai, would you be able to make a tutorial on building a logistic regression model using training and validation sets, with performance checking via ROC curve as you have done here? I know you posted one on linear regression, but I thought a logistic model would be very helpful too. Thank you!

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      You can get logistic regression from this link:
      th-cam.com/video/AVx7Wc1CQ7Y/w-d-xo.html
      ROC steps are already in the current lecture video.

  • @dorothymartin2477
    @dorothymartin2477 ปีที่แล้ว

    Hi Dr, when i do stacking of ensemble why do i get the roc curve in triangular shape?

  • @parasrai145
    @parasrai145 6 ปีที่แล้ว +3

    Great video and very well explained!

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Thanks for comments!

  • @dba99999
    @dba99999 7 ปีที่แล้ว +2

    Great Video....

  • @Revboiuk09
    @Revboiuk09 5 ปีที่แล้ว +1

    Thanks a lot sir... for such precise explanation of AU ROC curve. Truly appreciated.!

    • @bkrai
      @bkrai  5 ปีที่แล้ว

      Thanks for comments!

  • @rishikeshdash12
    @rishikeshdash12 2 ปีที่แล้ว

    Sir, I have ran the model using neuralnet package , is it necessary to calculate probability for predicted value or we can directly go with value obtained for test set. One more question sir is it any way to plot roc curve for two models.

  • @SachinSingh-uh2xh
    @SachinSingh-uh2xh 7 ปีที่แล้ว +1

    Please do a video on sentiment analysis using R in detail... Deep dive analysis

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      Thanks for the suggestion, I'll probably do it sometime this month.

  • @petrusk3185
    @petrusk3185 6 ปีที่แล้ว +1

    I'm a bit late to the party here, but sure You cannot compare total amount emitted vs accuracy of the model?
    In the dataset 68% were admitted, but that data is 100% accurate. If the model is 70% accurate you could get a result between 273 +-30%. Or am I missing something here? You are comparing apples with oranges?

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Thanks for comments! Probably you missed that "1" represents "admitted" in this data. Number of students admitted should be 31.75% and not 68%.

  • @TH-fe1vs
    @TH-fe1vs 7 ปีที่แล้ว +1

    can you tell me the differences below:
    yourmodel

  • @sateeshkumarojha7414
    @sateeshkumarojha7414 4 ปีที่แล้ว +1

    please avail me the codes of ROC Curve & Area Under Curve (AUC) with R - Application Example

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      For code, see the pinned comment.

  • @sanjeevnair9893
    @sanjeevnair9893 5 ปีที่แล้ว +1

    Hello Sir , i'm an avid viewer of your videos which truly add value to our ML understanding. Have a quick question that once we determine the Best Value of Cut Offs post Model performance evaluation , should we go back and re-run the Model performance with Best Cut Off values and change the cut off of 0.5 that we considered as thumb rule.

    • @bkrai
      @bkrai  5 ปีที่แล้ว

      Yes, that's correct.

  • @fabioambrosini5265
    @fabioambrosini5265 5 ปีที่แล้ว +1

    what does represent the color from green to red of the ROC?

    • @bkrai
      @bkrai  5 ปีที่แล้ว +1

      It represents cutoff values between 0 and 1.

    • @fabioambrosini5265
      @fabioambrosini5265 5 ปีที่แล้ว

      Dr. Bharatendra Rai thanks

  • @avinashsingh357
    @avinashsingh357 7 ปีที่แล้ว +2

    Explained very neatly sir, appreciate if you can pls add dataset and code for learning please....

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      email id?

    • @avinashsingh357
      @avinashsingh357 7 ปีที่แล้ว +1

      Thanks for your quick response sir, my emailid is avinashsinghemailid@gmail.com

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      all set.

  • @subaganesh552
    @subaganesh552 4 ปีที่แล้ว +1

    Hello sir.. is possible to calculate auc metric for multiple class prediction

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      I'll look into this.

    • @subaganesh552
      @subaganesh552 4 ปีที่แล้ว +1

      @@bkrai thank you sir...

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      welcome!

  • @sanjibphukan8921
    @sanjibphukan8921 2 ปีที่แล้ว +1

    Nicely explain. Sir, can you arrange to prepare a video on SVM of binary outcomes

    • @bkrai
      @bkrai  2 ปีที่แล้ว +1

      Try this.
      th-cam.com/video/pS5gXENd3a4/w-d-xo.html

  • @Hehxgdudh
    @Hehxgdudh 6 ปีที่แล้ว +1

    I get a confusion matrix that looks like (201,0; 45,0) with an accuracy of 1... HELP

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Make a confusion matrix for test data, that will provide more practical accuracy.

  • @jaituteja88
    @jaituteja88 7 ปีที่แล้ว +2

    Great explanation. Thank you Sir! )

    • @bkrai
      @bkrai  3 ปีที่แล้ว

      Welcome!

  • @damodharand1519
    @damodharand1519 4 ปีที่แล้ว +1

    I need to know how the medical data set are going to use in R studio programming and (example MIMIC, DCOM, etc) which library i have to use... pls if you know anyone inform...

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      You can try these:
      th-cam.com/play/PL34t5iLfZddsQ0NzMFszGduj3jE8UFm4O.html

  • @lewsmash
    @lewsmash 6 ปีที่แล้ว +1

    Hi,
    Firstly, great video this really helped me to understand the ROC curve and implement it with my data in R. I am analysing diagnostic data for a masters degree research project. I wanted to know how to identify the cutoff value from the value that we take from the accuracy versus cutoff curve or the final ROC curve. The scale goes from 0-1 but my independent variable data ranges from 100 to 10^7 . In short, how do I take the best cutoff value that this analysis outputs and relate/convert this to my independent variable and an exact cutoff value?
    Thanks very much.

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Cutoff value is for the dependent variable. If the dependent has two categories 'yes' and 'no', then default cutoff is probability = 0.5. ROC provides information on how prediction model performance will change if cutoff value changes from 0 to 1.

  • @sriharshabsathreya
    @sriharshabsathreya 6 ปีที่แล้ว +1

    we an use Deducer package in R to directly run the ROC Curve
    library(Deducer)
    mymodel

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      thanks!

  • @uhsay1986
    @uhsay1986 5 ปีที่แล้ว +1

    Hi Sir , why did you use multinom function here ? isnt multinom used only if target var have more than 2 categories ? while in this video we have only 2 categories , yes or no ?

    • @bkrai
      @bkrai  5 ปีที่แล้ว

      Multi works for 2 or more, so using it here should be ok.

  • @hamsinisankaran2435
    @hamsinisankaran2435 7 ปีที่แล้ว +2

    Thank you for a great tutorial sir. Could you please share the dataset and the code ?

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      email id?

    • @hamsinisankaran2435
      @hamsinisankaran2435 7 ปีที่แล้ว +1

      hamsini0992@gmail.com. Thank you

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      all set.

  • @francodjo
    @francodjo 7 ปีที่แล้ว +1

    Great video, trying to plot ROC for random fors but it giving me the following error
    Error in prediction(p2, train$Dispute) :
    Number of cross-validation runs must be equal for predictions and labels

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      what are your previous lines before this line where you are getting error?

    • @francodjo
      @francodjo 7 ปีที่แล้ว

      no error. thank you very much for quick response can you forward me the co

    • @francodjo
      @francodjo 7 ปีที่แล้ว

      library(randomForest)
      TD_model

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      i've sent my code file.

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      note that the response variable here has two levels. If your data has more, then the codes shown in the video may not work as it is.

  • @miguelsuarez475
    @miguelsuarez475 7 ปีที่แล้ว +2

    You nailed it teacher..!!

    • @bkrai
      @bkrai  3 ปีที่แล้ว

      Thanks!

  • @fayarvin2003
    @fayarvin2003 3 ปีที่แล้ว +1

    Really helpful!

    • @bkrai
      @bkrai  3 ปีที่แล้ว

      Glad it was helpful!

  • @balasrm1
    @balasrm1 5 ปีที่แล้ว +1

    Further to my earlier comment, also wanted to ask what software you used to create these videos on data analytics. Thanks

    • @bkrai
      @bkrai  5 ปีที่แล้ว

      I used iMovie.

  • @snigdhaluthra1946
    @snigdhaluthra1946 4 หลายเดือนก่อน

    can i get r coding of this video please

  • @OrcaChess
    @OrcaChess 6 ปีที่แล้ว +1

    A ROC Curve Tutorial for more than two classes with the 1 vs ALL approach
    would be a very helpful video :).

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Thanks foe the suggestion, its on my list now.

  • @prahladbhat9516
    @prahladbhat9516 4 ปีที่แล้ว +1

    How do I do this AUC I have NA Values in my dataframe?

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      You need to first address missing values. See this link:
      th-cam.com/video/An7nPLJ0fsg/w-d-xo.html

  • @bulletkip
    @bulletkip 5 ปีที่แล้ว +1

    absolutely excellent explanation. thank you very much.

    • @bkrai
      @bkrai  5 ปีที่แล้ว

      Thanks for comments!

  • @mashalnabh2747
    @mashalnabh2747 5 ปีที่แล้ว +1

    Hello, This is great video but I am slightly confused with the probability explanation. YOu mentioned that if the prob prediction score is less than .5 then chance are less than average but doesn't it depend on %age of events in the data on which model is based? If in the sample data, the event rate is 1 out of 4 then probability is .25, so any scores above .25 in the final output mean model is saying that this has higher chances. Not necessarily it has to be above .5.

    • @bkrai
      @bkrai  5 ปีที่แล้ว

      In this data there are only two outcomes. Either a student is admitted or not-admitted. If you only calculate % as students admitted and use that as a probability, it will not be a very useful prediction. Because it will remain same for every students irrespective of their gpa, sat score or college they are coming from. The prediction in the form of probability between 0 and 1 here is based on all input variables.

    • @mashalnabh2747
      @mashalnabh2747 5 ปีที่แล้ว +1

      @@bkrai Hello, Let me give an example: Lets say in your data set , there were 100 observations and 10 "events" of passing . So the probability of an event in your data set is 10/100= 0.1 . Now I build the prediction model based on different predictors like Gap, SAT, gender etc, and I get the final predicted probability scores. Lets say for some student, we get a score of .3,. All I want to say is that this student has higher chances than average. Not necessary the probability score has to be above .5. This depends upon the original percentage events in your data on which the model is built - which in this case is 0.1

    • @bkrai
      @bkrai  5 ปีที่แล้ว +1

      For original %, let's consider your own data. Let's say 10 out 100 applicants get accepted giving a rate of 0.1. Now forget about any type of model and simply classify all 100 students as getting accepted. Here without any model, you will be correct 10% of the time, but incorrect 90% of the time. Do not mix-up this overall rate with individual probability. When you develop a prediction model, it should give overall accuracy better than 90% to be of any value.

    • @sachin01663
      @sachin01663 5 ปีที่แล้ว +1

      @@bkrai Thanks. Am I right to say that predicted probability score less than .5 (after building the final model) does not necessarily mean that, that event is less likely to happen.

    • @bkrai
      @bkrai  5 ปีที่แล้ว +1

      That's correct.

  • @tapangautam8746
    @tapangautam8746 7 ปีที่แล้ว +2

    Hello sir, Video was excellent but I have small question. Cut off value changed to 0.45 so do we need to again run model. If not then why and if yes then what changed need to be made in code so that pure classification can be made for improving accuracy level of model.

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      You don't need to run model again. The model gives predictions in terms of probabilities. When we use cutoff of 0.45, probabilities that are less than 0.45 are classified into first group and those above 0.45 are classified into 2nd group. So for any cutoff value, probabilities from prediction model are not going to change, they are only used for classification.

    • @tapangautam8746
      @tapangautam8746 7 ปีที่แล้ว

      Thank you sir for your reply and for guiding me
      I am want to know one more thing, as according to your reply I can assume probability less then 0.45 as yes and above 0.45 as no in terms of dependent variable. But where or on which sort of output I will come to know that due to variation in independent variable majority of time dependent variable occur (yes or no).
      Thanks

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      Independent variable will take only one value for each case. If a new student has GRE = 380, GPA = 3.61 and comes from a school ranked 3, then inputting these values in the prediction model will give probability = 0.18. Since p=0.18 is less than cutoff = 0.45, this student will be rejected and will not get admission.

    • @tapangautam8746
      @tapangautam8746 7 ปีที่แล้ว +1

      Good morning, now I understand answer. Thanks for helping me in compression of logistic regression

    • @tapangautam8746
      @tapangautam8746 7 ปีที่แล้ว

      Hello Sir
      I need little help from your side. In code where we create subset out of sample why == TRUE or FALSE is used. What is difference between = and == symbol in Logistic regression.
      Thank you sir

  • @merumomo
    @merumomo 6 ปีที่แล้ว +1

    Thank you for this great video! And thank you for prompt reply. I have questions.
    If we are doing machine learning, we need to create ROC using predictive model created by test set, correct?
    (in your "Logistic Regression with R" video, you created predictive model using test set. We need to validate the accuracy of the model). Also, if I want to use which.max func to plot the highest values on the eval plot, what code should I use?

    • @bkrai
      @bkrai  3 ปีที่แล้ว

      Seeing this today. But roc curves can be done for both train and test data.

  • @BeKindPlox
    @BeKindPlox 6 ปีที่แล้ว +2

    Great explanation!

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Thanks!

  • @abdulhalim9472
    @abdulhalim9472 ปีที่แล้ว

    I can not find ant 'y.value' in eval

  • @abhilashiv3599
    @abhilashiv3599 5 ปีที่แล้ว +1

    Thank you so much sir, Just want to ask you whether type='response' is same as type='prob' when I am trying to give type='prob' , R is throwing an error like "Error in match.arg(type) :
    'arg' should be one of “link”, “response”, “terms” ?

    • @bkrai
      @bkrai  5 ปีที่แล้ว

      'Response' usually could be classes such as 'yes' or 'no'. But 'prob' gives probability values. And that could lead to errors that you are getting.

    • @abhilashiv3599
      @abhilashiv3599 5 ปีที่แล้ว

      @@bkrai Thank you Sir

  • @saurwt
    @saurwt 6 ปีที่แล้ว +2

    wooow just wooow!!!

    • @bkrai
      @bkrai  3 ปีที่แล้ว

      Thanks!

  • @tanyachichekian9900
    @tanyachichekian9900 4 ปีที่แล้ว +1

    Hello Dr. Rai. Would you happen to know what code I can use to compare 2 ROCs and/or AUCs using R? Also, if there is a way to represent 2 ROCs in one graph. Thank you! ~ Tanya

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      You can calculate AUC for different models and compare them. It should be higher the better.

    • @sanjibphukan8921
      @sanjibphukan8921 2 ปีที่แล้ว

      @@bkrai Sir, in that situation how can we get the p-value of the compared statistic.

  • @abhitest1
    @abhitest1 5 ปีที่แล้ว +1

    Sir, can you please also attach dataset files/link along with your videos. This would greatly help us in learning by practicing with same data set. Thanks for great videos sir.

    • @bkrai
      @bkrai  5 ปีที่แล้ว

      I've added a link in the description area below the video.

    • @abhitest1
      @abhitest1 5 ปีที่แล้ว +1

      @@bkrai thanks.

  • @fatimabadi3335
    @fatimabadi3335 5 ปีที่แล้ว +1

    Thanks for the useful information. I would like to ask you if I can use ROC to measure the effectiveness of the prediction model? And can I use ROC in R software?

    • @bkrai
      @bkrai  5 ปีที่แล้ว +1

      For model effectiveness you can use AUC. Also yes you can do ROC in R, this video gives you all the steps.

    • @fatimabadi3335
      @fatimabadi3335 5 ปีที่แล้ว

      Dr. Bharatendra Rai thanks

  • @nithinmamidala
    @nithinmamidala 6 ปีที่แล้ว +1

    very helpful video sir. thank you so much. I have a doubt how do you fix the threshold value as 0.5.

    • @bkrai
      @bkrai  6 ปีที่แล้ว

      Default threshold is already 0.5, there is no need to do anything for this.

    • @nithinmamidala
      @nithinmamidala 6 ปีที่แล้ว +1

      Ok.. Thankyou sir.

  • @raghavendras5331
    @raghavendras5331 6 ปีที่แล้ว +1

    Thank you sir...very clear and crisp explanation. In one video I got all the information. From the explanation in the video, I got how to find cutoff for maximum accuracy, by doing this only one class has got more weight in my dataset. but how to find a threshold value of cutoff(which gives maximum of sensitivity and maximum of specificity).

    • @bkrai
      @bkrai  6 ปีที่แล้ว +1

      You can get that using the ROC curve. The color used on the curve changes from 0 to 1. You can identify a point on the curve that is closest to the ideal curve.

    • @raghavendras5331
      @raghavendras5331 6 ปีที่แล้ว +1

      Thank you sir

  • @Chirag0729
    @Chirag0729 4 ปีที่แล้ว +1

    Nicely explained. Thank you.

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      Thanks for comments!

  • @TH-fe1vs
    @TH-fe1vs 7 ปีที่แล้ว

    is the same dataset from here?
    mydata

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      yes

  • @cameronyi683
    @cameronyi683 7 ปีที่แล้ว

    Hello. This looks like the cutoff would be the probability of a certain student getting admitted based on the multinomial model. If I am working with a dataset with one independent variable contributing to my multinomial model, and am wanting to obtain a cutoff value from that independent variable (ie what is the cutoff value of SAT score that will best tell me if someone is admitted to college), what would I be changing in the code? Thank you.

    • @lewsmash
      @lewsmash 6 ปีที่แล้ว

      I was also looking for this answer to this if you managed to find out?

  • @zhongyanxu9047
    @zhongyanxu9047 3 ปีที่แล้ว

    Amazing, very usefull, Thanks

    • @bkrai
      @bkrai  3 ปีที่แล้ว

      You are very welcome!

  • @mariaandreasantosruiz4491
    @mariaandreasantosruiz4491 7 ปีที่แล้ว +2

    Thanks !!!!

    • @bkrai
      @bkrai  3 ปีที่แล้ว

      Welcome!

  • @ranjitshekdar1720
    @ranjitshekdar1720 7 ปีที่แล้ว +2

    Would you be able to send me the dataset used for this please? awesome job done.

    • @ranjitshekdar1720
      @ranjitshekdar1720 7 ปีที่แล้ว

      rshekdar@gmail.com my email id. sorry accidentally clicked send before could complete.

    • @bkrai
      @bkrai  7 ปีที่แล้ว

      +Ranjit Shekdar all set.

    • @ranjitshekdar1720
      @ranjitshekdar1720 7 ปีที่แล้ว +1

      you sir are very fast, much appreciated. Thanks again. I was not expecting it to be sent out so fast.

  • @amitgajkal4821
    @amitgajkal4821 5 ปีที่แล้ว +1

    i am getting following error when i use : pred

    • @bkrai
      @bkrai  5 ปีที่แล้ว

      Difficult to say anything without looking at code.

    • @juanmauricioarrietalopez2395
      @juanmauricioarrietalopez2395 4 ปีที่แล้ว

      Same error. Could you solve it?

  • @abhibhavsharma8706
    @abhibhavsharma8706 4 ปีที่แล้ว +1

    In place of "tpr", will a numeric entry work?

    • @bkrai
      @bkrai  4 ปีที่แล้ว

      I've not tried, but should work.

  • @edneideramalho2363
    @edneideramalho2363 5 ปีที่แล้ว +1

    Thanks for the video! Amazing!

    • @bkrai
      @bkrai  5 ปีที่แล้ว

      Thanks for comments!

  • @vishnukowndinya
    @vishnukowndinya 5 ปีที่แล้ว

    Hi Sir, i have calculated cut off for accuracy for my data (~0.475). i would like to know where exactly i should replace default 0.5 with this .475 ?

  • @rohithebbar722
    @rohithebbar722 6 ปีที่แล้ว

    Hello sir, can you make a video for ploting ROC curve for SVM. I am getting an error in my code. The error is i am getting is format of prediction is invalid. Thank you

  • @francodjo
    @francodjo 7 ปีที่แล้ว

    I need some help please trying to submit my project but cannot get ROC work