SHAP - What Is Your Model Telling You? Interpret CatBoost Regression and Classification Outputs

แชร์
ฝัง
  • เผยแพร่เมื่อ 4 มิ.ย. 2024
  • Let's understand our models using SHAP - "SHapley Additive exPlanations" using Python and Catboost. Let's go over 2 hands-on examples, a regression, and classification, and analyze the SHAP summary plots. It's that powerful! And fun!
    For source code:
    www.viralml.com/video-content...
    Signup for my newsletter and more: www.viralml.com
    Connect on Twitter: / amunategui
    My books on Amazon:
    Python Web Work - Prototyping Guide for Makers: Use HTML5 Templates, Serve Dynamic Content, Build Machine Learning Web Apps, Grow Audiences, Conquer the World
    amzn.to/2veZtnB
    MVP Light Stack Field Guide: Take Your Python MVP to the Web as Quickly and Cheaply as Possible:
    amzn.to/2Q4Hiay
    The Little Book of Fundamental Indicators: Hands-On Market Analysis with Python: Find Your Market Bearings with Python, Jupyter Notebooks, and Freely Available Data:
    amzn.to/2DERG3d
    Monetizing Machine Learning: Quickly Turn Python ML Ideas into Web Applications on the Serverless Cloud:
    amzn.to/2PV3GCV
    Grow Your Web Brand, Visibility & Traffic Organically: 5 Years of amunategui.github.Io and the Lessons I Learned from Growing My Online Community from the Ground Up:
    amzn.to/2JDEU91
    Fringe Tactics - Finding Motivation in Unusual Places: Alternative Ways of Coaxing Motivation Using Raw Inspiration, Fear, and In-Your-Face Logic
    amzn.to/2DYWQas
    Create Income Streams with Online Classes: Design Classes That Generate Long-Term Revenue:
    amzn.to/2VToEHK
    Defense Against The Dark Digital Attacks: How to Protect Your Identity and Workflow in 2019:
    amzn.to/2Jw1AYS
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 44

  • @quasenerd5476
    @quasenerd5476 3 ปีที่แล้ว +20

    8:24 SHAP interpretation begins
    Thank you for the video!

  • @pavloseimskyi2413
    @pavloseimskyi2413 ปีที่แล้ว +6

    Great video, thanks! Just one little note: At 5:08 when you impute the age, this is actually data leakage. To avoid it, you should only use the average age of the training set in both training and validation sets. Cheers!

    • @asdasvcxvwe5114
      @asdasvcxvwe5114 ปีที่แล้ว +1

      thanks for pointing that out. but i have a question, why not use the average of the training in the training set and the average of the validation set in the validation set?

  • @himanshubhusanrath212
    @himanshubhusanrath212 2 ปีที่แล้ว +2

    Thank you so much for such a clear explanation of SHAP values and their interpretation.

  • @xbsd
    @xbsd 3 ปีที่แล้ว +1

    Clear explanations, excellent work!

  • @user-pb5su9zb9g
    @user-pb5su9zb9g 4 ปีที่แล้ว +3

    Thank you man, really interesting material. I will look and read more about it! You are a great teacher.

  • @stephenhobbs948
    @stephenhobbs948 4 ปีที่แล้ว +1

    Thank you! Your videos are very interesting. I look forward to more.

  • @seanbarrett1681
    @seanbarrett1681 2 ปีที่แล้ว

    Love you man, great video, exactly what I was looking for

  • @rajvernekar8605
    @rajvernekar8605 ปีที่แล้ว +1

    Thank you for this video. Very helpful!

  • @jhonnyespinozabryson8241
    @jhonnyespinozabryson8241 3 ปีที่แล้ว +1

    Thanks for sharing Manuel

  • @vincentdonghoonlee4497
    @vincentdonghoonlee4497 4 ปีที่แล้ว +1

    Thank you Manuel for sharing your insight!

    • @viralml
      @viralml  4 ปีที่แล้ว +1

      Thanks DHLee! (and thanks for pointing out the wrong source code)

  • @prathameshdinkar2966
    @prathameshdinkar2966 3 ปีที่แล้ว +1

    Thanks! Very nice video.

  • @danielinflam3s
    @danielinflam3s 3 ปีที่แล้ว +1

    loved it!

  • @bheeshmak.s5125
    @bheeshmak.s5125 2 หลายเดือนก่อน

    Great explanation..

  • @alexisparenty9445
    @alexisparenty9445 ปีที่แล้ว

    Manuel, you re the best!

  • @smeagolita1
    @smeagolita1 3 ปีที่แล้ว +1

    Great video!

  • @fliederblumen1843
    @fliederblumen1843 3 ปีที่แล้ว

    thanks for the video, can Shap be used for lstm model intepretation? it seems there is some problem due to the 3d tensor format of the lstm output.

  • @quasenerd5476
    @quasenerd5476 3 ปีที่แล้ว +16

    16:50 I guess maybe you have made a mistake. The x-axis do not give the amount it's affecting the model in the model's output unit. There is a non linear relationship between the SHAP values for features of an example and the prediction the model makes for this example.

    • @moisesdiaz9852
      @moisesdiaz9852 2 ปีที่แล้ว

      Thanks, i was getting confused

  • @juanete69
    @juanete69 ปีที่แล้ว

    Hello.
    If we apply SHAP to a linear regression model... are those Phi_i equivalent to the coefficients of the regression model? Do they also take into account the variance as the p-values do?
    How is the SHAP value for a variable different from the partial R^2?

  • @albertoaltamirano5462
    @albertoaltamirano5462 3 ปีที่แล้ว +1

    Muchas Gracias Manuel, muy interesante este tema me lo habían recomendado, estoy en proceso de aprendizaje. Saludos

    • @viralml
      @viralml  3 ปีที่แล้ว

      Gracias Alberto!

  • @michaeljohn8835
    @michaeljohn8835 3 ปีที่แล้ว

    This video was really helpful! I was wondering, how would you interpret the SHAP graph when you have variables that don’t have a low/high value? Would you have to encode your variables a certain way in order to do this?

  • @jonimatix
    @jonimatix 3 ปีที่แล้ว

    Great video, your material deserve more coverage!
    Is there a way to download the ipynb notebook?

  • @opalkabert
    @opalkabert 4 ปีที่แล้ว +2

    @amunatequi the plot you don't like is in fact the local plot for the purpose of explaining why an individual got a particular prediction. What you explained is the global explanation for the entire model. So in a case of credit decision making, the local explanation may be important. This is Albert

    • @viralml
      @viralml  4 ปีที่แล้ว +2

      Hey Albert - always good to hear from you - thanks for clarifying this - will be helpful to me and many others!

    • @viralml
      @viralml  4 ปีที่แล้ว +1

      And thanks for watching!!

    • @opalkabert
      @opalkabert 4 ปีที่แล้ว +2

      As always, it was a great post. I am sure there will be a follow up for deep learning and SHAP like using it tf.keras

  • @viralml
    @viralml  4 ปีที่แล้ว +1

    Sorry - had the wrong link to code - fixed now - thanks!

  •  4 ปีที่แล้ว +2

    Kudos Manuel! What about to use SHAP or LIME for error analysis?

  • @pratikpratik8495
    @pratikpratik8495 3 ปีที่แล้ว

    I can apply shap library and interpret the chart but what is final report out if it ??? Like what management / user expect from it ??? I can't see this chart to non-technical person . is there any report can be generated to draw any conclusion

  • @KN-tx7sd
    @KN-tx7sd 2 ปีที่แล้ว

    Is the same could be done in R, thanks

  • @jardelvieira8742
    @jardelvieira8742 ปีที่แล้ว

    I have a problem when I tried to use foce_plot for multiple Samples. "NotImplementedError: matplotlib = True is not yet supported for force plots with multiple samples!". Can you help-me?

  • @nancyzhang6790
    @nancyzhang6790 ปีที่แล้ว +1

    Great talk. Thanks. BTW, I don't know how Shapley was connected with Washington Univ. He got his A.B. from Harvard and Ph.D. from Princeton.

  • @shaythuramelangkovan5800
    @shaythuramelangkovan5800 2 ปีที่แล้ว

    hi sir, why is it grey at 10:24 ?

  • @hifredyo1773
    @hifredyo1773 ปีที่แล้ว

    Why is the base value / expected value for the classification problem negative when the problem is a binary classifcation? I thought the expected value was the mean of, so how could it be negative if the only possible values for the target variable are 0 or 1?

  • @MadhurDevkota
    @MadhurDevkota 2 ปีที่แล้ว +1

    Thanks for great SHAP workout. Is it only me or does he look like/ sound like Bil Burr of Data Science!! lol

    • @viralml
      @viralml  2 ปีที่แล้ว +1

      Haha thanks, I'll take that as a compliment as am a big fan!

  • @user-he7jw9uc4d
    @user-he7jw9uc4d 3 ปีที่แล้ว

    Hey Manuel,Thank you for a great instructional video, I learned your code, but at the end there was a problem. How do I solve this ,model_regressor = CatBoostRegressor(**params),NameError: name 'params' is not defined。 thanks

    • @paulguo7440
      @paulguo7440 3 ปีที่แล้ว

      params is a dictionary defined before.

    • @user-he7jw9uc4d
      @user-he7jw9uc4d 3 ปีที่แล้ว

      @@paulguo7440 thank you very much, i got it

  • @btcthousand5188
    @btcthousand5188 2 ปีที่แล้ว

    Seems your code to handle categorical variable is wrong. You can tell from your shape plot both CHAS and RAD are not considered :) something to do with your # convert to values to string piece of code.

  • @hoaxuan7074
    @hoaxuan7074 3 ปีที่แล้ว

    Ankit Patel breaking bad YT video. And then tell me ReLU is not a switch. f(x)=x is connect, f(x)=0 is disconnect. A light switch in your house is binary on off, yet connects and disconnects a continuously variable AC voltage signal.
    AI462 neural networks.