Shapley Values for Machine Learning
ฝัง
- เผยแพร่เมื่อ 2 เม.ย. 2023
- Shapley values come from game theory. So, what do they have to do with machine learning? We understand how the Shapley value formula is extended to explain how each model feature has contributed to a prediction. We will also see that the Shapley value axioms lead to desirable properties for a feature attribution method. To end, we will discuss how Shapley values can be approximated using Monte Carlo sampling, KernelSHAP and TreeSHAP.
*NOTE*: You will now get the XAI course for free if you sign up (not the SHAP course)
SHAP course: adataodyssey.com/courses/shap...
XAI course: adataodyssey.com/courses/xai-...
Newsletter signup: mailchi.mp/40909011987b/signup
Read the companion article (no-paywall link): towardsdatascience.com/from-s...
Medium: / conorosullyds
Twitter: / conorosullyds
Mastodon: sigmoid.social/@conorosully
Website: adataodyssey.com/
*NOTE*: You will now get the XAI course for free if you sign up (not the SHAP course)
SHAP course: adataodyssey.com/courses/shap-with-python/
XAI course: adataodyssey.com/courses/xai-with-python/
Newsletter signup: mailchi.mp/40909011987b/signup
These are some of the best data science tutorials Ive seen on TH-cam. Don't give up, keep making it. I know you'll make it big =)
Thank you so much! I really appreciate the support :)
Thanks for the tutorial. Nice example!
Typo: 2:15 "{1,...,p}{i}" should be "{1,...,p} - {i}".
Ahhh the maths nerds have found me! No really thanks. I know notation is important. Being on the ML side of things I miss this finer details.
nice explanation! :)
Thank you Erica!
Can you share link of Previous video which explains Shapley Formula?
Sure, Avijeet! You can find all the videos in this playlist: SHAP
th-cam.com/play/PLqDyyww9y-1SJgMw92x90qPYpHgahDLIK.html
Nice video.
But i've a question, what you showed in the video are if we are trying to "exclude" a categorical column (degree)
What about continuous column (number column)? (the age)
What value would we use?
Thanks! If the continuous variable is in the coalition, then we use the actual value for that instance (i.e. the person's actual age). If the continuous variable is not in the coalition, then we integrate over the values of the variable w.r.t. to the probability of the values.
However, in practice, we will not know the probability distribution of a variable. So we will have to randomly sample different values for the variable from our dataset. We do this a bunch of times so we end up approximating the distribution.
I hope that makes sense? There is a lot of statistical theory that underlies this explanation!
@@adataodyssey Ahh I see, got it.
But out of curiosity, Can you give me the reference of those statistical theory?
@@mxzeromxzero8912 Unfortunatley, I don't have any specific references. I'm using the knowledge from back in my undergrad. If you want to understand take a look at "stochastic calculus"
at 5:27 you mention the formula for calculating valx(S). don't we also need to subtract EX(f(X)) from that?
You can but they will cancel out when you subtract val(S) from val(S U {i})
Aaaaah of course, that makes sense! Thanks, you helped me a lot - not only with this comment but the entire video series:)
@@NeverHadMakingsOfAVarsityAthle No problem Matthias! I'm glad I could help :)