Shapley Values Explained | Interpretability for AI models, even LLMs!

AI Coffee Break with Letitia

มุมมอง 4 336

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 15 มิ.ย. 2024
Ever wondered how to interpret your machine learning models? 🤔 We explain a powerful interpretability technique for machine learning models: Shapley Values. They can be used to explain any model. 💻 We show a simple example code of how they work, and then 📖 explain the theory behind them.
AssemblyAI (Sponsor) 👉 www.assemblyai.com/research/u...
AI Coffee Break Merch! 🛍️ aicoffeebreak.creator-spring....
Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
Dres. Trost GbR, Siltax, Vignesh Valliappan, Michael, Sunny Dhiana, Andy Ma
Outline:
00:00 Interpretability in AI
01:02 AssemblyAI (Sponsor)
02:23 Simple example
03:51 Code example: SHAP
05:17 Shapley Values explained
07:59 Shortcomings of Shapley Values
💻 Demo for SHAP on LLaMA 2 LLM: drive.google.com/drive/folder...
Keep in mind that you need to have the resources to run LLaMA 2. If not, try out the “gpt2” model in the code. You can find simple examples here: shap.readthedocs.io/en/latest/ (see e.g., “Text examples”)
📙“Interpretable Machine Learning” by C. Molnar: christophm.github.io/interpre...
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: / aicoffeebreak
Ko-fi: ko-fi.com/aicoffeebreak
Join this channel to get access to perks:
/ @aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔗 Links:
AICoffeeBreakQuiz: / aicoffeebreak
Twitter: / aicoffeebreak
Reddit: / aicoffeebreak
TH-cam: / aicoffeebreak
#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research
Video editing: Nils Trost
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 29

@AICoffeeBreak หลายเดือนก่อน ⁺⁵
Maybe I should have mentioned this in the video: A huge problem in AI interpretability is faithfulness vs. plausibility. Users like *plausible* explanations which look right to them ("aha, this makes sense!"). But sometimes they see things that are counterintuitive or attributions that make no sense to them. Then, even if the explanations are *faithful* to the model's workings, they will seem alien, weird, and users will dislike such a model, or blame it into the interpretability method.
Why is feature attribution seldomly used in production? Because they can help users game the system. 😅 If you know your credit score is low because you have two cars, you will sell that extra car and increase your score.
หลายเดือนก่อน ⁺¹
It's inevitable that some folks will try to exploit any system, no matter how well-designed it is. Recently, we've seen some clever algorithms try to game the benchmarks, but they often fail spectacularly in the real world. It would be great to have a little extra help to detect these kinds of frauds. Something like 'humans' or 'reasoning agents' could be a good place to start.
@DerPylz หลายเดือนก่อน ⁺¹²
It's always great to see "old" ideas getting used for solving new problems . I had heard about shapley values and was hoping you'd make a video explainer about it Thanks!
@vietvu9714 26 วันที่ผ่านมา ⁺²
The explaination was farrrr better than anything I expected :D very well done
@juanmanuelcirotorres6155 หลายเดือนก่อน ⁺⁴
A serie about interpretability would be awesome
@md.enamulhoq9389 หลายเดือนก่อน ⁺⁴
best of luck with your Thesis. Stay sound. Love You
@dianai988 หลายเดือนก่อน ⁺³
Interpretability was the rabbit hole that got me into deep leaning, would love to see more content on this topic (and if you need ideas on things to explore, lmk) ♥ (also, SHAP was one of the earliest interpretability techniques I came cross after meeting the researcher working on it at the University of Washington at a poster session--so great to see how far this work has come since then!)
@AGI-Bingo หลายเดือนก่อน ⁺²
This is really cool, I can imagine in the future we'll have really good interpretability tools, for example marking a piece of text from the llm output and it will highlight the tokens from the context that influenced it the most ❤
@manolisnikolakakis7292 หลายเดือนก่อน ⁺³
Thanks for another great explanation! Good luck with your thesis :)
@AICoffeeBreak หลายเดือนก่อน ⁺¹
Thank you!
@SU3D3 หลายเดือนก่อน ⁺³
Excellent! Always providing the goods.
@AICoffeeBreak หลายเดือนก่อน ⁺²
Thank you!
@MyrLin8 หลายเดือนก่อน ⁺²
Very nice training vid. gj. useful info. good examples and references.
@MachineLearningStreetTalk หลายเดือนก่อน ⁺⁶
🔥🔥🔥
@MachineLearningStreetTalk หลายเดือนก่อน ⁺⁶
"Neat"! Best part haha
@AICoffeeBreak หลายเดือนก่อน ⁺⁴
@Nif3 หลายเดือนก่อน ⁺²
This is really interesting and your explanation was excellent, but... did that coffee bean really just wink at me?
@AICoffeeBreak หลายเดือนก่อน ⁺²
@yannickpezeu3419 หลายเดือนก่อน ⁺²
thanks !
@Ben_D. หลายเดือนก่อน ⁺⁹
Came for the AI commentary. Stayed for the god level lipstick.
@harumambaru หลายเดือนก่อน ⁺²
are those acoustic boards for walls? RLHF pretty easy to get all the words as another eastern european english speaker
@AICoffeeBreak หลายเดือนก่อน ⁺¹
Yes, that is acoustic foam. Otherwise I sound like I'm speaking from a bathroom. 🤭
@abhishekshakya6072 หลายเดือนก่อน ⁺¹
Thanks for referencing the mathematical equations from research papers. It really validates the authenticity of your work. I felt the video was rushed a bit. I was probably expecting a longer video with more examples.
But I understand you might have time crunch with your thesis. Good luck ✌
@sifonios หลายเดือนก่อน ⁺¹
Hm. I would have liked to watch this but the background music is far too loud and very distracting. ... Ah it does stop after a while. Yes it is very interesting and useful for me :)
@AICoffeeBreak หลายเดือนก่อน ⁺¹
I agree, I noticed that too in the final pass. Will make it better next time.
@DerPylz หลายเดือนก่อน ⁺¹
Sorry, that's on me (her editor). Something got messed up in the audio mixing and we didn't notice it before uploading. Luckily, it's only during the introduction, so the main part of the video should be fine 😅
@gordonfreeman4357 หลายเดือนก่อน
Not gonna lie, I think that this is basically useless on autoregressive models.
@AICoffeeBreak หลายเดือนก่อน ⁺⁸
👀 Don't leave us hanging here, explain your statement. 😅

ต่อไป

เล่นอัตโนมัติ

Stealing Part of a Production LLM | API protect LLMs no more