Adversarial Machine Learning explained! | With examples.

AI Coffee Break with Letitia

มุมมอง 24 323

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 23 ม.ค. 2025

ความคิดเห็น • 31

@AICoffeeBreak 5 หลายเดือนก่อน ⁺²
ERRATUM: At 3:44 showing the two equations:
The second one should have the derivative with respect to x_t rather than w_t, to increase the loss as much as possible by travelling in the direction of the gradient of the loss with respect to the INPUT rather than the loss with respect to the weights. Thanks to Hannes Whittingham for pointing this out! 🎯
@PaulVautravers 5 หลายเดือนก่อน ⁺⁴
Thank you for the good video! Just watched this as part of the BlueDot AI safety fundamentals course and excited to learn more about adversarial examples
@maxneumann 4 ปีที่แล้ว ⁺¹⁸
Incredible that such great information is just free on TH-cam! Thanks for the video! Great job!!!
@AICoffeeBreak 4 ปีที่แล้ว ⁺⁵
Thanks for your heartwarming message!
@vijayabhaskar-j 4 ปีที่แล้ว ⁺⁴
This channel is going to get super popular soon.
@AMANVERMA-bq8hj 9 หลายเดือนก่อน ⁺²
Wonderful Explanation Ma'am ! Thank you so much
@yako668 3 ปีที่แล้ว ⁺³
This video truly deserves more views!! very informative content explained in a simple way, thank you very much for uploading it I love it
@daniyalkabir6527 7 หลายเดือนก่อน ⁺²
Excellent video
@mehaksaini2552 ปีที่แล้ว ⁺²
Awesome content!! Such a great and concise explanation💕.
@nehzz7433 3 หลายเดือนก่อน ⁺¹
Nice work!
@hsngraphics8228 2 ปีที่แล้ว ⁺³
thanks for your such incredible videos.
@code_ansh ปีที่แล้ว ⁺²
great explaination
@hyperbolicandivote 10 หลายเดือนก่อน
Thanks. On spring break.
@karlnashnikov3261 ปีที่แล้ว ⁺²
The paper „On Adaptive Attacks
to Adversarial Example Defenses“ by Tramèr et al. shows, that none of the defense mechanisms against adv. ex. are robust.
@yingjiawan2514 5 หลายเดือนก่อน ⁺¹
It is not clear to me from the video how FGSM modifies the input to offset the SGD weight update calculated on loss. The input x is not in the axes of the graph. Why changing the input can interfere with the weight update?
@AICoffeeBreak 5 หลายเดือนก่อน ⁺¹
Thanks for the question. What an old video, yes, I could have made it clearer.
The idea is to backpropagate the loss through the weights up to the input neurons (input x) and in the same way in which SGD updates the weights, now we update the input x. I showed it for the weights because we can consider the input x, which is now variable, as additional sets of weights.
@hsngraphics8228 2 ปีที่แล้ว ⁺³
awesome
@ambujmittal6824 4 ปีที่แล้ว ⁺³
The initial panda-gibbon example will be an example of a targeted black-box attack, correct?
@AICoffeeBreak 4 ปีที่แล้ว ⁺⁴
Corect. :)
@orellavie6233 3 ปีที่แล้ว ⁺²
@@AICoffeeBreak YOU actually specified in the video that this is an whitebox (untargeted or targeted, we need access to the gradients..., which is whitebox, no?)
@AICoffeeBreak 3 ปีที่แล้ว ⁺⁴
@@orellavie6233 Bonus points to you for paying this much attention. 👍 Yes, in the paper they used a white-box algorithm (acces to gradients), true. But the same result could be achieved with a black-box algorithm too.
@orellavie6233 3 ปีที่แล้ว ⁺¹
@@AICoffeeBreak thanks :)! How it is possible to achieve it with blackbox? To use a transfer surrogate model like Papernot offered? Or I have missed something? You do need the gradients of the model, or to query a model until you find the right path?
@AICoffeeBreak 3 ปีที่แล้ว ⁺¹
@@orellavie6233 Brute-forcing is indeed an approach. And yes, the Papernot et al. Local substitute model could also be a thing.
Here is a great survey on black box adversarial attacks: arxiv.org/abs/1912.01667
@nathansuar9908 ปีที่แล้ว
What about contrastive learning? For example, I think that the image that most matches CLIPs "a panda" would be a realistic image of a panda.
@042chowdary2 4 ปีที่แล้ว ⁺²
Why don't u try installation tutorials alongside with tgeese that could reach broader audience of your work
BTW awesome work 👌
@paveltikhonov8780 4 ปีที่แล้ว ⁺¹
Why nobody interested in WHY it is possible instead of how to apply it
@AICoffeeBreak 4 ปีที่แล้ว ⁺²
Hi Pavel! 1:54 is explaining one very simple way of how to do it. Here I try to break it down even further: We have the model with it's specific decision boundary (fixed and given). So instead of changing the parameters of the model, we change the *input* slightly, enough to pass to the opposite direction of the *decision boundary*. How we achieve that? By FGSM at 1:54, for example.
This could have been a wonderful diagram to make and explain in the video, in hindsight...
@paveltikhonov8780 4 ปีที่แล้ว ⁺³
@@AICoffeeBreak
No - it is how to do it, but not why it works at all.
I mean, why does it take so little to cross the decision boundary?
If you and I didn't know about adversarials before and you came out with idea and said to me that you can fool neural network by small change of pixel values, I wouldn't believe you.
Why when we create adversarial for some image, for example "car", and we want it to be classified as "airplane", we do not see that something like "wings" starts to appear, but instead added values looks like a noise?
First when I saw it - I thought it is an overfitting problem - that decision boundary has very complicated shape and hence almost every input image is placed near decision boundary
But it rises some questions:
1) why neural nets become more confident in prediction of adversarial example than in original image, if boundary condition is so complicatelly shaped?
2) why random noise doesn't change prediction class, why do we need specific directions? We would expect random predictions if boundaries has irregular shape
3) why we can add the same adversarial difference to any other image and still have the same misclassification with the same prediction class. We also would expect random results
It means that there something interesting what's going on. And when I was searching for the answer, I found interesting video by Ian Goodfellow: th-cam.com/video/CIfsB_EYsVI/w-d-xo.html which I recommend.
He proposed very interesting idea, that it can be not because of overfitting but because of underfitting, and that neural networks in spite of non-linearities in activation functions are piecewise-linear models in some extent. And because of the linearity of the model we can find some direction which goes deeply beyond the decision boundary - it would explain previous questions:
1) it's simply because in linear models, if we go very deep beyond the decision boundary - we have more confidence in the prediction
2) if the goal is to move far in certain direction, then it can be explained why random direction wouldn't give us the desired results
3) because of the linearity of the dicision boundary we can cross this boundary from any point, if the adversarial direction vector length is large enough
And it gives us some interesting insights about how neural networks actually works and how difficult the problem of adversarial examples actually is
@AICoffeeBreak 4 ปีที่แล้ว ⁺⁴
Now I understand your question much better, thanks for the lengthy answer! But here you have it: "why" is not at all trivial to answer. I recommend the link you suggested too for everyone who prefers all the juicy details in 1 and 1/2 hours instead of a 10 minute taste bite. 😃 Thank you! I'll add it to the video description (th-cam.com/video/CIfsB_EYsVI/w-d-xo.html).
@siarez 3 ปีที่แล้ว
What is the reasoning behind using the sign of the gradients instead of the gradients itself? It feels like you are just throwing own useful information when you just use the sign.
@AICoffeeBreak 3 ปีที่แล้ว ⁺²
Hi and thanks for the question. I do not know exactly the part that confused, but the magnitude of the gradient is also used. The sign is used to determine the direction to move into. Then, one moves the input by (a fraction of) the magnitude of the gradient.

ต่อไป

เล่นอัตโนมัติ

How to check if a neural network has learned a specific phenomenon?