at 2:33, you mentioned that self-attention is more biased, but at 2:54 you also mentioned that self-attention reduces inductive bias?? Sorry but i'm a bit confused.
Self-Attention indeed reduces inductive bias and adopts a more general learning framework. At 2:33, I am asking a question: "IS Self-Attention more general or more biased?" And then I continue with "I'll argue that Self-Attention is not only more general than CNNs and RNNs but even more general than MLP layers".
This is true. Many computer vision research has now moved into adding elements of CNNs/inductive bias into the Vision Transformer architecture, for example - Swin Transformers.
Came here after interest from one blue three brown. It's clear you've got a great explanation style... plus you were earlier ;). Hope your channel following builds to match your outstanding quality.
wowww! what a great explanation! helps knit so many individual concepts together in one cohesive knowledge base!! thanks a lot for making this video and all the animations!
Awesome video! This (together with the last two videos) is one of the best explanations of Transformers I've seen. Thanks and keep it up!
This is gold I hope it gets the ATTENTION it deserves
Thanks!! More attention will surely TRANSFORM this channel! 😂
Incredible series, helped me a lot.
I really dont like to leave you a like. Instead, I want to leave you one hundred likes.. Unfortunately google limits me to one...
Thanks!! Super appreciated!
Thank you!
Dude, you are a treasure, keep it up!
the gist of this video: 4:29, a great job, thanks
at 2:33, you mentioned that self-attention is more biased, but at 2:54 you also mentioned that self-attention reduces inductive bias?? Sorry but i'm a bit confused.
Self-Attention indeed reduces inductive bias and adopts a more general learning framework. At 2:33, I am asking a question: "IS Self-Attention more general or more biased?" And then I continue with "I'll argue that Self-Attention is not only more general than CNNs and RNNs but even more general than MLP layers".
Should have a hybrid structure ...... 😅
This is true. Many computer vision research has now moved into adding elements of CNNs/inductive bias into the Vision Transformer architecture, for example - Swin Transformers.
Came here after interest from one blue three brown. It's clear you've got a great explanation style... plus you were earlier ;). Hope your channel following builds to match your outstanding quality.
Welcome! Thanks a lot for the shoutout!
Congrats !! Awesome channel !
Thanks!
wowww! what a great explanation! helps knit so many individual concepts together in one cohesive knowledge base!! thanks a lot for making this video and all the animations!
Thanks!!