Animated AI
Animated AI
  • 11
  • 210 764
Multihead Attention's Impossible Efficiency Explained
If the claims in my last video sound too good to be true, check out this video to see how the Multihead Attention layer can act like a linear layer with so much less computation and parameters.
Patreon: www.patreon.com/Animated_AI
Animations: animatedai.github.io/
มุมมอง: 5 196

วีดีโอ

What's So Special About Attention? (Neural Networks)
มุมมอง 6K7 หลายเดือนก่อน
Find out why the multihead attention layer is showing up in all kinds of machine learning architectures. What does it do that other layers can't? Patreon: www.patreon.com/Animated_AI Animations: animatedai.github.io/
Pixel Shuffle - Changing Resolution with Style
มุมมอง 9Kปีที่แล้ว
Patreon: www.patreon.com/Animated_AI Animations: animatedai.github.io/#pixel-shuffle
Source of confusion! Neural Nets vs Image Processing Convolution
มุมมอง 4.6Kปีที่แล้ว
Patreon: www.patreon.com/Animated_AI All Convolution Animations are Wrong: th-cam.com/video/w4kNHKcBGzA/w-d-xo.html My Animations: animatedai.github.io/ Intro sound: "Whoosh water x4" by beman87 freesound.org/s/162839/ Bee image: catalyststuff on Freepik www.freepik.com/free-vector/cute-bee-flying-cartoon-vector-icon-illustration-animal-nature-icon-concept-isolated-premium-vector_31641108.htm#q...
Groups, Depthwise, and Depthwise-Separable Convolution (Neural Networks)
มุมมอง 37Kปีที่แล้ว
Patreon: www.patreon.com/Animated_AI Fully animated explanation of the groups option in convolutional neural networks followed by an explanation of depthwise and depthwise-separable convolution in neural networks. Animations: animatedai.github.io/ Intro sound: "Whoosh water x4" by beman87 freesound.org/s/162839/
Stride - Convolution in Neural Networks
มุมมอง 8Kปีที่แล้ว
Patreon: www.patreon.com/Animated_AI A brief introduction to the stride option in neural network convolution followed by some best practices. Intro sound: "Whoosh water x4" by beman87 freesound.org/s/162839/
Convolution Padding - Neural Networks
มุมมอง 9K2 ปีที่แล้ว
Patreon: www.patreon.com/Animated_AI A brief introduction to the padding option in neural network convolution followed by an explanation of why the default is named "VALID". Intro sound: "Whoosh water x4" by beman87 freesound.org/s/162839/
All Convolution Animations Are Wrong (Neural Networks)
มุมมอง 65K2 ปีที่แล้ว
Patreon: www.patreon.com/Animated_AI All the neural network 2d convolution animations you've seen are wrong. Check out my animations: animatedai.github.io/
Filter Count - Convolutional Neural Networks
มุมมอง 16K2 ปีที่แล้ว
Patreon: www.patreon.com/Animated_AI Learn about filter count and the realistic methods of finding the best values My Udemy course on High-resolution GANs: www.udemy.com/course/high-resolution-generative-adversarial-networks/?referralCode=496CFB7F680D78F02798
Kernel Size and Why Everyone Loves 3x3 - Neural Network Convolution
มุมมอง 29K2 ปีที่แล้ว
Patreon: www.patreon.com/Animated_AI Find out what the Kernel Size option controls and which values you should use in your neural network.
Fundamental Algorithm of Convolution in Neural Networks
มุมมอง 22K2 ปีที่แล้ว
Patreon: www.patreon.com/Animated_AI See convolution in action like never before!

ความคิดเห็น

  • @leoliu9299
    @leoliu9299 2 วันที่ผ่านมา

    This is such an amazing video. Thank you.

  • @clutchplayz1180
    @clutchplayz1180 8 วันที่ผ่านมา

    straight heat i can't even lie good stuff bro

  • @clutchplayz1180
    @clutchplayz1180 9 วันที่ผ่านมา

    fye

  • @Tyler-i2d
    @Tyler-i2d 15 วันที่ผ่านมา

    I feel silly for asking, but the different colored blocks (in the middle) correspond to convolutions over different channels of the original matrix right?

  • @involuntaryoccupant
    @involuntaryoccupant 21 วันที่ผ่านมา

    i finally understood why convolution makes more channels. thank you so so much

  • @adimaqsood3040
    @adimaqsood3040 หลายเดือนก่อน

    So the point he is trying to make is that an "image" is represented as a 3d object/array , that is height(pixels) , width (pixels) and RGB components, but when we talk about 2d images we usually means a "grayscale image" , which doesn't require the RGB part mentioning explicitly, although it is still a 3d image , . So for a colored image of 128x128 pixels , tenser shape would be (128,128,3) & for a grayscale image it would be (128,128,1)

  • @wwxyz7570
    @wwxyz7570 หลายเดือนก่อน

    Is feature and filter count the same as PyTorch channel size?

  • @evr0.904
    @evr0.904 หลายเดือนก่อน

    The convolution is the same, you're just using different dimensions and then using the same dimension label for the operation. You're comparing apples to oranges.

  • @mMaximus56789
    @mMaximus56789 หลายเดือนก่อน

    these are the best animations i've seen about neural nets, i hope we can get a such a clear video like the one in separable dephwise convolutions but for attention

  • @msergejev
    @msergejev หลายเดือนก่อน

    An absolute pinnacle of online education materials in the field, when it comes to giving a real gut intution of what do operation look like 🙌 its a real talent you got there, thank you on behalf of the rest of the internet for using it well

  • @NoMusiciansInMusicAnymore
    @NoMusiciansInMusicAnymore 2 หลายเดือนก่อน

    This helped so much, you can't understand how thankful I am

  • @Scarlety8109
    @Scarlety8109 2 หลายเดือนก่อน

    game changer for anyone learning neural networks

  • @sashanksingh6714
    @sashanksingh6714 2 หลายเดือนก่อน

    Well explained.

  • @sashanksingh6714
    @sashanksingh6714 2 หลายเดือนก่อน

    Nice video, you should post more .

  • @felipelourenco8054
    @felipelourenco8054 2 หลายเดือนก่อน

    Thanks for that. It was really confusing before your animation came up!

  • @grjesus9979
    @grjesus9979 2 หลายเดือนก่อน

    So in case of a feature map input, 2d conv just replicate each 2d filter along the feature dimension and do multiplication wise? In the video, the filters are 2d really just replicate to fill in the the number of features? or does each 2d filter is in reality a 3d tensor to match the feature dimension?

  • @hyahyahyajay6029
    @hyahyahyajay6029 2 หลายเดือนก่อน

    I have been struggling to mentally visualize convolutions, specially going from one dimension to others. I was reading the book Understanding Deep Learning by Simon Prince and I realized what i thought i looked like was wrong ( The 2D to 2D animations from the beginning). I wish I would have stumbled upon yours before having to imagine what was explained in the books XD (Good book tho)

  • @ananyapamde4514
    @ananyapamde4514 2 หลายเดือนก่อน

    Such a beautiful video

  • @AmbrozeSE
    @AmbrozeSE 2 หลายเดือนก่อน

    Thank you. Great job

  • @minecraftermad
    @minecraftermad 3 หลายเดือนก่อน

    wanna bet e by e would be somehow mathematically optimal?

  • @thrisharamkumar9566
    @thrisharamkumar9566 3 หลายเดือนก่อน

    Coming here after "All convolution animations are wrong", brilliant work! thank you very much!

  • @thecodegobbler2179
    @thecodegobbler2179 3 หลายเดือนก่อน

    The other drawings and visuals can't keep up with this! Great content! I love the visualizations!

  • @harrydawitch
    @harrydawitch 3 หลายเดือนก่อน

    My second favourites 3Brown1Blue channel

  • @keihoag6467
    @keihoag6467 3 หลายเดือนก่อน

    Please do a video on backpropagation (since its another convolution)

  • @feddyxdx272
    @feddyxdx272 3 หลายเดือนก่อน

    thx

  • @bibimblapblap
    @bibimblapblap 3 หลายเดือนก่อน

    Why is your input tensor so many dimensions? Shouldn’t the depth be only 3 (1 for each color channel)?

  • @lizcoultersmith
    @lizcoultersmith 3 หลายเดือนก่อน

    These videos are outstanding! Finally, true visualisations that get it right. I'm sharing these with my ML Masters students. Thank you for your considerable effort putting these together.

  • @captainjj7184
    @captainjj7184 4 หลายเดือนก่อน

    I like it, really, love it! But... I don't see what's wrong with other illustrations and peculiarly I think yours just iterates what they already clearly illustrate. I was even expecting CNN representations in XYZ visuals. Am I missing some points here? Honest question, would appreciate any enlightenment! (btw, thank you for sharing the world with your own version of splendid animation!) PS: If you're up for the challenge, do Spiking NN, I'll buy you a beer in Bali!

  • @mayapony
    @mayapony 4 หลายเดือนก่อน

    thx!!

  • @afrolichesmain777
    @afrolichesmain777 4 หลายเดือนก่อน

    Its funny you mention that the number of kernels is the least exciting part, my thesis was an attempt on finding a systematic way to reduce the number kernels by correlating them and discarding kernels that “extract roughly the same features”. Great video!

  • @alexvillalobos8245
    @alexvillalobos8245 4 หลายเดือนก่อน

    jiff

  • @nikilragav
    @nikilragav 4 หลายเดือนก่อน

    The reason the filter at 3:00 being 2D gets glossed over is because most image signal processing is taught in grayscale

  • @nikilragav
    @nikilragav 4 หลายเดือนก่อน

    2:13 how does it stay at the same size? Padding the edges of the original image?

  • @nikilragav
    @nikilragav 4 หลายเดือนก่อน

    What actually is the 3rd dimension in this context for the source giant cube? Is that multiple colors? A batch of multiple images?

  • @sensitive_machine
    @sensitive_machine 4 หลายเดือนก่อน

    Lol grayscale is a real thing still! Medical and microscopy imaging

  • @sensitive_machine
    @sensitive_machine 4 หลายเดือนก่อน

    this is awesome and is inspiring me to learn blender!

  • @hieuluc8888
    @hieuluc8888 4 หลายเดือนก่อน

    0:16 If filters are stored in a 4-dimensional tensor and one of them represents the number of filters, then what does the depth represent?

    •  21 วันที่ผ่านมา

      it represents the depth of the input tensor

  • @FrigoCoder
    @FrigoCoder 5 หลายเดือนก่อน

    I have some hobbyist signal processing experience of a few decades, and these new methods seem so amateurish compared to what we had in the past. FFT, FHT, DCT, MDCT, FIR filters, IIR filters, FIR design based on frequency response, edge adapted filters (so no need for smaller outputs), filter banks, biorthogonal filter banks, window functions, wavelets, wavelet transforms, laplacian pyramids, curvelets, counterlets, non-separable wavelets, multiresolution analysis, compressive sensing, sparse reconstruction, SIFT, SURF, BRISK, FREAK, yadda yadda. Yes we even had even length filters, and different filters for analysis than for synthesis.

    • @equationalmc9862
      @equationalmc9862 4 หลายเดือนก่อน

      There are Equivalents in AI Model Development and Inference for those though. Many of these signal processing techniques have analogs or are directly applicable in AI and machine learning: - **FFT, FHT, DCT, and MDCT:** Used in feature extraction and preprocessing steps for machine learning models, especially in audio and image processing. - **FIR and IIR Filters:** Used in preprocessing steps to filter and clean data before feeding it into models. - **Wavelets and Wavelet Transforms:** Applied for feature extraction and data compression, useful in handling time-series data. - **Compressive Sensing and Sparse Reconstruction:** Important in developing models that can work with limited data and in reducing the dimensionality of data. - **SIFT, SURF, BRISK, and FREAK:** Feature detection and description techniques that are foundational in computer vision tasks like object recognition and image matching. In AI, techniques like convolutional neural networks (CNNs) often use concepts from signal processing (like filtering and convolutions) to process data in a way that mimics these traditional methods. Signal processing principles help in designing more efficient algorithms and models, improving performance in tasks such as image recognition, speech processing, and time-series analysis.

  • @rafa_br34
    @rafa_br34 5 หลายเดือนก่อน

    Incridibly helpful, keep up the good work!

  • @commanderlake7997
    @commanderlake7997 5 หลายเดือนก่อน

    I'm confused because you make it look like an attention layer could be used as a drop-in replacement for a linear layer but GPT-4o says: "No, an attention layer cannot be used as a direct drop-in replacement for a linear layer due to the fundamental differences in their functionalities and operations."?

    • @animatedai
      @animatedai 5 หลายเดือนก่อน

      That’s correct that an attention layer is not functionally equivalent to a linear layer. This efficiency comes with its own trade-offs. But it’s going to make more sense to talk about those trade-offs a couple more videos down the line in this series, so I didn’t go over them in this video.

    • @commanderlake7997
      @commanderlake7997 5 หลายเดือนก่อน

      @@animatedai Thanks for clearing that up, also I ran some quick tests comparing the performance of a pytorch MultiheadAttention layer with a Linear layer and the linear layer is significantly faster on CPU and GPU in every test i can run so i hope that's something you could clarify in a future video. Looking forward to the next one!

  • @hieuluc8888
    @hieuluc8888 5 หลายเดือนก่อน

    Sir, Thanks for doing god's work!!! I wonder why this channel has so few viewers; it deserves to be known by more people. Deep Learning is much simpler if learned from this guy. Honestly, I truly admire you for taking the time to research and visualize something so complex, making it easy for everyone to understand.

  • @rafa_br34
    @rafa_br34 5 หลายเดือนก่อน

    This is so unfairly underrated, I have never seen such a good video about CNNs.

  • @jameshopkins3541
    @jameshopkins3541 5 หลายเดือนก่อน

    Which is correct?????

  • @honourable8816
    @honourable8816 5 หลายเดือนก่อน

    Stride value was 2 pixel

  • @architech5940
    @architech5940 5 หลายเดือนก่อน

    You did not introduce convolution in any informative way, nor define any terms for your argument, and you didn't explain the purpose of 3D convolution or why 2D convolution is inaccurate in the first place. There is also no closing argument for what appears to be your proposition for the proper illustration of CNN. This whole video is completely open ended and thus ambiguous.

  • @AdmMusicc
    @AdmMusicc 5 หลายเดือนก่อน

    Loved the animation thank you!!

  • @martinhladis1941
    @martinhladis1941 5 หลายเดือนก่อน

    Excelent!!

  • @kage-sl8rz
    @kage-sl8rz 5 หลายเดือนก่อน

    cool even better add names to the objects like kernel etc would be helpful to new people

  • @edsparr2798
    @edsparr2798 5 หลายเดือนก่อน

    I adore your content, genuinely can’t wait for more videos of your visualizations. Feels like I’m building real intuition about what I’m doing watching you :)

  • @trololollolololololl
    @trololollolololololl 5 หลายเดือนก่อน

    keep it up great videos