Variational Autoencoders - EXPLAINED!

แชร์
ฝัง
  • เผยแพร่เมื่อ 16 ก.ย. 2024

ความคิดเห็น • 120

  • @benjaminbong
    @benjaminbong 4 ปีที่แล้ว +25

    Awesome tutorial! I've been struggling to userstand VAEs, and this helped me finally get an idea how they work!
    Thank you!

  • @Multibjarne
    @Multibjarne 3 ปีที่แล้ว +20

    I needed someone to spoonfeed me this stuff. Thanks

  • @jinoopark6034
    @jinoopark6034 5 ปีที่แล้ว +11

    I love your explanation. Please make a more math-oriented video on VAE!

  • @retime77
    @retime77 5 ปีที่แล้ว +11

    Thanks for intuitive explanation. I'm really looking forward to see more detailed exploration on the VAE and its variants as noted in the last of the video.

  • @rajpulapakura001
    @rajpulapakura001 11 หลายเดือนก่อน +1

    Thanks for the vid, now I finally understand VAEs. I would also highly recommend watching the MIT Deep Generative Modelling video to better understand the technical details of VAEs.

  • @diato2993
    @diato2993 ปีที่แล้ว +1

    the best explanation for beginners, thank you so much!

  • @DarshanSenTheComposer
    @DarshanSenTheComposer 4 ปีที่แล้ว +2

    Brilliant explanation! I have watched many videos on this topic, but most of them either throw some weird and unknown mathematical equation at you, which they just assume that you'll understand without a proper explanation and the rest just throws lines of python code at you, where the functions and parameters have thicc statistical names. You explained this like it is just a piece of cake! Thank you. :D

  • @monil_soni
    @monil_soni 11 หลายเดือนก่อน

    Thanks for this! Helped me understand the need for defining a region for these pools and consequently, having the K-L divergence in optimization. Up until now, I only looked at that regularization term as intentionally having information loss and now it makes sense that we need that to make the generator more useable for "varying" outputs.

  • @emransaleh9535
    @emransaleh9535 5 ปีที่แล้ว +1

    Keep doing this nice work about deep learning concepts and papers. You will go far with this channel.

  • @mariolinovalencia7776
    @mariolinovalencia7776 5 ปีที่แล้ว +1

    Best video on vae. Finally I understand

  • @ArchithaKishoreSings
    @ArchithaKishoreSings 5 ปีที่แล้ว +2

    Your channel is absolutely incredible. Keep em coming☺️

  • @GoKotlinJava
    @GoKotlinJava 4 ปีที่แล้ว

    awesome and simple explanation. I was confused and wondering about the sampling part that VAE's do because i didn't understand what was meant by sampling a latent vector from a distribution. But you made it so easy to understand. Thanks a lot. Keep up the good work

    • @CodeEmporium
      @CodeEmporium  4 ปีที่แล้ว

      Thanks homie. I'm trying to not hid hide behind the jargon. But it can be hard at times. I'll explain myself when I can

  • @DB-in2mr
    @DB-in2mr ปีที่แล้ว

    whow ...you showed a great deal of expalanation capacity man! kudos to you. Daniele

  • @yujisakabe4900
    @yujisakabe4900 2 ปีที่แล้ว

    Thank you so much for the didactic explanation, it really helped me to understand the fundamental concepts before exploring the math behind it.

  • @ParthivShah
    @ParthivShah 25 วันที่ผ่านมา

    Thank you very much. Love from India.

  • @xruan6582
    @xruan6582 4 ปีที่แล้ว +1

    Good intuitive explanation. I need more details about how to train a VAE, which is die hard to understand by following stanford's introduction

    • @CodeEmporium
      @CodeEmporium  4 ปีที่แล้ว +3

      Trying to make this as accessable as possible. It is a hard topic and sometimes I might hide behind that jargon. But I'll try to explain myself when I can

  • @tariqislam9388
    @tariqislam9388 4 หลายเดือนก่อน

    Thank you for this fantastic tutorial.

  • @saptakatha
    @saptakatha 4 ปีที่แล้ว

    Please make a video on maths behind VAE. Your way of explaining things makes it easy to understand the hard concepts!

  • @NaxAlpha
    @NaxAlpha 5 ปีที่แล้ว +1

    Love your channel. Looking forward to more research paper explanations!

  • @weilinfu1343
    @weilinfu1343 5 ปีที่แล้ว +1

    Great video! Looking forward for the math part!

  • @ssshukla26
    @ssshukla26 4 ปีที่แล้ว

    A very good and clear explanation. Thanks.

  • @eduardoblas2315
    @eduardoblas2315 5 ปีที่แล้ว

    Gold content, simple and entertaining, keep it going.

  • @__goyal__
    @__goyal__ 4 ปีที่แล้ว

    Glad that I came across this channel!!

  • @MayankKumar-nn7lk
    @MayankKumar-nn7lk 5 ปีที่แล้ว +2

    Awesome Video, Pls show mathematics part in the next video

  • @joebastulli
    @joebastulli 4 ปีที่แล้ว

    Thanks for the explanation, simple and clear!

  • @MartinWanckel
    @MartinWanckel ปีที่แล้ว

    Very nicely explained !

  • @joehaddad4945
    @joehaddad4945 2 ปีที่แล้ว

    This video is pure gold. Thank you so much!

  • @ArcticSilverFox1
    @ArcticSilverFox1 3 ปีที่แล้ว

    Very nicely explained! Great job!

  • @user-xt2om1ev9z
    @user-xt2om1ev9z 2 ปีที่แล้ว

    This is some GREAT explanation here!

  • @asheeshmathur
    @asheeshmathur 9 หลายเดือนก่อน

    Excellent explanation

    • @CodeEmporium
      @CodeEmporium  9 หลายเดือนก่อน +1

      Thanks a ton!

  • @ambeshshekhar4043
    @ambeshshekhar4043 3 ปีที่แล้ว

    +1 to the v.a.e video with lots of math!

  • @Vikram-wx4hg
    @Vikram-wx4hg 3 ปีที่แล้ว

    Beautifully explained!

  • @manuelkarner8746
    @manuelkarner8746 4 ปีที่แล้ว +2

    Hy, that was the best var Autoencoder video I found on the internet, so thanks a lot, it realy helped !
    I have 2 questions regarding min 10:22 continious region.
    1: (if i understood it correctly this is a no): is the number of dog-verctors in the dog pool equal to the number of dog pics in the training-set ?
    2: if you take the most average dog-verctor from the d-pool, to make it short lets say:
    [70, 10, 0.4] than could the whole pool be descirbed as each of the values has it´s range like: [70(+/-10), 10(+/- 2, 0.4(+/- 0.02) ] and as long as all values from a new latent space vector are in this range, I am in the dog pool and therfore generate an okey-looking dog ?
    (little bonus question so the number of values in the vector and the range of each determines how much different dogs the network is able to create ?)
    thank you in advance, i hope my question was understandable

    • @BlockOfRed
      @BlockOfRed 4 ปีที่แล้ว +1

      Hi,
      1: You understood that correctly, so no. As the region is continiuous, it contains an infinite amount of vectors. On the other hand, you know only as many vectors of that region as you have input images (as you generate one for each image).
      2: Not every dog image leads to a vector withing this pool and not every vector within this pool generates a dog image. This is due to the fact that a) we don't really understand how NNs function internally and b) these "pools" are just an explanation of what's wrong with traditional AE. That is, they do not have to really exist in the "real world".
      3: As traditional AE decoders are deterministic, yes. If your latent vector can only have one value, you can only generate one image. The "range" shown in the video is a slight simplification of what is really going on. That is, you do not set hard bounds for your latent variables, but you formulate this as minimizing the KL-divergence (Kullback-Leibler-divergence, i.e. the "distance" of two distributions), so that the latent distribution does not strive away too much from the standard distribution.
      I hope my answers were both understandable and correct :)

  • @internationalenglish7413
    @internationalenglish7413 5 ปีที่แล้ว

    Great work! Wish you a million subscribers.

  • @tobuslieven
    @tobuslieven 2 ปีที่แล้ว

    6:17 If passing in a random vector outputs garbage, then there are excess degrees of freedom in the vector. The variational autoencoder seems to be limiting the set of input vectors, so when we choose one from the limited set, we're assured it won't output garbage.

    • @vandarkholme442
      @vandarkholme442 2 ปีที่แล้ว

      So is that how the KL loss comes to play? by limiting input hidden vectors region?

  • @miracode7327
    @miracode7327 2 ปีที่แล้ว

    Reference list is good, subbed

  • @caoshixing7954
    @caoshixing7954 3 ปีที่แล้ว

    +1 to the v.a.e video with lots of math!
    thanks nice video!

  • @justgay
    @justgay ปีที่แล้ว +1

    on the final slide, how did you find the latent vectors for the VAE that generate images similar to the images generated by GAN?
    or were the images on the right the result of encoding and decoding the images generated by the GAN using the VAE? then the VAE seems really bad at its original job

  • @Leibniz_28
    @Leibniz_28 4 ปีที่แล้ว +1

    🙋🏻‍♂️ another video of variational autoencoders, please

  • @amr6859
    @amr6859 2 ปีที่แล้ว +7

    Take home message: Variational Autoencoders can generate new data.

  • @cptechno
    @cptechno 2 ปีที่แล้ว

    QUESTION CONCERNING VAE! Using VAE with images, we currently start by compressing an image into the latent space and reconstructing from the latent space.
    QUESTION: What if we start with the photo of adult human, say a man or woman 25 years old (young adult) and we rebuild to an image of the same person but at a younger age, say man/woman at 14 years old (mid-teen). Do you see where I'm going with this? Can we create a VAE to make the face younger from 25 years (young adult) to 14 years (mid-teen)?
    In more general term, can VAE be used with non-identity function?

  • @XecutionStyle
    @XecutionStyle 3 ปีที่แล้ว

    I think the reason the latent code is important is because that layer, that middle layer, has far fewer neurons than the input. So anything that's produced from there - has to come from a compressed form of the input.

  • @supnegi
    @supnegi 3 ปีที่แล้ว

    That was incredible!

  • @fatemerezaei6898
    @fatemerezaei6898 10 หลายเดือนก่อน

    Amazing!

  • @sunti8893
    @sunti8893 4 ปีที่แล้ว

    This is very useful video! Thank you :)

  • @dt28469
    @dt28469 3 ปีที่แล้ว

    Wow that dog barking noise tripped my brain out so hard. Because my neighbor's dog always barks, my brain tuned out the sound of the bark until I reasoned he was taking about the sound of dogs barking. Neural networks aren't intelligent enough to behave in these ways.

  • @dreamliu6867
    @dreamliu6867 2 ปีที่แล้ว

    Wonderful explanation. Could you please make a math tutorial on VAE? Thanks

  • @harshkumaragarwal8326
    @harshkumaragarwal8326 3 ปีที่แล้ว

    you guys do a great job

  • @MLDawn
    @MLDawn 3 ปีที่แล้ว

    really really good video. Could you tell me something about the Gaussian prior on the bottleneck. 1) Do we learn the parameters of this Gaussian? 2) Is it only 1 Gaussian, or as you said, it is really a mixture of Gaussians (mathematically speaking)? Thanks

  • @SurajBorate-bx6hv
    @SurajBorate-bx6hv ปีที่แล้ว

    Thanks for the awesome explanation. How to choose between VAEs and diffusion models ?

  • @china_tours
    @china_tours 2 ปีที่แล้ว

    Great explanation, but please make the slides (ppt) public.. thank you

  • @hihellohowrumfine
    @hihellohowrumfine 6 หลายเดือนก่อน

    Can you make a deep math video on variational auto encoders?

  • @avidreader100
    @avidreader100 3 ปีที่แล้ว

    Good explanation. Perhaps after creating a blurry image, one can use another application for sharpening the features.

  • @FrankaBrou
    @FrankaBrou 10 หลายเดือนก่อน

    bro I jumped, I thought there was a dog next to me 00:38

  • @user-ju5uv2lk3e
    @user-ju5uv2lk3e ปีที่แล้ว

    Thanks for this video :)

    • @CodeEmporium
      @CodeEmporium  ปีที่แล้ว +1

      You are very welcome. Thank you for the thoughtful words

  • @artinbogdanov7229
    @artinbogdanov7229 3 ปีที่แล้ว

    Thanks!

  • @niveyoga3242
    @niveyoga3242 5 ปีที่แล้ว

    Tells there is so much potential & then brings an example, where I can build a photobook of my favorite animal! xD

    • @CodeEmporium
      @CodeEmporium  5 ปีที่แล้ว

      Animal photo albums are all we need in this world.

  • @Victor-he5hy
    @Victor-he5hy 4 ปีที่แล้ว

    Very good video. Impressive

  • @ruksharalam173
    @ruksharalam173 8 หลายเดือนก่อน

    So, if GANs produce better-quality images, is there any use for VAEs in the industry?

  • @nikitasinha8181
    @nikitasinha8181 ปีที่แล้ว

    Thank you so much

  • @maxjt11
    @maxjt11 4 ปีที่แล้ว

    Thanks man, great vid

  • @sebastiaanvanbuisman1704
    @sebastiaanvanbuisman1704 4 ปีที่แล้ว

    great vid! i appreciate this a lot

  • @gordonlim2322
    @gordonlim2322 3 ปีที่แล้ว

    At 7:12, you said that generative models need to learn these "pools" or distribution. Which part of the autoencoder is that? Or is it separate from that? To my understanding, the autoencoder alone just learns the weights for the encoder and decoder.

  • @Wabadoum
    @Wabadoum 5 ปีที่แล้ว

    Nice video! I have two questions:
    You show that the pool of the VAE is continuous, but it also shows blanks, eg. all space isnt covered by the numbers. What does a sampling from these regions gives? Is it still close to a number?
    Second question, does the size of the pool affect the quality of an image generated? Like giving more space to the VAE allows it to learn with less constrains?
    Thanks!

  • @bharathpreetham310
    @bharathpreetham310 5 ปีที่แล้ว +1

    can i know which mic u r using for making these videos???

  • @lihuil3115
    @lihuil3115 2 ปีที่แล้ว

    very good.

  • @pavanms6924
    @pavanms6924 3 ปีที่แล้ว

    can you please make a video on probabilistic U nets

  • @baothach9259
    @baothach9259 4 ปีที่แล้ว

    Amazing tutorial

  • @threeMetreJim
    @threeMetreJim 5 ปีที่แล้ว

    What happens if you know how many vector elements are needed to accurately define what you want to reproduce, and then add a few more that aren't defined by the input image, but represent the class of the desired output. Will this force all of the vector elements into their own pool? So you can pick any random vector and add to it the representation of the class, to only pick from that pool. This strategy works for the 'image painting' network by Andrej Karpathy, and it's how I switched between images for a different kind of image tweening. I still wonder exactly what kind of network the 'image painter' actually is. I'm guessing that the same technique should also work for a generative auto encoder. I came up with the idea based on how a person learns something; you get more than one input - I.e a picture and description, that goes in (both presented at the input, rather than one at the input and the other at the output), and is then mapped to just the wanted description.

  • @thebrothershow5826
    @thebrothershow5826 3 ปีที่แล้ว

    You are amazing

  • @krishnagarg6870
    @krishnagarg6870 4 ปีที่แล้ว

    Nice Video

  • @hochmuch
    @hochmuch 5 ปีที่แล้ว

    Спасибо, твои видео веселые и очень полезные | Thank you, your videos are funny and so useful

    • @СергейКривенко-р6я
      @СергейКривенко-р6я 4 ปีที่แล้ว

      Чувак, веселый это fun, а funny это смешной, это два совсем разных слова.

  • @haralambiepapastathopoulos7876
    @haralambiepapastathopoulos7876 5 ปีที่แล้ว

    Could you make a video for adaptive instance normalization (AdaIN)? It would be very useful, nobody on TH-cam did this before

  • @yacinek85
    @yacinek85 4 ปีที่แล้ว

    Thanks

  • @vinayreddy8683
    @vinayreddy8683 4 ปีที่แล้ว

    Please make a video on Transformer and BERT architectures

    • @CodeEmporium
      @CodeEmporium  4 ปีที่แล้ว +1

      Gonna talk about that in my next video in a few days. Stay tuned :)

    • @vinayreddy8683
      @vinayreddy8683 4 ปีที่แล้ว

      @@CodeEmporium thanks for the reply AJ. I was really surprised the way you changed your accent in such a short span of time, at one point I couldn't believe the fact that you're Thamil.
      Your content is amazing, I don't want to be selfish here, but I'd be happy if you can do more video's on NLP.

  • @programmingrush
    @programmingrush 6 หลายเดือนก่อน

    Nice

  • @shivkrishnajaiswal8394
    @shivkrishnajaiswal8394 ปีที่แล้ว

    Interesting

  • @leosmi1
    @leosmi1 5 ปีที่แล้ว

    Thnx

  • @thecurious926
    @thecurious926 2 ปีที่แล้ว

    wait, then how is reconstruction done using an autoencoder?

  • @tilu391
    @tilu391 3 หลายเดือนก่อน

    if u r just taking a vector from pool , then isn't it just mapping of image->vector->image

  • @l.gunasekar832
    @l.gunasekar832 2 ปีที่แล้ว

    Good

  • @XecutionStyle
    @XecutionStyle 3 ปีที่แล้ว

    $@#$ I thought there was a dog in the house

  • @zarlishattique4167
    @zarlishattique4167 2 ปีที่แล้ว +1

    Where is coding it's not explained till you practice it.. 🥺

  • @fazilokuyanus3396
    @fazilokuyanus3396 5 ปีที่แล้ว

    you are great!

  • @juanpabloaguilar4982
    @juanpabloaguilar4982 3 ปีที่แล้ว

    I think is a very big mistake to say that auto encoders cannot used to generate data. That is very wrong and there are multiple applications which use images as inputs to generate images like for example how the baby from two parents will look like.

  • @HimanshuSingh-ej2tc
    @HimanshuSingh-ej2tc 2 ปีที่แล้ว

    Make more mathematical detailed video

  • @tıbhendese
    @tıbhendese 4 หลายเดือนก่อน

    Understood nothing about how this model works. Oversimplifications and storytelling makes it unpaired with the how the real thing work.
    Now I know : AE is reducing the input data into a smaller vector, VAE can generate blurry image.
    What I don't know : What is happening to input data and the dataset, what this pool intuition is for?

  • @DocTheDirector
    @DocTheDirector 5 ปีที่แล้ว +4

    Need the mathy version of this video the explanation of the latent loss is awful

  • @pseudospectral2
    @pseudospectral2 ปีที่แล้ว

    I was here

  • @Flinsyflonsy
    @Flinsyflonsy 4 ปีที่แล้ว

    10/10 because doggos.

  • @rockapedra1130
    @rockapedra1130 2 ปีที่แล้ว

    Was going well but ended without explaining ☹️

  • @Lucas7Martins
    @Lucas7Martins 4 ปีที่แล้ว

    Doggos!!!!!

  • @alexbarnadas
    @alexbarnadas 4 ปีที่แล้ว

    My cat makes very different noises x'D

  • @SolathPrime
    @SolathPrime 2 ปีที่แล้ว

    Kieet

  • @thejswaroop5230
    @thejswaroop5230 3 ปีที่แล้ว

    ur neural network has a bias over dogs to cats lol

  • @vladvladislav4335
    @vladvladislav4335 5 ปีที่แล้ว +1

    Well, that's actually a totally wrong conceptual explaination of a VAE. Moreover, in the video you didn't name some absolutely cruicial points about VAEs, that one would expect to hear. Moremoremoreover, there are plenty of statistical and mathematical things, that are not obvious at all and need to be explained when speaking about VAEs. So this is indeed an explaination, but quite a bad one
    I could be more specific if anybody is interested, so let's start some discussion in the comments :D

    • @jg9193
      @jg9193 4 ปีที่แล้ว

      I'm interested. Be more specific.

    • @est9949
      @est9949 4 ปีที่แล้ว

      Well, please explain more.

  • @bidishadas842
    @bidishadas842 4 ปีที่แล้ว

    Kahi bhi nahi jaate. Hamesha call karke puchte hai drop location kya hai or fir cancel karte !

  • @CharlieYoutubing
    @CharlieYoutubing 5 ปีที่แล้ว

    Thanks