Denoising Diffusion Probabilistic Models | DDPM Explained

แชร์
ฝัง
  • เผยแพร่เมื่อ 22 พ.ค. 2024
  • In this video, I get into diffusion models and specifically we look into denoising diffusion probabilistic models (DDPM). I try to provide a comprehensive guide to understanding entire maths behind it and training diffusion models ( denoising diffusion probabilistic models ).
    🔍 Video Highlights:
    1. Overview of Diffusion Models: We first look at the code idea in diffusion models
    2. DDPM Demystified: We break down entire math in Denoising Diffusion Probabilistic Models in order to gain a deep understanding of the algorithms driving these innovative models.
    3. Training and Sampling in Diffusion Models: Finally we look step-by-step on how these are trained and how one can sample images in Denoising Diffusion Probabilistic Models
    Timestamps
    00:00 Introduction
    00:25 Basic Idea of Diffusion Models
    02:23 Why call this Diffusion Models
    05:24 Transition function in Denoising Diffusion Probabilistic Models - DDPM
    07:28 Distribution at end of forward Diffusion Process
    10:17 Noise Schedule in Diffusion Models
    11:36 Recursion to get from original image to noisy image
    13:40 Reverse Process in Diffusion Models
    14:40 Variational Lower Bound in Denoising Diffusion Probabilistic Models - DDPM
    17:02 Simplifying the Likelihood for Diffusion Models
    19:08 Ground Truth Denoising Distribution
    22:31 Loss as Original Image Prediction
    24:10 Loss as Noise Prediction
    26:26 Training of DDPM - Denoising Diffusion Probabilistic Models
    27:17 Sampling in DDPM - Denoising Diffusion Probabilistic Models
    28:30 Why create this video on Diffusion Models
    29:10 Thank You
    🔔 Subscribe :
    tinyurl.com/exai-channel-link
    Useful Resources
    Paper Link - tinyurl.com/exai-ddpm-paper
    Jeremy Howard - • Practical Deep Learnin...
    Calvin Luo - calvinyluo.com/2022/08/26/dif...
    Joseph Rocca - towardsdatascience.com/unders...
    Outlier - • Diffusion Models | Pap...
    Lilian Weng - lilianweng.github.io/posts/20...
    Ayan Das - ayandas.me/blog-tut/2021/12/0...
    Jonathan Goodman - math.nyu.edu/~goodman/teachin...
    📌 Keywords:
    #DiffusionModels #DDPMExplained #generativeai
    Background Track - Fruits of Life by Jimena Contreras
    Email - explainingai.official@gmail.com

ความคิดเห็น • 72

  • @miladmas3296
    @miladmas3296 วันที่ผ่านมา

    Amazing video! Thanks

  • @bayesianmonk
    @bayesianmonk 14 วันที่ผ่านมา

    I watched your video again, and cannot give you enough compliments on it! Great job!

    • @Explaining-AI
      @Explaining-AI  13 วันที่ผ่านมา

      @bayesianmonk Thank you so much for taking the time to comment these words of appreciation(that too twice) 🙂

  • @amirzarei4955
    @amirzarei4955 4 หลายเดือนก่อน +7

    without a doubt the best video ever made on the subject of DDPM. Even better than the original paper. Thank you very much for that. ❤

    • @Explaining-AI
      @Explaining-AI  4 หลายเดือนก่อน +2

      I am truly humbled by your generous comment(brought a big smile to my face :) ).
      Thank you so much for the kind words.

  • @xichen391
    @xichen391 5 หลายเดือนก่อน +3

    VERY VERY GREAT video! Helps a lot for understanding why things are done in the ways presented in the original paper. Thank you so much!!!

    • @Explaining-AI
      @Explaining-AI  5 หลายเดือนก่อน

      Thank you! Really glad that it was of some help

  • @vikramsandu6054
    @vikramsandu6054 หลายเดือนก่อน

    I don't have enough words to describe this masterpiece. VERY WELL EXPLAINED. Thanks. :)

    • @Explaining-AI
      @Explaining-AI  29 วันที่ผ่านมา

      Thank you so much for this appreciation :)

  • @shizhouhuang4872
    @shizhouhuang4872 4 หลายเดือนก่อน +2

    That is the best video that i have watched about teaching the diffusion model.

  • @learningcurveai
    @learningcurveai หลายเดือนก่อน

    Best explanation of diffusion process with connection to VAE process!

    • @Explaining-AI
      @Explaining-AI  หลายเดือนก่อน

      Thank you for the kind words!

  • @BrijrajSingh08
    @BrijrajSingh08 6 วันที่ผ่านมา

    Nice explanation..!

  • @efstathiasoufleri6881
    @efstathiasoufleri6881 หลายเดือนก่อน

    Great Video! It was very helpful to understand DDPM ! Thank you so much ! : )

    • @Explaining-AI
      @Explaining-AI  หลายเดือนก่อน

      Thank you :) Glad that the video was helpful to you!

  • @gregkondas6457
    @gregkondas6457 5 หลายเดือนก่อน

    This is a great video! Thanks!

    • @Explaining-AI
      @Explaining-AI  5 หลายเดือนก่อน

      Thank you! Glad that the video was of any help

  • @jeffreyyoon3618
    @jeffreyyoon3618 3 หลายเดือนก่อน

    Appreciate your hard work🎉

    • @Explaining-AI
      @Explaining-AI  3 หลายเดือนก่อน

      Thank you for that :)

  • @DLwithShreyas
    @DLwithShreyas หลายเดือนก่อน +1

    Legendry video

  • @sushilkhadka8069
    @sushilkhadka8069 5 หลายเดือนก่อน +1

    This is a great video, i completely understood till "Simplifying the Likelihood for Diffusion Models". I'll need to replay multiple times but the video is very helpful..
    Please make more such video diving into maths. Most youtubers leave out the maths part while teaching DL part which is crazy because it's all math.

    • @Explaining-AI
      @Explaining-AI  5 หลายเดือนก่อน +1

      Thank you for saying that! And Yes the idea is to dive into that as doing that also gives me the best shot at ensuring I understand everything.

  • @bhushanatote
    @bhushanatote 2 หลายเดือนก่อน +1

    Hi, Very good attempt of explaining the DDPM, and thank you for sharing the information. Kudos! to answer your question at 14:22 (why reverse process is the diffusion?) because while reverse process, after the prediction of noise by u-net we check for the condition whether it is at t=0(x0-original image state) our output would be mean(has same shape of image) or not, if we are not at t=0 then our output would be mean+variance (with this variance we are adding noise again - based on x0). Hope this helps!

    • @genericperson8238
      @genericperson8238 หลายเดือนก่อน +1

      You sure you're answering the question? You're talking about an implementation detail. Could you please elaborate on the mathematical intuition?

  • @mycotina6438
    @mycotina6438 หลายเดือนก่อน

    Superb, the math doesn't looks all that scary after your explanation! Now I just need pen an paper to sink it in.

  • @prathameshdinkar2966
    @prathameshdinkar2966 21 วันที่ผ่านมา

    Very Nice! Keep the good word going!!

  • @tusharmadaan5480
    @tusharmadaan5480 4 หลายเดือนก่อน

    Mazaa aa gaya Tushar bhai!

  • @himanshurai6481
    @himanshurai6481 4 หลายเดือนก่อน

    Amazing tutorial! Thanks for putting this up. Waiting for the stable diffusion video. When can we expect that? :)

    • @Explaining-AI
      @Explaining-AI  4 หลายเดือนก่อน

      Thank you @himanshurai6481 :) It will be the next video that gets uploaded on the channel.. will start working on that from tomorrow.

    • @himanshurai6481
      @himanshurai6481 4 หลายเดือนก่อน

      @@Explaining-AI looking forward to that :)

  • @arpitanand4693
    @arpitanand4693 3 หลายเดือนก่อน

    This video was absolutely amazing!
    Also giving yourself a rating of 0.05 after spending 500 hrs on a topic is crazy(Not that I would know, because I am about a 0.0005 according to this scale)
    Waiting eagerly for the next one!

    • @Explaining-AI
      @Explaining-AI  3 หลายเดือนก่อน

      Thank you so much! the scale was more to indicate how much I don't know(yet)😃
      Have already started working on Part 2 of Stable Diffusion Video so that should soon be out.

  • @AR-on4wm
    @AR-on4wm 5 หลายเดือนก่อน +2

    Yes, in theory the forward process and the reverse process is the same given the process is a Weiner Process(Brownian motion). Intuitively, if you have a microscopic view of a Brownian motion, the forward and the reverse process looks similar (i.e. random). th-cam.com/video/XCUlnHP1TNM/w-d-xo.html

    • @Explaining-AI
      @Explaining-AI  5 หลายเดือนก่อน +1

      Thank you for sharing the video link

  • @NishanthMohankumar
    @NishanthMohankumar 25 วันที่ผ่านมา

    crazy stuff

  • @AniketKumar-dl1ou
    @AniketKumar-dl1ou 4 หลายเดือนก่อน

    Bhai Hats off

  • @bayesianmonk
    @bayesianmonk หลายเดือนก่อน

    Amazing video, thanks a lot for all the effort you put in this. Just out of curiosity what do you use for the animation of the formulas?

    • @Explaining-AI
      @Explaining-AI  หลายเดือนก่อน +1

      Thank you for the kind words! For creating the equations I use editor.codecogs.com and then use Canva for all the animations

    • @bayesianmonk
      @bayesianmonk หลายเดือนก่อน

      I thought you were using manim@@Explaining-AI

    • @Explaining-AI
      @Explaining-AI  หลายเดือนก่อน

      I haven’t yet given it a try. I started with canva for the first video and found was able to do everything that I wanted to( in terms of animations ), so just kept using that only.

  • @TirthRadadiya-hp9sq
    @TirthRadadiya-hp9sq 4 หลายเดือนก่อน

    Your explanation is really easy to understand. I have one request. Can you make one video on any virtual try on. On models like dior or tryondiffusion who give good results. Paper explanation and implementation both will really help. I am trying understand them over a month but still couldn't understand anything.

    • @Explaining-AI
      @Explaining-AI  4 หลายเดือนก่อน

      Thank you! Yes will add it to my list. It might take some time to get to it but whenever I do it I will have both explanation and implementation.

    • @TirthRadadiya-hp9sq
      @TirthRadadiya-hp9sq 4 หลายเดือนก่อน

      @@Explaining-AI Thank you Tushar

  • @Adityak1997
    @Adityak1997 2 หลายเดือนก่อน

    Hey, very helpful video. I'm making a project for our image processing course on diffPIR paper, this video explains everything in sequence. All the bad calculation missed in my paper is explained and with proper intuition very nicely thanks👍
    Edit: just one question what about the term E[log(p(x_0|x_1))], what is the idea behind it, does the model minimize it?

    • @Explaining-AI
      @Explaining-AI  2 หลายเดือนก่อน

      Thank you! This term is the reconstruction loss which is similar to what we have in vae's. Here its measure that given a slightly noisy image x1(t=1), how well the model is able to reconstruct the original image x0 from it. In an actual implementation this is minimized together with the summation terms itself. So during training instead of uniformly sampling timesteps from t=2 to t=T(to minimize the summation terms), we sample timesteps from t=1 to t=T, and when t=1 , the model is learning to denoise x1 (rather reconstruct x0 from a noisy x1). The only difference happens during inferencing, where at t=1, we simply return the predicted denoised mean, rather than returning a sample from N(mean, scheduled variance) which we do for t=2 to t=T.

  • @surfingmindwaves
    @surfingmindwaves 2 หลายเดือนก่อน

    Thank you for this fantastic video on DDPMs, it was super helpful. One thing I'm having trouble understanding is the derivation at 12:29, how can we go from the 3rd line to the 4th line on the right side. I mean this part:
    sqrt(alpha_t - alpha_t * alpha_{t-1}) * epsilon_{t-1} + sqrt(1 - alpha_t) * episolon_t
    ...to the next line where we combined these two square roots:
    sqrt(1 - alpha_t * alpha_{t-1}) * epsilon
    ?

    • @Explaining-AI
      @Explaining-AI  2 หลายเดือนก่อน +1

      In the third line, just view the epsilon terms as samples from gaussian with 0 mean and some variance. So the two epsilon terms in third line is just adding two gaussians. Then we use the fact that sum of two independent gaussians ends up being a gaussian with mean as sum of the two means(which here for both is 0) and variance as sum of the two variances. Which is why we can rewrite it in the 4th line as a sample from a gaussian with 0 mean and variance as sum of the individual variances present in third line. Do Let me know if this clarifies it.

    • @surfingmindwaves
      @surfingmindwaves 2 หลายเดือนก่อน

      @@Explaining-AI yes perfectly! Thank you for the quick response, that makes sense :)

  • @tanishmittal5083
    @tanishmittal5083 4 หลายเดือนก่อน +1

    the reverse process can't be computed. As the process we are doing is not reversible. Can be derived using Non linear dynamics.

  • @vinayakkumar4512
    @vinayakkumar4512 หลายเดือนก่อน

    I derived the whole equation for reverse diffusion process and at 21:26 in the last term of equation in the last line, I did not get \sqrt{\alpha t - 1}.
    Could you share the complete derivation? Also, the third last line seems to be incorrect, it should be (\alpha t - 1) instead of (\alpha t - 1)^2

    • @Explaining-AI
      @Explaining-AI  หลายเดือนก่อน

      Hello, yes the square on \bar{\alpha_(t-1)} is a mistake which gets corrected in the next line. But thank you for pointing that out!
      Regarding the last term in last line, just wanted to mention that its \bar{\alpha_(t-1)} which is just coming from rewriting \bar{\alpha_(t)} from the last term in second last line as \alpha_t * \bar{\alpha_(t-1)} .

    • @vinayakkumar4512
      @vinayakkumar4512 หลายเดือนก่อน

      @@Explaining-AI Ahh yes, ignorant me. Thank you for your time in deriving the equations. I did not find this derivation any where else yet :)

  • @Explaining-AI
    @Explaining-AI  5 หลายเดือนก่อน

    *Github Code* - github.com/explainingai-code/DDPM-Pytorch
    *DDPM Implementation Video* - th-cam.com/video/vu6eKteJWew/w-d-xo.html

  • @easyBob100
    @easyBob100 2 หลายเดือนก่อน +1

    28:11 The algorithm for sampling, namely step 4, looks a lot different than what you explain. Why is that? To me, it looks like they take the predicted noise from xt, do a lil math to it, then subtract it from xt, then add a lil noise to it to get xt-1. You kinda just ran through it like it was nothing, but it doesn't look the same at all.

    • @Explaining-AI
      @Explaining-AI  2 หลายเดือนก่อน +1

      Hello, Do you mean the formulation of mu + sigma*z and Step 4 of Sampling ?
      They both are actually the same and just require taking sqrt(xt) term out and simplifying the second term. Have a look at this - imgur.com/a/LJL73z1

    • @easyBob100
      @easyBob100 2 หลายเดือนก่อน

      @@Explaining-AIThank you, now I remember. Shift and scale. :)

  • @genericperson8238
    @genericperson8238 หลายเดือนก่อน +1

    Great video, but as feedback, I'd suggest to breath and pause a bit after each bigger step. You're jumping between statements really fast, so you don't give people to think a little bit about what you just said.

    • @Explaining-AI
      @Explaining-AI  หลายเดือนก่อน

      Thank you so much for this feedback, makes perfect sense. Will try to improve on this in the future videos.

  • @acatisfinetoo3018
    @acatisfinetoo3018 5 หลายเดือนก่อน

    bruh my brain is exploding from the math😅

    • @Explaining-AI
      @Explaining-AI  5 หลายเดือนก่อน

      Yes this one indeed has a lot of math required for understanding it which is why I tried to put forth every detail :) Though maybe I could have done a better job presenting it in a better/simpler manner.

  • @anshumansinha5874
    @anshumansinha5874 3 หลายเดือนก่อน

    Hi, did you count 500 hrs as in only on diffusion? Or including previously learned concepts like VAEs, ELBO, KLD etc ?

    • @Explaining-AI
      @Explaining-AI  3 หลายเดือนก่อน +1

      Hello,
      That number was just for diffusion as for 4-5 weeks all I was doing during the day(dont work as of now ) was understanding diffusion. And then post that, implementation. And I give myself ample time to understand things at my own speed, so somebody else can understand the same rather much more/better in lesser time :)
      But that number was just a means to express on scale as to how much I don't know still and how the video is just my current understanding of it all. Nothing more than that!

    • @anshumansinha5874
      @anshumansinha5874 3 หลายเดือนก่อน +1

      @@Explaining-AI Thanks for the reply. I also try to time myself during learning. As I think a definite number (lower bound) is required to build the concepts of any topic. That's why I was curious if 500 hours was a calculative number as Andrej Karapathy in his blogs also recommends an average figure of 10,000 hours to become a good beginner in Machine learning.

    • @gengyanzhao923
      @gengyanzhao923 หลายเดือนก่อน

      @@Explaining-AI Super cool!