Denoising Diffusion Probabilistic Models | DDPM Explained
ฝัง
- เผยแพร่เมื่อ 22 พ.ค. 2024
- In this video, I get into diffusion models and specifically we look into denoising diffusion probabilistic models (DDPM). I try to provide a comprehensive guide to understanding entire maths behind it and training diffusion models ( denoising diffusion probabilistic models ).
🔍 Video Highlights:
1. Overview of Diffusion Models: We first look at the code idea in diffusion models
2. DDPM Demystified: We break down entire math in Denoising Diffusion Probabilistic Models in order to gain a deep understanding of the algorithms driving these innovative models.
3. Training and Sampling in Diffusion Models: Finally we look step-by-step on how these are trained and how one can sample images in Denoising Diffusion Probabilistic Models
Timestamps
00:00 Introduction
00:25 Basic Idea of Diffusion Models
02:23 Why call this Diffusion Models
05:24 Transition function in Denoising Diffusion Probabilistic Models - DDPM
07:28 Distribution at end of forward Diffusion Process
10:17 Noise Schedule in Diffusion Models
11:36 Recursion to get from original image to noisy image
13:40 Reverse Process in Diffusion Models
14:40 Variational Lower Bound in Denoising Diffusion Probabilistic Models - DDPM
17:02 Simplifying the Likelihood for Diffusion Models
19:08 Ground Truth Denoising Distribution
22:31 Loss as Original Image Prediction
24:10 Loss as Noise Prediction
26:26 Training of DDPM - Denoising Diffusion Probabilistic Models
27:17 Sampling in DDPM - Denoising Diffusion Probabilistic Models
28:30 Why create this video on Diffusion Models
29:10 Thank You
🔔 Subscribe :
tinyurl.com/exai-channel-link
Useful Resources
Paper Link - tinyurl.com/exai-ddpm-paper
Jeremy Howard - • Practical Deep Learnin...
Calvin Luo - calvinyluo.com/2022/08/26/dif...
Joseph Rocca - towardsdatascience.com/unders...
Outlier - • Diffusion Models | Pap...
Lilian Weng - lilianweng.github.io/posts/20...
Ayan Das - ayandas.me/blog-tut/2021/12/0...
Jonathan Goodman - math.nyu.edu/~goodman/teachin...
📌 Keywords:
#DiffusionModels #DDPMExplained #generativeai
Background Track - Fruits of Life by Jimena Contreras
Email - explainingai.official@gmail.com
Amazing video! Thanks
I watched your video again, and cannot give you enough compliments on it! Great job!
@bayesianmonk Thank you so much for taking the time to comment these words of appreciation(that too twice) 🙂
without a doubt the best video ever made on the subject of DDPM. Even better than the original paper. Thank you very much for that. ❤
I am truly humbled by your generous comment(brought a big smile to my face :) ).
Thank you so much for the kind words.
VERY VERY GREAT video! Helps a lot for understanding why things are done in the ways presented in the original paper. Thank you so much!!!
Thank you! Really glad that it was of some help
I don't have enough words to describe this masterpiece. VERY WELL EXPLAINED. Thanks. :)
Thank you so much for this appreciation :)
That is the best video that i have watched about teaching the diffusion model.
Thank you :)
Best explanation of diffusion process with connection to VAE process!
Thank you for the kind words!
Nice explanation..!
Thank You!
Great Video! It was very helpful to understand DDPM ! Thank you so much ! : )
Thank you :) Glad that the video was helpful to you!
This is a great video! Thanks!
Thank you! Glad that the video was of any help
Appreciate your hard work🎉
Thank you for that :)
Legendry video
This is a great video, i completely understood till "Simplifying the Likelihood for Diffusion Models". I'll need to replay multiple times but the video is very helpful..
Please make more such video diving into maths. Most youtubers leave out the maths part while teaching DL part which is crazy because it's all math.
Thank you for saying that! And Yes the idea is to dive into that as doing that also gives me the best shot at ensuring I understand everything.
Hi, Very good attempt of explaining the DDPM, and thank you for sharing the information. Kudos! to answer your question at 14:22 (why reverse process is the diffusion?) because while reverse process, after the prediction of noise by u-net we check for the condition whether it is at t=0(x0-original image state) our output would be mean(has same shape of image) or not, if we are not at t=0 then our output would be mean+variance (with this variance we are adding noise again - based on x0). Hope this helps!
You sure you're answering the question? You're talking about an implementation detail. Could you please elaborate on the mathematical intuition?
Superb, the math doesn't looks all that scary after your explanation! Now I just need pen an paper to sink it in.
Thank You!
Very Nice! Keep the good word going!!
Thank You!
Mazaa aa gaya Tushar bhai!
Thank you 😀
Amazing tutorial! Thanks for putting this up. Waiting for the stable diffusion video. When can we expect that? :)
Thank you @himanshurai6481 :) It will be the next video that gets uploaded on the channel.. will start working on that from tomorrow.
@@Explaining-AI looking forward to that :)
This video was absolutely amazing!
Also giving yourself a rating of 0.05 after spending 500 hrs on a topic is crazy(Not that I would know, because I am about a 0.0005 according to this scale)
Waiting eagerly for the next one!
Thank you so much! the scale was more to indicate how much I don't know(yet)😃
Have already started working on Part 2 of Stable Diffusion Video so that should soon be out.
Yes, in theory the forward process and the reverse process is the same given the process is a Weiner Process(Brownian motion). Intuitively, if you have a microscopic view of a Brownian motion, the forward and the reverse process looks similar (i.e. random). th-cam.com/video/XCUlnHP1TNM/w-d-xo.html
Thank you for sharing the video link
crazy stuff
Bhai Hats off
Thank you!
Amazing video, thanks a lot for all the effort you put in this. Just out of curiosity what do you use for the animation of the formulas?
Thank you for the kind words! For creating the equations I use editor.codecogs.com and then use Canva for all the animations
I thought you were using manim@@Explaining-AI
I haven’t yet given it a try. I started with canva for the first video and found was able to do everything that I wanted to( in terms of animations ), so just kept using that only.
Your explanation is really easy to understand. I have one request. Can you make one video on any virtual try on. On models like dior or tryondiffusion who give good results. Paper explanation and implementation both will really help. I am trying understand them over a month but still couldn't understand anything.
Thank you! Yes will add it to my list. It might take some time to get to it but whenever I do it I will have both explanation and implementation.
@@Explaining-AI Thank you Tushar
Hey, very helpful video. I'm making a project for our image processing course on diffPIR paper, this video explains everything in sequence. All the bad calculation missed in my paper is explained and with proper intuition very nicely thanks👍
Edit: just one question what about the term E[log(p(x_0|x_1))], what is the idea behind it, does the model minimize it?
Thank you! This term is the reconstruction loss which is similar to what we have in vae's. Here its measure that given a slightly noisy image x1(t=1), how well the model is able to reconstruct the original image x0 from it. In an actual implementation this is minimized together with the summation terms itself. So during training instead of uniformly sampling timesteps from t=2 to t=T(to minimize the summation terms), we sample timesteps from t=1 to t=T, and when t=1 , the model is learning to denoise x1 (rather reconstruct x0 from a noisy x1). The only difference happens during inferencing, where at t=1, we simply return the predicted denoised mean, rather than returning a sample from N(mean, scheduled variance) which we do for t=2 to t=T.
Thank you for this fantastic video on DDPMs, it was super helpful. One thing I'm having trouble understanding is the derivation at 12:29, how can we go from the 3rd line to the 4th line on the right side. I mean this part:
sqrt(alpha_t - alpha_t * alpha_{t-1}) * epsilon_{t-1} + sqrt(1 - alpha_t) * episolon_t
...to the next line where we combined these two square roots:
sqrt(1 - alpha_t * alpha_{t-1}) * epsilon
?
In the third line, just view the epsilon terms as samples from gaussian with 0 mean and some variance. So the two epsilon terms in third line is just adding two gaussians. Then we use the fact that sum of two independent gaussians ends up being a gaussian with mean as sum of the two means(which here for both is 0) and variance as sum of the two variances. Which is why we can rewrite it in the 4th line as a sample from a gaussian with 0 mean and variance as sum of the individual variances present in third line. Do Let me know if this clarifies it.
@@Explaining-AI yes perfectly! Thank you for the quick response, that makes sense :)
the reverse process can't be computed. As the process we are doing is not reversible. Can be derived using Non linear dynamics.
I derived the whole equation for reverse diffusion process and at 21:26 in the last term of equation in the last line, I did not get \sqrt{\alpha t - 1}.
Could you share the complete derivation? Also, the third last line seems to be incorrect, it should be (\alpha t - 1) instead of (\alpha t - 1)^2
Hello, yes the square on \bar{\alpha_(t-1)} is a mistake which gets corrected in the next line. But thank you for pointing that out!
Regarding the last term in last line, just wanted to mention that its \bar{\alpha_(t-1)} which is just coming from rewriting \bar{\alpha_(t)} from the last term in second last line as \alpha_t * \bar{\alpha_(t-1)} .
@@Explaining-AI Ahh yes, ignorant me. Thank you for your time in deriving the equations. I did not find this derivation any where else yet :)
*Github Code* - github.com/explainingai-code/DDPM-Pytorch
*DDPM Implementation Video* - th-cam.com/video/vu6eKteJWew/w-d-xo.html
28:11 The algorithm for sampling, namely step 4, looks a lot different than what you explain. Why is that? To me, it looks like they take the predicted noise from xt, do a lil math to it, then subtract it from xt, then add a lil noise to it to get xt-1. You kinda just ran through it like it was nothing, but it doesn't look the same at all.
Hello, Do you mean the formulation of mu + sigma*z and Step 4 of Sampling ?
They both are actually the same and just require taking sqrt(xt) term out and simplifying the second term. Have a look at this - imgur.com/a/LJL73z1
@@Explaining-AIThank you, now I remember. Shift and scale. :)
Great video, but as feedback, I'd suggest to breath and pause a bit after each bigger step. You're jumping between statements really fast, so you don't give people to think a little bit about what you just said.
Thank you so much for this feedback, makes perfect sense. Will try to improve on this in the future videos.
bruh my brain is exploding from the math😅
Yes this one indeed has a lot of math required for understanding it which is why I tried to put forth every detail :) Though maybe I could have done a better job presenting it in a better/simpler manner.
Hi, did you count 500 hrs as in only on diffusion? Or including previously learned concepts like VAEs, ELBO, KLD etc ?
Hello,
That number was just for diffusion as for 4-5 weeks all I was doing during the day(dont work as of now ) was understanding diffusion. And then post that, implementation. And I give myself ample time to understand things at my own speed, so somebody else can understand the same rather much more/better in lesser time :)
But that number was just a means to express on scale as to how much I don't know still and how the video is just my current understanding of it all. Nothing more than that!
@@Explaining-AI Thanks for the reply. I also try to time myself during learning. As I think a definite number (lower bound) is required to build the concepts of any topic. That's why I was curious if 500 hours was a calculative number as Andrej Karapathy in his blogs also recommends an average figure of 10,000 hours to become a good beginner in Machine learning.
@@Explaining-AI Super cool!