AIAIART Lesson #7 - Diffusion Models

แชร์
ฝัง
  • เผยแพร่เมื่อ 7 ก.พ. 2025
  • Time to dive into diffusion models and see what is going on underneath the magic that is making the news at the moment :)
    If you'd prefer a slightly longer, more conversational version, the livestream recording is up here: • AIAIART Lesson 7 Lives... (this also covers a few additional topics thanks to questions from the twitch chat).
    Lesson notebook: colab.research...
    Github (includes discord invite) : github.com/joh...

ความคิดเห็น • 25

  • @emiliomorales2843
    @emiliomorales2843 2 ปีที่แล้ว +8

    Amazing work dude, great explanation, keep it going, this is the best diffusion model video in TH-cam

  • @growthfeeder6041
    @growthfeeder6041 2 ปีที่แล้ว

    Massive thanks. Very well done with the explanation. It really helped me think of new directions to explore. This channel is so underrated!

    • @datasciencecastnet
      @datasciencecastnet  2 ปีที่แล้ว

      I'm so glad to hear it :) Thanks for the kind words!

  • @jacobstuckiiii9683
    @jacobstuckiiii9683 2 ปีที่แล้ว +1

    Thank You for the summary & colab notebook!

  • @enesmahmutkulak
    @enesmahmutkulak 2 ปีที่แล้ว +1

    Thanks for sharing this fantastic video and notebook.

  • @Susuwho
    @Susuwho 2 ปีที่แล้ว

    best lecture on diffusion model! thank you!

  • @ionutbigioi1526
    @ionutbigioi1526 2 ปีที่แล้ว

    Amazing notebook, and great explanation.

  • @badermuteb4552
    @badermuteb4552 2 ปีที่แล้ว

    you are the best, man. thanks😍

  • @JBoy340a
    @JBoy340a 2 ปีที่แล้ว

    Nice job explaining this.

  • @gauravkaul9136
    @gauravkaul9136 2 ปีที่แล้ว

    This is such an awesome video!

  • @alfcnz
    @alfcnz 2 ปีที่แล้ว +11

    Nice notebook indeed!
    Small suggestion: move line 28 of the training loop just before line 37, since the reason behind zeroing out the gradient is that you don't want to accumulate them when running backward. Currently, these two instructions appear as unrelated in your code, which is bad practice.

    • @datasciencecastnet
      @datasciencecastnet  2 ปีที่แล้ว +3

      This is a bad habit of mine! I introduced the convention at the start of the course where the training loop was only a few lines and it was easy to see the link, but you're totally right that it's nicer to have them together. Changed it in the notebook.

  • @easyBob100
    @easyBob100 2 ปีที่แล้ว +1

    Thank you for this video. :)

  • @IArrrt
    @IArrrt 2 ปีที่แล้ว +1

    Cool ! Thanks you so much !

  • @feather6367
    @feather6367 2 ปีที่แล้ว +1

    Thank you for this code breakdown!! Very useful. Question though: Other videos I've seen (which are more theory focused) minimise the ELBO. Here I only see the reconstruction term. I'm wondering where the KL term is

    • @datasciencecastnet
      @datasciencecastnet  2 ปีที่แล้ว

      So I came across nn.labml.ai/diffusion/ddpm/index.html - specifically their section on the loss function and their 'simplified loss'. Re-reading now I'm not sure I quite follow some of their steps, but the conclusion seems to be that with a few (possibly sketchy) assumptions we can simplify the loss term to just the mean squared error between the predicted noise and the actual noise. There is some extra scaling we *should* technically include but leaving it out actually seems to help by increasing the weight given to higher ts.
      I'll take a look at some actual implementations and see what the people who actually know how this all works are doing and then get back to you :)

    • @Yenrabbit
      @Yenrabbit 2 ปีที่แล้ว

      I found a great video that goes through the maths in detail: th-cam.com/video/HoKDTa5jHvg/w-d-xo.html
      Shows *HOW* we end up with the relatively simple loss shown here.

  • @loyamarta4760
    @loyamarta4760 2 ปีที่แล้ว

    Uuu uji

  • @chiscoduran9517
    @chiscoduran9517 2 ปีที่แล้ว

    From a model like Unet is hard to implement a model like Palette?

    • @datasciencecastnet
      @datasciencecastnet  2 ปีที่แล้ว +1

      Palette uses a Unet model, specifically I think they follow the implementation used in (Prafulla Dhariwal and Alex Nichol. 2021. Diffusion models beat gans on image synthesis. arXiv preprint arXiv:2105.05233 (2021).) but with the input image as extra conditioning (so the input to the unet is 6 channels rather than 3).
      github.com/Janspiry/Palette-Image-to-Image-Diffusion-Models seems like a good place to start if you want to try a model like this yourself.

    • @chiscoduran9517
      @chiscoduran9517 2 ปีที่แล้ว +1

      @@datasciencecastnet thanks!!!!