Lesson 22: Deep Learning Foundations to Stable Diffusion

แชร์
ฝัง
  • เผยแพร่เมื่อ 31 ม.ค. 2025

ความคิดเห็น • 5

  • @michaelmuller136
    @michaelmuller136 4 หลายเดือนก่อน

    Great lecture, especially about the different samplers, thank you!

  • @piotr.czapla
    @piotr.czapla 2 ปีที่แล้ว +4

    I have a preliminary chapters. Let's see if TH-cam let me add them so that it is easier to improve on them.
    Chapters
    00:00 - Intro
    00:30 - Cosine Schedule (22_cosine)
    06:05 - Sampling
    09:37 - Summary / Notation
    10:42 - Pedicting the noise level of noisy Fashion MNIST images (22_noise-pred)
    12:57 - Why .logit() when predicting alpha bar t
    14:50 - Random baseline
    16:40 - mse_loss why .flatten()
    17:30 - Model & results
    19:03 - Why are we trying to predict the noise level?
    20:10 - Training diffiusion without t - first attempt
    22:58 - Why it isn’t working?
    27:02 - Debugging (summmary)
    29:29 - Bug in ddpm - paper that cast some light on the issue
    38:40 - Kerras (Elucidating the Design Space of Diffusion - Based Generative Models)
    49:47 - Picture of target images
    52:48 - Scaling problem - (scalings)
    59:42 - Training and predictions of modified model
    1:03:49 - Sampling
    1:06:05 - Sampling: Problems of composition
    1:07:40 - Sampling: Rationale for rho selection
    1:09:40 - Sampling: Denosing
    1:15:26 - Sampling: Heun’s method fid: 0.972
    1:19:00 - Sampling: LMS sampler
    1:20:00 - Kerras Summary
    1:23:00 - Comparison of different approaches
    1:25:00 - Next lessons

  • @myfolder4561
    @myfolder4561 2 หลายเดือนก่อน

    @6:04 (th-cam.com/video/6Bta1tXRUfM/w-d-xo.html). Just to seek clarification, the 'denoise' function essentially calculates for x_0_hat (which is the unbiased estimate of x_0) at any given timestep, right? It's equivalent to making the best unbiased estimate for x_0 (i.e. completely denoised image in the original data distribution) in a single denoising step, as opposed to the recursive approach in the multi-step reverse diffusion process, which essentially iterates through a series of x_0_hat estimation and weigh average against the noisy image x_t at each step of the reverse/denoising process. A single denoising step, albeit unbiased, would produce pretty unsatisfactory outcome that's way off of the original data distribution, given the high variance of this approach.

  • @myfolder4561
    @myfolder4561 2 หลายเดือนก่อน

    @th-cam.com/video/6Bta1tXRUfM/w-d-xo.html
    Hope someone can shed some light and wisdom. Does the modified model incorporating c-skip look like it's consistently not able to denoise an image on the noisy end of the spectrum? Given the modified model's objective function towards noisier images is to emphasize on finding the original image, so does that mean it's not really doing what it's set out to do?

  • @satirthapaulshyam7769
    @satirthapaulshyam7769 ปีที่แล้ว

    17:00 falttened mse