Flow Matching | Explanation + PyTorch Implementation

แชร์
ฝัง
  • เผยแพร่เมื่อ 13 ม.ค. 2025

ความคิดเห็น • 25

  • @baroniket
    @baroniket 2 วันที่ผ่านมา +7

    The intuitive derivation part is really beautiful. I was finding it hard to grasp the concept previously with just the paper derivation but this makes so much sense. And to me personally, the intuitive derivation as well as the final code (specially the simple sampling part) seems to be even more helpful if I think of the application with different interpolation methods for different noise.

  • @sushicommander
    @sushicommander 2 วันที่ผ่านมา +3

    😃 Thank you for making this. I’ve been looking for an explainer like this for ages. Always learn new nuggets of info I missed from papers.

  • @gnorts_mr_alien
    @gnorts_mr_alien วันที่ผ่านมา +3

    amazing video. the efforts you spend to really understand and the creativity you have to make things simple to teach really shows.

    • @outliier
      @outliier  วันที่ผ่านมา

      @@gnorts_mr_alien thank you :3

  • @msanterre
    @msanterre วันที่ผ่านมา

    I love these videos. I always struggle a bit to get to fully understand these concepts by reading the papers, but how you explain it really makes it stick.

  • @gustavgille9323
    @gustavgille9323 วันที่ผ่านมา

    Awesome video! Impressive how you can break it down to such a simple idea 👍

  • @wolfeinstien313
    @wolfeinstien313 10 ชั่วโมงที่ผ่านมา

    Finally! I've been waiting for this video. Just started watching it, but I know it is going to be great! Thank you :)

  • @danielrose9754
    @danielrose9754 วันที่ผ่านมา

    I love the way you showed how to move from the FM to the CFM objective! In Yaron Lipman’s talk on the topic, he simply mentioned that the gradients coincide, but your derivation helps a lot with the understanding! As a follow-up, I would be very interested in a video on conditioning of FM models with CFG, like it was done in the paper “Scaling Rectified Flow Transformers for High-Resolution Image Synthesis” (Esser et al.). Keep up the great work!

  • @jojodi
    @jojodi 9 ชั่วโมงที่ผ่านมา

    Really great video. I implemented this in pytorch just based on my memory of your intuitive description and it...just worked? Transformed 2d gaussian noise to a "ring"-shaped distribution. Pretty awesome :)

    • @outliier
      @outliier  4 ชั่วโมงที่ผ่านมา

      @@jojodi oh that is awesome

  • @zuchti5699
    @zuchti5699 วันที่ผ่านมา +1

    You always give me so good pointers and intuitions to go on and read my papers. Thanks so much for your work!

    • @outliier
      @outliier  วันที่ผ่านมา

      @@zuchti5699 thank you! Could you point to me to an exact timestamp where this could be improved?

  • @cpldcpu
    @cpldcpu 14 ชั่วโมงที่ผ่านมา

    The authors of this paper arrived at the same approach, but derived it rather by intuition: "Iterative α-(de)Blending: a Minimalist Deterministic Diffusion Model" (arxiv 2305.03486).

  • @spartancoder
    @spartancoder วันที่ผ่านมา

    Extraordinary video.

  • @cpldcpu
    @cpldcpu 14 ชั่วโมงที่ผ่านมา

    Thank you for the nice video! I always found diffusions models to be packaged into way too much math compared to the actual implementation.

  • @felipedilho
    @felipedilho วันที่ผ่านมา

    You have made some incredible videos, I come from a Mathematical Physics background and I really appreciate your mathematical exposition of these machine learning papers on image generation! Thank you very much!!! If I could suggest another topic to make a video I would like to suggest a video on Nvidia''s new open source model SANA that affirms to be more efficient than the majority of image synthesis generative models. They Talk about methos such as DiT with linear attention and a Flow-DPM-Solver which I don't know if these are the same as Flow Matching.

  • @apartmenttour4000
    @apartmenttour4000 วันที่ผ่านมา

    hi , at around 9:26 is q(x1) the probability density of the marginalization variable x1 ? thanks

    • @outliier
      @outliier  วันที่ผ่านมา

      Yea and technically this could also just be written as p_1(x_1) as far as I understand this
      But this is how the paper did this too so I kept this part

  • @abelhutten4532
    @abelhutten4532 วันที่ผ่านมา

    Nice video's! I'd like to see RL video's, especially model based RL

    • @abelhutten4532
      @abelhutten4532 5 ชั่วโมงที่ผ่านมา

      Many of the methods involve image generation for goal setting or as part of the world model, there is lots of research involving multimodal models and about learning compressed latent spaces. Would fit well with your other videos. Something like dreamer v3 would be cool, just as an example. Showing a computer program learning to mine diamonds in Minecraft is also very spectacular

  • @RadRebel4
    @RadRebel4 วันที่ผ่านมา

    amazing video bro

  • @EkShunya
    @EkShunya วันที่ผ่านมา

    another banger video

  • @amortalbeing
    @amortalbeing วันที่ผ่านมา

    Thanks a lot

  • @SN-uc3vr
    @SN-uc3vr 2 วันที่ผ่านมา

    You're awesome! I would be curious to know whether you're a PhD, in industry, or self-taught?

    • @outliier
      @outliier  วันที่ผ่านมา +4

      @@SN-uc3vr mostly self taught, studied a bachelor on AI in Germany and now I work in the industry at Luma AI :3