DALL-E: Zero-Shot Text-to-Image Generation | Paper Explained

VQ-VAEs: Neural Discrete Representation Learning | Paper + PyTorch Code Explained

Why Does Diffusion Work Better than Auto-Regression?

นี่ไม่ใช่ลูกผม ผม63ปีแล้ว ผมแก่เกินจะมีลูก #สาระแทบไม่มี

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 2

ไก่วิเศษ #การ์ตูน #นิทาน #cartoon

VQ-GAN: Taming Transformers for High-Resolution Image Synthesis | Paper Explained

Aleksa Gordić - The AI Epiphany

มุมมอง 20 996

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 19 ม.ค. 2025

ความคิดเห็น • 32

@TheAIEpiphany 3 ปีที่แล้ว ⁺¹²
What do you get combining DeepMind's VQ-VAE, GANs, perceptual loss, and OpenAI's GPT-2, and CLIP? Well, I dunno but the results are awesome haha!
@jisujeon5799 2 ปีที่แล้ว ⁺²
TH-cam should have recommended me this channel a year ago. What a quality content! Keep it up :D
@TheAIEpiphany 2 ปีที่แล้ว ⁺¹
Hahah misterious are the paths of the YT algorithm. 😅
@johnpope1473 3 ปีที่แล้ว ⁺⁷
I like the low level stuff. I attempt to read these papers and your grasp and explanations give me confidence that I can decode them too. Almost always they’re built on top of other work. I liked when you distilled that history out in stylegan session.
@TheAIEpiphany 3 ปีที่แล้ว ⁺¹
Thanks, It's fairly a complex tradeoff to decide when to stop digging into more nitty-gritty details. 😅 I am still figuring it out
@johnpope1473 3 ปีที่แล้ว
@@TheAIEpiphany . I once came across some python code I cloned on GitHub that could take a PDF and create multi quiz questions based off any content. Maybe I could help you one day and have you nut out the answer. You remember that sort of stuff in physics class where the teacher makes things clear eliminating nonsense and elucidating correct answer.
@ronitrastogi9016 ปีที่แล้ว
In-depth explanations are game changer. Keep doing the same. Great work!!
@moaidali874 3 ปีที่แล้ว ⁺⁴
The in-depth explanation is pretty useful. Thank you so much.
@akashsuryawanshi6267 ปีที่แล้ว
keep it up with the detailed explanations. For those who are interested in the low level stuff can just skip the detailed parts, win for both. Thank you.
@daesoolee1083 2 ปีที่แล้ว ⁺¹
I think you cover both the high-level explanation and details fairly well :) Keep it up, please.
@hoomansedghamiz2288 3 ปีที่แล้ว ⁺³
Great work and explanation. Probably you have noticed but VQVAE is a bit rough to train since it’s not differentiable. In parallel there is GumbleSoft which is differentiable and therefore easier to train, wav2vec v2 use that. It might be interesting to cover that next :) cheers
@rikki146 ปีที่แล้ว
15:56 I thought it is arbitrary at first but later realized it is just balancing between loss terms, namely L_{rec} and L_{GAN}. If gradients of L_{GAN} is big, then less weight on L_{GAN} and vice versa
@alexijohansen 3 ปีที่แล้ว
So great! Love the explanation of the loss functions.
@MostafaTIFAhaggag 2 ปีที่แล้ว
this is a master pieceee.
@vinciardovangoughci7775 3 ปีที่แล้ว ⁺¹
Great Job! The condition part is super useful. The paper is confusing there.
@akashraut3581 3 ปีที่แล้ว ⁺²
U are on fire 🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥.
This video was much needed for me, Thank you so much.
@TheAIEpiphany 3 ปีที่แล้ว ⁺¹
I am just getting started 😂 awesome!
@MuhammadAli-mi5gg 3 ปีที่แล้ว
Thanks again, a masterpiece like the VQ-VAE one. But it would be great if you also add the code part like in the VQ-VAE part, perhaps even more detailed one.
Thanks aloooot again!
@jonathanballoch 2 ปีที่แล้ว
i feel like you lost me on the semantic segmentation --> image generation step; you say that the semantic token vector from the semantic VQGAN is appended to the front of the CLS token and then the token vector of...the output VQGAN? and then this 2N+1 length vector is input, and the output is a length N vector? how is this possible, aren't transformers necessarily the same dimensional input and output?
@marcotroster8247 ปีที่แล้ว
It's always interesting to me how a bit of constrained resources can produce very intelligent, next-gen results instead of just pumping up the model with weights and using crazy amounts of compute 😂
@fly-code 3 ปีที่แล้ว ⁺¹
thank you sooo much
@TheAIEpiphany 3 ปีที่แล้ว ⁺¹
You're welcome man!
@vinhphanxuan5654 3 ปีที่แล้ว
how did you do it can you share with me , thank you
@TF2Shows 6 หลายเดือนก่อน
The adversarial loss - i think the explanation is wrong
You said the discriminator tries to maximize it, however, you have just shown that it tries to minimize is (the term becomes 0 if D(x) is 1 and D(\hatX) is 0). So the discriminator tries to minimize it (and because its a loss function it makes sense), and the generator tries to do the opposite, maximize it, to fool the discriminator.
So I think you mis-labeled the objective: L_GAN we try to minimize (minimize loss) in order to train the discriminator.
@kirtipandya4618 3 ปีที่แล้ว
Answer : I find in depth explanation very very useful. 🙂 you could also explain codes here. But great work. Thanks. 👍🏻🙂
Could you please also review paper „A Disentangling Invertible Interpretation Network for Explaining Latent Representations“ from same author. It would be great. Thank you. 🙂
@xxxx4570 3 ปีที่แล้ว
Thanks for your awesome explain about this paper, I want to ask a question, How does the transformer use the characteristics of the transformer to achieve autoregressive prediction?
@yasmimrodrigues5437 3 ปีที่แล้ว ⁺¹
Some segments in the video are stamped not adjacent to each other
@TheAIEpiphany 3 ปีที่แล้ว
What exactly do you mean by that?
@DebraMcClain-i5e 3 หลายเดือนก่อน
Rodriguez Barbara Martinez Jose Johnson Larry
@EatuopLyreqzgj-f5l 4 หลายเดือนก่อน
Robinson Sandra Hall Melissa Lee David
@LauraMiller-z4u 3 หลายเดือนก่อน
Robinson Cynthia Young Gary Thompson Kevin
@dfergrg4053 3 ปีที่แล้ว
how did you do it can you share with me , thank you

ต่อไป

เล่นอัตโนมัติ

DALL-E: Zero-Shot Text-to-Image Generation | Paper Explained

DALL-E: Zero-Shot Text-to-Image Generation | Paper Explained

VQ-VAEs: Neural Discrete Representation Learning | Paper + PyTorch Code Explained

VQ-VAEs: Neural Discrete Representation Learning | Paper + PyTorch Code Explained

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

นี่ไม่ใช่ลูกผม ผม63ปีแล้ว ผมแก่เกินจะมีลูก #สาระแทบไม่มี

นี่ไม่ใช่ลูกผม ผม63ปีแล้ว ผมแก่เกินจะมีลูก #สาระแทบไม่มี

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 2

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 2

ไก่วิเศษ #การ์ตูน #นิทาน #cartoon

ไก่วิเศษ #การ์ตูน #นิทาน #cartoon

🎄✨ Puff is saving Christmas again with his incredible baking skills! #PuffTheBaker #thatlittlepuff

🎄✨ Puff is saving Christmas again with his incredible baking skills! #PuffTheBaker #thatlittlepuff

Attention in transformers, step-by-step | DL6

Attention in transformers, step-by-step | DL6

Stable Diffusion 3: Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Stable Diffusion 3: Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Vision Transformer (ViT) - An image is worth 16x16 words | Paper Explained

Vision Transformer (ViT) - An image is worth 16x16 words | Paper Explained

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

How might LLMs store facts | DL7

How might LLMs store facts | DL7

RWKV: Reinventing RNNs for the Transformer Era (Paper Explained)

RWKV: Reinventing RNNs for the Transformer Era (Paper Explained)

If LLMs are text models, how do they generate images?

If LLMs are text models, how do they generate images?

How AI Image Generators Work (Stable Diffusion / Dall-E) - Computerphile

How AI Image Generators Work (Stable Diffusion / Dall-E) - Computerphile

VQ-GAN | Paper Explanation

VQ-GAN | Paper Explanation

OHANA บ้าพลัง EP.134 : เกมการ์ดโอฮาน่า X วัยหนุ่ม 2544

OHANA บ้าพลัง EP.134 : เกมการ์ดโอฮาน่า X วัยหนุ่ม 2544

【พากย์ไทย】ฮ่องเต้เมาและหลับไปกับนางใน แต่นางในตั้งท้องมังกรทันที จึงได้รับการแต่งตั้งเป็นพระมเหสี

【พากย์ไทย】ฮ่องเต้เมาและหลับไปกับนางใน แต่นางในตั้งท้องมังกรทันที จึงได้รับการแต่งตั้งเป็นพระมเหสี

🎄✨ Puff is saving Christmas again with his incredible baking skills! #PuffTheBaker #thatlittlepuff

🎄✨ Puff is saving Christmas again with his incredible baking skills! #PuffTheBaker #thatlittlepuff

วาทะลูกหนังขอเสนอ"แมนเชสเตอร์ ซิตี้ VS แมนเชสเตอร์ ยูไนเต็ด หลังเกม เรือใบสีฟ้าแพ้ปีศาจแดงคาบ้าน"

วาทะลูกหนังขอเสนอ"แมนเชสเตอร์ ซิตี้ VS แมนเชสเตอร์ ยูไนเต็ด หลังเกม เรือใบสีฟ้าแพ้ปีศาจแดงคาบ้าน"

ไก่วิเศษ #การ์ตูน #นิทาน #cartoon

ไก่วิเศษ #การ์ตูน #นิทาน #cartoon

Highlight | อัจฉริยะสาวไส้...เบื้องลึกเหตุยิง "สจ.โต้งปราจีนบุรี" | เปิดโต๊ะข่าว | 17 ธ.ค.67

Highlight | อัจฉริยะสาวไส้...เบื้องลึกเหตุยิง "สจ.โต้งปราจีนบุรี" | เปิดโต๊ะข่าว | 17 ธ.ค.67

คอมเมนต์แฟนเวียดนามสุดทึ่ง หลังไทยเกือบหลับแต่กลับมาได้ พลิกนรกคว้าชัยเหนือสิงคโปร์ 4-2 แบบสุดมันส์

คอมเมนต์แฟนเวียดนามสุดทึ่ง หลังไทยเกือบหลับแต่กลับมาได้ พลิกนรกคว้าชัยเหนือสิงคโปร์ 4-2 แบบสุดมันส์

แพนด้าจะไม่ทน #cartoon #cartoonnetwork #short

แพนด้าจะไม่ทน #cartoon #cartoonnetwork #short