17. Transformers Explained Easily: Part 1 - Generative Music AI

Valerio Velardo - The Sound of AI

มุมมอง 5 259

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 16 พ.ย. 2024

ความคิดเห็น • 41

@richardwang5877 3 หลายเดือนก่อน
I didn't expect to finally understand transformers in this generative music course. I had watched lots of other videos about transformers but still found them really confusing. I started this course because I'm interested in generative music, so understanding transformers is just a bonus. I will definitely recommend this series to my classmates. Thank you!
@hollowjohnny 10 หลายเดือนก่อน ⁺²
This is such a generous and empowering resource. Massive thanks!
@NikolozKordzakhia 4 หลายเดือนก่อน ⁺¹
i probably can say that this video is the best on whole TH-cam about this topic, i searched really a lot and all i found was very superficial courses.
Great job.
@ValerioVelardoTheSoundofAI 4 หลายเดือนก่อน
Thank you :)
@philtgun 11 หลายเดือนก่อน ⁺³
Good video, and good explanations of query, key and value matrices with analogies!
@lubhanshukachhawaha8559 11 หลายเดือนก่อน ⁺²
This video just saved my ass as I was having hard time understanding transformers for my work assignment to train a transformer model for audio classification. Thank You!!
@ValerioVelardoTheSoundofAI 11 หลายเดือนก่อน
Amazing!
@punyabrotad 5 หลายเดือนก่อน
Excellent explanation in a very lucid fashion. It was really helpful!
@6little6fang6 2 หลายเดือนก่อน
Mad value in this video. You are such a good expositor.
@ValerioVelardoTheSoundofAI 2 หลายเดือนก่อน
Thank you!
@ANMOLMISHRA-m8e 11 หลายเดือนก่อน ⁺¹
Amazing video, I would like it a thousand times if I could!
@ArjoonSuddhoo 2 หลายเดือนก่อน
Superbly presented!!
@Kevoshea 11 หลายเดือนก่อน ⁺²
Great work you're doing here Valerio. Really appreciated!
@ValerioVelardoTheSoundofAI 11 หลายเดือนก่อน
Thanks!
@jeremyuzan1169 หลายเดือนก่อน
the kind Valerio. Thank you
@ANMOLMISHRA-m8e 11 หลายเดือนก่อน ⁺³
59:41 The denominator values in the second column of this matrix seem to be different from the formula. Shouldn't it be 10000^(2*0/3)?
@ValerioVelardoTheSoundofAI 11 หลายเดือนก่อน ⁺¹
You're right and wrong at the same time. There's a mistake in the video -> dimension_model = 2 instead of 3 (I messed this one up in LaTex!). There's also a mistake in your formula "2*0" should be "2*1" as is correctly showed in the video. We're at embedding position 2, that is i = 1, given 0-indexing.
In any case, thank you for pointing this out :)
@jdavibedoya 9 หลายเดือนก่อน
I believe @user-yf6yf6ki6f has a valid point. The denominator in the second column should be 10000^(2*0/3), and I also noticed a mistake in the third column - it should be 10000^(2*1/3). I think this is how it is implemented in the upcoming video within the _get_angles method.
@_NickTech 6 หลายเดือนก่อน
Thank you very much! It will significantly help me with my university project!
@kyleworrall680 4 หลายเดือนก่อน
Velario, I'm mid writing my PhD thesis on music generation and this video is incredibly useful for ensuring my explanations make sense and is a great source to cite. Thanks for making it! Also at 1:00:38, why is your dimension model 2 for the cos(pos/10000^2i / dimension model) examples? Just want to make sure if I'm misunderstanding something :)
Thanks again!
@hemhemtheglass391 6 หลายเดือนก่อน
Best explination I found so far. Keep it up!
@neyten._py 11 หลายเดือนก่อน ⁺¹
Thanks a lot, that's pure gold content !
@ValerioVelardoTheSoundofAI 11 หลายเดือนก่อน
Thank you!
@oldskooltrancer 9 หลายเดือนก่อน
Thank you so much, Valerio!
@egorge00 11 หลายเดือนก่อน ⁺¹
Excellent , thanks !
@НиколайНовичков-е1э 11 หลายเดือนก่อน ⁺¹
Thanks a lot! You made great work!
@ValerioVelardoTheSoundofAI 11 หลายเดือนก่อน
Thanks!
@hariduraibaskar9056 11 หลายเดือนก่อน ⁺¹
Awesome explanation.
I have a doubt, the embeddings I is such that the first row corresponds to first word in the sequence and so on. Now we have the positional representation of eac word in the sequence, isn't this enough for the transformer model to undersatnd position related info of all the words in the input sequence?
@ValerioVelardoTheSoundofAI 11 หลายเดือนก่อน ⁺¹
The self-attention process is inherently position-agnostic - it doesn't inherently consider the order of words. The attention mechanism would work the same way regardless of the word order if not for positional encodings. That's why we can't rely on the order in the input matrix directly. The model needs an explicit, numerical way to understand word order. That is the job of the sinusoidal function.
@hariduraibaskar9056 11 หลายเดือนก่อน
@@ValerioVelardoTheSoundofAI
Like a blind mice which can sense gradient in smell of cheese in its environment.
@ValerioVelardoTheSoundofAI 11 หลายเดือนก่อน ⁺¹
@@hariduraibaskar9056 I love the metaphor :D Quite appropriate!
@vladimirbosinceanu5778 11 หลายเดือนก่อน ⁺¹
Thank you, sir!
@ValerioVelardoTheSoundofAI 11 หลายเดือนก่อน ⁺¹
Please call me Valerio :)
@vladimirbosinceanu5778 11 หลายเดือนก่อน
Thank you, Valerio! :)
Lovely explanation as always.@@ValerioVelardoTheSoundofAI
@dhnguyen68 11 หลายเดือนก่อน ⁺¹
Is there the part II of the video ?
@ValerioVelardoTheSoundofAI 11 หลายเดือนก่อน
It'll come out tomorrow - stay tuned ;)
@dhnguyen68 11 หลายเดือนก่อน
@@ValerioVelardoTheSoundofAI great thanks for sharing your knowledge.
@thenortheasterwizard16 9 หลายเดือนก่อน
🤟
@heeeyno 6 หลายเดือนก่อน
the positional encoding matrix is either a 'clever math trick', or a sign that all of this is a kludgy hack and that we're still very far off from actually understanding this crap lol.
like, we're still messin with brimstone and vitriol, and haven't been able describe 'sulfur' yet
@metroidandroid 11 หลายเดือนก่อน
you say "easily" but your part 1 video is over 1hour 😅
@ValerioVelardoTheSoundofAI 11 หลายเดือนก่อน ⁺¹²
I considered various methods to convey this topic:
1. Release a concise 15-minute video, giving viewers a feeling of understanding about transformers, yet only skimming the surface;
2. Publish a denser 30-minute video, heavy on mathematics and light on explanations, assuming a substantial level of pre-knowledge and making the material challenging;
3. Provide an in-depth, 2+ hour explanation filled with details, offering sufficient time to demystify the more intricate concepts in a user-friendly way.
My choice was the third option. Though it is lengthy, I believe its length makes it inherently simpler to comprehend due to the thorough coverage it allows.

ต่อไป

เล่นอัตโนมัติ

18. Transformers Explained Easily: Part 2 - Generative Music AI