Transformers, explained: Understand the model behind ChatGPT
ฝัง
- เผยแพร่เมื่อ 8 มิ.ย. 2024
- 🚀 Learn AI Prompt Engineering: bit.ly/3v8O4Vt
In this technical overview, we dissect the architecture of Generative Pre-trained Transformer (GPT) models, drawing parallels between artificial neural networks and the human brain.
From the foundational GPT-1 to the advanced GPT-4, we explore the evolution of GPT models, focusing on their learning processes, the significance of data in training, and the revolutionary Transformer architecture.
This video is designed for curious non-technical people looking to understand the complexities of GPT models in a way that's easy to understand.
🔗 SOCIAL LINKS:
🌐 Website/Blog: www.futurise.com/
🐦 Twitter/X: / joinfuturise
🔗 LinkedIn: / futurisealumni
📘 Facebook: profile.php?...
📣 Subscribe: www.youtube.com/@leonpetrou?s...
⏰ Timestamps:
0:00 - Intro
0:27 - The Importance of Modeling The Human Brain
1:10 - Basics of Artificial Neural Networks (ANNs)
2:26 - Overview of GPT Models Evolution
3:34 - Training Large Language Models
7:05 - Transformer Architecture
7:45 - Understanding Tokenization
10:19 - Explaining Token Embeddings
17:03 - Deep Dive into Self-Attention Mechanism
18:53 - Multiheaded Self-Attention Explained
19:55 - Predicting the Next Word: The Process
22:33 - De-Tokenization: Converting Token IDs Back to Words
#llm #ml #chatgpt #nvidia #elearning #futurise #promptengineering #futureofwork #leonpetrou #anthropic #claude #claude3 #gemini #openai #transformers #techinsights - วิทยาศาสตร์และเทคโนโลยี
Excellent, went thro' multiple videos on basic understanding of Transformers. This is the best one I could quickly grasp. Effortlessly explained, Well done !!
Thank you Ravindran! I try my best to teach things the same way that I'd like to be taught, which is simple and step-by-step. Let me know what other videos you'd like to see from my channel.
Hi Leon, it would be great if you can make videos on Langchain and its application which are trending now. You can also add topics like Vectordatabase, Embedding, word2vec and so on. Anything on GenAI is hot now in tech space. Thanks.
Same for me, I am a python backend dev and getting transformer was being tough, but you helped me a lot, thank you!
Excellent !!! Thanks for simplifying it. Loved it !
Appreciate that, thank you!
best. Finally i understand how gpt work now. Thanks male, u the champion.
thank you so much.. really reallly well explained.
1:12 ANN
2:26 GPT-1 ~ GPT-4
3:34 LLM
7:09 Transformer architecture
7:45 Tokenization & Detokenization
8:17 Step 1
10:14 Step 2
10:20 Token embeddings
14:48 Step 3
15:10 Position Enbedding
16:58 Step 4
17:17 Self-Attention
18:52 Multi-headed self-attention
19:55 Step 5
20:27 Feed-Forward
22:02 Step 6
22:32 Step 6
It was indeed a very informative video. It cleared a lot of the important ideas. Thanks a lot.
Very Good Explanation. Thank You
thx for this great video !
Appreciate it!
THANK YOU! easy explanation..
Appreciate it!
Wow, this is one of the easiest to understand video on how transformers work. You also explained very tokens and embeddings which I was searching for. I’m a complete newbie and I kept hearing nuerons and neural networks. Is a neuron a physical device/hardware or it actually an algorithm? And a neural network is not a physical network?
Thank you! Neural networks, and everything explained in this video is all software (except biological neurons which is in a human brain), it is all algorithms. It's basically just code. The hardware that the code runs on usually just requires high processing power / RAM. This can be a CPU or GPU.
Great explanation.
Thank you very much!
This is amazing, thanks a lot man! Quick question, how are the self-attention layers produced? Does the model dynamically “decide” which contextual layer to use depending on the prompt, or is the set of layers learnt during training?
My pleasure man, glad you like it. That's a great question. The structure and behavior of these self-attention layers are determined during the model's training phase, not during inference.
Simply put, the model learns which words in a sentence should pay attention to which other words to better understand the sentence's meaning. This learning process is fixed once the model is fully trained.. it does not change or decide on a different structure when it's given new prompts to process.
Bravo 🇨🇵Dommage que ce très bon niveau de d'explication n'est réservé que pour nous qui comprenons l'anglais. Lecun et Bengio en sont pour beaucoup. Heureusement que le nutshell n'est pas traduit par GPT à la noix!
Merci beaucoup for your thoughtful comment! I'm glad you found the video informative. Your point about language accessibility is very important to us. We're actively exploring options to include subtitles in multiple languages in our future videos to ensure more viewers can benefit from our content.
This video is a lot better one ☝️
Appreciate that!
Great session!
Thank you!
not 175 trillion parameters but 1.75 trillion
Thanks for clarifying, my bad.
Sir are u a researcher or ml enthusiast
I'm a ml enthusiast with an engineering background. :)
Ed Stafford?
I see it! haha
1.76 trillion for GPT 4
indeed, thanks for clarifying!
What happened to your hair?
New year new me 😂