Transformers, explained: Understand the model behind ChatGPT

แชร์
ฝัง
  • เผยแพร่เมื่อ 8 มิ.ย. 2024
  • 🚀 Learn AI Prompt Engineering: bit.ly/3v8O4Vt
    In this technical overview, we dissect the architecture of Generative Pre-trained Transformer (GPT) models, drawing parallels between artificial neural networks and the human brain.
    From the foundational GPT-1 to the advanced GPT-4, we explore the evolution of GPT models, focusing on their learning processes, the significance of data in training, and the revolutionary Transformer architecture.
    This video is designed for curious non-technical people looking to understand the complexities of GPT models in a way that's easy to understand.
    🔗 SOCIAL LINKS:
    🌐 Website/Blog: www.futurise.com/
    🐦 Twitter/X: / joinfuturise
    🔗 LinkedIn: / futurisealumni
    📘 Facebook: profile.php?...
    📣 Subscribe: www.youtube.com/@leonpetrou?s...
    ⏰ Timestamps:
    0:00 - Intro
    0:27 - The Importance of Modeling The Human Brain
    1:10 - Basics of Artificial Neural Networks (ANNs)
    2:26 - Overview of GPT Models Evolution
    3:34 - Training Large Language Models
    7:05 - Transformer Architecture
    7:45 - Understanding Tokenization
    10:19 - Explaining Token Embeddings
    17:03 - Deep Dive into Self-Attention Mechanism
    18:53 - Multiheaded Self-Attention Explained
    19:55 - Predicting the Next Word: The Process
    22:33 - De-Tokenization: Converting Token IDs Back to Words
    #llm #ml #chatgpt #nvidia #elearning #futurise #promptengineering #futureofwork #leonpetrou #anthropic #claude #claude3 #gemini #openai #transformers #techinsights
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 37

  • @ravindranshanmugam782
    @ravindranshanmugam782 2 หลายเดือนก่อน +6

    Excellent, went thro' multiple videos on basic understanding of Transformers. This is the best one I could quickly grasp. Effortlessly explained, Well done !!

    • @LeonPetrou
      @LeonPetrou  2 หลายเดือนก่อน +2

      Thank you Ravindran! I try my best to teach things the same way that I'd like to be taught, which is simple and step-by-step. Let me know what other videos you'd like to see from my channel.

    • @ravindranshanmugam782
      @ravindranshanmugam782 2 หลายเดือนก่อน +2

      Hi Leon, it would be great if you can make videos on Langchain and its application which are trending now. You can also add topics like Vectordatabase, Embedding, word2vec and so on. Anything on GenAI is hot now in tech space. Thanks.

    • @ovidioe.cabeza4750
      @ovidioe.cabeza4750 13 วันที่ผ่านมา

      Same for me, I am a python backend dev and getting transformer was being tough, but you helped me a lot, thank you!

  • @vj7668
    @vj7668 7 วันที่ผ่านมา

    Excellent !!! Thanks for simplifying it. Loved it !

    • @LeonPetrou
      @LeonPetrou  6 วันที่ผ่านมา

      Appreciate that, thank you!

  • @michaelzap8528
    @michaelzap8528 13 วันที่ผ่านมา +1

    best. Finally i understand how gpt work now. Thanks male, u the champion.

  • @programminglover2976
    @programminglover2976 6 วันที่ผ่านมา

    thank you so much.. really reallly well explained.

  • @wp1300
    @wp1300 22 วันที่ผ่านมา +1

    1:12 ANN
    2:26 GPT-1 ~ GPT-4
    3:34 LLM
    7:09 Transformer architecture
    7:45 Tokenization & Detokenization
    8:17 Step 1
    10:14 Step 2
    10:20 Token embeddings
    14:48 Step 3
    15:10 Position Enbedding
    16:58 Step 4
    17:17 Self-Attention
    18:52 Multi-headed self-attention
    19:55 Step 5
    20:27 Feed-Forward
    22:02 Step 6
    22:32 Step 6

  • @anibeto7
    @anibeto7 2 หลายเดือนก่อน

    It was indeed a very informative video. It cleared a lot of the important ideas. Thanks a lot.

  • @JohnCohen-ur5hk
    @JohnCohen-ur5hk 11 วันที่ผ่านมา

    Very Good Explanation. Thank You

  • @karannesh7700
    @karannesh7700 12 วันที่ผ่านมา +1

    thx for this great video !

    • @LeonPetrou
      @LeonPetrou  12 วันที่ผ่านมา +1

      Appreciate it!

  • @MotulzAnto
    @MotulzAnto 19 วันที่ผ่านมา

    THANK YOU! easy explanation..

    • @LeonPetrou
      @LeonPetrou  17 วันที่ผ่านมา +1

      Appreciate it!

  • @Clammer999
    @Clammer999 28 วันที่ผ่านมา +1

    Wow, this is one of the easiest to understand video on how transformers work. You also explained very tokens and embeddings which I was searching for. I’m a complete newbie and I kept hearing nuerons and neural networks. Is a neuron a physical device/hardware or it actually an algorithm? And a neural network is not a physical network?

    • @LeonPetrou
      @LeonPetrou  26 วันที่ผ่านมา

      Thank you! Neural networks, and everything explained in this video is all software (except biological neurons which is in a human brain), it is all algorithms. It's basically just code. The hardware that the code runs on usually just requires high processing power / RAM. This can be a CPU or GPU.

  • @sudhanshusaxena8134
    @sudhanshusaxena8134 18 วันที่ผ่านมา +1

    Great explanation.

    • @LeonPetrou
      @LeonPetrou  17 วันที่ผ่านมา

      Thank you very much!

  • @Omniassassin7
    @Omniassassin7 2 หลายเดือนก่อน +1

    This is amazing, thanks a lot man! Quick question, how are the self-attention layers produced? Does the model dynamically “decide” which contextual layer to use depending on the prompt, or is the set of layers learnt during training?

    • @LeonPetrou
      @LeonPetrou  2 หลายเดือนก่อน +2

      My pleasure man, glad you like it. That's a great question. The structure and behavior of these self-attention layers are determined during the model's training phase, not during inference.
      Simply put, the model learns which words in a sentence should pay attention to which other words to better understand the sentence's meaning. This learning process is fixed once the model is fully trained.. it does not change or decide on a different structure when it's given new prompts to process.

  • @abooaw4588
    @abooaw4588 2 หลายเดือนก่อน

    Bravo 🇨🇵Dommage que ce très bon niveau de d'explication n'est réservé que pour nous qui comprenons l'anglais. Lecun et Bengio en sont pour beaucoup. Heureusement que le nutshell n'est pas traduit par GPT à la noix!

    • @LeonPetrou
      @LeonPetrou  2 หลายเดือนก่อน +1

      Merci beaucoup for your thoughtful comment! I'm glad you found the video informative. Your point about language accessibility is very important to us. We're actively exploring options to include subtitles in multiple languages in our future videos to ensure more viewers can benefit from our content.

  • @kamal9991999
    @kamal9991999 15 วันที่ผ่านมา

    This video is a lot better one ☝️

    • @LeonPetrou
      @LeonPetrou  15 วันที่ผ่านมา

      Appreciate that!

  • @Keshi-lz3ef
    @Keshi-lz3ef 2 หลายเดือนก่อน

    Great session!

    • @LeonPetrou
      @LeonPetrou  2 หลายเดือนก่อน

      Thank you!

  • @d96002
    @d96002 13 วันที่ผ่านมา +2

    not 175 trillion parameters but 1.75 trillion

    • @LeonPetrou
      @LeonPetrou  12 วันที่ผ่านมา

      Thanks for clarifying, my bad.

  • @NavdeepVarshney-ep4ck
    @NavdeepVarshney-ep4ck 25 วันที่ผ่านมา

    Sir are u a researcher or ml enthusiast

    • @LeonPetrou
      @LeonPetrou  25 วันที่ผ่านมา

      I'm a ml enthusiast with an engineering background. :)

  • @dragonwood-hc4sw
    @dragonwood-hc4sw 6 วันที่ผ่านมา

    Ed Stafford?

    • @LeonPetrou
      @LeonPetrou  6 วันที่ผ่านมา

      I see it! haha

  • @MaduraiKallan
    @MaduraiKallan 11 วันที่ผ่านมา

    1.76 trillion for GPT 4

    • @LeonPetrou
      @LeonPetrou  11 วันที่ผ่านมา

      indeed, thanks for clarifying!

  • @saeidnazemi1312
    @saeidnazemi1312 3 หลายเดือนก่อน

    What happened to your hair?

    • @LeonPetrou
      @LeonPetrou  3 หลายเดือนก่อน +1

      New year new me 😂