The inner workings of LLMs explained - VISUALIZE the self-attention mechanism

แชร์
ฝัง
  • เผยแพร่เมื่อ 6 ต.ค. 2024
  • HOW do LLMs (Large Language Models) work, and WHY do they work?
    Models like ChatGPT or GPT-4. Can we understand them?
    Easy introduction to:
    1. How does the self-attention mechanism work inside of LLMs?
    2. What makes all those LLM different, their weights, their pre-trained datasets or their architectural design structure?
    3. What makes LLM perform better (hardware /software), and how to tune for optimal layers and attention heads in the LLM architecture?
    Simple explanations on how Large Language Models (LLM) or Decoder-based Transformers in general work. Plus LangChain and Vector stores, with their corresponding vector embeddings, explained. Also for beginners to AI.
    We only focus on the decoder stack of the transformer for LLMs and ignore for the moment the RLHF (human feedback forms).
    Introducing Claude
    www.anthropic....
    Great new pre-print (all rights with authors):
    "AttentionViz: A Global View of Transformer Attention"
    by Catherine Yeh, Yida Chen, Aoyu Wu, Cynthia Chen, Fernanda Viégas, Martin Wattenberg
    arxiv.org/abs/...
    Read the documentation:
    catherinesyeh....
    interactive Demo:
    attentionviz.com/
    #ai
    #languagemodel
    #datascience
    #naturallanguageprocessing
    #gpt4
    #chatgpt
    #bard
    #vectors
    #vectorspaces

ความคิดเห็น • 16

  • @derejew109
    @derejew109 ปีที่แล้ว +6

    I am already addicted to your Intro "hello community" brillant and authentic! Thank you sir.

  • @MadhavanSureshRobos
    @MadhavanSureshRobos ปีที่แล้ว +13

    Wow! I can't believe this quality information! Best educative channel

  • @mehmetcandemir5035
    @mehmetcandemir5035 7 หลายเดือนก่อน

    This explaination is something else, felt like a teacher right in front of me. Thank you for this incredible work!

  • @bahramboutorabi5971
    @bahramboutorabi5971 ปีที่แล้ว +1

    Great work. You made a complex concept easy to visualise and understand.

  • @JonathanYankovich
    @JonathanYankovich ปีที่แล้ว +1

    Fantastic. Man, i get so much out of these videos. And the delivery is great.

  • @MadhavanSureshRobos
    @MadhavanSureshRobos ปีที่แล้ว +1

    Cant wait for the data lakes video. I tried understanding the concept but I wasn't sure and also I wasnt able to run their code either

  • @vitaliiivanov9514
    @vitaliiivanov9514 ปีที่แล้ว

    That's great! Didn't know there is such a tool available

  • @norlesh
    @norlesh ปีที่แล้ว

    Would really like to see how much LLama2-7B could be reduced by optimizing every head of every layer using AttentionViz and transfer training the standard model weights to the reconfigured layer weights.

  • @jayhu6075
    @jayhu6075 ปีที่แล้ว

    What a understandable explanation how self -attention work in this for my as a beginner difficult topic, in this cases you decide to choose ADA, what is the reason behind this?
    Is it possible to make a tutorial with examples as counselor or lawyers office with there own AI system, what you all say optimize for his task? Many thanks.

    • @code4AI
      @code4AI  ปีที่แล้ว

      OpenAI only sells you token embeddings from a single, second generation AI model: their ada-002. They specify, that also Notion works with this model .
      See also:
      openai.com/blog/new-and-improved-embedding-model

  • @learnvik
    @learnvik 11 หลายเดือนก่อน

    Thanks, but I still didn't understand how llm creates a response from a prompt. I understood how it picks the words based on attention weightage but still no clue how the whole sentence is getting generated. I think I am not capable of understanding it. Will try some other videos.

  • @henkhbit5748
    @henkhbit5748 ปีที่แล้ว

    Thanks, very interesting topic. The code for the interactive visualization is not on github?

  • @gileneusz
    @gileneusz ปีที่แล้ว

    3:18 this is a nice youtube inception 😆

  • @chivesltd
    @chivesltd ปีที่แล้ว

    any tutorial on how to code our own llm and finetune to specialized task?

    • @code4AI
      @code4AI  ปีที่แล้ว +1

      yes, more than 30 videos on this channel.