Transformers, explained: Understand the model behind GPT, BERT, and T5

แชร์
ฝัง
  • เผยแพร่เมื่อ 21 พ.ค. 2024
  • Dale’s Blog → goo.gle/3xOeWoK
    Classify text with BERT → goo.gle/3AUB431
    Over the past five years, Transformers, a neural network architecture, have completely transformed state-of-the-art natural language processing. Want to translate text with machine learning? Curious how an ML model could write a poem or an op ed? Transformers can do it all. In this episode of Making with ML, Dale Markowitz explains what transformers are, how they work, and why they’re so impactful. Watch to learn how you can start using transformers in your app!
    Chapters:
    0:00 - Intro
    0:51 - What are transformers?
    3:18 - How do transformers work?
    7:41 - How are transformers used?
    8:35 - Getting started with transformers
    Watch more episodes of Making with Machine Learning → goo.gle/2YysJRY
    Subscribe to Google Cloud Tech → goo.gle/GoogleCloudTech
    #MakingwithMachineLearning #MakingwithML
    product: Cloud - General; fullname: Dale Markowitz; re_ty: Publish;
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 352

  • @Omikoshi78
    @Omikoshi78 ปีที่แล้ว +73

    Ability to break down complex topic is such an underrated super power. Amazing job.

  • @ansumansamal3767
    @ansumansamal3767 2 ปีที่แล้ว +218

    Where is optimus prime?

    • @alwaysabiggafish3305
      @alwaysabiggafish3305 ปีที่แล้ว +13

      He's on the thumbnail...

    • @ankitnmnaik229
      @ankitnmnaik229 11 หลายเดือนก่อน +8

      He will be in theaters in June 9... Transformers : Rise of breasts..

    • @captainbob6680
      @captainbob6680 11 หลายเดือนก่อน +1

      😂😂😂😂

    • @yomajo
      @yomajo 11 หลายเดือนก่อน

      Where are robotaxis?

    • @yeoj_maximo1122
      @yeoj_maximo1122 11 หลายเดือนก่อน

      We got lied to

  • @rohanchess8332
    @rohanchess8332 10 หลายเดือนก่อน +47

    How did you condense so many pieces of information in such a short time? This video is on a next level, I loved it!

  • @robchr
    @robchr 2 ปีที่แล้ว +222

    Transformers! More than meets the eye.

    • @suomynona7261
      @suomynona7261 ปีที่แล้ว +3

      😂

    • @Marcoose81
      @Marcoose81 ปีที่แล้ว +8

      Transformers! Robots in disguise!

    • @DomIstKrieg
      @DomIstKrieg ปีที่แล้ว +3

      Autobots wage their battle to fight the evil forces of the Decepticons!!!!!

    • @mieguishen
      @mieguishen ปีที่แล้ว +1

      Transformers! No money to buy…

    • @05012215
      @05012215 ปีที่แล้ว

      Oczywiście

  • @softcoda
    @softcoda ชั่วโมงที่ผ่านมา

    This has to be the best explanation so far, and by a very large margin.

  • @tongluo9860
    @tongluo9860 ปีที่แล้ว +222

    Great explanation of the key concept of position encoding and self attention. Amazing you get the gist covered in less than 10 minutes.

    • @patpearce8221
      @patpearce8221 ปีที่แล้ว +1

      @Dino Sauro tell me more...

    • @patpearce8221
      @patpearce8221 ปีที่แล้ว

      @Dino Sauro thanks for the heads up

    • @an-dr6eu
      @an-dr6eu ปีที่แล้ว +3

      She has one of the wealthiest company on earth providing her resources. First hand access to engineers, researchers, top notch communicators and marketing employees.

    • @michaellavelle7354
      @michaellavelle7354 11 หลายเดือนก่อน +2

      @@an-dr6eu True, but this young lady talks a mile-a-minute from memory. She's knows it cold regardless of the resources at Google.

  • @dylan_curious
    @dylan_curious ปีที่แล้ว +16

    This is such an informative video about transformers in machine learning! It's amazing how a type of neural network architecture can do so much, from translating text to generating computer code. I appreciate the clear explanations of the challenges with using recurrent neural networks for language analysis, and how transformers have overcome these limitations through innovations like positional encodings and self-attention. It's also fascinating to hear about BERT, a popular transformer-based model that has become a versatile tool for natural language processing in many different applications. The tips on where to find pertrained transformer models and the popular transformers Python library are super helpful for anyone looking to start using transformers in their own app. Thanks for sharing this video!

  • @dj67084
    @dj67084 ปีที่แล้ว +9

    This is awesome. This has been one of the best overall breakdowns I've found. Thank you!!

  • @luis96xd
    @luis96xd ปีที่แล้ว +5

    Amazing video! Nice explanation and examples 😄👍
    I would like to see more videos like this and practices ones

  • @rajqsl5525
    @rajqsl5525 5 หลายเดือนก่อน +2

    You have the gift of making things simple to understand. Keep up the good work 🙏

  • @erikengheim1106
    @erikengheim1106 2 หลายเดือนก่อน +1

    Thanks you did a great job. I spent some time already looking at different videos to capture the high level idea of what transformers are about and yours is the clearest explanation. I actually do have an educational background in neutral networks but don't go around remembering every details or the state of the art today so somebody removing all the unessesary technical details like you did here is very useful.

  • @maayansharon280
    @maayansharon280 ปีที่แล้ว +22

    This is a GREAT explanation! please lower the background music next time it could really help. thanks again! awesome video

  • @PaperTools
    @PaperTools ปีที่แล้ว +27

    Dale you are so good at explaining this tech, thank you!

  • @trushatalati5596
    @trushatalati5596 2 ปีที่แล้ว +7

    This is a really awesome video! Thank you so much for simplyifying the concepts.

  • @noureldinosamas2978
    @noureldinosamas2978 ปีที่แล้ว +166

    Amazing video! 🎉 You explained that difficult concepts of Transformers so clearly and made it easy to understand. Thanks for all your hard work!🙌👍

    • @pumbo_nv
      @pumbo_nv 10 หลายเดือนก่อน +4

      Are you serious? The concepts were not really explained. Just a summary of what they do but not how they work behind the scenes.

    • @axscs1178
      @axscs1178 4 หลายเดือนก่อน

      No.

  • @mfatal
    @mfatal ปีที่แล้ว +5

    Love the content and thanks for the great video! (one thing that might help is lower the background music a bit, I found myself stopping the video because I thought another app was playing music)

  • @Jewish5783
    @Jewish5783 ปีที่แล้ว +1

    i really enjoyed the concepts you explained. simple to understand

  • @reddyvarinaresh7924
    @reddyvarinaresh7924 2 ปีที่แล้ว +5

    I loved it and very simple ,clear explanation.

  • @bondsmagi
    @bondsmagi 2 ปีที่แล้ว +68

    Love how you simplified it. Thank you

    • @luxraider5384
      @luxraider5384 ปีที่แล้ว

      It s so simplified that you can t understand anything

  • @JayantKochhar
    @JayantKochhar ปีที่แล้ว

    Positional Encoding, Attention and Self Attention. That's it! Really well summarized.

  • @MaxKar97
    @MaxKar97 หลายเดือนก่อน

    Nice amount of info parted in this video. Very clear info on what Transformers are and what made them so great.

  • @shravanacharya4376
    @shravanacharya4376 2 ปีที่แล้ว +2

    So easy and clear to understand. Thanks

  • @TallesAiran
    @TallesAiran ปีที่แล้ว +6

    I love how to simplify something so complex, thank you so much Dale, the explanation was perfect

    • @decepticon-barricade934
      @decepticon-barricade934 ปีที่แล้ว

      how did you do that

    • @nahiyanalamgir7056
      @nahiyanalamgir7056 ปีที่แล้ว

      @@decepticon-barricade934 This one? Just type ":" (colon) followed by "thanksdoc" and end it with another colon. I can add other emojis like 🤟too!

    • @decepticon-barricade934
      @decepticon-barricade934 ปีที่แล้ว

      @@nahiyanalamgir7056 it needs desktop TH-cam i think

    • @nahiyanalamgir7056
      @nahiyanalamgir7056 ปีที่แล้ว

      @@decepticon-barricade934 Apparently, it does. When will these apps be consistent across devices and platforms?

    • @decepticon-barricade934
      @decepticon-barricade934 ปีที่แล้ว +1

      @@nahiyanalamgir7056 thanks though

  • @labsanta
    @labsanta ปีที่แล้ว +48

    Takeaways:
    A transformer is a type of neural network architecture that is used in natural language processing. Unlike recurrent neural networks (RNNs), which analyze language by processing words one at a time in sequential order, transformers use a combination of positional encodings, attention, and self-attention to efficiently process and analyze large sequences of text.
    Neural networks, Convolutional neural networks (for image analysis), Recurrent neural networks (RNNs), Positional encodings, Attention, Self-attention
    Neural networks: A type of model used for analyzing complicated data, such as images, videos, audio, and text.
    Convolutional neural networks: A type of neural network designed for image analysis.
    Recurrent neural networks (RNNs): A type of neural network used for text analysis that processes words one at a time in sequential order.
    Positional encodings: A method of storing information about word order in the data itself, rather than in the structure of the network.
    Attention: A mechanism used in neural networks to selectively focus on parts of the input.
    Self-attention: A type of attention mechanism that allows the network to focus on different parts of the input simultaneously.
    Neural networks are like a computerized version of a human brain, that uses algorithms to analyze complex data.
    Convolutional neural networks are used for tasks like identifying objects in photos, similar to how a human brain processes vision.
    Recurrent neural networks are used for text analysis, and are like a machine trying to understand the meaning of a sentence in the same order as a human would.
    Positional encodings are like adding a number to each word in a sentence to remember its order, like indexing a book.
    Attention is like a spotlight that focuses on specific parts of the input, like a person paying attention to certain details in a conversation.
    Self-attention is like being able to pay attention to multiple parts of the input at the same time, like listening to multiple conversations at once.

    • @an-dr6eu
      @an-dr6eu ปีที่แล้ว

      Great, you learned how to copy paste

    • @yumyum_99
      @yumyum_99 ปีที่แล้ว +10

      @@an-dr6eu first step on becoming a programmer

    • @JohnCorrUK
      @JohnCorrUK ปีที่แล้ว +3

      ​@@an-dr6eu your comment comes over somewhat 'catty' 😢

  • @SeanTechStories
    @SeanTechStories ปีที่แล้ว +1

    That's a really good high-level explanation!

  • @rembautimes8808
    @rembautimes8808 3 หลายเดือนก่อน

    This is a very well produced video. Credits to the presenter and those involved in production with the graphics

  • @barbara1943
    @barbara1943 4 หลายเดือนก่อน

    Very interesting, informative, this added perspective to a hyped-up landscape. I'll admit, I'm new to this, but when I hear "pretrained transformer" I didn't even think about BERT. I appreciate getting the view from 10,000 feet.

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w ปีที่แล้ว +4

    Wow, this is so well explained.

  • @CarlosRodriguez-mv8qi
    @CarlosRodriguez-mv8qi ปีที่แล้ว +4

    Charm, intelligence and clarity! Thanks!

  • @bingochipspass08
    @bingochipspass08 2 ปีที่แล้ว

    Very well explained.. This really is a high level view of what Transformers are, but it's probably enough to just get your toes wet in the field!

  • @walterppk1989
    @walterppk1989 2 ปีที่แล้ว +21

    Hi Google! First of all, thank you for this wonderful video. I'm working on a multiclass (single label) supervised learning that uses Bert for transfer learning. I've got about 10 classes and a couple hundred thousand examples. Any tips on best practices (which Bert variants to use, what order of magnitude of dropout to use if any)? I know I could do hyperparameter search but that'd probably cost more time and money than I'm comfortable with (for a prototype), so I'm looking to make the most out of my local Nvidia 3080.

  • @touchwithbabu
    @touchwithbabu ปีที่แล้ว

    Fantastic!. Thanks for simplifying the concept

  • @todayu
    @todayu ปีที่แล้ว +1

    This was a really, really awesome breakdown 👏🏾

  • @Daniel-iy1ed
    @Daniel-iy1ed ปีที่แล้ว

    Thank you so much. I really needed this video, other videos were just confusing

  • @junepark1003
    @junepark1003 5 หลายเดือนก่อน

    This is one of the best vids I've watched on this topic!

  • @akashrawat217
    @akashrawat217 ปีที่แล้ว

    Such a simple yet revolutionary 💡idea

  • @EranM
    @EranM ปีที่แล้ว +4

    I knew little on transformers before this video. I know little on transformers after this video. But I guess in order to know some, we'll need a 2-3 hours video.

  • @hallucinogen22
    @hallucinogen22 3 หลายเดือนก่อน

    thank you! I'm just starting to learn about gpt and this was quite helpful, though I will have to watch it again :)

  • @JohnCorrUK
    @JohnCorrUK ปีที่แล้ว +1

    Excellent presentation and explanation of concepts

  • @rodeoswing
    @rodeoswing 6 หลายเดือนก่อน +1

    Great video for people who are curious but don’t really want to (or can’t) understand how transformers actually work.

  • @danielchen2616
    @danielchen2616 ปีที่แล้ว

    Thanks for your hard work.This video is very helpful!!!

  • @DeanRGAnderson
    @DeanRGAnderson ปีที่แล้ว +1

    This is an excellent video introduction for transformers.

  • @sorbethyena3828
    @sorbethyena3828 2 ปีที่แล้ว +2

    Informative! Thank you

  • @sun-ship
    @sun-ship 2 หลายเดือนก่อน

    Easiest to understand explaination ive heard so far

  • @josedamiansanchez9874
    @josedamiansanchez9874 ปีที่แล้ว

    Amazing explanation!

  • @harshadfx
    @harshadfx 9 หลายเดือนก่อน +1

    I have more respect for Google after watching this Video. Not only did they provided their engineers with the funding to research, but they also let other companies like OpenAI to use said research. And they are opening up the knowledge for the general public with these video series.

  • @jsu12326
    @jsu12326 2 หลายเดือนก่อน

    wow, what a great summary! thanks!!!

  • @RobShuttleworth
    @RobShuttleworth 2 ปีที่แล้ว +9

    The visuals are very helpful. Thanks.

  • @mohankiranp
    @mohankiranp 7 หลายเดือนก่อน

    Very well explained. This video is must watch for anyone who wants to demystify the latest LLM technology. Wondering if this could be made into a more generic video with a quick high-level intro on neural networks for those who aren't in the field. I bet there are millions out there who want to get a basic understanding of how ChatGPT/Bard/Claude work without an in-depth technical deep dive.

  • @bobdillan5761
    @bobdillan5761 ปีที่แล้ว +1

    super well done. Thanks for this!

  • @ganbade200
    @ganbade200 2 ปีที่แล้ว +6

    You have no idea how much time I potentially have saved just by reading your blog and watching this video to get me up to speed quickly on this. "Liked" this video. Thanks

  • @NicolasHart
    @NicolasHart 4 หลายเดือนก่อน

    so super helpful for my thesis, thank u

  • @shailendraburman
    @shailendraburman 2 ปีที่แล้ว +1

    Simply loved it!

  • @Mariouigi
    @Mariouigi ปีที่แล้ว

    crazy how things have changed so much

  • @xiongjiedai8405
    @xiongjiedai8405 ปีที่แล้ว

    Very good lecture, thanks!

  • @myt97
    @myt97 ปีที่แล้ว

    Great video. Thank you!

  • @takeizy
    @takeizy ปีที่แล้ว

    Very impressive video. Thanks for the way you shared information via this video.
    Reference your video timeline 05:05, how you created such a video, please.

  • @theguythatcoment
    @theguythatcoment ปีที่แล้ว +2

    do transformers learn the internal representation one language at a time or all of them at the same time? I remember that Chomsky said that there's no underlying structure to language and that for every rule you try to make you'll always find an edge case that contradicts the rule.

  • @ZeeshanAli-ck3ue
    @ZeeshanAli-ck3ue ปีที่แล้ว

    very well explained.👍

  • @ayo4757
    @ayo4757 ปีที่แล้ว +1

    Soo cool! Great work

  • @robertabitbol6454
    @robertabitbol6454 ปีที่แล้ว +1

    You have actually given the BEST explanation on Neural Machine Translation that I read so far but you are missing a few elements

    • @robertabitbol6454
      @robertabitbol6454 ปีที่แล้ว +1

      But your explanations, your analyses and your delivery are excellent. You're definitely a great communicator and teacher.

    • @robertabitbol6454
      @robertabitbol6454 ปีที่แล้ว

      Actually Google and others have an algo they're not interested in sharing and I pretty much know what it is. I am working with my programmer on the coding of my new app, the revolutionary Universal Sentence builder and the Universal Dictionary and I keep adding and changing stuff to simplify the concept and I push at a later date the programming of my Sentence Analyser app. It is like most of my apps a simple (and brilliant concept) coded with very few lines of code.

    • @robertabitbol6454
      @robertabitbol6454 ปีที่แล้ว

      You know Alfred Hitchcock was always adapting into the screen his scenario never changing anything not even a comma while Francis Ford Copolla (The Godfather) was doing the opposite: They say that his script was like a newspaper that had new contents every day. Well I am more like Copolla with my apps. I change stuff all the time and I usually make my programmers go crazy. It's a good sign. :-) Mind you I don't know if one can do like Hitchcock with an app. Come up with a definite version once and for all. This would be quite an achievement!

    • @robertabitbol6454
      @robertabitbol6454 ปีที่แล้ว

      In the case of my Universal Sentence builder, the main task was to process the data entered by the user and we've been at it since July 2022. :-) It's either I am dumb or it is a complex task. Actually it is the latter for I have started with French, this langage being the most complex in the world. The good news is I am sure I will be imitated but you can rest assured that my imitators will also have a jolly hard time with French :-)

  • @gammacubed
    @gammacubed 4 หลายเดือนก่อน

    Amazing video, thank you so much!

  • @JG27Korny
    @JG27Korny 5 หลายเดือนก่อน

    Very informative video. Thank you!

  • @arpitrawat1203
    @arpitrawat1203 2 ปีที่แล้ว +6

    Very well explained. Thank you.

  • @zacharythomas5046
    @zacharythomas5046 ปีที่แล้ว

    Thanks! This is a great intro video!

  • @WalterReade
    @WalterReade 2 ปีที่แล้ว +4

    Nicely done. Very helpful. Thanks!

  • @anshulchaurasia8762
    @anshulchaurasia8762 ปีที่แล้ว

    Simplest Explanation ever

  • @VaibhavPatil-rx7pc
    @VaibhavPatil-rx7pc ปีที่แล้ว

    Excellent explanation i ever seen, recommending everyone's this link

  • @janeerin6918
    @janeerin6918 7 หลายเดือนก่อน +1

    OMG the BEST transformers video EVER!

  • @massimobuonaiuto8753
    @massimobuonaiuto8753 ปีที่แล้ว

    great video, thanks!

  • @shivangsharma599
    @shivangsharma599 ปีที่แล้ว

    Super Explanation!!

  • @maxkhan4485
    @maxkhan4485 ปีที่แล้ว

    Thanks! Great video.

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 2 ปีที่แล้ว +2

    Great video.

  • @AleksandarKamburov
    @AleksandarKamburov ปีที่แล้ว

    Positional encoding = time, attention = context, self attention = thumbprint (knowledge)... looks like a good start for AGI 😀

  • @RonaldMorrissetteJr
    @RonaldMorrissetteJr ปีที่แล้ว +1

    When I saw this title, I was hoping to better understand the mathematical workings of transformers such as matrices and the like. Maybe you could do a follow-up video explaining mathematically how transformers work.
    thank you for your time

  • @Christakxst
    @Christakxst ปีที่แล้ว

    Thanks, that was very interesting

  • @ludologian
    @ludologian ปีที่แล้ว

    When I was a kid, I knew the trouble of translation were due to literally translation words, without contextual/ sequential awareness. I knew it's important to distinguish between synonyms. I've imagined there's a button that generate the translation output then you can highlights the you words that doesn't make sense or want improvement on it . then regenerate text translation. this type of nlp probably exist before I program my first hello world (+15y ago)!

  • @amimegh
    @amimegh ปีที่แล้ว

    NICE SUPERB PRESENTATION

  • @Maisonier
    @Maisonier ปีที่แล้ว

    Amazing video, thank you ... can you use transformers to detect patterns in random data that which is supposedly unpredictable, like weather or stocks?

    • @Happypast
      @Happypast ปีที่แล้ว

      the unpredictability of stuff like weather and stocks has to do with the fundamental underlying nature of those phenomena so I would bet no.

  • @KulbirAhluwalia
    @KulbirAhluwalia ปีที่แล้ว +3

    From 5:28, shouldn't it be the following:
    "when the model outputs the word “économique,” it’s attending heavily to both the input words “European” and “Economic.” "?
    For européenne, I see that it is attending only to European. Please let me know if I am missing something here. Thanks for the great video.

  • @wiclcoocoo
    @wiclcoocoo หลายเดือนก่อน

    a very nice video. thanks

  • @gerardovalencia805
    @gerardovalencia805 2 ปีที่แล้ว +2

    Thank you

  • @MichaelToop
    @MichaelToop ปีที่แล้ว

    Great video. Thx.

  • @probablygrady
    @probablygrady 11 หลายเดือนก่อน

    phenomenal video

  • @EduardoOviedoBlanco
    @EduardoOviedoBlanco ปีที่แล้ว

    Great content 👍

  • @k-c
    @k-c ปีที่แล้ว +1

    This is probably the first time after the 90's I have the same "internet wild west" kinda feeling. The genie is out of the bottle baby.

  • @fenarRH
    @fenarRH 4 วันที่ผ่านมา +1

    I wish they don't embed music on the background, it makes harder to follow the conversations.

  • @badrinair
    @badrinair ปีที่แล้ว

    Thank you for sharing

  • @JorgetePanete
    @JorgetePanete ปีที่แล้ว

    Pretty nice, is there any automatic way of cleaning up data with errors such as a mislabel, or a grammar error?

  • @younessnaim1849
    @younessnaim1849 ปีที่แล้ว

    Beyond the great content and delivery, I loved your French accent ... ;)

  • @johnbarbuto5387
    @johnbarbuto5387 ปีที่แล้ว +3

    An excellent video. I wonder if you can comment on "living the life" of a transformers user. For example, in another video by another TH-camr I heard the sentiment that being an AI person in this era means constant - really constant - study. That may not be the lifestyle that everybody wants to adopt. I'm a retired neurologist and vice president of the faculty club at my state university. What interests me these days is how students "should" be educated in this era. And, at the end of the day, one of the critical aspects of that is matching individual human brains - with their individual proclivities - with the endless career opportunities of this era. So, I'm trying to gather perspectives (aka "data") on that topic. Maybe you could make some kind of video about it. Please do!

    • @LimabeanStudios
      @LimabeanStudios ปีที่แล้ว

      I think the most important thing is that students are simply encouraged to use these tools. It's pretty hard to get a realistic grasp of the capabilities without really pushing the systems. The idea about needing to do constant research is interesting, and I think it's something that a person CAN do (the rest of my life probably lmao) but I think simply adopting the tools is all that will effectively matter. It's too early to be much more specific sadly. When it comes to younger education then we definitely need to be putting more focus on skills and behaviors instead of knowledge.

  • @IceMetalPunk
    @IceMetalPunk 2 ปีที่แล้ว +16

    The invention of transformers seems to have jump-started a revolutionary acceleration in machine learning! Between the models you mentioned here, plus the way transformers are combined with other network architectures in DALL-E 2, OpenAI Jukebox, PaLM, Chinchilla/Flamingo, Gato -- it seems like adding a transformer to any model produces bleeding-edge, state-of-the-art-or-better performance on basically any tasks.
    Barring any major architecture innovations in the future, I wonder if transformers end up being the key we need to reach human levels of broad-range performance after all 🤔

    • @IceMetalPunk
      @IceMetalPunk ปีที่แล้ว +2

      @Dino Sauro They're certainly not dead, since they're still being incorporated into the bleeding edge AIs. But technology is always evolving, building upon one idea to create the next. If you're hoping for a "final architecture" that will be the best and never replaced by anything else, you're out of luck.
      While I respect Professor Marcus, his ideas about the requirements for AGI strongly imply that intelligent design is required for true intelligence to emerge, and I think evolution contradicts that view.

    • @IceMetalPunk
      @IceMetalPunk ปีที่แล้ว +1

      @Dino Sauro Um... Okay, friend, whatever you say. Have a nice life.

    • @tanweeralam1650
      @tanweeralam1650 ปีที่แล้ว

      I think you are right...we just saw its use in ChatGPT...and I think ChatGPT is just a glimpse of what future holds and how it will affect the IT, EV and Industrial Automation Industry.
      Am I right? You wanna add something to it?

    • @IceMetalPunk
      @IceMetalPunk ปีที่แล้ว +1

      @@tanweeralam1650 I agree. ChatGPT, though, is really just GPT-3 with a larger input layer, and human-guided reinforcement learning on top of it. Which is a step in the right direction for sure, but not as huge a development as a lot of people are touting it to be.
      From what I can tell, there are three issues that need to be solved before transformer-based (or transformer-incorporating) AIs can reach truly human levels of intelligent behavior.
      (1) They need to be bigger. If we think of the model parameter size as analogous to brain synapses, there are about a quadrillion synapses in a human brain, which is orders of magnitude more than the biggest current transformers. For instance, the largest single transformer model is 207 billion parameters, and the largest transformer-incorporating language model is 1.75 trillion parameters. On the other hand, such models don't need to allocate parameters for things like body maintenance, reproduction, etc., so it's not a 1-to-1 correspondence, but I think it's a good estimate for the order of magnitude we need to reach before we get to human levels of sapience. That said, models keep getting bigger, so I have no doubt we'll achieve this within the next decade at most.
      (2) Multimodality is important. A lot of "common sense" understanding that AIs seem to lack can likely be attributed to their lack of variety in types of input they can learn from. If you only learn from text, it's a lot harder to learn what the described concepts actually *mean.* On the other hand, a model that can learn from text, images, video, audio, and other forms of data should be able to learn much more accurate representations of the world. And of course, there's a TON of research into multimodal learning right now, so we'll get there pretty soon, too, I think.
      (3) The third obstacle I think is the hardest: continual learning. (From what I can tell, by the way, "continual learning" is synonymous with "incremental online learning". Let me know if there are any important differences between the two.) An AI without this can learn from a *ton* of data, but once it does, it stops learning and everything it knows is set in stone. In effect, this means every interaction with such an AI "resets" it, and so you might get inconsistent behaviors as slightly different initial conditions of an interaction can lead to very different outputs when previous similar interactions are not incorporated into the model's weights (which, in this context, can be thought of as its "long term memory"). This also means the AIs can't form consistent opinions, since any opinion they might espouse in one conversation is immediately forgotten for the next.
      Continual learning techniques already exist for smaller networks, but they are not at all efficient enough to practically apply to these very large language models of many billions of parameters or more. Which is a shame, because I'd speculate that larger models would be less prone to retroactive interference -- "catastrophic forgetting" -- than smaller ones, if we could efficiently incrementally train them.

    • @tanweeralam1650
      @tanweeralam1650 ปีที่แล้ว

      @@IceMetalPunk I did understand your first 2 points and agree with it...but I want to slightly differ with your 3rd point.
      I dont understand...Why would the AI would stop learning?? Due to its storage space, Processing power exhaustion or for what reason? What you said may be a POSSIBILITY...But its others side also exists...it may just continue learning more n more and make it's system better.
      To have Human like Intelligence...I dont think it will achieve that in next 30-40 yrs...far from those timeline...I can't say. And frankly there is NO NEED to have AIs so Advanced. Upto a certain extent...AIs should develop and Humans MUST BE able to control them. Always.
      And can you say will Programs like ChatGPT ( i mean its advanced form) able to replace search Engine like Google in future?? Also how AI/ML will affect IT industry as a whole and also EV, Industrial Automation industry (e.g.- the industry where companies like Siemens, Honeywell operate)??

  • @maxwellsdaemon7
    @maxwellsdaemon7 2 ปีที่แล้ว

    Nice explanation. But at 4:49, did she say that "in the French translation, European comes before economic."?

  • @JosephHenzi
    @JosephHenzi 2 ปีที่แล้ว +2

    I'll jump on where others are doing the same - would love advice for someone who understands half the concepts that are alluded to as complex naturally and the innovation feels obvious I'm unsure how to break into the space without some guidance or connection between having exactly that great natural grasp but wildly anxious that language and logic are strengths and math is a mental turn off. For someone needing that type of translation/guide where my approach is language usage & finer cues what is the key terms to get to that understanding? Hate being fascinated and all the tools to play in this space and being unable to start because how I approach topics so welcome any advice.

    • @meepk633
      @meepk633 ปีที่แล้ว

      Just go to school.

  • @TechNewsReviews
    @TechNewsReviews 8 หลายเดือนก่อน

    woww, she's good at explaining things

  • @hom01
    @hom01 ปีที่แล้ว

    this is brilliant

  • @GubeTube19
    @GubeTube19 ปีที่แล้ว

    10/10. Very helpful

  • @aGj2fiebP3ekso7wQpnd1Lhd
    @aGj2fiebP3ekso7wQpnd1Lhd ปีที่แล้ว

    Fantastic video

  • @jasonlough6640
    @jasonlough6640 หลายเดือนก่อน

    So, question: given the goal of understanding meaning within language regardless of language, could a sophisticated enough set of weights derived from a sufficiently large dataset represent essentially the human genome of language?

  • @softcoda
    @softcoda ปีที่แล้ว

    Wowww….thanks for clarifying my confusion.