All about AI Accelerators: GPU, TPU, Dataflow, Near-Memory, Optical, Neuromorphic & more (w/ Author)

แชร์
ฝัง
  • เผยแพร่เมื่อ 25 ก.ย. 2024

ความคิดเห็น • 76

  • @YannicKilcher
    @YannicKilcher  2 ปีที่แล้ว +14

    OUTLINE:
    0:00 - Intro
    5:10 - What does it mean to make hardware for AI?
    8:20 - Why were GPUs so successful?
    16:25 - What is "dark silicon"?
    20:00 - Beyond GPUs: How can we get even faster AI compute?
    28:00 - A look at today's accelerator landscape
    30:00 - Systolic Arrays and VLIW
    35:30 - Reconfigurable dataflow hardware
    40:50 - The failure of Wave Computing
    42:30 - What is near-memory compute?
    46:50 - Optical and Neuromorphic Computing
    49:50 - Hardware as enabler and limiter
    55:20 - Everything old is new again
    1:00:00 - Where to go to dive deeper?
    Read the full blog series here:
    Part I: medium.com/@adi.fu7/ai-accelerators-part-i-intro-822c2cdb4ca4
    Part II: medium.com/@adi.fu7/ai-accelerators-part-ii-transistors-and-pizza-or-why-do-we-need-accelerators-75738642fdaa
    Part III: medium.com/@adi.fu7/ai-accelerators-part-iii-architectural-foundations-3f1f73d61f1f
    Part IV: medium.com/@adi.fu7/ai-accelerators-part-iv-the-very-rich-landscape-17481be80917
    Part V: medium.com/@adi.fu7/ai-accelerators-part-v-final-thoughts-94eae9dbfafb

    • @Stopinvadingmyhardware
      @Stopinvadingmyhardware 2 ปีที่แล้ว

      You’re ignorant

    • @cedricvillani8502
      @cedricvillani8502 2 ปีที่แล้ว

      Somewhere on an alternate timeline in an alternate universe is a butterfly watching Chaos TV, it’s watching a comedy special about a weird alien species that does everything it can to destroy itself, first by exploding the many so a few can have better more interesting lives with ability to take back choices because they’re not held accountable nor do they have to deal with the annoying feelings of guilt and empathy, lulz. Then when things are going really bad, they toss in a few more stones and the unaware many are trying there hardest to find a way to replace themselves as fast as possible and call it progress 😂 the first season is over, and the Butterfly realizes that it has just wasted away an entire day starts to vibrate it’s wings AND THEN EXPLODES.😢😮
      The End…

    • @ramvirkumae975
      @ramvirkumae975 2 ปีที่แล้ว

      Ha

  • @Stwinky
    @Stwinky 2 ปีที่แล้ว +11

    Banger video fellas. One time I told my mom via text that I purchased a GPU and when I called her later she kept trying to pronounce “GPU”, but not as an acronym . Her best attempt was “guppy-ooh”

    • @NoNameAtAll2
      @NoNameAtAll2 2 ปีที่แล้ว +1

      g'poo/g'pew?

    • @truehighs7845
      @truehighs7845 7 หลายเดือนก่อน

      Well nowadays you can have the AI squeal with diesel.

  • @vzxvzvcxasd7109
    @vzxvzvcxasd7109 2 ปีที่แล้ว +33

    In truth, my profession has nothing to do with computers, but I learnt everything about ML from the sheer amount of videos I watched from this channel, to a point that I understand most of the videos that come out now.
    Started from attention is all you need. I like whenever you draw annotations of flow charts because it makes it so much easier to follow what a paper is trying to do.
    With your interviews with papers authors, I think it would be more insightful if you explained the paper first, then the interviewee gets to see you explainer before being interviewed. Almost like the peer review process. But then they are able to say if they agree on your interpretation, or to expand on things that they felt were the potential.
    This video was really nice, got to understand the bigger picture of how the system turns

    • @sebastianreyes8025
      @sebastianreyes8025 2 ปีที่แล้ว +3

      What's your profession? How did you gain interest in this? Is there a connection between what you do and these topics?

  • @johnshaff
    @johnshaff ปีที่แล้ว +1

    Yannic, thanks for this guest. Please continue identifying the core and leading edge components of technology and finding guests to explain them. Much better than channels who focus on the surface things everyone else is talking about

  • @billykotsos4642
    @billykotsos4642 2 ปีที่แล้ว +7

    oh come on !
    IVE GOT so much stuff on my plate !!
    Oh dear !
    But I will watch it ! for sure !

  • @javiergonzalezsanchez6587
    @javiergonzalezsanchez6587 2 ปีที่แล้ว +4

    Thank you for putting the time and energy into this interview. It was exactly what I needed.

  • @khmf1
    @khmf1 10 หลายเดือนก่อน +1

    11:22 I am very glad you guys got the history right well done. I really appreciate hearing from someone who I as well lived though those phases of technology. He is right!

  • @asnaeb2
    @asnaeb2 2 ปีที่แล้ว +27

    My experience with any ML accelerator other than GPU's is that my code won't run cause my model isn't 6 years old and the hardware doesn't support the new functions.

    • @flightrisk7566
      @flightrisk7566 2 ปีที่แล้ว +3

      do u mind if I ask which models at all? TPUs always seem to work for me no matter what kinds of workloads I’ve come up with, and I want to use them for a Kaggle competition, but I feel as though if there are going to be compatibility issues I should investigate whether they impact my use-case and make sure I strategize around that ahead of time

    • @asnaeb2
      @asnaeb2 2 ปีที่แล้ว

      @@flightrisk7566 What does? Detectron2, Clova's OCR model and many many more. Nothing new has ever worked for me.

    • @chanpreetsingh007
      @chanpreetsingh007 2 ปีที่แล้ว

      So true.

    • @ravindradas9135
      @ravindradas9135 2 ปีที่แล้ว

      Utasvkumar shigh rajàpu

    • @ravindradas9135
      @ravindradas9135 2 ปีที่แล้ว

      Utsav Kumar Singh Rajput

  • @nicohambauer
    @nicohambauer 2 ปีที่แล้ว +15

    I love your content, until today i am still very confused/wondering: why do you have a green screen when you don’t color key/remove it in general?

    • @siegfriedkettlitz6529
      @siegfriedkettlitz6529 2 ปีที่แล้ว +4

      Because he does whatever pleases him.

    • @YannicKilcher
      @YannicKilcher  2 ปีที่แล้ว +20

      I'm hipster like that 😅

    • @nicohambauer
      @nicohambauer 2 ปีที่แล้ว

      @@YannicKilcher 😁😁🙌🏼

    • @b0nce
      @b0nce 2 ปีที่แล้ว +5

      Maybe he actually has a blue screen, which is keyed to green ¯\_(ツ)_/¯

  • @TheEbbemonster
    @TheEbbemonster 2 ปีที่แล้ว +2

    Great video! I will read the blog for sure, this guy is a good and clear commutator ❤️

  • @BlackHermit
    @BlackHermit 5 หลายเดือนก่อน +1

    He's such a cool guy, and he works at Speedata!

  • @Coolguydudeness1234
    @Coolguydudeness1234 2 ปีที่แล้ว +8

    This is awesome, thanks!

  • @parsabsh
    @parsabsh ปีที่แล้ว

    Such a great talk! I think it's an amazing and helpful introduction to AI acceleration for anyone who is interested in the topic (as it was for me). Thanks for sharing your information!

  • @catalinavram3187
    @catalinavram3187 2 ปีที่แล้ว +2

    This is such a great interview!

  • @silberlinie
    @silberlinie 2 ปีที่แล้ว +1

    38:20;
    very good.
    You can compare this very nicely to what
    Stephen Wolfran is doing with his whole
    Mathematica project.
    That he's taking the focus away from the idea
    of the traditional teaching of mathematics with
    individual computational tasks to the idea of
    the functional description of the mathematical
    of the problem at hand.

  • @karolkornik
    @karolkornik 2 ปีที่แล้ว +1

    Yannic! You are nailing it! I love it 😍

  • @100SmokingElephants
    @100SmokingElephants 2 ปีที่แล้ว

    Thank you for taking up this topic.

  • @alan2here
    @alan2here 2 ปีที่แล้ว +2

    perceptron, recurrence, and memory cells trained on temporal information is all you need

  • @khmf1
    @khmf1 10 หลายเดือนก่อน

    11:39 Pentium 4 wasn't out until the 2000s. You are right. I still have one.

  • @spaghettihair
    @spaghettihair 2 ปีที่แล้ว +2

    51:57 check out Graphcore. They've made the bet that graph-nns are the future and are developing hardware to support them.

  • @EmilMikulic
    @EmilMikulic 2 ปีที่แล้ว +1

    Reciprocal sqrt is useful for normalizing vectors, because e.g. three multiplies (x,y,z) are much faster than three divides. :)

  • @sucim
    @sucim 6 หลายเดือนก่อน

    Nice he already talked about Groq 2 years ago!

  • @jasdeepsingh9774
    @jasdeepsingh9774 2 ปีที่แล้ว

    thanks for the video, love the content. I will appertiate it, for more future videos explaining content like that

  • @fabianaltendorfer11
    @fabianaltendorfer11 ปีที่แล้ว

    Very interesting! One Question: The architecture for AI accelerators like TPUs etc are still based on GPUs right?

  •  2 ปีที่แล้ว

    Amazing video. Thanks you very much

  • @GeoffLadwig
    @GeoffLadwig ปีที่แล้ว

    Loved this. Thanks

  • @NoNameAtAll2
    @NoNameAtAll2 2 ปีที่แล้ว

    are you playing lichess in the background or smth?

  • @djfl58mdlwqlf
    @djfl58mdlwqlf 2 ปีที่แล้ว

    As always, thanks for ur vid

  • @souljaaking94
    @souljaaking94 2 ปีที่แล้ว

    Thanks a lot man!

  • @silberlinie
    @silberlinie 2 ปีที่แล้ว

    37:00;
    for example, the Netherlands universities are
    leading in photonic computing, especially
    the university of Ghent.

    • @lucasbeyer2985
      @lucasbeyer2985 2 ปีที่แล้ว +1

      Duuuude, Ghent is not in the Netherlands, but in the better version of the Netherlands.

    • @silberlinie
      @silberlinie 2 ปีที่แล้ว +2

      @@lucasbeyer2985 I am soo sorry.
      How can I say such a thing?
      We are in Flanders, Belgium, of course.

  • @Boersenwunder-
    @Boersenwunder- ปีที่แล้ว

    Which stocks are benefiting? (except Nvidia)

  • @Clancydaenlightened
    @Clancydaenlightened ปีที่แล้ว

    Could make a chip that proceeses data like a gpu. But has more complex instruction set and opcodes like a cpu
    And add pipelining

  • @alan2here
    @alan2here 2 ปีที่แล้ว

    What are the most ghosty instructions? Lets think up uses :)
    Q: So how long does all that take?
    A: 1 clock cycle

  • @TheReferrer72
    @TheReferrer72 2 ปีที่แล้ว

    It seems to me that we are in the using vacuum tubes stage of machine learning.
    My bet that we be using an analog machine probably light for machine learning.

  • @anotherplatypus
    @anotherplatypus 10 หลายเดือนก่อน

    Where'd you find this genius? I kinda just wanna hear you bring up a topic, and then nudge him back on course when he absent mindedly starts rambling.... like about anything... if you have any TH-camr friends of your channel, I'd love to hear him to keep going on-and-on if he had a wrangler to keep him on topic....

  • @Jacob011
    @Jacob011 2 ปีที่แล้ว

    very very useful content

  • @cmnhl1329
    @cmnhl1329 2 ปีที่แล้ว

    Brainchip Akida: “Am I dead to you?”

  • @martinlindsey9693
    @martinlindsey9693 2 ปีที่แล้ว +2

    Take your sunglasses off.

  • @alexandrsoldiernetizen162
    @alexandrsoldiernetizen162 2 ปีที่แล้ว +2

    49:00 Neuromorphic computing isnt 'similar in theory' to the spiking neural net model, it is in fact based on, and relies on, the spiking neural net model.

    • @judgeomega
      @judgeomega 2 ปีที่แล้ว +4

      so its NOT similar in theory, its REALLY just similar in theory. thanks for clearing that up

  • @varunsai9736
    @varunsai9736 2 ปีที่แล้ว

    Nice one

  • @randywelt8210
    @randywelt8210 2 ปีที่แล้ว +2

    Man, I miss the explanation from the FPGA point of view. Flip-flops for sync yes no.. etc

  • @sandeepreddy8567
    @sandeepreddy8567 2 ปีที่แล้ว

    Hi yannic, you can get Light Matter CEO Nick Harris on your channel. He is a cool & intelligent guy

  • @hengzhou4566
    @hengzhou4566 11 หลายเดือนก่อน

    DPU?

    • @truehighs7845
      @truehighs7845 7 หลายเดือนก่อน

      Data Processing Unit, don't be a swine.

  • @MASTER-qc3ei
    @MASTER-qc3ei 2 ปีที่แล้ว +1

    WOW

  • @alexandrsoldiernetizen162
    @alexandrsoldiernetizen162 2 ปีที่แล้ว

    Increasing core count means eventually bumping up against Amdahls Law and data locality.

  • @tarcielamandac8092
    @tarcielamandac8092 2 ปีที่แล้ว

  • @tarcielamandac8092
    @tarcielamandac8092 2 ปีที่แล้ว

    Hello goodmorning frm Phillippiinees

  • @lucyfrye5365
    @lucyfrye5365 2 ปีที่แล้ว

    You use the reciprocal square root for normalizing vectors (basically dividing by the pythagoras sqrt). It is used so much in graphics that the makers of Quake 3 made a famous trick to speed it up a bit. Outdated now but it's really not exotic in game development at all.

    • @Wobbothe3rd
      @Wobbothe3rd ปีที่แล้ว

      I love Quake 3 with all my heart, but the fast inverse square root algorithm is much, much older than Q3.

  • @喵星人-f4m
    @喵星人-f4m 2 ปีที่แล้ว

    he is wrong on in memory computing and near memory computing...

  • @liweipeng9106
    @liweipeng9106 2 ปีที่แล้ว

    self learner is here. but basically speaking, the asian students are more confused by those abstract concepts than western friends.

  • @tonysu8860
    @tonysu8860 ปีที่แล้ว

    No!
    AI accelreator nodes have nothing to do with running an AI application!
    AI accllerator nodes are used when creating the NN algorithm. Creating a NN algorithm based on enormous amounts of data requires enormous amounts of computing power. When the AlphaZero program created the Leela chess playing AI, it took about a year running the machine learning 24/7/365 to create an algoithm capable of challenging the existing World Champion program called Stockfish and actually beat it (Stockfish has since won back the wolrd title and kept it through several matches).
    But again, once the NN algorithm has been created, the program can be run on relatively weak hardware very well. As an example every high end phone today runs AI programs at least 2 different ways... one is for voice recognition and the other is for video enhancement and manipulation. Longtime users of voice recogniztion will remember that in the past voice recognition programs had to be trained to recognize your voice. You would have to read prepared text into the computer which would be used later to try to recognize those soame snippets of tones and match for voice recgonition. Today, because of machine learning, enormous amounts of voice data saying different words in different dialects and intonations are stored in an algorithm which can easily and quickly recognize yours and others' voices accurately without training.
    So no, let's get this straight that enormous amounts of computing power and special computing machines like AI accelerators are needed to run AI programs like what is described in this video.
    All that heavy work has been done ahead of time on special computing devices including AI acceraltors so that you can run the AI program on common and even very weak computing devices like off the shelf mobile phones or your PC, and do wondrous things.

  • @wafflescripter9051
    @wafflescripter9051 2 ปีที่แล้ว

    I hate hardware

  • @silberlinie
    @silberlinie 2 ปีที่แล้ว

    An excellent man. I can judge that.
    However, he is out of place with Yannic.
    Because Yannic has little insight into these things.
    That is bad. Because he does not know how
    and where he can start.
    And you also see that he is not particularly interested.
    Because his world is several floors above these
    elementary hardware levels.
    Yes ok, he makes some effort, at least ...