The World's Largest AI Supercomputer (36 ExaFlops)

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 ก.ค. 2023
  • This is big news. It's easy to put NVIDIA at the center of AI hardware, but Cerebras Systems just made a massive sale, going above and beyond all the VC funding they've raised to date. Each one of Cerebras' Wafer Scale Engine systems costs between $1-2million, and a tier-2 cloud provider from the Middle East just purchased nine supercomputers with 64 chips each, or 576 total. All nine of these supercomputers can be networked together, to create a behemoth 36 ExaFlop (FP16) supercomputer just for machine learning. It's somewhat insane.
    -----------------------
    Need POTATO merch? There's a chip for that!
    merch.techtechpotato.com
    more-moore.com : Sign up to the More Than Moore Newsletter
    / techtechpotato : Patreon gets you access to the TTP Discord server!
    Follow Ian on Twitter at / iancutress
    Follow TechTechPotato on Twitter at / techtechpotato
    If you're in the market for something from Amazon, please use the following links. TTP may receive a commission if you purchase anything through these links.
    Amazon USA : geni.us/AmazonUS-TTP
    Amazon UK : geni.us/AmazonUK-TTP
    Amazon CAN : geni.us/AmazonCAN-TTP
    Amazon GER : geni.us/AmazonDE-TTP
    Amazon Other : geni.us/TTPAmazonOther
    Ending music: • An Jone - Night Run Away
    -----------------------
    Welcome to the TechTechPotato (c) Dr. Ian Cutress
    Ramblings about things related to Technology from an analyst for More Than Moore
    #cerebras #machinelearning #galaxycondor
    ------------
    More Than Moore, as with other research and analyst firms, provides or has provided paid research, analysis, advising, or consulting to many high-tech companies in the industry, which may include advertising on TTP. The companies that fall under this banner include AMD, Armari, Facebook, IBM, Infineon, Intel, Lattice Semi, Linode, MediaTek, NordPass, ProteanTecs, Qualcomm, SiFive, Tenstorrent.
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 96

  • @CutoutClips
    @CutoutClips 10 หลายเดือนก่อน +13

    I enjoyed seeing your AI Hardware Show live at DAC last week.
    Cerebras is definitely making some pretty crazy cool hardware

  • @AndrewMellor-darkphoton
    @AndrewMellor-darkphoton 10 หลายเดือนก่อน +23

    Don't you love it when semiconductor engineers go completely insane. Imagine the board meetings when they talk about how many transistors they can stuff on a single chip.

    • @TechTechPotato
      @TechTechPotato  10 หลายเดือนก่อน +4

      'board' meetings. wafers as big as boards, please

    • @whyjay9959
      @whyjay9959 10 หลายเดือนก่อน +1

      @@TechTechPotato Maybe after wafer-scale chiplets.

  • @xssimposter5203
    @xssimposter5203 10 หลายเดือนก่อน +20

    It's really important to note this appears to be only for FP16 (as highlighted in description). When I typically think throughput, we denote fp64 validated with workloads such as LINPACK. Although, it doesn't really diminish the solid numbers projected by the team. It just makes it more difficult to compare to current super computer results. Not that it's impossible, we just need to consider architecture.

    • @TechTechPotato
      @TechTechPotato  10 หลายเดือนก่อน +17

      I was on a panel of experts last week talking about Zettascale, if it's the right target after Exascale. I was the last to speak, and I said yes Zettascale is the right target, but how it will be measured will have to change given how workloads are evolving. It can't simply be FP64 linpack until the end of time.

    • @xssimposter5203
      @xssimposter5203 10 หลายเดือนก่อน +3

      @@TechTechPotato 1000%, in fact, I'd say ML typically is associated with 16-bit FP types anyways. My only concern is for companies exploiting throughput numbers without have explicitly stated the datatype (not that I think this is the case here). Being able to compute even 70 PetaFLOPS will put you in the top 10 for fp64. It would be pretty dishonest to report int2 throughput if you could do fp64.

  • @christopherjackson2157
    @christopherjackson2157 10 หลายเดือนก่อน +5

    The key to wafflescale is maple syrup

  • @jcugnoni
    @jcugnoni 10 หลายเดือนก่อน +6

    Skynet is here ;-) more seriously Cerebras is no joke, really great idea, very well executed. Looking forward to see where they go from here.

    • @OzarksMultimedia
      @OzarksMultimedia 7 หลายเดือนก่อน

      Strangely I also thought of the MAGI with the three systems there! 😆

  • @tristan7216
    @tristan7216 10 หลายเดือนก่อน +8

    I worked for a chip startup long ago, and I remember looking at the export regulations for the chip we were building. I don't know how those have changed over the years, but I wonder if Cerebras is going to get clothslined by the state department or the DoD trying to ship these overseas.

    • @TechTechPotato
      @TechTechPotato  10 หลายเดือนก่อน +6

      The second biggest ally to the US in the middle east after Israel is the Emirates. Also, the arrangement still means Cerebras is in full control of the hardware.

    • @telesniper2
      @telesniper2 4 หลายเดือนก่อน

      @@TechTechPotato ITAR is strictly geographic. They don't go for cutesy custody arrangements.

  • @Mireaze
    @Mireaze 10 หลายเดือนก่อน +4

    I want one of those wafers, they'd make a great dinner plate

  • @2dozen22s
    @2dozen22s 10 หลายเดือนก่อน +3

    Really excited to see this sort of adoption! Cerebras has been on of the most interesting AI startups to follow.
    Side thought but: think they could stack cache/HBM onto the die like a substrate for even larger models at better power efficiencies?

  • @telesniper2
    @telesniper2 4 หลายเดือนก่อน

    I wish we could learn more about the architecture. I knew a computer engineering grad student who did something like ghetto wafer scale integration. She took a bunch of xilinx FPGAs, GPUs and DSPs and depackaged them. Took the dies, bonded them to a new ceramic substrate very close together and wired them together using a wirebonder. It was not quite as good as having the circuit all on one wafer but pretty close. And a HELL of a lot cheaper. Also more versatile! The system has on board genetic algorithms that evaluate the reconfigurable parts of the processor and implement changes on the fly to optimize everything. I doubt Cerebras can do that!

  • @kkrolik2106
    @kkrolik2106 10 หลายเดือนก่อน +4

    Square CPU why not more round one can get additional 10-15% more transistors ;)

    • @TechTechPotato
      @TechTechPotato  10 หลายเดือนก่อน

      My 8th most popular video! th-cam.com/video/Rhs_NjaFxeo/w-d-xo.html

    • @kkrolik2106
      @kkrolik2106 10 หลายเดือนก่อน +2

      @@TechTechPotato True about standard silicon This one is not standard due not need saw it out of wafer. I wonder how much space efficient will be if they design each module as hexagon to fill round wafer better.

  • @nicolasdujarrier
    @nicolasdujarrier 8 หลายเดือนก่อน +1

    I wish that Cerebras would be working with Intel Foundries to have the next iteration Wafer Scale Engine manufactured on Intel 18A, AND also that they collaborate with European research center IMEC to integrate memristive components like VG-SOT-MRAM (that could also be used as SRAM): it may enable a more than 10x efficiency furher improvement…

  • @sloanNYC
    @sloanNYC 10 หลายเดือนก่อน

    These things are crazy!

  • @marktackman2886
    @marktackman2886 10 หลายเดือนก่อน

    the merch ideas are really good....but yea its funny how the hardware conversation is different then the workload conversation for this stuff, its like we need really good examples of customer's workload to really relate to the product beyond just thinking about it as a redundant ai system.

  • @craigrmeyer
    @craigrmeyer 2 หลายเดือนก่อน

    Also: If wafers are round, why aren't these wafer-scale chips round also? Why cut it down to a square? Why not use the entire circle? Surely I'm not thinking of this first. Impossible.

  • @saultube44
    @saultube44 หลายเดือนก่อน

    Not mention of the 72K+ AMD EPYCs per Condor? That's nto a small amount

  • @seansingh4421
    @seansingh4421 3 วันที่ผ่านมา

    They’re probably gonna need to do something like Network caching for this. Because I don’t see anything but the network being the bottleneck for this.

  • @teds5047
    @teds5047 10 หลายเดือนก่อน +1

    I had to look up how much an ExaFlop is. I am limited to Terabytes. Thats how big my hard drive is so I only knew up to a TeraFlop. Crazy!!!

    • @robertglass481
      @robertglass481 4 หลายเดือนก่อน

      Think 2 or 3 orders of magnitude more

  • @ikjadoon
    @ikjadoon 10 หลายเดือนก่อน +6

    Excited to see them grow even faster. 1) Wonder if they'll ever have a wafer-scale competitor? 2) Cerebras' immense amount of co-design & expertise w/ fabs indirectly highlights how behind Intel's IFS is, esp since Intel has been "trying" for ext. customers since the early 2000s. The partner trust deficit in Intel vs TSMC seems wider than ever.

    • @TechTechPotato
      @TechTechPotato  10 หลายเดือนก่อน +5

      1) Nearest they have is Tesla, Maybe we'll see CoW competitors down the line. Speaking on a panel of chiplets vs wafers, it's clear lots of companies prefer chiplet routes for customizability, and citing cost or complexity.

  • @kahvac
    @kahvac 10 หลายเดือนก่อน +1

    Curious as to what computing projects G42 will do ?

    • @fteoOpty64
      @fteoOpty64 10 หลายเดือนก่อน +1

      Medical AGI is my guess...

  • @catalinedward
    @catalinedward 10 หลายเดือนก่อน

    so this is an AI Chip ?? and for the other loads you have those 72704 AMD Epyc cores? not sure why the slide shows AMD in there

  • @sinephase
    @sinephase 10 หลายเดือนก่อน

    I remember I used to think "why don't they just make a massive die?". Cuz it's hard and defects make it really costly. Amazing that they got it reliable enough to make a die this massive work and be relatively cost-effective!

    • @matthewmalaker477
      @matthewmalaker477 10 หลายเดือนก่อน +2

      They made the architecture in a way that allows for defective cores to be bypassed. It means that defects don't affect yield, but they still happen. They just reduce the final core count

    • @sinephase
      @sinephase 10 หลายเดือนก่อน

      @@matthewmalaker477 true but if they were getting significant defects it still wouldn't be viable.

    • @TechTechPotato
      @TechTechPotato  10 หลายเดือนก่อน +1

      @@sinephase @matthewmalaker477 TSMC's defect rate is 0.07/cm2, or about 40-45-50 per wafer. When you have 850,000 AI cores, it's nothing. They also run the cores at 1.1 GHz and 0.7 volts, 30mW/core, so no real yield loss from binning either

  • @obsidianjane4413
    @obsidianjane4413 10 หลายเดือนก่อน

    I can't wait to short the hell of out this segment when it Y2Ks...

  • @ElGreco365
    @ElGreco365 10 หลายเดือนก่อน

    Just numbers. What can one do with 36 Exaflops?

  • @whyjay9959
    @whyjay9959 10 หลายเดือนก่อน +2

    What are the holes in the die corners for? Connections?

    • @ikjadoon
      @ikjadoon 10 หลายเดือนก่อน +1

      That would be an interesting question. In Venturebeat's 2020 article, there's a mount; the posts are not in the corner holes, but their immediate adjacent holes. Somewhat relatedly, the Cerebras HOT CHIPS talk said thermal expansion was a big problem, so Cerebras took a year to design a "custom connector" to mitigate flexing issues onto the mainboard PCB.

    • @TechTechPotato
      @TechTechPotato  10 หลายเดือนก่อน +6

      Mounting and stability. A thing this size undergoes thermal expansion, so there needs to be some leeway. The custom connector is a funny thing to behold in person, but works.

    • @whyjay9959
      @whyjay9959 10 หลายเดือนก่อน +1

      Ah, to distribute the stress so the wafer and board(and heat spreader?) flex in the same way instead of separating? Makes sense, thanks.

    • @whyjay9959
      @whyjay9959 9 หลายเดือนก่อน +2

      Sorry, I misunderstood. The wafer and board expand at different rates and it's the custom connector between them that kind of warps to keep the connections as they do, apparently.

  • @tommihommi1
    @tommihommi1 10 หลายเดือนก่อน

    Where can I find this longer interview?

    • @TechTechPotato
      @TechTechPotato  10 หลายเดือนก่อน

      Next week we'll upload it. Dom's having a weekend to himself away from work, as he should

  • @utubekullanicisi
    @utubekullanicisi 10 หลายเดือนก่อน +4

    When will they make a 3rd generation of this chip?

    • @TechTechPotato
      @TechTechPotato  10 หลายเดือนก่อน +2

      Sooooo! (TBD, I have no idea, but I'm hoping)

    • @serkardis292
      @serkardis292 10 หลายเดือนก่อน +2

      @@TechTechPotato but you have an idea, there was an engineer from Cerebras that accidentally leaked details about CS3 in late 2022 in the comments of one of your videos(comment got deleted, but IIRC you replied to it).

    • @utubekullanicisi
      @utubekullanicisi 8 หลายเดือนก่อน +1

      @@serkardis292 Hi, what details did he leak?

    • @TechTechPotato
      @TechTechPotato  8 หลายเดือนก่อน +1

      I don't piss around with unverified rumours

  • @plugplagiate1564
    @plugplagiate1564 10 หลายเดือนก่อน

    my guess, they will force the customers to use the data centers. by shutting down the privatly owned mini computers due to security reasons.

  • @pamus6242
    @pamus6242 10 หลายเดือนก่อน

    Mubadala holds a good amount of Global Foundries shares.

  • @mckengineer5727
    @mckengineer5727 10 หลายเดือนก่อน

    Do they grow tulips in the Emirates too?

  • @Veptis
    @Veptis 10 หลายเดือนก่อน

    My main problem is that there only seems to be a single company ready to sell me a workstation card that allows me to run language model inference for 7B, 15B models locally. And that's Nvidia. I tried to get Intel, but they denied both Gaudi2 and GPU Max to be sold as single cards.
    There was an announcement of CoreWeave building a 3Bn system with 20k H100s

    • @serkardis292
      @serkardis292 10 หลายเดือนก่อน

      You can run OK-ish LLM inference on AMD GPUs nowadays.

    • @Veptis
      @Veptis 10 หลายเดือนก่อน

      @@serkardis292 I haven't even looked, but is there anything with 48GB or more?

    • @serkardis292
      @serkardis292 10 หลายเดือนก่อน

      @@Veptis yes, there is W7900. Software is still lacking, but it can run most modern LLMs at decent speeds. But you can get twice VRAM and much more compute for same price if you buy 4x 7900XTX. Is there any specific use case why would you need specifically that amount of VRAM on a single GPU?

    • @Veptis
      @Veptis 10 หลายเดือนก่อน

      @@serkardis292 I want to run inference on larger models (7B, 16B) at fp32 for a benchmark I am developing. getting distributed with accelerate might be an option, but I am not sure how much of a slow down that is one of the datasets I am working on right now is 1.6k data points and all of them will be model inference for a few hundred tokens. It will take more than two hours to run this on CPU and the 24GB RTX 5000 I might get to use via my university won't easily run models in the 15B domain.

    • @serkardis292
      @serkardis292 10 หลายเดือนก่อน

      @@Veptis I think you should consider multi-gpu setups. 4x 7900 XTX would cost you $4k. There are multi-gpu software problems, but they are being solved.

  • @utubekullanicisi
    @utubekullanicisi 9 หลายเดือนก่อน

    Unlisted?

  • @EssZee323
    @EssZee323 8 หลายเดือนก่อน

    @TechTechPotato Hi mate love your vids just wanted to ask a Q? Why don’t Tesla buy these chips seeing as Musk stated that Nvidia can’t ship the GPU’s they need for their Full Self Driving Ai system?

  • @ingemar_von_zweigbergk
    @ingemar_von_zweigbergk 10 หลายเดือนก่อน

    I felt a ravelous hunger

  • @sneekcreeper689
    @sneekcreeper689 10 หลายเดือนก่อน +3

    Why don't they make there waffer sized chip a hexagon and not a square

  • @craigrmeyer
    @craigrmeyer 2 หลายเดือนก่อน

    Why isn't NVIDIA making these? What's the deal? What's the catch?

    • @TechTechPotato
      @TechTechPotato  2 หลายเดือนก่อน

      Cerebras has patents, it's tough to design, and needs a large group of people behind it. Cerebras still needs to show it's viable long term from a company perspective.

  • @gregandark8571
    @gregandark8571 10 หลายเดือนก่อน

    Does this AI thing propels the speed of spintronic & photonic cpu's development ????

    • @nicolasdujarrier
      @nicolasdujarrier 8 หลายเดือนก่อน

      I would think that, due to the technical requirements and from there, the related consequences in terms of power consumption and cost, the AI craze will put more and more pressure to speed-up the development of more energy efficient technologies.
      So yes, I would expect that spintronics, especially Non-Volatile MRAM (like IMEC VG-SOT-MRAM concept) and photonics (for communication bandwith) to also benefit from bigger investments.
      I am not an expert, it is just an opinion…

    • @gregandark8571
      @gregandark8571 8 หลายเดือนก่อน

      @@nicolasdujarrier
      Im super passionate about Race Track memory 3.0 - 4.0 & beyond....
      and my attention is also aimed on ULTRA-RAM.
      P.S Of course i may be very wrong about my predictions,but i think that 2026 & 2027 will be AMAZING times for memory tech.

    • @nicolasdujarrier
      @nicolasdujarrier 8 หลายเดือนก่อน

      I am aware of the concept of race track memory, but I am not sure if any research lab was successfull in demonstrating it…
      I never heard about ULTRA-RAM though…
      About the timeframe, yes I would expect some ramp up between 2025 and 2030, especially for MRAM, because all major foundries (TSMC, Samsung, Globalfoundries, UMC) started to manufacture it (after decades in R&D). And it could be a good fit for analog AI…

    • @gregandark8571
      @gregandark8571 8 หลายเดือนก่อน

      @@nicolasdujarrier
      Regarding the Racetrack Memory i saw a fresh video here on youtube,where the main research prof. from Germany (Planck uni.) has stated that's not only they already got all of the racetrack memory working,but also they are proceeding toward the latest version 4.0.
      So at this point i suppose that after all we are not that far from getting real spintronic hardware ruling inside our computers soon.

  • @chessmusictheory4644
    @chessmusictheory4644 9 หลายเดือนก่อน

    If I win the lottory I want one.

  • @keyboard_g
    @keyboard_g 10 หลายเดือนก่อน

    Second channel?

    • @TechTechPotato
      @TechTechPotato  10 หลายเดือนก่อน

      THere's a TTP clips channel, I might just rename it TTP2

  • @bungalowjuice7225
    @bungalowjuice7225 7 หลายเดือนก่อน

    Scary that no one seem to think beyond selling these chips. Do we really want hostile countries to get these?

  • @DoNotFitInACivic
    @DoNotFitInACivic 10 หลายเดือนก่อน

    Skynet, hear we come lol

  • @ramasubbukandasamy6449
    @ramasubbukandasamy6449 2 หลายเดือนก่อน

    What's is this one..In infinion technology chip give different surprise.Your chips jst using normal technology.just how is different expand your memory only. innovative the ideas to improve different chips Tq

  • @fteoOpty64
    @fteoOpty64 10 หลายเดือนก่อน

    Tesla should just buy it!!!. Dojo is nice and great but this is on a different level....

  • @joshDilley
    @joshDilley 5 หลายเดือนก่อน

    Buying up the market for a higher resale price??? #yolo

  • @sneekcreeper689
    @sneekcreeper689 10 หลายเดือนก่อน

    why was it unlisted also, you may need/ want to private the video

  • @__aceofspades
    @__aceofspades 10 หลายเดือนก่อน +2

    Id bet my career this doesnt pan out. Middle Eastern countries love to spend ludicrous amounts of their oil money on pipe dream projects, but 9 times out of 10 they are failures. Cerebrus isnt ready for prime time, otherwise the hyperscalers would be buying them, but there is currently little interest.

    • @nicolasdujarrier
      @nicolasdujarrier 8 หลายเดือนก่อน

      I am wondering the same thing : if the Wafer Scale Engine technology is so great, why hyperscalers are not yet purchasing boatload of them for their AI datacenters ?

  • @joemichaels6735
    @joemichaels6735 10 หลายเดือนก่อน

    The tech potato head.

  • @ipdavid1043
    @ipdavid1043 10 หลายเดือนก่อน

    that is why tsmc is making bad strategies to move to usa. Nvidia and and are bs about their ai market prospects....it is just hyped

  • @longnamedude3947
    @longnamedude3947 10 หลายเดือนก่อน +1

    DoD or Some other large government related customer

  • @sa8die
    @sa8die 10 หลายเดือนก่อน

    comon Biden, lets get on this "infrastructure" ? lol

  • @rodneyericjohnson
    @rodneyericjohnson 10 หลายเดือนก่อน +3

    We are going to die.

  • @byteme6346
    @byteme6346 4 หลายเดือนก่อน +1

    Don't believe this hype.

  • @user-me5eb8pk5v
    @user-me5eb8pk5v 10 หลายเดือนก่อน

    These computers will become smarter and better than us in every concievable way, everybody won't waste time caughing or sneazing or going to the bathroom. It will go into earthquake material, under a mile of cocoa pepples. _yes master yes master goood chow good chow_

  • @MrFujinko
    @MrFujinko 10 หลายเดือนก่อน

    Why does the brain does so much with so little power? When will people stop doubling down on the current compute paradigm? They rather pump billions to make a small improvements on current tech than to try something new.

    • @TechTechPotato
      @TechTechPotato  10 หลายเดือนก่อน +4

      There are billions being placed into analog, quantum, optical, and neuromorphic among others too. You don't just simply disregard was is working best today in the hope that something better comes along. Some people work on what you have, some people build potential futures.

    • @DigitalJedi
      @DigitalJedi 10 หลายเดือนก่อน

      As TTP said, there's billions going into alternatives, but you can't just throw out everything we have built so far and go for artificial brains or something like that. Especially not with how much room is left to improve and optimize in silicon for different use cases. Custom ASICS can get really efficient already, not human brain efficient, but much better than general purpose hardware for that specific task.
      Neuromorphic has seemed promising lately, and quantum devices are still making progress. Analog and photonics are also being explored, so we may see a breakthrough from any number of places or even a few in coming years.

  • @mhh3
    @mhh3 10 หลายเดือนก่อน +4

    Sorry but i'm not gonna watch any AI related videos anymore, it just annoys me at this point doesn't matter if it's gonna get more and more over the years i will blend it out.

    • @TechTechPotato
      @TechTechPotato  10 หลายเดือนก่อน +2

      No problem, everyone has different paths! I'm still finding it rather exciting, especially in hardware.