Bigger Chips = Better AI? Nvidia's Blackwell vs. Cerebras Wafer Scale Engine

Chip Stock Investor

มุมมอง 31 715

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 4 พ.ย. 2024

ความคิดเห็น • 114

@chipstockinvestor 7 หลายเดือนก่อน ⁺⁴
👉👉Want more Chip Stock Investor? Our membership delivers exclusive perks! Join our Discord community, get downloadable show notes, custom emojis, and more. Become a true insider - upgrade your experience today!
Join at TH-cam: th-cam.com/channels/3aD-gfmHV_MhMmcwyIu1wA.htmljoin
Join on Ko-Fi: ko-fi.com/chipstockinvestor 🔥🔥🔥
@RedondoBeach2 7 หลายเดือนก่อน
Good information but painful to listen to the two of you talk. Do yourself a favor and listen to your own speech patterns. You both have a terrible habit of talking in... a... very... choppy... pattern. This is aside from the obvious editing done to this video.
@alexsassanimd 7 หลายเดือนก่อน ⁺¹⁴
REQUEST: an episode of details of most important NVIDIA "partnerships" and what it means for the company's future. You guys are awesome.
@pierrever 7 หลายเดือนก่อน
Oracle dans les ordi quantique
@oker59 7 หลายเดือนก่อน ⁺⁷
Cerebras pace of making smaller and smaller transistors in large scale wafers, suggests they have some systematic understanding of how to deal with the thermal/quantum jitters. ASML now has 1 nm feature size ability(ASML's technology is also a technological miracle. And, they can see how they can go beyond their current miracle. ). So, I expect Cerebras for one to beat their cs-3
@darrell857 7 หลายเดือนก่อน ⁺¹²
NVidia makes its chips at or near the retical limit, as does WSE. Both designs overprovision functional units that can be fused off/routed around and still meet the specification (some estimate about 10-20% of H100 chip is disabled silicon).
Nvidia can bin bad chips into lower blackwell products to offset costs, WSE doesn't have this option.
WSE requires a complex cooling system but a lot less networking. Blackwell requires an additional NVlink chip per 8 gpus or so, advanced packaging for the GPU dies/HBM, advanced melanox networking to get a lot of gpus to communicate. So it isn't so clear who wins on a cost basis.
Cerebras seems to have solved the cooling/mechanical problems so in theory they can outperform blackwell on certain models that fit within the chips memory. However that is substantially less memory than blackwell.
@chipstockinvestor 7 หลายเดือนก่อน ⁺¹
Exactly. Thank you for the extra detail on the comp. Fun to watch the battle. All good stuff for customers.
7 หลายเดือนก่อน
Cerebra’s has also its own networking technology and it’s fast. Something to consider when evaluating his smaller memory per chip.
Cerebra’s has also its own networking technology and it’s fast.
I’m curious about the diff between Blackwell and WSE-3 operation cost. Also their business model are different. Cerebra’s doesn’t sell their chips but act as cloud provider and supercomputer builder/administrator In diff modalities.
I’m curious about the diff between Blackwell and WS
@Dmillz192 7 หลายเดือนก่อน ⁺¹
Huang said last week NVIDIA doesnt make or sell chips tho lmao then to double down and say theyre a software company
@v1kt0u5 5 หลายเดือนก่อน
@@Dmillz192 it's true... they're designers... TSMC makes the chips
... and in fact the insider knowledge/software part of Nvidia is by far the most valuable!
@Dmillz192 5 หลายเดือนก่อน
@@v1kt0u5tsmc makes both gpus and cpus. Again nvidia is known for their gpus which will take atleast 6 years to catch up and perform to the level of wafer chips; which tsmc is actually manufacturing for Cerebras. Cerebras was the only client of tsmc doing wafer chips until recently with Teslas wafer sized 'Dojo' which was announced a few weeks ago.
@sugavanamprabhakaran2028 7 หลายเดือนก่อน ⁺⁴
Excellent! As always you both are great teachers in this field! Keep up your amazing hard work! ❤
@kualakevin 7 หลายเดือนก่อน ⁺⁴
Good video, but hope video can elaborate more on how Cerebras has solved problems (3) and (4) in their product. And for problem (5), power consumption, although larger chip would consume more power per chip, but it consumes less power for the equivalent compute (of smaller chips stitches together with interconnects or other methods).
@basamnath2883 7 หลายเดือนก่อน ⁺³
Great video
@valueinvestor8555 7 หลายเดือนก่อน ⁺⁴
Very interesting video, especially the five reasons for size limiations at the end. #1 was new and interesting to me. But it makes sense. This is something most non-experts would probably not find out by themselves easily. #2 was relatively obvious. Carebras has at least somehwhat of a solution for this as you mentioned. They are somehow routing around damaged transistors (not sure how effective their solution is). #3 also makes sense. But like with #1 most people wouldn't know by how much exactly this would limit the chip size. #4 also makes sense. Maybe materials science could help here!? or maybe the optimal available materials are already used. It would seem that Nvidia wouldn't make compromises here given the product price. #5 I guess the previous points all play into this TCO calculation and it is probably cheaper to cool separate smaller chips. It would be interesting to know if the Nivida CEO thinks that the size of the Blackwell chips is already optimal or if it could make sense to grow chip size further at least for very large customers who need the most computing power. I asked Gemini why 300 mm is the current standard for wafers. One interesting aspect is that precisely handling 450 mm diameters wafers for example would be an immense technological challenge, because the wafers are so fragile.
@rastarebel4503 7 หลายเดือนก่อน ⁺⁴
HIGH QUALITY CONTENT!!! the 5 reasons on chips size limits was excellent... love it!
@Stan_144 3 หลายเดือนก่อน ⁺⁴
I found interesting book: "Chiplet Design and Heterogeneous Integration Packaging" by by John H. Lau. 895 pages. The book focuses on the design, materials, process, fabrication, and reliability of chiplet design and heterogeneous integraton packaging. Both principles and engineering practice have been addressed, with more weight placed on engineering practice. This is achieved by providing in-depth study on a number of major topics such as chip partitioning, chip splitting, multiple system and heterogeneous integration with TSV-interposers, multiple system and heterogeneous integration with TSV-less interposers, chiplets lateral communication, system-in-package, fan-out wafer/panel-level packaging, and various Cu-Cu hybrid bonding. The book can benefit researchers, engineers, and graduate students in fields of electrical engineering, mechanical engineering, materials sciences, and industry engineering, etc.
@majidsiddiqui2906 7 หลายเดือนก่อน ⁺¹
Great video. Good basic explanation regarding the 5 main reasons chips cannot easily be made bigger.👍
@mdo5121 7 หลายเดือนก่อน ⁺²
Another plethora of important info....thanks as always
@zebbie09 7 หลายเดือนก่อน ⁺²
Excellent presentation. Thanks for sharing….
@stgeorgetalk9849 3 วันที่ผ่านมา ⁺¹
The problem with Cerebrus' approach is that chiplets themselves already have redundancy to help prevent defective parts. Also, you need to break out to external devices at some point, and having everything monolithic means you have to break out for IO, storage, etc etc. anyway.
@andreinedelcu5330 7 หลายเดือนก่อน ⁺¹
Great videos and content! as always
@matteoianni9372 11 วันที่ผ่านมา
Nice to discover a video of NotebookLM’s previous model. It was definitely less life like and realistic than today.
@chipstockinvestor 11 วันที่ผ่านมา
Blackwell upgrade incoming, we'll look and sound no different than real humans!
@eversunnyguy 7 หลายเดือนก่อน ⁺¹
Your channel has come to my eyes at the right time...but I wished I knew this channel before the AI frenzy 2 months ago..
@shannonoliver7992 7 หลายเดือนก่อน ⁺¹
GREAT video! I can't believe you continue to produce such great content. Job well done, and a BIG thanks!!
@Stan_144 3 หลายเดือนก่อน ⁺²
Great content ! I learned a lot from it ,
@chipstockinvestor 3 หลายเดือนก่อน ⁺¹
We're glad to hear it!
@styx1272 7 หลายเดือนก่อน ⁺¹
Thanks Crew. I wonder if you might do a video on Brainchip Corps neuromorphic Akida chip ? I'm very curious to understand how the Akida 2000 works because it has memory embedded in the chip in 4 memory configurations to a 'node' or an axiom. Producing a super low powered chip. I'm wondering why other companies aren't following this design? And does it have the potential to be scaled into training ?
@aaronb8698 7 หลายเดือนก่อน ⁺¹
GREAT PRESENTATION!
@chrisgarner5765 7 หลายเดือนก่อน ⁺³
Nvidia has the fastest interconnect of all the competitors ... Nvidia also is the company that started all of this, really with Deep learning! Plus, Nvidia is more than capable of making a wafer scale chip if the believed it was a better way!!!!!!!!!!!!! Nvidia also has the Best software stack and tools for the job!!!!!!
@limbeh3301 7 หลายเดือนก่อน ⁺¹
No, Cerebras has the fastest interconnects between dies. It's basically like communication between two Blackwell dies, but instead of 2 you get 80-90 dies. Also Cerebras inter-die communication is faster than Blackwell since they're not using 2.5D. They're just using a metal layer, it looks like.
@lynnecoles7276 2 หลายเดือนก่อน
If you think about all the leading companies that are no longer the leading at the top. Cerebras System has already released inference that is faster than Invidia! This is who I am watching right now!
@chrisgarner5765 2 หลายเดือนก่อน
@@lynnecoles7276 well to be fair NVIDIA hasn't bothered to focus on inferencing as of yet, but you can inference a lot of ways. I use a 96core Genoa with 768GB of 12 channel DDR5 and 2 3090fe cards. Genoa feeds the gpus memory and works great and can basically run any model if in gguf extremely well that fits within 800GB environment.
@heelspurs 7 หลายเดือนก่อน ⁺²
The entire wafer is etched by reticles before it's cut into chips, so I don't see how problem #1, 'the reticle" is a problem for using the entire wafer for 1 chip. As for defects, the architecture enables bypassing sections that have a defect. Groq does this. It's not simply "infrastructure" that limits wafers to 12 inches, but the inability to make the flow of gases and heat across the entire wafer perfectly even. You could slow each step down to help gases and heat spread more evenly, but that reduces production rate. The only very fundamental physics problem is that you want as much of the chip to be synchronized with the clock steps as much possible because parallel computing for generalized algorithms can greatly waste computation. You can't have the entire wafer synch at high clock speeds because, for example, at 1 GHz, light can travel only 300 mm and it's not a straight path across the chip & capacitances greatly reduce that max speed, and at 1 GHz you really need everything sync'd at less than 1/4 the clock cycle (75 mm max distance). Fortunately, video and matrix multiplication are algorithms that can efficiently do parallel ("non-sync'd") computation. Training can't do parallel efficiently, but inference can, although NVDA's GPU architecture can't do it nearly as efficient as theoretically possible. Groq capitalizes on this, not needing any switches (Jensen was proud of NVDA's new switches being more efficient) or any L2 or cache (which at least 2x the energy per compute required), which is why Groq is 10x more tokens per energy than H100.
@SavageBits 7 หลายเดือนก่อน
Reticle size constrains the size of the of the largest unique design that can be patterned on the wafer. Those identical designs are then repeated across the entire wafer. Blackwell takes 2 of maximum reticle size dies and connects them together in the same package. My prediction is that the Blackwell 's successor will connect 4 maximum reticle size dies in the same package. Nvidia approach is more flexible then the WS-3, which has massive cooling, power distribution, and defect management challenges.
@limbeh3301 7 หลายเดือนก่อน
The advantage of staying on the wafer is that you have extremely low latency and extremely high bandwidth between the reticles. First, Blackwell only allows 2 reticles to talk to each other using the 2.5D interconnect (which likely has larger pitch than what Cerebras is doing). Second, the moment the data has to leave the Blackwell package you'll need to use NVlink, and eventually infiniband. This is why you see that everyone is trying to make larger and larger chip, to optimize for the communication between compute elements.
@IATotal 7 หลายเดือนก่อน
Thanks a lot for the video!
@geordiehawkins7372 7 หลายเดือนก่อน
Great insight for this non-techie. Still able to get good info that will help with due diligence before investing. Thanks!
@stachowi 2 หลายเดือนก่อน ⁺¹
this was very good.
@chipstockinvestor 2 หลายเดือนก่อน
Thanks, glad you liked it!
@RoTelnCheese 7 หลายเดือนก่อน ⁺¹
Great work guys. What do you think of Tesla and their developments in robotics and AI? Their stock value is compelling right now
@ronmatthews2164 7 หลายเดือนก่อน ⁺¹
Under $ 140 in a year.
@limbeh3301 7 หลายเดือนก่อน
Tesla met with Huang to beg for more GPUs. That tells you how well his Dojo supercomputer is doing.
@johndoh5182 7 หลายเดือนก่อน ⁺¹
Bigger chip = higher defect rate. If the chip is designed to deal with failed parts of the die so they can still get to market (pathways through the chip can be disabled and the chip specs allow for a certain percentage of the chip to fail in production), then it's not terrible. But a wafer size chip is a nightmare.
Pretty much any wafer that comes off a line has defects. It's only a matter of percentage. The prevailing knowledge is that the smaller you can make a die (chip), the smaller the percentage will be for chips that fail off that one wafer. For instance is a single wafer is used to make ONE chip AND there is no allowance for failed parts of that chip, then the failure rate is pretty much always going to be 100% and of course that's not feasible.
@mtoporovsky 7 หลายเดือนก่อน
Do u have some info about firms with develop on semi-light combining solutions?
@alan-od8hk 7 หลายเดือนก่อน ⁺¹
Little disappointed that you really didn't cover the cerebras cs-3 chip and compare it to nvidias grasshopper.
@Ronnieleec 7 หลายเดือนก่อน
What about patent limits? Are semiconductor companies and EDI companies patenting variations, like pharma companies and etc.?
@rahulchahal3824 7 หลายเดือนก่อน ⁺¹
Just SUPER
@eversunnyguy 7 หลายเดือนก่อน
Would like to hear your view on PLTR Palantir...Or this channel is only for chips...
@missunique65 7 หลายเดือนก่อน
could you cover the building out of the newer bigger data centers -I heard Andreeson talk about them.
@lightichigo 7 หลายเดือนก่อน
Can you guys do a video anout Groq and how it will impact nvidia monopoly ?
@AdvantestInc 7 หลายเดือนก่อน
How do you see the role of advanced packaging techniques evolving in response to these scaling challenges?
@chipstockinvestor 7 หลายเดือนก่อน
We think advanced packaging companies have a lot to gain to make it all happen.
@JayDee-b5u 7 หลายเดือนก่อน ⁺¹
Is there a native compiler for numpy to cerebras? If they are doing the latter, Nvidia is just fine.
@chipstockinvestor 7 หลายเดือนก่อน
www.cerebras.net/blog/whats-new-in-r0.6-of-the-cerebras-sdk
@groom84 3 หลายเดือนก่อน ⁺¹
Heat and SRAM
@владши-о8з 7 หลายเดือนก่อน
Thanks you!🌹🌹🌹
@1964juls 7 หลายเดือนก่อน
Great information, love your reviews!
Can you review ALAB(Astera Labs Inc)?
@mach1553 7 หลายเดือนก่อน
This is GPU bridging by 2 die stitching & gaining an extremely huge boost in performance!
@GustavoNoronha 6 หลายเดือนก่อน ⁺²
nVidia isn't ahead of the pack in terms of packaging, the new Blackwell double chip is exactly the same thing as the Apple M1 Ultra - 2 really big chips connected together using TSMC CoWoS. What makes nVidia the leader of the pack is their design, and in some cases the software support. For AI that is not a big deal, CUDA is not as relevant, people aren't writing to those APIs, they are using things like PyTorch, higher level frameworks that support all of the major vendor APIs these days, so that is not a big competitive advantage.
It would be good to do a deep dive in all the technologies used in the MI300 - AMD has been on the vanguard when it comes to packaging. It doesn't mean it gets the win, but it should be a good case study for how all of these advanced packaging technologies work, how they can be used to increase cost effectiveness by reducing the size of the dies that are fabricated (yield), and in providing a lot of flexibility on product-level differentiation. MI300A is a good indication of what the future holds.
@chipstockinvestor 6 หลายเดือนก่อน
Did you see our fab equipment video? We are planning some more detail on what CoWoS entails, as these are the processes all these chips and systems utilize.
@UltimateEnd0 5 หลายเดือนก่อน
MI300A=home super computer Cerabras=commercial super computer. They aren't even in the same league.
@GustavoNoronha 5 หลายเดือนก่อน
@@UltimateEnd0 MI300A is definitely not for home computers, the El Capitan super computer being installed right now should take the number 1 spot in the TOP500 super computers list when it's fully installed, and it's powered by MI300A.
@DigitalDesignET 7 หลายเดือนก่อน
@9:15 - 4np for the Blacwell is actually 5nm technology, it's not 4nm. That's why people need to understand this meaning is no longer tells us anything about transistor density. If I misunderstood someone correct me.
@chipstockinvestor 7 หลายเดือนก่อน
Sorry but we don't make up the names for these manufacturing processes. It is indeed called 4N, regardless of what the transistor sizes actually are, that's the name of it.
@DigitalDesignET 7 หลายเดือนก่อน
@@chipstockinvestor thanks replying, it soure is interesting to understand more about this manufacturing process as it can be misleading information which tech is more superior.
@tamasberki7758 7 หลายเดือนก่อน ⁺³
So you guys are telling me those pills I bought on a shady webshop won't make my chip bigger? 😉😃
@tarikviaer-mcclymont5762 7 หลายเดือนก่อน ⁺²
May result in chip shrinkage
@chipstockinvestor 7 หลายเดือนก่อน
😂
@limbeh3301 7 หลายเดือนก่อน ⁺¹
How is N5 one and a half generation behind N4?? That's like half a generation behind...
@chipstockinvestor 7 หลายเดือนก่อน
The N4 node being utilized isn't the standard one, but a newer "enhanced" N4
@alexsassanimd 7 หลายเดือนก่อน ⁺¹
how can one invest in Cerebras? They seem to be a private company
@chipstockinvestor 7 หลายเดือนก่อน ⁺¹
You are correct Cerebras is private
@limbeh3301 7 หลายเดือนก่อน
There are some websites that allows transactions in secondary markets. You might get lucky and score some shares.
@t33mtech59 7 หลายเดือนก่อน ⁺¹
Why do the hosts seem AI generated lol. Or just oddly calm and consistent in cadence
@seabassmoor 7 หลายเดือนก่อน
I think the video is chopped up
@pieterboots8566 7 หลายเดือนก่อน ⁺¹
One more disadvantage: path length or wire length.
Everybody knows these are all steps towards the optimal full 3d chip not just interconnects. This will have the highest transistor count and the shortest path length.
@limbeh3301 7 หลายเดือนก่อน
Problem with stacking vertically is power delivery and cooling. For compute you can't really stack much because the heat density will be too high to cool. This is why you only see memory being stacked on top of compute.
@pieterboots8566 7 หลายเดือนก่อน
@@limbeh3301 Chiplets with interconnects also have this problem.
@GuyLakeman 7 หลายเดือนก่อน ⁺¹
WELL, THEY FRY EGGS TOO !!!!
@camronrubin8599 7 หลายเดือนก่อน ⁺²
Nvidia going to stitch waferscales together 😆
@Baylorbetterthanbrown 27 วันที่ผ่านมา
Yeah Nvidia can do the same thing 😀
@MsDuketown 7 หลายเดือนก่อน ⁺¹
Monolithic boundaries.. But smaller calculation units are better. ARM already proved that, and now the explosion of diversification will do the rest..
@danielstevanoski 7 หลายเดือนก่อน ⁺¹
Co-fee?
@jacqdanieles 7 หลายเดือนก่อน ⁺²
Ko-fi
@ARIK.R 7 หลายเดือนก่อน ⁺¹
And also CAMT
@suyashmisra7406 7 หลายเดือนก่อน ⁺¹
You were doing okay until you said "these are not perfect conductors, they are semiconductors"
Good video otherwise, considering the channel is dedicated more towards people who are interested in stocks rather than the tech itself.
@almostdead9567 7 หลายเดือนก่อน ⁺¹
Why isn't liquid nitrogen used to cool these chips ? I mean quantum chips use liquid nitrogen so why not these big ones ?
@chipstockinvestor 7 หลายเดือนก่อน
Power consumption. It takes more energy to cool the chips, in addition to the energy to operate them. A poorly designed cooling system can add a huge expense to a data center's operations.
@noway8233 หลายเดือนก่อน
Cerebras is huge , the umpalompas gone sufer
@elroy1836 7 หลายเดือนก่อน ⁺¹
To paraphrase another reactor to a different review of NVDA's Blackwell, I hope at some point there is some discussion of AVGO's (Broadcom) newly produced ASIC chip with 12 HBM stacks versus the 8 on Nvidia’s Blackwell. While the focus seems constantly directed at the innovation of NVDA, the AVGO solution reportedly provides 50% more performance in an accelerator at the same or lower price than NVIDIA's solution.
@christian15213 7 หลายเดือนก่อน
Doesn't this all lead to the push for quantum
@TheBestNameEverMade 7 หลายเดือนก่อน
That last point is not correct. Per compute, the celebras system uses less power because you need less extra equipment to do the same thing. Did you not research the numbers?
@chipstockinvestor 7 หลายเดือนก่อน ⁺²
Uh, we don't recall attacking Cerebras and we certainly didn't say that it used more power. What we did say, was that it is possible that a bigger chip may have an increased total cost of ownership. We gave 5 reasons why it is a challenge to make bigger chips work. Did you not watch the whole video? Context is important.
@TheBestNameEverMade 7 หลายเดือนก่อน ⁺¹
Thanks for responding.
I did. Go to the section where you talk about TCO. 16.05. I know you said might but it doesn't because when you have 60x as much compute and a huge amount of memory on chip there is less power in total even if the there is more power per chip for cooling etc... also Nvidia needs dozens of chips for communication to do the same as one chip. Communication is much cheaper in power usage if it's just baked into the chip.
@bounceday 7 หลายเดือนก่อน ⁺¹
Bigger chips are hotter chips. Why is this not a concern. Is it lower energy architecture and smaller die allowance?
@chipstockinvestor 7 หลายเดือนก่อน
New techniques being used to try and keep those monster chips cool. All in the name of tearing through more data. We have some research in queue on Vertiv (VRT).
@kleanthisgroutides7100 6 หลายเดือนก่อน
My issue with Cerebras is the them being adamant they get 100% yield which is of course BS… they will not disclose how much of the wafer is actually bad/defective. As for the power they are not lower power when running at full tilt with a normalised process… yes there is an architecture advantage for lower power but in the grand scheme of things it’s not significant. Transistors are transistors, they need to switch hence consume power.
@UltimateEnd0 5 หลายเดือนก่อน ⁺¹
Except that that Cerebras CS-3 uses 200x less energy consumption than the fastest super computers currently operating in the world.
@kleanthisgroutides7100 5 หลายเดือนก่อน
@@UltimateEnd0 15KW-25KW is not low power… there’s no comparison to a supercomputer since it’s not Apples to Apples.
@Baylorbetterthanbrown 27 วันที่ผ่านมา
Is privately owned so it's not for the public
@Baylorbetterthanbrown 27 วันที่ผ่านมา
Im getting in early 😅
@anahitaaalami9064 2 หลายเดือนก่อน
so….. is cerebras a threat to nvidia or not?
@chipstockinvestor 2 หลายเดือนก่อน ⁺¹
No probably not
@Baylorbetterthanbrown 27 วันที่ผ่านมา ⁺¹
Yesh Yes they are
@johnsands6652 5 หลายเดือนก่อน ⁺¹
When will cerebras go public?
@themusic6808 3 หลายเดือนก่อน
Sounds like they’re planning for a IPO in October
@godfreycarmichael หลายเดือนก่อน
These people are AI generated. Show me your hands!
@RedondoBeach2 7 หลายเดือนก่อน
Why.... do.... you..... talk.... like.... robots?
@anahitaaalami9064 7 หลายเดือนก่อน
Intel
@MichaelMantion 7 หลายเดือนก่อน ⁺¹
Just skip to 12:44
this video was such a waste of time I think I will unsubl
@ARIK.R 7 หลายเดือนก่อน
Camtek (NASDAQ:CAMT) said Monday it has received a new order for about $25 million from a tier-1 HBM manufacturer, for the inspection and metrology of High Bandwidth Memory.

ต่อไป

เล่นอัตโนมัติ

Does Broadcom's AI Event Spell Trouble For Nvidia Stock? (AVGO & NVDA)