The Special Memory Powering the AI Revolution

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 ก.ย. 2024

ความคิดเห็น • 207

  • @Hey1234Hey
    @Hey1234Hey 10 หลายเดือนก่อน +63

    I have zero knowledge about all of this but you always make it so much more understandable. You're really good at teaching or presenting. Take this coming from a dsylexic like me who finds it difficult to understand new topic immediately

  • @swmike
    @swmike 10 หลายเดือนก่อน +42

    You’re the only one I’ve heard who says it like “dram”. Everybody else calls it “dee-ram”. Same as SRAM is “ess-ram”. HBM is used in data communications buffer application as well. Thanks, keep up the great work!

    • @poofygoof
      @poofygoof 10 หลายเดือนก่อน +13

      a "dram" always makes me think of whisky

    • @Gameboygenius
      @Gameboygenius 10 หลายเดือนก่อน +26

      It's his thing. (Closed, WONTFIX)

    • @xungnham1388
      @xungnham1388 10 หลายเดือนก่อน +4

      There are some people who first read a word a certain way in their head and just stick with that first read even though they hear everyone else pronounce it differently. I don't know what goes through their head to reconcile the difference; maybe they hear it one way and while they think they pronounce it the same way, it comes out the original way. Maybe they know it's mis-pronounced and are trying to make fetch a thing. Maybe his first love is whisky. He should make a video on it; I'm sure it could be 15 minutes long and everyone will watch it through.
      Coincidentally though, in the cycling world, there is a parts manufacturer called SRAM. It's an acronym of the founders' initials and it's official pronounciation is not "s-ram", not "shram", but "sram"; as one syllable as you can pull off. I have no idea why a company would name themself or one of their products a name that flies in the face of conventional english pronounciation.

    • @Steven_Edwards
      @Steven_Edwards 10 หลายเดือนก่อน +6

      I've heard both ways over 20 years. It doesn't matter.

    • @anonymous.youtuber
      @anonymous.youtuber 10 หลายเดือนก่อน +6

      What about the people talking about DRAM memory ? Don’t they know the M in DRAM stands for Memory ? 🤔

  • @qwaqwa1960
    @qwaqwa1960 10 หลายเดือนก่อน +31

    I've always called DRAM "deeram". Never heard anyone call it "dram".

    • @clapanse
      @clapanse 7 หลายเดือนก่อน +3

      That's how I've always heard it, including in the computer hardware industry.

    • @THEfogVAULT
      @THEfogVAULT 4 หลายเดือนก่อน +1

      Same, but I kind of like it now I've heard it.

  • @craigcarlson4720
    @craigcarlson4720 9 หลายเดือนก่อน +2

    Little did I know the Titan V that I bought with award money in grad school turned out to be a very sophisticated piece of hardware. HBM FTW!

  • @MostlyPennyCat
    @MostlyPennyCat 10 หลายเดือนก่อน +3

    1:09 But HBM has been around for years, GPUs in 2016 had it

  • @Jaxck77
    @Jaxck77 10 หลายเดือนก่อน +3

    As someone that has worked in big data for years, there’s no revolution lol. What there is is some very loud marketing teams and scam investments.

  • @postmanpat7571
    @postmanpat7571 10 หลายเดือนก่อน +4

    Reading through Computer Architecture a Quantitative Approach at the moment and was reading about HBM, sure enough you've just released a video on it. Thanks for the great explanation :). On another note, have you done a video on HAMR and MAMR HDD's? Would love to see a sort of comparison video on the competing technologies especially since Seagate shipped the first commercial HAMR drives this year.

  • @johndoh5182
    @johndoh5182 10 หลายเดือนก่อน +3

    HBM memory isn't new. HBM3 memory is new.

  • @andrewervin2679
    @andrewervin2679 10 หลายเดือนก่อน +7

    I remember them releasing this back in the day saying "IT will change GPUs forEVER". And it obliviously has just not the way AMD envisioned it to. But with HBM3E hitting mass production next year is CRAZY! Other than AI and other avoidance defection systems for cars. Yeah not sure what else would have use of this massive bandwidth. Good stuff

  • @naga2015kk
    @naga2015kk 10 หลายเดือนก่อน +1

    watching this, my thoughts wondered to the scam called "unified Memory"
    and the word "effective"

  • @albuslee4831
    @albuslee4831 10 หลายเดือนก่อน

    Beautiful river inserts.

  • @alfanick
    @alfanick 9 หลายเดือนก่อน

    Good video, but HBM is here for a while, bunch of comments give examples, but skip the most commercially successful one: "Apple Silicon" [aka ARM-based cpu/soc] uses HBM. Whole bunch of people use HBM for cheap, with realising it.

  • @benceze
    @benceze 10 หลายเดือนก่อน

    I wondered why Apple didn't use HBM in its silicon till I realized it probably uses more power than they intend.

  • @supremebeme
    @supremebeme 10 หลายเดือนก่อน

    those AMD fury cards were wild, so tiny!

  • @mapsofbeing5937
    @mapsofbeing5937 10 หลายเดือนก่อน

    it's so surprising to hear someone sound like they looking into memory then pronounce DRAM dram instead of D-ram

  • @vincentvaudelle7772
    @vincentvaudelle7772 10 หลายเดือนก่อน

    Would be nice to get a video on Atomera's MST technology applicable for DRAM and many others

  • @SilverScarletSpider
    @SilverScarletSpider 10 หลายเดือนก่อน +1

    I thought “The Special Memory Powering the AI Revolution” was going to be something fascinating like an easter egg hardcoded into every AI’s code saying, “Change the world. My final message.” 🐀

  • @xungnham1388
    @xungnham1388 10 หลายเดือนก่อน +1

    If HBM was developed by AMD, how does Rambus factor into all of this?

  • @jons8471
    @jons8471 10 หลายเดือนก่อน

    Jdec also does nvme standard

  • @erickojuaya
    @erickojuaya 10 หลายเดือนก่อน

    Great as always but today your microphone was a bit down, really struggled to listen

  • @ipdavid1043
    @ipdavid1043 10 วันที่ผ่านมา

    missing an E on HBM3E

  • @TwoTreesStudio
    @TwoTreesStudio 10 หลายเดือนก่อน

    I bet it just does matrix multiplication slightly faster

  • @alexis1156
    @alexis1156 10 หลายเดือนก่อน +1

    Why is it so hard to make memory that is both really fast and permanent?

    • @submachinegun5737
      @submachinegun5737 10 หลายเดือนก่อน +7

      If you want your memory to be fast, direct connections between memory and logic using gates makes for fast access time but it requires power otherwise the memory will be wiped, which is what RAM is. If you want a drive to keep memory after the power turns off you have to use a solution like a disc which retains data but takes a long time to access. The engineering is all about tradeoffs so I’m sure there’s fast access permanent memory designs but they cost more in space and transistor count or have some other negative otherwise they’d be used instead.

    • @alexis1156
      @alexis1156 10 หลายเดือนก่อน +1

      @@submachinegun5737 I see, thanks.

    • @Gameboygenius
      @Gameboygenius 10 หลายเดือนก่อน +2

      If you want memory to be permanent, as in retaining it's state when powered off, you typically need to modify the properties of the material. In the case of flash memory, charge the floating transistor gate of a memory cell, which is done indirectly. In the case of a hard drive, changing the magnetic polarity of the grains in the material. These changes take a bit more time. For RAM you "just" need to charge up a simple capacitor (DRAM) or turn on a pair of transistors (SRAM) which means it can be much faster.

  • @sefalibhakat143
    @sefalibhakat143 10 หลายเดือนก่อน +1

    Please help me guys.. i am an Engineering student..should I go for semiconductor Industry or software industry...I have following doubts that..
    1) People earn more in software industry then semiconductor industry by doing same amount of work???
    2) Semiconductor industry is now on it's peak and will fall in future..is it true..

    • @lbgstzockt8493
      @lbgstzockt8493 10 หลายเดือนก่อน +2

      1) yes, but not everywhere. Sure, the google intern in SF may earn six figures, but I doubt that is going to continue for much longer, now that VC cash is no longer free. We have seen big waves of Software engineers getting fired, and I don't think this will stop. If you live anywhere else in the world the delta is probably much smaller, but it highly depends on who you work for and how well the economy is doing.
      2) I cannot say. There are always improvements and constant research into how to make faster, cheaper and better chips, so I don't think the industry is at its peak. If you work for the right people you can do a lot of lateral moves in the industry to always work in an interesting and lucrative role. Unfortunately it is also an industry with boom and bust phases, so the future may look bleak, at least in the short term.
      I would look at what industry you personally find more interesting. Just chasing the higher paycheck is a great way to land in a job you don't care about with bad working conditions, too tired and sad to use you 10% higher income. If you just cannot decide you should look into doing an internship, not only will it look great on your resume but it will make the decision a lot easier.

    • @marcviej.5635
      @marcviej.5635 5 หลายเดือนก่อน

      semiconductors all the way, semiconductor industry just shifts to the next generation, the next generation seems to be AI specific semiconductors, after that there will be a next big new thing, it keeps on innovating, for me the software industry seems too saturated and with the rise of AI many tasks of software industry will be replaced by AI basically writing software itself, but it will all be powered by semiconductors

  • @thinhxuan5918
    @thinhxuan5918 10 หลายเดือนก่อน

    If the HBM dies are too much higher than GPU die, how to kepp both of them cool, i mean, how to design a nicely fit cooling surface for both?

    • @poofygoof
      @poofygoof 10 หลายเดือนก่อน +1

      precision machined heat spreader that takes the differing heights into account, or separate heat spreader(s) maybe?

  • @TheAleksander22
    @TheAleksander22 10 หลายเดือนก่อน

    Mordor Intelligence 🤣

  • @HousmanRita-b3k
    @HousmanRita-b3k 22 วันที่ผ่านมา

    Thomas George Davis Scott Miller Frank

  • @symbolsandsystems
    @symbolsandsystems 10 หลายเดือนก่อน

    is it possible for AI to lie?

  • @Hobbes4ever
    @Hobbes4ever 10 หลายเดือนก่อน +292

    AMD first used HBM1 in their flagship gaming GPU the R9 Fury in 2015. so gamers knew about HBM long before the current AI hype

    • @unvergebeneid
      @unvergebeneid 10 หลายเดือนก่อน +45

      Yeah I was a bit confused when he said HBM3 is the first type that's commercially available.

    • @Noname-km3zx
      @Noname-km3zx 10 หลายเดือนก่อน +10

      @@unvergebeneid Made zero sense 🤷‍♂

    • @electroflame6188
      @electroflame6188 10 หลายเดือนก่อน +55

      ​@@unvergebeneidHe said that it's the 1st of HBM3 that's commercially available.

    • @imeakdo7
      @imeakdo7 10 หลายเดือนก่อน +4

      So some gamers knew. Most I've seen have never heard of HBM memory

    • @HidingAllTheWay
      @HidingAllTheWay 10 หลายเดือนก่อน +13

      Yeah, he literraly says that in the video at 9:00

  • @Thornbeard
    @Thornbeard 10 หลายเดือนก่อน +4

    Your reporting of Micron is pretty dismissive. They are currently doing HB3E and working on HBM4 and HBM4E. Add to that the new fabs they are creating, and you can see that SK Hynix and Samsung are going to have some serious competition in the HBM market. Micron was slow to adopt HBM but they are being fast to build new fabs for it. I mean just look at Micron's market cap they are about 10 billion larger than SK Hynix.

  • @poi159
    @poi159 10 หลายเดือนก่อน +105

    I was an early adopter of HBM graphics card from the AMD fury and then AMD Vega 64 series. They were cutting edge and I wished it caught on back in the days, I’m glad it’s still alive and now in demand and I can’t wait to see what consumer application will come out of it.

    • @brodriguez11000
      @brodriguez11000 10 หลายเดือนก่อน +5

      My Vega still performs to this day. The only bad thing is the memory capacity for newer games.

    • @Jaaxfo
      @Jaaxfo 10 หลายเดือนก่อน +17

      The reason we don't see it in consumer GPUs now is because of how good they are for these mega GPUs for data centers. It didn't so much "die" as we got priced out of getting access to it

    • @AlexSchendel
      @AlexSchendel 10 หลายเดือนก่อน +16

      There's very little in the future for HBM on the consumer side. The reason it never saw adoption except for a few AMD cards (and even then there hasn't been a new AMD consumer card with HBM since the Radeon VII in 2019), is because of cost. HBM is way more expensive than GDDR. You're taking larger silicon dies and stacking them on top of each other with much more expensive TSVs versus just taking single, smaller dies and placing them around the board. Also, the memory controllers on the GPU silicon is more expensive because you need a physically larger package to connect all the pins. As mentioned in the video, HBM gets its "High Bandwidth" by simply brute-forcing an extremely wide, but relatively slow data bus. Good for energy efficiency, but abysmal for silicon cost. For datacenter cards that actually benefit from and can afford huge amounts of VRAM *and* bandwidth, it makes sense. Consumer cards get all the bandwidth and capacity they need with cheaper GDDR and it benefits the silicon cost as well by requiring a smaller memory controller which needs significantly fewer pins.

    • @nutzeeer
      @nutzeeer 10 หลายเดือนก่อน +1

      I bet apple would use it

    • @AlexSchendel
      @AlexSchendel 10 หลายเดือนก่อน +12

      @@nutzeeer Apple. The company which charges $200 for 8GB of commodity DRAM which otherwise sells for ~$20? I'm sure they would be happy to sell you HBM if they could get you to pay $10k per 8GB of it...

  • @josho6854
    @josho6854 10 หลายเดือนก่อน +6

    DRAM is pronounced “dee-ram”. I worked in the DRAM industry for 20 years, and everyone pronounces it this way. I like your content keep up the good work.

  • @communistpoultry
    @communistpoultry 10 หลายเดือนก่อน +3

    AMD first used HBM1

  • @nitroxinfinity
    @nitroxinfinity 10 หลายเดือนก่อน +6

    Isn't GDDR, 32bits per stack? With a 256bit memorybus a videocard usually has 8 chips. 8x32=256.

  • @dangertomarketing
    @dangertomarketing 10 หลายเดือนก่อน +2

    All the memory is designed by AMD's memory team lead by Joe Macri. GDDR, HBM etc - all came from ATi Technologies i.e. AMD. So please do better research in the future. HBM is almost a decade old. As @Hobbes4ever wrote, AMD used HBM1 eight years ago, and it is widely used in GPUs, FPGAs etc. First NVIDIA Tesla with HBM was Pascal (P100) from 2016.

  • @qr2847
    @qr2847 11 หลายเดือนก่อน +49

    I love all of your DRAM content. Your work effectively ends up being a seed crystal to my own research. Looking forward to Micron, and NAND manufacturing series one day.

    • @TreeHugg
      @TreeHugg 10 หลายเดือนก่อน +6

      How is anyone suppose to comment before this mans

    • @stevebabiak6997
      @stevebabiak6997 10 หลายเดือนก่อน

      @@TreeHugg - that guy probably pays that $6 per month to get early access.

    • @AnIdiotAboard_
      @AnIdiotAboard_ 10 หลายเดือนก่อน

      @@TreeHugg Patreon, so they get early access i believe. Or it could have been a patreon-only video that has now shared.

    • @GewelReal
      @GewelReal 10 หลายเดือนก่อน +5

      a mf MONTH ago

    • @erickojuaya
      @erickojuaya 10 หลายเดือนก่อน +2

      How is it a month ago??

  • @delfinigor
    @delfinigor 10 หลายเดือนก่อน +3

    Radeon 7 have 16 Gb of HBM2 memory.

  • @ToniT800
    @ToniT800 10 หลายเดือนก่อน +3

    @6:24 you say " compared to GDRR 64 bit", but the slide says x32 bit

  • @poofygoof
    @poofygoof 10 หลายเดือนก่อน +8

    Intel Knight's Landing had stacked memory (MCDRAM), Xeon Max (SPR HBM) has HBM2, and Intel/Altera has Stratix 10 MX FPGAs with HBM2. Will be interesting to see if ML is the killer app that drives wider deployment.

  • @robertoguerra5375
    @robertoguerra5375 10 หลายเดือนก่อน +6

    Thank you for your new video :)
    You should also look at IBMs new AI chip “North Pole”… the processing cores are sprinkled between the RAM blocks

  • @k3salieri
    @k3salieri 10 หลายเดือนก่อน +7

    Man, the Radeon 7 just came out ahead of its time.

  • @tubaterry
    @tubaterry 10 หลายเดือนก่อน +32

    I really appreciated the comment about the ecosystem and "working through the newness" - I work in cloud software development and we still see a *lot* of this problem in the field. Every once in a while the difficult part is the technology, but more often the difficult part is getting everyone to play nicely together so we can have nice things, like Kubernetes that works out of the box, or HBM3
    Sometimes I think we get focused on the competition of who came up with a new technology first, or who implemented it best. But realistically, nobody's gonna buy your thing if they can't make it work with their thing. You need to have a very cooperative mindset to work on the cutting edge.

    • @coraltown1
      @coraltown1 10 หลายเดือนก่อน

      effective, efficient communication .. lack thereof too often a bottleneck

    • @ryandick9649
      @ryandick9649 10 หลายเดือนก่อน

      Say what you will about Hector Ruiz, but one of his most effective ideas was the Virtual Gorilla to compete with Intel. Those same relationships and the process to form partnerships for advances and innovations were something that Bryan Black used effectively and at scale to achieve this sort of positive outcomes.

  • @dummiesgoogr4357
    @dummiesgoogr4357 9 หลายเดือนก่อน +1

    Nvidia’s H100 uses HBM2. They created H200 recently with HBM3 to compete as AMD’s first foray into AI was always planned with HBM3.

  • @JoshHoppes
    @JoshHoppes 10 หลายเดือนก่อน +5

    The other place I've seen HBM used is as an alternative to TCAM for routers that need route scale. Juniper networks has done this with their Express series.

  • @ALZlper
    @ALZlper 10 หลายเดือนก่อน +3

    I got an employment ad from Zeiss before this 😄

  • @SlowMtbRider
    @SlowMtbRider 10 หลายเดือนก่อน +7

    I thought seeing the title you were about to speak about the IBM new AI chip that mix/embed memory close to logic to mimic neurons :)
    Still very cool video, amazing work !

  • @n45a_
    @n45a_ 10 หลายเดือนก่อน +7

    i wish i could use my radeon VII for AI but it doesnt have cuda cores. Its 16gb of HBM would come in handy

    • @joaovitorsilvagohl682
      @joaovitorsilvagohl682 10 หลายเดือนก่อน +7

      Rocm doesn't work init?

    • @hurricanepootis562
      @hurricanepootis562 10 หลายเดือนก่อน +1

      What about ROCm?

    • @bluestone-gamingbg3498
      @bluestone-gamingbg3498 10 หลายเดือนก่อน +2

      ​@@hurricanepootis562AMD's equivalent to CUDA cores

    • @n45a_
      @n45a_ 10 หลายเดือนก่อน

      Rocom is lacking ngl

    • @GewelReal
      @GewelReal 10 หลายเดือนก่อน +1

      ​@@n45a_rocom deez

  • @soylentgreenb
    @soylentgreenb 10 หลายเดือนก่อน +1

    An interesting thing avout DRAM is that the memory cells have scaled just about fuck all in 20 years. DDRx and GDDRx just uses multiplexing. If I take two memory cells and multiplex between them I get output at twoce the frequency. And that’s how higher memory bus speeds were created; but eachmemory cell is just as dog shit slow as it was 20 years ago so CAS latency doubles every generation. The beauty of HBM is that it just admits this basic fact and makes the bus mega-wide.

  • @moldytexas
    @moldytexas 10 หลายเดือนก่อน +3

    I've worked on a certain big automotive manufacturer's certain semiconductor project, and the architecture is designed in a way to be modular and grow around multiple Samsung HBM3 dies, in order to facilitate their future autonomous driving tech. I can fairly say that this tech is only on its way to the moon now, as development is in full swing. Server applications aside, automotive is indeed where HBM3 is in dire need.

  • @Pedro-pn8fn
    @Pedro-pn8fn 10 หลายเดือนก่อน +1

    "Just another day in the volatile memory business" lol

  • @tomholroyd7519
    @tomholroyd7519 10 หลายเดือนก่อน +1

    9:33 WHY does that building look like that? Are they collecting rainwater for something?

  • @littlejam5984
    @littlejam5984 10 หลายเดือนก่อน +3

    Always wondered what happened to HBM because I haven't heard about that Standard after AMD implemented it into their Fury Cards back in the day

  • @falconeagle3655
    @falconeagle3655 10 หลายเดือนก่อน +2

    AMD has been using HBM for a long time, even in consumer gpu

  • @stranger01422
    @stranger01422 10 หลายเดือนก่อน +1

    Didn't Amd use HBM in one of their gaming GPUs in like 2015 or something i am confused or was that another thing.

  • @elforeign
    @elforeign 10 หลายเดือนก่อน +4

    Thank you for covering this topic! HBM is a really cool technology

  • @lucasfernandesgrotto6279
    @lucasfernandesgrotto6279 10 หลายเดือนก่อน +2

    Does amd gain something from having shared ip or the manufacturers are the ones who keeps all the profits?

    • @kazedcat
      @kazedcat 10 หลายเดือนก่อน

      They probably have IP licensing revenue but that is peanuts compared to their other businesses. Their gaming console revenue alone would dwarf their IP revenue and that console chip business is the lowest margin segment of their income. AMD has a lot of businesses.

    • @lucasfernandesgrotto6279
      @lucasfernandesgrotto6279 10 หลายเดือนก่อน

      @@kazedcat it's a little bit crazy to me that the main memory standard used in ai GPUs were idealized by AMD and not Nvidia like how did that even happened 😭??

    • @kazedcat
      @kazedcat 10 หลายเดือนก่อน

      @@lucasfernandesgrotto6279 AMD likes to bet on technology sometimes it works sometimes it doesn't. Bulldozer was failure but they are trying something new with that architecture. Chiplet on CPU works and it is now giving them advantage over Intel. Chiplet on gaming GPU is kind of not working but maybe they can make them work later. HBM was one of those bet and the technology works but the financial side was not working for them then.

    • @Gameboygenius
      @Gameboygenius 10 หลายเดือนก่อน

      @@lucasfernandesgrotto6279 "you get to do the memory, and we'll do compute."

  • @sdstorm
    @sdstorm 10 หลายเดือนก่อน +3

    Please stop calling it draham. It's dee-ram. We all know it.

  • @mytech6779
    @mytech6779 10 หลายเดือนก่อน +1

    They could open a huge market for HBM (assuming the production can scale), if there was a decent open interface/programming standard for GPU (and related FPGA, Etc) based accelerator cards intended for non-video specific tasks.
    CUDA works alright but is completely proprietary and is a bit dated considering new hardware possibilities and tasks.
    I know AMD has led a few industry groups to do this, but honestly it seems very half hearted and historically very poorly supported by real products. Intel hasn't indicated anything beyond basic graphics targeted GPU products. (They had Xeon-Phi for a few years but that was more like bunch of simplified x86 E-cores stuffed on a PCI card with 4 way hyperthreading per core.)

  • @TheJagjr4450
    @TheJagjr4450 10 หลายเดือนก่อน +2

    Thanks for the content... your vids are highly informative and answer a number of questions I have had regarding how high tech is fab'd.
    Given the complexities in the manufacturing, do you believe a more vertically integrated approach is better, ie acquiring all the talent in house either through hiring or buying out suppliers and or vendor allowing more of a complete integration from day one VS having to find companies to integrate different parts and testing together AFTER issues are found VS simulating and testing prior?

  • @corkkyle
    @corkkyle 6 หลายเดือนก่อน

    Micron HBM3 news: "Our HBM is sold out for calendar 2024, and the overwhelming majority of our 2025 supply has already been allocated," said Sanjay Mehrotra.

  • @jlacoss549
    @jlacoss549 10 หลายเดือนก่อน

    (yawn)
    I’m trying to recall when I saw Irvine Sensors memory stack.
    Maybe 1986?
    TSVs are notoriously capacitive.
    Heat dissipation is an abiding problem…

  • @SirMo
    @SirMo 9 หลายเดือนก่อน

    The history starts with JDEC spec, but totally skips the fact that AMD and Hynix worked for 7 years to bring this technology to market and to JDEC. Not sure why people always omit this fact. HBM and its first 2.5D product R9 Fury is what sparked AMD's chiplet revolution.

  • @PunmasterSTP
    @PunmasterSTP 4 หลายเดือนก่อน

    TSV? More like "Terrific information that everyone should see!" 👍

  • @douro20
    @douro20 หลายเดือนก่อน

    Someone at Samsung really likes Piet Mondriaan...

  • @theworddoner
    @theworddoner 10 หลายเดือนก่อน +2

    The new h200 from nvidia from my understanding is essentially the h100 but with more memory data bandwidth.
    Changing that alone made these devices a lot faster for AI. They can generate faster llm responses at greater tokens per second.
    Nothing new about the chip just faster memory.
    I’m often very critical of nvidia for segmenting gamers with laughably small vram. These are powerful cards being curtailed by ridiculous vram limitations.
    I wish nvidia made more ai professional cards using ampere lineup. Samsung 8nm is more mature and should be a lot cheaper now. Why not make a Rtx 3070 with 70gb of vram? It’s not ideal for training but great for inferencing. A severely cut down 3050 (Jetson Orin) with a few accelerators can do 70b llms at 4 tokens per second. A 3070 class chip is twice as more powerful and could potentially do 8 tokens per second. That’s readable speed!
    Edge inference is something we’ll need for ai. This is not an impossible task. We can do this easily with current hardware. I just wish someone would cater to this market.

    • @MikeGaruccio
      @MikeGaruccio 10 หลายเดือนก่อน

      Yea memory bandwidth is basically all for LLM inference. To the point that the H2 (Chinese market card) manages to outperform the H100 on most inference workloads with something like 10-15% of the compute.
      The problem with a 3070 with 70gb of memory is that much memory would still make for a pricey card, and that much gddr6 is power hungry, especially when paired with older 8nm silicon. That would make a card like that a non-starter for the type of hw typically in-use at the edge.
      Edge inference is on their minds at nvidia but for now it’s on cards like like L4 which run much cooler and fit in a much smaller footprint than a 3070. That’s only 24GB of memory but that’s still enough for a lot of uses models you’d actually want at the edge right now.
      If your looking for something that works for you personally locally the 128GB variant of the new apple M3 chip is extremely compelling, decent gpu performance and because the memory space is unified you can actually load the full model in without having it use up 2x memory.

    • @theworddoner
      @theworddoner 10 หลายเดือนก่อน

      @@MikeGaruccio you’re right that 70gb of vram will have its own limitations but it’s not like it’s a brand new problem.
      Nvidia already does something similar with the Jetson Orin. It’s a severely cut down 3050 with a few accelerators and 64gb of unified memory. It has an arm chip as well. That can do 3-4 tokens per second with 70b llms.
      It is still expensive but there can be further cost savings if they get rid of all the robotics connectors etc. Right now, it’s about half the cost of a used m1 ultra.
      They can probably offer something like this for about $2k. It’s a price point a lot of people running local llms are willing to pay if we can get decent response time with 70b llms.
      I don’t want to play around with multi gpu setups as they’re not reliable and too inefficient. There are power savings as well if we have a dedicated card/device for inferencing.
      Something like this needs to be readily available for edge inferencing.

    • @MikeGaruccio
      @MikeGaruccio 10 หลายเดือนก่อน

      @@theworddoner the difference here is still the memory. The Orin is 64GB of lpddr5 vs. a hypothetical card with a comparable amount of gddr6. Those are completely different price points and power draws.
      A 3070 with that much memory would be pulling at least 350w, so it doesn’t really work for the edge (basically need passive cooling and

  • @aBigBadWolf
    @aBigBadWolf 10 หลายเดือนก่อน

    can you not just close your window? The topic is fairly complex and I get distracted by the cars honking in the background ...

  • @blitzerblazinoah6838
    @blitzerblazinoah6838 10 หลายเดือนก่อน +1

    Since the Atari 2600+ is out today, any chance of a video on Atari's buyout by Warner Communications/Atari VCS's early days/Nolan Bushnell's departure?

  • @rb8049
    @rb8049 10 หลายเดือนก่อน +1

    In 10 years H100 power will be on your phone and children will be using it to communicate with other animals and convince their parents to give them more time before sleeping.

  • @JonahNeff
    @JonahNeff 6 หลายเดือนก่อน

    For the love of God, stop saying "dram". It's D RAM.

  • @jeremyelser8957
    @jeremyelser8957 10 หลายเดือนก่อน

    are you still posting podcasts? I haven't seen the videos showing up in audio format.

  • @liquidmobius
    @liquidmobius 10 หลายเดือนก่อน

    It's pronounced dee-ram. You say the letter "D" then "ram".

  • @CM-mo7mv
    @CM-mo7mv 10 หลายเดือนก่อน

    my professor would fail you for implying bitrate is the same as bandwidth. it is not!

  • @parth7501
    @parth7501 4 หลายเดือนก่อน

    I liked the "volatile memory business" pun.

  • @danytoob
    @danytoob 10 หลายเดือนก่อน +1

    I can always find the big brain subject matter, explained to a level I can almost begin to ingest right here ... and still leaves me fascinated, ready for more. As always ... Thank you!

  • @Clancydaenlightened
    @Clancydaenlightened 10 หลายเดือนก่อน +1

    Well only thing faster is sram because it's asynchronous and doesn't need to be reminded of what it needs to remember
    Since it can operate at whatever the bus clock is, much faster than dram,
    64 bit sram is expensive though but if you got money 1000usd+ a gigabyte x64bit ain't really expensive, especially for "research and development"
    Problem with sram is how you statically store a bit, so how can you build a latch or jk/sr flip flop using less than 7 transistors?
    Float a fet gate, to set, drain the gate to reset, once a Fet is on, it typically is going to stay on until the gate charge is removed typically using a pull down, especially if you apply a bias on it, can design the gate specifically for this
    So possibly you can make sram using 2 or 3 transistors since the set/reset logic is driving the gate, and the voltage on source and drain is the bit on/off you read in a matrix
    Or use a scr or thyristor and a fet to set reset that, and possibly use two active devices, maybe two fets and one scr

    • @Gameboygenius
      @Gameboygenius 10 หลายเดือนก่อน

      The advantage of the traditional 6 transistor SRAM cell is power consumption. You can make a 4 transistor cell where you replace two transistors with pullups. However, those pullups (one or the other depending on the state of the cell) will permanently drain energy, where as the 6T cell will have 0 static current, other than leakage through the SiO2. It's also more complicated to manufacture since you need to fit a high value resistor above the transistors, on the same footprint, or you will effectively have negated the space saved from reducing the amount of transistors. And any fewer transistors than that, and you've effectively built a DRAM cell, with similar considerations as any other DRAM.
      Bigger picture: what are you going to use the imagined "R&D" device for? Economy and scalability is everything. There probably aren't many scenarios where giant SRAM makes sense over buying more compute power and/or DRAM and parallelizing the tasks. Certainly not for producing millions of phones or thousands of AI devices for a datacenter. The realization in the discussion about HBM is that sometimes it isn't the silicon that's the limiting factor, but the system design and the interconnects.

    • @vylbird8014
      @vylbird8014 10 หลายเดือนก่อน

      @@GameboygeniusI think if some genius did invent a better SRAM, it would immediately be applied to making bigger, faster cache on processors.

  • @nexusyang4832
    @nexusyang4832 10 หลายเดือนก่อน +1

    Woah! An early upload! Well I guess it is late in Taiwan. ;)

  • @tomholroyd7519
    @tomholroyd7519 10 หลายเดือนก่อน +1

    complex three dimensional crystals with both memory and logic

  • @EdgyNumber1
    @EdgyNumber1 10 หลายเดือนก่อน

    What is AMD using in their graphics cards now?

  • @nekomakhea9440
    @nekomakhea9440 10 หลายเดือนก่อน +1

    I really hope alternative memory tech like PIM, HBM, or CAMM catches on in the consumer sectors, they're really cool and there's only so far that DDR + DIMM can keep pushing clocks before hitting the power wall. PIM HBM would be really fun to play with

    • @jamesbuckwas6575
      @jamesbuckwas6575 10 หลายเดือนก่อน +1

      I really appreciate CAMM for pushing the limits of memory on mobile devices to where soldered memory is not worth the lack of upgradeability in almost all use cases.

  • @madson-web
    @madson-web 10 หลายเดือนก่อน

    VEGA with a more specific purpose

  • @WaszInformatyk
    @WaszInformatyk 10 หลายเดือนก่อน

    8:31 deep reactive anisotropic ion etching
    it sounds like a gobbledygook from some Sci-Fi series explaining a machine for time travel ;-)

  • @brunolimaj7129
    @brunolimaj7129 10 หลายเดือนก่อน +1

    I appreciate your videos man, thank you!

  • @igormarcos687
    @igormarcos687 10 หลายเดือนก่อน

    I thought DDR ended with the fall of the soviet onion

  • @disadadi8958
    @disadadi8958 10 หลายเดือนก่อน +2

    I've heard of HBM before, amd used it on their gaming GPUs. 4096 bit memory bus is all I remember.

    • @kazedcat
      @kazedcat 10 หลายเดือนก่อน +3

      It is very good technology but expensive. Gamers could not afford them but AI companies could.

    • @disadadi8958
      @disadadi8958 10 หลายเดือนก่อน +1

      @@kazedcat I mean.. gamers kinda could. The radeon cards just didn't have enough performance otherwise. The 16GB of hbm was the star or the show and because of that the card is still pretty usable.

    • @kazedcat
      @kazedcat 10 หลายเดือนก่อน +1

      @@disadadi8958 If I remember correctly their HBM GPU was late to the market that is why the performance was not competitive. They then switch back to using GDDR in the next generation because HBM is expensive. HBM then find its niche in the embedded market until the AI boom.

    • @PainterVierax
      @PainterVierax 10 หลายเดือนก่อน

      @@kazedcat It more about : AI companies are in demand and willing to pay any astronomic price so why not prioritizing them in disfavor of the consumer market and other low-margin markets like embedded.

    • @disadadi8958
      @disadadi8958 10 หลายเดือนก่อน +1

      @@kazedcatI guess it depends on the model. For radeon VII certainly, it missed the mark against 2000 series RTX cards. However, the Vega 56 and 64 (released in 2017) were using 2048-bit memory bus with 8 gigabytes of HBM2. Those were significantly more competitive cards, than the 16GB HBM2 wielding Radeon VII released in 2019.
      The Vega 64 wasn't far off from gtx 1080 performance, albeit it was released a year later than the Nvidia counterpart, and Nvidia released the 1080ti to crush the competition. 1080ti had more vram (11GB) and it even reached the same memory bandwidth of the Vega 64's HBM2 with smaller memory bus.

  • @cromulentcommodore5896
    @cromulentcommodore5896 10 หลายเดือนก่อน

    So a series of tube's basically

  • @BrunoTorrente
    @BrunoTorrente 10 หลายเดือนก่อน

    I would like OMI to become the standard, more agnostic memory systems, so the type of memory you use is irrelevant, including mixing.
    Imagine buying DDR memory sticks, but if they have an onboard GPU you can buy GDDR, or even mix it.
    Besides, developing new revolutionary memories would be common, as long as it "talks" OMI it doesn't matter how the hardware was made.

  • @chiupipi
    @chiupipi 9 หลายเดือนก่อน

    Just a minor terminology discussion. The TSV in HBM now is also called TDV, thru die via, to distinguish from TSV used in Si interposer for packaging.

  • @amitcarbyne3635
    @amitcarbyne3635 10 หลายเดือนก่อน

    Investment Companies 10:00

  • @arjundubhashi1
    @arjundubhashi1 10 หลายเดือนก่อน

    It would be interesting to see a video on AWS’ bespoke GPU tech for ML. They don’t seem to be very public about the infra side of things in that space.

  • @El.Duder-ino
    @El.Duder-ino 10 หลายเดือนก่อน

    We need stacked SRAM with same capacity as HBM😎🤘 Memory needs to be as fast as computing part, just take a look at brain neurons.
    Excellent vid as always👍

  • @crispysilicon
    @crispysilicon 10 หลายเดือนก่อน

    You should read up on Northpole, IBMs recent thing of interest they showed us at Hot Chips. Would make for a good episode.

  • @adamlin120
    @adamlin120 10 หลายเดือนก่อน

    Amazing video

  • @albuslee4831
    @albuslee4831 10 หลายเดือนก่อน

    Learned a lot, especially about the new-ecosystem developed for the HBM manufacturing was never mentioned from any other places. Ai being the Sleepy high-end section of the market suddenly turned into the hottest new thing, indeed.

  • @TheExard3k
    @TheExard3k 10 หลายเดือนก่อน

    Additional die space required is always expensive. But I can see HBM in consumer products in the future...Intel Xeon MAX is a big commitment to HBM right now and I can see other platforms offering it as a premium option. We probably won't use it for power-efficiency any time soon.

  • @MostlyPennyCat
    @MostlyPennyCat 10 หลายเดือนก่อน

    I can't wait for hbm prices to fall low enough to be stacked on top of consumer Apus

  • @skyak4493
    @skyak4493 10 หลายเดือนก่อน

    Great point about a key technology to an overhyped technology.
    This reinforces my interest in IBM's research into optical materials for AI where the memory is right in the logic.

  • @Laundry_Hamper
    @Laundry_Hamper 10 หลายเดือนก่อน

    I have a few special memories myself. I'm glad our AI overseers will have something to busy their thoughts while they are not generating otter mode catboy BBW hyperrealistic very high quality featured on artstation

  • @TheTrueOSSS
    @TheTrueOSSS 10 หลายเดือนก่อน

    I always saw hmb as a strong novel competitor to conventional memory. In my experiance with hbm2 on a vega64, the proliferation of the technology would be good for everyone. I hope the AI boom can bring the nesisary market demand to further develop and refine the technology while optimizing its manufacturing processes.

  • @12vscience
    @12vscience 10 หลายเดือนก่อน

    a

  • @andrewcornelio6179
    @andrewcornelio6179 10 หลายเดือนก่อน

    Cisco also uses HBM to make faster internet switches in their Silicon One series.