An example where ECC to the module was needed, more than ECC on chip. I was working at a company who had spent 1.5 man years tracking down a software bug in production, when it suddenly resolved itself. A week later we realized that one of the computers in the high availability cluster had turned off. I was present when it was opened. I noticed that the ram had 8 chips per dimm. After we got new ram, the computer couldn’t complete post because it had bad memory. In fact any memory installed into channel 3 was bad. We got a replacement motherboard, and that fixed it.
Such hardware bugs is why I'm skeptical of on-chip RAM ECC. From an architecture perspective, the CPU package should do ECC on entire RAM cache slots independent of the RAM module design. So when the core needs something from a 64 byte (512 bit) cache slot, the memory interface chiplet loads 80 bytes from the DRAM interface and uses the extra 128 bits as ECC to correct up to 12 single bit errors. Those extra bytes translate to two extra 64 bit memory cycles and its up to the ECC format designer to ensure that flaky PCB traces or stuck DRAM data lines are always detected.
@@johndododoe1411 The on dimm ecc is mostly just to allow memory manufacturers to use lower quality parts with higher error rates that would otherwise disqualify them from use.
There are many more differences between RDIMMs and UDIMMs: * the number of address lines for each sub channels is 13 for UDIMM while only 7 for RDIMM, thanks to RCDs RDIMM address lines can run DDR while UDIMM must be SDR, * ECC on UDIMMs according to JEDEC is only 4 bits per sub channel, while RDIMM can be 4 or 8, * UDIMMs are restricted to x8/x16 memory dies, RDIMM can use x4 as well, this allows for UDIMM's hosts to mask writes using DM_n signal, RDIMM's hosts must send full transfers. I think that points 1 and 3 are mostly responsible for why UDIMMs and RDIMMs are no longer slot compatible. The supply voltage is different, but as far as I know all PMICs can run in the 5-15V range.
Great video Patrick! this was a topic that was really hot on the forums, with lots of conversation, especially around ECC UDIMMs, thanks for clearing everything up, and agree, this is an excellent resource that can be useful to lots of people - cheers!
I'd love a quick summary of how this affects ranks, 4 vs 8 bit ECC (9x4 vs 10x4) and maybe latency impacts and what would benefit from different configs.
Hey Patrick, awesome video explaining the differences between DDR4 and DDR5. I loved the various pictures and captions that you'd place as you were explaining the PMIC, RCD, and SPD hub! You have a knack for articulately and concisely explaining device differences and cementing what the myriad of acronyms actually do. Definitely will be coming back for more tech educational content over here. I have a background in chemical engineering and got curious to see the electrical/computer engineering side of semiconductor manufacturing :)
At the beginning of DDR5 rollout, quite a lot of DIMM's pmic were broken (including mine) during use It was my first time seeing faulty RAM but my friend who works in a big tech data centre said that it occurs every day for his company's scale. The quality of the pmic should also be way better now after 1.5 Yeats as well as the better shaped pmic supplies
Useful content. Thanks! In a few years when I'm going through the next round of server upgrades, I'll understand better what I'm looking at. (Yes, I expect some updates from you before then, too.)
Great dive into the state of PC memory arch., love to keep up with this stuff! My DDR5 key takeaways: cut memory channel width in about half, provide 2 channels per-dimm, and add on-dimm power supply & other reliability features to support even higher clocking. The ratio of memory bandwidth to clock cycle has not kept up with high core counts, explaining why top500 use 24-48 core CPUs. Related points: unbuffered dimms now have on-chip ecc to support smaller processes and faster speed (I'll think of it like reed-solomon on a hdd); and, there's some interesting future potential of ram over pci-e.
I agree on needing ECC Server memory, their up to speed and their amounts are awesome but i hate that persistent memory is dying and AMD EYPC Would be nice with it.
I would consider the advantage of multiple independent channels to really be a latency advantage, it reduces the chance a memory access has to be queued behind a previous access, thus reducing the chance of it being delayed. Back to back memory accesses to the same bank are the biggest contributor to higher latency under load of memory systems, increasing the number of channels is the best way to increase bank parallelism without adding more devices to each channel.
@@ServeTheHomeVideo Oh - I'm surprised. I was thinking "smarter folks than me have thought of this and dismissed it - I would like to know why so I can become smarter". I figured it would be a latency issue, bandwidth not being enough to saturate the PCI5 link or something, because as you said reusing DDR4 is cheaper.
Thanks for the overview of the differences. It is good to know. A couple of questions for you if you have the time: With the support components moving onto the memory module (eg: pmic) and PICe cards with memory, do you think we are headed towards a high bandwidth serial protocol between CPU and memory simlarly to how storage interconnects (eg: ata=>sata, pscsi=>sas) and even expansion buses (eg:: PCI=>PCIe)? How does the memory on PCIe work with NUMA configurations for the system?
CXL Type-3 looks like a NUMA node without cores attached in the topology. On the memory attached bit, that is somewhat the promise of CXL 3.x because it will make sense to start using shelves of memory connected via CXL instead of adding more DDR interfaces on a chip.
thought I saw smoke rolling out U'r ears once! I hope U find some great trade shows 2 go to! great breakdown! OMG!! it's a CXL module in the wild!!! I've waited 5yrs to see where that tech was going!
Quite interesting. Should be fun to see how the server world is changing. Just had 8x 7443P servers arrive today at my house all in just 4u which is quite nice.
So does AM5 support UDIMM? I really don't want to use non ecc anymore. Because most are unstable out of the box when you are gaming 12 hours you are nearly guaranteed to crash.
I am currently deciding on parts for a workstation. And picking the right combination of dimm slots, number of sticks, frequency, timing, capacity and price ... Is difficult.
That is soooooo expensive ATM. I don't even blame EPYC's motherboards not having 24 DIMM's. Though they should start right about now releasing them - that was the promise last year.
@@ServeTheHomeVideo True. I do wonder if there will be 24 DIMM per socket (with 2 sockets) motherboards as promised. I don't doubt they could be made, but I'm thinking of how much space it takes. I have to re-watch your earlier videos on Genoa. After the first one, my PC started acting up and only couple days ago I finally found the root of the problem.
@@revcrussell If past experience is anything to go by - and mine goes back to before DDR, the prices will flip. Though in case of DDR5 it seems to take longer than usual (average was around 18 months since introduction). But only a bit. I wouldn't be surprised if DDR5 - 6000 EXPO kits were cheaper than DDR4 - 4000 kits before the end of they year. Heck, you can find some places were they are cheaper already, but by the end of 2023 it should be universal. Especially since DDR4 will probably go completely out of mass production, only enough to support legacy systems. This will depend on how many Zen 3 and older Intel CPU's are unsold. And additionally there was almost a year when DDR5 was only an option - slowing down the change in production.
Are there ever even going to be any E-ATX 2 socket boards for Epyc Genoa (SuperMicro H13)? I see a few boards that are for specific cases. It doesn't look like an E-ATX board has room for of these CPUs. Seems they're going backwards.
How would I build a compute server to maximize RAM (specifically , a single Java address space for AI applications)? Four TB on a single motherboard would just be the starting point.
This is why I don’t see risc like arm over cisc in the big servers just yet.. it’s the same reason why cisc was set for long time.. the memory doesn’t keep up
Other than a server being built specifically for HB apps and banning low bandwidth from running on them, I see no point in choosing a lower core count CPU to save bandwidth. This is the same problem as inefficiencies in farm land use to make massive machine use more efficient and faster. It takes many times more land and other resources to feed a person this way. Put LB apps on the extra cores to save on hardware and floor space that they otherwise take up.
Nice video. For top500 most hplinpack machines use GPUs for the heavy lift, the bw mostly goes to them. CPU cores are good for other mixed workloads such as HPCG etc
As someone who learned about DRAM when each chip might contain only 8Kibyte or less and studied cache hardware and CPU design later, keeping up with marketing code names such as Death Lake and superclean plus plus mega is a useless game of noise. Interesting though that the old RAMBUS company is coming back as a maker of standard high end RAM chips instead of a monopoly.
About server with DDR5 RAM, I’ve a challenge. Today, what could be the best budget choice for a new homelab server to run Proxmox with TrueNas, Plex and 2 ou 3 VM ? And if it could be power efficient when it does nothing, that should be perfect Ok, I know…. That’s a lot but we’re close to Christmas 🤪
At 41 through 44 seconds, it in UNCLEAR if you said CAN or Can't. So please start learning to say "can not" when appropriate, and please STOP using contractions entirely. This is ESPECIALLY true when Brits, South Efrikans and Indians (3 examples should be enough) speak. Contractions make you less understandable and make you more likely to be skipped over, avoided and certainly not subscribed to.
Ehm, about that graph.... Might've been better to put a release year on the x-axis, because the MT-Gbps chart is essentially a chart displaying the relationship between inches on X, and centimeters on Y. Not really informative. 😉
Somewhat hard to do that. When was DDR5-4800? When it was delivered for consumers? When Genoa launched? When both Genoa and Intel used it? DDR4-3200 is another good example
@@revcrussell I do not remember where I read this, but I seem to remember that out of AMD/Nvidia/IBM, at least one is using a DRAM controller block they have bought from RAMBUS
An example where ECC to the module was needed, more than ECC on chip.
I was working at a company who had spent 1.5 man years tracking down a software bug in production, when it suddenly resolved itself.
A week later we realized that one of the computers in the high availability cluster had turned off. I was present when it was opened. I noticed that the ram had 8 chips per dimm.
After we got new ram, the computer couldn’t complete post because it had bad memory. In fact any memory installed into channel 3 was bad.
We got a replacement motherboard, and that fixed it.
We had something similar with an early HPE DL385 Gen10
A sign that the app needs better instrumentation. These kind of bugs happen too often!
@@jgurtzThe App wasn't at fault, the hardware was at fault for wasting software dev time by corrupting data.
Such hardware bugs is why I'm skeptical of on-chip RAM ECC. From an architecture perspective, the CPU package should do ECC on entire RAM cache slots independent of the RAM module design. So when the core needs something from a 64 byte (512 bit) cache slot, the memory interface chiplet loads 80 bytes from the DRAM interface and uses the extra 128 bits as ECC to correct up to 12 single bit errors. Those extra bytes translate to two extra 64 bit memory cycles and its up to the ECC format designer to ensure that flaky PCB traces or stuck DRAM data lines are always detected.
@@johndododoe1411 The on dimm ecc is mostly just to allow memory manufacturers to use lower quality parts with higher error rates that would otherwise disqualify them from use.
I'm not moving to D5 until D6 comes out, for my own financial health.
You did a great job pacing all the information I needed to keep me up to date with newer memory technologies, thank you.
Glad you found this useful!
@@ServeTheHomeVideolllkokkoo
.😊.
There are many more differences between RDIMMs and UDIMMs:
* the number of address lines for each sub channels is 13 for UDIMM while only 7 for RDIMM,
thanks to RCDs RDIMM address lines can run DDR while UDIMM must be SDR,
* ECC on UDIMMs according to JEDEC is only 4 bits per sub channel, while RDIMM can be 4 or 8,
* UDIMMs are restricted to x8/x16 memory dies, RDIMM can use x4 as well,
this allows for UDIMM's hosts to mask writes using DM_n signal, RDIMM's hosts must send full transfers.
I think that points 1 and 3 are mostly responsible for why UDIMMs and RDIMMs are no longer slot compatible.
The supply voltage is different, but as far as I know all PMICs can run in the 5-15V range.
Great video Patrick! this was a topic that was really hot on the forums, with lots of conversation, especially around ECC UDIMMs, thanks for clearing everything up, and agree, this is an excellent resource that can be useful to lots of people - cheers!
I'd love a quick summary of how this affects ranks, 4 vs 8 bit ECC (9x4 vs 10x4) and maybe latency impacts and what would benefit from different configs.
+1
It looks like DDR5 RDIMM is inherently Dual Rank.
Hey Patrick, awesome video explaining the differences between DDR4 and DDR5. I loved the various pictures and captions that you'd place as you were explaining the PMIC, RCD, and SPD hub! You have a knack for articulately and concisely explaining device differences and cementing what the myriad of acronyms actually do. Definitely will be coming back for more tech educational content over here. I have a background in chemical engineering and got curious to see the electrical/computer engineering side of semiconductor manufacturing :)
At the beginning of DDR5 rollout, quite a lot of DIMM's pmic were broken (including mine) during use
It was my first time seeing faulty RAM but my friend who works in a big tech data centre said that it occurs every day for his company's scale. The quality of the pmic should also be way better now after 1.5 Yeats as well as the better shaped pmic supplies
Useful content. Thanks! In a few years when I'm going through the next round of server upgrades, I'll understand better what I'm looking at. (Yes, I expect some updates from you before then, too.)
So why exactly Micron gave you 32x32GB to test?
Great dive into the state of PC memory arch., love to keep up with this stuff! My DDR5 key takeaways: cut memory channel width in about half, provide 2 channels per-dimm, and add on-dimm power supply & other reliability features to support even higher clocking. The ratio of memory bandwidth to clock cycle has not kept up with high core counts, explaining why top500 use 24-48 core CPUs. Related points: unbuffered dimms now have on-chip ecc to support smaller processes and faster speed (I'll think of it like reed-solomon on a hdd); and, there's some interesting future potential of ram over pci-e.
Very useful to get those of us in the DDR4 world up to speed!
Glad it was helpful!
Awesome! Can't wait till CXL hits sensible pricing too! :)
Still need a bit more platform maturity as well
I agree on needing ECC Server memory, their up to speed and their amounts are awesome but i hate that persistent memory is dying and AMD EYPC Would be nice with it.
I would consider the advantage of multiple independent channels to really be a latency advantage, it reduces the chance a memory access has to be queued behind a previous access, thus reducing the chance of it being delayed. Back to back memory accesses to the same bank are the biggest contributor to higher latency under load of memory systems, increasing the number of channels is the best way to increase bank parallelism without adding more devices to each channel.
قه٤٧٣خحهمقض
If a CXL module with 4 DDR5 modules only gives you the same bandwidth of 2 DDR5 channels, why not use DDR4 modules?
That is a major area folks are looking at. Imagine for a hyperscaler reusing DDR4
@@ServeTheHomeVideo Oh - I'm surprised.
I was thinking "smarter folks than me have thought of this and dismissed it - I would like to know why so I can become smarter".
I figured it would be a latency issue, bandwidth not being enough to saturate the PCI5 link or something, because as you said reusing DDR4 is cheaper.
Consumer CXL seems pretty exciting. We know at least AMD is working on it. We might be getting it with zen 5 and PCIe 6.0 which is really exciting
This really was a great video talking about all the new benefits that DDR5 brings
Thank you. Glad you liked it.
Could you make a video about power ic chips and how there being hit with the shortages and development problems. I’d love to know more about them
Great info, now to get a budget for new servers.
DDR5 is forcing people to know the difference between UDIMM and RDIMM, I love it.
PMmp😊0
❤a Ce ppmm❤qmpaq
P
MMmm
😊pmam😊m😊 ppmm m
❤❤
Qm/p/mm
Thanks for the overview of the differences. It is good to know. A couple of questions for you if you have the time:
With the support components moving onto the memory module (eg: pmic) and PICe cards with memory, do you think we are headed towards a high bandwidth serial protocol between CPU and memory simlarly to how storage interconnects (eg: ata=>sata, pscsi=>sas) and even expansion buses (eg:: PCI=>PCIe)?
How does the memory on PCIe work with NUMA configurations for the system?
CXL Type-3 looks like a NUMA node without cores attached in the topology. On the memory attached bit, that is somewhat the promise of CXL 3.x because it will make sense to start using shelves of memory connected via CXL instead of adding more DDR interfaces on a chip.
Speaking of ddr5 servers on a tangent, you should check out the M80q gen3 which uses DDR5 SODIMM in a tinyminimicro class device.
Wow! Just snagged a great deal on one because of this comment. Thank you!
@@ServeTheHomeVideo Can't wait for the review!
Thank you for the DDR5 explanation. Great video!
Glad it was helpful! Sadly, the Super Bowl seems to have stopped views on it. Hopefully people share it.
Thanks for the very helpful content for a self "studied" amateur trying to go from desktop to workstation / server.
thought I saw smoke rolling out U'r ears once! I hope U find some great trade shows 2 go to!
great breakdown! OMG!! it's a CXL module in the wild!!! I've waited 5yrs to see where that tech
was going!
Quite interesting.
Should be fun to see how the server world is changing. Just had 8x 7443P servers arrive today at my house all in just 4u which is quite nice.
So does AM5 support UDIMM? I really don't want to use non ecc anymore. Because most are unstable out of the box when you are gaming 12 hours you are nearly guaranteed to crash.
Yes. AMD AM5 is a consumer platform. The next use for that ECC UDIMM shown is for an AM5 server platform
@@ServeTheHomeVideo Good to know so any unbuffered DDR5 will likely work as long as the slot matches which it should in theory always do.
First 3 seconds of this video had me checking my playback speed.
I am currently deciding on parts for a workstation. And picking the right combination of dimm slots, number of sticks, frequency, timing, capacity and price ... Is difficult.
Really cool, we plan to use the DDR5 for the next design of the R86S MINI PC BOX+Alder Lake N300 CPU
That is soooooo expensive ATM. I don't even blame EPYC's motherboards not having 24 DIMM's. Though they should start right about now releasing them - that was the promise last year.
Showed a 8x DIMM one in the video from ASrock Rack
@@ServeTheHomeVideo True. I do wonder if there will be 24 DIMM per socket (with 2 sockets) motherboards as promised. I don't doubt they could be made, but I'm thinking of how much space it takes. I have to re-watch your earlier videos on Genoa. After the first one, my PC started acting up and only couple days ago I finally found the root of the problem.
Came here to say the same thing. I am happy to be building with DDR4 right now due to cost.
@@revcrussell If past experience is anything to go by - and mine goes back to before DDR, the prices will flip. Though in case of DDR5 it seems to take longer than usual (average was around 18 months since introduction). But only a bit. I wouldn't be surprised if DDR5 - 6000 EXPO kits were cheaper than DDR4 - 4000 kits before the end of they year. Heck, you can find some places were they are cheaper already, but by the end of 2023 it should be universal. Especially since DDR4 will probably go completely out of mass production, only enough to support legacy systems. This will depend on how many Zen 3 and older Intel CPU's are unsold. And additionally there was almost a year when DDR5 was only an option - slowing down the change in production.
Gigabyte MZ33-AR0 is an example mobo with 24 DIMMs per socket. Never used Gigabyte server mobos, but at least it means more are likely.
Are there ever even going to be any E-ATX 2 socket boards for Epyc Genoa (SuperMicro H13)? I see a few boards that are for specific cases. It doesn't look like an E-ATX board has room for of these CPUs. Seems they're going backwards.
That is why they need so many cores, you can only get one socket on a board. Just think of the loss of PCIe lanes.
So one RDIMM has 2 channels and CPUs have 12, is that now 6 DIMM per Socket or can you have multiple?
12 DIMMs per socket in 1DPC, 24 in 2DPC (when that is available). You are right it is confusing now.
CPU channels means 64-bit data width channels, as before. The two channels within a DIMM are best referred to as sub-channels.
We are finally seeing ddr5 be beneficial in consumer side too. Raptor lake takes a major hit on ddr4 that alder lake did not.
what about timings?
and how timings impact server work csenarious?
CAS latency increase is mostly offset by higher clock speeds so the latency in NS is only up ~3%.
Would be nice to see what workloads scale with more memory bandwidth as there is not many available info out there. Thanks Patrick for the video.
What a wonderful video ! I Just Need This !
Great breakdown, thanks!
How would I build a compute server to maximize RAM (specifically , a single Java address space for AI applications)?
Four TB on a single motherboard would just be the starting point.
This is why I don’t see risc like arm over cisc in the big servers just yet.. it’s the same reason why cisc was set for long time.. the memory doesn’t keep up
Other than a server being built specifically for HB apps and banning low bandwidth from running on them, I see no point in choosing a lower core count CPU to save bandwidth. This is the same problem as inefficiencies in farm land use to make massive machine use more efficient and faster. It takes many times more land and other resources to feed a person this way. Put LB apps on the extra cores to save on hardware and floor space that they otherwise take up.
Nice video. For top500 most hplinpack machines use GPUs for the heavy lift, the bw mostly goes to them. CPU cores are good for other mixed workloads such as HPCG etc
انا جدا تp😊🎉😂l@ll da p
If I had $100 everytime he says "THIS" I could buy some of THIS set of 32GBx24 DIMMs 😂
Excellent video, learned a lot!.
Imagine if I could get $100 every time I say it
As someone who learned about DRAM when each chip might contain only 8Kibyte or less and studied cache hardware and CPU design later, keeping up with marketing code names such as Death Lake and superclean plus plus mega is a useless game of noise.
Interesting though that the old RAMBUS company is coming back as a maker of standard high end RAM chips instead of a monopoly.
About server with DDR5 RAM, I’ve a challenge. Today, what could be the best budget choice for a new homelab server to run Proxmox with TrueNas, Plex and 2 ou 3 VM ?
And if it could be power efficient when it does nothing, that should be perfect
Ok, I know…. That’s a lot but we’re close to Christmas 🤪
The smiley is on^^
It flickers a bit but it isn't bad :)
Very informative, thank you.
ROI is going to be a pita to reach before DDR6 is out
At 41 through 44 seconds, it in UNCLEAR if you said CAN or Can't. So please start learning to say "can not" when appropriate, and please STOP using contractions entirely. This is ESPECIALLY true when Brits, South Efrikans and Indians (3 examples should be enough) speak. Contractions make you less understandable and make you more likely to be skipped over, avoided and certainly not subscribed to.
actually a great video
Good tutorial. Thanks
Ehm, about that graph.... Might've been better to put a release year on the x-axis, because the MT-Gbps chart is essentially a chart displaying the relationship between inches on X, and centimeters on Y. Not really informative. 😉
Somewhat hard to do that. When was DDR5-4800? When it was delivered for consumers? When Genoa launched? When both Genoa and Intel used it? DDR4-3200 is another good example
DAMN!!!! this is insane !!
Wow. RAMBUS is still around.
Only as a patent troll.
@@revcrussell i dont think so, they are actually designing memory/pci-e controller blocks and selling them to cpu/gpu/*pu designers
@@TheBackyardChemist If they are, I stand corrected, but I read recently they were just making money on patents.
@@revcrussell I do not remember where I read this, but I seem to remember that out of AMD/Nvidia/IBM, at least one is using a DRAM controller block they have bought from RAMBUS
@@revcrussell I think they're doing both.
Awesome content!
I mean.. 2 of those would be nice
i am a freelancer from india love to desiogn thumnail for you how can i contect you ..?
Ya overloaded information. I had to played at 0.75 just a better obtaining it
Incompatibility is such an anti consumer jerk move.
Why do you say "pretty much" either it is or it isn't m8.
They made it so they can charge a larger premium for server memory. Artificial infatuation.
Me who still used DDR3 FB-DIMM
FB-Dimm? What's that?
@@akirafan28 fully buffered dimms instead of the regular unbuffered dimms
@@reki353 Thanks! 🙂👍
👍
That looks like a very expensive box
Smiley face prop light is just a little flickery.
Yes. I am not sure why it is moreso in this one than others. That Canon C70 has not had settings change.
great vid
Thanks!
TLDR - server RAM has no RGB so its definitely better lol
This channel has no content that matched "ddr6."
Still a bit of time
Jesus, Rambus, the patent troll, is *still* around sticking their name on things. Haven't thought of them in years.
Actually they develop a lot of IP other companies use. I think of patent trolls more of organizations without R&D.
mb
sponsored by micron.
yes