The substrate material itself will have extremely low thermal conductivity, below 1W/MK. It's essentially fiberglass. The amount of copper incorporated into it will raise that somewhat but certainly not enough that it could ever dissipate tens or hundreds of watts Silicon is a very good thermal conductor on the other hand, not far off as good as aluminium.
Found this out when my DIY PCB for a led light resulted in a bunch of overheating LEDs. Turns out as efficient as they are they absolutely need the aluminum PCB to conduct heat away.
There was also apparently an early batch of Ryzen 5900X CPUs that were unstable and generated unrecoverable WHEA errors at stock settings. I was one of the ones who had one and others reported the same thing (the replacement worked fine). I think AMD's QC may be a bit sketchy.
According to a tomshardware source excessive SoC voltage(caused by Expo setting) kills the thermal sensors as well as thermal protection. With those dead sensors and the general behavior of Ryzen 7000 series(pushing max. performance within thermal limits) the CPU kills itself as it does not know it's thermal limits anymore. As there is no thermal headroom anymore visible to the CPU the boost behavior leads to excessive power draw beyond safe levels.
My concern with that would be is this something that can develop over time, or would you notice it pretty quickly, seeing high SoC voltage and what not
AMD had confirmed that this is the issue. The funny thing is that most EXPO settings boost the SoC Voltage to 1.3V and the MC_SoC to 1.4V. AMD stated that the SoC Voltage shouldn't go above 1.3V....but the question is *what* shouldn't go above 1.3V!? Just the SoC Voltage or the SoC Voltage and the MC_SoC Voltage? Just to be safe, I've lowered my SoC_Voltage to 1.25V and everything still works fine. But my memory controller is still set to 1.4V.... Edit - I just looked into this again. It is supposedly *just* the SoC Voltage. So, if you had an ASRock motherboard, you've been all good the whole time as EXPO profiles on ASRock boards never put more than 1.3V into the SoC Voltage (at least that's how it has always been with my Taichi Carrara) =D
@@NippleSauce Yeah it's just the SoC voltage. All AM5 mainboard manufacturers will release AGESA updates(with locked SoC voltage at 1.3v max.) in the coming days according to AMD.
@@Squilliam-Fancyson I am still a bit confused here because even ASRock released an updated BIOS this morning which has a description stating that "7000X3D users should probably update to this BIOS". I wonder if the SoC_MC_Voltages could have actually been involved on the ASRock motherboards. Regardless, I'll be updating my BIOS in a few hours and will be able to see if the SoC_MC_Voltage values are decreased or if they're still at 1.4V with EXPO enabled. I guess I'm also super confused because the SoC_Voltage on the ASRock boards has never gone higher than 1.3V - regardless of whether or not an EXPO profile was being used. Perhaps their new BIOS just prevents users from manually increasing the SoC_Voltage _above_ 1.3V if a 7000X3D CPU is detected?
AMD am5 cpus: *Tsundere Simulator* _"It's not like I did those volts & frequency because I like you or anything cooler, baka!"_ Intel lga1700 cpus: *Yandere Simulator* _"Oh boy! Here I go killing high memory speed & tight timings again"_
@@daniomhailemariam7308 all right boomers, I don't like explaining my memes. With AMD ryzen you have to deal with PBO and its algorithm will determine the frequencies and voltage based off of the temperature of your cooler, but with recent events with zen 4 is happened with a buggy AGESA and/or bios(uefi) putting more voltage and current then usual(my game theory), recently AMD released a AGESA fix anyways so they figured it out, but idk. and with Intel core gen 12-13(+14 tbh) has a major ddr5 dropout issue when reaching above 7200Mbps(megabits per second) depending on the motherboard and ram and IMC quality. There's also the issue of the LGA1700 socket having issues that you have to get a correcting frame before for a new virgin motherboard before putting in the CPU not using the stock iLM(loading mechanism), these also a risk of degradation depending on the voltages it may randomly do things.
@@quantum5661 glad I got in this shit show, because my junk will shit on yours. Mine is fine. How you just go to “everything this version is shit” off this video, boggles my mind. But hey. Happy gaming my guy. Jesus.
Fun that my r5-1600x never die when clocked to 4.2Ghz and could run 4.3Ghz on 1.55Vcore but I stick to 1.5Vcore on 4.2Ghz. Got r5-5600x on realise and could not wait for bios update for my old bord and get b550 auros pro v1 and clock my r5-5600x to 4.85Ghz all cores on 1.41Vcore 0 issues. People are poor and think they can use low end coolers. Know some people around my area that's have get issue with ryzen 3000s and 1000s and was because of they use stock cooler
Ironically enough being able to set a static voltage rather than letting the system do it for you might have been a way to prevent this issue. Because like I mentioned in a previous comment motherboards are supporting cpu's before they were even supposed to. That tells me that the firmware isn't locked down and the bios isn't secured.
5800X3D had everything locked down at launch, and the internet went on the "AMD Evil" bandwagon. I am pretty sure AMD regrets not locking down everything now.
@@SolarianStrike Ultimately just depends really on the reason this CPU died. If it was pure fluke then it will blow over. If it doesn't then AMD might do it again.
@@SolarianStrike dont advertise a suite if its not applicable then, AMD ultimately releases the part, not the fans. Or admit that those suites are not fit for purpose on X3D
Finally an opinion worth listening to. To add to your VRM controller theory: Back when your chip died, two chip deaths happened in ZA as well. One on an ASUS X670E-PRO and another on an ASUS CROSSHAIR X670E EXTREME. I've seen other chips dead on CROSSHAIR boards, and obviously your GENE as well. Der8auer's viewer had a high end AORUS board I think.
As far as we can tell, there are at least two Gigabyte boards (one claimed to be a B650 Aorus elite) and one verified MSI board. And since we now have evidence that the issue isn't limited to 3D VCache 7000 series, nor is it limited to dual CCD chips, this leads me to believe that the issue is either tied to flaw in the silicon, or an issue with the voltage/power control in firmware, whether it's related to boosting or a vCore overcurrent situation.
@@racerex340 I'm with you on that. Transient high on initialization, CPU dies, user waits for an extended POST while the CPU cooks due to it not having initialized and recognized OTP. Yeah pretty much. I don't believe this is high SOC.
@@dainluke I'm thinking it's something weird in AGESA that motherboard manufacturers have been leveraging per AMD design specs but isn't working right? Would explain Mobo manufacturers suddenly ripping all of their prior AM5 BIOS down except the latest, maybe they found something.
Could it be the socket design failure? Intel had the same kind of problem with their early LGA designs. The power will dissipate in the places with the higher resistance. So if there is a problem with the socket, pins will heat up, cause substrate to bubble, which in turn introduce short circuits in the inner layers and make the situation even worse, as the contacts has become the most resistance in the circuit.
Gigabyte all day long 😉 never use asus bord again or any products from asus again. First issue asus gaming laptop I got in 2010 and then my asus extreme vi maximus after that gtx 1060-3gb asus, both had seriously temp issues and bord had clock issues on my i7
Some people speculate this VSOC voltages problem is related to EXPO RAM settings. der8auer recieved a 7800X where the solder has melted and has even moved outside of the IHS, it also shows the same "bubble" on the underside of the package. EXPO was used here as well.
There used to be a guy exposing issues with DDR5, and how it's not better than 4 at low speeds, while high speeds used too much power on Intel. It doesn't surprise me AMD has similar issues. Of course most mainstream tech news refused to admit these issues with DDR5, so shout out to that guy. Everyone on DDR5 has been beta testing it. It won't be good until second generation products can handle high speed expo modules.
the high temps were from the GO - 95°C with a standard Air Cooler for the 7950X series for example, I under powered it from 170 Watts Stock to 105 Watts, my CPU Temp with AIO now is 35°C. I also installed recent Asus MOBO Update for my CPU which LOCKS the SoC @ 1.3v.
When I first saw the images of these I automatically from expierience back in the 90's was the problem is with the motherboard. Motherboards supply power and take down CPUs almost exclusively when the substrate goes. Not the CPU. Then I saw more reports come out and I was at first "Still the motherboard but maybe bad AGESA or bugged UEFI causing thermal runaways? Maybe faulty solder or lead mounts, but they fixed those issues decades ago."" Then when other CPUs started to show it I am starting to think there is one of those perfect storms of multiple unrelated factors just lining up perfect to cause this. I don't have the depth of knowledge of it as AHC does, but I am starting to think it is a case of both the CPU and the Motherboard in a case of Follie et deux. I haven't seen substrate fails like this since Thunderbirds were getting killed by high performance boards with auto overclocking going wrong.
Using my 7800X3D a bit over a week now on a X670E prime pro wifi and did use bios version 1406 and 1408 (also gaming on that BIOS). Few days ago I updated to 1409 and had no shutdowns or anything yet (also gaming done). But with EXPO on, my SoC voltage was over 1.3 volt. Probably 1.36V or something. And EXPO was active since day one. Turned off EXPO for now and voltage dropped to 1.0 - 1.1V.
Too much pressure could stress and damage the silicon and/or the substrate. Insufficient pressure could lead to improper contact between the pins and the CPU pads. Due to the high currents provided by the vcore rail, temps could go up and there could be even micro sparkling
My 7700X died in what sounds like a similar fashion to your 7950X3D. It would just throw the yellow DRAM light on my Asus board. The B650E-F was also faulty. I really couldn't see any physical damage anywhere though. I really should have checked the voltages, but to be fair it happened back in early February before the news started to pick up.
Low resistance / short circuits creates heat in circuits. So does high resistance with high amps. If you look at the damage on the cpu does it look instant like a short circuit or slow and gradual like a heated wire. My guess is some cpus have a poor connection internally somewhere creating a hotspot.
As I said on Roman's: A phase transition would be needed to cause a bulge in solid material like that. In other words, some small area melted or boiled.
@@ActuallyHardcoreOverclocking Ooh, popcorn 🍿😋 - another phase change process. Quick rule of thumb: the gas phase of a material has about 1000 times less density than the solid or liquid phases. For a given mass, that's 1000 times more volume in the vapor phase - and that volume's gotta go somewhere.
I noticed a couple of days ago the very high SoC voltages being used when EXPO was enabled with my 7800X3D and G-Skill DDR-6000 CL30 Ram, on an Asus TUF Gaming X670E-plus. This is with the latest (non beta) bios 1409 also released a few days ago. What concerned me was the sheer jump in SoC power - non EXPO default voltage was 1.05v with SoC power at ~9W. This jumped to 21W with expo enabled and 1.35v SoC voltage. That's quite a jump, especially when you consider the increase is over double default (133%). Also that power draw is continuous, and doesn't vary much from idle to load....i.e. any stress it's putting on various SoC IP blocks at that voltage is continuous. I lowered my SoC to a much safer 1.15v, then subsequently to 1.1v which has dropped the SoC package power down to ~10W. I recommend everyone with EXPO enabled do this for now, or just disable EXPO if running any 7000 series AM5 CPU (X3D or not).
With 5600 DOCP for me it seems to set my SOC to 1.24v, which seems high but is at least better than what your board was trying to do, maybe in very rare cases the boards can massively increase SOC voltage even further than yours which causes the CPU destruction?
IDK my first 5800X3D blew up after a month and I had to return it and get another one. I noticed a few other people seem to have similar issues. I was concerned to get another 3d chip.. Is it the new 70003D chips? Or all the 7000s
Well I’ve ran 3 , 7600x, 7950x and now a 7800x3d…. So ALL,… is a bit broad my guy. (On Honda has a issue and all of them are shit? Let’s open our minds a bit eh?)
@@KB-1976 My question is are people having issues only with the 3D chips. Or any 7000 series. Because one guy had no issues. Nobody else is? Let's open our minds here. You are not the only 7000 user on the planet.
@@alinzelnan Ok so it's generally not just an X3D issue. I understand there can always be bad chips of any kind. I was wondering if people were having issues with 3D V cache chips in general. After my 5800X3D failed. I noticed on forms others were having issues with bad 3D chips. I was wondering if the new 70003D ships were also having an issue with bad chips.
19:43 - Is there something about a motherboard VRM that prevents it from sending out higher voltage than this calculation (duty cycle * input voltage) if it's faultily doing more duty cycle than needed for the load? Like, if a PSU tried to do 20% duty cycle when there's no load on output, it would end up with the full input voltage on the output, not duty cycle * input.
If the MB pumps power into a dead and shortened CPU, as in power enough to cause this issue, that sounds like something that simply should not happen by design.
im a bit of a noob on this but cant it be a issue on the load line calibration, like 1 side pulls more power and the board tries too compensate and gives a spike ?
Some of the motherboard vendors like gigabyte have support for the X3D chips on bios versions that came out before the X3D chips even came out. It lets you tweak the v-core voltage even though normally on the newer motherboards bios you can't do that on the X3D chips.
It appears ASUS have the same. 0805 was able to boot and post 7800x3d in our case. Next was 1101 which was the one that 'supported' 7000x3d. And we returned board cuz dead CMOS + other mess with troubleshoot... Waiting for a replacement.
When I updated my gigabyte board to f8a from f7 I noticed much better cpu temps. Oddly enough they removed and then reuploaded f7 but all older bios versions are gone. Since upgrading, y cruncher throws errors pretty much immediately on the second test. My system is still really unstable generally, I had 2 random restarts today. I ran a chkdsk earlier and after it finished, it couldn't even find event viewer, it restarted a few more times and then the computer did some other disk fixing nonsense. No idea if I'm anywhere near stable. So far this 7950X3D on a X670E Aorus Xtreme upgrade has been frustrating considering how much it cost.
I had 1.7v on my 1700x, twice, for at least 30 seconds each, in BIOS, on asus crosshair vi hero x370. It's a bug where if you switch from Auto to voltage Offset, it shows 1.7v and then I restarted, and yeah I had 1.7v in BIOS. CPU still works fine, haven't noticed degradation. I still have it, but now I use 5800x.
Deamn 1.7Vcore xD I pushed my old r5-1600x on 4.2Ghz on 1.5Vcore for 24/7 and have also r5-5600x since realise runs 4.85Ghz 1.41Vcore 24/7, manually overclock feels so smoth vs PBO for my r5-5600x
Those run at 5V or 3.3V and dont include software Vcore monitoring neither 2.1V are enought for those. Pentium 2 will run at 2.1V except it probably don't report that for software either.
@@volodumurkalunyak4651 It's a 5V. You may be correct that there's no sensory circuit, but 2.1V wouldn't kill it. It probably wouldn't be possible to get it running at 2.1V, but it wouldn't kill it, that was the joke. But, thanks for being me by defeating a joke with facts and logic, Pedant. 😘
7:06 You would need to measure it on other alternative pads because the bubble part most likely lost its connection with the substrate. And for the same reason turning it on is most likely impossible since the bubble will just damage the pins on MB socket. Whatever power draw you see may just be the MB pins shorting. Something that I would find interesting would be if someone lapped the substrate to see if there is burn marks (carbonization) and how deep it goes. Would take a lot of manual work that is for sure.
hi please help me out. i have kingston furry beast 2x8gb xmp 6000mhz cl40 kit with AMD B650 asus tuf gaming plus Fatx motherboard and 7600x. the cpuz says its hynix. the system refuses to run at 6000, insta crash. and at 5600 also unstable. i tried your suggested voltages and even beyond. can it be the motherboard fault?
Had a thought/hypothesis while listening, if a pin was bent/deformed such that it shorted (ie user errror, dropped something onto the socket, etc), could that cause this?
I think there may be 2 different failure modes in play here... the combination of X3D and ASUS motherboard seems to always result in the bulge being right under the CCD with the 3DVCache. The other failures (7700X on ASRock, 7900X on Gigabyte, not sure what the CPU on the MSI board was) had the bulge in a different spot under the I/O die I believe. Those two may be different causes resulting in the same kind of damage, though on the picture from the most recent reddit post it also looks like there is a very slight bulge where the I/O die sits. Also I wonder if the bulge can be caused by mechanical force instead of heat: could the silicon on top blow up in such a way (releasing fumes for example) that pressure builds up? I.e. the V-Cache burns up, releases smoke, but the smoke can't go anywhere because the top of the V-Cache is sealed to the IHS, so the force is directed downwards.
the Vcache dying would probably shoot material out the sides of the CCD where the Vcache is bonded to it. Rather than push the CCD down enough to make a buldge on the back of the substrate.
The likelihood of two different causes for something suddenly occurring strikes me as highly unlikely. It's be like a person getting two rare diseases at the same time.
Some have also speculated that there are embedded capacitors in the substrate that have very tight voltage window that get damaged/become shorts when voltage goes too high and apparently affected motherboards' bios pushed too much voltage to ram controller
GN tested a chip in detail, they saw liquid melting of copper leads, which is 1000 C. VERY hot. The motherboards should never provide the power to do that sort of thing. The runaway catastrophic event _should_ trip a mobo to shut down before it becomes a spectacle. Also, messing around with clocks on my 7900X, I find that the factory power settings are definitely too high. You actually gain performance by undervolting a bit. They're factory-overclocked beyond what is wise, out of the box. And then the mobo makers are taking a lot of liberties & laziness with design, which is the lethal combo. I think the X3D's are just _more_ susceptible, but all of the AM5 run too hot & high voltage for good longevity. AMD put out "recommendations" for voltage & spec details to motherboard makers. It is not strictly enforced, so as to allow for 'ultra performance' creativity by board makers. Their ranges for spec are too loose for a chip which is juiced to 95C as a spec (up to the razor's edge of what the chip can handle). It's the only chip I've ever heard of which gets 'faster' when you manually undervolt it. For reference, I have a 420mm rad, top mounted, cpu at the lowest point in the loop, with thermal grizzly. I'm not observing an inadequate cooling solution. Also, based on temperature graphing from the moment I get to desktop. I suspect that my Asus board is really blasting the chip with juice during boot. And boot times are loooong compared to other chips. So while that may not have cooked a cpu in the past, it's cooking chips that sit in boot for much longer periods of time. I suspect that is a factor.
The burnout I saw was about 6 mm dia on the socket and the IC. Spot temperature was very high. My friend said, the shop replaced the motherboard and 7700 free of charge.
Hi Buildzoid, I love your channel, a lot of useful information, thank you for your hard work. Please tell me, I'm new to the topic of overclocking RAM, there is a memory kit on Samsung B-Die chips, what voltage should I set to overclock the memory so that it is safe for daily use in a work computer? I read that 1.5V for these chips is quite safe as long as the memory is blown, is that true? I can’t decide what parameters to use for the voltage of RAM, I need the advice of an experienced person. Thanks in advance.
Apparently it's SOC voltages with EXPO - as there are some mobo manufacturers statements as well as BIOSes with vSOC limits being pushed. So it seems it's at least partial cause and how exactly SOC causes (likely) short on vCore maybe buildzoid can explain because I have no clue. But early statements point to SOC as root cause - question is then: HOW?
Does the contact between the IHS and the die break at the temperature where the solder melts? I would have thought it would mostly remain just because the IHS is being held in place so there's not much of a gap. I expect it's a PCB substrate defect causing the short, a short in the die would be highly localised, probably shatter the die due to thermal stress, and not cause the even heating Roman saw on his sample which desoldered the IHS.
As someone who hasn't overclocked a thing in his life and had my DDR5 6000 on EXPO until 2 hours ago, what should I be doing other than limiting my soc to 1.2 V after turning EXPO off?
FYI, FWIW ... I have a B650M Aorus Elite AX and Gigabyte took down the earliest 5 (I think, maybe it's 4) BIOS versions, and released a new version just today (4/26/22). I've heard lots of BIOS changes across all MB manufacturers in the last few days....
Since this is speculation, what if for some reason VDDG CCD is set to high? could that maybe cause short in CCD that is routed too close to a VCC rail and in turn causes a short on VCC?
Thermal runaway may not take that much power either, depending on the heatsink used. An AIO watercooler would likely take 2-3x the nominal full load power of a given CPU. A heat pipe based cooler, though, only needs to exceed it's dry out power level by a relatively small margin to hit thermal runaway. Depending on the coolers in question, this may only be 200-300W total, or less for smaller coolers.
We now have a not-official statements indicating SoC as the culprit due to damaging thermal sensors. Supposedly even EXPO profiles are setting voltages high enough but X3D chips are more sensitive to the issue. I'm skeptical but I'm glad my instincts told me to avoid staying above 1.30V. What doesn't make sense to me is the claim that the lack of this temperature data causes the chip the keep pulling more and more power until it melts itself due to the temperature target algorithm. Wouldn't that indicate that this also breaks every other limiter? Even if it did, there's no internal error thrown when sensors go insane outside LN2 mode?? Our chips don't power themselves to the moon when they're kept well below the temp target. I smell shenanigans. EDIT: Was not an 'official' statement.
24:24 Yes I am betting on a 100% voltage/power glitch. But what is causing it will be a big reveal for sure! Cause reaching 160C to 200C and the CPU not shutting down cause of some temp sensors reaching over 100C is an insane glitch tho!
😊Another bumpgate? That's the closest precedent that I'm aware of anyways. This is some sort of manufacturing defect, most likely some sort of contamination in the substrate manufacture. Add heat and you get delamination. Delamination leads to short circuit. I'm not sure why you would want to posit anything compex than that. Occams razor. Amd is ramping up production volume, supply issues are common when you do that. It is absolutely not an agesa bug that I'm quite sure of.
Interesting... I am not using EXPO, I am using XMP. Does this affect XMP ram profiles? I am using QVL certified memory from ASUS. GSKILL 5600Mhz 28CL 2x32GB Trident Z5 RGB 1.35v, to be exact. I am using an Asus X670E-Plus WIFI with a Ryzen 7 7700x. I am currently using Bios 1409. Previous Bios was, 1223. Max SOC voltage was 1.28v and Vcore 1.27v. Memory is at 1.35v. The only Vmod to my CPU is Curve Optimizer All cores -28.
It would probably run just fine if you manually set soc voltage to 1.12-1.15v. On Ryzen 3000/5000 1.28v is quite excessive, the ddr4 OC manual says such high voltage can cause negative performance scaling. E: I know this is ddr5, not ddr4, but I would figure some relative similarity. Personally I don't like (or trust) AMD's approach of pumping voltage into the cpu.
Even if I overclock my r5-5600x to 4.85Ghz 1.41Vcore soc auto goes 1.2 as highest when CPU pulls +160W. I setup my SOC to 1.1V and would recommend 1.15 as highest for overclock and for stock on CPUs below 1.1V soc. 1.28V soc is too high, probably going to kill your CPU over time. Lower it soon as possible. SOC have highest chanse to damage a CPU vs Vcore/ddr voltage. I can run my ddr4 on 1.6Vcore without any issues or even my r5-5600x on 1.55Vcore but I do not have balls to push soc over 1.2V
I'd say it's PBO applying 1.5-ish volts, which then kills something within the CPU, making a short circuit, that blasts its surrounding, which causes whole plane to fail and generate heat enough to melt the PCB.
Another thing that might've potentially happened is that a low-power input connected to a high-power rail shorted out internally to the CPU, causing its much thinner traces to burn up inside of the substrate. Something like a sense line, for example. Still just speculation, of course, but I'm looking forward to your inevitable follow-up video!
pretty sure these substrates are similar to pcbs like fiberglass some other heat insulating non conductive composite. but if it was a very small short the temperature could rapidly increase at rapidly decreasing amounts of power relative to how small the short area/defect whatever is, i mean you can make a wire glowing hot from a 9v, this has 50+ amps available during normal load scenarios, that if that were to pass thru a human hair sized trace or something could be glowing hot like 1000c in probably a second or somethin
" that if that were to pass thru a human hair sized trace", think smaller, a LOT smaller. A trace's thickness within the CPU, even a power trace is measured in atoms, therefore the resistance is much higher than say a copper wire the width of a human hair. The smaller the conductor, the higher the resistance and the lower current that the conductor can carry, and the smaller conductor passing the same voltage/amperage will heat faster and hotter. So, in a CPU, the current and voltage needs to be controller very carefully and there is VERY little room for variance or error. Look at a stock 13900K for example, at full all-core load can be consuming 260W at an average of 1.3V, which is 200 amps, now imaging if the voltage suddenly / unexpectedly bumped voltage to 1.5V but kept the same 200 amps and you've increased to 300W. The VRM should prevent this behavior, but if the VRM is asked to increase to 2.1V, it would smoke any modern CPU.
@@racerex340 ye but would have to be a larger thing a very small trace would just melt/sorta pop and then no more connection since the voltage is so low it wouldnt arc very well. like if you held a single straing of the extra thin steel wool across a 9v or such it will melt before the ends get hot enough you feel it heating up/react probably no idea the cpu/substrate layouts if it had a collection of traces all nearby or like a plane where it has a large sheet like on motherboards/gpus but some of those big bubble ones are weird might be a long duration at lower heat or something, that could be something like flatness/socket mounting issues if bunch of the pins are lower pressure/contact could flow more power thru the better connected ones, depends how many layers how much actual substrate material is there besides the metal, like regular perf board even if you apply like 800c just really hot single temp soldering iron it does nothin basically, take a long time to start scorching it and stuff, has a smell and such but has also been out for a long time a now theres cpus doing this, are they old ones? new ones? is there different stepping/revision of something? could be complicated thing
@@Dinscurge my bet is on voltage / current control that they added to AGESA for 3D Vcache 7000 series that changed something, as so far all of these issues appear to be limited to motherboards running later code, which would explain why we may not have been seeing this issue before, and also explain why motherboard manufacturers have been releasing new firmware while removing access to all prior versions.
Fun that my r5-1600x never die when clocked to 4.2Ghz and could run 4.3Ghz on 1.55Vcore but I stick to 1.5Vcore on 4.2Ghz. Got r5-5600x on realise and could not wait for bios update for my old bord and get b550 auros pro v1 and clock my r5-5600x to 4.85Ghz all cores on 1.41Vcore 0 issues. Some people are poor and think they can use low end coolers. Know some people around my area that's have get issue with ryzen 3000s and 1000s and was because of they use stock cooler
@@Kage0No0Tenshiyou're not "poor" if you're buying a 300+ usd cpu on top of a ddr5 kit and motherboard, shit you can't control happens and your stuff dies (in this case dies in warranty)
Plz. can anyone confirm whether this is happening only with 3D cache cpus or entire ryzen 7000 lineup ? I am planning to build system based on 7600x/7700x, thanks in advance.
the same with high voltage (burned pads only) it happens on modern Intel CPU sometimes. I spotted it at few chips worked at auto rules of voltage by motherboards. All CPUs are live, but have more o less some pads "burned".
There is focus on vddsoc. I think they made the infinity fabric vddg tied to vddsoc, causing IO gates to blowup and a short to ground. Guessing someone forgot a level shifter. 😅 Or maybe they have this in bypass mode when it should be regulated by a LDO.
when I saw the pictures briefly and not looking closely, I kinda thought I'd be hearing they fucked something up with the socket implementation, not enough vcore landings contact area or something like that
In my pocket experience they basically never fail. I did have 1 issue and that was a gigabyte motherboard dying after 6 years of light to normal useage. No idea what it's issue was tho.
My first instinct was that some of the pins on the mobo were misaligned (and shorting Vdd to ground) due to the extreme density and the fact that it's AMD's first LGA socket. I then realized that idea is stupid, AMD isn't manufacturing the boards and it's happening across many board manufacturers
AMD was the oddball still using 80's socket technology on modern CPUs. LGA works just fine. When there's a pin misalignment issue, you SEE it :p thing pops immediately or burns in spectacular manner. If anything, the LGA socket allows the CPU to run for some time with that bulged surface.
The informational progress on this issue cause of this video is not 0. Now a couple people have said that the temperature on the PCB needs to be in the neighborhood of +150C to cause this damage.
I upgraded my 1070 TI to a 6950XT a month ago because Jensen has lost his mind. I was going to go to an AM5 platform but was way too cheap, so I just picked up a 5800X3D and a 32 gig kit of cheap DDR4 to replace my 16 gig kit and 3600x. Apparently besides saving about 700 CAD, I also saved myself a massive headache.
LOL. I just upgraded a mount ago from a Ryzen 1700 and I did the same, new B550 board, 5700X and 32GB 3600. I'm very happy with it since I don't game much. Looking for an upgrade for my GTX 1660Ti, but as you said Jensen is out to lunch, so I'll probably go AMD.
@@omdtdz I went with a 5800X3D because I wanted to keep my B450 board and figured the power constraints should be fine. RDNA2 isn't that bad. The drivers leave a bit to be desired. There's a hardware acceleration bug that's annoying. Before disabling MPO it'd be random black flashes. Once it's disabled they turn to random white flashes. I'd prefer an Nvidia card, but I can't beat the price to performance right now and I'm not giving in to Jensen.
@@simplyscholar21 I went B550 because I had an X370 which is pretty old. Thanks for the heads up about the bug. All in with board, CPU and RAM (board and CPU were on sale) was just around 600 Canadian.
@@omdtdz yeah I got a decent 4000 CL20 mhz kit of DJR for 100. I figure I can downclock it and tighten the timings up, and the CPU itself was 400 on sale. Meanwhile to get a B650 board with even remotely decent features is like 400 on its own.
@@ActuallyHardcoreOverclocking We could be looking at multiple issues here.....i've seen non X3D chips burn in the area of the IO chiplet, the Asrock example actually burned in an area responsible for various IP blocks powered from VDDCR_SOC.
Poor quality PCB can also be the reason for this issue. I say this because I solder allot and in my experience not all PCBs are created equally, some are crappier than others. One good example is Linksys routers, I have tried replacing the SPI flash on many routers and the pads lift with the tiniest amount of heat.
Just watched your video on CPUs failing across different motherboards, and it was super interesting! Loved how you explored potential causes like manufacturing defects, AGESA glitches, and VRM issues. Your in-depth breakdown of voltage rails was on point, and I'm looking forward to seeing if your speculations turn out to be right. Keep up the great work! 👍
I stared at a picture of a CPU for half an hour today.
The substrate material itself will have extremely low thermal conductivity, below 1W/MK. It's essentially fiberglass. The amount of copper incorporated into it will raise that somewhat but certainly not enough that it could ever dissipate tens or hundreds of watts
Silicon is a very good thermal conductor on the other hand, not far off as good as aluminium.
P.s. (joke) that's why MSI slapping their silicon leaky thermals everywhere! 😂
Found this out when my DIY PCB for a led light resulted in a bunch of overheating LEDs. Turns out as efficient as they are they absolutely need the aluminum PCB to conduct heat away.
Does solder mask increase conductivity?
So who's fault is it? Zen 4 CPUs or AM5 mobos or the user?
@@AdrianMuslim did you listen to the video?
Did not expect to be blessed by zoid-rage today. Awesome dude :D
Zooid-Rage is the Best
Doesn't seem ragey to me..
@@samiraperi467 lul he IS the rage
@@samiraperi467 send some bad b- die timings, that would do it
So who's fault is it? Zen 4 CPUs or AM5 mobos or the user?
Mad respect for the non clickbait title.
It isn't out of the question that a manufacturing defect has occurred, the full batch of code 43 Radeon VIIs is precedent
Add the vapour chamber issue with Radeon 7XXX GPU's to that list & it could very well be a manufacturing defect.
There was also apparently an early batch of Ryzen 5900X CPUs that were unstable and generated unrecoverable WHEA errors at stock settings. I was one of the ones who had one and others reported the same thing (the replacement worked fine). I think AMD's QC may be a bit sketchy.
if it was a manufacturing defect it wouldn't take weeks to rear it's ugly head
aaah... Buildzoid is rambling again, bring it on!
According to a tomshardware source excessive SoC voltage(caused by Expo setting) kills the thermal sensors as well as thermal protection. With those dead sensors and the general behavior of Ryzen 7000 series(pushing max. performance within thermal limits) the CPU kills itself as it does not know it's thermal limits anymore. As there is no thermal headroom anymore visible to the CPU the boost behavior leads to excessive power draw beyond safe levels.
My concern with that would be is this something that can develop over time, or would you notice it pretty quickly, seeing high SoC voltage and what not
@@Justathought81 judging how the boost algorithm works you would find out very quickly if a thermal sensor is dead ROFL
AMD had confirmed that this is the issue. The funny thing is that most EXPO settings boost the SoC Voltage to 1.3V and the MC_SoC to 1.4V.
AMD stated that the SoC Voltage shouldn't go above 1.3V....but the question is *what* shouldn't go above 1.3V!? Just the SoC Voltage or the SoC Voltage and the MC_SoC Voltage?
Just to be safe, I've lowered my SoC_Voltage to 1.25V and everything still works fine. But my memory controller is still set to 1.4V....
Edit - I just looked into this again. It is supposedly *just* the SoC Voltage. So, if you had an ASRock motherboard, you've been all good the whole time as EXPO profiles on ASRock boards never put more than 1.3V into the SoC Voltage (at least that's how it has always been with my Taichi Carrara) =D
@@NippleSauce Yeah it's just the SoC voltage. All AM5 mainboard manufacturers will release AGESA updates(with locked SoC voltage at 1.3v max.) in the coming days according to AMD.
@@Squilliam-Fancyson I am still a bit confused here because even ASRock released an updated BIOS this morning which has a description stating that "7000X3D users should probably update to this BIOS".
I wonder if the SoC_MC_Voltages could have actually been involved on the ASRock motherboards. Regardless, I'll be updating my BIOS in a few hours and will be able to see if the SoC_MC_Voltage values are decreased or if they're still at 1.4V with EXPO enabled.
I guess I'm also super confused because the SoC_Voltage on the ASRock boards has never gone higher than 1.3V - regardless of whether or not an EXPO profile was being used. Perhaps their new BIOS just prevents users from manually increasing the SoC_Voltage _above_ 1.3V if a 7000X3D CPU is detected?
AMD am5 cpus: *Tsundere Simulator*
_"It's not like I did those volts & frequency because I like you or anything cooler, baka!"_
Intel lga1700 cpus: *Yandere Simulator*
_"Oh boy! Here I go killing high memory speed & tight timings again"_
I hate you 😂
Huh?
@@daniomhailemariam7308 all right boomers, I don't like explaining my memes.
With AMD ryzen you have to deal with PBO and its algorithm will determine the frequencies and voltage based off of the temperature of your cooler, but with recent events with zen 4 is happened with a buggy AGESA and/or bios(uefi) putting more voltage and current then usual(my game theory), recently AMD released a AGESA fix anyways so they figured it out, but idk.
and with Intel core gen 12-13(+14 tbh) has a major ddr5 dropout issue when reaching above 7200Mbps(megabits per second) depending on the motherboard and ram and IMC quality. There's also the issue of the LGA1700 socket having issues that you have to get a correcting frame before for a new virgin motherboard before putting in the CPU not using the stock iLM(loading mechanism), these also a risk of degradation depending on the voltages it may randomly do things.
@@emini6 ok boomer
"This video is getting longer than it should be"
It's not a BZ video if it's short. And in this case short is destructive.
It's not just the X3D, der8auer EN also found damage on the pins of a non X3D's ( 7900X) buyers beware- owners be more aware
stuff like this makes me not at all upset that i got in on zen 3 right before the price dropped. glad i skipped this shitshow.
@@quantum5661 glad I got in this shit show, because my junk will shit on yours. Mine is fine. How you just go to “everything this version is shit” off this video, boggles my mind. But hey. Happy gaming my guy. Jesus.
@@quantum5661 also, there is a person bitching that his zen 3 shit the bed right after he bought it. Look up. See how that works? LOL. Jesus.
Ryzen 3000 Chips die too. It’s not new, it’s just an increased speed on the timeline
Fun that my r5-1600x never die when clocked to 4.2Ghz and could run 4.3Ghz on 1.55Vcore but I stick to 1.5Vcore on 4.2Ghz.
Got r5-5600x on realise and could not wait for bios update for my old bord and get b550 auros pro v1 and clock my r5-5600x to 4.85Ghz all cores on 1.41Vcore 0 issues.
People are poor and think they can use low end coolers.
Know some people around my area that's have get issue with ryzen 3000s and 1000s and was because of they use stock cooler
G'day Buildzoid,
Thanks for your educated thoughts, I will be interesting to see what Roman & GN find out as they look deeper.
Ironically enough being able to set a static voltage rather than letting the system do it for you might have been a way to prevent this issue. Because like I mentioned in a previous comment motherboards are supporting cpu's before they were even supposed to. That tells me that the firmware isn't locked down and the bios isn't secured.
5800X3D had everything locked down at launch, and the internet went on the "AMD Evil" bandwagon. I am pretty sure AMD regrets not locking down everything now.
@@SolarianStrike Ultimately just depends really on the reason this CPU died. If it was pure fluke then it will blow over. If it doesn't then AMD might do it again.
@@SolarianStrike dont advertise a suite if its not applicable then, AMD ultimately releases the part, not the fans. Or admit that those suites are not fit for purpose on X3D
@@Arwel22597 They did what you said with the 5800X3D and the Internet just like to be mad as usual.
Manufacturers are given a spec to work with long beforehand...how else will they design a board to launch alongside the cpu to begin with?
Finally an opinion worth listening to.
To add to your VRM controller theory:
Back when your chip died, two chip deaths happened in ZA as well. One on an ASUS X670E-PRO and another on an ASUS CROSSHAIR X670E EXTREME. I've seen other chips dead on CROSSHAIR boards, and obviously your GENE as well. Der8auer's viewer had a high end AORUS board I think.
As far as we can tell, there are at least two Gigabyte boards (one claimed to be a B650 Aorus elite) and one verified MSI board.
And since we now have evidence that the issue isn't limited to 3D VCache 7000 series, nor is it limited to dual CCD chips, this leads me to believe that the issue is either tied to flaw in the silicon, or an issue with the voltage/power control in firmware, whether it's related to boosting or a vCore overcurrent situation.
@@racerex340 I'm with you on that. Transient high on initialization, CPU dies, user waits for an extended POST while the CPU cooks due to it not having initialized and recognized OTP. Yeah pretty much. I don't believe this is high SOC.
cope
@@notreya 🤣🤣🤣 Cope with what? I'm just giving BZ some info to look over.
@@dainluke I'm thinking it's something weird in AGESA that motherboard manufacturers have been leveraging per AMD design specs but isn't working right? Would explain Mobo manufacturers suddenly ripping all of their prior AM5 BIOS down except the latest, maybe they found something.
Could it be the socket design failure? Intel had the same kind of problem with their early LGA designs. The power will dissipate in the places with the higher resistance. So if there is a problem with the socket, pins will heat up, cause substrate to bubble, which in turn introduce short circuits in the inner layers and make the situation even worse, as the contacts has become the most resistance in the circuit.
What I was thinking. It is the LGA causing the heat.
Do you still have your 7950 that died?
nvrmind you talk about it at 23:00
Is there any damage on it like derbaur missed?
I've just build a new AMD system after 10 years and this has to happen.
ASUS have some issue too on intel.
ASRock s better.
Too early to tell, might be a NVIDIA connector deal in that it won't amount to much in the end.
Gigabyte all day long 😉 never use asus bord again or any products from asus again.
First issue asus gaming laptop I got in 2010 and then my asus extreme vi maximus after that gtx 1060-3gb asus, both had seriously temp issues and bord had clock issues on my i7
@@Kage0No0Tenshi Do you try to use alienware.
Its expensive garbage and unfixable laptop.
Same here. This is my first AMD CPU since the FX-6300 and now this happens lmao.
Some people speculate this VSOC voltages problem is related to EXPO RAM settings. der8auer recieved a 7800X where the solder has melted and has even moved outside of the IHS, it also shows the same "bubble" on the underside of the package. EXPO was used here as well.
It's kinda confirmed by board manufacturers, pushing SOC voltage limiting BIOSes.
There used to be a guy exposing issues with DDR5, and how it's not better than 4 at low speeds, while high speeds used too much power on Intel. It doesn't surprise me AMD has similar issues. Of course most mainstream tech news refused to admit these issues with DDR5, so shout out to that guy. Everyone on DDR5 has been beta testing it. It won't be good until second generation products can handle high speed expo modules.
the high temps were from the GO - 95°C with a standard Air Cooler for the 7950X series for example, I under powered it from 170 Watts Stock to 105 Watts, my CPU Temp with AIO now is 35°C. I also installed recent Asus MOBO Update for my CPU which LOCKS the SoC @ 1.3v.
Yes, but if you are lowering the power limit your clocks are lower. May as well buy a lower spec’d cpu.
When I first saw the images of these I automatically from expierience back in the 90's was the problem is with the motherboard. Motherboards supply power and take down CPUs almost exclusively when the substrate goes. Not the CPU.
Then I saw more reports come out and I was at first "Still the motherboard but maybe bad AGESA or bugged UEFI causing thermal runaways? Maybe faulty solder or lead mounts, but they fixed those issues decades ago.""
Then when other CPUs started to show it I am starting to think there is one of those perfect storms of multiple unrelated factors just lining up perfect to cause this. I don't have the depth of knowledge of it as AHC does, but I am starting to think it is a case of both the CPU and the Motherboard in a case of Follie et deux.
I haven't seen substrate fails like this since Thunderbirds were getting killed by high performance boards with auto overclocking going wrong.
Using my 7800X3D a bit over a week now on a X670E prime pro wifi and did use bios version 1406 and 1408 (also gaming on that BIOS). Few days ago I updated to 1409 and had no shutdowns or anything yet (also gaming done). But with EXPO on, my SoC voltage was over 1.3 volt. Probably 1.36V or something. And EXPO was active since day one. Turned off EXPO for now and voltage dropped to 1.0 - 1.1V.
If you're going to enable EXPO, I think the common wisdom is to limit SoC voltage to 1.25, because there is some variance.
Too much pressure could stress and damage the silicon and/or the substrate. Insufficient pressure could lead to improper contact between the pins and the CPU pads. Due to the high currents provided by the vcore rail, temps could go up and there could be even micro sparkling
My 7700X died in what sounds like a similar fashion to your 7950X3D. It would just throw the yellow DRAM light on my Asus board. The B650E-F was also faulty. I really couldn't see any physical damage anywhere though. I really should have checked the voltages, but to be fair it happened back in early February before the news started to pick up.
"asus board"...
Low resistance / short circuits creates heat in circuits. So does high resistance with high amps.
If you look at the damage on the cpu does it look instant like a short circuit or slow and gradual like a heated wire. My guess is some cpus have a poor connection internally somewhere creating a hotspot.
"high resistance with high amps" - this would require high voltage though, and as mentioned nothing should be getting higher than 2.
As I said on Roman's: A phase transition would be needed to cause a bulge in solid material like that. In other words, some small area melted or boiled.
Like Roman said in his videos
You can popcorn a substrate with a heatgun set to 350C easily.
@@ActuallyHardcoreOverclocking Ooh, popcorn 🍿😋 - another phase change process.
Quick rule of thumb: the gas phase of a material has about 1000 times less density than the solid or liquid phases. For a given mass, that's 1000 times more volume in the vapor phase - and that volume's gotta go somewhere.
@@MrKentaroMotoPIstop, you’re trying to come across as smart but all you have done is shown your ego and it’s insistence upon itself.
@@alexmills1329 I have 40 years of experience in the requisite fields. Other than the taste of shoe leather, what do you have?
I noticed a couple of days ago the very high SoC voltages being used when EXPO was enabled with my 7800X3D and G-Skill DDR-6000 CL30 Ram, on an Asus TUF Gaming X670E-plus.
This is with the latest (non beta) bios 1409 also released a few days ago.
What concerned me was the sheer jump in SoC power - non EXPO default voltage was 1.05v with SoC power at ~9W.
This jumped to 21W with expo enabled and 1.35v SoC voltage. That's quite a jump, especially when you consider the increase is over double default (133%).
Also that power draw is continuous, and doesn't vary much from idle to load....i.e. any stress it's putting on various SoC IP blocks at that voltage is continuous.
I lowered my SoC to a much safer 1.15v, then subsequently to 1.1v which has dropped the SoC package power down to ~10W.
I recommend everyone with EXPO enabled do this for now, or just disable EXPO if running any 7000 series AM5 CPU (X3D or not).
when youre looking at soc voltage are you looking at cpu vddcr_soc or cpu vcore soc in hwinfo64?
How you check the SoC power draw?
@@DavidDaBoss23 you have to use hwinfo64. Under cpu soc power I'm assuming.
With 5600 DOCP for me it seems to set my SOC to 1.24v, which seems high but is at least better than what your board was trying to do, maybe in very rare cases the boards can massively increase SOC voltage even further than yours which causes the CPU destruction?
@@Silentheaven89 CPU VDDCR_SOC Voltage and CPU SoC Power is what you need to be looking at in hwinfo64.
So who's fault is it? Zen 4 CPUs or AM5 mobos or the user?
IDK my first 5800X3D blew up after a month and I had to return it and get another one. I noticed a few other people seem to have similar issues. I was concerned to get another 3d chip.. Is it the new 70003D chips? Or all the 7000s
Well I’ve ran 3 , 7600x, 7950x and now a 7800x3d…. So ALL,… is a bit broad my guy. (On Honda has a issue and all of them are shit? Let’s open our minds a bit eh?)
@@KB-1976 My question is are people having issues only with the 3D chips. Or any 7000 series. Because one guy had no issues. Nobody else is? Let's open our minds here. You are not the only 7000 user on the planet.
@@jasonmaxwell9762 Der8auer has a faulty non-X3D chip, either a 7700X or 7900X.
@@alinzelnan Ok so it's generally not just an X3D issue. I understand there can always be bad chips of any kind. I was wondering if people were having issues with 3D V cache chips in general. After my 5800X3D failed. I noticed on forms others were having issues with bad 3D chips. I was wondering if the new 70003D ships were also having an issue with bad chips.
@@jasonmaxwell9762 I guess they're more sensitive because of the V-cache, especially regarding voltages.
19:43 - Is there something about a motherboard VRM that prevents it from sending out higher voltage than this calculation (duty cycle * input voltage) if it's faultily doing more duty cycle than needed for the load? Like, if a PSU tried to do 20% duty cycle when there's no load on output, it would end up with the full input voltage on the output, not duty cycle * input.
If the MB pumps power into a dead and shortened CPU, as in power enough to cause this issue, that sounds like something that simply should not happen by design.
If i would undervolt the cpu for example at 1.3 volts like in a offset mode, would it prevent all these issues ?
I had a 7700x die within 4 months on a Noctua DH-15 with good thermals. First time I had to RMA a CPU
Deamn ⁰
im a bit of a noob on this but cant it be a issue on the load line calibration, like 1 side pulls more power and the board tries too compensate and gives a spike ?
I noticed that Asus removed all old BIOs versions and only 1409 is the version still up on the website.
Now 1409? It was 1202 just yesterday!
Gigabyte did the same thing.
I'm currently on 1409
@@ertai222 I am on the same version as I was on a version that got removed so I got nervous and updated
Some of the motherboard vendors like gigabyte have support for the X3D chips on bios versions that came out before the X3D chips even came out.
It lets you tweak the v-core voltage even though normally on the newer motherboards bios you can't do that on the X3D chips.
It appears ASUS have the same. 0805 was able to boot and post 7800x3d in our case. Next was 1101 which was the one that 'supported' 7000x3d. And we returned board cuz dead CMOS + other mess with troubleshoot... Waiting for a replacement.
"if the VRM goes nuts" what an amazing phrase, so much fun with it
When I updated my gigabyte board to f8a from f7 I noticed much better cpu temps. Oddly enough they removed and then reuploaded f7 but all older bios versions are gone. Since upgrading, y cruncher throws errors pretty much immediately on the second test. My system is still really unstable generally, I had 2 random restarts today. I ran a chkdsk earlier and after it finished, it couldn't even find event viewer, it restarted a few more times and then the computer did some other disk fixing nonsense. No idea if I'm anywhere near stable. So far this 7950X3D on a X670E Aorus Xtreme upgrade has been frustrating considering how much it cost.
Why are you overrclocking so much? Or is it unstable at stock configuration?
@Will Proctor No, this is at stock configuration.
I would RMA the snot out of it.
@@Cjorss OOF.
I had 1.7v on my 1700x, twice, for at least 30 seconds each, in BIOS, on asus crosshair vi hero x370. It's a bug where if you switch from Auto to voltage Offset, it shows 1.7v and then I restarted, and yeah I had 1.7v in BIOS. CPU still works fine, haven't noticed degradation. I still have it, but now I use 5800x.
Deamn 1.7Vcore xD I pushed my old r5-1600x on 4.2Ghz on 1.5Vcore for 24/7 and have also r5-5600x since realise runs 4.85Ghz 1.41Vcore 24/7, manually overclock feels so smoth vs PBO for my r5-5600x
"If your reporting software indicated 2.1v your CPU would be dead, no question"
> Breaks out his 486 CPU.... 😆
Those run at 5V or 3.3V and dont include software Vcore monitoring neither 2.1V are enought for those. Pentium 2 will run at 2.1V except it probably don't report that for software either.
@@volodumurkalunyak4651 It's a 5V. You may be correct that there's no sensory circuit, but 2.1V wouldn't kill it. It probably wouldn't be possible to get it running at 2.1V, but it wouldn't kill it, that was the joke.
But, thanks for being me by defeating a joke with facts and logic, Pedant. 😘
7:06 You would need to measure it on other alternative pads because the bubble part most likely lost its connection with the substrate.
And for the same reason turning it on is most likely impossible since the bubble will just damage the pins on MB socket. Whatever power draw you see may just be the MB pins shorting.
Something that I would find interesting would be if someone lapped the substrate to see if there is burn marks (carbonization) and how deep it goes. Would take a lot of manual work that is for sure.
Ahhhhh a BZ ramble. I have a warm fuzzy feeling already.
So to add some info here CPU VDDR_SOC without expo enabled on 6000mhz ram is at 1.016v at idle vs when expo enabled at 1.344v
The highest mine ever got was 1.247 volts.
@@trparky prob bios version
hi please help me out. i have kingston furry beast 2x8gb xmp 6000mhz cl40 kit with AMD B650 asus tuf gaming plus Fatx motherboard and 7600x. the cpuz says its hynix. the system refuses to run at 6000, insta crash. and at 5600 also unstable. i tried your suggested voltages and even beyond. can it be the motherboard fault?
Had a thought/hypothesis while listening, if a pin was bent/deformed such that it shorted (ie user errror, dropped something onto the socket, etc), could that cause this?
I think there may be 2 different failure modes in play here... the combination of X3D and ASUS motherboard seems to always result in the bulge being right under the CCD with the 3DVCache. The other failures (7700X on ASRock, 7900X on Gigabyte, not sure what the CPU on the MSI board was) had the bulge in a different spot under the I/O die I believe. Those two may be different causes resulting in the same kind of damage, though on the picture from the most recent reddit post it also looks like there is a very slight bulge where the I/O die sits.
Also I wonder if the bulge can be caused by mechanical force instead of heat: could the silicon on top blow up in such a way (releasing fumes for example) that pressure builds up? I.e. the V-Cache burns up, releases smoke, but the smoke can't go anywhere because the top of the V-Cache is sealed to the IHS, so the force is directed downwards.
the Vcache dying would probably shoot material out the sides of the CCD where the Vcache is bonded to it. Rather than push the CCD down enough to make a buldge on the back of the substrate.
The likelihood of two different causes for something suddenly occurring strikes me as highly unlikely. It's be like a person getting two rare diseases at the same time.
Some have also speculated that there are embedded capacitors in the substrate that have very tight voltage window that get damaged/become shorts when voltage goes too high and apparently affected motherboards' bios pushed too much voltage to ram controller
Had such a good nap! Thanks man
GN tested a chip in detail, they saw liquid melting of copper leads, which is 1000 C. VERY hot.
The motherboards should never provide the power to do that sort of thing. The runaway catastrophic event _should_ trip a mobo to shut down before it becomes a spectacle.
Also, messing around with clocks on my 7900X, I find that the factory power settings are definitely too high. You actually gain performance by undervolting a bit. They're factory-overclocked beyond what is wise, out of the box. And then the mobo makers are taking a lot of liberties & laziness with design, which is the lethal combo. I think the X3D's are just _more_ susceptible, but all of the AM5 run too hot & high voltage for good longevity.
AMD put out "recommendations" for voltage & spec details to motherboard makers. It is not strictly enforced, so as to allow for 'ultra performance' creativity by board makers.
Their ranges for spec are too loose for a chip which is juiced to 95C as a spec (up to the razor's edge of what the chip can handle). It's the only chip I've ever heard of which gets 'faster' when you manually undervolt it. For reference, I have a 420mm rad, top mounted, cpu at the lowest point in the loop, with thermal grizzly. I'm not observing an inadequate cooling solution.
Also, based on temperature graphing from the moment I get to desktop. I suspect that my Asus board is really blasting the chip with juice during boot. And boot times are loooong compared to other chips. So while that may not have cooked a cpu in the past, it's cooking chips that sit in boot for much longer periods of time. I suspect that is a factor.
The burnout I saw was about 6 mm dia on the socket and the IC. Spot temperature was very high. My friend said, the shop replaced the motherboard and 7700 free of charge.
Hi Buildzoid, I love your channel, a lot of useful information, thank you for your hard work. Please tell me, I'm new to the topic of overclocking RAM, there is a memory kit on Samsung B-Die chips, what voltage should I set to overclock the memory so that it is safe for daily use in a work computer? I read that 1.5V for these chips is quite safe as long as the memory is blown, is that true? I can’t decide what parameters to use for the voltage of RAM, I need the advice of an experienced person. Thanks in advance.
Apparently it's SOC voltages with EXPO - as there are some mobo manufacturers statements as well as BIOSes with vSOC limits being pushed. So it seems it's at least partial cause and how exactly SOC causes (likely) short on vCore maybe buildzoid can explain because I have no clue. But early statements point to SOC as root cause - question is then: HOW?
Does the contact between the IHS and the die break at the temperature where the solder melts? I would have thought it would mostly remain just because the IHS is being held in place so there's not much of a gap.
I expect it's a PCB substrate defect causing the short, a short in the die would be highly localised, probably shatter the die due to thermal stress, and not cause the even heating Roman saw on his sample which desoldered the IHS.
On der8auers processor the damage is below the IO chip, so the vdd soc might still be the reason, as vcore doesn't go to the IO die.
As someone who hasn't overclocked a thing in his life and had my DDR5 6000 on EXPO until 2 hours ago, what should I be doing other than limiting my soc to 1.2 V after turning EXPO off?
FYI, FWIW ... I have a B650M Aorus Elite AX and Gigabyte took down the earliest 5 (I think, maybe it's 4) BIOS versions, and released a new version just today (4/26/22). I've heard lots of BIOS changes across all MB manufacturers in the last few days....
Since this is speculation, what if for some reason VDDG CCD is set to high? could that maybe cause short in CCD that is routed too close to a VCC rail and in turn causes a short on VCC?
Thermal runaway may not take that much power either, depending on the heatsink used.
An AIO watercooler would likely take 2-3x the nominal full load power of a given CPU. A heat pipe based cooler, though, only needs to exceed it's dry out power level by a relatively small margin to hit thermal runaway.
Depending on the coolers in question, this may only be 200-300W total, or less for smaller coolers.
So do we to be safe …until bios update….put the memory back to stock 4800mhz?….mag b650 tomahawk here, cpu amd 7700 non x..gskill ddr 5, 6000
What a throwback to the old AMD cpus without thermal shutdown lol
We now have a not-official statements indicating SoC as the culprit due to damaging thermal sensors. Supposedly even EXPO profiles are setting voltages high enough but X3D chips are more sensitive to the issue. I'm skeptical but I'm glad my instincts told me to avoid staying above 1.30V.
What doesn't make sense to me is the claim that the lack of this temperature data causes the chip the keep pulling more and more power until it melts itself due to the temperature target algorithm. Wouldn't that indicate that this also breaks every other limiter? Even if it did, there's no internal error thrown when sensors go insane outside LN2 mode??
Our chips don't power themselves to the moon when they're kept well below the temp target. I smell shenanigans.
EDIT: Was not an 'official' statement.
Link please
how can i peg someones vrm to specified temps. how can i set load indiscriminate of process. is it firmware or is it windows
100 C operating temperatures are perfectly fine, we tested it. I saw straight through that one somehow.
what setting on hwinfo should i keep an eye on - to make sure its normal ?
CPU VDDCR_SOC Voltage and CPU SoC Power
@@SadisticStang ty -lowered mine to 1.0v and its getting about 10w soc power vs the default 18w with normal expo voltage.
24:24 Yes I am betting on a 100% voltage/power glitch. But what is causing it will be a big reveal for sure!
Cause reaching 160C to 200C and the CPU not shutting down cause of some temp sensors reaching over 100C is an insane glitch tho!
😊Another bumpgate?
That's the closest precedent that I'm aware of anyways.
This is some sort of manufacturing defect, most likely some sort of contamination in the substrate manufacture. Add heat and you get delamination. Delamination leads to short circuit. I'm not sure why you would want to posit anything compex than that. Occams razor.
Amd is ramping up production volume, supply issues are common when you do that.
It is absolutely not an agesa bug that I'm quite sure of.
If we set static VCore in BIOS? Could we save it?
Interesting... I am not using EXPO, I am using XMP. Does this affect XMP ram profiles? I am using QVL certified memory from ASUS. GSKILL 5600Mhz 28CL 2x32GB Trident Z5 RGB 1.35v, to be exact. I am using an Asus X670E-Plus WIFI with a Ryzen 7 7700x. I am currently using Bios 1409. Previous Bios was, 1223. Max SOC voltage was 1.28v and Vcore 1.27v. Memory is at 1.35v. The only Vmod to my CPU is Curve Optimizer All cores -28.
It would probably run just fine if you manually set soc voltage to 1.12-1.15v. On Ryzen 3000/5000 1.28v is quite excessive, the ddr4 OC manual says such high voltage can cause negative performance scaling. E: I know this is ddr5, not ddr4, but I would figure some relative similarity.
Personally I don't like (or trust) AMD's approach of pumping voltage into the cpu.
Even if I overclock my r5-5600x to 4.85Ghz 1.41Vcore soc auto goes 1.2 as highest when CPU pulls +160W.
I setup my SOC to 1.1V and would recommend 1.15 as highest for overclock and for stock on CPUs below 1.1V soc.
1.28V soc is too high, probably going to kill your CPU over time. Lower it soon as possible.
SOC have highest chanse to damage a CPU vs Vcore/ddr voltage.
I can run my ddr4 on 1.6Vcore without any issues or even my r5-5600x on 1.55Vcore but I do not have balls to push soc over 1.2V
I was just reading about some bios needing regulation update. Is this related?
There is an option on my Asus BIOS to limit the DDR voltage to 1.4V? Otherwise, it allows the voltage to be 2.07V
With the high voltage disabled, the system trains the RAM faster.
my question is what changed since 7000 came out why now because it's not just x3d?
How much L2atom voltage to use for overclocking Alder lake?
is vcore capable of a thermal runaway on modern mobos? i mean theres got to be a failsafe for this
I'd say it's PBO applying 1.5-ish volts, which then kills something within the CPU, making a short circuit, that blasts its surrounding, which causes whole plane to fail and generate heat enough to melt the PCB.
Another thing that might've potentially happened is that a low-power input connected to a high-power rail shorted out internally to the CPU, causing its much thinner traces to burn up inside of the substrate. Something like a sense line, for example. Still just speculation, of course, but I'm looking forward to your inevitable follow-up video!
pretty sure these substrates are similar to pcbs like fiberglass some other heat insulating non conductive composite. but if it was a very small short the temperature could rapidly increase at rapidly decreasing amounts of power relative to how small the short area/defect whatever is, i mean you can make a wire glowing hot from a 9v, this has 50+ amps available during normal load scenarios, that if that were to pass thru a human hair sized trace or something could be glowing hot like 1000c in probably a second or somethin
" that if that were to pass thru a human hair sized trace", think smaller, a LOT smaller. A trace's thickness within the CPU, even a power trace is measured in atoms, therefore the resistance is much higher than say a copper wire the width of a human hair. The smaller the conductor, the higher the resistance and the lower current that the conductor can carry, and the smaller conductor passing the same voltage/amperage will heat faster and hotter. So, in a CPU, the current and voltage needs to be controller very carefully and there is VERY little room for variance or error. Look at a stock 13900K for example, at full all-core load can be consuming 260W at an average of 1.3V, which is 200 amps, now imaging if the voltage suddenly / unexpectedly bumped voltage to 1.5V but kept the same 200 amps and you've increased to 300W. The VRM should prevent this behavior, but if the VRM is asked to increase to 2.1V, it would smoke any modern CPU.
@@racerex340 ye but would have to be a larger thing a very small trace would just melt/sorta pop and then no more connection since the voltage is so low it wouldnt arc very well. like if you held a single straing of the extra thin steel wool across a 9v or such it will melt before the ends get hot enough you feel it heating up/react probably
no idea the cpu/substrate layouts if it had a collection of traces all nearby or like a plane where it has a large sheet like on motherboards/gpus
but some of those big bubble ones are weird might be a long duration at lower heat or something, that could be something like flatness/socket mounting issues if bunch of the pins are lower pressure/contact could flow more power thru the better connected ones,
depends how many layers how much actual substrate material is there besides the metal, like regular perf board even if you apply like 800c just really hot single temp soldering iron it does nothin basically, take a long time to start scorching it and stuff, has a smell and such
but has also been out for a long time a now theres cpus doing this, are they old ones? new ones? is there different stepping/revision of something? could be complicated thing
@@Dinscurge my bet is on voltage / current control that they added to AGESA for 3D Vcache 7000 series that changed something, as so far all of these issues appear to be limited to motherboards running later code, which would explain why we may not have been seeing this issue before, and also explain why motherboard manufacturers have been releasing new firmware while removing access to all prior versions.
I think everyone was waiting this video
Makes me happy to be not an early adaptor to new platforms like this.
Yep, got a 5800X last week for $200 at microcenter. It's nice to know the platform is 2.5 years old and reliable.
Fun that my r5-1600x never die when clocked to 4.2Ghz and could run 4.3Ghz on 1.55Vcore but I stick to 1.5Vcore on 4.2Ghz. Got r5-5600x on realise and could not wait for bios update for my old bord and get b550 auros pro v1 and clock my r5-5600x to 4.85Ghz all cores on 1.41Vcore 0 issues.
Some people are poor and think they can use low end coolers. Know some people around my area that's have get issue with ryzen 3000s and 1000s and was because of they use stock cooler
@@Kage0No0Tenshiyou're not "poor" if you're buying a 300+ usd cpu on top of a ddr5 kit and motherboard, shit you can't control happens and your stuff dies (in this case dies in warranty)
Just bought a 7900x, hope its only the 3d models having these issues. Videos started showing up the day after I bought the cpu.
Gigabyte had a update called F8h not there any more, why?
Plz. can anyone confirm whether this is happening only with 3D cache cpus or entire ryzen 7000 lineup ? I am planning to build system based on 7600x/7700x, thanks in advance.
There are cases of 7800x and up dying as mentioned in the video.
Only if you overclock your cpu... and with new bios... there are zero chances unless you want it on purpose
Derbauer had a non x3d chip exhibit this behavior. It was a 7900x i believe. He posted a video touching on it.
Probably only if you use EXPO or XMP
@@florin604 That's not yet known!
the same with high voltage (burned pads only) it happens on modern Intel CPU sometimes. I spotted it at few chips worked at auto rules of voltage by motherboards. All CPUs are live, but have more o less some pads "burned".
Hi🧓
Asus just updated bioses.. Again!
"SoC voltage for Ryzen 7000X3D series limited to a maximum of 1.30V to protect the CPU and motherboard." xD
Not again. It's up since yesterday. It's not even final, just beta. Every board vendor did this.
I dont get it, and I think you need to wait. I assume the chips have a manufacturer defect and it leads to a runaway in the silicon - internally.
There is focus on vddsoc.
I think they made the infinity fabric vddg tied to vddsoc, causing IO gates to blowup and a short to ground.
Guessing someone forgot a level shifter. 😅
Or maybe they have this in bypass mode when it should be regulated by a LDO.
VDDG voltages are generated from VDD_MISC AFAIK.
@@ActuallyHardcoreOverclocking good point.
when I saw the pictures briefly and not looking closely, I kinda thought I'd be hearing they fucked something up with the socket implementation, not enough vcore landings contact area or something like that
IIRC there are a chunk of strange 3600 ryzen's dying. Weird, for many years I tended to think that CPU - in normal usage, are very rare failure parts.
In my pocket experience they basically never fail. I did have 1 issue and that was a gigabyte motherboard dying after 6 years of light to normal useage. No idea what it's issue was tho.
@@tilburg8683 my msi mb died just after 1Y. No overclocking. 'Once in a lifetime' experience, I hope
waiting for GN to deepdive.untill then zoid❤
What is the pin mapping in that region?
My first instinct was that some of the pins on the mobo were misaligned (and shorting Vdd to ground) due to the extreme density and the fact that it's AMD's first LGA socket. I then realized that idea is stupid, AMD isn't manufacturing the boards and it's happening across many board manufacturers
Not sure what the first amd lga socket was, but socket F from 2006 was LGA. All the threadripper and epyc chips are also LGA
Lotes makes the AM5 socket, but I doubt it's the socket's fault.
AMD was the oddball still using 80's socket technology on modern CPUs. LGA works just fine.
When there's a pin misalignment issue, you SEE it :p thing pops immediately or burns in spectacular manner. If anything, the LGA socket allows the CPU to run for some time with that bulged surface.
I'll update my bios to be safe i guess, fingers crossed not affected as not pushed the RAM.
I see the title:
Thoughts - Please be at least an hour
Buildzoid has spoken......This is the way.
Awesome videos. I stabilized 7600c32 2x16gb m-die on my z790dark after watching the 7200c34 m-die video.
The informational progress on this issue cause of this video is not 0.
Now a couple people have said that the temperature on the PCB needs to be in the neighborhood of +150C to cause this damage.
my guess would be short circuit in substrate (man. def.) BCS its on randomly positioned pins on substrate but always on v core pins
I upgraded my 1070 TI to a 6950XT a month ago because Jensen has lost his mind. I was going to go to an AM5 platform but was way too cheap, so I just picked up a 5800X3D and a 32 gig kit of cheap DDR4 to replace my 16 gig kit and 3600x. Apparently besides saving about 700 CAD, I also saved myself a massive headache.
LOL. I just upgraded a mount ago from a Ryzen 1700 and I did the same, new B550 board, 5700X and 32GB 3600.
I'm very happy with it since I don't game much. Looking for an upgrade for my GTX 1660Ti, but as you said Jensen is out to lunch, so I'll probably go AMD.
@@omdtdz I went with a 5800X3D because I wanted to keep my B450 board and figured the power constraints should be fine. RDNA2 isn't that bad. The drivers leave a bit to be desired. There's a hardware acceleration bug that's annoying. Before disabling MPO it'd be random black flashes. Once it's disabled they turn to random white flashes. I'd prefer an Nvidia card, but I can't beat the price to performance right now and I'm not giving in to Jensen.
@@simplyscholar21 I went B550 because I had an X370 which is pretty old. Thanks for the heads up about the bug. All in with board, CPU and RAM (board and CPU were on sale) was just around 600 Canadian.
AyyyMD
@@omdtdz yeah I got a decent 4000 CL20 mhz kit of DJR for 100. I figure I can downclock it and tighten the timings up, and the CPU itself was 400 on sale. Meanwhile to get a B650 board with even remotely decent features is like 400 on its own.
The SOC talk is related to an ASUS statement related to them reconsidering auto rules for EXPO and Vsoc in response to these failures
I'm aware of the ASUS statement however I still don't understand how the SOC voltage would cause what looks like a Vcore short.
@@ActuallyHardcoreOverclocking We could be looking at multiple issues here.....i've seen non X3D chips burn in the area of the IO chiplet, the Asrock example actually burned in an area responsible for various IP blocks powered from VDDCR_SOC.
@@ActuallyHardcoreOverclocking some of the dead cpus have bubbled substrate under the IO die.
I saw you posting on twitter about it earlier.
Poor quality PCB can also be the reason for this issue. I say this because I solder allot and in my experience not all PCBs are created equally, some are crappier than others.
One good example is Linksys routers, I have tried replacing the SPI flash on many routers and the pads lift with the tiniest amount of heat.
Nice how long have you been soldering?
Typical Buildzoid video; Long, with static images from the start to finish, so you can treat it as single source podcast :)
Just watched your video on CPUs failing across different motherboards, and it was super interesting! Loved how you explored potential causes like manufacturing defects, AGESA glitches, and VRM issues. Your in-depth breakdown of voltage rails was on point, and I'm looking forward to seeing if your speculations turn out to be right. Keep up the great work! 👍