We bought 1347 Used Data Center SSDs to See SSD Endurance

แชร์
ฝัง
  • เผยแพร่เมื่อ 19 ก.ย. 2024

ความคิดเห็น • 402

  • @79back2basic
    @79back2basic หลายเดือนก่อน +418

    why didn't you bought 1337 drives ? missed opportunity ....

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +141

      We bought many more than that, but you are right, it was a missed opportunity not to prune 10 more from the data set.

    • @DrRussell
      @DrRussell หลายเดือนก่อน +13

      Clearly I don’t understand the reference, may I ask the significance of 1337, please?

    • @peakz8548
      @peakz8548 หลายเดือนก่อน +48

      @@DrRussell en.wikipedia.org/wiki/Leet

    • @ralanham76
      @ralanham76 หลายเดือนก่อน

      ​@@DrRussell type 1337 on a calculator and look at it upside down spells LEET

    • @gamingballsgaming
      @gamingballsgaming หลายเดือนก่อน

      i was thinking the same thing

  • @cmdr_stretchedguy
    @cmdr_stretchedguy หลายเดือนก่อน +186

    In my 20+ years in IT and server administration, I've always told people to get twice the storage you think you need. For servers, especially if they use SSDs, if they think they need 4TB, always get 8TB. Partially because they suddenly need to create a large file share, but also because the larger SSDs will have lower DWPD and typically last longer. I dealt with 1 company that had a 5 SSD drive raid5 (250GB SSDs) but they kept their storage over 95% at all times so they kept losing drives. Once we replaced and reseeded with 5x 1TB, then expanded the storage volume, they didn't have any issue for over 3 years after that.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +62

      What is interesting is that this basically shows that doubling the capacity also helps with the write endurance challenge. So the question is do you get a higher endurance drive, or just get a larger capacity drive that has similar endurance.

    • @CoreyPL
      @CoreyPL หลายเดือนก่อน +37

      @@ServeTheHomeVideo It's like with normal disks - if you operate it on 95% all the time, where most data is cold, wear leveling algorithm can't function properly and new writes quickly kill this 5-10% of changing cells. If you up the capacity, then wear leveling can do its job properly.

    • @thehotshot0167
      @thehotshot0167 หลายเดือนก่อน +4

      That is a very helpful interesting tip, Ill keep that in mind for future builds.

    • @userbosco
      @userbosco หลายเดือนก่อน +5

      Exactly. Learned this strategy the hard way years ago....

    • @Meowbay
      @Meowbay หลายเดือนก่อน +2

      ​@@ServeTheHomeVideoOr, instead of using a 2 drive mirroring raid ssd, use single ones and just use the second ssd for expansion of space. Which is fine, as long as you're not rewriting that single ssd too often.

  • @CoreyPL
    @CoreyPL หลายเดือนก่อน +103

    One of servers I deployed 7-8 years ago hosted MSSQL database (around 300GB) on a 2TB volume consisted of Intel's 400GB SSD drives (can't remember the model). Database was for ERP system that was used by around 80-100 employees. After 6 years of work before server and drives being retired, they still had 99% of life left. They were moved to a non-critical server and are working to this day without a hitch.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +18

      That is pretty good though! Usually a service life is defined as 5 years

    • @CoreyPL
      @CoreyPL หลายเดือนก่อน +12

      @@ServeTheHomeVideo Yeah, I was pushing on management to spend some $$$ on a new server and move current one to non-critical role as well. It's hard to convince non-tech people that even server grade equipment isn't meant to work forever.

    • @MW-cs8zd
      @MW-cs8zd หลายเดือนก่อน +2

      Love the used Intel ds SSD. Expensive on eBay now though

    • @MichaelCzajka
      @MichaelCzajka หลายเดือนก่อน +1

      @@ServeTheHomeVideo 5 years is for mechanical drives.
      SSD's seem to last 10 years or more.
      In most cases... with light use you'd expect the drive to continue to be used until it becomes obsolete.
      Even with heavy use it's likely to last a looong time.
      The question for SSD's has always been... "How long will they last?"
      🙂

    • @scalty2008
      @scalty2008 หลายเดือนก่อน +2

      10years for HDD is good too. We have 500+ HDD here in Datacentre, the oldest ones 4TB running since 2013 as backup to Disk Storage and now doing their last days as Exchange Storage. Even the first helium 8TB running fine since 2017 (after Firmwareupdate solved a failure bug). Disk failures at all 500+ are less than 5 per year.

  • @edwarddejong8025
    @edwarddejong8025 หลายเดือนก่อน +28

    We only used Intel (Solidigm now) drives on in all of our server racks. They have performed wonderfully. They have a supercapacitor so that they can write out the data if there is a power failure. An essential feature for data center use. We haven't however upgraded to SSD for our NAS units because we write a huge amount every day, and SSD's would have burned out in 3 years; our mechanicals have lasted 9 years and only had 3 out of 50 drives fail.

  • @sadnesskant7604
    @sadnesskant7604 หลายเดือนก่อน +134

    So, this is why ssds on ebay got so expensive lately... Thanks a lot Patric😢

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +43

      Ha! When NAND prices go up ebay prices do too. We have been buying the drives in here for almost a decade.

    • @quademasters249
      @quademasters249 หลายเดือนก่อน +14

      I noticed that too. I bought 7.6 TB for $350. Now I can't find it for less than $500.

    • @Knaeckebrotsaege
      @Knaeckebrotsaege หลายเดือนก่อน +13

      There has been price fixing going on in terms of NAND chips, and Toshiba/KIOXIA already got bonked for it. Check price history for consumer SSDs up till november/december 2023, and then up to today and watch the line go up and up and up for no reason whatsoever... basic 2TB TLC NVMe SSDs were down to 65eur, now the very same models are 115+eur. Heck 1TB TLC NVMe SSDs were at the point of being so cheap (35eur!) that you just threw them at everything, whether it needed one or not. Now with the price ballooned to 60+eur, not anymore. And yes, consumer SSDs aren't the target for viewers of this channel, but the prices for consumer junk exploding inevitably also has an effect on used enterprise stuff

    • @thelaughingmanofficial
      @thelaughingmanofficial หลายเดือนก่อน +1

      Welcome to the concept of Supply and Demand.

    • @WeiserMaster3
      @WeiserMaster3 หลายเดือนก่อน +17

      ​@@thelaughingmanofficialillegal price fixing*

  • @MrBillrookard
    @MrBillrookard หลายเดือนก่อน +18

    I've got a SSD that I put in my webserver wayyyyy back in 2013. Crucial M4 64GB SSD. I was a bit iffy about it as that was when SSD tech was pretty new, but I picked a good brand so I just YOLO'd it. Currently still in service, 110,000 power on hours, 128 cycle count. 0 uncorrectable, 0 bad blocks, 0 pending sectors, and one error logged when it powered off during a write (lost power, whoops).
    Still, 12 years of service without a hiccup, and according to the wear leveling, it's gone through 4% of it's life. At that point I expect it to last... another 275 years? Cool. I guess my SSD will still be functional when we develop warp drive if Star Trek shows where we're headed. Lol.

  • @seccentral
    @seccentral หลายเดือนก่อน +19

    Recently I saw a vid by level1techs saying pretty much the same thing, he hammered a drive rated for hundreds of tbw with over a petabyte and it still ran; also, the same idea around companies very very rarely needing anything modern drive bigger than 1dwpd. Thanks for confirming this. And for new ones, it matters: - Kioxia 6.4 TB 3DWPDs go for 1600, similar 7.6 TB 1 dwpd drives are 1000 and when you're building clusters, it matters a lot

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +8

      Yes. And with big drives you should not need 1DWPD

  • @udirt
    @udirt หลายเดือนก่อน +5

    My favs were the Hitachi HGST, not the stec ones but their own. Any number in the datasheet was understating their real performance. Pure quality.

  • @MikeKirkReloaded
    @MikeKirkReloaded หลายเดือนก่อน +19

    It makes all those 1.92/3.84TB used U.2's on Ebay look like an even better deal for homelab use.

    • @balex96
      @balex96 หลายเดือนก่อน +2

      Definitely, I bought yesterday 6 TOSHIBA 1.92 TB SSD for 85 British pounds each.

    • @originalbadboy32
      @originalbadboy32 หลายเดือนก่อน

      ​@@balex96you can buy brand new 2tb SDDs for about £90... so why risk used

    • @Beany2007FTW
      @Beany2007FTW หลายเดือนก่อน +4

      @@originalbadboy32 Because homelab use tends to be a lot more write intensive than a regular desktop PC by it's nature, so getting higher endurance drives makes a difference.
      Also if you're working with ex-enterprise hardware (as many homelab users are), you're talking U2 2.5 hotswap capable drives for arrays, not M2 keying for mobo slots or add-in cards.
      You can't get those for £90 new.
      Different use cases that require different solutions, simple as that.

    • @originalbadboy32
      @originalbadboy32 หลายเดือนก่อน

      @@Beany2007FTW to a point I agree but even most homelab users are probably not going to be pushing writes all that much.
      Media creation sure, outside of that probably not pushing writes so much that you need enterprise level hardware.

    • @Beany2007FTW
      @Beany2007FTW หลายเดือนก่อน +4

      @@originalbadboy32 Might want the battery backed write protection for power outages, though.
      There's more to enterprise drives than just write endurance.

  • @concinnus
    @concinnus หลายเดือนก่อน +14

    In the consumer space, most of the reliability issues have not been hardware-based but firmware, like Samsung's. As for rebuild time and RAID levels, the other issue with hard drives is that mechanical failures tend to happen around the same time for drives from the same manufacturing batch. We used to mix and match drives (still same model/firmware) in re-deployed servers to mitigate this. Probably less of an issue for SSDs.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +8

      You are right that there are other factors. We lost an entire Dell C6100 chassis worth of Kingston DC SSDs because of a power in rush event. At the time Intel had the protection feature and Kingston did not. Now most do

  • @purrloftruth
    @purrloftruth หลายเดือนก่อน +16

    not that i know anything about anything, but i think there should be some sort of opt-in industry-wide database where interested server/dc owners can run a daemon on their server that submits the smart stats of all its drives daily, so that people across the industry can see statistics on how certain models perform, potentially get early warning of models with abnormally high failure rates, etc.

    • @ThylineTheGay
      @ThylineTheGay หลายเดือนก่อน +2

      like a distributed backblaze drive report

    • @purrloftruth
      @purrloftruth หลายเดือนก่อน +3

      @@ThylineTheGay yeah, but updating in 'real time' (daily or so). whereas they put one out once a year iirc

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +4

      The server vendors can do this at the BMC level and then use the data for predictive failure service

  • @ewenchan1239
    @ewenchan1239 หลายเดือนก่อน +16

    Three things:
    1) SSD usage and by extension, endurance, REALLY depends on what it is that you do.
    One of the guys that I went to college with, who is now a Mechanical Design Lead at SpaceX, runs Monte Carlo simulations and on his new workstation which uses E1S NVMe SSDs -- a SINGLE batch of runs, consumed 2% of the drives' total write endurance.
    (When you are using SSDs as scratch disk space for HPC/CFD/FEA/CAE applications, especially FEA applications, it just rains data like no tomorrow. For some of the FEA work that I used to do on vehicle suspension systems and body-on-frame pickup trucks, a single run can easily cycle through about 10 TB of scratch disk data.)
    So, if customers are using the SSDs because they're fast, and they're using it for storage of large, sequential (read: video) files, then I would 100% agree with you.
    But if they are using it for its blazing fast random read/write capabilities (rather than sequential transfers), then the resulting durability and reliability is very different.
    2) I've killed 2 NVMe SSDs (ironic that you mentioned the Intel 750 Series NVMe SSD, because that was the one that I killed. Twice.) and 5 SATA 6 Gbps SSDs (all Intel drives) over the past 8 years because I use the SSDs as swap space for Windows clients (which is also the default, when you install Windows), for systems that had, at minimum, 64 GB of RAM, and a max of 128 GB of RAM.
    The Intel 750 Series 400 GB AIC NVMe SSDs, died, with an average of 2.29 GB writes/day, and yet, because it was used as a swap drive, it still died within the warranty period (in 4 years out of the 5 year warranty).
    On top of that, the manner in how it died was also really interesting because you would think that when you burn up the write endurance of the NAND flash cells/modules/chips, that you'd still be able to read the data, but that wasn't true neither. In fact, it was the read that was the indicator that the drive had a problem/died -- because it didn't hit the write endurance limits (according to STR nor DWPD nor TBW).
    The workload makes a HUGE difference.
    3) It is quite a pity that a 15.36 TB Intel/Solidigm D5-P5316 U.2 NVMe costs a minimum of $1295 USD whereas a WD HC550 16 TB SATA 6 Gbps HDD can be had for as little as $129.99 USD (so almost 1/10th the cost, for a similar capacity).
    Of course, the speed and the latency is night-and-day and isn't comparable at all, but from the cost perspective, I can buy 10 WD HC550 16 TB SATA HDDs for the cost of one Intel D5-P5316 15.36 TB U.2 NVMe SSD.
    So, it'll be a while before I will be able to replace my homelab server with these SSDs, possibly never.

    • @RussellWaldrop
      @RussellWaldrop หลายเดือนก่อน +1

      Shouldn't someone who needs that crazy quick random R/W, wouldn't it be cheaper to just build a server with a ton of ram and create some form of a ramdisk? And more durable.

    • @Henrik_Holst
      @Henrik_Holst หลายเดือนก่อน +1

      @@RussellWaldrop building a commodity server taking TB of RAM is no easy feat. Even on EPYC you max out at 6TB of RAM per system and that RAM alone is easy $90K and you are only 1/3 into replacing that one 16TB drive that OP talked about.

    • @ewenchan1239
      @ewenchan1239 หลายเดือนก่อน

      @@RussellWaldrop
      "Shouldn't someone who needs that crazy quick random R/W, wouldn't it be cheaper to just build a server with a ton of ram and create some form of a ramdisk? And more durable."
      Depends on the platform, RAM generation, and fault tolerance for data loss in the event of a power outage.
      Intel has their Xeon series which could, at least for two generations, take DC Persistent Memory (which Patrick and the team at ServeTheHome) has covered in multiple, previous videos.
      So, to that end, it helps to lower the $/GB overall, but historically speaking, if you wanted say like 1 TB of DDR4-3200 ECC Reg. RAM, it was still quite expensive, on a $/GB basis. (I couldn't find the historical prices on that type of memory now, but suffice it to say that I remember looking into it ca. 2018 when I had my 4-node, dual Xeon E5-2690 (v1) compute cluster, where each node had 128 GB of DDR3-1866 ECC Reg. RAM running at DDR3-1600 speeds, for a total of 512 GB, and if I remember correctly, 1 TB of RAM would have been something on the order of like $11,000 (if one stick of 64 GB DDR4 was $717, per this post that I was able to find, about the historical prices (Source: hardforum.com/threads/go-home-memory-prices-youre-drunk.1938365/)).
      So you figure that's ON TOP of the price of the motherboard, chassis, power supply, NIC(s), CPUs, HSFs (if you're building your own server vs. buying a pre-built server), and the cost of those components varies significantly depending on what you are looking for.
      (i.e. The top of the line Cascade Lake 28 core CPU that support DC PMEM original list price was almost $18,000 a pop (Source: en.wikipedia.org/wiki/Cascade_Lake#Xeon_W-2200_series) for the 'L' SKUs which support more RAM. So you get two of those suckers, you're still only at 28 cores each, for a total of 56 cores/112 threads (whereas AMD EPYC had 64 cores by then, IIRC, but didn't support DC PMEM).)
      My point is that the cost for a lot of RAM often became quite cost prohibitive for companies, so they would just go the SSD route, knowing that it's a wear item like brake pads on your car. (And like brake pads on your car, the faster it goes, the faster it wears out.)
      DC PMEM helped lower the $/GB cost SOME, but again, without it being supported on AMD platforms, and given the cost, and often times, the relative LACK of performance from Intel Xeon processors (compared to AMD EPYC processors), there wasn't a mass adoption of the technology, which is probably why Intel ultimately killed the project. (cf. www.tomshardware.com/news/intel-kills-optane-memory-business-for-good).
      I looked into it because like I said, for my HPC/FEA/CFD/CAE workloads, I was knowingly killing NAND flash SSDs VERY quickly. (Use them as a swap/scratch drive, and you'll see just how fast they can wear out without ever even getting remotely close to the DWPD STR write endurance limits.)
      (Compare and contrast that to the fact that I bought my 4-node micro compute cluster for a grand total of like $4000 USD, so there was no way that the capex for the platform that supported DC PMEM was ever going to fly/take off. It was just too expensive.)
      At one point, I was even playing around with using GlusterFS (version 3.7 back then) distributed file system, where I created 110 GiB ram disks, and then strung them all together as a distributed striped GlusterFS volume, to use as a scratch disk, but the problem that I ran into with that was that even with 100 Gbps Infiniband, it wasn't really read/writing the data significantly faster than just using a local SATA SSD because GlusterFS didn't support RDMA on the GlusterFS volume, despite the fact that I exported the gvol over onto the network as a NFS-over-RDMA export.
      That didn't quite go as well as I thought it could've or would've. (And by Gluster version 5, that capability was deprecated and by version 6, it was removed entirely from the GlusterFS source code.)
      (I've tried a whole bunch of stuff that was within my minimal budget, so never anything as exotic as DC PMEM.)
      There were also proposals to get AMD EPYC nodes, using their 8-core variant of their processors (the cheapest you can go), and then fill it with 4 TB of RAM, but again, RAM was expensive back then.
      I vaguely remember pricing out systems, and it was in the $30k-60k neighbourhood (with 4 TB of RAM, IIRC), vs. you can buy even consumer SATA SSDs for like a few hundred bucks a pop (1 TB drives, and you can string four of them together in RAID 0 (be it hardware or SW RAID), and then exported that as the scratch disk (which is what I did with my four Samsung EVO 850 1 TB SSDs, and then exported that to the IB network as a NFSoRDMA export, and the best that I was able to ever get with it was about 32 Gbps write speed, which, for four SATA 6 Gbps SSDs, meant that I actually was able to, at least temporarily, exceed the SATA interface theoretical limit of a combined total of 24 Gbps. Yay RDMA??? (Never was sure about that, but that's what iotop reported).)
      Good enough. Still burned through the write endurance limit at that rate though.
      For a company with an actual, annual IT budget -- replacing SSDs just became a norm for HPC workloads.
      For me though, with my micro HPC server, running in the basement of my home -- that wasn't really a viable option, so I ended up ditching pretty much all SSDs, and just stuck with HDDs.
      Yes, it's significantly slower, but I don't have annualised sunk cost where I'd knowingly have to replace it, as it wears out. $0 is still better than having to spend a few hundred bucks on replacement SSDs annually.
      (cf. www.ebay.com/itm/186412502922?epid=20061497033&itmmeta=01J56P9FCY6HJ5V1QT28FZ09PP&hash=item2b670d0f8a:g:dsMAAOSwke9mKR61&itmprp=enc%3AAQAJAAAA4HoV3kP08IDx%2BKZ9MfhVJKlh58auJaq6WQcmR34S6zfFgi4VcCPwxAwlTOkDwzQNAuaK9bi%2BmrehAA82MAu78x8Fx8iWc7PGv6TP9Vrypic02FAbBfEWd7UjU5W1G0CuYKYjCxdkETpy3xnK2D0iPrkBwNi5R%2BaphL%2B%2Fd8taZo0RG%2Fed%2F4QoqNmDMyMoTvDIBGifnVEngMykFUtrULKQMlUkbQ6ED%2B0iOYLQxEJDrkmSJauzdBzwMHCbNuvCLM0l08ziMQJVvBo1FBT%2FXXToZITQk%2BdUTBYfOv6cdotQ1678%7Ctkp%3ABk9SR8j2pdapZA)
      An open box Solidigm D5-P5316 15.36TB U.2 NVMe SSD out of China is $1168 USD.
      A WD HC550 16 TB HDD is $129.99 USD.
      I would LOVE to be able to replace my entire main Proxmox storage server with U.2 NVMe SSDs.
      But at roughly 10X the cost, there's no need for it. Nothing I do/use now (with my Proxmox storage server) would benefit from the U.2 NVMe SSD interface.
      I think that the last time that I ran the calculation for the inventory check, I am at something like a grand total of 216 TB raw capacity. It'd cost me almost $16k USD to replace all of my HDDs with U.2 NVMe SSDs.
      The base server that I bought, was only $1150 USD.
      The $/GB equation still isn't there yet.
      It'd be one thing if I was server hundreds or thousands of clients, but I'm not.
      (Additionally, there is currently a proposal that ZFS might actually be making my system work harder than it might otherwise need to, because if I offloaded the RAID stuff onto my Avago/Broadcom/LSI MegaRAID SAS 12 Gbps 9361-8i, the SAS HW RAID HBA should be able to do a MUCH better job of handling all of the RAID stuff, which would then free up my CPU from all of the I/O wait metric that is a result of the fact that I am using HDDs, so they're slow to respond to I/O requests.)

    • @Nagasaski
      @Nagasaski หลายเดือนก่อน

      What about intel optane? Or Crucial T700? They are almost server grade SSD but for consumers.

    • @ewenchan1239
      @ewenchan1239 หลายเดือนก่อน

      @@Nagasaski
      "What about intel optane?"
      Depends on capacity and platform.
      On my 7th gen NUC, it recognises it, and it can be used as cache for the 2.5" Toshiba 5400 rpm HDD, but at the end of the day, it is limited by the HDD. (It just too slow.)
      I haven't tried using Optane on my AMD systems, but I am going to surmise that it won't work on an AMD system.
      "Or Crucial T700?"
      I quickly googled this, and the 1 TB version of this drive only has a write endurance limit of 600 TBW over its entire lifetime.
      Again, it depends, a LOT, on HOW you use the drive.
      If you use it as a swap drive, you can kill the drive LONGGG before it will hit the sequential transfer write endurance limit, which is how the TBW metric might be measured (or it might be like 70% sequential/30% random write pattern).
      However, if you have almost a 10% sequential/90% random write pattern like using the drive as a swap drive, you can exhaust the finite number of write/erase/programme cycles of the NAND flash of the SSD without having hit the write endurance limit.
      Again, my Intel 750 Series 400 GB NVMe SSD AIC, I only averaged something like 2.29 GB writes/day. But I still managed to kill TWO of these drives, in a 7 year period. (A little less than 4 years each.) And that's on my Windows workstation which had it's RAM maxed out at 64 GB.
      The usage pattern makes a HUGE difference, and the write endurance limit doesn't take that into consideration, at least not in terms of the number that's advertised in the product specs/advertising/marketing materials.
      (Intel REFUSED to RMA the second 750 Series that I killed because that was the drive that died after the first drive was RMA'd, from the first time that the drive failed, arguing that it was beyond the initial 5 year warranty from the FIRST purchase. So now, I have a dead 750 Series NVMe SSD, that's just e-Waste now. I can't do anything with it.)
      And that's precisely what dead SSDs are -- eWaste.
      And people have called BS about this, and I told them that by default, Windows installs the pagefile.sys hidden file on the same drive where Windows is installed.
      So, if you are swapping a fair bit, it's burning up write/erase/program cycles on your OS drive.

  • @paulbrooks4395
    @paulbrooks4395 หลายเดือนก่อน +12

    The contrary information is hybrid flash arrays like Nimble that does read caching by writing a copy of frequently used data to cache. Our Nimble burned through all of its data center write-focused SSDs all at once, requiring 8 replacements. The SMART data showed 99% drive write usage.
    We also use Nutanix which uses SSDs for both read and write tiering. Since we host a lot of customer servers and data churn, we see drives getting burned out at an expected rate.
    To your point, most places don't operate like this, instead being WORM operations and using SSDs for fast access times. But it's still very important for people to know their use case well to avoid over or under buying.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +5

      Exactly. It is also interesting that write focused drives often were not used in that manner.

  • @kennethhomza9026
    @kennethhomza9026 หลายเดือนก่อน +7

    The consistent background music is a nuisance

    • @youtubiers
      @youtubiers 21 วันที่ผ่านมา

      Yes agree

  • @LtdJorge
    @LtdJorge หลายเดือนก่อน +9

    Sshhhh, Patrick, don’t tell the enterprise customers they’re overbuying endurance. It lets those trickle down at low prices to us homelabbers 😅

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน

      Fair

    • @LtdJorge
      @LtdJorge หลายเดือนก่อน

      @@ServeTheHomeVideo hehe

  • @ChrisSmith-tc4df
    @ChrisSmith-tc4df หลายเดือนก่อน +19

    I’d still want a DWPD that’s at least some low multiple of my actual workload writes just so that performance doesn’t suffer so much near EOL when ECC would be working hard to maintain that essentially zero error rate.
    That said, a lower endurance enterprise SSD (~1 DWPD) would probably suffice for the majority of practical workloads and save the costly higher endurance ones for truly write intensive use cases.
    Also the dying gasp write assurance capability helps prevent array corruption upon unexpected loss of power, so the enterprise class drives still provide that benefit even at lower DWPD ratings. That’s something to consider if considering using non-enterprise SSD’s in RAID arrays.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +8

      Totally, but then the question is do you still want 1DWPD at 30.72TB? 61.44? 122.88? Putting it another way, 8x 122.88TB drives will be just shy of 1PB of raw storage. Writing 1PB of 4K random writes per day is not trivial.

    • @ChrisSmith-tc4df
      @ChrisSmith-tc4df หลายเดือนก่อน +1

      @@ServeTheHomeVideo A decade+ ago back in the SATA/SAS SSD days, I recall the lowest write endurance enterprise drives that I saw aimed at data warehousing were 0.5 DWPD. So given the even lower write utilization on colossal drive arrays that are likely only partially filled, you’re advocating use cases for perhaps even less than 0.5 DWPD down near a prosumer SSD write endurance?

  • @sotosoul
    @sotosoul หลายเดือนก่อน +59

    Lots of people are concerned about SSD reliability not because of the SSDs themselves but because of the fact that SO MANY devices have them soldered!

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +19

      That is true. This is just data center drives

    • @SyrFlora
      @SyrFlora หลายเดือนก่อน +5

      SSD reliability regarding how much write endurance in them not really improving to be honest.. it going backwards.
      Newer Manufacturing nowadays makes each cell more reliable but when industry shifts to QLC for consumers storage solutions. It is still worse than ssd in the TLC or MLC era . For most people it is still not a problem unless u are a really really heavy write user, a bad scenario like always staying with less than 10% free space or not enough ram that make swap like crazy to run OS and application that u use. U basically unlikely got an issue of failure because u wore out the cell.
      For mobile devices most people should be fine. But like pc.. soldered storage is pretty nasty like whut 🍏 did. Especially when bios stuff are also inside that ssd not dedicated chip. U wore it out.. it basically brick because u cannot even boot from other media.😂😂

    • @Meowbay
      @Meowbay หลายเดือนก่อน +7

      Well, speaking from personal experiences as a hosting engineer, that fear also stems from the large number of ssd failures that result in actually entirely unreadable, after primary failing notice. Controller error or not. This is not what you want hoping your data could at least partially be restored, as I usually can and could with mechanical drives. Many ssd's fail from 100% to being complete 0% readable. That's frightening, I assure you. Unless you're into resoldering your own electronics on such micro chips and know what parts make it fail, and you have your own lab for that and the time to do this, of course. But I don't think many among us would..

    • @kintustis
      @kintustis หลายเดือนก่อน +2

      soldered ssd means manufactured ewaste

    • @mk72v2oq
      @mk72v2oq หลายเดือนก่อน +4

      @@Meowbay as a hosting engineer you should know that relying on assumption that you will be able to restore data from a failed drive (regardless of its type) is dumb. And that having data redundancy and backups is crucial part of any data center operation.

  • @jeremyroberts2782
    @jeremyroberts2782 หลายเดือนก่อน +3

    Our 6 year old Dell drives hosting a VMware vSAN for a mixed range of servers including Databases in the 1-2TB size, all the drives still have around 85-90% endurance availability. Our main line of business DB has read/write ratio of 95% reads/5% writes.
    Life of SSDs is really in the decades or more (assuming the electronic stuff doesn't naturally degrade or capacitors go pop).
    Most heavy use personal PCs will only write about 7GB of data a day (the odd game install aside) so on a 1TB drive it will take 150 days to do a full drive write, if the stated life is 1000 DW/ 3 years, it will take around 390 years to reach that limit.

  • @MichaelCzajka
    @MichaelCzajka หลายเดือนก่อน +3

    The takeaway message seems to be that SSD's are ~10x more reliable than mechanical drives:
    Helpful to know that SSD's in servers have almost eliminated HDD failures.
    Helpful to point out that larger SSD's help improve reliability.
    Mechanical HDD's have to swapped out every ~5 years even if they've had light use.
    That's starts to get very expensive and inconvenient.
    SSD's are a much better solution.
    Most users just want a drive that is not going to fail during the life of the computer.
    The lifespan of many computers might be 10 years or more.
    NVMe drives are great because you get speed, small form factor and low price all in one package.
    The faster the drive the better in most cases... especially if you like searching your drives for emails or files.
    My key metric remains total data written before failure... although it is useful to know over what time period the data was written.
    I've yet to have an SSD fail.
    Most of my SSD's live on in various upgrades e.g. Laptops.
    That means that old SSD's will continue to be used until they become obsolete.
    It's rare to see meaningful useability data on SSD's. Nicely done.
    🙂

  • @GreenAppelPie
    @GreenAppelPie หลายเดือนก่อน +2

    So far my SSDs/NVMEs have had zero problems for 7+ years, while my hard drives on the other hand start failing within a few years. I’ll never get an SSD again. great episode BTW, very informational!.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน

      Why never another SSD?

    • @mikemotorbike4283
      @mikemotorbike4283 24 วันที่ผ่านมา

      @@ServeTheHomeVideo I suspect he's being sarcastic

    • @lilietto1
      @lilietto1 21 วันที่ผ่านมา +1

      @@mikemotorbike4283 I suspect he just wrote ssd when he meant hd

  • @marklewus5468
    @marklewus5468 หลายเดือนก่อน +21

    I don’t think you can compare a large SSD with a hard drive. A Solidigm 61TB SSD costs on the order of $120 per terabyte and a 16-22tb IronWolf Pro hard drive is on the order of $20 per terabyte. Apples and oranges.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +12

      So the counter to this is that they literally cannot make 61.44TB drives fast enough and big orders are coming in already for 122.88TB next year. There is a per device cost in favor of HDD but higher performance, reliability, and endurance. In the DC swapping to high capacity SSDs can save huge amounts of space and power. Power is the big limiter right now.

  • @redslate
    @redslate หลายเดือนก่อน +5

    Controversially, years ago, I estimated that most quality commercial SSDs would simply obselete themselves in terms of capacity long before reaching their half-life, given even "enthusiast" levels of use. Thus far, this has been the case, even with QLC drives.
    Capacities continue to increase, write endurance continues to improve, and costs continue to decrease. It will be curious to see what levels of performance and endurance PLC delivers.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +1

      That is what happens with us. Capacity becomes more important

  • @reubenmitchell5269
    @reubenmitchell5269 หลายเดือนก่อน +1

    We've had Intel S3500/3510 Sata SSDs as the boot drives in RAID1 for all our production Dell R730s for coming up 8 years - never had an issue with any of them. We had 3x P5800X Optanes fail under warrant, but the 750 PCI-E cards are still going strong

  • @imqqmi
    @imqqmi หลายเดือนก่อน +2

    I remember around 2010 that I introduced 2x 60GB drives as the IT guy at a company in raid 1 config for their main database for their accounting software. Reports and software upgrades that ran for minutes up to half an hour was done in seconds. The software technician was apprehensive about using SSDs for databases but after seeing these performance numbers he was convinced. These drives worked for around 4 years after being retired but were still working.
    Capacitors and other support electronics seem to be less reliable than the flash chips themselves lol! I've upgraded all my HDD drives to SDDs last year and never looked back.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน

      Yes. Also Optane was expensive, but it often moved DB performance bottlenecks elsewhere

  • @JP-zd8hm
    @JP-zd8hm หลายเดือนก่อน +4

    DWPD is relevant in server specification - write amplification needs to be considered especially for ZFS or dual parity arrangements eg VSAN. That said, enterprise drives used are a great shout in my experience, 40% left of a 10PB total write life device is still very nice thank you!

  • @djayjp
    @djayjp 17 วันที่ผ่านมา +3

    Keep in mind the survivorship bias in effect here: you typically won't be sold already dead drives....

  • @BloodyIron
    @BloodyIron หลายเดือนก่อน +2

    Welp that just validated what I've been thinking for the last like 10 years lol. Thanks!

  • @udirt
    @udirt หลายเดือนก่อน +3

    You'll see a lot more wear if you focus on drives in HCI setups due to silly rebalancing etc.
    You also need to factor in the overprovisining if you look at failure rates. People factored in this and gained reliability.

  • @jaimeduncan6167
    @jaimeduncan6167 หลายเดือนก่อน +2

    Great overview. We need to force people to understand the MTTR metric, even IT professionals (software) sometimes don't get how important it is. In fact a 20TB HDD drive is a liability even for RAID 6 equivalent technologies (2 drive failure). In particular, if all your rives were bought at the same time, from the same vendor they are likely to come from the same batch. Clearly, the relation of price per by of a 20 TB vs an U.2 16TB ssd is bast and you can buy something more sophisticated and don't worry as much of MTTR.

  • @RWBHere
    @RWBHere หลายเดือนก่อน +1

    Thanks for the heads-up. Now to go and find a few used server SSD's which haven't been edited with a hammer...

  • @cyklondx
    @cyklondx หลายเดือนก่อน +2

    The endurance is meant for the disks to last so we don't have to replace them in 2-4 years, they can sit there until we decom whole box... thats the idea of having a lot of endurance.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +1

      DWPD endurance ratings on DC drives are for 5 years, so 2-4 should not be an issue.

  • @paulstubbs7678
    @paulstubbs7678 หลายเดือนก่อน +4

    My main concern with SSD's comes from earlier endurance tests where a failed drive would become read only, then totally bricked if you power cycled it. This means if a drive dies, as in goes read only, you basically cannot clone that drive to a new one as that will most likely involve a power cycle/reset - as the O/S has probably crashed being unable to update something.

    • @kevinzhu5591
      @kevinzhu5591 28 วันที่ผ่านมา +1

      In that case, you use another computer to retrieve the information by not using the drive as a boot drive.

  • @iiisaac1312
    @iiisaac1312 หลายเดือนก่อน +3

    I'm showing this video to my SanDisk Ultra Fit USB Flash Drive to shame it for being stuck in read only mode.

  • @tomreingold4024
    @tomreingold4024 หลายเดือนก่อน +6

    Fantastic. Very informative. I used to run data centers. I've switched drastically and am now becoming a school teacher. I don't know where I will use the information you just provided, but I enjoyed learning it.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +1

      Glad it was helpful! Maybe it becomes a lesson one day. Thanks for being a teacher!

    • @tomreingold4024
      @tomreingold4024 หลายเดือนก่อน

      @@ServeTheHomeVideo hey you never know. As I prepare to be a special ed teacher for math, English, social studies and science, maybe I'll end up teaching IT.

  • @bacphan7582
    @bacphan7582 หลายเดือนก่อน +1

    I just bought an old 1TB server SSD. It's toshiba one, has been written over 1PB, but it's MLC( 2 bit per cell), so i put a lot of trust to it.

  • @Mayaaahhhh
    @Mayaaahhhh หลายเดือนก่อน +6

    This is something I've been curious about for a while, glad to see it tested!
    Also so many bots ;_;

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +4

      Yea so many! We have been collecting the data for a long time, we just have not shared it since the 2016 article.

    • @mika2666
      @mika2666 หลายเดือนก่อน

      Bots?

  • @tad2021
    @tad2021 หลายเดือนก่อน +2

    I think outside of the early gens of non-SLC SSDs, I haven't had any wear out. But far more of those drives died from controller failure, as was the style of the time. 100% failure rate on some brands.
    I recently bought a around 50 of 10-12 year old Intel SSDs. Discounting the one that was DOA, the worst drive was down to 93%, the next worst was 97%, the rest were 98-99%. A bunch of them still had data (seller should not have done that...) and I could tell that many of them had been in use till about a year ago.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน

      Yea we found many with data still accessible. In 2016 when we did the 2013-2016 population a lot more were accessible and unencrypted

  • @tbas8741
    @tbas8741 หลายเดือนก่อน +1

    MY Old System (Built in 2014, Retired in 2024)
    The HDD Stats in that heavily used system are
    - Western Digital Hybrid SD-HDD 7200rpm (32mb ssd cache on the sata interface)
    - Power on Hours 92,000,
    But i kept the computer running on average 24/7/365 for over 10 years.

  • @CampRusso
    @CampRusso หลายเดือนก่อน +2

    😮🤔 great video! Ive seen a few videos of all SSD NAS's and tbought well that is bold. Though now watching this I'm thinking I want to try it too! I happen to have a collection of enterprise SSD's from decomm servers at work. The SMART numbers on these are probably crazy low. This also sounds very appealing from a power/heat perspective. Im always trying to make the homelab more efficient.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +1

      We are down to two hard drives in our hosting clusters, and hoping to shift away from those in the next refresh

    • @CampRusso
      @CampRusso หลายเดือนก่อน

      @@ServeTheHomeVideo 😯 that's right you did mention 2 HDD in the vid. That's awesome. Yeah it's time! 😁
      The mobo for my TN Scale box has six sata ports. I have some Intel D3-S4600 and Samsung PM863a to test with.

  • @axescar
    @axescar หลายเดือนก่อน +1

    Thank you for sharing your experience. What can be interesting is some heavy load mssql/oracle ssd database storage.

  • @honkhonkler7732
    @honkhonkler7732 16 วันที่ผ่านมา +2

    Ive had great reliability from SSDs, I just cant afford the ones that match hard drives for capacity. At work though, we just bought a new vxrail setup thats loaded out with SSDs and the performance improvement from the extra storage speed is more noticeable than the extra CPU resources and memory.

  • @scsirob
    @scsirob หลายเดือนก่อน +1

    Kinda confirms my statement from a couple of years ago. In the end we'll have just two types of storage.
    1. SSD
    2. Tape

    • @UmVtCg
      @UmVtCg 6 วันที่ผ่านมา

      LOL You are mistaken, media for storing data will evolve. In the future, huge amounts of data will be stored in glass. Project silicon

  • @TonyMasters-u2w
    @TonyMasters-u2w หลายเดือนก่อน +3

    the sad truth is
    back in 2016s SSD were SLC (or MLC at worst) and they were very reliable
    but today they are all TLC and more often QLC
    it's not correct to say people are not utilizing them because people buy capacity by their needs and the volume and size of data skyrocketed.
    In fact, storing data and not changing them is even worse because stale data take for example 80% of your disk (very typical scenario) and now you have only 20% to play with meaning more often writes (although the overall amount seems small) leading to high usage of active 20%
    So i don't agree with your points, we can't project previous reliability stats to modern SSDs

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน

      SLC was already just in more niche drives by 2011-2012

  • @basspig
    @basspig 22 วันที่ผ่านมา +4

    I am employed probably 20 conventional hard drives up until 2015 in various editing workstations and I would see a drive failure about every 17 to 18 months. After 2015 I converted all the machines over to solid state drive and to this day we have not seen one fail.

  • @patrickdk77
    @patrickdk77 หลายเดือนก่อน +2

    I have several intel 311 (20g) I should upgrade (purchased 2010), as zfs slog service, PONH=93683, DWPD=1.53, but everything has been optimized to not write unless needed, and moving everything to containers helped with this even more.

  • @BethAlpaca
    @BethAlpaca หลายเดือนก่อน +2

    I will not stop overbuying them. I got so much space my files are like a paper clip in a hallway. Games maybe 10% but they dont move often.

  • @noco-pf3vj
    @noco-pf3vj 18 วันที่ผ่านมา

    I have a Kingston HyperX 3K SSD 120 GB, using a Sandforce controller. I use it on my 2012 Asus AMD Turion laptop for work, around 10 hours a day. It's still going strong, when I checked on CrystalDisk Info, the SSD's health was 93%.
    That is Sandforce SSD, how amazing it is still alive and well.

  • @pkt1213
    @pkt1213 หลายเดือนก่อน +3

    My home server gets almost 0 driverights per day. It gets read a lot, but every once in a while, photos or movies are added.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +1

      Great example. Photos and movies are sequential workloads as well

  • @nadtz
    @nadtz หลายเดือนก่อน +2

    For my use at home I grabbed some P4510's used, they were all at 99% life left and have been chugging along for a couple years now. Stating to think about upgrading to some gen 4 drives so I've been hunting ebay but I think I'll wait for prices to drop again since they've gone up recently.
    Your 2016 study and a lot of people reporting use on drives they bought on forums made me worry a lot less about buying used. Always the possibility of getting a dud but I've had good luck so far.

  • @chrisnelson414
    @chrisnelson414 หลายเดือนก่อน +2

    The home NAS community (especially my spouse, the media hoarder), is waiting for the larger capacity SSDs to drop in price so they can replace their spinny disks.

  • @SkaBob
    @SkaBob หลายเดือนก่อน +1

    The drive wear issue sounds similar to the EV battery wear problem that's going away now as well. On the early EVs like a Leaf only had a 50-70 mile range, so 12,000 miles a year would need 200 battery cycles while a newer car with a 320 mile range would only be 37 cycles for the same miles. I do have a few old SSD probably 6-8 years old that never failed and only get replaced to gain capacity but not used in a server capacity. The only SSD that I remember failing was an old Sandisk ReadyCache SSD from around 2012, it was a small 32GB SSD made to supplement your HDD by caching your most used files so it likely had a high write/read/rewrite load and ran near 100% capacity all the time.

  • @henderstech
    @henderstech หลายเดือนก่อน +2

    wow I wish I had just one of those large SSDs! I had no idea They made them so l with so much capacity.

  • @markkoops2611
    @markkoops2611 หลายเดือนก่อน +2

    Run spinrite 6.1 on the drive and watch it revive the disk

  • @Zarathustra-H-
    @Zarathustra-H- หลายเดือนก่อน +3

    You don't think that maybe your data set might be skewed due to sellers not selling drives where they have already consumed all or close to all of the drives write cycles? Because of this, I just don't think your sample is truly random or representative.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +2

      That would have been a bigger concern if we were buying like 5+ year old drives. Normally we are buying 2-ish year old models and so it is much less likely they can get written through at that pace. This is especially true since we are seeing sub 10% duty cycle on the vast majority of drives. Also l, remember a good portion of these are not even wiped as we showed, so if people are not wiping them they are unlikely to be looking at SMART wear data.

    • @Zarathustra-H-
      @Zarathustra-H- หลายเดือนก่อน

      @@ServeTheHomeVideo The fact that they are not wiping them is pretty shocking actually.

  • @lukasbruderlin2723
    @lukasbruderlin2723 หลายเดือนก่อน +1

    would have been nice if you would have given some examples of SSD drives that do have less higher ratings and therefore are less expensive but still are reliable,

  • @ralanham76
    @ralanham76 หลายเดือนก่อน +1

    I still have the first SSD a Toshiba used from a failed apple device. I think some of these drives might out live me 🤣

  • @acquacow
    @acquacow หลายเดือนก่อน +2

    I just built a whole new nas on 1.6TB Intel S3500s with 60k hours on them all a few months ago =p I'm all about used flash.

  • @heeerrresjonny
    @heeerrresjonny หลายเดือนก่อน +1

    Maybe this is just because I have only ever purchases consumer SSDs, but I have been using SSDs for over a decade and I have never once seen a drive have a DWPD rating listed (in fact, this video is the **first** time I have ever encountered that metric in all these years lol). Endurance has always been rated using TBW. EDIT: also, now that I've looked into it, it seems manufacturers calculate "DWPD" based on the warranty period... but that doesn't make sense to me. It should use MTBF for the time component. This would make all the DWPD numbers WAY smaller, but more "objective")

  • @Memesdopb
    @Memesdopb หลายเดือนก่อน +1

    Bought 8x Enterprise SSDs 6 years ago, all of them still have 99% TBW Remaining since day 1. Last year I bought 70+ used Enterprise SSDs to fill 3x NetApp DS2246 (3x 24 bay storage) and guess what? Most of them had 3~4 years of power-on and 90%+ TBW Remaining. Oh, these are Enterprise drives so they are spec to 7~12PBW (Petabyte-writes) of total endurance.

  • @computersales
    @computersales หลายเดือนก่อน +2

    I prefer buying used DC drives because they always have a ton of reads and writes but are always reporting over 80% health. Im not as keen on consumer drives. I don't use it as much as I could but my 1TB P3 is already down to 94% health after a year. Granted it has a lot of life still but a DC drive wouldn't even flinch at 14TB writes.

  • @virtualinfinity6280
    @virtualinfinity6280 หลายเดือนก่อน +1

    I think, this analysis contains a critical flaw. SSDs write data in blocks (typically 512k) and writing an entire block is the actual write load on the drive. So if you create a file of a few bytes in size, the drive metrics get actually updated by the amount of data you transfer to the drive. By using 512k blocks, the actual write load on the drive is significantly higher. Or in essence: it makes a whole universe of a difference, if you do 1 DWPD by writing the drives capacity using 1-byte-files .vs. you write one big file with the drives capacity as filesize.

  • @foldionepapyrus3441
    @foldionepapyrus3441 หลายเดือนก่อน +1

    When you are talking about drives though as they are so crucial to your desktop/server actually being functional, which for many is essential for their income stream its worth picking for a more certainly going to outlast your interest spec than run near the edge and get burned - transferring your drives to a new system if/when you upgrade or replace a failure is quick and painless for the most part.. Plus even with fast drivers any serious storage array takes a while to rebuild, so avoid that is always going to be nice.

  • @whyjay9959
    @whyjay9959 หลายเดือนก่อน +3

    There are Micron Ion drives with different ratings for types of writes, I think that's from when QLC was new.
    Interesting, seeing how much write endurance and sustained performance seem to be emphasized in enterprise I kinda thought companies were routinely working the drives to death.

  • @DarkfireTS
    @DarkfireTS หลายเดือนก่อน +5

    Would you resell a few after the testing is done…? Homelabber hungry for storage here 🙂

    • @jwdory
      @jwdory หลายเดือนก่อน

      Great video. I am also interested in some additional storage.

  • @dataterminal
    @dataterminal หลายเดือนก่อน +14

    I've given up telling people this. Even back when I had a 64GB SSD drive as my main boot drive, I was treating it as a harddisk because at the time if it died, I was just going to replace it. It didn't die, and I ended up having by in far way more data written to it than my harddisks, and by the time I had upgraded to a bigger drive, I was no where near the limit of the TBW the manufacture said. For home users at least, you're not going to write/wear the NANDs out, and haven't since SSD, never mind m.2 NVMEs.

    • @Lollllllz
      @Lollllllz หลายเดือนก่อน +2

      a nice thing is that you'll get a decent amount of warning to move data off a drive that has reached its endurance limit as they usually dont drop like flies when that limit is reached

    • @kevinzhu5591
      @kevinzhu5591 28 วันที่ผ่านมา

      The NAND may be fine, but the controller could have issues as well whether by firmware bug, thermal design or just random shorts on the board.
      Although controller failure rarely happens.

  • @jeffcraymore
    @jeffcraymore หลายเดือนก่อน +3

    Western Digital Green survived less than a month, having a server as a docker host.
    Using docker as distributed computing. spawning multiple instance every day.
    I'm running blues now and they haven't failed yet, but there are some os level issues that point to data corruption.

  • @raylopez99
    @raylopez99 หลายเดือนก่อน +2

    In a different context, this reminds me of Michael Milken of Drexel Burnham fame: he found that "junk bonds" were unfairly shunned when in fact their default rates were much less than people expected (based on data from a finance professor, which was the original inspiration). Consequently he used junk bonds to his advantage and as leverage to takeover companies (which had a lot of corporate fat, back in the day). How Patrick can profit from his observation in this video is less clear however, but I hope he achieves billionaire status in his own way.

  • @Proton_Decay
    @Proton_Decay หลายเดือนก่อน +3

    With per-TB prices coming down again, it would be great to know how SSDs perform long-term in home NAS applications -- much higher temps 24/365, low writes but lots of reads and regular ZFS scrubs. Do they outlast spinning rust? So much quieter, I hope to transition my home NAS at some point in the coming couple of years.

  • @shutenchan
    @shutenchan หลายเดือนก่อน +2

    I actually bought tons those intel S3510/S3520 ssd's from my own workplace (I work on a data center), they're very cheap and has high endurance with decent speed (although slower sequential speed).

  • @artemis1825
    @artemis1825 หลายเดือนก่อน +5

    Would love to see a version for used SAS enterprise HDDs and their failure rate

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +3

      Sure but the flip side is we stopped using disks several years ago except in special use cases

    • @artemis1825
      @artemis1825 หลายเดือนก่อน

      @@ServeTheHomeVideo Ah I guess I could always check the surveys from hyperscalers.

    • @masterTigress96
      @masterTigress96 หลายเดือนก่อน

      @@artemis1825 You can check the statistics from BackBlaze. They have been analyzing drives for many, many years as they are a back-up as a service provider, so they definitely need cheap, reliable long-term storage devices.

    • @Brian-L
      @Brian-L หลายเดือนก่อน +1

      Does Backblaze still publish their annual spinning rust analysis?

  • @moeness86
    @moeness86 26 วันที่ผ่านมา +2

    That doesn't address sudden death problems.. drives will fail in any category, but a headsup is always nice. Any idea how to check an SSD for issues ahead of failure? A follow up question would be how to do that with a raid array? Thanks for sharing.

  • @BangBangBang.
    @BangBangBang. หลายเดือนก่อน +2

    the Intel 5xx series SSDs we used to deal with back in the day would be DOA or die within 90 days, otherwise they were usually fine.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +1

      Yea I worked with a company who was seeing over 50% AFR on certain consumer Samsung SATA drives in servers

  • @drd105
    @drd105 หลายเดือนก่อน +1

    storing a lot of videos is a pretty niche use. VMs are in much more mainstream use. It's easier to keep old VMs around than treat configuring systems as a lifestyle choice.

  • @DrivingWithJake
    @DrivingWithJake หลายเดือนก่อน +2

    We mostly seen only people who really abuse drives run into issues even.
    Our most used drives we find are for data bases that uses the most life out of them. Other than people trying to use for mining.
    Smallest nvme we use is 1tb as defaults but have a lot of 15.36TB's for the past 4-5 years now.

  • @chuckthetekkie
    @chuckthetekkie หลายเดือนก่อน +1

    I think one reason that people still use HDDs is due to the lower upfront cost compared to SSDs. I have 2 4TB Intel DC 4510 U.2 SSDs that I bought used last year for about $200 per drive. Each drive has over 4 years 8 months of power on time and about 6.4PB read. I do have HDDs that I bought over 10 years ago that have way more power on hours and still work just fine.
    I myself and I know other people still worry about SSD reliability. Especially when it comes to QLC. I have a friend who REFUSES to even touch QLC SSDs. Usually with HDDs they start acting up before they fail where as SSDs will just stop working altogether with no warning. That has been my experience with HDDs and SSDs. I have had 2 m.2 NVMe SSD fail on me with no warning. One just stopped being recognized and the other actually Kernel Panics my Mac when plugged in. They were both being used in an NVMe USB enclosure.
    For SSD endurance I only ever looked at the TBW (Terabytes Written) rating. Never cared about DWPD.

  • @45KevinR
    @45KevinR หลายเดือนก่อน +1

    It's really interesting, and suggests to me that for general population drives, the obsession with getting enterprise vs retail drives is unnecessary, at least for endurance. Now write caching, safe cache writing, perhaps the lifespan of the electronics might all push a user towards enterprise drives - but you've basically proved endurance isn't a factor.
    However if the hosting/cloud industry still worry about it - it should mean a steady supply of retired drives that a home user or even STH can get cheaply! 😎👍
    I'd like to hear your thoughts on say: zfs write caches or daily backup targets.
    Presumably they would be subjected to much fiercer write traffic (though probably sequential) - and endurance might get tested more. Though I guess even a backup target would max out at say 1 drive write per 24 hrs. Less of you keep multiple days on the one drive/volume.
    Which only leaves zfs (or similar) as a concern. Thoughts?
    On rebuild time, I guess that takes us back to the original axiom of RAID. Array of *inexpensive* disks. When RAID was first conceived HDD were small and fragile by today's standards, large drives were expensive. So RAID facilitated using smaller cheaper drives to make a large volume that was fault tolerant. These days people have forgotten the small and wide, and you get say 12Tb in a mirror. However that's pricey to replace, and the rebuild will take hours to days and might even stress the new drive. 100% writes for hours and hours isn't the normal use or failure mode. So the rebuild might be the hardest use the drive ever gets! And a small NAS might only support 2-4 drives. Though I guess this also reminds us that a RAID volume isn't a backup, it's a resilient original. A real backup is our safety net until the rebuild is complete.
    (sorry for essay. A thought provoking video.) hope it keeps my paragraphs, 😮

    • @45KevinR
      @45KevinR หลายเดือนก่อน

      To rain on my own parade, looking at solidigm's own workstation M.2 drives and using the warranty period of 5 years you only get 660p = 0.1 to 670p = 0.2 drive writes per day. So they are a bit on the edge in server use.

  • @FragEightyfive
    @FragEightyfive หลายเดือนก่อน +1

    I would consider myself a power user and looking at some of my primary SSD's from the mid 2010's, I'm at about 0.12DWPD based on hours....And the second oldest/most used 256GB drive that still sees near daily use on a laptop, is still at 83% Drive LIfe Remaining.
    When I first started using SSD"s, I kept track of usage statistics. I stopped doing that after a few years when I realized that on paper, the NAND will last at least 100 years. Something else is going to cause a failure than drive writes (except maybe bad firmware that writes too much to some cells).
    I have been working with some large data sets on my main desktop more recently (10's to 100+GB), and even the 2TB and 4TB NVME drives are a similar DWPD, and at 95% after 2 and 5 years.

  • @Michael_K_Woods
    @Michael_K_Woods 21 วันที่ผ่านมา +1

    I think the main reason system guys like the high drivewrites per day is the implied hardiness. They will pay the extra money for 16 over a 4 if they believe it decreases maintenance and disruption odds.

  • @charlesspringer4709
    @charlesspringer4709 หลายเดือนก่อน +1

    Wow what a mass of words! Should I get SSD's and which ones?

  • @kelownatechkid
    @kelownatechkid หลายเดือนก่อน +2

    optane for write-heavy/DB workloads and literally whatever else for bulk storage haha. Ceph especially benefits from optane for the db/wal

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน

      If you saw, we bought a lot of Optane and have an entire shelf of it

  • @mystixa
    @mystixa หลายเดือนก่อน +1

    Good analysis but with an oversight. Ive had many SSDs and HDDs fail over the years. The problem being a lot of time an SSD will fail quickly and then be unrecoverable en mass with 100% data loss. An HDD often fails progressively with errors showing up in scans or with bad behaviour. When data from some sectors is lost almost always some of the rest is saveable.
    With the appropriate backup strategy this makes it less of a problem of course. It does shift the emphasis of how one cares for the data though.

  • @Zarathustra-H-
    @Zarathustra-H- หลายเดือนก่อน +3

    Just for shits and giggles I ran the DWPD on all of the SSD's in my server. The highest was on my two Optane's (which I use as mirrored SLOG drives). They have a whopping ~0.1 DWPD average over ~3 years. :p

  • @mehtotally3101
    @mehtotally3101 หลายเดือนก่อน +8

    Correct me if I am wrong, but the DWPD is only rated for the 3-5 year "lifespan" on the drive. So 1 DWPD for three years on a 1TB drive means approx. 1095 drive writes. If you have the drive in service for 10 years, that means it would only be able to handle .3 DWPD. So the proper way to evaluate these drives is really- total rated drive writes vs total drive writes performed. The flash drives take essentially no wear from reads or even being powered on so their lifespan is really gated by how much of their total write capacity has been used up. I have never understood why the metric was per day. Who cares when the writing is done, the question is how much writing has been done.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +4

      Usually five years 4K random write. You are correct PBW is a more useful figure which is why we did this piece to show why DPWD is not a good metric anymore. Also the type of writes impacts how much you can write. Usually rated DWPD is much lower than actual

    • @cameramaker
      @cameramaker หลายเดือนก่อน +1

      @@ServeTheHomeVideo the DWPD is more useful than PBW, because it is not a function of capacity. The DWPD figure can easily split into read-intensive (low dpwd) and write-intensive (high dwpd) kind of drives. Also, you have some sort of online service, which eg. accepts a 1Gb/s of continuous feed and you need to save or buffer that - so you go with 86400 Gb/day, which is 10800 GB = 10.8 TB. So all you care about is to have either a 10.8TB of 1DWPD drive, or 3.6TB of 3DWPD drive, to be on the safe side of 5y warranty. With PBW metric you are much more complicating the formulas for such streaming/ingest use case.

    • @MichaelCzajka
      @MichaelCzajka หลายเดือนก่อน

      My drives usually get upgraded at regular intervals:
      I'm always looking for faster drives i.e. PCIe3 -> PCIe4 -> PCIe5
      Bigger drives are also desirable as you want a bit of overcapacity if possible.
      Overcapacity is less of an issue if the drive is mainly read (storage) rather than written to.
      Total number of writes is the most useful metric as it predicts failure.
      However as drive speed increases the number of potential writes also increases.
      If you have a fast drive you'll find the number of detailed searches you do is likely to increase.
      The amount of data you write to a fast drive is also likely to increase... as some of the more time consuming tasks become less onerous.
      If a drive has an expected lifespan of 10 or more years... that's when you don't have to constantly monitor your drives for failures.
      That's one less thing to worry about on your computer.
      Drive metrics often make the expected lifespan quite hard to work out.
      Early on there were a lot of SSD failures.
      Nice to see that the situation has now reversed.
      Doesn't seem to be any manufacturer with an SSD reliability problem.
      🙂

  • @RichardFraser-y9t
    @RichardFraser-y9t หลายเดือนก่อน +1

    The Penta or Septa cells might only last 500 re-writes but they are nothing to worry about,

  • @harshbarj
    @harshbarj หลายเดือนก่อน +2

    I'd move to SSD, but there is one MASSIVE barrier. Cost. Right now my 2 drive array cost me under $150 for 8TB of storage. As of this moment the cheapest 8TB used enterprise SSD I can find is $1099. So my array as an SSD solution would cost me $2200, rather than the ~$150 it cost me today.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน +2

      7.68TB DC SSDs can be purchased new other (e.g leftovers and spares) for $500ish.

  • @ChipsChallenge95
    @ChipsChallenge95 29 วันที่ผ่านมา +1

    I’ve worked with and worked for many companies (hundreds) over the last couple decades, and every single one of those companies destroyed their drives after use or contracted someone to do it and are provided certificates of destruction. Idk how you managed to find so many used drives.

  • @EdvardasSmakovas
    @EdvardasSmakovas หลายเดือนก่อน +1

    Did you analize data writes only, or nand writes as well? I think write amplification factor shoud be mentioned in this context. Since depending on your storage array setup, this could result in many orders of magnitude more writes.

  • @teardowndan5364
    @teardowndan5364 5 วันที่ผ่านมา

    For home storage, I'd be more concerned with cold-storage data retention than write endurance. And drives with higher endurance usually hold data longer.

  • @glebs.
    @glebs. 20 วันที่ผ่านมา +3

    You focus entirety on DWPD, ignoring other metrics like TBW

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  20 วันที่ผ่านมา +1

      Yes. TBW is more interesting than DWPD these days, which is why we should not use DWPD as heavily anymore.

  • @cjcox
    @cjcox หลายเดือนก่อน +1

    I think with regards to normal (not unusual cases), the outage scenarios due to nand wearing out due to writes, would be cases where by algorithm or lack of trim you were hitting a particular cell with writes more than others. So, the TWD sort of thing goes out the window when talking about those types of scenarios. The good news there? Rare. Just like the other situations you mentioned. With that said, SSD quality can be an issue. I have RMA'd a new Samsung SATA SSDs (it was a 2TB 870 EVO) that started producing errors in the first year. So, there are elements apart from NAND (assuming good) lifetime as well. I think those are the errors that are more likely to occur.

  • @ABaumstumpf
    @ABaumstumpf หลายเดือนก่อน +2

    We had some problems with fast SSD failures... They didnt experience any high write-rates, they never got even close to be full, but they failed at an alarming rate in comparison to the normal HDDs we used (in raid5). Turns out that it is just the usecase: most of the write comes from logfiles and the SSD controllers were not all that great with constant tiny writes to just a few files. Switched to some SAS SSDs and no problem since then.
    and ok our usecase was also a bit special as we mostly needed RAM and CPU performance, decent networking and the storage only needed to last.
    In general for any active storage i will go hybrid - SSD for main stuff, HDD for bulk storage that really does not need any high performance. And of course for any cold storage SSDs are just a no-go.

    • @stevesteve8098
      @stevesteve8098 หลายเดือนก่อน

      logfiles will destroy an ssd, it's what takes down most consumer equipment
      embedded linux writing log files to the device ssd or Nand flash.
      He talks about having a couple of HD failures, it's fine becasue i have a load of >70,000 hour hard drives with ZERO failures & errors.
      but one absolutely critical point he missed in al lthis data........ was the drive temp, without it , these figures mean sweet FA.

    • @ABaumstumpf
      @ABaumstumpf หลายเดือนก่อน

      @@stevesteve8098 yeah, logfiles are kinda evil for that.
      And temps - yeah. But it would assume (or at least hope) that they run their hardware in a somewhat controlled environment and all under similar conditions.
      I do remember the times of school-computers being tiny machines with just 1 small fan, stuffed into a wooden cabin with only some holes at the top - and the hardware getting absolutely cooked for years :D

  • @comp20B
    @comp20B หลายเดือนก่อน +1

    I have been keeping to 5yo old Dell enterprise hardware.
    Currently my need is just 8TB within truenas. Enterprise SAS SSDs have been a huge leap for my use.

  • @whereserik
    @whereserik 29 วันที่ผ่านมา +2

    I value this. Thank you

  • @SomeRandomPerson
    @SomeRandomPerson หลายเดือนก่อน +1

    Just casually showing off a 60TB SSD there.

  • @KevoHelps
    @KevoHelps หลายเดือนก่อน +1

    I would challenge another human to a “to the death” fight for one of those 61tb SSDs

  • @danw1955
    @danw1955 หลายเดือนก่อน +1

    Wow! Only $8250 for a 61.44 TB. SSD. I'll take 3 please. That should be enough storage for my little home lab.🤣 That's just bonkers! I have about 16 tb. available on my NAS and I back up all my running machines to it, and it's STILL only about 1/3 full!!😄

  • @Koop1337
    @Koop1337 หลายเดือนก่อน +3

    So like... Can I get some of those drives now that you're done testing them? :)

  • @Vegemeister1
    @Vegemeister1 หลายเดือนก่อน +1

    Intel drives were known for going read-only and then bricking themselves on the next power reset when lifetime bytes written hit the warranty limit, whether those had been small-block writes or large sequential, and whether or not the drive was still perfectly good. Does Solidigm retain that behavior?

  • @Superkuh2
    @Superkuh2 หลายเดือนก่อน +1

    SSD aren't actually so much larger now. The vast majority of SSD used, even by IT geeks, are vastly smaller than HDDs. Even in 2024 1 or 2 TB is *normal* and that's insane. That was *normal* for HDD in 2009. No human person can really afford to buy a SSD that is larger than a HDD. That is only something corporate persons can do.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  หลายเดือนก่อน

      Solidigm told me 3.84TB is common for them but 7.68TB is rapidly gaining. The 61.44TB are lower volume but they are selling every one they can make

    • @Superkuh2
      @Superkuh2 หลายเดือนก่อน

      @@ServeTheHomeVideo 7.68 is finally a respectable size equal to HDD in 2013, ~10 years ago. I sure hope we see more of that in the future and without being 7 times the price.

  • @PeterBlaise2
    @PeterBlaise2 หลายเดือนก่อน +1

    Can you please test the data access rates and data transfer rates to see if the used drives are really performing according to manufacturer promises?
    Steve Gibson's GRC free ReadSpeed acknowledges "... we often witness a significant slowdown at the front of solid state drives (SSDs), presumably due to excessive use in that region ...".
    And free HDD Scan and free HD Tune can show us graphs of the slow or even unreadable sectors.
    And then SpinRite 6.1 Level 5 or HDD Regenerator will show the qualities of every sector's REWRITABILITY.
    Without that information, it's impossible to know the true value of any of those SSDs, really.
    Let us know when you have the next video with a full analysis of the SSD drive's REAL qualities to READ and WRITE compared to manufacturer performance specifications.
    Thanks.
    .