The one thing that I think is the most obviously missing thing in all forums and discussions about ZFS is designing from the top down user perspective. There is a lot of (really great) info about the lower level technologies and ZFS design principles, but I think we need someone to talk about pool requirements from the user file perspective. I think the best way to do it would be to discuss best pool design to support the most common file storage scenarios - e.g. just video files, just MP3/audio files, just photos, just documents, just database files - then work out the what is best for each of those, then work out how to best combine those into one system. I think I'll start a discussion in the TrueNAS Forum.
Have you had any issue with the odd numbered storage drives? In my experience they're sometimes more flaky because they're rejected 4TB drives for example.
@@AgentLokVokun I haven't, but I also haven't had that much experience with ZFS. The current array is mirrored pairs, but I'll probably reformat it as raidz2 soon to get more space.
Great video 👍 1:30 -- "Give [ZFS] raw block devices, and it figures it out. And it works great." Helpful as am shading towards possibly dumping LVM altogether. Kindest regards, friends.
If I don't remember wrongly, the special vdevs are permanent like how adding a regular vdev is permanent, unlike slog and l2arc that you can add and remove at will. So, redundancy for those are critical.
Even though you can remove slog, it's a critical part of the pool as well, and you will lose some data if there were writes in progress when it failed.
Having seen the forum post, I did a little experimentation. You can remove the special vdev using zpool remove (I just tested with one), and it appears to copy the data to the main pool. But, yes, redundancy is critical because, as Wendell said, loss of the special vdev will lose the pool. But you should still have a backup :0)
@@adrianstephens56 is it not possible to say "store the metadata on the regular hdd's themselves as well. but just have a faster copy on the ssd that you'll use... if the ssd is available"?
I filed a bug with the ZFS team years ago, basically was talk "yep, that's broken". It's an awesome file system, but I moved to LizardFS/MooseFS a few years ago for me "low" performance stuff. Never looked back.
@@MarkRose1337 I haven't used LizardFS, but from what I understand Ceph is superior to LizardFS in that: (1) it offers multiple APIs: "direct" block device with thin provisioning, POSIX and S3 compatible object store while LizardFS only offers POSIX. (2) LizardFS requires, by design, at least 2 servers with very high minimum hardware but isn't considered reliable or high performing before you have at least 2 metadata servers and 2 chunk servers, while Ceph deploys nicely on a single node and is considered reliable with 2 nodes. (3) LizardFS uses an underlying file system for storage while Ceph directly manages block devices - this gives Ceph an edge in performance and reliability (and also makes setup a bit simpler and more similar to a ZFS based NAS).
@@MarkRose1337 The performance of a ceph cluster is hugely dependant on its installation, the type of replication and the performance of the network and OSDs. Ceph's primary objective is resiliency and data integrity over performance.
@@MarkRose1337 Ceph is amazing and CAN be extremely fast (but needs multiple SSDs to handle redundancy, meta and write cache)! But it has to be done JUST right. It's best if you can get to at least four or five servers working together, I am running LFS on a single server, with my desktop running as a metadata backup plus once a day meta backups to the cloud. One mes sup with Ceph (or ZFS) and POOF, you data is gone, and/or something is just "pissed" and it will take a while to figure out why. LFS/MFS are extremely forgiving, you just need to have very good backups of your meta data (or at LEAST two systems doing meta). You can even "roll back" meta data and you may or MIGHT not loose anything! If you do, it will be just those files, not entire volumes!
Hey Wendel, love the channel! I put your excellent knowledge to good use: most recently producing spots for the GRAMMYs. One thing that no-one talks about is what happens to current data when you add new vdevs to an existing pool. Will the metadata be written to the new special Vdev? Do we need to make a new pool every time some essential new feature is added to ZFS (TrueNAS is our case at Colorways). I really wish this was addressed in black and white. Any help would be so appreciated. Could you do a video on existing pools? All of the videos about ZFS and TrueNAS seem to be targeted to people making a TrueNAS box for the first time. I have a 300TB TrueNAS set up and I want to reconfigure a bit, but would love some guidance for ZFS users who want to upgrade their existing set ups, without starting from scratch. Thank you!
Great Video! Maybe thats the Problem i have whenever i access my ZFS share via SMB. It just takes a hot second to load all the Stuff. You cant really switch around like on a local Hard Drive or something and its a bit annoying.
@@patrickwilcox4630 If you ever find a solution for that, lemme know. It's 6 Months now and tbh.. I just live with it now, because idk what to check anymore.
Hugely faster metadata access, and simultaneous access to file contents also gets a performance boost due to reduced load on the HDDs. Really nice. For my home storage which is mostly large media files, I was thinking I'd configure the ARC to only cache metadata and give it a decent chunk of memory. I may still do that, but this is a really nice feature to see in ZFS and I may use it myself.
If you enable ARC only for metadata then it won't cache actual data. You may think this will free up space for more metadata which is probably correct. However, the ARC is adaptive (the A from the abbreviation) and you probably won't achieve much but screwing with it. But mostly, disabling the ARC for data will disable file prefetch (which goes into the ARC) and your sequential performance goes down the drain. I have experienced this first hand so be careful.
@@nmihaylove My thought was that I'm not reading the same files repeatedly, so there'd be no benefit in trying to cache the data. However, if sequential reads/writes will perform like crap that's no good. I'll look into this further. Thanks.
It looks like "zfs.arc_meta_min" could be a safer way. I could configure the ARC to favor metadata caching but reserve some space for prefetch data. I'll try different configurations and see what works best for my situation. For now, I still use Linux MD (raid6) for my storage and I've only played with ZFS in VMs.
Thanks for this tip and the forum post, it is much appreciated. I'm a home hobbyist. I moved my nfs-mounted user files onto an nvme pool just because directory listing was slow. Now I've undone that and used those nvmes for a special vdev. Is there any way to move data onto the special vdev (excepting copying the data in the filesystem to a new directory and deleting the old one, which will make my snapshots and backups big)? The documentation on special vdevs is not very clear, so I tested first that I could also remove the special vdev if necessary (e.g. to upgrade the nvmes) without breaking the pool. What I didn't get around to testing was, if there are multiple special vdevs, removing one of them moves its data into the other special vdevs or returns it to the main pool.
This is fantastic but where's the automated storage tearing? it should be able to move hot blocks to ssd for caching and as the cached data cools beyond the storage limits of the nvme, remove it from the cache back to rust.
The same thing that happens when you lose power on a write back cache that doesn't have "crash to flash" capability, you lose what's in flight. The metadata volume can be made redundant (RAID) too.
How does this impact the performance of a zvol comparing to a dataset? Say I have a VM and its disk is a zol, should I expect faster access after adding special devices?
I would like to share with you all my computer build. Corsair 1000D case ( kind of big and heavy, LOL ! ) Gigabyte B550 Vision D ( I like it, but it has limitations ). Ryzen 3600XT ( probably overkill for my needs ). Powercolor 5500XT ( I do not yet do any gaming ). I have Fedora Rawhide on the primary NVMe. Fedora 33 on the secondary NVMe. OpenMandriva Alpha on the 1st SSD, OpenMandriva Cooker on the 2nd SSD ( both use znver1 kernel and Cooker is compiled with clang ). And on my 3rd, SSD, I have the testing version of GhostBSD. Of those Rawhide is my favorite. GhostBSD, is barely usable on my new hardware, but I can see its potential for those that like alternative operating systems. I can easily add three more SSD's and mechanical hard-drives, but my motherboard only has four SATA ports. I have some extra SSD's laying around that have Linux distros I tinkered with in 2019. I plan to eventually have them in this rig, but will likely have to keep them unplugged, except when tinkering with them. There are lots of things that I do not like about the Corsair 1000D. Anyone purchasing this case needs to plan on spending an extra $ 100 for custom cabling. And you want to purchase some 90 degree and 180 degree adapters for several of the cables to shorten the lengths necessary. When you open the hatch-doors on the cabling-side of the case, you can't have the cables tied down, as the SSD cables must have room to flex around. This requires extra thought, and each time you add a component or remove a component you have to disconnect lots of wires. You are going to need to own a dolly to move this thing, and be extra-cautious. Do NOT even imagine trying to mail it or ship it. When working on the case, you really need to take the glass doors off to keep from breaking them. They are heavy. Trying to put them back on requires patience, eyeglasses and adequate lighting. Few motherboards under $ 400 if any have two front USB headers. And few under $ 250 have a front Type "C". I wish I had gone with an EATX motherboard, but that would have required going Intel, and or spending a whole lot more money on a Godlike motherboard or a Threadripper. www.dropbox.com/s/ux2tzd819e0vymw/Photo%20Nov%2002%2C%208%2054%2006%20PM.jpg?dl=0 www.dropbox.com/s/qqlvzom546xwxt2/Davids_OpenMandriva_install.png?dl=0 www.dropbox.com/s/vxnemw155dwx8tp/IMG-1019.jpg?dl=0
Unix/Linux/*nix operating systems used to have the philosophy of do one thing and do it well. The last few years have seen a blurring of that philosophy imo
So, like an idiot I setup a mirror of 2 SSDs (consumer-grade) to test special Metadata and it ran for a month. Fast forward to yesterday when an SSD drive crashed and the pool went offline. I am now sitting at a place where I can't access my data and the pool won't online in truenas. Any assistance is welcome to try and recover data/Metadata so I can get my content back? I might be pulling the drives tomorrow and wiping them since I have had no luck but I was just hoping for assistance before I go that route.
Is there any benefit to this in an all flash system, considering that the main pool would likely be much faster than the special device vdev? Would it gain from being able to read and write in parallel to the main vdev?
Spinning rust??? I now say this all the time, mostly because of you. Can you imagine how upset I was when I discovered that the spinning bits are glass and stainless steel??? :-D Spinning glass just doesn't have the same ring to it, does it? :-D I have 4TB of TLC SSD as a non-mirror single vdev in a primarly zpool mounted on /data with *everything* in it, and another zpool with two 4TB ironwolf pro disks in usb enclosures as a single mirror vdev that is mounted read only on /rust and is synced to with snapshots from the SSD vdev. I wish I could have a single pool and have a 3 way mirror vdev in it that only did reads from the SSD and did writes to the SSD with delay/batched writes to the HDDs, but alas, the enterprise doesn't want/need that, so I'll have to do without / make do with the snapshot mechanism and two pools. I'd also love it if the zpool could be specified to NOT stripe across vdevs such that if a whole vdev was lost the zpool would carry on with partial data loss and partial data integrity, rather than a total failure. So as my storage grows I'll have 2*N zpools and matching 2*N vdevs where N = number of SSDs I current need to store my data AND can fit inside my nuc9i9 and N SSDs are half of the vdevs/zpools and 2*N HDDs outside in an enclosure/enclosures are the other half of the vdevs/zpools in mirror mode with snapshot sync. Seems like jumping through hoops to get ultimate reliability and great performance, but it's the best solution I could come up with for my needs.
I have six 2TB Samsung evo 870s. How dumb would it be to add a special vdev of the Samsung ssd's to my pool of eight 4TB WD red plus drives drives in raidz3 or mirrors for the metadata?
ZFS is supposed to consume as much memory as it is given, essentially, for the caching that it does. It also is designed to give consumed memory back to higher priority processes when necessary.
Does any of this matter to a guy like me with one do-it-all rig? Or is this really only relevant to enterprise folks? Getting back into linux and always curious whether file systems matter to me or not.
It’s a matter of how much your data matters to you? I’ve lost data to bit rot, and to having a backup drive fail at the same time as the primary drive. So I always have at least three back ups and they are in three different locations. My primary storage and the local backup are on ZFS. Second backup is online. Third backup is a Windows box in a different city. If they all fail I figure something must be happening such that losing my data is the least of my problems...
i doubt it. the baseline gain of being able to do random access without the latency associated with physically moving a read-head to the location of the data should improve your performance
After several attempts at this, and experiments I concluded it didn't do me any good. It slowed my write speed to about a third. Plus the documentation and writings are currently very shady and inconsistent. Their seems to be a load of risks, and culprits to implementing this, and not much to gain. It seems to be cheaper and better to just through more RAM at it. If you need performance, my suggestion is, to just create an SSD pool. Even an L2ARC is just a temporary boost, to what was recently copied to the pool, and will only have a noticeable effect, if you are running programs directly off the pool, and not adding much additional data. The metadata might make sense if you run VMs and databases directly from the ZFS pool. If you do, be sure to have UPS, and at least 3 x mirror for the Metadata vdev, and perhaps consider PLP SSDs for the job, or a power loss may be the end of your pool.
So what? I doubt it can get even close to what ZFS has to offer. ZFS is a huge beast and it's understandable if you have reservations about it. I had reservations about it. Until I started start using it. To paraphrase a famous quote: Unfortunately, no one can be told what ZFS is. You have to see it for yourself. Sure, it's not perfect. But many of its problems are being worked on. We have file-level clones (reflinks) in the works. Also, the ability to widen a RAID-Z VDEV. These are two of my biggest gripes. And I actually don't buy into the mantra that it's block device management. I manage my devices with LVM and lay ZFS on top of that. Then I can get different redundancy policies for different areas of the drives. Again, it's not perfect. It's just that overall it's the best filesystem there is. By far! The beauty of Linux is that you can use any supported filesystems if you think it fits better a particular scenario. So I look forward for bcachefs for some niche use case. In the meantime, I have learned to stop worrying and love ZFS.
@@nmihaylove The main problem I have with ZFS is that it requires a lot more from the admin and the hardware than BCacheFS or BTRFS, and offers in return some percentage gains in benchmarks and some multi RAID levels (which are nice, but if you *need* that kind of redundancy and have the money to spare, using a distributed storage with smaller and cheaper boxes will give better safety). For a lot of people - it just isn't worth it. I have yet to find a use-case that isn't better served by either a cheaper distributed storage or a cheaper file system.
@@guss77 So for you administering distributed filesystems is easier than administering ZFS?! And the latter doesn't need to be complicated. FreeNAS does an excellent job in hiding the details while at the same time offering many services on top. I really do not understand your stance. You should tell Wendell that he's wasting his time with ZFS.
@@nmihaylove Yes. Ceph setup is super easy - install some stuff, assign block devices and all runtime is handled for you. There is no designing of cache layers and assigning different raid levels, figuring out how to build a drive hierarchy, there is basically no "day to day" management, and if there are issues - you get mail. GlusterFS is somewhat more complicated. Some stacks even have nice web UIs for admins to remotely monitor and manage (see: documentation.suse.com/ses/6/html/ses-all/ceph-dashboard.html )
ZFS was decommissioned out of my entire network of systems back around 2007. ALL SSDs that I had deployed have been decommissioned out of my entire network of systems this year (2020).
Been using a pool with a special vdev for a year now. Still a little bummed that you can't have a separate DDT and metadata vdev but Intel kept changing the code so much that I'm surprised the feature got added at all.
If I have SSDs wouldn't a better use for them be for L2ARC? It can also cache metadata and will cache the metadata that is actually used so it won't waste space. But also will cache data (it can be controlled per dataset what to cache - data/metadata/both). And with the persistent (across reboots) L2ARC, which is coming soon with ZFS 2.0.0 I think, you'll achieve pretty much the same thing. And you won't need redundancy because the L2ARC is checksummed so errors will be caught and the authoritative source of information resides on redundant VDEVs in the pool. It seems you cannot do better than that.
Is it the magic bullet for dedup? I really want it to be, but I'm scared and don't have the spare hardware to try it without effecting my production workflow. Please do a follow-up on a deeper dive on this subject. Love your work, thank you.
this video should add CCs at very least add eng subs plz. this is the very opensource fusion drive solution the way i see ,even single RPM disk can benefit from it
as soon as i started buidling a ZFS Mashine..Wendel showed up on YT..this is a sign by the Gods of IEEE XD
The one thing that I think is the most obviously missing thing in all forums and discussions about ZFS is designing from the top down user perspective.
There is a lot of (really great) info about the lower level technologies and ZFS design principles, but I think we need someone to talk about pool requirements from the user file perspective. I think the best way to do it would be to discuss best pool design to support the most common file storage scenarios - e.g. just video files, just MP3/audio files, just photos, just documents, just database files - then work out the what is best for each of those, then work out how to best combine those into one system. I think I'll start a discussion in the TrueNAS Forum.
I'm going to have to try this on my archive box. I've got twelve 3TB drives in it, and half the IO is used doing meta-data lookups!
Have you had any issue with the odd numbered storage drives? In my experience they're sometimes more flaky because they're rejected 4TB drives for example.
@@AgentLokVokun I haven't, but I also haven't had that much experience with ZFS. The current array is mirrored pairs, but I'll probably reformat it as raidz2 soon to get more space.
@@MarkRose1337 TY and Good luck!
woohooo, a zfs video!
Great video 👍
1:30 -- "Give [ZFS] raw block devices, and it figures it out. And it works great."
Helpful as am shading towards possibly dumping LVM altogether.
Kindest regards, friends.
I just wanna say, the ZFS topic combined with the Jonny 5 in the background next to a Threadripper is the best combination for this night
Thanks for sharing
Coming to theaters near you "ZFS Tables on Intel Optane DIMMs". Starring Intel's Top Research Scientists... ;-)
Actually if intel sees your comment they may have a eureka moment. They certainly need to figure something out to stay in the game.
If I don't remember wrongly, the special vdevs are permanent like how adding a regular vdev is permanent, unlike slog and l2arc that you can add and remove at will. So, redundancy for those are critical.
Even though you can remove slog, it's a critical part of the pool as well, and you will lose some data if there were writes in progress when it failed.
Having seen the forum post, I did a little experimentation. You can remove the special vdev using zpool remove (I just tested with one), and it appears to copy the data to the main pool. But, yes, redundancy is critical because, as Wendell said, loss of the special vdev will lose the pool. But you should still have a backup :0)
@@adrianstephens56 is it not possible to say "store the metadata on the regular hdd's themselves as well. but just have a faster copy on the ssd that you'll use... if the ssd is available"?
I filed a bug with the ZFS team years ago, basically was talk "yep, that's broken". It's an awesome file system, but I moved to LizardFS/MooseFS a few years ago for me "low" performance stuff. Never looked back.
Do you have any experience with Ceph as well? I wonder how they compare for "low" performance situations.
What was the bug that you found?
@@MarkRose1337 I haven't used LizardFS, but from what I understand Ceph is superior to LizardFS in that: (1) it offers multiple APIs: "direct" block device with thin provisioning, POSIX and S3 compatible object store while LizardFS only offers POSIX. (2) LizardFS requires, by design, at least 2 servers with very high minimum hardware but isn't considered reliable or high performing before you have at least 2 metadata servers and 2 chunk servers, while Ceph deploys nicely on a single node and is considered reliable with 2 nodes. (3) LizardFS uses an underlying file system for storage while Ceph directly manages block devices - this gives Ceph an edge in performance and reliability (and also makes setup a bit simpler and more similar to a ZFS based NAS).
@@MarkRose1337 The performance of a ceph cluster is hugely dependant on its installation, the type of replication and the performance of the network and OSDs. Ceph's primary objective is resiliency and data integrity over performance.
@@MarkRose1337 Ceph is amazing and CAN be extremely fast (but needs multiple SSDs to handle redundancy, meta and write cache)! But it has to be done JUST right. It's best if you can get to at least four or five servers working together, I am running LFS on a single server, with my desktop running as a metadata backup plus once a day meta backups to the cloud. One mes sup with Ceph (or ZFS) and POOF, you data is gone, and/or something is just "pissed" and it will take a while to figure out why. LFS/MFS are extremely forgiving, you just need to have very good backups of your meta data (or at LEAST two systems doing meta). You can even "roll back" meta data and you may or MIGHT not loose anything! If you do, it will be just those files, not entire volumes!
Great demo. What are the risks? Do we need to mirror the special device? What happens if the special device dies?
Hey Wendel, love the channel! I put your excellent knowledge to good use: most recently producing spots for the GRAMMYs. One thing that no-one talks about is what happens to current data when you add new vdevs to an existing pool. Will the metadata be written to the new special Vdev? Do we need to make a new pool every time some essential new feature is added to ZFS (TrueNAS is our case at Colorways). I really wish this was addressed in black and white. Any help would be so appreciated. Could you do a video on existing pools? All of the videos about ZFS and TrueNAS seem to be targeted to people making a TrueNAS box for the first time. I have a 300TB TrueNAS set up and I want to reconfigure a bit, but would love some guidance for ZFS users who want to upgrade their existing set ups, without starting from scratch. Thank you!
I'd love to see you do videos on ceph
Great Video! Maybe thats the Problem i have whenever i access my ZFS share via SMB. It just takes a hot second to load all the Stuff. You cant really switch around like on a local Hard Drive or something and its a bit annoying.
Same here! Cannot get Windows to index either
@@patrickwilcox4630 If you ever find a solution for that, lemme know. It's 6 Months now and tbh.. I just live with it now, because idk what to check anymore.
6:31 Btw, TrueNAS 12 will come out with OpenZFS 2.0(ZFS on Linux). More info here on the features: www.ixsystems.com/blog/truenas-12-0-performance/
0:16 Icon of Saint Wendell who was touched by Almighty ZFS. :)
Hugely faster metadata access, and simultaneous access to file contents also gets a performance boost due to reduced load on the HDDs. Really nice. For my home storage which is mostly large media files, I was thinking I'd configure the ARC to only cache metadata and give it a decent chunk of memory. I may still do that, but this is a really nice feature to see in ZFS and I may use it myself.
If you enable ARC only for metadata then it won't cache actual data. You may think this will free up space for more metadata which is probably correct. However, the ARC is adaptive (the A from the abbreviation) and you probably won't achieve much but screwing with it. But mostly, disabling the ARC for data will disable file prefetch (which goes into the ARC) and your sequential performance goes down the drain. I have experienced this first hand so be careful.
@@nmihaylove My thought was that I'm not reading the same files repeatedly, so there'd be no benefit in trying to cache the data. However, if sequential reads/writes will perform like crap that's no good. I'll look into this further. Thanks.
It looks like "zfs.arc_meta_min" could be a safer way. I could configure the ARC to favor metadata caching but reserve some space for prefetch data. I'll try different configurations and see what works best for my situation. For now, I still use Linux MD (raid6) for my storage and I've only played with ZFS in VMs.
How does this channel only have 150k subs!?
Because we have videos like zfs special devices and you? How many people in the world could possibly be interested in that?
They need to get tricky with the video titles. IE - Karen EXPLODES when her ZFS VDEV exceeds her expectations!
@@johncnorris 3 years later and you made me laugh - thank you
Thanks for this tip and the forum post, it is much appreciated. I'm a home hobbyist. I moved my nfs-mounted user files onto an nvme pool just because directory listing was slow. Now I've undone that and used those nvmes for a special vdev. Is there any way to move data onto the special vdev (excepting copying the data in the filesystem to a new directory and deleting the old one, which will make my snapshots and backups big)? The documentation on special vdevs is not very clear, so I tested first that I could also remove the special vdev if necessary (e.g. to upgrade the nvmes) without breaking the pool. What I didn't get around to testing was, if there are multiple special vdevs, removing one of them moves its data into the other special vdevs or returns it to the main pool.
Grease ⚡, I like this.
This is fantastic but where's the automated storage tearing? it should be able to move hot blocks to ssd for caching and as the cached data cools beyond the storage limits of the nvme, remove it from the cache back to rust.
But if my pool runs like grease lightning, won't it be empty very soon?
Finally ! Someone fixed the metadata performance
Can you talk about the need for redundancy in regards to the metadata device? What happens when it fails?
Like with all VDEVs in ZFS, if the VDEV fails, the filesystem fails.
@@MarkRose1337 Thanks!
You can use a mirror as a special vdev
The same thing that happens when you lose power on a write back cache that doesn't have "crash to flash" capability, you lose what's in flight.
The metadata volume can be made redundant (RAID) too.
ALL your data is gone if it fails. Use a mirror!
Wait... I'm the first patron here. Wow. Evidently zfs is a little more niche of a topic than I thought...
Maybe the first. But alone you're not ;)
Why don't you do your special Goatse gang sign.
How does this impact the performance of a zvol comparing to a dataset? Say I have a VM and its disk is a zol, should I expect faster access after adding special devices?
I would like to share with you all my computer build. Corsair 1000D case ( kind of big and heavy, LOL ! ) Gigabyte B550 Vision D ( I like it, but it has limitations ). Ryzen 3600XT ( probably overkill for my needs ). Powercolor 5500XT ( I do not yet do any gaming ). I have Fedora Rawhide on the primary NVMe. Fedora 33 on the secondary NVMe. OpenMandriva Alpha on the 1st SSD, OpenMandriva Cooker on the 2nd SSD ( both use znver1 kernel and Cooker is compiled with clang ). And on my 3rd, SSD, I have the testing version of GhostBSD. Of those Rawhide is my favorite. GhostBSD, is barely usable on my new hardware, but I can see its potential for those that like alternative operating systems. I can easily add three more SSD's and mechanical hard-drives, but my motherboard only has four SATA ports. I have some extra SSD's laying around that have Linux distros I tinkered with in 2019. I plan to eventually have them in this rig, but will likely have to keep them unplugged, except when tinkering with them.
There are lots of things that I do not like about the Corsair 1000D. Anyone purchasing this case needs to plan on spending an extra $ 100 for custom cabling. And you want to purchase some
90 degree and 180 degree adapters for several of the cables to shorten the lengths necessary. When you open the hatch-doors on the cabling-side of the case, you can't have the cables tied down, as the SSD cables must have room to flex around. This requires extra thought, and each time you add a component or remove a component you have to disconnect lots of wires.
You are going to need to own a dolly to move this thing, and be extra-cautious. Do NOT even imagine trying to mail it or ship it.
When working on the case, you really need to take the glass doors off to keep from breaking them. They are heavy. Trying to put them back on requires patience, eyeglasses and adequate lighting.
Few motherboards under $ 400 if any have two front USB headers. And few under $ 250 have a front Type "C".
I wish I had gone with an EATX motherboard, but that would have required going Intel, and or spending a whole lot more money on a Godlike motherboard or a Threadripper.
www.dropbox.com/s/ux2tzd819e0vymw/Photo%20Nov%2002%2C%208%2054%2006%20PM.jpg?dl=0
www.dropbox.com/s/qqlvzom546xwxt2/Davids_OpenMandriva_install.png?dl=0
www.dropbox.com/s/vxnemw155dwx8tp/IMG-1019.jpg?dl=0
Unix/Linux/*nix operating systems used to have the philosophy of do one thing and do it well.
The last few years have seen a blurring of that philosophy imo
Can you test 2x Intel P1600X 118GB partitioned 25%/75% to accomodate mirrored slog and metadata for a pool of 10x MX500 4TB RaidZ2? Thanks
So if the special ssd holding the metadat dies, we lose all our data in the pool?
What kind of disk should be used for this Enterprise SSD class? PLP protection?
Elbereth. because of course.
3:17 made me laugh more than it should have 😂
Very cool. What will happen if the Metadata fills up the vdev? Will everything just sieze up, or will it somehow rotate the data stored on the vdev?
Maybe an idea for a video - how about a tutorial on how to copy a ZFS pool "for Linux" to a BSD one, with all the snapshots and everything intact?
any updated guides on this? it says my 250tb pool consist of only 50 files so kinda stuck there lol.
So, like an idiot I setup a mirror of 2 SSDs (consumer-grade) to test special Metadata and it ran for a month. Fast forward to yesterday when an SSD drive crashed and the pool went offline. I am now sitting at a place where I can't access my data and the pool won't online in truenas. Any assistance is welcome to try and recover data/Metadata so I can get my content back? I might be pulling the drives tomorrow and wiping them since I have had no luck but I was just hoping for assistance before I go that route.
Is there any benefit to this in an all flash system, considering that the main pool would likely be much faster than the special device vdev?
Would it gain from being able to read and write in parallel to the main vdev?
Hey there is a ZFS pool on the roof! Go check it out!
Is your recordsize 512k? Why? Is it really suitable for something like homedir? Even given the mysql databases used by apps, like browsers?
I need more information here. I have a system that I think would seriously benefit from this. Please and thank you...
Filesystem NetApp style!
wouldn't be better to just set the property of open-zfs to put all metadata on arc upon boot like l2arc rebuild when it's persistent?
Spinning rust??? I now say this all the time, mostly because of you. Can you imagine how upset I was when I discovered that the spinning bits are glass and stainless steel??? :-D Spinning glass just doesn't have the same ring to it, does it? :-D I have 4TB of TLC SSD as a non-mirror single vdev in a primarly zpool mounted on /data with *everything* in it, and another zpool with two 4TB ironwolf pro disks in usb enclosures as a single mirror vdev that is mounted read only on /rust and is synced to with snapshots from the SSD vdev. I wish I could have a single pool and have a 3 way mirror vdev in it that only did reads from the SSD and did writes to the SSD with delay/batched writes to the HDDs, but alas, the enterprise doesn't want/need that, so I'll have to do without / make do with the snapshot mechanism and two pools. I'd also love it if the zpool could be specified to NOT stripe across vdevs such that if a whole vdev was lost the zpool would carry on with partial data loss and partial data integrity, rather than a total failure. So as my storage grows I'll have 2*N zpools and matching 2*N vdevs where N = number of SSDs I current need to store my data AND can fit inside my nuc9i9 and N SSDs are half of the vdevs/zpools and 2*N HDDs outside in an enclosure/enclosures are the other half of the vdevs/zpools in mirror mode with snapshot sync. Seems like jumping through hoops to get ultimate reliability and great performance, but it's the best solution I could come up with for my needs.
run like grease lightning - run is correct - just keep admin control
I have six 2TB Samsung evo 870s. How dumb would it be to add a special vdev of the Samsung ssd's to my pool of eight 4TB WD red plus drives drives in raidz3 or mirrors for the metadata?
What happens if you change the small allocation size? Do the file locations get readjusted on each disk?
I installed Ubuntu using ZFS and it using so much RAM, it blows my mind, I have 16 gig RAM nad most of the time 12-14 are used.
ZFS is supposed to consume as much memory as it is given, essentially, for the caching that it does. It also is designed to give consumed memory back to higher priority processes when necessary.
I tried that script last night to see what files I had and it didn't work.
Catherine Plain
Does any of this matter to a guy like me with one do-it-all rig? Or is this really only relevant to enterprise folks? Getting back into linux and always curious whether file systems matter to me or not.
It’s a matter of how much your data matters to you?
I’ve lost data to bit rot, and to having a backup drive fail at the same time as the primary drive. So I always have at least three back ups and they are in three different locations.
My primary storage and the local backup are on ZFS. Second backup is online. Third backup is a Windows box in a different city. If they all fail I figure something must be happening such that losing my data is the least of my problems...
Any idea how to monitor ZFS pool storage in grafana? Seems walled off.
Do I need high-performance SSDs for this?
i doubt it. the baseline gain of being able to do random access without the latency associated with physically moving a read-head to the location of the data should improve your performance
You say grease lightning, but my mind went straight to summer nights :)
Uh well-a well-a well-a huh!
Tell me more, tell me more.
Did you get very far?
That's a different song.
I need help setting up a Corsair Commander Pro fan-hub in Fedora. Can someone please help ?
Everett Pines
Haven Throughway
After several attempts at this, and experiments I concluded it didn't do me any good. It slowed my write speed to about a third. Plus the documentation and writings are currently very shady and inconsistent. Their seems to be a load of risks, and culprits to implementing this, and not much to gain. It seems to be cheaper and better to just through more RAM at it. If you need performance, my suggestion is, to just create an SSD pool. Even an L2ARC is just a temporary boost, to what was recently copied to the pool, and will only have a noticeable effect, if you are running programs directly off the pool, and not adding much additional data. The metadata might make sense if you run VMs and databases directly from the ZFS pool. If you do, be sure to have UPS, and at least 3 x mirror for the Metadata vdev, and perhaps consider PLP SSDs for the job, or a power loss may be the end of your pool.
Dr. Strangelove :)
When are you going to do ZFS on window?! Wendel, we need you man!
This is a linux channel...
Kurt Lock
Stiedemann Estates
Lorena Light
Bernie Glens
Gregorio Crescent
One day bcachefs will be ready.
So what? I doubt it can get even close to what ZFS has to offer. ZFS is a huge beast and it's understandable if you have reservations about it. I had reservations about it. Until I started start using it. To paraphrase a famous quote: Unfortunately, no one can be told what ZFS is. You have to see it for yourself.
Sure, it's not perfect. But many of its problems are being worked on. We have file-level clones (reflinks) in the works. Also, the ability to widen a RAID-Z VDEV. These are two of my biggest gripes.
And I actually don't buy into the mantra that it's block device management. I manage my devices with LVM and lay ZFS on top of that. Then I can get different redundancy policies for different areas of the drives.
Again, it's not perfect. It's just that overall it's the best filesystem there is. By far! The beauty of Linux is that you can use any supported filesystems if you think it fits better a particular scenario. So I look forward for bcachefs for some niche use case. In the meantime, I have learned to stop worrying and love ZFS.
@@nmihaylove The main problem I have with ZFS is that it requires a lot more from the admin and the hardware than BCacheFS or BTRFS, and offers in return some percentage gains in benchmarks and some multi RAID levels (which are nice, but if you *need* that kind of redundancy and have the money to spare, using a distributed storage with smaller and cheaper boxes will give better safety). For a lot of people - it just isn't worth it.
I have yet to find a use-case that isn't better served by either a cheaper distributed storage or a cheaper file system.
@@guss77 So for you administering distributed filesystems is easier than administering ZFS?!
And the latter doesn't need to be complicated. FreeNAS does an excellent job in hiding the details while at the same time offering many services on top. I really do not understand your stance. You should tell Wendell that he's wasting his time with ZFS.
@@nmihaylove Yes. Ceph setup is super easy - install some stuff, assign block devices and all runtime is handled for you. There is no designing of cache layers and assigning different raid levels, figuring out how to build a drive hierarchy, there is basically no "day to day" management, and if there are issues - you get mail. GlusterFS is somewhat more complicated. Some stacks even have nice web UIs for admins to remotely monitor and manage (see: documentation.suse.com/ses/6/html/ses-all/ceph-dashboard.html )
Claudia Throughway
Hermann Lake
Volkman Alley
45 drives collab
Austin Pike
Cole Mountains
Dislike(s) are from NTFS, FAT32, exFAT and Reiser fanboys.
McGlynn Meadow
Alivia Loaf
ZFS was decommissioned out of my entire network of systems back around 2007. ALL SSDs that I had deployed have been decommissioned out of my entire network of systems this year (2020).
Been using a pool with a special vdev for a year now. Still a little bummed that you can't have a separate DDT and metadata vdev but Intel kept changing the code so much that I'm surprised the feature got added at all.
If I have SSDs wouldn't a better use for them be for L2ARC? It can also cache metadata and will cache the metadata that is actually used so it won't waste space. But also will cache data (it can be controlled per dataset what to cache - data/metadata/both). And with the persistent (across reboots) L2ARC, which is coming soon with ZFS 2.0.0 I think, you'll achieve pretty much the same thing. And you won't need redundancy because the L2ARC is checksummed so errors will be caught and the authoritative source of information resides on redundant VDEVs in the pool. It seems you cannot do better than that.
Leuschke Mills
Altenwerth Extensions
Is it the magic bullet for dedup? I really want it to be, but I'm scared and don't have the spare hardware to try it without effecting my production workflow.
Please do a follow-up on a deeper dive on this subject.
Love your work, thank you.
Rosenbaum Ramp
Velda Pines
Goyette Burg
Annie Turnpike
Schneider Centers
Palma Route
Rempel Lock
Schaefer Row
I hope you are not working with dell or hpe for your storage server. Ill call the cops wendell. All jokes aside I hope its 45 drives.
45 Drives was my first thought
Jordy Islands
Wunsch Curve
Quitzon Cliffs
Moore Mountains
Bins Highway
this video should add CCs at very least add eng subs plz. this is the very opensource fusion drive solution the way i see ,even single RPM disk can benefit from it
Margarette Corners
Jaleel Manors
Aaliyah Lane
Travis Junctions