TrueNAS Cache Disks with FusionIO Drives

แชร์
ฝัง
  • เผยแพร่เมื่อ 6 ก.พ. 2025
  • Like most people who use NAS Servers for video production, I was looking for better performance when it came to adding data to a storage pool. What I found was not what I expected.
    But first... What am I drinking???
    From Adroit Theory, it's the Death of Civilization 01, and 8.0% Hazy IPA. Super tropical nose and forward taste, predominantly pineapple. And a little interesting extra on the nose... watch the review.
    I've got merch! Head on over to craftcomputing... and start drinking like a pro!
    Follow me on Twitter @CraftComputing
    Support me on Patreon or Floatplane and get access to my exclusive Discord server. Chat with myself and the other hosts on Talking Heads all week long.
    / craftcomputing
    www.floatplane...

ความคิดเห็น • 240

  • @totoritko
    @totoritko 3 ปีที่แล้ว +152

    Hey Jeff, lovely video. Couple of suggestions with your perf problem: so setting up "sync=always" is kind of a double-edged sword. This causes ZFS to hold off acknowledging any transaction until it is written to "safe" storage (like the slog). This means, the workload "batches" that your clients are sending, usually the filesystem block size, is going to matter a lot for performance. This depends a lot on how you're connecting to your storage. Since you showed Windows SMB transfer, for bulk transfers to run efficiently, you want to up your network MTU as much as possible, and also see if you can somehow convince your Windows client to transfer as large chunks of work as possible.
    Secondly on when to use a slog in the first place - generally, it makes very little sense for bulk transfers and bulk storage. Its function is primarily for stuff that is highly integrity-critical (like database stores) and stuff that is latency-critical. It can be used when used as a backing store for something like a VMDK from ESXi (VMware generally marks all transfers as "sync" anyway, so the sync=always option is usually not required there either), but that client is quite well optimized for sending many large 64k chunks in parallel.
    Lastly, logs don't need to be 1.2 TB in size. All a log is used to do is accumulate transfers marked as "sync" for the CURRENT transaction in ZFS. A single transaction usually lasts around 5 seconds. So if your log can accumulate more than 5 seconds' worth of transaction data, then it's large enough. Even on a 10G network, don't waste storage by carving out more than about 16-32GB for it. Use the rest for L2ARC, if you wish to accelerate reads as well.
    IMPORTANT caveat for slogs: **if** they have silent data errors on read back and your machine crashes, the log may not be recoverable, so beware! Moreover, slogs are generally used write-only, so it's not like ZFS will detect silently occurring data corruption just by using the slog. Running regular pool scrubs does check data currently on the slog device for integrity, so that *can* help detect data corruption, but for high-integrity enterprise applications, it is recommended to run two mirrored slog devices for this (yes, you can create a mirror of slog devices, just like for main pool devices). In some sense, by using a single slog device and setting sync=always, you are forcing all data, even that marked as "sync" to pass through this single point of failure. Without a slog, at least the sync data would not be acknowledged to the client until it is safely on mirrored spinning storage. However, with a slog, it'll get ACK'ed as soon as it hits the slog, so be aware that adding a single slog device comes with potential downsides to data integrity (which is why for enterprise applications, you'd add two, not just one).

    • @Prophes0r
      @Prophes0r 3 ปีที่แล้ว +2

      Can you edit this to make it...readable? It's just a giant block of letters.

    • @KayJay01
      @KayJay01 3 ปีที่แล้ว +26

      @@Prophes0r learn to read then, it's not pretty to look at but it gets the points across just fine. TH-cam comments don't have to be structured like an essay to be valid...

    • @Prophes0r
      @Prophes0r 3 ปีที่แล้ว +3

      @@KayJay01 Thank you for your opinion...

    • @totoritko
      @totoritko 3 ปีที่แล้ว +20

      @@Prophes0r Added blank lines to split up the paragraphs a little. Hope this helps.

    • @Prophes0r
      @Prophes0r 3 ปีที่แล้ว +4

      @@totoritko better thanks.
      What he really needs to do though is set up a metadata vdev.
      And use special_block_size assignments on specific vdevs

  • @highend88
    @highend88 3 ปีที่แล้ว +28

    I knew where this was headed when he set sync to "always".

  • @Hacker-at-Large
    @Hacker-at-Large 3 ปีที่แล้ว +45

    I’ve not used Proxmox or TrueNAS before, but I’d be tempted to put TrueNAS SCALE on the bare metal and use its virtualization services. I’d actually be interested in seeing that comparison.

    • @DangoNetwork
      @DangoNetwork 3 ปีที่แล้ว

      One is hypervisor, the other one is SAN/NAS that can do KVM. They are totally different.

    • @morosis82
      @morosis82 3 ปีที่แล้ว

      This also doesn't allow you to add it to the Proxmox cluster.

    • @cbremer83
      @cbremer83 3 ปีที่แล้ว +1

      It is anything like TrueNAS Core, it's not great. I have removed all jails and VM's from TrueNAS at this point. The VM function in particular is a bit flakey. Just not ready for prime time yet. I moved all of that stuff to a dual 8 core server I use for Proxmox. Everything is in it's own little VM and runs far more stabile. Then I have a 10GB link and all the VM's that need larger storage or access to files on my TrueNAS just connect over that. This has been bullet proof for the last year or so that I have had all of this set up. I ever pulled Syncthing from my TrueNAS. It is just a storage server now.

    • @Prophes0r
      @Prophes0r 3 ปีที่แล้ว +1

      It's also much more restrictive from a configuration/update perspective.
      It's not recommended to fiddle with too much in the shell on SCALE.
      It really is better to just run it in a VM.

    • @wishusknight3009
      @wishusknight3009 2 ปีที่แล้ว

      Truenas on bare metal is the way to go. But i run a seperate NAS and VM box. The VM capabilities in truenas for me are more or less provisional. They are ok for small unimportant things, but dont expect it to be anywhere near something like ESXi or xcp-ng. I use only one VM on my Trunas box, and that a windows guest that is for torrent downloading only. For this its great as it flushes to disk extremely fast.

  • @VelcorHF
    @VelcorHF 3 ปีที่แล้ว

    As always, I'm Jefe. Very cool that someone sent you some hardware to play with. Love that you dove into this one.

  • @shanemshort
    @shanemshort 3 ปีที่แล้ว +33

    ZIL is *not* a write cache. This is a massive (common) misconception.

    • @philiphiggins6870
      @philiphiggins6870 3 ปีที่แล้ว +5

      Yes. ZIL is effectively a backup copy of the synchronous writes that are still in RAM.
      Having ZIL on a dedicated device allows SYNC writes to be acknowledged quicker, but will never help with non sync writes.
      And you will never be able to buffer more writes than you have RAM anyway. Set sync to 'always' will just slow things down more.
      To get an actual non-RAM write cache, you would have to use a block caching layer (like lvm cache), which is definitely not recommended with ZFS.
      Another way to maybe speed up writes would be to use the new special allocation devices to move metadata and small blocks onto the faster storage. It's not a cache as such, but moving a lot of the slow seek activity off the hard drives and just giving them big blocks to work with should allow them to perform better.

  • @chrisparkin4989
    @chrisparkin4989 ปีที่แล้ว +1

    A Windows client over SMB writes data asynchronously meaning that once data has been loaded into your TrueNAS RAM the client can continue sending more data and it doesn't have to wait for the transaction to be committed onto stable storage which by default makes the process fast. Naturally zfs needs to commit the data to your pool via a transaction but your client doesn't have to wait for this. By setting your dataset to synchronous you have told your Windows client it MUST wait for the write to be committed to stable storage (in this case your SLOG) but this is an extra step as the data still goes to RAM first then SLOG and only then does the client get told go get me some more data. So you essentially slowed your pool down accidentally and got the results that were expected.

  • @NickF1227
    @NickF1227 3 ปีที่แล้ว +4

    My previous post seems to have been deleted...
    But this behavior is by design and isn't because of a hardware problem, that may be a separate issue.
    Sync writes will always be slower than asynchronous writes by the very definition of the word in ZFS. The difference is that unless you have a SLOG you will be committing your writes to you pool twice which dramatically slows things down, that is why SLOGs exist. The ZFS engineers wanted a way to ensure all writes were committed to disks, so sync writes were created. They realized that the only way to accomplish this would be to have a dedicated piece of hardware to put the data in flight, which is what a SLOG is.
    But adding these checks and balances has an obvious negative impact on performance, and so fast SLOG devices that are low latency are needed. But that doesn't mean it will be faster than asynchronous writes, it just means you can rest assured that your data was successfully written to the other side.
    There's no such thing as a ZFS write cache.

  • @jenesuispasbavard
    @jenesuispasbavard 2 ปีที่แล้ว +1

    The bottleneck isn't your single-core performance, it's your sync setting. Sync=always will *never* be faster than sync=disabled. And if your slog device isn't redundant, it can even be less safe than sync=disabled. totoriko's comment is a good summary of why.

  • @christokutrovsky8086
    @christokutrovsky8086 3 ปีที่แล้ว +6

    You could partition the SSD device into 4 or 8 partitions - and have each one added as write cache. I believe each volume will have its own thread.
    Note that ZFS doesn't really have "write accelerators" - it only has "sync write" accelerator - which is intended for databases requiring fast write latency. Not for large scale "absorption" of writes.
    Some NAS vendors have this "write absorption" concept - but not ZFS.
    Also note that if something happens with your ZFS intent log devices - you're going to loose data - unless you mirror them.

  • @Enrythe8th
    @Enrythe8th 3 ปีที่แล้ว +3

    My understanding (and it could be wrong now) was that the ZIL was pretty small regardless of the size of the drive. The entire FusionIO drive won't be used.

    • @MichaelSmith-fg8xh
      @MichaelSmith-fg8xh 3 ปีที่แล้ว

      Correct

    • @kienanvella
      @kienanvella 3 ปีที่แล้ว

      Yep. He'd probably be better off setting it up as a metadata "special" device

    • @mikes78
      @mikes78 3 ปีที่แล้ว +1

      @@MichaelSmith-fg8xh From the sound of that then, the drive could be split into either two halves, or even a 3+1 split and still provide adequate caching for the write transfers and the ZIL. I've thought about adding a cache drive to mine, but as mine is used for cold storage primarily, I've never really worried THAT much. I've more worried about the end PC and using RST caching of tthe drive, if not moving directly over to pure ssd.

    • @MichaelSmith-fg8xh
      @MichaelSmith-fg8xh 3 ปีที่แล้ว

      @@mikes78 Even a crappy cache drive makes things more consistant on my TrueNAS (8HDD, 10GB Network, Backup style workload). My memory is a little foggy on the flash cards you can split but it could be interesting to split out the volumes if they support that to 1/ Have read and write cache out of one card 2/ Expose multiple volumes to ZFS so you could do RAIDZ-1/2 for the write cache

    • @berndeckenfels
      @berndeckenfels 3 ปีที่แล้ว

      Yes but it does not hurt to overproduction Flash a bit, especially if it is write-only. (Of course 1tb would be a bit overkill, but 240GB dedicated for slog is probably fine - but not mirroring it, it’s somewhat doubtful to use sync=never is cheaper ,)

  • @WizardTim
    @WizardTim 3 ปีที่แล้ว +7

    This confused me A LOT when I first looked into this but you had it correct when you said SLOGs are for data integrity, not to speed up [most] transfers.
    The SLOG as you have configured doesn’t act as an intermediate cache between RAM and HDDs, it is purely a non-volatile backup of RAM in the case of a power failure, during normal operation the ZIL is only written to and only read from at boot, data is flushed directly from RAM to the HDDs as this is faster and safer than the SLOG. You can check the IO on your drive and see very little read activity and the usage doesn’t exceed you ARC size.
    A SLOG for your ZIL will only speed up synchronous writes and cannot make them faster than your asynchronous writes. Using sync=”always” just forces everything including normal asynchronous SMB writes to be copied to the SLOG which will slow them down but gives you better data protection during power failure.
    The fastest transfer speeds you can achieve are during that first 5 s burst of transferring a large single file asynchronously. During this you are directly writing to RAM (~450 MB/s in your case). *You likely need to make changes to your network adapter or client machine to improve this.*
    I instead of buying a new expensive SSD with power loss protection for a SLOG I spent the time to create new datasets optimised for each specific use case which included tuning snapshots, compression and file properties along with allocation sizes and encryption, saw much better performance after that saturating 10 GbE regularly now.
    If you need a write cache for something like a high speed ingest server (> 1 GB/s), ZFS was not designed with this in mind, you will have to create a separate pool of SSDs or switch to a different file system however write caches (and especially systems with both a write and read cache) are difficult to implement and have large overheads.

    • @klyoku
      @klyoku 3 ปีที่แล้ว +3

      this is the correct answer

    • @bambinone
      @bambinone 3 ปีที่แล้ว +2

      This is the best explanation I've read in the comments so far.

  • @ThePeperich
    @ThePeperich 3 ปีที่แล้ว +15

    You like to increase ZFS-Speed, collaborate with Wendell the Great from Level1Techs :-) - would be an awesome video (series?) Btw.
    Keep up the good work

  • @cinemaipswich4636
    @cinemaipswich4636 2 ปีที่แล้ว

    Just add as much RAM as you can. It is faster than any drive. Cache is just overflow from your RAM. The cache disks mentioned are add-ons to old systems that lack lots of ram.

  • @SaroG
    @SaroG 3 ปีที่แล้ว +12

    Sigh; seems like everyone gets this wrong (Linus being the biggest offender): the ZFS Intent Log ("ZIL") is not a "write cache". Its purpose is to guarantee writes ("intents") to primary storage by keeping a separate journal. The ZIL needs to be a low-latency high-speed flash drive with a supercapacitor to make sure the journal is successfully flushed to disk. ZFS' "write cache" is always RAM as it's cheap, fast and low latency. Give the system more RAM and ZFS will happily use it.

    • @charlesturner897
      @charlesturner897 3 ปีที่แล้ว +1

      Sigh;
      You don't need to emote while writing a commend

  • @capybarahat
    @capybarahat 3 ปีที่แล้ว +6

    I had the misfortune of getting two of these iodrive2s without reading the fine print that they are not nvme. I also went this route of switching to truenas scale and installing unofficial drivers. What a pain. Would have gotten some intel nvme drives if I knew better.

  • @praecorloth
    @praecorloth 3 ปีที่แล้ว +15

    I'm a bit confused by the beginning of this video. You talk about discussing two different types of caching, but the first type you talk about *is* the second type that you talk about. The SLOG is all about making sure that your data is available after a critical failure. It's not a write cache.
    The ZIL is a section of hard drive on every data device in your pool. Writes come in, if they are synchronous writes, they are written both to memory (in a transaction group, TXG), and the ZIL on the disks at the same time. Then, when it comes time to write the TXG to disk (typically after 5 seconds, or a certain amount of data, whichever comes first), the data is read from memory and written to disk. All data. Nothing is read from the ZIL, unless the system is recovering from a critical failure. Synchronous writes send a confirmation back to the application that the data is happily written and safe. In the case of ZFS, this happens when the data is finished being written to the ZIL, not when it is finished being written to its permanent resting place on the disk.
    Whew. That was a big block of text. Adding a SLOG. How does that change things? Well, the SLOG takes the place of the ZIL. So you've saved some IOPS from hitting your spinning rust. Which is a really good thing, and can increase performance. But not like people typically expect a "write cache" to do. Because **the ZIL and SLOG are not write caches**. So while things can speed up because you've taken some IOPS off of your spinning rust, you're still ultimately limited by the speed of your spinning rust. Your spinning rust just has more IOPS to throw at writing the data now.
    Then why did your system slow down when you added a SLOG?
    Good question. Dollars to donuts says it's because you set sync=always. This kills write performance. There are two types of writes that come in to your system. Synchronous, and asynchronous. Asynchronous writes happen, as far as your application is concerned, as fast as your system can write the data to a TXG in memory. You are not going to make that noticeably faster unless you're moving from some exceptionally slow memory to some exceptionally fast memory.
    Setting sync=always kills those asynchronous writes, and treats all of them as synchronous writes. So now everything is slogging through your SLOG (pun intended). While your SLOG is faster than your spinning disk, it is a goddamned turtle compared to your system memory.
    If you set sync back to default, you should see a fair bump in performance as you get your asynchronous writes back, but it's going to be a far cry from a saturated 10Gb line that you might be expecting.
    If you want even better performance, ditch the RAIDZ-2. Parity RAID is dead. Long live striped mirrors.

    • @comp20B
      @comp20B 2 ปีที่แล้ว +1

      I've used striped mirrors for years in Windows Server (raid cards). And also in truenas for a few years.
      You have me rethinking my intent to use z2 with SSDs in a new truenas build.
      I could go all mirrors, but is the performance downgrade really that extreme?

  • @williamcleek4922
    @williamcleek4922 10 หลายเดือนก่อน

    Use ZFS direct IO (custom build, bypasses ARC overhead for writes) and swap out any RAIDZ with something like Xinnor RAID plus ZFS (even core loading, super efficient). Zoom zoom on high bw, low latency devices. Keep the ARC big - as big as the data set as possible, or use Optane for L2ARC if RAM constrained (set L2ARC parameters appropriately - prefetch, headroom). See Xinnor's ZFS tuning page for reference.

  • @kdb424
    @kdb424 3 ปีที่แล้ว +3

    Have you looked into special vdevs? I use them on all of my high performance pools, and massively help out with small files, and move metadata off of the spinning disks, which increase iops, and return as fast as the SSD.

  • @alexxx4434
    @alexxx4434 3 ปีที่แล้ว +5

    Doesn't the author confuse the roles of CACHE and LOG vdev in his explanation?

    • @bambinone
      @bambinone 3 ปีที่แล้ว +4

      Unfortunately, yes. He starts by explaining what a SLOG device is and then says "so instead we're going to use an SLOG device," so it's a little confusing.

    • @shammyh
      @shammyh 3 ปีที่แล้ว +2

      Yes, yes he does.

    • @alexxx4434
      @alexxx4434 3 ปีที่แล้ว

      @@bambinone Yeah, and AFAIK Intent Log isn't supposed to increase write speed, it's for safety in case of power failure.

    • @bambinone
      @bambinone 3 ปีที่แล้ว

      @@alexxx4434 It can greatly improve sync write latency. I'm not sure about sync write throughput.

    • @SaroG
      @SaroG 3 ปีที่แล้ว +2

      Indeed, like most people who make videos on TH-cam regarding ZFS (re: Linus) they misunderstand the purpose of a ZIL as being a "write cache" drive.

  • @voodoovinny7125
    @voodoovinny7125 2 ปีที่แล้ว

    It has been a year and I am now wondering if there was any notes that deserve an update with new results on this topic.

  • @DjMrMors
    @DjMrMors 2 ปีที่แล้ว

    that was another good video, i was thinking of adding some kind of drives on my truenas scale home server to increase speed, but i dont know exactly what drives and how to do it properly, my point is to enable 10g network for transfers any advice?

  • @Clobercow1
    @Clobercow1 3 ปีที่แล้ว +3

    Patch proxmox to turboboost? I think Wendell has a video on it.

    • @shammyh
      @shammyh 3 ปีที่แล้ว +1

      I doubt it even needs a patch? Just the correct kernel modules loaded for the correct PM driver and/or the right kernel flags set. I had to do this dance with my Cascade Lake Xeons too. Required the BIOS power management be set correctly there, but it's definitely doable if you play around with settings a bit.

    • @levelnine123
      @levelnine123 3 ปีที่แล้ว

      Yes, he made a video about it. Jeff was even mentioned for super proxmox videos 😉 Name of the forum post :GIgabyte Server Activity Corner - Proxmox, Docker and Config Notes!

  • @dimitriid
    @dimitriid 3 ปีที่แล้ว +1

    Timestamp 7:15 will come in handy so you can quote yourself exactly at the moment you say you weren't going to upgrade your entire platform, for when you inevitably do exactly that in less than "a couple months"

  • @JGoodwin
    @JGoodwin 3 ปีที่แล้ว +2

    @Craft Computing, You suspected that you are single threaded CPU bound. Would it be possible to try to verify that you are able to get a no-copy read/write connection to your storage? By that, I mean that when you access from a client, it makes the request to your NFS, the data is written from the storage directly into the network device without first copying it to your system memory.
    Related to your project, I'd be interested to see how you feel this works out vs something like a Truenas Mini XL+

  • @savagedk
    @savagedk 3 ปีที่แล้ว +5

    I have to correct you at 4:20 You are speaking about write caches. This is an incorrect terminolgy. It is a SLOG device. The contents of the SLOG is a copy of the ZIL that resides in memory. From memory it is eventually copied to the pool and the contents are deleted from ZIL in memory and the SLOG. At no point in time will data be read from the SLOG and copied to the pool unless there is a poweroutage. A SLOG device such as RMS-200/300 will speed up sync writes though as the writes are acknowledged to be written to disk ONLY after they have been written to SLOG. Without a SLOG the ZIL in memory would have to flush to the pool before the sync gets an Acknowledge. The SLOG however is never read and committed to disk unless there is a poweroutage.
    The only "cache" that ZFS supports beyond ARC is L2ARC which is not even an ARC but a ring-buffer. L2ARC Pulls data from ARC, the ARC does not PUSH it to the L2ARC.
    You can add special devices for small files and metadata though.
    A few misconceptions many make about ZFS :)
    If you like your SLOG device, buy NVRAM for that such as RMS-200 which can be found on ebay if lucky. Your SSD's will be eaten up in no time. Seriously, SSDs will be be destroyed.
    Also, make sure to use correct ashift on those SSD's when adding them to the pool. Some Samsung SSD's require ashift=13 most HDD's require 12. If it's 9, ouch! time to recreate the pool!
    For l2arc you can use mfuonly=1 this stores only MFU objects on the cache drives. Just restart your vm's like 10 times to give them a high MFU score /s :D
    PS: upon resuming video I realized writing all of this was for naught, well most of it anyway! Good whiskey :D

  • @marcin_karwinski
    @marcin_karwinski 3 ปีที่แล้ว +1

    And here the Qnap "hero" devices are running NAS platform on 4c/8t 2GHz CPUs and most users are content with their speeds ;)

  • @sachinmahar
    @sachinmahar 3 ปีที่แล้ว

    hi jaf, how r u, i have a truenas storage i change computer, it shown offline what is solution..

  • @6LordMortus9
    @6LordMortus9 3 ปีที่แล้ว

    How did you know?? This is the first video I've watched after ordering the pint glass :)

  • @SPXLabs
    @SPXLabs 3 ปีที่แล้ว

    That's a good looking set!

  • @TedPhillips
    @TedPhillips 3 ปีที่แล้ว +1

    Hot pixel in the 4k face cam shots

  • @satchguitar84
    @satchguitar84 ปีที่แล้ว

    man that dead pixel got me good

  • @BrunoVera
    @BrunoVera 3 ปีที่แล้ว

    thanks for the info! :D

  • @Adam-di3mn
    @Adam-di3mn 3 ปีที่แล้ว +2

    watching the beer glass go down throughout the video is something I have never noticed between cuts before lmao

  • @ultraderek
    @ultraderek 3 ปีที่แล้ว +1

    I have 10 gbe and 512GB nvme ssds and can’t get up to 10gb transfer rates. I’ll be watching this to see if there are any pointers.

  • @NicolaiSyvertsen
    @NicolaiSyvertsen 3 ปีที่แล้ว +1

    You should look into storage tiering if you need to cache sequential IO. Regular caches are designed to speed up random IO and some burst sequential IO. But there aren't a lot of easy to use tiering systems and most software don't support it at all. Wendell had to roll his own when he built a NAS for Gamers Nexus.

  • @lawrenbw
    @lawrenbw 2 ปีที่แล้ว

    @Craft Computing: any idea why I'm getting an error when trying to run "make dkms"? I'm getting "line 41: Need an operator", and on Line 42 and 43. Thanks.

    • @TheBoroer2
      @TheBoroer2 2 ปีที่แล้ว

      Hey! Try again, there was a typo in the makefile. the fix is in the git repo. But i'm now having another issue ..the driver is installed just fine and i can format and mount and do stuff within the shell but TrueNAS SCALE (release candidate) doesn't display the fio drives in the GUI at all....smh.

  • @jaylanmcmurtry5317
    @jaylanmcmurtry5317 3 ปีที่แล้ว

    I have not a clue about like half of the stuff you are talking about seems like something I may have to know later so I'll just watch and pay attention to the best of my ability, and if I have to know this could be a reference

  • @ThineHolyBacon
    @ThineHolyBacon 3 ปีที่แล้ว

    I've tested some configs on my 8x4TB pools. SMB puts the compute strain on the TrueNAS box/VM. iSCSI will put it on the VM/machine using the LUN off TrueNAS. Not been able to test NFS yet

  • @OLDMANDOM42.Dominic
    @OLDMANDOM42.Dominic 3 ปีที่แล้ว

    Good thing I swallowed my drink, as I would have spit it on the monitor!! HAHA!! He said "Swaggy!" LMAO!

  • @aemonblackfyre4159
    @aemonblackfyre4159 3 ปีที่แล้ว +1

    how many drives do you have? I only have 4 8TB drives and i get between 200 and 400MB/s. I'm planning on upgrading to 8 Drives in a Z2 Volume and I hope I get some better performance out of those.

  • @randomgaminginfullhd7347
    @randomgaminginfullhd7347 3 ปีที่แล้ว +3

    Great video. Do you have any formal official qualifications such as a degree in IT/CS or certifications? You seem to be knowledgeable.

    • @CraftComputing
      @CraftComputing  3 ปีที่แล้ว +10

      "Officially" I'm certified to deploy and manage Windows Server 2008 R2 systems, and Windows Deployment Services for Windows Vista.
      I also spent 13+ years in IT, most of that time in systems administration and management.

    • @randomgaminginfullhd7347
      @randomgaminginfullhd7347 3 ปีที่แล้ว

      @@CraftComputing sounds great. do you have a university degree

    • @shammyh
      @shammyh 3 ปีที่แล้ว +1

      He's very knowledgeable. But not about ZFS!

  • @919stephenp
    @919stephenp 3 ปีที่แล้ว

    makes me wonder if you could get a NetApp flash cache card to work the same way

  • @berndeckenfels
    @berndeckenfels 3 ปีที่แล้ว

    Why sync=always, did you check if that is not slowing it down? Especially if the slog is not redundant.

  • @walllin7930
    @walllin7930 3 ปีที่แล้ว +1

    哈哈,我也是遇到了同样的问题。最后增加内存,disable sync,读写速度非常快。4块4T raidz。rw差不多400m/s

  • @BrandonLooi-rs5eg
    @BrandonLooi-rs5eg 7 หลายเดือนก่อน

    is that info correct proxmox not turboboosting??

  • @DovahDoVolom
    @DovahDoVolom 3 ปีที่แล้ว

    Oh man.. Your camera must have dead pixels because i saw white spots and thought it was my monitor. had a mini heart attack until i switched tabs.

    • @DovahDoVolom
      @DovahDoVolom 3 ปีที่แล้ว

      For referance i saw the white pixels once around your nose area and another by your left shoulder (right shoulder from the watchers point of view.)

    • @CraftComputing
      @CraftComputing  3 ปีที่แล้ว +1

      ....i saw it after editing. And dammit. I just hope it's not too expensive.

  • @piexil
    @piexil 3 ปีที่แล้ว +1

    A better use for these drives would be a special vdev for storing metadata and low block size files

  • @stephenp4440
    @stephenp4440 3 ปีที่แล้ว

    A few questions that might make good videos: Is the single threaded performance problem caused by the SMB driver? Should we be using NFS or another file system on TrueNAS because it is better multithreaded? Could you implement a TrueNAS write cache with a storage pool of RAID SSD?
    I bought 12 identical SSDs for the purpose of testing a TrueNAS write cache and they're installed. But I've got a low rent replication strategy. In the short term, I just use it as a SSD storage device for editing and then it moves the files to disk storage for backup and eventually deletes them from the SSD.

    • @shammyh
      @shammyh 3 ปีที่แล้ว +1

      CPU performance is not the bottleneck for this test. And only some parts of SMB sharing are single threaded. Everything that's actually ZFS related (and mentioned in this video) is actually multi-threaded.

  • @PatchedUpGaming
    @PatchedUpGaming 3 ปีที่แล้ว +1

    You said "as always I'm Jeff". Now I really wonder what the comments would be if you introduced yourself as Mike one time as a joke. Maybe on April 1 2022.

  • @wishusknight3009
    @wishusknight3009 2 ปีที่แล้ว

    Using Debian core Truenas and not the BSD core? You got my hopes up here. :(

  • @VTOLfreak
    @VTOLfreak 3 ปีที่แล้ว

    SLOG can make your write latency go down - On the condition you don't have more writes to do than can fit into a single TXG group and you have enough throughput to flush the TXG group away to the pool vdev's before the next TXG group has filled up. So you are correct that it will not do anything for throughput. But it can help tremendously for applications that do small writes and are impacted by write latency. Database log files for example.

  • @dalefrancis2247
    @dalefrancis2247 3 ปีที่แล้ว

    I have a EPYC 7551p running Proxmox, and it is according to nmon it is boosting on 16 of my 32 cores running while all 32 cores are at a 100% load.
    The only non-vanilla mod i have running is loading the MSR kernel module "modprobe msr"

  • @mycosys
    @mycosys 3 ปีที่แล้ว

    The 'broadcasting from Doom3D studios on Mars' look is interesting

  • @fiferox
    @fiferox 3 ปีที่แล้ว

    Really importante question for the video, Can I have the model a brand of your refrigerator, I really like it.

  • @jbonn5365
    @jbonn5365 3 ปีที่แล้ว +9

    Random question... Your file transfer speeds hitting the cpu so hard and relying so heavily on single threaded performance, is this just an issue of not being able to use multiple threads, or is this an Epyc issue (by this I just mean trunas' support for Epyc cpus)? My assumption is it's just the inability to use multiple threads, which I suppose is something the devs could address in future releases. At least I hope. I've been thinking about going with 10gb for my proxmox cluster and PBS setup... specially since 10gb hardware is becoming so homelab friendly on the wallet.

    • @handlealreadytaken
      @handlealreadytaken 3 ปีที่แล้ว

      I find that most transfers are single thread bound.

    • @KayJay01
      @KayJay01 3 ปีที่แล้ว

      You're always going to be single thread bound in Windows without SMB direct.

    • @shammyh
      @shammyh 3 ปีที่แล้ว +3

      No. That Epyc CPU has more than enough kick to drive SMB at 10gbe, but his underlying disk cluster may not. Especially older spinning rust.
      Also, SMB is not exactly single-threaded. The handler for the SMB traffic is threaded-per-share-active, more or less, but the underlying ZFS operations (compression, hashing, etc) are not single threaded.
      There's definitely some mix of poor configuration and/or slow disks going on here. Even a few atom cores are enough to give 10gbe performance, so Epyc Milan cores are definitely fast enough.

    • @shammyh
      @shammyh 3 ปีที่แล้ว

      @@KayJay01 No.

  • @Felix-ve9hs
    @Felix-ve9hs 3 ปีที่แล้ว

    AFAIK putting the ZIL on a separate SLOG won't speed up anything other than NFS Sync writes?

    • @berndeckenfels
      @berndeckenfels 3 ปีที่แล้ว

      Yes it only speeds up sync writes (not only NFS, also esx or Iscsi use them). But with sync=always it will also write amplify all async writes, that’s why it might get even slower than without a slog.

  • @AidenPryde3025
    @AidenPryde3025 3 ปีที่แล้ว

    This is cool and all, but why aren't you just striping a couple of SATA SSDs? Or slow NVMe? Those will also meet or exceed the 10GBe speeds you want.

    • @bambinone
      @bambinone 3 ปีที่แล้ว

      The goal is a stable, reliable fileserver with scalable capacity and bulletproof data integrity, AND fast reads and writes.

    • @AidenPryde3025
      @AidenPryde3025 3 ปีที่แล้ว

      @@bambinone Yes, but there are cheap enterprise SSDs that don't require all these extra steps.

    • @bambinone
      @bambinone 3 ปีที่แล้ว

      @@AidenPryde3025 Oh, for the SLOG device? Agreed. But it's always fun to tinker with older and/or unique gadgets.

  • @justanotherhuman-d6l
    @justanotherhuman-d6l 3 ปีที่แล้ว

    What if any performance differences are there between bare metal TrueNAS and hypervised TrueNAS? In addition what benefits does running it in A VM have over bare metal?

    • @CraftComputing
      @CraftComputing  3 ปีที่แล้ว +1

      Performance loss on PCIe devices is less than 2% in a VM on any modern platform. And less than 3% impact on multithreaded performance for CPUs. Results would be nearly identical on bare metal (which they were when I ran these disks on bare metal).

    • @justanotherhuman-d6l
      @justanotherhuman-d6l 3 ปีที่แล้ว

      @@CraftComputing thanks, man!

  • @LindustriesOutdoors
    @LindustriesOutdoors 3 ปีที่แล้ว

    So if you set up a cache as well as a log on the same pool in truenas would you get the benefits of both? Fast sustained writes but with the data integrity of the cache drive still in case of a power outage or if the Log drive fails? (with a fast enough CPU to support the write speeds of course)

  • @sybreeder86
    @sybreeder86 3 ปีที่แล้ว

    Fortunately I've set mine Fusionio 3.2TB as regular pool for VMs. I use 2 disk in mirror zfs anyway for now.

  • @matthewcodlin1387
    @matthewcodlin1387 3 ปีที่แล้ว

    Can I ask what the program you are using at 6:05 is called?

  • @FlaxTheSeedOne
    @FlaxTheSeedOne 3 ปีที่แล้ว

    you can gain a little bit of performance if you select the rigjt cpu type in proxmox for your cpu.
    That way the vm can use something like aes and other hardware to accerelate certain tasks.
    might be worth a try at least?

  • @Rene-kg7pf
    @Rene-kg7pf ปีที่แล้ว

    Good video

  • @kwinzman
    @kwinzman ปีที่แล้ว

    This is a bad idea. I don't think you understood how ZFS works.
    Maybe this could work if you heavily tuned the ZFS kernel parameters, but you didn't go into that at all.

  • @jrucker2004
    @jrucker2004 3 ปีที่แล้ว

    Could you run a similar type of test with adding cache disks to proxmox? I have a pretty fast spinning disk array on my proxmox server, but I've always wondered if a cache disk (or two... or 4) would help with usability on my windows 10 VM.

  • @nevoyu
    @nevoyu 3 ปีที่แล้ว

    Interesting that proxmox doesn't allow the cpu to boost, I wonder if thats a proxmox specific issue or if I can replicate that with kvm.

    • @shammyh
      @shammyh 3 ปีที่แล้ว +3

      Of course Proxmox allows the CPU to boost. You just need to make sure it's enabled correctly, just as with any distro.
      I'm kinda surprised Jeff hasn't figured that out yet tbh. I know he's more of a Windows sysadmin than a Linux guy, but still, it's not that complicated to make sure power states and turbo are configured correctly...

  • @whatevah666
    @whatevah666 3 ปีที่แล้ว

    Interesting, so if possible see how much of a best cpu you need to get the full (or close to it) of 10gbit with another cpu, also, try changing mtu? :)

  • @MichaelSmith-fg8xh
    @MichaelSmith-fg8xh 3 ปีที่แล้ว

    Compression is on but your compression ratio is low (it’s not doing anything apart from cpu heat)… what happens with compression off? What happens with multiple simultaneous file transfers? What about other protocols (filezilla transferring lots of parallel files over ftp for a test)? What do you get with a iperf test to TrueNAS?

    • @bambinone
      @bambinone 3 ปีที่แล้ว

      lz4 is so lightweight on modern CPUs that turning it off rarely makes any performance difference whatsoever.

  • @tomashk
    @tomashk 3 ปีที่แล้ว

    Beginning mentions that it can't be used as cache in Windows. You can try Primocache - maybe it can help in some scenarios as you can define L1 (RAM) and L2 (fast drive) cache.

  • @jimbo-dev
    @jimbo-dev 3 ปีที่แล้ว +6

    The cpu is used for data compression, could tuning the level of compression help 🤔 I think there was a way to do that, not sure though

    • @jimbo-dev
      @jimbo-dev 3 ปีที่แล้ว

      I did a bit more research and it looks like NFS could support multithreading. I should just probably switch to that

  • @nectarinetangerineorange
    @nectarinetangerineorange ปีที่แล้ว

    If you had used compression=zstsd your reads/writes would be encoded with all threads available to the system (ie however many cores the vms has access to)
    Read/write speeds should increase at a nearly geometric rate relative to the thread count (not exact but close-ish)
    Way faster than single-core encoding...

  • @malventano
    @malventano 3 ปีที่แล้ว

    Your lower clocks are likely running up against the max throughput that is achievable with a single VDEV RAIDZ2.

  • @JasonLeaman
    @JasonLeaman 3 ปีที่แล้ว

    The Fusion IO drives work in esxi too.

  • @savagedk
    @savagedk 3 ปีที่แล้ว +1

    6:20 What happened is, you ran out of ZIL space in memory and you were waiting for ZIL to flush to RUST so that windows could get an ACK and send more data. You cant put more data in SLOG than you have ZIL space in memory It does not matter that you have 10TB SLOG if you have 128GB of RAM. What good would it do anyway? SLOG is never read.. There IS NO WRITE CACHE IN ZFS!

  • @denvera1g1
    @denvera1g1 3 ปีที่แล้ว

    I have a ryzen 5 Pro 4650G, 128GB of ECC, two Optane 118GB drives for cache, and 18x2TB samsung 870 SSDs
    Why do i have a cache drive? Because i migrated from HDD to SSDs and didnt realise that having only 2 optane drives would hurt my performance, or maybe the CPU overhead
    In windows, these 18 drives hit over 7GB/s(would more than saturate my 40g QSFP+ port) But in truenas, writes seem limited to 20-25Gbps. I'm not 100% sure if this isnt a limitation of the different HBA, overhead involved in caching, or simply the overhead with using fast PCIe storage

    • @bambinone
      @bambinone 3 ปีที่แล้ว

      ZFS has known issues when the underlying disks reach a certain higher level of performance. You would choose ZFS for stability and long-term data integrity in that case, not for speed. How did you arrange the SSDs in your pool? What ashift and recordsize(s)?

    • @denvera1g1
      @denvera1g1 3 ปีที่แล้ว

      @@bambinone I'm at work and cant remember the recordsize but IIRC ashift was 9, or maybe 12. I just remember looking at a few reddit posts with similar drives
      In either case i think record size was 4096 but i'm not sure, maybe 8k

    • @denvera1g1
      @denvera1g1 3 ปีที่แล้ว

      @@bambinone Also, i'm not really interested in speed, i'm looking for reliability, resistance to shock(this is a portable mATX server) and low latency reads

    • @bambinone
      @bambinone 3 ปีที่แล้ว

      @@denvera1g1 Hopefully you used ashift=13. One of the many problems with Samsung drives is that they present as 512n, not even 512e like many other brands, so ZFS will autodetect ashift=9. I believe Samsung's NAND flash actually uses a 8192-byte page size, so ashift=13 is appropriate. Lower than ashift=13 and you were probably hitting serious write amplification.

    • @denvera1g1
      @denvera1g1 3 ปีที่แล้ว

      @@bambinone I'll have to doublecheck when i get home, and if not 13, i'll try it and see how it performs.
      thank you

  • @KadiusFTW
    @KadiusFTW 3 ปีที่แล้ว

    currently budgeting out a raspberry pi cm4 blade server with 16 cm4s in 1u, with 4tb of nvme (16x256gb ssds), and all 8gb "lite" cm4s. Budget is ~3600, including POE+ networking, a UPS, and rack.

    • @jbasshead
      @jbasshead 3 ปีที่แล้ว

      That sounds like a beast for K3/8s clustering. What're you looking to run on it?

  • @Raymond6494
    @Raymond6494 3 ปีที่แล้ว

    good info thanks

  • @jordanmccallum1234
    @jordanmccallum1234 3 ปีที่แล้ว

    no no no no, this is not how zfs caching behaves. If you want faster writes, you need more RAM + more spindles, as data is only stored in an slog for the time it takes to make a transaction group out of it, then is immediately sent to the main vdevs.

  • @DigitsUK
    @DigitsUK 3 ปีที่แล้ว

    As far as I understand it, your SLOG is only going to soak up about 5s worth of writes before they have to be committed to disk, so all that space isn't going to do you any good. Plus as others have said, RAID-Z2 is not about performance, I'm seeing 400MBs-1 sustained writes on a 6-disk pool made up of 3x 2-way mirrors, with an Optane 900P split to be both the SLOG and L2ARC. You can probably use your Fusion-IO drives to speed up read-caching, but you would need to tune some L2ARC parameters to increase the default limits on the max-write rate etc. Oh, and the CPU is just a 2.1GHz Xeon with 8 cores, the turbo-boost does work in Proxmox, and on my EPYC boxes too - try setting the governors to performance on the Proxmox hosts with cpufrequtils, and confirm with turbostat.

  • @Bloodred16ro
    @Bloodred16ro 3 ปีที่แล้ว

    Is ZFS really this slow and CPU-heavy? My home server which is a repurposed quad core Haswell PC with a bunch of WD 3TB drives in md RAID6 does 300-500MB/s and the initial burst is 1GB/s. I don't have any SSD caching of course, that's much too fancy for my setup. It's crazy to me that you're seeing much worse numbers on hardware that's vastly superior to what I have.

    • @morosis82
      @morosis82 3 ปีที่แล้ว

      I think some work is being done, but zfs primary goal is stability and consistency over raw speed.

    • @shammyh
      @shammyh 3 ปีที่แล้ว +3

      No. It's not. A quad core Atom can push 10gbe speeds. This is a poorly configured/optimized setup, and likely real culprit is the underlying vdev layout/performance.

    • @MichaelSmith-fg8xh
      @MichaelSmith-fg8xh 3 ปีที่แล้ว

      ZFS isn't necessarily slow and heavy but you can bring it to it's knees with sub-optimal config. I'm not limited by CPU at 10GB with a quad core xeon and 32GB RAM... but if I enable compression or something heavy I can bottleneck it instantly. The reason I would put that much RAM in the system would be to do online video editing so that reads were coming from RAM (but fast flash could be also performant as a read cache at better $/GB)

  • @cowboyboots9901
    @cowboyboots9901 ปีที่แล้ว

    I followed this and forgot to click "always" I was getting 12MBps transfers.

  • @mmetzger123456
    @mmetzger123456 3 ปีที่แล้ว

    offloading settings for nic?

  • @112Haribo
    @112Haribo 3 ปีที่แล้ว

    2:25 a 2000mbps drive more than exceeds the speed of a 10gbps network? what??

    • @TheMrAlien
      @TheMrAlien 3 ปีที่แล้ว

      I think the drive should actually be 2000MBps, megabytes not megabits, which would make it 16000mbps or 15.625gbps? I think?

  • @mikebroom1866
    @mikebroom1866 3 ปีที่แล้ว +1

    Would love a pint glass... $26 is.... Alot

  • @xeress
    @xeress 3 ปีที่แล้ว

    Would this work in unraid?

    • @jbasshead
      @jbasshead 3 ปีที่แล้ว

      Unraid doesn't use ZFS, so this won't likely be helpful. IIRC they use bcache and LVM for a lot of the heavy lifting, but I've only given the documentation a cursory overview. This is all focused around utilizing ZFS's built in caching/data protection to speed up write performance.

  • @morgonaut
    @morgonaut 3 ปีที่แล้ว

    darling, you have dead pixel on your camera, right in the middle of your faces -check it out - maybe cleaning the sensor in camera would help.

  • @richardleaneagh4274
    @richardleaneagh4274 3 ปีที่แล้ว

    I am lazy just using store mi with windows gaming machine and have a backup directory on my drive

  • @laloajuria4678
    @laloajuria4678 3 ปีที่แล้ว

    i dont know what this was......im still here.

  • @BramVelthuis
    @BramVelthuis 3 ปีที่แล้ว +1

    Was looking forward to this one. Would love a Home Assistant in a home lab tutorial series! Great content as always.

  • @joeyjojojr.shabadoo915
    @joeyjojojr.shabadoo915 3 ปีที่แล้ว

    Suddenly I don't feel like the poor kid any longer with the measly performance of my LGA1366 HW SAS2 RAID5 setup and it's paltry 250-260MB/s transfers across a single 4 lane SAS2 connection to a 12 drive backplane consisting of 2 separate SATA arrays. I simply figured that trying to reach anything more than the saturation of 2.5Gbit networking was money and time that I wasn't willing to spend or invest either way. Chalk one up for the peasants.

    • @joeyjojojr.shabadoo915
      @joeyjojojr.shabadoo915 3 ปีที่แล้ว

      I should also add that even though I have an LSI Cachecade/FastPath HW key plugged into my 2108 based 4i4e card, I don't have a cache drive installed or even assigned. I do however see burst speeds of up to 500MB/s for the first few GB of an 8-10GB file (settling to 250-260MB/s for the 2nd half) of local transfer on the box from a standalone 'ingest' SSD. Transfer time in Windows says it's correct and not placebo. Not sure if this is Fastpath helping me, but either way, I am satisfied.

    • @bambinone
      @bambinone 3 ปีที่แล้ว

      ZFS is more concerned with stability and long-term data integrity, but it would most likely perform about the same for your use case.

    • @joeyjojojr.shabadoo915
      @joeyjojojr.shabadoo915 3 ปีที่แล้ว

      @@bambinone I considered a point/click NAS but unfortunately I am too old to learn new tricks and Windows is as far as it goes for me. My use case is a low traffic Plex home server and I don't hold anything that can't be replaced.

  • @K0gashuk0
    @K0gashuk0 ปีที่แล้ว

    This is the problem with TrueNAS. I keep hearing everyone crying about how good it is but it can not even do simple data tiered storage. Windows Server has done it longer than that. I set up hybrid storage between SSD and HDD six years ago and it is meeting these transfer speeds that everyone keeps saying is impossible on windows server. No they arnt you are just completely self absorbed with yourself and linux. I went to high school with people like that. Only one of them did anything useful and he is still an idiot. Not saying you per say are these things but there is a crowd that pushes Windows is evil, which it is, but at the same time the solutions out there suck. It is 2023 there should be a way for storage data to cool.

  • @beauslim
    @beauslim 3 ปีที่แล้ว +2

    ZFS sometimes seems like one of the Dark Arts, and conflicting information out there in guides definitely doesn't help. My understanding is that turning sync on (the default) means that the file write doesn't return until everything is written to rust. Turning it off gives you much faster writes at the cost of potential data loss in the event of power failure. Try testing with sync off, and then decide if any increases are worth the risk.

    • @CraftComputing
      @CraftComputing  3 ปีที่แล้ว +1

      I will give that a shot. Thanks!

    • @shammyh
      @shammyh 3 ปีที่แล้ว +2

      Correct. Mostly. But definitely he should test with sync off and see what perf he gets.

    • @bambinone
      @bambinone 3 ปีที่แล้ว +2

      Correct, if there's no SLOG device present. If there's an SLOG device present, sync writes are committed there and the syscall returns immediately. A good test would have been sync=always with and without a SLOG device, but as others have already written, sync=always is an antipattern.

  • @rodhester2166
    @rodhester2166 3 ปีที่แล้ว

    cheers

  • @webserververse5749
    @webserververse5749 3 ปีที่แล้ว

    Should just use hardware RAID cards that support cache drive setup. Relying on any OS to do this is a fight you will never win

  • @magoostus
    @magoostus 3 ปีที่แล้ว

    imo even if the caching worked I still wouldnt trust it.. I think i'd rather build a massive SSD stripe array and then have that snapshot to a HDD raid every hour or something

  • @GrandpasPlace
    @GrandpasPlace 3 ปีที่แล้ว

    Well if TrueNAS is converting to Debian I may just have to give it a try again. I moved off TrueNAS because it didnt support my 40Gb/s network connections.

  • @mranthony1886
    @mranthony1886 3 ปีที่แล้ว

    I miss Sun Microsystems Solaris

  • @MrArrakis9
    @MrArrakis9 3 ปีที่แล้ว

    use an intel 800p 128gb drive. if optane cant do it nothing can.

  • @GabrielFoote
    @GabrielFoote 3 ปีที่แล้ว

    Here to feed the algorithm

  • @frankkesel7252
    @frankkesel7252 3 ปีที่แล้ว

    at leat test it on a higher clock cpu to confirm the theory