Yes, I recall using ReFS in the past on a single HDD (I think it was a Windows Server 2012 R2 installation configured to be like a client, with desktop experience and the Acceleration Level dwords to enable 3D acceleration), and then I tried installing GTA V off Steam on it. It went horribly with Steam simply going crazy, believing that the game was corrupt and restarting the download on an infinity loop (the OS was of course on a different disk, formatted as NTFS, but I created a Steam Library in the ReFS HDD).
@Proton ThioJoe explains it pretty well. I suggest using ReFS only if you're going to use its exclusive features, since NTFS generally performs better.
Yep, it truly adds to the credibility of the rest he's telling. To give a very simple idea on "sparse" files. You can see it as a specific implementation of the "block deduplication" feature. A block with zeroes will occur quite often on virtual machine file system images. So a block with zeroes (or whatever specific content they think of to fully use this feature) will occur multiple times and a lot of data blocks can point to this same block. This makes creating and copying the file (on the same filesystem) very fast and using a lot less space on the storage. Virtual machine images often already used this concept of sparse files. But I don't know if it was supported as such on NTFS or that it was aapplication specific implementation.
So NTFS already has spare files, which means files only take as much space as are written to (reading other areas of the allocated space just returns zeros). But once written, you can't undo that so if you want to clear that region, you have to actually write zeros all over. Sparse VDL allows you to essentially make those areas sparse again in a sense (allowing you to zero regions of a file without actually writing zeros) making it much faster
@@gblargg close but unlike sparse files, that space can’t be reused. So while sparse files are only truly allocated on disk as written to (I can create a 100 TB sparse file on 1 TB and write 200 GB, the size of the file on disk is about 200 GB. With Sparse VDL, however, those allocated regions remain allocated but you don’t need to write zeros to them (just need to update metadata). Instead, the system will return zeros when read (like with sparse files if I read beyond 200 GB) because ReFS tracks what regions of file has Valid data (hence VDL). For SSDs, the TRIM information is important as otherwise it would have no idea that certain sectors contain no valid data and would copy it or keep it there allocating different cells (causing write amplification). While you could just write zeroes (and some controllers would consider the sectors reusable), it makes deletion slower (as you have to go out and zero it out), it is an implementation details that SSDs didn’t have to implement (after all, file systems don’t zero out file during delete) and wouldn’t work with transparent disk encryption which sits under file system. So TRIM is genuinely useful here (but TRIM for Shingled hard drives is just terrible. Shingled hard drives shouldn’t exist imo)
@@ckingpro Ahhh, looking at Microsoft docs on VDL you are correct, it's just a way to avoiding having to zero all the data on disk. Sounds like it basically marks sections of the file as zeroed and returns zero when read, to avoid long wait times when creating huge zeroed files. I could find very little about VDL relating to ReFS. One from a MS engineer was "More on Maintaining Valid Data Length".
@@brkbtjunkie aren’t sparse bundles exclusive to macOS? Regardless, they are not the same as sparse VDL. Think of them as dynamic disk images that Parallels, VMWare or Virtualbox, where they can grow as they get used up (they can often be shrunk when offline too just like sparsebundle). However, while you can just create a sparse file or use the VM disk formats (where the mappings of empty space is part of the format itself for dynamic VMs), Apple chose small files in a folder approach. The reason is sparse files may not be supported on the remote file system on a file sharing network, and some network file sharing systems would send a whole file so having one big file like VM dynamic disks would not work either. That left many small files in a folder. That said, SMB can send only part of a file (I have tested this), and I am not sure if AFS did as well. Unlike Sparse VDL, you can shrink sparse bundles and dynamic disk images to fit the allocated size after you delete and zero out the contents
Yeah, don't use ReFS, unless you want your data held hostage by Microsoft. Unfortunately, there is currently no other file system than FAT32 or exFAT that allows you universal access from all major platforms. NTFS is at least a viable compromise.
Yeah pretty much you can only format drives with FAT32, exFat, NTFS on Windows. But from using Linux, you can format drives with its default file systems, the ones we've mentioned, and pretty much any file system. Pretty much from what I noticed, FAT file systems (mostly FAT32) are mostly used for boot partitions. And NTFS and ExFat are normally used for drives.
@@pyp2205 The big question is what happens when you nuke your ReFS-based Windows install and want to use a Linux rescue stick to recover data. Good luck, as there is exactly one commercial file system driver for ReFS on Linux, which you can't even license. Oh, and your C: drive can't be exFAT either. Only NTFS or ReFS.
Linux has a pretty good NTFS implementation. But they had to black box reverse engineer it. Change a file on Windows, look at a tool that says what changed on the disk, note it down, rinse and repeat until you figure it out.
I'm still hoping that Microsoft will adopt OpenZFS (of which they are a member, if I'm not mistaken) ZFS does everything that ReFS does. And more. And better.
Indeed, I use ZFS on Windows using a cludgy workaround. I have an Ubuntu VM that runs in the background with samba and ZFS, with the data showing up on Windows as a SMB network share. It works on my laptop (with all the nice features like resiliency (two virtual disks on two different drive) and is set to launch on startup but it is a cludgy solution)
@@brodriguez11000 BTRFS is still such a buggy mess with many data corruption bugs that can cause you to lose data. I was hopeful about it but after more than a decade, I have given up on it. It’s always going to remain a buggy mess
@@ckingpro Something I found out by accident is that Hypervisor virtual machines remember their running state, surviving a shutdown and reboot of the host machine. I was surprised to find my guest OS happily idling in the background one day. It had been weeks since I last used it!
ReFS is primarily intended for enterprise environments. It is especially worthwhile on hypervisors for storing VM disks, as well as for backup repositories. Block cloning allows you to create synthetic full backups, or to merge incremental backups without rewriting the data, which makes it incredibly performant compared to other file systems. ReFS is also used as the basis for Storage Spaces Direct and Azure Stack HCI, where storage arrays are distributed across multiple servers.
And even there ReFS is.. not quite as performant as one might think in comparison to other file systems such as xfs. Synthetic full backups of the same backup job in Veeam being many hours quicker on xfs than on ReFS is a common occurence, even when xfs is ran on a backup repository with inferior hardware.
ReFS sounds very similar to ZFS on Linux, *BSD, etc. Except ZFS does support native compression and encryption, is bootable, and can be used with any distribution which supports it (for free!). I use ZFS encrypted root partition on my laptop and it works great.
I'd just like to interject for a moment. What you're referring to as Linux, is in fact, GNU/Linux, or as I've recently taken to calling it, GNU plus Linux. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX. Many computer users run a modified version of the GNU system every day, without realizing it. Through a peculiar turn of events, the version of GNU which is widely used today is often called "Linux", and many of its users are not aware that it is basically the GNU system, developed by the GNU Project. There really is a Linux, and these people are using it, but it is just a part of the system they use. Linux is the kernel: the program in the system that allocates the machine's resources to the other programs that you run. The kernel is an essential part of an operating system, but useless by itself; it can only function in the context of a complete operating system. Linux is normally used in combination with the GNU operating system: the whole system is basically GNU with Linux added, or GNU/ Linux. All the so-called "Linux" distributions are really distributions of GNU/Linux.
@@henryglends I did not read most of your long comment. However, I wonder how many people now want to try out Linux after all of your criticisms of me for doing the horrible crime of calling it "Linux". Nice way to turn people off from something. If we fail to abide by your harsh restrictions, then we get severely criticized. I guess I am missing your point...maybe you want to turn people away.
You can use ReFS on a single drive in enterprise, it's in the format dialog. But if you do that you put your data at risk, if there is a catastrophic failure and the partition metadata get corrupted, there is currently no free way to recover your data and chkdsk won't save you. Lost entire 2TB drive during a power outage and eventually had to give up the (non-critical) data and wipe it.
I would never recommend using it until we can trust it, and we won't be able to trust it until the open-source community reverse engineers, it. I would be screaming at people NOT to use it on every chance I get. Microsoft should put some effort into adopting tried and true systems that are being developed in the open. Rather than try to play the apple game and hold people hostage over their data.
This is why you should have backups of ryour important data. Basically assume that *every* storage device that you use is shit and has a 1% chance of just dying next time you want to use it.
@@Mobin92 Yup, I have two backup levels of all critical data. This was a drive with VMs used to spawn new systems, too unwieldy to backup with big frequent changes. I was able to restore the most important ones by retro-cloning live systems and cleaning the images.
Another warning for enthusiasts! ReFS has different versions that are not backwards compatible. Sometimes when you upgrade a version of Windows or mount an array to a newer version of Windows, the version of the ReFS on your volume will be automatically updated without any warning. You will NOT be able to use this volume with an earlier version of Windows, even if the volume itself was created by it. Do not use refs if it is possible that you will be moving this volume between systems.
Pretty shure that has also been the case with NTFS. Not much of an issue these days since NTFS is mature and Microsoft hasn't really added a new things for a while.
@@Doso777 It was, and yes, it it was a problem, but last revision was released 20 years ago when XP came out. It has a bigger problem with how ACL and metadata works, which is why it is less-than-stellar as a removable drive.
"ReFS has different versions" Yeah, not to mention that Storage Spaces itself also has different versions across Desktop and Server OSes! Once I've created a Storage Space and pools inside of it (using the latest pool version available) on a Workstation and when I put it into a server 2019, Windows can did not even see the Storage Space on the drives! So, beware.
Additionally, old ReFS 1.0 partitions on Server 2012 (R2) will shit themselves if you install this years' security updates and read as RAW until you uninstall the update.
9:57 On Windows 11 I was able to format a single drive to ReFS so I don't think so that there is such a limitation, you can also format external HDDs to ReFS because Windows sees them as non-removable disks, but you really can't format pendrives and micro SD cards to ReFS.
Well placing it on a single drive defeats the whole notion. People should be warned against ReFS, they should be pointed towards OpenZFS, BTRFS or Ceph
I thought perhaps the mention of RAID (Redundant Array of Independent Disks, and their levels/variations) might have helped to explain things in this particular subject area of file systems. Again, just a thought. Love your channel, Joe!
@@amak1131 you can percept it like a software raid done right. ReFS appeared when more and more servers were going softraid on chipset driver level and MS was like “wtf are you doing guys?”
5:05 - If I understand correctly, ReFS is able to _present_ allocated, but newer written to, clusters as containing zeroes. So, when you create, say, a 200 GB file that will contain a VM disk volume, it will be allocated, but not overwritten. However, if the VM reads a "virgin" cluster, the FS will return all zeroes, not whichever leftover from previous content was actually there, thus dramatically speeding up creation of huge empty files without compromising security.
What you are referring to Copy-On-Write refers to RAM (where memory pages (fancy word which is a unit of memory. On x86 and x86_64, it is 4 KiB of memory) can be shared between processes until a process tries to write to a shared page, in which case, the system copies it instead for the process). File system Copy-On-Write is different. Let me give you a brief overview of how file systems remain consistent (corruption-free). So, in the old days, file systems basically had little checks. If the power goes out mid-write, the file system has no way of checking what went wrong. So, you would run chkdsk or fsck (Linux/macOS/Unix equivalent) and it would have to check every single thing to look for any corruption to fix. This could take days. Then came journaling. Typically, journaling is only enabled for filesystem metadata (though some filesystems like the linux ext3 and ext4 do allow you to enable data journaling. But since your data has to be written twice, expect abysmal write performance). This means that the file system first writes what is is going to do/change to a journal (not your data, but say it is going to rename or allocate space, etc) and then perform it. If the power goes out and comes back again, the file system first checks the journal and finishes the operation. This means no more lengthy checks. However, data consistency is not ensured. To ensure that, we have copy-on-write. In copy-on-write, you don't write the data in place, but in free space, before updating all references to point to the new space, and updating references to that reference and so on until the superblock (think of it as the main block of the file system. And typically there are multiple superblocks so when all are updated, the operation is done). In this case, if the power goes out, you can be sure either the entire write went through or none of it did, ensuring consistency. All of this is done without writing data twice. Now, when it was first introduced, it was meant for enterprises or businesses (or anywhere where data integrity matters) with file systems like ZFS as it increased fragmentation (think about it: writing within a file will have to be placed somewhere else where there is free space rather than in place). But now that we have SSDs, it does not matter.
Haven't Microsoft realised they can't design a decent filesystem yet? There's many existing solutions that are far better than anything they could come up with. They should have just used one of those.
Add to this video. This Re FS will give you a possibility to recover from hardware failures. NOT from software failures (because all dat on all 3 disks will be corrupted the same). Also ...It does not protect you from the effects of malicious hackers encrypting your data....all copies will be encrypted (only a backup will help you at that moment.) And that is where a snapshot will be valuable....take that snapshot and keep it as backup.....but NOT directly connected to your computer to keep it safe from hackers. (Better also keep it stored in another location....helpfull when one of your location gets destroyed....maybe fire.) I've years ago told my brother in law some of these tips....about having a safe backup. It has saved him many thousands of dollars because he could refuse the offer from hackers for the decripting key...(costed him one day of work restoring from his safe backup)
We are finally moving some of our main storage to ReFS at my workplace. Our use case is backup storage and virtual machines and from all of our reading, it's going to save us a lot of time for many large, write-heavy operations!
9:40 Yes, that's a very good way to describe the swap file ('overflow for memory'). In general, trying to explain this kind of stuff to the layman is pretty hard, so good job!
Important: cant’use refs on different win server edition. Refs have versions so if you plan to attack disk to another system (like for recovery ) , must be same version
Great video, you made this really easy to understand. Just to add some information, when Joe explaining Copy-On-Write the main point was missed. COW isn't just for multiple file locations (that point to the same actual location) but how some of the new "hip" filesystems make changes to data by always writing to new blocks on the disk for all data. This is for making data more resilient to errors, and also gives the data its own snapshots. This is why it has file level snapshots, because it is built in to how the file system works. Windows also does file snapshots with NTFS since windows Vista, but this from manually making snapshot copies of data, where as COW file systems (like ReFS) do this natively. Functionally they work the same for you because Windows makes it so, but internally ReFS will do this faster because it doesn't have to manually make a new copy, it is part of how the file system normally works. This also makes the file system better at freeing up the space from the snapshot copies because it naturally overwrites the oldest data when space is needed.
For the features not in refs that you mention. For booting, that's not actually a refs limitation, but because most modern UEFI boot systems, simply do not have a driver for it. So instead you have to give it a driver on the EFI partition, same as for ntfs for a lot of systems. Now windows installer does that for you with ntfs, but it won't do that for refs. This is because MS does not consider refs to be ready for this yet. They have published a roadmap for refs which is in three stages, and refs is currently in stage 2. MS is not going to install the drivers to the UEFI prior to this, but there are some third party methods you can use to do it if you decide you really really really want to... For file system level compression, this is not entirely true. You see, compression on refs is tightly integrated into the deduplication, exactly because they sort of need to be for optimal usage of either. So you enable compression, by enabling the deduplication. Encryption however is not available, again due to its negative impact on deduplication. As for page file, well, it's not that you technically couldn't, but first of all, you're using refs through storage spaces, and you're not allowed to place the pagefile on a storage space volume. You can however do refs on a single drive (even though you say you can't and you absolutely do not need to use storage spaces for it), in which case you could do pagefile for it. BUT, windows GUI will not allow you to do so. And this has to do with the Copy on Write, which I might add, you explained incorrectly. Copy on Write means that if you have a file opened, you make a change in it and save that file, then it will write out the full block that the changes are made in, in a completely new, separate block, then move the file reference over. While it does have the effect that if two programs open the same file, one writes to the file, the other program will now still be reading the old data, because it opened the old reference and has not been told to reload the reference which now points to new data. CoW filesystems are incredibly good for storage that rarely change. It is however incredibly inefficient and slow for data that change rapidly, such as a pagefile. Hence why Windows won't allow you in the GUI to use it that way. You can force it, but you will have a very VERY bad time from it, even if you're nowhere near to running out of ram simply because of how a pagefile is NOT just "overflow ram" which isn't how page or swapfiles have worked for over 20 years now. Anyway... Next you bring up not for removable drives. This is again sort of true but also sort of not. First of all, there's nothing stopping you from adding a removable drive to a storage space pool and having the pool formatted refs. Secondly, the reason it's normally not shown is because refs does not work well with the quick removal that is the standard on modern Windows. If you instead go in and enable caching and optimizing for performance on a removable drive, you can how using powershell force it to format refs (the gui still won't let you). Be warned though that this drive will not work in systems that is not configured the same and you will likely corrupt it simply by plugging it in to such a system. As for the mirror accelerated parity. Since you as you admit didn't understand it, I'll try to explain it. So because parity calculations are slow, but more efficient. What it does is that it writes a data block to the two parity drives, creating a mirrored set of that block. Now this is ofc inefficient for storage amount. It will however report it as if it was written with parity rather than a mirrored set. It will then when it either has some free idle time or when it's starting to run out of real space, do the parity calculations and rewrite that block as a parity block, at a later time. It's very good for when you have infrequent writes that you want to complete fast. It's bad if you have constant writing as it actually has to write data twice.
If you like to experiment with your computer don't use Refs, I tell you from personal experience. If you go to a higher version of Windows (Insider Preview in my case, even Release Preview which is the slowes ring of Insider Program) and you downgrade, you'll no longer be able to use it or access it until you format or go to the same version or newer than you had.
ReFS is pretty useful for some server applications. For example block cloning helps to save a lot of space and processing time in backup repositories. The space savings can also be huge on things like VDI (Virtual Desktop Infrastructure).
About parity: as said it does affect performance (at least write wise). I don't know for ReFS, but it should improve read performances for large files since the data can be read from at least 3 drives together (if parity is mixed accross drives eg. block 1 is stored on drive A & B parity on C, block 2 is stored on drive B & C parity on A etc…).
Actually, you CAN run ReFS on a single drive not in a pool, have done so on both win10 enterprise and hyper-v core 2019. Not certain how I did it and if it was as intended by Microsoft, but I believe I set it up using Windows admin center.
can, but should NEVER until you can trust it , and you can't trust it, until it is open source or properly reverse engineered by the open-source community. We should be pointing people towards, OpenZFS, BTRFS or Ceph
@@Mikesco3 This is the kind of elitism we don't need. Yes, btrfs and ceph are by far superior, but not an option in Windows. So they're out of the equation. And compared to ntfs, it is better in some usecases.
It absolutely can be used on a single drive. Just use /fs:refs with the format command in command prompt. Also, scrubbing can only be done on pools with redundancy. On non-redundant ReFS, it'll simply fail when reading a corrupted piece of data, which is still preferable to not knowing that you just read invalid data and get corruption or a crash.
So I wonder if this is salvaging of WinFS from longhorn/vista. If you remember that was suppose to be more a full journaling system they ended up shelving around 2005/2006
WinFS was supposed to be like a relational database built into a filesystem. That was one of the three major technologies that were promised for Longhorn/Vista, all of which were abandoned before release.
your titles are so attractive man but the vid length to explain a concept or subject that could have taken half that time really doesnt let me click, i clicked this one just to convey this comment
I believe the reason refs failed so hard was because it doesn't compete with open source file systems like zfs or btrfs. Most enterprise solutions are virtualizing Windows with a Linux like hypervisor anyways. It's honestly very rare to find a Windows server instance on bare metal in a datacenter. Not like it's impossible to find for very specific use cases, but rare nonetheless. Btrfs and zfs do everything refs does but WAY better. Storage spaces just doesn't compete in performance, flexibility and management.
5:40 if you make VM it reserves set amount of disc (if you pick full reservation, not "grow as you go") When you make saves of VM it will save all "reserved" space. If you use compression on that, which is standard to avoid 20GB backups/images, every bit that is not zeroed, causes issues for compression. while creating images of VM's, it is a good practice to zero-out whole free disc space, it can make 3.7 GB image/backup into 2.1 GB one, just because compression algorithm don't have random strings in free space. This fs does that to any unused but still reserved space, so any VM images or operations on the whole thing, as backups/restores/loads will work with much smaller images, and that speeds up a lot of things.
You don't have a "parity drive", people often say this to make visualization easier but parity bits exist on all the drives and striped across with the actual data bits. Also the calculation of how parity works isn't nearly as complex as you might think, it is just XOR calculations. This is very easy in a 3 drive RAID 5 example because if you take 8 bits (to make it easy to write out) and think of each 4 bit chunk as being what is striped to a single drive, then you get 2 data bits, if you XOR those you will get your parity bit. If you lose any of your data bits you just XOR the remaining one with your parity and you will get your missing piece of data. This gets more complex after 3 drives but the basic concepts are the same.
Technically true for actual parity (e.g. RAID 5 on any number of drives). But for RAID 6 (double-parity)-type systems, the "parity" is not actually parity, but something more complex. I think they use either Galois fields or some sort of Reed-Solomon code (not very familiar). That said: as far as I understand, the main performance issue with parity RAID is not computational cost, but rather fragmentation due to having to treat every write as its own stripe to compute parity. The "mirror-accelerated parity" feature sounds like it mitigates precisely that problem by computing the parity asynchronously, likely after a larger amount of written data has been accumulated. I believe Bcachefs uses the same technique for its parity RAID support.
@@fat_pigeon You basically just went in slightly more depth on what I already said: "This gets more complex after 3 drives but the basic concepts are the same". Reed-solomon is commonly used and allows for much more complex striping and parity variability for systems like ceph, but at the end of the day it is all somewhat based on the same ideas. They just get more complex and build on each other more and more until it gets a little too hard to explain without breaking out math proofs. I have done a little work with ceph and the reed-solomon algorithm but I wouldn't attempt to break it down any further on something like youtube (Plus I am far from an expert on the minute details)
Most of this stuff was around in ODS-5 on OpenVMS 20+ years ago, although OpenVMS needs updating in terms of storage capacity now (it's planned, they have just been busy the past few years porting the OS itself to x86) There;s a lot of very good file systems out there, ZFS would be my pick although there are technically better ones
dude I remember when I was like 8 I was talking to my dad about the batteries on ethernet cables and didnt listen to him even though at the time he would have had 30ish years of experience with computers. Nice to see you're making "real" content now
Just a small correction: at least on 10 Enterprise you CAN format a single partition as ReFS directly from File Explorer, no need to use the Storage Spaces thing! (I guess the same is true for 11 Enterprise, but don´t have one at hand to test). Don´t know how the Pro for Workstations versions behave, as I never used those!
fun fact ReFS was in a earlier version of windows 10 but MS decided to remove the feature and reserve it only for higher versions of windows also storage pools and parity are so soooo slow that its not worth it. i setup a truenas vm in hyper-v on my windows 10 pc that boots up automatically when my PC starts. I then added my 5 4tb drives directly to it and use ZFS which is the superior file system. there is a beta out there to bring zfs to windows but its far from prime time ready. if MS ever get parity writing speed fixed (27mb/s on 5 4tb drives on a ryzen 1700x) then i might consider switching back but storage pools and parity has been awful sines 2012. so i have very low hopes that ever happening.
I was going to say the same. ReFS in Storage Spaces in any configuration other than 'Mirror' is slow. TrueNAS in VM is what I'm doing too. Got a zvol presented using iSCSI LUN to my PC. Works great, but would be very hard for non-nerds to set up.
@@Rn-pp9et funny that you commented this very thing as i was thinking about changing it from a mapped network share to iSCSI a few days ago. some programs cant access a mapped network drive. how complicated is it to setup iSCSI in truenas?
I used to have a Windows "NAS" which used ReFS on Server 2012 back in ~2012-2013. It was an array with 8 x 2TB disks. I had so many issues with early Storage Spaces and ReFS. I ended up building a new RAID6 array on an LSI MegaRAID, migrated the data, and never looked back. I think of ReFS not so much as a replacement file system, but really as a specialist file system like ZFS. I'm surprised that ReFS is really still in development...I think the biggest benefit and reason it was originally built was for Hyper-V, and well as we know On Premise Hyper-V got EOL'd with Hyper-V 2019. I wouldn't be surprised if ReFS got canned as well, I cant imagine theres much use of it when theres much better solutions like NetApp, StorServ, 3PAR, hell even ZFS solutions like TrueNAS Enterprise.
My biggest issue with ReFS is that there are no data recovery tools for the file system, unlike FAT32, NTFS, and so on. If the partition simply becomes too full, the data becomes inaccessible and irretrievable. This also holds true if the partition becomes corrupt for whatever reason (it sometimes can't correct for all errors automatically and fails in a non-graceful manner, preventing data recovery). That's a hard pass for me!
Its basically the modern version of spanned volumes on dynamic disks. Its also the reason why Microsoft stopped supporting dynamic disks & spanned volumes when they introduced ReFS.
ReFS shines in enterprise backup systems. The Data De-Deuplication feature saves a ton of space. I have a few drive arrays ranging from 60TB to over 100TB and NTFS works the best. I have lost data using ReFS setup via a Storage Pool where the meta data was missing on boot. I only use it now for my backup server. Would not recommend regular users to use this.
The parity system is actually pretty simple, and just exploits a neat property of the XOR operator. If you have three sets of binary data of equal length, A, B, and C, then with the XOR operator, if we set C = A XOR B, then we can recover A, B, or C regardless of which one fails from the other two: A = B XOR C B = A XOR C C = A XOR B
Parity is often easily done in hardware with dedicated logical circuitry... taking the load off the CPU. - Ben Eater has made a great video on error detection and parity checking... It is really a simple, ingenious and well established concept.
look into bitrot, that's why there are projects like OpenZFS, BTRFS or Ceph. Hardware raid is known for the potential to introduce silent corruption and or locking people into proprietary solutions that become a problem once the manufacturer doesn't want to support that version of the hardware. CPU time is not as expensive as it was way back.
@@Mikesco3 I was referring to what parity checking is and how it functions. It has been around for ages, long before RAID was even a concept. I remember reading about it as a teen back in the '80s. Parity checking isn't going anywhere or being developed any further. It is not like with compression algorithms, where the latest one is able to compress data even harder than it's predecessor. Parity checking is what it is, the more parity bits you add to your circuit, the larger any portions of potentially corrupted data can be reconstructed/recovered. And yes, it can be emulated in software as a lot of circuits can today. But it is a way of building an error-correction circuit with bitwise logical gate IC chips (or in logic arrays like PLAs, GALs, or FPGAs, etc.) completely without the need to wait for a CPU or even a MPU to finish running any code. The parity data is ready the very instance the transmission has been received. This operates at the "bare metal level" as we old school computer nerds used to say (even though "bare silicon level" would probably have made more sense)... drivers or software are much much further up the "food-chain"... along side such phenomena as compatibility issues. Done correctly the OS doesn't even need to know that it exists. - Sadly YT won't let me post any links... But go find and checkout Ben Eaters videos... you will see what I mean.
@@Zhixalom It sounds like you're talking about some kind of hardware accelerated parity calculations? It sounds useful for stuff like server farms or the like, but it doesn't solve the issue of end-to-end data integrity checking and recovery. Hard drives actually already have error correction via ECC data for each sector, these days 4k each. This will work - for data integrity on the platter only - so long as the corruption that has occurred on-disk is not greater than what the ECC data can repair. Then we have RAID parity setups, hardware or software based to deal with more massive damage, all the way up to whole disks dying altogether. The point Michael is trying to make is that all of these approaches, hardware or software, and including the product you're speaking of if I've understood it correctly, so long as they're not integrated into the filsystem itself, won't be able to detect or repair damage that happens in-flight or in-memory. Data can be corrupted in memory (unless it's ECC memory, as in servers), by a faulty CPU, or while in transit either from or to storage. Only a checksumming filesystem where integrity checks are performed in-memory post retrieval such as ZFS, btrfs or ReFS (APFS promised this, but AFAIK they still haven't fully delivered (only metadata is checksummed)) can detect such a corruption. As a personal example I have, I had a massive ZFS array running on a Linux box with SATA PMPs that had a kernel driver bug when running in SATA-300 mode, which caused data transferred to have thousands of errors every few minutes when fully saturated. I had no idea for almost a year, until I mirrored a SMART faulting drive for replacement, and discovered that I was completely incapable of copying even a single megabyte off the drive repeatedly getting the same hash sums even twice. I then started checking the other drives and found that each and every one of 15 drives running off these PMPs where producing error-filled data when read from them raw. I then debugged ZFS and discovered the torrent of failed reads that it was experiencing - after having read blocks successfully from the drive with no crc errors and no controller errors reported - and silently retrying until it got back what it know was good data. I "fixed" the bug by forcing SATA-150 speeds, and ZFS performance increased massively as a result, as reads where now almost always good instead of almost always bad, and it no longer had to retry until receiving good data. Same for writes, which are by default read back and confirmed in ZFS, then rewritten if bad. Had I had a regular filesystem here, perhaps even running RAID with parity, all my data would have been destroyed. I'd have parity to ensure that it would remain destroyed in exactly the same way going forward, but any software RAID couldn't have prevented the data corruption that happened afterwards on the SATA channel, ECC on the disk neither, and neither can hardware RAID know if the arbitrary blocks sent to it were actually damaged since the fs sent a write request.
Xor parity is an extremely simple bitwise operation anyway. Recover any disk by xor of the other two and the runtime of the xor will be vastly outweighed by disk read time anyway.
I was wondering about this file system - thanks for covering it. 😄 By the way, there is something strange with the audio equalization for this video compared to your other videos. Your S's aren't coming out as crisply as before. Either that or someone stuck meat probes in my ears while I was sleeping. [Edit: The problem traced to the fact that every time there's a Windows feature update, it wipes out my equalizer settings 🤬]
ReFS is a neat type only for programs that are used for backing up files. The files that land on ReFS should already be compressed, duplicated, and encrypted by the program that uses that volume. You want an error correcting, resilient file location. Then set the Block Size to 256-512 to match the program so when to program backs up each block there is not any wasted space at the end of each physical block saving you Gigabytes to hundreds of Gigabytes of space. Also matching the Block Size will allow you to have better OI speeds.
That works in the simple example, but for serious use you'd want double (or triple) "parity", which is a bit more complex. I think they use either Galois fields or some sort of Reed-Solomon code (not very familiar).
Hi, How did you make a colorized prompt in your powershell? I mean user and path, I know the standard color output. About the ReFs, this is the Linux btrfs, glusterfs, ceph and similars equivalent :-) Best regards.
Interesting. I was able to format a single partition on my secondary SSD as ReFS. It was formatted on Windows 10 LTSC 2021. Probably not that useful on a single disk though. When I eventually reformat that SSD, I'll probably be sticking with NTFS in future. I don't really have a real reason to be using ReFS, so probably best that I don't until I need to.
Lets bottom line it... Its NOT new, Its NOT a replacement, nor is it even a viable alternative, or a next gen for NTFS and unless you have a specific need that takes advantage of it.
I'm no not work engineer and please correct me if I'm wrong but when you're talking about parity bits and such isn't that just raid I forget which RAID level is which I'm pretty sure raid one is the parity bit but anyhow maybe they'll roll it into NTFS at some point once I get the bugs and kinks figured out
I have used ReFS for many years (with absolute guarantee and excellent performance, proof of power outages, proof of disconnections, etc). I've used ReFS until the bloody January 11th of this year. On that date, Microsoft began to make it impossible to use it in external mirrored disk cases (specifically QNAP TR004 disks). Microsoft made the ReFS versions of Windows Server 2012, Windows Server 2019, Win10, and Win11 incompatible. Even the data was inaccessible to those who lacked wait time until a few weeks later. So, Microsoft partially patched up that mess. Then nothing was the same as before. Microsoft made us happy with ReFS on external drives, and now it has changed its mind and puts stones on the wheels. Microsoft products have never been stable, nor have they been durable. Only their monopoly is truly lasting and truly eternal.
So it is BTRFS or ZFS for windows ? The problem with RAID 5 etc was bit rot , where if it spotted a error it would not know which of the drives made the error so just recalculate a new parity .
Sad to see shitdows hasnt switched to ext4 yet. More proprietary junk that isnt half as good. 💩 The joys of a system administerator having to deal with proprietary junk.
@@AchmadBadra you say that when even enterprise standard for a stable 100% uptime filesystem are unix only. ZFS is the defactor standard and is For Linux / BSD. Windows spyware is too inferior to support such reliable options
@@AchmadBadra I'd just like to interject for a moment. What you're referring to as Linux, is in fact, GNU/Linux, or as I've recently taken to calling it, GNU plus Linux. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX. Many computer users run a modified version of the GNU system every day, without realizing it. Through a peculiar turn of events, the version of GNU which is widely used today is often called "Linux", and many of its users are not aware that it is basically the GNU system, developed by the GNU Project. There really is a Linux, and these people are using it, but it is just a part of the system they use. Linux is the kernel: the program in the system that allocates the machine's resources to the other programs that you run. The kernel is an essential part of an operating system, but useless by itself; it can only function in the context of a complete operating system. Linux is normally used in combination with the GNU operating system: the whole system is basically GNU with Linux added, or GNU/Linux. All the so-called "Linux" distributions are really distributions of GNU/Linux.
Thanks, Joe. This type of video helps those who have maybe heard of ReFS and that it might be great, or not, to know why or why not we might want to look into it.
I think the limitations are there for a reason. In modern IT it‘s considered a best practice to keep the Production data and configuration data separately. So the server would boot on a NTFS drive and the user data, Databases, VMs are stored on ReFS. If following best practices the configuration would be stored as Ansible playbook (or other deployment toolchain). Because restoring a machine from a snapshot is always a pain in the … Easiest is to just make a automated fresh install, connecting the user data later.
Using zfs combined with sanoid/syncoid for VMS and lxcs allows near instant rollback compared to using more traditional backups. My setup has automatic replication of VM zvols every hour taking up very little space on my pool.
I think I know enough about what VDL does without having heard of it. When a file-system allocates space for a file it allocates blocks. When you format a drive you'll see it called, "Allocation Unit size". These blocks are what are allocated for every file. The default allocation unit is 4096 bytes so a file that is 20 Megabytes is 20,000,000 bytes That means that it is equal to 4882 Allocation Units, or blocks. Because files are never precise to the byte there is always empty space between files. So a file that is 19,998,976 bytes would also take up 4882 Allocation Units. So on the file system the file still takes up 4882 Blocks but the actual number of bytes it takes up is less then the exact number of blocks. The blank space will be filled with zeros. If I'm wrong pleas correct me.
VDL is a dream with there are many small files, smaller than the block size. Having the remainder of the block filled with 0s instead of random file fragments make data recovery and disk-level maintenance a very clean and safe operation. The ReFS is mostly intended for enterprise and data centres...
File metadata does not include the location. In all modem filesystems, the location is not a property of the file at all, but a consequence of the directory structures that reference it and that referencev this directories and so on. When a file is repositioned (moved) within a filesystem, its metadata might not change at all or even be touched. On windows, certain caveats apply because of weird legacy logic around short filenames,depending on version.
You just can't format drives in some versions of Windows. But you can use ReFS e.g. under Windows 10 Pro if the drive was formatted with it. And you can use ReFS on single (even removable) drives without any problems. For example, I formatted an external USB backup drive with it.
- 0:28 Vista was supposed to roll out WinFS which was supposed to be a relational-database filesystem, but it never manifest, which was massively disappointing. 🤦 😕 - 4:35 NTFS supports sparse files. In addition to filesystem support, the program that's writing a file also needs to know about sparse files in order to use it. The filesystem allocates clusters/blocks to a file _as needed_ instead of all at once. One use-case scenario for sparse files is a P2P program. For example, if you want to download a 1GB file, the program can mark it as a sparse file so when it requests the filesystem to reserve 1GB of space, the filesystem will say okay even if it doesn't actually have 1GB of free space available at that moment. Then the filesystem can allocate the clusters as the program downloads them. So if that 1GB file gets stuck at 50% because there are no sources, then it will only take up 500MB of space on disk and can buy you some time to continue downloading while you try to free up space. Of course, this also means that if the system runs out of available space when the program tries to write to a location that hasn't been allocated yet, it will get an error, but it's still helpful with low-free-space scenarios. - 6:18 Mirrored-parity is clever. As an analogy, imagine whenever(-ish) you write to one of the two storage disks, for every byte that's written, you do some sort of reversible operation (eg XOR) between that byte and the corresponding byte on the other storage disk, then write the result to the parity drive. If one drive croaks (or throws bad sectors), you just perform the reverse operation (or in the case of XOR, just XOR again), on the bytes from the parity drive and the good drive and boom! you've got your data back. As Joe said, this lets you have a 100% redundancy backup while only using half the storage (1 backup for 2 drives). 👍 But as Joe also said, it requires extra work to do the calculation, but what he didn't mention is the extra time for the extra drive-reads as well. But then, this is for long-term reliable data-storage, not a gaming system. 🤷 - 7:45 Nope, volume-shadow-copy has been available with NTFS at file-level since Windows Vista (well, _technically_ since XP…) - 10:27 You don't think a lack of transactions is interesting? Transactions are like ACID for databases, they guarantee all-or-nothing changes, it's part of what makes Linux filesystems more reliable. - Quotas are also not supported, so you can't limit different users on the system to be able to specific amounts of disk space (which is particularly bad for data-centers that rent out space 🤦). - No 8.3 filenames‽‽‽ 😲 Awwwww, come on. 😕 - 10:52 They were trying to focus on WinFS because a relational-database filesystem is much more useful, but unfortunately it got stuck in development-hell and was ultimately cancelled. 😕
4:20 one question, in this case where it uses a pointer to the original instead of being a true copy, what happens if the original gets deleted? Prior to deletion does it ACTUALLY copy the file to the other location?
ReFS can't be more stupid than any other file system with CoW and snapshot functionality or with support for hard links - the disk space gets released when there are no more references to the data. So if a file is part of multiple snapshots or there are multiple hard links to it, then all snapshots or links needs to be deleted for the disk space to be released.
I am very hapy with the free and open source Btrfs and ZFS from Linux and Unix! When it comes to servers, I feel absolutely zero need to go for a proprietary and expensive solution copied by micro$oft, while we have such amazing free ones. 9:55 Not sure, but may I guess why?: because then you would need to disable copy-on-write (COW). Otherwise enormous blocks of data would be rewritten on every write. :P To store huge binary files like disk images and swap files, you need to disable it for the directory where the big binary needs to be stored. I know on Btrfs you can disable copy-on-write for a specific directory. Probably on ReFS that - of course not - cannot even be done. :P
If I remember correctly when ReFS came out is was really only meant to be used for storage on Hyper-V hosts. It wasn’t really meant to replace NTFS for normal workstations or VMs.
So I want to ask something, the Block Cloning feature suggests a pointer to the original file, what if the drive containing the pointer file needs to be taken away, will the file be actually stored? Also, the same for the drive containing the original file and it gets removed, will the other drive persist it’s original copy or the pointer?
This filesystem isn't meant for removable media at all, but for an array of internal drives that are generally removed only when they fail. In the scenario you describe, if you pull one of the disks, the filesystem would treat that as a drive failure. CoW is separate (it would work on a single-drive volume); when used together with mirroring, *both* drives would have a copy of *both* the original and the modified file.
Correction: You CAN actually format a single drive with ReFS, it does not need to be part of a pool.
Yes, I recall using ReFS in the past on a single HDD (I think it was a Windows Server 2012 R2 installation configured to be like a client, with desktop experience and the Acceleration Level dwords to enable 3D acceleration), and then I tried installing GTA V off Steam on it. It went horribly with Steam simply going crazy, believing that the game was corrupt and restarting the download on an infinity loop (the OS was of course on a different disk, formatted as NTFS, but I created a Steam Library in the ReFS HDD).
Correct, but the feature is only available from the command line in the 'diskpart' and 'format' commands.
@@Rn-pp9et In Windows 11, you can also format in ReFS filesystem from the UI.
@Proton ThioJoe explains it pretty well. I suggest using ReFS only if you're going to use its exclusive features, since NTFS generally performs better.
ReFS is also bootable with version 3.7 and Windows 11
hey babe wake up a new file system just dropped
10 years ago
@@John-Smith02 wdym?
Edit: Just watched the vid and realized Lmao
So nvm
“hey what it is called”
“reeeee-FS”
“what”
“don’t worry, it dropped in 2012”
“so why is it new”
“idk”
@@joen4287 0:29 :)
@@gallium-gonzollium yeah just watched the vid and realized what he meant XD
Thanks
I appreciate your honesty in telling that you're not familiar on other features like Sparse VDL 👍🏻
Honesty always adds to credibility, and we appreciate it
That's who Joe is and that's why we love him
I loved his humility too.
Yep, it truly adds to the credibility of the rest he's telling.
To give a very simple idea on "sparse" files.
You can see it as a specific implementation of the "block deduplication" feature.
A block with zeroes will occur quite often on virtual machine file system images. So a block with zeroes (or whatever specific content they think of to fully use this feature) will occur multiple times and a lot of data blocks can point to this same block.
This makes creating and copying the file (on the same filesystem) very fast and using a lot less space on the storage.
Virtual machine images often already used this concept of sparse files. But I don't know if it was supported as such on NTFS or that it was aapplication specific implementation.
@@TD-er Alright Thanks for the information. Highly appreciated 🙂
So NTFS already has spare files, which means files only take as much space as are written to (reading other areas of the allocated space just returns zeros). But once written, you can't undo that so if you want to clear that region, you have to actually write zeros all over. Sparse VDL allows you to essentially make those areas sparse again in a sense (allowing you to zero regions of a file without actually writing zeros) making it much faster
So it's basically like TRIM does at a lower level with SSDs, letting it know there's nothing important stored there so it can reuse the space.
@@gblargg close but unlike sparse files, that space can’t be reused. So while sparse files are only truly allocated on disk as written to (I can create a 100 TB sparse file on 1 TB and write 200 GB, the size of the file on disk is about 200 GB. With Sparse VDL, however, those allocated regions remain allocated but you don’t need to write zeros to them (just need to update metadata). Instead, the system will return zeros when read (like with sparse files if I read beyond 200 GB) because ReFS tracks what regions of file has Valid data (hence VDL). For SSDs, the TRIM information is important as otherwise it would have no idea that certain sectors contain no valid data and would copy it or keep it there allocating different cells (causing write amplification). While you could just write zeroes (and some controllers would consider the sectors reusable), it makes deletion slower (as you have to go out and zero it out), it is an implementation details that SSDs didn’t have to implement (after all, file systems don’t zero out file during delete) and wouldn’t work with transparent disk encryption which sits under file system. So TRIM is genuinely useful here (but TRIM for Shingled hard drives is just terrible. Shingled hard drives shouldn’t exist imo)
@@ckingpro Ahhh, looking at Microsoft docs on VDL you are correct, it's just a way to avoiding having to zero all the data on disk. Sounds like it basically marks sections of the file as zeroed and returns zero when read, to avoid long wait times when creating huge zeroed files. I could find very little about VDL relating to ReFS. One from a MS engineer was "More on Maintaining Valid Data Length".
Like a .sparsebundle in unix?
@@brkbtjunkie aren’t sparse bundles exclusive to macOS? Regardless, they are not the same as sparse VDL. Think of them as dynamic disk images that Parallels, VMWare or Virtualbox, where they can grow as they get used up (they can often be shrunk when offline too just like sparsebundle). However, while you can just create a sparse file or use the VM disk formats (where the mappings of empty space is part of the format itself for dynamic VMs), Apple chose small files in a folder approach. The reason is sparse files may not be supported on the remote file system on a file sharing network, and some network file sharing systems would send a whole file so having one big file like VM dynamic disks would not work either. That left many small files in a folder. That said, SMB can send only part of a file (I have tested this), and I am not sure if AFS did as well. Unlike Sparse VDL, you can shrink sparse bundles and dynamic disk images to fit the allocated size after you delete and zero out the contents
Yeah, don't use ReFS, unless you want your data held hostage by Microsoft.
Unfortunately, there is currently no other file system than FAT32 or exFAT that allows you universal access from all major platforms. NTFS is at least a viable compromise.
A fair point
Yeah pretty much you can only format drives with FAT32, exFat, NTFS on Windows. But from using Linux, you can format drives with its default file systems, the ones we've mentioned, and pretty much any file system.
Pretty much from what I noticed, FAT file systems (mostly FAT32) are mostly used for boot partitions. And NTFS and ExFat are normally used for drives.
@@pyp2205 The big question is what happens when you nuke your ReFS-based Windows install and want to use a Linux rescue stick to recover data. Good luck, as there is exactly one commercial file system driver for ReFS on Linux, which you can't even license.
Oh, and your C: drive can't be exFAT either. Only NTFS or ReFS.
Linux has a pretty good NTFS implementation. But they had to black box reverse engineer it. Change a file on Windows, look at a tool that says what changed on the disk, note it down, rinse and repeat until you figure it out.
@@bernardo-x5n Yes, that's why I am saying that NTFS is a viable compromise.
I'm still hoping that Microsoft will adopt OpenZFS (of which they are a member, if I'm not mistaken) ZFS does everything that ReFS does. And more. And better.
Indeed, I use ZFS on Windows using a cludgy workaround. I have an Ubuntu VM that runs in the background with samba and ZFS, with the data showing up on Windows as a SMB network share. It works on my laptop (with all the nice features like resiliency (two virtual disks on two different drive) and is set to launch on startup but it is a cludgy solution)
A production grade filesystem in Windows would be cool.
Locked into a pool size, and not as well for a lot of small files (BTRFS better).
@@brodriguez11000 BTRFS is still such a buggy mess with many data corruption bugs that can cause you to lose data. I was hopeful about it but after more than a decade, I have given up on it. It’s always going to remain a buggy mess
@@ckingpro Something I found out by accident is that Hypervisor virtual machines remember their running state, surviving a shutdown and reboot of the host machine. I was surprised to find my guest OS happily idling in the background one day. It had been weeks since I last used it!
ReFS is primarily intended for enterprise environments. It is especially worthwhile on hypervisors for storing VM disks, as well as for backup repositories. Block cloning allows you to create synthetic full backups, or to merge incremental backups without rewriting the data, which makes it incredibly performant compared to other file systems. ReFS is also used as the basis for Storage Spaces Direct and Azure Stack HCI, where storage arrays are distributed across multiple servers.
Sounds kinda like snapshotting on btrfs
And even there ReFS is.. not quite as performant as one might think in comparison to other file systems such as xfs. Synthetic full backups of the same backup job in Veeam being many hours quicker on xfs than on ReFS is a common occurence, even when xfs is ran on a backup repository with inferior hardware.
Xfs does not have parity checks and isnt designed as a resilant file system.
It's just Microsoft trash, Linux is for enterprise environments, don't be a soydevs
Btw I use Arch
ReFS sounds very similar to ZFS on Linux, *BSD, etc. Except ZFS does support native compression and encryption, is bootable, and can be used with any distribution which supports it (for free!). I use ZFS encrypted root partition on my laptop and it works great.
Well as long as you agree to the CDDL
I thought the same
I'd just like to interject for a moment. What you're referring to as Linux, is in fact, GNU/Linux, or as I've recently taken to calling it, GNU plus Linux. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX. Many computer users run a modified version of the GNU system every day, without realizing it. Through a peculiar turn of events, the version of GNU which is widely used today is often called "Linux", and many of its users are not aware that it is basically the GNU system, developed by the GNU Project. There really is a Linux, and these people are using it, but it is just a part of the system they use. Linux is the kernel: the program in the system that allocates the machine's resources to the other programs that you run. The kernel is an essential part of an operating system, but useless by itself; it can only function in the context of a complete operating system. Linux is normally used in combination with the GNU operating system: the whole system is basically GNU with Linux added, or GNU/ Linux. All the so-called "Linux" distributions are really distributions of GNU/Linux.
@@henryglends I did not read most of your long comment. However, I wonder how many people now want to try out Linux after all of your criticisms of me for doing the horrible crime of calling it "Linux". Nice way to turn people off from something. If we fail to abide by your harsh restrictions, then we get severely criticized. I guess I am missing your point...maybe you want to turn people away.
@@georgeh6856 it was a copypasta joke
You can use ReFS on a single drive in enterprise, it's in the format dialog. But if you do that you put your data at risk, if there is a catastrophic failure and the partition metadata get corrupted, there is currently no free way to recover your data and chkdsk won't save you. Lost entire 2TB drive during a power outage and eventually had to give up the (non-critical) data and wipe it.
On Windows 10 Pro for Workstations you can do that too, in disk management while creating new volume or formatting existing one.
I would never recommend using it until we can trust it, and we won't be able to trust it until the open-source community reverse engineers, it.
I would be screaming at people NOT to use it on every chance I get.
Microsoft should put some effort into adopting tried and true systems that are being developed in the open. Rather than try to play the apple game and hold people hostage over their data.
This is why you should have backups of ryour important data. Basically assume that *every* storage device that you use is shit and has a 1% chance of just dying next time you want to use it.
@@Mobin92 Yup, I have two backup levels of all critical data. This was a drive with VMs used to spawn new systems, too unwieldy to backup with big frequent changes. I was able to restore the most important ones by retro-cloning live systems and cleaning the images.
It’s like singe disk raid array, why anyone will want to do that besides experiments? And if it an experiment so the data must be invaluable.
Another warning for enthusiasts!
ReFS has different versions that are not backwards compatible. Sometimes when you upgrade a version of Windows or mount an array to a newer version of Windows, the version of the ReFS on your volume will be automatically updated without any warning. You will NOT be able to use this volume with an earlier version of Windows, even if the volume itself was created by it.
Do not use refs if it is possible that you will be moving this volume between systems.
Pretty shure that has also been the case with NTFS. Not much of an issue these days since NTFS is mature and Microsoft hasn't really added a new things for a while.
@@Doso777 It was, and yes, it it was a problem, but last revision was released 20 years ago when XP came out. It has a bigger problem with how ACL and metadata works, which is why it is less-than-stellar as a removable drive.
"ReFS has different versions" Yeah, not to mention that Storage Spaces itself also has different versions across Desktop and Server OSes!
Once I've created a Storage Space and pools inside of it (using the latest pool version available) on a Workstation and when I put it into a server 2019, Windows can did not even see the Storage Space on the drives! So, beware.
Additionally, old ReFS 1.0 partitions on Server 2012 (R2) will shit themselves if you install this years' security updates and read as RAW until you uninstall the update.
Wholly f**k don't I know it?......NOW, and FWIW, you're an added verification, so thanks for that.
9:57 On Windows 11 I was able to format a single drive to ReFS so I don't think so that there is such a limitation, you can also format external HDDs to ReFS because Windows sees them as non-removable disks, but you really can't format pendrives and micro SD cards to ReFS.
Maybe my info was outdated, I read several places that it can’t be on one drive, but they may have added that feature.
Well placing it on a single drive defeats the whole notion.
People should be warned against ReFS, they should be pointed towards OpenZFS, BTRFS or Ceph
I thought perhaps the mention of RAID (Redundant Array of Independent Disks, and their levels/variations) might have helped to explain things in this particular subject area of file systems. Again, just a thought. Love your channel, Joe!
Was going to say, it sounds a lot like RAID with some extra stuff tacked on.
@@amak1131 you can percept it like a software raid done right. ReFS appeared when more and more servers were going softraid on chipset driver level and MS was like “wtf are you doing guys?”
@@amak1131 Yep, it sounds like RAID directly implemented in the file system.
right. this sounds a lot like software level raid, which has been around for decades.
"This video is sponsored by RAID SHADOW LEGENDS"
5:05 - If I understand correctly, ReFS is able to _present_ allocated, but newer written to, clusters as containing zeroes. So, when you create, say, a 200 GB file that will contain a VM disk volume, it will be allocated, but not overwritten. However, if the VM reads a "virgin" cluster, the FS will return all zeroes, not whichever leftover from previous content was actually there, thus dramatically speeding up creation of huge empty files without compromising security.
I accidentally read "Mirror Accelerated Parity" as "Mirror Accelerated Party" at 5:34 and I thought "that exists?!"
Great vid btw!
Yes the mirror accelerated party exists.
It's called disco ball.
What you are referring to Copy-On-Write refers to RAM (where memory pages (fancy word which is a unit of memory. On x86 and x86_64, it is 4 KiB of memory) can be shared between processes until a process tries to write to a shared page, in which case, the system copies it instead for the process). File system Copy-On-Write is different. Let me give you a brief overview of how file systems remain consistent (corruption-free). So, in the old days, file systems basically had little checks. If the power goes out mid-write, the file system has no way of checking what went wrong. So, you would run chkdsk or fsck (Linux/macOS/Unix equivalent) and it would have to check every single thing to look for any corruption to fix. This could take days. Then came journaling. Typically, journaling is only enabled for filesystem metadata (though some filesystems like the linux ext3 and ext4 do allow you to enable data journaling. But since your data has to be written twice, expect abysmal write performance). This means that the file system first writes what is is going to do/change to a journal (not your data, but say it is going to rename or allocate space, etc) and then perform it. If the power goes out and comes back again, the file system first checks the journal and finishes the operation. This means no more lengthy checks. However, data consistency is not ensured. To ensure that, we have copy-on-write. In copy-on-write, you don't write the data in place, but in free space, before updating all references to point to the new space, and updating references to that reference and so on until the superblock (think of it as the main block of the file system. And typically there are multiple superblocks so when all are updated, the operation is done). In this case, if the power goes out, you can be sure either the entire write went through or none of it did, ensuring consistency. All of this is done without writing data twice. Now, when it was first introduced, it was meant for enterprises or businesses (or anywhere where data integrity matters) with file systems like ZFS as it increased fragmentation (think about it: writing within a file will have to be placed somewhere else where there is free space rather than in place). But now that we have SSDs, it does not matter.
Yeah
Lots of text (I read it all)
That is NOt what copy on write means. COW based file systems are not what you have described.
@@ruwn561 it is though
@@ckingpro: No, it isn't.
Sounds like a specialized FS for use with RAID arrays. Probably wouldn't make sense in general use.
that and lets be honest your regullar user dosent know what raid is let alone how to set one up it would be just confusing to most people
@@darksill Also there are tons of open standard RAID-optimized filesystems, if Microsoft ever felt like working with everyone else for once.
Haven't Microsoft realised they can't design a decent filesystem yet? There's many existing solutions that are far better than anything they could come up with. They should have just used one of those.
Really never heard of this. Thanks so much
Add to this video.
This Re FS will give you a possibility to recover from hardware failures. NOT from software failures (because all dat on all 3 disks will be corrupted the same).
Also ...It does not protect you from the effects of malicious hackers encrypting your data....all copies will be encrypted (only a backup will help you at that moment.)
And that is where a snapshot will be valuable....take that snapshot and keep it as backup.....but NOT directly connected to your computer to keep it safe from hackers. (Better also keep it stored in another location....helpfull when one of your location gets destroyed....maybe fire.)
I've years ago told my brother in law some of these tips....about having a safe backup.
It has saved him many thousands of dollars because he could refuse the offer from hackers for the decripting key...(costed him one day of work restoring from his safe backup)
Right; always remember that RAID is for performance or uptime, but *RAID is not a backup*! Also remember to test your backups.
Finally a good TH-camr who doesn’t kill your brain with useless info and click bait… subscribed, shared, and thanks for the informative video!
We are finally moving some of our main storage to ReFS at my workplace. Our use case is backup storage and virtual machines and from all of our reading, it's going to save us a lot of time for many large, write-heavy operations!
Until some Windows update don't wipe clean your drives, as it was the case with refs early this year.
Microsoft seems to be backing away from promoting it, even removing support for it from newer versions of some OS products.
9:40 Yes, that's a very good way to describe the swap file ('overflow for memory'). In general, trying to explain this kind of stuff to the layman is pretty hard, so good job!
Important: cant’use refs on different win server edition. Refs have versions so if you plan to attack disk to another system (like for recovery ) , must be same version
Why would you attack it though 😅 Just kidding, I know you mean attach 😉
@@how_to_lol_u That is hilarious 😂🤣 Good one 👌🏻
Great video, you made this really easy to understand.
Just to add some information, when Joe explaining Copy-On-Write the main point was missed. COW isn't just for multiple file locations (that point to the same actual location) but how some of the new "hip" filesystems make changes to data by always writing to new blocks on the disk for all data. This is for making data more resilient to errors, and also gives the data its own snapshots.
This is why it has file level snapshots, because it is built in to how the file system works. Windows also does file snapshots with NTFS since windows Vista, but this from manually making snapshot copies of data, where as COW file systems (like ReFS) do this natively. Functionally they work the same for you because Windows makes it so, but internally ReFS will do this faster because it doesn't have to manually make a new copy, it is part of how the file system normally works. This also makes the file system better at freeing up the space from the snapshot copies because it naturally overwrites the oldest data when space is needed.
For the features not in refs that you mention. For booting, that's not actually a refs limitation, but because most modern UEFI boot systems, simply do not have a driver for it. So instead you have to give it a driver on the EFI partition, same as for ntfs for a lot of systems. Now windows installer does that for you with ntfs, but it won't do that for refs. This is because MS does not consider refs to be ready for this yet. They have published a roadmap for refs which is in three stages, and refs is currently in stage 2. MS is not going to install the drivers to the UEFI prior to this, but there are some third party methods you can use to do it if you decide you really really really want to... For file system level compression, this is not entirely true. You see, compression on refs is tightly integrated into the deduplication, exactly because they sort of need to be for optimal usage of either. So you enable compression, by enabling the deduplication. Encryption however is not available, again due to its negative impact on deduplication. As for page file, well, it's not that you technically couldn't, but first of all, you're using refs through storage spaces, and you're not allowed to place the pagefile on a storage space volume. You can however do refs on a single drive (even though you say you can't and you absolutely do not need to use storage spaces for it), in which case you could do pagefile for it. BUT, windows GUI will not allow you to do so. And this has to do with the Copy on Write, which I might add, you explained incorrectly. Copy on Write means that if you have a file opened, you make a change in it and save that file, then it will write out the full block that the changes are made in, in a completely new, separate block, then move the file reference over. While it does have the effect that if two programs open the same file, one writes to the file, the other program will now still be reading the old data, because it opened the old reference and has not been told to reload the reference which now points to new data. CoW filesystems are incredibly good for storage that rarely change. It is however incredibly inefficient and slow for data that change rapidly, such as a pagefile. Hence why Windows won't allow you in the GUI to use it that way. You can force it, but you will have a very VERY bad time from it, even if you're nowhere near to running out of ram simply because of how a pagefile is NOT just "overflow ram" which isn't how page or swapfiles have worked for over 20 years now. Anyway... Next you bring up not for removable drives. This is again sort of true but also sort of not. First of all, there's nothing stopping you from adding a removable drive to a storage space pool and having the pool formatted refs. Secondly, the reason it's normally not shown is because refs does not work well with the quick removal that is the standard on modern Windows. If you instead go in and enable caching and optimizing for performance on a removable drive, you can how using powershell force it to format refs (the gui still won't let you). Be warned though that this drive will not work in systems that is not configured the same and you will likely corrupt it simply by plugging it in to such a system.
As for the mirror accelerated parity. Since you as you admit didn't understand it, I'll try to explain it. So because parity calculations are slow, but more efficient. What it does is that it writes a data block to the two parity drives, creating a mirrored set of that block. Now this is ofc inefficient for storage amount. It will however report it as if it was written with parity rather than a mirrored set. It will then when it either has some free idle time or when it's starting to run out of real space, do the parity calculations and rewrite that block as a parity block, at a later time. It's very good for when you have infrequent writes that you want to complete fast. It's bad if you have constant writing as it actually has to write data twice.
Can anyone agree that they would not want to see who is faster at typing against this guy?
i remember when you made the parody how to videos. i thought that shit was so funny. glad to see you're still kickin it with youtube.
If you like to experiment with your computer don't use Refs, I tell you from personal experience. If you go to a higher version of Windows (Insider Preview in my case, even Release Preview which is the slowes ring of Insider Program) and you downgrade, you'll no longer be able to use it or access it until you format or go to the same version or newer than you had.
Thanks for the link about Sparse VDL! Boy, I'd never have found that. So hard to find information about this feature!
ReFS is pretty useful for some server applications. For example block cloning helps to save a lot of space and processing time in backup repositories. The space savings can also be huge on things like VDI (Virtual Desktop Infrastructure).
block cloning is 'deduplication' from a different perspective, what it does however is the same thing.
About parity: as said it does affect performance (at least write wise). I don't know for ReFS, but it should improve read performances for large files since the data can be read from at least 3 drives together (if parity is mixed accross drives eg. block 1 is stored on drive A & B parity on C, block 2 is stored on drive B & C parity on A etc…).
Actually, you CAN run ReFS on a single drive not in a pool, have done so on both win10 enterprise and hyper-v core 2019.
Not certain how I did it and if it was as intended by Microsoft, but I believe I set it up using Windows admin center.
can, but should NEVER until you can trust it , and you can't trust it, until it is open source or properly reverse engineered by the open-source community.
We should be pointing people towards, OpenZFS, BTRFS or Ceph
@@Mikesco3
This is the kind of elitism we don't need.
Yes, btrfs and ceph are by far superior, but not an option in Windows. So they're out of the equation. And compared to ntfs, it is better in some usecases.
It absolutely can be used on a single drive. Just use /fs:refs with the format command in command prompt. Also, scrubbing can only be done on pools with redundancy. On non-redundant ReFS, it'll simply fail when reading a corrupted piece of data, which is still preferable to not knowing that you just read invalid data and get corruption or a crash.
So I wonder if this is salvaging of WinFS from longhorn/vista. If you remember that was suppose to be more a full journaling system they ended up shelving around 2005/2006
No, WinFS was a metadata database store on top of NTFS. ReFS serves a different purpose.
@@zoomosis Ah I forgot. Thanks for the clarification.
WinFS was supposed to be like a relational database built into a filesystem. That was one of the three major technologies that were promised for Longhorn/Vista, all of which were abandoned before release.
your titles are so attractive man but the vid length to explain a concept or subject that could have taken half that time really doesnt let me click, i clicked this one just to convey this comment
Would be interesting to compare it to file systems oft used on servers, like ext4, btrfs, and zfs.
Are there any newer file systems in Windows 11?
I believe the reason refs failed so hard was because it doesn't compete with open source file systems like zfs or btrfs. Most enterprise solutions are virtualizing Windows with a Linux like hypervisor anyways. It's honestly very rare to find a Windows server instance on bare metal in a datacenter. Not like it's impossible to find for very specific use cases, but rare nonetheless. Btrfs and zfs do everything refs does but WAY better. Storage spaces just doesn't compete in performance, flexibility and management.
5:40 if you make VM it reserves set amount of disc (if you pick full reservation, not "grow as you go") When you make saves of VM it will save all "reserved" space. If you use compression on that, which is standard to avoid 20GB backups/images, every bit that is not zeroed, causes issues for compression.
while creating images of VM's, it is a good practice to zero-out whole free disc space, it can make 3.7 GB image/backup into 2.1 GB one, just because compression algorithm don't have random strings in free space. This fs does that to any unused but still reserved space, so any VM images or operations on the whole thing, as backups/restores/loads will work with much smaller images, and that speeds up a lot of things.
updated video on this would be nice
Btw, does Parity do RAID 5 or RAID 4?
I didn't even know there is a new file system, nice video
You don't have a "parity drive", people often say this to make visualization easier but parity bits exist on all the drives and striped across with the actual data bits. Also the calculation of how parity works isn't nearly as complex as you might think, it is just XOR calculations. This is very easy in a 3 drive RAID 5 example because if you take 8 bits (to make it easy to write out) and think of each 4 bit chunk as being what is striped to a single drive, then you get 2 data bits, if you XOR those you will get your parity bit. If you lose any of your data bits you just XOR the remaining one with your parity and you will get your missing piece of data. This gets more complex after 3 drives but the basic concepts are the same.
Technically true for actual parity (e.g. RAID 5 on any number of drives). But for RAID 6 (double-parity)-type systems, the "parity" is not actually parity, but something more complex. I think they use either Galois fields or some sort of Reed-Solomon code (not very familiar).
That said: as far as I understand, the main performance issue with parity RAID is not computational cost, but rather fragmentation due to having to treat every write as its own stripe to compute parity. The "mirror-accelerated parity" feature sounds like it mitigates precisely that problem by computing the parity asynchronously, likely after a larger amount of written data has been accumulated. I believe Bcachefs uses the same technique for its parity RAID support.
@@fat_pigeon You basically just went in slightly more depth on what I already said: "This gets more complex after 3 drives but the basic concepts are the same". Reed-solomon is commonly used and allows for much more complex striping and parity variability for systems like ceph, but at the end of the day it is all somewhat based on the same ideas. They just get more complex and build on each other more and more until it gets a little too hard to explain without breaking out math proofs. I have done a little work with ceph and the reed-solomon algorithm but I wouldn't attempt to break it down any further on something like youtube (Plus I am far from an expert on the minute details)
Most of this stuff was around in ODS-5 on OpenVMS 20+ years ago, although OpenVMS needs updating in terms of storage capacity now (it's planned, they have just been busy the past few years porting the OS itself to x86)
There;s a lot of very good file systems out there, ZFS would be my pick although there are technically better ones
dude I remember when I was like 8 I was talking to my dad about the batteries on ethernet cables and didnt listen to him even though at the time he would have had 30ish years of experience with computers. Nice to see you're making "real" content now
@ThioJoe : I think that it's pronounced "Ree-F-S", not "R-E-F-S" (hence the smaller letter "e" instead of capital "E")
Just a small correction: at least on 10 Enterprise you CAN format a single partition as ReFS directly from File Explorer, no need to use the Storage Spaces thing! (I guess the same is true for 11 Enterprise, but don´t have one at hand to test). Don´t know how the Pro for Workstations versions behave, as I never used those!
11 Enterprise can do that too, just checked.
fun fact ReFS was in a earlier version of windows 10 but MS decided to remove the feature and reserve it only for higher versions of windows
also storage pools and parity are so soooo slow that its not worth it. i setup a truenas vm in hyper-v on my windows 10 pc that boots up automatically when my PC starts. I then added my 5 4tb drives directly to it and use ZFS which is the superior file system.
there is a beta out there to bring zfs to windows but its far from prime time ready.
if MS ever get parity writing speed fixed (27mb/s on 5 4tb drives on a ryzen 1700x) then i might consider switching back but storage pools and parity has been awful sines 2012. so i have very low hopes that ever happening.
I was going to say the same. ReFS in Storage Spaces in any configuration other than 'Mirror' is slow. TrueNAS in VM is what I'm doing too. Got a zvol presented using iSCSI LUN to my PC. Works great, but would be very hard for non-nerds to set up.
@@Rn-pp9et funny that you commented this very thing as i was thinking about changing it from a mapped network share to iSCSI a few days ago.
some programs cant access a mapped network drive. how complicated is it to setup iSCSI in truenas?
@@gamingthunder6305 Easy, there's plenty of guides on YT itself. Just a few clicks.
I used to have a Windows "NAS" which used ReFS on Server 2012 back in ~2012-2013. It was an array with 8 x 2TB disks. I had so many issues with early Storage Spaces and ReFS. I ended up building a new RAID6 array on an LSI MegaRAID, migrated the data, and never looked back.
I think of ReFS not so much as a replacement file system, but really as a specialist file system like ZFS. I'm surprised that ReFS is really still in development...I think the biggest benefit and reason it was originally built was for Hyper-V, and well as we know On Premise Hyper-V got EOL'd with Hyper-V 2019. I wouldn't be surprised if ReFS got canned as well, I cant imagine theres much use of it when theres much better solutions like NetApp, StorServ, 3PAR, hell even ZFS solutions like TrueNAS Enterprise.
My biggest issue with ReFS is that there are no data recovery tools for the file system, unlike FAT32, NTFS, and so on. If the partition simply becomes too full, the data becomes inaccessible and irretrievable. This also holds true if the partition becomes corrupt for whatever reason (it sometimes can't correct for all errors automatically and fails in a non-graceful manner, preventing data recovery). That's a hard pass for me!
Yeah, that why it never really took off. :(
Its basically the modern version of spanned volumes on dynamic disks. Its also the reason why Microsoft stopped supporting dynamic disks & spanned volumes when they introduced ReFS.
ReFS shines in enterprise backup systems. The Data De-Deuplication feature saves a ton of space.
I have a few drive arrays ranging from 60TB to over 100TB and NTFS works the best. I have lost data using ReFS setup via a Storage Pool where the meta data was missing on boot. I only use it now for my backup server. Would not recommend regular users to use this.
The parity system is actually pretty simple, and just exploits a neat property of the XOR operator.
If you have three sets of binary data of equal length, A, B, and C, then with the XOR operator, if we set C = A XOR B, then we can recover A, B, or C regardless of which one fails from the other two:
A = B XOR C
B = A XOR C
C = A XOR B
I'm surprised. Last time I got recomended to this channel it was somewhat an "absurd as true" funny type channel. This is actually legit.
Parity is often easily done in hardware with dedicated logical circuitry... taking the load off the CPU.
- Ben Eater has made a great video on error detection and parity checking... It is really a simple, ingenious and well established concept.
look into bitrot, that's why there are projects like OpenZFS, BTRFS or Ceph.
Hardware raid is known for the potential to introduce silent corruption and or locking people into proprietary solutions that become a problem once the manufacturer doesn't want to support that version of the hardware.
CPU time is not as expensive as it was way back.
@@Mikesco3 I was referring to what parity checking is and how it functions. It has been around for ages, long before RAID was even a concept. I remember reading about it as a teen back in the '80s.
Parity checking isn't going anywhere or being developed any further. It is not like with compression algorithms, where the latest one is able to compress data even harder than it's predecessor. Parity checking is what it is, the more parity bits you add to your circuit, the larger any portions of potentially corrupted data can be reconstructed/recovered. And yes, it can be emulated in software as a lot of circuits can today. But it is a way of building an error-correction circuit with bitwise logical gate IC chips (or in logic arrays like PLAs, GALs, or FPGAs, etc.) completely without the need to wait for a CPU or even a MPU to finish running any code. The parity data is ready the very instance the transmission has been received. This operates at the "bare metal level" as we old school computer nerds used to say (even though "bare silicon level" would probably have made more sense)... drivers or software are much much further up the "food-chain"... along side such phenomena as compatibility issues. Done correctly the OS doesn't even need to know that it exists.
- Sadly YT won't let me post any links... But go find and checkout Ben Eaters videos... you will see what I mean.
@@Zhixalom It sounds like you're talking about some kind of hardware accelerated parity calculations? It sounds useful for stuff like server farms or the like, but it doesn't solve the issue of end-to-end data integrity checking and recovery. Hard drives actually already have error correction via ECC data for each sector, these days 4k each. This will work - for data integrity on the platter only - so long as the corruption that has occurred on-disk is not greater than what the ECC data can repair. Then we have RAID parity setups, hardware or software based to deal with more massive damage, all the way up to whole disks dying altogether. The point Michael is trying to make is that all of these approaches, hardware or software, and including the product you're speaking of if I've understood it correctly, so long as they're not integrated into the filsystem itself, won't be able to detect or repair damage that happens in-flight or in-memory. Data can be corrupted in memory (unless it's ECC memory, as in servers), by a faulty CPU, or while in transit either from or to storage. Only a checksumming filesystem where integrity checks are performed in-memory post retrieval such as ZFS, btrfs or ReFS (APFS promised this, but AFAIK they still haven't fully delivered (only metadata is checksummed)) can detect such a corruption. As a personal example I have, I had a massive ZFS array running on a Linux box with SATA PMPs that had a kernel driver bug when running in SATA-300 mode, which caused data transferred to have thousands of errors every few minutes when fully saturated. I had no idea for almost a year, until I mirrored a SMART faulting drive for replacement, and discovered that I was completely incapable of copying even a single megabyte off the drive repeatedly getting the same hash sums even twice. I then started checking the other drives and found that each and every one of 15 drives running off these PMPs where producing error-filled data when read from them raw. I then debugged ZFS and discovered the torrent of failed reads that it was experiencing - after having read blocks successfully from the drive with no crc errors and no controller errors reported - and silently retrying until it got back what it know was good data. I "fixed" the bug by forcing SATA-150 speeds, and ZFS performance increased massively as a result, as reads where now almost always good instead of almost always bad, and it no longer had to retry until receiving good data. Same for writes, which are by default read back and confirmed in ZFS, then rewritten if bad. Had I had a regular filesystem here, perhaps even running RAID with parity, all my data would have been destroyed. I'd have parity to ensure that it would remain destroyed in exactly the same way going forward, but any software RAID couldn't have prevented the data corruption that happened afterwards on the SATA channel, ECC on the disk neither, and neither can hardware RAID know if the arbitrary blocks sent to it were actually damaged since the fs sent a write request.
Xor parity is an extremely simple bitwise operation anyway. Recover any disk by xor of the other two and the runtime of the xor will be vastly outweighed by disk read time anyway.
Very interesting. Thank. Why are the widgets on background monitor flickering though?
I was wondering about this file system - thanks for covering it. 😄 By the way, there is something strange with the audio equalization for this video compared to your other videos. Your S's aren't coming out as crisply as before. Either that or someone stuck meat probes in my ears while I was sleeping. [Edit: The problem traced to the fact that every time there's a Windows feature update, it wipes out my equalizer settings 🤬]
You can format external/removable USB drives to refs, veeam backup even recommends it for example.
ReFS is a neat type only for programs that are used for backing up files. The files that land on ReFS should already be compressed, duplicated, and encrypted by the program that uses that volume. You want an error correcting, resilient file location. Then set the Block Size to 256-512 to match the program so when to program backs up each block there is not any wasted space at the end of each physical block saving you Gigabytes to hundreds of Gigabytes of space. Also matching the Block Size will allow you to have better OI speeds.
Kudos man. You kept it very simple and helped make the first steps in soft soft. Very Helpfull! Thanks!
BTRFS, ZFS, in Linux with similar features.
and waaaay more secure and dependable than this commercial attempt at ransomware
6:40 I think that it could just XOR the data on the storage drives to get the data for the parity drive and it would fit what you said earlier
That works in the simple example, but for serious use you'd want double (or triple) "parity", which is a bit more complex. I think they use either Galois fields or some sort of Reed-Solomon code (not very familiar).
ReFS now bootable on Windows 11 24H2 latest build
Without unofficial workarounds?? 😮
@@ChrisAzure most stable build 26100.863 and preview 994
Hi,
How did you make a colorized prompt in your powershell? I mean user and path, I know the standard color output.
About the ReFs, this is the Linux btrfs, glusterfs, ceph and similars equivalent :-)
Best regards.
Interesting. I was able to format a single partition on my secondary SSD as ReFS. It was formatted on Windows 10 LTSC 2021.
Probably not that useful on a single disk though. When I eventually reformat that SSD, I'll probably be sticking with NTFS in future.
I don't really have a real reason to be using ReFS, so probably best that I don't until I need to.
You say you won't talk technical and your a tech channel...god give me hope
Lets bottom line it... Its NOT new, Its NOT a replacement, nor is it even a viable alternative, or a next gen for NTFS and unless you have a specific need that takes advantage of it.
I'm no not work engineer and please correct me if I'm wrong but when you're talking about parity bits and such isn't that just raid I forget which RAID level is which I'm pretty sure raid one is the parity bit but anyhow maybe they'll roll it into NTFS at some point once I get the bugs and kinks figured out
is it only me who think this so called new files systems is cheap ripoff of RAID ?
exactly
I'm just happy ThioJoe showed up on my recommendation Feed.
WTF the like button got an animation
If it didn't I would have quit clicking awhile ago
i dont see it
Bro just woke up from a coma or has been living under a rock
@@Meowmeown1664 bro wrote the comment a year ago
Thank you for the information, but one question for me is not answered: Can you use alternate data streams like as in nfts?
Really a new file system?
I have used ReFS for many years (with absolute guarantee and excellent performance, proof of power outages, proof of disconnections, etc). I've used ReFS until the bloody January 11th of this year. On that date, Microsoft began to make it impossible to use it in external mirrored disk cases (specifically QNAP TR004 disks). Microsoft made the ReFS versions of Windows Server 2012, Windows Server 2019, Win10, and Win11 incompatible. Even the data was inaccessible to those who lacked wait time until a few weeks later. So, Microsoft partially patched up that mess. Then nothing was the same as before. Microsoft made us happy with ReFS on external drives, and now it has changed its mind and puts stones on the wheels. Microsoft products have never been stable, nor have they been durable. Only their monopoly is truly lasting and truly eternal.
So it's more or less Microsofts attempt for RAID? 😕
Yes it is kind of raid built into file system
It is like ZFS
@@bennihtm yes
So it is BTRFS or ZFS for windows ? The problem with RAID 5 etc was bit rot , where if it spotted a error it would not know which of the drives made the error so just recalculate a new parity .
Filesystem-level checksumming solves that problem (but for some reason it's off by default in ReFS...).
Sad to see shitdows hasnt switched to ext4 yet. More proprietary junk that isnt half as good. 💩 The joys of a system administerator having to deal with proprietary junk.
EXT4 and linux stuff is junk 💩
@@AchmadBadra you say that when even enterprise standard for a stable 100% uptime filesystem are unix only. ZFS is the defactor standard and is For Linux / BSD. Windows spyware is too inferior to support such reliable options
@@AchmadBadra I'd just like to interject for a moment. What you're referring to as Linux, is in fact, GNU/Linux, or as I've recently taken to calling it, GNU plus Linux. Linux is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX. Many computer users run a modified version of the GNU system every day, without realizing it. Through a peculiar turn of events, the version of GNU which is widely used today is often called "Linux", and many of its users are not aware that it is basically the GNU system, developed by the GNU Project. There really is a Linux, and these people are using it, but it is just a part of the system they use. Linux is the kernel: the program in the system that allocates the machine's resources to the other programs that you run. The kernel is an essential part of an operating system, but useless by itself; it can only function in the context of a complete operating system. Linux is normally used in combination with the GNU operating system: the whole system is basically GNU with Linux added, or GNU/Linux. All the so-called "Linux" distributions are really distributions of GNU/Linux.
Thanks, Joe. This type of video helps those who have maybe heard of ReFS and that it might be great, or not, to know why or why not we might want to look into it.
So basically, what Microsoft has done was, rename RAID 1, 5 and 10 with their own names and then introduce it as ReFS.
No, they add COW and a lot of other things, so that is not a proper representation.
I think the limitations are there for a reason. In modern IT it‘s considered a best practice to keep the Production data and configuration data separately. So the server would boot on a NTFS drive and the user data, Databases, VMs are stored on ReFS.
If following best practices the configuration would be stored as Ansible playbook (or other deployment toolchain). Because restoring a machine from a snapshot is always a pain in the …
Easiest is to just make a automated fresh install, connecting the user data later.
Using zfs combined with sanoid/syncoid for VMS and lxcs allows near instant rollback compared to using more traditional backups. My setup has automatic replication of VM zvols every hour taking up very little space on my pool.
Yay, yet another file system that won’t be supported by anything but windows!
Gotta love how this channel features lesser-known Windows features.
Nothing beats Btrfs
ZFS
ZFS is better in some cases. I use Btrfs though.
I think I know enough about what VDL does without having heard of it. When a file-system allocates space for a file it allocates blocks. When you format a drive you'll see it called, "Allocation Unit size". These blocks are what are allocated for every file. The default allocation unit is 4096 bytes so a file that is 20 Megabytes is 20,000,000 bytes That means that it is equal to 4882 Allocation Units, or blocks. Because files are never precise to the byte there is always empty space between files. So a file that is 19,998,976 bytes would also take up 4882 Allocation Units. So on the file system the file still takes up 4882 Blocks but the actual number of bytes it takes up is less then the exact number of blocks. The blank space will be filled with zeros.
If I'm wrong pleas correct me.
VDL is a dream with there are many small files, smaller than the block size. Having the remainder of the block filled with 0s instead of random file fragments make data recovery and disk-level maintenance a very clean and safe operation.
The ReFS is mostly intended for enterprise and data centres...
File metadata does not include the location. In all modem filesystems, the location is not a property of the file at all, but a consequence of the directory structures that reference it and that referencev this directories and so on. When a file is repositioned (moved) within a filesystem, its metadata might not change at all or even be touched.
On windows, certain caveats apply because of weird legacy logic around short filenames,depending on version.
aside from the windows playground stuff; how is refs support in linux?
Thank you for this, I had no idea what ReFS was for before this.
You just can't format drives in some versions of Windows. But you can use ReFS e.g. under Windows 10 Pro if the drive was formatted with it. And you can use ReFS on single (even removable) drives without any problems. For example, I formatted an external USB backup drive with it.
- 0:28 Vista was supposed to roll out WinFS which was supposed to be a relational-database filesystem, but it never manifest, which was massively disappointing. 🤦 😕
- 4:35 NTFS supports sparse files. In addition to filesystem support, the program that's writing a file also needs to know about sparse files in order to use it. The filesystem allocates clusters/blocks to a file _as needed_ instead of all at once.
One use-case scenario for sparse files is a P2P program. For example, if you want to download a 1GB file, the program can mark it as a sparse file so when it requests the filesystem to reserve 1GB of space, the filesystem will say okay even if it doesn't actually have 1GB of free space available at that moment. Then the filesystem can allocate the clusters as the program downloads them.
So if that 1GB file gets stuck at 50% because there are no sources, then it will only take up 500MB of space on disk and can buy you some time to continue downloading while you try to free up space.
Of course, this also means that if the system runs out of available space when the program tries to write to a location that hasn't been allocated yet, it will get an error, but it's still helpful with low-free-space scenarios.
- 6:18 Mirrored-parity is clever. As an analogy, imagine whenever(-ish) you write to one of the two storage disks, for every byte that's written, you do some sort of reversible operation (eg XOR) between that byte and the corresponding byte on the other storage disk, then write the result to the parity drive. If one drive croaks (or throws bad sectors), you just perform the reverse operation (or in the case of XOR, just XOR again), on the bytes from the parity drive and the good drive and boom! you've got your data back. As Joe said, this lets you have a 100% redundancy backup while only using half the storage (1 backup for 2 drives). 👍 But as Joe also said, it requires extra work to do the calculation, but what he didn't mention is the extra time for the extra drive-reads as well. But then, this is for long-term reliable data-storage, not a gaming system. 🤷
- 7:45 Nope, volume-shadow-copy has been available with NTFS at file-level since Windows Vista (well, _technically_ since XP…)
- 10:27 You don't think a lack of transactions is interesting? Transactions are like ACID for databases, they guarantee all-or-nothing changes, it's part of what makes Linux filesystems more reliable. - Quotas are also not supported, so you can't limit different users on the system to be able to specific amounts of disk space (which is particularly bad for data-centers that rent out space 🤦). - No 8.3 filenames‽‽‽ 😲 Awwwww, come on. 😕
- 10:52 They were trying to focus on WinFS because a relational-database filesystem is much more useful, but unfortunately it got stuck in development-hell and was ultimately cancelled. 😕
at 2:32, what is the nice terminal he is using ?
How did you enable the fancy drive paths for powershell ? I've seen people do that for the Linux terminal before I just don't know how to do it.
Would there be an advantage in using a file system designed with flash storage in mind? Something like f2fs. Most pcs have some sort of flash storage.
Is it "resilient" because of technology inherent in ReFS itself, or because of drive pooling (mirroring, striping with parity, etc.)?
4:20 one question, in this case where it uses a pointer to the original instead of being a true copy, what happens if the original gets deleted? Prior to deletion does it ACTUALLY copy the file to the other location?
ReFS can't be more stupid than any other file system with CoW and snapshot functionality or with support for hard links - the disk space gets released when there are no more references to the data.
So if a file is part of multiple snapshots or there are multiple hard links to it, then all snapshots or links needs to be deleted for the disk space to be released.
I like the prompt in your powershell console. How do you do that?
isn't everything I heard available on BTRFS?
I am very hapy with the free and open source Btrfs and ZFS from Linux and Unix!
When it comes to servers, I feel absolutely zero need to go for a proprietary and expensive solution copied by micro$oft, while we have such amazing free ones.
9:55 Not sure, but may I guess why?: because then you would need to disable copy-on-write (COW). Otherwise enormous blocks of data would be rewritten on every write. :P
To store huge binary files like disk images and swap files, you need to disable it for the directory where the big binary needs to be stored.
I know on Btrfs you can disable copy-on-write for a specific directory. Probably on ReFS that - of course not - cannot even be done. :P
If I remember correctly when ReFS came out is was really only meant to be used for storage on Hyper-V hosts. It wasn’t really meant to replace NTFS for normal workstations or VMs.
So I want to ask something, the Block Cloning feature suggests a pointer to the original file, what if the drive containing the pointer file needs to be taken away, will the file be actually stored? Also, the same for the drive containing the original file and it gets removed, will the other drive persist it’s original copy or the pointer?
This filesystem isn't meant for removable media at all, but for an array of internal drives that are generally removed only when they fail. In the scenario you describe, if you pull one of the disks, the filesystem would treat that as a drive failure. CoW is separate (it would work on a single-drive volume); when used together with mirroring, *both* drives would have a copy of *both* the original and the modified file.
I Want to know difference between Basic Disk Vs Dynamic Disks .
Thanks for the tech info Joe! This is pretty helpful! 👍