Overclockers Australia Forums

OCAU News - Wiki - QuickLinks - Pix - Sponsors  

Go Back   Overclockers Australia Forums > Specific Hardware Topics > Storage & Backup

Notices


Sign up for a free OCAU account and this ad will go away!
Search our forums with Google:
Reply
 
Thread Tools
Old 20th May 2016, 3:07 PM   #1
elvis Thread Starter
Old school old fool
 
elvis's Avatar
 
Join Date: Jun 2001
Location: Brisbane
Posts: 27,888
Default Next gen filesystems (ZFS, BtrFS, ReFS, APFS, etc)

Going to try and consolidate some discussion on next-gen filesystems. This comes up in various subforums (other operating systems, business and enterprise, storage), but I think it's a good idea to bring it all in one place.

I'm going to start a quick list of features of the two I'm most familiar with: ZFS and BtrFS. I'd love for someone to throw in something comprehensive on ReFS (I've never used it in anger, and only know what I've read).

First up: what is a next-gen filesystem? Filesystems up until now have relied on the fact that the underlying hardware is fairly reliable. That was pretty much true up until we hit multi-terabyte workloads. With petabyte workloads becoming common in large business, it's a concern. But why?

https://en.wikipedia.org/wiki/RAID#URE

Hard disks have a URE (Unrecoverable Read Error) rate of about 1:10^14 bits (~12TB)for consumer disks (IDE, SATA) and about 1:10^15 (~125TB) bits for enterprise disks (SAS, SCSI). This differs between vendors, however these numbers are beginning to become less of a unlikely chance, and more of a guaranteed thing at the data rates we use today. We call this failure "bit rot".

For large storage systems in the past, many of us relied on RAID controllers to offer us not only data sets that span beyond the limits of a single drive, but also a way to include some sort of redundancy should one of the physical drives malfunction within the data set. We're seeing patters today where these RAID controllers are missing bit rot, and are assuming that data sets are clean when they are not.

RAID controllers also need to verify entire, large disk systems, and are often unaware of the file systems on top of them. This leads to problems where a single drive loss can take days to rebuild in large arrays. Combine that with the bit rot issue, and our data isn't as guaranteed as we'd like.

Previously it was considered "good practice" to abstract the layers of storage. The RAID system should be agnostic to the volume management, and the volume management should be agnostic to the file system. This allowed high flexibility for people to choose whatever file system they like, and combine it with whatever RAID system they like.

Next-gen filesystems are a realisation that at current data rates, we need disk management systems that are aware of all the layers at once. They need to understand everything from the logical file system that handles data and metadata, right down to where a byte is physically placed on a disk, and if that byte has been reliably written or read.

As a result, next gen filesystems do away with RAID controllers entirely. That firmware layer merely hides the actual success or failure of a write from the filesystem. Likewise with volume management, these filesystems need to "do it all", so that they can be 100% sure that data is written and read correctly, without physical or logical fault at any layer.

The upside is that you can use any disk controller that allows direct access to a raw disk (often called "Initiator Target mode", or "IT mode"). Often you'll see guides for users wanting to flash their SATA and SAS controllers to remove the vendor-provided RAID functionality, and offer this direct IT mode access. The other upside is that even very cheap SATA controllers on motherboards now become useful for quality storage, as no special hardware is needed.

These filesystems keep checksums (mathematical fingerprints) of every single byte of data written, and compare them on every read. This sacrifices a small amount of performance, although with modern CPU and RAM speeds, it goes largely unnoticed (I'm typing this from a single core Pentium M 1.5GHz and 30GB IDE hard disk running BtrFS with single data and duplicate metadata, and file system performance is within 2-3% of ext4fs). This means that data is verified constantly on all reads and writes, and if there are problems, is rebuilt in the backround.

One caveat to all of this is that your system RAM *must* be reliable. If corruptions in-memory occur, then whatever is written to disk will never be guaranteed. As such, ECC RAM is recommended where next-gen file systems are used to prevent memory bitrot. This isn't mandatory for these file systems to work, however it's recommended particularly for dedicated storage arrays.

Next-gen file systems also tend to be COW (Copy On Write). This means that they never update data in-place like legacy file systems tended to. If data in a block is modified, the file system will make a copy of that data block and modify it before writing it back to a different part of the disk. This results in a guaranteed consistency of data even in the result of a crash (the old block is still there and valid pre-crash), as well as allowing trivial addition of snapshotting.

Snapshots themselves are a handy byproduct of COW. Instead of marking the old location of the data as deleted after a COW operation, the file system can track these changes, and build a virtual "snapshot" of the old data versus the new data. The benefit is that only the changed data space eats up actual disk space, so it appears as if you can have multiple copies of very large sets of data from points in time, but in reality you're only storing changes.

Two popular next-gen filesystems on offer today are ZFS and BtrFS. ZFS was initially written by Sun Microsystems for their Solaris operating system, and ultimately bought by Oracle when they acquired Sun. BtrFS was ironically started as a Linux-based competitor to ZFS by Oracle, when they then found they now had two competing products under one roof.

Both are open source, however ZFS is licensed under the CDDL, which is an incompatible license to Linux and BtrFS's GPL. This made distribution of ZFS on Linux tricky for a while, which is why most ZFS appliances and systems run either Solaris or BSD.

After some time, open source ZFS development was moved to a group called OpenZFS, who worked on allowing people to compile their own ZFS easily under multiple operating systems, including Linux and MacOSX, as well as continuing work on BSD and Solaris vesions. Recently Ubuntu announced they would ship OpenZFS with Ubuntu natively, which is still under the legal microscope, however it's out there now in the most recent release of 16.04 LTS (along with BtrFS, which is native under most modern Linux kernels).

BtrFS is also open source, licensed under the same GPL that covers the Linux kernel. BtrFS development is active, and contributors come in from a number of big companies around the world. The current lead developer works for Facebook (previously Oracle), and unverified word is that Facebook themselves use BtrFS heavily in their environment.

Some general "pros and cons" of each:

ZFS:

ZFS is rock solid, thanks to a heck of a lot of people using it in commercial environments for some time now. ZFS offers a heap of cool features, such as:

* RAIDZ - an updated version of RAID5, which avoids the "RAID5 write hole" - a problem where a power failure at the moment of a RAID5 parity write can leave data in an inconsistent state, and one that the RAID controller can't detect. Generally this problem reduced with UPS, battery backup, or NAND flash on the RAID controller, however still isn't guaranteed in that case. RAIDZ fixes that issue, and as long as ECC RAM is in place, guarantees consistency even in a hard crash on write.

* RAIDZ2 and Z3 - offering extra parity (RAIDZ2 would be similar to RAID6, and RAIDZ3 would offer a third disk worth of redundancy for really large arrays).

* Offers great volume management with many advanced features - some of them great for users of virtualisation and container software who want to clone/create/destroy data sets quickly.

* Ability to make volumes exported as raw block information, which can be used as iSCSI LUNs, FC LUNs, swap volumes, and other things wher raw blocks are needed.

* Built in realitme compression (various algorithms and levels including LZO, LZ4, LZJB, GZIP and others depending on the vendor). If your CPU is fast enough, this can actually improve performance in some cases, as less data is written back to the storage. It can also lengthen the life of SSDs!

* Built in encryption

* "Infinite" read-write or read-only snapshots

* Ability to duplicate data even on a single disk

* Online "scrub" (perform a file system check while the system is running and in use)

* Ability to use SSDs as read and write cache infront of spindle disks, as a way to add some performance to slower storage

Some cons of ZFS:

* ZFS is RAM hungry. Most setups recommend a minimum of 8GB of RAM in the system running ZFS, and the more the better. Where I work, all of our ZFS appliances have 512GB ECC RAM in them to really maximise performance.

* Unless you're running Solaris, ZFS is difficult (sometimes impossible) to boot from. Linux and Mac users will struggle here especially, and ZFS is better suited on these platforms as your storage area separate to your OS.

* As the first major "next-gen" file system, ZFS is now slowing in development. It's assumed that other file systems will eventually surpass it, however that's not really a "con", but just the natural evolutoin of things.

* Can't resize an existing RAID set by adding new disks to it

BtrFS:

BtrFS is much younger than ZFS, and as such hasn't had time to get all the features implemented. Starting with a few cons compared to ZFS:

* RAID5/6 isn't quite ready yet. Code exists and is in testing, however it's not recommended for production use. It also hasn't yet solved the RAID5 write hole like ZFS's RAIDZ has, however that's "coming soon".

* No SSD caching yet

* No built in ecryption yet

* No advanced raw disk export (can't hold swap)

* Poor performance for databases (you can disable COW at the file/driectory level to get around this if required).

* Not available for BSD, Solaris or MacOSX yet. Native to most Linux distros, and there's currently a beta driver for Windows and ReactOS (bonus points to anyone using ReactOS in the real world).

Pros of BtrFS:

* Great volume management, and for some Linux distros, the ability to auto-snapshot your entire OS every time you update, add or remove packages via your package manager, resulting in easy rollback of a bad update.

* "Infinite" read-write or read-only snapshots

* Online "scrub" (perform a file system check while the system is running and in use)

* Very light on memory usage. The BtrFS based laptop I'm sitting on now runs LXDE and the Midori browser, and the entire system is consuming 430MB of RAM (with 320MB of that going to the 6 browser tabs I have open).

* GRUB2-compatible Bootable/system drive for Linux (even if you want ZFS on Linux, you'll still need to put your boot/root volume somewhere)

* Built in compression, often resulting in faster reads and writes on slower storage, and improving SSD life

* SSD optimisations (mostly to the queue depth settings)

* Easily add disks to an existing RAID set and run a "rebalance" to use all space. You can do this while the RAID set is mounted and online.

* Ability to have different data and metadata RAID levels mixed within a volume

* Production-ready data levels are:
* Single
* Dup (write data or metadata twice, even to a single disk)
* RAID0
* RAID1* (special case for BtrFS, see below)
* RAID10

* BtrFS's RAID1 is special, in that it makes sure that each block of data is written to two separate devices. This is particularly useful if you have a number of mis-matched drives. For example, you can have 4x 1TB and 1x2TB disks totalling 6TB of space, and BtrFS will ensure that under RAID1, you get a usable 3TB of storage. Other RAID systems would give far less, depending on their definition of RAID1, and typically limit the usabe size of every disk to the size of the smallest drive. (Some would just offer 1TB of data, mirrored across all drives, and some would allow 2TB with an uneven amount of drives, and 1TB as the maximum usable space on any disk due to that being the smallest drive size).

So, which of these should you use? That would largely depend on your use case. If you absolutely require RAID5/6 type volumes over many disks, ZFS is going to win that battle here and now (at least, until BtrFS sorts out their RAID5/6 stuff). For smaller home file servers, BtrFS is great. I use a RAID1 setup on 4x1TB and 1x2TB drives, and can happily replace my smaller disks with larger ones individually over time, and not waste space.

If you want to speed up a lot of spindles, ZFS's SSD caching is fantastic. If you're already on an all-SSD array, then that matters less.

BtrFS's low memory usag makes it great for smaller systems. Raspberry Pi users are likely to not be able to use ZFS at all, but for micro-NAS setups or build-your-own-Time-machine users, an RPi with Netatalk and BtrFS makes for an excellent "Time Capsule" type device for Apple MacOSX users to automatically back up over WiFi. (Or NFS/SMB for backup of anything else to your micro-NAS).

MacOSX users, HFS+ is simply the second worst file system in existence today:
https://blog.barthe.ph/2014/06/10/hfs-plus-bit-rot/

And for you guys, getting OpenZFS on Mac on any volume that needs data reliability is a very good option. Particularly so that Apple are slowly dropping support for software RAID, and hardware RAID options for Mac users are terrible (as a commercial Mac user, there is not a single decent commercial RAID array for Mac out there today). I see a lot of photographers corrupt data regularly thanks to HFS+'s terribleness, and ZFS-on-Mac means that precious data is just that little bit safer.

As I said right at the top, I'd love for someone to add some details about the Windows equivalent, ReFS, and particularly where it's headed for Windows in Server 2016 and beyond.

Last edited by elvis; 15th June 2016 at 9:58 PM.
elvis is offline   Reply With Quote

Join OCAU to remove this ad!
Old 20th May 2016, 4:52 PM   #2
peter10001
Member
 
Join Date: May 2010
Location: Netherlands
Posts: 136
Default

ZFS does not need the special ram, at least for home use this not needed, for a company it is better.
ZFS wil notice a damage file.

ZFS or BTRFS, if you look at safety for your files, then ZFS is better, BTRFS is still to young.
peter10001 is offline   Reply With Quote
Old 20th May 2016, 5:06 PM   #3
elvis Thread Starter
Old school old fool
 
elvis's Avatar
 
Join Date: Jun 2001
Location: Brisbane
Posts: 27,888
Default

I'm very glad I wrote this thread. Two incorrect statements straight off the bat, based on old information. Let's look at them in detail:

Quote:
Originally Posted by peter10001 View Post
ZFS does not need the special ram, at least for home use this not needed, for a company it is better.
ZFS wil notice a damage file.
Incorrect.

When you write data to disk on PC architecture, it travels through RAM. If your RAM is corrupt, your file is corrupt.

For example:
* Request a write of a file with the bit sequence 010101
* RAM is corrupt, sends 010111 to the file system driver in your kernel
* File system gets request for 010111 write, writes it
* File system stores checksum for 010111

ZFS cannot detect this, because according to ZFS, the original request and resulting data matched. You still need ECC RAM to ensure your RAM doesn't corrupt the data before it gets to the filesystem layer. (Same goes for BtrFS).

ECC is to RAM as block checksumming is to ZFS and BtrFS.

The caveat of course is that ECC RAM (and motherboards that support it) are expensive. You can still get many benefits from ZFS and BtrFS on non-ECC systems, however to be *really* sure of your data integrity, you do indeed need ECC RAM.

My home NAS runs BtrFS on non-ECC RAM. I acknowledge that this is sub-optimal, but don't have the budget for ECC. Conversely, at work we run multiple ZFS and BtrFS servers in production, and they all run ECC RAM.

Quote:
Originally Posted by peter10001 View Post
ZFS or BTRFS, if you look at safety for your files, then ZFS is better, BTRFS is still to young.
Outdated information.

https://btrfs.wiki.kernel.org/index....ability_status

BtrFS is stable for Single, Dup, RAID0, RAID1 and RAID10 workloads as of Linux kernel 4.1 (based on the last deadlock fix issue, which is many releases old now). On disk structures have not changed in many releases, and features from 4.1 onwards are consistent with all deadlock edge cases sorted.

BtrFS RAID5/6 is still in development, and changing regularly. It's not considered production-ready. With that said, plenty of people *ARE* using it in production (RockStor, for example, support it commercially on their NAS devices). However conservative users should avoid RAID5/6 on BtrFS for a while longer yet.

As above, I run both ZFS and BtrFS in production right now. I consider both stable, and am confident in that statement by virtue of the many petabytes of information we store and move around every month.

Last edited by elvis; 20th May 2016 at 6:04 PM.
elvis is offline   Reply With Quote
Old 20th May 2016, 5:08 PM   #4
fad
Member
 
fad's Avatar
 
Join Date: Jun 2001
Location: City, Canberra, Australia
Posts: 2,056
Default

Alot of the features of ReFS is part of the Microsoft Storage Spaces, or are included in SS, so they can have those features and also be formated to NTFS.
I think the features of Storage Pools and ReFS are not seen, until you get to the sort of size where you have hundreds of disks.

I run a 8x4Tb 7.2k 4x128Gb SSD Storage space with RAID 5. I get the same performance that you would expect from a small level SAN. 200-300mb/sec at around 100k IOPs.

ReFS:

Cons of ReFS:

ReFS is very young. Storage spaces is the same.

* RAID5/6 isn't quite ready yet. Code exists and is in production , however it's not recommended for production use. There isn't a raid hole, but the performance is 20-30mb/sec.

* No built in encryption yet

* Poor performance for anything other than mirrors

* Not available for *nix, BSD, Solaris or MacOSX .
* 2012R2 needs shared SAS drives (2016 changes to shared hosts)

Pros of ReFS:

* Great volume management, there are powershell commands for scripting operations and the standard microsoft GUI tools.
* Metadata integrity with checksums
* Integrity streams providing optional user data integrity (data checksums are optional)
* "Infinite" read-write or read-only snapshots
* Online "scrub" (perform a file system check while the system is running and in use)
* Shared storage pools across machines for additional failure tolerance and load balancing
* Built in compression
* Built in dedupe.
* SSD optimisations
(There is support for tier storage, and SSD cache storage. For the cache a raid space is created the same level as the other drives. All data written is stored to these first before being destaged back to the slower drives. Max size is 100Gb)
* From Server 2016, support for storage spanning hosts. With VSAN or ScaleIO like behaviour.

* Production-ready data levels are:
* Single ( Not sure this counts)
* Mirror
* Dual Mirror

* If the system is configured correctly, the disks can be failed over to a working cluster node. Which working as a active-active cluster.

I will double check the ReFS stuff later. I think this is mostly correct.
__________________
WTB : 1RU Server with rails 1150/1151/1366/2011/2011-3
WTB : 1150 Intel stock heatsink + DVD laptop drives

Last edited by fad; 21st May 2016 at 7:41 AM.
fad is offline   Reply With Quote
Old 20th May 2016, 7:54 PM   #5
freshmania
Member
 
Join Date: Sep 2002
Location: Sydney
Posts: 81
Default

Hi elvis,

Thanks for starting this thread.

Assuming a system with non-ECC RAM, the data you write to disk will be corrupted if the data on RAM is corrupted. This is true for both ZFS and Btrfs.

So why didn't you go with ZFS on your home server? Is it because ZFS would be bottlenecked by the lack of RAM?
freshmania is offline   Reply With Quote
Old 20th May 2016, 8:21 PM   #6
elvis Thread Starter
Old school old fool
 
elvis's Avatar
 
Join Date: Jun 2001
Location: Brisbane
Posts: 27,888
Default

Quote:
Originally Posted by freshmania View Post
So why didn't you go with ZFS on your home server? Is it because ZFS would be bottlenecked by the lack of RAM?
A few reasons.

1) My home fileserver is quite old hardware - Athlon X2 250 with 4GB RAM. I definitely wanted to see how BtrFS would handle an older system.

With that said, I'm also upgrading all my Raspberry Pis in the house (three in total) to Raspbian Jessie, and will set them all to BtrFS-on-root. I've got one testing at the moment, and it works quite well (LZO compression really helps on my crappy cheap-brand SD cards, with notably lower wait times / latency for disk IO).

I'm also building a couple of RPi based "Time Capsule" devices for Apple Mac using friends, although they will stick to ext4-on-root for sanity (BtrFS on root is a bit tricky with Raspbian, and I wouldn't want to support that remotely), and go with BtrFS on USB storage for their Time Machine storage, exported via Netatalk/AFP.

2) I wanted to use BtrFS in a multi-disk SOHO environment. I run BtrFS on all my laptops at home (6 in total in my house - don't ask me why I have so many, I think they breed when I'm not looking). But they're all single-disk systems. I use BtrFS a work as well on workstations and on a few older servers with RAID controllers (still in RAID mode, not yet flashed to IT mode), but again these don't use multi-disk setups.

3) And the big one for me - I wanted to use something that would handle an odd amount of non-uniform disks. With ZFS, if I mix different sized disks in a single vdev, ZFS will cap the usage of each drive to the size of the smallest disk. BtrFS will do the same in RAID10, however in RAID1 mode it will simple ensure 2 copies of each data and metadata block exist on different physical disks. It almost acts more like a clustered filesystem in that regard, than traditional RAID1.

This is cool for a few reasons:

a) I can mix and match drives quite happily - different speeds, different types, different sizes. ZFS really doesn't like this, and Linux's MD-RAID would cap the usable space unless I did some manual partitioning and trickiness (i.e. two partitions on the larger disk as separate devices for individual RAID1 devices, which I'd then have to merge together with either another MD or LVM. Very messy).

b) I can upgrade individual drives one a time, and not waste space. If I throw out one of my 1TB drives and swap it with a 3TB, I can run a "btrfs balance" to spread my existing data over the drives.

If I had to build a larger store that needed RAID5/6 style storage, and was buying brand new hardware, I'd go with ZFS today (at least until BtrFS sorts out their RAID5/6, and then I'd re-evaluate that statement). But for my needs in this instance, BtrFS won this particular battle.
elvis is offline   Reply With Quote
Old 20th May 2016, 11:31 PM   #7
ewok85
Member
 
ewok85's Avatar
 
Join Date: Jul 2002
Location: Tokyo, Japan
Posts: 8,072
Default

You missed one of the big pro's of using ZFS - it scales up fairly effortlessly to huge levels.

I've been running ZFS on my home server since about 2012 - Intel S1200KP mini-ITX server motherboard with 16GB of ECC memory, Intel E3-1265L V2 @ 2.50GHz. Ubuntu is running off an Intel 128GB SSD, and 8x 3TB hard drives (WD and Hitachi) running in raidz2, connected to a 2-port SAS HBA. I think it's this LianLi case: http://www.lian-li.com/en/dt_portfolio/pc-q25/

This provides me with 14.9TB of usable space.

It's full speed on my little gigabit home LAN, which is good enough for me.
__________________
半ばは自己の幸せを、半ばは他人の幸せを
ewok85 is offline   Reply With Quote
Old 21st May 2016, 7:30 AM   #8
Diode
Member
 
Diode's Avatar
 
Join Date: Jun 2011
Location: Melbourne
Posts: 1,507
Default

Nice write up. Whilst your write up is mostly focused around ZFS and other *nix file systems it has got me thinking about my photos and the integrity of them stored on my NTFS disks. I have seen over time more and more photos appear with bit rot, especially as I have upgraded hard disks and transferred them over. It seems to be an increasing problem as disk density continues to increase.

Anyhow. Going to make some hashes and compare my backups with my primary copies to see if I can recover some of the lost images.

Once cleaned up I'll have to update my copy process to have proper file verification. Perhaps xxcopy using /V2 option which does byte by byte comparison.
Diode is offline   Reply With Quote
Old 21st May 2016, 8:18 AM   #9
elvis Thread Starter
Old school old fool
 
elvis's Avatar
 
Join Date: Jun 2001
Location: Brisbane
Posts: 27,888
Default

Quote:
Originally Posted by ewok85 View Post
You missed one of the big pro's of using ZFS - it scales up fairly effortlessly to huge levels.
I hinted at a few things in the first post, but you're absolutely right.

At work, we run 2.5 PB (i.e.: 2500 TB) or ZFS storage online. I mentioned in the first post that next-gen file systems aim to solve issues of URE (Unrecoverable Read Errors) at the 1 bit to 12TB ratio for home users, and 1 bit to 125TB for enterprise users. These sized file systems used to be considered enormous, but today are quite standard. Scaling out to petabyte workloads is something filesystems need to be able to do now, and are not longer a problem "for the future".

I also mentioned how RAID on large data sets can take a very long time to rebuild from a failed device. Next-gen filesystems have and advantage in that they have access from the data and metadata layer, through the volume management layer right down to the drive layer. As a result, when they rebuild data from a hardware failure, they only need to rebuild the missing/incomplete/damaged data. They don't need to rebuild parts of the RAID volume where no data was stored, and they don't need to rebuild unaffected files.

Rebuilding a 1PB RAID array on traditional RAID controllers would probably take weeks. Our storage arrays at work have recovered completely from drive faults in mere hours, even with over 100 8TB drives in them.

Due to the post length, I also skipped over deduplication and remote send/receive (for block-level file system synchronisation between remote systems), which are two features that can be very useful for some folks. Hopefully we can talk about these more soon.

Quote:
Originally Posted by Diode View Post
Nice write up. Whilst your write up is mostly focused around ZFS and other *nix file systems it has got me thinking about my photos and the integrity of them stored on my NTFS disks. I have seen over time more and more photos appear with bit rot, especially as I have upgraded hard disks and transferred them over. It seems to be an increasing problem as disk density continues to increase.

Anyhow. Going to make some hashes and compare my backups with my primary copies to see if I can recover some of the lost images.

Once cleaned up I'll have to update my copy process to have proper file verification. Perhaps xxcopy using /V2 option which does byte by byte comparison.
I'm still hoping there'll be more Windows/ReFS contributions to this thread. From what I read on MSDN blogs, it sounds like Server 2016 will have some more upgrades to ReFS, and I'm hoping Microsoft will be clever enough to include it in desktop versions of their OS soon too (if there's not a way to use it now in Windows 10?).

Windows users are otherwise being left out in the cold a little for large storage options. Most folks end up turning to a shared NAS running Linux/BSD/Solaris for their storage with Samba exports, even in single-user Windows environments (there are prebuilt distros like RockStor for BtrFS and FreeNAS for ZFS). OpenZFS has stable MacOSX ports (I'm going to be using ZFS-on-Mac much more this year, as testing has proven quite solid even on single disks), but there's not much out there for Windows desktop users.

Even this year, I've reverted to tools like PAR2 to build checksums and repair information for data on filesystems I don't trust (let me once again share my hate for Apple MacOSX's HFS+). I find that pretty terrible that I have to resort to that sort of thing, as it's manual and time consuming, and something our filesystems should be able to do for us in this day and age.

Last edited by elvis; 21st May 2016 at 8:21 AM.
elvis is offline   Reply With Quote
Old 21st May 2016, 8:30 AM   #10
elvis Thread Starter
Old school old fool
 
elvis's Avatar
 
Join Date: Jun 2001
Location: Brisbane
Posts: 27,888
Default

For the filesystem geeks (no shame here, I'm one of them), my favourite desktop BSD Distro, Dragonfly BSD, has it's own next-gen filesystem called HAMMER:

https://www.dragonflybsd.org/hammer/

I've not used this one yet, but it's good to see other people trying out development next-gen filesystem concepts. Homogeneous software is not good, and different implementations of the same ideas are "good computer science".

HAMMER has an in-development successor named HAMMER2, which aims to add clustering natively on top of HAMMER, using the send/receive and block checksumming ideas implemented in most next-gen filesystems as a way to track and verify changes in remote volumes.

HAMMER2 could then be used for either scale-out NAS storage (i.e.: add multiple NASes together to make one big "super NAS"), or as remote replication (keep data in multiple sites in hot-sync, with users able to edit data in all locations in a safe and consistent manner in near-real time). Quite cool.
elvis is offline   Reply With Quote
Old 21st May 2016, 8:53 AM   #11
Aetherone
Member
 
Aetherone's Avatar
 
Join Date: Jan 2002
Location: Adelaide, SA
Posts: 8,334
Default

Quote:
Originally Posted by elvis View Post
I'm hoping Microsoft will be clever enough to include it in desktop versions of their OS soon too
FWIW, ReFS appears to function quite well in windows 8.1. You just can't format ReFS without a Server 2012 handy.
Aetherone is offline   Reply With Quote
Old 21st May 2016, 10:20 AM   #12
Diode
Member
 
Diode's Avatar
 
Join Date: Jun 2011
Location: Melbourne
Posts: 1,507
Default

There is a registry hack to enable Windows 10 users to format a drive as ReFS. The intent of ReFS is for data storage and archive so you can't boot from it and at the moment misses other features of NTFS.
Diode is offline   Reply With Quote
Old 21st May 2016, 10:29 AM   #13
NSanity
Member
 
NSanity's Avatar
 
Join Date: Mar 2002
Location: Canberra
Posts: 15,608
Default

I didn't think DeDupe was in for ReFS yet?

Sorry elvis re: Storage Spaces and ReFS - most of the info on both is pretty thin on the ground. The best people talking about Storage Spaces is Fujitsu - http://sp.ts.fujitsu.com/dmsp/Public...ance-ww-en.pdf

ReFS is thin - however given that SQL 2016 and Exchange 2016 are now recommending their "best practice" implementation to be placed on the filesystem - expect more to come from it soon.

My experiences with Storage Spaces is pretty good (we're about to cut ~25TB to Storage Spaces 2012 R2). Its a bit easier to get iops out of it than ZFS.

Looking at Scale Out File Server (SOFS), SMB Direct and SMB Multichannel - if Hyper-V allows us to put vhd's on it, we'll have quite a nice little stack shortly.

re: Parity spaces and performance - the secret is putting a Write Back Cache in front of the pool and ensuring that the pool is on some form of power protection (with the pool told that it has that). It's still not quite as fast as say - a 9271 w/ Cachecade - but you're not giving that much performance away.
NSanity is offline   Reply With Quote
Old 21st May 2016, 10:32 AM   #14
Smokin Whale
Member
 
Smokin Whale's Avatar
 
Join Date: Nov 2006
Location: Pacific Ocean off SC
Posts: 5,124
Default

Good post. I'd like to see some real world performance figures as well, especially with SSD caching. I'm using ReFS and it's good apart from the performance. I'm going to need a serious bump of performance soon, and I'm tossing up on whether to just do a stripe of 2x 1TB SSDs or put that SSD storage into a caching system of sorts.
Smokin Whale is offline   Reply With Quote
Old 21st May 2016, 10:35 AM   #15
NSanity
Member
 
NSanity's Avatar
 
Join Date: Mar 2002
Location: Canberra
Posts: 15,608
Default

Quote:
Originally Posted by Smokin Whale View Post
Good post. I'd like to see some real world performance figures as well, especially with SSD caching. I'm using ReFS and it's good apart from the performance. I'm going to need a serious bump of performance soon, and I'm tossing up on whether to just do a stripe of 2x 1TB SSDs or put that SSD storage into a caching system of sorts.
tbh - i'm not seeing any real difference between ReFS and NTFS for Exchange Mail / Log Stores.

Just let Storage Spaces tier the storage.
NSanity is offline   Reply With Quote
Reply

Bookmarks

Sign up for a free OCAU account and this ad will go away!

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +10. The time now is 4:38 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
OCAU is not responsible for the content of individual messages posted by others.
Other content copyright Overclockers Australia.
OCAU is hosted by Micron21!