Next gen filesystems (ZFS, BtrFS, ReFS, APFS, etc)

Discussion in 'Storage & Backup' started by elvis, May 20, 2016.

  1. NSanity

    NSanity Member

    Joined:
    Mar 11, 2002
    Messages:
    18,244
    Location:
    Canberra
    FreeNAS 10 has a hypervisor in it iirc.

    Correction - 9.10 has it (which is BSD 10, whatever).
     
  2. Smokin Whale

    Smokin Whale Member

    Joined:
    Nov 29, 2006
    Messages:
    5,183
    Location:
    Pacific Ocean off SC
    Ah. Didn't know that was a thing now.

    http://www.freenas.org/blog/freenas-910-released/

    Nice. Whilst I've used FreeNAS with relative ease in the past, BSD isn't really my forte. Bhyve is pretty new though isn't it?
     
  3. NSanity

    NSanity Member

    Joined:
    Mar 11, 2002
    Messages:
    18,244
    Location:
    Canberra
    As far as i could see - yeah (2011) . I know people were clambering for *some* form of virtualisation inside of FreeNAS for some time. I presume the biggest reason its not KVM/oVirt is due to BSD vs GPL bullshit.
     
  4. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    43,457
    Location:
    Brisbane
    BSD has had jails since forever.

    In fact, that's what Linux's OpenVZ/LXC/Docker were all based on over a decade later.

    KVM is specifically built for the Linux kernel. You can run QEmu on KVM but it's slow. I have no idea what the equivalent is for BSD.
     
  5. NSanity

    NSanity Member

    Joined:
    Mar 11, 2002
    Messages:
    18,244
    Location:
    Canberra
    Jails != Virtuals though.

    Its much harder to escape a guest hypervisor than it is a jail. Just ask Apple.
     
  6. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    43,457
    Location:
    Brisbane
    Free/OpenBSD have a pretty good track record.

    Apple give no fucks about security.
     
  7. NSanity

    NSanity Member

    Joined:
    Mar 11, 2002
    Messages:
    18,244
    Location:
    Canberra
    OpenBSD is run by a literal fanatic thou ;P Makes Linus look tame by all accounts.

    Apple indeed gives no fucks about security. Or enterprise.

    I mean it does stand to reason though. My understanding is we use Jails for App isolation - but not entirely user isolation. Sidenote: I dislike that just with Virtualisation (particularly early virtualisation), it does seem that there is a shitload of people using it for the wrong reasons - or "just because".
     
  8. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    43,457
    Location:
    Brisbane
    A friend of mine works on OpenBSD and travels to Canada once a year to go hiking and climbing with de Raadt.

    What you see on mailing lists and in media isn't a true representation of the guy. And again, his track record speaks for itself. We all rely on stuff he's written to do our jobs every day.
     
  9. Onthax

    Onthax Member

    Joined:
    Nov 5, 2003
    Messages:
    471
    Actually, dedup in 2012 R2 /2016 is done on the NTFS filesystem layer, not the storage space layer, it is not inline too and has it's own set of limitations

    64TB Max volume size
    slow space release on delete (scheduled process)
    max supported file size 1TB

    In the article you mentioned you can see you enable it on the volume.

    Doesn't support REFS at this time either.

    great for backups tho.



     
  10. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    43,457
    Location:
    Brisbane
    When it comes to inline/online dedup versus periodic/offline dedup, I much prefer the latter.

    Dedup takes a lot of RAM to manage inline, and invariably misses a lot of dedupe opportunities if the system has been restarted recently.

    Scheduling a crawl over random parts of the file system for a given time is a feature I'd like to see in new storage solutions. Where I work, we get a pretty good window of downtime from about Sunday evening through to Monday 9:00am. I'd love to be able to schedule some sort of random crawl+dedup for that many hours every week, and then it stops running when production ramps up.

    The "dupremove" project that BtrFS recommends is an offline/scheduled approach to deduplication, and can use an SQLite3 database instead of RAM if you've got a big data store and not much RAM (point it at a spare SSD not in the same storage pool and it's not a problem).

    Once blocks have been deduped, the benefits of that stick around - get more cache efficiency (no need to cache identical blocks twice), the space is saved, and you're not wasting tonnes of RAM that could be speedy cache as a store for your block hash map.

    On dedicated ZFS arrays if I'm copying mass data to them for the fist time (and I don't have to have the storage online quickly, which is rare because nobody plans anything), I'll turn on dedup, copy the data across, and then turn off dedup and reboot the unit to allow ARC to pick up the RAM as cache, but still keep the benefits of the initial dedup.

    Those benefits obviously erode over time, as subsequent access and copies aren't deduped. ZFS not offering an offline dedup is one of it's downsides.
     
    Last edited: Jun 4, 2016
  11. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    43,457
    Location:
    Brisbane
    Apple have announced at this year's WWDC that APFS will replace the horrid HFS+, and bring with it many next-gen filesystem benefits:

    Apple's developer docs:
    https://developer.apple.com/library...ual/APFS_Guide/Introduction/Introduction.html

    Good write up at ArsTechnica:
    http://arstechnica.com/apple/2016/0...ocumentation-for-apfs-apples-new-file-system/

    * Copy on Write (which they call "Crash Protection", which is a bit silly, but whatever)
    * Nanosecond timestamps (mandatory for modern computers, whereas HFS+ can only do 1-second accuracy timestamps)
    * Block level snapshots
    * Filesystem level encryption (currently FileVault on HFS+ uses loopback files to do this, which is clunky).
    * TRIM, IO coalescing and queue depth optimisations for better SSD performance
    * "Copy" operations will use built in reflink/clone operations to be faster and not waste space
    * Container/volume management built in
    * Quotas and thin provisioning (which Apple call "space sharing")
    * 64bit inodes
    * Sparse file support
    * RAID0 (stripe), RAID1 (mirror) and JBOD (span) modes available. No details yet on the specifics (if they can be layered, or if RAID1 is always 2 copies like BtrFS, or up to N disks like other systems).

    The big missing feature for me is block-level checksumming, which hasn't been mentioned anywhere. I'm not sure at this stage if it's not part of the design, or just remains unmentioned, but that needs to be implemented at minimum, in my opinion. Compression is also missing, which is less of a concern, but is very nice to have.

    The current beta code doesn't yet support case-insensitive operations (not a bad thing if you ask me, but I'm oldschool POSIX), and can't be shared via AFP (again, not a bad thing). That also means it can't be used for Time Machine backups over network. It also can't be installed on the main/boot partition.

    This is available in macOS 10.12 Sierra and up only. No backward compatibility for 10.11 Yosemite and older announced.

    Many folks are asking why they didn't use ZFS, BtrFS, HAMMER or other open source file systems. Cynically, I think Apple suffer from Not Invented Here Syndrome too frequently. But worth noting that they are targeting this at iOS, tvOS and watchOS as well (which to be fair are just marketing names for the same bits of core software). I'd dare say the goal is to keep memory requirements and IO way down, which may mean sacrificing some of the features ZFS/BtrFS will offer.
     
  12. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    43,457
    Location:
    Brisbane
    So far APFS design concepts are disappointing.

    From Adam Leventhal's (ex Sun developer, DTrace co-author, and all around clever dude) blog: http://dtrace.org/blogs/ahl/2016/06/19/apfs-part5/#apfs-data

    It appears Apple are at this point specifically not implementing checksums into APFS. IMHO, that removes APFS as a true "next gen filesystem" if it relies on third party hardware/firmware to implement data integrity checks.

    Adam goes on further to note:

    I couldn't agree more. A very disappointing viewpoint from Apple engineers so far. I sincerely hope they come around to understanding exactly why this matters.
     
    Last edited: Jun 21, 2016
  13. Aetherone

    Aetherone Member

    Joined:
    Jan 15, 2002
    Messages:
    8,879
    Location:
    Adelaide, SA
    Apple. If we didn't steal invent it, its crap. If we did, its perfection incarnate.
    <fingers_in_ears>LALALALALALALALA</fingers>
     
  14. shadowman

    shadowman Member

    Joined:
    Aug 3, 2003
    Messages:
    2,755
    Location:
    Perth
    I love this bit:

    Lol, alright then. They are saying apple devices never produce erroneous data? Bullcrap.
     
  15. theSeekerr

    theSeekerr Member

    Joined:
    Jan 19, 2010
    Messages:
    3,579
    Location:
    Broadview SA
    This perfection of the hardware, of course, is why HFS+ has always behaved completely perfectly and doesn't need replacement....oh wait :Paranoid:
     
  16. [KEi]SoVeReIgN

    [KEi]SoVeReIgN Member

    Joined:
    Feb 20, 2002
    Messages:
    8,583
    Location:
    Sydney
    Lets not forget this is third hand information from "Apple engineers"
     
  17. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    43,457
    Location:
    Brisbane
    Agreed, although the source is extremely reliable (dtrace author, worked for Sun/Oracle in a senior technical role, and consulted on ZFS).
     
  18. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    43,457
    Location:
    Brisbane
  19. KDog

    KDog Member

    Joined:
    Jan 9, 2002
    Messages:
    269
    Location:
    ACT
    I'm just about to run up some ZFS nodes for distributed storage (GlusterFS on top), will be for a couple of offices across a city.

    Thanks to Elvis I sort of know what's going on. lol
     
  20. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    43,457
    Location:
    Brisbane
    Not at home, but we also run GlusterFS on top of BtrFS in production. It's a bit crummy because we're still using LSI RAID cards to provide RAID6, and then BtrFS volumes on top of that. Not ideal at all, but we're actually just using BtrFS for compression and snapshotting, and not other features.

    GlusterFS itself now offers bitrot detection, although that's still at a high level (i.e.: on top of the filesystem layer, and not at the filesystem layer). It basically just keeps checksums of chunks that float around the cluster and stores them in the special GlusterFS extended attributes on disk.

    Long, long ago, Ceph was all intended to be designed specifically to use BtrFS (using and abusing COW, send/receive, and other things). I've not kept up to date with Ceph development at all, so I have no idea how thoroughly it uses BtrFS features currently.
     

Share This Page

Advertisement: