Next gen filesystems (ZFS, BtrFS, ReFS, APFS, etc)

Discussion in 'Storage & Backup' started by elvis, May 20, 2016.

  1. ae00711

    ae00711 Member

    Joined:
    Apr 9, 2013
    Messages:
    1,435
  2. waltermitty

    waltermitty Member

    Joined:
    Feb 19, 2016
    Messages:
    1,031
    Location:
    BRISBANE
    That video is 300+mb

    Run sudo btrfs fi show /mnt/SK1

    Sorry I missed the sudo part
     
  3. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    37,892
    Location:
    Brisbane
    Prefix commands with "sudo", or run "sudo su -" to permanently become root.

    Most file system stuff needs root permissions ("df" shouldn't, but most of the "btrfs" commands will).

    My turn this round, you beat me. :)
     
  4. Quadbox

    Quadbox Member

    Joined:
    Jun 27, 2001
    Messages:
    6,246
    Location:
    Brisbane
    What's the status of btrfs raid5/6? I havent kept up with it for 18 months or so, and the information out there seems thoroughly inconsistent
     
  5. ae00711

    ae00711 Member

    Joined:
    Apr 9, 2013
    Messages:
    1,435
    duh, silly me ofc

    ok, so..:

    'sudo btrfs fi show /mnt/SK1' yields:


    Code:
    Label: none   uuid: 71ea78b0-c9e8-4dbe-89cc-22d868882e86
    Total devices 2 FS bytes used 640.00KiB
    devid 1 size 2.73TiB used 2.01GiB path /dev/sdf
    devid 2 size 2.73TiB used 2.01GiB path /dev/sdg
    'df -hT /mnt/SK1' yields:


    Code:
    Filesystem  Type   Size   Used   Avail   Use% Mounted on
    /dev/sdf       btrfs   2.8T    17M    2.8T   1%
    /mnt/SK1
     
  6. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    37,892
    Location:
    Brisbane
    Fixes came through thick and fast about a year back. Slowed down after that. I don't know if anyone's truly put it through its paces yet, but the wiki says "mostly OK".

    https://btrfs.wiki.kernel.org/index.php/Status#RAID56

    These things tend to need lots of people using it for large workloads over years before anyone gives it the real rubber stamp.

    The RAID5 write hole still exists. So far ZFS is the only FS on the planet to plug that (that I know of). But BtrFS folks are still talking about how to solve it for their code base, which is good.


    This output says it's all working. You've got 2.8TB usable of BtrFS RAID1 on /mnt/SK1. Go write something to it. :)
     
  7. ae00711

    ae00711 Member

    Joined:
    Apr 9, 2013
    Messages:
    1,435
    did you watch the video?
    cause that's what the rig is doing right now
    I have no other way of explaining it, sorry :(
     
  8. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    37,892
    Location:
    Brisbane
    Just watched it now.

    First up, stop using that silly GUI tool. Every time you click a "volume" (raw device), it's running an automatic mounting tool and mounting your volume under /media/$USERNAME. These tools are handy for when you stick in a USB thumbdrive or something, but just get retarded when you start creating volumes and devices that your GUI didn't know about at boot.

    So it's doubling up on the mount command, and causing you confusion.

    If you INSIST on using a GUI to verify this stuff, I would instead ignore the volumes on the left, and just browse the file system to /mnt/SK1 where you manually mounted your volume. If you get permission errors in the GUI, fix them with a "sudo chown" type command on the command line.

    From there, I'd set a specific mount point for the volume in /etc/fstab like I showed you earlier, reboot the whole machine to test that it mounts correctly. *Hopefully* that will stop the silly GUI automount nonsense from thinking it's a removable disk, and it can butt out and leave your drives alone. If, however, it still shows up as a volume and tempts you to click on it, it's a bug in the silly GUI, not a problem with your disk, file system, or mount point.
     
  9. ae00711

    ae00711 Member

    Joined:
    Apr 9, 2013
    Messages:
    1,435
    I see
    yeah, I'm simply using the GUI to navi to the array to dump files, not 'verify' that everything is working etc.. however, in doing so, it was acting retarded.. ok.. will try do the things you mentioned.. little green on editing fstab, but here goes! :p

    thanks heaps!
     
    Last edited: Aug 19, 2019
  10. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    37,892
    Location:
    Brisbane
    Definitely browse from the root down. Don't click on raw devices on the left, as that triggers the weird automount thing that's confusing you.

    Again, handy for USB sticks in your laptop. But don't get me started on Linux devs who test shit on their laptop and then upload it to enterprise code repositories (fucking Lennart Poettering...)
     
  11. ae00711

    ae00711 Member

    Joined:
    Apr 9, 2013
    Messages:
    1,435
    that's rather odd, cause in my main rig, I have two stand-alone HDDs using btrfs, and a RAID1 ZFS, all show on the explorer left hand side, and that's my primary way of navigating to them, none have given me any dramas since their creation :/
     
  12. ae00711

    ae00711 Member

    Joined:
    Apr 9, 2013
    Messages:
    1,435
    ok.. success

    I changed the label of the array
    I then edited fstab
    restarted, all good
    it still does show two volumes, both with the same, new label, but clicking on one, gets me to the actual array, the other pops up with an error, something about unable to mount due to mount point being busy (duh, it's already in successful use), and that's the only (small) hiccup it gives - it doesn't auto-mount volume after volume after volume, which was more the annoying thing, rather than having a redundant volume


    thank you soooooo much
     
    elvis likes this.
  13. HobartTas

    HobartTas Member

    Joined:
    Jun 22, 2006
    Messages:
    839
    MUTMAN likes this.
  14. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    37,892
    Location:
    Brisbane
    exFAT only does metadata checksums (not data) and has no redundancy/repair options. Only really useful as a file system for small removable storage that's multi-OS capable, where the data always exists on bigger and more reliable storage.

    I wouldn't ever recommend exFAT for any sort of production use, even on a single drive workstation.

    If you're still dual-booting and need a reliable FS to share between OSes, ZFS and BtrFS have options for Mac and Windows respectively.
     
    Last edited: Sep 7, 2019
  15. ae00711

    ae00711 Member

    Joined:
    Apr 9, 2013
    Messages:
    1,435
    ok, here's a (possibly) curly question:

    re: REFS + StorageSpaces + Mirror array + data dedup enabled

    so say you have the above mentioned array.. you dumb a whole heap of files on there, and you know there are quite a few double, triple copies of some shit... data dedup 'gets rid of the excess copies'.. my -limited- understanding is, it works like this - and please excuse my incorrect terminology:

    - there is a 'base layer' of data, which is the actual data stored / takes up the actual space on the array, and it contains no duplicates.

    - this 'base layer' of data is not visible to the user - the user only sees the files and folders he/she dumped on the array, and without the user specifically deleting any dup's, can navigate to the same file, contained in different locations. Example:

    folder1/folder1/file1

    folder2/folder2/file1

    ^ two copies of the exact same file, different locations. The user can browse to either, despite DD having 'deleted' one copy of the file - so one file location acts more like a shortcut, without the user realising it ('invisible shortcut'?)

    Now.. I 80% sure that, from my experience, I have copy over data from an exact set-up like the aforementioned and although the 'base layer data' only taking up, say 100GB, 150GB of data gets copied over. That is, 50GB of duplicates were found, deleted, only a single instant of (each) file remained, but the layer visible to me, and the system being aware of what is visible to me, means that if I want folder1/folder1/file1 AND folder2/folder2/file1 copied, it will auto give the same file twice.

    Like I said.. pretty darn sure this is how it is acting / working.. I've been moving a fair bit of data lately, and had/have 3 such arrays, and all exhibited such 'behaviour'.

    now.. here's the curly question: how do I access / only copy the de-dup'd data(set), that 'base layer'?
     
  16. HobartTas

    HobartTas Member

    Joined:
    Jun 22, 2006
    Messages:
    839
    Do you really need de-dupe for this? wouldn't it be better to just run any of the "find duplicate files" programs on the internet and just cull all the extra copies?
     
  17. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    37,892
    Location:
    Brisbane
    When it comes to CoW (Copy on Write), there is no "original" or "base layer". Duplicate blocks/extents that have been deduplicated use reflinks (similar to hard links, but maintaining independence on a future update via CoW when the new version is altered, both at the data and metadata layer).

    These reflinks are of equal value. Similar to a hard link, where you can have two locations in a file system pointing to the same "file" (chunk of data and metadata), and you can delete one and the other still exists.

    There's no primary/secondary in that world. They're just pointers. With reflinks/CoW, "virtual file A" and "virtual file B" point to the same bunch of blocks/extents on disk. Modify "virtual file B", and copy-on-write does exactly that - copies just the bits that get modified, and write them back to a brand new part of the disk. "Virtual file B" now appears entirely unique to A, however only the tiny part that was modified is separate on disk. All the other bits that were the same remain untouched, and shared between them (again, of equal value). It's this exact mechanism that allows snapshots to work - essentially it marks a huge lump of blocks/extents as "read only", and writes every new block/extent from that point onwards. Some file systems like NILFS do this to an extreme - data is never deleted until the disk is at 100% capacity. "Deleted" files are merely marked as deleted, and all changes to files are stored in new blocks, never overwriting old blocks in place, and the data remains. At any time you can ask the file system for an old "snapshot" down to the second, and it will present you that virtual view based on the blocks that were active at that time. Different to BtrFS/ZFS/ReFS where you have to make that snapshot manually, otherwise the background cleaner removes deleted data.

    Now try to imagine what happens when you "delete a snapshot". You're actually merging the old data in with the new, and cleaning the left over. Conversely "restoring a snapshot" actually removes the deltas. In terms of effort to the file system, deleting a snapshot is typically much more effort than restoring a snapshot (you'd assume the opposite, if they were just dumb files).

    This is all a bit tricky, but it does require some fundamental understanding of how data exists physically on disk, right down at the bytes/blocks/extents level. You need to let go of the highly virtualised, highly abstracted concept of "files". Files don't exist. Your disk is nothing but a stream of data. At certain points we put collections of bytes in patterns that identify virtual beginnings and ends to things. Much like a long scroll of paper could have a story written on it, broken in to chapters, but "chapters" are meaningless, as they're just collections of letters that start and stop logical flows of information.

    Deduplication happens underneath the file level. You as a user are totally ignorant to whether data is duplicated or not. You see just a logical collection of stuff represented as files. And even then, files are arbitrary - there are "file systems" out there that represent data as objects, like Amazon S3 or IBM System i aka OS/400 aka AS/400. These file systems treat their collections of data more like a huge database, rather than "files and folders". But in either example, all of that is just representing streams of bytes on a disk to humans, who are really terrible at understanding things without simplistic metaphors.

    So, long answer to a short question. In a CoW file system, no, you cant tell what's "original" and what's not in a deduplicated set of data, because it doesn't work that way. All reflinks are of equal value, whether there are one or millions.
     
    Last edited: Sep 8, 2019
    KDog and grs1961 like this.
  18. OP
    OP
    elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    37,892
    Location:
    Brisbane
    VIP3R and HobartTas like this.
  19. demiurge3141

    demiurge3141 Member

    Joined:
    Aug 19, 2005
    Messages:
    1,560
    Location:
    Melbourne 3073
    Deduplication works on a block level, a file just refers to a collection of blocks that make it up. You can have different files using the same block if they are partially the same. The "base layer" is useless if you don't have the metadata for each file. So I am not sure what you are asking for.
     
  20. HobartTas

    HobartTas Member

    Joined:
    Jun 22, 2006
    Messages:
    839
    Sounds interesting and here's an article where someone has put all of these latest file systems to the test Battle testing data integrity verification with ZFS, Btrfs and mdadm+dm-integrity and while I am confident about ZFS's abilities to keep data safe I am less so where it interacts with Linux as the OS as mostly detailed here on the ZOL forums and rectifying any significant problems if they do occur basically means you have to have extensive Linux experience to figure out how to fix them which I unfortunately don't have.
     

Share This Page

Advertisement: