To be expected. TB of space / MB/s write speed = write time. Do the same for read time, add them together for a single all-sector read/write test. I generally don't bother with new drives. Second hand, maybe run a couple of xxhash checksums over the drive and make sure they return the same value, but even then I just get impatient. Next gen filesystems will tell you really quick if reads or scrubs have even a single bit bad (acknowledging you aren't going every sector on a non-full drive), and you should have good backups where it counts.
Does ZFS not use internal UUIDs? With BtrFS you can point to the drive any way you like. All subsequent operations are done by "blkid" uuid and sub-uuid information after that, drive order is totally irrelevant. mdadm is the same. Seems positively archaic to do it any other way. I would have assumed ZoL would be better than that.
dunno where you got that idea? of course it uses it's own internal ID system. you can load a bunch of ZFS disks into a new system, and run 'zpool import -a' and it'll find and import the pool. no need to specify any device paths. now that ae000001 is probably experiencing, is ZFS does 'cache' imported pool device paths for use to speed boot up. you can clear that cache (delete a file) and do a zpool import -a to reimport the pool. and use if you create the array with either the wwns, or preferable the disk/by-id/ using the serial numbers of the disks, you get a pool that's not confused by sdX naming changes. (you could also make a custom udev rule to keep the sdX names consistent based on serial # or wwn). however as usual RTFM helps here, zpool import uses /dev/disk for it's search path (and picks the first match). if you want to be specific with your device naming a -d /dev/disk/by-uuid or /dev/disk/by-id (what I use) may be useful.
It was a question. I don't use ZoL (only BSD and Solaris versions). Good to know they do it reasonably sensibly, although the "cached pool" issue still seems a bit annoying. If rather a few seconds slower at boot with a guaranteed array discovery than the alternative.
I don't see this as much of an issue if it can't find all the drives because if you want to import the array you just enumerate them all individually such as "zpool import tank (sda, sdb. sdc, sdd)" and so on but check the exact syntax to use and then its got no problems opening the pool.
found a stack of disks (8) in a box, loaded them up to see what was on them. noticed they had the tell-tale ZFS partition layout, always shows as having two partitions, 1 and 9. Code: lrwxrwxrwx. 1 root root 9 Apr 30 20:59 wwn-0x600062b2004f1b40263d69d9564b574d -> ../../sdt lrwxrwxrwx. 1 root root 10 Apr 30 20:59 wwn-0x600062b2004f1b40263d69d9564b574d-part1 -> ../../sdt1 lrwxrwxrwx. 1 root root 10 Apr 30 20:59 wwn-0x600062b2004f1b40263d69d9564b574d-part9 -> ../../sdt9 lrwxrwxrwx. 1 root root 9 Apr 30 20:59 wwn-0x600062b2004f1b40263d69e456f99b26 -> ../../sdu lrwxrwxrwx. 1 root root 10 Apr 30 20:59 wwn-0x600062b2004f1b40263d69e456f99b26-part1 -> ../../sdu1 lrwxrwxrwx. 1 root root 10 Apr 30 20:59 wwn-0x600062b2004f1b40263d69e456f99b26-part9 -> ../../sdu9 lrwxrwxrwx. 1 root root 9 Apr 30 20:59 wwn-0x600062b2004f1b40263d69f958339fb4 -> ../../sdv lrwxrwxrwx. 1 root root 10 Apr 30 20:59 wwn-0x600062b2004f1b40263d69f958339fb4-part1 -> ../../sdv1 lrwxrwxrwx. 1 root root 10 Apr 30 20:59 wwn-0x600062b2004f1b40263d69f958339fb4-part9 -> ../../sdv9 lrwxrwxrwx. 1 root root 9 Apr 30 20:59 wwn-0x600062b2004f1b40263d6a0458e11812 -> ../../sdw lrwxrwxrwx. 1 root root 10 Apr 30 21:00 wwn-0x600062b2004f1b40263d6a0458e11812-part1 -> ../../sdw1 lrwxrwxrwx. 1 root root 10 Apr 30 21:00 wwn-0x600062b2004f1b40263d6a0458e11812-part9 -> ../../sdw9 lrwxrwxrwx. 1 root root 9 Apr 30 21:00 wwn-0x600062b2004f1b40263d6a1059988a92 -> ../../sdx lrwxrwxrwx. 1 root root 10 Apr 30 21:00 wwn-0x600062b2004f1b40263d6a1059988a92-part1 -> ../../sdx1 lrwxrwxrwx. 1 root root 10 Apr 30 21:00 wwn-0x600062b2004f1b40263d6a1059988a92-part9 -> ../../sdx9 lrwxrwxrwx. 1 root root 9 Apr 30 21:00 wwn-0x600062b2004f1b40263d6a1c5a4a686c -> ../../sdy lrwxrwxrwx. 1 root root 10 Apr 30 21:00 wwn-0x600062b2004f1b40263d6a1c5a4a686c-part1 -> ../../sdy1 lrwxrwxrwx. 1 root root 10 Apr 30 21:00 wwn-0x600062b2004f1b40263d6a1c5a4a686c-part9 -> ../../sdy9 lrwxrwxrwx. 1 root root 9 Apr 30 20:51 wwn-0x600062b2004f1b40263d6a315b8bc9c6 -> ../../sdz lrwxrwxrwx. 1 root root 10 Apr 30 20:51 wwn-0x600062b2004f1b40263d6a3c5c37ba7e -> ../../sdaa so ran 'zpool import' Code: # zpool import pool: zfs-storage id: 13421130254309412809 state: DEGRADED status: One or more devices contains corrupted data. action: The pool can be imported despite missing or damaged devices. The fault tolerance of the pool may be compromised if imported. see: http://zfsonlinux.org/msg/ZFS-8000-4J config: zfs-storage DEGRADED raidz2-0 DEGRADED scsi-SATA_ST32000542AS_5XW0FMNZ UNAVAIL scsi-SATA_ST2000DM001-1CH_Z240V8Y5 UNAVAIL sdx ONLINE sdu ONLINE sdt ONLINE sdy ONLINE sdw ONLINE sdv ONLINE Seems two of the 8 drives are not happy. Note the first listing doesn't show any partitions for them. Go to import it. Code: # zpool import zfs-storage cannot import 'zfs-storage': pool was previously in use from another system. Last accessed by <unknown> (hostid=0) at Thu Jul 3 22:13:53 2014 The pool can be imported, use 'zpool import -f' to import the pool. Ok, pool not from this server, so force it. Code: # zpool import zfs-storage -f cannot import 'zfs-storage': I/O error Recovery is possible, but will result in some data loss. Returning the pool to its state as of Thu 03 Jul 2014 22:13:46 AEST should correct the problem. Approximately 7 seconds of data must be discarded, irreversibly. Recovery can be attempted by executing 'zpool import -F zfs-storage'. A scrub of the pool is strongly recommended after recovery. Ok, must force and FIX it, and lose 7 seconds of data from 6 years ago. Code: # zpool import -f -F zfs-storage # Ok. scrub it as requested. Code: # zpool scrub zfs-storage # zpool status zfs-storage pool: zfs-storage state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: http://zfsonlinux.org/msg/ZFS-8000-4J scan: scrub in progress since Thu Apr 30 21:30:18 2020 686M scanned at 21.4M/s, 491M issued at 15.3M/s, 692M total 0B repaired, 70.94% done, 0 days 00:00:13 to go config: NAME STATE READ WRITE CKSUM zfs-storage DEGRADED 0 0 0 raidz2-0 DEGRADED 0 0 0 3831973345871989454 UNAVAIL 0 0 0 was /dev/disk/by-id/scsi-SATA_ST32000542AS_5XW0FMNZ-part1 13160028098390289446 UNAVAIL 0 0 0 was /dev/disk/by-id/scsi-SATA_ST2000DM001-1CH_Z240V8Y5-part1 sdx ONLINE 0 0 0 sdu ONLINE 0 0 0 sdt ONLINE 0 0 0 sdy ONLINE 0 0 0 sdw ONLINE 0 0 0 sdv ONLINE 0 0 0 errors: No known data errors Code: # zpool status zfs-storage pool: zfs-storage state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: http://zfsonlinux.org/msg/ZFS-8000-4J scan: scrub repaired 0B in 0 days 00:00:50 with 0 errors on Thu Apr 30 21:31:08 2020 config: NAME STATE READ WRITE CKSUM zfs-storage DEGRADED 0 0 0 raidz2-0 DEGRADED 0 0 0 3831973345871989454 UNAVAIL 0 0 0 was /dev/disk/by-id/scsi-SATA_ST32000542AS_5XW0FMNZ-part1 13160028098390289446 UNAVAIL 0 0 0 was /dev/disk/by-id/scsi-SATA_ST2000DM001-1CH_Z240V8Y5-part1 sdx ONLINE 0 0 0 sdu ONLINE 0 0 0 sdt ONLINE 0 0 0 sdy ONLINE 0 0 0 sdw ONLINE 0 0 0 sdv ONLINE 0 0 0 errors: No known data errors scrub finished quickly. turns out the pool is empty. no data lost. lol no data to lose. Code: # ll /zfs-storage -a total 22 drwxr-xr-x. 2 root root 2 May 22 2014 . dr-xr-xr-x. 22 root root 4096 Apr 30 21:29 .. was hoping to get some additional space, but with two dead drives, that's a bust already. posted anyway to show the process for importing a pool of disks, not once did I need to specify device names, autodetected, had to force the import as it was a pool from a foreign system (one of my previous server builds), and force a Fix to import a degraded pool. all nicely documented by the out put of the commands, and you're at no stage left wondering what to do, or what your actions will do.
out of interest that's a raidz2 volume with 2 disks missing therefore it will function in a degraded state and no data is yet lost. if you scrub the data, my understanding is you will not be able to repair anything because there is no parity available (and correct me if this is not correct). but will a scrub even be able to tell you that data is corrupt without any available parity? or am i getting the roles of parity and metadata confused?
yeah parity and metadata are two different parts. The parity is missing (we'll ignore as the parity and data is striped, we have parity AND data missing, but enough of both for no loss). the metadata - in this case block checksums, is enough for ZFS scrub to tell there's no data lost. if there was it wouldn't be able to fix it however. were I to find some more 2TB spinners, and replace the dead two, I'd run 'zpool scrub zfs-storage' again to enact the repair (which for some reason ZFS calls a 'resilver' in the case of a repair) - rebuild the missing two disks from the remaining data and parity.
You know what would let you rebuild the data without extra disks? BtrFS. That's what. Assuming there was enough free space.
Partially correct as far as data goes because you're effectively at the equivalent of raid zero so if you strike a bad block you're lost the stripe as you've gone below minimum redundancy so now you've lost whatever file that has this issue, meanwhile everything else on there will be 100% OK because ....... Yes, because in addition to parity each and every block in the volume is checksummed whether or not you have parity or mirrors so even if you're scrubbing your remaining six drives out of the eight drive Raid-Z2 array with zero parity then if all six checksums are valid for each relevant block then the entire stripe is still OK. This is different to hardware Raid 5 down one drive or Raid 6 down two drives because you can't tell if the blocks are valid or not, take for example if one block is on the way out to being a bad block and presently is returning corrupted data during reads (unlikely I know due to internal block ECC but just assume its happening for the purposes of my argument), ZFS would detect this because the individual block checksum either matches or it doesn't whereas hardware raid wouldn't have a clue. Metadata is different in that when copies is set to one (N) by default then there is one copy of the data but metadata is then set to (N+1) hence for a setting of one then metadata is set to two so this means that there are at least two copies of the metadata and in addition if the zpool consists of more than one disk then the two copies are also stored on different disks. These reasons are why ZFS and probably also BTRFS are way better filesystems than anything else. Microsoft claims REFS is up there with those two but since they haven't released any information about how REFS works internally I'm a bit dubious about their claim.
Seeing metadata checksums making it into a few file systems now, which is better then before. Still not the complete data checksum of ZFS/BtrFS, but better than nothing. File systems include: * xfs * ext4 * f2fs * apfs (covered already in this thread).
Work continues on both WinBtrFS and Quibble, an open source bootloader for Windows XP through to 10 (1909 tested and working). It's now at a point where you can convert an existing NTFS installed Windows to BtrFS and boot from it. Still highly experimental, and the author warns "don't use this for anything serious". https://github.com/maharmstone/btrfs https://github.com/maharmstone/quibble
Linux kernel 5.4 grants BtrFS the ability to detect memory bit flips without ECC RAM. And... my home NAS detected one! Time to upgrade the old clunker to ECC. (That's the only non-ECC NAS I have access to, so nice to see the feature in action).
I don't actually know the exact details of it. 5.4 RC1 added a "strict check" to the tree-checker, which apparently can now detect a corruption, but isn't able to fix it. https://kernelnewbies.org/Linux_5.4#File_systems [edit] Well, it was certainly a crystal ball event. As I was typing this message, my NAS blew up, and didn't come back. Swapped it out to another system, and we're off and running again! Still no ECC in this one, so it's time to go shopping.
It's in the 5.4 kernel and the matching BtrFS tools, specifically the scrub component. Both of these are in 20.04 LTS To clarify, it can only detect certain errors after the fact, not during, and it's not fool proof. It's not a replacement for ECC, but will better identify the results of bit flips with the new strict checker. I've said for some time that I think ECC will become a standard in more devices, including low end systems. As speed and number of cores increase, I think manufacturers will be forced to do so, just like we're forced to use checksumming file systems on large drives now.
Thread is 4 years old, how exciting. Some new advice on BtrFS RAID5/6, which has been lagging IMHO, but changes are happening slowly. Still not a great option, and still not recommended for "data you really care about". My recommendations remain that BtrFS users should stick to RAID1 (quite different to traditional RAID1 - more like a clustered file system where 2 (or more if you wish) copies of every bit of data get made and placed on physically separate volumes), which can scale happily over a large volume of mismatching drives. Particularly for home users on hodge-podge hardware that don't want to purchase multiple expensive drives all at the same time, it's brilliant, and allows you to scale over time as you replace physical disks. I'm on the "same" BtrFS RAID1 setup I've been on for years, despite 3 generations of disk changes thanks to hot/online migrations, which is a bit of this sort of thing: https://en.wikipedia.org/wiki/Ship_of_Theseus But I digress, on to the late 2020 BtrFS RAID5/6 recommendations: https://lore.kernel.org/linux-btrfs/20200627032414.GX10769@hungrycats.org/ In summary: * Don't use RAID5/6 on metadata. Do so only on data. Use RAID1 (preferably the new RAID1C3 or 1C4 that creates more than 2 copies) on metadata. Metadata is tiny compared to data. On my baby home NAS in RAID1: Code: # btrfs filesystem usage /data Overall: Device size: 14.55TiB Device allocated: 13.24TiB Device unallocated: 1.32TiB Device missing: 0.00B Used: 12.72TiB Free (estimated): 926.35GiB (min: 926.35GiB) Data ratio: 2.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Data,RAID1: Size:6.59TiB, Used:6.34TiB (96.26%) /dev/sdb 6.59TiB /dev/sda 6.59TiB Metadata,RAID1: Size:34.00GiB, Used:23.33GiB (68.63%) /dev/sdb 34.00GiB /dev/sda 34.00GiB System,RAID1: Size:32.00MiB, Used:960.00KiB (2.93%) /dev/sdb 32.00MiB /dev/sda 32.00MiB Unallocated: /dev/sdb 674.00GiB /dev/sda 674.00GiB Just shy of 8TB usable disk space sitting at 6.4TB used. Metadata takes up around 24GB, or about 0.4% of total used space. Could be more if you have *lots* of small files, but is still not a problem to RAID1C4. * Scrub often. They don't really say what "often" is. But it sounds like something that should be aimed for at least once a week. However read on for the caveat... * Scrub devices one by one. Rather than scrub a volume (which scrubs all disks in parallel), sequentially scrub each device. Use the "-B" flag to "not background" (negative options are so stupid), i.e.: foreground the task, and then you can scrub subsequent devices in parallel. Not that this will obviously take *much* longer compared to RAID1/10 if you have a lot of disks. A big consideration for anyone hosting lots of drives. * Don't fix failed disks with "btrfs device remove". You need to use "replace" instead. That also means not adding your spare disks to your array (either add them in unused to the system if you want them "hot", or keep them on the shelf if you want them "cold"). By comparison, other RAID modes allow you to remove a disk and balance (assuming you have the space), which is a far easier way to recover from failure. They also note that this process pretty much renders the system unusable during recovery, which sucks. * During a RAID5/6 recovery, incorrect error messages can sometimes be printed to dmesg. Ugh. * RAID5 write hole still exists.
so reading that, still says to me, don't use RAID5/6 on BTRFS. too many caveats to remember - especially in a time where you're trying to recover your array. on the topic of updates. OpenZFS is almost reaching version 2 (it's at 2.0 RC3 now). They're skipping v1 for some reason, going instead from ZFS on Linux 0.8.x to OpenZFS 2.0 the name change, reflects the fact they're supporting more than just Linux now, with BSD support now, and Mac support in the future. Ubuntu now even supports booting and root ZFS volumes. pretty good to have major distro support in that regard. the only thing really holding it back is the license incompatibility CDDL vs. GPLv2. so it'll likely never be mainlined.
Certainly if you've got big/expensive/important things on there. If you're just running a home NAS for your "acquired" media, then meh. The big advantage of BtrFS is that you can run it over mismatching devices and better use the space available, as well as grow+rebalance the array a single disk at a time. Something ZFS can't do (and likely won't due to design considerations). If you're scrubbing frequently, you can get this sort of info: Code: # /bin/btrfs device stats /data [/dev/sdb].write_io_errs 0 [/dev/sdb].read_io_errs 0 [/dev/sdb].flush_io_errs 0 [/dev/sdb].corruption_errs 0 [/dev/sdb].generation_errs 0 [/dev/sda].write_io_errs 0 [/dev/sda].read_io_errs 0 [/dev/sda].flush_io_errs 0 [/dev/sda].corruption_errs 0 [/dev/sda].generation_errs 0 If you see one of those numbers go up, time to replace a device before it fails. Combine that with the "5 SMART values that matter", and you've got a good indication when a disk is on the way out: https://www.backblaze.com/blog/what-smart-stats-indicate-hard-drive-failures/ I've run ZFS on Mac before. Is this something different? Or just "more official"? Lord knows macOS needs some decent options. APFS is rubbish, and most consumer NAS devices targeting Mac are utterly shithouse.