Overclockers Australia Forums

OCAU News - Wiki - QuickLinks - Pix - Sponsors  

Go Back   Overclockers Australia Forums > Specific Hardware Topics > Storage & Backup

Notices

Reply
 
Thread Tools
Old 21st May 2016, 9:15 PM   #31
Smokin Whale
Member
 
Smokin Whale's Avatar
 
Join Date: Nov 2006
Location: Pacific Ocean off SC
Posts: 5,170
Default

Quote:
Originally Posted by elvis View Post
One of BtrFS's huge downfalls right now is that it doesn't support swap. That's coming (patches exist, but haven't made it to the mainline kernel yet), however right now you have to partition up your first disk still, with a dedicated swap partition, and BtrFS managing volumes inside the other partition.
Gotcha, makes sense now. I gather if you want swap it kinda works like a standard EXT4 install at the moment, where there is the main EXT4 primary partition which has the swap attached at the end of the disk in an extended partition of sorts with some UUID wizardry in /etc/fstab? I guess that makes sense. Of course, like you mention, Btrfs excels where it exists as the only partition on a disk, and swap is kinda important for me, so I'll probably pass until swap is fully supported.

What are your thoughts on the implementation on OpenZFS on Ubuntu 16.04? I know ZoL was frowned upon when it was first released, but I'm thinking of taking another look at it now.

PS: This thread is great, btw. The storage gods are smiling upon ye, Elvis

Last edited by Smokin Whale; 21st May 2016 at 9:17 PM.
Smokin Whale is offline   Reply With Quote
Old 21st May 2016, 10:16 PM   #32
Annihilator69
Member
 
Annihilator69's Avatar
 
Join Date: Feb 2003
Location: Perth
Posts: 5,986
Default

I've been using ZFS for a while now, and what I've noticed is that overtime you get fragmentation and thus slower right speeds.

This is directly proportional to the amount of freespace on the unit.
The more free space the faster the write speeds as the average seek distance would be lower to find a 'free' block.

I guess with all SSD it will become less of an issue.
At the moment I try and keep my arrays under 70% utilised.
at around 80% write performance drops off a cliff face.
__________________
Intel i5 2500k @ 4.7Ghz || Gigabyte Z68X-UD5-B3 || 16GB G.Skill Ripjaws 1600mhz CL7 || Nvidia MSI 1080 || Samsung 256GB 850Pro + Samsung 500GB 850 Evo + Toshiba 2TB 7200|| 35" Benq XR3501 Ultrawide
OCAU Extreme Cooling Club HO Member
Koolance 370, Bix3, XSPC R9 290 Full Cover Block, Swiftech MCP-350 Pump
Trades List
Annihilator69 is offline   Reply With Quote
Old 21st May 2016, 10:20 PM   #33
fad
Member
 
fad's Avatar
 
Join Date: Jun 2001
Location: City, Canberra, Australia
Posts: 2,068
Default

Inline dedupe is good for some things. Depends on the load and IO profile.

I have a 24x1Tb spindle with SSD L2 and ARC running with 8GB FC targets running ZFS. Providing storage to a Vmware cluster. It is very fast and cheap. I had to replace the HBA with a third party LSI card.

I also have archive files on storage spaces with dedupe on. With some pretty amazing ratios.


I would like to have another look at Ubuntu for ZFS. The last time I used it, with a fuse implementation, the performance was really bad. (20-30mb/sec over 16 spindles)
The feature I would like to see from any of them would be OPAL hardware encryption.

Quote:
Originally Posted by Annihilator69 View Post
at around 80% write performance drops off a cliff face.

Do you have SSD caches? Or just spinning disks?
__________________

FS: 8x WDC 5Tb RED HDD

Last edited by fad; 21st May 2016 at 10:24 PM.
fad is offline   Reply With Quote
Old 21st May 2016, 10:36 PM   #34
elvis Thread Starter
Old school old fool
 
elvis's Avatar
 
Join Date: Jun 2001
Location: Brisbane
Posts: 29,937
Default

Quote:
Originally Posted by Smokin Whale View Post
Gotcha, makes sense now. I gather if you want swap it kinda works like a standard EXT4 install at the moment, where there is the main EXT4 primary partition which has the swap attached at the end of the disk in an extended partition of sorts with some UUID wizardry in /etc/fstab? I guess that makes sense. Of course, like you mention, Btrfs excels where it exists as the only partition on a disk, and swap is kinda important for me, so I'll probably pass until swap is fully supported.
Under Ubuntu, if you select your root partition to be BtrFS and don't specify anything for /home, it will automatically create a subvolume called "@" for / and "@home" for /home.

This is the expected minimum volume layout for APT's automatic snapshot mode (make a snapshot on any change that APT makes to the system, whether it's an add, remove or upgrade), which is a pretty cool feature for desktop and server systems alike.

I think BtrFS as your primary file system on Linux is a very good idea, even if you're not using it to it's fullest extent (no pun intended).

Quote:
Originally Posted by Smokin Whale View Post
What are your thoughts on the implementation on OpenZFS on Ubuntu 16.04? I know ZoL was frowned upon when it was first released, but I'm thinking of taking another look at it now.
I am not a lawyer. However, IMHO the CDDL and GPL are incompatible, and I think that Canonical bundling binary ZFS in-kernel with Ubuntu is against the terms of both licenses.

RMS and the FSF both agree with me. Canonical disagree. Most people don't understand or don't care.

It should be dealt with via DKMS, just like other GPL-incompatible software. For example, DKMS is the way that Nvidia drivers, VirtualBox, and a bunch of other stuff deal with license incompatibilities. There should be no difference with ZFS, and it isn't that hard for end users to deal with (for most things it's completely transparent, and happens automatically on installation via APT).

Important to note that I'm not against OpenZFS/ZoL. That's fine, and we have tools out there to deal with making that work easily. I just think that the GPL and CDDL should both be adhered to by everyone, ESPECIALLY members of the open source community.

Last edited by elvis; 21st May 2016 at 10:40 PM.
elvis is offline   Reply With Quote
Old 21st May 2016, 11:28 PM   #35
Smokin Whale
Member
 
Smokin Whale's Avatar
 
Join Date: Nov 2006
Location: Pacific Ocean off SC
Posts: 5,170
Default

Quote:
Originally Posted by elvis View Post
snip

...Important to note that I'm not against OpenZFS/ZoL. That's fine, and we have tools out there to deal with making that work easily. I just think that the GPL and CDDL should both be adhered to by everyone, ESPECIALLY members of the open source community.
Yeah, so I was more asking in relation to performance, reliability and manageability etc. However I do agree, definitely a grey area and surprising that canonical went ahead with it. I really don't know enough about the licencing models so I figured they just found a loophole and ran with it. I guess that puts me in the "don't care" category, and won't stop me from using it for now. If there's a genuine problem with the licencing, I'm sure we'll hear about it well before I'm ready to put it into any sort of business environment.
Smokin Whale is offline   Reply With Quote
Old 22nd May 2016, 6:37 AM   #36
Diode
Member
 
Diode's Avatar
 
Join Date: Jun 2011
Location: Melbourne
Posts: 1,680
Default

Quote:
Originally Posted by elvis View Post
Yup, precisely why we need these new file systems.
My find has made me a little more paranoid on how I'm handing my photos. The good news is that I already took some half decent measures. I have 3 copies to compare against and already after comparing the hashes of 2 copies it seems one drive (my WD Green) was much worse than the other (my WD Black). Fortunately I have not come across where I've had double corruption on each side. I'll repair the files on the black and then I'll create new check-sum file and compare the repaired black against my other offline USB backup as an extra measure. Hopefully between all the copies I can clean it up as best as I can.

Moving forward it might be the time to embrace cloud storage to keep a golden copy. I've been putting it off since to upload via ADSL will suck, but maybe I'll just make use of works gigabit internet. Cloud storage is going to have better file system integrity checking for fraction of the price than what I can implement at home. Even if the cloud solution didn't use ZFS in the back end it's still a step up.

Through all the years of storing and copying these files between drives I really haven't experienced such corruption at this scale, so take hied to the warnings!


Edit: Considering going down the route of building a Free NAS box.

Last edited by Diode; 22nd May 2016 at 7:42 AM.
Diode is offline   Reply With Quote
Old 22nd May 2016, 9:51 AM   #37
elvis Thread Starter
Old school old fool
 
elvis's Avatar
 
Join Date: Jun 2001
Location: Brisbane
Posts: 29,937
Default

Quote:
Originally Posted by Smokin Whale View Post
Yeah, so I was more asking in relation to performance, reliability and manageability etc.
ZoL hasn't implemented all of the features of native ZFS on Solaris yet, but what has been implemented is pretty stable.

Ditto for ZFS on Mac, which I'm going to be pushing into production workloads soon (which I'm really thankful about - let me once again say just how terrible HFS+ is, and I feel so sorry for media/photography users who rely on OSX and HFS+ for their livelihoods).

Quote:
Originally Posted by Diode View Post
Cloud storage is going to have better file system integrity checking
How are you verifying that?
elvis is offline   Reply With Quote
Old 22nd May 2016, 10:31 AM   #38
Smokin Whale
Member
 
Smokin Whale's Avatar
 
Join Date: Nov 2006
Location: Pacific Ocean off SC
Posts: 5,170
Default

Would I be incorrect in saying that that most, if not all reputable cloud providers use some form of next-gen filesystem on their underlying hardware with parity checking? People's data is kinda their bread and butter, you think that if they had to handle petabytes of the stuff, they'd want to cover their ass over something like bit rot? I wouldn't blame someone for making assumptions about it. I did a little googling and I couldn't find too much evidence of bit rot causing data corruption on client data on a few providers like Google Drive, Drobpox etc (it all seems to be caused by bitrot on the local disk syncing to the cloud, which is another reason why a sync is not a true backup).
Smokin Whale is offline   Reply With Quote
Old 22nd May 2016, 10:33 AM   #39
Diode
Member
 
Diode's Avatar
 
Join Date: Jun 2011
Location: Melbourne
Posts: 1,680
Default

Quote:
Originally Posted by elvis View Post
How are you verifying that?
You could say it's a bit of a generalisation, but generally speaking the underlying infrastructure for most cloud services would have more error checking and correction on the storage layer then a stand alone HDD. That's not to say it's full proof.
Diode is offline   Reply With Quote
Old 22nd May 2016, 10:34 AM   #40
elvis Thread Starter
Old school old fool
 
elvis's Avatar
 
Join Date: Jun 2001
Location: Brisbane
Posts: 29,937
Default

Quote:
Originally Posted by Smokin Whale View Post
Would I be incorrect in saying that that most, if not all reputable cloud providers use some form of next-gen filesystem on their underlying hardware with parity checking?
I would certainly hope so, but as above, how do you verify this?

Quote:
Originally Posted by Diode View Post
You could say it's a bit of a generalisation, but generally speaking the underlying infrastructure for most cloud services would have more error checking and correction on the storage layer then a stand alone HDD. That's not to say it's full proof.
Given that most popular vendor-provided storage solutions pre-ZFS (NetApp, WAFL, VMFS, etc) didn't do block level checksumming, and relied instead on standard SAS/FC technology reliability, there's a lot of vendors out there that still don't have access to this level of data protection.

Probably not so much of an issue when we had 1TB drive densities. Now that 8TB is pretty common, it is certainly becoming a bigger problem.

Where I work, we have a lot of legacy NAS units in production still on hardware LSI RAID controllers with battery backup and RAID6. They do automatic weekly scrubs which can take hours/days depending on the data volumes. The problem is the performance takes a beating, and there's a potential for actively read data to be in a bad state for a week or more before you notice. LIkewise, this is all handled by vendor-provided "black box" proprietary firmware, with a wink and a smile from the vendor saying "trust us" (which I don't). That's pretty much how most vendor storage has worked for a long time (even scaling right up to large SANs, which are more or less the same idea at bigger scale with more IO paths).

New generation clustered filesystems are dealing with bitrot in their own ways. Ceph and Gluster both have checksum/scrub features. And as above, I'd *hope* that folks at AWS/Azure/Google levels would have their own checksum/scrub type features, but that data isn't available to us.

It's all a bit super-paranoid, of course. In this day and age of the Internet, we'd hear about data corruptions happening very quickly. But just yelling "to the cloud" isn't always an answer for data reliability.

Last edited by elvis; 22nd May 2016 at 10:52 AM.
elvis is offline   Reply With Quote
Old 22nd May 2016, 11:00 AM   #41
Smokin Whale
Member
 
Smokin Whale's Avatar
 
Join Date: Nov 2006
Location: Pacific Ocean off SC
Posts: 5,170
Default

Quote:
Originally Posted by elvis View Post
I would certainly hope so, but as above, how do you verify this?
You can't really. But it's a pretty safe bet with the big guys. I doubt it's anything to worry about.

Backblaze actually documented how they handle their filesystems and their strategies against data corruption here. Pretty interesting. Someone actually asked about ZFS since Backblaze actually use EXT4, and there is some good info there.

FYI: For anyone who doesn't know, Backblaze are well known for using consumer drives rather than enterprise stuff. They're able to do that using the unique parity checking systems they've implemented. Pretty clever, and probably saves them quite a wad of cash.

Last edited by Smokin Whale; 22nd May 2016 at 11:14 AM.
Smokin Whale is offline   Reply With Quote
Old 22nd May 2016, 11:31 AM   #42
Diode
Member
 
Diode's Avatar
 
Join Date: Jun 2011
Location: Melbourne
Posts: 1,680
Default

Quote:
Originally Posted by elvis View Post
It's all a bit super-paranoid, of course. In this day and age of the Internet, we'd hear about data corruptions happening very quickly. But just yelling "to the cloud" isn't always an answer for data reliability.
At the end of the day nothing is full proof and so it really still just goes to show you the importance of making sure you have good and proper backups.

In the end having multiple copies of the data floating about has just saved my skin. It's a problem I would not like to face again so I'll be looking for ways to improve. I've also been wanting to make sure my photos are safe localised disasters like fire and theft. So it's all about finding the right balance on where I want to spend my dollars to improve my overall protection.

Sorry if dragging things a bit off topic.
Diode is offline   Reply With Quote
Old 22nd May 2016, 11:45 AM   #43
elvis Thread Starter
Old school old fool
 
elvis's Avatar
 
Join Date: Jun 2001
Location: Brisbane
Posts: 29,937
Question

Quote:
Originally Posted by Smokin Whale View Post
You can't really. But it's a pretty safe bet with the big guys. I doubt it's anything to worry about.

Backblaze actually documented how they handle their filesystems and their strategies against data corruption here. Pretty interesting. Someone actually asked about ZFS since Backblaze actually use EXT4, and there is some good info there.

FYI: For anyone who doesn't know, Backblaze are well known for using consumer drives rather than enterprise stuff. They're able to do that using the unique parity checking systems they've implemented. Pretty clever, and probably saves them quite a wad of cash.
Looks like they use standard erasure coding over bricks. Same method as Gluster's "disperse volume" setup.

They could still use BtrFS underneath for extra protection. I recently converted our 2x 300TB Gluster volumes to BtrFS backed erasure coding, and it works quite well.
elvis is offline   Reply With Quote
Old 22nd May 2016, 11:59 AM   #44
Smokin Whale
Member
 
Smokin Whale's Avatar
 
Join Date: Nov 2006
Location: Pacific Ocean off SC
Posts: 5,170
Default

Quote:
Originally Posted by elvis View Post
Looks like they use standard erasure coding over bricks. Same method as Gluster's "disperse volume" setup.

They could still use BtrFS underneath for extra protection. I recently converted our 2x 300TB Gluster volumes to BtrFS backed erasure coding, and it works quite well.
That's what I was thinking. I'm guessing these systems were designed and implemented around 2013, which is just when Btrfs was considered stable. The article is a year old, maybe leave a blog comment? Who knows, they might be using it now.
Smokin Whale is offline   Reply With Quote
Old 22nd May 2016, 1:25 PM   #45
Smokin Whale
Member
 
Smokin Whale's Avatar
 
Join Date: Nov 2006
Location: Pacific Ocean off SC
Posts: 5,170
Default

Ok, couple more questions.

1. You mention you have an old AMD box at home. Have you considered running ECC RAM on it? AMD chipsets happily accept ECC RAM.

(on a side note: one of the reasons I'm getting excited for Zen - hope they continue this trend, low cost, decently performing, power efficient ECC-compatible hardware platforms would be awesome).

2. What are some of the considerations for using something like Btrfs over an interface like USB3.0? Does it retain its benefits? Due to the increase of streaming over home networks, I've been looking at implementing more ultra-cheap and ultra-low power network storage to homes, and using a cheap $120 cherry trail mini PCs with a couple of cheap 2TB external drives in some sort of mirror configuration. From a bit of reading it doesn't seem I'm unsure if it's worth the effort as performance statistics aren't massively promising.
Smokin Whale is offline   Reply With Quote
Reply

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +10. The time now is 5:53 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
OCAU is not responsible for the content of individual messages posted by others.
Other content copyright Overclockers Australia.
OCAU is hosted by Micron21!