1. OCAU Merchandise now available! Check out our 20th Anniversary Mugs, Classic Logo Shirts and much more! Discussion here.
    Dismiss Notice

Ok who broke rsync at the ATO?

Discussion in 'Business & Enterprise Computing' started by link1896, Dec 14, 2016.

  1. Daemon

    Daemon Member

    Joined:
    Jun 27, 2001
    Messages:
    5,470
    Location:
    qld.au
    All the same, except hyperconverged is a more efficient use of space :)

    Even complexity isn't an excuse. VMWare do hyperconverged systems now, so you have simple (albeit expensive) point and click systems. If you need performance, you go Nutanix so there's no reason there either. There's plenty of other vendors for all of the above too, you'll find traditional SAN vendors scrambling for answers in this area very soon.
     
  2. NSanity

    NSanity Member

    Joined:
    Mar 11, 2002
    Messages:
    17,910
    Location:
    Canberra
    Agreed. This is where the market is going. Hyperconverged is the most efficient use of resources - and Microsoft have proved its no slouch either. - https://blogs.technet.microsoft.com...orage-iops-update-with-storage-spaces-direct/

    I believe that VSAN isn't quite as competitive as what MS is offering - Nutanix doesn't do RDMA yet - so MS probably has the lead there too.
     
  3. scrantic

    scrantic Member

    Joined:
    Apr 8, 2002
    Messages:
    1,738
    Location:
    3350
    Curious to know you using storage spaces direct in production yet? Something that interests me given we're going through a hardware refresh.
     
  4. GreenBeret

    GreenBeret Member

    Joined:
    Dec 31, 2001
    Messages:
    19,370
    Location:
    Melbourne
    Don't do hyperconverge unless you really hate yourself. When shit go fucked up, which one do you think caused the problem? It's hard enough troubleshooting storage or compute cluster by itself.

    The load on Ceph OSD processes is always going to be high, and that's gonna take away from your VMs running on the same host. Unless your VMs don't do much (in our major zones, they run at ~90% utilisation per core), compute performance will suffer.

    Then you have the problem of not being able to scale compute and storage separately. You don't always have the same compute to storage ratio in the near future.
     
  5. NSanity

    NSanity Member

    Joined:
    Mar 11, 2002
    Messages:
    17,910
    Location:
    Canberra
    We have some stuff up on 2016 S2D. ~40VM's. Mix of Exchange, SQL, AD, RDS's etc.

    Box is basically our old ZFS box, and we have *loads* better performance out of it.

    SMB3 works. SMB Direct Works. S2D works.
     
  6. Daemon

    Daemon Member

    Joined:
    Jun 27, 2001
    Messages:
    5,470
    Location:
    qld.au
    Sounds like you've chosen a bad solution :) I've run hyperconverged systems for ~4 years now in production and had very little issues (over 100TB of data / hundreds of VM's).

    I bet many sysadmins have troubles delineating between a storage fault and a hypervisor level fault regardless, especially when many have near zero reporting.

    We have data logged every 5 seconds for every drive in our system and can see i/o issues nearly immediately. The visibility compared to a SAN is much, much better.

    Ceph is a special case and I wouldn't recommend using it. It's busy trying to be all things to all people under every scenario so it ends up carrying enough baggage to sink the Titanic. We see (on a competing system) around 3-5% CPU overhead to run storage.

    Again, not sure what system you're running but we can add storage only nodes, compute only or both. Generally though we find that with decent planning, our nodes are fairly evenly balanced.
     
  7. wintermute000

    wintermute000 Member

    Joined:
    Jan 23, 2011
    Messages:
    2,284
    So what's your flavour? GlusterFS or Nutanix or VSAN?

    If Ceph is stuffed why is it the 'standard' Openstack choice?
     
  8. scrantic

    scrantic Member

    Joined:
    Apr 8, 2002
    Messages:
    1,738
    Location:
    3350
    Nice to know.

    Any pitfalls you've seen or has it been smooth sailing?
     
  9. NSanity

    NSanity Member

    Joined:
    Mar 11, 2002
    Messages:
    17,910
    Location:
    Canberra
    Pretty much smooth sailing. I did a bunch of testing between manually building columns vs "let windows work it out" - iops across the board seemed always better when Windows did it.

    So long as you use enterprise hardware (there is a few reference architectures out there from various vendors) and/or have a Storage Spaces Engineer validate things, you'll be fine.

    The killer is Datacenter licensing - but even when you roll that in, its still cheaper than a traditional SAN appliance.
     
  10. TehCamel

    TehCamel Member

    Joined:
    Oct 8, 2006
    Messages:
    4,183
    Location:
    Melbourne
    This would never have happened if Luke212 won his whitebox tender.
     
  11. cvidler

    cvidler Member

    Joined:
    Jun 29, 2001
    Messages:
    13,310
    Location:
    Canberra
    He would've delivered a external USB from hardley normals, cause misunderstood PB and TB.
     
  12. elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    40,983
    Location:
    Brisbane
    Micro-instance workloads for web applications are different to enterprise VM workloads, which are closer to traditional physical servers. Different requirements there.
     
  13. Myne_h

    Myne_h Member

    Joined:
    Feb 27, 2002
    Messages:
    10,493
    Silly question for the storage experts here.

    Facebook has a bluray dukebox thing for near-line WORM storage.
    Wouldn't it be ideal to have something similar as a mirror for massive arrays of generally only written once data like this?
    Keep a separate disc +hdd for the consolidated index and you're done. No?

    That way the reads from the HDDs are fast, the writes to the HDD are fast, but the slower optical keeps it intact.
    When shit hits the fan, you just zero the array and read all the discs in.

    If files have changed, well, they'll be in later discs. So any attack can be narrowed down to the minute as amended and corrupt files.
     
  14. cvidler

    cvidler Member

    Joined:
    Jun 29, 2001
    Messages:
    13,310
    Location:
    Canberra
    That's great for cat photos and old memes from last year. FB keeps everything forever, but how often is anything from more than a few days ago accessed?

    For the ATO, it's not really workable when auditors and the like need to quickly access data from up to 7 years ago. all that stuff needs to be kept online, and fast. They have all their backups on tape. Despite the archaic image of tape as old tech, tape is still unmatched for archival backups.
     
  15. scrantic

    scrantic Member

    Joined:
    Apr 8, 2002
    Messages:
    1,738
    Location:
    3350
    Everything I've read so far points to me wanting to investigate further.

    DC Licensing is easy we're a NFP, will have a look at some reference architectures.
     
  16. elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    40,983
    Location:
    Brisbane
    The problem with citing Facebook and Google as sources of "we should do it that way" is that they have radically different setups to most businesses. Their technical to non-technical staff ratios are much better than most, and they have squllions of dollars to throw at R&D for solutions at scale.

    With that said, I've made a career out of tackling old problems in obscure/odd/non-standard ways using tools that are the same building blocks used by the companies I mention above. But, hand on heart, I know exactly how much maintenance and effort that comes with. Even if it does save on dollars, you need clever people around to keep it working and improve it. Where I work now, we had a large growth period, moving from small player in a rather shitty market to medium player in a much bigger market, and we had next to no cash to get us through that. I put in a hell of a lot of these "build it yourself" solutions to get us through those tough times. It was fun, and the solutions did a great job of pulling the company up by the bootstraps without sending them broke. But, now that we've got some cash, buying "off the shelf" solutions means we're in a much more stable place, and I'm doing far fewer late nights maintaining/patching/fixing/building stuff.

    Like all things, there's no right and wrong answer. Constantly churning out BluRay disks of permanent archives could likely work, but you'd be in the "build it yourself" territory, with no vendor offering help there. And that means hiring clever folk to build it, document it and keep it running, rather than your "9-5, clock on, churn through tickets, clock off" types that are typical hires in Government, with their strict "no overtime" policies.

    Politically speaking, I don't think it's something the AU Government could handle in their current form. If you look to how the public service in places like Brazil work, they emphasis spending money on local developers maintaining in house and open source code, rather than buying foreign software. With that mindset, they're not only growing their own skilled labour / technical population, but they're also getting a more customised product for their dollar. Here in Australia, we're pretty adverse to doing things differently, and would rather just blindly follow the status quo of technology. There's pros and cons to that, but just trying to get our public sector to change overnight from their current mindset to one more focused on development rather than outsourcing/purchasing sounds like it would be a lengthy fight.
     
  17. Myne_h

    Myne_h Member

    Joined:
    Feb 27, 2002
    Messages:
    10,493
    I think you misunderstood. You seem to think I'm advocating a hdd cache, optical primary. I'm not. I'm advocating an optical mirror of the array. The data stored is too critical to trust to offline backup or any source which can be overwritten.

    So:
    Array gets mirrored to optical-dukebox.
    Changes to files are essentially incrementally backed up to new disks.
    One small HDD+optical would be the consolidated index of all disks.

    If for some reason the primary array failed totally, the optical would be read-only and thus unaffected. It would be used to restore the entire array. Theoretically, if it was an attack of some form, you could narrow it down to the exact minute the corruption started because you cannot overwrite the optical. You can write new disks, but the old ones would be intact.


    Right. So the issue is that HP and co haven't built something like it as a compliment to their spinning rust arrays. Cheers.
     
  18. obi

    obi Member

    Joined:
    Oct 16, 2004
    Messages:
    127
    More "enterprisey" versions of such a system exists. If you're interested, read up about hierarchical storage management.

    I've worked in a place that had the same thing, but with two tape libraries as backing.

    Two copies of data on disk, two copies on tape. Two disk arrays, two libraries, two locations. Eventually disk hits a high water mark, data on disk is purged based on age/size/access/<insert parameters here>. Tape copies stay forever. Backups are handled via metadata, so you can pull a file from a tape using the metadata to get info/location.

    As long as you can tune the system well, and understand the workloads, it is possible to run with such a system. Tricky to get right, and can be tricky for users to understand that they can't always get a file from the depths of the past immediately. Helps to tell them to get coffee :lol:
     
  19. NSanity

    NSanity Member

    Joined:
    Mar 11, 2002
    Messages:
    17,910
    Location:
    Canberra
    In 2012R2 Speak - Fujitsu was king - however this was just Storage Spaces.

    S2D is still *really* green, so a few vendors have S2D Nodes that you can buy off the shelf, and from that - and the HCL - you can build things with another vendor (e.g Supermicro).
     
  20. NSanity

    NSanity Member

    Joined:
    Mar 11, 2002
    Messages:
    17,910
    Location:
    Canberra
    IBM TSM, backed by DB2 (and hilariously, TSM is what DB2 grew out of).
     

Share This Page

Advertisement: