Active/Passive HA Backing Store

Discussion in 'Other Operating Systems' started by gdjacobs, Sep 17, 2018.

  1. gdjacobs

    gdjacobs Member

    Joined:
    Apr 3, 2007
    Messages:
    837
    Location:
    MB, Canada
    I'm looking at providing replication for a KVM setup, so I'm reading some literature to get a sense of what the best approach is for data replication. I'd prefer to use something free but off the shelf.

    A solution like Gluster exporting iSCSI luns might work, but it appears their quorum methods are not very applicable for n=2. Failover capable NAS platforms don't appear to be a thing. So far, I'm leaning towards DRBD either on bare metal or on top of MD on the VM hosts themselves. Of course, this is somewhat limited in terms of expansion capability.

    What have others used in the past and had success with?
     
    Last edited: Sep 17, 2018
  2. Primüs

    Primüs Member

    Joined:
    Apr 1, 2003
    Messages:
    3,348
    Location:
    CFS
    A lot of what your discussing starts jumping into enterprise hardware realm and may be better suited to the enterprise forum. I'm sure there are SAN products that can help with redundancy across supervisor modules along with disk shelves etc.

    In terms of software all I've really seen used is GlusterFS and Ceph. You manage the hardware underneath, and chuck this software on top to create single target but redundant file system. Common setup i've used with GlusterFS is 2x RAID6 servers, in pure replication so its just a 1:1.

    I believe elvis uses GlusterFS in his heavily redundant environment so might be able to offer better advice.
     
  3. elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    33,804
    Location:
    Brisbane
    Gluster never worked well for me for VM storage. It's a great ad-hoc file replicator, but things go a bit screwey for old gluster under VM/image style workloads, especially under high IO.

    I've used DRBD before, and it worked far better for a small VM deployment. If you're hell-bent on system-level redundancy for real HA, that's the way I'd go.

    Even then, that's not a zero-maintenance solution. Good checking/nagios style stuff to monitor and alert, especially for flapping services, is recommended. If you absolutely 100% need real HA, then fair enough. But these days I try to talk businesses down off a ledge of wanting HA, and look instead to less immediate but more reliable replication tools that allow a window to boot up a backup system somewhere else on the network if the primary has failed. Things like BtrFS/ZFS snapshots and send/receive can get you 90% of the way towards a HA setup with a fraction of the heartache, depending on business requirements.
     
  4. NSanity

    NSanity Member

    Joined:
    Mar 11, 2002
    Messages:
    17,195
    Location:
    Canberra
    Didn't Ceph recently get canned due to performance issues with vmhosting by some major players?

    I know Daemon humps virtuozzo's leg 24/7.
     
  5. Daemon

    Daemon Member

    Joined:
    Jun 27, 2001
    Messages:
    5,350
    Location:
    qld.au
    You're simply not going to get what you want for free, and active/passive you're 10x better off with simply having a backup system which works well.

    Here's the thing. Storage is far more complicated than you first think. Add in replication (sync and async) and you've increased the complexity by about 5x.

    All of the systems like Ceph, Gluster and so forth are great (albeit slow) until they're not. When things don't work, you're then needing to be an expert in how the storage system you're using functions. This is what kills you, there is no easy out.

    If you need HA, you don't need on-prem kit. Leave the HA bits to the big guys and focus on fast recovery where required.
     
    NSanity likes this.
  6. OP
    OP
    gdjacobs

    gdjacobs Member

    Joined:
    Apr 3, 2007
    Messages:
    837
    Location:
    MB, Canada
    I've already got a monitoring infrastructure up, so that's not a problem. ZFS snapshots might be a good option. Being up to date on an hourly basis would be a good target for my application. Lower I/O performance isn't too much of an issue, either, as our needs are modest.

    Elvis, you mentioned "old gluster". I know Gluster has been working on implementing erasure coding and I think they changed their recommended backing store after the version you deployed and documented. Are you thinking of a newer Gluster release in comparison or a different solution?

    I knew HA storage was tricky, I just wasn't sure how they compared in production and, therefore, which direction I should leap in. I think I've got a better grip on where I need to go now.

    Thanks, guys!
     
    Last edited: Sep 18, 2018

Share This Page