1. OCAU Merchandise now available! Check out our 20th Anniversary Mugs, Classic Logo Shirts and much more! Discussion here.
    Dismiss Notice

Esxi Snapshots

Discussion in 'Business & Enterprise Computing' started by tonner78, Oct 25, 2010.

  1. tonner78

    tonner78 Member

    Joined:
    Sep 16, 2003
    Messages:
    2,192
    Location:
    Inside the Matrix
    Hi All

    Got a bit of a problem at a site. They are running a single server with local storage and Esxi 4.0. There is also a physical vCenter server also running Backup Exec with VMware agents to do full VM backups. Have had some troubles with this but had been starting to get on top of it. However the VM backups on one of the servers have been failing for the last couple of weeks, which I am pretty sure is because that VM has a whole bunch of snapshots which Backup Exec seems to leave behind from time to time. So I decided to commit them all, in an attempt to resolve the backup issue and also cos having lots of snapshots uncommitted like that is bad practice (I would never have left it that way if I'd known they were there).

    Problem is, after several hours of committing, the job 'failed' because it was trying to create another snapshot file? (No idea why, it might be part of the process of committing, I haven't looked that closely into the detail of the process) Anyway it had run out of space on the datastore and wouldn't finish. In order to get the server going again I had to abort the job and delete some ISO's off the datastore to make some room (albeit only a few gig), and then start up the VM again.

    Real problem is, now the snapshot manager shows as having no snapshots on that VM, however if I look at the datastore all of the VMDK snapshot files are still there, indicating the committal didn't actually work.. This is a big problem as these extra snapshot files are taking up over 150GB of 'free' space and we are fast running out of room....

    Anyone got any suggestions for me?

    Cheers
     
  2. exodus_68

    exodus_68 Member

    Joined:
    Jun 10, 2003
    Messages:
    354
    Location:
    Perth
    If you enable the service console on the ESXi host you can connect to the datastore using WinSCP (or similar) and take a backup of the relevant VMs directory for a start. Depending on the version of ESXi will depend on the way you need to enable to console so I will let you find that on your own.

    Once you have a backup there are a couple things I would try.
    First just remove the VM from the inventory (in Infrastructure Manager) and readd it...if it recognises the snapshots it should ask you about them. Failing that I would (making sure you have your backup). Remove the VM from the inventory again. Delete the snapshots from the datastore, and then readd the VM in the inventory.

    I can't stress enough that you need to BACKUP before doing any of those operations!
     
  3. ewok85

    ewok85 Member

    Joined:
    Jul 4, 2002
    Messages:
    8,104
    Location:
    Tokyo, Japan
    Just remember once you get it going go to the snapshot manager and use "Delete All" - don't try and delete them one by one or anything else.

    If you have 150GB of snapshots then you have a massive issue - snapshots will expand over time until they gobble up all the space and the VM will stop once there is no space.

    Deleting the snapshot will merge the snapshot file (which expands over time) into the original virtual machine disk (VMDK - which is a fixed size) without any loss of data.

    Luckily this will work when the hard disk is full since the VMDK is pre-allocated :wired:

    The basic rule of snapshots is don't let them run any longer than you need them - make sure they get removed as soon as they are no longer needed.
     
  4. OP
    OP
    tonner78

    tonner78 Member

    Joined:
    Sep 16, 2003
    Messages:
    2,192
    Location:
    Inside the Matrix
    I know snapshots are bad and should not be left - I didn't do it deliberately! As said unbeknownest to me Backup Exec had been leaving them behind occasionally when doing its backup job.

    The problem I am stuck with is that whilst the snapshots are still there, they don't show up in the snapshot manager so I can't attempt to delete them again! I have next to zero space available to play around with, and no other data stores either..

    I'm hoping for a simple fix to make the snapshots show up again in the snapshot manager, although I realise it may be wishful thinking...
     
  5. ewok85

    ewok85 Member

    Joined:
    Jul 4, 2002
    Messages:
    8,104
    Location:
    Tokyo, Japan
    Is the VM working and all the current data there? If it is and its running off the proper vmdk you can just delete the snapshots from the CLI...
     
  6. OP
    OP
    tonner78

    tonner78 Member

    Joined:
    Sep 16, 2003
    Messages:
    2,192
    Location:
    Inside the Matrix
    Yup the VM is working. But the snapshot files are still in the datastore... Just not showing up in the snapshot manager.
     
  7. OP
    OP
    tonner78

    tonner78 Member

    Joined:
    Sep 16, 2003
    Messages:
    2,192
    Location:
    Inside the Matrix
    Ok so this just got somewhat more serious.... The VM now cannot stay running as there is no longer enough physical space on the datastore, because it is still running off the snapshots which of course grow as more data is add/created on the guest.. And yes the VM is pointed at Server-00005.vmdk - the fifth of the snapshots..

    :wired::(
     
  8. Nyarghnia

    Nyarghnia (Taking a Break)

    Joined:
    Aug 5, 2008
    Messages:
    1,274
    Having just been through a true horrorfest with Backing up Virtual machines...

    +++ StorageCraft Shadowprotect +++

    AND

    +++ StorageCraft ImageManager +++

    Run it inside your Guest Operating System (Windows only i'm affraid).

    Create a template Virtual Machine which is your typical system build for a server...

    Run ShadowProtect, set for continuous incrementals every 15 minutes or whatever.

    Run ImageManager which handles and consolidates your Backup Images.

    Put the ShadowProtect Recovery Disk ISO into a Storage Volume (I'm assuming that you've got a spot for your ISO's).

    On the spot file recovery and server recovery is simply creating a VM, booting off the ISO, then doing a restore... BAM, server recovered... sort out your databases etc and you're back in business.

    None of this Snapshotting bullshit, none of this crap caused by expensive, ovepriced agent software.

    Use Symantec to simply take a nightly tape backup (always cut a tape every day) and you're golden, also you will save a HEAP of money. It's great
    at managing your tapes, handling backup policies and reporting on what did (and more importantly, what did not) get backed up.

    It gets my favorite three word slogan..... "it just works".

    In my last gig, I was able to P2V using this approach and damn, it just worked. Last thing I did was migrate everything to Win2k8R2, which seems to
    be very good in a Virtualised environment.

    -NyarghNia
     
    Last edited: Oct 25, 2010
  9. NSanity

    NSanity Member

    Joined:
    Mar 11, 2002
    Messages:
    17,987
    Location:
    Canberra
    vRanger is faster, cheaper and supports File Level Restores on Windows Guests.

    Just because the "traditional" backup guys (oooooh CA, Symantec, Commvault, etc) can't get their shitty, insanely overpriced (per Guest licensing? Get Fucked) agents to work right with VCB/VADP - doesn't mean others haven't.
     
  10. Nyarghnia

    Nyarghnia (Taking a Break)

    Joined:
    Aug 5, 2008
    Messages:
    1,274
    Spent two weeks trying to get vranger working under vsphere 4.1 on 64bit hardware, it had the same issues as vdr 1.2 did, same issues veeam did...

    maybe newer versions will work, but i have zero faith in the whole external-to-vm snapshot backup systems.

    -NyarghNia
     
  11. NSanity

    NSanity Member

    Joined:
    Mar 11, 2002
    Messages:
    17,987
    Location:
    Canberra
    Who setup your VCB/VADP environment - it sounds fubar to me.
     
  12. OP
    OP
    tonner78

    tonner78 Member

    Joined:
    Sep 16, 2003
    Messages:
    2,192
    Location:
    Inside the Matrix
    Guys, I have a potential good solution by installing new drives to give me more space to play with, however I started a new snapshot removal process early this morning after making some room by taking other VM's off the datastore. My question is, if I restarted management agents and cancelled the snapshot process, will my VM die in the ass??
     
  13. syx

    syx Member

    Joined:
    Jul 30, 2001
    Messages:
    588
    Location:
    Mount Gambier
    *tiny derail*

    how much roughly do storageprotect and veem and vranger cost ?
    we have vsphere 4 and 3 physical servers with less than 15 vm's if that makes a difference.
     
  14. Jimoin

    Jimoin Member

    Joined:
    Jul 26, 2002
    Messages:
    579
    Location:
    Melbourne
    ShadowProtect is about 900 per OS (and therefore guest in a VM environment), Veeam is about 1200 per host CPU & I dunno about vranger, prolly the same.

    We're currently moving from physical/shadowprotect to veeam at the moment.
     
  15. NSanity

    NSanity Member

    Joined:
    Mar 11, 2002
    Messages:
    17,987
    Location:
    Canberra
    $690AU/cpu (i.e most people will be 2 per host) with 3 Years Maintenance for vRanger Pro.
     
  16. syx

    syx Member

    Joined:
    Jul 30, 2001
    Messages:
    588
    Location:
    Mount Gambier
    thanks guys
     
  17. exodus_68

    exodus_68 Member

    Joined:
    Jun 10, 2003
    Messages:
    354
    Location:
    Perth
    Yeah if it is running the snapshot process don't touch it. Let it time out and fail our finish by itself. Depending on the speed of your storage and the size of the snapshots this can take quite some time (I have seen 3 days). If you try and kill the process now you will most likely corrupt the vmdk(s)
     
  18. ewok85

    ewok85 Member

    Joined:
    Jul 4, 2002
    Messages:
    8,104
    Location:
    Tokyo, Japan
    Took our server with 15k SAS drives in RAID10 about 14hrs to process a 200GB snapshot (same situation as the OP - long forgotten snapshot gobbles up the entire drive and the VM dies - a lovely first day/monday morning surprise)
     
  19. OP
    OP
    tonner78

    tonner78 Member

    Joined:
    Sep 16, 2003
    Messages:
    2,192
    Location:
    Inside the Matrix
    Yeah, so the long and the short of it is that I got it all sorted in the end. Elected not to risk cancelling the snapshot so just sucked it up and kept waiting.. It finished at about 10am yesterday morning, after starting at about 1:30am. Was about 300gb of snapshot data that had to be committed. I sucked the other two VMs off the datastore to make room to do it. Then copied them back once the datastore had finished.

    So all's well that ends well :)
     
  20. exodus_68

    exodus_68 Member

    Joined:
    Jun 10, 2003
    Messages:
    354
    Location:
    Perth
    good to hear it tonner.

    I am trying to remember the name of the product that I was once told about which monitors snapshots accross the virtual infrastructure and gives reports automatically...might be pretty useful in your situation where you aren't manually creating these snapshots and so theoretically don't know they exist until you have a problem.

    Watch this space I am sure it will come back to me.
     

Share This Page

Advertisement: