Skip to main content

When backing up more than 1 VM at a time, job fails

Thread needs solution

Hi all. We just got this product installed (purchased 6 keys for 3 dual cpu servers). I'm noticing that backup jobs are being marked as failed if I put a bunch of machines under one job. Should I be setting up machines individually? I have about 50 virtual machines (not all need backed up though). So are we talking a bunch of individual jobs then? Originally I have backup jobs defined per datastore (each datastore holds maybe 15 machines). Maybe that is too many VM's per job?

Though when I look at recovery points, the majority of the machines are backing up. Then the next day maybe different ones backup. The problem is they don't ALL back up and for spending a couple grand on software we would hope that we can get this functionaly to be able to back up everything we need and rely on it.

Here is there error:

Task 'vm3-sas backup' failed: 'Failed to create the backup.
Additional info:
--------------------
Error code: 3
Module: 435
LineInfo: 555b5abba0950053
Fields:
Message: Failed to create the backup.
--------------------
Error code: 32786
Module: 114
LineInfo: 28314c961de7d2fd
Fields:
Message: Failed to prepare for backing up.
--------------------
Error code: 353
Module: 149
LineInfo: a71592046cb2c5c6
Fields:
Message: Failed to back up the group.
--------------------
Error code: 2
Module: 218
LineInfo: 600f7166ce397de5
Fields:
Message: Error occurred while running the backup and recovery engine.
--------------------'.

0 Users found this helpful

Well it seems it can do concurrent tasks. However days later I still get this same error. Some machines backup fine while others do not. First I thought maybe it was linux VM's that don't, so I seperated the job after seeing this in the logs:
Failed to detect GRUB loader.
Additional info:
--------------------
Error code: 7
Module: 57
LineInfo: 5186ab5c3563a7c3
Fields:
Message: Failed to detect GRUB loader.
--------------------
Error code: 7
Module: 57
LineInfo: 85de7f4c6a5777a5
Fields:
Message: Failed to process GRUB on disk '\comp_emu(vm://86E16424-D25B-4E0D-AB44-B3544AC9B64A/4221b751-85bb-f9e3-1af1-0cfe86b78556?host=host-19&type=vmwesx)\hd_emu(1)'.
--------------------
Error code: 7
Module: 57
LineInfo: 85de7f4c6a57748f
Fields:
Message: Failed to find file '/grub/stage2'.
--------------------

But that's not it. Every night two other jobs show the same error. Here is one from last night, same error for 1/31/2012 23:59:41 and 1/31/2012 00:30:25

Task 'vm2-sas backup' failed: 'Failed to create the backup.
Additional info:
--------------------
Error code: 3
Module: 435
LineInfo: 555b5abba0950053
Fields:
Message: Failed to create the backup.
--------------------
Error code: 32786
Module: 114
LineInfo: 28314c961de7d2fd
Fields:
Message: Failed to prepare for backing up.
--------------------
Error code: 353
Module: 149
LineInfo: a71592046cb2c5c6
Fields:
Message: Failed to back up the group.
--------------------
Error code: 2
Module: 218
LineInfo: 600f7166ce397de5
Fields:
Message: Error occurred while running the backup and recovery engine.
--------------------'.

Note the line info, module and error code is always the same! That has to pinpoint something, right?

I'm seeing 19 virtual machines not currently up to date. Backups are supposed to run every night so today as of 2/2 all of these machines should have a recovery point of 2/1. I'm seeing the latest recovery point out of a group of 19 problem vm's as 1/30, but there's quite a few where the last recovery point is 1/25. Here's the breakdown:

Last recovery point of 1/24 - 1 machine
Last recovery point of 1/25 - 10 machines
Last recovery point of 1/26 - 2 machines
Last recovery point of 1/28 - 4 machines
Last recovery point of 1/29 - 1 machine
Last recovery point of 1/30 - 1 machine

Expected:
Last recovery point of 2/1 - 8 machines 

Job 1 - vm2-sas backup - 8 machines selected, only 2 are current

Job 2 - vm3-sas backup - 15 machines selected, only 3 are current

Job 3 - Linux based VMs (CBT off) - 3 machines selected, only 2 are current

Job 4 - A single machine - it is current.

In an ideal world all of these should show 2/1 as the last recovery point. Yes I did open a case and I still have not heard back. Looks like I'll have to add these machines back into our Symantec BackupExec selection list because this is a production environment. Backups are pretty important!

We are seeing the same problem you are! I have also opened up a support case and am waiting to hear back. Talked to someone today in the support group who was NOT helpful. It really looks like Acronis support is not yet ready to actually deal with the issues at hand. The agents I have dealt with via e-mail did not read my support request correctly and could not answer back. Called in today and waiting to have the issue escalated.

Out of memory problem

There seems to be a problem with memory usage of the AcronisAppliance. The /var/log/messages log of the Appliance contains the following lines:

Feb 15 01:37:50 (none) user.warn kernel: Out of memory: kill process 19291 (sh) score 7509 or a child
Feb 15 01:37:50 (none) user.warn kernel: Killed process 19292 (demon) vsz:958144kB, anon-rss:631652kB, file-rss:0kB

Today Acronis Phone Support forwarded the case to a technician.

I ended up splitting every server backup into their own individual jobs and they point to their own tib file. A little more time consuming to set up but at least its easier to see which machine is failing the backup now.

I have vmProtect6 backing everything up on one of my datastores and vmProtect7 beta backing up everything that resides on another datastore.
Results from last night's pass?
vmProtect6
10 machines setup to backup
3 machines failed backup (always failed perform the requested operation, failed to open the vm, failed to create a quiesced snapshot, create snapshot failed). One of these machines is an HR payroll server and thats really important that it gets backed up.
Failed - 2 Windows 2008 32bit server, 1 Windows 2003 R2 32bit server.

This past weekend I have more jobs that run because once or twice a week is sufficient for what they do (they don't change much). I had 2 out of 9 jobs fail. 1 Windows 2008 R2 64bit server and 1 Linux box.

vmProtect7 beta
The only thing that failed was our mail server (2003 R2 box with Exchange 2007) There's 10 jobs so 9 went through. The one has a warning "Failed to detect GRUB loader" and it's an Ubuntu linux box. Despite the warning I think it still backed up I guess because there's a 3.5 GB TIB file with it's name out on my system.

vCenter stuck tasks
The other thing I see is stuck tasks in vCenter. Like today (2/15/2012) I see a task stuck at 98% in vcenter initiated by my user name, requested start time 2/11/2012 6:31:05 AM. The task is titled Acronis vmProtect Backup. There's no way to get it out of there unless I restart the vCenter server.

KJSTech - Good info! From your notes it really looks like the vmProtect7 may be an improvement! Question... are you using the appliance or an agent installed on a windows system for the beta?

Similar question for all on this thread... what is everyone use to run the backups? i.e. is the case that all of us are using the virtual appliance and seeing these errors? By contrast, is there a better experience using a windows server to run the backup agent? If so, what kind of system is being used successfully?

Please let us all know!

KurtA - how did you get access to the virtual appliance? Did you ssh in? Are the credentials needed published?

MickyW - I followed the procedure outlined in the article http://kb.acronis.com/content/4568

- Open the vSphere Client and click on the Virtual Appliance;
- Click the Console tab;
- Press Ctrl-Alt-F2 to enter the command prompt;

This opens a root shell in the Virtual Appliance. The following command starts the ssh-Server

# /bin/sshd

Now you can use your favorite ssh Client to connect to the Appliance with the same credentials you use to connect with your browser to the web frontend.

MickyW wrote:

KJSTech - Good info! From your notes it really looks like the vmProtect7 may be an improvement! Question... are you using the appliance or an agent installed on a windows system for the beta?

Similar question for all on this thread... what is everyone use to run the backups? i.e. is the case that all of us are using the virtual appliance and seeing these errors? By contrast, is there a better experience using a windows server to run the backup agent? If so, what kind of system is being used successfully?

Please let us all know!

I have both vmprotect6 and 7 beta installed as virtual appliances.

I haven't thought about spinning it up as an agent installed on a windows box. I just don't like the idea of having a full blown windows server eating resources and requiring updates and all that stuff. Lean and mean is how I like it!

I actually have my Windows server had it's auto updates disabled. So far, it's running fine.