Replication to 2nd Vault Fails When Agent Machine is Interrupted
Small office environment with about 18 Win7 workstations. Using Acronis Backup 11.5 for daily backups with offsite replication over VPN. VPN maxes out at around 20Mbps, so a typical full backup can take several hours to replicate offsite. I've re-created a GFS scheme using unique custom plans for each workstation, which allows me to stagger the backups so that each machine has its full backup on a different day of the month. So, each evening from 7pm to about 2am most machines do daily incremental backups, three do their weekly differential, and one has its monthly full backup.
The first backup location is a centralized managed vault stored on an in-house NAS, attached to the Acronis Storage Node. Most recent backups are validated after full and differential. Plans are configured to immediately replicate to a second location - another managed vault stored on an offsite NAS, owned by the same Acronis Storage Node. These are set to validate only after full backup, which can take another few hours back over the VPN.
The problem I'm having seems to be a three-way discrepancy between what the documentation says should happen, what common sense dictates, and what Acronis actually does.
According to Documentation:
http://www.acronis.com/en-gb/support/documentation/ABR11.5/index.html#1…
"Copying [...] a backup from any location is initiated by the agent that created the backup, and is performed [...] by the corresponding storage node." Also, lower down, it states, "Copying [...] a backup from one managed vault to another managed vault is performed by the storage node." I understand that task scheduling is managed by the AMS server, but actually triggered and run by each workstation's agent. The local agent performs the compression and pushes the backup to the first location - in this case, a vault on the storage node. The agent then "initiates" the replication to the second location.
According to Common Sense:
The machines should be woken on schedule by the AMS. AMS should tell the workstation's agent what to backup, how, and where. The agent on the workstation should perform the backup and compression, and then return the machine to its previous state. It can and should return the machine to sleep or power down state. All further validation and replication tasks should be initiated and performed by the AMS, and should not require that the workstation continues running.
What Acronis Actually Does:
Backup to the first location is performed by the local agent. Then the replication to the second location is run by the local agent (NOT THE STORAGE NODE AS DOCUMENTED) over the VPN. Next the 2nd location's backup is validated over the VPN. Only after that long multi-hour replication task is completed is the validation of the first location's backup started. The source workstation must remain running with a solid network connection for the ENTIRE process. If any glitch or hiccup occurs, the entire job is marked as FAILED.
This has severe consequences: Because, even though the first vault successfully received the full backup, the failure of the job as a whole means that the following day, ANOTHER FULL BACKUP will run for that machine instead of an incremental. And when it comes time to replicate - it will attempt to copy yesterday's full backup AND today's full backup. The problem snowballs when the slower offsite replication runs into the next working day. The user of the machine may restart his workstation, cancelling the entire process once again. The following evening, the failed backup chain just gets longer and longer and never resolves.
It seems that if I could separately schedule replication and validation as completely independent tasks that I may be able to solve the problem, and have a fully automated offsite solution. But this is unavailable in the current settings. How can I replicate backups from a managed vault to an offsite vault without having to rely on the local agents running all day and night? Shouldn't the storage node be doing that?

- Accedi per poter commentare

Thank you for the reply, and for acknowledging the bug. Is there a suggested workaround?
Best I can figure is to have the backup plans replicate to a second UN-managed vault on the local network, and then use some third-party tool to sync them from there to the offsite location. Vault export is not a viable option, since it cannot be scheduled, and would consolidate the backup chains. Either way, I'm looking at double the storage on the NAS to make this work. How are others doing automated offsite backup with this product?
- Accedi per poter commentare

There is no workaround for this bug.
Can't advise on best practices for your case, unfortunately.
- Accedi per poter commentare

In case anyone else has been struggling with this, I have experimented and found a solution that works well for me. The key is not to use Acronis to replicate over slower connections at all. All backup tasks write to storage on the local network. Offsite vault replication is handled by separate synchronization software. This allows the workstations to backup and shut down - staying awake for at most one hour. Details of my new setup are as follows:
I use two vaults on a local NAS: One very large, managed vault with 6 months of retention, and one smaller, unmanaged vault retaining only 6 weeks. Backup tasks use the managed vault as a primary storage location, and then validate and replicate to the unmanaged vault as a secondary location. I then use BTSync to automatically replicate and verify the entire unmanaged vault over the VPN to the offsite NAS.
I tested Synchredible, GoodSync, and several others, but found that BTSync has many advantages: Because it is peer-to-peer, the hash checking does not require re-transmitting an entire .TIB file back to the source, so synchronization completes in half of the time. It indexes files block by block, so it can pause and resume without having to start over again if interrupted. You can set it up to encrypt and transmit over WAN at your maximum upload speed, but I prefer to keep all sensitive data tunneled through the VPN, even if slightly slower. Another advantage is that the BTSync client runs on many platforms, including a plugin for FreeNAS - which is ideal for me, since it means not having to set up a resource-heavy windows VM at the offsite location.
Limited local storage space may make it hard to justify two local vaults (with mostly the same data), but having that second, unmanaged vault does make de-duplicating the primary vault a more attractive possibility. You could save space, and always have the other to fall back on. I also think that an unmanaged vault would make disaster recovery go a little smoother.
- Accedi per poter commentare