Bug Report - Infastructure update needed
As a paying user of Online B&R for Windows Server I rely heavily on the ability to backup as intended: online, to Acronis' servers. After a primarily smooth evaluation period we purchased the year-long subscription and continued backing up as planned. Toward the end of the eval period I had opened a support ticket about the "Cleanup" schedule (retaining and combining incremental backups) failing intermittently. As this was not a serious or common occurrence, it wasn't a big deal.
Since purchasing the online license though the problem has gotten much worse. Now it's about a 50/50 chance that I make it through a basic online backup increment without incident. God forbid I try to verify an online archive... I ended up getting the Tier 1 support run-around of sending various logs...ending with a rather ludicrous request for a full tcp dump of our webserver running the software while "recreating the issue," (tcp dump for 1-2 days until I get an error?), which for the security and privacy of my clients, I naturally declined. Digging deeply into this myself I've ruled some things out. The failures occur due to too many network timeouts during the operation. Now, I receive instant notification from our dedicated hosting company should our server experience any downtime or service connectivity issues, the backups to complete eventually, so it's not an issue with our server's filesystem or anything like that. Application is up to date, schedule plan re-created (requested several times while attempting to move support on to something more useful). From this, I conclude that the connection issue lies with Acronis.
This is supported by some (specific) numbers too. Shortly before a series of "problem with network connection to the required files" errors->retry->write error messages that led to a plan failure, a packet analysis shows that interaction with the Acronis online backup servers, which operate in the 209.239.125.1/24 range, switched from 209.239.125.153 to 209.239.125.146. I don't know the particular setup that Acronis is running, but that, combined with the end result for a write error, lead me to believe that Acronis load-balanced the connection to a different server in the middle of a write operation, indicative of high network or server load at their end. I could be wrong in this conclusion, but based on the evidence and what has been ruled out, it seems likely. This is further supported by the result of changing the scheduling of the backup to a less "typical" server maintenance time. The problem seems to happen less with a 5AM scheduled backup than with a 2AM backup, which I'm guessing is in a fairly common range of times where Acronis customers schedule online backups for their servers. This leads me to believe that Acronis needs to take steps to upgrade their infrastructure to provide the service they have advertised and received money for.
Well there's the long bug report; here's the short one.
Symptom: Online backup activities fail intermittently due to network connectivity issues leading to a write-error
Prereq: Client has no recorded network or machine issues on their end
Bug: Acronis online servers get overloaded and timeout or load-balance in the middle of a write operation, resulting in a write-error.
Solution: Increase capacity of online infrastructure. Do not load balance in a way that interrupts write operations.
There, I've done part of your job. Please do the rest.

- Log in to post comments

Is there a way to enforce a "graceful failure" of some sort? I have my error handling set to 10 retry attempts 20 minutes apart and my task failure handling to five task restarts 30 minutes apart (that's 50 total connection attempts over the course of nearly an entire calendar day). When this results in an overall task failure as described above, the task hangs a "Needs Interaction," meaning I have to log in to the server and click an "OK" box to allow the scheduler to ever fire another backup task. In the event that this happens on, say, a Friday night, the server will not run any backup tasks all weekend until I go back to work Monday. Naturally, this is not providing even a "daily backup" attempt, further complicating the inability of Acronis Online Backup to provide a daily backup solution. Yes, the task has "silent mode" set, though I can't imagine what it actually does if it still requires user interaction.
- Log in to post comments

Additionally, according to the Task Progress window, I'm only uploading the backups at 200-400 KB/s. Considering I have an uplink connection of 100MB/s, that would strike me as indicative of the issue mentioned in the OP.
- Log in to post comments