Salta al contenuto principale

ABR11 stalls at Analyzing Partition

Thread needs solution

Version used ABR 11.0.17438

The task stalls at 20%, the last log entry being Analyzing Partition, code 66.039 0x101F7.

Trying to STOP the current activity doesn't work, I have to either kill the task or reboot the machine.
Restarting the task has the same result even after a reboot.

The task has worked in the past, taking no more than 3 seconds for the 'analyzing' bit before moving on to the message that an image will now be created.

Is there something I can check here?

0 Users found this helpful

Same Version same problem here.
Our system backups of the 2008 r2 servers were running for some days without problems and now without changing anything in the configuration we got this problem.

Hello Patrick, Hello Thomas,

Thank you for using Acronis Software.

Usually when the backup stalls at 20%, it is the point, where the Volume Snapshot is created. You will find a matching article in our Knowledge Base.

If you cannot stop a running task at the client machine, then you can do this by using the windows command 'taskkill'.

  • Go to Start -> Run -> cmd (start it with administrator rights)
  • Type the command: taskkill /f /im mms.exe
  • reply this step by pressing the up arrow at your keyboard as long there is a PID shown (most 3 times).
  • wait for a minute, then you can restart the Acronis Services using Start -> Run -> services.msc

Please let me know if you have additional questions.

Thank you.

What I tried:
+ tried to end the Acronis services and/or processes - same problem
+ tried to restart the server - same problem
+ updated the scheduler service - same problem
+ deleted the backup job, recreated the job - same problem
+ disabled vss (use always software system provider) completely - same problem
+ changed target depot (one without deduplication) - same problem
+ checked the disk for errors - there were none
+ removed server from ams, revoke license, uninstalled ABR 11 completely, abr_cleanup afterwards, restart, reinstall, restart - same problem

It has to be a "central" problem in our environment otherwise it would not be a problem for all our physical servers at once where disk snapshots are used.

Hello Thomas,

Thank you for your reply.

In this case I really suggest to contact Acronis Support (find more Information here) with this collected information:

  • Please collect System Report from affected machine. 
  • Provide the link to this forum thread to the Support Professional

The collected information helps our Support Professionals to reduce time in analyzing the data and issue. It speeds up the resolution process because we will have the chance to analyze all the available information.

Let me know if you need additional help please.

Thank you.

I found out that the backup is running if i use an unmanaged vault. Very strange.
I added a new storage node with a managed vault. Backups with this vault are running too.

I had this problem with older versions of the ASN - I thought (or better 'hoped') that the most recent ASN would be more close to 'production ready'.

What does your ASN look like? Do you use Dedup? Can you validate the vault? Can you import backups?

Sometimes the ASN just hoses its metadata and then just hangs. You can check if it hangs: If you try to end the service on the windows service console, it will time out: Your ASN is hanging. You can either wait (sometimes it comes back after a few hours) or kill it with the task manager. Sometimes, detaching and reattaching the vault can fix this. But before doing this, try to export all important backups, because sometimes the ASN is not able to reattach the vault and then your backups are pretty much gone.

if all this doesn't work you could either just delete the vault or contact acronis support. Both options won't fix your vault in under 3 months - but you may be lucky :)

Hi Peter,

turning VSS off completely seems to have fixed the issue. 

I wonder if you'd still like to receive any kind of information for analysis?

On another note though, two (2) different partitions on the same machine (same physical HDD too) can be backed up without any issues, i.e. without turning VSS off. The machine just never got around to those tasks since it was waiting for the aforementioned to finish first.

I can already rule out that there is an issue with the free space, since one of the two partitions successfully being backed up is about four times the size of the one failing.

Patrick

"What does your ASN look like?"
Hyper-V Windows 2008 R2 SP1, 2 CPU, 8GB RAM
Volumes
C (SYSTEM) = VHD on Physical Host Disk (Raid1) - Capacity 30GB of 48GB free
D (Catalog, Deduplication) = HyperV Passthrough Disk (MSA2000/8Gbit FC/Raid1) - Capacity 538GB of 558GB free
E (Vault Data) = iSCSI Volume (Single Host LUN) on Thecus NAS (Raid5) - Capacity 1,56TB of 1,96TB free
Databases
ASN Catalog - Folder = 5,12 GB
ASN Deduplication Database - db3 = 8,95 GB, db3-wal = 4,88 GB
ASN Vault - Folder = 396GB
"Do you use Dedup? Can you validate the vault? Can you import backups?"
Yes, Yes and Yes
"If you try to end the service on the windows service console, it will time out: Your ASN is hanging"
That's always the case with our Storage Nodes if tasks were executed already (although I got this problem with other Acronis services too (AMS e.g.)

Your setup looks healthy.

Your db3-wal file is huge. That shouldn't be a problem itself, but it indicates that the ASN is unable (or unwilling) to commit the logs to the DB. Is this Node very busy?

I stopped having huge WALs in the more recent betas (anything before beta 17437 had problems with the DBs and also could not shutdown).

The Problem with AMS hanging is a direct result of the ASN hanging. I had that problem and it resulted in long (months long) debug sessions with Acronis Support. But on our side, this problem went away with the most recent build (17438).

Did you already open a case or how long are you waiting already?

The storage node is not very busy at least while not compacting or while the backups are running.
This storage node is installed newly with the version 17437. We had huge problems with vaults too in earlier versions so with every version I recreated our vaults because of some trouble with updates or data in the vaults. :-(
Anyway I opened a case and send the logs to Acronis yesterday. Hope I get a solution soon.
Short question: Do you have many page faults in the StorageServer process? We got 3568278 in 54 hours.

"soon" is seemingly impossible with acronis... Our ASN regularly corrupts backup chain metadata (seems to be a bug with GFS handling of delete marks) causing the store to not validate (which then prevents compaction). Fortunately this bug doesn't look like it causes data loss. Case is open since April.

My StorageServer.exe is now at 152.034.679 page faults and running for about 28 hours. Peak working set is about 8G (VM has 12G RAM) and Dedup DB size is about 16G.

+ disabled all jobs for the vault
+ did a full validation on the vault (29 hours runtime)
=> Backups running fine again

It seems I always aborted the validation task somehow (didn't see it was running in the gui)
Checked the associated validation Log in C:\ProgramData\Acronis\ServiceProcess\ on Storage Node and waited until the modification date did not change anymore.