Skip to main content

Slower than expected backups and reps via 10GbE to QNAP iISCSI/QNAP storage NAS

Thread needs solution

Hi all,

I'd better explain this in detail so I give you the full story...

I'm setting up a new iSCSI shared storage system between three VM hosts and a QNAP NAS via 10GbE links. That bit is fine - I'm okay with that. I've currently got them set up in a 'lab' environment before they go live in a couple of months.

This is how it's set up...

All the 'onboard' NICs are 1GbE...

Onboard 1GbE NIC 0 - Management network
Onboard 1GbE NIC 1 - Backups Network (assigned to the Acronis Appliance)
Onboard 1GbE NIC 2 -
Onboard 1GbE NIC 3 -
Onboard 1GbE NIC 4 - For VM traffic
Onboard 1GbE NIC 5 - For VM traffic
Onboard 1GbE NIC 6 - For VM traffic
Onboard 1GbE NIC 7 - For VM traffic
HBA 10GbE NIC 8 - iSCSI Storage (Direction connection to QNAP NAS via CAT6e Copper)
HBA 10GbE NIC 9 - Spare (for now - but will look at dual path redundancy soon)

The above VMNIC8 (10GbE) is connected directly to one of the four 10GbE ports on the QNAP NAS unit - directly without a switch. MTU is set to 9000 on both the VMNIC and the QNAP NAS. The connection seems stable enough like that without the need for a switch in the mix. The port is set up as a VM Kernel port for iSCSI storage and I've 'bound' it to the IP. That works fine - and lets me see the various iSCSI datastores I've configured etc.

What I've noticed is that when I'm doing replications (via Acronis Appliance) between the local VM datastore (attached SAS local storage) and the iSCSI storage - the replication speed is terrible. We're talking no faster than about 40/50MB/sec. I can achieve those speeds (and better) whilst replicating across a 1GbE network on my other VM hosts. I was expecting this to absolutely fly - as the link between the VM host and the shared storage is running at 10GbE @ 9000 MTU. If I migrate a large VM between local host storage and the iSCSI shared storage via vSphere Web Client - it moves the VM data pretty damned rapidly between the two devices. I was hoping the Acronis replication data would also be 'pretty damned rapid' as it's following the same path. The replication speeds aren't consistent either - they're a bit sporadic - jumping up to 40/50MB/sec and then plummiting back down to 5MB/sec briefly and then back up again. On my other VM's that replicate between local datastores (from host to host) - the speed does sometimes start low - but then gradually ramps up to around 100MB/sec and sometimes beyond that - which I can't understand - due to the limitations of the 1GbE connection itself.

I just wondered if anyone else had a similar scenario/problem/fix for this - as I'm pretty gutted with the speeds I'm experiencing across such a fast link. I was hoping to get replications done in no time at all with the 10GbE in the mix.

The Acronis Appliance doesn't also need to be on the same subnet as the iSCSI storage HBA and QNAP NAS does it? At the moment, the Acronis Appliance has an IP address on the management network and is connected to the same switch as the VM host management Kernel port. This network only runs at 1GbE. Do I need to be configuring VMNIC9 (currently unused 10GbE HBA) for the Acronis Appliance or something? Will the Acronis Appliance only backup and replicate data as fast as its own assigned network port will run at?

I'm a bit lost with this - as you can no doubt tell !!

Any help is much appreciated - anything at all that you can suggest will be most welcomed!

If you need me to explain anything else about the layout or configuration - please ask...

Thanks all,
Paul

0 Users found this helpful
frestogaslorastaswastavewroviwroclolacorashibushurutraciwrubrishabenichikucrijorejenufrilomuwrigaslowrikejawrachosleratiswurelaseriprouobrunoviswosuthitribrepakotritopislivadrauibretisetewrapenuwrapi
Posts: 22
Comments: 3800

Hi Paul,

The main thing to check in your case is where the Virtual Appliance is deployed. The fastest possible speed can be achieved if appliance is deployed on the host which has access to both source (local) and target (shared) datastores. In this case the appliance will attach the backed up VM virtual disk (from local datastore) to itself and it will also attach the target replica virtual disk (from shared datastore). The traffic will not go over network and will be sent entirely through ESXi host internal SCSI stack. My guess is that when you get 40-50MB/s speed you're replicating a VM from host different to one where you have appliance deployed (in this case the data will be read over network) and when you have 100MB/s speed then you're replicating VM from the same host.

Another thing I can guess is that the overhead is brought by VMware VDDK library which is used to get access to the virtual disks and to write data into them, though this is unclear whether it's the case.

One more thing to note: it always makes sense to check the actual _time_ it took to perform full VM replication rather than checking the speed values. Time is what actually matters.

In either case I'd recommended to contact our support team for assistance. Remember to attach the replication log (from 2 attempts where you see different speeds) to your request (View->Show Logs->Save All To File) and clarify the actual data size in the replicated VM for better understanding of speed scales.

Thank you.
--
Best regards,
Vasily
Acronis Virtualization Program Manager

Hi Vasily,

Thanks ever so much for your reply - I really appreciate it.

What I tend to do is have an Acronis Appliance on each host - and only assign backups and reps for that particular host's VMs.

So, the replication jobs I was running between the host's local storage and the iSCSI storage are running from the Acronis Appliance that's deployed on that particular host. I try not to run jobs on other hosts from different Appliances - as I guessed that might slow things down a bit.

So, the replications themselves aren't interested in how fast the Acronis Appliance network is - they will definitely only use the 10GbE iSCSI link between the host and the storage system? I do understand that that's the only route it can take - but wondered if the data had to go 'via' the Acronis Appliance and that may have been bottlenecking it.

I'm currently recreating my test lab - and will contact support and get some logs together once I've got everything configured and back working again.

Thanks Vasily

Best Regards,
Paul