See for the latest updates the end of this post.
In this post Marc van Eijk points out connectivity issues with VMs and vNICs. At random virtual machine or vNIC would loose connectivity completely. After a simple live migration the virtual machine would resume connectivity.
Marc has already logged a support case at Microsoft and HP and they are investigating this issue. Last week I also discovered this issue, here is my configuration:
Currently we experience network connectivity issues with one of our cluster networks in a Windows Server 2012 R2 Hyper-V cluster environment.
Our environment is as follows:
- Two HP BL460 G7 servers (name of the servers: Host01 and Host02)
- 6x HP NC553i Dualport Flexfabric 10GB Converged Networkadapters (only 2 active)
- Installed with Windows Server 2012 R2 Hyper-V (full edition)
- Configured in a Windows Failover Cluster
The NICs are installed with the following driver
Driver: Emulex, Driver date: 5-6-2013, Driver version: 220.127.116.11
We have configured a switch-independent NIC team with dynamic loadbalancing with 2 NIC team members. Upon this NIC team we have configured a vswitch.
In this vswitch we have created three vNICs of type Management OS:
- Live Migration
- Cluster CSV
Every NIC is configured in a separate VLAN. Only the Live Migration network may be used for Live Migration traffic (Configured in Windows Failover Clustering).
The initial installation and configuration of Hyper-V and the Windows Failover Cluster was OK. Over all the networks, communication between the hosts in the cluster was possible.
The Cluster Validation Wizard runs successfully without any warning or error.
After the installation of the Hyper-V cluster we start creating and installing the virtual machines. No problems at all, till we build a specific VM called VM06. This VM was created on the host Host01.
When the VM resides on this host everything is OK. As soon as we move this virtual machine (via Live Migration) to the host Host02 the cluster network called Live Migration went down and communication on this network between the two Hyper-V hosts is not possible anymore. When we move the virtual machine back to Host01 the cluster network called Live Migration comes back online. Also when we shut down the virtual machine when it resides on node Host02 the cluster network called Live Migration comes back online.
When we change the NIC teaming configuration to a Active/ Standby configuration, as Mark described in his blog, this network issue does not appear.
Microsoft requested us to disable Large Send Offloading: “Get-NetadapterLSO | Disabel-NetadapterLSO” (with NIC teaming in active/ active). However the issue is still there.
Update 11-26-2013 14:45: After disabling RSS and RSC (which does not change te situation) Hans suggest to disabling VMQ. We used PowerShell to disable VMQ on all interfaces: “Get-NetAdapterVmq | Disable-NetAdapterVmq” …. and yes disabling VMQ does the trick. Off course this is not a solution but only a workaround. These findings are logged in the case @ Microsoft and they will investigate this futher.
Update 12-02-2013 10:45: After applying update KB2887595-v2 to both of our Hyper-V nodes the network problems with our Live Migration network are gone. Even with VMQ enabled the network keeps up and running. However this update does fix the problem for our situation but not for the situation that Marc describes. So it seems that we’ve two different issues here.
We, Hans, Mark en me, will continue to investigate this issue and will update you on www.hyper-v.nu!