Powered by System Center
Effects of offline System Partition on Hyper-V cluster node (updated)
One of my colleagues noticed rather strange behavior with one of the Windows Server 2008 R2 servers which is part of a 7-node Hyper-V cluster built with HP BL460c G6 blade servers. This behavior showed up during host level backups of Hyper-V virtual machines.
What problems were observed?
-
The System Reserved Partition did not have a label and was offline after a reboot
-
A CSV could be offline if it was owned by this particular cluster node
These problems could be temporarily solved by assigning a drive letter. Oddly enough Disk Manager and Diskpart did not agree with each other about the disk being online or offline. In Diskpart the problematic partition would show offline and in Disk Manager it was online. It would show offline if the disk had no label:
With a disk label: ![]()
Other symptoms:
-
Host level (external) backups of Hyper-V child partitions failed (while internal backups of VM’s with a backup agents would succeed)
-
During this external backup multiple VHD’s are created but apparently nothing is written to it.
Ultimately the Hyper-V VSS writer faults with a non-tryable error. However, if we move the VM to another host, the backup completes successfully.
Multiple attempts to solve this problem were made:
-
Evict node from cluster and rejoin cluster
-
Remove backup agent (DPM2010) and add it again
-
Remove Server Backup feature and add it again
-
Remove DPM2010 Protection Group and recreate it
-
Backup with and without label on reserved partition
So far the only option left was to reinstall the server.
Just by coincidence I found a recent KB article called “System Partition goes offline on Windows Server 2008 and Windows Server 2008 R2 after installing some 3rd Party Disk or Storage Management Software” dated September 29, 2010: http://support.microsoft.com/kb/2419286/
This article names three issues:
Issue 1:
Hyper-V Role Cannot be Installed “Failure configuring Windows features”
Issue 2:
“Failed to prepare storage for testing on node 1 status 87” during Cluster Validation
Issue 3
“The boot configuration data store could not be opened. The system cannot find the file specified.
In our case it was issue 2 that troubled us.
The resolution was to online the System Reserved Partition:
Diskpart
List volume
Select volume n (n= the 100MB system partition)
Online volume
Exit
Or from Disk Management:
Diskmgmt.msc
Select the 100MB volume and Right-Click it
Change drive letter & path
Assign a drive letter
This is something we had already found out. The bad thing is that it returns after a reboot.
We checked the following:
-
No 3rd party disk or storage management software is installed
-
No anti-virus software is installed on the cluster node
______________________________________________________________
Update: October 6, 2010
Talking to several people in my network we’ve come up with the following (partial and possibly full) solutions:
-
Assign a drive letter, and leave it during a reboot: result was that system partition kept online, but the host level backup of guest partitions with DPM 2010 failed (this idea was presented by several people, a.o. Kurt Roggen and an engineer at Microsoft)
-
Run a chkdsk /r on the system drive (part of solution) and run the system readiness tool which replaces the corrupt mum files from a fully functional Hyper-V host to the one that was having issues:
http://support.microsoft.com/kb/947821 (this tip was presented by Annur Sumar)
Unfortunately we were pressured to get the host running again so we just reinstalled it. That is solution #3 and although not very efficient, one that works in almost all situations ![]()
So thanks for the great feedback to all that contributed!
| Print article | This entry was posted by Hans Vredevoort on October 4, 2010 at 10:18, and is filed under Hans Vredevoort, Hyper-v. Follow any responses to this post through RSS 2.0. You can leave a response or trackback from your own site. |









Twitter
RSS
about 2 years ago
Pretty sure I saw this same issue during the last cluster I setup. And yes setting those 100mb partitions reserved for bitlocker to online did allow the cluster validation test to proceed.
Interesting point that it may be affecting DPM2010 though, as ive been having some issues with this, maybe something for me to look at.
about 2 years ago
We could keep the system partition online by assigning a drive letter and leaving it on during reboot. The backup problem did not go away. We decided to reinstall and for now both problems are gone.