How to defrag a Hyper-V R2 Cluster Shared Volume

Recently I was asked to describe the correct procedure for defragmenting Cluster Shared Volumes on a      Hyper-V R2 cluster. This is not really a very complicated task but if you have never had the opportunity to give it a try, this blog post will offer you the exact steps using PowerShell.

Case

Let’s start with a case description: the System Center Operations Manager Windows Management Pack is reporting “Logical Disk Fragmentation Level is high” for your Hyper-V R2 servers.

A Cluster Shared Volume (CSV) contains the configuration, virtual hard disk and snapshot files of multiple Hyper-V guests. Notably fragmentation of the large VHD files deserve your attention.

Fragmentation of these files can become a problem because the disk head needs to use an increasing number of seeks, lowering the throughput and thus the perceived performance of the guest as a whole.

On the other hand, NTFS has become more and more efficient in recent OS versions and fragmentation need not always have a severe impact on performance.

Analysis

CSV is a distributed orchestration layer on top of NTFS (implemented as a file system filter driver) and for fragmentation it takes advantage of all the NTFS techniques. The advantage of this design is that all disk management tools which have been written for NTFS continue to work, including a variety of defrag tools.

image

Before writing a file, NTFS searches for enough free space to write the entire file to disk. On a newly formatted disk the VHDs will be written contiguously while all the blocks are written and "zero out" which can be fairly time consuming. Of course this mainly applies when a fixed VHD format is used for which 100% of the space of the virtual hard disk is allocated and reserved on the disk. With dynamic VHD’s you are much more susceptible to fragmentation because initially only the header and a few control blocks are written to the physical disk. The VHD file will grow as more data is written to the file that the virtual hard disk represents.  Suppose you have 10 VMs on the same CSV and each VM has access to one or more dynamic VHDs. The chances of fragmentation have grown tenfold.

Recommendation

The classic administrator will probably not think twice and start to defragment the disk as quickly as possible. However when dealing with a fragmented CSV disk, you have to plan in advance. Before you can start the defrag the CSV will have to be placed into maintenance mode.

The best advice I can give you is to start monitoring disk throughput and fragmentation statistics over a longer period of time. In other words get a baseline and check its deviation at set intervals in time.

How does Operations Manager monitor the health of the disk? According to Cameron Fuller’s article:

“Disk utilization is determined by the Average Disk Seconds Per Transfer monitor. This monitor’s healthy and critical states are defined as follows:

  • Critical state occurs when the average disk seconds per transfer is greater than 50 for 5 minutes (after five samples on a 1-minute schedule)

Fragmentation health is determined by the Logical Disk Fragmentation Level monitor. This monitor’s healthy and warning states are defined as follows:

  • Warning state occurs when the percentage of file fragmentation is greater than 10% (This monitor checks health state once a day at 3.00 a.m. on Saturday by default).

The Logical Disk Fragmentation Level monitor also includes a recovery task called Logical Disk Defragmentation, which is disabled by default. This task can automatically run a defragmentation if the drive exceeds the threshold defined for the monitor.

The following picture shows a fragmented logical disk in System Center Operations Manager’s Health Explorer.

See also this article by Kevin Holman on file fragmentation monitoring in Operations Manager

A key takeaway from this article is that if you automatically want to perform a defrag job against fragmented local disks, be aware that if the recovery task is enabled, it will run against both the physical servers and the VM’s at roughly the same time. The job might stress the SAN with more I/O’s than it can handle.

I have not been able to find out if Operations Manager is able to deal with CSV disks at all. So before thinking about automation, let’s first look at what are the exact steps to properly defrag one or more Cluster Shared Volumes in a Hyper-V R2 Cluster.

Defrag

Windows Server 2008 R2 has its own defrag command. If you want to try this out manually start with an analysis by opening a command prompt with administrator privileges.

; Analyzing a disk
> Defrag /A

or

; Analyzing a disk with additional info and statistics
> Defrag /A /U /V

At the end it will show you if the disk requires a defrag or not.

image

Every CSV is a separate volume without a drive letter and all CSVs are logically organized underneath a root directory on disk C: called C:ClusterStorage. However, when you run a defrag for disk C: it does not defrag the individual CSV volumes. In fact the CSV has to be placed in maintenance to run properly. Enabling maintenance will at the same time put your guests in Saved State, so be careful when to start such a job.

After explaining some of the background of Cluster Shared Volumes, I will detail the manual steps to properly defrag a CSV volume.

CSV background

CSV, which is enabled in most Hyper-V R2 clusters, is implemented as an NTFS junction point. This is more or less comparable to a mountpoint to which volumes are mounted. CSV uses the C:ClusterStorage root directory and each volume is placed underneath:

C:ClusterStorageVolume1
C:ClusterStorageVolume2

C:ClusterStorageVolumeN

These CSV volumes are used by multiple VMs in the cluster so by just starting a defrag you would not have exclusive access to many of the files.

Another important fact to know is that in a Hyper-V R2 cluster, VMs are able to read and write to their respective VHD’s on that CSV simultaneously without intervention of the coordinator node. This is called Direct I/O. Still every CSV volume has its own owner (or coordinator node). The owner of the disk is capable of performing metadata operations on that disk.

Because a defrag involves having exclusive access to a disk, it has to be placed into maintenance. The same applies to chkdsk.

In this picture you can see two VM’s spread across two cluster nodes, but still capable of Direct I/O against the CSV disk with its VHD.

clip_image004

 

WARNING

Make sure you run these steps in a designated maintenance window because all guests will be saved and temporarily unavailable before a CSV volume is put into maintenance mode.

Step by step

Run the following commands to defrag a CSV volume:

Open a command prompt with Administrator privileges.

; Start PowerShell
> PowerShell

; Import PowerShell module for Failover Clusters
PS > Import-Module FailoverClusters

; Request list of available CSVs
PS > Get-ClusterSharedVolume

clip_image005

; Request properties of selected CSV volume
PS > Get-ClusterSharedVolume “csv01” | fc *

clip_image007

The VMs and CSVs are still online.

clip_image009

; Enable maintenance mode for specified CSV which puts VMs into Save State
PS> Suspend-ClusterResource “CSV01” –VolumeName “C:ClusterStorageVolume1”

clip_image010

VMs are saved one by one.

clip_image011

; Check status of all CSV’s
PS > Get-ClusterSharedVolume

clip_image012

As soon as the CSV is switched into maintenance a defrag (or chkdsk) command can be given from the coordinator node. If it is not on the current node the CSV can be moved:

; Move CSV to specific cluster node
PS > Move-ClusterSharedVolume “csv01” –node hv01

image

Before starting the actual fragment, an analysis can be done first.

; Fragmentation analysis of a CSV
PS > Repair-ClusterSharedVolume C:ClusterStorageVolume1 -Defrag -Parameters “/A /U /V”

clip_image013

If the fragmentation is over a certain percentage, the actual defrag can be started

; Defrag of CSV
PS > Repair-ClusterSharedVolume C:ClusterStorageVolume1 –Defrag -Parameters “/H /U /V /X”

Here is a list of all defrag parameters. In the above example we run the operation at a higher than default priority, displaying progress, printing verbose output for additional statistics and perform a free space consolidation on the specified volume.

clip_image015

;  Take CSV out of maintenance
PS > Resume-ClusterResource “CSV01” –VolumeName “C:ClusterStorageVolume1”

clip_image016

; Check status of CSVs
PS > Get-ClusterSharedVolume

clip_image017

; Restart all guests that were paused as a result of the maintenance
PS > Start-ClusterGroup “[name of guest cluster group]”

clip_image018

An interesting subject is how we can automate all this, either from System Center Operations Manager or possibly even System Center Orchestration Manager. There may be even third party tools that deal with Hyper-V R2 Cluster Shared Volumes and automate the defrag process.

If you have any experience in this area please leave a comment about how you solved this.

 

UPDATE: Commenter Gavin made a great suggestion to place the CSV in Redirected Access Mode. In that case the VM’s don’t need to go into Save State. You should not do this during working hours as this significantly impacts the performance of the host and its guests. See my answer in the comments which commands to use in PowerShell to switch CSV in Redirected Access Mode.

6 Comments

  1. May 31, 2011    

    Have you considered putting the CSV in redirected mode instead of maintenance? That way the VMs stay online.

  2. adminHans adminHans
    May 31, 2011    

    Thanks Gavin,
    I hadn’t tried that one

    The steps should then read:
    Move-ClusterSharedVolume “[Name of CSV]” -Node [This Cluster Node]”
    Get-ClusterSharedVolume “[Name of CSV]” | Suspend-ClusterResource -RedirectedAccess
    Repair-ClusterSharedVolume C:ClusterStorage[VolumeX] -Defrag -Parameters “/A /U /V”
    Resume-ClusterResoure “[Name of CSV]” -VolumeName “C:ClusterStorage[VolumeX]”

    Regards, Hans

  3. June 2, 2011    

    For folks using Diskeeper, they have a blog post that talks about their recommendations for CSV with their product as well:

    http://www.diskeeper.com/blog/post/2011/03/28/Best-Practices-for-CSV-defrag-in-Hyper-V-(Windows-Server-2008R2).aspx

  4. June 2, 2011    

    Hey Hans,

    Nice post. I provided some feedback on your open questions via my blog. Take care. http://workinghardinit.wordpress.com/2011/06/02/some-feedback-on-how-to-defrag-a-hyper-v-r2-cluster-shared-volume/

  5. Miike Miike
    May 20, 2015    

    Good info there.

    Been doing a search on both the Optimize-Volume command and the defrag command for use on CSVs and a couple of things stand out.

    1) Nobody tells you what to expect. Optimize-Volume did three passes on a 13.8TB Volume. All about the same speed taking a total of a seven days. Defrag did 5 passes with the first one taking about four days and the last four completing in about 10 hours total. This volume was listed as being 52% fragmented.
    2) Yes, you can put a running CSV into redirected mode and no machines will stop working and it really is as simple as that.

    I have no idea if these results are the norm or we just mutated into some kind of twilight zone.

    Now for the followup:
    After running both Optimize-Volume and Defrag to actually do the defrag, the volume is still 52% fragmented even though both runs reported success.

    Makes me wonder what I missed or if this is really the way to do defragementation. Has anyone else encountered this last of expected results?

No Pings Yet

  1. virtualization.info | Tech: How to defrag a Hyper-V R2 Cluster Shared Volume on May 31, 2011 at 18:01
  2. What is the correct way to defrag Hyper-V R2 Cluster Shared Volume? on November 8, 2011 at 09:43
  3. Hyper-V picks from around the web on August 10, 2012 at 17:01

Leave a Reply

Your email address will not be published. Required fields are marked *

Our Sponsors

Powered by