Category Archives: Virtualisation

Hyper-V VM Stuck In Backing Up… State

In this post I’ll show how to resolve the issue of a VM that’s stuck in the “backing up…” state as shown by Hyper-V Manager, without having to reboot the virtual host.

BackingUp0

 

If a VM is stuck in the backing up… state it’s probably due to an error with the Microsoft Hyper-V VSS Writer.  Open an elevated command prompt and run “vssadmin list writers”.  The output should look like it does below, with no errors listed.

BackingUp1

 

The Microsoft Hyper-V VSS Writer runs within the Hyper-V Virtual Machine Management service, so in order to restart the VSS writer and clear the error, you have to restart the Hyper-V Virtual Machine Management service.  I’ve restarted this service without any issues, but please test this on a test server first.

BackingUp2

You can restart the service from the Services mmc, but if the Hyper-V VSS Writer is in an error state the service may hang on shutdown, in this case you’ll have to kill the vmms.exe process from Task Manager.

BackingUp3

When you do this VMs will disappear from Hyper-V Manager, but will reappear when you restart the Hyper-V Virtual Machine Management service.  Following the service restart the VM should no longer by in a backing up… state.

 

SCVMM 2012 R2 – Dynamic Optimization Cannot Be Performed At This Time

When trying to run the Optimize Hosts wizard within SCVMM 2012 R2 I received the error “Dynamic Optimization Cannot Be Performed At This Time” and “Object reference not set to an instance of an object”.

The Application Event Log on the SCVMM server contained a Windows Error Reporting event from the same time.  Opening the event showed a link to the error log.

SCVMMDO0

 

Opening the error log showed that the error was related to a logical network issue on the cluster.  This cluster has a converged network switch to which all virtual machines (VMs) connect.  However, two additional logical networks are mapped to the switch to enable the migration of VMs which were connected to logical networks of a different name on a legacy cluster.

SCVMMDO1

What I found is that some VMs were connected to the “Hyper-V External Access ” logical networks, rather than the ConvergedNetworkSwitch.  Changing the network mapping of the affected VMs to ConvergedNetworkSwitch enabled me to run the Dynamic Optimization wizard.

SCVMMDO2

Unable To Live Migrate A Virtual Machine – “There currently are no network adapters with network optimization available on host”

Having moved a virtual machine (VM) from a Hyper-V cluster where network optimization was available, to a cluster where it isn’t I was unable to live migrate the VM because “There currently are no network adapters with network optimization available on host”.

MigrateVM1

As network optimizations aren’t available on the host cluster, the tick box to disable virtual switch optimizations isn’t available.

MigrateVM2

On the original Hyper-V cluster where network optimization is available you can see a check box to “Enable virtual switch optimizations”.

MigrateVM2a

Luckily, PowerShell can help. Running the command below from the SCVMM PowerShell console lists the properties of the VM’s network adapter.

$VM = Get-SCVirtualMachine -Name “VMNAME”
Get-SCVirtualNetworkAdapter -VM $VM

VMNetworkOptimizationEnabled is set to true.

MigrateVM3

You can use PowerShell to disable VM Network Optimization.

$VM = Get-SCVirtualMachine -Name “VMNAME”
$Adapter = Get-SCVirtualNetworkAdapter -VM $VM
Set-SCVirtualNetworkAdapter -VirtualNetworkAdapter $Adapter -EnableVMNetworkOptimization $false

You can now live migrate the VM.

Improve Virtual Machine Migration Performance

During the migration from a Windows Server 2008 R2 Hyper-V cluster to one based on Windows Server 2012 R2 I have had to migrate a lot of virtual machines.  The migration speed was lower than I expected given the available source and destination disk performance.  On a 1 Gbps network connection, the migration was only using about 25% of the available bandwidth.

In order to improve the performance I disabled TCP Connection Offload (TOE) on the source hosts.  This is because TOE can slow down the VM migration process, which takes place over BITS using HTTPS.  The screenshot below shows the setting in the Broadcom software.  Having disabled TOE the migration used anything up to 80% of the available bandwidth, significantly improving the migration times.

VMMigrationSpeed

 

 

Event ID 17140 & 15030 On Windows Server 2012 R2 Hyper-V

Looking through the Event Logs on a Windows Server 2012 R2 Hyper-V host I could see multiple entries for event IDs 17140 & 15030 as below

17140 – The specified account name is not valid: ‘%USER SID%’.

15030 – ‘%SERVER NAME%’ failed to modify settings. (Virtual machine ID %VM ID%)

These errors can occur if you’re using System Center Virtual Machine Manager (SCVMM).  In the SCVMM console, open Settings -> Security -> User Roles.  Right-click on each user role and choose Properties.  Open the Members tab and look for the user SID you see in event ID 17140.

SCVMM SID

Highlight the SID and click Remove.

Windows Server 2012 R2 Update – Hyper-V Backup Issue

UPDATE 1

Veeam have confirmed that this issue is related to Veeam Backup & Replication.  They’re working on a fix for the general availability (GA) release of the Windows Server 2012 R2 Update.  Follow this link on the Veeam forum for updates.

UPDATE 2

Veeam have released the hotfix and it’s available through support.

UPDATE 3

Veeam have identified a second issue.  Restoring Windows 2008 and earlier backups made on an updated Hyper-V 2012 R2 host fails.  The Veeam KB article is available here 

Having installed the Windows Server 2012 R2 Update, my Veeam Hyper-V backups failed with errors as shown below.

03/04/2014 20:58:28 :: Processing ‘SERVERNAME’ Error: Client error: The system cannot find the file specified.

Failed to open file [\\?\Volume{36db85be-bb6a-11e3-80ce-000af72f0333}\SERVERNAMEWINDOWS SERVER 2012 R2 STANDARD_DISK_1_117A5A0B-5976-4766-B256-3707A2D8CFA7-AutoRecovery.avhdx] in readonly mode.

Veeam Error

The engineer at Veeam (Anatoly) was excellent and was able to prove that the error wasn’t related to Veeam by running Windows backup commands using PowerShell.  Following the uninstall of KB 2919355, the only one of the six hot fixes comprising the Windows Server 2012 R2 Update listed in Add Remove Programs, I was able to run backups as before.  If you’re thinking of installing this update ensure you test it well.

For reference the configuration is:

  • 4x Hyper-V 2012 R2 hosts
  • Dell EqualLogic SAN 6.0.7
  • HIT Kit 4.7 EPA
  • Veeam Backup & Replication 7.0.0.839

Hyper-V Virtual Machine Connectivity Issue

Following an issue with the storage used by our Hyper-V cluster, one node in our five node cluster became partially unresponsive.  The virtual machines (VMs) running on the unresponsive node were automatically moved to other cluster nodes and service was resumed withing a couple of minutes.  At first everything appeared to be fine, but within a few minutes our monitoring system started to report connectivity issues to the VMs that had failed over.

I RDP’d onto one of the VMs that was having connectivity issues, but found the connection kept dropping out, so I connected to the console through System Center Virtual Machine Manager (SCVMM).  I found I was unable to ping any server on the physical network.  I took a look at the event log on one of the virtual hosts and saw the error below:

Port ‘BF392932-9AE4-453A-8E13-26671BB556D9′ was prevented from using MAC address ’00-14-22-18-7F-DC’ because it is pinned to port ‘SCVMM-C26227E3-D6AB-4818-B8BF-4CCF923C’.

The error message implied another VM was using the MAC of the VM that was having connectivity issues.  As the VM had a dynamic MAC that was managed by SCVMM I knew that couldn’t be the case.  I decided to reboot the unresponsive cluster node.  After waiting 30 minutes for the node to shutdown I killed the power via a DRAC.  As soon as I killed the power to the node the MAC address errors in the event log disappeared and all the VMs resumed normal connectivity.  I believe the cluster node that became unresponsive was keeping some kind of lock on the MAC addresses of the VMs that were running on the node when it became unresponsive.  Killing the power to the node freed the locks enabling connectivity to resume.

Deleting Hyper-V Snapshots

An important point when deleting a Hyper-V snapshot is that the corresponding AVHD file isn’t merged into the main VHD file until you power down the server.  This means that the disk space used by the snapshot isn’t immediately available when you delete a snapshot.

As far as I’m aware there’s no way to monitor the merge process through System Center Virtual Machine Manager, but you can via the Hyper-V MMC.

If you start a virtual machine during the merge process, the merge process stops and the machine starts instantly.

 

 

 

Unable to “Turn off redirected access for this Cluster shared volume”

Our five node Hyper-V cluster is connected to a Dell MD3000i, which provides virtual machine storage using Cluster Shared Volumes (CSV).  The MD3000i has dual storage controllers for redundancy, but recently both storage controllers rebooted within a minute of each other.  Looking at the Storage section of Failover Cluster Manager showed that half the CSV volumes had a status of Redirected Access.  Reading this blog http://blogs.technet.com/b/askcore/archive/2010/12/16/troubleshooting-redirected-access-on-a-cluster-shared-volume-csv.aspx showed that the first thing to try was to “Turn off redirected access for this Cluster shared volume”, unfortunately this didn’t work.  I looked in Disk Management on each of the five nodes, which looked as below:

Each node should be able to see all the disks, but only the first node, the one on the far left, could see all five disks.  I live migrated the virtual machines off each node and rebooted each node one at a time.  Once completed, Disk Management looked like this:

 

Every node can see all five disks.  I checked Failover Cluster Manager and all the CSV volumes had returned to Online status automatically.

Cluster Shared Volumes Issue

Following a power down of our Hyper-V cluster, about half the virtual machines  (VMs) would not start.  Looking in Failover Cluster Manager showed  the VMs couldn’t start because the machine files were missing.  I looked at the Storage summary, but the cluster shared volumes were present and had a status of online:

This was odd, Failover Cluster Manager was saying the VM files were missing, but the disks were online.  I expanded the cluster shared volumes (CSV):

If you look at the CSV paths, you’ll see the path for Cluster Disk 1 is C:\ClusterStorage\Volume1 and the path for Cluster Disk 3 is C:\ClusterStorage\.  The reason the machines wouldn’t start was because the Cluster Disk 3 path was incorrect.  It should have been C:\ClusterStorage\Volume2.

I right-clicked on Cluster Disk 3, chose “Move this shared volume to another node” and selected one of the other cluster nodes.  When the move had completed, the path was correct and I was able to start the VMs.