vSphere 5 & HA Heartbeat Datastores

 

I was busy updating my vSphere lab from 4.1 to 5 and ran into a warning on the first ESXi host I updated to ESXi 5.0. It read: “The number of vSphere HA heartbeat datastores for this host is 1, which is less than required: 2”. The message itself is fairly self-explanatory, but prompted me to find out more about this as I immediately knew it must be related to new functionality.

 

The Configuration Issue message

 

Pre-vSphere 5.0, if a host failed, or was just isolated on its Management Network, HA would restart the VMs that were running on that host and bring them up elsewhere. (I have actually seen this happen in our ESX 4.0 environment before!) With vSphere 5.0, HA has been overhauled and I believe this new Datastore Heartbeat feature is part of making HA more intelligent and able to make better decisions in the case of the Master HA Host being isolated or split off from other hosts. This Datastore Heartbeat feature should help significantly in the case of HA initiated restarts, allowing HA to more accurately determine the difference between a failed host and a host that has just been split off from the others for example.

 

vCenter will automatically choose two Datastores to use for the Datastore Heartbeat functionality. You can see which have been selected, by clicking on your cluster in the vSphere client, then choosing “Cluster Status”. Select the “Heartbeat Datastores” tab to see which are being used.

 

Cluster Status - viewing the elected HA Heartbeat Datastores

 

Without going into too much detail, this mechanism works with file locks on the datastores elected for this purpose. HA is able to determine whether the host has failed or is just isolated or split on the network by looking at whether these files have been updated or not. After my lab upgrade I noticed a new folder on some of my datastores and wondered at first what these new files were doing there! If you take a look at the contents of the Datastores seen your Heartbeat Datastores tab, you should see these files that HA keeps a lock on for this functionality to work.

 

Files created on HA Heartbeat Datastores for the new functionality

 

So, if you notice this configuration issue message, chances are your ESXi 5 host in question simply doesn’t have enough Datastores – this is likely to be quite common in lab environments, as traditionally we don’t tend to add many (well at least I don’t!) In my case this was a test host to do the update from 4.1 to 5 on, and I only had one shared datastore added. After adding my other two datastores from my FreeNAS box and an HP iSCSI VSA, then selecting “Re-configure for HA” on my ESXi host, the message disappeared as expected. I believe there should be some advanced settings you could also add to change the number of datastores required for this feature, but I have not looked into these yet. Generally, it is also always best to stick with VMware defaults (or so I say) as they would have been thought out carefully by the engineers. Changing advanced settings is also usually not supported by VMware too. However, if you find you are short on Datastores to add and want to get rid of the error in your lab environment, then this shouldn’t be a problem to change.