Issues with vmmonitoring in HA clusters while using vsphere client 5.0.0 build 455964, vm monitoring with vmware tools down will not restart failed vm

I wanted to share with everybody this ‘news’.

While trying to create vm monitoring within HA cluster using vm monitoring only i went into some issues. VM monitoring was not working. I thought there is not that much to configure to have this working right ? Open cluster settings, go to vm monitoring section

Make sure that the VM monitoring option is set from Disabled to VM Monitoring Only, and set the default cluster settings to Low for example. That should do the trick right ? Yep, that does the trick. But what in case that you want to select the vm monitoring for only a couple of vms in that cluster? Well, you have to first select all vms in Virtual Machine Settings section and disable VM Monitoring. After that you have to enable it for those particular vms that should be protected. You do this by selecting a value in VM Monitoring column (high, low, medium, custom).
Ok, that’s pretty easy so far, no issue. Still having fun with vm monitoring settings.
Let’s say you have made a mistake, and selected ‘disable’ value for a vm that should be covered by this. You would then go back tho those settings, select the vm, and switch from disabled to ‘low’, should be good right ?

Nope, it will not be good at all.
If you are using vsphere client 5.0.0 build 455964 , i would say 99% chance that it will be not ok. This client comes from vsphere 5.0 Virtual Center installation. This client has issue with setting cluster configuration regarding vm monitoring.

I got my vm with current vm monitoring settings set to disabled. Let’s query it from powercli.

(get-view -viewtype computeresource -filter @{'name'='my_cluster'}).configuration.dasvmconfig[0].dassettings.vmtoolsmonitoringsettings
Enabled          : True
VmMonitoring     : vmMonitoringDisabled
ClusterSettings  : False
FailureInterval  : 60
MinUpTime        : 240
MaxFailures      : 3
MaxFailureWindow : 86400

Ok it’s all good now. We are not using cluster settings, we have selected disabled. All good. Now let’s change this to medium.

Save the config. And let’s check what our cluster looks like when queried from powercli.

(get-view -viewtype computeresource -filter @{'name'='my_cluster'}).configuration.dasvmconfig[0].dassettings.vmtoolsmonitoringsettings
Enabled          : True
VmMonitoring     : vmMonitoringDisabled
ClusterSettings  : False
FailureInterval  : 60
MinUpTime        : 240
MaxFailures      : 3
MaxFailureWindow : 86400

No.. I did not make a wrong copy/paste here. The vmMonitoringDisabled is present. In the vsphere client you will always see that medium is selected and that vm monitoring is enabled for this vm. But in reality nothing will happen to your vm when it will freeze. If for some reason you want to use this buggy version of vsphere client you will have to configure those clusters and vm monitoring using your own script like this one taken from onyx dump:

$clusterid=(get-cluster 'your_cluster').id
$vmref=(get-vm your_vm).extensiondata.moref.value
$spec = New-Object VMware.Vim.ClusterConfigSpecEx
$spec.dasVmConfigSpec = New-Object VMware.Vim.ClusterDasVmConfigSpec[] (1)
$spec.dasVmConfigSpec[0] = New-Object VMware.Vim.ClusterDasVmConfigSpec
$spec.dasVmConfigSpec[0].operation = "edit"
$spec.dasVmConfigSpec[0].info = New-Object VMware.Vim.ClusterDasVmConfigInfo
$spec.dasVmConfigSpec[0].info.key = New-Object VMware.Vim.ManagedObjectReference
$spec.dasVmConfigSpec[0].info.key.type = "VirtualMachine"
$spec.dasVmConfigSpec[0].info.key.value = $vmref
$spec.dasVmConfigSpec[0].info.dasSettings = New-Object VMware.Vim.ClusterDasVmSettings
$spec.dasVmConfigSpec[0].info.dasSettings.restartPriority = "clusterRestartPriority"
$spec.dasVmConfigSpec[0].info.dasSettings.isolationResponse = "clusterIsolationResponse"
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings = New-Object VMware.Vim.ClusterVmToolsMonitoringSettings
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.enabled = $true
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.vmMonitoring = "vmMonitoringOnly"
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.clusterSettings = $false
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.failureInterval = 30
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.minUpTime = 120
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.maxFailures = 3
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.maxFailureWindow = 3600

$clusterview = Get-View -Id $clusterid
$clusterview.ReconfigureComputeResource_Task($spec, $true)

You manipulate the low,med,high using those settings:

This part descibes the high setting:
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.failureInterval = 30
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.minUpTime = 120
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.maxFailures = 3
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.maxFailureWindow = 3600

This part descibes the medium setting:
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.failureInterval = 60
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.minUpTime = 240
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.maxFailures = 3
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.maxFailureWindow = 86400

This part descibes the low setting:
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.failureInterval = 120
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.minUpTime = 480
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.maxFailures = 3
$spec.dasVmConfigSpec[0].info.dasSettings.vmToolsMonitoringSettings.maxFailureWindow = 604800

Ok, so this is clear. Now this client does work regarding vm monitoring in specific condition. It’s when the vm monitoring for vm is set to disabled, and you change it to use cluster settings.When you set it like this, proper values are being displayed:

(get-view -viewtype computeresource -filter @{‘name’=’test_cluster’}).configuration.dasvmconfig[0].dassettings.vmtoolsmonitoringsettings
Enabled : True
VmMonitoring : vmAndAppMonitoring -> see this ? this should be vmMonitoringOnly
#this will work too, but this option is not the option that you see in the client, this should be vmMonitoringOnly, i was not using vm and app monitoring
ClusterSettings : True
FailureInterval : 60
MinUpTime : 240
MaxFailures : 3
MaxFailureWindow : 86400

So if you are using this build of vsphere client and set the vm monitorig to use cluster settings, i bet you have it working properly. For those of you who have been playing around with cluster settings and vm monitoring for a while , there is probability that you have changed options from disabled to low/med/custom/high , and i bet that vm monitoring will not work properly for you.

In order to fix this issue with the client, you have to install vsphere client 5.0.0 build 755629 which comes from Virtual Center server 5.0 update 1 installation.
Ok, so that’s probably it. One more thing that i wanted to add, if you are wondering why vm is not being rebooted when vm tools are down is that you have probably an option for das.iostatsInterval enabled. It means that if your vm has vmware tools down but it is still making IO operations then it will not be marked to restart. You can clearly see it in fdm.log . So if you do not want IO to be checked add this option to you cluster HA advanced option list:
das.iostatsInterval with value of : 0

das.iostatsInterval – “Changes the default I/O stats interval for VM Monitoring sensitivity. The default is 120 (seconds). Can be set to any value greater than, or equal to 0. Setting to 0 disables the check.” Taken from vmware KB.

From now on, you will see that IO check is being ignored and vm is being rebooted when vmware tools are down.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s