I’ve been testing with the experimental feature Virtual Machine High Availability(aka VM Failure Monitoring) for a couple of days now. I must say it just does what VMware claims in the PDF, resetting a VM within the configured time when a the heartbeat is lost. But one thing that struck me is that there’s hardly any evidence that HA did it’s job, in other words no events logged in VirtualCenter as far as I can see. Well there was an error indicating something was wrong “Remote console on w2k3-001 disconnected”. I checked several log files but could not find any decent errors until I checked the file /var/log/vmware/hostd.log. I know the PDF about this feature states “In this experimental version of Virtual Machine Failure Monitoring, no explicit notification is sent to the administrator.”, but I would at least expect some sort of error. The following lines in the log /var/log/vmware/hostd.log indicated that VMware initiated the reset of the VM: * Task Created : haTask-112-vim.VirtualMachine.reset-1098 VM HA was configured with the following parameters: * das.FailureInterval = 30 (If there’s no heartbeat received withing 30 seconds initiate restart) By the way I used the following Microsoft “hidden feature” to force a BSOD: * Name: CrashOnCtrlScroll Exit Registry Editor, and then restart the computer. When holding down the right ctrl and pressing the scroll lock twice at the same time Windows will generate a BSOD and if you have setup VM HA correctly the VM will be reset within the das.FailureInterval time. |
|||
