Hi, I working on a CentOS7 based Ovirt 3.6 system (ovirt-engine/db on one machine, two
separate ovirt vm hosts) which has been running fine but mostly ignored for 2-3 of years.
Recently it was decided to update the OS as it was far behind on security updates, so one
host was put into maintenance mode, yum update'd, rebooted, and then it was attempted
to take out of maintenance mode but it's "non-responsive" now.
If I look in /var/log/ovirt-engine/engine.log on the engine machine I see for this host
(vmserver2):
"ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
(DefaultQuartzScheduler_Worker-36) [2bc0978d] Command
'GetCapabilitiesVDSCommand(HostName = vmserver2,
VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
hostId='6725086f-42c0-40eb-91f1-0f2411ea9432',
vds='Host[vmserver2,6725086f-42c0-40eb-91f1-0f2411ea9432]'})' execution
failed: org.ovirt.vdsm.jsonrpc.client.ClientConnectionException: Connection failed"
and thereafter more errors. This keep repeating in the log.
In the Ovirt GUI I see multiple occurrences of log entries for the problem host:
"vmserver2...command failed: Vds timeout occurred"
"vmserver2...command failed: Heartbeat exceeded"
"vmserver2...command failed: internal error: Unknown CPU model
Broadwell-noTSX-IBRS"
Firewall rules look identical to the host which is working normally but has not been
updated.
Any thoughts about how to fix or further troubleshoot this?