Hi
Having an issue where i frequently have a server that is set to not responsive. VM's are set to unknown status, but still continue to run. This issue is isolated to just a single host. My Setup is currently a 2 Data Center Configuration with 2 servers in each data center. Issue is occurring at my remote site.
The primary storage volumes are setup on dedicated hardware, with the arbiter running on the server that is having issues. There is also another gluster replica volume hosted on this box, the replica is the other dedicated server.
The logs are showing:
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler8) [] Command 'GetCapabilitiesVDSCommand(HostName = vm-host-colo-1, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true', hostId='e75d4446-9bfc-47cb-8bf8-a2e681720b66', vds='Host[vm-host-colo-1,e75d4446-9bfc-47cb-8bf8-a2e681720b66]'})' execution failed: java.rmi.ConnectException: Connection timeout
[org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (DefaultQuartzScheduler8) [] Failure to refresh host 'vm-host-colo-1' runtime info: java.rmi.ConnectException: Connection timeout.
I have attached the vdsm.log from the server with issues and the engine.log.
Thanks