Bryan,
In your engine logs I see :
2017-09-13 04:07:07,599-05 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand]
(DefaultQuartzScheduler3) [] Command 'GetAllVmStatsVDSCommand(HostName
= vm-host-colo-1, VdsIdVDSCommandParametersBase:{runAsync='true',
hostId='e75d4446-9bfc-47cb-8bf8-a2e681720b66'})' execution failed:
VDSGenericzException: VDSNetworkException: Heartbeat exceeded
It would br great to understand what happened on vdsm side because the
engine was still trying to connect at: 2017-09-13 09:30:46,275-05
In vdsm logs you provided I see that they start at 2017-09-13
09:01:08,895-0500 and end at 2017-09-13 09:53:24,760-0500.
Please provide vdsm logs from the time the issue occurred.
Thanks,
Piotr
On Wed, Sep 13, 2017 at 5:09 PM, Bryan Sockel <
Bryan.Sockel@altn.com> wrote:
>
> Hi
>
> Having an issue where i frequently have a server that is set to not
> responsive. VM's are set to unknown status, but still continue to run.
> This issue is isolated to just a single host. My Setup is currently a 2
> Data Center Configuration with 2 servers in each data center. Issue is
> occurring at my remote site.
>
> The primary storage volumes are setup on dedicated hardware, with the
> arbiter running on the server that is having issues. There is also another
> gluster replica volume hosted on this box, the replica is the other
> dedicated server.
>
> The logs are showing:
>
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
> (DefaultQuartzScheduler8) [] Command 'GetCapabilitiesVDSCommand(HostName =
> vm-host-colo-1, VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
> hostId='e75d4446-9bfc-47cb-8bf8-a2e681720b66',
> vds='Host[vm-host-colo-1,e75d4446-9bfc-47cb-8bf8-a2e681720b66]'})' execution
> failed: java.rmi.ConnectException: Connection timeout
>
> [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
> (DefaultQuartzScheduler8) [] Failure to refresh host 'vm-host-colo-1'
> runtime info: java.rmi.ConnectException: Connection timeout.
>
>
> I have attached the vdsm.log from the server with issues and the engine.log.
>
> Thanks
>
> Bryan Sockel
>
>
> _______________________________________________
> Users mailing list
>
Users@ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users
>