Re: [ovirt-users] Communication errors between engine and nodes?

12 Mar 2015

      If I'm not mistaken, heartbeat intervals are configured to 10 seconds by
default.

The command times out queries for the status of VMs on a host - any
reason to suspect why that's taking long? Does it happen on specific hosts?

On 11/03/15 18:40, Chris Adams wrote:
...
Once upon a time, Chris Adams <cma@cmadams.net> said:
...
2015-03-10 04:42:23,310 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.ListVDSCommand] (DefaultQuartzScheduler_Worker-40) [75b9e6d9] Command ListVDSCommand(HostName = node5, HostId = 8dfd0195-f386-4e16-9379-a5287221d5bd, vds=Host[node5,8dfd0195-f386-4e16-9379-a5287221d5bd]) execution failed.  Exception: VDSNetworkException: VDSGenericException: VDSNetworkException: Heartbeat exeeded
I'm trying to dig into this some on my own (without knowing about
oVirt's internals); can somebody tell me the timeout for the dispatching
of commands to vdsm?  I get different things happening when the engine
thinks a node has "gone away", but they all start with the same
org.ovirt.engine.core.vdsbroker.vdsbroker bit (and have a network
timeout of some type).
I don't see anything in common in any of the logs at the time of the
error, so I'm trying to roll back to when the request was sent (but I
don't know how long it took for the engine to time out before the error
was logged).

Re: [ovirt-users] Communication errors between engine and nodes?

Lior Vernia