Johan,

It there is temporary networking issue and you still want engine not to fence the host you can
increase heartbeat interval in the engine configuration. It would tell engine to wait longer
before assuming that the host is not responding.

Please provide the logs so we can understand why there is communication issue in the first
place.

Thanks,
Piotr

On Thu, Mar 17, 2016 at 12:52 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Thu, Mar 17, 2016 at 10:49 AM, Johan Kooijman <mail@johankooijman.com> wrote:
> Hi all,
>
> Since we upgraded to the latest ovirt node running 7.2, we're seeing that
> nodes become unavailable after a while. It's running fine, with a couple of
> VM's on it, untill it becomes non responsive. At that moment it doesn't even
> respond to ICMP. It'll come back by itself after a while, but oVirt fences
> the machine before that time and restarts VM's elsewhere.
>
> Engine tells me this message:
>
> VDSM host09 command failed: Message timeout which can be caused by
> communication issues
>
> Is anyone else experiencing these issues with ixgbe drivers? I'm running on
> Intel X540-AT2 cards.

We will need engine and vdsm logs to understand this issue.

Can you file a bug and attach ful logs?

Nir