
Johan, It there is temporary networking issue and you still want engine not to fence the host you can increase heartbeat interval in the engine configuration. It would tell engine to wait longer before assuming that the host is not responding. Please provide the logs so we can understand why there is communication issue in the first place. Thanks, Piotr On Thu, Mar 17, 2016 at 12:52 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Thu, Mar 17, 2016 at 10:49 AM, Johan Kooijman <mail@johankooijman.com> wrote:
Hi all,
Since we upgraded to the latest ovirt node running 7.2, we're seeing that nodes become unavailable after a while. It's running fine, with a couple of VM's on it, untill it becomes non responsive. At that moment it doesn't even respond to ICMP. It'll come back by itself after a while, but oVirt fences the machine before that time and restarts VM's elsewhere.
Engine tells me this message:
VDSM host09 command failed: Message timeout which can be caused by communication issues
Is anyone else experiencing these issues with ixgbe drivers? I'm running on Intel X540-AT2 cards.
We will need engine and vdsm logs to understand this issue.
Can you file a bug and attach ful logs?
Nir