Hello, Pablo.
It looks like nodo1 have lost connection with the storage (sanlock on nodo1 can't
renew leases), then nodo1 has been reset by the watchdog.
Are there any errors in logs on the other nodes at this period (15:02 - 15:03)?
Are there any errors (near 15:02:13) in cisco9000's log (except those which was after
15:03:36 - when nodo1 reboots)?
Do you use bond on nodo1 for storage connection?
--
Alexey
-----Original Message-----
From: Pablo Olivera <p.olivera(a)telfy.com>
Sent: Wednesday, February 16, 2022 11:04 AM
To: users(a)ovirt.org
Subject: [ovirt-users] Random reboots
Hi community,
We're dealing with an issue as we occasionally have random reboots on any of our
hosts.
We're using ovirt 4.4.3 in production with about 60 VM distributed over
5 hosts. We've a virtualized engine and a DRBD storage mounted by NFS.
The infrastructure is interconnected by a Cisco 9000 switch.
The last random reboot was yesterday February 14th at 03:03 PM (in the log it appears as:
15:03 due to our time configuration) of the host:
'nodo1'.
At the moment of the reboot we detected in the log of the switch a link-down in the port
where the host is connected.
I attach log of the engine and host 'nodo1' in case you can help us to find the
cause of these random reboots.
Thanks in advance.
Pablo.