
Hi Nardus, You might have hit an issue I have been hunting for some time ( [1] and [2] ). [1] could not be properly resolved because at a time was not able to recreate an issue on dev setup. I suspect [2] is related. Would you be able to prepare a thread dump from your engine instance? Additionally, please check for potential libvirt errors/warnings. Can you also paste the output of: sudo yum list installed | grep vdsm sudo yum list installed | grep ovirt-engine sudo yum list installed | grep libvirt Usually, according to previous reports, restarting the engine helps to restore connectivity with hosts ... at least for some time. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1845152 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1846338 regards, Artur On Thu, Aug 6, 2020 at 8:01 AM Nardus Geldenhuys <nardusg@gmail.com> wrote:
Also see this in engine:
Aug 6, 2020, 7:37:17 AM VDSM someserver command Get Host Capabilities failed: Message timeout which can be caused by communication issues
On Thu, 6 Aug 2020 at 07:09, Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
Can you fheck for errors on the affected host. Most probably you need the vdsm logs.
Best Regards, Strahil Nikolov
На 6 август 2020 г. 7:40:23 GMT+03:00, Nardus Geldenhuys < nardusg@gmail.com> написа:
Hi Strahil
Hope you are well. I get the following error when I tried to confirm reboot:
Error while executing action: Cannot confirm 'Host has been rebooted' Host. Valid Host statuses are "Non operational", "Maintenance" or "Connecting".
And I can't put it in maintenance, only option is "restart" or "stop".
Regards
Nar
On Thu, 6 Aug 2020 at 06:16, Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
After rebooting the node, have you "marked" it that it was rebooted ?
Best Regards, Strahil Nikolov
На 5 август 2020 г. 21:29:04 GMT+03:00, Nardus Geldenhuys < nardusg@gmail.com> написа:
Hi oVirt land
Hope you are well. Got a bit of an issue, actually a big issue. We had some sort of dip of some sort. All the VM's is still running, but some of the hosts is show "Unassigned" or "NonResponsive". So all the hosts was showing UP and was fine before our dip. So I did increase vdsHeartbeatInSecond to 240, no luck.
I still get a timeout on the engine lock even thou I can connect to that host from the engine using nc to test to port 54321. I also did restart vdsmd and also rebooted the host with no luck.
nc -v someserver 54321 Ncat: Version 7.50 ( https://nmap.org/ncat ) Ncat: Connected to 172.40.2.172:54321.
2020-08-05 20:20:34,256+02 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engineScheduled-Thread-70) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM someserver command Get Host Capabilities failed: Message timeout which can be caused by communication issues
Any troubleshoot ideas will be gladly appreciated.
Regards
Nar
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/C4HB2J3MH76FI2...
-- Artur Socha Senior Software Engineer, RHV Red Hat