
Hello, TL;DR : engine stops talking with rebooted host. [oVirt 4.2.3.5-1.el7.centos] - From the web gui, upgrading a host, allowing the reboot checkbox checked - upgrade is OK (/var/log/yum.log is showing successful updates + the Ansible host deploy log is also OK) - reboot is OK (clean, SSH OK...) - the host eventually appears as "Install failed" - the engine.log is telling :
2018-06-19 10:02:24,896+02 ERROR [org.ovirt.engine.core.bll.SshHostRebootCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac] SSH reboot command failed on host 'serv-hv-prds06': SSH session timeout host 'root@ serv-hv-prds06' Stdout: Stderr: 2018-06-19 10:02:25,028+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac] EVENT_ID: SYSTEM_FAILED_SSH_HOST_RESTART(198), A restart usin g SSH initiated by the engine to Host serv-hv-prds06 has failed. 2018-06-19 10:02:25,185+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac] START, SetVdsStatusVDSCommand(HostName = serv-hv-prds06, SetVdsStatusVDSCom mandParameters:{hostId='9c1566a4-8432-4de6-b30d-fd3b8e5fafca', status='InstallFailed', nonOperationalReason='NONE', stopSpmFailureLogged='false', maintenanceReason='null'}), log id: 833f9bd 2018-06-19 10:02:25,191+02 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac] FINISH, SetVdsStatusVDSCommand, log id: 833f9bd 2018-06-19 10:02:25,191+02 ERROR [org.ovirt.engine.core.bll.hostdeploy.UpgradeHostInternalCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac] Engine failed to restart via ssh host 'serv-hv-prds06' ('9c1566a4- 8432-4de6-b30d-fd3b8e5fafca') after upgrade 2018-06-19 10:02:25,256+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-commandCoordinator-Thread-7) [8b7c6e7d-1a22-407c-818b-849e67b94051] EVENT_ID: HOST_UPGRADE_FAILED(841 ), Failed to upgrade Host serv-hv-prds06 (User: necarnot@sdis.isere.fr@SDIS38-authz). 2018-06-19 10:02:30,755+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-69) [8b7c6e7d-1a22-407c-818b-849e67b94051] EVENT_ID: HOST_UPGRADE_FAILED(841), Failed to upgrade Host serv-hv-prds06 (User: necarnot@sdis.isere.fr@SDIS38-authz).
- Manually activating the host puts it back on track without issue The usual SSH communications between the engine and the host are usually very sound (VM migrations, maintenance...). On this oVirt DC, I reproduced this issue twice on 2 different hosts. In this engine log above, you see that I'm using my account to manage this engine, as I 'm doing for years with no issue. I'll try the exact same path with admin@internal to see what could change, but I don't see the link. What other logs could I give you to debug this? Regards, -- Nicolas ECARNOT

Le 19/06/2018 à 10:14, Nicolas Ecarnot a écrit :
In this engine log above, you see that I'm using my account to manage this engine, as I 'm doing for years with no issue. I'll try the exact same path with admin@internal to see what could change, but I don't see the link.
I just tried on another host, using admin@internal, and the same issue occurred.
What other logs could I give you to debug this? -- Nicolas ECARNOT
participants (1)
-
Nicolas Ecarnot