Problems after upgrade from 4.4.3 to 4.4.4

Hi I have problems after upgrading my 2-node cluster from 4.4.3 to 4.4.4. Initially, I performed the upgrade of the oVirt hosts using the oVirt GUI (I wasn't planning any changes). It appears that the upgrade broke the system. On host1, the ovirt-engine was configured to run on the oVirt host itself (not self-hosted engine). After the upgrade, the oVirt GUI didn't load in the Browser anymore. I tried to fix the issue by migrating to self-hosted engine, which did not work, so I ran engine restore and engine-setup in order to get back to the initial state. I am now able to login to the oVirt GUI again, but I am having the following problems: host1 is in status "Unassigned", and it has the SPM role. It cannot be set to maintenance mode, nor re-installed from GUI, but I am able to reboot the host from oVirt. All Storage Domains are inactive. (all NFS) In the /var/log/messages log, I can see the following message appearing frequently: "vdsm[5935]: ERROR ssl handshake: socket error, address: ::ffff:192.168.100.61" The cluster is down and no VM's can be run. I don't know how to fix either of the issues. Does anyone have an idea? I am appending a tar file containing log files to this email. http://gofile.me/5fp92/d7iGEqh3H <http://gofile.me/5fp92/d7iGEqh3H> Many thanks Toni

Have you tried to set another host as SPM ?Also, you can mark the host was rebooted ( I assume you got no power management configured) from Hosts -> 3 dots in upper right in UI. Also check if sanlock, supervdsmd and vdsmd services are healthyand running. Best Regards,Strahil Nikolov On Fri, Feb 12, 2021 at 1:46, tferic@swissonline.ch<tferic@swissonline.ch> wrote: Hi I have problems after upgrading my 2-node cluster from 4.4.3 to 4.4.4. Initially, I performed the upgrade of the oVirt hosts using the oVirt GUI (I wasn't planning any changes). It appears that the upgrade broke the system. On host1, the ovirt-engine was configured to run on the oVirt host itself (not self-hosted engine). After the upgrade, the oVirt GUI didn't load in the Browser anymore. I tried to fix the issue by migrating to self-hosted engine, which did not work, so I ran engine restore and engine-setup in order to get back to the initial state. I am now able to login to the oVirt GUI again, but I am having the following problems: - host1 is in status "Unassigned", and it has the SPM role. It cannot be set to maintenance mode, nor re-installed from GUI, but I am able to reboot the host from oVirt. - All Storage Domains are inactive. (all NFS) - In the /var/log/messages log, I can see the following message appearing frequently: "vdsm[5935]: ERROR ssl handshake: socket error, address: ::ffff:192.168.100.61" The cluster is down and no VM's can be run. I don't know how to fix either of the issues. Does anyone have an idea? I am appending a tar file containing log files to this email. http://gofile.me/5fp92/d7iGEqh3H Many thanks Toni _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/3QM4ATVPNXWJVN...

Hi On host1, all services are running, including “sanlock”, “supervdsmd” and “vdsmd”. On host2, the services "ovsdb-server” and "ovirt-imageio” are not running, and as a result, “vdsmd” is also not running. So the cluster is down. In the log of host2, I can find "ovsdb-server.service: Failed at step EXEC spawning /usr/share/openvswitch/scripts/ovs-ctl: Exec format error”. host1 has status “Unassigned” and host2 has status “NonResponsive”. So the SPM role cannot be assigned to another host. Any ideas how to fix this? Kind regards Toni
On 13 Feb 2021, at 18:31, Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
Have you tried to set another host as SPM ? Also, you can mark the host was rebooted ( I assume you got no power management configured) from Hosts -> 3 dots in upper right in UI.
Also check if sanlock, supervdsmd and vdsmd services are healthyand running.
Best Regards, Strahil Nikolov
On Fri, Feb 12, 2021 at 1:46, tferic@swissonline.ch <tferic@swissonline.ch> wrote: Hi
I have problems after upgrading my 2-node cluster from 4.4.3 to 4.4.4.
Initially, I performed the upgrade of the oVirt hosts using the oVirt GUI (I wasn't planning any changes).
It appears that the upgrade broke the system. On host1, the ovirt-engine was configured to run on the oVirt host itself (not self-hosted engine).
After the upgrade, the oVirt GUI didn't load in the Browser anymore. I tried to fix the issue by migrating to self-hosted engine, which did not work, so I ran engine restore and engine-setup in order to get back to the initial state. I am now able to login to the oVirt GUI again, but I am having the following problems: host1 is in status "Unassigned", and it has the SPM role. It cannot be set to maintenance mode, nor re-installed from GUI, but I am able to reboot the host from oVirt. All Storage Domains are inactive. (all NFS) In the /var/log/messages log, I can see the following message appearing frequently: "vdsm[5935]: ERROR ssl handshake: socket error, address: ::ffff:192.168.100.61" The cluster is down and no VM's can be run. I don't know how to fix either of the issues.
Does anyone have an idea? I am appending a tar file containing log files to this email.
http://gofile.me/5fp92/d7iGEqh3H <http://gofile.me/5fp92/d7iGEqh3H> Many thanks
Toni
_______________________________________________ Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/privacy-policy.html <https://www.ovirt.org/privacy-policy.html> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ <https://www.ovirt.org/community/about/community-guidelines/> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/3QM4ATVPNXWJVN... <https://lists.ovirt.org/archives/list/users@ovirt.org/message/3QM4ATVPNXWJVNT2BCY7IFX63JYROZSD/>

Have you tried rebooting the engine? On Sun, Feb 14, 2021 at 4:18, tferic@swissonline.ch<tferic@swissonline.ch> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/5KH6ZAVU6XAI5H...

Yes, several times. Same result. I also installed the latest dnf updates and rebooted. Same result. On 14.02.21 20:24, Strahil Nikolov wrote:
Have you tried rebooting the engine?
On Sun, Feb 14, 2021 at 4:18, tferic@swissonline.ch <tferic@swissonline.ch> wrote: _______________________________________________ Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/5KH6ZAVU6XAI5H...

Try to fix host2 . All critical services should be running:sanlockvdsmdsupervdsmdovirt-ha-brokerovirt-ha-agent Then check if host2 can reach the storage.I guess that if you manage to fix the second host, you can mark the first host as rebooted, or try to change SPM role to host2. Best Regards,Strahil Nikolov On Mon, Feb 15, 2021 at 18:16, tferic@swissonline.ch<tferic@swissonline.ch> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/2TXIZTZ7AVXKI6...

I managed to get host2 online and take the SPM role. I can run virtual machines now. After all, it was a corrupt CentOS 7 installation (many library files with 0 KB size). But host1 is still in status “Unassigned”. I tried to use the option “Confirm host has been rebooted” but it did not help. All options are greyed out except “SSH Management:" “Restart” and “Stop”. I cannot set the host to maintenance mode, nor can I delete it. I would like to reinitialize or reinstall host1 - how can I do that? In the Events of oVirt, I can see the following message roughly every minute: VDSM host1 command Get Host Capabilities failed: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target Any ideas how I could proceed? Best regards Toni
On 16 Feb 2021, at 17:35, Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
Try to fix host2 . All critical services should be running: sanlock vdsmd supervdsmd ovirt-ha-broker ovirt-ha-agent
Then check if host2 can reach the storage. I guess that if you manage to fix the second host, you can mark the first host as rebooted, or try to change SPM role to host2.
Best Regards, Strahil Nikolov
On Mon, Feb 15, 2021 at 18:16, tferic@swissonline.ch <tferic@swissonline.ch> wrote: _______________________________________________ Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/privacy-policy.html <https://www.ovirt.org/privacy-policy.html> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ <https://www.ovirt.org/community/about/community-guidelines/> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/2TXIZTZ7AVXKI6... <https://lists.ovirt.org/archives/list/users@ovirt.org/message/2TXIZTZ7AVXKI6J4LUJ4NJ77GLQUTNM4/>

Usually confirming that the host was rebooted should be enough. I guess you havw to take a look in engine's logs. Best Regards,Strahil Nikolov On Mon, Apr 5, 2021 at 15:10, tferic@swissonline.ch<tferic@swissonline.ch> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/73RSQXJTCO45WL...
participants (2)
-
Strahil Nikolov
-
tferic@swissonline.ch