Lost console access to VMs after updating

We have a couple OLVM instances. Completely separate with their own managers (ovirt-engine). One is lab and one is prod. I updated the lab engine from 4.5.4-1.0.31.el8 to 4.5.5-1.22.el8 and as soon as that was done and it rebooted I no longer can open VM consoles. This was the update process followed: https://docs.oracle.com/en/virtualization/oracle-linux-virtualization-manage... In troubleshooting I checked for the SASL db which appears to be /etc/libvirt/passwd.db # grep sasldb_path /etc/sasl2/libvirt.conf sasldb_path: /etc/libvirt/passwd.db I can verify the vdsm user and realm is in there # sasldblistusers2 -f /etc/libvirt/passwd.db vdsm@ovirt: userPassword We are using noVNC in the browser and when I click on a vm and then console it opens a new window and just has the "connecting" message. I switched to Native VNC client and downloaded the console.vv file which looks normal inside and when I try and open it with vnc viewer or remote-viewer etc. I get a window that connects, authenticates with the password found in the console.vv but then presented with a display that says "Guest has not initialized the display (yet)." Thinking perhaps I need to update the KVM hosts too I proceeded with the update document listed above but even with the hosts updated nothing has changed. If I tail /var/log/messages when opening a console connection all I see is this: Oct 14 10:33:52 olvm2 kernel: sd 6:0:0:1: Warning! Received an indication that the LUN reached a thin provisioning soft threshold. Oct 14 10:34:41 olvm2 saslpasswd2[55701]: error deleting entry from sasldb: BDB0073 DB_NOTFOUND: No matching key/data pair found Oct 14 10:34:41 olvm2 saslpasswd2[55701]: error deleting entry from sasldb: BDB0073 DB_NOTFOUND: No matching key/data pair found Oct 14 10:34:41 olvm2 saslpasswd2[55701]: error deleting entry from sasldb: BDB0073 DB_NOTFOUND: No matching key/data pair found Oct 14 10:34:41 olvm2 saslpasswd2[55701]: error deleting entry from sasldb: BDB0073 DB_NOTFOUND: No matching key/data pair found I checked the production OLVM cluster which has not been updated and has a properly functioning vnc consoles and it produces the exact same errors shown above in the logs when opening a console. So I don't think that error is the cause if the issue although likely something that needs to be fixed. If I check the VM guest log when opening a console I get nothing. tail -f /var/log/libvirt/qemu/<vm_hostname>.log Has anyone else run into this issue when updating? Any ideas on what more I can look at or troubleshoot? Thanks Malcolm

Hi Malcolm, Select the Cluster line at Compute > Cluster and edit it. Then check if in the Console left tab, the Enable VNC Encryption is set. If enabled, you will need to disable it, and using the web UI, put a host into Maintenance mode, then click the Installation button > Reinstall. Marcos -----Original Message----- From: malcolm.strydom@pacxa.com <malcolm.strydom@pacxa.com> Sent: Monday, October 14, 2024 11:44 AM To: users@ovirt.org Subject: [External] : [ovirt-users] Lost console access to VMs after updating We have a couple OLVM instances. Completely separate with their own managers (ovirt-engine). One is lab and one is prod. I updated the lab engine from 4.5.4-1.0.31.el8 to 4.5.5-1.22.el8 and as soon as that was done and it rebooted I no longer can open VM consoles. This was the update process followed: https://docs.oracle.com/en/virtualization/oracle-linux-virtualization-manage... In troubleshooting I checked for the SASL db which appears to be /etc/libvirt/passwd.db # grep sasldb_path /etc/sasl2/libvirt.conf sasldb_path: /etc/libvirt/passwd.db I can verify the vdsm user and realm is in there # sasldblistusers2 -f /etc/libvirt/passwd.db vdsm@ovirt: userPassword We are using noVNC in the browser and when I click on a vm and then console it opens a new window and just has the "connecting" message. I switched to Native VNC client and downloaded the console.vv file which looks normal inside and when I try and open it with vnc viewer or remote-viewer etc. I get a window that connects, authenticates with the password found in the console.vv but then presented with a display that says "Guest has not initialized the display (yet)." Thinking perhaps I need to update the KVM hosts too I proceeded with the update document listed above but even with the hosts updated nothing has changed. If I tail /var/log/messages when opening a console connection all I see is this: Oct 14 10:33:52 olvm2 kernel: sd 6:0:0:1: Warning! Received an indication that the LUN reached a thin provisioning soft threshold. Oct 14 10:34:41 olvm2 saslpasswd2[55701]: error deleting entry from sasldb: BDB0073 DB_NOTFOUND: No matching key/data pair found Oct 14 10:34:41 olvm2 saslpasswd2[55701]: error deleting entry from sasldb: BDB0073 DB_NOTFOUND: No matching key/data pair found Oct 14 10:34:41 olvm2 saslpasswd2[55701]: error deleting entry from sasldb: BDB0073 DB_NOTFOUND: No matching key/data pair found Oct 14 10:34:41 olvm2 saslpasswd2[55701]: error deleting entry from sasldb: BDB0073 DB_NOTFOUND: No matching key/data pair found I checked the production OLVM cluster which has not been updated and has a properly functioning vnc consoles and it produces the exact same errors shown above in the logs when opening a console. So I don't think that error is the cause if the issue although likely something that needs to be fixed. If I check the VM guest log when opening a console I get nothing. tail -f /var/log/libvirt/qemu/<vm_hostname>.log Has anyone else run into this issue when updating? Any ideas on what more I can look at or troubleshoot? Thanks Malcolm _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://urldefense.com/v3/__https://www.ovirt.org/privacy-policy.html__;!!AC... oVirt Code of Conduct: https://urldefense.com/v3/__https://www.ovirt.org/community/about/community-... List Archives: https://urldefense.com/v3/__https://lists.ovirt.org/archives/list/users@ovir...

On Tue, Oct 15, 2024 at 2:12 AM <malcolm.strydom@pacxa.com> wrote:
We have a couple OLVM instances. Completely separate with their own managers (ovirt-engine). One is lab and one is prod. I updated the lab engine from 4.5.4-1.0.31.el8 to 4.5.5-1.22.el8 and as soon as that was done and it rebooted I no longer can open VM consoles.
This was the update process followed:
https://docs.oracle.com/en/virtualization/oracle-linux-virtualization-manage...
Hi Malcolm on a lab env I have I replicated your steps and even with only the engine updated I'm still able to connect to the vnc console of my VM (ubuntu 20.04) I'm using firefox 130.0.1 on a fedora 40 system to connect to the VNC console of the guest. This is a nested env, so that the 2 hypervisors are actually VMs and the Self Hosted Engine is an L2 VM inside. My starting version when selecting Help --> About on the engine: Version 4.5.4-1.0.31.el8 Version:4.5.4-1.0.31.el8 kernel version on my engine (and also hosts): 5.15.0-207.156.6.el8uek.x86_64 steps done: dnf update oracle-ovirt-release-45-el8 passed its version from oracle-ovirt-release-45-el8-1.0-24.el8.x86_64 to oracle-ovirt-release-45-el8-1.0-28.el8.x86_64 then: [root@olvmnesteng ~]# engine-upgrade-check VERB: Creating transaction VERB: Queue package ovirt-engine-setup for update VERB: Building transaction VERB: Transaction built VERB: Transaction Summary: VERB: install : ovirt-engine-setup-plugin-websocket-proxy-4.5.5-1.22.el8.noarch VERB: install : ovirt-engine-setup-plugin-cinderlib-4.5.5-1.22.el8.noarch VERB: install : python3-ovirt-engine-lib-4.5.5-1.22.el8.noarch VERB: install : ovirt-engine-setup-plugin-imageio-4.5.5-1.22.el8.noarch VERB: install : ovirt-engine-setup-plugin-ovirt-engine-4.5.5-1.22.el8.noarch VERB: install : ovirt-engine-setup-plugin-ovirt-engine-common-4.5.5-1.22.el8.noarch VERB: install : ovirt-engine-setup-4.5.5-1.22.el8.noarch VERB: install : ovirt-engine-setup-plugin-vmconsole-proxy-helper-4.5.5-1.22.el8.noarch VERB: install : ovirt-engine-setup-base-4.5.5-1.22.el8.noarch VERB: remove : ovirt-engine-setup-4.5.4-1.0.31.el8.noarch VERB: remove : ovirt-engine-setup-base-4.5.4-1.0.31.el8.noarch VERB: remove : ovirt-engine-setup-plugin-cinderlib-4.5.4-1.0.31.el8.noarch VERB: remove : ovirt-engine-setup-plugin-imageio-4.5.4-1.0.31.el8.noarch VERB: remove : ovirt-engine-setup-plugin-ovirt-engine-4.5.4-1.0.31.el8.noarch VERB: remove : ovirt-engine-setup-plugin-ovirt-engine-common-4.5.4-1.0.31.el8.noarch VERB: remove : ovirt-engine-setup-plugin-vmconsole-proxy-helper-4.5.4-1.0.31.el8.noarch VERB: remove : ovirt-engine-setup-plugin-websocket-proxy-4.5.4-1.0.31.el8.noarch VERB: remove : python3-ovirt-engine-lib-4.5.4-1.0.31.el8.noarch VERB: Closing transaction with commit VERB: Calling _plugins._unload Upgrade available. [root@olvmnesteng ~]# then dnf update ovirt\*setup\* [root@olvmnesteng ~]# dnf update ovirt\*setup\* Last metadata expiration check: 0:00:53 ago on Tue 15 Oct 2024 03:50:12 PM CEST. Dependencies resolved. ============================================================================================================================== Package Architecture Version Repository Size ============================================================================================================================== Upgrading: ovirt-engine-dwh-grafana-integration-setup noarch 4.5.8-1.4.el8 ovirt-4.5 92 k ovirt-engine-dwh-setup noarch 4.5.8-1.4.el8 ovirt-4.5 98 k ovirt-engine-setup noarch 4.5.5-1.22.el8 ovirt-4.5 21 k ovirt-engine-setup-base noarch 4.5.5-1.22.el8 ovirt-4.5 125 k ovirt-engine-setup-plugin-cinderlib noarch 4.5.5-1.22.el8 ovirt-4.5 42 k ovirt-engine-setup-plugin-imageio noarch 4.5.5-1.22.el8 ovirt-4.5 29 k ovirt-engine-setup-plugin-ovirt-engine noarch 4.5.5-1.22.el8 ovirt-4.5 194 k ovirt-engine-setup-plugin-ovirt-engine-common noarch 4.5.5-1.22.el8 ovirt-4.5 131 k ovirt-engine-setup-plugin-vmconsole-proxy-helper noarch 4.5.5-1.22.el8 ovirt-4.5 41 k ovirt-engine-setup-plugin-websocket-proxy noarch 4.5.5-1.22.el8 ovirt-4.5 42 k python3-ovirt-engine-lib noarch 4.5.5-1.22.el8 ovirt-4.5 42 k Transaction Summary ============================================================================================================================== Upgrade 11 Packages Total download size: 856 k Is this ok [y/N]: then engine-setup dnf update reboot (exiting from Global Maintenance) Now: Software Version:4.5.5-1.22.el8 kernel on the engine VM: 5.15.0-300.163.18.1.1.el8uek.x86_64 I'm going to update also the hosts Gianluca

Thank you Gianluca for attempting to replicate the issue. Also thank you Marcos for your input. I replied to your message earlier but for some reason it is not showing up. Marcos - I had previously responded saying I validated VNC encryption is not enabled. I also put each host into maintenance mode as you suggested and did a reinstall and it did not resolve the issue. Gianluca - My versions as you showed are: Software version: 4.5.5-1.22.el8 Kernel on the engine: 5.15.0-300.163.18.1.el8uek.x86_64 I have tried Chrome and Firefox with the same results but I don't believe it to be a browser issue because I can switch to Native VNC instead of noVNC and download the console.vv file. Look inside of it for the connection info and password and then manually make the connection with a vnc viewer app and authenticate with the password and get a blank screen saying "Guest has not initialized the display (yet)." The only thing different on our setup I can think of is we had to disable OAuth logins and use the older AAA authentication since we are still migrating production VMs from OVM into OLVM. Part of the migration uses virt-v2v command which does not support keycloak authentication so we were forced to disable it. Our migration process is loosely based on this article and then tweaked to our specifics https://blogs.oracle.com/scoter/post/how-to-migrate-oracle-vm-to-oracle-linu... Here are the exact steps we took to disable SSO: ##### Disable the external SSO in ovirt engine Edit /etc/ovirt-engine/engine.conf.d/12-setup-keycloak.conf and change: KEYCLOAK_BUNDLED=false ENGINE_SSO_ENABLE_EXTERNAL_SSO=false Disable HTTPD openidc configuration mv /etc/httpd/conf.d/internalsso-openidc.conf /etc/httpd/conf.d/internalsso-openidc.conf.disabled Update oVirt OVN provider Edit /etc/ovirt-provider-ovn/conf.d/10-setup-ovirt-provider-ovn.conf Comment out this line ovirt-admin-user-name=admin@ovirt@internalsso Re-run the engine-setup engine-setup --otopi-environment="OVESETUP_CONFIG/keycloakEnable=bool:False OVESETUP_CONFIG/keycloakSupported=bool:False" --offline Restart all related services systemctl restart ovirt-engine httpd ovirt-provider-ovn grafana-server With AAA authentication login is "admin" ##### I will point out we've done prior upgrades with SSO disabled and it has not been an issue. I only bring it up as we're out of ideas and unsure what has broken console access. As part of troubleshooting I re-ran the engine-setup command as shown above like this: engine-setup --otopi-environment="OVESETUP_CONFIG/keycloakEnable=bool:False OVESETUP_CONFIG/keycloakSupported=bool:False" --offline Restarted services and there was no change. Also one other minor customization we have done some time ago was to make noVNC the default console for our VMs. We achieved that with the following commands: engine-config -s ClientModeVncDefault=NoVnc systemctl restart ovirt-engine Other than that our install is pretty straight forward. Our production upgrades have been pushed out till we can find a resolution to what broke these consoles in lab on upgrade. Thanks again Malcolm

Just to eliminate any doubt that having OAuth disabled was somehow causing issues I undid the steps described above and re-ran engine-setup like this: engine-setup --otopi-environment="OVESETUP_CONFIG/keycloakEnable=bool:True OVESETUP_CONFIG/keycloakSupported=bool:True" Restarted all services: systemctl restart ovirt-engine httpd ovirt-provider-ovn grafana-server Now I logged in with admin@ovirt but nothing had changed. Same console issue. So I reverted it back to AAA logins as before which mimics our production environment and will stay that way until all OVM clusters and environments are fully migrated over to OLVM. Thanks Malcolm

I finally figured this out. After playing around with this some more I found that if I manually connected to a KVM host with a VNC Client I got a console. Why I was getting blank screens previously I'm unsure but I suspect the console of the VM was asleep and I possibly didn't push keys on the console to wake it. I'm not sure. Either way this meant my issue was either configuration issue in the ovirt-engine or the webproxy configuration because I could now get VNC working manually with a stand alone client. Long story short of tracing things step by step I eventually found connection refused errors in my web browser using F12 and checking the console output. The rest of the error was: "because it violates the following Content Security Policy directive: "default-src 'self'". Note that 'connect-src' was not explicitly set, so 'default-src' is used as a fallback." So I dug around /etc/http.d/conf.d/ and found this file that was updated apparently in the recent update: olvm45-security-fixes.conf Looking inside that file I found this at the bottom of it: # Refer for More info https://content-security-policy.com/ #This policy allows images, scripts, AJAX, form actions, and CSS from the same origin, # and does not allow any other resources to load (eg object, frame, media, etc). It is a good starting point for many sites. Header always set Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-eval' 'unsafe-inline'; style-src 'self' 'unsafe-inline'; img-src 'self' data:" I modified that "Header" line to include my engine: Header always set Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-eval' 'unsafe-inline'; style-src 'self' 'unsafe-inline'; img-src 'self' data; connect-src wss://<engine-fqdn>:6100;" Restarted httpd and what do you know. noVNC web consoles work again. What a horrible little updated line to add to apache config. Took me long enough to find it. Hopefully this post helps someone else with this issue searching archives later. Malcolm

Thank you, I spent several hours trying to figure this out before finding your comment. Mine is a little different from the answer above, I needed two entries after connect-src, and I also added a font-src. Header always set Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-eval' 'unsafe-inline'; style-src 'self' 'unsafe-inline'; img-src 'self' data:; font-src 'self' data:; connect-src 'self' wss:/<engine-fqdn>:6100" This was for OLVM Version 4.5.5-1.22.el8.

Charles, this is regression introduced by 4.5.5-1.22. We’re working to get a fix out asap. Simon
On Oct 24, 2024, at 7:03 AM, charles.macdonald2013@gmail.com wrote:
Thank you, I spent several hours trying to figure this out before finding your comment. Mine is a little different from the answer above, I needed two entries after connect-src, and I also added a font-src.
Header always set Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-eval' 'unsafe-inline'; style-src 'self' 'unsafe-inline'; img-src 'self' data:; font-src 'self' data:; connect-src 'self' wss:/<engine-fqdn>:6100"
This was for OLVM Version 4.5.5-1.22.el8. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://urldefense.com/v3/__https://www.ovirt.org/privacy-policy.html__;!!AC... oVirt Code of Conduct: https://urldefense.com/v3/__https://www.ovirt.org/community/about/community-... List Archives: https://urldefense.com/v3/__https://lists.ovirt.org/archives/list/users@ovir...

I finally figured this out. After playing around with this some more I found that if I manually connected to a KVM host with a VNC Client I got a console. Why I was getting blank screens previously I'm unsure but I suspect the console of the VM was asleep and I possibly didn't push keys on the console to wake it. I'm not sure. Either way this meant my issue was either configuration issue in the ovirt-engine or the webproxy configuration because I could now get VNC working manually with a stand alone client. Long story short of tracing things step by step I eventually found connection refused errors in my web browser using F12 and checking the console output. The rest of the error was: "because it violates the following Content Security Policy directive: "default-src 'self'". Note that 'connect-src' was not explicitly set, so 'default-src' is used as a fallback." So I dug around /etc/http.d/conf.d/ and found this file that was updated apparently in the recent update: olvm45-security-fixes.conf Looking inside that file I found this at the bottom of it: # Refer for More info https://content-security-policy.com/ #This policy allows images, scripts, AJAX, form actions, and CSS from the same origin, # and does not allow any other resources to load (eg object, frame, media, etc). It is a good starting point for many sites. Header always set Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-eval' 'unsafe-inline'; style-src 'self' 'unsafe-inline'; img-src 'self' data:" I modified that "Header" line to include my engine: Header always set Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-eval' 'unsafe-inline'; style-src 'self' 'unsafe-inline'; img-src 'self' data; connect-src wss://<engine-fqdn>:6100;" Restarted httpd and what do you know. noVNC web consoles work again. What a horrible little updated line to add to apache config. Took me long enough to find it. Hopefully this post helps someone else with this issue searching archives later. Malcolm
participants (5)
-
charles.macdonald2013@gmail.com
-
Gianluca Cecchi
-
malcolm.strydom@pacxa.com
-
Marcos Sungaila
-
Simon Coter