October 2021 - Users - oVirt List Archives

UEFI Guest can only be started on UEFI host (4.4)
by nroach44＠nroach44.id.au 20 Oct '21

20 Oct '21

Hi All, A problem I've just "dealt with" over the past months is that the two UEFI VMs I have installed (One Windows 10, one RHEL8) will only start on the oVirt Node (4.4.x, still an issue on 4.4.8) hosts that have been installed using UEFI. In the case of both guests, they will "start" but get stuck on a small 640x480-ish black screen, with no CPU or disk activity. It looks as if the VM has been started with "Start paused" enabled, but the VM is not paused. I've noticed that this matches the normal startup of the guest, although it only spends a second or two like that before TianoCore takes over. Occasionally, I'm able to migrate the VM to a BIOS host. When it fails, the following is seen on the /sending/ host: 2021-09-21 20:09:42,915+0800 ERROR (migsrc/86df93bc) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') internal error: qemu unexpectedly closed the monitor: 2021-09-21T12:08:57.355188Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 1 should be encrypted 2021-09-21T12:08:57.393585Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 3 should be encrypted 2021-09-21T12:08:57.393805Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 4 should be encrypted 2021-09-21T12:08:57.393960Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 2 should be encrypted 2021-09-21T12:09:40.799119Z qemu-kvm: warning: TSC frequency mismatch between VM (3099980 kHz) and host (3392282 kHz), and TSC scaling unavailable 2021-09-21T12:09:40.799228Z qemu-kvm: error: failed to set MSR 0x204 to 0x1000000000 qemu-kvm: ../target/i386/kvm/kvm.c:2778: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. (migration:331) 2021-09-21 20:09:42,938+0800 INFO (migsrc/86df93bc) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Switching from State.STARTED to State.FAILED (migration:234) 2021-09-21 20:09:42,938+0800 ERROR (migsrc/86df93bc) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Failed to migrate (migration:503) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line 477, in _regular_run time.time(), machineParams File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line 578, in _startUnderlyingMigration self._perform_with_conv_schedule(duri, muri) File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line 667, in _perform_with_conv_schedule self._perform_migration(duri, muri) File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line 596, in _perform_migration self._migration_flags) File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 159, in call return getattr(self._vm._dom, name)(*a, **kw) File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f ret = attr(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python3.6/site-packages/libvirt.py", line 2126, in migrateToURI3 raise libvirtError('virDomainMigrateToURI3() failed') libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2021-09-21T12:08:57.355188Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 1 should be encrypted 2021-09-21T12:08:57.393585Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 3 should be encrypted 2021-09-21T12:08:57.393805Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 4 should be encrypted 2021-09-21T12:08:57.393960Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 2 should be encrypted 2021-09-21T12:09:40.799119Z qemu-kvm: warning: TSC frequency mismatch between VM (3099980 kHz) and host (3392282 kHz), and TSC scaling unavailable 2021-09-21T12:09:40.799228Z qemu-kvm: error: failed to set MSR 0x204 to 0x1000000000 qemu-kvm: ../target/i386/kvm/kvm.c:2778: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. The receiving host simply sees 2021-09-21 20:09:42,840+0800 INFO (libvirt/events) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') underlying process disconnected (vm:1135) 2021-09-21 20:09:42,840+0800 INFO (libvirt/events) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Release VM resources (vm:5325) 2021-09-21 20:09:42,840+0800 INFO (libvirt/events) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Stopping connection (guestagent:438) 2021-09-21 20:09:42,840+0800 INFO (libvirt/events) [vdsm.api] START teardownImage(sdUUID='3f46f0f3-1cbb-4154-8af5-dcc3a09c6177', spUUID='924e5fbe-beba-11ea-b679-00163e03ad3e', imgUUID='d91282d3-2552-44d3-aa0f-84f7330be4ce', volUUID=None) from=internal, task_id=51eb32fc-1167-4c4c-bea8-4664c92d15e9 (api:48) 2021-09-21 20:09:42,841+0800 INFO (libvirt/events) [storage.StorageDomain] Removing image rundir link '/run/vdsm/storage/3f46f0f3-1cbb-4154-8af5-dcc3a09c6177/d91282d3-2552-44d3-aa0f-84f7330be4ce' (fileSD:601) 2021-09-21 20:09:42,841+0800 INFO (libvirt/events) [vdsm.api] FINISH teardownImage return=None from=internal, task_id=51eb32fc-1167-4c4c-bea8-4664c92d15e9 (api:54) 2021-09-21 20:09:42,841+0800 INFO (libvirt/events) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Stopping connection (guestagent:438) 2021-09-21 20:09:42,841+0800 INFO (libvirt/events) [vdsm.api] START inappropriateDevices(thiefId='86df93bc-3304-4002-8939-cbefdea4cc60') from=internal, task_id=1e3aafc2-62c7-4fe5-a807-69942709e936 (api:48) 2021-09-21 20:09:42,842+0800 INFO (libvirt/events) [vdsm.api] FINISH inappropriateDevices return=None from=internal, task_id=1e3aafc2-62c7-4fe5-a807-69942709e936 (api:54) 2021-09-21 20:09:42,847+0800 WARN (vm/86df93bc) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Couldn't destroy incoming VM: Domain not found: no domain with matching uuid '86df93bc-3304-4002-8939-cbefdea4cc60' (vm:4073) 2021-09-21 20:09:42,847+0800 INFO (vm/86df93bc) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Changed state to Down: VM destroyed during the startup (code=10) (vm:1921) 2021-09-21 20:09:42,849+0800 INFO (vm/86df93bc) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Stopping connection (guestagent:438) 2021-09-21 20:09:42,856+0800 INFO (jsonrpc/3) [api.virt] START destroy(gracefulAttempts=1) from=::ffff:10.1.2.30,59424, flow_id=47e0a91b, vmId=86df93bc-3304-4002-8939-cbefdea4cc60 (api:48) 2021-09-21 20:09:42,917+0800 INFO (jsonrpc/5) [api.virt] START destroy(gracefulAttempts=1) from=::ffff:10.1.2.7,50798, vmId=86df93bc-3304-4002-8939-cbefdea4cc60 (api:48) The Data center is configured with BIOS as a default. As an aside, *all* hosts have the following cmdline set: (to allow nested virt) intel_iommu=on kvm-intel.nested=1 kvm.ignore_msrs=1 Any suggestions?

2 1

Power Saving schedule, hosts not shutting down
by Maton, Brett 20 Oct '21

20 Oct '21

Hi, I'm having trouble with the power_saving Scheduling Policy not shutting down idle hosts Policy is more or less default, I added 'HostsInReserve 0' too see if that would help, and then 24hrs later I bumped CpuOverCommitDurationMinutes to 15 that didn't make a difference either. (not unexpected as the CPU is only being tickled by two small VMs at the moment). 3x Dell Hosts with iDRAC 8 management cards, power management configured and functional. oVrit 4.4.8.6-1.el8 Thanks in advance for any help Brett

1 0

Assigning public IPs to a VM
by admin＠foundryserver.com 19 Oct '21

19 Oct '21

My provider will give me 20 public IP's, and attach them to each of my hosts. So a total of 60. The question I have, is I have two nics on the host, public and private. Is assigning the public IP as simple as selecting the public network and typing in one of the public IP's? I am still trying to get my hosted engine running, so at this point can't try this. Just thought I would ask in advance. Again Thank you for any help. Brad

1 0

Understanding Cluster Networking between Hosts
by admin＠foundryserver.com 18 Oct '21

18 Oct '21

Hello everyone. I am looking to setup a ovrit cluster. I am struggling with the network side of things. I have watched some videos about Ovirt and kvm/libvirt. I understand the bridge and NAT networks. The part I am struggling with is the host network. Here is my setup. I have two bare metal servers, that have one physical nic with a public IP. There is no private physical network on these boxes virtual or physical. I want to host an application on a single vm for each customer. the customer access the application via username.domain.com. So the question is. I have to have a layer 7 load balancer to route the url to an IP. This load balancer is haproxy on a separate bare metal box. (it could be in a vm on one of the hosts) So the load balancer has to be on the same network as all the vms. I just don't know to setup the networking between the LB on one box and the vm's on potentially may bare metal hosts. It really feels like I am missing something really obvious. All the videos I have watched all assume the host networking is already in place. Any help or resources I can read/watch would really be helpful. Brad

2 2

not able to upload disks, iso - paused by the system error -- Version 4.4.6.7-1.el8
by dhanaraj.ramesh＠yahoo.com 18 Oct '21

18 Oct '21

Hi Team in one of the cluster infra, we are unable to upload the images or disks via gui. up on checking the /var/log/ovirt-imageio/daemon.log found that throwing ssl connection failure, help us to check what are we missing.. We are using thirdparty CA approved SSL for web GUI.. 2021-10-11 22:45:42,812 INFO (Thread-6) [http] OPEN connection=6 client=127.0.0.1 2021-10-11 22:45:42,812 INFO (Thread-6) [tickets] [127.0.0.1] REMOVE ticket=f18cff91-1fc4-43b6-91ea-ca2a11d409a6 2021-10-11 22:45:42,813 INFO (Thread-6) [http] CLOSE connection=6 client=127.0.0.1 [connection 1 ops, 0.000539 s] [dispatch 1 ops, 0.000216 s] 2021-10-11 22:45:43,621 INFO (Thread-4) [images] [::ffff:10.12.23.212] OPTIONS ticket=53ff98f9-f429-4880-abe6-06c6c01473de 2021-10-11 22:45:43,621 INFO (Thread-4) [backends.http] Open backend netloc='renlovkvma01.test.lab:54322' path='/images/53ff98f9-f429-4880-abe6-06c6c01473de' cafile='/etc/pki/ovirt-engine/ca.pem' secure=True 2021-10-11 22:45:43,626 ERROR (Thread-4) [http] Server error Traceback (most recent call last): File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/__init__.py", line 66, in get return ticket.get_context(req.connection_id) File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/auth.py", line 146, in get_context return self._connections[con_id] KeyError: 4 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/http.py", line 774, in __call__ self.dispatch(req, resp) File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/http.py", line 819, in dispatch return method(req, resp, *match.groups()) File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/cors.py", line 84, in wrapper return func(self, req, resp, *args) File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/images.py", line 246, in options ctx = backends.get(req, ticket, self.config) File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/__init__.py", line 85, in get cafile=ca_file) File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/http.py", line 48, in open return Backend(url, **options) File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/http.py", line 76, in __init__ self._connect() File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/http.py", line 117, in _connect self._con = self._create_tcp_connection() File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/http.py", line 379, in _create_tcp_connection con.connect() File "/usr/lib64/python3.6/http/client.py", line 1437, in connect server_hostname=server_hostname) File "/usr/lib64/python3.6/ssl.py", line 365, in wrap_socket _context=self, _session=session) File "/usr/lib64/python3.6/ssl.py", line 776, in __init__ self.do_handshake() File "/usr/lib64/python3.6/ssl.py", line 1036, in do_handshake self._sslobj.do_handshake() File "/usr/lib64/python3.6/ssl.py", line 648, in do_handshake self._sslobj.do_handshake() ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)

3 3

Ovirt VM started with old snapshots but with current disk missing
by samuel.xhu＠horebdata.cn 17 Oct '21

17 Oct '21

Hi, all: I have an ovirt 4.3 environment of 3 nodes. While i am using the following command (qemuCmd = 'qemu-img convert -p -c -O qcow2 '+backy2Source+' '+RESTOREPATH+disk.id) to merge qemu diks to do backup using backy2, unfortnately a power cut happened. After the power is back, the current disk disappeared in web UI and the recent data is lost when i restarted the virtual machine. But I could see the disk (44dd7a6f...) still exist, but Ovirt can not recognize it. How could i restore the disk identified by 44dd7a6f... so that ovirt can recognize it again? Anyone can help? Thanks!

1 0

HA VM and vm leases usage with site failure
by Gianluca Cecchi 17 Oct '21

17 Oct '21

Hello, supposing latest 4.4.7 environment installed with an external engine and two hosts, one in one site and one in another site. For storage I have one FC storage domain. I try to simulate a sort of "site failure scenario" to see what kind of HA I should expect. The 2 hosts have power mgmt configured through fence_ipmilan. I have 2 VMs, one configured as HA with lease on storage (Resume Behavior: kill) and one not marked as HA. Initially host1 is SPM and it is the host that runs the two VMs. Fencing of host1 from host2 initially works ok. I can test also from command line: # fence_ipmilan -a 10.10.193.152 -P -l my_fence_user -A password -L operator -S /usr/local/bin/pwd.sh -o status Status: ON On host2 I then prevent reaching host1 iDRAC: firewall-cmd --direct --add-rule ipv4 filter OUTPUT 0 -d 10.10.193.152 -p udp --dport 623 -j DROP firewall-cmd --direct --add-rule ipv4 filter OUTPUT 1 -j ACCEPT so that: # fence_ipmilan -a 10.10.193.152 -P -l my_fence_user -A password -L operator -S /usr/local/bin/pwd.sh -o status 2021-08-05 15:06:07,254 ERROR: Failed: Unable to obtain correct plug status or plug is not available On host1 I generate panic: # date ; echo 1 > /proc/sys/kernel/sysrq ; echo c > /proc/sysrq-trigger Thu Aug 5 15:06:24 CEST 2021 host1 correctly completes its crash dump (kdump integration is enabled) and reboots, but I stop it at grub prompt so that host1 is unreachable from host2 point of view and also power fencing not determined At this point I thought that VM lease functionality would have come in place and host2 would be able to re-start the HA VM, as it is able to see that the lease is not taken from the other host and so it can acquire the lock itself.... Instead it goes through the attempt to power fence loop I wait about 25 minutes without any effect but continuous attempts. After 2 minutes host2 correctly becomes SPM and VMs are marked as unknown At a certain point after the failures in power fencing host1, I see the event: Failed to power fence host host1. Please check the host status and it's power management settings, and then manually reboot it and click "Confirm Host Has Been Rebooted" If I select host and choose "Confirm Host Has Been Rebooted", then the two VMs are marked as down and the HA one is correctly booted by host2. But this requires my manual intervention. Is the behavior above the expected one or the use of VM leases should have allowed host2 to bypass fencing inability and start the HA VM with lease? Otherwise I don't understand the reason to have the lease itself at all.... Thanks, Gianluca

3 6

Ovirt VM started with old snapshots but with current disk missing
by samuel.xhu＠horebdata.cn 16 Oct '21

16 Oct '21

Hi, all: I have an ovirt 4.3 environment of 3 nodes. While i am using the following command (qemuCmd = 'qemu-img convert -p -c -O qcow2 '+backy2Source+' '+RESTOREPATH+disk.id) to merge qemu diks to do backup using backy2, unfortnately a power cut happened. After the power is back, the current disk disappeared in web UI and the recent data is lost when i restarted the virtual machine. But I could see the disk (44dd7a6f...) still exist, but Ovirt can not recognize it. How could i restore the disk identified by 44dd7a6f... so that ovirt can recognize it again? Anyone can help? Thanks! Do Right Thing (做正确的事) / Pursue Excellence (追求卓越) / Help Others Succeed (成就他人)

1 0

Engine insists on running on 1 host in cluster when that host is online
by David White 16 Oct '21

16 Oct '21

About a month ago, I completely rebuilt my oVirt cluster, as I needed to move all my hardware from 1 data center to another with minimal downtime. All my hardware is in the new data center (yay for HIPAA compliance and 24/7 access, unlike the old place!) I originally built the cluster as a single-node hyperconverged. I then added nodes to it, and then, finally, I reconfigured gluster to run in a replica 2 / arbiter 1 configuration. As of now: - I have 4 compute hosts - Gluster is running fine and replicating fine between the two full replicas, along with the arbiter node - I need to revisit my recent email about gluster replication speed, though However, I'm worried about two things: - I think that my hosts, and hosted-engine, are all still configured to use the single mount point from the original single-node hyperconverged ... i.e. if I shutdown / reboot that host, then the storage "goes away" - How do I reconfigure oVirt to use the 2nd replica as a secondary mount point? - Currently, the Engine is deployed on two of the servers (a and c). But any time one of those servers is online (c), the hosted-engine insists on running on that server c. I cannot migrate the engine off of c. And if the engine is running on the other host a, then I turn host c back on, the engine shuts down from a, and comes back up on c. - How do I "fix" this so that the engine will run on host a, even when host c is turned on? - How do I deploy the hosted-engine to host b? - Is it as simple as logging into host b, and running "hosted-engine --deploy" ? Sent with ProtonMail Secure Email.

2 5

Disconnected from Console. Cannot connect to websocket proxy server.
by vegard.saeten＠noroff.no 15 Oct '21

15 Oct '21

Hi. After renewing our certificate, our users cannot connect to their virtual machines using the VNC Console (browser) option. They receive this error message: Disconnected from Console. Cannot connect to websocket proxy server. Please check your websocket proxy certificate or ask your administrator for help. For further information please refer to the console manual. Press the 'Connect' button to reconnect the console. This is what is showing in the logs on the engine server: sudo service ovirt-websocket-proxy status -l Redirecting to /bin/systemctl status -l ovirt-websocket-proxy.service ● ovirt-websocket-proxy.service - oVirt Engine websockets proxy Loaded: loaded (/usr/lib/systemd/system/ovirt-websocket-proxy.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2021-10-12 15:32:04 CEST; 1 day 3h ago Main PID: 49597 (ovirt-websocket) CGroup: /system.slice/ovirt-websocket-proxy.service └─49597 /usr/bin/python /usr/share/ovirt-engine/services/ovirt-websocket-proxy/ovirt-websocket-proxy.py --systemd=notify start Oct 13 18:27:51 enginedomain ovirt-websocket[52044]: 2021-10-13 18:27:51,338+0200 ovirt-websocket-proxy: INFO msg:887 handler exception: [Errno 13] Permission denied Oct 13 18:27:51 enginedomain ovirt-websocket-proxy.py[49597]: ovirt-websocket-proxy[52044] INFO msg:887 handler exception: [Errno 13] Permission denied Oct 13 18:31:54 enginedomain ovirt-websocket[53293]: 2021-10-13 18:31:54,516+0200 ovirt-websocket-proxy: INFO msg:887 handler exception: [Errno 13] Permission denied Oct 13 18:31:54 enginedomain ovirt-websocket-proxy.py[49597]: ovirt-websocket-proxy[53293] INFO msg:887 handler exception: [Errno 13] Permission denied Oct 13 18:31:59 enginedomain ovirt-websocket[53300]: 2021-10-13 18:31:59,270+0200 ovirt-websocket-proxy: INFO msg:887 handler exception: [Errno 13] Permission denied Oct 13 18:31:59 enginedomain ovirt-websocket-proxy.py[49597]: ovirt-websocket-proxy[53300] INFO msg:887 handler exception: [Errno 13] Permission denied Oct 13 18:32:00 enginedomain ovirt-websocket[53301]: 2021-10-13 18:32:00,028+0200 ovirt-websocket-proxy: INFO msg:887 handler exception: [Errno 13] Permission denied Oct 13 18:32:00 enginedomain ovirt-websocket-proxy.py[49597]: ovirt-websocket-proxy[53301] INFO msg:887 handler exception: [Errno 13] Permission denied Oct 13 18:34:45 enginedomain ovirt-websocket[53341]: 2021-10-13 18:34:45,099+0200 ovirt-websocket-proxy: INFO msg:887 handler exception: [Errno 13] Permission denied Oct 13 18:34:45 enginedomain ovirt-websocket-proxy.py[49597]: ovirt-websocket-proxy[53341] INFO msg:887 handler exception: [Errno 13] Permission denied Any suggestions on how we can fix this issue?

2 1