UEFI Guest can only be started on UEFI host (4.4)
by nroach44@nroach44.id.au
Hi All,
A problem I've just "dealt with" over the past months is that the two UEFI VMs I have installed (One Windows 10, one RHEL8) will only start on the oVirt Node (4.4.x, still an issue on 4.4.8) hosts that have been installed using UEFI.
In the case of both guests, they will "start" but get stuck on a small 640x480-ish black screen, with no CPU or disk activity. It looks as if the VM has been started with "Start paused" enabled, but the VM is not paused. I've noticed that this matches the normal startup of the guest, although it only spends a second or two like that before TianoCore takes over.
Occasionally, I'm able to migrate the VM to a BIOS host. When it fails, the following is seen on the /sending/ host:
2021-09-21 20:09:42,915+0800 ERROR (migsrc/86df93bc) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') internal error: qemu unexpectedly closed the monitor: 2021-09-21T12:08:57.355188Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 1 should be encrypted
2021-09-21T12:08:57.393585Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 3 should be encrypted
2021-09-21T12:08:57.393805Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 4 should be encrypted
2021-09-21T12:08:57.393960Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 2 should be encrypted
2021-09-21T12:09:40.799119Z qemu-kvm: warning: TSC frequency mismatch between VM (3099980 kHz) and host (3392282 kHz), and TSC scaling unavailable
2021-09-21T12:09:40.799228Z qemu-kvm: error: failed to set MSR 0x204 to 0x1000000000
qemu-kvm: ../target/i386/kvm/kvm.c:2778: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed. (migration:331)
2021-09-21 20:09:42,938+0800 INFO (migsrc/86df93bc) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Switching from State.STARTED to State.FAILED (migration:234)
2021-09-21 20:09:42,938+0800 ERROR (migsrc/86df93bc) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Failed to migrate (migration:503)
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line 477, in _regular_run
time.time(), machineParams
File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line 578, in _startUnderlyingMigration
self._perform_with_conv_schedule(duri, muri)
File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line 667, in _perform_with_conv_schedule
self._perform_migration(duri, muri)
File "/usr/lib/python3.6/site-packages/vdsm/virt/migration.py", line 596, in _perform_migration
self._migration_flags)
File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 159, in call
return getattr(self._vm._dom, name)(*a, **kw)
File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f
ret = attr(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper
return func(inst, *args, **kwargs)
File "/usr/lib64/python3.6/site-packages/libvirt.py", line 2126, in migrateToURI3
raise libvirtError('virDomainMigrateToURI3() failed')
libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2021-09-21T12:08:57.355188Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 1 should be encrypted
2021-09-21T12:08:57.393585Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 3 should be encrypted
2021-09-21T12:08:57.393805Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 4 should be encrypted
2021-09-21T12:08:57.393960Z qemu-kvm: warning: Spice: reds.c:2305:reds_handle_read_link_done: spice channels 2 should be encrypted
2021-09-21T12:09:40.799119Z qemu-kvm: warning: TSC frequency mismatch between VM (3099980 kHz) and host (3392282 kHz), and TSC scaling unavailable
2021-09-21T12:09:40.799228Z qemu-kvm: error: failed to set MSR 0x204 to 0x1000000000
qemu-kvm: ../target/i386/kvm/kvm.c:2778: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
The receiving host simply sees
2021-09-21 20:09:42,840+0800 INFO (libvirt/events) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') underlying process disconnected (vm:1135)
2021-09-21 20:09:42,840+0800 INFO (libvirt/events) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Release VM resources (vm:5325)
2021-09-21 20:09:42,840+0800 INFO (libvirt/events) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Stopping connection (guestagent:438)
2021-09-21 20:09:42,840+0800 INFO (libvirt/events) [vdsm.api] START teardownImage(sdUUID='3f46f0f3-1cbb-4154-8af5-dcc3a09c6177', spUUID='924e5fbe-beba-11ea-b679-00163e03ad3e', imgUUID='d91282d3-2552-44d3-aa0f-84f7330be4ce', volUUID=None) from=internal, task_id=51eb32fc-1167-4c4c-bea8-4664c92d15e9 (api:48)
2021-09-21 20:09:42,841+0800 INFO (libvirt/events) [storage.StorageDomain] Removing image rundir link '/run/vdsm/storage/3f46f0f3-1cbb-4154-8af5-dcc3a09c6177/d91282d3-2552-44d3-aa0f-84f7330be4ce' (fileSD:601)
2021-09-21 20:09:42,841+0800 INFO (libvirt/events) [vdsm.api] FINISH teardownImage return=None from=internal, task_id=51eb32fc-1167-4c4c-bea8-4664c92d15e9 (api:54)
2021-09-21 20:09:42,841+0800 INFO (libvirt/events) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Stopping connection (guestagent:438)
2021-09-21 20:09:42,841+0800 INFO (libvirt/events) [vdsm.api] START inappropriateDevices(thiefId='86df93bc-3304-4002-8939-cbefdea4cc60') from=internal, task_id=1e3aafc2-62c7-4fe5-a807-69942709e936 (api:48)
2021-09-21 20:09:42,842+0800 INFO (libvirt/events) [vdsm.api] FINISH inappropriateDevices return=None from=internal, task_id=1e3aafc2-62c7-4fe5-a807-69942709e936 (api:54)
2021-09-21 20:09:42,847+0800 WARN (vm/86df93bc) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Couldn't destroy incoming VM: Domain not found: no domain with matching uuid '86df93bc-3304-4002-8939-cbefdea4cc60' (vm:4073)
2021-09-21 20:09:42,847+0800 INFO (vm/86df93bc) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Changed state to Down: VM destroyed during the startup (code=10) (vm:1921)
2021-09-21 20:09:42,849+0800 INFO (vm/86df93bc) [virt.vm] (vmId='86df93bc-3304-4002-8939-cbefdea4cc60') Stopping connection (guestagent:438)
2021-09-21 20:09:42,856+0800 INFO (jsonrpc/3) [api.virt] START destroy(gracefulAttempts=1) from=::ffff:10.1.2.30,59424, flow_id=47e0a91b, vmId=86df93bc-3304-4002-8939-cbefdea4cc60 (api:48)
2021-09-21 20:09:42,917+0800 INFO (jsonrpc/5) [api.virt] START destroy(gracefulAttempts=1) from=::ffff:10.1.2.7,50798, vmId=86df93bc-3304-4002-8939-cbefdea4cc60 (api:48)
The Data center is configured with BIOS as a default.
As an aside, *all* hosts have the following cmdline set: (to allow nested virt)
intel_iommu=on kvm-intel.nested=1 kvm.ignore_msrs=1
Any suggestions?
3 years, 6 months
Power Saving schedule, hosts not shutting down
by Maton, Brett
Hi,
I'm having trouble with the power_saving Scheduling Policy not shutting
down idle hosts
Policy is more or less default, I added 'HostsInReserve 0' too see if
that would help, and then 24hrs later I bumped
CpuOverCommitDurationMinutes to 15 that didn't make a difference either.
(not unexpected as the CPU is only being tickled by two small VMs at the
moment).
3x Dell Hosts with iDRAC 8 management cards, power management configured
and functional.
oVrit 4.4.8.6-1.el8
Thanks in advance for any help
Brett
3 years, 6 months
Assigning public IPs to a VM
by admin@foundryserver.com
My provider will give me 20 public IP's, and attach them to each of my hosts. So a total of 60. The question I have, is I have two nics on the host, public and private. Is assigning the public IP as simple as selecting the public network and typing in one of the public IP's? I am still trying to get my hosted engine running, so at this point can't try this. Just thought I would ask in advance.
Again Thank you for any help.
Brad
3 years, 6 months
Understanding Cluster Networking between Hosts
by admin@foundryserver.com
Hello everyone. I am looking to setup a ovrit cluster. I am struggling with the network side of things. I have watched some videos about Ovirt and kvm/libvirt. I understand the bridge and NAT networks. The part I am struggling with is the host network. Here is my setup.
I have two bare metal servers, that have one physical nic with a public IP. There is no private physical network on these boxes virtual or physical.
I want to host an application on a single vm for each customer. the customer access the application via username.domain.com.
So the question is. I have to have a layer 7 load balancer to route the url to an IP. This load balancer is haproxy on a separate bare metal box. (it could be in a vm on one of the hosts) So the load balancer has to be on the same network as all the vms. I just don't know to setup the networking between the LB on one box and the vm's on potentially may bare metal hosts.
It really feels like I am missing something really obvious. All the videos I have watched all assume the host networking is already in place. Any help or resources I can read/watch would really be helpful.
Brad
3 years, 6 months
not able to upload disks, iso - paused by the system error -- Version 4.4.6.7-1.el8
by dhanaraj.ramesh@yahoo.com
Hi Team
in one of the cluster infra, we are unable to upload the images or disks via gui. up on checking the /var/log/ovirt-imageio/daemon.log found that throwing ssl connection failure, help us to check what are we missing..
We are using thirdparty CA approved SSL for web GUI..
2021-10-11 22:45:42,812 INFO (Thread-6) [http] OPEN connection=6 client=127.0.0.1
2021-10-11 22:45:42,812 INFO (Thread-6) [tickets] [127.0.0.1] REMOVE ticket=f18cff91-1fc4-43b6-91ea-ca2a11d409a6
2021-10-11 22:45:42,813 INFO (Thread-6) [http] CLOSE connection=6 client=127.0.0.1 [connection 1 ops, 0.000539 s] [dispatch 1 ops, 0.000216 s]
2021-10-11 22:45:43,621 INFO (Thread-4) [images] [::ffff:10.12.23.212] OPTIONS ticket=53ff98f9-f429-4880-abe6-06c6c01473de
2021-10-11 22:45:43,621 INFO (Thread-4) [backends.http] Open backend netloc='renlovkvma01.test.lab:54322' path='/images/53ff98f9-f429-4880-abe6-06c6c01473de' cafile='/etc/pki/ovirt-engine/ca.pem' secure=True
2021-10-11 22:45:43,626 ERROR (Thread-4) [http] Server error
Traceback (most recent call last):
File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/__init__.py", line 66, in get
return ticket.get_context(req.connection_id)
File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/auth.py", line 146, in get_context
return self._connections[con_id]
KeyError: 4
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/http.py", line 774, in __call__
self.dispatch(req, resp)
File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/http.py", line 819, in dispatch
return method(req, resp, *match.groups())
File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/cors.py", line 84, in wrapper
return func(self, req, resp, *args)
File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/images.py", line 246, in options
ctx = backends.get(req, ticket, self.config)
File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/__init__.py", line 85, in get
cafile=ca_file)
File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/http.py", line 48, in open
return Backend(url, **options)
File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/http.py", line 76, in __init__
self._connect()
File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/http.py", line 117, in _connect
self._con = self._create_tcp_connection()
File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/http.py", line 379, in _create_tcp_connection
con.connect()
File "/usr/lib64/python3.6/http/client.py", line 1437, in connect
server_hostname=server_hostname)
File "/usr/lib64/python3.6/ssl.py", line 365, in wrap_socket
_context=self, _session=session)
File "/usr/lib64/python3.6/ssl.py", line 776, in __init__
self.do_handshake()
File "/usr/lib64/python3.6/ssl.py", line 1036, in do_handshake
self._sslobj.do_handshake()
File "/usr/lib64/python3.6/ssl.py", line 648, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)
3 years, 6 months
Ovirt VM started with old snapshots but with current disk missing
by samuel.xhu@horebdata.cn
Hi, all:
I have an ovirt 4.3 environment of 3 nodes. While i am using the following command (qemuCmd = 'qemu-img convert -p -c -O qcow2 '+backy2Source+' '+RESTOREPATH+disk.id) to merge qemu diks to do backup using backy2, unfortnately a power cut happened. After the power is back, the current disk disappeared in web UI and the recent data is lost when i restarted the virtual machine. But I could see the disk (44dd7a6f...) still exist, but Ovirt can not recognize it.
How could i restore the disk identified by 44dd7a6f... so that ovirt can recognize it again?
Anyone can help? Thanks!
3 years, 6 months
HA VM and vm leases usage with site failure
by Gianluca Cecchi
Hello,
supposing latest 4.4.7 environment installed with an external engine and
two hosts, one in one site and one in another site.
For storage I have one FC storage domain.
I try to simulate a sort of "site failure scenario" to see what kind of HA
I should expect.
The 2 hosts have power mgmt configured through fence_ipmilan.
I have 2 VMs, one configured as HA with lease on storage (Resume Behavior:
kill) and one not marked as HA.
Initially host1 is SPM and it is the host that runs the two VMs.
Fencing of host1 from host2 initially works ok. I can test also from
command line:
# fence_ipmilan -a 10.10.193.152 -P -l my_fence_user -A password -L
operator -S /usr/local/bin/pwd.sh -o status
Status: ON
On host2 I then prevent reaching host1 iDRAC:
firewall-cmd --direct --add-rule ipv4 filter OUTPUT 0 -d 10.10.193.152 -p
udp --dport 623 -j DROP
firewall-cmd --direct --add-rule ipv4 filter OUTPUT 1 -j ACCEPT
so that:
# fence_ipmilan -a 10.10.193.152 -P -l my_fence_user -A password -L
operator -S /usr/local/bin/pwd.sh -o status
2021-08-05 15:06:07,254 ERROR: Failed: Unable to obtain correct plug status
or plug is not available
On host1 I generate panic:
# date ; echo 1 > /proc/sys/kernel/sysrq ; echo c > /proc/sysrq-trigger
Thu Aug 5 15:06:24 CEST 2021
host1 correctly completes its crash dump (kdump integration is enabled) and
reboots, but I stop it at grub prompt so that host1 is unreachable from
host2 point of view and also power fencing not determined
At this point I thought that VM lease functionality would have come in
place and host2 would be able to re-start the HA VM, as it is able to see
that the lease is not taken from the other host and so it can acquire the
lock itself....
Instead it goes through the attempt to power fence loop
I wait about 25 minutes without any effect but continuous attempts.
After 2 minutes host2 correctly becomes SPM and VMs are marked as unknown
At a certain point after the failures in power fencing host1, I see the
event:
Failed to power fence host host1. Please check the host status and it's
power management settings, and then manually reboot it and click "Confirm
Host Has Been Rebooted"
If I select host and choose "Confirm Host Has Been Rebooted", then the two
VMs are marked as down and the HA one is correctly booted by host2.
But this requires my manual intervention.
Is the behavior above the expected one or the use of VM leases should have
allowed host2 to bypass fencing inability and start the HA VM with lease?
Otherwise I don't understand the reason to have the lease itself at all....
Thanks,
Gianluca
3 years, 6 months
Ovirt VM started with old snapshots but with current disk missing
by samuel.xhu@horebdata.cn
Hi, all:
I have an ovirt 4.3 environment of 3 nodes. While i am using the following command (qemuCmd = 'qemu-img convert -p -c -O qcow2 '+backy2Source+' '+RESTOREPATH+disk.id) to merge qemu diks to do backup using backy2, unfortnately a power cut happened. After the power is back, the current disk disappeared in web UI and the recent data is lost when i restarted the virtual machine. But I could see the disk (44dd7a6f...) still exist, but Ovirt can not recognize it.
How could i restore the disk identified by 44dd7a6f... so that ovirt can recognize it again?
Anyone can help? Thanks!
Do Right Thing (做正确的事) / Pursue Excellence (追求卓越) / Help Others Succeed (成就他人)
3 years, 6 months
Engine insists on running on 1 host in cluster when that host is online
by David White
About a month ago, I completely rebuilt my oVirt cluster, as I needed to move all my hardware from 1 data center to another with minimal downtime.
All my hardware is in the new data center (yay for HIPAA compliance and 24/7 access, unlike the old place!)
I originally built the cluster as a single-node hyperconverged. I then added nodes to it, and then, finally, I reconfigured gluster to run in a replica 2 / arbiter 1 configuration.
As of now:
- I have 4 compute hosts
- Gluster is running fine and replicating fine between the two full replicas, along with the arbiter node
- I need to revisit my recent email about gluster replication speed, though
However, I'm worried about two things:
- I think that my hosts, and hosted-engine, are all still configured to use the single mount point from the original single-node hyperconverged ... i.e. if I shutdown / reboot that host, then the storage "goes away"
- How do I reconfigure oVirt to use the 2nd replica as a secondary mount point?
- Currently, the Engine is deployed on two of the servers (a and c). But any time one of those servers is online (c), the hosted-engine insists on running on that server c. I cannot migrate the engine off of c.
And if the engine is running on the other host a, then I turn host c back on, the engine shuts down from a, and comes back up on c.
- How do I "fix" this so that the engine will run on host a, even when host c is turned on?
- How do I deploy the hosted-engine to host b?
- Is it as simple as logging into host b, and running "hosted-engine --deploy" ?
Sent with ProtonMail Secure Email.
3 years, 6 months
Disconnected from Console. Cannot connect to websocket proxy server.
by vegard.saeten@noroff.no
Hi.
After renewing our certificate, our users cannot connect to their virtual machines using the VNC Console (browser) option.
They receive this error message:
Disconnected from Console. Cannot connect to websocket proxy server. Please check your websocket proxy certificate or ask your administrator for help. For further information please refer to the console manual.
Press the 'Connect' button to reconnect the console.
This is what is showing in the logs on the engine server:
sudo service ovirt-websocket-proxy status -l
Redirecting to /bin/systemctl status -l ovirt-websocket-proxy.service
● ovirt-websocket-proxy.service - oVirt Engine websockets proxy
Loaded: loaded (/usr/lib/systemd/system/ovirt-websocket-proxy.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2021-10-12 15:32:04 CEST; 1 day 3h ago
Main PID: 49597 (ovirt-websocket)
CGroup: /system.slice/ovirt-websocket-proxy.service
└─49597 /usr/bin/python /usr/share/ovirt-engine/services/ovirt-websocket-proxy/ovirt-websocket-proxy.py --systemd=notify start
Oct 13 18:27:51 enginedomain ovirt-websocket[52044]: 2021-10-13 18:27:51,338+0200 ovirt-websocket-proxy: INFO msg:887 handler exception: [Errno 13] Permission denied
Oct 13 18:27:51 enginedomain ovirt-websocket-proxy.py[49597]: ovirt-websocket-proxy[52044] INFO msg:887 handler exception: [Errno 13] Permission denied
Oct 13 18:31:54 enginedomain ovirt-websocket[53293]: 2021-10-13 18:31:54,516+0200 ovirt-websocket-proxy: INFO msg:887 handler exception: [Errno 13] Permission denied
Oct 13 18:31:54 enginedomain ovirt-websocket-proxy.py[49597]: ovirt-websocket-proxy[53293] INFO msg:887 handler exception: [Errno 13] Permission denied
Oct 13 18:31:59 enginedomain ovirt-websocket[53300]: 2021-10-13 18:31:59,270+0200 ovirt-websocket-proxy: INFO msg:887 handler exception: [Errno 13] Permission denied
Oct 13 18:31:59 enginedomain ovirt-websocket-proxy.py[49597]: ovirt-websocket-proxy[53300] INFO msg:887 handler exception: [Errno 13] Permission denied
Oct 13 18:32:00 enginedomain ovirt-websocket[53301]: 2021-10-13 18:32:00,028+0200 ovirt-websocket-proxy: INFO msg:887 handler exception: [Errno 13] Permission denied
Oct 13 18:32:00 enginedomain ovirt-websocket-proxy.py[49597]: ovirt-websocket-proxy[53301] INFO msg:887 handler exception: [Errno 13] Permission denied
Oct 13 18:34:45 enginedomain ovirt-websocket[53341]: 2021-10-13 18:34:45,099+0200 ovirt-websocket-proxy: INFO msg:887 handler exception: [Errno 13] Permission denied
Oct 13 18:34:45 enginedomain ovirt-websocket-proxy.py[49597]: ovirt-websocket-proxy[53341] INFO msg:887 handler exception: [Errno 13] Permission denied
Any suggestions on how we can fix this issue?
3 years, 6 months