oVirt 4.5 linux guest vm with host device added to it fails to start
by Don Dupuis
Hello
I have a RHEL 8.6 based hypervisor with a Mellanox ConnectX-5 IB card
installed with SRIOV enabled. The host device I am assigning is
pci_0000_af_00_2. The card is working as I can talk to other infiniband
interfaces on other servers. Below is the output of lspci.
3b:00.0 Ethernet controller: Mellanox Technologies MT27800 Family
[ConnectX-5]
3b:00.1 Ethernet controller: Mellanox Technologies MT27800 Family
[ConnectX-5]
af:00.0 Infiniband controller: Mellanox Technologies MT27800 Family
[ConnectX-5]
af:00.1 Infiniband controller: Mellanox Technologies MT27800 Family
[ConnectX-5 Virtual Function]
af:00.2 Infiniband controller: Mellanox Technologies MT27800 Family
[ConnectX-5 Virtual Function]
af:00.3 Infiniband controller: Mellanox Technologies MT27800 Family
[ConnectX-5 Virtual Function]
af:00.4 Infiniband controller: Mellanox Technologies MT27800 Family
[ConnectX-5 Virtual Function]
The linux vm is configured as Q35 Chipset with UEFI, 16 cpus, numa enabled,
and cpu pinning enabled. OS is RHEL 7.9. As soon as I start the vm, I get
an immediate error message stating "Cannot run VM. There is no host that
satisfies current scheduling constraints. See below for details:, The host
rvsh002 did not satisfy internal filter HostDevice because some of the
required host devices are unavailable." If I remove the host device from
the vm config, then it starts and runs fine. This setup was working just
fine on RHEL8.4 and oVirt 4.4.7 using the proper driver for RHEL 8.4.
Here is the engine.log after I press the run button.
2022-06-10 11:22:10,506-05 INFO [org.ovirt.engine.core.bll.RunVmCommand]
(default task-1) [81144b66-e5f9-474e-a922-e2ce49cdc8ca] Lock Acquired to
object
'EngineLock:{exclusiveLocks='[de54b903-7204-4966-95a3-05f64ed17f68=VM]',
sharedLocks=''}'
2022-06-10 11:22:10,520-05 INFO
[org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (default
task-1) [81144b66-e5f9-474e-a922-e2ce49cdc8ca] START,
IsVmDuringInitiatingVDSCommand(
IsVmDuringInitiatingVDSCommandParameters:{vmId='de54b903-7204-4966-95a3-05f64ed17f68'}),
log id: 6faf22a5
2022-06-10 11:22:10,520-05 INFO
[org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (default
task-1) [81144b66-e5f9-474e-a922-e2ce49cdc8ca] FINISH,
IsVmDuringInitiatingVDSCommand, return: false, log id: 6faf22a5
2022-06-10 11:22:10,560-05 INFO
[org.ovirt.engine.core.bll.scheduling.SchedulingManager] (default task-1)
[] Candidate host 'rvsh002' ('f68352c2-6ddc-44ae-a19b-9262e92327f8') was
filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'HostDevice'
(correlation id: null)
2022-06-10 11:22:10,569-05 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-1) [] EVENT_ID: USER_FAILED_RUN_VM(54), Failed to run VM
ws006 due to a failed validation: [Cannot run VM. There is no host that
satisfies current scheduling constraints. See below for details:, The host
rvsh002 did not satisfy internal filter HostDevice because some of the
required host devices are unavailable.] (User: admin@internal-authz).
2022-06-10 11:22:10,569-05 WARN [org.ovirt.engine.core.bll.RunVmCommand]
(default task-1) [] Validation of action 'RunVm' failed for user
admin@internal-authz. Reasons:
VAR__ACTION__RUN,VAR__TYPE__VM,SCHEDULING_ALL_HOSTS_FILTERED_OUT,VAR__FILTERTYPE__INTERNAL,$hostName
rvsh002,$filterName
HostDevice,VAR__DETAIL__HOST_DEVICE_UNAVAILABLE,SCHEDULING_HOST_FILTERED_REASON_WITH_DETAIL
2022-06-10 11:22:10,570-0
There was nothing in the vdsm.log on the hypervisor related to this issue
that I could see after hitting the run button.
Thanks
Don
2 years, 7 months
dnf update fails on ovirt node 4.5.0-2022052513
by pat@patfruth.com
I have freshly install ovirt node 4.5 from the iso download here;
https://resources.ovirt.org/pub/ovirt-4.5/iso/ovirt-node-ng-installer/4.5...
The installation appears to have been successful.
Now, when I attempt apply the latest updates with 'dnf update', I get an error
[MIRROR] ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 1180431893 but expected size is: 1180857942
Here is the entire output of the dnf update command;
Last metadata expiration check: 0:47:08 ago on Mon 13 Jun 2022 10:55:35 PM MDT.
Dependencies resolved.
=======================================================================================================================================================================
Package Architecture Version Repository Size
=======================================================================================================================================================================
Installing:
ovirt-node-ng-image-update noarch 4.5.1-0.1.el8 ovirt-45-upstream-testing 1.1 G
replacing ovirt-node-ng-image-update-placeholder.noarch 4.5.0.3-1.el8
Transaction Summary
=======================================================================================================================================================================
Install 1 Package
Total download size: 1.1 G
Is this ok [y/N]: y
Downloading Packages:
[MIRROR] ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 1180431893 but expected size is: 1180857942
[MIRROR] ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 1180431893 but expected size is: 1180857942
[MIRROR] ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 1180431893 but expected size is: 1180857942
[MIRROR] ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 1180431893 but expected size is: 1180857942
[MIRROR] ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 1180431893 but expected size is: 1180857942
[MIRROR] ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 1180431893 but expected size is: 1180857942
[MIRROR] ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 1180431893 but expected size is: 1180857942
[MIRROR] ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 1180431893 but expected size is: 1180857942
[MIRROR] ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 1180431893 but expected size is: 1180857942
[MIRROR] ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 1180431893 but expected size is: 1180857942
[MIRROR] ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 1180431893 but expected size is: 1180857942
[MIRROR] ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch.rpm: Interrupted by header callback: Server reports Content-Length: 1180431893 but expected size is: 1180857942
[FAILED] ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch.rpm: No more mirrors to try - All mirrors were already tried without success
The downloaded packages were saved in cache until the next successful transaction.
You can remove cached packages by executing 'dnf clean packages'.
Error: Error downloading packages:
ovirt-node-ng-image-update-4.5.1-0.1.el8.noarch: Cannot download, all mirrors were already tried without success
How can I fix this error, and get the latest updates for 4.5 installed?
2 years, 7 months
Initramfs and vmlinuz corrupted, how to recover?
by douglasddr8@gmail.com
I have a Dell server with a BIOS bug, it corrupted the files during a UEFI boot.
I noticed that the initramfs and vmlinuz files are zero-sized
How can I retrieve or generate new ones?
2 years, 7 months
Reinstall standalone node without vms loss
by douglasddr8@gmail.com
My server failed and I can't boot via UEFI
How can I reinstall this node (standalone) without losing my virtual machines?. I checked the filesystem and it's completely intact, I couldn't figure out what caused the UEFI to fail.
2 years, 7 months
can't use vmconsole anymore
by Nathanaël Blanchet
Hi,
I was used to use the vmconsole proxy, but since a while, I'm getting
this issue (currently 4.4.5):
# ssh -t -p 2222 ovirt-vmconsole(a)air.v100.abes.fr connect
ovirt-vmconsole(a)air.v100.abes.fr: Permission denied (publickey).
I found following in the engine.log
2021-04-15 17:55:43,094+02 ERROR
[org.ovirt.engine.core.services.VMConsoleProxyServlet] (default task-4)
[] Error validating ticket: :
sun.security.provider.certpath.SunCertPathBuilderException: unable to
find valid certification path to requested target
at
java.base/sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141)
at
java.base/sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
at
java.base/java.security.cert.CertPathBuilder.build(CertPathBuilder.java:297)
at
org.ovirt.engine.core.uutils//org.ovirt.engine.core.uutils.crypto.CertificateChain.buildCertPath(CertificateChain.java:128)
at
org.ovirt.engine.core.uutils//org.ovirt.engine.core.uutils.crypto.ticket.TicketDecoder.decode(TicketDecoder.java:89)
at
deployment.engine.ear.services.war//org.ovirt.engine.core.services.VMConsoleProxyServlet.validateTicket(VMConsoleProxyServlet.java:175)
at
deployment.engine.ear.services.war//org.ovirt.engine.core.services.VMConsoleProxyServlet.doPost(VMConsoleProxyServlet.java:225)
The user key is the good one, I use the same with my other engines and I
can successfully connect to vm consoles.
Thank you for helping
--
Nathanaël Blanchet
Supervision réseau
SIRE
227 avenue Professeur-Jean-Louis-Viala
34193 MONTPELLIER CEDEX 5
Tél. 33 (0)4 67 54 84 55
Fax 33 (0)4 67 54 84 14
blanchet(a)abes.fr
2 years, 7 months
Host fails to activate
by David Johnson
Good afternoon all,
Ovirt version: 4.14.4.10.7-1.el8
Centos version: Linux version 4.18.0-365.el8.x86_64 (
mockbuild(a)kbuilder.bsys.centos.org) (gcc version 8.5.0 20210514 (Red Hat
8.5.0-10) (GCC)) #1 SMP Thu Feb 10 16:11:23 UTC 2022
Background:
We had a mother board fail in our storage device. I was able to migrate the
storage domain to the backup device before it failed completely, and have
been running on the backup device for several weeks while we purchased a
replacement main storage.
Today I shut everything down cleanly, replaced the main storage, and
restarted the cluster. We did disconnect and reconnect the network on all
of the devices as we shuffled equipment in the rack.
One of the hosts in the cluster refuses to come back up.I am able to
connect to the host via putty.
Ovirt gui reporting:
Setting Host ovirt-host-03.maxisinc.net to Non-Operational mode.
Completed: Jun 11, 2022, 4:59:57 PM
Activating Host ovirt-host-03.maxisinc.net
Completed: Jun 11, 2022, 4:59:57 PM
Invoking Activate Host ovirt-host-03.maxisinc.net
Completed: Jun 11, 2022, 4:57:40 PM
Installing Host ovirt-host-03.maxisinc.net
log from host is
5:09 PM
GetManagedObjects() failed: org.freedesktop.DBus.Error.NoReply: Did not
receive a reply. Possible causes include: the remote application did not
send a reply, the message bus security policy blocked the reply, the reply
timeout expired, or the network connection was broken.
pulseaudio
4:55 PM
bondscan-DGwC1l: option lacp_active: mode dependency failed, not supported
in mode balance-alb(6)
kernel
4:55 PM
bondscan-DGwC1l: option arp_all_targets: invalid value (2)
kernel
4:55 PM
bondscan-DGwC1l: option fail_over_mac: invalid value (3)
kernel
4:55 PM
bondscan-DGwC1l: option primary_reselect: invalid value (3)
kernel
4:55 PM
bondscan-DGwC1l: option ad_select: invalid value (3)
kernel
4:55 PM
2 years, 7 months
oVirt SSH rate limit and disable SSH passwd auth
by tasnadi.peter@kifu.gov.hu
Hello,
1.
Is it possible to disable ssh root password authentication in a working oVirt cluster without any problems? (host and ovirt-engine)
/etc/ssh/sshd_config
PasswordAuthentication no
The SSH public authentication key is set on the host.
2.
I tried setting ssh rate limit using firewall-cmd but it doesn't work for some reason. I can log in more than once.
firewall-cmd --permanent --add-rich-rule='rule family="ipv4" priority="-1" service name=ssh limit value=3/m accept'
There is best practice for this?
Thanks
Peter
2 years, 7 months