Re: Future of the oVirt CSI Driver
by Mike Rochefort
On 7/11/23 2:58 AM, Sandro Bonazzola wrote:
> I think that if OKD is dropping the oVirt CSI Driver this also will mean
> nobody will be actively developing it anymore.
I asked about this on the Kubernetes Slack workspace and Vadim mentioned
that in order for the OKD project to continue providing oVirt support,
someone with oVirt knowledge and operator development would need to step
up. It would also require forking the installer and a few other
projects, which is something the OKD team spent a lot of effort to not
have to do anymore during the 4.x series.
https://github.com/openshift/installer
Looking at the OpenShift Installer git, so far what's changed is an
admin can no longer specify oVirt as a deployment target. According to
Jira, there will be a second phase of the deprecation where the actual
provisioning pieces (e.g. Terraform) will be removed, but that probably
won't be for 4.14.
https://issues.redhat.com//browse/OCPBUGS-14818
But not having anyone develop the CSI driver means it will likely end up
with bit rot. With OpenShift 4.14 the oVirt platform becomes less
attractive to use, though IPI/UPI installations should be possible. A
different storage backend would probably be recommended, however.
--
Mike Rochefort
1 year, 4 months
oVirt Self-Hosted Deploy in lopping during search available subnet
by Lucy Silvestre
Hi,
I have been stuck for weeks in this problem during self-hosted deployment. Does someone have suggestions to resolve this?
Error:
The Deploy loops in this part until the Deployment Fail:
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Get ip route]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Fail if can't find an available subnet]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Set new IPv4 subnet prefix]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Search again with another prefix]
[ INFO ] ok: [localhost]
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Define 3rd chunk]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Set 3rd chunk]
[ INFO ] ok: [localhost]
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Get ip route]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Fail if can't find an available subnet]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Set new IPv4 subnet prefix]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Search again with another prefix]
[ INFO ] ok: [localhost]
Details:
OS: Oracle Linux 8.8 (minimum installation) and updated
oVirt 4.4
ovirt-hosted-engine-setup and ovirt-engine-appliance installed
Network device used in the Host machine: eno1 - 192.168.10.x and I also tried with Bond.
I made a reservation on DHCP for these IPs used during the installation.
I tried by terminal and Cockpit, and it is the same issue.
I tried with the command: hosted-engine --deploy --4
I tried with DHCP and Static IP
I will appreciate every help. Thank you.
1 year, 4 months
ACTION_TYPE_FAILED_DISK_IS_BEING_TRANSFERRED
by eshwayri@gmail.com
System unexpectedly lost power. Now when I try to start one of the VMs I get: "ACTION_TYPE_FAILED_DISK_IS_BEING_TRANSFERRED". This may be due to a failed backup earlier in the week. There are no active tasks against this VM or its disk at this time. The disk shows OK in the storage view. Any ideas?
1 year, 4 months
Suggestion to switch to nightly
by Sandro Bonazzola
Hi,
As you probably noticed there were no regular releases after oVirt 4.5.4
<https://ovirt.org/release/4.5.4/> in December 2022.
Despite the calls to action to the community and to the companies involved
with oVirt, there have been no uptake of the leading of the oVirt project
yet.
The developers at Red Hat still dedicating time to the project are now
facing the fact they lack the time to do formal releases despite they keep
fixing platform regressions like the recent ones due to the new ansible
changes. That makes a nightly snapshot setup a more stable environment than
oVirt 4.5.4.
For this reason, we would like to suggest the user community to enable
nightly repositories for oVirt by following the procedure at:
https://www.ovirt.org/develop/dev-process/install-nightly-snapshot.html
This will ensure that the latest fixes for the platform regressions will be
promptly available.
Regards,
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING - Red Hat In-Vehicle Operating System
Red Hat EMEA <https://www.redhat.com/>
sbonazzo(a)redhat.com
<https://www.redhat.com/>
*Red Hat respects your work life balance. Therefore there is no need to
answer this email out of your office hours.*
1 year, 4 months
GPU Passthrough issues with oVirt 4.5
by Vinícius Ferrão
Hello, does anyone is having issues with device passthrough on oVirt 4.5?
I can passthrough the devices without issue to a given VM, but inside the VM it fails to recognize all the devices.
In my case I’ve added 4x GPUs to a VM, but only one show up, and there’s the following errors inside the VM:
[ 23.006655] nvidia 0000:0a:00.0: enabling device (0000 -> 0002)
[ 23.008026] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR1 is 0M @ 0x0 (PCI:0000:0a:00.0)
[ 23.008035] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR2 is 0M @ 0x0 (PCI:0000:0a:00.0)
[ 23.008040] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR3 is 0M @ 0x0 (PCI:0000:0a:00.0)
[ 23.008045] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR4 is 0M @ 0x0 (PCI:0000:0a:00.0)
[ 23.008049] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR5 is 0M @ 0x0 (PCI:0000:0a:00.0)
[ 23.012339] NVRM: The NVIDIA GPU 0000:0a:00.0 (PCI ID: 10de:1db1)
NVRM: installed in this system is not supported by the
NVRM: NVIDIA 535.54.03 driver release.
NVRM: Please see 'Appendix A - Supported NVIDIA GPU Products'
NVRM: in this release's README, available on the operating system
NVRM: specific graphics driver download page at www.nvidia.com.
[ 23.016175] nvidia: probe of 0000:0a:00.0 failed with error -1
[ 23.016838] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR0 is 0M @ 0x0 (PCI:0000:0b:00.0)
[ 23.016842] nvidia: probe of 0000:0b:00.0 failed with error -1
[ 23.017211] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR0 is 0M @ 0x0 (PCI:0000:0c:00.0)
[ 23.017215] nvidia: probe of 0000:0c:00.0 failed with error -1
[ 23.017248] NVRM: The NVIDIA probe routine failed for 3 device(s).
[ 23.214409] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 535.54.03 Tue Jun 6 22:20:39 UTC 2023
[ 23.485704] [drm] [nvidia-drm] [GPU ID 0x00000900] Loading driver
[ 23.485708] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:09:00.0 on minor 1
On the host this shows up on dmesg, but seems right:
[ 709.572845] vfio-pci 0000:1a:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[ 709.572877] vfio-pci 0000:1a:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 709.572883] vfio-pci 0000:1a:00.0: vfio_ecap_init: hiding ecap 0x23@0xac0
[ 710.660813] vfio-pci 0000:1d:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[ 710.660845] vfio-pci 0000:1d:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 710.660851] vfio-pci 0000:1d:00.0: vfio_ecap_init: hiding ecap 0x23@0xac0
[ 711.748760] vfio-pci 0000:1e:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[ 711.748791] vfio-pci 0000:1e:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 711.748797] vfio-pci 0000:1e:00.0: vfio_ecap_init: hiding ecap 0x23@0xac0
[ 712.836687] vfio-pci 0000:1c:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[ 712.836718] vfio-pci 0000:1c:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 712.836725] vfio-pci 0000:1c:00.0: vfio_ecap_init: hiding ecap 0x23@0xac0
Thanks.
1 year, 4 months
No bootable disk OVA
by cello86@gmail.com
Hi all,
we imported a rhel 9.2 image from an OVA generated on vmware and we tried to create a new VM but we had a no bootable disk error. The OVA has been imported with virt-2-v library and if we create a new VM we noticed that the disk is 2 GB size but we resized the disk to 50 GB.
The VM has been started with Q35 Chipset with UEFI options and the disk has the flag bootable activated.
We're using ovirt 4.5.4-1
Could you help us to sort this issue?
Thanks,
Marcello
1 year, 4 months
Restoring HE Fails, engine-config cannot connect to database
by Levi Wilbert
I am attempting to restore n HE backup to a fresh host (not previously in the old environment) in order to restore our old environment but running into issues during the deployment.
Basically my goal is to remove and redeploy an existing HE back into its same environment on a new storage domain.
What I've done:
backed up HE from prior environment
Installed oVirt 4.5.10 on a fresh node that was not in the prior environment
Ran the redeployment: hosted-engine --deploy --restore-from-file=<bkpfile> --4
The script pauses the deployment (even tho I told it not to), during this part I update /etc/dnf/dnf.conf w/ "exclude=ansible-core" since once ansible-core is updated it breaks the deployment script w/ Python incompatibilities.
But I'm running into the following:
[ ERROR ] fatal: [localhost -> 192.168.222.158]: FAILED! => {"changed": true, "cmd": "set -euo pipefail && engine-config -g DisableFenceAtStartupInSec | cut -d' ' -f2 > /root/DisableFenceAtStartupInSec.txt", "delta": "0:00:01.296169", "end": "2023-07-05 11:29:14.101292", "msg": "non-zero return code", "rc": 1, "start": "2023-07-05 11:29:12.805123", "stderr": "Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false", "stderr_lines": ["Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false"], "stdout": "", "stdout_lines": []}
I see that it fails running the engine-config command on the new hosted engine, but when I SSH to it and try running it, I get:
# engine-config -l
Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false
Connection to the Database failed. Please check that the hostname and port number are correct and that the Database service is up and running.
I haven't been able to find anything specifically for this area searching through Google. Anyone have any idea where to go with this?
1 year, 4 months
ovirt 4.5.4 deploy self-hosted engine
by Jorge Visentini
Hi.
I'm trying to deploy the engine but I'm having some errors that I couldn't
identify.
I don't know if it's incompatibility with my hardware or some libvirt bug.
Jul 05 10:06:21 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48505)
Jul 05 10:06:21 ksmmi1r02ovirt36.kosmo.cloud libvirtd[701878]: Domain id=1
name='HostedEngineLocal' uuid=922a156c-7f4c-4815-a645-54ed07 794451 is
tainted: custom-ga-command
Jul 05 10:06:21 ksmmi1r02ovirt36.kosmo.cloud virtlogd[630980]: Client hit
max requests limit 1. This may result in keep-alive timeouts. Consider
tuning the max_client_requests server parameter
Jul 05 10:06:22 ksmmi1r02ovirt36.kosmo.cloud libvirtd[701878]: Invalid
value '-1' for 'cpu.max': Invalid argument
Jul 05 10:06:26 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48500)
Jul 05 10:06:31 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48495)
Jul 05 10:06:31 ksmmi1r02ovirt36.kosmo.cloud systemd[1]:
systemd-timedated.service: Deactivated successfully.
Jul 05 10:06:36 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48490)
Jul 05 10:06:37 ksmmi1r02ovirt36.kosmo.cloud libvirtd[701878]: Invalid
value '-1' for 'cpu.max': Invalid argument
Jul 05 10:06:41 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48485)
Jul 05 10:06:46 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48480)
Jul 05 10:06:51 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48475)
Jul 05 10:06:52 ksmmi1r02ovirt36.kosmo.cloud libvirtd[701878]: Invalid
value '-1' for 'cpu.max': Invalid argument
Jul 05 10:06:56 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48470)
Jul 05 10:07:01 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48465)
*My config:*
*CPU:* 2 x Intel(R) Xeon(R) Platinum 8276M CPU @ 2.20GHz
*Memory:* 4TB
*Disk:* 120GB RAID 1
*ISO:* ovirt-node-ng-installer-4.5.4-2022120615.el9.iso
*Packages:*
kernel-5.14.0-202.el9.x86_64
libvirt-8.9.0-2.el9.x86_64
centos-release-ovirt45-9.1-3.el9s.noarch
python3-ovirt-engine-sdk4-4.6.0-1.el9.x86_64
ovirt-imageio-common-2.4.7-1.el9.x86_64
ovirt-imageio-client-2.4.7-1.el9.x86_64
ovirt-openvswitch-ovn-2.15-4.el9.noarch
ovirt-openvswitch-ovn-common-2.15-4.el9.noarch
ovirt-imageio-daemon-2.4.7-1.el9.x86_64
ovirt-openvswitch-ovn-host-2.15-4.el9.noarch
python3-ovirt-setup-lib-1.3.3-1.el9.noarch
ovirt-vmconsole-1.0.9-1.el9.noarch
ovirt-vmconsole-host-1.0.9-1.el9.noarch
ovirt-openvswitch-2.15-4.el9.noarch
python3-ovirt-node-ng-nodectl-4.4.2-1.el9.noarch
ovirt-node-ng-nodectl-4.4.2-1.el9.noarch
ovirt-ansible-collection-3.0.0-1.el9.noarch
ovirt-python-openvswitch-2.15-4.el9.noarch
ovirt-openvswitch-ipsec-2.15-4.el9.noarch
ovirt-hosted-engine-ha-2.5.0-1.el9.noarch
ovirt-provider-ovn-driver-1.2.36-1.el9.noarch
ovirt-host-dependencies-4.5.0-3.el9.x86_64
ovirt-hosted-engine-setup-2.7.0-1.el9.noarch
ovirt-host-4.5.0-3.el9.x86_64
ovirt-release-host-node-4.5.4-1.el9.x86_64
ovirt-node-ng-image-update-placeholder-4.5.4-1.el9.noarch
ovirt-engine-appliance-4.5-20221206125848.1.el9.x86_64
For better understanding, the deploy log is attached.
I appreciate any tips that help me.
Thank you!
--
Att,
Jorge Visentini
+55 55 98432-9868
1 year, 4 months