ACTION_TYPE_FAILED_DISK_IS_BEING_TRANSFERRED
by eshwayri@gmail.com
System unexpectedly lost power. Now when I try to start one of the VMs I get: "ACTION_TYPE_FAILED_DISK_IS_BEING_TRANSFERRED". This may be due to a failed backup earlier in the week. There are no active tasks against this VM or its disk at this time. The disk shows OK in the storage view. Any ideas?
1 year, 9 months
Suggestion to switch to nightly
by Sandro Bonazzola
Hi,
As you probably noticed there were no regular releases after oVirt 4.5.4
<https://ovirt.org/release/4.5.4/> in December 2022.
Despite the calls to action to the community and to the companies involved
with oVirt, there have been no uptake of the leading of the oVirt project
yet.
The developers at Red Hat still dedicating time to the project are now
facing the fact they lack the time to do formal releases despite they keep
fixing platform regressions like the recent ones due to the new ansible
changes. That makes a nightly snapshot setup a more stable environment than
oVirt 4.5.4.
For this reason, we would like to suggest the user community to enable
nightly repositories for oVirt by following the procedure at:
https://www.ovirt.org/develop/dev-process/install-nightly-snapshot.html
This will ensure that the latest fixes for the platform regressions will be
promptly available.
Regards,
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING - Red Hat In-Vehicle Operating System
Red Hat EMEA <https://www.redhat.com/>
sbonazzo(a)redhat.com
<https://www.redhat.com/>
*Red Hat respects your work life balance. Therefore there is no need to
answer this email out of your office hours.*
1 year, 9 months
GPU Passthrough issues with oVirt 4.5
by Vinícius Ferrão
Hello, does anyone is having issues with device passthrough on oVirt 4.5?
I can passthrough the devices without issue to a given VM, but inside the VM it fails to recognize all the devices.
In my case I’ve added 4x GPUs to a VM, but only one show up, and there’s the following errors inside the VM:
[ 23.006655] nvidia 0000:0a:00.0: enabling device (0000 -> 0002)
[ 23.008026] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR1 is 0M @ 0x0 (PCI:0000:0a:00.0)
[ 23.008035] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR2 is 0M @ 0x0 (PCI:0000:0a:00.0)
[ 23.008040] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR3 is 0M @ 0x0 (PCI:0000:0a:00.0)
[ 23.008045] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR4 is 0M @ 0x0 (PCI:0000:0a:00.0)
[ 23.008049] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR5 is 0M @ 0x0 (PCI:0000:0a:00.0)
[ 23.012339] NVRM: The NVIDIA GPU 0000:0a:00.0 (PCI ID: 10de:1db1)
NVRM: installed in this system is not supported by the
NVRM: NVIDIA 535.54.03 driver release.
NVRM: Please see 'Appendix A - Supported NVIDIA GPU Products'
NVRM: in this release's README, available on the operating system
NVRM: specific graphics driver download page at www.nvidia.com.
[ 23.016175] nvidia: probe of 0000:0a:00.0 failed with error -1
[ 23.016838] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR0 is 0M @ 0x0 (PCI:0000:0b:00.0)
[ 23.016842] nvidia: probe of 0000:0b:00.0 failed with error -1
[ 23.017211] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid:
NVRM: BAR0 is 0M @ 0x0 (PCI:0000:0c:00.0)
[ 23.017215] nvidia: probe of 0000:0c:00.0 failed with error -1
[ 23.017248] NVRM: The NVIDIA probe routine failed for 3 device(s).
[ 23.214409] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 535.54.03 Tue Jun 6 22:20:39 UTC 2023
[ 23.485704] [drm] [nvidia-drm] [GPU ID 0x00000900] Loading driver
[ 23.485708] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:09:00.0 on minor 1
On the host this shows up on dmesg, but seems right:
[ 709.572845] vfio-pci 0000:1a:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[ 709.572877] vfio-pci 0000:1a:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 709.572883] vfio-pci 0000:1a:00.0: vfio_ecap_init: hiding ecap 0x23@0xac0
[ 710.660813] vfio-pci 0000:1d:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[ 710.660845] vfio-pci 0000:1d:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 710.660851] vfio-pci 0000:1d:00.0: vfio_ecap_init: hiding ecap 0x23@0xac0
[ 711.748760] vfio-pci 0000:1e:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[ 711.748791] vfio-pci 0000:1e:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 711.748797] vfio-pci 0000:1e:00.0: vfio_ecap_init: hiding ecap 0x23@0xac0
[ 712.836687] vfio-pci 0000:1c:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[ 712.836718] vfio-pci 0000:1c:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 712.836725] vfio-pci 0000:1c:00.0: vfio_ecap_init: hiding ecap 0x23@0xac0
Thanks.
1 year, 9 months
No bootable disk OVA
by cello86@gmail.com
Hi all,
we imported a rhel 9.2 image from an OVA generated on vmware and we tried to create a new VM but we had a no bootable disk error. The OVA has been imported with virt-2-v library and if we create a new VM we noticed that the disk is 2 GB size but we resized the disk to 50 GB.
The VM has been started with Q35 Chipset with UEFI options and the disk has the flag bootable activated.
We're using ovirt 4.5.4-1
Could you help us to sort this issue?
Thanks,
Marcello
1 year, 9 months
Restoring HE Fails, engine-config cannot connect to database
by Levi Wilbert
I am attempting to restore n HE backup to a fresh host (not previously in the old environment) in order to restore our old environment but running into issues during the deployment.
Basically my goal is to remove and redeploy an existing HE back into its same environment on a new storage domain.
What I've done:
backed up HE from prior environment
Installed oVirt 4.5.10 on a fresh node that was not in the prior environment
Ran the redeployment: hosted-engine --deploy --restore-from-file=<bkpfile> --4
The script pauses the deployment (even tho I told it not to), during this part I update /etc/dnf/dnf.conf w/ "exclude=ansible-core" since once ansible-core is updated it breaks the deployment script w/ Python incompatibilities.
But I'm running into the following:
[ ERROR ] fatal: [localhost -> 192.168.222.158]: FAILED! => {"changed": true, "cmd": "set -euo pipefail && engine-config -g DisableFenceAtStartupInSec | cut -d' ' -f2 > /root/DisableFenceAtStartupInSec.txt", "delta": "0:00:01.296169", "end": "2023-07-05 11:29:14.101292", "msg": "non-zero return code", "rc": 1, "start": "2023-07-05 11:29:12.805123", "stderr": "Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false", "stderr_lines": ["Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false"], "stdout": "", "stdout_lines": []}
I see that it fails running the engine-config command on the new hosted engine, but when I SSH to it and try running it, I get:
# engine-config -l
Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false
Connection to the Database failed. Please check that the hostname and port number are correct and that the Database service is up and running.
I haven't been able to find anything specifically for this area searching through Google. Anyone have any idea where to go with this?
1 year, 9 months
ovirt 4.5.4 deploy self-hosted engine
by Jorge Visentini
Hi.
I'm trying to deploy the engine but I'm having some errors that I couldn't
identify.
I don't know if it's incompatibility with my hardware or some libvirt bug.
Jul 05 10:06:21 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48505)
Jul 05 10:06:21 ksmmi1r02ovirt36.kosmo.cloud libvirtd[701878]: Domain id=1
name='HostedEngineLocal' uuid=922a156c-7f4c-4815-a645-54ed07 794451 is
tainted: custom-ga-command
Jul 05 10:06:21 ksmmi1r02ovirt36.kosmo.cloud virtlogd[630980]: Client hit
max requests limit 1. This may result in keep-alive timeouts. Consider
tuning the max_client_requests server parameter
Jul 05 10:06:22 ksmmi1r02ovirt36.kosmo.cloud libvirtd[701878]: Invalid
value '-1' for 'cpu.max': Invalid argument
Jul 05 10:06:26 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48500)
Jul 05 10:06:31 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48495)
Jul 05 10:06:31 ksmmi1r02ovirt36.kosmo.cloud systemd[1]:
systemd-timedated.service: Deactivated successfully.
Jul 05 10:06:36 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48490)
Jul 05 10:06:37 ksmmi1r02ovirt36.kosmo.cloud libvirtd[701878]: Invalid
value '-1' for 'cpu.max': Invalid argument
Jul 05 10:06:41 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48485)
Jul 05 10:06:46 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48480)
Jul 05 10:06:51 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48475)
Jul 05 10:06:52 ksmmi1r02ovirt36.kosmo.cloud libvirtd[701878]: Invalid
value '-1' for 'cpu.max': Invalid argument
Jul 05 10:06:56 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48470)
Jul 05 10:07:01 ksmmi1r02ovirt36.kosmo.cloud
ansible-async_wrapper.py[690916]: 690917 still running (48465)
*My config:*
*CPU:* 2 x Intel(R) Xeon(R) Platinum 8276M CPU @ 2.20GHz
*Memory:* 4TB
*Disk:* 120GB RAID 1
*ISO:* ovirt-node-ng-installer-4.5.4-2022120615.el9.iso
*Packages:*
kernel-5.14.0-202.el9.x86_64
libvirt-8.9.0-2.el9.x86_64
centos-release-ovirt45-9.1-3.el9s.noarch
python3-ovirt-engine-sdk4-4.6.0-1.el9.x86_64
ovirt-imageio-common-2.4.7-1.el9.x86_64
ovirt-imageio-client-2.4.7-1.el9.x86_64
ovirt-openvswitch-ovn-2.15-4.el9.noarch
ovirt-openvswitch-ovn-common-2.15-4.el9.noarch
ovirt-imageio-daemon-2.4.7-1.el9.x86_64
ovirt-openvswitch-ovn-host-2.15-4.el9.noarch
python3-ovirt-setup-lib-1.3.3-1.el9.noarch
ovirt-vmconsole-1.0.9-1.el9.noarch
ovirt-vmconsole-host-1.0.9-1.el9.noarch
ovirt-openvswitch-2.15-4.el9.noarch
python3-ovirt-node-ng-nodectl-4.4.2-1.el9.noarch
ovirt-node-ng-nodectl-4.4.2-1.el9.noarch
ovirt-ansible-collection-3.0.0-1.el9.noarch
ovirt-python-openvswitch-2.15-4.el9.noarch
ovirt-openvswitch-ipsec-2.15-4.el9.noarch
ovirt-hosted-engine-ha-2.5.0-1.el9.noarch
ovirt-provider-ovn-driver-1.2.36-1.el9.noarch
ovirt-host-dependencies-4.5.0-3.el9.x86_64
ovirt-hosted-engine-setup-2.7.0-1.el9.noarch
ovirt-host-4.5.0-3.el9.x86_64
ovirt-release-host-node-4.5.4-1.el9.x86_64
ovirt-node-ng-image-update-placeholder-4.5.4-1.el9.noarch
ovirt-engine-appliance-4.5-20221206125848.1.el9.x86_64
For better understanding, the deploy log is attached.
I appreciate any tips that help me.
Thank you!
--
Att,
Jorge Visentini
+55 55 98432-9868
1 year, 9 months
ovirt template import using ansible
by destfinal@googlemail.com
Hi,
I use a set of templates, generated in one cluster and re-used in multiple clusters. The clusters do not have direct connections between each other. My dev environment can talk to all the clusters. Currently,
1. I export the templates (as OVAs) to one of the nodes (example: node1.source.cluster), from the ovirt console (https://management.source.cluster)
2. scp the templates to my dev machine (example: scp -r node1.source.cluster:/tmp/ovirt_templates /tmp/ovirt_templates)
3. scp the templates from my dev environment to the target cluster (example: scp /tmp/ovirt_templates node1.target.cluster:/tmp)
4. Import the templates using the ovirt console of the target cluster (https://management.target.cluster)
This is highly a manual job and I am trying to automate the process using ansible. I am unable to work it out using the ovit_template module documentation (https://docs.ansible.com/ansible/latest/collections/ovirt/ovirt/ovirt_tem...) and could not able to see any other module in this relation.
Has anybody done this before and point me to the right direction? Or if there is a better process than what I follow above, please suggest me one.
Please let me know if you need more information in this regard.
Thanks
1 year, 9 months