Re: Multiple GPU Passthrough with NVLink (Invalid I/O region)
by Zhengyi Lai
I noticed this document https://docs.nvidia.com/vgpu/16.0/grid-vgpu-release-notes-generic-linux-k... has this to say
In pass through mode, all GPUs connected to each other through NVLink must be assigned to the same VM. If a subset of GPUs connected to each other through NVLink is passed through to a VM, unrecoverable error XID 74 occurs when the VM is booted. If a subset of GPUs connected to each other through NVLink is passed through to a VM, unrecoverable error XID 74 occurs when the VM is booted. This error corrupts the NVLink state on the physical GPUs and, as a result, the NVLink bridge between the NVLink and the physical GPUs is not recognized. result, the NVLink bridge between the GPUs is unusable.
You may need to passthrough all GPUs in the nvlink to the VM
8 months
problem on dnf update
by g.vasilopoulos@uoc.gr
Hi
I tried dnf update today and got the error bellow. Is there a solution for this?:
[root@*****~]# dnf update
Last metadata expiration check: 3:42:36 ago on Mon 17 Jun 2024 08:47:59 AM EEST.
Error:
Problem 1: package python3-boto3-1.18.58-1.el9s.noarch from @System requires (python3.9dist(botocore) < 1.22 with python3.9dist(botocore) >= 1.21.58), but none of the providers can be installed
- cannot install both python3-botocore-1.31.62-1.el9.noarch from appstream and python3-botocore-1.21.58-1.el9s.noarch from @System
- cannot install both python3-botocore-1.21.58-1.el9s.noarch from ovirt-master-centos-stream-openstack-yoga-testing and python3-botocore-1.31.62-1.el9.noarch from appstream
- cannot install the best update candidate for package python3-botocore-1.21.58-1.el9s.noarch
- cannot install the best update candidate for package python3-boto3-1.18.58-1.el9s.noarch
Problem 2: package python3-pyngus-2.3.0-8.el9s.noarch from @System requires python3.9dist(python-qpid-proton), but none of the providers can be installed
- cannot install both python3-qpid-proton-0.39.0-2.el9s.x86_64 from ovirt-master-centos-stream-opstools-collectd5-testing and python3-qpid-proton-0.35.0-2.el9s.x86_64 from @System
- cannot install both python3-qpid-proton-0.35.0-2.el9s.x86_64 from ovirt-master-centos-stream-openstack-yoga-testing and python3-qpid-proton-0.39.0-2.el9s.x86_64 from ovirt-master-centos-stream-opstools-collectd5-testing
- cannot install the best update candidate for package python3-qpid-proton-0.35.0-2.el9s.x86_64
- cannot install the best update candidate for package python3-pyngus-2.3.0-8.el9s.noarch
Problem 3: package python3-oslo-messaging-12.13.3-1.el9s.noarch from @System requires python3-pyngus, but none of the providers can be installed
- package python3-pyngus-2.3.0-8.el9s.noarch from @System requires python3.9dist(python-qpid-proton), but none of the providers can be installed
- package python3-pyngus-2.3.0-8.el9s.noarch from ovirt-master-centos-stream-openstack-yoga-testing requires python3.9dist(python-qpid-proton), but none of the providers can be installed
- package python3-qpid-proton-0.35.0-2.el9s.x86_64 from @System requires qpid-proton-c(x86-64) = 0.35.0-2.el9s, but none of the providers can be installed
- package python3-qpid-proton-0.35.0-2.el9s.x86_64 from ovirt-master-centos-stream-openstack-yoga-testing requires qpid-proton-c(x86-64) = 0.35.0-2.el9s, but none of the providers can be installed
- cannot install both qpid-proton-c-0.39.0-2.el9s.x86_64 from ovirt-master-centos-stream-opstools-collectd5-testing and qpid-proton-c-0.35.0-2.el9s.x86_64 from @System
- cannot install both qpid-proton-c-0.35.0-2.el9s.x86_64 from ovirt-master-centos-stream-openstack-yoga-testing and qpid-proton-c-0.39.0-2.el9s.x86_64 from ovirt-master-centos-stream-opstools-collectd5-testing
- cannot install the best update candidate for package qpid-proton-c-0.35.0-2.el9s.x86_64
- cannot install the best update candidate for package python3-oslo-messaging-12.13.3-1.el9s.noarch
(try to add '--allowerasing' to command line to replace conflicting packages or '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)
8 months, 1 week
Certificate verification error for qemu while migrating
by Julien Deberles
Hello,
I'm running ovirt 4.4.10 and I have the following error while I launch a VM migration :
Jul 3 12:37:07 ssc-sati-02 journal[958949]: Certificate [session] owner does not match the hostname myhostname
Jul 3 12:37:07 ssc-sati-02 journal[958949]: Certificate check failed Certificate [session] owner does not match the hostname myhostname
Jul 3 12:37:07 ssc-sati-02 journal[958949]: authentication failed: Failed to verify peer's certificate
Jul 3 12:37:07 ssc-sati-02 journal[958949]: operation failed: Failed to connect to remote libvirt URI qemu+tls://myhostname/system: authentication failed: Failed to verify peer's certificate
To avoid this error I set the following paramaters inside the /etc/libvirt/qemu.conf and restard vdsmd daemon.
migrate_tls_x509_verify = 0
default_tls_x509_verify = 0
But I still have the same error. Can you help me to understand why this set of parameters are not working as exepected ?
kind regards,
Julien
8 months, 1 week
Disk blocked
by Louis Barbonnais
Hello,
I apologize, I am new to oVirt and I am lost. I have just installed oVirt on CentOS 9 with local storage. I am trying to add ISO images to my disk, but they are blocked. I cannot delete or unblock them.
Could you please assist me?
8 months, 1 week
deploy ovirt-engine4.5.6 on rockylinux9 encounter cross-origin frame error when visiting webadmin
by taleintervenor@sjtu.edu.cn
We have deployed ovirt-engine on rocky9.4, "engine-setup" runs all green and said it completed successfully.
But when we visit https://ovirtmu.pi.sjtu.edu.cn/ovirt-engine/webadmin, UI report the error as:
```
2024-07-03 15:45:58,692+08 ERROR [org.ovirt.engine.ui.frontend.server.gwt.OvirtRemoteLoggingService] (default task-3) [] Uncaught exception: com.google.gwt.event.shared.UmbrellaException: Exception caught: (SecurityError) : Failed to read a named property 'kCb' from 'Window': Blocked a frame with origin "https://ovirtmu.pi.sjtu.edu.cn" from accessing a cross-origin frame.
at java.lang.Throwable.Throwable(Throwable.java:72)
at java.lang.RuntimeException.RuntimeException(RuntimeException.java:32)
at com.google.web.bindery.event.shared.UmbrellaException.UmbrellaException(UmbrellaException.java:64)
at Unknown.new t8(webadmin-0.js)
at com.google.gwt.event.shared.EventBus.$castFireEvent(EventBus.java:65)
at org.ovirt.engine.ui.webadmin.system.MessageReceivedEvent.fire(MessageReceivedEvent.java:21)
at org.ovirt.engine.ui.webadmin.system.PostMessageDispatcher.onMessage(PostMessageDispatcher.java:27)
at Unknown.c(webadmin-0.js)
Caused by: com.google.gwt.core.client.JavaScriptException: (SecurityError) : Failed to read a named property 'kCb' from 'Window': Blocked a frame with origin "https://ovirtmu.pi.sjtu.edu.cn" from accessing a cross-origin frame.
at com.google.gwt.lang.Cast.instanceOfJso(Cast.java:211)
at org.ovirt.engine.ui.webadmin.plugin.jsni.JsArrayHelper.createMixedArray(JsArrayHelper.java:36)
at org.ovirt.engine.ui.webadmin.plugin.PluginEventHandler.lambda$16(PluginEventHandler.java:105)
at org.ovirt.engine.ui.webadmin.system.MessageReceivedEvent.$dispatch(MessageReceivedEvent.java:50)
at org.ovirt.engine.ui.webadmin.system.MessageReceivedEvent.dispatch(MessageReceivedEvent.java:50)
at com.google.gwt.event.shared.GwtEvent.dispatch(GwtEvent.java:76)
at com.google.web.bindery.event.shared.SimpleEventBus.$doFire(SimpleEventBus.java:173)
... 4 more
```
Version of ovirt-engine is ovirt-engine-4.5.6-1.el9.noarch, and the setup options are:
--== CONFIGURATION PREVIEW ==--
Application mode : both
Default SAN wipe after delete : False
Host FQDN : ovirtmu.pi.sjtu.edu.cn
Firewall manager : firewalld
Update Firewall : True
Set up Cinderlib integration : False
Configure local Engine database : True
Set application as default page : True
Configure Apache SSL : True
Keycloak installation : True
Engine database host : localhost
Engine database port : 5432
Engine database secured connection : False
Engine database host name validation : False
Engine database name : engine
Engine database user name : engine
Engine installation : True
PKI organization : pi.sjtu.edu.cn
Set up ovirt-provider-ovn : True
DWH installation : True
DWH database host : localhost
DWH database port : 5432
Configure local DWH database : True
Grafana integration : False
Keycloak database host : localhost
Keycloak database port : 5432
Keycloak database secured connection : False
Keycloak database host name validation : False
Keycloak database name : ovirt_engine_keycloak
Keycloak database user name : ovirt_engine_keycloak
Configure local Keycloak database : True
Configure VMConsole Proxy : True
Configure WebSocket Proxy : True
Can anyone provide some suggestions on positioning the problem?
8 months, 1 week
Install the oVirt engine on CentOS 9 Stream. However, I encountered an error while enabling the Java package tool, pki-deps, and the PostgreSQL
by Sachendra Shukla
Hi All,
I am trying to install the oVirt engine on CentOS 9 Stream. However, I
encountered an error while enabling the Java package tool, pki-deps, and
the PostgreSQL module. Below is the error message I received.
Could anyone please assist with a resolution for this issue?
[image: image.png]
--
Regards,
*Sachendra Shukla*
IT Administrator
Yagna iQ, Inc. and subsidiaries
Email: Sachendra.shukla(a)yagnaiq.com <dnyanesh.tisge(a)yagnaiq.com>
Website: https://yagnaiq.com
Privacy Policy: https://www.yagnaiq.com/privacy-policy/
<https://www.linkedin.com/company/yagnaiq/mycompany/>
<https://www.youtube.com/channel/UCeHXOpcUxWvOJO0aegD99Jg>
This communication and any attachments may contain confidential information
and/or Yagna iQ, Inc. copyright material.
All unauthorized use, disclosure, or distribution is prohibited. If you are
not the intended recipient, please notify Yagna iQ immediately by replying
to the email and destroy all copies of this communication.
This email has been scanned for all known viruses. The sender does not
accept liability for any damage inflicted by viewing the content of this
email.
8 months, 2 weeks
oVirt 4.5.6-1.el9 standalone engine - older VM's on centos 6 or rhel 6 hangs on live migration with ovirt node 5.14.0-388.el9.x86_64
by Sumit Basu
Hi All,
In our new oVirt 4.5.6-1.el9 cluster we find older vm's on el6 (centos/rhel) keeps hanging after live migrations. we find this typically with el6 vm's, our other el'7 and windows vm's do not face this issue.
All these VM's were imported from an oVirt 4.3.10.4-1.el7 cluster by importing the storage domain into the new cluster. Initially we installed all the latest ovirt-node-ng-installer-latest-el9 on all our nodes(8 of them) and we found this live migration hang issue, later we installed ovirt-node-ng-installer-4.5.2-2022081013.el9 on the nodes and tested - we found the issue resolved and we could migrate the el6 vm's without any issues at all, later, after doing an upgrade on the nodes we find the issue has come back again.
The nodes where there are no live migration issues have kernel 5.14.0-142.el9.x86_64, the nodes with the latest version 5.14.0-388.el9.x86_64 seems to have this issue.
Has anybody faced this issue? - any patches needed/available?
Thanks
8 months, 2 weeks