Nvidia A10 vGPU support on oVirt 4.5.2
by Don Dupuis
Hello
I can run an Nvidia GRID T4 on oVirt 4.5.2 with no issue but I have a new
GRID A10 that doesn't seem to work in oVirt 4.5.2. This new card seems to
use SRIOV instead of mediated devices or it seems that I only get the
mdev_supported_types directory structure when I run the
/usr/lib/nvidia/sriov-manager command. Has anyone got this card working on
oVirt or has developers working on oVirt/RHV know about this?
Thanks
Don
2 years, 4 months
oVirt/Ceph iSCSI Issues
by Matthew J Black
Hi All,
I've got some issues with connecting my oVirt Cluster to my Ceph Cluster via iSCSI. There are two issues, and I don't know if one is causing the other, if they are related at all, or if they are two separate, unrelated issues. Let me explain.
The Situation
-------------
- I have a working three node Ceph Cluster (Ceph Quincy on Rocky Linux 8.6)
- The Ceph Cluster has four Storage Pools of between 4 and 8 TB each
- The Ceph Cluster has three iSCSI Gateways
- There is a single iSCSI Target on the Ceph Cluster
- The iSCSI Target has all three iSCSI Gateways attached
- The iSCSI Target has all four Storage Pools attached
- The four Storage Pools have been assigned LUNs 0-3
- I have set up (Discovery) CHAP Authorisation on the iSCSI Target
- I have a working three node self-hosted oVirt Cluster (oVirt v4.5.3 on Rocky Linux 8.6)
- The oVirt Cluster has (in addition to the hosted_storage Storage Domain) three GlusterFS Storage Domains
- I can ping all three Ceph Cluster Nodes to/from all three oVirt Hosts
- The iSCSI Target on the Ceph Cluster has all three oVirt Hosts Initiators attached
- Each Initiator has all four Ceph Storage Pools attached
- I have set up CHAP Authorisation on the iSCSI Target's Initiators
- The Ceph Cluster Admin Portal reports that all three Initiators are "logged_in"
- I have previous connected Ceph iSCSI LUNs to the oVirt Cluster successfully (as an experiment), but had to remove and re-instate them for the "final" version(?).
- The oVirt Admin Portal (ie HostedEngine) reports that Initiators are 1 & 2 (ie oVirt Hosts 1 & 2) are "logged_in" to all three iSCSI Gateways
- The oVirt Admin Portal reports that Initiator 3 (ie oVirt Host 3) is "logged_in" to iSCSI Gateways 1 & 2
- I can "force" Initiator 3 to become "logged_in" to iSCSI Gateway 3, but when I do this it is *not* persistent
- oVirt Hosts 1 & 2 can/have discovered all three iSCSI Gateways
- oVirt Hosts 1 & 2 can/have discovered all four LUNs/Targets on all three iSCSI Gateways
- oVirt Host 3 can only discover 2 of the iSCSI Gateways
- For Target/LUN 0 oVirt Host 3 can only "see" the LUN provided by iSCSI Gateway 1
- For Targets/LUNs 1-3 oVirt Host 3 can only "see" the LUNs provided by iSCSI Gateways 1 & 2
- oVirt Host 3 can *not* "see" any of the Targets/LUNs provided by iSCSI Gateway 3
- When I create a new oVirt Storage Domain for any of the four LUNs:
- I am presented with a message saying "The following LUNs are already in use..."
- I am asked to "Approve operation" via a checkbox, which I do
- As I watch the oVirt Admin Portal I can see the new iSCSI Storage Domain appear in the Storage Domain list, and then after a few minutes it is removed
- After those few minutes I am presented with this failure message: "Error while executing action New SAN Storage Domain: Network error during communication with the Host."
- I have looked in the engine.log and all I could find that was relevant (as far as I know) was this:
~~~
2022-11-28 19:59:20,506+11 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand] (default task-1) [77b0c12d] Command 'CreateStorageDomainVDSCommand(HostName = ovirt_node_1.mynet.local, CreateStorageDomainVDSCommandParameters:{hostId='967301de-be9f-472a-8e66-03c24f01fa71', storageDomain='StorageDomainStatic:{name='data', id='2a14e4bd-c273-40a0-9791-6d683d145558'}', args='s0OGKR-80PH-KVPX-Fi1q-M3e4-Jsh7-gv337P'})' execution failed: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues
2022-11-28 19:59:20,507+11 ERROR [org.ovirt.engine.core.bll.storage.domain.AddSANStorageDomainCommand] (default task-1) [77b0c12d] Command 'org.ovirt.engine.core.bll.storage.domain.AddSANStorageDomainCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022)
~~~
I cannot see/detect any "communication issue" - but then again I'm not 100% sure what I should be looking for
I have looked on-line for an answer, and apart from not being able to get past Red Hat's "wall" to see the solutions that they have, all I could find that was relevant was this: https://lists.ovirt.org/archives/list/devel@ovirt.org/thread/AVLORQNOLJHR... . If this *is* relevant then there is not enough context here for me to proceed (ie/eg *where* (which host/vm) should that command be run?).
I also found (for a previous version of oVirt) notes about modifying the Postgres DB manual to resolve a similar issue. While I am more than comfortable doing this (I've been an SQL DBA for well over 20 years) this seems like asking for trouble - at least until I hear back from the oVirt Devs that this is OK to do - and of course, I'll need the relevant commands / locations / authorisations to get into the DB.
Questions
---------
- Are the two issues (oVirt Host 3 not having a full picture of the Ceph iSCSI environment and the oVirt iSCSI Storage Domain creation failure) related?
- Do I need to "refresh" the iSCSI info on the oVirt Hosts, and if so, how do I do this?
- Do I need to "flush" the old LUNs from the oVirt Cluster, and if so, how do I do this?
- Where else should I be looking for info in the logs (& which logs)?
- Does *anyone* have any other ideas how to resolve the situation - especially when using the Ceph iSCII Gateways?
Thanks in advance
Cheers
Dulux-Oz
2 years, 4 months
Import VM via KVM. Can't see vm's.
by piotret@wp.pl
Hi
I am trying to migrate vm's from oVirt 4.3.10.4-1.el7 to oVirt 4.5.3.2-1.el8.
I use Provider KVM (via libvirt)
My problem is that I can't see vm's from old oVirt when they are shutdown.
When they are running I see but can't import because "All chosen VMs are running in the external system and therefore have been filtered. Please see log for details."
Thank you for help.
Regards
2 years, 4 months
How does ovirt handle disks across multiple iscsi LUNs
by peterd@mdg-it.com
A possibly obvious question I can't find the answer to anywhere—how does ovirt allocate VM disk images when a storage domain has multiple LUNs? Are these allocated one per LUN, so if e.g. a LUN runs out of space the disks on that LUN (only) will be unable to write? Or are these distributed across LUNs, so if a LUN fails due to storage failure etc the entire storage domain can be affected?
Many thanks in advance, Peter
2 years, 4 months
Warning alert: Failed to parse server certificates
by me@alexsmirnov.us
After an update 2 days ago we no longer able to log into the oVirt console.
The certificate was expiring, so we ran the system update and the ovirt-engine update as well. It went fine and re-issued (upgraded) certificates. After the work was done, we no longer able to log into the WEB UI. The error we get on the screen is 'Failed to parse server certificates'. After an attempt to login as local administrator, it reverts back to landing page with the same error.
Any help is appreciated at this time.
2 years, 4 months