oVirt/Ceph iSCSI Issues
by Matthew J Black
Hi All,
I've got some issues with connecting my oVirt Cluster to my Ceph Cluster via iSCSI. There are two issues, and I don't know if one is causing the other, if they are related at all, or if they are two separate, unrelated issues. Let me explain.
The Situation
-------------
- I have a working three node Ceph Cluster (Ceph Quincy on Rocky Linux 8.6)
- The Ceph Cluster has four Storage Pools of between 4 and 8 TB each
- The Ceph Cluster has three iSCSI Gateways
- There is a single iSCSI Target on the Ceph Cluster
- The iSCSI Target has all three iSCSI Gateways attached
- The iSCSI Target has all four Storage Pools attached
- The four Storage Pools have been assigned LUNs 0-3
- I have set up (Discovery) CHAP Authorisation on the iSCSI Target
- I have a working three node self-hosted oVirt Cluster (oVirt v4.5.3 on Rocky Linux 8.6)
- The oVirt Cluster has (in addition to the hosted_storage Storage Domain) three GlusterFS Storage Domains
- I can ping all three Ceph Cluster Nodes to/from all three oVirt Hosts
- The iSCSI Target on the Ceph Cluster has all three oVirt Hosts Initiators attached
- Each Initiator has all four Ceph Storage Pools attached
- I have set up CHAP Authorisation on the iSCSI Target's Initiators
- The Ceph Cluster Admin Portal reports that all three Initiators are "logged_in"
- I have previous connected Ceph iSCSI LUNs to the oVirt Cluster successfully (as an experiment), but had to remove and re-instate them for the "final" version(?).
- The oVirt Admin Portal (ie HostedEngine) reports that Initiators are 1 & 2 (ie oVirt Hosts 1 & 2) are "logged_in" to all three iSCSI Gateways
- The oVirt Admin Portal reports that Initiator 3 (ie oVirt Host 3) is "logged_in" to iSCSI Gateways 1 & 2
- I can "force" Initiator 3 to become "logged_in" to iSCSI Gateway 3, but when I do this it is *not* persistent
- oVirt Hosts 1 & 2 can/have discovered all three iSCSI Gateways
- oVirt Hosts 1 & 2 can/have discovered all four LUNs/Targets on all three iSCSI Gateways
- oVirt Host 3 can only discover 2 of the iSCSI Gateways
- For Target/LUN 0 oVirt Host 3 can only "see" the LUN provided by iSCSI Gateway 1
- For Targets/LUNs 1-3 oVirt Host 3 can only "see" the LUNs provided by iSCSI Gateways 1 & 2
- oVirt Host 3 can *not* "see" any of the Targets/LUNs provided by iSCSI Gateway 3
- When I create a new oVirt Storage Domain for any of the four LUNs:
- I am presented with a message saying "The following LUNs are already in use..."
- I am asked to "Approve operation" via a checkbox, which I do
- As I watch the oVirt Admin Portal I can see the new iSCSI Storage Domain appear in the Storage Domain list, and then after a few minutes it is removed
- After those few minutes I am presented with this failure message: "Error while executing action New SAN Storage Domain: Network error during communication with the Host."
- I have looked in the engine.log and all I could find that was relevant (as far as I know) was this:
~~~
2022-11-28 19:59:20,506+11 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand] (default task-1) [77b0c12d] Command 'CreateStorageDomainVDSCommand(HostName = ovirt_node_1.mynet.local, CreateStorageDomainVDSCommandParameters:{hostId='967301de-be9f-472a-8e66-03c24f01fa71', storageDomain='StorageDomainStatic:{name='data', id='2a14e4bd-c273-40a0-9791-6d683d145558'}', args='s0OGKR-80PH-KVPX-Fi1q-M3e4-Jsh7-gv337P'})' execution failed: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues
2022-11-28 19:59:20,507+11 ERROR [org.ovirt.engine.core.bll.storage.domain.AddSANStorageDomainCommand] (default task-1) [77b0c12d] Command 'org.ovirt.engine.core.bll.storage.domain.AddSANStorageDomainCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues (Failed with error VDS_NETWORK_ERROR and code 5022)
~~~
I cannot see/detect any "communication issue" - but then again I'm not 100% sure what I should be looking for
I have looked on-line for an answer, and apart from not being able to get past Red Hat's "wall" to see the solutions that they have, all I could find that was relevant was this: https://lists.ovirt.org/archives/list/devel@ovirt.org/thread/AVLORQNOLJHR... . If this *is* relevant then there is not enough context here for me to proceed (ie/eg *where* (which host/vm) should that command be run?).
I also found (for a previous version of oVirt) notes about modifying the Postgres DB manual to resolve a similar issue. While I am more than comfortable doing this (I've been an SQL DBA for well over 20 years) this seems like asking for trouble - at least until I hear back from the oVirt Devs that this is OK to do - and of course, I'll need the relevant commands / locations / authorisations to get into the DB.
Questions
---------
- Are the two issues (oVirt Host 3 not having a full picture of the Ceph iSCSI environment and the oVirt iSCSI Storage Domain creation failure) related?
- Do I need to "refresh" the iSCSI info on the oVirt Hosts, and if so, how do I do this?
- Do I need to "flush" the old LUNs from the oVirt Cluster, and if so, how do I do this?
- Where else should I be looking for info in the logs (& which logs)?
- Does *anyone* have any other ideas how to resolve the situation - especially when using the Ceph iSCII Gateways?
Thanks in advance
Cheers
Dulux-Oz
1 year, 12 months
Import VM via KVM. Can't see vm's.
by piotret@wp.pl
Hi
I am trying to migrate vm's from oVirt 4.3.10.4-1.el7 to oVirt 4.5.3.2-1.el8.
I use Provider KVM (via libvirt)
My problem is that I can't see vm's from old oVirt when they are shutdown.
When they are running I see but can't import because "All chosen VMs are running in the external system and therefore have been filtered. Please see log for details."
Thank you for help.
Regards
2 years
How does ovirt handle disks across multiple iscsi LUNs
by peterd@mdg-it.com
A possibly obvious question I can't find the answer to anywhere—how does ovirt allocate VM disk images when a storage domain has multiple LUNs? Are these allocated one per LUN, so if e.g. a LUN runs out of space the disks on that LUN (only) will be unable to write? Or are these distributed across LUNs, so if a LUN fails due to storage failure etc the entire storage domain can be affected?
Many thanks in advance, Peter
2 years
How to remove cloud-init check on vm using ansible module or rest api
by kishorekumar.goli@gmail.com
Hi All,
I have a requirement to automate template creation using ansible module or apis.
I need to remove cloud-init check box from vm before creating the template. Below are the steps I follow from gui
1. Shutdown the vm.
2. click edit vm > initial run> uncheck cloud-init box
3. create template.
I cannot automate second step as there is no option to remove cloud-init option.
Could anyone please help me if there is a way to uncheck cloud-init??
BRs
Kishore
2 years
2nd Self-Hosted Engine Node Not Attached To "ovirtmgmt" Network
by Matthew J Black
Hi All,
So, I've got the Self-Hosted Engine up and running on the first Host, and everything *seems* to be working OK.
I'm now attempting to use the Web UI to add a new ie 2nd (Self-Hosted Engine) Host to the cluster. Everything *seems* to go OK except the 2nd Host is left in a Non-Operational state (which I believe is "normal") awaiting the set up of the Host Networks.
Its here where things go wrong: the 2nd Host does not have the ovirtmgmt Network attached (nor any of the others, for that matter). When I drag the ovirtmgmt Network to the 2nd Host's pre-existing (and working) bond interface the Engine works away for a while and then reports "Error while executing action HostSetupNetworks: Unexpected exception".
I have located (but have not yet read) these logs from the 2nd Host:
- agent.log
- broker.log
I have located (but have not yet read) these logs from the Engine:
- engine.log
- ovirt-host-deploy-ansible-20221124150742-ovirt_node_1.mynet.local-c26ca3fc-3c3f-4ee0-9562-fd7fd5066f8b.log
- ovirt-host-deploy-ansible-20221124150742-ovirt_node_2.mynet.local-c26ca3fc-3c3f-4ee0-9562-fd7fd5066f8b.log
Just to make things clear:
- Both the Hosts are physically the same. Same brand/model of M/Board, NICS, HDDs, ect. Same layout, etc
- The Web UI says that the 2nd Host's Bond is working AOK (as is the 1st Host)
- I can ssh into the 2nd Host fine.
So my questions are:
1) Which other logs should I be looking at?
2) Has anyone else struck this issue before (I had a look through the mail list archives, etc, and couldn't really find anything relevant - but as always, I may be mistaken)?
Any and all help greately appreciated
Cheers
Dulux-Oz
2 years
Is it possible to auto-start VMs on single LocalStorage host without the engine?
by ernest.beinrohr@axonpro.sk
I currently use KVM/virsh for my DNS. I would like it to ovirt, but I need DNS started for the engine to work. So I need to start the DNS vm before the engine. Is that possible with ovirt? I was thinking I could use the same mechanism as hosted engine, as that would autostart.
In my current KVM setup I needed only to symlink /etc/libvirtd/qemu/dns.xml to /etc/libvirtd/qemu/autostart/ to get it to run.
2 years
EL8.6 to EL8.7
by eshwayri@gmail.com
I upgraded to 8.5.3, and did a nobest upgrade to EL8.7. There are two un-resolved issues:
Problem 1: installed package centos-stream-release-8.6-1.el8.noarch obsoletes redhat-release < 9 provided by redhat-release-8.7-0.3.el8.x86_64
- cannot install the best update candidate for package redhat-release-8.6-0.1.el8.x86_64
- problem with installed package centos-stream-release-8.6-1.el8.noarch
I can't find an stream 8.7 stream file, and the oVirt files depend on the existing 8.6 package.
And as has been noted by others, ansible:
Problem 2: package ovirt-engine-4.5.3.2-1.el8.noarch conflicts with ansible-core >= 2.13.0 provided by ansible-core-2.13.3-1.el8.x86_64
- cannot install the best update candidate for package ovirt-engine-4.5.3.2-1.el8.noarch
- cannot install the best update candidate for package ansible-core-2.12.2-4.el8_6.x86_64
Will there be a future upgrade that allows this upgrade to go through?
2 years
Help moving hosted_storage data/domain
by tannoiser@gmail.com
Hi there, first time here.
I successfully installed Oracle Linux Virtual Manager that's basically Ovirt 4.6. I have a datacenter, 2 separate cluster, 3 host per cluster (self hosted engine configuration).
During the installation process I defined the nfs storage for the hosted engine VM, and now I'm in trouble. I can shut down any host but the one that serve the nfs.
I understand lately that I need a distributed storage to obtain HA, then I added the glusterfs support. I can see my gluster volume but I can't add it as storage domain. It's simply doesn't appear in dropdown menu when I try to add it as new storage domain.
A part for that: there's a way to move data from hosted_storage domain (the nfs one) to another one? I try to setup another one as main but no joy. If a try to move the disk from there it's say to me I can't.
I searched on the web but I found several answers, starting from: "no you can't" to "with engine-backup utility" and I'm quite confused.
Reinstall it isn't an option, i have approx 60 vm active.
Any help it's appreciated.
(forgive eventually mistakes, I'm not a native).
Maurizio
2 years