Empty disks from exports that went wrong didn't help. But that's fixed now, even
if I can't fully validate the OVA exports on VMware and VirtualBox.
The export/import target for the *.ova files is an SSD hosted xfs file system on a pure
compute Xeon D oVirt node, exported and automounted to the 3nHCI cluster, also all SSD but
J5005 Atoms.
As I import the OVA file, I chose the Xeon-D as the import node and the *local path* on
that host for the import. Cockpit checks the OVA file, detects the machine inside, lets me
select and chose it for import, potentially overriding some parameters, lets me choose the
target storage volume, sets up the job... and then fails, rather siliently and with very
little in terms of error reporting ("connection closed") is the best I got.
Now that same process worked just fine on a single node HCI cluster (also J5005 Atom),
which had me a bit stunned at first, but gave a hint as to the cause: Parts of the input
job, most likely an qemu-img job, isn't run via the machine you selected in the first
step and unless the path is global (e.g. external NFS), it fails.
If someone from the oVirt team could check and validate or disprove this theory, that
could be documented and/or added as a check to avoid people falling into the same trap.
While I was testing this using a global automount path, my cluster failed me (creating and
deleting VMs a bit too quickly?) and I had to struggle for a while to have it recover.
While those transient ailures are truly frightening, oVirt's ability to recover from
these scenarios is quite simply awsome.
I guess it's really mostly miscommunication and not real failures and oVirt has lots
of logic to rectify that.
Show replies by date
Here is the explanation, I think:
root 12319 12313 15 16:59 pts/0 00:00:56 qemu-img convert -O qcow2 /dev/loop0
/rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/3be7c1bb-377c-4d5e-b4f6-1a6574b8a52b/845cdd93-def8-4d84-9a08-f8c991f89fe3
This is where the image is entering from the OVA source and gets written on the Gluster.
I consistently chose one of the computer cluster nodes, because they have the bigger CPUs
and it also happened to have the OVA file locally, so the network wouldn't have to
carry source and sink traffic...
But unless I use one of the nodes that actually have bricks in the Gluster, I get this
strange silent failure.
First thing I did notice is that during the import dialog, more details about the engines
are actually visible (disk and network details), before I actually launch the import,
although Cockpit doesn't seem to care.
I then theorized, that the compute nodes won't actually mount the Gluster file system
on /rhev, so the target would be missing... but they do in fact...
I'll not go into further details but take away this lesson:
"If you want to import an OVA files to a Gluster farm, you must use a node which is
part of the gluster to do the import on."
Ah, yes, the import succeeded and oVirt immediately chose the nicely bigger Xeon-D node to
run the VM...
Now I just need to find out how to twiddle OVA export files from oVirt to make them
digestable for VMware and VirtualBox...