Why OVA imports failed (the other reason...)

Empty disks from exports that went wrong didn't help. But that's fixed now, even if I can't fully validate the OVA exports on VMware and VirtualBox. The export/import target for the *.ova files is an SSD hosted xfs file system on a pure compute Xeon D oVirt node, exported and automounted to the 3nHCI cluster, also all SSD but J5005 Atoms. As I import the OVA file, I chose the Xeon-D as the import node and the *local path* on that host for the import. Cockpit checks the OVA file, detects the machine inside, lets me select and chose it for import, potentially overriding some parameters, lets me choose the target storage volume, sets up the job... and then fails, rather siliently and with very little in terms of error reporting ("connection closed") is the best I got. Now that same process worked just fine on a single node HCI cluster (also J5005 Atom), which had me a bit stunned at first, but gave a hint as to the cause: Parts of the input job, most likely an qemu-img job, isn't run via the machine you selected in the first step and unless the path is global (e.g. external NFS), it fails. If someone from the oVirt team could check and validate or disprove this theory, that could be documented and/or added as a check to avoid people falling into the same trap. While I was testing this using a global automount path, my cluster failed me (creating and deleting VMs a bit too quickly?) and I had to struggle for a while to have it recover. While those transient ailures are truly frightening, oVirt's ability to recover from these scenarios is quite simply awsome. I guess it's really mostly miscommunication and not real failures and oVirt has lots of logic to rectify that.

Here is the explanation, I think: root 12319 12313 15 16:59 pts/0 00:00:56 qemu-img convert -O qcow2 /dev/loop0 /rhev/data-center/mnt/glusterSD/192.168.0.91:_vmstore/9d1b8774-c5dc-46a8-bfa2-6a6db5851195/images/3be7c1bb-377c-4d5e-b4f6-1a6574b8a52b/845cdd93-def8-4d84-9a08-f8c991f89fe3 This is where the image is entering from the OVA source and gets written on the Gluster. I consistently chose one of the computer cluster nodes, because they have the bigger CPUs and it also happened to have the OVA file locally, so the network wouldn't have to carry source and sink traffic... But unless I use one of the nodes that actually have bricks in the Gluster, I get this strange silent failure. First thing I did notice is that during the import dialog, more details about the engines are actually visible (disk and network details), before I actually launch the import, although Cockpit doesn't seem to care. I then theorized, that the compute nodes won't actually mount the Gluster file system on /rhev, so the target would be missing... but they do in fact... I'll not go into further details but take away this lesson: "If you want to import an OVA files to a Gluster farm, you must use a node which is part of the gluster to do the import on." Ah, yes, the import succeeded and oVirt immediately chose the nicely bigger Xeon-D node to run the VM... Now I just need to find out how to twiddle OVA export files from oVirt to make them digestable for VMware and VirtualBox...
participants (1)
-
thomas@hoberg.net