Why OVA imports failed (the other reason...)

7 Aug 2020

      Empty disks from exports that went wrong didn't help. But that's fixed now, even if I can't fully validate the OVA exports on VMware and VirtualBox.

The export/import target for the *.ova files is an SSD hosted xfs file system on a pure compute Xeon D oVirt node, exported and automounted to the 3nHCI cluster, also all SSD but J5005 Atoms.

As I import the OVA file, I chose the Xeon-D as the import node and the *local path* on that host for the import. Cockpit checks the OVA file, detects the machine inside, lets me select and chose it for import, potentially overriding some parameters, lets me choose the target storage volume, sets up the job... and then fails, rather siliently and with very little in terms of error reporting ("connection closed") is the best I got.

Now that same process worked just fine on a single node HCI cluster (also J5005 Atom), which had me a bit stunned at first, but gave a hint as to the cause: Parts of the input job, most likely an qemu-img job, isn't run via the machine you selected in the first step and unless the path is global (e.g. external NFS), it fails.

If someone from the oVirt team could check and validate or disprove this theory, that could be documented and/or added as a check to avoid people falling into the same trap.

While I was testing this using a global automount path, my cluster failed me (creating and deleting VMs a bit too quickly?) and I had to struggle for a while to have it recover.

While those transient ailures are truly frightening, oVirt's ability to recover from these scenarios is quite simply awsome.
I guess it's really mostly miscommunication and not real failures and oVirt has lots of logic to rectify that.

thomas＠hoberg.net

thomas＠hoberg.net

tags

participants (1)