oVirt definitely involves a lot of paddling. I see now that you are right about HCI (Hyper
Converged Platform), it was never specifically documented how to upgrade it. I have
stepped away from HCI a long time ago, after testing and working with it in oVirt 3.x.
There were just too many dependencies and things that could go wrong. Too much
functionality depending on the same hardware and software. I have also stepped away from
Gluster since and in production we are not using self-hosted engines either. A standalone
engine on a different hypervisor cluster gives me much more peace of mind. However, I
cannot see from the original post if this is Hyperconverged or even if it is self-hosted.
Maybe I'm just missing it :-)
The procedures for self-hosted and standalone engine are described and I have been able to
upgrade clusters in the past using them. Make sure you follow all the steps one by one in
the correct order. But I agree, if something goes wrong on the way, you are a bit on your
own in dangerous territory paddling up the creek :-P It sounds like the upgrade went fine
though... and migrating VMs from a 4.3 host to a 4.4 host should be a standard procedure
when upgrading the nodes one by one in a live environment. The upgrade from 4.3 to 4.4 I
cannot remember if we did live, but from 4.4 to 4.5 worked for us without taking the VMs
down. This kind of upgrade is best done in a service window though, and you should have
backups of everything before you start. I'm 100% with Thomas on this.
Looking at "12.6. Migrating hosts and virtual machines from oVirt 4.3 to 4.4"
from the upgrade guide, I see lots of caveats. It is really depending whether or not the
oVirt node appliance is used or a Linux Enterprise server (as it needs to be upgraded
first by the looks of it). And whether or not there are VMs with CPU-passthrough.
I have also looked around for bugs in 4.4, as that is the version you are upgrading to...
I find this for example: Bug 1774064. It seem this error occurs in various versions of
4.4, even none related to upgrades.
I'm wondering, if you should not just push through with updating all the oVirt nodes,
which then allows you to change the cluster compatibility level to 4.4. This would mean
not migrating VMs live, but shutting them down and possibly starting them on an upgraded
node, until all nodes are upgraded. If you upgrade one more node, you would be able to
check if migration works between upgraded 4.4 nodes, which would then confirm that pushing
through and upgrade all nodes to 4.4 would be the way forward, even if it means that you
have to shutdown all the VMs at some point during the migration.
I hope this helps. Good luck with it.
Ps. I have resorted in few cases, to simply reinstalling a node from scratch on the new
version, and then simply add i to the (upgraded) engine, to make them work again. oVirt is
a lot of paddling, but in my experience it works fine when you just let it run and
don't do anything too fancy to it (like upgrading, which has to be done from time to
time).