I came to oVirt thinking that it was like CentOS: There might be bugs, but given the
mainline usage in home and coporate labs with light workloads and nothing special, chances
to hit one should be pretty minor: I like looking for new fronteers atop of my OS, not
inside.
I have been runing CentOS/OpenVZ for years in a previous job, mission critical 24x7 stuff
where minutes of outage meant being grilled for hours in meetings afterwards. And with
PCI-DSS compliance certified. Never had an issue with OpenVZ/CentOS, all those minute
goofs where human error or Oracle inventing execution plans.
Boy was I wrong about oVirt! Just setting it up took weeks. Ansible loves eating Gigahertz
and I was running on Atoms. I had to learn how to switch from an i7 in mid-installation to
have it finish at all. I the end I had learned tons of new things, but all I wanted was a
cluster that would work as much out of the box as CentOS or OpenVZ.
Something as fundamental as exporting and importing a VM might simply not work and not
even get fixed.
Migrating HCI from CentOS7/oVirt 4.3 to CentOS8/oVirt 4.4 is anything but smooth, a
complete rebuild seems the lesser evil: Now if only exports and imports worked reliably!
Rebooting a HCI nodes seems to involve an "I am dying!" aria on the network,
where the whole switch becomes unresponsive for 10 minutes and the fault tolerant cluster
on it being 100% unresponsive (including all other machines on that switch). I has so much
fun resynching gluster file systems and searching through all those log files for signs as
to what was going on!
And the instructions on how to fix gluster issues seems so wonderfully detailed and vague,
it seems one could spend days trying to fix things or rebuild and restore. It doesn't
help that the fate of Gluster very much seems to hang in the air, when the scalable HCI
aspect was the only reason I ever wanted oVirt.
Could just be an issue with RealTek adapters, because I never oberved something like that
with Intel NICs or on (recycled old) enterprise hardware
I guess official support for a 3 node HCI cluster on passive Atoms isn't going to
happen, unless I make happen 100% myself: It's open source after all!
Just think what 3/6/9 node HCI based on Raspberry PI would do for the project! The 9 node
HCI should deliver better 10Gbit GlusterFS performance than most QNAP units at the same
cost with a single 10Gbit interface even with 7:2 erasure coding!
I really think the future of oVirt may be at the edge, not in the datacenter core.
In short: oVirt is very much beta software and quite simply a full-time job if you depend
on it working over time.
I can't see that getting any better when one beta gets to run on top of another beta.
At the moment my oVirt experience has me doubt RHV on RHEL would work any better, even if
it's cheaper than VMware.
OpenVZ was simply the far better alternative than KVM for most of the things I needed from
virtualization and it was mainly the hastle of trying to make that work with RHEL which
had me switching to CentOS. CentOS with OpenVZ was the bedrock of that business for 15
years and proved to me that Redhat was hell-bent on making bad decisions on technological
direction.
I would have actually liked to pay a license for each of the physical hosts we used, but
it turned out much less of a bother to forget about negotiating licensing conditions for
OpenVZ containers and use CentOS instead.
BTW: I am going into a meeting tomorrow, where after two years of pilot usage, we might
just decide to kill our current oVirt farms, because they didn't deliver on "a
free open-source virtualization solution for your entire enterprise".
I'll keep my Atoms running a little longer, mostly because I have nothing else to use
them for. For a first time in months, they show zero gluster replication errors, perhaps
because for lack of updates there have been no node reboots. CentOS 7 is stable, but oVirt
4.3 out of support.