Hey,
I was trying to install oVirt with SE on a node that has Intel Skylake CPU
(Intel Xeon Gold 6238R CPU to be precise) which by Intel supports TSX.
When the SE was provisioned as a local VM all was working well, it was
using a different CPU type for local provisioning.
After the local SE VM was migrated to the shared Storage (iSCSI) and was
configured, it failed to start.
When checking the XML (and vm.conf) that was created and provided to
libvirt I noticed it uses the "Secure Intel Skylake" type CPU
with +tsx-ctrl as a required flag.
My assumption as this is a fresh install of oVirt with SE is that the newly
created Cluster was set to this CPU compatibility.
This specific CPU by Intel does not expose any tsx flags, while it does
indeed support TSX libvirt has no way of knowing it, more strange, some
other CPUs from that same range/models do expose the tsx flag.
I have tried to set the kernel cmdline to tsx=yes|auto|off and none of
those helped.
The quick solution was to start the engine manually by editing the XML file
(hosted-engine --vm-shutdown and then start it with libvirt) and change the
Cluster CPU type and the HostedEngine CPU type as well to "Intel Skylake"
and then start it (hosted-engine --vm-start)
Another solution which i haven't tried is to get the correct string from
the oVirt-Engine API of the non-secure Intel Skylake CPU and hardcode it
into the cpu_model fact in the SE ansible role (task).
At the end i opted not to try any workaround and decided to go on with
oVirt 4.3 which went smooth, it chose "Intel Skylake Server IBRS SSBD MDS
Family" as the Cluster CPU compatibility and installation went without any
errors/issues.
1) What will happen if i decide to upgrade to 4.4? I will first have to
reinstall a node with CentOS 8 and then migrate the HostedEngine to there
as well, will it keep the current cluster CPU type or will it try to
upgrade and these fail the upgrade?
2) Are you aware of this situation? I understand this is a new solution
because you had to update everytime the CPU databases but on the other
hand, Intel is not helping here by not being strict about exposing the tsx
flags, perhaps the best will be to let the user chose which CPU type use on
the first Cluster created by the SE ansible role? (As far as i remember,
this was available in the previous versions of oVirt)
Thanks!