[Users] fedora qemu modern CPU recognition issues cause ovirt manged host to go non-operational

Dead Horse deadhorseconsulting at gmail.com
Sun Apr 22 04:20:38 EDT 2012


Seeing an issue wherein ovirt moves a managed host to non-operational state.
This occurs with the currently released version of ovirt and the latest
development builds.
The ovirt host is loaded with Fedora core 16 and equipped with most current
development version of the vdsm.
*Editorial node*
The latest vdsm to work on the FC16 host required building and adding newer
versions of the sanlock, libvirt, lvm2 and device-mapper packages than what
FC16 provides.
Ultimately however none of the newer packages have any bearing on this
failure mode.

The failure mode is as follows.
Upon successfully adding the host and setting the cluster CPU compatibility
level oVirt will offline the host with the following message:
--> Host ovirtnode moved to Non-Operational state as host does not meet the
cluster's minimum CPU level. Missing CPU features : model_Nehalem

Under the hood the actual cause of this failure is that qemu is not
correctly able to identify the host CPU feature flags.
This can be observed by doing: qemu-system-x86_64 -cpu Nehalem,check
which fails with:
warning: host cpuid 0000_0000 lacks requested flag 'fpu' [0x00000001]
warning: host cpuid 0000_0000 lacks requested flag 'de' [0x00000004]
and on and on...

A simple check of "cat /proc/cpuinfo | grep flags"
flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16
xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi
flexpriority ept vpid

Thus the CPU is more than capable of everything being asked of it via the
cpudef for "Nehalem" in /etc/qemu/target-x86_64.conf.

This is only an issue on Fedora hosts. RHEL/CentOS/SL hosts work fine. This
was recognized as an issue RHEL and fixed there but has not been fixed in
Fedora.
See: http://wiki.qemu.org/Features/CPUModels#Examples and this:
https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=689665 and this
related discussion in qemu-devel:
http://www.mail-archive.com/qemu-devel@nongnu.org/msg101360.html

Thus is appears that the changes were made to the RHEL qemu (eg: cpu type
rhel6) AKA the change one needs to make to the ovirt engine database to
have ovirt manage a EL based host.
Fedora hosts--> psql -U postgres engine -c "update vdc_options set
option_value='pc-0.14' where option_name='EmulatedMachine' and
version='3.0';"
EL hosts --> psql -U postgres engine -c "update vdc_options set
option_value='rhel6.2.0' where option_name='EmulatedMachine' and
version='3.0';"

Thus at the moment any host loaded with Fedora and manged by oVirt
utilizing a Sandy Bridge, Nehalem or Westmere processor will be dead in the
water.

-DHC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20120422/0f5a05f5/attachment-0002.html>


More information about the Users mailing list