[Users] 6.4 CR: oVirt 3.1 breaks with missing cpu features after update to CentOS 6.4 (6.3 + CR)
Dan Kenigsberg
danken at redhat.com
Thu Mar 7 10:18:33 EST 2013
On Thu, Mar 07, 2013 at 03:59:27PM +0100, Patrick Hurrelmann wrote:
> On 05.03.2013 13:49, Dan Kenigsberg wrote:
> > On Tue, Mar 05, 2013 at 12:32:31PM +0100, Patrick Hurrelmann wrote:
> >> On 05.03.2013 11:14, Dan Kenigsberg wrote:
> >> <snip>
> >>>>>>
> >>>>>> My version of vdsm as stated by Dreyou:
> >>>>>> v 4.10.0-0.46 (.15), builded from
> >>>>>> b59c8430b2a511bcea3bc1a954eee4ca1c0f4861 (branch ovirt-3.1)
> >>>>>>
> >>>>>> I can't see that Ia241b09c96fa16441ba9421f61a2f9a417f0d978 was merged to
> >>>>>> 3.1 Branch?
> >>>>>>
> >>>>>> I applied that patch locally and restarted vdsmd but this does not
> >>>>>> change anything. Supported cpu is still as low as Conroe instead of
> >>>>>> Nehalem. Or is there more to do than patching libvirtvm.py?
> >>>>>
> >>>>> What is libvirt's opinion about your cpu compatibility?
> >>>>>
> >>>>> virsh -r cpu-compare <(echo '<cpu match="minimum"><model>Nehalem</model><vendor>Intel</vendor></cpu>')
> >>>>>
> >>>>> If you do not get "Host CPU is a superset of CPU described in bla", then
> >>>>> the problem is within libvirt.
> >>>>>
> >>>>> Dan.
> >>>>
> >>>> Hi Dan,
> >>>>
> >>>> virsh -r cpu-compare <(echo '<cpu
> >>>> match="minimum"><model>Nehalem</model><vendor>Intel</vendor></cpu>')
> >>>> Host CPU is a superset of CPU described in /dev/fd/63
> >>>>
> >>>> So libvirt obviously is fine. Something different would have surprised
> >>>> my as virsh capabilities seemed correct anyway.
> >>>
> >>> So maybe, just maybe, libvirt has changed their cpu_map, a map that
> >>> ovirt-3.1 had a bug reading.
> >>>
> >>> Would you care to apply http://gerrit.ovirt.org/5035 to see if this is
> >>> it?
> >>>
> >>> Dan.
> >>
> >> Hi Dan,
> >>
> >> success! Applying that patch made the cpu recognition work again. The
> >> cpu type in admin portal shows again as Nehalem. Output from getVdsCaps:
> >>
> >> cpuCores = 4
> >> cpuFlags = fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,
> >> mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,
> >> ss,ht,tm,pbe,syscall,nx,rdtscp,lm,constant_tsc,
> >> arch_perfmon,pebs,bts,rep_good,xtopology,nonstop_tsc,
> >> aperfmperf,pni,dtes64,monitor,ds_cpl,vmx,smx,est,tm2,
> >> ssse3,cx16,xtpr,pdcm,sse4_1,sse4_2,popcnt,lahf_lm,ida,
> >> dts,tpr_shadow,vnmi,flexpriority,ept,vpid,model_Nehalem,
> >> model_Conroe,model_coreduo,model_core2duo,model_Penryn,
> >> model_n270
> >> cpuModel = Intel(R) Xeon(R) CPU X3430 @ 2.40GHz
> >> cpuSockets = 1
> >> cpuSpeed = 2393.769
> >>
> >>
> >> I compared libvirt's cpu_map.xml on both Centos 6.3 and CentOS 6.4 and
> >> indeed they do differ in large portions. So this patch should probably
> >> be merged to 3.1 branch? I will contact Dreyou and request that this
> >> patch will also be included in his builds. I guess otherwise there will
> >> be quite some fallout after people start picking CentOS 6.4 for oVirt 3.1.
> >>
> >> Thanks again and best regards
> >
> > Thank you for reporting this issue and verifying its fix.
> >
> > I'm not completely sure that we should keep maintaining the ovirt-3.1
> > branch upstream - but a build destined for el6.4 must have it.
> >
> > If you believe we should release a fix version for 3.1, please verify
> > that http://gerrit.ovirt.org/12723 has no ill effects.
> >
> > Dan.
>
> I did none additional tests and the new CentOS 6.4 host failed start or
> migrate any vm. It always boils down to:
>
> Thread-43::ERROR::2013-03-07
> 15:02:51,950::task::853::TaskManager.Task::(_setError)
> Task=`52a9f96f-3dfd-4bcf-8d7a-db14e650b4c1`::Unexpected error
> Traceback (most recent call last):
> File "/usr/share/vdsm/storage/task.py", line 861, in _run
> return fn(*args, **kargs)
> File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
> res = f(*args, **kwargs)
> File "/usr/share/vdsm/storage/hsm.py", line 2551, in getVolumeSize
> apparentsize = str(volume.Volume.getVSize(sdUUID, imgUUID, volUUID,
> bs=1))
> File "/usr/share/vdsm/storage/volume.py", line 283, in getVSize
> return mysd.getVolumeClass().getVSize(mysd, imgUUID, volUUID, bs)
> File "/usr/share/vdsm/storage/blockVolume.py", line 101, in getVSize
> return int(int(lvm.getLV(sdobj.sdUUID, volUUID).size) / bs)
> File "/usr/share/vdsm/storage/lvm.py", line 772, in getLV
> lv = _lvminfo.getLv(vgName, lvName)
> File "/usr/share/vdsm/storage/lvm.py", line 567, in getLv
> lvs = self._reloadlvs(vgName)
> File "/usr/share/vdsm/storage/lvm.py", line 419, in _reloadlvs
> self._lvs.pop((vgName, lvName), None)
> File "/usr/lib64/python2.6/contextlib.py", line 34, in __exit__
> self.gen.throw(type, value, traceback)
> File "/usr/share/vdsm/storage/misc.py", line 1219, in acquireContext
> yield self
> File "/usr/share/vdsm/storage/lvm.py", line 404, in _reloadlvs
> lv = makeLV(*fields)
> File "/usr/share/vdsm/storage/lvm.py", line 218, in makeLV
> attrs = _attr2NamedTuple(args[LV._fields.index("attr")],
> LV_ATTR_BITS, "LV_ATTR")
> File "/usr/share/vdsm/storage/lvm.py", line 188, in _attr2NamedTuple
> attrs = Attrs(*values)
> TypeError: __new__() takes exactly 9 arguments (10 given)
>
> and followed by:
>
> Thread-43::ERROR::2013-03-07
> 15:02:51,987::dispatcher::69::Storage.Dispatcher.Protect::(run)
> __new__() takes exactly 9 arguments (10 given)
> Traceback (most recent call last):
> File "/usr/share/vdsm/storage/dispatcher.py", line 61, in run
> result = ctask.prepare(self.func, *args, **kwargs)
> File "/usr/share/vdsm/storage/task.py", line 1164, in prepare
> raise self.error
> TypeError: __new__() takes exactly 9 arguments (10 given)
> Thread-43::DEBUG::2013-03-07
> 15:02:51,987::vm::580::vm.Vm::(_startUnderlyingVm)
> vmId=`7db86f12-8c57-4d2b-a853-a6fd6f7ee82d`::_ongoingCreations released
> Thread-43::ERROR::2013-03-07
> 15:02:51,987::vm::604::vm.Vm::(_startUnderlyingVm)
> vmId=`7db86f12-8c57-4d2b-a853-a6fd6f7ee82d`::The vm start process failed
> Traceback (most recent call last):
> File "/usr/share/vdsm/vm.py", line 570, in _startUnderlyingVm
> self._run()
> File "/usr/share/vdsm/libvirtvm.py", line 1289, in _run
> devices = self.buildConfDevices()
> File "/usr/share/vdsm/vm.py", line 431, in buildConfDevices
> self._normalizeVdsmImg(drv)
> File "/usr/share/vdsm/vm.py", line 358, in _normalizeVdsmImg
> drv['truesize'] = res['truesize']
> KeyError: 'truesize'
>
> In webadmin the start and migrate operations fail with 'truesize'.
>
> I could find BZ#876958 which has the very same error. So I tried to
> apply patch http://gerrit.ovirt.org/9317. I had to apply it manually
> (guess patch would need a rebase for 3.1), but it works.
Thanks for the report. I've made a public backport for this in
http://gerrit.ovirt.org/12836/ and would ask you again to tick that it
is verified by you.
>
> I now can start new virtual machines successfully on a CentOS 6.4 /
> oVirt 3.1 host. Migration of vm from CentOS 6.3 hosts work, but not the
> other way around. Migration from 6.4 to 6.3 fails:
>
> Thread-1296::ERROR::2013-03-07 15:55:24,845::vm::176::vm.Vm::(_recover)
> vmId=`c978cbf8-6b4d-4d6f-9435-480d9fed31c4`::internal error Process
> exited while reading console log output: Supported machines are:
> pc RHEL 6.3.0 PC (alias of rhel6.3.0)
> rhel6.3.0 RHEL 6.3.0 PC (default)
> rhel6.2.0 RHEL 6.2.0 PC
> rhel6.1.0 RHEL 6.1.0 PC
> rhel6.0.0 RHEL 6.0.0 PC
> rhel5.5.0 RHEL 5.5.0 PC
> rhel5.4.4 RHEL 5.4.4 PC
> rhel5.4.0 RHEL 5.4.0 PC
>
> Thread-1296::ERROR::2013-03-07 15:55:24,988::vm::240::vm.Vm::(run)
> vmId=`c978cbf8-6b4d-4d6f-9435-480d9fed31c4`::Failed to migrate
> Traceback (most recent call last):
> File "/usr/share/vdsm/vm.py", line 223, in run
> self._startUnderlyingMigration()
> File "/usr/share/vdsm/libvirtvm.py", line 451, in
> _startUnderlyingMigration
> None, maxBandwidth)
> File "/usr/share/vdsm/libvirtvm.py", line 491, in f
> ret = attr(*args, **kwargs)
> File "/usr/lib/python2.6/site-packages/vdsm/libvirtconnection.py",
> line 82, in wrapper
> ret = f(*args, **kwargs)
> File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1178, in
> migrateToURI2
> if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed',
> dom=self)
> libvirtError: internal error Process exited while reading console log
> output: Supported machines are:
> pc RHEL 6.3.0 PC (alias of rhel6.3.0)
> rhel6.3.0 RHEL 6.3.0 PC (default)
> rhel6.2.0 RHEL 6.2.0 PC
> rhel6.1.0 RHEL 6.1.0 PC
> rhel6.0.0 RHEL 6.0.0 PC
> rhel5.5.0 RHEL 5.5.0 PC
> rhel5.4.4 RHEL 5.4.4 PC
> rhel5.4.0 RHEL 5.4.0 PC
>
> But I guess this is fine and migration from higher host version to a
> lower version is probably not supported, right?
Well, I suppose that qemu would allow migration if you begine with a
a *guest* of version rhel6.3.0. Please try it out.
Dan.
More information about the Users
mailing list