[Users] migrations failing with latest master + vdsm

Michal Skrivanek mskrivan at redhat.com
Tue Oct 15 14:42:11 UTC 2013


On Oct 15, 2013, at 16:36 , Dead Horse <deadhorseconsulting at gmail.com> wrote:

> I have been running EL 6.4 hosts in 3.3 mode for quite some time, I only noticed this breakage in the latest master VDSM 4.13.x. Tagged vdsm versions: ovirt-3.3.0 and ovirt-3.3 do work (if not using master which one of these should be used with 3.3 btw?)

it is correct
I suppose as a workaround you can use engine-config and set AbortMigrationOnError to false for now…
the default should change to false till EL 6.5 comes out, I guess…

Thanks,
michal

> The running version of libvirt on the hosts is: libvirt-0.10.2-18.0.1.el6_4.14.x86_64
> 
> - DHC
> 
> 
> 
> On Tue, Oct 15, 2013 at 9:05 AM, Michal Skrivanek <mskrivan at redhat.com> wrote:
> 
> On Oct 12, 2013, at 00:30 , Dan Kenigsberg <danken at redhat.com> wrote:
> 
> > On Fri, Oct 11, 2013 at 03:30:35PM -0500, Dead Horse wrote:
> >> VM migrations are failing with latest master engine and vdsm
> >>
> >> logs attached from engine and both hosts
> >>
> >> Hosts are EL 6.4 with latest master VDSM
> >
> > Thanks for your report!
> >
> > Thread-195::ERROR::2013-10-11 15:22:39,508::vm::304::vm.Vm::(run) vmId=`4bad94ad-c338-4ec5-8e5b-9910d58c1854`::Failed to migrate
> > Traceback (most recent call last):
> >  File "/usr/share/vdsm/vm.py", line 291, in run
> >    self._startUnderlyingMigration()
> >  File "/usr/share/vdsm/vm.py", line 369, in _startUnderlyingMigration
> >    self._abortOnError else 0),
> > AttributeError: 'module' object has no attribute 'VIR_MIGRATE_ABORT_ON_ERROR'
> >
> >
> > Peter, Michal, I think that VIR_MIGRATE_ABORT_ON_ERROR is expected only in
> > el6.5, which is still not public
> >    Bug 972675 - Fail migration when VM get paused due to EIO
> >
> > This must be reverted in vdsm or hacked in Engine (do not set abortOnError=True if libvirt < libvirt-0.10.2-20.el6)
> 
> as of http://gerrit.ovirt.org/#/c/19312/ the flag is sent for 3.3 clusters, which is correct
> I thought you're not supposed to have EL 6.4 host in 3.3 cluster. 6.5 should work..
> 
> Thanks,
> michal
> 
> >
> >>
> >> Additionally the hosts seem to lose connection with their storage domains
> >> (new behavior), are offline then recovered (even though not is physically
> >> wrong).
> >
> 
> 




More information about the Users mailing list