[ovirt-devel] contEIOVMs regression?

Nir Soffer nsoffer at redhat.com
Fri Nov 21 15:03:42 UTC 2014


----- Original Message -----
> From: "Adam Litke" <alitke at redhat.com>
> To: "Nir Soffer" <nsoffer at redhat.com>
> Cc: devel at ovirt.org, "Francesco Romani" <fromani at redhat.com>, "Federico Simoncelli" <fsimonce at redhat.com>, "Dan
> Kenigsberg" <danken at redhat.com>
> Sent: Friday, November 21, 2014 4:46:13 PM
> Subject: Re: contEIOVMs regression?
> 
> On 20/11/14 17:37 -0500, Nir Soffer wrote:
> >----- Original Message -----
> >> From: "Adam Litke" <alitke at redhat.com>
> >> To: devel at ovirt.org
> >> Cc: "Nir Soffer" <nsoffer at redhat.com>, "Francesco Romani"
> >> <fromani at redhat.com>, "Federico Simoncelli"
> >> <fsimonce at redhat.com>, "Dan Kenigsberg" <danken at redhat.com>
> >> Sent: Thursday, November 20, 2014 9:15:33 PM
> >> Subject: contEIOVMs regression?
> >>
> >> Hi list,
> >>
> >> I am taking a look at Bug 1157421 [1] which describes a situation
> >> where VM's that are paused with an -EIO error are not automatically
> >> resumed after the problem with storage has been corrected.  I have
> >> some patches [2] on gerrit that resolve the problem.  Since this
> >> appears to be a regression I am looking at a non-intrusive way to fix
> >> it in the 3.5 branch.  There is some disagreement on the proper way to
> >> fix this so I am hoping we can arrive at a solution through an open
> >> discussion.
> >>
> >> The main issue at hand is with the Event/Callback mechanism we use to
> >> call clientIF.contEIOVMs.  According to my experiments and this online
> >> discussion [3] weakref does not work for instance methods such as
> >> clientIF.contEIOVMs.  Our Event class uses weakref to prevent it from
> >> holding references to registered callback functions.
> >
> >Why making event system more correct is required to tix [1]?
> >
> >>
> >> I see two easy ways to fix the regression:
> >
> >I don't follow, what is the regression?
> 
> Assuming that at some point contEIOVMs actually worked and was able to
> automatically resume VMs, then we have a regression because given the
> weakref problems I am describing herein there is no way that it is
> working now.  The only way we don't have a regression is if this code
> has never worked to begin with.

The current code is master do work - when I fixed this last time, the problem
was that we did not register the callback before starting the monitors, and 
that the monitors did not issue a state change on the first time a monitor
check the domain state.

I verified that contEIOVMs is called and that it does try to continue vms.

If this does not break now (with current code), please open a bug.

> 
> >> 1) Treat clientIF as a singleton class (which it is) and make
> >> contEIOVMs a module-level method which gets the clientIF instance
> >> and calls it's bound contEIOVMs method.  See my patches [2] for the
> >> code behind this idea.
> >
> >This is the wrong direction. There is only one place using that horrible
> >getInstance(), and it also could just create the single instance that we
> >need. We should remove getInstance() instead of using it in new code.
> >
> >>
> >> 2) Allow Event to maintain a strong reference on the bound
> >> clientIF.contEIOVMs method.  This will allow the current code to work
> >> as designed but will change the Event implementation to accomodate
> >> this specific use case.  Since no one else appears to be using this
> >> code, it should have no functional impact.
> >
> >The code is already holding a strong reference now, no change is
> >needed :-)
> 
> I disagree.  From vdsm/storage/misc.py:
> class Event(object):
> ...
>     def register(self, func, oneshot=False):
>         with self._syncRoot:
>             self._registrar[id(func)] = (weakref.ref(func), oneshot)
> ...
>             # ^^^ "He's dead Jim"
> 
> The function is converted into a weak reference.  Since, in this case,
> the function is an instance method, the reference is immediately dead
> on arrival.  I have verified this with debugging statements in my
> environment.

So you suggest that taking a weakref to an instance method returns
a dead reference?

I thought that the problem is instance method keep hard reference to
the instance, so the weakref is useless.

Nir



More information about the Devel mailing list