[ovirt-devel] contEIOVMs regression?

Adam Litke alitke at redhat.com
Fri Nov 21 15:28:07 UTC 2014


On 21/11/14 10:03 -0500, Nir Soffer wrote:
>----- Original Message -----
>> From: "Adam Litke" <alitke at redhat.com>
>> To: "Nir Soffer" <nsoffer at redhat.com>
>> Cc: devel at ovirt.org, "Francesco Romani" <fromani at redhat.com>, "Federico Simoncelli" <fsimonce at redhat.com>, "Dan
>> Kenigsberg" <danken at redhat.com>
>> Sent: Friday, November 21, 2014 4:46:13 PM
>> Subject: Re: contEIOVMs regression?
>>
>> On 20/11/14 17:37 -0500, Nir Soffer wrote:
>> >----- Original Message -----
>> >> From: "Adam Litke" <alitke at redhat.com>
>> >> To: devel at ovirt.org
>> >> Cc: "Nir Soffer" <nsoffer at redhat.com>, "Francesco Romani"
>> >> <fromani at redhat.com>, "Federico Simoncelli"
>> >> <fsimonce at redhat.com>, "Dan Kenigsberg" <danken at redhat.com>
>> >> Sent: Thursday, November 20, 2014 9:15:33 PM
>> >> Subject: contEIOVMs regression?
>> >>
>> >> Hi list,
>> >>
>> >> I am taking a look at Bug 1157421 [1] which describes a situation
>> >> where VM's that are paused with an -EIO error are not automatically
>> >> resumed after the problem with storage has been corrected.  I have
>> >> some patches [2] on gerrit that resolve the problem.  Since this
>> >> appears to be a regression I am looking at a non-intrusive way to fix
>> >> it in the 3.5 branch.  There is some disagreement on the proper way to
>> >> fix this so I am hoping we can arrive at a solution through an open
>> >> discussion.
>> >>
>> >> The main issue at hand is with the Event/Callback mechanism we use to
>> >> call clientIF.contEIOVMs.  According to my experiments and this online
>> >> discussion [3] weakref does not work for instance methods such as
>> >> clientIF.contEIOVMs.  Our Event class uses weakref to prevent it from
>> >> holding references to registered callback functions.
>> >
>> >Why making event system more correct is required to tix [1]?
>> >
>> >>
>> >> I see two easy ways to fix the regression:
>> >
>> >I don't follow, what is the regression?
>>
>> Assuming that at some point contEIOVMs actually worked and was able to
>> automatically resume VMs, then we have a regression because given the
>> weakref problems I am describing herein there is no way that it is
>> working now.  The only way we don't have a regression is if this code
>> has never worked to begin with.
>
>The current code is master do work - when I fixed this last time, the problem
>was that we did not register the callback before starting the monitors, and
>that the monitors did not issue a state change on the first time a monitor
>check the domain state.
>
>I verified that contEIOVMs is called and that it does try to continue vms.

Very curious.  I am working with 3.5.0.  The main difference (other
than branch) is that I am working in an environment with no connected
storage pool.  Though I still can't see how the weakref stuff could be
working in master.

>If this does not break now (with current code), please open a bug.
>
>>
>> >> 1) Treat clientIF as a singleton class (which it is) and make
>> >> contEIOVMs a module-level method which gets the clientIF instance
>> >> and calls it's bound contEIOVMs method.  See my patches [2] for the
>> >> code behind this idea.
>> >
>> >This is the wrong direction. There is only one place using that horrible
>> >getInstance(), and it also could just create the single instance that we
>> >need. We should remove getInstance() instead of using it in new code.
>> >
>> >>
>> >> 2) Allow Event to maintain a strong reference on the bound
>> >> clientIF.contEIOVMs method.  This will allow the current code to work
>> >> as designed but will change the Event implementation to accomodate
>> >> this specific use case.  Since no one else appears to be using this
>> >> code, it should have no functional impact.
>> >
>> >The code is already holding a strong reference now, no change is
>> >needed :-)
>>
>> I disagree.  From vdsm/storage/misc.py:
>> class Event(object):
>> ...
>>     def register(self, func, oneshot=False):
>>         with self._syncRoot:
>>             self._registrar[id(func)] = (weakref.ref(func), oneshot)
>> ...
>>             # ^^^ "He's dead Jim"
>>
>> The function is converted into a weak reference.  Since, in this case,
>> the function is an instance method, the reference is immediately dead
>> on arrival.  I have verified this with debugging statements in my
>> environment.
>
>So you suggest that taking a weakref to an instance method returns
>a dead reference?
>
>I thought that the problem is instance method keep hard reference to
>the instance, so the weakref is useless.

Yeah, try out this test program to see what I mean:


#!/usr/bin/env python

import weakref
from functools import partial

class A(object):
    def __init__(self):
        self.r1 = weakref.ref(self.a)
        self.r2 = partial(A.a, weakref.proxy(self))


    def a(self):
        print "Hello from a"

def main():
    obj = A()
    print obj.r1
    obj.r2()

if __name__ == '__main__':
    main()

-- 
Adam Litke



More information about the Devel mailing list