[ovirt-devel] Where's MOM (on latest master)

Michal Skrivanek michal.skrivanek at redhat.com
Fri Nov 18 11:55:09 UTC 2016


> On 18 Nov 2016, at 12:35, Martin Sivak <msivak at redhat.com> wrote:
> 
>> I don't think it is related to version X or Y. It is a race, so might be
>> related to other factors.
> 
> It never (seriously: NEVER) happened with xml-rpc before 4.0.5.

that is surprising
but we also didn’t have lago before;-)

> 
>> likely because json-rpc is initialized after xml-rpc….or indeed whatever
>> else;-)
> 
> But this is not about jsonrpc. The socket itself is shared according
> to what Piotr said.

it is

> 
>> btw you likely still want to have a retry in mom once it
>> starts responding due to delayed vdsm async recovery taking potentially
>> minutes
> 
> We handle this already. The only issue is the connection refused state.

then why don’t you handle the connection state as well? isn’t that a simple fix?

> 
> 
> Martin
> 
> 
> On Fri, Nov 18, 2016 at 12:19 PM, Michal Skrivanek
> <michal.skrivanek at redhat.com> wrote:
>> 
>> On 18 Nov 2016, at 12:12, Oved Ourfali <oourfali at redhat.com> wrote:
>> 
>> I don't think it is related to version X or Y. It is a race, so might be
>> related to other factors.
>> 
>> 
>> likely because json-rpc is initialized after xml-rpc….or indeed whatever
>> else;-)
>> 
>> either way it needs to be solved. Either by improving the systemd service
>> file or mom retry (btw you likely still want to have a retry in mom once it
>> starts responding due to delayed vdsm async recovery taking potentially
>> minutes)
>> 
>> 
>> On Nov 18, 2016 12:59 PM, "Martin Sivak" <msivak at redhat.com> wrote:
>>> 
>>>> Are we / can we use systemd socket activation there?
>>> 
>>> That actually requires systemd specific code iirc (to take over the
>>> standing by socket). I am actually wondering why the xml-rpc in 4.0.4
>>> was fine and json-rpc in 4.0.6 is too slow.
>>> 
>>> Martin
>>> 
>>> On Fri, Nov 18, 2016 at 11:53 AM, Anton Marchukov <amarchuk at redhat.com>
>>> wrote:
>>>> Hello All.
>>>> 
>>>> Are we / can we use systemd socket activation there?
>>>> 
>>>> Anton.
>>>> 
>>>> On Fri, Nov 18, 2016 at 11:21 AM, Martin Sivak <msivak at redhat.com>
>>>> wrote:
>>>>> 
>>>>> What about making vdsm ready to answer connections when it returns to
>>>>> systemd instead? I hate workarounds and this always worked fine.
>>>>> 
>>>>> Martin
>>>>> 
>>>>> On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali <oourfali at redhat.com>
>>>>> wrote:
>>>>>> Seems like a race regardless of the protocol.
>>>>>> Should you add a retry?
>>>>>> 
>>>>>> 
>>>>>> On Nov 18, 2016 11:52 AM, "Martin Sivak" <msivak at redhat.com> wrote:
>>>>>>> 
>>>>>>> Yes, because VDSM is supposed to be up (there is systemd
>>>>>>> dependency).
>>>>>>> This always worked fine with xml-rpc.
>>>>>>> 
>>>>>>> Martin
>>>>>>> 
>>>>>>> On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer at redhat.com>
>>>>>>> wrote:
>>>>>>>> On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak at redhat.com>
>>>>>>>> wrote:
>>>>>>>>> This happens because MOM can't connect to VDSM and so it quits.
>>>>>>>> 
>>>>>>>> So mom try once to connect and if the connection fails it quits?
>>>>>>>> 
>>>>>>>>> We
>>>>>>>>> discussed it on the mailinglist
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> https://lists.fedoraproject.org/archives/list/vdsm-devel@lists.fedorahosted.org/thread/MZ7UJUWO5KFRDJJDNXX7VIYU5PWSXF62/
>>>>>>>>> http://lists.ovirt.org/pipermail/devel/2016-November/014101.html
>>>>>>>>> 
>>>>>>>>> This issue never happened with XML-RPC.
>>>>>>>>> 
>>>>>>>>> Shira reported it as
>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1393012
>>>>>>>>> 
>>>>>>>>> Martin
>>>>>>>>> 
>>>>>>>>> On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul at redhat.com>
>>>>>>>>> wrote:
>>>>>>>>>> I've recently seen, including now on Master, the following
>>>>>>>>>> warnings:
>>>>>>>>>> Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]:
>>>>>>>>>> Started
>>>>>>>>>> MOM
>>>>>>>>>> instance configured for VDSM purposes.
>>>>>>>>>> Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]:
>>>>>>>>>> Starting
>>>>>>>>>> MOM
>>>>>>>>>> instance configured for VDSM purposes...
>>>>>>>>>> Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>> MOM
>>>>>>>>>> WARN MOM
>>>>>>>>>> not available, Policy could not be set.
>>>>>>>>>> Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>> MOM
>>>>>>>>>> WARN MOM
>>>>>>>>>> not available.
>>>>>>>>>> Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>> MOM
>>>>>>>>>> WARN MOM
>>>>>>>>>> not available, KSM stats will be missing.
>>>>>>>>>> Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>> MOM
>>>>>>>>>> WARN MOM
>>>>>>>>>> not available.
>>>>>>>>>> Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>> MOM
>>>>>>>>>> WARN MOM
>>>>>>>>>> not available, KSM stats will be missing.
>>>>>>>>>> Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>> MOM
>>>>>>>>>> WARN MOM
>>>>>>>>>> not available.
>>>>>>>>>> Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>> MOM
>>>>>>>>>> WARN MOM
>>>>>>>>>> not available, KSM stats will be missing.
>>>>>>>>>> Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>> MOM
>>>>>>>>>> WARN MOM
>>>>>>>>>> not available.
>>>>>>>>>> Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>> MOM
>>>>>>>>>> WARN MOM
>>>>>>>>>> not available, KSM stats will be missing.
>>>>>>>>>> Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>> MOM
>>>>>>>>>> WARN MOM
>>>>>>>>>> not available.
>>>>>>>>>> Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>> MOM
>>>>>>>>>> WARN MOM
>>>>>>>>>> not available, KSM stats will be missing.
>>>>>>>>>> Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>> MOM
>>>>>>>>>> WARN MOM
>>>>>>>>>> not available.
>>>>>>>>>> Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>> MOM
>>>>>>>>>> WARN MOM
>>>>>>>>>> not available, KSM stats will be missing.
>>>>>>>>>> Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>> MOM
>>>>>>>>>> WARN MOM
>>>>>>>>>> not available.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Any ideas what this is and why?
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Devel mailing list
>>>>>>>>>> Devel at ovirt.org
>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>>>>>> _______________________________________________
>>>>>>>>> Devel mailing list
>>>>>>>>> Devel at ovirt.org
>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>>>> _______________________________________________
>>>>>>> Devel mailing list
>>>>>>> Devel at ovirt.org
>>>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> _______________________________________________
>>>>> Devel mailing list
>>>>> Devel at ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Anton Marchukov
>>>> Senior Software Engineer - RHEV CI - Red Hat
>>>> 
>> 
>> _______________________________________________
>> Devel mailing list
>> Devel at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel
>> 
>> 




More information about the Devel mailing list