[ovirt-devel] Where's MOM (on latest master)

Martin Sivak msivak at redhat.com
Fri Nov 18 12:14:06 UTC 2016


> then why don’t you handle the connection state as well? isn’t that a simple fix?

VDSM socket availability during startup is probably the most important
requirement for MOM and the whole service is based around that
assumption. We could handle that differently, but letting the service
crash saves us tons of code (as it should not happen in the first
place). We simply do not need the code that would decide between a
permanent and temporary failure during startup.

XML-RPC was easy as it was stateless (new request for every call).

JSON-RPC is a bit harder as it keeps the socket internally. Does the
client reconnect by itself btw? What happens when we use it after a
socket error?

Martin

On Fri, Nov 18, 2016 at 12:55 PM, Michal Skrivanek
<michal.skrivanek at redhat.com> wrote:
>
>> On 18 Nov 2016, at 12:35, Martin Sivak <msivak at redhat.com> wrote:
>>
>>> I don't think it is related to version X or Y. It is a race, so might be
>>> related to other factors.
>>
>> It never (seriously: NEVER) happened with xml-rpc before 4.0.5.
>
> that is surprising
> but we also didn’t have lago before;-)
>
>>
>>> likely because json-rpc is initialized after xml-rpc….or indeed whatever
>>> else;-)
>>
>> But this is not about jsonrpc. The socket itself is shared according
>> to what Piotr said.
>
> it is
>
>>
>>> btw you likely still want to have a retry in mom once it
>>> starts responding due to delayed vdsm async recovery taking potentially
>>> minutes
>>
>> We handle this already. The only issue is the connection refused state.
>
> then why don’t you handle the connection state as well? isn’t that a simple fix?
>
>>
>>
>> Martin
>>
>>
>> On Fri, Nov 18, 2016 at 12:19 PM, Michal Skrivanek
>> <michal.skrivanek at redhat.com> wrote:
>>>
>>> On 18 Nov 2016, at 12:12, Oved Ourfali <oourfali at redhat.com> wrote:
>>>
>>> I don't think it is related to version X or Y. It is a race, so might be
>>> related to other factors.
>>>
>>>
>>> likely because json-rpc is initialized after xml-rpc….or indeed whatever
>>> else;-)
>>>
>>> either way it needs to be solved. Either by improving the systemd service
>>> file or mom retry (btw you likely still want to have a retry in mom once it
>>> starts responding due to delayed vdsm async recovery taking potentially
>>> minutes)
>>>
>>>
>>> On Nov 18, 2016 12:59 PM, "Martin Sivak" <msivak at redhat.com> wrote:
>>>>
>>>>> Are we / can we use systemd socket activation there?
>>>>
>>>> That actually requires systemd specific code iirc (to take over the
>>>> standing by socket). I am actually wondering why the xml-rpc in 4.0.4
>>>> was fine and json-rpc in 4.0.6 is too slow.
>>>>
>>>> Martin
>>>>
>>>> On Fri, Nov 18, 2016 at 11:53 AM, Anton Marchukov <amarchuk at redhat.com>
>>>> wrote:
>>>>> Hello All.
>>>>>
>>>>> Are we / can we use systemd socket activation there?
>>>>>
>>>>> Anton.
>>>>>
>>>>> On Fri, Nov 18, 2016 at 11:21 AM, Martin Sivak <msivak at redhat.com>
>>>>> wrote:
>>>>>>
>>>>>> What about making vdsm ready to answer connections when it returns to
>>>>>> systemd instead? I hate workarounds and this always worked fine.
>>>>>>
>>>>>> Martin
>>>>>>
>>>>>> On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali <oourfali at redhat.com>
>>>>>> wrote:
>>>>>>> Seems like a race regardless of the protocol.
>>>>>>> Should you add a retry?
>>>>>>>
>>>>>>>
>>>>>>> On Nov 18, 2016 11:52 AM, "Martin Sivak" <msivak at redhat.com> wrote:
>>>>>>>>
>>>>>>>> Yes, because VDSM is supposed to be up (there is systemd
>>>>>>>> dependency).
>>>>>>>> This always worked fine with xml-rpc.
>>>>>>>>
>>>>>>>> Martin
>>>>>>>>
>>>>>>>> On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer at redhat.com>
>>>>>>>> wrote:
>>>>>>>>> On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak at redhat.com>
>>>>>>>>> wrote:
>>>>>>>>>> This happens because MOM can't connect to VDSM and so it quits.
>>>>>>>>>
>>>>>>>>> So mom try once to connect and if the connection fails it quits?
>>>>>>>>>
>>>>>>>>>> We
>>>>>>>>>> discussed it on the mailinglist
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> https://lists.fedoraproject.org/archives/list/vdsm-devel@lists.fedorahosted.org/thread/MZ7UJUWO5KFRDJJDNXX7VIYU5PWSXF62/
>>>>>>>>>> http://lists.ovirt.org/pipermail/devel/2016-November/014101.html
>>>>>>>>>>
>>>>>>>>>> This issue never happened with XML-RPC.
>>>>>>>>>>
>>>>>>>>>> Shira reported it as
>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1393012
>>>>>>>>>>
>>>>>>>>>> Martin
>>>>>>>>>>
>>>>>>>>>> On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul at redhat.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> I've recently seen, including now on Master, the following
>>>>>>>>>>> warnings:
>>>>>>>>>>> Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]:
>>>>>>>>>>> Started
>>>>>>>>>>> MOM
>>>>>>>>>>> instance configured for VDSM purposes.
>>>>>>>>>>> Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]:
>>>>>>>>>>> Starting
>>>>>>>>>>> MOM
>>>>>>>>>>> instance configured for VDSM purposes...
>>>>>>>>>>> Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>>> MOM
>>>>>>>>>>> WARN MOM
>>>>>>>>>>> not available, Policy could not be set.
>>>>>>>>>>> Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>>> MOM
>>>>>>>>>>> WARN MOM
>>>>>>>>>>> not available.
>>>>>>>>>>> Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>>> MOM
>>>>>>>>>>> WARN MOM
>>>>>>>>>>> not available, KSM stats will be missing.
>>>>>>>>>>> Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>>> MOM
>>>>>>>>>>> WARN MOM
>>>>>>>>>>> not available.
>>>>>>>>>>> Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>>> MOM
>>>>>>>>>>> WARN MOM
>>>>>>>>>>> not available, KSM stats will be missing.
>>>>>>>>>>> Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>>> MOM
>>>>>>>>>>> WARN MOM
>>>>>>>>>>> not available.
>>>>>>>>>>> Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>>> MOM
>>>>>>>>>>> WARN MOM
>>>>>>>>>>> not available, KSM stats will be missing.
>>>>>>>>>>> Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>>> MOM
>>>>>>>>>>> WARN MOM
>>>>>>>>>>> not available.
>>>>>>>>>>> Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>>> MOM
>>>>>>>>>>> WARN MOM
>>>>>>>>>>> not available, KSM stats will be missing.
>>>>>>>>>>> Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>>> MOM
>>>>>>>>>>> WARN MOM
>>>>>>>>>>> not available.
>>>>>>>>>>> Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>>> MOM
>>>>>>>>>>> WARN MOM
>>>>>>>>>>> not available, KSM stats will be missing.
>>>>>>>>>>> Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>>> MOM
>>>>>>>>>>> WARN MOM
>>>>>>>>>>> not available.
>>>>>>>>>>> Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>>> MOM
>>>>>>>>>>> WARN MOM
>>>>>>>>>>> not available, KSM stats will be missing.
>>>>>>>>>>> Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm
>>>>>>>>>>> MOM
>>>>>>>>>>> WARN MOM
>>>>>>>>>>> not available.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Any ideas what this is and why?
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Devel mailing list
>>>>>>>>>>> Devel at ovirt.org
>>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Devel mailing list
>>>>>>>>>> Devel at ovirt.org
>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>>>>> _______________________________________________
>>>>>>>> Devel mailing list
>>>>>>>> Devel at ovirt.org
>>>>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Devel mailing list
>>>>>> Devel at ovirt.org
>>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Anton Marchukov
>>>>> Senior Software Engineer - RHEV CI - Red Hat
>>>>>
>>>
>>> _______________________________________________
>>> Devel mailing list
>>> Devel at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>
>>>
>



More information about the Devel mailing list