Where's MOM (on latest master)

I've recently seen, including now on Master, the following warnings: Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started MOM instance configured for VDSM purposes. Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting MOM instance configured for VDSM purposes... Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, Policy could not be set. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Any ideas what this is and why?

This happens because MOM can't connect to VDSM and so it quits. We discussed it on the mailinglist https://lists.fedoraproject.org/archives/list/vdsm-devel@lists.fedorahosted.... http://lists.ovirt.org/pipermail/devel/2016-November/014101.html This issue never happened with XML-RPC. Shira reported it as https://bugzilla.redhat.com/show_bug.cgi?id=1393012 Martin On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
I've recently seen, including now on Master, the following warnings: Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started MOM instance configured for VDSM purposes. Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting MOM instance configured for VDSM purposes... Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, Policy could not be set. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available.
Any ideas what this is and why?
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak@redhat.com> wrote:
This happens because MOM can't connect to VDSM and so it quits.
So mom try once to connect and if the connection fails it quits?
We discussed it on the mailinglist
https://lists.fedoraproject.org/archives/list/vdsm-devel@lists.fedorahosted.... http://lists.ovirt.org/pipermail/devel/2016-November/014101.html
This issue never happened with XML-RPC.
Shira reported it as https://bugzilla.redhat.com/show_bug.cgi?id=1393012
Martin
On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
I've recently seen, including now on Master, the following warnings: Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started MOM instance configured for VDSM purposes. Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting MOM instance configured for VDSM purposes... Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, Policy could not be set. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available.
Any ideas what this is and why?
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

Yes, because VDSM is supposed to be up (there is systemd dependency). This always worked fine with xml-rpc. Martin On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer@redhat.com> wrote:
On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak@redhat.com> wrote:
This happens because MOM can't connect to VDSM and so it quits.
So mom try once to connect and if the connection fails it quits?
We discussed it on the mailinglist
https://lists.fedoraproject.org/archives/list/vdsm-devel@lists.fedorahosted.... http://lists.ovirt.org/pipermail/devel/2016-November/014101.html
This issue never happened with XML-RPC.
Shira reported it as https://bugzilla.redhat.com/show_bug.cgi?id=1393012
Martin
On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
I've recently seen, including now on Master, the following warnings: Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started MOM instance configured for VDSM purposes. Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting MOM instance configured for VDSM purposes... Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, Policy could not be set. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available.
Any ideas what this is and why?
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

Seems like a race regardless of the protocol. Should you add a retry? On Nov 18, 2016 11:52 AM, "Martin Sivak" <msivak@redhat.com> wrote:
Yes, because VDSM is supposed to be up (there is systemd dependency). This always worked fine with xml-rpc.
Martin
On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak@redhat.com> wrote:
This happens because MOM can't connect to VDSM and so it quits.
So mom try once to connect and if the connection fails it quits?
We discussed it on the mailinglist
On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer@redhat.com> wrote: lists.fedorahosted.org/thread/MZ7UJUWO5KFRDJJDNXX7VIYU5PWSXF62/
http://lists.ovirt.org/pipermail/devel/2016-November/014101.html
This issue never happened with XML-RPC.
Shira reported it as https://bugzilla.redhat.com/ show_bug.cgi?id=1393012
Martin
On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
I've recently seen, including now on Master, the following warnings: Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started MOM instance configured for VDSM purposes. Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting MOM instance configured for VDSM purposes... Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, Policy could not be set. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available.
Any ideas what this is and why?
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

What about making vdsm ready to answer connections when it returns to systemd instead? I hate workarounds and this always worked fine. Martin On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali <oourfali@redhat.com> wrote:
Seems like a race regardless of the protocol. Should you add a retry?
On Nov 18, 2016 11:52 AM, "Martin Sivak" <msivak@redhat.com> wrote:
Yes, because VDSM is supposed to be up (there is systemd dependency). This always worked fine with xml-rpc.
Martin
On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer@redhat.com> wrote:
On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak@redhat.com> wrote:
This happens because MOM can't connect to VDSM and so it quits.
So mom try once to connect and if the connection fails it quits?
We discussed it on the mailinglist
https://lists.fedoraproject.org/archives/list/vdsm-devel@lists.fedorahosted.... http://lists.ovirt.org/pipermail/devel/2016-November/014101.html
This issue never happened with XML-RPC.
Shira reported it as https://bugzilla.redhat.com/show_bug.cgi?id=1393012
Martin
On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
I've recently seen, including now on Master, the following warnings: Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started MOM instance configured for VDSM purposes. Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting MOM instance configured for VDSM purposes... Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, Policy could not be set. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available.
Any ideas what this is and why?
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

Discuss it with the infra guys and I'm sure you'll get the reasons, and will figure out a solution together. On Nov 18, 2016 12:21 PM, "Martin Sivak" <msivak@redhat.com> wrote:
What about making vdsm ready to answer connections when it returns to systemd instead? I hate workarounds and this always worked fine.
Martin
Seems like a race regardless of the protocol. Should you add a retry?
On Nov 18, 2016 11:52 AM, "Martin Sivak" <msivak@redhat.com> wrote:
Yes, because VDSM is supposed to be up (there is systemd dependency). This always worked fine with xml-rpc.
Martin
On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer@redhat.com>
wrote:
On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak@redhat.com> wrote:
This happens because MOM can't connect to VDSM and so it quits.
So mom try once to connect and if the connection fails it quits?
We discussed it on the mailinglist
On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali <oourfali@redhat.com> wrote: lists.fedorahosted.org/thread/MZ7UJUWO5KFRDJJDNXX7VIYU5PWSXF62/
http://lists.ovirt.org/pipermail/devel/2016-November/014101.html
This issue never happened with XML-RPC.
Shira reported it as https://bugzilla.redhat.com/show_bug.cgi?id=1393012
Martin
On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
I've recently seen, including now on Master, the following warnings: Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started MOM instance configured for VDSM purposes. Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting MOM instance configured for VDSM purposes... Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, Policy could not be set. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available.
Any ideas what this is and why?
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

In my opinion this issue is protocol agnostic. We may want to have a retry logic to make sure client can "talk" to vdsm. It seems that we could have this logic in the client code (disabled by default) and client code could enable it if needed. On Fri, Nov 18, 2016 at 11:25 AM, Oved Ourfali <oourfali@redhat.com> wrote:
Discuss it with the infra guys and I'm sure you'll get the reasons, and will figure out a solution together.
On Nov 18, 2016 12:21 PM, "Martin Sivak" <msivak@redhat.com> wrote:
What about making vdsm ready to answer connections when it returns to systemd instead? I hate workarounds and this always worked fine.
I am not so sure whether it will be so simple to do it. Recovery can take some time and during this time vdsm is not functional. Interesting issue found [1]. [1] https://bugzilla.redhat.com/1396183
Martin
On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali <oourfali@redhat.com> wrote:
Seems like a race regardless of the protocol. Should you add a retry?
On Nov 18, 2016 11:52 AM, "Martin Sivak" <msivak@redhat.com> wrote:
Yes, because VDSM is supposed to be up (there is systemd dependency). This always worked fine with xml-rpc.
Martin
On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer@redhat.com>
wrote:
On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak@redhat.com> wrote:
This happens because MOM can't connect to VDSM and so it quits.
So mom try once to connect and if the connection fails it quits?
We discussed it on the mailinglist
https://lists.fedoraproject.org/archives/list/vdsm-devel@lis ts.fedorahosted.org/thread/MZ7UJUWO5KFRDJJDNXX7VIYU5PWSXF62/ http://lists.ovirt.org/pipermail/devel/2016-November/014101.html
This issue never happened with XML-RPC.
Shira reported it as https://bugzilla.redhat.com/show_bug.cgi?id=1393012
Martin
On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com> wrote: > I've recently seen, including now on Master, the following warnings: > Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started MOM > instance configured for VDSM purposes. > Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting MOM > instance configured for VDSM purposes... > Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, Policy could not be set. > Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > > > > Any ideas what this is and why? > > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

I am not so sure whether it will be so simple to do it. Recovery can take some time and during this time vdsm is not functional. Interesting issue found [1].
MOM has no issue with VDSM that reports the recovering or reinitializing code (99 iirc). It just needs to get the answer. Martin On Fri, Nov 18, 2016 at 11:47 AM, Piotr Kliczewski <pkliczew@redhat.com> wrote:
In my opinion this issue is protocol agnostic. We may want to have a retry logic to make sure client can "talk" to vdsm.
It seems that we could have this logic in the client code (disabled by default) and client code could enable it if needed.
On Fri, Nov 18, 2016 at 11:25 AM, Oved Ourfali <oourfali@redhat.com> wrote:
Discuss it with the infra guys and I'm sure you'll get the reasons, and will figure out a solution together.
On Nov 18, 2016 12:21 PM, "Martin Sivak" <msivak@redhat.com> wrote:
What about making vdsm ready to answer connections when it returns to systemd instead? I hate workarounds and this always worked fine.
I am not so sure whether it will be so simple to do it. Recovery can take some time and during this time vdsm is not functional. Interesting issue found [1].
[1] https://bugzilla.redhat.com/1396183
Martin
On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali <oourfali@redhat.com> wrote:
Seems like a race regardless of the protocol. Should you add a retry?
On Nov 18, 2016 11:52 AM, "Martin Sivak" <msivak@redhat.com> wrote:
Yes, because VDSM is supposed to be up (there is systemd dependency). This always worked fine with xml-rpc.
Martin
On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer@redhat.com> wrote:
On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak@redhat.com> wrote: > This happens because MOM can't connect to VDSM and so it quits.
So mom try once to connect and if the connection fails it quits?
> We > discussed it on the mailinglist > > > > https://lists.fedoraproject.org/archives/list/vdsm-devel@lists.fedorahosted.... > http://lists.ovirt.org/pipermail/devel/2016-November/014101.html > > This issue never happened with XML-RPC. > > Shira reported it as > https://bugzilla.redhat.com/show_bug.cgi?id=1393012 > > Martin > > On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com> > wrote: >> I've recently seen, including now on Master, the following >> warnings: >> Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started >> MOM >> instance configured for VDSM purposes. >> Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting >> MOM >> instance configured for VDSM purposes... >> Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, Policy could not be set. >> Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> >> >> >> Any ideas what this is and why? >> >> _______________________________________________ >> Devel mailing list >> Devel@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/devel > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

Hello All. Are we / can we use systemd socket activation there? Anton. On Fri, Nov 18, 2016 at 11:21 AM, Martin Sivak <msivak@redhat.com> wrote:
What about making vdsm ready to answer connections when it returns to systemd instead? I hate workarounds and this always worked fine.
Martin
Seems like a race regardless of the protocol. Should you add a retry?
On Nov 18, 2016 11:52 AM, "Martin Sivak" <msivak@redhat.com> wrote:
Yes, because VDSM is supposed to be up (there is systemd dependency). This always worked fine with xml-rpc.
Martin
On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer@redhat.com>
wrote:
On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak@redhat.com> wrote:
This happens because MOM can't connect to VDSM and so it quits.
So mom try once to connect and if the connection fails it quits?
We discussed it on the mailinglist
On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali <oourfali@redhat.com> wrote: lists.fedorahosted.org/thread/MZ7UJUWO5KFRDJJDNXX7VIYU5PWSXF62/
http://lists.ovirt.org/pipermail/devel/2016-November/014101.html
This issue never happened with XML-RPC.
Shira reported it as https://bugzilla.redhat.com/show_bug.cgi?id=1393012
Martin
On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
I've recently seen, including now on Master, the following warnings: Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started MOM instance configured for VDSM purposes. Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting MOM instance configured for VDSM purposes... Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, Policy could not be set. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available.
Any ideas what this is and why?
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Anton Marchukov Senior Software Engineer - RHEV CI - Red Hat

Are we / can we use systemd socket activation there?
That actually requires systemd specific code iirc (to take over the standing by socket). I am actually wondering why the xml-rpc in 4.0.4 was fine and json-rpc in 4.0.6 is too slow. Martin On Fri, Nov 18, 2016 at 11:53 AM, Anton Marchukov <amarchuk@redhat.com> wrote:
Hello All.
Are we / can we use systemd socket activation there?
Anton.
On Fri, Nov 18, 2016 at 11:21 AM, Martin Sivak <msivak@redhat.com> wrote:
What about making vdsm ready to answer connections when it returns to systemd instead? I hate workarounds and this always worked fine.
Martin
On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali <oourfali@redhat.com> wrote:
Seems like a race regardless of the protocol. Should you add a retry?
On Nov 18, 2016 11:52 AM, "Martin Sivak" <msivak@redhat.com> wrote:
Yes, because VDSM is supposed to be up (there is systemd dependency). This always worked fine with xml-rpc.
Martin
On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer@redhat.com> wrote:
On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak@redhat.com> wrote:
This happens because MOM can't connect to VDSM and so it quits.
So mom try once to connect and if the connection fails it quits?
We discussed it on the mailinglist
https://lists.fedoraproject.org/archives/list/vdsm-devel@lists.fedorahosted.... http://lists.ovirt.org/pipermail/devel/2016-November/014101.html
This issue never happened with XML-RPC.
Shira reported it as https://bugzilla.redhat.com/show_bug.cgi?id=1393012
Martin
On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com> wrote: > I've recently seen, including now on Master, the following > warnings: > Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started > MOM > instance configured for VDSM purposes. > Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting > MOM > instance configured for VDSM purposes... > Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, Policy could not be set. > Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > > > > Any ideas what this is and why? > > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Anton Marchukov Senior Software Engineer - RHEV CI - Red Hat

Hello Martin. It does. But from 4.0 as I see we support only systemd enabled distros so it might make sense. Anton. On Fri, Nov 18, 2016 at 11:59 AM, Martin Sivak <msivak@redhat.com> wrote:
Are we / can we use systemd socket activation there?
That actually requires systemd specific code iirc (to take over the standing by socket). I am actually wondering why the xml-rpc in 4.0.4 was fine and json-rpc in 4.0.6 is too slow.
Martin
Hello All.
Are we / can we use systemd socket activation there?
Anton.
On Fri, Nov 18, 2016 at 11:21 AM, Martin Sivak <msivak@redhat.com> wrote:
What about making vdsm ready to answer connections when it returns to systemd instead? I hate workarounds and this always worked fine.
Martin
On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali <oourfali@redhat.com> wrote:
Seems like a race regardless of the protocol. Should you add a retry?
On Nov 18, 2016 11:52 AM, "Martin Sivak" <msivak@redhat.com> wrote:
Yes, because VDSM is supposed to be up (there is systemd dependency). This always worked fine with xml-rpc.
Martin
On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer@redhat.com> wrote:
On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak@redhat.com> wrote: > This happens because MOM can't connect to VDSM and so it quits.
So mom try once to connect and if the connection fails it quits?
> We > discussed it on the mailinglist > > > > https://lists.fedoraproject.org/archives/list/vdsm-devel@
On Fri, Nov 18, 2016 at 11:53 AM, Anton Marchukov <amarchuk@redhat.com> wrote: lists.fedorahosted.org/thread/MZ7UJUWO5KFRDJJDNXX7VIYU5PWSXF62/
> http://lists.ovirt.org/pipermail/devel/2016-November/014101.html > > This issue never happened with XML-RPC. > > Shira reported it as > https://bugzilla.redhat.com/show_bug.cgi?id=1393012 > > Martin > > On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com> > wrote: >> I've recently seen, including now on Master, the following >> warnings: >> Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started >> MOM >> instance configured for VDSM purposes. >> Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting >> MOM >> instance configured for VDSM purposes... >> Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, Policy could not be set. >> Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> >> >> >> Any ideas what this is and why? >> >> _______________________________________________ >> Devel mailing list >> Devel@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/devel > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Anton Marchukov Senior Software Engineer - RHEV CI - Red Hat
-- Anton Marchukov Senior Software Engineer - RHEV CI - Red Hat

I don't think it is related to version X or Y. It is a race, so might be related to other factors. On Nov 18, 2016 12:59 PM, "Martin Sivak" <msivak@redhat.com> wrote:
Are we / can we use systemd socket activation there?
That actually requires systemd specific code iirc (to take over the standing by socket). I am actually wondering why the xml-rpc in 4.0.4 was fine and json-rpc in 4.0.6 is too slow.
Martin
Hello All.
Are we / can we use systemd socket activation there?
Anton.
On Fri, Nov 18, 2016 at 11:21 AM, Martin Sivak <msivak@redhat.com> wrote:
What about making vdsm ready to answer connections when it returns to systemd instead? I hate workarounds and this always worked fine.
Martin
On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali <oourfali@redhat.com> wrote:
Seems like a race regardless of the protocol. Should you add a retry?
On Nov 18, 2016 11:52 AM, "Martin Sivak" <msivak@redhat.com> wrote:
Yes, because VDSM is supposed to be up (there is systemd dependency). This always worked fine with xml-rpc.
Martin
On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer@redhat.com> wrote:
On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak@redhat.com> wrote: > This happens because MOM can't connect to VDSM and so it quits.
So mom try once to connect and if the connection fails it quits?
> We > discussed it on the mailinglist > > > > https://lists.fedoraproject.org/archives/list/vdsm-devel@
On Fri, Nov 18, 2016 at 11:53 AM, Anton Marchukov <amarchuk@redhat.com> wrote: lists.fedorahosted.org/thread/MZ7UJUWO5KFRDJJDNXX7VIYU5PWSXF62/
> http://lists.ovirt.org/pipermail/devel/2016-November/014101.html > > This issue never happened with XML-RPC. > > Shira reported it as > https://bugzilla.redhat.com/show_bug.cgi?id=1393012 > > Martin > > On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com> > wrote: >> I've recently seen, including now on Master, the following >> warnings: >> Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started >> MOM >> instance configured for VDSM purposes. >> Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting >> MOM >> instance configured for VDSM purposes... >> Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, Policy could not be set. >> Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM >> WARN MOM >> not available. >> >> >> >> Any ideas what this is and why? >> >> _______________________________________________ >> Devel mailing list >> Devel@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/devel > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Anton Marchukov Senior Software Engineer - RHEV CI - Red Hat

--Apple-Mail=_A4BB25DC-3E21-4B00-A859-A2272D61BE0D Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8
On 18 Nov 2016, at 12:12, Oved Ourfali <oourfali@redhat.com> wrote: =20 I don't think it is related to version X or Y. It is a race, so might = be related to other factors. =20
likely because json-rpc is initialized after xml-rpc=E2=80=A6.or indeed = whatever else;-) either way it needs to be solved. Either by improving the systemd = service file or mom retry (btw you likely still want to have a retry in = mom once it starts responding due to delayed vdsm async recovery taking = potentially minutes)
=20 On Nov 18, 2016 12:59 PM, "Martin Sivak" <msivak@redhat.com = <mailto:msivak@redhat.com>> wrote:
Are we / can we use systemd socket activation there? =20 That actually requires systemd specific code iirc (to take over the standing by socket). I am actually wondering why the xml-rpc in 4.0.4 was fine and json-rpc in 4.0.6 is too slow. =20 Martin =20 On Fri, Nov 18, 2016 at 11:53 AM, Anton Marchukov <amarchuk@redhat.com = <mailto:amarchuk@redhat.com>> wrote: Hello All.
Are we / can we use systemd socket activation there?
Anton.
On Fri, Nov 18, 2016 at 11:21 AM, Martin Sivak <msivak@redhat.com = <mailto:msivak@redhat.com>> wrote:
What about making vdsm ready to answer connections when it returns =
to
systemd instead? I hate workarounds and this always worked fine.
Martin
On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali <oourfali@redhat.com = <mailto:oourfali@redhat.com>> wrote:
Seems like a race regardless of the protocol. Should you add a retry?
On Nov 18, 2016 11:52 AM, "Martin Sivak" <msivak@redhat.com = <mailto:msivak@redhat.com>> wrote:
Yes, because VDSM is supposed to be up (there is systemd =
dependency).
This always worked fine with xml-rpc.
Martin
On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer@redhat.com = <mailto:nsoffer@redhat.com>> wrote:
On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak = <msivak@redhat.com <mailto:msivak@redhat.com>> wrote: > This happens because MOM can't connect to VDSM and so it = quits.
So mom try once to connect and if the connection fails it = quits?
> We > discussed it on the mailinglist > > > > = https://lists.fedoraproject.org/archives/list/vdsm-devel@lists.fedorahoste= d.org/thread/MZ7UJUWO5KFRDJJDNXX7VIYU5PWSXF62/ = <https://lists.fedoraproject.org/archives/list/vdsm-devel@lists.fedorahost= ed.org/thread/MZ7UJUWO5KFRDJJDNXX7VIYU5PWSXF62/> > = http://lists.ovirt.org/pipermail/devel/2016-November/014101.html = <http://lists.ovirt.org/pipermail/devel/2016-November/014101.html> > > This issue never happened with XML-RPC. > > Shira reported it as > https://bugzilla.redhat.com/show_bug.cgi?id=3D1393012 = <https://bugzilla.redhat.com/show_bug.cgi?id=3D1393012> > > Martin > > On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com = <mailto:ykaul@redhat.com>> > wrote: >> I've recently seen, including now on Master, the following >> warnings: >> Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: = Started >> MOM >> instance configured for VDSM purposes. >> Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: = Starting >> MOM >> instance configured for VDSM purposes... >> Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: = vdsm MOM >> WARN MOM >> not available, Policy could not be set. >> Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: = vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: = vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: = vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: = vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: = vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: = vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: = vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: = vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: = vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: = vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: = vdsm MOM >> WARN MOM >> not available. >> Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: = vdsm MOM >> WARN MOM >> not available, KSM stats will be missing. >> Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: = vdsm MOM >> WARN MOM >> not available. >> >> >> >> Any ideas what this is and why? >> >> _______________________________________________ >> Devel mailing list >> Devel@ovirt.org <mailto:Devel@ovirt.org> >> http://lists.ovirt.org/mailman/listinfo/devel = <http://lists.ovirt.org/mailman/listinfo/devel> > _______________________________________________ > Devel mailing list > Devel@ovirt.org <mailto:Devel@ovirt.org> > http://lists.ovirt.org/mailman/listinfo/devel = <http://lists.ovirt.org/mailman/listinfo/devel>
Devel mailing list Devel@ovirt.org <mailto:Devel@ovirt.org> http://lists.ovirt.org/mailman/listinfo/devel = <http://lists.ovirt.org/mailman/listinfo/devel>
_______________________________________________ Devel mailing list Devel@ovirt.org <mailto:Devel@ovirt.org> http://lists.ovirt.org/mailman/listinfo/devel = <http://lists.ovirt.org/mailman/listinfo/devel>
-- Anton Marchukov Senior Software Engineer - RHEV CI - Red Hat
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
--Apple-Mail=_A4BB25DC-3E21-4B00-A859-A2272D61BE0D Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html = charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; = -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" = class=3D""><br class=3D""><div><blockquote type=3D"cite" class=3D""><div = class=3D"">On 18 Nov 2016, at 12:12, Oved Ourfali <<a = href=3D"mailto:oourfali@redhat.com" class=3D"">oourfali@redhat.com</a>>= wrote:</div><br class=3D"Apple-interchange-newline"><div class=3D""><p = dir=3D"ltr" class=3D"">I don't think it is related to version X or Y. It = is a race, so might be related to other factors. = </p></div></blockquote><div><br class=3D""></div>likely because json-rpc = is initialized after xml-rpc=E2=80=A6.or indeed whatever = else;-)</div><div><br class=3D""></div><div>either way it needs to be = solved. Either by improving the systemd service file or mom retry (btw = you likely still want to have a retry in mom once it starts responding = due to delayed vdsm async recovery taking potentially = minutes)</div><div><br class=3D""><blockquote type=3D"cite" = class=3D""><div class=3D""> <div class=3D"gmail_extra"><br class=3D""><div class=3D"gmail_quote">On = Nov 18, 2016 12:59 PM, "Martin Sivak" <<a = href=3D"mailto:msivak@redhat.com" class=3D"">msivak@redhat.com</a>> = wrote:<br type=3D"attribution" class=3D""><blockquote = class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc = solid;padding-left:1ex">> Are we / can we use systemd socket = activation there?<br class=3D""> <br class=3D""> That actually requires systemd specific code iirc (to take over the<br = class=3D""> standing by socket). I am actually wondering why the xml-rpc in 4.0.4<br = class=3D""> was fine and json-rpc in 4.0.6 is too slow.<br class=3D""> <br class=3D""> Martin<br class=3D""> <br class=3D""> On Fri, Nov 18, 2016 at 11:53 AM, Anton Marchukov <<a = href=3D"mailto:amarchuk@redhat.com" class=3D"">amarchuk@redhat.com</a>>= wrote:<br class=3D""> > Hello All.<br class=3D""> ><br class=3D""> > Are we / can we use systemd socket activation there?<br class=3D""> ><br class=3D""> > Anton.<br class=3D""> ><br class=3D""> > On Fri, Nov 18, 2016 at 11:21 AM, Martin Sivak <<a = href=3D"mailto:msivak@redhat.com" class=3D"">msivak@redhat.com</a>> = wrote:<br class=3D""> >><br class=3D""> >> What about making vdsm ready to answer connections when it = returns to<br class=3D""> >> systemd instead? I hate workarounds and this always worked = fine.<br class=3D""> >><br class=3D""> >> Martin<br class=3D""> >><br class=3D""> >> On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali <<a = href=3D"mailto:oourfali@redhat.com" = class=3D"">oourfali@redhat.com</a>><br class=3D""> >> wrote:<br class=3D""> >> > Seems like a race regardless of the protocol.<br class=3D"">= >> > Should you add a retry?<br class=3D""> >> ><br class=3D""> >> ><br class=3D""> >> > On Nov 18, 2016 11:52 AM, "Martin Sivak" <<a = href=3D"mailto:msivak@redhat.com" class=3D"">msivak@redhat.com</a>> = wrote:<br class=3D""> >> >><br class=3D""> >> >> Yes, because VDSM is supposed to be up (there is = systemd dependency).<br class=3D""> >> >> This always worked fine with xml-rpc.<br class=3D""> >> >><br class=3D""> >> >> Martin<br class=3D""> >> >><br class=3D""> >> >> On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <<a = href=3D"mailto:nsoffer@redhat.com" = class=3D"">nsoffer@redhat.com</a>><br class=3D""> >> >> wrote:<br class=3D""> >> >> > On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak = <<a href=3D"mailto:msivak@redhat.com" = class=3D"">msivak@redhat.com</a>><br class=3D""> >> >> > wrote:<br class=3D""> >> >> >> This happens because MOM can't connect to = VDSM and so it quits.<br class=3D""> >> >> ><br class=3D""> >> >> > So mom try once to connect and if the connection = fails it quits?<br class=3D""> >> >> ><br class=3D""> >> >> >> We<br class=3D""> >> >> >> discussed it on the mailinglist<br class=3D""> >> >> >><br class=3D""> >> >> >><br class=3D""> >> >> >><br class=3D""> >> >> >> <a = href=3D"https://lists.fedoraproject.org/archives/list/vdsm-devel@lists.fed= orahosted.org/thread/MZ7UJUWO5KFRDJJDNXX7VIYU5PWSXF62/" rel=3D"noreferrer"= target=3D"_blank" class=3D"">https://lists.fedoraproject.<wbr = class=3D"">org/archives/list/vdsm-devel@<wbr = class=3D"">lists.fedorahosted.org/thread/<wbr = class=3D"">MZ7UJUWO5KFRDJJDNXX7VIYU5PWSXF<wbr class=3D"">62/</a><br = class=3D""> >> >> >> <a = href=3D"http://lists.ovirt.org/pipermail/devel/2016-November/014101.html" = rel=3D"noreferrer" target=3D"_blank" = class=3D"">http://lists.ovirt.org/<wbr = class=3D"">pipermail/devel/2016-November/<wbr = class=3D"">014101.html</a><br class=3D""> >> >> >><br class=3D""> >> >> >> This issue never happened with XML-RPC.<br = class=3D""> >> >> >><br class=3D""> >> >> >> Shira reported it as<br class=3D""> >> >> >> <a = href=3D"https://bugzilla.redhat.com/show_bug.cgi?id=3D1393012" = rel=3D"noreferrer" target=3D"_blank" = class=3D"">https://bugzilla.redhat.com/<wbr = class=3D"">show_bug.cgi?id=3D1393012</a><br class=3D""> >> >> >><br class=3D""> >> >> >> Martin<br class=3D""> >> >> >><br class=3D""> >> >> >> On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul = <<a href=3D"mailto:ykaul@redhat.com" = class=3D"">ykaul@redhat.com</a>><br class=3D""> >> >> >> wrote:<br class=3D""> >> >> >>> I've recently seen, including now on = Master, the following<br class=3D""> >> >> >>> warnings:<br class=3D""> >> >> >>> Nov 17 13:33:25 = lago-basic-suite-master-host0 systemd[1]: Started<br class=3D""> >> >> >>> MOM<br class=3D""> >> >> >>> instance configured for VDSM purposes.<br = class=3D""> >> >> >>> Nov 17 13:33:25 = lago-basic-suite-master-host0 systemd[1]: Starting<br class=3D""> >> >> >>> MOM<br class=3D""> >> >> >>> instance configured for VDSM = purposes...<br class=3D""> >> >> >>> Nov 17 13:33:35 = lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM<br class=3D""> >> >> >>> WARN MOM<br class=3D""> >> >> >>> not available, Policy could not be = set.<br class=3D""> >> >> >>> Nov 17 13:33:39 = lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM<br class=3D""> >> >> >>> WARN MOM<br class=3D""> >> >> >>> not available.<br class=3D""> >> >> >>> Nov 17 13:33:39 = lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM<br class=3D""> >> >> >>> WARN MOM<br class=3D""> >> >> >>> not available, KSM stats will be = missing.<br class=3D""> >> >> >>> Nov 17 13:33:55 = lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM<br class=3D""> >> >> >>> WARN MOM<br class=3D""> >> >> >>> not available.<br class=3D""> >> >> >>> Nov 17 13:33:55 = lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM<br class=3D""> >> >> >>> WARN MOM<br class=3D""> >> >> >>> not available, KSM stats will be = missing.<br class=3D""> >> >> >>> Nov 17 13:34:10 = lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM<br class=3D""> >> >> >>> WARN MOM<br class=3D""> >> >> >>> not available.<br class=3D""> >> >> >>> Nov 17 13:34:10 = lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM<br class=3D""> >> >> >>> WARN MOM<br class=3D""> >> >> >>> not available, KSM stats will be = missing.<br class=3D""> >> >> >>> Nov 17 13:34:26 = lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM<br class=3D""> >> >> >>> WARN MOM<br class=3D""> >> >> >>> not available.<br class=3D""> >> >> >>> Nov 17 13:34:26 = lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM<br class=3D""> >> >> >>> WARN MOM<br class=3D""> >> >> >>> not available, KSM stats will be = missing.<br class=3D""> >> >> >>> Nov 17 13:34:42 = lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM<br class=3D""> >> >> >>> WARN MOM<br class=3D""> >> >> >>> not available.<br class=3D""> >> >> >>> Nov 17 13:34:42 = lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM<br class=3D""> >> >> >>> WARN MOM<br class=3D""> >> >> >>> not available, KSM stats will be = missing.<br class=3D""> >> >> >>> Nov 17 13:34:57 = lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM<br class=3D""> >> >> >>> WARN MOM<br class=3D""> >> >> >>> not available.<br class=3D""> >> >> >>> Nov 17 13:34:57 = lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM<br class=3D""> >> >> >>> WARN MOM<br class=3D""> >> >> >>> not available, KSM stats will be = missing.<br class=3D""> >> >> >>> Nov 17 13:35:12 = lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM<br class=3D""> >> >> >>> WARN MOM<br class=3D""> >> >> >>> not available.<br class=3D""> >> >> >>><br class=3D""> >> >> >>><br class=3D""> >> >> >>><br class=3D""> >> >> >>> Any ideas what this is and why?<br = class=3D""> >> >> >>><br class=3D""> >> >> >>> ______________________________<wbr = class=3D"">_________________<br class=3D""> >> >> >>> Devel mailing list<br class=3D""> >> >> >>> <a href=3D"mailto:Devel@ovirt.org" = class=3D"">Devel@ovirt.org</a><br class=3D""> >> >> >>> <a = href=3D"http://lists.ovirt.org/mailman/listinfo/devel" rel=3D"noreferrer" = target=3D"_blank" class=3D"">http://lists.ovirt.org/<wbr = class=3D"">mailman/listinfo/devel</a><br class=3D""> >> >> >> ______________________________<wbr = class=3D"">_________________<br class=3D""> >> >> >> Devel mailing list<br class=3D""> >> >> >> <a href=3D"mailto:Devel@ovirt.org" = class=3D"">Devel@ovirt.org</a><br class=3D""> >> >> >> <a = href=3D"http://lists.ovirt.org/mailman/listinfo/devel" rel=3D"noreferrer" = target=3D"_blank" class=3D"">http://lists.ovirt.org/<wbr = class=3D"">mailman/listinfo/devel</a><br class=3D""> >> >> ______________________________<wbr = class=3D"">_________________<br class=3D""> >> >> Devel mailing list<br class=3D""> >> >> <a href=3D"mailto:Devel@ovirt.org" = class=3D"">Devel@ovirt.org</a><br class=3D""> >> >> <a = href=3D"http://lists.ovirt.org/mailman/listinfo/devel" rel=3D"noreferrer" = target=3D"_blank" class=3D"">http://lists.ovirt.org/<wbr = class=3D"">mailman/listinfo/devel</a><br class=3D""> >> >><br class=3D""> >> >><br class=3D""> >> ><br class=3D""> >> ______________________________<wbr = class=3D"">_________________<br class=3D""> >> Devel mailing list<br class=3D""> >> <a href=3D"mailto:Devel@ovirt.org" = class=3D"">Devel@ovirt.org</a><br class=3D""> >> <a href=3D"http://lists.ovirt.org/mailman/listinfo/devel" = rel=3D"noreferrer" target=3D"_blank" = class=3D"">http://lists.ovirt.org/<wbr = class=3D"">mailman/listinfo/devel</a><br class=3D""> ><br class=3D""> ><br class=3D""> ><br class=3D""> ><br class=3D""> > --<br class=3D""> > Anton Marchukov<br class=3D""> > Senior Software Engineer - RHEV CI - Red Hat<br class=3D""> ><br class=3D""> </blockquote></div></div> _______________________________________________<br class=3D"">Devel = mailing list<br class=3D""><a href=3D"mailto:Devel@ovirt.org" = class=3D"">Devel@ovirt.org</a><br = class=3D"">http://lists.ovirt.org/mailman/listinfo/devel</div></blockquote=
</div><br class=3D""></body></html>=
--Apple-Mail=_A4BB25DC-3E21-4B00-A859-A2272D61BE0D--

On Fri, Nov 18, 2016 at 12:21 PM, Martin Sivak <msivak@redhat.com> wrote:
What about making vdsm ready to answer connections when it returns to systemd instead? I hate workarounds and this always worked fine.
This is clearly a mom bug. Mom must have retry mechanism when and do not expect that vdsm is ready to accept connections when mom starts. Vdsm can be nicer and notify systemd when vdsm is ready. I already mentioned it in the other thread here: http://lists.ovirt.org/pipermail/devel/2016-November/014104.html But even if vdsm did this, mom still needs a retry mechanism. Nir
Martin
On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali <oourfali@redhat.com> wrote:
Seems like a race regardless of the protocol. Should you add a retry?
On Nov 18, 2016 11:52 AM, "Martin Sivak" <msivak@redhat.com> wrote:
Yes, because VDSM is supposed to be up (there is systemd dependency). This always worked fine with xml-rpc.
Martin
On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer@redhat.com> wrote:
On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak@redhat.com> wrote:
This happens because MOM can't connect to VDSM and so it quits.
So mom try once to connect and if the connection fails it quits?
We discussed it on the mailinglist
https://lists.fedoraproject.org/archives/list/vdsm-devel@lists.fedorahosted.... http://lists.ovirt.org/pipermail/devel/2016-November/014101.html
This issue never happened with XML-RPC.
Shira reported it as https://bugzilla.redhat.com/show_bug.cgi?id=1393012
Martin
On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
I've recently seen, including now on Master, the following warnings: Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started MOM instance configured for VDSM purposes. Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting MOM instance configured for VDSM purposes... Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, Policy could not be set. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available. Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available, KSM stats will be missing. Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM WARN MOM not available.
Any ideas what this is and why?
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

But even if vdsm did this, mom still needs a retry mechanism.
No it doesn't. Systemd handles that just fine (I might increase the retry count in the service file though). I rather like the supervised way of handling errors in "let it crash" projects. It simplifies the code tremendously. MOM does not have almost any state and plain restart is the right thing to do. You might be surprised how nice this approach can be (Erlang is based around supervisors and crashing on error - and is used to achieve couple of nines in telco). Martin On Fri, Nov 18, 2016 at 12:14 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Fri, Nov 18, 2016 at 12:21 PM, Martin Sivak <msivak@redhat.com> wrote:
What about making vdsm ready to answer connections when it returns to systemd instead? I hate workarounds and this always worked fine.
This is clearly a mom bug. Mom must have retry mechanism when and do not expect that vdsm is ready to accept connections when mom starts.
Vdsm can be nicer and notify systemd when vdsm is ready. I already mentioned it in the other thread here: http://lists.ovirt.org/pipermail/devel/2016-November/014104.html
But even if vdsm did this, mom still needs a retry mechanism.
Nir
Martin
On Fri, Nov 18, 2016 at 11:13 AM, Oved Ourfali <oourfali@redhat.com> wrote:
Seems like a race regardless of the protocol. Should you add a retry?
On Nov 18, 2016 11:52 AM, "Martin Sivak" <msivak@redhat.com> wrote:
Yes, because VDSM is supposed to be up (there is systemd dependency). This always worked fine with xml-rpc.
Martin
On Fri, Nov 18, 2016 at 10:14 AM, Nir Soffer <nsoffer@redhat.com> wrote:
On Fri, Nov 18, 2016 at 10:45 AM, Martin Sivak <msivak@redhat.com> wrote:
This happens because MOM can't connect to VDSM and so it quits.
So mom try once to connect and if the connection fails it quits?
We discussed it on the mailinglist
https://lists.fedoraproject.org/archives/list/vdsm-devel@lists.fedorahosted.... http://lists.ovirt.org/pipermail/devel/2016-November/014101.html
This issue never happened with XML-RPC.
Shira reported it as https://bugzilla.redhat.com/show_bug.cgi?id=1393012
Martin
On Thu, Nov 17, 2016 at 7:42 PM, Yaniv Kaul <ykaul@redhat.com> wrote: > I've recently seen, including now on Master, the following warnings: > Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Started MOM > instance configured for VDSM purposes. > Nov 17 13:33:25 lago-basic-suite-master-host0 systemd[1]: Starting MOM > instance configured for VDSM purposes... > Nov 17 13:33:35 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, Policy could not be set. > Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:33:39 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:33:55 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:34:10 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:34:26 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:34:42 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > Nov 17 13:34:57 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available, KSM stats will be missing. > Nov 17 13:35:12 lago-basic-suite-master-host0 vdsm[2012]: vdsm MOM > WARN MOM > not available. > > > > Any ideas what this is and why? > > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
participants (7)
-
Anton Marchukov
-
Martin Sivak
-
Michal Skrivanek
-
Nir Soffer
-
Oved Ourfali
-
Piotr Kliczewski
-
Yaniv Kaul