On 9 Nov 2016, at 09:57, Francesco Romani <fromani(a)redhat.com>
wrote:
----- Original Message -----
> From: "Piotr Kliczewski" <pkliczew(a)redhat.com>
> To: "Martin Sivak" <msivak(a)redhat.com>
> Cc: "Michal Skrivanek" <michal.skrivanek(a)redhat.com>, "Francesco
Romani" <fromani(a)redhat.com>, "Shira Maximov"
> <mshira(a)redhat.com>, "devel" <devel(a)ovirt.org>
> Sent: Wednesday, November 9, 2016 9:54:02 AM
> Subject: Re: [ovirt-devel] [vdsm] Connection refused when talking to jsonrpc
>
> On Wed, Nov 9, 2016 at 9:48 AM, Martin Sivak <msivak(a)redhat.com> wrote:
>
>>> Isn’t the most likely cause by far a simple startup delay? We do open
>> the listener “soon” and responds with code 99, but it’s still not instant
>> of course
>>
>> That is possible of course and we handle those "errors" just fine. But
>> connection refused never happened with xmlrpc. It might have been
>> luck, but it always worked there :)
>>
>>
> There is no difference how we open listening socket (it is used by both
> protocols) and I have seen the engine attempting to connect using both
> protocols
> before the socket was open. What is the time difference that you see?
Not sure I reproduced correctly, but it seems a race on startup.
I got the same error on my box, and here it happens if mom tries to connect
to the unix socket /var/run/vdsm/mom-vdsm.sock *before* that Vdsm creates it.
Once vdsmd succesfully starts, a restart of mom-vdsm seems to fix all the issues.
I'm not sure yet if that's all of it and how to handle with systemd
dependencies.
Perhaps Nir's suggestion past in the thread to notify systemd is a good first step
in the right direction.
not sure
we would need to wait for recovery to finish if you want to be really nice to “dumb”
clients
that would take too long though
I think mom needs a fix in addition anyway
HTH,
--
Francesco Romani
Red Hat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani