Re: [ovirt-devel] oVirt 4.2.0 GA status

19 Dec 2017

      On Tue, Dec 19, 2017 at 2:05 PM, Martin Sivak <msivak@redhat.com> wrote:
...
...
...
So I think that we still have to fix it somehow.
Are we really sure that nr_retries=2 and _timeout=20 are really the
magic numbers that works on every conditions?
No, it should be tested on HE environment and it depends on your usage.
What does happen when only timeout is specified and the connection
fails after the command is sent? What are the defaults in that case?
So, there are not magic numbers to fit all, here's the description

of those parameters:

nr_retries
  - number of reconnection retries
  - if not specified than defualt is 1

_timeout
  - it's maximum time to wait for reply of a command/veb if client is
connected
  - this does not affect reconnection any way, meaning the client could
reconnect for example for 10 minutes (using high enough nr_retries value)
and yet this timeout may not be reached

So here are 2 suggestions:

1. Set nr_retries=0 and client will behave the same way as in 4.1 (meaning
no reconnection performed)

2. Set nr_retries to high enough number (for example 100 000) and hope that
this number of retries is enough for host being deployed using host deploy.
I know that setting this number is tricky for HE, because the host deploy
can take really different amount of time, but there's no exact way how to
define exact timeout of single reconnection as it depends to the failure
during attempt to connect.

AFAIK no other functionality was mentioned in [1], so nothing else is
implemented. If there are other requirement for reconnection functionality,
then let's open a new RFE for that and discuss it

Thanks

Martin

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1376843
...
Martin
On Tue, Dec 19, 2017 at 1:55 PM, Irit Goihman <igoihman@redhat.com> wrote:
...
On Tue, Dec 19, 2017 at 2:51 PM, Simone Tiraboschi <stirabos@redhat.com>
...
...
On Tue, Dec 19, 2017 at 12:56 PM, Martin Perina <mperina@redhat.com>
wrote:
...
...
As Irit mentioned the provided reproduction steps are wrong (misuse of
wrote:
the code) and she posted correct example showing that jsonrpc code works as
expected. So Martin/Simone are you using somewhere in HE code the original
example that is misusing the client?
...
...
According to
https://bugzilla.redhat.com/show_bug.cgi?id=1527155#c9
It works in Irit example, at least on that host with that load and
timings, setting nr_retries=2 and _timeout=20
...
While we have _timeout=5 and no custom nr_retries
https://github.com/oVirt/ovirt-hosted-engine-ha/blob/
master/ovirt_hosted_engine_ha/lib/util.py#L417
...
So I think that we still have to fix it somehow.
Are we really sure that nr_retries=2 and _timeout=20 are really the
magic numbers that works on every conditions?
No, it should be tested on HE environment and it depends on your usage.
...
...
Thanks
Martin
On Tue, Dec 19, 2017 at 12:53 PM, Oved Ourfali <oourfali@redhat.com>
wrote:
...
...
...
From the latest comment it doesn't seem like a blocker to me.
Martin S. - your thoughts?
On Tue, Dec 19, 2017 at 1:48 PM, Sandro Bonazzola <
sbonazzo@redhat.com> wrote:
...
...
We have a proposed blocker for the release:
1527155InfravdsmBindings-APIigoihman@redhat.comNEWjsonrpc reconnect
logic does not work and gets stuckurgentunspecifiedovirt-4.2.004:30:30
...
Please review and either approve the blcoker or postpone to 4.2.1.
Thanks,
--
SANDRO BONAZZOLA
ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
Red Hat EMEA
TRIED. TESTED. TRUSTED.
_______________________________________________
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel
--
Martin Perina
Associate Manager, Software Engineering
Red Hat Czech s.r.o.
--
IRIT GOIHMAN
SOFTWARE ENGINEER
EMEA VIRTUALIZATION R&D
Red Hat EMEA
TRIED. TESTED. TRUSTED.
@redhatnews   Red Hat   Red Hat
-- 
Martin Perina
Associate Manager, Software Engineering
Red Hat Czech s.r.o.