
On Tue, Dec 19, 2017 at 2:05 PM, Martin Sivak <msivak@redhat.com> wrote:
So I think that we still have to fix it somehow. Are we really sure that nr_retries=2 and _timeout=20 are really the magic numbers that works on every conditions?
No, it should be tested on HE environment and it depends on your usage.
What does happen when only timeout is specified and the connection fails after the command is sent? What are the defaults in that case?
So, there are not magic numbers to fit all, here's the description of those parameters: nr_retries - number of reconnection retries - if not specified than defualt is 1 _timeout - it's maximum time to wait for reply of a command/veb if client is connected - this does not affect reconnection any way, meaning the client could reconnect for example for 10 minutes (using high enough nr_retries value) and yet this timeout may not be reached So here are 2 suggestions: 1. Set nr_retries=0 and client will behave the same way as in 4.1 (meaning no reconnection performed) 2. Set nr_retries to high enough number (for example 100 000) and hope that this number of retries is enough for host being deployed using host deploy. I know that setting this number is tricky for HE, because the host deploy can take really different amount of time, but there's no exact way how to define exact timeout of single reconnection as it depends to the failure during attempt to connect. AFAIK no other functionality was mentioned in [1], so nothing else is implemented. If there are other requirement for reconnection functionality, then let's open a new RFE for that and discuss it Thanks Martin [1] https://bugzilla.redhat.com/show_bug.cgi?id=1376843
Martin
On Tue, Dec 19, 2017 at 1:55 PM, Irit Goihman <igoihman@redhat.com> wrote:
On Tue, Dec 19, 2017 at 2:51 PM, Simone Tiraboschi <stirabos@redhat.com>
On Tue, Dec 19, 2017 at 12:56 PM, Martin Perina <mperina@redhat.com>
wrote:
As Irit mentioned the provided reproduction steps are wrong (misuse of
wrote: the code) and she posted correct example showing that jsonrpc code works as expected. So Martin/Simone are you using somewhere in HE code the original example that is misusing the client?
According to https://bugzilla.redhat.com/show_bug.cgi?id=1527155#c9 It works in Irit example, at least on that host with that load and
timings, setting nr_retries=2 and _timeout=20
While we have _timeout=5 and no custom nr_retries https://github.com/oVirt/ovirt-hosted-engine-ha/blob/
master/ovirt_hosted_engine_ha/lib/util.py#L417
So I think that we still have to fix it somehow. Are we really sure that nr_retries=2 and _timeout=20 are really the
magic numbers that works on every conditions?
No, it should be tested on HE environment and it depends on your usage.
Thanks
Martin
On Tue, Dec 19, 2017 at 12:53 PM, Oved Ourfali <oourfali@redhat.com>
wrote:
From the latest comment it doesn't seem like a blocker to me. Martin S. - your thoughts?
On Tue, Dec 19, 2017 at 1:48 PM, Sandro Bonazzola <
sbonazzo@redhat.com> wrote:
We have a proposed blocker for the release: 1527155InfravdsmBindings-APIigoihman@redhat.comNEWjsonrpc reconnect
logic does not work and gets stuckurgentunspecifiedovirt-4.2.004:30:30
Please review and either approve the blcoker or postpone to 4.2.1. Thanks,
--
SANDRO BONAZZOLA
ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D
Red Hat EMEA
TRIED. TESTED. TRUSTED.
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Martin Perina Associate Manager, Software Engineering Red Hat Czech s.r.o.
--
IRIT GOIHMAN
SOFTWARE ENGINEER
EMEA VIRTUALIZATION R&D
Red Hat EMEA
TRIED. TESTED. TRUSTED. @redhatnews Red Hat Red Hat
-- Martin Perina Associate Manager, Software Engineering Red Hat Czech s.r.o.