
On Tue, Apr 24, 2018 at 4:17 PM, Ravi Shankar Nori <rnori@redhat.com> wrote:
On Tue, Apr 24, 2018 at 7:00 AM, Dan Kenigsberg <danken@redhat.com> wrote:
Ravi's patch is in, but a similar problem remains, and the test cannot be put back into its place.
It seems that while Vdsm was taken down, a couple of getCapsAsync requests queued up. At one point, the host resumed its connection, before the requests have been cleared of the queue. After the host is up, the following tests resume, and at a pseudorandom point in time, an old getCapsAsync request times out and kills our connection.
I believe that as long as ANY request is on flight, the monitoring lock should not be released, and the host should not be declared as up.
Hi Dan,
Can I have the link to the job on jenkins so I can look at the logs
We disabled a network test that started failing after getCapsAsync was merged. Please own its re-introduction to OST: https://gerrit.ovirt.org/#/c/90264/ Its most recent failure http://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/346/ has been discussed by Alona and Piotr over IRC.