<div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Apr 24, 2018 at 3:28 PM, Dan Kenigsberg <span dir="ltr"><<a href="mailto:danken@redhat.com" target="_blank">danken@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Tue, Apr 24, 2018 at 4:17 PM, Ravi Shankar Nori <<a href="mailto:rnori@redhat.com">rnori@redhat.com</a>> wrote:<br>
><br>
><br>
> On Tue, Apr 24, 2018 at 7:00 AM, Dan Kenigsberg <<a href="mailto:danken@redhat.com">danken@redhat.com</a>> wrote:<br>
>><br>
>> Ravi's patch is in, but a similar problem remains, and the test cannot<br>
>> be put back into its place.<br>
>><br>
>> It seems that while Vdsm was taken down, a couple of getCapsAsync<br>
>> requests queued up. At one point, the host resumed its connection,<br>
>> before the requests have been cleared of the queue. After the host is<br>
>> up, the following tests resume, and at a pseudorandom point in time,<br>
>> an old getCapsAsync request times out and kills our connection.<br>
>><br>
>> I believe that as long as ANY request is on flight, the monitoring<br>
>> lock should not be released, and the host should not be declared as<br>
>> up.<br>
>><br>
>><br>
><br>
><br>
> Hi Dan,<br>
><br>
> Can I have the link to the job on jenkins so I can look at the logs<br>
<br>
We disabled a network test that started failing after getCapsAsync was merged.<br>
Please own its re-introduction to OST: <a href="https://gerrit.ovirt.org/#/c/90264/" rel="noreferrer" target="_blank">https://gerrit.ovirt.org/#/c/<wbr>90264/</a><br>
<br>
Its most recent failure<br>
<a href="http://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/346/" rel="noreferrer" target="_blank">http://jenkins.ovirt.org/job/<wbr>ovirt-system-tests_standard-<wbr>check-patch/346/</a><br>
has been discussed by Alona and Piotr over IRC.<br>
</blockquote></div><br><div style="font-family:arial,helvetica,sans-serif" class="gmail_default">So <a href="https://bugzilla.redhat.com/1571768">https://bugzilla.redhat.com/1571768</a> was created to cover this issue discovered during Alona's and Piotr's conversation. But after further discussion we have found out that this issue is not related to non-blocking thread changes in engine 4.2 and this behavior exists from beginning of vdsm-jsonrpc-java. Ravi will continue verify the fix for BZ1571768 along with other locking changes he already posted to see if they will help network OST to succeed.<br><br>But the fix for BZ1571768 is too dangerous for 4.2.3, let's try to fix that on master and let's see if it doesn't introduce any regressions. If not, then we can backport to 4.2.4.<br></div><br><br clear="all"><br>-- <br><div class="gmail_signature"><div dir="ltr"><font size="1">Martin Perina<br>Associate Manager, Software Engineering<br>Red Hat Czech s.r.o.<br></font></div></div>
</div></div>