[ovirt-devel] [ OST Failure Report ] [ oVirt 4.2 ] [ 2018-04-04 ] [006_migrations.prepare_migration_attachments_ipv6]

Martin Perina mperina at redhat.com
Wed Apr 25 14:57:54 UTC 2018


On Tue, Apr 24, 2018 at 3:28 PM, Dan Kenigsberg <danken at redhat.com> wrote:

> On Tue, Apr 24, 2018 at 4:17 PM, Ravi Shankar Nori <rnori at redhat.com>
> wrote:
> >
> >
> > On Tue, Apr 24, 2018 at 7:00 AM, Dan Kenigsberg <danken at redhat.com>
> wrote:
> >>
> >> Ravi's patch is in, but a similar problem remains, and the test cannot
> >> be put back into its place.
> >>
> >> It seems that while Vdsm was taken down, a couple of getCapsAsync
> >> requests queued up. At one point, the host resumed its connection,
> >> before the requests have been cleared of the queue. After the host is
> >> up, the following tests resume, and at a pseudorandom point in time,
> >> an old getCapsAsync request times out and kills our connection.
> >>
> >> I believe that as long as ANY request is on flight, the monitoring
> >> lock should not be released, and the host should not be declared as
> >> up.
> >>
> >>
> >
> >
> > Hi Dan,
> >
> > Can I have the link to the job on jenkins so I can look at the logs
>
> We disabled a network test that started failing after getCapsAsync was
> merged.
> Please own its re-introduction to OST: https://gerrit.ovirt.org/#/c/90264/
>
> Its most recent failure
> http://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/346/
> has been discussed by Alona and Piotr over IRC.
>

​So https://bugzilla.redhat.com/1571768 was created to cover this issue​
discovered during Alona's and Piotr's conversation. But after further
discussion we have found out that this issue is not related to non-blocking
thread changes in engine 4.2 and this behavior exists from beginning of
vdsm-jsonrpc-java. Ravi will continue verify the fix for BZ1571768 along
with other locking changes he already posted to see if they will help
network OST to succeed.

But the fix for BZ1571768 is too dangerous for 4.2.3, let's try to fix that
on master and let's see if it doesn't introduce any regressions. If not,
then we can backport to 4.2.4.



-- 
Martin Perina
Associate Manager, Software Engineering
Red Hat Czech s.r.o.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20180425/a5f834c3/attachment.html>


More information about the Devel mailing list