[ovirt-devel] [ OST Failure Report ] [ oVirt 4.2 ] [ 2018-04-04 ] [006_migrations.prepare_migration_attachments_ipv6]

Dan Kenigsberg danken at redhat.com
Thu Apr 26 06:51:58 UTC 2018


On Wed, Apr 25, 2018 at 7:20 PM, Ravi Shankar Nori <rnori at redhat.com> wrote:
>
>
> On Wed, Apr 25, 2018 at 10:57 AM, Martin Perina <mperina at redhat.com> wrote:
>>
>>
>>
>> On Tue, Apr 24, 2018 at 3:28 PM, Dan Kenigsberg <danken at redhat.com> wrote:
>>>
>>> On Tue, Apr 24, 2018 at 4:17 PM, Ravi Shankar Nori <rnori at redhat.com>
>>> wrote:
>>> >
>>> >
>>> > On Tue, Apr 24, 2018 at 7:00 AM, Dan Kenigsberg <danken at redhat.com>
>>> > wrote:
>>> >>
>>> >> Ravi's patch is in, but a similar problem remains, and the test cannot
>>> >> be put back into its place.
>>> >>
>>> >> It seems that while Vdsm was taken down, a couple of getCapsAsync
>>> >> requests queued up. At one point, the host resumed its connection,
>>> >> before the requests have been cleared of the queue. After the host is
>>> >> up, the following tests resume, and at a pseudorandom point in time,
>>> >> an old getCapsAsync request times out and kills our connection.
>>> >>
>>> >> I believe that as long as ANY request is on flight, the monitoring
>>> >> lock should not be released, and the host should not be declared as
>>> >> up.
>>> >>
>>> >>
>>> >
>>> >
>>> > Hi Dan,
>>> >
>>> > Can I have the link to the job on jenkins so I can look at the logs
>>>
>>> We disabled a network test that started failing after getCapsAsync was
>>> merged.
>>> Please own its re-introduction to OST:
>>> https://gerrit.ovirt.org/#/c/90264/
>>>
>>> Its most recent failure
>>> http://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/346/
>>> has been discussed by Alona and Piotr over IRC.
>>
>>
>> So https://bugzilla.redhat.com/1571768 was created to cover this issue
>> discovered during Alona's and Piotr's conversation. But after further
>> discussion we have found out that this issue is not related to non-blocking
>> thread changes in engine 4.2 and this behavior exists from beginning of
>> vdsm-jsonrpc-java. Ravi will continue verify the fix for BZ1571768 along
>> with other locking changes he already posted to see if they will help
>> network OST to succeed.
>>
>> But the fix for BZ1571768 is too dangerous for 4.2.3, let's try to fix
>> that on master and let's see if it doesn't introduce any regressions. If
>> not, then we can backport to 4.2.4.
>>
>>
>>
>> --
>> Martin Perina
>> Associate Manager, Software Engineering
>> Red Hat Czech s.r.o.
>
>
> Posted a vdsm-jsonrpc-java patch [1] for BZ 1571768 [2] which fixes the OST
> issue with enabling 006_migrations.prepare_migration_attachments_ipv6.
>
> I ran OST with the vdsm-jsonrpc-java patch [1] and the patch to add back
> 006_migrations.prepare_migration_attachments_ipv6 [3]  and the jobs
> succeeded thrice [4][5][6]
>
> [1] https://gerrit.ovirt.org/#/c/90646/
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1571768
> [3] https://gerrit.ovirt.org/#/c/90264/
> [4] http://jenkins.ovirt.org/job/ovirt-system-tests_manual/2643/
> [5] http://jenkins.ovirt.org/job/ovirt-system-tests_manual/2644/
> [6] http://jenkins.ovirt.org/job/ovirt-system-tests_manual/2645/

Eyal, Gal: would you please take the test back?


More information about the Devel mailing list