[ovirt-devel] [ OST Failure Report ] [ oVirt master ] [ 22/02/2017 ] [add_secondary_storage_domains]

Nir Soffer nsoffer at redhat.com
Thu Feb 23 15:03:42 UTC 2017


On Thu, Feb 23, 2017 at 4:51 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>
>
> On Thu, Feb 23, 2017 at 4:43 PM Nir Soffer <nsoffer at redhat.com> wrote:
>>
>> On Thu, Feb 23, 2017 at 4:38 PM, Barak Korren <bkorren at redhat.com> wrote:
>> > On 23 February 2017 at 16:35, Nir Soffer <nsoffer at redhat.com> wrote:
>> >> On Thu, Feb 23, 2017 at 9:37 AM, Barak Korren <bkorren at redhat.com>
>> >> wrote:
>> >>> Test failed: [ add_secondary_storage_domains ]
>> >>>
>> >>> Note:
>> >>> - This may or may not be related to
>> >>>   https://bugzilla.redhat.com/show_bug.cgi?id=1421945
>> >>>   The BZ talks about sporadic failures, while this seems to be
>> >>>   happening consistently (for 6 runs so far)
>> >>>
>> >>> Link to suspected patches:
>> >>> - https://gerrit.ovirt.org/70415
>> >>> - https://gerrit.ovirt.org/69157
>> >>
>> >> Why do you suspect these patches?
>> >
>> > Because the test right before them passed.
>> > These are all the changes that caused the failing OST job to run.
>> >
>> >> Did you try to run the tests with the latest patches before these
>> >> patches?
>> >
>> > Yes, the test before them pass.
>>
>> Your are correct, these patches are broken:
>>
>> 2017-02-22 16:13:00,745-0500 ERROR (jsonrpc/1)
>> [storage.TaskManager.Task]
>> (Task='4f670db2-70c2-4c21-96ff-114f57de70c0') Unexpected error
>> (task:871)
>> Traceback (most recent call last):
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line
>> 878, in _run
>>     return fn(*args, **kargs)
>>   File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in
>> wrapper
>>     res = f(*args, **kwargs)
>>   File "/usr/share/vdsm/storage/hsm.py", line 989, in connectStoragePool
>>     spUUID, hostID, msdUUID, masterVersion, domainsMap)
>>   File "/usr/share/vdsm/storage/hsm.py", line 1051, in _connectStoragePool
>>     res = pool.connect(hostID, msdUUID, masterVersion)
>>   File "/usr/share/vdsm/storage/sp.py", line 672, in connect
>>     self.__createMailboxMonitor()
>>   File "/usr/share/vdsm/storage/sp.py", line 485, in
>> __createMailboxMonitor
>>     outbox = self._master_volume_path("inbox")
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>> line 77, in wrapper
>>     raise SecureError("Secured object is not in safe state")
>> SecureError: Secured object is not in safe state
>
>
> It's very confusing that this error is sometimes harmless and sometimes
> isn't - how did you identify it as problematic?

It depends on the context.

Here we called __createMailboxMontior, which is something we call
when creating an instance, and is marked as @unsecured.

This call is calling now a new helper introduced in 7cf19dafd7cd,
but the helper was not marked as @unsecured. This will raise
UnsecureError, which will fail the current flow.

We have another instance of this during upgrade domain flow - I think
we have the same issue there, but this needs investigation.

Other errors means that a real secured method is called when a host
is not hte spm. This may be bad client code, or unavoidable, since
there is no race-free way to check that a host is the spm before
calling a method on the spm.

> Y.
>
>>
>>
>> I'm sending a fix.
>>
>> Nir
>>
>> >
>> >
>> > --
>> > Barak Korren
>> > bkorren at redhat.com
>> > RHCE, RHCi, RHV-DevOps Team
>> > https://ifireball.wordpress.com/
>> _______________________________________________
>> Devel mailing list
>> Devel at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel


More information about the Infra mailing list