On Thu, Feb 23, 2017 at 4:43 PM Nir Soffer <nsoffer(a)redhat.com> wrote:
>
> On Thu, Feb 23, 2017 at 4:38 PM, Barak Korren <bkorren(a)redhat.com> wrote:
> > On 23 February 2017 at 16:35, Nir Soffer <nsoffer(a)redhat.com> wrote:
> >> On Thu, Feb 23, 2017 at 9:37 AM, Barak Korren <bkorren(a)redhat.com>
> >> wrote:
> >>> Test failed: [ add_secondary_storage_domains ]
> >>>
> >>> Note:
> >>> - This may or may not be related to
> >>>
https://bugzilla.redhat.com/show_bug.cgi?id=1421945
> >>> The BZ talks about sporadic failures, while this seems to be
> >>> happening consistently (for 6 runs so far)
> >>>
> >>> Link to suspected patches:
> >>> -
https://gerrit.ovirt.org/70415
> >>> -
https://gerrit.ovirt.org/69157
> >>
> >> Why do you suspect these patches?
> >
> > Because the test right before them passed.
> > These are all the changes that caused the failing OST job to run.
> >
> >> Did you try to run the tests with the latest patches before these
> >> patches?
> >
> > Yes, the test before them pass.
>
> Your are correct, these patches are broken:
>
> 2017-02-22 16:13:00,745-0500 ERROR (jsonrpc/1)
> [storage.TaskManager.Task]
> (Task='4f670db2-70c2-4c21-96ff-114f57de70c0') Unexpected error
> (task:871)
> Traceback (most recent call last):
> File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line
> 878, in _run
> return fn(*args, **kargs)
> File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in
> wrapper
> res = f(*args, **kwargs)
> File "/usr/share/vdsm/storage/hsm.py", line 989, in connectStoragePool
> spUUID, hostID, msdUUID, masterVersion, domainsMap)
> File "/usr/share/vdsm/storage/hsm.py", line 1051, in _connectStoragePool
> res = pool.connect(hostID, msdUUID, masterVersion)
> File "/usr/share/vdsm/storage/sp.py", line 672, in connect
> self.__createMailboxMonitor()
> File "/usr/share/vdsm/storage/sp.py", line 485, in
> __createMailboxMonitor
> outbox = self._master_volume_path("inbox")
> File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
> line 77, in wrapper
> raise SecureError("Secured object is not in safe state")
> SecureError: Secured object is not in safe state
It's very confusing that this error is sometimes harmless and sometimes
isn't - how did you identify it as problematic?
It depends on the context.
Here we called __createMailboxMontior, which is something we call
when creating an instance, and is marked as @unsecured.
This call is calling now a new helper introduced in 7cf19dafd7cd,
but the helper was not marked as @unsecured. This will raise
UnsecureError, which will fail the current flow.
We have another instance of this during upgrade domain flow - I think
we have the same issue there, but this needs investigation.
Other errors means that a real secured method is called when a host
is not hte spm. This may be bad client code, or unavoidable, since
there is no race-free way to check that a host is the spm before
calling a method on the spm.
Y.
>
>
> I'm sending a fix.
>
> Nir
>
> >
> >
> > --
> > Barak Korren
> > bkorren(a)redhat.com
> > RHCE, RHCi, RHV-DevOps Team
> >
https://ifireball.wordpress.com/
> _______________________________________________
> Devel mailing list
> Devel(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/devel