[ovirt-devel] [ OST Failure Report ] [ oVirt master ] [ 22/02/2017 ] [add_secondary_storage_domains]

Nir Soffer nsoffer at redhat.com
Thu Feb 23 16:11:10 UTC 2017


I see this error there now:

00:02:59.460 [upgrade-from-release_suit_el7] + yum install
--nogpgcheck -y --downloaddir=/dev/shm ntp ovirt-engine
ovirt-log-collector 'ovirt-engine-extension-aaa-ldap*'
00:02:59.460 [upgrade-from-release_suit_el7]
00:02:59.461 [upgrade-from-release_suit_el7]
00:02:59.461 [upgrade-from-release_suit_el7]  One of the configured
repositories failed (Unknown),
00:02:59.461 [upgrade-from-release_suit_el7]  and yum doesn't have
enough cached data to continue. At this point the only
00:02:59.461 [upgrade-from-release_suit_el7]  safe thing yum can do is
fail. There are a few ways to work "fix" this:
00:02:59.461 [upgrade-from-release_suit_el7]
00:02:59.461 [upgrade-from-release_suit_el7]      1. Contact the
upstream for the repository and get them to fix the problem.
00:02:59.461 [upgrade-from-release_suit_el7]
00:02:59.462 [upgrade-from-release_suit_el7]      2. Reconfigure the
baseurl/etc. for the repository, to point to a working
00:02:59.462 [upgrade-from-release_suit_el7]         upstream. This is
most often useful if you are using a newer
00:02:59.462 [upgrade-from-release_suit_el7]         distribution
release than is supported by the repository (and the
00:02:59.462 [upgrade-from-release_suit_el7]         packages for the
previous distribution release still work).
00:02:59.462 [upgrade-from-release_suit_el7]
00:02:59.462 [upgrade-from-release_suit_el7]      3. Disable the
repository, so yum won't use it by default. Yum will then
00:02:59.462 [upgrade-from-release_suit_el7]         just ignore the
repository until you permanently enable it again or use
00:02:59.463 [upgrade-from-release_suit_el7]         --enablerepo for
temporary usage:
00:02:59.463 [upgrade-from-release_suit_el7]
00:02:59.463 [upgrade-from-release_suit_el7]
yum-config-manager --disable <repoid>
00:02:59.463 [upgrade-from-release_suit_el7]
00:02:59.463 [upgrade-from-release_suit_el7]      4. Configure the
failing repository to be skipped, if it is unavailable.
00:02:59.463 [upgrade-from-release_suit_el7]         Note that yum
will try to contact the repo. when it runs most commands,
00:02:59.464 [upgrade-from-release_suit_el7]         so will have to
try and fail each time (and thus. yum will be be much
00:02:59.464 [upgrade-from-release_suit_el7]         slower). If it is
a very temporary problem though, this is often a nice
00:02:59.464 [upgrade-from-release_suit_el7]         compromise:
00:02:59.464 [upgrade-from-release_suit_el7]
00:02:59.464 [upgrade-from-release_suit_el7]
yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true
00:02:59.464 [upgrade-from-release_suit_el7]
00:02:59.464 [upgrade-from-release_suit_el7] Cannot find a valid
baseurl for repo: base/7/x86_64


On Thu, Feb 23, 2017 at 6:02 PM, Barak Korren <bkorren at redhat.com> wrote:
> Great!
>
> This OST experimental run is verifying that:
> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/5503/
>
> Hope it doesn't fail on something else...
>
> On 23 February 2017 at 17:38, Nir Soffer <nsoffer at redhat.com> wrote:
>> Fixed in
>>
>> commit 726e946257174926ea2591a1c4a3be2dae4297ea
>> Author: Nir Soffer <nsoffer at redhat.com>
>> Date:   Thu Feb 23 16:45:45 2017 +0200
>>
>>     sp: Mark helper method as @unsecured
>>
>>     In commit 7cf19dafd7cd (storage_mailbox: make inbox/outbox mailbox
>>     args), we added a helper that is used before the spm is started, but the
>>     helper was not marked as @unsecure. This cause the call to fail with:
>>
>>         File "/usr/share/vdsm/storage/sp.py", line 485, in
>> __createMailboxMonitor
>>             outbox = self._master_volume_path("inbox")
>>           File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>         line 77, in wrapper
>>             raise SecureError("Secured object is not in safe state")
>>         SecureError: Secured object is not in safe state
>>
>>     As this helper doesn't change the state of the storage pool, there is no
>>     reason to treat it as a secured method, which is the default for this
>>     class.
>>
>>     Change-Id: Icf92b9474c9000840a5c15e3b91f2ced4d02aca2
>>     Signed-off-by: Nir Soffer <nsoffer at redhat.com>
>>
>> Verified with http://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/64/console
>>
>> Thanks for reporting this.
>>
>> Nir
>>
>>
>> On Thu, Feb 23, 2017 at 5:03 PM, Nir Soffer <nsoffer at redhat.com> wrote:
>>> On Thu, Feb 23, 2017 at 4:51 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>>>>
>>>>
>>>> On Thu, Feb 23, 2017 at 4:43 PM Nir Soffer <nsoffer at redhat.com> wrote:
>>>>>
>>>>> On Thu, Feb 23, 2017 at 4:38 PM, Barak Korren <bkorren at redhat.com> wrote:
>>>>> > On 23 February 2017 at 16:35, Nir Soffer <nsoffer at redhat.com> wrote:
>>>>> >> On Thu, Feb 23, 2017 at 9:37 AM, Barak Korren <bkorren at redhat.com>
>>>>> >> wrote:
>>>>> >>> Test failed: [ add_secondary_storage_domains ]
>>>>> >>>
>>>>> >>> Note:
>>>>> >>> - This may or may not be related to
>>>>> >>>   https://bugzilla.redhat.com/show_bug.cgi?id=1421945
>>>>> >>>   The BZ talks about sporadic failures, while this seems to be
>>>>> >>>   happening consistently (for 6 runs so far)
>>>>> >>>
>>>>> >>> Link to suspected patches:
>>>>> >>> - https://gerrit.ovirt.org/70415
>>>>> >>> - https://gerrit.ovirt.org/69157
>>>>> >>
>>>>> >> Why do you suspect these patches?
>>>>> >
>>>>> > Because the test right before them passed.
>>>>> > These are all the changes that caused the failing OST job to run.
>>>>> >
>>>>> >> Did you try to run the tests with the latest patches before these
>>>>> >> patches?
>>>>> >
>>>>> > Yes, the test before them pass.
>>>>>
>>>>> Your are correct, these patches are broken:
>>>>>
>>>>> 2017-02-22 16:13:00,745-0500 ERROR (jsonrpc/1)
>>>>> [storage.TaskManager.Task]
>>>>> (Task='4f670db2-70c2-4c21-96ff-114f57de70c0') Unexpected error
>>>>> (task:871)
>>>>> Traceback (most recent call last):
>>>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line
>>>>> 878, in _run
>>>>>     return fn(*args, **kargs)
>>>>>   File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in
>>>>> wrapper
>>>>>     res = f(*args, **kwargs)
>>>>>   File "/usr/share/vdsm/storage/hsm.py", line 989, in connectStoragePool
>>>>>     spUUID, hostID, msdUUID, masterVersion, domainsMap)
>>>>>   File "/usr/share/vdsm/storage/hsm.py", line 1051, in _connectStoragePool
>>>>>     res = pool.connect(hostID, msdUUID, masterVersion)
>>>>>   File "/usr/share/vdsm/storage/sp.py", line 672, in connect
>>>>>     self.__createMailboxMonitor()
>>>>>   File "/usr/share/vdsm/storage/sp.py", line 485, in
>>>>> __createMailboxMonitor
>>>>>     outbox = self._master_volume_path("inbox")
>>>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>>>>> line 77, in wrapper
>>>>>     raise SecureError("Secured object is not in safe state")
>>>>> SecureError: Secured object is not in safe state
>>>>
>>>>
>>>> It's very confusing that this error is sometimes harmless and sometimes
>>>> isn't - how did you identify it as problematic?
>>>
>>> It depends on the context.
>>>
>>> Here we called __createMailboxMontior, which is something we call
>>> when creating an instance, and is marked as @unsecured.
>>>
>>> This call is calling now a new helper introduced in 7cf19dafd7cd,
>>> but the helper was not marked as @unsecured. This will raise
>>> UnsecureError, which will fail the current flow.
>>>
>>> We have another instance of this during upgrade domain flow - I think
>>> we have the same issue there, but this needs investigation.
>>>
>>> Other errors means that a real secured method is called when a host
>>> is not hte spm. This may be bad client code, or unavoidable, since
>>> there is no race-free way to check that a host is the spm before
>>> calling a method on the spm.
>>>
>>>> Y.
>>>>
>>>>>
>>>>>
>>>>> I'm sending a fix.
>>>>>
>>>>> Nir
>>>>>
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Barak Korren
>>>>> > bkorren at redhat.com
>>>>> > RHCE, RHCi, RHV-DevOps Team
>>>>> > https://ifireball.wordpress.com/
>>>>> _______________________________________________
>>>>> Devel mailing list
>>>>> Devel at ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>
>
>
> --
> Barak Korren
> bkorren at redhat.com
> RHCE, RHCi, RHV-DevOps Team
> https://ifireball.wordpress.com/


More information about the Devel mailing list