[ovirt-devel] dynamic ownership changes

Martin Polednik mpolednik at redhat.com
Fri Apr 27 07:23:24 UTC 2018


On 24/04/18 00:37 +0300, Elad Ben Aharon wrote:
>I will update with the results of the next tier1 execution on latest 4.2.3

That isn't master but old branch though. Could you run it against
*current* VDSM master?

>On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik <mpolednik at redhat.com>
>wrote:
>
>> On 23/04/18 01:23 +0300, Elad Ben Aharon wrote:
>>
>>> Hi, I've triggered another execution [1] due to some issues I saw in the
>>> first which are not related to the patch.
>>>
>>> The success rate is 78% which is low comparing to tier1 executions with
>>> code from downstream builds (95-100% success rates) [2].
>>>
>>
>> Could you run the current master (without the dynamic_ownership patch)
>> so that we have viable comparision?
>>
>> From what I could see so far, there is an issue with move and copy
>>> operations to and from Gluster domains. For example [3].
>>>
>>> The logs are attached.
>>>
>>>
>>> [1]
>>> *https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv
>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/
>>> <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv
>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/>*
>>>
>>>
>>>
>>> [2]
>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/
>>>
>>> rhv-4.2-ge-runner-tier1-after-upgrade/7/
>>>
>>>
>>>
>>> [3]
>>> 2018-04-22 13:06:28,316+0300 INFO  (jsonrpc/7) [vdsm.api] FINISH
>>> deleteImage error=Image does not exist in domain:
>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33,
>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'
>>> from=:
>>> :ffff:10.35.161.182,40936, flow_id=disks_syncAction_ba6b2630-5976-4935,
>>> task_id=3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51)
>>> 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) [storage.TaskManager.Task]
>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error (task:875)
>>> Traceback (most recent call last):
>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882,
>>> in
>>> _run
>>>   return fn(*args, **kargs)
>>> File "<string>", line 2, in deleteImage
>>> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, in
>>> method
>>>   ret = func(*args, **kwargs)
>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1503,
>>> in
>>> deleteImage
>>>   raise se.ImageDoesNotExistInSD(imgUUID, sdUUID)
>>> ImageDoesNotExistInSD: Image does not exist in domain:
>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33,
>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'
>>>
>>> 2018-04-22 13:06:28,317+0300 INFO  (jsonrpc/7) [storage.TaskManager.Task]
>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task is aborted:
>>> "Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835-
>>> 5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'" - code 268
>>> (task:1181)
>>> 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) [storage.Dispatcher] FINISH
>>> deleteImage error=Image does not exist in domain:
>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33,
>>> domain=e5fd29c8-52ba-467e-be09
>>> -ca40ff054d
>>> d4' (dispatcher:82)
>>>
>>>
>>>
>>> On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon <ebenahar at redhat.com>
>>> wrote:
>>>
>>> Triggered a sanity tier1 execution [1] using [2], which covers all the
>>>> requested areas, on iSCSI, NFS and Gluster.
>>>> I'll update with the results.
>>>>
>>>> [1]
>>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2
>>>> _dev/job/rhv-4.2-ge-flow-storage/1161/
>>>>
>>>> [2]
>>>> https://gerrit.ovirt.org/#/c/89830/
>>>> vdsm-4.30.0-291.git77aef9a.el7.x86_64
>>>>
>>>>
>>>>
>>>> On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik <mpolednik at redhat.com>
>>>> wrote:
>>>>
>>>> On 19/04/18 14:54 +0300, Elad Ben Aharon wrote:
>>>>>
>>>>> Hi Martin,
>>>>>>
>>>>>> I see [1] requires a rebase, can you please take care?
>>>>>>
>>>>>>
>>>>> Should be rebased.
>>>>>
>>>>> At the moment, our automation is stable only on iSCSI, NFS, Gluster and
>>>>>
>>>>>> FC.
>>>>>> Ceph is not supported and Cinder will be stabilized soon, AFAIR, it's
>>>>>> not
>>>>>> stable enough at the moment.
>>>>>>
>>>>>>
>>>>> That is still pretty good.
>>>>>
>>>>>
>>>>> [1] https://gerrit.ovirt.org/#/c/89830/
>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik <mpolednik at redhat.com
>>>>>> >
>>>>>> wrote:
>>>>>>
>>>>>> On 18/04/18 11:37 +0300, Elad Ben Aharon wrote:
>>>>>>
>>>>>>>
>>>>>>> Hi, sorry if I misunderstood, I waited for more input regarding what
>>>>>>>
>>>>>>>> areas
>>>>>>>> have to be tested here.
>>>>>>>>
>>>>>>>>
>>>>>>>> I'd say that you have quite a bit of freedom in this regard.
>>>>>>> GlusterFS
>>>>>>> should be covered by Dennis, so iSCSI/NFS/ceph/cinder with some suite
>>>>>>> that covers basic operations (start & stop VM, migrate it), snapshots
>>>>>>> and merging them, and whatever else would be important for storage
>>>>>>> sanity.
>>>>>>>
>>>>>>> mpolednik
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik <
>>>>>>> mpolednik at redhat.com
>>>>>>> >
>>>>>>>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On 11/04/18 16:52 +0300, Elad Ben Aharon wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> We can test this on iSCSI, NFS and GlusterFS. As for ceph and
>>>>>>>>> cinder,
>>>>>>>>>
>>>>>>>>> will
>>>>>>>>>> have to check, since usually, we don't execute our automation on
>>>>>>>>>> them.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Any update on this? I believe the gluster tests were successful,
>>>>>>>>>> OST
>>>>>>>>>>
>>>>>>>>> passes fine and unit tests pass fine, that makes the storage
>>>>>>>>> backends
>>>>>>>>> test the last required piece.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir <ratamir at redhat.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> +Elad
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg <danken at redhat.com
>>>>>>>>>>> >
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer <nsoffer at redhat.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri <eedri at redhat.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Please make sure to run as much OST suites on this patch as
>>>>>>>>>>>>> possible
>>>>>>>>>>>>>
>>>>>>>>>>>>> before merging ( using 'ci please build' )
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But note that OST is not a way to verify the patch.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Such changes require testing with all storage types we support.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Nir
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik <
>>>>>>>>>>>>> mpolednik at redhat.com
>>>>>>>>>>>>> >
>>>>>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hey,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've created a patch[0] that is finally able to activate
>>>>>>>>>>>>>>> libvirt's
>>>>>>>>>>>>>>> dynamic_ownership for VDSM while not negatively affecting
>>>>>>>>>>>>>>> functionality of our storage code.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That of course comes with quite a bit of code removal, mostly
>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> area of host devices, hwrng and anything that touches devices;
>>>>>>>>>>>>>>> bunch
>>>>>>>>>>>>>>> of test changes and one XML generation caveat (storage is
>>>>>>>>>>>>>>> handled
>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>> VDSM, therefore disk relabelling needs to be disabled on the
>>>>>>>>>>>>>>> VDSM
>>>>>>>>>>>>>>> level).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Because of the scope of the patch, I welcome
>>>>>>>>>>>>>>> storage/virt/network
>>>>>>>>>>>>>>> people to review the code and consider the implication this
>>>>>>>>>>>>>>> change
>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>> on current/future features.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [0] https://gerrit.ovirt.org/#/c/89830/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In particular:  dynamic_ownership was set to 0 prehistorically
>>>>>>>>>>>>>>> (as
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> part
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> of https://bugzilla.redhat.com/show_bug.cgi?id=554961 ) because
>>>>>>>>>>>> libvirt,
>>>>>>>>>>>> running as root, was not able to play properly with root-squash
>>>>>>>>>>>> nfs
>>>>>>>>>>>> mounts.
>>>>>>>>>>>>
>>>>>>>>>>>> Have you attempted this use case?
>>>>>>>>>>>>
>>>>>>>>>>>> I join to Nir's request to run this with storage QE.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Raz Tamir
>>>>>>>>>>> Manager, RHV QE
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>
>>
>>


More information about the Devel mailing list