Re: [ovirt-devel] dynamic ownership changes
 
            --Apple-Mail=_82B55436-9E61-4F82-AF39-6B64C656C187 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi Elad, why did you install vdsm-hook-allocate_net? adding Dan as I think the hook is not supposed to fail this badly in any = case Thanks, michal
On 5 May 2018, at 19:22, Elad Ben Aharon <ebenahar@redhat.com> wrote: =20 Start VM fails on: =20 2018-05-05 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] = (vmId=3D'e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path: = 'dev=3D/rhev/data-center/mnt/blockSD/db5a6696-d907-4938-9a78-bdd13a843c62/= images/6cdabfe5-=20 d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a01928211' -> = u'*dev=3D/rhev/data-center/mnt/blockSD/db5a6696-d907-4938-9a78-bdd13a843c6= 2/images/6cdabfe5-d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-=20=
On 5 May 2018, at 00:38, Elad Ben Aharon <ebenahar@redhat.com = <mailto:ebenahar@redhat.com>> wrote: =20 Hi guys,=20 =20 The vdsm build from the patch requires glusterfs-fuse > 3.12. This is = while the latest 4.2.3-5 d/s build requires 3.8.4 (3.4.0.59rhs-1.el7) =20 because it is still oVirt, not a downstream build. We can=E2=80=99t = really do downstream builds with unmerged changes:/ =20 Trying to get this gluster-fuse build, so far no luck. Is this requirement intentional?=20 =20 it should work regardless, I guess you can force install it without =
=20 On Fri, May 4, 2018 at 2:38 PM, Michal Skrivanek = <michal.skrivanek@redhat.com <mailto:michal.skrivanek@redhat.com>> = wrote: Hi Elad, to make it easier to compare, Martin backported the change to 4.2 so = it is actually comparable with a run without that patch. Would you =
It would be best to have 4.2 upstream and this[1] run to really = minimize the noise. =20 Thanks, michal =20 [1] = http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on-demand-el7-x86_64= /28/ = <http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on-demand-el7-x86_6= 4/28/> =20
On 27 Apr 2018, at 09:23, Martin Polednik <mpolednik@redhat.com = <mailto:mpolednik@redhat.com>> wrote: =20 On 24/04/18 00:37 +0300, Elad Ben Aharon wrote:
I will update with the results of the next tier1 execution on = latest 4.2.3 =20 That isn't master but old branch though. Could you run it against *current* VDSM master? =20 On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik = <mpolednik@redhat.com <mailto:mpolednik@redhat.com>> wrote: =20
On 23/04/18 01:23 +0300, Elad Ben Aharon wrote: =20
Hi, I've triggered another execution [1] due to some issues I saw = in the first which are not related to the patch. =20 The success rate is 78% which is low comparing to tier1 = executions with code from downstream builds (95-100% success rates) [2]. =20 =20 Could you run the current master (without the dynamic_ownership =
so that we have viable comparision? =20 =46rom what I could see so far, there is an issue with move and = copy
operations to and from Gluster domains. For example [3]. =20 The logs are attached. =20 =20 [1] *https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv = <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv> -4.2-ge-runner-tier1-after-upgrade/7/testReport/ <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv = <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv> -4.2-ge-runner-tier1-after-upgrade/7/testReport/>* =20 =20 =20 [2] https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/ = <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/> =20 rhv-4.2-ge-runner-tier1-after-upgrade/7/ =20 =20 =20 [3] 2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) [vdsm.api] FINISH deleteImage error=3DImage does not exist in domain: 'image=3Dcabb8846-7a4b-4244-9835-5f603e682f33, domain=3De5fd29c8-52ba-467e-be09-ca40ff054dd4' from=3D: :ffff:10.35.161.182,40936, = flow_id=3Ddisks_syncAction_ba6b2630-5976-4935, task_id=3D3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51) 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) = [storage.TaskManager.Task] (Task=3D'3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error = (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", =
in _run return fn(*args, **kargs) File "<string>", line 2, in deleteImage File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line = 49, in method ret =3D func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line = 1503, in deleteImage raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) ImageDoesNotExistInSD: Image does not exist in domain: 'image=3Dcabb8846-7a4b-4244-9835-5f603e682f33, domain=3De5fd29c8-52ba-467e-be09-ca40ff054dd4' =20 2018-04-22 13:06:28,317+0300 INFO (jsonrpc/7) = [storage.TaskManager.Task] (Task=3D'3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task is = aborted: "Image does not exist in domain: 'image=3Dcabb8846-7a4b-4244-9835- 5f603e682f33, domain=3De5fd29c8-52ba-467e-be09-ca40ff054dd4'" - = code 268 (task:1181) 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) = [storage.Dispatcher] FINISH deleteImage error=3DImage does not exist in domain: 'image=3Dcabb8846-7a4b-4244-9835-5f603e682f33, domain=3De5fd29c8-52ba-467e-be09 -ca40ff054d d4' (dispatcher:82) =20 =20 =20 On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon = <ebenahar@redhat.com <mailto:ebenahar@redhat.com>> wrote: =20 Triggered a sanity tier1 execution [1] using [2], which covers = all the > requested areas, on iSCSI, NFS and Gluster. > I'll update with the results. >=20 > [1] > https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2 = <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2> > _dev/job/rhv-4.2-ge-flow-storage/1161/ >=20 > [2] > https://gerrit.ovirt.org/#/c/89830/ = <https://gerrit.ovirt.org/#/c/89830/> > vdsm-4.30.0-291.git77aef9a.el7.x86_64 >=20 >=20 >=20 > On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik = <mpolednik@redhat.com <mailto:mpolednik@redhat.com>> > wrote: >=20 > On 19/04/18 14:54 +0300, Elad Ben Aharon wrote: >>=20 >> Hi Martin, >>>=20 >>> I see [1] requires a rebase, can you please take care? >>>=20 >>>=20 >> Should be rebased. >>=20 >> At the moment, our automation is stable only on iSCSI, NFS, = Gluster and >>=20 >>> FC. >>> Ceph is not supported and Cinder will be stabilized soon, = AFAIR, it's >>> not >>> stable enough at the moment. >>>=20 >>>=20 >> That is still pretty good. >>=20 >>=20 >> [1] https://gerrit.ovirt.org/#/c/89830/ = <https://gerrit.ovirt.org/#/c/89830/> >>=20 >>>=20 >>>=20 >>> Thanks >>>=20 >>> On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik = <mpolednik@redhat.com <mailto:mpolednik@redhat.com> >>> > >>> wrote: >>>=20 >>> On 18/04/18 11:37 +0300, Elad Ben Aharon wrote: >>>=20 >>>>=20 >>>> Hi, sorry if I misunderstood, I waited for more input = regarding what >>>>=20 >>>>> areas >>>>> have to be tested here. >>>>>=20 >>>>>=20 >>>>> I'd say that you have quite a bit of freedom in this regard. >>>> GlusterFS >>>> should be covered by Dennis, so iSCSI/NFS/ceph/cinder with = some suite >>>> that covers basic operations (start & stop VM, migrate it), = snapshots >>>> and merging them, and whatever else would be important for = storage >>>> sanity. >>>>=20 >>>> mpolednik >>>>=20 >>>>=20 >>>> On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik < >>>> mpolednik@redhat.com <mailto:mpolednik@redhat.com> >>>> > >>>>=20 >>>> wrote: >>>>>=20 >>>>> On 11/04/18 16:52 +0300, Elad Ben Aharon wrote: >>>>>=20 >>>>>=20 >>>>>> We can test this on iSCSI, NFS and GlusterFS. As for ceph = and >>>>>> cinder, >>>>>>=20 >>>>>> will >>>>>>> have to check, since usually, we don't execute our = automation on >>>>>>> them. >>>>>>>=20 >>>>>>>=20 >>>>>>> Any update on this? I believe the gluster tests were = successful, >>>>>>> OST >>>>>>>=20 >>>>>> passes fine and unit tests pass fine, that makes the = storage >>>>>> backends >>>>>> test the last required piece. >>>>>>=20 >>>>>>=20 >>>>>> On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir = <ratamir@redhat.com <mailto:ratamir@redhat.com>> >>>>>> wrote: >>>>>>=20 >>>>>>=20 >>>>>> +Elad >>>>>>>=20 >>>>>>>=20 >>>>>>> On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg = <danken@redhat.com <mailto:danken@redhat.com> >>>>>>>> > >>>>>>>> wrote: >>>>>>>>=20 >>>>>>>> On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer = <nsoffer@redhat.com <mailto:nsoffer@redhat.com>> >>>>>>>> wrote: >>>>>>>>=20 >>>>>>>>=20 >>>>>>>> On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri = <eedri@redhat.com <mailto:eedri@redhat.com>> >>>>>>>>> wrote: >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> Please make sure to run as much OST suites on this patch = as >>>>>>>>>> possible >>>>>>>>>>=20 >>>>>>>>>> before merging ( using 'ci please build' ) >>>>>>>>>>=20 >>>>>>>>>>>=20 >>>>>>>>>>>=20 >>>>>>>>>>> But note that OST is not a way to verify the patch. >>>>>>>>>>>=20 >>>>>>>>>>>=20 >>>>>>>>>> Such changes require testing with all storage types we = support. >>>>>>>>>>=20 >>>>>>>>>> Nir >>>>>>>>>>=20 >>>>>>>>>> On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik < >>>>>>>>>> mpolednik@redhat.com <mailto:mpolednik@redhat.com> >>>>>>>>>> > >>>>>>>>>>=20 >>>>>>>>>> wrote: >>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>>> Hey, >>>>>>>>>>>=20 >>>>>>>>>>>=20 >>>>>>>>>>> I've created a patch[0] that is finally able to = activate >>>>>>>>>>>> libvirt's >>>>>>>>>>>> dynamic_ownership for VDSM while not negatively = affecting >>>>>>>>>>>> functionality of our storage code. >>>>>>>>>>>>=20 >>>>>>>>>>>> That of course comes with quite a bit of code = removal, mostly >>>>>>>>>>>> in >>>>>>>>>>>> the >>>>>>>>>>>> area of host devices, hwrng and anything that touches = devices; >>>>>>>>>>>> bunch >>>>>>>>>>>> of test changes and one XML generation caveat = (storage is >>>>>>>>>>>> handled >>>>>>>>>>>> by >>>>>>>>>>>> VDSM, therefore disk relabelling needs to be disabled = on the >>>>>>>>>>>> VDSM >>>>>>>>>>>> level). >>>>>>>>>>>>=20 >>>>>>>>>>>> Because of the scope of the patch, I welcome >>>>>>>>>>>> storage/virt/network >>>>>>>>>>>> people to review the code and consider the = implication this >>>>>>>>>>>> change >>>>>>>>>>>> has >>>>>>>>>>>> on current/future features. >>>>>>>>>>>>=20 >>>>>>>>>>>> [0] https://gerrit.ovirt.org/#/c/89830/ = <https://gerrit.ovirt.org/#/c/89830/> >>>>>>>>>>>>=20 >>>>>>>>>>>>=20 >>>>>>>>>>>> In particular: dynamic_ownership was set to 0 =
f35a01928211' (storagexml:334)=20 2018-05-05 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START = getSpmStatus(spUUID=3D'940fe6f3-b0c6-4d0c-a921-198e7819c1cc', = options=3DNone) from=3D::ffff:10.35.161.127,53512, = task_id=3Dc70ace39-dbfe-4f5c-ae49-a1e3a82c=20 2758 (api:46)=20 2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) [root] = /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: rc=3D2 = err=3Dvm net allocation hook: [unexpected error]: Traceback (most recent = call last):=20 File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", = line 105, in <module>=20 main()=20 File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", = line 93, in main=20 allocate_random_network(device_xml)=20 File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", = line 62, in allocate_random_network=20 net =3D _get_random_network()=20 File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", = line 50, in _get_random_network=20 available_nets =3D _parse_nets()=20 File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", = line 46, in _parse_nets=20 return [net for net in os.environ[AVAIL_NETS_KEY].split()]=20 File "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__=20 raise KeyError(key)=20 KeyError: 'equivnets'=20 =20 =20 (hooks:110)=20 2018-05-05 17:53:27,915+0300 ERROR (vm/e6ce66ce) [virt.vm] = (vmId=3D'e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process = failed (vm:943)=20 Traceback (most recent call last):=20 File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, in = _startUnderlyingVm=20 self._run()=20 File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2861, = in _run=20 domxml =3D hooks.before_vm_start(self._buildDomainXML(),=20 File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2254, = in _buildDomainXML=20 dom, self.id <http://self.id/>, self._custom['custom'])=20 File = "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_preprocess.py", line = 240, in replace_device_xml_with_hooks_xml=20 dev_custom)=20 File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line = 134, in before_device_create=20 params=3DcustomProperties)=20 File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line = 120, in _runHooksDir=20 raise exception.HookError(err)=20 HookError: Hook Error: ('vm net allocation hook: [unexpected error]: = Traceback (most recent call last):\n File = "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line = 105, in <module>\n main()\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", = line 93, in main\n allocate_random_network(device_xml)\n File = "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 62, = i n allocate_random_network\n net =3D _get_random_network()\n File = "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 50, = in _get_random_network\n available_nets =3D _parse_nets()\n File = "/us r/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, = in _parse_nets\n return [net for net in = os.environ[AVAIL_NETS_KEY].split()]\n File = "/usr/lib64/python2.7/UserDict.py", line 23, in __getit em__\n raise KeyError(key)\nKeyError: \'equivnets\'\n\n\n',) =20 =20 =20 Hence, the success rate was 28% against 100% running with d/s (d/s). = If needed, I'll compare against the latest master, but I think you get = the picture with d/s. =20 vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64=20 libvirt-3.9.0-14.el7_5.3.x86_64=20 qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64=20 kernel 3.10.0-862.el7.x86_64 rhel7.5 =20 =20 Logs attached =20 On Sat, May 5, 2018 at 1:26 PM, Elad Ben Aharon <ebenahar@redhat.com = <mailto:ebenahar@redhat.com>> wrote: nvm, found gluster 3.12 repo, managed to install vdsm =20 On Sat, May 5, 2018 at 1:12 PM, Elad Ben Aharon <ebenahar@redhat.com = <mailto:ebenahar@redhat.com>> wrote: No, vdsm requires it: =20 Error: Package: vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 = (/vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64)=20 Requires: glusterfs-fuse >=3D 3.12=20 Installed: glusterfs-fuse-3.8.4-54.8.el7.x86_64 (@rhv-4.2.3) =20 Therefore, vdsm package installation is skipped upon force install. =20 On Sat, May 5, 2018 at 11:42 AM, Michal Skrivanek = <michal.skrivanek@redhat.com <mailto:michal.skrivanek@redhat.com>> = wrote: =20 =20 the dependency =20 please try that out?=20 patch) line 882, prehistorically
>>>>>>>>>>>> (as >>>>>>>>>>>>=20 >>>>>>>>>>>>=20 >>>>>>>>>>> part >>>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>> of https://bugzilla.redhat.com/show_bug.cgi?id=3D554961 = <https://bugzilla.redhat.com/show_bug.cgi?id=3D554961> ) because >>>>>>>>> libvirt, >>>>>>>>> running as root, was not able to play properly with = root-squash >>>>>>>>> nfs >>>>>>>>> mounts. >>>>>>>>>=20 >>>>>>>>> Have you attempted this use case? >>>>>>>>>=20 >>>>>>>>> I join to Nir's request to run this with storage QE. >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> -- >>>>>>>>=20 >>>>>>>>=20 >>>>>>>> Raz Tamir >>>>>>>> Manager, RHV QE >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>=20 >=20 =20 =20
Devel mailing list Devel@ovirt.org <mailto:Devel@ovirt.org> http://lists.ovirt.org/mailman/listinfo/devel = <http://lists.ovirt.org/mailman/listinfo/devel> =20 =20 =20 =20 =20 =20 =20 =20 <logs.tar.gz>
--Apple-Mail=_82B55436-9E61-4F82-AF39-6B64C656C187 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html; = charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; = -webkit-nbsp-mode: space; line-break: after-white-space;" class=3D"">Hi = Elad,<div class=3D"">why did you install = vdsm-hook-allocate_net?</div><div class=3D""><br class=3D""></div><div = class=3D"">adding Dan as I think the hook is not supposed to fail this = badly in any case</div><div class=3D""><br class=3D""></div><div = class=3D"">Thanks,</div><div class=3D"">michal<br class=3D""><div><br = class=3D""><blockquote type=3D"cite" class=3D""><div class=3D"">On 5 May = 2018, at 19:22, Elad Ben Aharon <<a href=3D"mailto:ebenahar@redhat.com"= class=3D"">ebenahar@redhat.com</a>> wrote:</div><br = class=3D"Apple-interchange-newline"><div class=3D""><div dir=3D"ltr" = class=3D"">Start VM fails on:<div class=3D""><br class=3D""></div><div = class=3D""><span style=3D"font-family:monospace" class=3D""><span = style=3D"background-color: rgb(255, 255, 255);" class=3D"">2018-05-05 = 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] = (vmId=3D'e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path: = 'dev=3D/rhev/data-center/mnt/blockSD/db5a6696-d907-4938-9a78-bdd13a843c62/= images/6cdabfe5- </span><br = class=3D"">d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a0192821= 1' -> = u'*dev=3D/rhev/data-center/mnt/blockSD/db5a6696-d907-4938-9a78-bdd13a843c6= 2/images/6cdabfe5-d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425- = <br class=3D"">f35a01928211' (storagexml:334) <br class=3D"">2018-05-05 = 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START = getSpmStatus(spUUID=3D'940fe6f3-b0c6-4d0c-a921-198e7819c1cc', = options=3DNone) from=3D::ffff:10.35.161.127,53512, = task_id=3Dc70ace39-dbfe-4f5c-ae49-a1e3a82c <br class=3D"">2758 (api:46) = <br class=3D"">2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) = [root] /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: = rc=3D2 err=3Dvm net allocation hook: [unexpected error]: Traceback (most = recent call last): <br class=3D""> File = "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line = 105, in <module> <br class=3D""> main() <br = class=3D""> File = "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 93, = in main <br = class=3D""> allocate_random_network(device_xml) <br = class=3D""> File = "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 62, = in allocate_random_network <br class=3D""> net =3D = _get_random_network() <br class=3D""> File = "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 50, = in _get_random_network <br class=3D""> available_nets =3D= _parse_nets() <br class=3D""> File = "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, = in _parse_nets <br class=3D""> return [net for net in = os.environ[AVAIL_NETS_KEY].split()] <br class=3D""> File = "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__ <br = class=3D""> raise KeyError(key) <br class=3D"">KeyError: = 'equivnets' <br class=3D""> <br class=3D""> <br class=3D"">(hooks:110) <br class=3D"">2018-05-05 17:53:27,915+0300 = <span style=3D"color:rgb(255,255,255);background-color:rgb(0,0,0)" = class=3D"">ERROR</span><span style=3D"background-color: rgb(255, 255, = 255);" class=3D""> (vm/e6ce66ce) [virt.vm] = (vmId=3D'e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process = failed (vm:943) </span><br class=3D"">Traceback (most recent call last): = <br class=3D""> File = "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, in = _startUnderlyingVm <br class=3D""> self._run() <br = class=3D""> File = "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2861, in _run = <br class=3D""> domxml =3D = hooks.before_vm_start(self._buildDomainXML(), <br class=3D""> File = "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2254, in = _buildDomainXML <br class=3D""> dom, <a = href=3D"http://self.id/" class=3D"">self.id</a>, self._custom['custom']) = <br class=3D""> File = "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_preprocess.py", line = 240, in replace_device_xml_with_hooks_xml <br = class=3D""> dev_custom) <br class=3D""> File = "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 134, in = before_device_create <br = class=3D""> params=3DcustomProperties) <br = class=3D""> File = "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 120, in = _runHooksDir <br class=3D""> raise = exception.HookError(err) <br class=3D"">HookError: Hook Error: ('vm net = allocation hook: [unexpected error]: Traceback (most recent call = last):\n File = "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line = 105, in <module>\n main()\n<br = class=3D""> File = "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 93, = in main\n allocate_random_network(device_xml)\n = File = "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 62, = i<br class=3D"">n allocate_random_network\n net =3D = _get_random_network()\n File = "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 50, = in _get_random_network\n available_nets =3D = _parse_nets()\n File "/us<br = class=3D"">r/libexec/vdsm/hooks/before_device_create/10_allocate_net", = line 46, in _parse_nets\n return [net for net in = os.environ[AVAIL_NETS_KEY].split()]\n File = "/usr/lib64/python2.7/UserDict.py", line 23, in __getit<br = class=3D"">em__\n raise KeyError(key)\nKeyError: = \'equivnets\'\n\n\n',)<br class=3D""></span><br class=3D""></div><div = class=3D""><br class=3D""></div><div class=3D""><br class=3D""></div><div = class=3D"">Hence, the success rate was 28% against 100% running with d/s = (d/s). If needed, I'll compare against the latest master, but I think = you get the picture with d/s.</div><div class=3D""><br = class=3D""></div><div class=3D""><span style=3D"font-family:monospace" = class=3D""><span style=3D"background-color: rgb(255, 255, 255);" = class=3D"">vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 </span><br = class=3D"">libvirt-3.9.0-14.el7_5.3.x86_64 <br = class=3D"">qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64 <br class=3D"">kernel = 3.10.0-862.el7.x86_64</span></div><div class=3D""><span = style=3D"font-family:monospace" class=3D"">rhel7.5<br = class=3D""></span><br class=3D""></div><div class=3D""><br = class=3D""></div><div class=3D"">Logs attached</div></div><div = class=3D"gmail_extra"><br class=3D""><div class=3D"gmail_quote">On Sat, = May 5, 2018 at 1:26 PM, Elad Ben Aharon <span dir=3D"ltr" = class=3D""><<a href=3D"mailto:ebenahar@redhat.com" target=3D"_blank" = class=3D"">ebenahar@redhat.com</a>></span> wrote:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 = .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr" = class=3D"">nvm, found gluster 3.12 repo, managed to install = vdsm</div><div class=3D"HOEnZb"><div class=3D"h5"><div = class=3D"gmail_extra"><br class=3D""><div class=3D"gmail_quote">On Sat, = May 5, 2018 at 1:12 PM, Elad Ben Aharon <span dir=3D"ltr" = class=3D""><<a href=3D"mailto:ebenahar@redhat.com" target=3D"_blank" = class=3D"">ebenahar@redhat.com</a>></span> wrote:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 = .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr" = class=3D"">No, vdsm requires it:<div class=3D""><br = class=3D""></div><div class=3D""><span style=3D"font-family:monospace" = class=3D""><span style=3D"background-color: rgb(255, 255, 255);" = class=3D"">Error: Package: vdsm-4.20.27-3.gitfee7810.el7.<wbr = class=3D"">centos.x86_64 (/vdsm-4.20.27-3.gitfee7810.el<wbr = class=3D"">7.centos.x86_64) </span><br = class=3D""> Req= uires: glusterfs-fuse >=3D 3.12 <br = class=3D""> Ins= talled: glusterfs-fuse-3.8.4-54.8.el7.<wbr class=3D"">x86_64 = (@rhv-4.2.3)<br class=3D""></span><br class=3D""></div><div = class=3D"">Therefore, vdsm package installation is skipped upon force = install.</div></div><div class=3D"m_8270803836802176999HOEnZb"><div = class=3D"m_8270803836802176999h5"><div class=3D"gmail_extra"><br = class=3D""><div class=3D"gmail_quote">On Sat, May 5, 2018 at 11:42 AM, = Michal Skrivanek <span dir=3D"ltr" class=3D""><<a = href=3D"mailto:michal.skrivanek@redhat.com" target=3D"_blank" = class=3D"">michal.skrivanek@redhat.com</a>></span> wrote:<br = class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 = .8ex;border-left:1px #ccc solid;padding-left:1ex"><div = style=3D"word-wrap:break-word;line-break:after-white-space" class=3D""><br= class=3D""><div class=3D""><span class=3D""><br class=3D""><blockquote = type=3D"cite" class=3D""><div class=3D"">On 5 May 2018, at 00:38, Elad = Ben Aharon <<a href=3D"mailto:ebenahar@redhat.com" target=3D"_blank" = class=3D"">ebenahar@redhat.com</a>> wrote:</div><br = class=3D"m_8270803836802176999m_4224343900157515506m_-5974818518343566788A= pple-interchange-newline"><div class=3D""><div dir=3D"ltr" class=3D"">Hi = guys, <div class=3D""><br class=3D""></div><div class=3D"">The = vdsm build from the patch requires glusterfs-fuse > 3.12. This = is while the latest 4.2.3-5 d/s build requires 3.8.4 (<span = style=3D"font-family:monospace" class=3D""><span = style=3D"background-color:rgb(255,255,255)" = class=3D"">3.4.0.59rhs-1.el7)</span><br = class=3D""></span></div></div></div></blockquote><div class=3D""><br = class=3D""></div></span>because it is still oVirt, not a downstream = build. We can=E2=80=99t really do downstream builds with unmerged = changes:/</div><div class=3D""><span class=3D""><br class=3D""><blockquote= type=3D"cite" class=3D""><div class=3D""><div dir=3D"ltr" class=3D""><div= class=3D""><font face=3D"monospace" class=3D"">Trying to get this = gluster-fuse build, so far no luck.</font></div><div class=3D""><font = face=3D"monospace" class=3D"">Is this requirement = intentional? </font></div></div></div></blockquote><div = class=3D""><br class=3D""></div></span>it should work regardless, I = guess you can force install it without the dependency</div><div = class=3D""><div = class=3D"m_8270803836802176999m_4224343900157515506h5"><div class=3D""><br= class=3D""><blockquote type=3D"cite" class=3D""><div class=3D""><div = class=3D"gmail_extra"><br class=3D""><div class=3D"gmail_quote">On Fri, = May 4, 2018 at 2:38 PM, Michal Skrivanek <span dir=3D"ltr" = class=3D""><<a href=3D"mailto:michal.skrivanek@redhat.com" = target=3D"_blank" class=3D"">michal.skrivanek@redhat.com</a>></span> = wrote:<br class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0 = 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div = style=3D"word-wrap:break-word;line-break:after-white-space" class=3D"">Hi = Elad,<div class=3D"">to make it easier to compare, Martin backported the = change to 4.2 so it is actually comparable with a run without that = patch. Would you please try that out? </div><div class=3D"">It = would be best to have 4.2 upstream and this[1] run to really minimize = the noise.</div><div class=3D""><br class=3D""></div><div = class=3D"">Thanks,</div><div class=3D"">michal</div><div class=3D""><br = class=3D""></div><div class=3D"">[1] <a = href=3D"http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on-demand-el= 7-x86_64/28/" target=3D"_blank" class=3D"">http://jenkins.ovirt.org/j<wbr = class=3D"">ob/vdsm_4.2_build-artifacts-on<wbr = class=3D"">-demand-el7-x86_64/28/</a></div><div class=3D""><br = class=3D""><div class=3D""><blockquote type=3D"cite" class=3D""><div = class=3D""><div = class=3D"m_8270803836802176999m_4224343900157515506m_-5974818518343566788h= 5"><div class=3D"">On 27 Apr 2018, at 09:23, Martin Polednik <<a = href=3D"mailto:mpolednik@redhat.com" target=3D"_blank" = class=3D"">mpolednik@redhat.com</a>> wrote:</div><br = class=3D"m_8270803836802176999m_4224343900157515506m_-5974818518343566788m= _-2464431127513935993Apple-interchange-newline"></div></div><div = class=3D""><div class=3D""><div class=3D""><div = class=3D"m_8270803836802176999m_4224343900157515506m_-5974818518343566788h= 5">On 24/04/18 00:37 +0300, Elad Ben Aharon wrote:<br = class=3D""><blockquote type=3D"cite" class=3D"">I will update with the = results of the next tier1 execution on latest 4.2.3<br = class=3D""></blockquote><br class=3D"">That isn't master but old branch = though. Could you run it against<br class=3D"">*current* VDSM master?<br = class=3D""><br class=3D""><blockquote type=3D"cite" class=3D"">On Mon, = Apr 23, 2018 at 3:56 PM, Martin Polednik <<a = href=3D"mailto:mpolednik@redhat.com" target=3D"_blank" = class=3D"">mpolednik@redhat.com</a>><br class=3D"">wrote:<br = class=3D""><br class=3D""><blockquote type=3D"cite" class=3D"">On = 23/04/18 01:23 +0300, Elad Ben Aharon wrote:<br class=3D""><br = class=3D""><blockquote type=3D"cite" class=3D"">Hi, I've triggered = another execution [1] due to some issues I saw in the<br class=3D"">first = which are not related to the patch.<br class=3D""><br class=3D"">The = success rate is 78% which is low comparing to tier1 executions with<br = class=3D"">code from downstream builds (95-100% success rates) [2].<br = class=3D""><br class=3D""></blockquote><br class=3D"">Could you run the = current master (without the dynamic_ownership patch)<br class=3D"">so = that we have viable comparision?<br class=3D""><br class=3D"">=46rom = what I could see so far, there is an issue with move and copy<br = class=3D""><blockquote type=3D"cite" class=3D"">operations to and from = Gluster domains. For example [3].<br class=3D""><br class=3D"">The logs = are attached.<br class=3D""><br class=3D""><br class=3D"">[1]<br = class=3D"">*<a = href=3D"https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv" = target=3D"_blank" class=3D"">https://rhv-jenkins.rhev-ci-v<wbr = class=3D"">ms.eng.rdu2.redhat.com/job/rhv</a><br = class=3D"">-4.2-ge-runner-tier1-after-upg<wbr = class=3D"">rade/7/testReport/<br class=3D""><<a = href=3D"https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv" = target=3D"_blank" class=3D"">https://rhv-jenkins.rhev-ci-v<wbr = class=3D"">ms.eng.rdu2.redhat.com/job/rhv</a><br = class=3D"">-4.2-ge-runner-tier1-after-upg<wbr = class=3D"">rade/7/testReport/>*<br class=3D""><br class=3D""><br = class=3D""><br class=3D"">[2]<br class=3D""><a = href=3D"https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/" = target=3D"_blank" class=3D"">https://rhv-jenkins.rhev-ci-vm<wbr = class=3D"">s.eng.rdu2.redhat.com/job/</a><br class=3D""><br = class=3D"">rhv-4.2-ge-runner-tier1-after-<wbr class=3D"">upgrade/7/<br = class=3D""><br class=3D""><br class=3D""><br class=3D"">[3]<br = class=3D"">2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) = [vdsm.api] FINISH<br class=3D"">deleteImage error=3DImage does not exist = in domain:<br class=3D"">'image=3Dcabb8846-7a4b-4244-9835<wbr = class=3D"">-5f603e682f33,<br class=3D"">domain=3De5fd29c8-52ba-467e-be09<w= br class=3D"">-ca40ff054dd4'<br class=3D"">from=3D:<br = class=3D"">:ffff:10.35.161.182,40936, flow_id=3Ddisks_syncAction_ba6b2<wbr= class=3D"">630-5976-4935,<br = class=3D"">task_id=3D3d5f2a8a-881c-409e-93e<wbr class=3D"">9-aaa643c10e42 = (api:51)<br class=3D"">2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) = [storage.TaskManager.Task]<br = class=3D"">(Task=3D'3d5f2a8a-881c-409e-93e9<wbr class=3D"">-aaa643c10e42')= Unexpected error (task:875)<br class=3D"">Traceback (most recent call = last):<br class=3D"">File "/usr/lib/python2.7/site-packa<wbr = class=3D"">ges/vdsm/storage/task.py", line 882,<br class=3D"">in<br = class=3D"">_run<br class=3D""> return fn(*args, **kargs)<br = class=3D"">File "<string>", line 2, in deleteImage<br = class=3D"">File "/usr/lib/python2.7/site-packa<wbr = class=3D"">ges/vdsm/common/api.py", line 49, in<br class=3D"">method<br = class=3D""> ret =3D func(*args, **kwargs)<br class=3D"">File = "/usr/lib/python2.7/site-packa<wbr class=3D"">ges/vdsm/storage/hsm.py", = line 1503,<br class=3D"">in<br class=3D"">deleteImage<br class=3D""> = raise se.ImageDoesNotExistInSD(imgUU<wbr class=3D"">ID, sdUUID)<br = class=3D"">ImageDoesNotExistInSD: Image does not exist in domain:<br = class=3D"">'image=3Dcabb8846-7a4b-4244-9835<wbr = class=3D"">-5f603e682f33,<br class=3D"">domain=3De5fd29c8-52ba-467e-be09<w= br class=3D"">-ca40ff054dd4'<br class=3D""><br class=3D"">2018-04-22 = 13:06:28,317+0300 INFO (jsonrpc/7) [storage.TaskManager.Task]<br = class=3D"">(Task=3D'3d5f2a8a-881c-409e-93e9<wbr class=3D"">-aaa643c10e42')= aborting: Task is aborted:<br class=3D"">"Image does not exist in = domain: 'image=3Dcabb8846-7a4b-4244-9835<wbr class=3D"">-<br = class=3D"">5f603e682f33, domain=3De5fd29c8-52ba-467e-be09<wbr = class=3D"">-ca40ff054dd4'" - code 268<br class=3D"">(task:1181)<br = class=3D"">2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) = [storage.Dispatcher] FINISH<br class=3D"">deleteImage error=3DImage does = not exist in domain:<br class=3D"">'image=3Dcabb8846-7a4b-4244-9835<wbr = class=3D"">-5f603e682f33,<br class=3D"">domain=3De5fd29c8-52ba-467e-be09<b= r class=3D"">-ca40ff054d<br class=3D"">d4' (dispatcher:82)<br = class=3D""><br class=3D""><br class=3D""><br class=3D"">On Thu, Apr 19, = 2018 at 5:34 PM, Elad Ben Aharon <<a = href=3D"mailto:ebenahar@redhat.com" target=3D"_blank" = class=3D"">ebenahar@redhat.com</a>><br class=3D"">wrote:<br = class=3D""><br class=3D"">Triggered a sanity tier1 execution [1] using = [2], which covers all the<br class=3D""><blockquote type=3D"cite" = class=3D"">requested areas, on iSCSI, NFS and Gluster.<br class=3D"">I'll = update with the results.<br class=3D""><br class=3D"">[1]<br class=3D""><a= href=3D"https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2" = target=3D"_blank" class=3D"">https://rhv-jenkins.rhev-ci-vm<wbr = class=3D"">s.eng.rdu2.redhat.com/view/4.2</a><br = class=3D"">_dev/job/rhv-4.2-ge-flow-stora<wbr class=3D"">ge/1161/<br = class=3D""><br class=3D"">[2]<br class=3D""><a = href=3D"https://gerrit.ovirt.org/#/c/89830/" target=3D"_blank" = class=3D"">https://gerrit.ovirt.org/#/c/8<wbr class=3D"">9830/</a><br = class=3D"">vdsm-4.30.0-291.git77aef9a.el7<wbr class=3D"">.x86_64<br = class=3D""><br class=3D""><br class=3D""><br class=3D"">On Thu, Apr 19, = 2018 at 3:07 PM, Martin Polednik <<a = href=3D"mailto:mpolednik@redhat.com" target=3D"_blank" = class=3D"">mpolednik@redhat.com</a>><br class=3D"">wrote:<br = class=3D""><br class=3D"">On 19/04/18 14:54 +0300, Elad Ben Aharon = wrote:<br class=3D""><blockquote type=3D"cite" class=3D""><br = class=3D"">Hi Martin,<br class=3D""><blockquote type=3D"cite" = class=3D""><br class=3D"">I see [1] requires a rebase, can you please = take care?<br class=3D""><br class=3D""><br class=3D""></blockquote>Should= be rebased.<br class=3D""><br class=3D"">At the moment, our automation = is stable only on iSCSI, NFS, Gluster and<br class=3D""><br = class=3D""><blockquote type=3D"cite" class=3D"">FC.<br class=3D"">Ceph = is not supported and Cinder will be stabilized soon, AFAIR, it's<br = class=3D"">not<br class=3D"">stable enough at the moment.<br = class=3D""><br class=3D""><br class=3D""></blockquote>That is still = pretty good.<br class=3D""><br class=3D""><br class=3D"">[1] <a = href=3D"https://gerrit.ovirt.org/#/c/89830/" target=3D"_blank" = class=3D"">https://gerrit.ovirt.org/#/c/8<wbr class=3D"">9830/</a><br = class=3D""><br class=3D""><blockquote type=3D"cite" class=3D""><br = class=3D""><br class=3D"">Thanks<br class=3D""><br class=3D"">On Wed, = Apr 18, 2018 at 2:17 PM, Martin Polednik <<a = href=3D"mailto:mpolednik@redhat.com" target=3D"_blank" = class=3D"">mpolednik@redhat.com</a><br class=3D"">><br = class=3D"">wrote:<br class=3D""><br class=3D"">On 18/04/18 11:37 +0300, = Elad Ben Aharon wrote:<br class=3D""><br class=3D""><blockquote = type=3D"cite" class=3D""><br class=3D"">Hi, sorry if I misunderstood, I = waited for more input regarding what<br class=3D""><br = class=3D""><blockquote type=3D"cite" class=3D"">areas<br class=3D"">have = to be tested here.<br class=3D""><br class=3D""><br class=3D"">I'd say = that you have quite a bit of freedom in this regard.<br = class=3D""></blockquote>GlusterFS<br class=3D"">should be covered by = Dennis, so iSCSI/NFS/ceph/cinder with some suite<br class=3D"">that = covers basic operations (start & stop VM, migrate it), snapshots<br = class=3D"">and merging them, and whatever else would be important for = storage<br class=3D"">sanity.<br class=3D""><br class=3D"">mpolednik<br = class=3D""><br class=3D""><br class=3D"">On Wed, Apr 18, 2018 at 11:16 = AM, Martin Polednik <<br class=3D""><a = href=3D"mailto:mpolednik@redhat.com" target=3D"_blank" = class=3D"">mpolednik@redhat.com</a><br class=3D"">><br class=3D""><br = class=3D"">wrote:<br class=3D""><blockquote type=3D"cite" class=3D""><br = class=3D"">On 11/04/18 16:52 +0300, Elad Ben Aharon wrote:<br = class=3D""><br class=3D""><br class=3D""><blockquote type=3D"cite" = class=3D"">We can test this on iSCSI, NFS and GlusterFS. As for ceph = and<br class=3D"">cinder,<br class=3D""><br class=3D"">will<br = class=3D""><blockquote type=3D"cite" class=3D"">have to check, since = usually, we don't execute our automation on<br class=3D"">them.<br = class=3D""><br class=3D""><br class=3D"">Any update on this? I believe = the gluster tests were successful,<br class=3D"">OST<br class=3D""><br = class=3D""></blockquote>passes fine and unit tests pass fine, that makes = the storage<br class=3D"">backends<br class=3D"">test the last required = piece.<br class=3D""><br class=3D""><br class=3D"">On Wed, Apr 11, 2018 = at 4:38 PM, Raz Tamir <<a href=3D"mailto:ratamir@redhat.com" = target=3D"_blank" class=3D"">ratamir@redhat.com</a>><br = class=3D"">wrote:<br class=3D""><br class=3D""><br class=3D"">+Elad<br = class=3D""><blockquote type=3D"cite" class=3D""><br class=3D""><br = class=3D"">On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg <<a = href=3D"mailto:danken@redhat.com" target=3D"_blank" = class=3D"">danken@redhat.com</a><br class=3D""><blockquote type=3D"cite" = class=3D"">><br class=3D"">wrote:<br class=3D""><br class=3D"">On = Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer <<a = href=3D"mailto:nsoffer@redhat.com" target=3D"_blank" = class=3D"">nsoffer@redhat.com</a>><br class=3D"">wrote:<br = class=3D""><br class=3D""><br class=3D"">On Wed, Apr 11, 2018 at 12:31 = PM Eyal Edri <<a href=3D"mailto:eedri@redhat.com" target=3D"_blank" = class=3D"">eedri@redhat.com</a>><br class=3D""><blockquote = type=3D"cite" class=3D"">wrote:<br class=3D""><br class=3D""><br = class=3D"">Please make sure to run as much OST suites on this patch = as<br class=3D""><blockquote type=3D"cite" class=3D"">possible<br = class=3D""><br class=3D"">before merging ( using 'ci please build' )<br = class=3D""><br class=3D""><blockquote type=3D"cite" class=3D""><br = class=3D""><br class=3D"">But note that OST is not a way to verify the = patch.<br class=3D""><br class=3D""><br class=3D""></blockquote>Such = changes require testing with all storage types we support.<br = class=3D""><br class=3D"">Nir<br class=3D""><br class=3D"">On Tue, Apr = 10, 2018 at 4:09 PM, Martin Polednik <<br class=3D""><a = href=3D"mailto:mpolednik@redhat.com" target=3D"_blank" = class=3D"">mpolednik@redhat.com</a><br class=3D"">><br class=3D""><br = class=3D"">wrote:<br class=3D""><br class=3D""><br class=3D""><blockquote = type=3D"cite" class=3D"">Hey,<br class=3D""><br class=3D""><br = class=3D"">I've created a patch[0] that is finally able to activate<br = class=3D""><blockquote type=3D"cite" class=3D"">libvirt's<br = class=3D"">dynamic_ownership for VDSM while not negatively affecting<br = class=3D"">functionality of our storage code.<br class=3D""><br = class=3D"">That of course comes with quite a bit of code removal, = mostly<br class=3D"">in<br class=3D"">the<br class=3D"">area of host = devices, hwrng and anything that touches devices;<br class=3D"">bunch<br = class=3D"">of test changes and one XML generation caveat (storage is<br = class=3D"">handled<br class=3D"">by<br class=3D"">VDSM, therefore disk = relabelling needs to be disabled on the<br class=3D"">VDSM<br = class=3D"">level).<br class=3D""><br class=3D"">Because of the scope of = the patch, I welcome<br class=3D"">storage/virt/network<br = class=3D"">people to review the code and consider the implication = this<br class=3D"">change<br class=3D"">has<br class=3D"">on = current/future features.<br class=3D""><br class=3D"">[0] <a = href=3D"https://gerrit.ovirt.org/#/c/89830/" target=3D"_blank" = class=3D"">https://gerrit.ovirt.org/#/c/8<wbr class=3D"">9830/</a><br = class=3D""><br class=3D""><br class=3D"">In particular: = dynamic_ownership was set to 0 prehistorically<br class=3D"">(as<br = class=3D""><br class=3D""><br class=3D""></blockquote>part<br = class=3D""><br class=3D""></blockquote><br class=3D"">of <a = href=3D"https://bugzilla.redhat.com/show_bug.cgi?id=3D554961" = target=3D"_blank" class=3D"">https://bugzilla.redhat.com/sh<wbr = class=3D"">ow_bug.cgi?id=3D554961</a> ) because<br = class=3D""></blockquote>libvirt,<br class=3D"">running as root, was not = able to play properly with root-squash<br class=3D"">nfs<br = class=3D"">mounts.<br class=3D""><br class=3D"">Have you attempted this = use case?<br class=3D""><br class=3D"">I join to Nir's request to run = this with storage QE.<br class=3D""><br class=3D""><br class=3D""><br = class=3D""><br class=3D"">--<br class=3D""></blockquote><br class=3D""><br= class=3D"">Raz Tamir<br class=3D"">Manager, RHV QE<br class=3D""><br = class=3D""><br class=3D""><br class=3D""><br class=3D""><br = class=3D""></blockquote></blockquote></blockquote></blockquote></blockquot= e></blockquote></blockquote><br class=3D""></blockquote></blockquote><br = class=3D""><br class=3D""></blockquote></blockquote></div></div><span = class=3D"">______________________________<wbr = class=3D"">_________________<br class=3D"">Devel mailing list<br = class=3D""><a href=3D"mailto:Devel@ovirt.org" target=3D"_blank" = class=3D"">Devel@ovirt.org</a><br class=3D""><a = href=3D"http://lists.ovirt.org/mailman/listinfo/devel" target=3D"_blank" = class=3D"">http://lists.ovirt.org/mailman<wbr = class=3D"">/listinfo/devel</a><br class=3D""><br class=3D""><br = class=3D""></span></div></div></blockquote></div><br = class=3D""></div></div></blockquote></div><br class=3D""></div> </div></blockquote></div><br = class=3D""></div></div></div></blockquote></div><br class=3D""></div> </div></div></blockquote></div><br class=3D""></div> </div></div></blockquote></div><br class=3D""></div> <span = id=3D"cid:77382D15-7BFB-4164-A6D0-F8FA5BE5E692@mrkev"><logs.tar.gz><= /span></div></blockquote></div><br class=3D""></div></body></html>= --Apple-Mail=_82B55436-9E61-4F82-AF39-6B64C656C187--
 
            On Mon, May 7, 2018 at 3:53 PM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote:
Hi Elad, why did you install vdsm-hook-allocate_net?
adding Dan as I think the hook is not supposed to fail this badly in any case
yep, this looks bad and deserves a little bug report. Installing this little hook should not block vm startup. But more importantly - what is the conclusion of this thread? Do we have a green light from QE to take this in?
Thanks, michal
On 5 May 2018, at 19:22, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Start VM fails on:
2018-05-05 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path: 'dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938-9a78-bdd13a843c62/images/6cdabfe5- d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a01928211' -> u'*dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938-9a78-bdd13a843c62/images/6cdabfe5-d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425- f35a01928211' (storagexml:334) 2018-05-05 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START getSpmStatus(spUUID='940fe6f3-b0c6-4d0c-a921-198e7819c1cc', options=None) from=::ffff:10.35.161.127,53512, task_id=c70ace39-dbfe-4f5c-ae49-a1e3a82c 2758 (api:46) 2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) [root] /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: rc=2 err=vm net allocation hook: [unexpected error]: Traceback (most recent call last): File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 105, in <module> main() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 93, in main allocate_random_network(device_xml) File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 62, in allocate_random_network net = _get_random_network() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 50, in _get_random_network available_nets = _parse_nets() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, in _parse_nets return [net for net in os.environ[AVAIL_NETS_KEY].split()] File "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__ raise KeyError(key) KeyError: 'equivnets'
(hooks:110) 2018-05-05 17:53:27,915+0300 ERROR (vm/e6ce66ce) [virt.vm] (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process failed (vm:943) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, in _startUnderlyingVm self._run() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2861, in _run domxml = hooks.before_vm_start(self._buildDomainXML(), File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2254, in _buildDomainXML dom, self.id, self._custom['custom']) File "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_preprocess.py", line 240, in replace_device_xml_with_hooks_xml dev_custom) File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 134, in before_device_create params=customProperties) File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 120, in _runHooksDir raise exception.HookError(err) HookError: Hook Error: ('vm net allocation hook: [unexpected error]: Traceback (most recent call last):\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 105, in <module>\n main()\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 93, in main\n allocate_random_network(device_xml)\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 62, i n allocate_random_network\n net = _get_random_network()\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 50, in _get_random_network\n available_nets = _parse_nets()\n File "/us r/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, in _parse_nets\n return [net for net in os.environ[AVAIL_NETS_KEY].split()]\n File "/usr/lib64/python2.7/UserDict.py", line 23, in __getit em__\n raise KeyError(key)\nKeyError: \'equivnets\'\n\n\n',)
Hence, the success rate was 28% against 100% running with d/s (d/s). If needed, I'll compare against the latest master, but I think you get the picture with d/s.
vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 libvirt-3.9.0-14.el7_5.3.x86_64 qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64 kernel 3.10.0-862.el7.x86_64 rhel7.5
Logs attached
On Sat, May 5, 2018 at 1:26 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
nvm, found gluster 3.12 repo, managed to install vdsm
On Sat, May 5, 2018 at 1:12 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
No, vdsm requires it:
Error: Package: vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 (/vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64) Requires: glusterfs-fuse >= 3.12 Installed: glusterfs-fuse-3.8.4-54.8.el7.x86_64 (@rhv-4.2.3)
Therefore, vdsm package installation is skipped upon force install.
On Sat, May 5, 2018 at 11:42 AM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote:
On 5 May 2018, at 00:38, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Hi guys,
The vdsm build from the patch requires glusterfs-fuse > 3.12. This is while the latest 4.2.3-5 d/s build requires 3.8.4 (3.4.0.59rhs-1.el7)
because it is still oVirt, not a downstream build. We can’t really do downstream builds with unmerged changes:/
Trying to get this gluster-fuse build, so far no luck. Is this requirement intentional?
it should work regardless, I guess you can force install it without the dependency
On Fri, May 4, 2018 at 2:38 PM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote:
Hi Elad, to make it easier to compare, Martin backported the change to 4.2 so it is actually comparable with a run without that patch. Would you please try that out? It would be best to have 4.2 upstream and this[1] run to really minimize the noise.
Thanks, michal
[1] http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on-demand-el7-x86_64/2...
On 27 Apr 2018, at 09:23, Martin Polednik <mpolednik@redhat.com> wrote:
On 24/04/18 00:37 +0300, Elad Ben Aharon wrote:
I will update with the results of the next tier1 execution on latest 4.2.3
That isn't master but old branch though. Could you run it against *current* VDSM master?
On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik <mpolednik@redhat.com> wrote:
On 23/04/18 01:23 +0300, Elad Ben Aharon wrote:
Hi, I've triggered another execution [1] due to some issues I saw in the first which are not related to the patch.
The success rate is 78% which is low comparing to tier1 executions with code from downstream builds (95-100% success rates) [2].
Could you run the current master (without the dynamic_ownership patch) so that we have viable comparision?
From what I could see so far, there is an issue with move and copy
operations to and from Gluster domains. For example [3].
The logs are attached.
[1] *https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv -4.2-ge-runner-tier1-after-upgrade/7/testReport/ <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv -4.2-ge-runner-tier1-after-upgrade/7/testReport/>*
[2] https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/
rhv-4.2-ge-runner-tier1-after-upgrade/7/
[3] 2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) [vdsm.api] FINISH deleteImage error=Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835-5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' from=: :ffff:10.35.161.182,40936, flow_id=disks_syncAction_ba6b2630-5976-4935, task_id=3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51) 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) [storage.TaskManager.Task] (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in deleteImage File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1503, in deleteImage raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) ImageDoesNotExistInSD: Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835-5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'
2018-04-22 13:06:28,317+0300 INFO (jsonrpc/7) [storage.TaskManager.Task] (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task is aborted: "Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835- 5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'" - code 268 (task:1181) 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) [storage.Dispatcher] FINISH deleteImage error=Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835-5f603e682f33, domain=e5fd29c8-52ba-467e-be09 -ca40ff054d d4' (dispatcher:82)
On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Triggered a sanity tier1 execution [1] using [2], which covers all the
requested areas, on iSCSI, NFS and Gluster. I'll update with the results.
[1] https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2 _dev/job/rhv-4.2-ge-flow-storage/1161/
[2] https://gerrit.ovirt.org/#/c/89830/ vdsm-4.30.0-291.git77aef9a.el7.x86_64
On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik <mpolednik@redhat.com> wrote:
On 19/04/18 14:54 +0300, Elad Ben Aharon wrote:
Hi Martin,
I see [1] requires a rebase, can you please take care?
Should be rebased.
At the moment, our automation is stable only on iSCSI, NFS, Gluster and
FC. Ceph is not supported and Cinder will be stabilized soon, AFAIR, it's not stable enough at the moment.
That is still pretty good.
[1] https://gerrit.ovirt.org/#/c/89830/
Thanks
On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik <mpolednik@redhat.com
wrote:
On 18/04/18 11:37 +0300, Elad Ben Aharon wrote:
Hi, sorry if I misunderstood, I waited for more input regarding what
areas have to be tested here.
I'd say that you have quite a bit of freedom in this regard.
GlusterFS should be covered by Dennis, so iSCSI/NFS/ceph/cinder with some suite that covers basic operations (start & stop VM, migrate it), snapshots and merging them, and whatever else would be important for storage sanity.
mpolednik
On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik < mpolednik@redhat.com
wrote:
On 11/04/18 16:52 +0300, Elad Ben Aharon wrote:
We can test this on iSCSI, NFS and GlusterFS. As for ceph and cinder,
will
have to check, since usually, we don't execute our automation on them.
Any update on this? I believe the gluster tests were successful, OST
passes fine and unit tests pass fine, that makes the storage backends test the last required piece.
On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir <ratamir@redhat.com> wrote:
+Elad
On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg <danken@redhat.com
wrote:
On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri <eedri@redhat.com>
wrote:
Please make sure to run as much OST suites on this patch as
possible
before merging ( using 'ci please build' )
But note that OST is not a way to verify the patch.
Such changes require testing with all storage types we support.
Nir
On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik < mpolednik@redhat.com
wrote:
Hey,
I've created a patch[0] that is finally able to activate
libvirt's dynamic_ownership for VDSM while not negatively affecting functionality of our storage code.
That of course comes with quite a bit of code removal, mostly in the area of host devices, hwrng and anything that touches devices; bunch of test changes and one XML generation caveat (storage is handled by VDSM, therefore disk relabelling needs to be disabled on the VDSM level).
Because of the scope of the patch, I welcome storage/virt/network people to review the code and consider the implication this change has on current/future features.
[0] https://gerrit.ovirt.org/#/c/89830/
In particular: dynamic_ownership was set to 0 prehistorically (as
part
of https://bugzilla.redhat.com/show_bug.cgi?id=554961 ) because
libvirt, running as root, was not able to play properly with root-squash nfs mounts.
Have you attempted this use case?
I join to Nir's request to run this with storage QE.
--
Raz Tamir Manager, RHV QE
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
<logs.tar.gz>
 
            Hi Dan, In the last execution, the success rate was very low due to a large number of failures on start VM caused, according to Michal, by the vdsm-hook-allocate_net that was installed on the host. This is the latest status here, would you like me to re-execute? If so, with or W/O vdsm-hook-allocate_net installed? On Tue, May 29, 2018 at 1:14 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On Mon, May 7, 2018 at 3:53 PM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote:
Hi Elad, why did you install vdsm-hook-allocate_net?
adding Dan as I think the hook is not supposed to fail this badly in any case
yep, this looks bad and deserves a little bug report. Installing this little hook should not block vm startup.
But more importantly - what is the conclusion of this thread? Do we have a green light from QE to take this in?
Thanks, michal
On 5 May 2018, at 19:22, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Start VM fails on:
2018-05-05 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path: 'dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938-
d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a01928211' -> u'*dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- 9a78-bdd13a843c62/images/6cdabfe5-d1ca-40af-ae63- 9834f235d1c8/7ef97445-30e6-4435-8425- f35a01928211' (storagexml:334) 2018-05-05 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START getSpmStatus(spUUID='940fe6f3-b0c6-4d0c-a921-198e7819c1cc',
from=::ffff:10.35.161.127,53512, task_id=c70ace39-dbfe-4f5c- ae49-a1e3a82c 2758 (api:46) 2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) [root] /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: rc=2 err=vm net allocation hook: [unexpected error]: Traceback (most recent call last): File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
105, in <module> main() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
93, in main allocate_random_network(device_xml) File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
62, in allocate_random_network net = _get_random_network() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
50, in _get_random_network available_nets = _parse_nets() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
46, in _parse_nets return [net for net in os.environ[AVAIL_NETS_KEY].split()] File "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__ raise KeyError(key) KeyError: 'equivnets'
(hooks:110) 2018-05-05 17:53:27,915+0300 ERROR (vm/e6ce66ce) [virt.vm] (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process failed (vm:943) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, in _startUnderlyingVm self._run() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2861, in _run domxml = hooks.before_vm_start(self._buildDomainXML(), File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2254, in _buildDomainXML dom, self.id, self._custom['custom']) File "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_preprocess.py", line 240, in replace_device_xml_with_hooks_xml dev_custom) File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 134, in before_device_create params=customProperties) File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 120, in _runHooksDir raise exception.HookError(err) HookError: Hook Error: ('vm net allocation hook: [unexpected error]: Traceback (most recent call last):\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 105, in <module>\n main()\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
93, in main\n allocate_random_network(device_xml)\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 62, i n allocate_random_network\n net = _get_random_network()\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 50, in _get_random_network\n available_nets = _parse_nets()\n File "/us r/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, in _parse_nets\n return [net for net in os.environ[AVAIL_NETS_KEY].split()]\n File "/usr/lib64/python2.7/UserDict.py", line 23, in __getit em__\n raise KeyError(key)\nKeyError: \'equivnets\'\n\n\n',)
Hence, the success rate was 28% against 100% running with d/s (d/s). If needed, I'll compare against the latest master, but I think you get the picture with d/s.
vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 libvirt-3.9.0-14.el7_5.3.x86_64 qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64 kernel 3.10.0-862.el7.x86_64 rhel7.5
Logs attached
On Sat, May 5, 2018 at 1:26 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
nvm, found gluster 3.12 repo, managed to install vdsm
On Sat, May 5, 2018 at 1:12 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
No, vdsm requires it:
Error: Package: vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 (/vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64) Requires: glusterfs-fuse >= 3.12 Installed: glusterfs-fuse-3.8.4-54.8.el7.x86_64 (@rhv-4.2.3)
Therefore, vdsm package installation is skipped upon force install.
On Sat, May 5, 2018 at 11:42 AM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote:
On 5 May 2018, at 00:38, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Hi guys,
The vdsm build from the patch requires glusterfs-fuse > 3.12. This is while the latest 4.2.3-5 d/s build requires 3.8.4 (3.4.0.59rhs-1.el7)
because it is still oVirt, not a downstream build. We can’t really do downstream builds with unmerged changes:/
Trying to get this gluster-fuse build, so far no luck. Is this requirement intentional?
it should work regardless, I guess you can force install it without
dependency
On Fri, May 4, 2018 at 2:38 PM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote:
Hi Elad, to make it easier to compare, Martin backported the change to 4.2 so
it
is actually comparable with a run without that patch. Would you
that out? It would be best to have 4.2 upstream and this[1] run to really minimize the noise.
Thanks, michal
[1] http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on- demand-el7-x86_64/28/
On 27 Apr 2018, at 09:23, Martin Polednik <mpolednik@redhat.com> wrote:
On 24/04/18 00:37 +0300, Elad Ben Aharon wrote:
I will update with the results of the next tier1 execution on latest 4.2.3
That isn't master but old branch though. Could you run it against *current* VDSM master?
On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik < mpolednik@redhat.com> wrote:
On 23/04/18 01:23 +0300, Elad Ben Aharon wrote:
Hi, I've triggered another execution [1] due to some issues I saw in the first which are not related to the patch.
The success rate is 78% which is low comparing to tier1 executions with code from downstream builds (95-100% success rates) [2].
Could you run the current master (without the dynamic_ownership
so that we have viable comparision?
From what I could see so far, there is an issue with move and copy
operations to and from Gluster domains. For example [3].
The logs are attached.
[1] *https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv -4.2-ge-runner-tier1-after-upgrade/7/testReport/ <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv -4.2-ge-runner-tier1-after-upgrade/7/testReport/>*
[2] https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/
rhv-4.2-ge-runner-tier1-after-upgrade/7/
[3] 2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) [vdsm.api] FINISH deleteImage error=Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835-5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' from=: :ffff:10.35.161.182,40936, flow_id=disks_syncAction_ ba6b2630-5976-4935, task_id=3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51) 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) [storage.TaskManager.Task] (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in deleteImage File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1503, in deleteImage raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) ImageDoesNotExistInSD: Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835-5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'
2018-04-22 13:06:28,317+0300 INFO (jsonrpc/7) [storage.TaskManager.Task] (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task is aborted: "Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835- 5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'" - code 268 (task:1181) 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) [storage.Dispatcher] FINISH deleteImage error=Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835-5f603e682f33, domain=e5fd29c8-52ba-467e-be09 -ca40ff054d d4' (dispatcher:82)
On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon < ebenahar@redhat.com> wrote:
Triggered a sanity tier1 execution [1] using [2], which covers all
9a78-bdd13a843c62/images/6cdabfe5- options=None) line line line line line line the please try patch) the
requested areas, on iSCSI, NFS and Gluster. I'll update with the results.
[1] https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2 _dev/job/rhv-4.2-ge-flow-storage/1161/
[2] https://gerrit.ovirt.org/#/c/89830/ vdsm-4.30.0-291.git77aef9a.el7.x86_64
On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik <
mpolednik@redhat.com>
wrote:
On 19/04/18 14:54 +0300, Elad Ben Aharon wrote:
Hi Martin,
I see [1] requires a rebase, can you please take care?
Should be rebased.
At the moment, our automation is stable only on iSCSI, NFS, Gluster and
FC. Ceph is not supported and Cinder will be stabilized soon, AFAIR, it's not stable enough at the moment.
That is still pretty good.
[1] https://gerrit.ovirt.org/#/c/89830/
Thanks
On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik < mpolednik@redhat.com > wrote:
On 18/04/18 11:37 +0300, Elad Ben Aharon wrote:
Hi, sorry if I misunderstood, I waited for more input regarding what
areas have to be tested here.
I'd say that you have quite a bit of freedom in this regard.
GlusterFS should be covered by Dennis, so iSCSI/NFS/ceph/cinder with some suite that covers basic operations (start & stop VM, migrate it), snapshots and merging them, and whatever else would be important for storage sanity.
mpolednik
On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik < mpolednik@redhat.com >
wrote:
On 11/04/18 16:52 +0300, Elad Ben Aharon wrote:
We can test this on iSCSI, NFS and GlusterFS. As for ceph and cinder,
will
have to check, since usually, we don't execute our automation on them.
Any update on this? I believe the gluster tests were successful, OST
passes fine and unit tests pass fine, that makes the storage backends test the last required piece.
On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir <ratamir@redhat.com> wrote:
+Elad
On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg <danken@redhat.com
> wrote:
On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri <eedri@redhat.com>
wrote:
Please make sure to run as much OST suites on this patch as
possible
before merging ( using 'ci please build' )
But note that OST is not a way to verify the patch.
Such changes require testing with all storage types we support.
Nir
On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik < mpolednik@redhat.com >
wrote:
Hey,
I've created a patch[0] that is finally able to activate
libvirt's dynamic_ownership for VDSM while not negatively affecting functionality of our storage code.
That of course comes with quite a bit of code removal, mostly in the area of host devices, hwrng and anything that touches devices; bunch of test changes and one XML generation caveat (storage is handled by VDSM, therefore disk relabelling needs to be disabled on the VDSM level).
Because of the scope of the patch, I welcome storage/virt/network people to review the code and consider the implication this change has on current/future features.
[0] https://gerrit.ovirt.org/#/c/89830/
In particular: dynamic_ownership was set to 0 prehistorically (as
part
of https://bugzilla.redhat.com/show_bug.cgi?id=554961 ) because
libvirt, running as root, was not able to play properly with root-squash nfs mounts.
Have you attempted this use case?
I join to Nir's request to run this with storage QE.
--
Raz Tamir Manager, RHV QE
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
<logs.tar.gz>
 
            On Tue, May 29, 2018 at 1:21 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Hi Dan,
In the last execution, the success rate was very low due to a large number of failures on start VM caused, according to Michal, by the vdsm-hook-allocate_net that was installed on the host.
This is the latest status here, would you like me to re-execute?
yes, of course. but you should rebase Polednik's code on top of *current* ovirt-4.2.3 branch.
If so, with or W/O vdsm-hook-allocate_net installed?
There was NO reason to have that installed. Please keep it (and any other needless code) out of the test environment.
On Tue, May 29, 2018 at 1:14 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On Mon, May 7, 2018 at 3:53 PM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote:
Hi Elad, why did you install vdsm-hook-allocate_net?
adding Dan as I think the hook is not supposed to fail this badly in any case
yep, this looks bad and deserves a little bug report. Installing this little hook should not block vm startup.
But more importantly - what is the conclusion of this thread? Do we have a green light from QE to take this in?
Thanks, michal
On 5 May 2018, at 19:22, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Start VM fails on:
2018-05-05 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path:
'dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938-9a78-bdd13a843c62/images/6cdabfe5- d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a01928211' ->
u'*dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938-9a78-bdd13a843c62/images/6cdabfe5-d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425- f35a01928211' (storagexml:334) 2018-05-05 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START getSpmStatus(spUUID='940fe6f3-b0c6-4d0c-a921-198e7819c1cc', options=None) from=::ffff:10.35.161.127,53512, task_id=c70ace39-dbfe-4f5c-ae49-a1e3a82c 2758 (api:46) 2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) [root] /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: rc=2 err=vm net allocation hook: [unexpected error]: Traceback (most recent call last): File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 105, in <module> main() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 93, in main allocate_random_network(device_xml) File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 62, in allocate_random_network net = _get_random_network() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 50, in _get_random_network available_nets = _parse_nets() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, in _parse_nets return [net for net in os.environ[AVAIL_NETS_KEY].split()] File "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__ raise KeyError(key) KeyError: 'equivnets'
(hooks:110) 2018-05-05 17:53:27,915+0300 ERROR (vm/e6ce66ce) [virt.vm] (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process failed (vm:943) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, in _startUnderlyingVm self._run() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2861, in _run domxml = hooks.before_vm_start(self._buildDomainXML(), File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2254, in _buildDomainXML dom, self.id, self._custom['custom']) File "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_preprocess.py", line 240, in replace_device_xml_with_hooks_xml dev_custom) File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 134, in before_device_create params=customProperties) File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 120, in _runHooksDir raise exception.HookError(err) HookError: Hook Error: ('vm net allocation hook: [unexpected error]: Traceback (most recent call last):\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 105, in <module>\n main()\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 93, in main\n allocate_random_network(device_xml)\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 62, i n allocate_random_network\n net = _get_random_network()\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 50, in _get_random_network\n available_nets = _parse_nets()\n File "/us r/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, in _parse_nets\n return [net for net in os.environ[AVAIL_NETS_KEY].split()]\n File "/usr/lib64/python2.7/UserDict.py", line 23, in __getit em__\n raise KeyError(key)\nKeyError: \'equivnets\'\n\n\n',)
Hence, the success rate was 28% against 100% running with d/s (d/s). If needed, I'll compare against the latest master, but I think you get the picture with d/s.
vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 libvirt-3.9.0-14.el7_5.3.x86_64 qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64 kernel 3.10.0-862.el7.x86_64 rhel7.5
Logs attached
On Sat, May 5, 2018 at 1:26 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
nvm, found gluster 3.12 repo, managed to install vdsm
On Sat, May 5, 2018 at 1:12 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
No, vdsm requires it:
Error: Package: vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 (/vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64) Requires: glusterfs-fuse >= 3.12 Installed: glusterfs-fuse-3.8.4-54.8.el7.x86_64 (@rhv-4.2.3)
Therefore, vdsm package installation is skipped upon force install.
On Sat, May 5, 2018 at 11:42 AM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote:
On 5 May 2018, at 00:38, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Hi guys,
The vdsm build from the patch requires glusterfs-fuse > 3.12. This is while the latest 4.2.3-5 d/s build requires 3.8.4 (3.4.0.59rhs-1.el7)
because it is still oVirt, not a downstream build. We can’t really do downstream builds with unmerged changes:/
Trying to get this gluster-fuse build, so far no luck. Is this requirement intentional?
it should work regardless, I guess you can force install it without the dependency
On Fri, May 4, 2018 at 2:38 PM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote: > > Hi Elad, > to make it easier to compare, Martin backported the change to 4.2 so > it > is actually comparable with a run without that patch. Would you > please try > that out? > It would be best to have 4.2 upstream and this[1] run to really > minimize the noise. > > Thanks, > michal > > [1] > > http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on-demand-el7-x86_64/2... > > On 27 Apr 2018, at 09:23, Martin Polednik <mpolednik@redhat.com> > wrote: > > On 24/04/18 00:37 +0300, Elad Ben Aharon wrote: > > I will update with the results of the next tier1 execution on latest > 4.2.3 > > > That isn't master but old branch though. Could you run it against > *current* VDSM master? > > On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik > <mpolednik@redhat.com> > wrote: > > On 23/04/18 01:23 +0300, Elad Ben Aharon wrote: > > Hi, I've triggered another execution [1] due to some issues I saw in > the > first which are not related to the patch. > > The success rate is 78% which is low comparing to tier1 executions > with > code from downstream builds (95-100% success rates) [2]. > > > Could you run the current master (without the dynamic_ownership > patch) > so that we have viable comparision? > > From what I could see so far, there is an issue with move and copy > > operations to and from Gluster domains. For example [3]. > > The logs are attached. > > > [1] > *https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv > -4.2-ge-runner-tier1-after-upgrade/7/testReport/ > <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv > -4.2-ge-runner-tier1-after-upgrade/7/testReport/>* > > > > [2] > https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/ > > rhv-4.2-ge-runner-tier1-after-upgrade/7/ > > > > [3] > 2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) [vdsm.api] FINISH > deleteImage error=Image does not exist in domain: > 'image=cabb8846-7a4b-4244-9835-5f603e682f33, > domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' > from=: > :ffff:10.35.161.182,40936, > flow_id=disks_syncAction_ba6b2630-5976-4935, > task_id=3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51) > 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) > [storage.TaskManager.Task] > (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error > (task:875) > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line > 882, > in > _run > return fn(*args, **kargs) > File "<string>", line 2, in deleteImage > File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, > in > method > ret = func(*args, **kwargs) > File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line > 1503, > in > deleteImage > raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) > ImageDoesNotExistInSD: Image does not exist in domain: > 'image=cabb8846-7a4b-4244-9835-5f603e682f33, > domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' > > 2018-04-22 13:06:28,317+0300 INFO (jsonrpc/7) > [storage.TaskManager.Task] > (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task is > aborted: > "Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835- > 5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'" - code > 268 > (task:1181) > 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) [storage.Dispatcher] > FINISH > deleteImage error=Image does not exist in domain: > 'image=cabb8846-7a4b-4244-9835-5f603e682f33, > domain=e5fd29c8-52ba-467e-be09 > -ca40ff054d > d4' (dispatcher:82) > > > > On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon > <ebenahar@redhat.com> > wrote: > > Triggered a sanity tier1 execution [1] using [2], which covers all > the > > requested areas, on iSCSI, NFS and Gluster. > I'll update with the results. > > [1] > https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2 > _dev/job/rhv-4.2-ge-flow-storage/1161/ > > [2] > https://gerrit.ovirt.org/#/c/89830/ > vdsm-4.30.0-291.git77aef9a.el7.x86_64 > > > > On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik > <mpolednik@redhat.com> > wrote: > > On 19/04/18 14:54 +0300, Elad Ben Aharon wrote: > > > Hi Martin, > > > I see [1] requires a rebase, can you please take care? > > > Should be rebased. > > At the moment, our automation is stable only on iSCSI, NFS, Gluster > and > > FC. > Ceph is not supported and Cinder will be stabilized soon, AFAIR, > it's > not > stable enough at the moment. > > > That is still pretty good. > > > [1] https://gerrit.ovirt.org/#/c/89830/ > > > > Thanks > > On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik > <mpolednik@redhat.com > > > wrote: > > On 18/04/18 11:37 +0300, Elad Ben Aharon wrote: > > > Hi, sorry if I misunderstood, I waited for more input regarding what > > areas > have to be tested here. > > > I'd say that you have quite a bit of freedom in this regard. > > GlusterFS > should be covered by Dennis, so iSCSI/NFS/ceph/cinder with some > suite > that covers basic operations (start & stop VM, migrate it), > snapshots > and merging them, and whatever else would be important for storage > sanity. > > mpolednik > > > On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik < > mpolednik@redhat.com > > > > wrote: > > > On 11/04/18 16:52 +0300, Elad Ben Aharon wrote: > > > We can test this on iSCSI, NFS and GlusterFS. As for ceph and > cinder, > > will > > have to check, since usually, we don't execute our automation on > them. > > > Any update on this? I believe the gluster tests were successful, > OST > > passes fine and unit tests pass fine, that makes the storage > backends > test the last required piece. > > > On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir <ratamir@redhat.com> > wrote: > > > +Elad > > > > On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg <danken@redhat.com > > > > wrote: > > On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer <nsoffer@redhat.com> > wrote: > > > On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri <eedri@redhat.com> > > wrote: > > > Please make sure to run as much OST suites on this patch as > > possible > > before merging ( using 'ci please build' ) > > > > But note that OST is not a way to verify the patch. > > > Such changes require testing with all storage types we support. > > Nir > > On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik < > mpolednik@redhat.com > > > > wrote: > > > Hey, > > > I've created a patch[0] that is finally able to activate > > libvirt's > dynamic_ownership for VDSM while not negatively affecting > functionality of our storage code. > > That of course comes with quite a bit of code removal, mostly > in > the > area of host devices, hwrng and anything that touches devices; > bunch > of test changes and one XML generation caveat (storage is > handled > by > VDSM, therefore disk relabelling needs to be disabled on the > VDSM > level). > > Because of the scope of the patch, I welcome > storage/virt/network > people to review the code and consider the implication this > change > has > on current/future features. > > [0] https://gerrit.ovirt.org/#/c/89830/ > > > In particular: dynamic_ownership was set to 0 prehistorically > (as > > > part > > > of https://bugzilla.redhat.com/show_bug.cgi?id=554961 ) because > > libvirt, > running as root, was not able to play properly with root-squash > nfs > mounts. > > Have you attempted this use case? > > I join to Nir's request to run this with storage QE. > > > > > -- > > > > Raz Tamir > Manager, RHV QE > > > > > > > > > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel > > >
<logs.tar.gz>
 
            Hi Martin, Can you please create a cerry pick patch that is based on 4.2? Thanks On Tue, May 29, 2018 at 1:34 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On Tue, May 29, 2018 at 1:21 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Hi Dan,
In the last execution, the success rate was very low due to a large number of failures on start VM caused, according to Michal, by the vdsm-hook-allocate_net that was installed on the host.
This is the latest status here, would you like me to re-execute?
yes, of course. but you should rebase Polednik's code on top of *current* ovirt-4.2.3 branch.
If so, with or W/O vdsm-hook-allocate_net installed?
There was NO reason to have that installed. Please keep it (and any other needless code) out of the test environment.
On Tue, May 29, 2018 at 1:14 PM, Dan Kenigsberg <danken@redhat.com>
On Mon, May 7, 2018 at 3:53 PM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote:
Hi Elad, why did you install vdsm-hook-allocate_net?
adding Dan as I think the hook is not supposed to fail this badly in
any
case
yep, this looks bad and deserves a little bug report. Installing this little hook should not block vm startup.
But more importantly - what is the conclusion of this thread? Do we have a green light from QE to take this in?
Thanks, michal
On 5 May 2018, at 19:22, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Start VM fails on:
2018-05-05 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path:
'dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938-
9a78-bdd13a843c62/images/6cdabfe5-
d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a01928211' ->
u'*dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- 9a78-bdd13a843c62/images/6cdabfe5-d1ca-40af-ae63- 9834f235d1c8/7ef97445-30e6-4435-8425- f35a01928211' (storagexml:334) 2018-05-05 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START getSpmStatus(spUUID='940fe6f3-b0c6-4d0c-a921-198e7819c1cc', options=None) from=::ffff:10.35.161.127,53512, task_id=c70ace39-dbfe-4f5c-ae49-a1e3a82c 2758 (api:46) 2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) [root] /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: rc=2 err=vm net allocation hook: [unexpected error]: Traceback (most recent call last): File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 105, in <module> main() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 93, in main allocate_random_network(device_xml) File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 62, in allocate_random_network net = _get_random_network() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 50, in _get_random_network available_nets = _parse_nets() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, in _parse_nets return [net for net in os.environ[AVAIL_NETS_KEY].split()] File "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__ raise KeyError(key) KeyError: 'equivnets'
(hooks:110) 2018-05-05 17:53:27,915+0300 ERROR (vm/e6ce66ce) [virt.vm] (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process failed (vm:943) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, in _startUnderlyingVm self._run() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2861, in _run domxml = hooks.before_vm_start(self._buildDomainXML(), File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2254, in _buildDomainXML dom, self.id, self._custom['custom']) File "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_
line 240, in replace_device_xml_with_hooks_xml dev_custom) File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 134, in before_device_create params=customProperties) File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 120, in _runHooksDir raise exception.HookError(err) HookError: Hook Error: ('vm net allocation hook: [unexpected error]: Traceback (most recent call last):\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 105, in <module>\n main()\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 93, in main\n allocate_random_network(device_xml)\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 62, i n allocate_random_network\n net = _get_random_network()\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 50, in _get_random_network\n available_nets = _parse_nets()\n File "/us r/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, in _parse_nets\n return [net for net in os.environ[AVAIL_NETS_KEY].split()]\n File "/usr/lib64/python2.7/UserDict.py", line 23, in __getit em__\n raise KeyError(key)\nKeyError: \'equivnets\'\n\n\n',)
Hence, the success rate was 28% against 100% running with d/s (d/s). If needed, I'll compare against the latest master, but I think you get
picture with d/s.
vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 libvirt-3.9.0-14.el7_5.3.x86_64 qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64 kernel 3.10.0-862.el7.x86_64 rhel7.5
Logs attached
On Sat, May 5, 2018 at 1:26 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
nvm, found gluster 3.12 repo, managed to install vdsm
On Sat, May 5, 2018 at 1:12 PM, Elad Ben Aharon <ebenahar@redhat.com
wrote:
No, vdsm requires it:
Error: Package: vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 (/vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64) Requires: glusterfs-fuse >= 3.12 Installed: glusterfs-fuse-3.8.4-54.8.el7.x86_64
(@rhv-4.2.3)
Therefore, vdsm package installation is skipped upon force install.
On Sat, May 5, 2018 at 11:42 AM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote: > > > > On 5 May 2018, at 00:38, Elad Ben Aharon <ebenahar@redhat.com>
wrote:
> > Hi guys, > > The vdsm build from the patch requires glusterfs-fuse > 3.12. This is > while the latest 4.2.3-5 d/s build requires 3.8.4 (3.4.0.59rhs-1.el7) > > > because it is still oVirt, not a downstream build. We can’t really do > downstream builds with unmerged changes:/ > > Trying to get this gluster-fuse build, so far no luck. > Is this requirement intentional? > > > it should work regardless, I guess you can force install it without > the > dependency > > > On Fri, May 4, 2018 at 2:38 PM, Michal Skrivanek > <michal.skrivanek@redhat.com> wrote: >> >> Hi Elad, >> to make it easier to compare, Martin backported the change to 4.2 so >> it >> is actually comparable with a run without that patch. Would you >> please try >> that out? >> It would be best to have 4.2 upstream and this[1] run to really >> minimize the noise. >> >> Thanks, >> michal >> >> [1] >> >> http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on- demand-el7-x86_64/28/ >> >> On 27 Apr 2018, at 09:23, Martin Polednik <mpolednik@redhat.com> >> wrote: >> >> On 24/04/18 00:37 +0300, Elad Ben Aharon wrote: >> >> I will update with the results of the next tier1 execution on latest >> 4.2.3 >> >> >> That isn't master but old branch though. Could you run it against >> *current* VDSM master? >> >> On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik >> <mpolednik@redhat.com> >> wrote: >> >> On 23/04/18 01:23 +0300, Elad Ben Aharon wrote: >> >> Hi, I've triggered another execution [1] due to some issues I saw in >> the >> first which are not related to the patch. >> >> The success rate is 78% which is low comparing to tier1 executions >> with >> code from downstream builds (95-100% success rates) [2]. >> >> >> Could you run the current master (without the dynamic_ownership >> patch) >> so that we have viable comparision? >> >> From what I could see so far, there is an issue with move and copy >> >> operations to and from Gluster domains. For example [3]. >> >> The logs are attached. >> >> >> [1] >> *https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv >> -4.2-ge-runner-tier1-after-upgrade/7/testReport/ >> <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv >> -4.2-ge-runner-tier1-after-upgrade/7/testReport/>* >> >> >> >> [2] >> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/ >> >> rhv-4.2-ge-runner-tier1-after-upgrade/7/ >> >> >> >> [3] >> 2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) [vdsm.api] FINISH >> deleteImage error=Image does not exist in domain: >> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' >> from=: >> :ffff:10.35.161.182,40936, >> flow_id=disks_syncAction_ba6b2630-5976-4935, >> task_id=3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51) >> 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) >> [storage.TaskManager.Task] >> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error >> (task:875) >> Traceback (most recent call last): >> File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py",
wrote: preprocess.py", the line
>> 882, >> in >> _run >> return fn(*args, **kargs) >> File "<string>", line 2, in deleteImage >> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, >> in >> method >> ret = func(*args, **kwargs) >> File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line >> 1503, >> in >> deleteImage >> raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) >> ImageDoesNotExistInSD: Image does not exist in domain: >> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' >> >> 2018-04-22 13:06:28,317+0300 INFO (jsonrpc/7) >> [storage.TaskManager.Task] >> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task is >> aborted: >> "Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835- >> 5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'" - code >> 268 >> (task:1181) >> 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) [storage.Dispatcher] >> FINISH >> deleteImage error=Image does not exist in domain: >> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >> domain=e5fd29c8-52ba-467e-be09 >> -ca40ff054d >> d4' (dispatcher:82) >> >> >> >> On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon >> <ebenahar@redhat.com> >> wrote: >> >> Triggered a sanity tier1 execution [1] using [2], which covers all >> the >> >> requested areas, on iSCSI, NFS and Gluster. >> I'll update with the results. >> >> [1] >> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2 >> _dev/job/rhv-4.2-ge-flow-storage/1161/ >> >> [2] >> https://gerrit.ovirt.org/#/c/89830/ >> vdsm-4.30.0-291.git77aef9a.el7.x86_64 >> >> >> >> On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik >> <mpolednik@redhat.com> >> wrote: >> >> On 19/04/18 14:54 +0300, Elad Ben Aharon wrote: >> >> >> Hi Martin, >> >> >> I see [1] requires a rebase, can you please take care? >> >> >> Should be rebased. >> >> At the moment, our automation is stable only on iSCSI, NFS, Gluster >> and >> >> FC. >> Ceph is not supported and Cinder will be stabilized soon, AFAIR, >> it's >> not >> stable enough at the moment. >> >> >> That is still pretty good. >> >> >> [1] https://gerrit.ovirt.org/#/c/89830/ >> >> >> >> Thanks >> >> On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik >> <mpolednik@redhat.com >> > >> wrote: >> >> On 18/04/18 11:37 +0300, Elad Ben Aharon wrote: >> >> >> Hi, sorry if I misunderstood, I waited for more input regarding what >> >> areas >> have to be tested here. >> >> >> I'd say that you have quite a bit of freedom in this regard. >> >> GlusterFS >> should be covered by Dennis, so iSCSI/NFS/ceph/cinder with some >> suite >> that covers basic operations (start & stop VM, migrate it), >> snapshots >> and merging them, and whatever else would be important for storage >> sanity. >> >> mpolednik >> >> >> On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik < >> mpolednik@redhat.com >> > >> >> wrote: >> >> >> On 11/04/18 16:52 +0300, Elad Ben Aharon wrote: >> >> >> We can test this on iSCSI, NFS and GlusterFS. As for ceph and >> cinder, >> >> will >> >> have to check, since usually, we don't execute our automation on >> them. >> >> >> Any update on this? I believe the gluster tests were successful, >> OST >> >> passes fine and unit tests pass fine, that makes the storage >> backends >> test the last required piece. >> >> >> On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir <ratamir@redhat.com> >> wrote: >> >> >> +Elad >> >> >> >> On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg < danken@redhat.com >> >> > >> wrote: >> >> On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer <nsoffer@redhat.com> >> wrote: >> >> >> On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri <eedri@redhat.com> >> >> wrote: >> >> >> Please make sure to run as much OST suites on this patch as >> >> possible >> >> before merging ( using 'ci please build' ) >> >> >> >> But note that OST is not a way to verify the patch. >> >> >> Such changes require testing with all storage types we support. >> >> Nir >> >> On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik < >> mpolednik@redhat.com >> > >> >> wrote: >> >> >> Hey, >> >> >> I've created a patch[0] that is finally able to activate >> >> libvirt's >> dynamic_ownership for VDSM while not negatively affecting >> functionality of our storage code. >> >> That of course comes with quite a bit of code removal, mostly >> in >> the >> area of host devices, hwrng and anything that touches devices; >> bunch >> of test changes and one XML generation caveat (storage is >> handled >> by >> VDSM, therefore disk relabelling needs to be disabled on the >> VDSM >> level). >> >> Because of the scope of the patch, I welcome >> storage/virt/network >> people to review the code and consider the implication this >> change >> has >> on current/future features. >> >> [0] https://gerrit.ovirt.org/#/c/89830/ >> >> >> In particular: dynamic_ownership was set to 0 prehistorically >> (as >> >> >> part >> >> >> of https://bugzilla.redhat.com/show_bug.cgi?id=554961 ) because >> >> libvirt, >> running as root, was not able to play properly with root-squash >> nfs >> mounts. >> >> Have you attempted this use case? >> >> I join to Nir's request to run this with storage QE. >> >> >> >> >> -- >> >> >> >> Raz Tamir >> Manager, RHV QE >> >> >> >> >> >> >> >> >> _______________________________________________ >> Devel mailing list >> Devel@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/devel >> >> >> > >
<logs.tar.gz>
 
            On 29/05/18 15:30 +0300, Elad Ben Aharon wrote:
Hi Martin,
Can you please create a cerry pick patch that is based on 4.2?
See https://gerrit.ovirt.org/#/c/90906/. The CI failure isn unrelated (storage needs real env). mpolednik
Thanks
On Tue, May 29, 2018 at 1:34 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On Tue, May 29, 2018 at 1:21 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Hi Dan,
In the last execution, the success rate was very low due to a large number of failures on start VM caused, according to Michal, by the vdsm-hook-allocate_net that was installed on the host.
This is the latest status here, would you like me to re-execute?
yes, of course. but you should rebase Polednik's code on top of *current* ovirt-4.2.3 branch.
If so, with or W/O vdsm-hook-allocate_net installed?
There was NO reason to have that installed. Please keep it (and any other needless code) out of the test environment.
On Tue, May 29, 2018 at 1:14 PM, Dan Kenigsberg <danken@redhat.com>
On Mon, May 7, 2018 at 3:53 PM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote:
Hi Elad, why did you install vdsm-hook-allocate_net?
adding Dan as I think the hook is not supposed to fail this badly in
any
case
yep, this looks bad and deserves a little bug report. Installing this little hook should not block vm startup.
But more importantly - what is the conclusion of this thread? Do we have a green light from QE to take this in?
Thanks, michal
On 5 May 2018, at 19:22, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Start VM fails on:
2018-05-05 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path:
'dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938-
9a78-bdd13a843c62/images/6cdabfe5-
d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a01928211' ->
u'*dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- 9a78-bdd13a843c62/images/6cdabfe5-d1ca-40af-ae63- 9834f235d1c8/7ef97445-30e6-4435-8425- f35a01928211' (storagexml:334) 2018-05-05 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START getSpmStatus(spUUID='940fe6f3-b0c6-4d0c-a921-198e7819c1cc', options=None) from=::ffff:10.35.161.127,53512, task_id=c70ace39-dbfe-4f5c-ae49-a1e3a82c 2758 (api:46) 2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) [root] /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: rc=2 err=vm net allocation hook: [unexpected error]: Traceback (most recent call last): File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 105, in <module> main() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 93, in main allocate_random_network(device_xml) File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 62, in allocate_random_network net = _get_random_network() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 50, in _get_random_network available_nets = _parse_nets() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, in _parse_nets return [net for net in os.environ[AVAIL_NETS_KEY].split()] File "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__ raise KeyError(key) KeyError: 'equivnets'
(hooks:110) 2018-05-05 17:53:27,915+0300 ERROR (vm/e6ce66ce) [virt.vm] (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process failed (vm:943) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, in _startUnderlyingVm self._run() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2861, in _run domxml = hooks.before_vm_start(self._buildDomainXML(), File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2254, in _buildDomainXML dom, self.id, self._custom['custom']) File "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_
line 240, in replace_device_xml_with_hooks_xml dev_custom) File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 134, in before_device_create params=customProperties) File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 120, in _runHooksDir raise exception.HookError(err) HookError: Hook Error: ('vm net allocation hook: [unexpected error]: Traceback (most recent call last):\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 105, in <module>\n main()\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 93, in main\n allocate_random_network(device_xml)\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 62, i n allocate_random_network\n net = _get_random_network()\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 50, in _get_random_network\n available_nets = _parse_nets()\n File "/us r/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, in _parse_nets\n return [net for net in os.environ[AVAIL_NETS_KEY].split()]\n File "/usr/lib64/python2.7/UserDict.py", line 23, in __getit em__\n raise KeyError(key)\nKeyError: \'equivnets\'\n\n\n',)
Hence, the success rate was 28% against 100% running with d/s (d/s). If needed, I'll compare against the latest master, but I think you get
picture with d/s.
vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 libvirt-3.9.0-14.el7_5.3.x86_64 qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64 kernel 3.10.0-862.el7.x86_64 rhel7.5
Logs attached
On Sat, May 5, 2018 at 1:26 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
nvm, found gluster 3.12 repo, managed to install vdsm
On Sat, May 5, 2018 at 1:12 PM, Elad Ben Aharon <ebenahar@redhat.com
wrote: > > No, vdsm requires it: > > Error: Package: vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 > (/vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64) > Requires: glusterfs-fuse >= 3.12 > Installed: glusterfs-fuse-3.8.4-54.8.el7.x86_64 (@rhv-4.2.3) > > Therefore, vdsm package installation is skipped upon force install. > > On Sat, May 5, 2018 at 11:42 AM, Michal Skrivanek > <michal.skrivanek@redhat.com> wrote: >> >> >> >> On 5 May 2018, at 00:38, Elad Ben Aharon <ebenahar@redhat.com> wrote: >> >> Hi guys, >> >> The vdsm build from the patch requires glusterfs-fuse > 3.12. This is >> while the latest 4.2.3-5 d/s build requires 3.8.4 (3.4.0.59rhs-1.el7) >> >> >> because it is still oVirt, not a downstream build. We can’t really do >> downstream builds with unmerged changes:/ >> >> Trying to get this gluster-fuse build, so far no luck. >> Is this requirement intentional? >> >> >> it should work regardless, I guess you can force install it without >> the >> dependency >> >> >> On Fri, May 4, 2018 at 2:38 PM, Michal Skrivanek >> <michal.skrivanek@redhat.com> wrote: >>> >>> Hi Elad, >>> to make it easier to compare, Martin backported the change to 4.2 so >>> it >>> is actually comparable with a run without that patch. Would you >>> please try >>> that out? >>> It would be best to have 4.2 upstream and this[1] run to really >>> minimize the noise. >>> >>> Thanks, >>> michal >>> >>> [1] >>> >>> http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on- demand-el7-x86_64/28/ >>> >>> On 27 Apr 2018, at 09:23, Martin Polednik <mpolednik@redhat.com> >>> wrote: >>> >>> On 24/04/18 00:37 +0300, Elad Ben Aharon wrote: >>> >>> I will update with the results of the next tier1 execution on latest >>> 4.2.3 >>> >>> >>> That isn't master but old branch though. Could you run it against >>> *current* VDSM master? >>> >>> On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik >>> <mpolednik@redhat.com> >>> wrote: >>> >>> On 23/04/18 01:23 +0300, Elad Ben Aharon wrote: >>> >>> Hi, I've triggered another execution [1] due to some issues I saw in >>> the >>> first which are not related to the patch. >>> >>> The success rate is 78% which is low comparing to tier1 executions >>> with >>> code from downstream builds (95-100% success rates) [2]. >>> >>> >>> Could you run the current master (without the dynamic_ownership >>> patch) >>> so that we have viable comparision? >>> >>> From what I could see so far, there is an issue with move and copy >>> >>> operations to and from Gluster domains. For example [3]. >>> >>> The logs are attached. >>> >>> >>> [1] >>> *https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv >>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/ >>> <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv >>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/>* >>> >>> >>> >>> [2] >>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/ >>> >>> rhv-4.2-ge-runner-tier1-after-upgrade/7/ >>> >>> >>> >>> [3] >>> 2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) [vdsm.api] FINISH >>> deleteImage error=Image does not exist in domain: >>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' >>> from=: >>> :ffff:10.35.161.182,40936, >>> flow_id=disks_syncAction_ba6b2630-5976-4935, >>> task_id=3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51) >>> 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) >>> [storage.TaskManager.Task] >>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error >>> (task:875) >>> Traceback (most recent call last): >>> File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py",
wrote: preprocess.py", the line
>>> 882, >>> in >>> _run >>> return fn(*args, **kargs) >>> File "<string>", line 2, in deleteImage >>> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, >>> in >>> method >>> ret = func(*args, **kwargs) >>> File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line >>> 1503, >>> in >>> deleteImage >>> raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) >>> ImageDoesNotExistInSD: Image does not exist in domain: >>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' >>> >>> 2018-04-22 13:06:28,317+0300 INFO (jsonrpc/7) >>> [storage.TaskManager.Task] >>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task is >>> aborted: >>> "Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835- >>> 5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'" - code >>> 268 >>> (task:1181) >>> 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) [storage.Dispatcher] >>> FINISH >>> deleteImage error=Image does not exist in domain: >>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >>> domain=e5fd29c8-52ba-467e-be09 >>> -ca40ff054d >>> d4' (dispatcher:82) >>> >>> >>> >>> On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon >>> <ebenahar@redhat.com> >>> wrote: >>> >>> Triggered a sanity tier1 execution [1] using [2], which covers all >>> the >>> >>> requested areas, on iSCSI, NFS and Gluster. >>> I'll update with the results. >>> >>> [1] >>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2 >>> _dev/job/rhv-4.2-ge-flow-storage/1161/ >>> >>> [2] >>> https://gerrit.ovirt.org/#/c/89830/ >>> vdsm-4.30.0-291.git77aef9a.el7.x86_64 >>> >>> >>> >>> On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik >>> <mpolednik@redhat.com> >>> wrote: >>> >>> On 19/04/18 14:54 +0300, Elad Ben Aharon wrote: >>> >>> >>> Hi Martin, >>> >>> >>> I see [1] requires a rebase, can you please take care? >>> >>> >>> Should be rebased. >>> >>> At the moment, our automation is stable only on iSCSI, NFS, Gluster >>> and >>> >>> FC. >>> Ceph is not supported and Cinder will be stabilized soon, AFAIR, >>> it's >>> not >>> stable enough at the moment. >>> >>> >>> That is still pretty good. >>> >>> >>> [1] https://gerrit.ovirt.org/#/c/89830/ >>> >>> >>> >>> Thanks >>> >>> On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik >>> <mpolednik@redhat.com >>> > >>> wrote: >>> >>> On 18/04/18 11:37 +0300, Elad Ben Aharon wrote: >>> >>> >>> Hi, sorry if I misunderstood, I waited for more input regarding what >>> >>> areas >>> have to be tested here. >>> >>> >>> I'd say that you have quite a bit of freedom in this regard. >>> >>> GlusterFS >>> should be covered by Dennis, so iSCSI/NFS/ceph/cinder with some >>> suite >>> that covers basic operations (start & stop VM, migrate it), >>> snapshots >>> and merging them, and whatever else would be important for storage >>> sanity. >>> >>> mpolednik >>> >>> >>> On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik < >>> mpolednik@redhat.com >>> > >>> >>> wrote: >>> >>> >>> On 11/04/18 16:52 +0300, Elad Ben Aharon wrote: >>> >>> >>> We can test this on iSCSI, NFS and GlusterFS. As for ceph and >>> cinder, >>> >>> will >>> >>> have to check, since usually, we don't execute our automation on >>> them. >>> >>> >>> Any update on this? I believe the gluster tests were successful, >>> OST >>> >>> passes fine and unit tests pass fine, that makes the storage >>> backends >>> test the last required piece. >>> >>> >>> On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir <ratamir@redhat.com> >>> wrote: >>> >>> >>> +Elad >>> >>> >>> >>> On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg < danken@redhat.com >>> >>> > >>> wrote: >>> >>> On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer <nsoffer@redhat.com> >>> wrote: >>> >>> >>> On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri <eedri@redhat.com> >>> >>> wrote: >>> >>> >>> Please make sure to run as much OST suites on this patch as >>> >>> possible >>> >>> before merging ( using 'ci please build' ) >>> >>> >>> >>> But note that OST is not a way to verify the patch. >>> >>> >>> Such changes require testing with all storage types we support. >>> >>> Nir >>> >>> On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik < >>> mpolednik@redhat.com >>> > >>> >>> wrote: >>> >>> >>> Hey, >>> >>> >>> I've created a patch[0] that is finally able to activate >>> >>> libvirt's >>> dynamic_ownership for VDSM while not negatively affecting >>> functionality of our storage code. >>> >>> That of course comes with quite a bit of code removal, mostly >>> in >>> the >>> area of host devices, hwrng and anything that touches devices; >>> bunch >>> of test changes and one XML generation caveat (storage is >>> handled >>> by >>> VDSM, therefore disk relabelling needs to be disabled on the >>> VDSM >>> level). >>> >>> Because of the scope of the patch, I welcome >>> storage/virt/network >>> people to review the code and consider the implication this >>> change >>> has >>> on current/future features. >>> >>> [0] https://gerrit.ovirt.org/#/c/89830/ >>> >>> >>> In particular: dynamic_ownership was set to 0 prehistorically >>> (as >>> >>> >>> part >>> >>> >>> of https://bugzilla.redhat.com/show_bug.cgi?id=554961 ) because >>> >>> libvirt, >>> running as root, was not able to play properly with root-squash >>> nfs >>> mounts. >>> >>> Have you attempted this use case? >>> >>> I join to Nir's request to run this with storage QE. >>> >>> >>> >>> >>> -- >>> >>> >>> >>> Raz Tamir >>> Manager, RHV QE >>> >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Devel mailing list >>> Devel@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/devel >>> >>> >>> >> >> >
<logs.tar.gz>
 
            Triggered a sanity automation execution using [1], which covers all the requested areas, on iSCSI, NFS and Gluster. I'll update with the results. [1] *https://gerrit.ovirt.org/#/c/90906/ <https://gerrit.ovirt.org/#/c/90906/>* vdsm-4.20.28-6.gitc23aef6.el7.x86_64 On Tue, May 29, 2018 at 4:26 PM, Martin Polednik <mpolednik@redhat.com> wrote:
On 29/05/18 15:30 +0300, Elad Ben Aharon wrote:
Hi Martin,
Can you please create a cerry pick patch that is based on 4.2?
See https://gerrit.ovirt.org/#/c/90906/. The CI failure isn unrelated (storage needs real env).
mpolednik
Thanks
On Tue, May 29, 2018 at 1:34 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On Tue, May 29, 2018 at 1:21 PM, Elad Ben Aharon <ebenahar@redhat.com>
wrote:
Hi Dan,
In the last execution, the success rate was very low due to a large number of failures on start VM caused, according to Michal, by the vdsm-hook-allocate_net that was installed on the host.
This is the latest status here, would you like me to re-execute?
yes, of course. but you should rebase Polednik's code on top of *current* ovirt-4.2.3 branch.
If so, with or W/O vdsm-hook-allocate_net installed?
There was NO reason to have that installed. Please keep it (and any other needless code) out of the test environment.
On Tue, May 29, 2018 at 1:14 PM, Dan Kenigsberg <danken@redhat.com>
On Mon, May 7, 2018 at 3:53 PM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote:
Hi Elad, why did you install vdsm-hook-allocate_net?
adding Dan as I think the hook is not supposed to fail this badly in
any
case
yep, this looks bad and deserves a little bug report. Installing this little hook should not block vm startup.
But more importantly - what is the conclusion of this thread? Do we have a green light from QE to take this in?
Thanks, michal
On 5 May 2018, at 19:22, Elad Ben Aharon <ebenahar@redhat.com>
wrote:
Start VM fails on:
2018-05-05 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path:
'dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938-
9a78-bdd13a843c62/images/6cdabfe5-
d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a01928211' ->
u'*dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- 9a78-bdd13a843c62/images/6cdabfe5-d1ca-40af-ae63- 9834f235d1c8/7ef97445-30e6-4435-8425- f35a01928211' (storagexml:334) 2018-05-05 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START getSpmStatus(spUUID='940fe6f3-b0c6-4d0c-a921-198e7819c1cc', options=None) from=::ffff:10.35.161.127,53512, task_id=c70ace39-dbfe-4f5c-ae49-a1e3a82c 2758 (api:46) 2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) [root] /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: rc=2 err=vm net allocation hook: [unexpected error]: Traceback (most recent call last): File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ net", line 105, in <module> main() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ net", line 93, in main allocate_random_network(device_xml) File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ net", line 62, in allocate_random_network net = _get_random_network() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ net", line 50, in _get_random_network available_nets = _parse_nets() File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ net", line 46, in _parse_nets return [net for net in os.environ[AVAIL_NETS_KEY].split()] File "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__ raise KeyError(key) KeyError: 'equivnets'
(hooks:110) 2018-05-05 17:53:27,915+0300 ERROR (vm/e6ce66ce) [virt.vm] (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process failed (vm:943) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, in _startUnderlyingVm self._run() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2861, in _run domxml = hooks.before_vm_start(self._buildDomainXML(), File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2254, in _buildDomainXML dom, self.id, self._custom['custom']) File "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_
line 240, in replace_device_xml_with_hooks_xml dev_custom) File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 134, in before_device_create params=customProperties) File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 120, in _runHooksDir raise exception.HookError(err) HookError: Hook Error: ('vm net allocation hook: [unexpected error]: Traceback (most recent call last):\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
105, in <module>\n main()\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ net", line 93, in main\n allocate_random_network(device_xml)\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
i n allocate_random_network\n net = _get_random_network()\n File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
in _get_random_network\n available_nets = _parse_nets()\n File "/us r/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, in _parse_nets\n return [net for net in os.environ[AVAIL_NETS_KEY].split()]\n File "/usr/lib64/python2.7/UserDict.py", line 23, in __getit em__\n raise KeyError(key)\nKeyError: \'equivnets\'\n\n\n',)
Hence, the success rate was 28% against 100% running with d/s (d/s). If needed, I'll compare against the latest master, but I think you get
picture with d/s.
vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 libvirt-3.9.0-14.el7_5.3.x86_64 qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64 kernel 3.10.0-862.el7.x86_64 rhel7.5
Logs attached
On Sat, May 5, 2018 at 1:26 PM, Elad Ben Aharon < ebenahar@redhat.com> wrote: > > nvm, found gluster 3.12 repo, managed to install vdsm > > On Sat, May 5, 2018 at 1:12 PM, Elad Ben Aharon < ebenahar@redhat.com
> wrote: >> >> No, vdsm requires it: >> >> Error: Package: vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 >> (/vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64) >> Requires: glusterfs-fuse >= 3.12 >> Installed: glusterfs-fuse-3.8.4-54.8.el7.x86_64 (@rhv-4.2.3) >> >> Therefore, vdsm package installation is skipped upon force install. >> >> On Sat, May 5, 2018 at 11:42 AM, Michal Skrivanek >> <michal.skrivanek@redhat.com> wrote: >>> >>> >>> >>> On 5 May 2018, at 00:38, Elad Ben Aharon <ebenahar@redhat.com> wrote: >>> >>> Hi guys, >>> >>> The vdsm build from the patch requires glusterfs-fuse > 3.12. This is >>> while the latest 4.2.3-5 d/s build requires 3.8.4 (3.4.0.59rhs-1.el7) >>> >>> >>> because it is still oVirt, not a downstream build. We can’t really do >>> downstream builds with unmerged changes:/ >>> >>> Trying to get this gluster-fuse build, so far no luck. >>> Is this requirement intentional? >>> >>> >>> it should work regardless, I guess you can force install it without >>> the >>> dependency >>> >>> >>> On Fri, May 4, 2018 at 2:38 PM, Michal Skrivanek >>> <michal.skrivanek@redhat.com> wrote: >>>> >>>> Hi Elad, >>>> to make it easier to compare, Martin backported the change to 4.2 so >>>> it >>>> is actually comparable with a run without that patch. Would you >>>> please try >>>> that out? >>>> It would be best to have 4.2 upstream and this[1] run to really >>>> minimize the noise. >>>> >>>> Thanks, >>>> michal >>>> >>>> [1] >>>> >>>> http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on- demand-el7-x86_64/28/ >>>> >>>> On 27 Apr 2018, at 09:23, Martin Polednik <mpolednik@redhat.com
>>>> wrote: >>>> >>>> On 24/04/18 00:37 +0300, Elad Ben Aharon wrote: >>>> >>>> I will update with the results of the next tier1 execution on latest >>>> 4.2.3 >>>> >>>> >>>> That isn't master but old branch though. Could you run it against >>>> *current* VDSM master? >>>> >>>> On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik >>>> <mpolednik@redhat.com> >>>> wrote: >>>> >>>> On 23/04/18 01:23 +0300, Elad Ben Aharon wrote: >>>> >>>> Hi, I've triggered another execution [1] due to some issues I saw in >>>> the >>>> first which are not related to the patch. >>>> >>>> The success rate is 78% which is low comparing to tier1 executions >>>> with >>>> code from downstream builds (95-100% success rates) [2]. >>>> >>>> >>>> Could you run the current master (without the dynamic_ownership >>>> patch) >>>> so that we have viable comparision? >>>> >>>> From what I could see so far, there is an issue with move and copy >>>> >>>> operations to and from Gluster domains. For example [3]. >>>> >>>> The logs are attached. >>>> >>>> >>>> [1] >>>> *https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv >>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/ >>>> <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv >>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/>* >>>> >>>> >>>> >>>> [2] >>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/ >>>> >>>> rhv-4.2-ge-runner-tier1-after-upgrade/7/ >>>> >>>> >>>> >>>> [3] >>>> 2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) [vdsm.api] FINISH >>>> deleteImage error=Image does not exist in domain: >>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' >>>> from=: >>>> :ffff:10.35.161.182,40936, >>>> flow_id=disks_syncAction_ba6b2630-5976-4935, >>>> task_id=3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51) >>>> 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) >>>> [storage.TaskManager.Task] >>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error >>>> (task:875) >>>> Traceback (most recent call last): >>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py",
>>>> 882, >>>> in >>>> _run >>>> return fn(*args, **kargs) >>>> File "<string>", line 2, in deleteImage >>>> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py",
>>>> in >>>> method >>>> ret = func(*args, **kwargs) >>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py",
>>>> 1503, >>>> in >>>> deleteImage >>>> raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) >>>> ImageDoesNotExistInSD: Image does not exist in domain: >>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' >>>> >>>> 2018-04-22 13:06:28,317+0300 INFO (jsonrpc/7) >>>> [storage.TaskManager.Task] >>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task is >>>> aborted: >>>> "Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835
wrote: preprocess.py", line line 62, line 50, the line line 49, line -
>>>> 5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'" - code >>>> 268 >>>> (task:1181) >>>> 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) [storage.Dispatcher] >>>> FINISH >>>> deleteImage error=Image does not exist in domain: >>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >>>> domain=e5fd29c8-52ba-467e-be09 >>>> -ca40ff054d >>>> d4' (dispatcher:82) >>>> >>>> >>>> >>>> On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon >>>> <ebenahar@redhat.com> >>>> wrote: >>>> >>>> Triggered a sanity tier1 execution [1] using [2], which covers all >>>> the >>>> >>>> requested areas, on iSCSI, NFS and Gluster. >>>> I'll update with the results. >>>> >>>> [1] >>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2 >>>> _dev/job/rhv-4.2-ge-flow-storage/1161/ >>>> >>>> [2] >>>> https://gerrit.ovirt.org/#/c/89830/ >>>> vdsm-4.30.0-291.git77aef9a.el7.x86_64 >>>> >>>> >>>> >>>> On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik >>>> <mpolednik@redhat.com> >>>> wrote: >>>> >>>> On 19/04/18 14:54 +0300, Elad Ben Aharon wrote: >>>> >>>> >>>> Hi Martin, >>>> >>>> >>>> I see [1] requires a rebase, can you please take care? >>>> >>>> >>>> Should be rebased. >>>> >>>> At the moment, our automation is stable only on iSCSI, NFS, Gluster >>>> and >>>> >>>> FC. >>>> Ceph is not supported and Cinder will be stabilized soon, AFAIR, >>>> it's >>>> not >>>> stable enough at the moment. >>>> >>>> >>>> That is still pretty good. >>>> >>>> >>>> [1] https://gerrit.ovirt.org/#/c/89830/ >>>> >>>> >>>> >>>> Thanks >>>> >>>> On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik >>>> <mpolednik@redhat.com >>>> > >>>> wrote: >>>> >>>> On 18/04/18 11:37 +0300, Elad Ben Aharon wrote: >>>> >>>> >>>> Hi, sorry if I misunderstood, I waited for more input regarding what >>>> >>>> areas >>>> have to be tested here. >>>> >>>> >>>> I'd say that you have quite a bit of freedom in this regard. >>>> >>>> GlusterFS >>>> should be covered by Dennis, so iSCSI/NFS/ceph/cinder with some >>>> suite >>>> that covers basic operations (start & stop VM, migrate it), >>>> snapshots >>>> and merging them, and whatever else would be important for storage >>>> sanity. >>>> >>>> mpolednik >>>> >>>> >>>> On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik < >>>> mpolednik@redhat.com >>>> > >>>> >>>> wrote: >>>> >>>> >>>> On 11/04/18 16:52 +0300, Elad Ben Aharon wrote: >>>> >>>> >>>> We can test this on iSCSI, NFS and GlusterFS. As for ceph and >>>> cinder, >>>> >>>> will >>>> >>>> have to check, since usually, we don't execute our automation on >>>> them. >>>> >>>> >>>> Any update on this? I believe the gluster tests were successful, >>>> OST >>>> >>>> passes fine and unit tests pass fine, that makes the storage >>>> backends >>>> test the last required piece. >>>> >>>> >>>> On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir <ratamir@redhat.com> >>>> wrote: >>>> >>>> >>>> +Elad >>>> >>>> >>>> >>>> On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg < danken@redhat.com >>>> >>>> > >>>> wrote: >>>> >>>> On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer < nsoffer@redhat.com> >>>> wrote: >>>> >>>> >>>> On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri <eedri@redhat.com> >>>> >>>> wrote: >>>> >>>> >>>> Please make sure to run as much OST suites on this patch as >>>> >>>> possible >>>> >>>> before merging ( using 'ci please build' ) >>>> >>>> >>>> >>>> But note that OST is not a way to verify the patch. >>>> >>>> >>>> Such changes require testing with all storage types we support. >>>> >>>> Nir >>>> >>>> On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik < >>>> mpolednik@redhat.com >>>> > >>>> >>>> wrote: >>>> >>>> >>>> Hey, >>>> >>>> >>>> I've created a patch[0] that is finally able to activate >>>> >>>> libvirt's >>>> dynamic_ownership for VDSM while not negatively affecting >>>> functionality of our storage code. >>>> >>>> That of course comes with quite a bit of code removal, mostly >>>> in >>>> the >>>> area of host devices, hwrng and anything that touches devices; >>>> bunch >>>> of test changes and one XML generation caveat (storage is >>>> handled >>>> by >>>> VDSM, therefore disk relabelling needs to be disabled on the >>>> VDSM >>>> level). >>>> >>>> Because of the scope of the patch, I welcome >>>> storage/virt/network >>>> people to review the code and consider the implication this >>>> change >>>> has >>>> on current/future features. >>>> >>>> [0] https://gerrit.ovirt.org/#/c/89830/ >>>> >>>> >>>> In particular: dynamic_ownership was set to 0 prehistorically >>>> (as >>>> >>>> >>>> part >>>> >>>> >>>> of https://bugzilla.redhat.com/show_bug.cgi?id=554961 ) because >>>> >>>> libvirt, >>>> running as root, was not able to play properly with root-squash >>>> nfs >>>> mounts. >>>> >>>> Have you attempted this use case? >>>> >>>> I join to Nir's request to run this with storage QE. >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> >>>> >>>> Raz Tamir >>>> Manager, RHV QE >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Devel mailing list >>>> Devel@ovirt.org >>>> http://lists.ovirt.org/mailman/listinfo/devel >>>> >>>> >>>> >>> >>> >> >
<logs.tar.gz>
 
            Execution is done, 59/65 cases passed. Latest 4.2.4 execution ended with 100% so failures were caused probably due to the changes done in the patch. Failures are mainly on preview snapshots. Execution info provided to Martin separately. On Wed, May 30, 2018 at 5:44 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Triggered a sanity automation execution using [1], which covers all the requested areas, on iSCSI, NFS and Gluster. I'll update with the results.
[1] *https://gerrit.ovirt.org/#/c/90906/ <https://gerrit.ovirt.org/#/c/90906/>* vdsm-4.20.28-6.gitc23aef6.el7.x86_64
On Tue, May 29, 2018 at 4:26 PM, Martin Polednik <mpolednik@redhat.com> wrote:
On 29/05/18 15:30 +0300, Elad Ben Aharon wrote:
Hi Martin,
Can you please create a cerry pick patch that is based on 4.2?
See https://gerrit.ovirt.org/#/c/90906/. The CI failure isn unrelated (storage needs real env).
mpolednik
Thanks
On Tue, May 29, 2018 at 1:34 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On Tue, May 29, 2018 at 1:21 PM, Elad Ben Aharon <ebenahar@redhat.com>
wrote:
Hi Dan,
In the last execution, the success rate was very low due to a large number of failures on start VM caused, according to Michal, by the vdsm-hook-allocate_net that was installed on the host.
This is the latest status here, would you like me to re-execute?
yes, of course. but you should rebase Polednik's code on top of *current* ovirt-4.2.3 branch.
If so, with or W/O vdsm-hook-allocate_net installed?
There was NO reason to have that installed. Please keep it (and any other needless code) out of the test environment.
On Tue, May 29, 2018 at 1:14 PM, Dan Kenigsberg <danken@redhat.com>
On Mon, May 7, 2018 at 3:53 PM, Michal Skrivanek <michal.skrivanek@redhat.com> wrote: > Hi Elad, > why did you install vdsm-hook-allocate_net? > > adding Dan as I think the hook is not supposed to fail this badly
in any
> case
yep, this looks bad and deserves a little bug report. Installing this little hook should not block vm startup.
But more importantly - what is the conclusion of this thread? Do we have a green light from QE to take this in?
> > Thanks, > michal > > On 5 May 2018, at 19:22, Elad Ben Aharon <ebenahar@redhat.com> wrote: > > Start VM fails on: > > 2018-05-05 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] > (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path: > > 'dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- 9a78-bdd13a843c62/images/6cdabfe5- > d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a01928211' -> > > u'*dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- 9a78-bdd13a843c62/images/6cdabfe5-d1ca-40af-ae63- 9834f235d1c8/7ef97445-30e6-4435-8425- > f35a01928211' (storagexml:334) > 2018-05-05 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START > getSpmStatus(spUUID='940fe6f3-b0c6-4d0c-a921-198e7819c1cc', > options=None) > from=::ffff:10.35.161.127,53512, > task_id=c70ace39-dbfe-4f5c-ae49-a1e3a82c > 2758 (api:46) > 2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) [root] > /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: rc=2 > err=vm > net allocation hook: [unexpected error]: Traceback (most recent call > last): > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", > line > 105, in <module> > main() > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", > line > 93, in main > allocate_random_network(device_xml) > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", > line > 62, in allocate_random_network > net = _get_random_network() > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", > line > 50, in _get_random_network > available_nets = _parse_nets() > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", > line > 46, in _parse_nets > return [net for net in os.environ[AVAIL_NETS_KEY].split()] > File "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__ > raise KeyError(key) > KeyError: 'equivnets' > > > (hooks:110) > 2018-05-05 17:53:27,915+0300 ERROR (vm/e6ce66ce) [virt.vm] > (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process > failed > (vm:943) > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, in > _startUnderlyingVm > self._run() > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2861, in > _run > domxml = hooks.before_vm_start(self._buildDomainXML(), > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2254, in > _buildDomainXML > dom, self.id, self._custom['custom']) > File "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_
> line 240, in replace_device_xml_with_hooks_xml > dev_custom) > File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py",
> in > before_device_create > params=customProperties) > File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py",
> in > _runHooksDir > raise exception.HookError(err) > HookError: Hook Error: ('vm net allocation hook: [unexpected error]: > Traceback (most recent call last):\n File > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
> 105, in > <module>\n main()\n > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", > line > 93, in main\n allocate_random_network(device_xml)\n File > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
> i > n allocate_random_network\n net = _get_random_network()\n File > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
> in > _get_random_network\n available_nets = _parse_nets()\n File "/us > r/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, in > _parse_nets\n return [net for net in > os.environ[AVAIL_NETS_KEY].split()]\n File > "/usr/lib64/python2.7/UserDict.py", line 23, in __getit > em__\n raise KeyError(key)\nKeyError: \'equivnets\'\n\n\n',) > > > > Hence, the success rate was 28% against 100% running with d/s (d/s). If > needed, I'll compare against the latest master, but I think you get
> picture with d/s. > > vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 > libvirt-3.9.0-14.el7_5.3.x86_64 > qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64 > kernel 3.10.0-862.el7.x86_64 > rhel7.5 > > > Logs attached > > On Sat, May 5, 2018 at 1:26 PM, Elad Ben Aharon < ebenahar@redhat.com> > wrote: >> >> nvm, found gluster 3.12 repo, managed to install vdsm >> >> On Sat, May 5, 2018 at 1:12 PM, Elad Ben Aharon < ebenahar@redhat.com
>> wrote: >>> >>> No, vdsm requires it: >>> >>> Error: Package: vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 >>> (/vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64) >>> Requires: glusterfs-fuse >= 3.12 >>> Installed: glusterfs-fuse-3.8.4-54.8.el7.x86_64 (@rhv-4.2.3) >>> >>> Therefore, vdsm package installation is skipped upon force install. >>> >>> On Sat, May 5, 2018 at 11:42 AM, Michal Skrivanek >>> <michal.skrivanek@redhat.com> wrote: >>>> >>>> >>>> >>>> On 5 May 2018, at 00:38, Elad Ben Aharon <ebenahar@redhat.com> wrote: >>>> >>>> Hi guys, >>>> >>>> The vdsm build from the patch requires glusterfs-fuse > 3.12. This is >>>> while the latest 4.2.3-5 d/s build requires 3.8.4 (3.4.0.59rhs-1.el7) >>>> >>>> >>>> because it is still oVirt, not a downstream build. We can’t really do >>>> downstream builds with unmerged changes:/ >>>> >>>> Trying to get this gluster-fuse build, so far no luck. >>>> Is this requirement intentional? >>>> >>>> >>>> it should work regardless, I guess you can force install it without >>>> the >>>> dependency >>>> >>>> >>>> On Fri, May 4, 2018 at 2:38 PM, Michal Skrivanek >>>> <michal.skrivanek@redhat.com> wrote: >>>>> >>>>> Hi Elad, >>>>> to make it easier to compare, Martin backported the change to 4.2 so >>>>> it >>>>> is actually comparable with a run without that patch. Would you >>>>> please try >>>>> that out? >>>>> It would be best to have 4.2 upstream and this[1] run to really >>>>> minimize the noise. >>>>> >>>>> Thanks, >>>>> michal >>>>> >>>>> [1] >>>>> >>>>> http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on- demand-el7-x86_64/28/ >>>>> >>>>> On 27 Apr 2018, at 09:23, Martin Polednik < mpolednik@redhat.com> >>>>> wrote: >>>>> >>>>> On 24/04/18 00:37 +0300, Elad Ben Aharon wrote: >>>>> >>>>> I will update with the results of the next tier1 execution on latest >>>>> 4.2.3 >>>>> >>>>> >>>>> That isn't master but old branch though. Could you run it against >>>>> *current* VDSM master? >>>>> >>>>> On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik >>>>> <mpolednik@redhat.com> >>>>> wrote: >>>>> >>>>> On 23/04/18 01:23 +0300, Elad Ben Aharon wrote: >>>>> >>>>> Hi, I've triggered another execution [1] due to some issues I saw in >>>>> the >>>>> first which are not related to the patch. >>>>> >>>>> The success rate is 78% which is low comparing to tier1 executions >>>>> with >>>>> code from downstream builds (95-100% success rates) [2]. >>>>> >>>>> >>>>> Could you run the current master (without the dynamic_ownership >>>>> patch) >>>>> so that we have viable comparision? >>>>> >>>>> From what I could see so far, there is an issue with move and copy >>>>> >>>>> operations to and from Gluster domains. For example [3]. >>>>> >>>>> The logs are attached. >>>>> >>>>> >>>>> [1] >>>>> *https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv >>>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/ >>>>> <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv >>>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/>* >>>>> >>>>> >>>>> >>>>> [2] >>>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/ >>>>> >>>>> rhv-4.2-ge-runner-tier1-after-upgrade/7/ >>>>> >>>>> >>>>> >>>>> [3] >>>>> 2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) [vdsm.api] FINISH >>>>> deleteImage error=Image does not exist in domain: >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >>>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' >>>>> from=: >>>>> :ffff:10.35.161.182,40936, >>>>> flow_id=disks_syncAction_ba6b2630-5976-4935, >>>>> task_id=3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51) >>>>> 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) >>>>> [storage.TaskManager.Task] >>>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error >>>>> (task:875) >>>>> Traceback (most recent call last): >>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py",
>>>>> 882, >>>>> in >>>>> _run >>>>> return fn(*args, **kargs) >>>>> File "<string>", line 2, in deleteImage >>>>> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py",
>>>>> in >>>>> method >>>>> ret = func(*args, **kwargs) >>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py",
>>>>> 1503, >>>>> in >>>>> deleteImage >>>>> raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) >>>>> ImageDoesNotExistInSD: Image does not exist in domain: >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >>>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' >>>>> >>>>> 2018-04-22 13:06:28,317+0300 INFO (jsonrpc/7) >>>>> [storage.TaskManager.Task] >>>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task is >>>>> aborted: >>>>> "Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835
wrote: preprocess.py", line 134, line 120, line line 62, line 50, the line line 49, line -
>>>>> 5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'" - code >>>>> 268 >>>>> (task:1181) >>>>> 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) [storage.Dispatcher] >>>>> FINISH >>>>> deleteImage error=Image does not exist in domain: >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >>>>> domain=e5fd29c8-52ba-467e-be09 >>>>> -ca40ff054d >>>>> d4' (dispatcher:82) >>>>> >>>>> >>>>> >>>>> On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon >>>>> <ebenahar@redhat.com> >>>>> wrote: >>>>> >>>>> Triggered a sanity tier1 execution [1] using [2], which covers all >>>>> the >>>>> >>>>> requested areas, on iSCSI, NFS and Gluster. >>>>> I'll update with the results. >>>>> >>>>> [1] >>>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2 >>>>> _dev/job/rhv-4.2-ge-flow-storage/1161/ >>>>> >>>>> [2] >>>>> https://gerrit.ovirt.org/#/c/89830/ >>>>> vdsm-4.30.0-291.git77aef9a.el7.x86_64 >>>>> >>>>> >>>>> >>>>> On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik >>>>> <mpolednik@redhat.com> >>>>> wrote: >>>>> >>>>> On 19/04/18 14:54 +0300, Elad Ben Aharon wrote: >>>>> >>>>> >>>>> Hi Martin, >>>>> >>>>> >>>>> I see [1] requires a rebase, can you please take care? >>>>> >>>>> >>>>> Should be rebased. >>>>> >>>>> At the moment, our automation is stable only on iSCSI, NFS, Gluster >>>>> and >>>>> >>>>> FC. >>>>> Ceph is not supported and Cinder will be stabilized soon, AFAIR, >>>>> it's >>>>> not >>>>> stable enough at the moment. >>>>> >>>>> >>>>> That is still pretty good. >>>>> >>>>> >>>>> [1] https://gerrit.ovirt.org/#/c/89830/ >>>>> >>>>> >>>>> >>>>> Thanks >>>>> >>>>> On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik >>>>> <mpolednik@redhat.com >>>>> > >>>>> wrote: >>>>> >>>>> On 18/04/18 11:37 +0300, Elad Ben Aharon wrote: >>>>> >>>>> >>>>> Hi, sorry if I misunderstood, I waited for more input regarding what >>>>> >>>>> areas >>>>> have to be tested here. >>>>> >>>>> >>>>> I'd say that you have quite a bit of freedom in this regard. >>>>> >>>>> GlusterFS >>>>> should be covered by Dennis, so iSCSI/NFS/ceph/cinder with some >>>>> suite >>>>> that covers basic operations (start & stop VM, migrate it), >>>>> snapshots >>>>> and merging them, and whatever else would be important for storage >>>>> sanity. >>>>> >>>>> mpolednik >>>>> >>>>> >>>>> On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik < >>>>> mpolednik@redhat.com >>>>> > >>>>> >>>>> wrote: >>>>> >>>>> >>>>> On 11/04/18 16:52 +0300, Elad Ben Aharon wrote: >>>>> >>>>> >>>>> We can test this on iSCSI, NFS and GlusterFS. As for ceph and >>>>> cinder, >>>>> >>>>> will >>>>> >>>>> have to check, since usually, we don't execute our automation on >>>>> them. >>>>> >>>>> >>>>> Any update on this? I believe the gluster tests were successful, >>>>> OST >>>>> >>>>> passes fine and unit tests pass fine, that makes the storage >>>>> backends >>>>> test the last required piece. >>>>> >>>>> >>>>> On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir <ratamir@redhat.com
>>>>> wrote: >>>>> >>>>> >>>>> +Elad >>>>> >>>>> >>>>> >>>>> On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg < danken@redhat.com >>>>> >>>>> > >>>>> wrote: >>>>> >>>>> On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer < nsoffer@redhat.com> >>>>> wrote: >>>>> >>>>> >>>>> On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri <eedri@redhat.com> >>>>> >>>>> wrote: >>>>> >>>>> >>>>> Please make sure to run as much OST suites on this patch as >>>>> >>>>> possible >>>>> >>>>> before merging ( using 'ci please build' ) >>>>> >>>>> >>>>> >>>>> But note that OST is not a way to verify the patch. >>>>> >>>>> >>>>> Such changes require testing with all storage types we support. >>>>> >>>>> Nir >>>>> >>>>> On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik < >>>>> mpolednik@redhat.com >>>>> > >>>>> >>>>> wrote: >>>>> >>>>> >>>>> Hey, >>>>> >>>>> >>>>> I've created a patch[0] that is finally able to activate >>>>> >>>>> libvirt's >>>>> dynamic_ownership for VDSM while not negatively affecting >>>>> functionality of our storage code. >>>>> >>>>> That of course comes with quite a bit of code removal, mostly >>>>> in >>>>> the >>>>> area of host devices, hwrng and anything that touches devices; >>>>> bunch >>>>> of test changes and one XML generation caveat (storage is >>>>> handled >>>>> by >>>>> VDSM, therefore disk relabelling needs to be disabled on the >>>>> VDSM >>>>> level). >>>>> >>>>> Because of the scope of the patch, I welcome >>>>> storage/virt/network >>>>> people to review the code and consider the implication this >>>>> change >>>>> has >>>>> on current/future features. >>>>> >>>>> [0] https://gerrit.ovirt.org/#/c/89830/ >>>>> >>>>> >>>>> In particular: dynamic_ownership was set to 0 prehistorically >>>>> (as >>>>> >>>>> >>>>> part >>>>> >>>>> >>>>> of https://bugzilla.redhat.com/show_bug.cgi?id=554961 ) because >>>>> >>>>> libvirt, >>>>> running as root, was not able to play properly with root-squash >>>>> nfs >>>>> mounts. >>>>> >>>>> Have you attempted this use case? >>>>> >>>>> I join to Nir's request to run this with storage QE. >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> >>>>> >>>>> Raz Tamir >>>>> Manager, RHV QE >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Devel mailing list >>>>> Devel@ovirt.org >>>>> http://lists.ovirt.org/mailman/listinfo/devel >>>>> >>>>> >>>>> >>>> >>>> >>> >> > > <logs.tar.gz> > >
 
            On 31/05/18 12:47 +0300, Elad Ben Aharon wrote:
Execution is done, 59/65 cases passed. Latest 4.2.4 execution ended with 100% so failures were caused probably due to the changes done in the patch. Failures are mainly on preview snapshots.
Execution info provided to Martin separately.
I'm currently investigating the snapshot breakage, thanks Elad!
On Wed, May 30, 2018 at 5:44 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Triggered a sanity automation execution using [1], which covers all the requested areas, on iSCSI, NFS and Gluster. I'll update with the results.
[1] *https://gerrit.ovirt.org/#/c/90906/ <https://gerrit.ovirt.org/#/c/90906/>* vdsm-4.20.28-6.gitc23aef6.el7.x86_64
On Tue, May 29, 2018 at 4:26 PM, Martin Polednik <mpolednik@redhat.com> wrote:
On 29/05/18 15:30 +0300, Elad Ben Aharon wrote:
Hi Martin,
Can you please create a cerry pick patch that is based on 4.2?
See https://gerrit.ovirt.org/#/c/90906/. The CI failure isn unrelated (storage needs real env).
mpolednik
Thanks
On Tue, May 29, 2018 at 1:34 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On Tue, May 29, 2018 at 1:21 PM, Elad Ben Aharon <ebenahar@redhat.com>
wrote:
Hi Dan,
In the last execution, the success rate was very low due to a large number of failures on start VM caused, according to Michal, by the vdsm-hook-allocate_net that was installed on the host.
This is the latest status here, would you like me to re-execute?
yes, of course. but you should rebase Polednik's code on top of *current* ovirt-4.2.3 branch.
If so, with or W/O vdsm-hook-allocate_net installed?
There was NO reason to have that installed. Please keep it (and any other needless code) out of the test environment.
On Tue, May 29, 2018 at 1:14 PM, Dan Kenigsberg <danken@redhat.com>
> > On Mon, May 7, 2018 at 3:53 PM, Michal Skrivanek > <michal.skrivanek@redhat.com> wrote: > > Hi Elad, > > why did you install vdsm-hook-allocate_net? > > > > adding Dan as I think the hook is not supposed to fail this badly in any > > case > > yep, this looks bad and deserves a little bug report. Installing this > little hook should not block vm startup. > > But more importantly - what is the conclusion of this thread? Do we > have a green light from QE to take this in? > > > > > > Thanks, > > michal > > > > On 5 May 2018, at 19:22, Elad Ben Aharon <ebenahar@redhat.com> wrote: > > > > Start VM fails on: > > > > 2018-05-05 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] > > (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path: > > > > 'dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- 9a78-bdd13a843c62/images/6cdabfe5- > > d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a01928211' -> > > > > u'*dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- 9a78-bdd13a843c62/images/6cdabfe5-d1ca-40af-ae63- 9834f235d1c8/7ef97445-30e6-4435-8425- > > f35a01928211' (storagexml:334) > > 2018-05-05 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START > > getSpmStatus(spUUID='940fe6f3-b0c6-4d0c-a921-198e7819c1cc', > > options=None) > > from=::ffff:10.35.161.127,53512, > > task_id=c70ace39-dbfe-4f5c-ae49-a1e3a82c > > 2758 (api:46) > > 2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) [root] > > /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: rc=2 > > err=vm > > net allocation hook: [unexpected error]: Traceback (most recent call > > last): > > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", > > line > > 105, in <module> > > main() > > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", > > line > > 93, in main > > allocate_random_network(device_xml) > > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", > > line > > 62, in allocate_random_network > > net = _get_random_network() > > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", > > line > > 50, in _get_random_network > > available_nets = _parse_nets() > > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", > > line > > 46, in _parse_nets > > return [net for net in os.environ[AVAIL_NETS_KEY].split()] > > File "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__ > > raise KeyError(key) > > KeyError: 'equivnets' > > > > > > (hooks:110) > > 2018-05-05 17:53:27,915+0300 ERROR (vm/e6ce66ce) [virt.vm] > > (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process > > failed > > (vm:943) > > Traceback (most recent call last): > > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, in > > _startUnderlyingVm > > self._run() > > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2861, in > > _run > > domxml = hooks.before_vm_start(self._buildDomainXML(), > > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2254, in > > _buildDomainXML > > dom, self.id, self._custom['custom']) > > File "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_
> > line 240, in replace_device_xml_with_hooks_xml > > dev_custom) > > File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py",
> > in > > before_device_create > > params=customProperties) > > File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py",
> > in > > _runHooksDir > > raise exception.HookError(err) > > HookError: Hook Error: ('vm net allocation hook: [unexpected error]: > > Traceback (most recent call last):\n File > > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
> > 105, in > > <module>\n main()\n > > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", > > line > > 93, in main\n allocate_random_network(device_xml)\n File > > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
> > i > > n allocate_random_network\n net = _get_random_network()\n File > > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net",
> > in > > _get_random_network\n available_nets = _parse_nets()\n File "/us > > r/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, in > > _parse_nets\n return [net for net in > > os.environ[AVAIL_NETS_KEY].split()]\n File > > "/usr/lib64/python2.7/UserDict.py", line 23, in __getit > > em__\n raise KeyError(key)\nKeyError: \'equivnets\'\n\n\n',) > > > > > > > > Hence, the success rate was 28% against 100% running with d/s (d/s). If > > needed, I'll compare against the latest master, but I think you get
> > picture with d/s. > > > > vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 > > libvirt-3.9.0-14.el7_5.3.x86_64 > > qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64 > > kernel 3.10.0-862.el7.x86_64 > > rhel7.5 > > > > > > Logs attached > > > > On Sat, May 5, 2018 at 1:26 PM, Elad Ben Aharon < ebenahar@redhat.com> > > wrote: > >> > >> nvm, found gluster 3.12 repo, managed to install vdsm > >> > >> On Sat, May 5, 2018 at 1:12 PM, Elad Ben Aharon < ebenahar@redhat.com
> >> wrote: > >>> > >>> No, vdsm requires it: > >>> > >>> Error: Package: vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 > >>> (/vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64) > >>> Requires: glusterfs-fuse >= 3.12 > >>> Installed: glusterfs-fuse-3.8.4-54.8.el7.x86_64 (@rhv-4.2.3) > >>> > >>> Therefore, vdsm package installation is skipped upon force install. > >>> > >>> On Sat, May 5, 2018 at 11:42 AM, Michal Skrivanek > >>> <michal.skrivanek@redhat.com> wrote: > >>>> > >>>> > >>>> > >>>> On 5 May 2018, at 00:38, Elad Ben Aharon <ebenahar@redhat.com> wrote: > >>>> > >>>> Hi guys, > >>>> > >>>> The vdsm build from the patch requires glusterfs-fuse > 3.12. This is > >>>> while the latest 4.2.3-5 d/s build requires 3.8.4 (3.4.0.59rhs-1.el7) > >>>> > >>>> > >>>> because it is still oVirt, not a downstream build. We can’t really do > >>>> downstream builds with unmerged changes:/ > >>>> > >>>> Trying to get this gluster-fuse build, so far no luck. > >>>> Is this requirement intentional? > >>>> > >>>> > >>>> it should work regardless, I guess you can force install it without > >>>> the > >>>> dependency > >>>> > >>>> > >>>> On Fri, May 4, 2018 at 2:38 PM, Michal Skrivanek > >>>> <michal.skrivanek@redhat.com> wrote: > >>>>> > >>>>> Hi Elad, > >>>>> to make it easier to compare, Martin backported the change to 4.2 so > >>>>> it > >>>>> is actually comparable with a run without that patch. Would you > >>>>> please try > >>>>> that out? > >>>>> It would be best to have 4.2 upstream and this[1] run to really > >>>>> minimize the noise. > >>>>> > >>>>> Thanks, > >>>>> michal > >>>>> > >>>>> [1] > >>>>> > >>>>> http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on- demand-el7-x86_64/28/ > >>>>> > >>>>> On 27 Apr 2018, at 09:23, Martin Polednik < mpolednik@redhat.com> > >>>>> wrote: > >>>>> > >>>>> On 24/04/18 00:37 +0300, Elad Ben Aharon wrote: > >>>>> > >>>>> I will update with the results of the next tier1 execution on latest > >>>>> 4.2.3 > >>>>> > >>>>> > >>>>> That isn't master but old branch though. Could you run it against > >>>>> *current* VDSM master? > >>>>> > >>>>> On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik > >>>>> <mpolednik@redhat.com> > >>>>> wrote: > >>>>> > >>>>> On 23/04/18 01:23 +0300, Elad Ben Aharon wrote: > >>>>> > >>>>> Hi, I've triggered another execution [1] due to some issues I saw in > >>>>> the > >>>>> first which are not related to the patch. > >>>>> > >>>>> The success rate is 78% which is low comparing to tier1 executions > >>>>> with > >>>>> code from downstream builds (95-100% success rates) [2]. > >>>>> > >>>>> > >>>>> Could you run the current master (without the dynamic_ownership > >>>>> patch) > >>>>> so that we have viable comparision? > >>>>> > >>>>> From what I could see so far, there is an issue with move and copy > >>>>> > >>>>> operations to and from Gluster domains. For example [3]. > >>>>> > >>>>> The logs are attached. > >>>>> > >>>>> > >>>>> [1] > >>>>> *https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv > >>>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/ > >>>>> <https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv > >>>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/>* > >>>>> > >>>>> > >>>>> > >>>>> [2] > >>>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/ > >>>>> > >>>>> rhv-4.2-ge-runner-tier1-after-upgrade/7/ > >>>>> > >>>>> > >>>>> > >>>>> [3] > >>>>> 2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) [vdsm.api] FINISH > >>>>> deleteImage error=Image does not exist in domain: > >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, > >>>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' > >>>>> from=: > >>>>> :ffff:10.35.161.182,40936, > >>>>> flow_id=disks_syncAction_ba6b2630-5976-4935, > >>>>> task_id=3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51) > >>>>> 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) > >>>>> [storage.TaskManager.Task] > >>>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error > >>>>> (task:875) > >>>>> Traceback (most recent call last): > >>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py",
> >>>>> 882, > >>>>> in > >>>>> _run > >>>>> return fn(*args, **kargs) > >>>>> File "<string>", line 2, in deleteImage > >>>>> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py",
> >>>>> in > >>>>> method > >>>>> ret = func(*args, **kwargs) > >>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py",
> >>>>> 1503, > >>>>> in > >>>>> deleteImage > >>>>> raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) > >>>>> ImageDoesNotExistInSD: Image does not exist in domain: > >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, > >>>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' > >>>>> > >>>>> 2018-04-22 13:06:28,317+0300 INFO (jsonrpc/7) > >>>>> [storage.TaskManager.Task] > >>>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task is > >>>>> aborted: > >>>>> "Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835
wrote: preprocess.py", line 134, line 120, line line 62, line 50, the line line 49, line -
> >>>>> 5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'" - code > >>>>> 268 > >>>>> (task:1181) > >>>>> 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) [storage.Dispatcher] > >>>>> FINISH > >>>>> deleteImage error=Image does not exist in domain: > >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, > >>>>> domain=e5fd29c8-52ba-467e-be09 > >>>>> -ca40ff054d > >>>>> d4' (dispatcher:82) > >>>>> > >>>>> > >>>>> > >>>>> On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon > >>>>> <ebenahar@redhat.com> > >>>>> wrote: > >>>>> > >>>>> Triggered a sanity tier1 execution [1] using [2], which covers all > >>>>> the > >>>>> > >>>>> requested areas, on iSCSI, NFS and Gluster. > >>>>> I'll update with the results. > >>>>> > >>>>> [1] > >>>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2 > >>>>> _dev/job/rhv-4.2-ge-flow-storage/1161/ > >>>>> > >>>>> [2] > >>>>> https://gerrit.ovirt.org/#/c/89830/ > >>>>> vdsm-4.30.0-291.git77aef9a.el7.x86_64 > >>>>> > >>>>> > >>>>> > >>>>> On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik > >>>>> <mpolednik@redhat.com> > >>>>> wrote: > >>>>> > >>>>> On 19/04/18 14:54 +0300, Elad Ben Aharon wrote: > >>>>> > >>>>> > >>>>> Hi Martin, > >>>>> > >>>>> > >>>>> I see [1] requires a rebase, can you please take care? > >>>>> > >>>>> > >>>>> Should be rebased. > >>>>> > >>>>> At the moment, our automation is stable only on iSCSI, NFS, Gluster > >>>>> and > >>>>> > >>>>> FC. > >>>>> Ceph is not supported and Cinder will be stabilized soon, AFAIR, > >>>>> it's > >>>>> not > >>>>> stable enough at the moment. > >>>>> > >>>>> > >>>>> That is still pretty good. > >>>>> > >>>>> > >>>>> [1] https://gerrit.ovirt.org/#/c/89830/ > >>>>> > >>>>> > >>>>> > >>>>> Thanks > >>>>> > >>>>> On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik > >>>>> <mpolednik@redhat.com > >>>>> > > >>>>> wrote: > >>>>> > >>>>> On 18/04/18 11:37 +0300, Elad Ben Aharon wrote: > >>>>> > >>>>> > >>>>> Hi, sorry if I misunderstood, I waited for more input regarding what > >>>>> > >>>>> areas > >>>>> have to be tested here. > >>>>> > >>>>> > >>>>> I'd say that you have quite a bit of freedom in this regard. > >>>>> > >>>>> GlusterFS > >>>>> should be covered by Dennis, so iSCSI/NFS/ceph/cinder with some > >>>>> suite > >>>>> that covers basic operations (start & stop VM, migrate it), > >>>>> snapshots > >>>>> and merging them, and whatever else would be important for storage > >>>>> sanity. > >>>>> > >>>>> mpolednik > >>>>> > >>>>> > >>>>> On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik < > >>>>> mpolednik@redhat.com > >>>>> > > >>>>> > >>>>> wrote: > >>>>> > >>>>> > >>>>> On 11/04/18 16:52 +0300, Elad Ben Aharon wrote: > >>>>> > >>>>> > >>>>> We can test this on iSCSI, NFS and GlusterFS. As for ceph and > >>>>> cinder, > >>>>> > >>>>> will > >>>>> > >>>>> have to check, since usually, we don't execute our automation on > >>>>> them. > >>>>> > >>>>> > >>>>> Any update on this? I believe the gluster tests were successful, > >>>>> OST > >>>>> > >>>>> passes fine and unit tests pass fine, that makes the storage > >>>>> backends > >>>>> test the last required piece. > >>>>> > >>>>> > >>>>> On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir <ratamir@redhat.com
> >>>>> wrote: > >>>>> > >>>>> > >>>>> +Elad > >>>>> > >>>>> > >>>>> > >>>>> On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg < danken@redhat.com > >>>>> > >>>>> > > >>>>> wrote: > >>>>> > >>>>> On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer < nsoffer@redhat.com> > >>>>> wrote: > >>>>> > >>>>> > >>>>> On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri <eedri@redhat.com> > >>>>> > >>>>> wrote: > >>>>> > >>>>> > >>>>> Please make sure to run as much OST suites on this patch as > >>>>> > >>>>> possible > >>>>> > >>>>> before merging ( using 'ci please build' ) > >>>>> > >>>>> > >>>>> > >>>>> But note that OST is not a way to verify the patch. > >>>>> > >>>>> > >>>>> Such changes require testing with all storage types we support. > >>>>> > >>>>> Nir > >>>>> > >>>>> On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik < > >>>>> mpolednik@redhat.com > >>>>> > > >>>>> > >>>>> wrote: > >>>>> > >>>>> > >>>>> Hey, > >>>>> > >>>>> > >>>>> I've created a patch[0] that is finally able to activate > >>>>> > >>>>> libvirt's > >>>>> dynamic_ownership for VDSM while not negatively affecting > >>>>> functionality of our storage code. > >>>>> > >>>>> That of course comes with quite a bit of code removal, mostly > >>>>> in > >>>>> the > >>>>> area of host devices, hwrng and anything that touches devices; > >>>>> bunch > >>>>> of test changes and one XML generation caveat (storage is > >>>>> handled > >>>>> by > >>>>> VDSM, therefore disk relabelling needs to be disabled on the > >>>>> VDSM > >>>>> level). > >>>>> > >>>>> Because of the scope of the patch, I welcome > >>>>> storage/virt/network > >>>>> people to review the code and consider the implication this > >>>>> change > >>>>> has > >>>>> on current/future features. > >>>>> > >>>>> [0] https://gerrit.ovirt.org/#/c/89830/ > >>>>> > >>>>> > >>>>> In particular: dynamic_ownership was set to 0 prehistorically > >>>>> (as > >>>>> > >>>>> > >>>>> part > >>>>> > >>>>> > >>>>> of https://bugzilla.redhat.com/show_bug.cgi?id=554961 ) because > >>>>> > >>>>> libvirt, > >>>>> running as root, was not able to play properly with root-squash > >>>>> nfs > >>>>> mounts. > >>>>> > >>>>> Have you attempted this use case? > >>>>> > >>>>> I join to Nir's request to run this with storage QE. > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> > >>>>> > >>>>> > >>>>> Raz Tamir > >>>>> Manager, RHV QE > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Devel mailing list > >>>>> Devel@ovirt.org > >>>>> http://lists.ovirt.org/mailman/listinfo/devel > >>>>> > >>>>> > >>>>> > >>>> > >>>> > >>> > >> > > > > <logs.tar.gz> > > > >
 
            On Thu, May 31, 2018 at 1:05 PM Martin Polednik <mpolednik@redhat.com> wrote:
Execution is done, 59/65 cases passed. Latest 4.2.4 execution ended with 100% so failures were caused probably due to the changes done in the
On 31/05/18 12:47 +0300, Elad Ben Aharon wrote: patch.
Failures are mainly on preview snapshots.
Can we run the same job on the patch before Martin patch? maybe the issue are already in master, caused by other patches?
Execution info provided to Martin separately.
I'm currently investigating the snapshot breakage, thanks Elad!
code >> >>>>> 268 >> >>>>> (task:1181) >> >>>>> 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) [storage.Dispatcher] >> >>>>> FINISH >> >>>>> deleteImage error=Image does not exist in domain: >> >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >> >>>>> domain=e5fd29c8-52ba-467e-be09 >> >>>>> -ca40ff054d >> >>>>> d4' (dispatcher:82) >> >>>>> >> >>>>> >> >>>>> >> >>>>> On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon >> >>>>> <ebenahar@redhat.com> >> >>>>> wrote: >> >>>>> >> >>>>> Triggered a sanity tier1 execution [1] using [2], which covers all >> >>>>> the >> >>>>> >> >>>>> requested areas, on iSCSI, NFS and Gluster. >> >>>>> I'll update with the results. >> >>>>> >> >>>>> [1] >> >>>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.2 >> >>>>> _dev/job/rhv-4.2-ge-flow-storage/1161/ >> >>>>> >> >>>>> [2] >> >>>>> https://gerrit.ovirt.org/#/c/89830/ >> >>>>> vdsm-4.30.0-291.git77aef9a.el7.x86_64 >> >>>>> >> >>>>> >> >>>>> >> >>>>> On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik >> >>>>> <mpolednik@redhat.com> >> >>>>> wrote: >> >>>>> >> >>>>> On 19/04/18 14:54 +0300, Elad Ben Aharon wrote: >> >>>>> >> >>>>> >> >>>>> Hi Martin, >> >>>>> >> >>>>> >> >>>>> I see [1] requires a rebase, can you please take care? >> >>>>> >> >>>>> >> >>>>> Should be rebased. >> >>>>> >> >>>>> At the moment, our automation is stable only on iSCSI, NFS, Gluster >> >>>>> and >> >>>>> >> >>>>> FC. >> >>>>> Ceph is not supported and Cinder will be stabilized soon, AFAIR, >> >>>>> it's >> >>>>> not >> >>>>> stable enough at the moment. >> >>>>> >> >>>>> >> >>>>> That is still pretty good. >> >>>>> >> >>>>> >> >>>>> [1] https://gerrit.ovirt.org/#/c/89830/ >> >>>>> >> >>>>> >> >>>>> >> >>>>> Thanks >> >>>>> >> >>>>> On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik >> >>>>> <mpolednik@redhat.com >> >>>>> > >> >>>>> wrote: >> >>>>> >> >>>>> On 18/04/18 11:37 +0300, Elad Ben Aharon wrote: >> >>>>> >> >>>>> >> >>>>> Hi, sorry if I misunderstood, I waited for more input regarding what >> >>>>> >> >>>>> areas >> >>>>> have to be tested here. >> >>>>> >> >>>>> >> >>>>> I'd say that you have quite a bit of freedom in this regard. >> >>>>> >> >>>>> GlusterFS >> >>>>> should be covered by Dennis, so iSCSI/NFS/ceph/cinder with some >> >>>>> suite >> >>>>> that covers basic operations (start & stop VM, migrate it), >> >>>>> snapshots >> >>>>> and merging them, and whatever else would be important for storage >> >>>>> sanity. >> >>>>> >> >>>>> mpolednik >> >>>>> >> >>>>> >> >>>>> On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik < >> >>>>> mpolednik@redhat.com >> >>>>> > >> >>>>> >> >>>>> wrote: >> >>>>> >> >>>>> >> >>>>> On 11/04/18 16:52 +0300, Elad Ben Aharon wrote: >> >>>>> >> >>>>> >> >>>>> We can test this on iSCSI, NFS and GlusterFS. As for ceph and >> >>>>> cinder, >> >>>>> >> >>>>> will >> >>>>> >> >>>>> have to check, since usually, we don't execute our automation on >> >>>>> them. >> >>>>> >> >>>>> >> >>>>> Any update on this? I believe the gluster tests were successful, >> >>>>> OST >> >>>>> >> >>>>> passes fine and unit tests pass fine, that makes the storage >> >>>>> backends >> >>>>> test the last required piece. >> >>>>> >> >>>>> >> >>>>> On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir < ratamir@redhat.com > >> >>>>> wrote: >> >>>>> >> >>>>> >> >>>>> +Elad >> >>>>> >> >>>>> >> >>>>> >> >>>>> On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg < danken@redhat.com >> >>>>> >> >>>>> > >> >>>>> wrote: >> >>>>> >> >>>>> On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer < nsoffer@redhat.com> >> >>>>> wrote: >> >>>>> >> >>>>> >> >>>>> On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri < eedri@redhat.com> >> >>>>> >> >>>>> wrote: >> >>>>> >> >>>>> >> >>>>> Please make sure to run as much OST suites on this patch as >> >>>>> >> >>>>> possible >> >>>>> >> >>>>> before merging ( using 'ci please build' ) >> >>>>> >> >>>>> >> >>>>> >> >>>>> But note that OST is not a way to verify the patch. >> >>>>> >> >>>>> >> >>>>> Such changes require testing with all storage types we support. >> >>>>> >> >>>>> Nir >> >>>>> >> >>>>> On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik < >> >>>>> mpolednik@redhat.com >> >>>>> > >> >>>>> >> >>>>> wrote: >> >>>>> >> >>>>> >> >>>>> Hey, >> >>>>> >> >>>>> >> >>>>> I've created a patch[0] that is finally able to activate >> >>>>> >> >>>>> libvirt's >> >>>>> dynamic_ownership for VDSM while not negatively affecting >> >>>>> functionality of our storage code. >> >>>>> >> >>>>> That of course comes with quite a bit of code removal, mostly >> >>>>> in >> >>>>> the >> >>>>> area of host devices, hwrng and anything that touches devices; >> >>>>> bunch >> >>>>> of test changes and one XML generation caveat (storage is >> >>>>> handled >> >>>>> by >> >>>>> VDSM, therefore disk relabelling needs to be disabled on the >> >>>>> VDSM >> >>>>> level). >> >>>>> >> >>>>> Because of the scope of the patch, I welcome >> >>>>> storage/virt/network >> >>>>> people to review the code and consider the implication this >> >>>>> change >> >>>>> has >> >>>>> on current/future features. >> >>>>> >> >>>>> [0] https://gerrit.ovirt.org/#/c/89830/ >> >>>>> >> >>>>> >> >>>>> In particular: dynamic_ownership was set to 0 prehistorically >> >>>>> (as >> >>>>> >> >>>>> >> >>>>> part >> >>>>> >> >>>>> >> >>>>> of https://bugzilla.redhat.com/show_bug.cgi?id=554961 ) because >> >>>>> >> >>>>> libvirt, >> >>>>> running as root, was not able to play properly with root-squash >> >>>>> nfs >> >>>>> mounts. >> >>>>> >> >>>>> Have you attempted this use case? >> >>>>> >> >>>>> I join to Nir's request to run this with storage QE. >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> >> >>>>> >> >>>>> >> >>>>> Raz Tamir >> >>>>> Manager, RHV QE >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> _______________________________________________ >> >>>>> Devel mailing list >> >>>>> Devel@ovirt.org >> >>>>> http://lists.ovirt.org/mailman/listinfo/devel >> >>>>> >> >>>>> >> >>>>> >> >>>> >> >>>> >> >>> >> >> >> > >> > <logs.tar.gz> >> > >> > > >
On Wed, May 30, 2018 at 5:44 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Triggered a sanity automation execution using [1], which covers all the requested areas, on iSCSI, NFS and Gluster. I'll update with the results.
[1] *https://gerrit.ovirt.org/#/c/90906/ < https://gerrit.ovirt.org/#/c/90906/>* vdsm-4.20.28-6.gitc23aef6.el7.x86_64
On Tue, May 29, 2018 at 4:26 PM, Martin Polednik <mpolednik@redhat.com> wrote:
On 29/05/18 15:30 +0300, Elad Ben Aharon wrote:
Hi Martin,
Can you please create a cerry pick patch that is based on 4.2?
See https://gerrit.ovirt.org/#/c/90906/. The CI failure isn unrelated (storage needs real env).
mpolednik
Thanks
On Tue, May 29, 2018 at 1:34 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On Tue, May 29, 2018 at 1:21 PM, Elad Ben Aharon <ebenahar@redhat.com
wrote: > Hi Dan, > > In the last execution, the success rate was very low due to a large number > of failures on start VM caused, according to Michal, by the > vdsm-hook-allocate_net that was installed on the host. > > This is the latest status here, would you like me to re-execute?
yes, of course. but you should rebase Polednik's code on top of *current* ovirt-4.2.3 branch.
> If so, with > or W/O vdsm-hook-allocate_net installed?
There was NO reason to have that installed. Please keep it (and any other needless code) out of the test environment.
> > On Tue, May 29, 2018 at 1:14 PM, Dan Kenigsberg <danken@redhat.com
wrote: >> >> On Mon, May 7, 2018 at 3:53 PM, Michal Skrivanek >> <michal.skrivanek@redhat.com> wrote: >> > Hi Elad, >> > why did you install vdsm-hook-allocate_net? >> > >> > adding Dan as I think the hook is not supposed to fail this badly in any >> > case >> >> yep, this looks bad and deserves a little bug report. Installing this >> little hook should not block vm startup. >> >> But more importantly - what is the conclusion of this thread? Do we >> have a green light from QE to take this in? >> >> >> > >> > Thanks, >> > michal >> > >> > On 5 May 2018, at 19:22, Elad Ben Aharon <ebenahar@redhat.com> wrote: >> > >> > Start VM fails on: >> > >> > 2018-05-05 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] >> > (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path: >> > >> > 'dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- 9a78-bdd13a843c62/images/6cdabfe5- >> > d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a01928211' -> >> > >> > u'*dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- 9a78-bdd13a843c62/images/6cdabfe5-d1ca-40af-ae63- 9834f235d1c8/7ef97445-30e6-4435-8425- >> > f35a01928211' (storagexml:334) >> > 2018-05-05 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START >> > getSpmStatus(spUUID='940fe6f3-b0c6-4d0c-a921-198e7819c1cc', >> > options=None) >> > from=::ffff:10.35.161.127,53512, >> > task_id=c70ace39-dbfe-4f5c-ae49-a1e3a82c >> > 2758 (api:46) >> > 2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) [root] >> > /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: rc=2 >> > err=vm >> > net allocation hook: [unexpected error]: Traceback (most recent call >> > last): >> > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", >> > line >> > 105, in <module> >> > main() >> > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", >> > line >> > 93, in main >> > allocate_random_network(device_xml) >> > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", >> > line >> > 62, in allocate_random_network >> > net = _get_random_network() >> > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", >> > line >> > 50, in _get_random_network >> > available_nets = _parse_nets() >> > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", >> > line >> > 46, in _parse_nets >> > return [net for net in os.environ[AVAIL_NETS_KEY].split()] >> > File "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__ >> > raise KeyError(key) >> > KeyError: 'equivnets' >> > >> > >> > (hooks:110) >> > 2018-05-05 17:53:27,915+0300 ERROR (vm/e6ce66ce) [virt.vm] >> > (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process >> > failed >> > (vm:943) >> > Traceback (most recent call last): >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, in >> > _startUnderlyingVm >> > self._run() >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2861, in >> > _run >> > domxml = hooks.before_vm_start(self._buildDomainXML(), >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2254, in >> > _buildDomainXML >> > dom, self.id, self._custom['custom']) >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_ preprocess.py", >> > line 240, in replace_device_xml_with_hooks_xml >> > dev_custom) >> > File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 134, >> > in >> > before_device_create >> > params=customProperties) >> > File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", line 120, >> > in >> > _runHooksDir >> > raise exception.HookError(err) >> > HookError: Hook Error: ('vm net allocation hook: [unexpected error]: >> > Traceback (most recent call last):\n File >> > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line >> > 105, in >> > <module>\n main()\n >> > File "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ne t", >> > line >> > 93, in main\n allocate_random_network(device_xml)\n File >> > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 62, >> > i >> > n allocate_random_network\n net = _get_random_network()\n File >> > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 50, >> > in >> > _get_random_network\n available_nets = _parse_nets()\n File "/us >> > r/libexec/vdsm/hooks/before_device_create/10_allocate_net", line 46, in >> > _parse_nets\n return [net for net in >> > os.environ[AVAIL_NETS_KEY].split()]\n File >> > "/usr/lib64/python2.7/UserDict.py", line 23, in __getit >> > em__\n raise KeyError(key)\nKeyError: \'equivnets\'\n\n\n',) >> > >> > >> > >> > Hence, the success rate was 28% against 100% running with d/s (d/s). If >> > needed, I'll compare against the latest master, but I think you get the >> > picture with d/s. >> > >> > vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 >> > libvirt-3.9.0-14.el7_5.3.x86_64 >> > qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64 >> > kernel 3.10.0-862.el7.x86_64 >> > rhel7.5 >> > >> > >> > Logs attached >> > >> > On Sat, May 5, 2018 at 1:26 PM, Elad Ben Aharon < ebenahar@redhat.com> >> > wrote: >> >> >> >> nvm, found gluster 3.12 repo, managed to install vdsm >> >> >> >> On Sat, May 5, 2018 at 1:12 PM, Elad Ben Aharon < ebenahar@redhat.com > >> >> wrote: >> >>> >> >>> No, vdsm requires it: >> >>> >> >>> Error: Package: vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 >> >>> (/vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64) >> >>> Requires: glusterfs-fuse >= 3.12 >> >>> Installed: glusterfs-fuse-3.8.4-54.8.el7.x86_64 (@rhv-4.2.3) >> >>> >> >>> Therefore, vdsm package installation is skipped upon force install. >> >>> >> >>> On Sat, May 5, 2018 at 11:42 AM, Michal Skrivanek >> >>> <michal.skrivanek@redhat.com> wrote: >> >>>> >> >>>> >> >>>> >> >>>> On 5 May 2018, at 00:38, Elad Ben Aharon < ebenahar@redhat.com> wrote: >> >>>> >> >>>> Hi guys, >> >>>> >> >>>> The vdsm build from the patch requires glusterfs-fuse > 3.12. This is >> >>>> while the latest 4.2.3-5 d/s build requires 3.8.4 (3.4.0.59rhs-1.el7) >> >>>> >> >>>> >> >>>> because it is still oVirt, not a downstream build. We can’t really do >> >>>> downstream builds with unmerged changes:/ >> >>>> >> >>>> Trying to get this gluster-fuse build, so far no luck. >> >>>> Is this requirement intentional? >> >>>> >> >>>> >> >>>> it should work regardless, I guess you can force install it without >> >>>> the >> >>>> dependency >> >>>> >> >>>> >> >>>> On Fri, May 4, 2018 at 2:38 PM, Michal Skrivanek >> >>>> <michal.skrivanek@redhat.com> wrote: >> >>>>> >> >>>>> Hi Elad, >> >>>>> to make it easier to compare, Martin backported the change to 4.2 so >> >>>>> it >> >>>>> is actually comparable with a run without that patch. Would you >> >>>>> please try >> >>>>> that out? >> >>>>> It would be best to have 4.2 upstream and this[1] run to really >> >>>>> minimize the noise. >> >>>>> >> >>>>> Thanks, >> >>>>> michal >> >>>>> >> >>>>> [1] >> >>>>> >> >>>>> http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on- demand-el7-x86_64/28/ >> >>>>> >> >>>>> On 27 Apr 2018, at 09:23, Martin Polednik < mpolednik@redhat.com> >> >>>>> wrote: >> >>>>> >> >>>>> On 24/04/18 00:37 +0300, Elad Ben Aharon wrote: >> >>>>> >> >>>>> I will update with the results of the next tier1 execution on latest >> >>>>> 4.2.3 >> >>>>> >> >>>>> >> >>>>> That isn't master but old branch though. Could you run it against >> >>>>> *current* VDSM master? >> >>>>> >> >>>>> On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik >> >>>>> <mpolednik@redhat.com> >> >>>>> wrote: >> >>>>> >> >>>>> On 23/04/18 01:23 +0300, Elad Ben Aharon wrote: >> >>>>> >> >>>>> Hi, I've triggered another execution [1] due to some issues I saw in >> >>>>> the >> >>>>> first which are not related to the patch. >> >>>>> >> >>>>> The success rate is 78% which is low comparing to tier1 executions >> >>>>> with >> >>>>> code from downstream builds (95-100% success rates) [2]. >> >>>>> >> >>>>> >> >>>>> Could you run the current master (without the dynamic_ownership >> >>>>> patch) >> >>>>> so that we have viable comparision? >> >>>>> >> >>>>> From what I could see so far, there is an issue with move and copy >> >>>>> >> >>>>> operations to and from Gluster domains. For example [3]. >> >>>>> >> >>>>> The logs are attached. >> >>>>> >> >>>>> >> >>>>> [1] >> >>>>> * https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv >> >>>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/ >> >>>>> < https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv >> >>>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/>* >> >>>>> >> >>>>> >> >>>>> >> >>>>> [2] >> >>>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/ >> >>>>> >> >>>>> rhv-4.2-ge-runner-tier1-after-upgrade/7/ >> >>>>> >> >>>>> >> >>>>> >> >>>>> [3] >> >>>>> 2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) [vdsm.api] FINISH >> >>>>> deleteImage error=Image does not exist in domain: >> >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >> >>>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' >> >>>>> from=: >> >>>>> :ffff:10.35.161.182,40936, >> >>>>> flow_id=disks_syncAction_ba6b2630-5976-4935, >> >>>>> task_id=3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51) >> >>>>> 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) >> >>>>> [storage.TaskManager.Task] >> >>>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error >> >>>>> (task:875) >> >>>>> Traceback (most recent call last): >> >>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line >> >>>>> 882, >> >>>>> in >> >>>>> _run >> >>>>> return fn(*args, **kargs) >> >>>>> File "<string>", line 2, in deleteImage >> >>>>> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, >> >>>>> in >> >>>>> method >> >>>>> ret = func(*args, **kwargs) >> >>>>> File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line >> >>>>> 1503, >> >>>>> in >> >>>>> deleteImage >> >>>>> raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) >> >>>>> ImageDoesNotExistInSD: Image does not exist in domain: >> >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >> >>>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' >> >>>>> >> >>>>> 2018-04-22 13:06:28,317+0300 INFO (jsonrpc/7) >> >>>>> [storage.TaskManager.Task] >> >>>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task is >> >>>>> aborted: >> >>>>> "Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835 - >> >>>>> 5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'"
Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/WGMUK5T7PDYSIB...
 
            Martin, please update, if you think the failures are not related to your patch I'll test with the master as Nir suggested. Thanks On Thu, May 31, 2018 at 1:19 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Thu, May 31, 2018 at 1:05 PM Martin Polednik <mpolednik@redhat.com> wrote:
Execution is done, 59/65 cases passed. Latest 4.2.4 execution ended with 100% so failures were caused probably due to the changes done in the
On 31/05/18 12:47 +0300, Elad Ben Aharon wrote: patch.
Failures are mainly on preview snapshots.
Can we run the same job on the patch before Martin patch? maybe the issue are already in master, caused by other patches?
Execution info provided to Martin separately.
I'm currently investigating the snapshot breakage, thanks Elad!
> code > >> >>>>> 268 > >> >>>>> (task:1181) > >> >>>>> 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) > [storage.Dispatcher] > >> >>>>> FINISH > >> >>>>> deleteImage error=Image does not exist in domain: > >> >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, > >> >>>>> domain=e5fd29c8-52ba-467e-be09 > >> >>>>> -ca40ff054d > >> >>>>> d4' (dispatcher:82) > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon > >> >>>>> <ebenahar@redhat.com> > >> >>>>> wrote: > >> >>>>> > >> >>>>> Triggered a sanity tier1 execution [1] using [2], which covers > all > >> >>>>> the > >> >>>>> > >> >>>>> requested areas, on iSCSI, NFS and Gluster. > >> >>>>> I'll update with the results. > >> >>>>> > >> >>>>> [1] > >> >>>>> https://rhv-jenkins.rhev-ci-vm s.eng.rdu2.redhat.com/view/4.2 > >> >>>>> _dev/job/rhv-4.2-ge-flow-storage/1161/ > >> >>>>> > >> >>>>> [2] > >> >>>>> https://gerrit.ovirt.org/#/c/89830/ > >> >>>>> vdsm-4.30.0-291.git77aef9a.el7.x86_64 > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik > >> >>>>> <mpolednik@redhat.com> > >> >>>>> wrote: > >> >>>>> > >> >>>>> On 19/04/18 14:54 +0300, Elad Ben Aharon wrote: > >> >>>>> > >> >>>>> > >> >>>>> Hi Martin, > >> >>>>> > >> >>>>> > >> >>>>> I see [1] requires a rebase, can you please take care? > >> >>>>> > >> >>>>> > >> >>>>> Should be rebased. > >> >>>>> > >> >>>>> At the moment, our automation is stable only on iSCSI, NFS, > Gluster > >> >>>>> and > >> >>>>> > >> >>>>> FC. > >> >>>>> Ceph is not supported and Cinder will be stabilized soon, > AFAIR, > >> >>>>> it's > >> >>>>> not > >> >>>>> stable enough at the moment. > >> >>>>> > >> >>>>> > >> >>>>> That is still pretty good. > >> >>>>> > >> >>>>> > >> >>>>> [1] https://gerrit.ovirt.org/#/c/89830/ > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> Thanks > >> >>>>> > >> >>>>> On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik > >> >>>>> <mpolednik@redhat.com > >> >>>>> > > >> >>>>> wrote: > >> >>>>> > >> >>>>> On 18/04/18 11:37 +0300, Elad Ben Aharon wrote: > >> >>>>> > >> >>>>> > >> >>>>> Hi, sorry if I misunderstood, I waited for more input regarding > what > >> >>>>> > >> >>>>> areas > >> >>>>> have to be tested here. > >> >>>>> > >> >>>>> > >> >>>>> I'd say that you have quite a bit of freedom in this regard. > >> >>>>> > >> >>>>> GlusterFS > >> >>>>> should be covered by Dennis, so iSCSI/NFS/ceph/cinder with some > >> >>>>> suite > >> >>>>> that covers basic operations (start & stop VM, migrate it), > >> >>>>> snapshots > >> >>>>> and merging them, and whatever else would be important for > storage > >> >>>>> sanity. > >> >>>>> > >> >>>>> mpolednik > >> >>>>> > >> >>>>> > >> >>>>> On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik < > >> >>>>> mpolednik@redhat.com > >> >>>>> > > >> >>>>> > >> >>>>> wrote: > >> >>>>> > >> >>>>> > >> >>>>> On 11/04/18 16:52 +0300, Elad Ben Aharon wrote: > >> >>>>> > >> >>>>> > >> >>>>> We can test this on iSCSI, NFS and GlusterFS. As for ceph and > >> >>>>> cinder, > >> >>>>> > >> >>>>> will > >> >>>>> > >> >>>>> have to check, since usually, we don't execute our automation > on > >> >>>>> them. > >> >>>>> > >> >>>>> > >> >>>>> Any update on this? I believe the gluster tests were > successful, > >> >>>>> OST > >> >>>>> > >> >>>>> passes fine and unit tests pass fine, that makes the storage > >> >>>>> backends > >> >>>>> test the last required piece. > >> >>>>> > >> >>>>> > >> >>>>> On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir < ratamir@redhat.com > > > >> >>>>> wrote: > >> >>>>> > >> >>>>> > >> >>>>> +Elad > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg < > danken@redhat.com > >> >>>>> > >> >>>>> > > >> >>>>> wrote: > >> >>>>> > >> >>>>> On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer < > nsoffer@redhat.com> > >> >>>>> wrote: > >> >>>>> > >> >>>>> > >> >>>>> On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri < eedri@redhat.com> > >> >>>>> > >> >>>>> wrote: > >> >>>>> > >> >>>>> > >> >>>>> Please make sure to run as much OST suites on this patch as > >> >>>>> > >> >>>>> possible > >> >>>>> > >> >>>>> before merging ( using 'ci please build' ) > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> But note that OST is not a way to verify the patch. > >> >>>>> > >> >>>>> > >> >>>>> Such changes require testing with all storage types we support. > >> >>>>> > >> >>>>> Nir > >> >>>>> > >> >>>>> On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik < > >> >>>>> mpolednik@redhat.com > >> >>>>> > > >> >>>>> > >> >>>>> wrote: > >> >>>>> > >> >>>>> > >> >>>>> Hey, > >> >>>>> > >> >>>>> > >> >>>>> I've created a patch[0] that is finally able to activate > >> >>>>> > >> >>>>> libvirt's > >> >>>>> dynamic_ownership for VDSM while not negatively affecting > >> >>>>> functionality of our storage code. > >> >>>>> > >> >>>>> That of course comes with quite a bit of code removal, mostly > >> >>>>> in > >> >>>>> the > >> >>>>> area of host devices, hwrng and anything that touches devices; > >> >>>>> bunch > >> >>>>> of test changes and one XML generation caveat (storage is > >> >>>>> handled > >> >>>>> by > >> >>>>> VDSM, therefore disk relabelling needs to be disabled on the > >> >>>>> VDSM > >> >>>>> level). > >> >>>>> > >> >>>>> Because of the scope of the patch, I welcome > >> >>>>> storage/virt/network > >> >>>>> people to review the code and consider the implication this > >> >>>>> change > >> >>>>> has > >> >>>>> on current/future features. > >> >>>>> > >> >>>>> [0] https://gerrit.ovirt.org/#/c/89830/ > >> >>>>> > >> >>>>> > >> >>>>> In particular: dynamic_ownership was set to 0 prehistorically > >> >>>>> (as > >> >>>>> > >> >>>>> > >> >>>>> part > >> >>>>> > >> >>>>> > >> >>>>> of https://bugzilla.redhat.com/show_bug.cgi?id=554961 ) > because > >> >>>>> > >> >>>>> libvirt, > >> >>>>> running as root, was not able to play properly with root-squash > >> >>>>> nfs > >> >>>>> mounts. > >> >>>>> > >> >>>>> Have you attempted this use case? > >> >>>>> > >> >>>>> I join to Nir's request to run this with storage QE. > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> -- > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> Raz Tamir > >> >>>>> Manager, RHV QE > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> _______________________________________________ > >> >>>>> Devel mailing list > >> >>>>> Devel@ovirt.org > >> >>>>> http://lists.ovirt.org/mailman/listinfo/devel > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>> > >> >>>> > >> >>> > >> >> > >> > > >> > <logs.tar.gz> > >> > > >> > > > > > > >
On Wed, May 30, 2018 at 5:44 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Triggered a sanity automation execution using [1], which covers all the requested areas, on iSCSI, NFS and Gluster. I'll update with the results.
[1] *https://gerrit.ovirt.org/#/c/90906/ <https://gerrit.ovirt.org/#/c/ 90906/>* vdsm-4.20.28-6.gitc23aef6.el7.x86_64
On Tue, May 29, 2018 at 4:26 PM, Martin Polednik <mpolednik@redhat.com
wrote:
On 29/05/18 15:30 +0300, Elad Ben Aharon wrote:
Hi Martin,
Can you please create a cerry pick patch that is based on 4.2?
See https://gerrit.ovirt.org/#/c/90906/. The CI failure isn unrelated (storage needs real env).
mpolednik
Thanks
On Tue, May 29, 2018 at 1:34 PM, Dan Kenigsberg <danken@redhat.com> wrote:
On Tue, May 29, 2018 at 1:21 PM, Elad Ben Aharon < ebenahar@redhat.com> > wrote: > > Hi Dan, > > > > In the last execution, the success rate was very low due to a large > number > > of failures on start VM caused, according to Michal, by the > > vdsm-hook-allocate_net that was installed on the host. > > > > This is the latest status here, would you like me to re-execute? > > yes, of course. but you should rebase Polednik's code on top of > *current* ovirt-4.2.3 branch. > > > If so, with > > or W/O vdsm-hook-allocate_net installed? > > There was NO reason to have that installed. Please keep it (and any > other needless code) out of the test environment. > > > > > On Tue, May 29, 2018 at 1:14 PM, Dan Kenigsberg < danken@redhat.com> > wrote: > >> > >> On Mon, May 7, 2018 at 3:53 PM, Michal Skrivanek > >> <michal.skrivanek@redhat.com> wrote: > >> > Hi Elad, > >> > why did you install vdsm-hook-allocate_net? > >> > > >> > adding Dan as I think the hook is not supposed to fail this badly > in > any > >> > case > >> > >> yep, this looks bad and deserves a little bug report. Installing this > >> little hook should not block vm startup. > >> > >> But more importantly - what is the conclusion of this thread? Do we > >> have a green light from QE to take this in? > >> > >> > >> > > >> > Thanks, > >> > michal > >> > > >> > On 5 May 2018, at 19:22, Elad Ben Aharon <ebenahar@redhat.com> > wrote: > >> > > >> > Start VM fails on: > >> > > >> > 2018-05-05 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] > >> > (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path: > >> > > >> > 'dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- > 9a78-bdd13a843c62/images/6cdabfe5- > >> > d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a0192 8211' > -> > >> > > >> > u'*dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- > 9a78-bdd13a843c62/images/6cdabfe5-d1ca-40af-ae63- > 9834f235d1c8/7ef97445-30e6-4435-8425- > >> > f35a01928211' (storagexml:334) > >> > 2018-05-05 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START > >> > getSpmStatus(spUUID='940fe6f3-b0c6-4d0c-a921-198e7819c1cc', > >> > options=None) > >> > from=::ffff:10.35.161.127,53512, > >> > task_id=c70ace39-dbfe-4f5c-ae49-a1e3a82c > >> > 2758 (api:46) > >> > 2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) [root] > >> > /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: rc=2 > >> > err=vm > >> > net allocation hook: [unexpected error]: Traceback (most recent > call > >> > last): > >> > File "/usr/libexec/vdsm/hooks/befor e_device_create/10_allocate_ne > t", > >> > line > >> > 105, in <module> > >> > main() > >> > File "/usr/libexec/vdsm/hooks/befor e_device_create/10_allocate_ne > t", > >> > line > >> > 93, in main > >> > allocate_random_network(device_xml) > >> > File "/usr/libexec/vdsm/hooks/befor e_device_create/10_allocate_ne > t", > >> > line > >> > 62, in allocate_random_network > >> > net = _get_random_network() > >> > File "/usr/libexec/vdsm/hooks/befor e_device_create/10_allocate_ne > t", > >> > line > >> > 50, in _get_random_network > >> > available_nets = _parse_nets() > >> > File "/usr/libexec/vdsm/hooks/befor e_device_create/10_allocate_ne > t", > >> > line > >> > 46, in _parse_nets > >> > return [net for net in os.environ[AVAIL_NETS_KEY].split()] > >> > File "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__ > >> > raise KeyError(key) > >> > KeyError: 'equivnets' > >> > > >> > > >> > (hooks:110) > >> > 2018-05-05 17:53:27,915+0300 ERROR (vm/e6ce66ce) [virt.vm] > >> > (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process > >> > failed > >> > (vm:943) > >> > Traceback (most recent call last): > >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line > 872, > in > >> > _startUnderlyingVm > >> > self._run() > >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line > 2861, > in > >> > _run > >> > domxml = hooks.before_vm_start(self._buildDomainXML(), > >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line > 2254, > in > >> > _buildDomainXML > >> > dom, self.id, self._custom['custom']) > >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_ > preprocess.py", > >> > line 240, in replace_device_xml_with_hooks_xml > >> > dev_custom) > >> > File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", > line > 134, > >> > in > >> > before_device_create > >> > params=customProperties) > >> > File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", > line > 120, > >> > in > >> > _runHooksDir > >> > raise exception.HookError(err) > >> > HookError: Hook Error: ('vm net allocation hook: [unexpected > error]: > >> > Traceback (most recent call last):\n File > >> > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ net", > line > >> > 105, in > >> > <module>\n main()\n > >> > File "/usr/libexec/vdsm/hooks/befor e_device_create/10_allocate_ne > t", > >> > line > >> > 93, in main\n allocate_random_network(device_xml)\n File > >> > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ net", > line > 62, > >> > i > >> > n allocate_random_network\n net = _get_random_network()\n File > >> > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ net", > line > 50, > >> > in > >> > _get_random_network\n available_nets = _parse_nets()\n File > "/us > >> > r/libexec/vdsm/hooks/before_device_create/10_allocate_net", line > 46, > in > >> > _parse_nets\n return [net for net in > >> > os.environ[AVAIL_NETS_KEY].split()]\n File > >> > "/usr/lib64/python2.7/UserDict.py", line 23, in __getit > >> > em__\n raise KeyError(key)\nKeyError: \'equivnets\'\n\n\n',) > >> > > >> > > >> > > >> > Hence, the success rate was 28% against 100% running with d/s > (d/s). > If > >> > needed, I'll compare against the latest master, but I think you get > the > >> > picture with d/s. > >> > > >> > vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 > >> > libvirt-3.9.0-14.el7_5.3.x86_64 > >> > qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64 > >> > kernel 3.10.0-862.el7.x86_64 > >> > rhel7.5 > >> > > >> > > >> > Logs attached > >> > > >> > On Sat, May 5, 2018 at 1:26 PM, Elad Ben Aharon < > ebenahar@redhat.com> > >> > wrote: > >> >> > >> >> nvm, found gluster 3.12 repo, managed to install vdsm > >> >> > >> >> On Sat, May 5, 2018 at 1:12 PM, Elad Ben Aharon < > ebenahar@redhat.com > > > >> >> wrote: > >> >>> > >> >>> No, vdsm requires it: > >> >>> > >> >>> Error: Package: vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 > >> >>> (/vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64) > >> >>> Requires: glusterfs-fuse >= 3.12 > >> >>> Installed: glusterfs-fuse-3.8.4-54.8.el7.x86_64 > (@rhv-4.2.3) > >> >>> > >> >>> Therefore, vdsm package installation is skipped upon force > install. > >> >>> > >> >>> On Sat, May 5, 2018 at 11:42 AM, Michal Skrivanek > >> >>> <michal.skrivanek@redhat.com> wrote: > >> >>>> > >> >>>> > >> >>>> > >> >>>> On 5 May 2018, at 00:38, Elad Ben Aharon < ebenahar@redhat.com> > wrote: > >> >>>> > >> >>>> Hi guys, > >> >>>> > >> >>>> The vdsm build from the patch requires glusterfs-fuse > 3.12. > This > is > >> >>>> while the latest 4.2.3-5 d/s build requires 3.8.4 > (3.4.0.59rhs-1.el7) > >> >>>> > >> >>>> > >> >>>> because it is still oVirt, not a downstream build. We can’t > really > do > >> >>>> downstream builds with unmerged changes:/ > >> >>>> > >> >>>> Trying to get this gluster-fuse build, so far no luck. > >> >>>> Is this requirement intentional? > >> >>>> > >> >>>> > >> >>>> it should work regardless, I guess you can force install it > without > >> >>>> the > >> >>>> dependency > >> >>>> > >> >>>> > >> >>>> On Fri, May 4, 2018 at 2:38 PM, Michal Skrivanek > >> >>>> <michal.skrivanek@redhat.com> wrote: > >> >>>>> > >> >>>>> Hi Elad, > >> >>>>> to make it easier to compare, Martin backported the change to > 4.2 > so > >> >>>>> it > >> >>>>> is actually comparable with a run without that patch. Would you > >> >>>>> please try > >> >>>>> that out? > >> >>>>> It would be best to have 4.2 upstream and this[1] run to really > >> >>>>> minimize the noise. > >> >>>>> > >> >>>>> Thanks, > >> >>>>> michal > >> >>>>> > >> >>>>> [1] > >> >>>>> > >> >>>>> http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on- > demand-el7-x86_64/28/ > >> >>>>> > >> >>>>> On 27 Apr 2018, at 09:23, Martin Polednik < > mpolednik@redhat.com> > >> >>>>> wrote: > >> >>>>> > >> >>>>> On 24/04/18 00:37 +0300, Elad Ben Aharon wrote: > >> >>>>> > >> >>>>> I will update with the results of the next tier1 execution on > latest > >> >>>>> 4.2.3 > >> >>>>> > >> >>>>> > >> >>>>> That isn't master but old branch though. Could you run it > against > >> >>>>> *current* VDSM master? > >> >>>>> > >> >>>>> On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik > >> >>>>> <mpolednik@redhat.com> > >> >>>>> wrote: > >> >>>>> > >> >>>>> On 23/04/18 01:23 +0300, Elad Ben Aharon wrote: > >> >>>>> > >> >>>>> Hi, I've triggered another execution [1] due to some issues I > saw > in > >> >>>>> the > >> >>>>> first which are not related to the patch. > >> >>>>> > >> >>>>> The success rate is 78% which is low comparing to tier1 > executions > >> >>>>> with > >> >>>>> code from downstream builds (95-100% success rates) [2]. > >> >>>>> > >> >>>>> > >> >>>>> Could you run the current master (without the dynamic_ownership > >> >>>>> patch) > >> >>>>> so that we have viable comparision? > >> >>>>> > >> >>>>> From what I could see so far, there is an issue with move and > copy > >> >>>>> > >> >>>>> operations to and from Gluster domains. For example [3]. > >> >>>>> > >> >>>>> The logs are attached. > >> >>>>> > >> >>>>> > >> >>>>> [1] > >> >>>>> *https://rhv-jenkins.rhev-ci-v ms.eng.rdu2.redhat.com/job/rhv > >> >>>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/ > >> >>>>> <https://rhv-jenkins.rhev-ci-v ms.eng.rdu2.redhat.com/job/rhv > >> >>>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/>* > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> [2] > >> >>>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/ > >> >>>>> > >> >>>>> rhv-4.2-ge-runner-tier1-after-upgrade/7/ > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> [3] > >> >>>>> 2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) [vdsm.api] > FINISH > >> >>>>> deleteImage error=Image does not exist in domain: > >> >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, > >> >>>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' > >> >>>>> from=: > >> >>>>> :ffff:10.35.161.182,40936, > >> >>>>> flow_id=disks_syncAction_ba6b2630-5976-4935, > >> >>>>> task_id=3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51) > >> >>>>> 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) > >> >>>>> [storage.TaskManager.Task] > >> >>>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error > >> >>>>> (task:875) > >> >>>>> Traceback (most recent call last): > >> >>>>> File "/usr/lib/python2.7/site-packa ges/vdsm/storage/task.py", > line > >> >>>>> 882, > >> >>>>> in > >> >>>>> _run > >> >>>>> return fn(*args, **kargs) > >> >>>>> File "<string>", line 2, in deleteImage > >> >>>>> File "/usr/lib/python2.7/site-packa ges/vdsm/common/api.py", > line > 49, > >> >>>>> in > >> >>>>> method > >> >>>>> ret = func(*args, **kwargs) > >> >>>>> File "/usr/lib/python2.7/site-packa ges/vdsm/storage/hsm.py", > line > >> >>>>> 1503, > >> >>>>> in > >> >>>>> deleteImage > >> >>>>> raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) > >> >>>>> ImageDoesNotExistInSD: Image does not exist in domain: > >> >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, > >> >>>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' > >> >>>>> > >> >>>>> 2018-04-22 13:06:28,317+0300 INFO (jsonrpc/7) > >> >>>>> [storage.TaskManager.Task] > >> >>>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task > is > >> >>>>> aborted: > >> >>>>> "Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835 > - > >> >>>>> 5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'"
Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/communit y/about/community-guidelines/ List Archives: https://lists.ovirt.org/archiv es/list/devel@ovirt.org/message/WGMUK5T7PDYSIBIUZE3AJIYQAWJOILTC/
 
            On 31/05/18 14:58 +0300, Elad Ben Aharon wrote:
Martin, please update, if you think the failures are not related to your patch I'll test with the master as Nir suggested.
Thanks
I believe the failures are related to the patch, and more specifically to the way libvirt handles seclabel for snapshots. Opened https://bugzilla.redhat.com/show_bug.cgi?id=1584682.
On Thu, May 31, 2018 at 1:19 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Thu, May 31, 2018 at 1:05 PM Martin Polednik <mpolednik@redhat.com> wrote:
Execution is done, 59/65 cases passed. Latest 4.2.4 execution ended with 100% so failures were caused probably due to the changes done in the
On 31/05/18 12:47 +0300, Elad Ben Aharon wrote: patch.
Failures are mainly on preview snapshots.
Can we run the same job on the patch before Martin patch? maybe the issue are already in master, caused by other patches?
Execution info provided to Martin separately.
I'm currently investigating the snapshot breakage, thanks Elad!
>> code >> >> >>>>> 268 >> >> >>>>> (task:1181) >> >> >>>>> 2018-04-22 13:06:28,318+0300 ERROR (jsonrpc/7) >> [storage.Dispatcher] >> >> >>>>> FINISH >> >> >>>>> deleteImage error=Image does not exist in domain: >> >> >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >> >> >>>>> domain=e5fd29c8-52ba-467e-be09 >> >> >>>>> -ca40ff054d >> >> >>>>> d4' (dispatcher:82) >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> On Thu, Apr 19, 2018 at 5:34 PM, Elad Ben Aharon >> >> >>>>> <ebenahar@redhat.com> >> >> >>>>> wrote: >> >> >>>>> >> >> >>>>> Triggered a sanity tier1 execution [1] using [2], which covers >> all >> >> >>>>> the >> >> >>>>> >> >> >>>>> requested areas, on iSCSI, NFS and Gluster. >> >> >>>>> I'll update with the results. >> >> >>>>> >> >> >>>>> [1] >> >> >>>>> https://rhv-jenkins.rhev-ci-vm s.eng.rdu2.redhat.com/view/4.2 >> >> >>>>> _dev/job/rhv-4.2-ge-flow-storage/1161/ >> >> >>>>> >> >> >>>>> [2] >> >> >>>>> https://gerrit.ovirt.org/#/c/89830/ >> >> >>>>> vdsm-4.30.0-291.git77aef9a.el7.x86_64 >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> On Thu, Apr 19, 2018 at 3:07 PM, Martin Polednik >> >> >>>>> <mpolednik@redhat.com> >> >> >>>>> wrote: >> >> >>>>> >> >> >>>>> On 19/04/18 14:54 +0300, Elad Ben Aharon wrote: >> >> >>>>> >> >> >>>>> >> >> >>>>> Hi Martin, >> >> >>>>> >> >> >>>>> >> >> >>>>> I see [1] requires a rebase, can you please take care? >> >> >>>>> >> >> >>>>> >> >> >>>>> Should be rebased. >> >> >>>>> >> >> >>>>> At the moment, our automation is stable only on iSCSI, NFS, >> Gluster >> >> >>>>> and >> >> >>>>> >> >> >>>>> FC. >> >> >>>>> Ceph is not supported and Cinder will be stabilized soon, >> AFAIR, >> >> >>>>> it's >> >> >>>>> not >> >> >>>>> stable enough at the moment. >> >> >>>>> >> >> >>>>> >> >> >>>>> That is still pretty good. >> >> >>>>> >> >> >>>>> >> >> >>>>> [1] https://gerrit.ovirt.org/#/c/89830/ >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> Thanks >> >> >>>>> >> >> >>>>> On Wed, Apr 18, 2018 at 2:17 PM, Martin Polednik >> >> >>>>> <mpolednik@redhat.com >> >> >>>>> > >> >> >>>>> wrote: >> >> >>>>> >> >> >>>>> On 18/04/18 11:37 +0300, Elad Ben Aharon wrote: >> >> >>>>> >> >> >>>>> >> >> >>>>> Hi, sorry if I misunderstood, I waited for more input regarding >> what >> >> >>>>> >> >> >>>>> areas >> >> >>>>> have to be tested here. >> >> >>>>> >> >> >>>>> >> >> >>>>> I'd say that you have quite a bit of freedom in this regard. >> >> >>>>> >> >> >>>>> GlusterFS >> >> >>>>> should be covered by Dennis, so iSCSI/NFS/ceph/cinder with some >> >> >>>>> suite >> >> >>>>> that covers basic operations (start & stop VM, migrate it), >> >> >>>>> snapshots >> >> >>>>> and merging them, and whatever else would be important for >> storage >> >> >>>>> sanity. >> >> >>>>> >> >> >>>>> mpolednik >> >> >>>>> >> >> >>>>> >> >> >>>>> On Wed, Apr 18, 2018 at 11:16 AM, Martin Polednik < >> >> >>>>> mpolednik@redhat.com >> >> >>>>> > >> >> >>>>> >> >> >>>>> wrote: >> >> >>>>> >> >> >>>>> >> >> >>>>> On 11/04/18 16:52 +0300, Elad Ben Aharon wrote: >> >> >>>>> >> >> >>>>> >> >> >>>>> We can test this on iSCSI, NFS and GlusterFS. As for ceph and >> >> >>>>> cinder, >> >> >>>>> >> >> >>>>> will >> >> >>>>> >> >> >>>>> have to check, since usually, we don't execute our automation >> on >> >> >>>>> them. >> >> >>>>> >> >> >>>>> >> >> >>>>> Any update on this? I believe the gluster tests were >> successful, >> >> >>>>> OST >> >> >>>>> >> >> >>>>> passes fine and unit tests pass fine, that makes the storage >> >> >>>>> backends >> >> >>>>> test the last required piece. >> >> >>>>> >> >> >>>>> >> >> >>>>> On Wed, Apr 11, 2018 at 4:38 PM, Raz Tamir < ratamir@redhat.com >> > >> >> >>>>> wrote: >> >> >>>>> >> >> >>>>> >> >> >>>>> +Elad >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> On Wed, Apr 11, 2018 at 4:28 PM, Dan Kenigsberg < >> danken@redhat.com >> >> >>>>> >> >> >>>>> > >> >> >>>>> wrote: >> >> >>>>> >> >> >>>>> On Wed, Apr 11, 2018 at 12:34 PM, Nir Soffer < >> nsoffer@redhat.com> >> >> >>>>> wrote: >> >> >>>>> >> >> >>>>> >> >> >>>>> On Wed, Apr 11, 2018 at 12:31 PM Eyal Edri < eedri@redhat.com> >> >> >>>>> >> >> >>>>> wrote: >> >> >>>>> >> >> >>>>> >> >> >>>>> Please make sure to run as much OST suites on this patch as >> >> >>>>> >> >> >>>>> possible >> >> >>>>> >> >> >>>>> before merging ( using 'ci please build' ) >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> But note that OST is not a way to verify the patch. >> >> >>>>> >> >> >>>>> >> >> >>>>> Such changes require testing with all storage types we support. >> >> >>>>> >> >> >>>>> Nir >> >> >>>>> >> >> >>>>> On Tue, Apr 10, 2018 at 4:09 PM, Martin Polednik < >> >> >>>>> mpolednik@redhat.com >> >> >>>>> > >> >> >>>>> >> >> >>>>> wrote: >> >> >>>>> >> >> >>>>> >> >> >>>>> Hey, >> >> >>>>> >> >> >>>>> >> >> >>>>> I've created a patch[0] that is finally able to activate >> >> >>>>> >> >> >>>>> libvirt's >> >> >>>>> dynamic_ownership for VDSM while not negatively affecting >> >> >>>>> functionality of our storage code. >> >> >>>>> >> >> >>>>> That of course comes with quite a bit of code removal, mostly >> >> >>>>> in >> >> >>>>> the >> >> >>>>> area of host devices, hwrng and anything that touches devices; >> >> >>>>> bunch >> >> >>>>> of test changes and one XML generation caveat (storage is >> >> >>>>> handled >> >> >>>>> by >> >> >>>>> VDSM, therefore disk relabelling needs to be disabled on the >> >> >>>>> VDSM >> >> >>>>> level). >> >> >>>>> >> >> >>>>> Because of the scope of the patch, I welcome >> >> >>>>> storage/virt/network >> >> >>>>> people to review the code and consider the implication this >> >> >>>>> change >> >> >>>>> has >> >> >>>>> on current/future features. >> >> >>>>> >> >> >>>>> [0] https://gerrit.ovirt.org/#/c/89830/ >> >> >>>>> >> >> >>>>> >> >> >>>>> In particular: dynamic_ownership was set to 0 prehistorically >> >> >>>>> (as >> >> >>>>> >> >> >>>>> >> >> >>>>> part >> >> >>>>> >> >> >>>>> >> >> >>>>> of https://bugzilla.redhat.com/show_bug.cgi?id=554961 ) >> because >> >> >>>>> >> >> >>>>> libvirt, >> >> >>>>> running as root, was not able to play properly with root-squash >> >> >>>>> nfs >> >> >>>>> mounts. >> >> >>>>> >> >> >>>>> Have you attempted this use case? >> >> >>>>> >> >> >>>>> I join to Nir's request to run this with storage QE. >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> -- >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> Raz Tamir >> >> >>>>> Manager, RHV QE >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> _______________________________________________ >> >> >>>>> Devel mailing list >> >> >>>>> Devel@ovirt.org >> >> >>>>> http://lists.ovirt.org/mailman/listinfo/devel >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>> >> >> >>>> >> >> >>> >> >> >> >> >> > >> >> > <logs.tar.gz> >> >> > >> >> > >> > >> > >> >>
On Wed, May 30, 2018 at 5:44 PM, Elad Ben Aharon <ebenahar@redhat.com> wrote:
Triggered a sanity automation execution using [1], which covers all the requested areas, on iSCSI, NFS and Gluster. I'll update with the results.
[1] *https://gerrit.ovirt.org/#/c/90906/ <https://gerrit.ovirt.org/#/c/ 90906/>* vdsm-4.20.28-6.gitc23aef6.el7.x86_64
On Tue, May 29, 2018 at 4:26 PM, Martin Polednik <mpolednik@redhat.com
wrote:
On 29/05/18 15:30 +0300, Elad Ben Aharon wrote:
> Hi Martin, > > Can you please create a cerry pick patch that is based on 4.2? >
See https://gerrit.ovirt.org/#/c/90906/. The CI failure isn unrelated (storage needs real env).
mpolednik
> Thanks > > On Tue, May 29, 2018 at 1:34 PM, Dan Kenigsberg <danken@redhat.com> > wrote: > > On Tue, May 29, 2018 at 1:21 PM, Elad Ben Aharon < ebenahar@redhat.com> >> wrote: >> > Hi Dan, >> > >> > In the last execution, the success rate was very low due to a large >> number >> > of failures on start VM caused, according to Michal, by the >> > vdsm-hook-allocate_net that was installed on the host. >> > >> > This is the latest status here, would you like me to re-execute? >> >> yes, of course. but you should rebase Polednik's code on top of >> *current* ovirt-4.2.3 branch. >> >> > If so, with >> > or W/O vdsm-hook-allocate_net installed? >> >> There was NO reason to have that installed. Please keep it (and any >> other needless code) out of the test environment. >> >> > >> > On Tue, May 29, 2018 at 1:14 PM, Dan Kenigsberg < danken@redhat.com> >> wrote: >> >> >> >> On Mon, May 7, 2018 at 3:53 PM, Michal Skrivanek >> >> <michal.skrivanek@redhat.com> wrote: >> >> > Hi Elad, >> >> > why did you install vdsm-hook-allocate_net? >> >> > >> >> > adding Dan as I think the hook is not supposed to fail this badly >> in >> any >> >> > case >> >> >> >> yep, this looks bad and deserves a little bug report. Installing this >> >> little hook should not block vm startup. >> >> >> >> But more importantly - what is the conclusion of this thread? Do we >> >> have a green light from QE to take this in? >> >> >> >> >> >> > >> >> > Thanks, >> >> > michal >> >> > >> >> > On 5 May 2018, at 19:22, Elad Ben Aharon <ebenahar@redhat.com> >> wrote: >> >> > >> >> > Start VM fails on: >> >> > >> >> > 2018-05-05 17:53:27,399+0300 INFO (vm/e6ce66ce) [virt.vm] >> >> > (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') drive 'vda' path: >> >> > >> >> > 'dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- >> 9a78-bdd13a843c62/images/6cdabfe5- >> >> > d1ca-40af-ae63-9834f235d1c8/7ef97445-30e6-4435-8425-f35a0192 8211' >> -> >> >> > >> >> > u'*dev=/rhev/data-center/mnt/blockSD/db5a6696-d907-4938- >> 9a78-bdd13a843c62/images/6cdabfe5-d1ca-40af-ae63- >> 9834f235d1c8/7ef97445-30e6-4435-8425- >> >> > f35a01928211' (storagexml:334) >> >> > 2018-05-05 17:53:27,888+0300 INFO (jsonrpc/1) [vdsm.api] START >> >> > getSpmStatus(spUUID='940fe6f3-b0c6-4d0c-a921-198e7819c1cc', >> >> > options=None) >> >> > from=::ffff:10.35.161.127,53512, >> >> > task_id=c70ace39-dbfe-4f5c-ae49-a1e3a82c >> >> > 2758 (api:46) >> >> > 2018-05-05 17:53:27,909+0300 INFO (vm/e6ce66ce) [root] >> >> > /usr/libexec/vdsm/hooks/before_device_create/10_allocate_net: rc=2 >> >> > err=vm >> >> > net allocation hook: [unexpected error]: Traceback (most recent >> call >> >> > last): >> >> > File "/usr/libexec/vdsm/hooks/befor e_device_create/10_allocate_ne >> t", >> >> > line >> >> > 105, in <module> >> >> > main() >> >> > File "/usr/libexec/vdsm/hooks/befor e_device_create/10_allocate_ne >> t", >> >> > line >> >> > 93, in main >> >> > allocate_random_network(device_xml) >> >> > File "/usr/libexec/vdsm/hooks/befor e_device_create/10_allocate_ne >> t", >> >> > line >> >> > 62, in allocate_random_network >> >> > net = _get_random_network() >> >> > File "/usr/libexec/vdsm/hooks/befor e_device_create/10_allocate_ne >> t", >> >> > line >> >> > 50, in _get_random_network >> >> > available_nets = _parse_nets() >> >> > File "/usr/libexec/vdsm/hooks/befor e_device_create/10_allocate_ne >> t", >> >> > line >> >> > 46, in _parse_nets >> >> > return [net for net in os.environ[AVAIL_NETS_KEY].split()] >> >> > File "/usr/lib64/python2.7/UserDict.py", line 23, in __getitem__ >> >> > raise KeyError(key) >> >> > KeyError: 'equivnets' >> >> > >> >> > >> >> > (hooks:110) >> >> > 2018-05-05 17:53:27,915+0300 ERROR (vm/e6ce66ce) [virt.vm] >> >> > (vmId='e6ce66ce-852f-48c5-9997-5d2959432a27') The vm start process >> >> > failed >> >> > (vm:943) >> >> > Traceback (most recent call last): >> >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line >> 872, >> in >> >> > _startUnderlyingVm >> >> > self._run() >> >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line >> 2861, >> in >> >> > _run >> >> > domxml = hooks.before_vm_start(self._buildDomainXML(), >> >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line >> 2254, >> in >> >> > _buildDomainXML >> >> > dom, self.id, self._custom['custom']) >> >> > File "/usr/lib/python2.7/site-packages/vdsm/virt/domxml_ >> preprocess.py", >> >> > line 240, in replace_device_xml_with_hooks_xml >> >> > dev_custom) >> >> > File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", >> line >> 134, >> >> > in >> >> > before_device_create >> >> > params=customProperties) >> >> > File "/usr/lib/python2.7/site-packages/vdsm/common/hooks.py", >> line >> 120, >> >> > in >> >> > _runHooksDir >> >> > raise exception.HookError(err) >> >> > HookError: Hook Error: ('vm net allocation hook: [unexpected >> error]: >> >> > Traceback (most recent call last):\n File >> >> > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ net", >> line >> >> > 105, in >> >> > <module>\n main()\n >> >> > File "/usr/libexec/vdsm/hooks/befor e_device_create/10_allocate_ne >> t", >> >> > line >> >> > 93, in main\n allocate_random_network(device_xml)\n File >> >> > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ net", >> line >> 62, >> >> > i >> >> > n allocate_random_network\n net = _get_random_network()\n File >> >> > "/usr/libexec/vdsm/hooks/before_device_create/10_allocate_ net", >> line >> 50, >> >> > in >> >> > _get_random_network\n available_nets = _parse_nets()\n File >> "/us >> >> > r/libexec/vdsm/hooks/before_device_create/10_allocate_net", line >> 46, >> in >> >> > _parse_nets\n return [net for net in >> >> > os.environ[AVAIL_NETS_KEY].split()]\n File >> >> > "/usr/lib64/python2.7/UserDict.py", line 23, in __getit >> >> > em__\n raise KeyError(key)\nKeyError: \'equivnets\'\n\n\n',) >> >> > >> >> > >> >> > >> >> > Hence, the success rate was 28% against 100% running with d/s >> (d/s). >> If >> >> > needed, I'll compare against the latest master, but I think you get >> the >> >> > picture with d/s. >> >> > >> >> > vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 >> >> > libvirt-3.9.0-14.el7_5.3.x86_64 >> >> > qemu-kvm-rhev-2.10.0-21.el7_5.2.x86_64 >> >> > kernel 3.10.0-862.el7.x86_64 >> >> > rhel7.5 >> >> > >> >> > >> >> > Logs attached >> >> > >> >> > On Sat, May 5, 2018 at 1:26 PM, Elad Ben Aharon < >> ebenahar@redhat.com> >> >> > wrote: >> >> >> >> >> >> nvm, found gluster 3.12 repo, managed to install vdsm >> >> >> >> >> >> On Sat, May 5, 2018 at 1:12 PM, Elad Ben Aharon < >> ebenahar@redhat.com >> > >> >> >> wrote: >> >> >>> >> >> >>> No, vdsm requires it: >> >> >>> >> >> >>> Error: Package: vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64 >> >> >>> (/vdsm-4.20.27-3.gitfee7810.el7.centos.x86_64) >> >> >>> Requires: glusterfs-fuse >= 3.12 >> >> >>> Installed: glusterfs-fuse-3.8.4-54.8.el7.x86_64 >> (@rhv-4.2.3) >> >> >>> >> >> >>> Therefore, vdsm package installation is skipped upon force >> install. >> >> >>> >> >> >>> On Sat, May 5, 2018 at 11:42 AM, Michal Skrivanek >> >> >>> <michal.skrivanek@redhat.com> wrote: >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> On 5 May 2018, at 00:38, Elad Ben Aharon < ebenahar@redhat.com> >> wrote: >> >> >>>> >> >> >>>> Hi guys, >> >> >>>> >> >> >>>> The vdsm build from the patch requires glusterfs-fuse > 3.12. >> This >> is >> >> >>>> while the latest 4.2.3-5 d/s build requires 3.8.4 >> (3.4.0.59rhs-1.el7) >> >> >>>> >> >> >>>> >> >> >>>> because it is still oVirt, not a downstream build. We can’t >> really >> do >> >> >>>> downstream builds with unmerged changes:/ >> >> >>>> >> >> >>>> Trying to get this gluster-fuse build, so far no luck. >> >> >>>> Is this requirement intentional? >> >> >>>> >> >> >>>> >> >> >>>> it should work regardless, I guess you can force install it >> without >> >> >>>> the >> >> >>>> dependency >> >> >>>> >> >> >>>> >> >> >>>> On Fri, May 4, 2018 at 2:38 PM, Michal Skrivanek >> >> >>>> <michal.skrivanek@redhat.com> wrote: >> >> >>>>> >> >> >>>>> Hi Elad, >> >> >>>>> to make it easier to compare, Martin backported the change to >> 4.2 >> so >> >> >>>>> it >> >> >>>>> is actually comparable with a run without that patch. Would you >> >> >>>>> please try >> >> >>>>> that out? >> >> >>>>> It would be best to have 4.2 upstream and this[1] run to really >> >> >>>>> minimize the noise. >> >> >>>>> >> >> >>>>> Thanks, >> >> >>>>> michal >> >> >>>>> >> >> >>>>> [1] >> >> >>>>> >> >> >>>>> http://jenkins.ovirt.org/job/vdsm_4.2_build-artifacts-on- >> demand-el7-x86_64/28/ >> >> >>>>> >> >> >>>>> On 27 Apr 2018, at 09:23, Martin Polednik < >> mpolednik@redhat.com> >> >> >>>>> wrote: >> >> >>>>> >> >> >>>>> On 24/04/18 00:37 +0300, Elad Ben Aharon wrote: >> >> >>>>> >> >> >>>>> I will update with the results of the next tier1 execution on >> latest >> >> >>>>> 4.2.3 >> >> >>>>> >> >> >>>>> >> >> >>>>> That isn't master but old branch though. Could you run it >> against >> >> >>>>> *current* VDSM master? >> >> >>>>> >> >> >>>>> On Mon, Apr 23, 2018 at 3:56 PM, Martin Polednik >> >> >>>>> <mpolednik@redhat.com> >> >> >>>>> wrote: >> >> >>>>> >> >> >>>>> On 23/04/18 01:23 +0300, Elad Ben Aharon wrote: >> >> >>>>> >> >> >>>>> Hi, I've triggered another execution [1] due to some issues I >> saw >> in >> >> >>>>> the >> >> >>>>> first which are not related to the patch. >> >> >>>>> >> >> >>>>> The success rate is 78% which is low comparing to tier1 >> executions >> >> >>>>> with >> >> >>>>> code from downstream builds (95-100% success rates) [2]. >> >> >>>>> >> >> >>>>> >> >> >>>>> Could you run the current master (without the dynamic_ownership >> >> >>>>> patch) >> >> >>>>> so that we have viable comparision? >> >> >>>>> >> >> >>>>> From what I could see so far, there is an issue with move and >> copy >> >> >>>>> >> >> >>>>> operations to and from Gluster domains. For example [3]. >> >> >>>>> >> >> >>>>> The logs are attached. >> >> >>>>> >> >> >>>>> >> >> >>>>> [1] >> >> >>>>> *https://rhv-jenkins.rhev-ci-v ms.eng.rdu2.redhat.com/job/rhv >> >> >>>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/ >> >> >>>>> <https://rhv-jenkins.rhev-ci-v ms.eng.rdu2.redhat.com/job/rhv >> >> >>>>> -4.2-ge-runner-tier1-after-upgrade/7/testReport/>* >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> [2] >> >> >>>>> https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/ >> >> >>>>> >> >> >>>>> rhv-4.2-ge-runner-tier1-after-upgrade/7/ >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> [3] >> >> >>>>> 2018-04-22 13:06:28,316+0300 INFO (jsonrpc/7) [vdsm.api] >> FINISH >> >> >>>>> deleteImage error=Image does not exist in domain: >> >> >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >> >> >>>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' >> >> >>>>> from=: >> >> >>>>> :ffff:10.35.161.182,40936, >> >> >>>>> flow_id=disks_syncAction_ba6b2630-5976-4935, >> >> >>>>> task_id=3d5f2a8a-881c-409e-93e9-aaa643c10e42 (api:51) >> >> >>>>> 2018-04-22 13:06:28,317+0300 ERROR (jsonrpc/7) >> >> >>>>> [storage.TaskManager.Task] >> >> >>>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') Unexpected error >> >> >>>>> (task:875) >> >> >>>>> Traceback (most recent call last): >> >> >>>>> File "/usr/lib/python2.7/site-packa ges/vdsm/storage/task.py", >> line >> >> >>>>> 882, >> >> >>>>> in >> >> >>>>> _run >> >> >>>>> return fn(*args, **kargs) >> >> >>>>> File "<string>", line 2, in deleteImage >> >> >>>>> File "/usr/lib/python2.7/site-packa ges/vdsm/common/api.py", >> line >> 49, >> >> >>>>> in >> >> >>>>> method >> >> >>>>> ret = func(*args, **kwargs) >> >> >>>>> File "/usr/lib/python2.7/site-packa ges/vdsm/storage/hsm.py", >> line >> >> >>>>> 1503, >> >> >>>>> in >> >> >>>>> deleteImage >> >> >>>>> raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) >> >> >>>>> ImageDoesNotExistInSD: Image does not exist in domain: >> >> >>>>> 'image=cabb8846-7a4b-4244-9835-5f603e682f33, >> >> >>>>> domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4' >> >> >>>>> >> >> >>>>> 2018-04-22 13:06:28,317+0300 INFO (jsonrpc/7) >> >> >>>>> [storage.TaskManager.Task] >> >> >>>>> (Task='3d5f2a8a-881c-409e-93e9-aaa643c10e42') aborting: Task >> is >> >> >>>>> aborted: >> >> >>>>> "Image does not exist in domain: 'image=cabb8846-7a4b-4244-9835 >> - >> >> >>>>> 5f603e682f33, domain=e5fd29c8-52ba-467e-be09-ca40ff054dd4'"
Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/communit y/about/community-guidelines/ List Archives: https://lists.ovirt.org/archiv es/list/devel@ovirt.org/message/WGMUK5T7PDYSIBIUZE3AJIYQAWJOILTC/
participants (5)
- 
                 Dan Kenigsberg Dan Kenigsberg
- 
                 Elad Ben Aharon Elad Ben Aharon
- 
                 Martin Polednik Martin Polednik
- 
                 Michal Skrivanek Michal Skrivanek
- 
                 Nir Soffer Nir Soffer