[ovirt-devel] [ OST Failure Report ] [ oVirt master ] [ 01/08/2017 ] [add_secondary_storage_domains]

Yaniv Kaul ykaul at redhat.com
Mon Aug 14 13:03:25 UTC 2017


On Mon, Aug 14, 2017 at 3:31 PM, Marc Young <3vilpenguin at gmail.com> wrote:

> Thanks for clarifying, that makes sense in hindsight since its testing
> things in their entirety. This is more of a rabbit hole for me as Im really
> just trying to learn Lago by using the OST project (since I'll be
> duplicating a lot of the setup).
>
> stdout logs from a run: https://pastebin.com/KBDaCCYp
>

Can you send engine log so we'll understand why the host installation
failed?
You should have all logs needed
@/home/myoung/repos/github/ovirt-system-tests/test_logs/basic-suite-4.1/
post-002_bootstrap.py/lago_logs

Y.


> On Mon, Aug 14, 2017 at 6:44 AM, Eyal Edri <eedri at redhat.com> wrote:
>
>>
>>
>> On Mon, Aug 14, 2017 at 2:31 PM, Marc Young <3vilpenguin at gmail.com>
>> wrote:
>>
>>> Ill try to get some detailed log files later, but fwiw I'm not running
>>> the hc suites (afaik, Im still getting inundated with the system
>>> tests/lago). The link I used for Jenkins was just to try to pull the latest
>>> 'passing' hash for ovirt-system-tests
>>>
>>
>> I see, its a bit more complicated than that I'm afraid, the HASH you see
>> is probably on for OST code itself, while the suites can fail on any oVirt
>> project ( ovirt-engine/vdsm/host-deploy/etc.... ), so its more than just
>> a single hash, but rather a list of RPMs and their versions, together with
>> hash of OST ( for the tests themselves ) and also OS updates.
>>
>>
>>>
>>> The errors have all been from master branch on the basic suite for 4.1
>>> via:
>>>
>>> $ ./run_suite.sh basic-suite-4.1/
>>>
>>>
>> OK, then we need to investigate it and understand why add host fails on
>> 4.1, please provide logs when possible.
>>
>> BTW, you can also try running the manual job [1], but you'll need to
>> provide it with custom yum repo URL with your built artifacts to test your
>> code.
>> There is a section on it on the OST readthedocs page.
>>
>> [1] http://jenkins.ovirt.org/view/oVirt%20system%20tests/job
>> /ovirt-system-tests_manual/
>>
>>
>>>
>>> On Mon, Aug 14, 2017 at 12:29 AM, Eyal Edri <eedri at redhat.com> wrote:
>>>
>>>> I see you run the 'hc' suite, which means hyperconverged. This suite is
>>>> running an hosted engine on gluster storage, It's a more complex suite than
>>>> the basic one, and prone to more errors, however it should still work, but
>>>> if you don't require it specifically, I would recommend running the basic
>>>> suite, which should be easier to debug and also quicker to run.
>>>>
>>>> I'm also adding hc maintainer so she can check, can you share the link
>>>> to the Jenkins job you're running? Or the log files if you're running
>>>> locally.
>>>>
>>>> On Aug 14, 2017 06:00, "Marc Young" <3vilpenguin at gmail.com> wrote:
>>>>
>>>>> Actually I spoke too soon, still fails:
>>>>>
>>>>> + lago ovirt runtest /home/myoung/repos/github/ovir
>>>>> t-system-tests/vagrant/test-scenarios/002_bootstrap.py
>>>>> @ Run test: 002_bootstrap.py:
>>>>> nose.config: INFO: Ignoring files matching ['^\\.', '^_',
>>>>> '^setup\\.py$']
>>>>>   # print_api_ver:
>>>>>   # print_api_ver: Success (in 0:00:00)
>>>>>   # add_dc:
>>>>>   # add_dc: Success (in 0:00:43)
>>>>>   # add_cluster:
>>>>>   # add_cluster: Success (in 0:00:03)
>>>>>   # add_hosts:
>>>>> dd
>>>>>     * Collect artifacts:
>>>>>     * Collect artifacts: Success (in 0:01:14)
>>>>>   # add_hosts: Success (in 0:16:36)
>>>>>   # Results located at /home/myoung/repos/github/ovir
>>>>> t-system-tests/deployment-vagrant/default/002_bootstrap.py.junit.xml
>>>>> @ Run test: 002_bootstrap.py: Success (in 0:17:26)
>>>>> Error occured, aborting
>>>>> Traceback (most recent call last):
>>>>>   File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 360,
>>>>> in do_run
>>>>>     self.cli_plugins[args.ovirtverb].do_run(args)
>>>>>   File "/usr/lib/python2.7/site-packages/lago/plugins/cli.py", line
>>>>> 184, in do_run
>>>>>     self._do_run(**vars(args))
>>>>>   File "/usr/lib/python2.7/site-packages/lago/utils.py", line 501, in
>>>>> wrapper
>>>>>     return func(*args, **kwargs)
>>>>>   File "/usr/lib/python2.7/site-packages/lago/utils.py", line 512, in
>>>>> wrapper
>>>>>     return func(*args, prefix=prefix, **kwargs)
>>>>>   File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 99,
>>>>> in do_ovirt_runtest
>>>>>     raise RuntimeError('Some tests failed')
>>>>> RuntimeError: Some tests failed
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Aug 13, 2017 at 9:47 PM, Marc Young <3vilpenguin at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Edit: reply-all
>>>>>>
>>>>>> It does, every time.
>>>>>> I got it to pass by using the last passing revision from Jenkins
>>>>>> (hash 98ae6d0b452d098f2703a197deb082a091bba837 ), noted from
>>>>>> http://jenkins.ovirt.org/job/system-tests_hc-basic-suite-mas
>>>>>> ter/15/consoleFull
>>>>>>
>>>>>> Not sure if it's a true race condition, that build in Jenkins has
>>>>>> failed consistently since #15
>>>>>>
>>>>>> On Sun, Aug 13, 2017 at 2:03 AM, Eyal Edri <eedri at redhat.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Aug 11, 2017 at 9:34 PM, Marc Young <3vilpenguin at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> What's the fix for this for those of us using ovirt-system-tests?
>>>>>>>>
>>>>>>>> Im trying to adapt some of the code for testing a third party tool,
>>>>>>>> but master is still failing on 002_bootstrap
>>>>>>>>
>>>>>>>
>>>>>>> It fails consistently?
>>>>>>> AFAIK this is a race condition that happens maybe once a week, can
>>>>>>> you share your logs? is it also failing on add_secondary_storage?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Aug 1, 2017 at 9:20 AM, Benny Zlotnik <bzlotnik at redhat.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I'm not sure it's related since the LSM test runs much later, in
>>>>>>>>> 004
>>>>>>>>>
>>>>>>>>> On Tue, Aug 1, 2017 at 3:33 PM, Eyal Edri <eedri at redhat.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Adding Allon & Benny.
>>>>>>>>>>
>>>>>>>>>> Is it possible to verify this is related to the LSM issue we've
>>>>>>>>>> been handling in [1]?
>>>>>>>>>> If this is the case, we agreed to disable the test next time it
>>>>>>>>>> fails, as the current workarounds with sleep isn't enough.
>>>>>>>>>>
>>>>>>>>>> Can you confirm this is the case, and so we'll have to disable
>>>>>>>>>> this test until one of the RFEs described in [1] is merged?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [1] https://gerrit.ovirt.org/#/c/78613/
>>>>>>>>>>
>>>>>>>>>> On Tue, Aug 1, 2017 at 2:45 PM, Barak Korren <bkorren at redhat.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> On 1 August 2017 at 14:39, Nir Soffer <nsoffer at redhat.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> >
>>>>>>>>>>> > On Tue, Aug 1, 2017 at 2:34 PM Barak Korren <
>>>>>>>>>>> bkorren at redhat.com> wrote:
>>>>>>>>>>> >>
>>>>>>>>>>> >> Test failed: [ 002_bootstrap.add_secondary_storage_domains ]
>>>>>>>>>>> >>
>>>>>>>>>>> >> Link to suspected patches:
>>>>>>>>>>> >> https://gerrit.ovirt.org/#/c/79974
>>>>>>>>>>> >
>>>>>>>>>>> >
>>>>>>>>>>> > This patch adds missing log when resizing an online disk - why
>>>>>>>>>>> do you think
>>>>>>>>>>> > it is related to the failure?
>>>>>>>>>>>
>>>>>>>>>>> Because it is the only patch participating in the test.
>>>>>>>>>>> (The test was equivalent to running the manual job with just
>>>>>>>>>>> this patch)
>>>>>>>>>>>
>>>>>>>>>>> Then again this may also one of the usual SD testing race
>>>>>>>>>>> conditions.
>>>>>>>>>>> Since the code in the patch seemed to be related to storage I
>>>>>>>>>>> didn't
>>>>>>>>>>> want to just assume that.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Barak Korren
>>>>>>>>>>> RHV DevOps team , RHCE, RHCi
>>>>>>>>>>> Red Hat EMEA
>>>>>>>>>>> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Devel mailing list
>>>>>>>>>>> Devel at ovirt.org
>>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Eyal edri
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ASSOCIATE MANAGER
>>>>>>>>>>
>>>>>>>>>> RHV DevOps
>>>>>>>>>>
>>>>>>>>>> EMEA VIRTUALIZATION R&D
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Red Hat EMEA <https://www.redhat.com/>
>>>>>>>>>> <https://red.ht/sig> TRIED. TESTED. TRUSTED.
>>>>>>>>>> <https://redhat.com/trusted>
>>>>>>>>>> phone: +972-9-7692018 <+972%209-769-2018>
>>>>>>>>>> irc: eedri (on #tlv #rhev-dev #rhev-integ)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Devel mailing list
>>>>>>>>> Devel at ovirt.org
>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Eyal edri
>>>>>>>
>>>>>>>
>>>>>>> ASSOCIATE MANAGER
>>>>>>>
>>>>>>> RHV DevOps
>>>>>>>
>>>>>>> EMEA VIRTUALIZATION R&D
>>>>>>>
>>>>>>>
>>>>>>> Red Hat EMEA <https://www.redhat.com/>
>>>>>>> <https://red.ht/sig> TRIED. TESTED. TRUSTED.
>>>>>>> <https://redhat.com/trusted>
>>>>>>> phone: +972-9-7692018 <+972%209-769-2018>
>>>>>>> irc: eedri (on #tlv #rhev-dev #rhev-integ)
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>>
>>
>> --
>>
>> Eyal edri
>>
>>
>> ASSOCIATE MANAGER
>>
>> RHV DevOps
>>
>> EMEA VIRTUALIZATION R&D
>>
>>
>> Red Hat EMEA <https://www.redhat.com/>
>> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>
>> phone: +972-9-7692018 <+972%209-769-2018>
>> irc: eedri (on #tlv #rhev-dev #rhev-integ)
>>
>
>
> _______________________________________________
> Devel mailing list
> Devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170814/c64dba5e/attachment-0001.html>


More information about the Devel mailing list