[ovirt-devel] System tests failure (wrong test code?)

Arik Hadas ahadas at redhat.com
Sun Mar 19 12:41:01 UTC 2017


On Sun, Mar 19, 2017 at 2:09 PM, Yaniv Kaul <ykaul at redhat.com> wrote:

>
>
> On Sun, Mar 19, 2017 at 12:14 PM, Arik Hadas <ahadas at redhat.com> wrote:
>
>>
>>
>> On Sat, Mar 18, 2017 at 12:57 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>>
>>>
>>>
>>> On Sat, Mar 18, 2017 at 12:05 AM, Michal Skrivanek <mskrivan at redhat.com>
>>> wrote:
>>>
>>>>
>>>> On 17 Mar 2017, at 20:49, Yaniv Kaul <ykaul at redhat.com> wrote:
>>>>
>>>>
>>>>
>>>> On Thu, Mar 16, 2017 at 8:16 PM, Michal Skrivanek <mskrivan at redhat.com>
>>>> wrote:
>>>>
>>>>>
>>>>> On 16 Mar 2017, at 15:18, Yaniv Kaul <ykaul at redhat.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Mar 16, 2017 at 10:46 AM, Nir Soffer <nsoffer at redhat.com> wr
>>>>> ote:
>>>>>
>>>>>> If found this error in system tests - looks like wrong assert - code
>>>>>> should check
>>>>>> if disk is not None before checking state.
>>>>>>
>>>>>> I'm not sure who is the owner of this test, so posting here.
>>>>>>
>>>>>
>>>>> Theoretically, perhaps.
>>>>> Practically, it worked (until yesterday?) and now I'm also seeing this
>>>>> failure - it's not a coincidence.
>>>>>
>>>>> However, looking at my failure[1], I'm seeing other nasty stuff, which
>>>>> may explain the later on failures
>>>>>
>>>>> For example, new NPE I have not seen in the past:
>>>>> 2017-03-16 09:57:57,581-04 INFO  [org.ovirt.engine.core.bll.AddVmTemplateCommand]
>>>>> (DefaultQuartzScheduler1) [5d94233] Ending command
>>>>> 'org.ovirt.engine.core.bll.AddVmTemplateCommand' successfully.
>>>>>
>>>>>
>>>>> Is this IST or which TZ?
>>>>>
>>>>
>>>> Interesting question - I guess that's why we need the offset from UTC
>>>> or use UTC in the logs.
>>>>
>>>>
>>>>> Likely https://gerrit.ovirt.org/#/c/69323 merged 10:33:20 CET today
>>>>>
>>>>
>>>> That's the cause or the fix?
>>>>
>>>>
>>>> oh, sorry, the cause.
>>>> we were not able to find anyone to confirm and resolve it today
>>>> unfortunately
>>>>
>>>
>> A fix is posted: https://gerrit.ovirt.org/#/c/74276/
>> The cause is explained in the commit message.
>>
>
Merged.


>
> Thanks.
>
>
>> In our defense I'll say that somehow the system tests passed when we ran
>> them ([1]) and that the command for adding a template became way too
>> complicated than it is supposed to be with the introduction of import from
>> glance and instance-types.
>>
>
> Do we have intentions to simplify it?
>

Well, we had such intentions but these things are always the first
candidates to postpone. We should.


>
>
>>
>> [1] http://jenkins.ovirt.org/job/ovirt-system-tests_manual/129/
>>
>>
>>>
>>> A secondary bug is that it fails every 10 seconds or so, continuously.
>>> Y.
>>>
>>
>> Right, because it fails in the end-action phase of the command and we
>> have a retry mechanism that attempts to end the command (unless stated
>> otherwise) periodically.
>>
>
> But the end user experience is an event every few seconds (10?) that we
> fail an operation.
> When does it give up?
>

For the most part, chances are extremely low for a command to fail in its
end-action phase. Generally, commands either fail in their validation phase
or execution phase or when their async tasks/jobs fail. In those case, we
don't have/need a mechanism for periodic retry. On the other hand, the
end-action phase is relatively short and simple and is really not expected
to fail (unless there is a bug, like in this case), but if it does fail -
we have to retry and the user will probably experience more acute issues
than a flood of audit messages (e.g., inability to connect to the
database). AFAIK there is no limit to the number of retry attempts in this
case.


> Y.
>
>
>>
>>
>>>
>>>
>>>
>>>>
>>>> Y.
>>>>
>>>>
>>>>>
>>>>> 2017-03-16 09:57:57,591-04 INFO  [org.ovirt.engine.core.vdsbroker.SetVmStatusVDSCommand]
>>>>> (DefaultQuartzScheduler1) [5d94233] START, SetVmStatusVDSCommand(
>>>>> SetVmStatusVDSCommandParameters:{runAsync='true',
>>>>> vmId='00000000-0000-0000-0000-000000000000', status='Down',
>>>>> exitStatus='Normal'}), log id: 30ee3299
>>>>> 2017-03-16 09:57:57,593-04 DEBUG [org.ovirt.engine.core.dal.dbb
>>>>> roker.PostgresDbEngineDialect$PostgresSimpleJdbcCall]
>>>>> (DefaultQuartzScheduler1) [5d94233] Compiled stored procedure. Call string
>>>>> is [{call getvmdynamicbyvmguid(?)}]
>>>>> 2017-03-16 09:57:57,594-04 DEBUG [org.ovirt.engine.core.dal.dbb
>>>>> roker.PostgresDbEngineDialect$PostgresSimpleJdbcCall]
>>>>> (DefaultQuartzScheduler1) [5d94233] SqlCall for procedure
>>>>> [GetVmDynamicByVmGuid] compiled
>>>>> 2017-03-16 09:57:57,595-04 ERROR [org.ovirt.engine.core.vdsbroker.SetVmStatusVDSCommand]
>>>>> (DefaultQuartzScheduler1) [5d94233] Command 'SetVmStatusVDSCommand(
>>>>> SetVmStatusVDSCommandParameters:{runAsync='true',
>>>>> vmId='00000000-0000-0000-0000-000000000000', status='Down',
>>>>> exitStatus='Normal'})' execution failed: null
>>>>> 2017-03-16 09:57:57,595-04 DEBUG [org.ovirt.engine.core.vdsbroker.SetVmStatusVDSCommand]
>>>>> (DefaultQuartzScheduler1) [5d94233] Exception:
>>>>> java.lang.NullPointerException
>>>>> at org.ovirt.engine.core.vdsbroker.SetVmStatusVDSCommand.execut
>>>>> eVDSCommand(SetVmStatusVDSCommand.java:33) [vdsbroker.jar:]
>>>>> at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:73)
>>>>> [vdsbroker.jar:]
>>>>> at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33)
>>>>> [dal.jar:]
>>>>> at org.ovirt.engine.core.vdsbroker.vdsbroker.DefaultVdsCommandE
>>>>> xecutor.execute(DefaultVdsCommandExecutor.java:14) [vdsbroker.jar:]
>>>>> at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:404)
>>>>> [vdsbroker.jar:]
>>>>> at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.runVdsComman
>>>>> d(VDSBrokerFrontendImpl.java:33) [bll.jar:]
>>>>> at org.ovirt.engine.core.bll.VmHandler.unLockVm(VmHandler.java:377)
>>>>> [bll.jar:]
>>>>>
>>>>>
>>>>>
>>>>> Y.
>>>>>
>>>>> [1]  http://jenkins.ovirt.org/job/ovirt-system-tests_master_chec
>>>>> k-patch-el7-x86_64/326/artifact/exported-artifacts/basic_sui
>>>>> te_master__logs/test_logs/basic-suite-master/post-004_basic_
>>>>> sanity.py/lago-basic-suite-master-engine/_var_log/ovirt-engi
>>>>> ne/engine.log
>>>>>
>>>>>
>>>>>> *08:28:05*   # snapshots_merge: *08:28:31* Unhandled exception in <function <lambda> at 0x276a5f0>*08:28:31* Traceback (most recent call last):*08:28:31*   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 217, in assert_equals_within*08:28:31*     res = func()*08:28:31*   File "/home/jenkins/workspace/ovirt-system-tests_manual/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py", line 466, in <lambda>*08:28:31*     api.vms.get(VM0_NAME).disks.get(disk_name).status.state == 'ok'*08:28:31* AttributeError: 'NoneType' object has no attribute 'state'*08:28:31* Error while running thread*08:28:31* Traceback (most recent call last):*08:28:31*   File "/usr/lib/python2.7/site-packages/lago/utils.py", line 57, in _ret_via_queue*08:28:31*     queue.put({'return': func()})*08:28:31*   File "/home/jenkins/workspace/ovirt-system-tests_manual/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py", line 465, in snapshot_live_merge*08:28:31*     lambda:*08:28:31*   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 264, in assert_true_within_long*08:28:31*     assert_equals_within_long(func, True, allowed_exceptions)*08:28:31*   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 251, in assert_equals_within_long*08:28:31*     func, value, LONG_TIMEOUT, allowed_exceptions=allowed_exceptions*08:28:31*   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 217, in assert_equals_within*08:28:31*     res = func()*08:28:31*   File "/home/jenkins/workspace/ovirt-system-tests_manual/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py", line 466, in <lambda>*08:28:31*     api.vms.get(VM0_NAME).disks.get(disk_name).status.state == 'ok'*08:28:31* AttributeError: 'NoneType' object has no attribute 'state'
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Devel mailing list
>>>>>> Devel at ovirt.org
>>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Devel mailing list
>>>>> Devel at ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Devel mailing list
>>> Devel at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170319/c9080453/attachment-0001.html>


More information about the Devel mailing list