Fix (cf502720b4cf7bc09694ee0821f3e7f73d6e6e0f) to the following exception just merged:

Caused by: java.lang.NullPointerException
	at java.util.Objects.requireNonNull(Objects.java:203) [rt.jar:1.8.0_151]
	at java.util.Optional.<init>(Optional.java:96) [rt.jar:1.8.0_151]
	at java.util.Optional.of(Optional.java:108) [rt.jar:1.8.0_151]
	at java.util.stream.FindOps$FindSink$OfRef.get(FindOps.java:193) [rt.jar:1.8.0_151]
	at java.util.stream.FindOps$FindSink$OfRef.get(FindOps.java:190) [rt.jar:1.8.0_151]
	at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152) [rt.jar:1.8.0_151]
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) [rt.jar:1.8.0_151]
	at java.util.stream.ReferencePipeline.findFirst(ReferencePipeline.java:464) [rt.jar:1.8.0_151]
	at org.ovirt.engine.core.bll.storage.disk.image.BaseImagesCommand.setQcowCompatByQemuImageInfo(BaseImagesCommand.java:432) [bll.jar:]
	at org.ovirt.engine.core.bll.storage.disk.image.BaseImagesCommand.endSuccessfully(BaseImagesCommand.java:393) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.internalEndSuccessfully(CommandBase.java:751) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.endActionInTransactionScope(CommandBase.java:709) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1989) [bll.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInRequired(TransactionSupport.java:139) [utils.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:105) [utils.jar:]
	at org.ovirt.engine.core.bll.CommandBase.endAction(CommandBase.java:574) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.endAction(Backend.java:536) [bll.jar:]

On Thu, Nov 23, 2017 at 4:44 PM, Allon Mureinik <amureini@redhat.com> wrote:

---------- Forwarded message ----------
From: Dafna Ron <dron@redhat.com>
Date: Thu, Nov 23, 2017 at 1:32 PM
Subject: Re: [ovirt-devel] [ OST Failure Report ] [ oVirt master ] [ 22-11-2017 ] [ 004_basic_sanity.verify_add_vm_template ]
To: Yaniv Kaul <ykaul@redhat.com>, Nir Soffer <nsoffer@redhat.com>
Cc: aos-scalability <aos-scalability@redhat.com>, devel <devel@ovirt.org>, infra <infra@ovirt.org>


I opened a bug on the error: https://bugzilla.redhat.com/show_bug.cgi?id=1516780

This test seems to be catching something since we are failing on it randomly so I will review the logs today to see if I can find the true cause of failure and update my findings.

On 11/22/2017 07:05 PM, Yaniv Kaul wrote:


On Wed, Nov 22, 2017 at 7:55 PM, Nir Soffer <nsoffer@redhat.com> wrote:


On Wed, Nov 22, 2017 at 7:15 PM Dafna Ron <dron@redhat.com> wrote:

Hi,

we have a failure in 004_basic_sanity.verify_add_vm_template.

The error seems to be a failure from the api request since although I am seeing errors in the logs I am not sure they are the cause.


Link and headline ofto suspected patches:

core: Cleanup BaseImagesCommand code - https://gerrit.ovirt.org/#/c/83812/


Link to Job:

http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3989


Link to all logs:

http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3989/artifact




2017-11-22 10:40:14,334-05 ERROR [org.jboss.as.ejb3.invocation] (EE-ManagedThreadFactory-engineScheduled-Thread-29) WFLYEJB0034: EJB Invocation failed on component Backend for method public abstract org.ovirt.engine.core.common.action.ActionReturnValue org.ovirt.engine.core.bll.interfaces.BackendInternal.endAction(org.ovirt.engine.core.common.action.ActionType,org.ovirt.engine.core.common.action.ActionParametersBase,org.ovirt.engine.core.bll.context.CommandContext): javax.ejb.EJBTransactionRolledbackException
at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleInCallerTx(CMTTxInterceptor.java:160) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInCallerTx(CMTTxInterceptor.java:257) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:381) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:244) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:73) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final]
at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:89)
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ejb3.component.invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor.java:47) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ejb3.security.SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:100) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ejb3.deployment.processors.StartupAwaitInterceptor.processInvocation(StartupAwaitInterceptor.java:22) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)

...
Caused by: java.lang.NullPointerException
	at java.util.Objects.requireNonNull(Objects.java:203) [rt.jar:1.8.0_151]
	at java.util.Optional.<init>(Optional.java:96) [rt.jar:1.8.0_151]
	at java.util.Optional.of(Optional.java:108) [rt.jar:1.8.0_151]
	at java.util.stream.FindOps$FindSink$OfRef.get(FindOps.java:193) [rt.jar:1.8.0_151]
	at java.util.stream.FindOps$FindSink$OfRef.get(FindOps.java:190) [rt.jar:1.8.0_151]
	at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152) [rt.jar:1.8.0_151]
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) [rt.jar:1.8.0_151]
	at java.util.stream.ReferencePipeline.findFirst(ReferencePipeline.java:464) [rt.jar:1.8.0_151]
	at org.ovirt.engine.core.bll.storage.disk.image.BaseImagesCommand.setQcowCompatByQemuImageInfo(BaseImagesCommand.java:432) [bll.jar:]
	at org.ovirt.engine.core.bll.storage.disk.image.BaseImagesCommand.endSuccessfully(BaseImagesCommand.java:393) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.internalEndSuccessfully(CommandBase.java:751) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.endActionInTransactionScope(CommandBase.java:709) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1989) [bll.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInRequired(TransactionSupport.java:139) [utils.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:105) [utils.jar:]
	at org.ovirt.engine.core.bll.CommandBase.endAction(CommandBase.java:574) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.endAction(Backend.java:536) [bll.jar:]

              

              
Y.

            


(Relevant) error snippet from the log:

<error>

lago log:


 


 File "/usr/lib64/python2.7/unittest/case.py", line 369, in run
    testMethod()
  File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 129, in wrapped_test
    test()
  File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 59, in wrapper
    return func(get_test_prefix(), *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 68, in wrapper
    return func(prefix.virt_env.engine_vm().get_api(), *args, **kwargs)
  File "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py", line 500, in verify_add_vm_template
    lambda: api.vms.get(VM1_NAME).status.state == 'down',
  File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 271, in assert_true_within_long
    assert_equals_within_long(func, True, allowed_exceptions)
  File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 258, in assert_equals_within_long
    func, value, LONG_TIMEOUT, allowed_exceptions=allowed_exceptions
  File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 219, in assert_equals_within
    res = func()
  File "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py", line 500, in <lambda>
    lambda: api.vms.get(VM1_NAME).status.state == 'down',
'\'NoneType\' object has no attribute \'status\'\n-------------------- >> begin captured logging << --------------------\novirtlago.testlib: ERROR:     * Unhandled exception in <function <lambda> at 0x4bafb90>\nTraceback (most recent call last):\n  File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 219, in assert_equals_within\n    res = func()\n  File "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py", line 500, in <lambda>\n    lambda: api.vms.get(VM1_NAME).status.state == \'down\',\nAttributeError: \'NoneType\' object has no attribute \'status\'\n--------------------- >> end captured logging << ---------------------'

This is a test error - code like this is always wrong:

    api.vms.get(VM1_NAME).status.state == something

In this case, api.vms.get() returned None. The test code
should handle None and fail with meaningfull error, or maybe
retry the call.

It doesn't matter - if api.vms.get(VM1_NAME) return None, we'll fail the test. So we can add an assert here, but still fail in exactly the same manner. 
Gracefully with 'VM1_NAME was not found' ? Perhaps.
 
 

engine log:

{"jsonrpc": "2.0", "id": "32bfeb29-e70f-48b2-84c7-91cf22ea408c", "result": {"status": "OK", "domain": "2d776fb7-eba8-4ee2-97b6-4350c28ddb39", "voltype": "LEAF", "description": "", "parent": "a1192fcc-4660-41a7-a1a5-a221808f603b", "format": "COW", "generation": 0, "image": "dc033235-ab4b-4ef0-be79-6e8a22bfb0bc", "ctime": "1511365200", "disktype": "DATA", "legality": "LEGAL", "mtime": "0", "apparentsize": "1073741824", "children": [], "pool": "", "capacity": "41126400", "uuid": "6a354b93-86af-41fb-a68b-10897ae3562f", "truesize": "1073741824", "type": "SPARSE", "lease": {"owners": [], "version": null}}}�
2017-11-22 10:40:39,454-05 DEBUG [org.ovirt.vdsm.jsonrpc.client.internal.ResponseWorker] (ResponseWorker) [] Message received: {"jsonrpc": "2.0", "id": "32bfeb29-e70f-48b2-84c7-91cf22ea408c", "result": {"status": "OK", "domain": "2d776fb7-eba8-4ee2-97b6-4350c28ddb39", "voltype": "LEAF", "description": "", "parent": "a1192fcc-4660-41a7-a1a5-a221808f603b", "format": "COW", "generation": 0, "image": "dc033235-ab4b-4ef0-be79-6e8a22bfb0bc", "ctime": "1511365200", "disktype": "DATA", "legality": "LEGAL", "mtime": "0", "apparentsize": "1073741824", "children": [], "pool": "", "capacity": "41126400", "uuid": "6a354b93-86af-41fb-a68b-10897ae3562f", "truesize": "1073741824", "type": "SPARSE", "lease": {"owners": [], "version": null}}}
2017-11-22 10:40:39,454-05 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVolumeInfoVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [c90b2f35-7a80-476f-9b23-78beecef7c37] FINISH, GetVolumeInfoVDSCommand, return: org.ovirt.engine.core.common.businessentities.storage.DiskImage@6903fdd6, log id: 85b379d
2017-11-22 10:40:39,455-05 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.GetImageInfoVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [c90b2f35-7a80-476f-9b23-78beecef7c37] FINISH, GetImageInfoVDSCommand, return: org.ovirt.engine.core.common.businessentities.storage.DiskImage@6903fdd6, log id: 75d29166
2017-11-22 10:40:39,455-05 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [c90b2f35-7a80-476f-9b23-78beecef7c37] method: runVdsCommand, params: [GetImageInfo, GetImageInfoVDSCommandParameters:{storagePoolId='0762073b-b0ab-4ee2-8af9-dafa930a5fd4', ignoreFailoverLimit='false', storageDomainId='2d776fb7-eba8-4ee2-97b6-4350c28ddb39', imageGroupId='dc033235-ab4b-4ef0-be79-6e8a22bfb0bc', imageId='6a354b93-86af-41fb-a68b-10897ae3562f'}], timeElapsed: 165ms
2017-11-22 10:40:39,528-05 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [c90b2f35-7a80-476f-9b23-78beecef7c37] method: getVmsWithPlugInfo, params: [dc033235-ab4b-4ef0-be79-6e8a22bfb0bc], timeElapsed: 68ms
2017-11-22 10:40:39,543-05 INFO  [org.ovirt.engine.core.utils.transaction.TransactionSupport] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] transaction rolled back
2017-11-22 10:40:39,543-05 WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] Trying to release a shared lock for key: '93d271d5-e0af-4795-b800-55465c7b8826TEMPLATE' , but lock does not exist
2017-11-22 10:40:39,543-05 WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] Trying to release a shared lock for key: 'fb76119e-9283-4c8b-9185-5d79e2481bc7DISK' , but lock does not exist
2017-11-22 10:40:39,543-05 WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] Trying to release exclusive lock which does not exist, lock key: 'vm1VM_NAME'
2017-11-22 10:40:39,543-05 INFO  [org.ovirt.engine.core.bll.AddVmCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] Lock freed to object 'EngineLock:{exclusiveLocks='[vm1=VM_NAME]', sharedLocks='[93d271d5-e0af-4795-b800-55465c7b8826=TEMPLATE, fb76119e-9283-4c8b-9185-5d79e2481bc7=DISK]'}'
2017-11-22 10:40:39,556-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] EVENT_ID: USER_ADD_VM_FINISHED_FAILURE(60), Failed to complete VM vm1 creation.
2017-11-22 10:40:39,556-05 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] Failed invoking callback end method 'onSucceeded' for command '2760a62f-e4ee-4dd2-b13a-1bb548e3e62d' with exception 'null', the callback is marked for end method retries
2017-11-22 10:40:39,935-05 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-35) [] method: get, params: [d25423f6-34bc-4cde-82fa-c0eedfc85386], timeElapsed: 3ms
2017-11-22 10:40:39,949-05 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-71) [] method: get, params: [7c9ec3eb-2838-4b06-9408-326b67869bdc], timeElapsed: 2ms
2017-11-22 10:40:40,606-05 DEBUG [org.ovirt.engine.core.aaa.filters.SsoRestApiAuthFilter] (default task-22) [] Entered SsoRestApiAuthFilter

vdsm log: 
2017-11-22 10:36:11,973-0500 ERROR (jsonrpc/7) [storage.TaskManager.Task] (Task='5b6ab546-b560-41fc-9b56-de465aa55eee') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
    return fn(*args, **kargs)
  File "<string>", line 2, in getAllTasksInfo
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2216, in getAllTasksInfo
    allTasksInfo = self._pool.getAllTasksInfo()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper
    raise SecureError("Secured object is not in safe state")
SecureError: Secured object is not in safe state
2017-11-22 10:36:11,975-0500 INFO  (jsonrpc/7) [storage.TaskManager.Task] (Task='5b6ab546-b560-41fc-9b56-de465aa55eee') aborting: Task is aborted: u'Secured object is not in safe state' - code 100 (task:1181)
2017-11-22 10:36:11,976-0500 ERROR (jsonrpc/7) [storage.Dispatcher] FINISH getAllTasksInfo error=Secured object is not in safe state (dispatcher:86)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/dispatcher.py", line 73, in wrapper
    result = ctask.prepare(func, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 108, in wrapper
    return m(self, *a, **kw)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 1189, in prepare
    raise self.error
SecureError: Secured object is not in safe state
2017-11-22 10:36:11,976-0500 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllTasksInfo failed (error 654) in 0.01 seconds (__init__:573)
2017-11-22 10:36:11,979-0500 INFO  (jsonrpc/2) [vdsm.api] START getAllTasksStatuses(spUUID=None, options=None) from=::1,52728, task_id=172b5528-cc76-426a-81d3-43143b49a878 (api:46)
2017-11-22 10:36:11,979-0500 INFO  (jsonrpc/2) [vdsm.api] FINISH getAllTasksStatuses error=Secured object is not in safe state from=::1,52728, task_id=172b5528-cc76-426a-81d3-43143b49a878 (api:50)


Not related to the real error.
2017-11-22 10:36:11,980-0500 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='172b5528-cc76-426a-81d3-43143b49a878') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
    return fn(*args, **kargs)
  File "<string>", line 2, in getAllTasksStatuses
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2176, in getAllTasksStatuses
    allTasksStatus = self._pool.getAllTasksStatuses()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper
    raise SecureError("Secured object is not in safe state")
SecureError: Secured object is not in safe state
2017-11-22 10:36:11,980-0500 INFO  (jsonrpc/2) [storage.TaskManager.Task] (Task='172b5528-cc76-426a-81d3-43143b49a878') aborting: Task is aborted: u'Secured object is not in safe state' - code 100 (task:1181)
2017-11-22 10:36:11,981-0500 ERROR (jsonrpc/2) [storage.Dispatcher] FINISH getAllTasksStatuses error=Secured object is not in safe state (dispatcher:86)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/dispatcher.py", line 73, in wrapper
    result = ctask.prepare(func, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 108, in wrapper
    return m(self, *a, **kw)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 1189, in prepare
    raise self.error
SecureError: Secured object is not in safe state

The way to find the relevant errors is to find the entry point of this flow
in engine.log, and then follow this in vdsm.log using flow_id=xxx.

Ideally we would send a uuid to engine for every operation, and be
able to query the operation status using the uuid. This is the way
storage jobs are managed in vdsm side since 4.1.

If we had this every test would generate a uuids and debugging
failing test would be much faster.

Nir



</error>



_______________________________________________
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

_______________________________________________
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel



_______________________________________________
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra