
On Wed, Nov 22, 2017 at 7:55 PM, Nir Soffer <nsoffer@redhat.com> wrote:
On Wed, Nov 22, 2017 at 7:15 PM Dafna Ron <dron@redhat.com> wrote:
Hi,
we have a failure in 004_basic_sanity.verify_add_vm_template.
The error seems to be a failure from the api request since although I am seeing errors in the logs I am not sure they are the cause.
*Link and headline ofto suspected patches: *
* core: Cleanup BaseImagesCommand code - https://gerrit.ovirt.org/#/c/83812/ <https://gerrit.ovirt.org/#/c/83812/>*
*Link to Job:* * http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3989 <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3989>*
*Link to all logs:* * http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3989/artifact <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3989/artifact>*
Relevant logs start here - http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3989/artifact/... Specifically, server.log ( http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3989/artifact/... ): 2017-11-22 10:40:14,334-05 ERROR [org.jboss.as.ejb3.invocation] (EE-ManagedThreadFactory-engineScheduled-Thread-29) WFLYEJB0034: EJB Invocation failed on component Backend for method public abstract org.ovirt.engine.core.common.action.ActionReturnValue org.ovirt.engine.core.bll.interfaces.BackendInternal.endAction(org.ovirt.engine.core.common.action.ActionType,org.ovirt.engine.core.common.action.ActionParametersBase,org.ovirt.engine.core.bll.context.CommandContext): javax.ejb.EJBTransactionRolledbackException at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleInCallerTx(CMTTxInterceptor.java:160) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final] at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInCallerTx(CMTTxInterceptor.java:257) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final] at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:381) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final] at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:244) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509) at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:73) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final] at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:89) at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.component.invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor.java:47) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.security.SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:100) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) at org.jboss.as.ejb3.deployment.processors.StartupAwaitInterceptor.processInvocation(StartupAwaitInterceptor.java:22) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422) ... Caused by: java.lang.NullPointerException at java.util.Objects.requireNonNull(Objects.java:203) [rt.jar:1.8.0_151] at java.util.Optional.<init>(Optional.java:96) [rt.jar:1.8.0_151] at java.util.Optional.of(Optional.java:108) [rt.jar:1.8.0_151] at java.util.stream.FindOps$FindSink$OfRef.get(FindOps.java:193) [rt.jar:1.8.0_151] at java.util.stream.FindOps$FindSink$OfRef.get(FindOps.java:190) [rt.jar:1.8.0_151] at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152) [rt.jar:1.8.0_151] at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) [rt.jar:1.8.0_151] at java.util.stream.ReferencePipeline.findFirst(ReferencePipeline.java:464) [rt.jar:1.8.0_151] at org.ovirt.engine.core.bll.storage.disk.image.BaseImagesCommand.setQcowCompatByQemuImageInfo(BaseImagesCommand.java:432) [bll.jar:] at org.ovirt.engine.core.bll.storage.disk.image.BaseImagesCommand.endSuccessfully(BaseImagesCommand.java:393) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.internalEndSuccessfully(CommandBase.java:751) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.endActionInTransactionScope(CommandBase.java:709) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1989) [bll.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInRequired(TransactionSupport.java:139) [utils.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:105) [utils.jar:] at org.ovirt.engine.core.bll.CommandBase.endAction(CommandBase.java:574) [bll.jar:] at org.ovirt.engine.core.bll.Backend.endAction(Backend.java:536) [bll.jar:] Y.
*(Relevant) error snippet from the log: *
* <error> *
* lago log: *
File "/usr/lib64/python2.7/unittest/case.py", line 369, in run testMethod() File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 129, in wrapped_test test() File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 59, in wrapper return func(get_test_prefix(), *args, **kwargs) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 68, in wrapper return func(prefix.virt_env.engine_vm().get_api(), *args, **kwargs) File "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py", line 500, in verify_add_vm_template lambda: api.vms.get(VM1_NAME).status.state == 'down', File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 271, in assert_true_within_long assert_equals_within_long(func, True, allowed_exceptions) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 258, in assert_equals_within_long func, value, LONG_TIMEOUT, allowed_exceptions=allowed_exceptions File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 219, in assert_equals_within res = func() File "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py", line 500, in <lambda> lambda: api.vms.get(VM1_NAME).status.state == 'down', '\'NoneType\' object has no attribute \'status\'\n-------------------- >> begin captured logging << --------------------\novirtlago.testlib: ERROR: * Unhandled exception in <function <lambda> at 0x4bafb90>\nTraceback (most recent call last):\n File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 219, in assert_equals_within\n res = func()\n File "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py", line 500, in <lambda>\n lambda: api.vms.get(VM1_NAME).status.state == \'down\',\nAttributeError: \'NoneType\' object has no attribute \'status\'\n--------------------- >> end captured logging << ---------------------'
This is a test error - code like this is always wrong:
api.vms.get(VM1_NAME).status.state == something
In this case, api.vms.get() returned None. The test code should handle None and fail with meaningfull error, or maybe retry the call.
It doesn't matter - if api.vms.get(VM1_NAME) return None, we'll fail the test. So we can add an assert here, but still fail in exactly the same manner. Gracefully with 'VM1_NAME was not found' ? Perhaps.
engine log:
{"jsonrpc": "2.0", "id": "32bfeb29-e70f-48b2-84c7-91cf22ea408c", "result": {"status": "OK", "domain": "2d776fb7-eba8-4ee2-97b6-4350c28ddb39", "voltype": "LEAF", "description": "", "parent": "a1192fcc-4660-41a7-a1a5-a221808f603b", "format": "COW", "generation": 0, "image": "dc033235-ab4b-4ef0-be79-6e8a22bfb0bc", "ctime": "1511365200", "disktype": "DATA", "legality": "LEGAL", "mtime": "0", "apparentsize": "1073741824", "children": [], "pool": "", "capacity": "41126400", "uuid": "6a354b93-86af-41fb-a68b-10897ae3562f", "truesize": "1073741824", "type": "SPARSE", "lease": {"owners": [], "version": null}}}� 2017-11-22 10:40:39,454-05 DEBUG [org.ovirt.vdsm.jsonrpc.client.internal.ResponseWorker] (ResponseWorker) [] Message received: {"jsonrpc": "2.0", "id": "32bfeb29-e70f-48b2-84c7-91cf22ea408c", "result": {"status": "OK", "domain": "2d776fb7-eba8-4ee2-97b6-4350c28ddb39", "voltype": "LEAF", "description": "", "parent": "a1192fcc-4660-41a7-a1a5-a221808f603b", "format": "COW", "generation": 0, "image": "dc033235-ab4b-4ef0-be79-6e8a22bfb0bc", "ctime": "1511365200", "disktype": "DATA", "legality": "LEGAL", "mtime": "0", "apparentsize": "1073741824", "children": [], "pool": "", "capacity": "41126400", "uuid": "6a354b93-86af-41fb-a68b-10897ae3562f", "truesize": "1073741824", "type": "SPARSE", "lease": {"owners": [], "version": null}}} 2017-11-22 10:40:39,454-05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVolumeInfoVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [c90b2f35-7a80-476f-9b23-78beecef7c37] FINISH, GetVolumeInfoVDSCommand, return: org.ovirt.engine.core.common.businessentities.storage.DiskImage@6903fdd6, log id: 85b379d 2017-11-22 10:40:39,455-05 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.GetImageInfoVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [c90b2f35-7a80-476f-9b23-78beecef7c37] FINISH, GetImageInfoVDSCommand, return: org.ovirt.engine.core.common.businessentities.storage.DiskImage@6903fdd6, log id: 75d29166 2017-11-22 10:40:39,455-05 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [c90b2f35-7a80-476f-9b23-78beecef7c37] method: runVdsCommand, params: [GetImageInfo, GetImageInfoVDSCommandParameters:{storagePoolId='0762073b-b0ab-4ee2-8af9-dafa930a5fd4', ignoreFailoverLimit='false', storageDomainId='2d776fb7-eba8-4ee2-97b6-4350c28ddb39', imageGroupId='dc033235-ab4b-4ef0-be79-6e8a22bfb0bc', imageId='6a354b93-86af-41fb-a68b-10897ae3562f'}], timeElapsed: 165ms 2017-11-22 10:40:39,528-05 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [c90b2f35-7a80-476f-9b23-78beecef7c37] method: getVmsWithPlugInfo, params: [dc033235-ab4b-4ef0-be79-6e8a22bfb0bc], timeElapsed: 68ms 2017-11-22 10:40:39,543-05 INFO [org.ovirt.engine.core.utils.transaction.TransactionSupport] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] transaction rolled back 2017-11-22 10:40:39,543-05 WARN [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] Trying to release a shared lock for key: '93d271d5-e0af-4795-b800-55465c7b8826TEMPLATE' , but lock does not exist 2017-11-22 10:40:39,543-05 WARN [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] Trying to release a shared lock for key: 'fb76119e-9283-4c8b-9185-5d79e2481bc7DISK' , but lock does not exist 2017-11-22 10:40:39,543-05 WARN [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] Trying to release exclusive lock which does not exist, lock key: 'vm1VM_NAME' 2017-11-22 10:40:39,543-05 INFO [org.ovirt.engine.core.bll.AddVmCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] Lock freed to object 'EngineLock:{exclusiveLocks='[vm1=VM_NAME]', sharedLocks='[93d271d5-e0af-4795-b800-55465c7b8826=TEMPLATE, fb76119e-9283-4c8b-9185-5d79e2481bc7=DISK]'}' 2017-11-22 10:40:39,556-05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] EVENT_ID: USER_ADD_VM_FINISHED_FAILURE(60), Failed to complete VM vm1 creation. 2017-11-22 10:40:39,556-05 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (EE-ManagedThreadFactory-engineScheduled-Thread-34) [] Failed invoking callback end method 'onSucceeded' for command '2760a62f-e4ee-4dd2-b13a-1bb548e3e62d' with exception 'null', the callback is marked for end method retries 2017-11-22 10:40:39,935-05 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-35) [] method: get, params: [d25423f6-34bc-4cde-82fa-c0eedfc85386], timeElapsed: 3ms 2017-11-22 10:40:39,949-05 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-71) [] method: get, params: [7c9ec3eb-2838-4b06-9408-326b67869bdc], timeElapsed: 2ms 2017-11-22 10:40:40,606-05 DEBUG [org.ovirt.engine.core.aaa.filters.SsoRestApiAuthFilter] (default task-22) [] Entered SsoRestApiAuthFilter
vdsm log: 2017-11-22 10:36:11,973-0500 ERROR (jsonrpc/7) [storage.TaskManager.Task] (Task='5b6ab546-b560-41fc-9b56-de465aa55eee') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in getAllTasksInfo File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2216, in getAllTasksInfo allTasksInfo = self._pool.getAllTasksInfo() File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper raise SecureError("Secured object is not in safe state") SecureError: Secured object is not in safe state 2017-11-22 10:36:11,975-0500 INFO (jsonrpc/7) [storage.TaskManager.Task] (Task='5b6ab546-b560-41fc-9b56-de465aa55eee') aborting: Task is aborted: u'Secured object is not in safe state' - code 100 (task:1181) 2017-11-22 10:36:11,976-0500 ERROR (jsonrpc/7) [storage.Dispatcher] FINISH getAllTasksInfo error=Secured object is not in safe state (dispatcher:86) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/dispatcher.py", line 73, in wrapper result = ctask.prepare(func, *args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 108, in wrapper return m(self, *a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 1189, in prepare raise self.error SecureError: Secured object is not in safe state 2017-11-22 10:36:11,976-0500 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllTasksInfo failed (error 654) in 0.01 seconds (__init__:573) 2017-11-22 10:36:11,979-0500 INFO (jsonrpc/2) [vdsm.api] START getAllTasksStatuses(spUUID=None, options=None) from=::1,52728, task_id=172b5528-cc76-426a-81d3-43143b49a878 (api:46) 2017-11-22 10:36:11,979-0500 INFO (jsonrpc/2) [vdsm.api] FINISH getAllTasksStatuses error=Secured object is not in safe state from=::1,52728, task_id=172b5528-cc76-426a-81d3-43143b49a878 (api:50)
This is https://bugzilla.redhat.com/1426429
Not related to the real error.
2017-11-22 10:36:11,980-0500 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='172b5528-cc76-426a-81d3-43143b49a878') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in getAllTasksStatuses File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2176, in getAllTasksStatuses allTasksStatus = self._pool.getAllTasksStatuses() File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper raise SecureError("Secured object is not in safe state") SecureError: Secured object is not in safe state 2017-11-22 10:36:11,980-0500 INFO (jsonrpc/2) [storage.TaskManager.Task] (Task='172b5528-cc76-426a-81d3-43143b49a878') aborting: Task is aborted: u'Secured object is not in safe state' - code 100 (task:1181) 2017-11-22 10:36:11,981-0500 ERROR (jsonrpc/2) [storage.Dispatcher] FINISH getAllTasksStatuses error=Secured object is not in safe state (dispatcher:86) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/dispatcher.py", line 73, in wrapper result = ctask.prepare(func, *args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 108, in wrapper return m(self, *a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 1189, in prepare raise self.error SecureError: Secured object is not in safe state
The way to find the relevant errors is to find the entry point of this flow in engine.log, and then follow this in vdsm.log using flow_id=xxx.
Ideally we would send a uuid to engine for every operation, and be able to query the operation status using the uuid. This is the way storage jobs are managed in vdsm side since 4.1.
If we had this every test would generate a uuids and debugging failing test would be much faster.
Nir
*</error>*
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel