[ OST Failure Report ] [ oVirt Master (ovirt-engine) ] [ 27-03-2018 ] [ 004_basic_sanity.disk_operations ]

Hi, We have a failure on 004_basic_sanity.disk_operations. the reported change seems to me to be connected to the failure. *Link and headline of suspected patches: core: introduce CreateSnapshotForVm - https://gerrit.ovirt.org/#/c/87671/ <https://gerrit.ovirt.org/#/c/87671/>Link to Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/>Link to all logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/... <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-004_basic_sanity.py/>(Relevant) error snippet from the log: <error>2018-03-27 11:30:16,167-04 WARN [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand] (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] Validation of action 'CreateSnapshotForVm' failed for user admin@internal-authz. Reasons: VAR__ACTION__CREATE,VAR__TYPE__SNAPSHOT,ACTION_TYPE_FAILED_SNAPSHOT_IS_BEING_TAKEN_FOR_VM,$VmName vm12018-03-27 11:30:16,176-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] method: runAction, params: [CreateSnapshotForVm, CreateSnapshotForVmParameters:{commandId='d8bfdb0e-5ae4-4fed-8852-a0b29e377708', user='null', commandType='Unknown', vmId='65e7fea5-8b7d-4281-bd0d-6a84de207a00'}], timeElapsed: 57ms2018-03-27 11:30:16,186-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-11) [] Operation Failed: [Cannot create Snapshot. Snapshot is currently being created for VM vm1.]2018-03-27 11:30:16,820-04 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [16c08ce6-9a73-489f-a1f2-56e21699f14a] Command 'CopyImageGroupVolumesData' (id: '4c4023b4-21e8-4a98-8ac3-aba51967d160') waiting on child command id: '3d96b933-6e3b-4838-94bc-db3ff0cadcdc' type:'CopyData' to complete2018-03-27 11:30:16,833-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [16c08ce6-9a73-489f-a1f2-56e21699f14a] method: get, params: [a85992e9-40b8-4c80-9c93-3bf767f6efa4], timeElapsed: 8ms</error>*

Looking into this On Tue, Mar 27, 2018 at 6:45 PM, Dafna Ron <dron@redhat.com> wrote:
Hi,
We have a failure on 004_basic_sanity.disk_operations. the reported change seems to me to be connected to the failure.
*Link and headline of suspected patches: core: introduce CreateSnapshotForVm - https://gerrit.ovirt.org/#/c/87671/ <https://gerrit.ovirt.org/#/c/87671/>Link to Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/>Link to all logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/... <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-004_basic_sanity.py/>(Relevant) error snippet from the log: <error>2018-03-27 11:30:16,167-04 WARN [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand] (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] Validation of action 'CreateSnapshotForVm' failed for user admin@internal-authz. Reasons: VAR__ACTION__CREATE,VAR__TYPE__SNAPSHOT,ACTION_TYPE_FAILED_SNAPSHOT_IS_BEING_TAKEN_FOR_VM,$VmName vm12018-03-27 11:30:16,176-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] method: runAction, params: [CreateSnapshotForVm, CreateSnapshotForVmParameters:{commandId='d8bfdb0e-5ae4-4fed-8852-a0b29e377708', user='null', commandType='Unknown', vmId='65e7fea5-8b7d-4281-bd0d-6a84de207a00'}], timeElapsed: 57ms2018-03-27 11:30:16,186-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-11) [] Operation Failed: [Cannot create Snapshot. Snapshot is currently being created for VM vm1.]2018-03-27 11:30:16,820-04 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [16c08ce6-9a73-489f-a1f2-56e21699f14a] Command 'CopyImageGroupVolumesData' (id: '4c4023b4-21e8-4a98-8ac3-aba51967d160') waiting on child command id: '3d96b933-6e3b-4838-94bc-db3ff0cadcdc' type:'CopyData' to complete2018-03-27 11:30:16,833-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [16c08ce6-9a73-489f-a1f2-56e21699f14a] method: get, params: [a85992e9-40b8-4c80-9c93-3bf767f6efa4], timeElapsed: 8ms</error>*

This issue looks like it's caused by the DB lock being released before the engine memory lock, there was a similar race with the LSM test a few months back. I'll apply the same fix for the snapshot_cold_merge. . On Tue, Mar 27, 2018 at 10:03 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
Looking into this
On Tue, Mar 27, 2018 at 6:45 PM, Dafna Ron <dron@redhat.com> wrote:
Hi,
We have a failure on 004_basic_sanity.disk_operations. the reported change seems to me to be connected to the failure.
*Link and headline of suspected patches: core: introduce CreateSnapshotForVm - https://gerrit.ovirt.org/#/c/87671/ <https://gerrit.ovirt.org/#/c/87671/>Link to Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/>Link to all logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/... <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-004_basic_sanity.py/>(Relevant) error snippet from the log: <error>2018-03-27 11:30:16,167-04 WARN [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand] (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] Validation of action 'CreateSnapshotForVm' failed for user admin@internal-authz. Reasons: VAR__ACTION__CREATE,VAR__TYPE__SNAPSHOT,ACTION_TYPE_FAILED_SNAPSHOT_IS_BEING_TAKEN_FOR_VM,$VmName vm12018-03-27 11:30:16,176-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] method: runAction, params: [CreateSnapshotForVm, CreateSnapshotForVmParameters:{commandId='d8bfdb0e-5ae4-4fed-8852-a0b29e377708', user='null', commandType='Unknown', vmId='65e7fea5-8b7d-4281-bd0d-6a84de207a00'}], timeElapsed: 57ms2018-03-27 11:30:16,186-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-11) [] Operation Failed: [Cannot create Snapshot. Snapshot is currently being created for VM vm1.]2018-03-27 11:30:16,820-04 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [16c08ce6-9a73-489f-a1f2-56e21699f14a] Command 'CopyImageGroupVolumesData' (id: '4c4023b4-21e8-4a98-8ac3-aba51967d160') waiting on child command id: '3d96b933-6e3b-4838-94bc-db3ff0cadcdc' type:'CopyData' to complete2018-03-27 11:30:16,833-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [16c08ce6-9a73-489f-a1f2-56e21699f14a] method: get, params: [a85992e9-40b8-4c80-9c93-3bf767f6efa4], timeElapsed: 8ms</error>*

Thanks Beni. can you please paste the fix that you submit to the mail? Thanks, Dafna On Wed, Mar 28, 2018 at 9:13 AM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
This issue looks like it's caused by the DB lock being released before the engine memory lock, there was a similar race with the LSM test a few months back. I'll apply the same fix for the snapshot_cold_merge. .
On Tue, Mar 27, 2018 at 10:03 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
Looking into this
On Tue, Mar 27, 2018 at 6:45 PM, Dafna Ron <dron@redhat.com> wrote:
Hi,
We have a failure on 004_basic_sanity.disk_operations. the reported change seems to me to be connected to the failure.
*Link and headline of suspected patches: core: introduce CreateSnapshotForVm - https://gerrit.ovirt.org/#/c/87671/ <https://gerrit.ovirt.org/#/c/87671/>Link to Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/>Link to all logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/... <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-004_basic_sanity.py/>(Relevant) error snippet from the log: <error>2018-03-27 11:30:16,167-04 WARN [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand] (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] Validation of action 'CreateSnapshotForVm' failed for user admin@internal-authz. Reasons: VAR__ACTION__CREATE,VAR__TYPE__SNAPSHOT,ACTION_TYPE_FAILED_SNAPSHOT_IS_BEING_TAKEN_FOR_VM,$VmName vm12018-03-27 11:30:16,176-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] method: runAction, params: [CreateSnapshotForVm, CreateSnapshotForVmParameters:{commandId='d8bfdb0e-5ae4-4fed-8852-a0b29e377708', user='null', commandType='Unknown', vmId='65e7fea5-8b7d-4281-bd0d-6a84de207a00'}], timeElapsed: 57ms2018-03-27 11:30:16,186-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-11) [] Operation Failed: [Cannot create Snapshot. Snapshot is currently being created for VM vm1.]2018-03-27 11:30:16,820-04 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [16c08ce6-9a73-489f-a1f2-56e21699f14a] Command 'CopyImageGroupVolumesData' (id: '4c4023b4-21e8-4a98-8ac3-aba51967d160') waiting on child command id: '3d96b933-6e3b-4838-94bc-db3ff0cadcdc' type:'CopyData' to complete2018-03-27 11:30:16,833-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [16c08ce6-9a73-489f-a1f2-56e21699f14a] method: get, params: [a85992e9-40b8-4c80-9c93-3bf767f6efa4], timeElapsed: 8ms</error>*

https://gerrit.ovirt.org/#/c/89534/ https://gerrit.ovirt.org/#/c/89537/ On Wed, Mar 28, 2018 at 12:01 PM, Dafna Ron <dron@redhat.com> wrote:
Thanks Beni.
can you please paste the fix that you submit to the mail?
Thanks, Dafna
On Wed, Mar 28, 2018 at 9:13 AM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
This issue looks like it's caused by the DB lock being released before the engine memory lock, there was a similar race with the LSM test a few months back. I'll apply the same fix for the snapshot_cold_merge. .
On Tue, Mar 27, 2018 at 10:03 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
Looking into this
On Tue, Mar 27, 2018 at 6:45 PM, Dafna Ron <dron@redhat.com> wrote:
Hi,
We have a failure on 004_basic_sanity.disk_operations. the reported change seems to me to be connected to the failure.
*Link and headline of suspected patches: core: introduce CreateSnapshotForVm - https://gerrit.ovirt.org/#/c/87671/ <https://gerrit.ovirt.org/#/c/87671/>Link to Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/>Link to all logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/... <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-004_basic_sanity.py/>(Relevant) error snippet from the log: <error>2018-03-27 11:30:16,167-04 WARN [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand] (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] Validation of action 'CreateSnapshotForVm' failed for user admin@internal-authz. Reasons: VAR__ACTION__CREATE,VAR__TYPE__SNAPSHOT,ACTION_TYPE_FAILED_SNAPSHOT_IS_BEING_TAKEN_FOR_VM,$VmName vm12018-03-27 11:30:16,176-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] method: runAction, params: [CreateSnapshotForVm, CreateSnapshotForVmParameters:{commandId='d8bfdb0e-5ae4-4fed-8852-a0b29e377708', user='null', commandType='Unknown', vmId='65e7fea5-8b7d-4281-bd0d-6a84de207a00'}], timeElapsed: 57ms2018-03-27 11:30:16,186-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-11) [] Operation Failed: [Cannot create Snapshot. Snapshot is currently being created for VM vm1.]2018-03-27 11:30:16,820-04 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [16c08ce6-9a73-489f-a1f2-56e21699f14a] Command 'CopyImageGroupVolumesData' (id: '4c4023b4-21e8-4a98-8ac3-aba51967d160') waiting on child command id: '3d96b933-6e3b-4838-94bc-db3ff0cadcdc' type:'CopyData' to complete2018-03-27 11:30:16,833-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [16c08ce6-9a73-489f-a1f2-56e21699f14a] method: get, params: [a85992e9-40b8-4c80-9c93-3bf767f6efa4], timeElapsed: 8ms</error>*

On Wed, Mar 28, 2018 at 2:48 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
https://gerrit.ovirt.org/#/c/89534/ https://gerrit.ovirt.org/#/c/89537/
Merged.
On Wed, Mar 28, 2018 at 12:01 PM, Dafna Ron <dron@redhat.com> wrote:
Thanks Beni.
can you please paste the fix that you submit to the mail?
Thanks, Dafna
On Wed, Mar 28, 2018 at 9:13 AM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
This issue looks like it's caused by the DB lock being released before the engine memory lock, there was a similar race with the LSM test a few months back. I'll apply the same fix for the snapshot_cold_merge. .
On Tue, Mar 27, 2018 at 10:03 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
Looking into this
On Tue, Mar 27, 2018 at 6:45 PM, Dafna Ron <dron@redhat.com> wrote:
Hi,
We have a failure on 004_basic_sanity.disk_operations. the reported change seems to me to be connected to the failure.
*Link and headline of suspected patches: core: introduce CreateSnapshotForVm - https://gerrit.ovirt.org/#/c/87671/ <https://gerrit.ovirt.org/#/c/87671/>Link to Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/>Link to all logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/... <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-004_basic_sanity.py/>(Relevant) error snippet from the log: <error>2018-03-27 11:30:16,167-04 WARN [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand] (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] Validation of action 'CreateSnapshotForVm' failed for user admin@internal-authz. Reasons: VAR__ACTION__CREATE,VAR__TYPE__SNAPSHOT,ACTION_TYPE_FAILED_SNAPSHOT_IS_BEING_TAKEN_FOR_VM,$VmName vm12018-03-27 11:30:16,176-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] method: runAction, params: [CreateSnapshotForVm, CreateSnapshotForVmParameters:{commandId='d8bfdb0e-5ae4-4fed-8852-a0b29e377708', user='null', commandType='Unknown', vmId='65e7fea5-8b7d-4281-bd0d-6a84de207a00'}], timeElapsed: 57ms2018-03-27 11:30:16,186-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-11) [] Operation Failed: [Cannot create Snapshot. Snapshot is currently being created for VM vm1.]2018-03-27 11:30:16,820-04 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [16c08ce6-9a73-489f-a1f2-56e21699f14a] Command 'CopyImageGroupVolumesData' (id: '4c4023b4-21e8-4a98-8ac3-aba51967d160') waiting on child command id: '3d96b933-6e3b-4838-94bc-db3ff0cadcdc' type:'CopyData' to complete2018-03-27 11:30:16,833-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [16c08ce6-9a73-489f-a1f2-56e21699f14a] method: get, params: [a85992e9-40b8-4c80-9c93-3bf767f6efa4], timeElapsed: 8ms</error>*
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- *GAL bEN HAIM* RHV DEVOPS

On Wed, Mar 28, 2018 at 4:19 PM, Gal Ben Haim <gbenhaim@redhat.com> wrote:
On Wed, Mar 28, 2018 at 2:48 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
https://gerrit.ovirt.org/#/c/89534/ https://gerrit.ovirt.org/#/c/89537/
Merged.
Seems like it failed again similarly: 08:05:07 vm1_snapshots_service.snapshot_service(snapshot.id).remove() 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/services.py", line 20230, in remove 08:05:07 self._internal_remove(headers, query, wait) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 271, in _internal_remove 08:05:07 return future.wait() if wait else future 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 55, in wait 08:05:07 return self._code(response) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 268, in callback 08:05:07 self._check_fault(response) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 132, in _check_fault 08:05:07 self._raise_error(response, body) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 118, in _raise_error 08:05:07 raise error 08:05:07 Error: Fault reason is "Operation Failed". Fault detail is "[Cannot remove Snapshot. Snapshot is currently being created for VM vm1.]". HTTP response code is 409. http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6... Can you please check? Thanks.
On Wed, Mar 28, 2018 at 12:01 PM, Dafna Ron <dron@redhat.com> wrote:
Thanks Beni.
can you please paste the fix that you submit to the mail?
Thanks, Dafna
On Wed, Mar 28, 2018 at 9:13 AM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
This issue looks like it's caused by the DB lock being released before the engine memory lock, there was a similar race with the LSM test a few months back. I'll apply the same fix for the snapshot_cold_merge. .
On Tue, Mar 27, 2018 at 10:03 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
Looking into this
On Tue, Mar 27, 2018 at 6:45 PM, Dafna Ron <dron@redhat.com> wrote:
Hi,
We have a failure on 004_basic_sanity.disk_operations. the reported change seems to me to be connected to the failure.
Link and headline of suspected patches:
core: introduce CreateSnapshotForVm - https://gerrit.ovirt.org/#/c/87671/
Link to Job:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/
Link to all logs:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/...
(Relevant) error snippet from the log:
<error>
2018-03-27 11:30:16,167-04 WARN [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand] (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] Validation of action 'CreateSnapshotForVm' failed for user admin@internal-authz. Re asons: VAR__ACTION__CREATE,VAR__TYPE__SNAPSHOT,ACTION_TYPE_FAILED_SNAPSHOT_IS_BEING_TAKEN_FOR_VM,$VmName vm1 2018-03-27 11:30:16,176-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] method: runAction, params: [CreateSnapshotForVm, CreateSnapshotForVmParameters :{commandId='d8bfdb0e-5ae4-4fed-8852-a0b29e377708', user='null', commandType='Unknown', vmId='65e7fea5-8b7d-4281-bd0d-6a84de207a00'}], timeElapsed: 57ms 2018-03-27 11:30:16,186-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-11) [] Operation Failed: [Cannot create Snapshot. Snapshot is currently being created for VM vm1.] 2018-03-27 11:30:16,820-04 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [16c08ce6-9a73-489f-a1f2-56e21699f14a] Command 'CopyImageGroupVolumesData' (id: '4c4023 b4-21e8-4a98-8ac3-aba51967d160') waiting on child command id: '3d96b933-6e3b-4838-94bc-db3ff0cadcdc' type:'CopyData' to complete 2018-03-27 11:30:16,833-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (EE-ManagedThreadFactory-engineScheduled-Thread-100) [16c08ce6-9a73-489f-a1f2-56e21699f14a] method: get, params: [a85992e9-40b8-4c80-9c 93-3bf767f6efa4], timeElapsed: 8ms
</error>
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- GAL bEN HAIM RHV DEVOPS
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- Didi

I forgot to add the polling after the second snapshot as well, it was added in this patch: https://gerrit.ovirt.org/#/c/89577/ Sorry for the inconvenience On Thu, Mar 29, 2018 at 1:46 PM, Yedidyah Bar David <didi@redhat.com> wrote:
On Wed, Mar 28, 2018 at 4:19 PM, Gal Ben Haim <gbenhaim@redhat.com> wrote:
On Wed, Mar 28, 2018 at 2:48 PM, Benny Zlotnik <bzlotnik@redhat.com>
wrote:
https://gerrit.ovirt.org/#/c/89534/ https://gerrit.ovirt.org/#/c/89537/
Merged.
Seems like it failed again similarly:
08:05:07 vm1_snapshots_service.snapshot_service(snapshot.id).remove() 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/services.py", line 20230, in remove 08:05:07 self._internal_remove(headers, query, wait) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 271, in _internal_remove 08:05:07 return future.wait() if wait else future 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 55, in wait 08:05:07 return self._code(response) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 268, in callback 08:05:07 self._check_fault(response) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 132, in _check_fault 08:05:07 self._raise_error(response, body) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 118, in _raise_error 08:05:07 raise error 08:05:07 Error: Fault reason is "Operation Failed". Fault detail is "[Cannot remove Snapshot. Snapshot is currently being created for VM vm1.]". HTTP response code is 409.
http://jenkins.ovirt.org/job/ovirt-system-tests_master_ check-patch-el7-x86_64/4770/consoleFull
Can you please check? Thanks.
On Wed, Mar 28, 2018 at 12:01 PM, Dafna Ron <dron@redhat.com> wrote:
Thanks Beni.
can you please paste the fix that you submit to the mail?
Thanks, Dafna
On Wed, Mar 28, 2018 at 9:13 AM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
This issue looks like it's caused by the DB lock being released before the engine memory lock, there was a similar race with the LSM test a
few
months back. I'll apply the same fix for the snapshot_cold_merge. .
On Tue, Mar 27, 2018 at 10:03 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
Looking into this
On Tue, Mar 27, 2018 at 6:45 PM, Dafna Ron <dron@redhat.com> wrote: > > Hi, > > We have a failure on 004_basic_sanity.disk_operations. the reported > change seems to me to be connected to the failure. > > Link and headline of suspected patches: > > core: introduce CreateSnapshotForVm - > https://gerrit.ovirt.org/#/c/87671/ > > > Link to Job: > > http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/ > > Link to all logs: > > > http://jenkins.ovirt.org/job/ovirt-master_change-queue-
tester/6564/artifact/exported-artifacts/basic-suit-master- el7/test_logs/basic-suite-master/post-004_basic_sanity.py/
> > (Relevant) error snippet from the log: > > <error> > > > 2018-03-27 11:30:16,167-04 WARN > [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand] (default > task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] Validation of action > 'CreateSnapshotForVm' failed for user admin@internal-authz. Re > asons: > VAR__ACTION__CREATE,VAR__TYPE__SNAPSHOT,ACTION_TYPE_FAILED_ SNAPSHOT_IS_BEING_TAKEN_FOR_VM,$VmName > vm1 > 2018-03-27 11:30:16,176-04 DEBUG > [org.ovirt.engine.core.common.di.interceptor. DebugLoggingInterceptor] > (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] method: runAction, > params: [CreateSnapshotForVm, CreateSnapshotForVmParameters > :{commandId='d8bfdb0e-5ae4-4fed-8852-a0b29e377708', user='null', > commandType='Unknown', vmId='65e7fea5-8b7d-4281-bd0d- 6a84de207a00'}], > timeElapsed: 57ms > 2018-03-27 11:30:16,186-04 ERROR > [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default > task-11) [] Operation Failed: [Cannot create Snapshot. Snapshot is currently > being created for VM vm1.] > 2018-03-27 11:30:16,820-04 INFO > [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] > (EE-ManagedThreadFactory-engineScheduled-Thread-100) > [16c08ce6-9a73-489f-a1f2-56e21699f14a] Command 'CopyImageGroupVolumesData' > (id: '4c4023 > b4-21e8-4a98-8ac3-aba51967d160') waiting on child command id: > '3d96b933-6e3b-4838-94bc-db3ff0cadcdc' type:'CopyData' to complete > 2018-03-27 11:30:16,833-04 DEBUG > [org.ovirt.engine.core.common.di.interceptor. DebugLoggingInterceptor] > (EE-ManagedThreadFactory-engineScheduled-Thread-100) > [16c08ce6-9a73-489f-a1f2-56e21699f14a] method: get, params: > [a85992e9-40b8-4c80-9c > 93-3bf767f6efa4], timeElapsed: 8ms > > > </error> > >
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- GAL bEN HAIM RHV DEVOPS
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- Didi

On Thu, Mar 29, 2018 at 1:51 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
I forgot to add the polling after the second snapshot as well, it was added in this patch: https://gerrit.ovirt.org/#/c/89577/
Sorry for the inconvenience
NP :-) Now happened again: http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6... detail: Cannot remove Snapshot. Snapshot is currently being created for VM vm0.
On Thu, Mar 29, 2018 at 1:46 PM, Yedidyah Bar David <didi@redhat.com> wrote:
On Wed, Mar 28, 2018 at 4:19 PM, Gal Ben Haim <gbenhaim@redhat.com> wrote:
On Wed, Mar 28, 2018 at 2:48 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
https://gerrit.ovirt.org/#/c/89534/ https://gerrit.ovirt.org/#/c/89537/
Merged.
Seems like it failed again similarly:
08:05:07 vm1_snapshots_service.snapshot_service(snapshot.id).remove() 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/services.py", line 20230, in remove 08:05:07 self._internal_remove(headers, query, wait) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 271, in _internal_remove 08:05:07 return future.wait() if wait else future 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 55, in wait 08:05:07 return self._code(response) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 268, in callback 08:05:07 self._check_fault(response) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 132, in _check_fault 08:05:07 self._raise_error(response, body) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 118, in _raise_error 08:05:07 raise error 08:05:07 Error: Fault reason is "Operation Failed". Fault detail is "[Cannot remove Snapshot. Snapshot is currently being created for VM vm1.]". HTTP response code is 409.
http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6...
Can you please check? Thanks.
On Wed, Mar 28, 2018 at 12:01 PM, Dafna Ron <dron@redhat.com> wrote:
Thanks Beni.
can you please paste the fix that you submit to the mail?
Thanks, Dafna
On Wed, Mar 28, 2018 at 9:13 AM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
This issue looks like it's caused by the DB lock being released before the engine memory lock, there was a similar race with the LSM test a few months back. I'll apply the same fix for the snapshot_cold_merge. .
On Tue, Mar 27, 2018 at 10:03 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote: > > Looking into this > > On Tue, Mar 27, 2018 at 6:45 PM, Dafna Ron <dron@redhat.com> wrote: >> >> Hi, >> >> We have a failure on 004_basic_sanity.disk_operations. the reported >> change seems to me to be connected to the failure. >> >> Link and headline of suspected patches: >> >> core: introduce CreateSnapshotForVm - >> https://gerrit.ovirt.org/#/c/87671/ >> >> >> Link to Job: >> >> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/ >> >> Link to all logs: >> >> >> >> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/... >> >> (Relevant) error snippet from the log: >> >> <error> >> >> >> 2018-03-27 11:30:16,167-04 WARN >> [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand] >> (default >> task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] Validation of >> action >> 'CreateSnapshotForVm' failed for user admin@internal-authz. Re >> asons: >> >> VAR__ACTION__CREATE,VAR__TYPE__SNAPSHOT,ACTION_TYPE_FAILED_SNAPSHOT_IS_BEING_TAKEN_FOR_VM,$VmName >> vm1 >> 2018-03-27 11:30:16,176-04 DEBUG >> >> [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] >> (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] method: >> runAction, >> params: [CreateSnapshotForVm, CreateSnapshotForVmParameters >> :{commandId='d8bfdb0e-5ae4-4fed-8852-a0b29e377708', user='null', >> commandType='Unknown', >> vmId='65e7fea5-8b7d-4281-bd0d-6a84de207a00'}], >> timeElapsed: 57ms >> 2018-03-27 11:30:16,186-04 ERROR >> [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] >> (default >> task-11) [] Operation Failed: [Cannot create Snapshot. Snapshot is >> currently >> being created for VM vm1.] >> 2018-03-27 11:30:16,820-04 INFO >> [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] >> (EE-ManagedThreadFactory-engineScheduled-Thread-100) >> [16c08ce6-9a73-489f-a1f2-56e21699f14a] Command >> 'CopyImageGroupVolumesData' >> (id: '4c4023 >> b4-21e8-4a98-8ac3-aba51967d160') waiting on child command id: >> '3d96b933-6e3b-4838-94bc-db3ff0cadcdc' type:'CopyData' to complete >> 2018-03-27 11:30:16,833-04 DEBUG >> >> [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] >> (EE-ManagedThreadFactory-engineScheduled-Thread-100) >> [16c08ce6-9a73-489f-a1f2-56e21699f14a] method: get, params: >> [a85992e9-40b8-4c80-9c >> 93-3bf767f6efa4], timeElapsed: 8ms >> >> >> </error> >> >> >
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- GAL bEN HAIM RHV DEVOPS
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- Didi
-- Didi

On Mon, Apr 2, 2018 at 9:18 AM, Yedidyah Bar David <didi@redhat.com> wrote:
On Thu, Mar 29, 2018 at 1:51 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
I forgot to add the polling after the second snapshot as well, it was added in this patch: https://gerrit.ovirt.org/#/c/89577/
Sorry for the inconvenience
NP :-)
Now happened again:
http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6...
detail: Cannot remove Snapshot. Snapshot is currently being created for VM vm0.
Well, probably because your patch(es) were only for basic-suite, and above failure is in HE. Hopefully will be handled by Gal's '[WIP] he&basic master: Share tests': https://gerrit.ovirt.org/89615
On Thu, Mar 29, 2018 at 1:46 PM, Yedidyah Bar David <didi@redhat.com> wrote:
On Wed, Mar 28, 2018 at 4:19 PM, Gal Ben Haim <gbenhaim@redhat.com> wrote:
On Wed, Mar 28, 2018 at 2:48 PM, Benny Zlotnik <bzlotnik@redhat.com> wrote:
https://gerrit.ovirt.org/#/c/89534/ https://gerrit.ovirt.org/#/c/89537/
Merged.
Seems like it failed again similarly:
08:05:07 vm1_snapshots_service.snapshot_service(snapshot.id).remove() 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/services.py", line 20230, in remove 08:05:07 self._internal_remove(headers, query, wait) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 271, in _internal_remove 08:05:07 return future.wait() if wait else future 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 55, in wait 08:05:07 return self._code(response) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 268, in callback 08:05:07 self._check_fault(response) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 132, in _check_fault 08:05:07 self._raise_error(response, body) 08:05:07 File "/usr/lib64/python2.7/site-packages/ovirtsdk4/service.py", line 118, in _raise_error 08:05:07 raise error 08:05:07 Error: Fault reason is "Operation Failed". Fault detail is "[Cannot remove Snapshot. Snapshot is currently being created for VM vm1.]". HTTP response code is 409.
http://jenkins.ovirt.org/job/ovirt-system-tests_master_check-patch-el7-x86_6...
Can you please check? Thanks.
On Wed, Mar 28, 2018 at 12:01 PM, Dafna Ron <dron@redhat.com> wrote:
Thanks Beni.
can you please paste the fix that you submit to the mail?
Thanks, Dafna
On Wed, Mar 28, 2018 at 9:13 AM, Benny Zlotnik <bzlotnik@redhat.com> wrote: > > This issue looks like it's caused by the DB lock being released > before > the engine memory lock, there was a similar race with the LSM test a > few > months back. I'll apply the same fix for the snapshot_cold_merge. > . > > On Tue, Mar 27, 2018 at 10:03 PM, Benny Zlotnik <bzlotnik@redhat.com> > wrote: >> >> Looking into this >> >> On Tue, Mar 27, 2018 at 6:45 PM, Dafna Ron <dron@redhat.com> wrote: >>> >>> Hi, >>> >>> We have a failure on 004_basic_sanity.disk_operations. the reported >>> change seems to me to be connected to the failure. >>> >>> Link and headline of suspected patches: >>> >>> core: introduce CreateSnapshotForVm - >>> https://gerrit.ovirt.org/#/c/87671/ >>> >>> >>> Link to Job: >>> >>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/ >>> >>> Link to all logs: >>> >>> >>> >>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6564/artifact/... >>> >>> (Relevant) error snippet from the log: >>> >>> <error> >>> >>> >>> 2018-03-27 11:30:16,167-04 WARN >>> [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand] >>> (default >>> task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] Validation of >>> action >>> 'CreateSnapshotForVm' failed for user admin@internal-authz. Re >>> asons: >>> >>> VAR__ACTION__CREATE,VAR__TYPE__SNAPSHOT,ACTION_TYPE_FAILED_SNAPSHOT_IS_BEING_TAKEN_FOR_VM,$VmName >>> vm1 >>> 2018-03-27 11:30:16,176-04 DEBUG >>> >>> [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] >>> (default task-11) [9f713a3d-82a9-4db8-8207-bb69a0a2c550] method: >>> runAction, >>> params: [CreateSnapshotForVm, CreateSnapshotForVmParameters >>> :{commandId='d8bfdb0e-5ae4-4fed-8852-a0b29e377708', user='null', >>> commandType='Unknown', >>> vmId='65e7fea5-8b7d-4281-bd0d-6a84de207a00'}], >>> timeElapsed: 57ms >>> 2018-03-27 11:30:16,186-04 ERROR >>> [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] >>> (default >>> task-11) [] Operation Failed: [Cannot create Snapshot. Snapshot is >>> currently >>> being created for VM vm1.] >>> 2018-03-27 11:30:16,820-04 INFO >>> [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] >>> (EE-ManagedThreadFactory-engineScheduled-Thread-100) >>> [16c08ce6-9a73-489f-a1f2-56e21699f14a] Command >>> 'CopyImageGroupVolumesData' >>> (id: '4c4023 >>> b4-21e8-4a98-8ac3-aba51967d160') waiting on child command id: >>> '3d96b933-6e3b-4838-94bc-db3ff0cadcdc' type:'CopyData' to complete >>> 2018-03-27 11:30:16,833-04 DEBUG >>> >>> [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] >>> (EE-ManagedThreadFactory-engineScheduled-Thread-100) >>> [16c08ce6-9a73-489f-a1f2-56e21699f14a] method: get, params: >>> [a85992e9-40b8-4c80-9c >>> 93-3bf767f6efa4], timeElapsed: 8ms >>> >>> >>> </error> >>> >>> >> >
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- GAL bEN HAIM RHV DEVOPS
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- Didi
-- Didi
-- Didi
participants (4)
-
Benny Zlotnik
-
Dafna Ron
-
Gal Ben Haim
-
Yedidyah Bar David