One disk in illegal state after deleting snapshot with several disks

Hi, I'm using oVirt 4.2.5. I know, it is not the newest anymore, but our cluster is quite big and I will upgrade to 4.2.8 as soon as possible. I have a VM with several disks, one with virtio (boot device) and 3 other disks with virtio-scsi: log 8190c797-0ed8-421f-85c7-cc1f540408f8 1 GiB root 5edab51c-9113-466c-bd27-e73d4bfb29c4 10 GiB tmp 11d74762-6053-4347-bdf2-4838dc2ea6f0 1 GiB web_web-content bb5b1881-d40f-4ad1-a8c8-8ee594b3fe8a 20 GiB Snapshots where quite small, because not much is changing there. All disks are on NFS v3 share running on NetApp cluster. Some IDs: VM ID: bc25c5c9-353b-45ba-b0d5-5dbba41e9c5f affected disk ID: 6cbd2f85-8335-416f-a208-ef60ecd839a4 Snapshot ID: c8103ae8-3432-4b69-8b91-790cdc37a2da Snapshot disk ID: 2564b125-857e-41fa-b187-2832df277ccf Task ID: 2a60efb5-1a11-49ac-a7f0-406faac219d6 Storage domain ID: 14794a3e-16fc-4dd3-a867-10507acfe293 After triggering snapshot delete task (2a60efb5-1a11-49ac-a7f0-406faac219d6), deletion was running for about one hour and I though it was hanging and I restarted the engine process on self-hosted engine... After that, snapshot was still in lock state, therefore, I deleted the lock: /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t snapshot c8103ae8-3432-4b69-8b91-790cdc37a2da ########################################## CAUTION, this operation may lead to data corruption and should be used with care. Please contact support prior to running this command ########################################## Are you sure you want to proceed? [y/n] y select fn_db_unlock_snapshot('c8103ae8-3432-4b69-8b91-790cdc37a2da'); INSERT 0 1 unlock snapshot c8103ae8-3432-4b69-8b91-790cdc37a2da completed successfully. After trying to delete snapshot again, engine gave the error, that the Disk is in status Illegal. Snapshot file is still there for the disk log: -rw-rw----. 1 vdsm kvm 282M Feb 1 13:33 2564b125-857e-41fa-b187-2832df277ccf -rw-rw----. 1 vdsm kvm 1.0M Jan 22 03:22 2564b125-857e-41fa-b187-2832df277ccf.lease -rw-r--r--. 1 vdsm kvm 267 Feb 1 14:05 2564b125-857e-41fa-b187-2832df277ccf.meta -rw-rw----. 1 vdsm kvm 1.0G Feb 1 11:53 6cbd2f85-8335-416f-a208-ef60ecd839a4 -rw-rw----. 1 vdsm kvm 1.0M Jan 10 12:07 6cbd2f85-8335-416f-a208-ef60ecd839a4.lease -rw-r--r--. 1 vdsm kvm 272 Jan 22 03:22 6cbd2f85-8335-416f-a208-ef60ecd839a4.meta All other snapshots have been merged successfully. I have umounted the disk inside the VM, after I saw, that the snapshot disk is still in use. That's why, the date is not changed anymore. The strange thing is, that it looks like that the merge was working for a short time, because also time of the underlying disk has changed... In database, I have this data about the VM and its snapshot: engine=# select snapshot_id,snapshot_type,status,description from snapshots where vm_id='bc25c5c9-353b-45ba-b0d5-5dbba41e9c5f'; snapshot_id | snapshot_type | status | description --------------------------------------+---------------+--------+------------- f596ba1c-4a6e-4372-9df4-c8e870c55fea | ACTIVE | OK | Active VM c8103ae8-3432-4b69-8b91-790cdc37a2da | REGULAR | OK | cab-3449 engine=# select image_guid,parentid,imagestatus,vm_snapshot_id,volume_type,volume_format,active from images where image_group_id='8190c797-0ed8-421f-85c7-cc1f540408f8'; image_guid | parentid | imagestatus | vm_snapshot_id | volume_type | volume_format | active --------------------------------------+--------------------------------------+-------------+--------------------------------------+-------------+---------------+-------- 2564b125-857e-41fa-b187-2832df277ccf | 6cbd2f85-8335-416f-a208-ef60ecd839a4 | 1 | f596ba1c-4a6e-4372-9df4-c8e870c55fea | 2 | 4 | t 6cbd2f85-8335-416f-a208-ef60ecd839a4 | 00000000-0000-0000-0000-000000000000 | 4 | c8103ae8-3432-4b69-8b91-790cdc37a2da | 2 | 5 | f vdsm-tool dump-volume-chains 14794a3e-16fc-4dd3-a867-10507acfe293: image: 8190c797-0ed8-421f-85c7-cc1f540408f8 - 6cbd2f85-8335-416f-a208-ef60ecd839a4 status: OK, voltype: INTERNAL, format: RAW, legality: LEGAL, type: SPARSE - 2564b125-857e-41fa-b187-2832df277ccf status: OK, voltype: LEAF, format: COW, legality: LEGAL, type: SPARSE @bzlotnik, it would be great, when you can help me to get the disk back only without stopping or starting the VM. I'm really afraid now of deleting snapshots... I will send you the vdsm log from host running the VM and from SPM and engine.log. Thank you very much! BR Florian Schmid
participants (1)
-
Florian Schmid