Hi,

I'm using oVirt 4.2.5. I know, it is not the newest anymore, but our cluster is quite big and I will upgrade to 4.2.8 as soon as possible.

I have a VM with several disks, one with virtio (boot device) and 3 other disks with virtio-scsi:
log                               8190c797-0ed8-421f-85c7-cc1f540408f8          1 GiB
root                              5edab51c-9113-466c-bd27-e73d4bfb29c4       10 GiB
tmp                              11d74762-6053-4347-bdf2-4838dc2ea6f0        1 GiB
web_web-content        bb5b1881-d40f-4ad1-a8c8-8ee594b3fe8a        20 GiB
Snapshots where quite small, because not much is changing there. All disks are on NFS v3 share running on NetApp cluster.

Some IDs:
VM ID:                       bc25c5c9-353b-45ba-b0d5-5dbba41e9c5f
affected disk ID:        6cbd2f85-8335-416f-a208-ef60ecd839a4
Snapshot ID:             c8103ae8-3432-4b69-8b91-790cdc37a2da
Snapshot disk ID:      2564b125-857e-41fa-b187-2832df277ccf
Task ID:                     2a60efb5-1a11-49ac-a7f0-406faac219d6
Storage domain ID:   14794a3e-16fc-4dd3-a867-10507acfe293



After triggering snapshot delete task (2a60efb5-1a11-49ac-a7f0-406faac219d6), deletion was running for about one hour and I though it was hanging and I restarted the engine process on self-hosted engine...

After that, snapshot was still in lock state, therefore, I deleted the lock:
/usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t snapshot c8103ae8-3432-4b69-8b91-790cdc37a2da

##########################################
CAUTION, this operation may lead to data corruption and should be used with care. Please contact support prior to running this command
##########################################

Are you sure you want to proceed? [y/n]
y
select fn_db_unlock_snapshot('c8103ae8-3432-4b69-8b91-790cdc37a2da');
 
INSERT 0 1
unlock snapshot c8103ae8-3432-4b69-8b91-790cdc37a2da completed successfully.



After trying to delete snapshot again, engine gave the error, that the Disk is in status Illegal.

Snapshot file is still there for the disk log:
-rw-rw----. 1 vdsm kvm 282M Feb  1 13:33 2564b125-857e-41fa-b187-2832df277ccf
-rw-rw----. 1 vdsm kvm 1.0M Jan 22 03:22 2564b125-857e-41fa-b187-2832df277ccf.lease
-rw-r--r--. 1 vdsm kvm  267 Feb  1 14:05 2564b125-857e-41fa-b187-2832df277ccf.meta
-rw-rw----. 1 vdsm kvm 1.0G Feb  1 11:53 6cbd2f85-8335-416f-a208-ef60ecd839a4
-rw-rw----. 1 vdsm kvm 1.0M Jan 10 12:07 6cbd2f85-8335-416f-a208-ef60ecd839a4.lease
-rw-r--r--. 1 vdsm kvm  272 Jan 22 03:22 6cbd2f85-8335-416f-a208-ef60ecd839a4.meta

All other snapshots have been merged successfully. I have umounted the disk inside the VM, after I saw, that the snapshot disk is still in use. That's why, the date is not changed anymore.
The strange thing is, that it looks like that the merge was working for a short time, because also time of the underlying disk has changed...

In database, I have this data about the VM and its snapshot:
engine=# select snapshot_id,snapshot_type,status,description from snapshots where vm_id='bc25c5c9-353b-45ba-b0d5-5dbba41e9c5f';
             snapshot_id              | snapshot_type | status | description
--------------------------------------+---------------+--------+-------------
 f596ba1c-4a6e-4372-9df4-c8e870c55fea | ACTIVE        | OK     | Active VM
 c8103ae8-3432-4b69-8b91-790cdc37a2da | REGULAR       | OK     | cab-3449

engine=# select image_guid,parentid,imagestatus,vm_snapshot_id,volume_type,volume_format,active from images where image_group_id='8190c797-0ed8-421f-85c7-cc1f540408f8';
              image_guid              |               parentid               | imagestatus |            vm_snapshot_id            | volume_type | volume_format | active
--------------------------------------+--------------------------------------+-------------+--------------------------------------+-------------+---------------+--------
 2564b125-857e-41fa-b187-2832df277ccf | 6cbd2f85-8335-416f-a208-ef60ecd839a4 |           1 | f596ba1c-4a6e-4372-9df4-c8e870c55fea |           2 |             4 | t
 6cbd2f85-8335-416f-a208-ef60ecd839a4 | 00000000-0000-0000-0000-000000000000 |           4 | c8103ae8-3432-4b69-8b91-790cdc37a2da |           2 |             5 | f

vdsm-tool dump-volume-chains 14794a3e-16fc-4dd3-a867-10507acfe293:

   image:    8190c797-0ed8-421f-85c7-cc1f540408f8

             - 6cbd2f85-8335-416f-a208-ef60ecd839a4

               status: OK, voltype: INTERNAL, format: RAW, legality: LEGAL, type: SPARSE

             - 2564b125-857e-41fa-b187-2832df277ccf

               status: OK, voltype: LEAF, format: COW, legality: LEGAL, type: SPARSE



@bzlotnik,
it would be great, when you can help me to get the disk back only without stopping or starting the VM. I'm really afraid now of deleting snapshots...
I will send you the vdsm log from host running the VM and from SPM and engine.log.

Thank you very much!

BR Florian Schmid