Error while removing snapshot: Unable to get volume info

Hi all, I'm trying to remove a snapshot from a HA VM in a setup with glusterfs (2 nodes C8 stream oVirt 4.4 + 1 arbiter C8). The error that appears in the vdsm log of the host is: 2022-01-10 09:33:03,003+0100 ERROR (jsonrpc/4) [api] FINISH merge error=Merge failed: {'top': '441354e7-c234-4079-b494-53fa99cdce6f', 'base': 'fdf38f20-3416-4d75-a159-2a341b1ed637', 'job': '50206e3a-8018-4ea8-b191-e4bc859ae0c7', 'reason': 'Unable to get volume info for domain 574a3cd1-5617-4742-8de9-4732be4f27e0 volume 441354e7-c234-4079-b494-53fa99cdce6f'} (api:131) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/virt/livemerge.py", line 285, in merge drive.domainID, drive.poolID, drive.imageID, job.top) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5988, in getVolumeInfo (domainID, volumeID)) vdsm.virt.errors.StorageUnavailableError: Unable to get volume info for domain 574a3cd1-5617-4742-8de9-4732be4f27e0 volume 441354e7-c234-4079-b494-53fa99cdce6f During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 124, in method ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 776, in merge drive, baseVolUUID, topVolUUID, bandwidth, jobUUID) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5833, in merge driveSpec, baseVolUUID, topVolUUID, bandwidth, jobUUID) File "/usr/lib/python3.6/site-packages/vdsm/virt/livemerge.py", line 288, in merge str(e), top=top, base=job.base, job=job_id) The volume list in the host differs from the engine one: HOST: vdsm-tool dump-volume-chains 574a3cd1-5617-4742-8de9-4732be4f27e0 | grep -A10 0b995271-e7f3-41b3-aff7-b5ad7942c10d image: 0b995271-e7f3-41b3-aff7-b5ad7942c10d - fdf38f20-3416-4d75-a159-2a341b1ed637 status: OK, voltype: INTERNAL, format: COW, legality: LEGAL, type: SPARSE, capacity: 53687091200, truesize: 44255387648 - 10df3adb-38f4-41d1-be84-b8b5b86e92cc status: OK, voltype: LEAF, format: COW, legality: LEGAL, type: SPARSE, capacity: 53687091200, truesize: 7335407616 ls -1 0b995271-e7f3-41b3-aff7-b5ad7942c10d 10df3adb-38f4-41d1-be84-b8b5b86e92cc 10df3adb-38f4-41d1-be84-b8b5b86e92cc.lease 10df3adb-38f4-41d1-be84-b8b5b86e92cc.meta fdf38f20-3416-4d75-a159-2a341b1ed637 fdf38f20-3416-4d75-a159-2a341b1ed637.lease fdf38f20-3416-4d75-a159-2a341b1ed637.meta ENGINE: engine=# select * from images where image_group_id='0b995271-e7f3-41b3-aff7-b5ad7942c10d'; -[ RECORD 1 ]---------+------------------------------------- image_guid | 10df3adb-38f4-41d1-be84-b8b5b86e92cc creation_date | 2022-01-07 11:23:43+01 size | 53687091200 it_guid | 00000000-0000-0000-0000-000000000000 parentid | 441354e7-c234-4079-b494-53fa99cdce6f imagestatus | 1 lastmodified | 2022-01-07 11:23:39.951+01 vm_snapshot_id | bd2291a4-8018-4874-a400-8d044a95347d volume_type | 2 volume_format | 4 image_group_id | 0b995271-e7f3-41b3-aff7-b5ad7942c10d _create_date | 2022-01-07 11:23:41.448463+01 _update_date | 2022-01-07 11:24:10.414777+01 active | t volume_classification | 0 qcow_compat | 2 -[ RECORD 2 ]---------+------------------------------------- image_guid | 441354e7-c234-4079-b494-53fa99cdce6f creation_date | 2021-12-15 07:16:31.647+01 size | 53687091200 it_guid | 00000000-0000-0000-0000-000000000000 parentid | fdf38f20-3416-4d75-a159-2a341b1ed637 imagestatus | 1 lastmodified | 2022-01-07 11:23:41.448+01 vm_snapshot_id | 2d610958-59e3-4685-b209-139b4266012f volume_type | 2 volume_format | 4 image_group_id | 0b995271-e7f3-41b3-aff7-b5ad7942c10d _create_date | 2021-12-15 07:16:32.37005+01 _update_date | 2022-01-07 11:23:41.448463+01 active | f volume_classification | 1 qcow_compat | 0 -[ RECORD 3 ]---------+------------------------------------- image_guid | fdf38f20-3416-4d75-a159-2a341b1ed637 creation_date | 2020-08-12 17:16:07+02 size | 53687091200 it_guid | 00000000-0000-0000-0000-000000000000 parentid | 00000000-0000-0000-0000-000000000000 imagestatus | 4 lastmodified | 2021-12-15 07:16:32.369+01 vm_snapshot_id | 603811ba-3cdd-4388-a971-05e300ced0c3 volume_type | 2 volume_format | 4 image_group_id | 0b995271-e7f3-41b3-aff7-b5ad7942c10d _create_date | 2020-08-12 17:16:07.506823+02 _update_date | 2021-12-15 07:16:32.37005+01 active | f volume_classification | 1 qcow_compat | 2 However in the engine gui I see only two snapshots ID: 1- 10df3adb-38f4-41d1-be84-b8b5b86e92cc (status ok) 2- 441354e7-c234-4079-b494-53fa99cdce6f (Disk status illegal) So the situation is: - on the host I see two volumes, status OK - on the engine GUI I see two volumes, one OK and the other one, disk in illegal status - on the engine DB I see three volumes, that are the combo of the two previous situations I would avoid restarting the VM, any advice for fixing this messy situation? I can attach engine log/vdsm.log Thank you for your time and help, Francesco

My problem should be the same as the one filed here: https://bugzilla.redhat.com/show_bug.cgi?id=1948599 So, if I'm correct I must edit DB entries to fix the situations. Although I don't like to operate directly the DB, I'll try that and let you know if I resolve it. In the meanwhile, if anyone has any tips or suggestion that doesn't involve editing the DB, much appreciate it. Regards, Francesco Il 10/01/2022 10:33, francesco--- via Users ha scritto:
Hi all,
I'm trying to remove a snapshot from a HA VM in a setup with glusterfs (2 nodes C8 stream oVirt 4.4 + 1 arbiter C8). The error that appears in the vdsm log of the host is:
2022-01-10 09:33:03,003+0100 ERROR (jsonrpc/4) [api] FINISH merge error=Merge failed: {'top': '441354e7-c234-4079-b494-53fa99cdce6f', 'base': 'fdf38f20-3416-4d75-a159-2a341b1ed637', 'job': '50206e3a-8018-4ea8-b191-e4bc859ae0c7', 'reason': 'Unable to get volume info for domain 574a3cd1-5617-4742-8de9-4732be4f27e0 volume 441354e7-c234-4079-b494-53fa99cdce6f'} (api:131) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/virt/livemerge.py", line 285, in merge drive.domainID, drive.poolID, drive.imageID, job.top) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5988, in getVolumeInfo (domainID, volumeID)) vdsm.virt.errors.StorageUnavailableError: Unable to get volume info for domain 574a3cd1-5617-4742-8de9-4732be4f27e0 volume 441354e7-c234-4079-b494-53fa99cdce6f
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 124, in method ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 776, in merge drive, baseVolUUID, topVolUUID, bandwidth, jobUUID) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5833, in merge driveSpec, baseVolUUID, topVolUUID, bandwidth, jobUUID) File "/usr/lib/python3.6/site-packages/vdsm/virt/livemerge.py", line 288, in merge str(e), top=top, base=job.base, job=job_id)
The volume list in the host differs from the engine one:
HOST:
vdsm-tool dump-volume-chains 574a3cd1-5617-4742-8de9-4732be4f27e0 | grep -A10 0b995271-e7f3-41b3-aff7-b5ad7942c10d image: 0b995271-e7f3-41b3-aff7-b5ad7942c10d
- fdf38f20-3416-4d75-a159-2a341b1ed637 status: OK, voltype: INTERNAL, format: COW, legality: LEGAL, type: SPARSE, capacity: 53687091200, truesize: 44255387648
- 10df3adb-38f4-41d1-be84-b8b5b86e92cc status: OK, voltype: LEAF, format: COW, legality: LEGAL, type: SPARSE, capacity: 53687091200, truesize: 7335407616
ls -1 0b995271-e7f3-41b3-aff7-b5ad7942c10d 10df3adb-38f4-41d1-be84-b8b5b86e92cc 10df3adb-38f4-41d1-be84-b8b5b86e92cc.lease 10df3adb-38f4-41d1-be84-b8b5b86e92cc.meta fdf38f20-3416-4d75-a159-2a341b1ed637 fdf38f20-3416-4d75-a159-2a341b1ed637.lease fdf38f20-3416-4d75-a159-2a341b1ed637.meta
ENGINE:
engine=# select * from images where image_group_id='0b995271-e7f3-41b3-aff7-b5ad7942c10d'; -[ RECORD 1 ]---------+------------------------------------- image_guid | 10df3adb-38f4-41d1-be84-b8b5b86e92cc creation_date | 2022-01-07 11:23:43+01 size | 53687091200 it_guid | 00000000-0000-0000-0000-000000000000 parentid | 441354e7-c234-4079-b494-53fa99cdce6f imagestatus | 1 lastmodified | 2022-01-07 11:23:39.951+01 vm_snapshot_id | bd2291a4-8018-4874-a400-8d044a95347d volume_type | 2 volume_format | 4 image_group_id | 0b995271-e7f3-41b3-aff7-b5ad7942c10d _create_date | 2022-01-07 11:23:41.448463+01 _update_date | 2022-01-07 11:24:10.414777+01 active | t volume_classification | 0 qcow_compat | 2 -[ RECORD 2 ]---------+------------------------------------- image_guid | 441354e7-c234-4079-b494-53fa99cdce6f creation_date | 2021-12-15 07:16:31.647+01 size | 53687091200 it_guid | 00000000-0000-0000-0000-000000000000 parentid | fdf38f20-3416-4d75-a159-2a341b1ed637 imagestatus | 1 lastmodified | 2022-01-07 11:23:41.448+01 vm_snapshot_id | 2d610958-59e3-4685-b209-139b4266012f volume_type | 2 volume_format | 4 image_group_id | 0b995271-e7f3-41b3-aff7-b5ad7942c10d _create_date | 2021-12-15 07:16:32.37005+01 _update_date | 2022-01-07 11:23:41.448463+01 active | f volume_classification | 1 qcow_compat | 0 -[ RECORD 3 ]---------+------------------------------------- image_guid | fdf38f20-3416-4d75-a159-2a341b1ed637 creation_date | 2020-08-12 17:16:07+02 size | 53687091200 it_guid | 00000000-0000-0000-0000-000000000000 parentid | 00000000-0000-0000-0000-000000000000 imagestatus | 4 lastmodified | 2021-12-15 07:16:32.369+01 vm_snapshot_id | 603811ba-3cdd-4388-a971-05e300ced0c3 volume_type | 2 volume_format | 4 image_group_id | 0b995271-e7f3-41b3-aff7-b5ad7942c10d _create_date | 2020-08-12 17:16:07.506823+02 _update_date | 2021-12-15 07:16:32.37005+01 active | f volume_classification | 1 qcow_compat | 2
However in the engine gui I see only two snapshots ID:
1- 10df3adb-38f4-41d1-be84-b8b5b86e92cc (status ok) 2- 441354e7-c234-4079-b494-53fa99cdce6f (Disk status illegal)
So the situation is: - on the host I see two volumes, status OK - on the engine GUI I see two volumes, one OK and the other one, disk in illegal status - on the engine DB I see three volumes, that are the combo of the two previous situations
I would avoid restarting the VM, any advice for fixing this messy situation? I can attach engine log/vdsm.log
Thank you for your time and help, Francesco
_______________________________________________ Users mailing list --users@ovirt.org To unsubscribe send an email tousers-leave@ovirt.org Privacy Statement:https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct:https://www.ovirt.org/community/about/community-guidelines/ List Archives:https://lists.ovirt.org/archives/list/users@ovirt.org/message/KJBV42Q3GRSCIT...
-- -- Shellrent - Il primo hosting italiano Security First *Francesco Lorenzini* /System Administrator & DevOps Engineer/ Shellrent Srl Via dell'Edilizia, 19 - 36100 Vicenza Tel. 0444321155 <tel:+390444321155> | Fax 04441492177

On Mon, Jan 10, 2022 at 5:22 PM Francesco Lorenzini via Users < users@ovirt.org> wrote:
My problem should be the same as the one filed here: https://bugzilla.redhat.com/show_bug.cgi?id=1948599
So, if I'm correct I must edit DB entries to fix the situations. Although I don't like to operate directly the DB, I'll try that and let you know if I resolve it.
It looks like the volume on vdsm side was already removed, so when engine try to merge the merge fails. This is an engine bug - it should handle this case and remove the illegal snapshot in the db. But since it does not, you have to do this manually. Please file an engine bug for this issue.
In the meanwhile, if anyone has any tips or suggestion that doesn't involve editing the DB, much appreciate it.
I don't think there is another way. Nir
Regards, Francesco
Il 10/01/2022 10:33, francesco--- via Users ha scritto:
Hi all,
I'm trying to remove a snapshot from a HA VM in a setup with glusterfs (2 nodes C8 stream oVirt 4.4 + 1 arbiter C8). The error that appears in the vdsm log of the host is:
2022-01-10 09:33:03,003+0100 ERROR (jsonrpc/4) [api] FINISH merge error=Merge failed: {'top': '441354e7-c234-4079-b494-53fa99cdce6f', 'base': 'fdf38f20-3416-4d75-a159-2a341b1ed637', 'job': '50206e3a-8018-4ea8-b191-e4bc859ae0c7', 'reason': 'Unable to get volume info for domain 574a3cd1-5617-4742-8de9-4732be4f27e0 volume 441354e7-c234-4079-b494-53fa99cdce6f'} (api:131) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/virt/livemerge.py", line 285, in merge drive.domainID, drive.poolID, drive.imageID, job.top) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5988, in getVolumeInfo (domainID, volumeID)) vdsm.virt.errors.StorageUnavailableError: Unable to get volume info for domain 574a3cd1-5617-4742-8de9-4732be4f27e0 volume 441354e7-c234-4079-b494-53fa99cdce6f
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 124, in method ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 776, in merge drive, baseVolUUID, topVolUUID, bandwidth, jobUUID) File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5833, in merge driveSpec, baseVolUUID, topVolUUID, bandwidth, jobUUID) File "/usr/lib/python3.6/site-packages/vdsm/virt/livemerge.py", line 288, in merge str(e), top=top, base=job.base, job=job_id)
The volume list in the host differs from the engine one:
HOST:
vdsm-tool dump-volume-chains 574a3cd1-5617-4742-8de9-4732be4f27e0 | grep -A10 0b995271-e7f3-41b3-aff7-b5ad7942c10d image: 0b995271-e7f3-41b3-aff7-b5ad7942c10d
- fdf38f20-3416-4d75-a159-2a341b1ed637 status: OK, voltype: INTERNAL, format: COW, legality: LEGAL, type: SPARSE, capacity: 53687091200, truesize: 44255387648
- 10df3adb-38f4-41d1-be84-b8b5b86e92cc status: OK, voltype: LEAF, format: COW, legality: LEGAL, type: SPARSE, capacity: 53687091200, truesize: 7335407616
ls -1 0b995271-e7f3-41b3-aff7-b5ad7942c10d 10df3adb-38f4-41d1-be84-b8b5b86e92cc 10df3adb-38f4-41d1-be84-b8b5b86e92cc.lease 10df3adb-38f4-41d1-be84-b8b5b86e92cc.meta fdf38f20-3416-4d75-a159-2a341b1ed637 fdf38f20-3416-4d75-a159-2a341b1ed637.lease fdf38f20-3416-4d75-a159-2a341b1ed637.meta
ENGINE:
engine=# select * from images where image_group_id='0b995271-e7f3-41b3-aff7-b5ad7942c10d'; -[ RECORD 1 ]---------+------------------------------------- image_guid | 10df3adb-38f4-41d1-be84-b8b5b86e92cc creation_date | 2022-01-07 11:23:43+01 size | 53687091200 it_guid | 00000000-0000-0000-0000-000000000000 parentid | 441354e7-c234-4079-b494-53fa99cdce6f imagestatus | 1 lastmodified | 2022-01-07 11:23:39.951+01 vm_snapshot_id | bd2291a4-8018-4874-a400-8d044a95347d volume_type | 2 volume_format | 4 image_group_id | 0b995271-e7f3-41b3-aff7-b5ad7942c10d _create_date | 2022-01-07 11:23:41.448463+01 _update_date | 2022-01-07 11:24:10.414777+01 active | t volume_classification | 0 qcow_compat | 2 -[ RECORD 2 ]---------+------------------------------------- image_guid | 441354e7-c234-4079-b494-53fa99cdce6f creation_date | 2021-12-15 07:16:31.647+01 size | 53687091200 it_guid | 00000000-0000-0000-0000-000000000000 parentid | fdf38f20-3416-4d75-a159-2a341b1ed637 imagestatus | 1 lastmodified | 2022-01-07 11:23:41.448+01 vm_snapshot_id | 2d610958-59e3-4685-b209-139b4266012f volume_type | 2 volume_format | 4 image_group_id | 0b995271-e7f3-41b3-aff7-b5ad7942c10d _create_date | 2021-12-15 07:16:32.37005+01 _update_date | 2022-01-07 11:23:41.448463+01 active | f volume_classification | 1 qcow_compat | 0 -[ RECORD 3 ]---------+------------------------------------- image_guid | fdf38f20-3416-4d75-a159-2a341b1ed637 creation_date | 2020-08-12 17:16:07+02 size | 53687091200 it_guid | 00000000-0000-0000-0000-000000000000 parentid | 00000000-0000-0000-0000-000000000000 imagestatus | 4 lastmodified | 2021-12-15 07:16:32.369+01 vm_snapshot_id | 603811ba-3cdd-4388-a971-05e300ced0c3 volume_type | 2 volume_format | 4 image_group_id | 0b995271-e7f3-41b3-aff7-b5ad7942c10d _create_date | 2020-08-12 17:16:07.506823+02 _update_date | 2021-12-15 07:16:32.37005+01 active | f volume_classification | 1 qcow_compat | 2
However in the engine gui I see only two snapshots ID:
1- 10df3adb-38f4-41d1-be84-b8b5b86e92cc (status ok) 2- 441354e7-c234-4079-b494-53fa99cdce6f (Disk status illegal)
So the situation is: - on the host I see two volumes, status OK - on the engine GUI I see two volumes, one OK and the other one, disk in illegal status - on the engine DB I see three volumes, that are the combo of the two previous situations
I would avoid restarting the VM, any advice for fixing this messy situation? I can attach engine log/vdsm.log
Thank you for your time and help, Francesco
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/KJBV42Q3GRSCIT...
-- -- [image: Shellrent - Il primo hosting italiano Security First] *Francesco Lorenzini* *System Administrator & DevOps Engineer* Shellrent Srl Via dell'Edilizia, 19 - 36100 Vicenza Tel. 0444321155 <+390444321155> | Fax 04441492177 _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/BLB3BYPCWS7V7D...
participants (3)
-
Francesco Lorenzini
-
francesco@shellrent.com
-
Nir Soffer