Digging a bit further I found this is a known issue. A discrepancy can occur between vdsm and engine db when removing a snapshot.
It's been already discussed [1] and a bug is filed [2]. In the discussion you can find a workaround which is manual removal of the snapshot.Don't forget to backup the engine database by running 'engine-backup' tool on the engine node before doing any changes.Restore requires a bit more options and can be done like this:
# engine-backup --file=/var/lib/ovirt-engine-backup/ovirt-engine-backup-20210602055605.backup --mode=restore --provision-all-databases
To check if the discrepancy occurred you can check the db and compare that to what vdsm sees (which is a source of truth).
The example below shows a consistent setup from my env with one snapshot, if there is anything extra in your env in the
db it should be removed and the parent id changed accordingly [3].
image_group_id (db) == image (vdsm)
image_guid (db) == logical volume on host (vdsm)
Engine node:
# su - postgres
# psql
postgres=# \c engine
engine=# select image_guid, image_group_id, parentid from images where image_group_id = 'e75318bf-c563-4d66-99e4-63645736a418';
image_guid | image_group_id | parentid
--------------------------------------+--------------------------------------+--------------------------------------
1955f6de-658a-43c3-969b-79db9b4bf14c | e75318bf-c563-4d66-99e4-63645736a418 | 00000000-0000-0000-0000-000000000000
d6662661-eb87-4c01-a204-477919e65221 | e75318bf-c563-4d66-99e4-63645736a418 | 1955f6de-658a-43c3-969b-79db9b4bf14c
Host node:
# vdsm-tool dump-volume-chains <STORAGE_DOMAIN_ID>
Images volume chains (base volume first)
image: e75318bf-c563-4d66-99e4-63645736a418
- 1955f6de-658a-43c3-969b-79db9b4bf14c
status: OK, voltype: INTERNAL, format: RAW, legality: LEGAL, type: PREALLOCATED, capacity: 5368709120, truesize: 5368709120
- d6662661-eb87-4c01-a204-477919e65221
status: OK, voltype: LEAF, format: COW, legality: LEGAL, type: SPARSE, capacity: 5368709120, truesize: 3221225472
...
I hope this helps a bit and if you need further assistance let us know, it's not very convenient to change the db
manually like this but a fix should be on the way :)
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1948599
[2] https://lists.ovirt.org/archives/list/users@ovirt.org/thread/7ZU7NWHBW3B2NBPQPNRVAAU7CVJ5PEKG/
[3] https://lists.ovirt.org/archives/list/users@ovirt.org/message/D2HKS2RFMNKGP54JVA3D5MVUYKKQVZII/On Tue, Jun 1, 2021 at 8:19 PM David Johnson <djohnson@maxistechnology.com> wrote:Yes, I have the same error on the second try.You can see it happening in the engine log starting at 2021-05-31 07:49.On Tue, Jun 1, 2021 at 7:48 AM Roman Bednar <rbednar@redhat.com> wrote:Hi David,awesome, thanks for the reply. Looking at the logs there does not seem anything suspicious on vdsm side and as you said the snapshots are really gone when looking from vdsm. I tried to reproduce without much success but it looks like a problem on the engine side.Did you get the same error saying that the disks are illegal on the second try? There should be more in the engine log so try checking it as well to see if this is really on the engine side.It would be great to have a reproducer for this and file the bug so we can track this and provide a fix.-RomanOn Mon, May 31, 2021 at 3:20 PM David Johnson <djohnson@maxistechnology.com> wrote:Hi Roman,Thank you for your assistance.I found another snapshot that needed collapsing, and deleted that. These logs include that execution.Prior to the execution, the vdsm-dump listed snapshot volumes. Post-execution, the snapshot volumes were absent. That suggests to me that the snapshot was actually removed, but Ovirt is confused.On Mon, May 31, 2021 at 5:54 AM Roman Bednar <rbednar@redhat.com> wrote:Hello David,there's quite a few reasons a volume could be marked as illegal, e.g. failed operation that left the volume in this state. This is done in vdsm so please provide a vdsm log on the host running the VM so we can check exactly what went wrong. Also the state of the storage domain could be helpful to see what volumes are present and marked illegal, you can get the information by running:# vdsm-client StorageDomain dump sd_id=<SD_ID>-RomanOn Fri, May 28, 2021 at 6:10 PM David Johnson <djohnson@maxistechnology.com> wrote:Hi all,_______________________________________________I patched one of my Windows VM's yesterday. I started by snapshotting the VM, then applied the Windows update. Now that the patch has been tested, I want to remove the snapshot. I get this message:Error while executing action:
win-sql-2019:
- Cannot remove Snapshot. The following attached disks are in ILLEGAL status: win-2019-tmpl_Disk1 - please remove them and try again.
Does anyone have any thoughts how to recover from this? I really don't want to keep this snapshot hanging around.Thanks in advance,
David Johnson
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/PJODGOB2MI6EQQHSJSRXFWRZGJXMZH6P/