Q: Failed creating snapshot

Hi, I have oVirt 4.4 with nodes 4.2.8 Attempt to create snapshot failed with: EventID 10802 VDSM node11 command SnapshotVDS failed: Message timeout can be caused by communication issues. Event ID 2022 Add-disk operation failed to complete. However, snapshot is on disk since VM now takes 1200GB instead of 500. Yes snapshot list in VM details is empty. How to recover disk space occupied by unsuccessful snapshot ? I looked at /vmraid/nfs/disks/xxxx-xxxx-xxx/images and there are no distinctive files which I can identify as snapshot. Thanks in advance. Andrei

On Fri, Oct 30, 2020 at 10:43 AM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi,
I have oVirt 4.4 with nodes 4.2.8
Attempt to create snapshot failed with: EventID 10802 VDSM node11 command SnapshotVDS failed: Message timeout can be caused by communication issues. Event ID 2022 Add-disk operation failed to complete.
However, snapshot is on disk since VM now takes 1200GB instead of 500. Yes snapshot list in VM details is empty.
How to recover disk space occupied by unsuccessful snapshot ? I looked at /vmraid/nfs/disks/xxxx-xxxx-xxx/images
Is /vmraid/nfs/disks/ your storage domain directory on the server? You can find the disk uuid on the engine side, and search the disk snapshots engine knows about: $ sudo -u postgres psql -d engine engine=# select image_group_id,image_guid from images where image_group_id = '4b62aa6d-3bdd-4db3-b26f-0484c4124631'; image_group_id | image_guid --------------------------------------+-------------------------------------- 4b62aa6d-3bdd-4db3-b26f-0484c4124631 | 20391fbc-fa77-4fef-9aea-cf59f27f90b5 4b62aa6d-3bdd-4db3-b26f-0484c4124631 | 13ad63bf-71e2-4243-8967-ab4324818b31 4b62aa6d-3bdd-4db3-b26f-0484c4124631 | b5e56953-bf86-4496-a6bd-19132cba9763 (3 rows) Looking up the same image on storage will be: $ find /rhev/data-center/mnt/nfs1\:_export_3/ -name 4b62aa6d-3bdd-4db3-b26f-0484c4124631 /rhev/data-center/mnt/nfs1:_export_3/f5915245-0ac5-4712-b8b2-dd4d4be7cdc4/images/4b62aa6d-3bdd-4db3-b26f-0484c4124631 $ ls -1 /rhev/data-center/mnt/nfs1:_export_3/f5915245-0ac5-4712-b8b2-dd4d4be7cdc4/images/4b62aa6d-3bdd-4db3-b26f-0484c4124631 13ad63bf-71e2-4243-8967-ab4324818b31 13ad63bf-71e2-4243-8967-ab4324818b31.lease 13ad63bf-71e2-4243-8967-ab4324818b31.meta 20391fbc-fa77-4fef-9aea-cf59f27f90b5 20391fbc-fa77-4fef-9aea-cf59f27f90b5.lease 20391fbc-fa77-4fef-9aea-cf59f27f90b5.meta b5e56953-bf86-4496-a6bd-19132cba9763 b5e56953-bf86-4496-a6bd-19132cba9763.lease b5e56953-bf86-4496-a6bd-19132cba9763.meta Note that for every image we have a file without extension (the actual image data) and .meta and .lease files. You can find the leaf volume - the volume used by the VM like this: $ grep LEAF /rhev/data-center/mnt/nfs1:_export_3/f5915245-0ac5-4712-b8b2-dd4d4be7cdc4/images/4b62aa6d-3bdd-4db3-b26f-0484c4124631/*.meta /rhev/data-center/mnt/nfs1:_export_3/f5915245-0ac5-4712-b8b2-dd4d4be7cdc4/images/4b62aa6d-3bdd-4db3-b26f-0484c4124631/b5e56953-bf86-4496-a6bd-19132cba9763.meta:VOLTYPE=LEAF From this you can get the image chain: $ sudo -u vdsm qemu-img info --backing-chain /rhev/data-center/mnt/nfs1:_export_3/f5915245-0ac5-4712-b8b2-dd4d4be7cdc4/images/4b62aa6d-3bdd-4db3-b26f-0484c4124631/b5e56953-bf86-4496-a6bd-19132cba9763 [sudo] password for nsoffer: image: /rhev/data-center/mnt/nfs1:_export_3/f5915245-0ac5-4712-b8b2-dd4d4be7cdc4/images/4b62aa6d-3bdd-4db3-b26f-0484c4124631/b5e56953-bf86-4496-a6bd-19132cba9763 file format: qcow2 virtual size: 6 GiB (6442450944 bytes) disk size: 103 MiB cluster_size: 65536 backing file: 13ad63bf-71e2-4243-8967-ab4324818b31 (actual path: /rhev/data-center/mnt/nfs1:_export_3/f5915245-0ac5-4712-b8b2-dd4d4be7cdc4/images/4b62aa6d-3bdd-4db3-b26f-0484c4124631/13ad63bf-71e2-4243-8967-ab4324818b31) backing file format: qcow2 Format specific information: compat: 1.1 compression type: zlib lazy refcounts: false refcount bits: 16 corrupt: false image: /rhev/data-center/mnt/nfs1:_export_3/f5915245-0ac5-4712-b8b2-dd4d4be7cdc4/images/4b62aa6d-3bdd-4db3-b26f-0484c4124631/13ad63bf-71e2-4243-8967-ab4324818b31 file format: qcow2 virtual size: 6 GiB (6442450944 bytes) disk size: 4.88 MiB cluster_size: 65536 backing file: 20391fbc-fa77-4fef-9aea-cf59f27f90b5 (actual path: /rhev/data-center/mnt/nfs1:_export_3/f5915245-0ac5-4712-b8b2-dd4d4be7cdc4/images/4b62aa6d-3bdd-4db3-b26f-0484c4124631/20391fbc-fa77-4fef-9aea-cf59f27f90b5) backing file format: qcow2 Format specific information: compat: 1.1 compression type: zlib lazy refcounts: false refcount bits: 16 corrupt: false image: /rhev/data-center/mnt/nfs1:_export_3/f5915245-0ac5-4712-b8b2-dd4d4be7cdc4/images/4b62aa6d-3bdd-4db3-b26f-0484c4124631/20391fbc-fa77-4fef-9aea-cf59f27f90b5 file format: qcow2 virtual size: 6 GiB (6442450944 bytes) disk size: 1.65 GiB cluster_size: 65536 Format specific information: compat: 1.1 compression type: zlib lazy refcounts: false refcount bits: 16 corrupt: false Any other image in this directory which engine does not know about and is not part of the backing chain is safe to to delete. For example if you found unknown files: 3b37d7e3-a6d7-4f76-9145-2732669d2ebd 3b37d7e3-a6d7-4f76-9145-2732669d2ebd.meta 3b37d7e3-a6d7-4f76-9145-2732669d2ebd.lease You can safely delete them. This may be more complicated if the snapshot failed after changing the .meta file, and your leaf is the image that engine does not know about. In this case you will have to fix the metadata. If you are not sure, about this please share output of qemu-img info --backing-chain shown above and contents of your .meta files in the image directory. Nir
participants (2)
-
Andrei Verovski
-
Nir Soffer