Cannot remove Snapshot. The VM is during a backup operation.

Hello! Running ovirt Version 4.5.5-1.el8 I had an issue with the iscsi server during the backup and I have two VMs that cannot be backed up anymore by Veeam. In the ovirt event log i have the following errors: Snapshot 'Auto-generated for Backup VM' creation for VM 'dns-a' has been completed. VDSM ovirt1-02 command StartNbdServerVDS failed: Bitmap does not exist: "{'reason': 'Bitmap does not exist in /rhev/data-center/mnt/blockSD/b2fa3469-a380-4180-a89a-43d65085d1b9/images/6a4de98a-b544-4df8-beb1-e560fd61c0e6/cdb26b8b-c447-48de-affa-d7f778aebac7', 'bitmap': '12d2fb20-74da-4e63-b240-f1a42210760c'}" Transfer was stopped by system. Reason: failed to create a signed image ticket. Image Download with disk dns-a_Disk1 was initiated by veeam@internal-authz Image Download with disk dns-a_Disk1 was cancelled. The error on the Veeam backup proxy: dns-a: Unable to create image transfer: Reason: 'Operation Failed', Detail: '[]' When trying to delete the snapshot from the administration interface I receive the following error in the web interface (and nothing gets logged in the event log) Cannot remove Snapshot. The VM is during a backup operation. How should I go about fixing this issue?

Here is how I manually cancel backup operations caught in a finalizing state from Veeam. I've logged multiple tickets with Veeam about this issue. There is a forum post [1] about it as well. 1. Obtain a oAuth access token (if keycloak is used). curl -k -H "Accept: application/json" 'https://<engine>/ovirt-engine/sso/oauth/token?grant_type=password&username=<username>&password=<password>&scope=ovirt-app-api' Where the <username> and <password> are URL encoded. It can easily be done with urllib in python.
import urllib.parse query = "admin@ovirt@internalsso" urllib.parse.quote(query) 'admin%40ovirt%40internalsso' query = "super!secret" urllib.parse.quote(query) 'super%21secret'
The curl command will return a JSON object, extract the value of the access_token, set it aside. It's good for a while. token=eyJhI...FpZpA More information can be found here: https://www.ovirt.org/documentation/doc-REST_API_Guide/#authentication 2. Gather the VM IDs of the affected machines from the Engine UI. vmid=141d2650-a1c6-430b-b364-cc15f83ec50c 3. List the backups for each VM. This will return an XML object. Extract the backup-id <backup id="...">. curl --insecure --header "Authorization: Bearer ${token}" --request GET --header 'Version: 4' https://<engine>/ovirt-engine/api/vms/${vmid}/backups https://www.ovirt.org/documentation/doc-REST_API_Guide/#services-vm_backup-m... 4. Log into the engine or anywhere the oVirt SDK is installed. 5. Configure the client, check and set the following: a. Create a file in ~/.config called ovirt.conf. b. Download a copy of the engine CA to the home directory. c. Specify the engine URL, username, password, and path to ca. [engine] engine_url=https://<fqdn-of-engine> username=<username> password=<password> cafile=ca.pem The <username> and <password> are not encoded. Ensure the permissions of the file are set to 0600 or 0400. 6. Use the backup_vm.py script in /usr/share/doc/python3-ovirt-engine-sdk4/examples. engine ~]$ /usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py -c <engine-id> stop <vm-id> <backup-id> Where <engine-id> matches an entry in ~/.config/ovirt.conf. In this case I called it "engine". Once the backup has been cancelled, go ahead and delete any lingering Veeam snapshots. You should be able to create snapshots as well, but I noticed Veeam backups will consistently fail if there are any preexisting snapshots. It seems to think there are 0 bytes in the disk image and immediately fail. The only way I found to backup a VM using the most recent Veeam plugin is to ensure all snapshots are deleted. The other thing I recently noticed is that ovirt-imageio service eventually stops responding to requests, resulting in consistent backup failures. Each backup request will remain in a finalizing state, requiring them to be manually cancelled as outlined above. After the last round of backup failures, I finally noticed about 14+ connections from the RHEV Backup Proxy from Veeam (or passed through the engine by Veeam) to each Hypervisior. The connection state is CLOSE_WAIT. The logs for ovirt-imageio have uncaught exceptions, connection closed by peer responses. Not sure if this is the reason why they are lingering around at the application level. However, if there are no backups in progress, go ahead and restart the service. It might take a few minutes for it to restart. The service should start responding, all CLOSE_WAIT connections will be gone. systemctl restart ovirt-imageio Each time I find something different with the most recent RHEV Plugin from Veeam, so my advise should be taken with a grain of salt. Hope this helps! [1] https://forums.veeam.com/red-hat-virtualization-f62/veeam-rhv-12-1-command-r...
participants (2)
-
and@missme.ro
-
Jon Sattelberger