Hi Francesco,

Thanks for sharing.

There was a release note specifically for that issue.
In order to use incremental backup in the 4.4.5 engine, the host must be upgraded to 4.4.5 as well. 

In 4.4.4 version, backup scratch disk created on the host local storage, in 4.4.5 scratch disks created on 
the shared storage where the backed-up disk resides.

There is a bug for this to fail the backup operation with more proper error - https://bugzilla.redhat.com/1950467



On Thu, 22 Apr 2021 at 11:14, Francesco via Devel <devel@ovirt.org> wrote:
Hi,

i upgraded one of the hosts where the backup failed because I noticed that in the lasts deployed hosts the process was smooth and without errors. Maybe the update of on of the following packages (or others) fixed the issue:

FROM

[root@host1.server.tld ~]# rpm -qa | grep vdsm
vdsm-common-4.40.26.3-1.el8.noarch
vdsm-http-4.40.26.3-1.el8.noarch
vdsm-hook-vhostmd-4.40.26.3-1.el8.noarch
vdsm-yajsonrpc-4.40.26.3-1.el8.noarch
vdsm-network-4.40.26.3-1.el8.x86_64
vdsm-api-4.40.26.3-1.el8.noarch
vdsm-4.40.26.3-1.el8.x86_64
vdsm-hook-ethtool-options-4.40.26.3-1.el8.noarch
vdsm-client-4.40.26.3-1.el8.noarch
vdsm-hook-vmfex-dev-4.40.26.3-1.el8.noarch
vdsm-hook-openstacknet-4.40.26.3-1.el8.noarch
vdsm-jsonrpc-4.40.26.3-1.el8.noarch
vdsm-hook-fcoe-4.40.26.3-1.el8.noarch
vdsm-python-4.40.26.3-1.el8.noarch

TO

rpm[root@host2.server.tld ~]# rpm -qa | grep vdsm
vdsm-4.40.50.9-1.el8.x86_64
vdsm-jsonrpc-4.40.50.9-1.el8.noarch
vdsm-api-4.40.50.9-1.el8.noarch
vdsm-python-4.40.50.9-1.el8.noarch
vdsm-yajsonrpc-4.40.50.9-1.el8.noarch
vdsm-hook-openstacknet-4.40.50.9-1.el8.noarch
vdsm-client-4.40.50.9-1.el8.noarch
vdsm-hook-vhostmd-4.40.50.9-1.el8.noarch
vdsm-hook-fcoe-4.40.50.9-1.el8.noarch
vdsm-common-4.40.50.9-1.el8.noarch
vdsm-http-4.40.50.9-1.el8.noarch
vdsm-hook-ethtool-options-4.40.50.9-1.el8.noarch
vdsm-network-4.40.50.9-1.el8.x86_64
vdsm-hook-vmfex-dev-4.40.50.9-1.el8.noarch


I shutdown the vms and upgraded the host via engine (without reboot). I hope the packages upgraded helped, not a simple vms reboot.

Is there any way to upgrade the hosts/ vdsm packages without shutdown the vms? Migrating it's not an option sadly.

Regards,
Francesco

Il 22/04/2021 09:37, francesco--- via Devel ha scritto:
Hi all,

I use a custom python script for performing full and incremental backup on few hundreds vms. I customized the example script in https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/backup_vm.py.

Everything was working fine, but starting from a few days ago, some vms starts to fail the backup. I upgraded the engine (self-hosted) from 4.4.4 to 4.4.5 yesterday afternoon and today most of the vms failed the incremental and full backup.

The error is triggered when trying to get the backp_service in the following function:

```
def backup_disk(disk, backups_service, from_checkpoint_id):
    # We need to set an array because backups_service.add need it
    disks = []
    disks.append(disk)
    add_error = False

    try:
        backup = backups_service.add(
                types.Backup(
                    disks=disks,
                    from_checkpoint_id=from_checkpoint_id
                )
            )
    except Exception as e:
        exception = str(e)
        logging.error("error on backups_service.add, set backup object to False. Error: {} ".format(exception))
        add_error = True
        backup = False

    if not add_error:
        backup_service = backups_service.backup_service(backup.id)
        logging.info("Backup phase: {}".format(backup.phase))
        while backup.phase != types.BackupPhase.READY:
            time.sleep(1)
            try:
                backup = backup_service.get()
            except sdk.NotFoundError:
                logging.error("backup_service NotFoundError, set backup object to False")
                backup = False
                break

        if backup and backup.to_checkpoint_id is not None:
            logging.info("Created checkpoint {}".format(backup.to_checkpoint_id))

    return backup
```

Here's the error in the engine logs. 

```
2021-04-21 11:31:21,750+02 INFO  [org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService] (EE-ManagedScheduledExecutorService-engineThreadMonitoringThreadPool-Thread-1) [] Thread pool 'default' is using 0 threads out of 1, 5 threads waiting for tasks.
2021-04-21 11:31:21,750+02 INFO  [org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService] (EE-ManagedScheduledExecutorService-engineThreadMonitoringThreadPool-Thread-1) [] Thread pool 'engine' is using 0 threads out of 500, 50 threads waiting for tasks and 0 tasks in queue.
2021-04-21 11:31:21,750+02 INFO  [org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService] (EE-ManagedScheduledExecutorService-engineThreadMonitoringThreadPool-Thread-1) [] Thread pool 'engineScheduledThreadPool' is using 2 threads out of 1, 98 threads waiting for tasks.
2021-04-21 11:31:21,750+02 INFO  [org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService] (EE-ManagedScheduledExecutorService-engineThreadMonitoringThreadPool-Thread-1) [] Thread pool 'engineThreadMonitoringThreadPool' is using 1 threads out of 1, 0 threads waiting for tasks.
2021-04-21 11:31:21,750+02 INFO  [org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService] (EE-ManagedScheduledExecutorService-engineThreadMonitoringThreadPool-Thread-1) [] Thread pool 'hostUpdatesChecker' is using 0 threads out of 5, 5 threads waiting for tasks.
2021-04-21 11:31:59,013+02 INFO  [org.ovirt.engine.core.sso.service.AuthenticationService] (default task-19) [] User admin@internal-authz with profile [internal] successfully logged in with scopes: ovirt-app-api ovirt-ext=token-info:authz-search ovirt-ext=token-info:public-authz-search ovirt-ext=token-info:validate ovirt-ext=token:password-access
2021-04-21 11:31:59,124+02 INFO  [org.ovirt.engine.core.bll.aaa.CreateUserSessionCommand] (default task-19) [43d5f322] Running command: CreateUserSessionCommand internal: false.
2021-04-21 11:31:59,157+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-19) [43d5f322] EVENT_ID: USER_VDC_LOGIN(30), User admin@internal-authz connecting from '5.19.121.125' using session 'VN/aBJxvgcG1WBY9OQSjUnHL7HaSPhbLSX52inLiYFGG/gubMeZiv9Tu+3D4htqVwrb/kqlvMzBL3WAQXb+3Lg==' logged in.
2021-04-21 11:31:59,534+02 WARN  [org.ovirt.engine.api.restapi.util.LinkHelper] (default task-19) [] Can't find relative path for class "org.ovirt.engine.api.resource.StorageDomainVmDiskAttachmentsResource", will return null
2021-04-21 11:31:59,535+02 WARN  [org.ovirt.engine.api.restapi.util.LinkHelper] (default task-19) [] Can't find relative path for class "org.ovirt.engine.api.resource.StorageDomainVmDiskAttachmentsResource", will return null
2021-04-21 11:32:00,670+02 INFO  [org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Lock Acquired to object 'EngineLock:{exclusiveLocks='[0ff73c3a-fb49-4148-a3b4-6ce790523a49=DISK]', sharedLocks='[94ec9758-cfc1-41ba-ba71-e339d8df8cee=VM]'}'
2021-04-21 11:32:00,863+02 INFO  [org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Running command: StartVmBackupCommand internal: false. Entities affected :  ID: 94ec9758-cfc1-41ba-ba71-e339d8df8cee Type: VMAction group BACKUP_DISK with role type ADMIN,  ID: 0ff73c3a-fb49-4148-a3b4-6ce790523a49 Type: DiskAction group BACKUP_DISK with role type ADMIN
2021-04-21 11:32:00,863+02 INFO  [org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Creating VmBackup entity for VM '94ec9758-cfc1-41ba-ba71-e339d8df8cee'
2021-04-21 11:32:00,956+02 INFO  [org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Created VmBackup entity 'ee02f949-ccb9-452f-a1bb-8e05d0684b0f'
2021-04-21 11:32:00,956+02 INFO  [org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Redefine previous VM checkpoints for VM '94ec9758-cfc1-41ba-ba71-e339d8df8cee'
2021-04-21 11:32:01,316+02 INFO  [org.ovirt.engine.core.bll.storage.backup.RedefineVmCheckpointCommand] (default task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Running command: RedefineVmCheckpointCommand internal: true. Entities affected :  ID: 94ec9758-cfc1-41ba-ba71-e339d8df8cee Type: VMAction group BACKUP_DISK with role type ADMIN,  ID: 0ff73c3a-fb49-4148-a3b4-6ce790523a49 Type: DiskAction group BACKUP_DISK with role type ADMIN
2021-04-21 11:32:01,318+02 INFO  [org.ovirt.engine.core.bll.storage.backup.RedefineVmCheckpointCommand] (default task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] No previous VM checkpoints found for VM '94ec9758-cfc1-41ba-ba71-e339d8df8cee', skipping redefining VM checkpoints
2021-04-21 11:32:01,567+02 INFO  [org.ovirt.engine.core.bll.storage.backup.RedefineVmCheckpointCommand] (default task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Lock freed to object 'EngineLock:{exclusiveLocks='[0ff73c3a-fb49-4148-a3b4-6ce790523a49=DISK]', sharedLocks='[94ec9758-cfc1-41ba-ba71-e339d8df8cee=VM]'}'
2021-04-21 11:32:01,567+02 INFO  [org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Successfully redefined previous VM checkpoints for VM '94ec9758-cfc1-41ba-ba71-e339d8df8cee'
2021-04-21 11:32:01,567+02 INFO  [org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Creating VmCheckpoint entity for VM '94ec9758-cfc1-41ba-ba71-e339d8df8cee'
2021-04-21 11:32:01,602+02 INFO  [org.ovirt.engine.core.bll.storage.backup.StartVmBackupCommand] (default task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Created VmCheckpoint entity 'b67a0436-42fb-4715-a0a5-6bfd9cb09214'
2021-04-21 11:32:01,736+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-19) [21bb0fd3-06fe-49c8-95ec-94856edabc41] EVENT_ID: VM_BACKUP_STARTED(10,790), Backup ee02f949-ccb9-452f-a1bb-8e05d0684b0f for VM vm1.cloudserver.tld started (User: admin@internal-authz).
2021-04-21 11:32:03,789+02 INFO  [org.ovirt.engine.core.bll.storage.backup.CreateScratchDisksCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Running command: CreateScratchDisksCommand internal: true. Entities affected :  ID: 94ec9758-cfc1-41ba-ba71-e339d8df8cee Type: VMAction group BACKUP_DISK with role type ADMIN
2021-04-21 11:32:03,910+02 INFO  [org.ovirt.engine.core.bll.storage.backup.CreateScratchDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Running command: CreateScratchDiskCommand internal: true. Entities affected :  ID: 4f6d464e-531a-431a-bfcc-ca9dfd1d0c9a Type: StorageAction group CREATE_DISK with role type USER
2021-04-21 11:32:03,910+02 INFO  [org.ovirt.engine.core.bll.storage.backup.CreateScratchDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Creating scratch disk for disk ID 0ff73c3a-fb49-4148-a3b4-6ce790523a49
2021-04-21 11:32:03,977+02 INFO  [org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Running command: AddDiskCommand internal: true. Entities affected :  ID: 4f6d464e-531a-431a-bfcc-ca9dfd1d0c9a Type: StorageAction group CREATE_DISK with role type USER
2021-04-21 11:32:04,038+02 INFO  [org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Running command: AddImageFromScratchCommand internal: true. Entities affected :  ID: 4f6d464e-531a-431a-bfcc-ca9dfd1d0c9a Type: Storage
2021-04-21 11:32:04,109+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) [21bb0fd3-06fe-49c8-95ec-94856edabc41] EVENT_ID: FAILED_TO_STORE_ENTIRE_DISK_FIELD_IN_DISK_DESCRIPTION_METADATA(1,026), Failed to store field DiskDescription as a part of VM vm1.cloudserver.tld backup ee02f949-ccb9-452f-a1bb-8e05d0684b0f scratch disk for vm1.cloudserver.tld-0's description metadata due to storage space limitations. The field DiskDescription will be truncated.
2021-04-21 11:32:04,111+02 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.CreateVolumeVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) [21bb0fd3-06fe-49c8-95ec-94856edabc41] START, CreateVolumeVDSCommand( CreateVolumeVDSCommandParameters:{storagePoolId='0532fe8c-d841-49ae-9615-45a7b865f3e3', ignoreFailoverLimit='false', storageDomainId='4f6d464e-531a-431a-bfcc-ca9dfd1d0c9a', imageGroupId='a419fb9b-486d-4a92-83d0-ef6c845ae5ba', imageSizeInBytes='268435456000', volumeFormat='COW', newImageId='0cbd890e-5bdc-4e8e-9e46-53bf5093cfd0', imageType='Sparse', newImageDescription='{"DiskAlias":"VM vm1.cloudserver.tld backup ee02f949-ccb9-452f-a1bb-8e05d0684b0f scratch disk for vm1.cloudserver.tld-0","DiskDescription":"Backup ee02f949-ccb9-452f-a1bb-8e05d0684b0f scratch di"}', imageInitialSizeInBytes='0', imageId='00000000-0000-0000-0000-000000000000', sourceImageGroupId='00000000-0000-0000-0000-000000000000', shouldAddBitmaps='false'}), log id: 34e22243
2021-04-21 11:32:04,137+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) [21bb0fd3-06fe-49c8-95ec-94856edabc41] EVENT_ID: IRS_BROKER_COMMAND_FAILURE(10,803), VDSM command CreateVolumeVDS failed: Invalid parameter: 'DiskType=SCRD'
2021-04-21 11:32:04,137+02 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.CreateVolumeVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Command 'CreateVolumeVDSCommand( CreateVolumeVDSCommandParameters:{storagePoolId='0532fe8c-d841-49ae-9615-45a7b865f3e3', ignoreFailoverLimit='false', storageDomainId='4f6d464e-531a-431a-bfcc-ca9dfd1d0c9a', imageGroupId='a419fb9b-486d-4a92-83d0-ef6c845ae5ba', imageSizeInBytes='268435456000', volumeFormat='COW', newImageId='0cbd890e-5bdc-4e8e-9e46-53bf5093cfd0', imageType='Sparse', newImageDescription='{"DiskAlias":"VM vm1.cloudserver.tld backup ee02f949-ccb9-452f-a1bb-8e05d0684b0f scratch disk for vm1.cloudserver.tld-0","DiskDescription":"Backup ee02f949-ccb9-452f-a1bb-8e05d0684b0f scratch di"}', imageInitialSizeInBytes='0', imageId='00000000-0000-0000-0000-000000000000', sourceImageGroupId='00000000-0000-0000-0000-000000000000', shouldAddBitmaps='false'})' execution failed: I
 RSGenericException: IRSErrorException: Invalid parameter: 'DiskType=SCRD'
2021-04-21 11:32:04,137+02 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.CreateVolumeVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) [21bb0fd3-06fe-49c8-95ec-94856edabc41] FINISH, CreateVolumeVDSCommand, return: 00000000-0000-0000-0000-000000000000, log id: 34e22243
2021-04-21 11:32:04,137+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Command 'org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.irsbroker.IrsOperationFailedNoFailoverException: IRSGenericException: IRSErrorException: Invalid parameter: 'DiskType=SCRD' (Failed with error InvalidParameterException and code 1000)
2021-04-21 11:32:04,170+02 INFO  [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-94) [21bb0fd3-06fe-49c8-95ec-94856edabc41] Removed task 'cb1f10fc-13e6-4ead-a33f-fad342a69ea2' from DataBase
```

Here's the SPM host vdsm logs:

```
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 887, in _run
    return fn(*args, **kargs)
  File "<decorator-gen-61>", line 2, in createVolume
  File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 1510, in createVolume
    add_bitmaps=addBitmaps)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/sd.py", line 1002, in validateCreateVolumeParams
    add_bitmaps=add_bitmaps)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/sd.py", line 649, in validateCreateVolumeParams
    raise se.InvalidParameterException("DiskType", diskType)
vdsm.storage.exception.InvalidParameterException: Invalid parameter: 'DiskType=SCRD
```

It's a known bug in 4.4.4/4.4.5? Any suggestion?

If needed any other info lemme know, I'll provide you asap.

Thank you for your time
Francesco
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/WYWMHZFHBVKAVASDWLFSTHXXGAG26K5D/

--
--    
Shellrent - Il primo hosting italiano Security
                    First
Francesco Lorenzini
System Administrator & DevOps Engineer
Shellrent Srl
Via dell'Edilizia, 19 - 36100 Vicenza
Tel. 0444321155 | Fax 04441492177
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/XFBTENKZ5Y6T5CEVHTGTOTLRMPYGJNJN/


--
Regards,
Eyal Shenitzky