On Sun, Sep 5, 2021 at 6:00 PM Pavel Bar <pbar(a)redhat.com> wrote:
Hi,
Please try the instructions below and update whether it helped.
Thank you!
Pavel
Thanks for input.
If I understand it correctly I have to complete the steps described by Nir
and then work at db level.
Right now what I see in the table is:
engine=# \x
Expanded display is on.
engine=# select * from vm_backups;
-[ RECORD 1 ]------+-------------------------------------
backup_id | 68f83141-9d03-4cb0-84d4-e71fdd8753bb
from_checkpoint_id |
to_checkpoint_id | d31e35b6-bd16-46d2-a053-eabb26d283f5
vm_id | dc386237-1e98-40c8-9d3d-45658163d1e2
phase | Finalizing
_create_date | 2021-09-03 15:31:11.447+02
host_id | cc241ec7-64fc-4c93-8cec-9e0e7005a49d
engine=#
see below my doubts...
On Sun, 5 Sept 2021 at 18:41, Nir Soffer <nsoffer(a)redhat.com> wrote:
> On Sat, Sep 4, 2021 at 1:08 AM Gianluca Cecchi
> <gianluca.cecchi(a)gmail.com> wrote:
> ...
> >>> ovirt_imageio._internal.nbd.ReplyError: Writing to file failed:
> [Error 28] No space left on device
> >> This error is expected if you don't have space to write the data.
> > ok.
>
> I forgot to mention that running backup on engine host is not recommended.
> It is better to run the backup on the hypervisor, speeding up the data
> copy.
>
OK, I will take care of it, thanks.
>> How can I clean the situation?
> >>
> >> 1. Stop the current backup
>
>
>> If stopping the backup failed, stopping the VM will stop the
backup.
>
OK, I will try to fix it with the VM running if possible, before going and
stopping it.
> > But if I try the stop command I get the error
> >
> > [g.cecchi@ovmgr1 ~]$ python3
> /usr/share/doc/python3-ovirt-engine-sdk4/examples/backup_vm.py -c ovmgr1
> stop dc386237-1e98-40c8-9d3d-45658163d1e2
> 68f83141-9d03-4cb0-84d4-e71fdd8753bb
> > [ 0.0 ] Finalizing backup '68f83141-9d03-4cb0-84d4-e71fdd8753bb'
> > Traceback (most recent call last):
> ...
> > ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is
> "[Cannot stop VM backup. The VM backup is not in READY phase, backup phase
> is FINALIZING. Please try again when the backup is in READY phase.]". HTTP
> response code is 409.
>
> So your backup was already finalized, and it is stuck in "finalizing"
> phase.
>
> Usually this means the backup on libvirt side was already stopped, but
> engine
> failed to detect this and failed to complete the finalize step
> (ovirt-engine bug).
>
> You need to ensure if the backup was stopped on vdsm side.
>
> - If the vm was stopped, the bacukp is not running
> - If the vm is running, we can make sure the backup is stopped using
>
> vdsm-client VM stop_backup
> vmID=dc386237-1e98-40c8-9d3d-45658163d1e2
> backup_id=68f83141-9d03-4cb0-84d4-e71fdd8753bb
>
The VM is still running.
The host (I see it in its events with relation to backup errors) is ov200.
BTW: how can I see the mapping between host id and hostname (from the db
and/or api)?
[root@ov200 ~]# vdsm-client VM stop_backup
vmID=dc386237-1e98-40c8-9d3d-45658163d1e2
backup_id=68f83141-9d03-4cb0-84d4-e71fdd8753bb
{
"code": 0,
"message": "Done"
}
[root@ov200 ~]#
> If this succeeds, the backup is not running on vdsm side.
>
I preseum from the output above that the command succeeded, correct?
If this fails, you may need stop the VM to end the backup.
>
> If the backup was stopped, you may need to delete the scratch disks
> used in this backup.
> You can find the scratch disks ids in engine logs, and delete them
> from engine UI.
>
Any insight for finding the scratch disks ids in engine.log?
See here my engine.log and timestamp of backup (as seen in database above)
is 15:31 on 03 September:
https://drive.google.com/file/d/1Ao1CIA2wlFCqMMKeXbxKXrWZXUrnJN2h/view?us...
> Finally, after you cleaned up vdsm side, you can delete the
backup
> from engine database,
> and unlock the disks.
>
> Pavel, can you provide instructions on how to clean up engine db after
> stuck backup?
>
Can you please try manually updating the 'phase" of the problematic
backup entry in the "vm_backups" DB table to 1 of the final phases, which
are either "Succeeded" or "Failed"?
This should allow creating a new backup.
[image: image.png]
>
> After vdsm and engine were cleaned, new backup should work normally.
>
OK, so I wait for Nir input about scratch disks removal and then I go with
changing the phase column for the backup.
> >> 2. File a bug about this
> > Filed this one, hope its is correct; I chose ovirt-imageio as the
> product and Client as the component:
>
> In general backup bugs should be filed for ovirt-engine. ovirt-imageio
> is rarely the
> cause for a bug. We will move the bug to ovirt-imageio if needed.
>
> >
https://bugzilla.redhat.com/show_bug.cgi?id=2001136
>
> Thanks!
>
> Nir
>
ok.
Gianluca