Hello,

It looks like this was the problem indeed.

I have the migration policy set to post copy (thought this was relevant only to VM migration and not disk migration) and had libvirt-4.5.0-23.el7_7.6.x86_64 on the problematic hosts. Restarting the VDSM after the migration indeed resolved the issue.

This issue only appeared during disk move for me.

I have updated all of the hosts since (libvirt-4.5.0-33.el7_8.1.x86_64) and have not noticed the issue since.

Thank you again.

Regards,

On Mon, Jun 1, 2020 at 6:53 PM Benny Zlotnik <bzlotnik@redhat.com> wrote:
Sorry for the late reply, but you may have hit this bug[1], I forgot about it.
The bug happens when you live migrate a VM in post-copy mode, vdsm
stops monitoring the VM's jobs.
The root cause is an issue in libvirt, so it depends on which libvirt
version you have

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1774230

On Fri, May 29, 2020 at 3:54 PM David Sekne <david.sekne@gmail.com> wrote:
>
> Hello,
>
> I tried the live migrate as well and it didn't help (it failed).
>
> The VM disks were in a illegal state so I ended up restoring the VM from backup (It was least complex solution for my case).
>
> Thank you both for the help.
>
> Regards,
>
> On Thu, May 28, 2020 at 5:01 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
>>
>> I used  to have a similar issue and when I live migrated  (from 1  host to another)  it  automatically completed.
>>
>> Best Regards,
>> Strahil Nikolov
>>
>> На 27 май 2020 г. 17:39:36 GMT+03:00, Benny Zlotnik <bzlotnik@redhat.com> написа:
>> >Sorry, by overloaded I meant in terms of I/O, because this is an
>> >active layer merge, the active layer
>> >(aabf3788-8e47-4f8b-84ad-a7eb311659fa) is merged into the base image
>> >(a78c7505-a949-43f3-b3d0-9d17bdb41af5), before the VM switches to use
>> >it as the active layer. So if there is constantly additional data
>> >written to the current active layer, vdsm may have trouble finishing
>> >the synchronization
>> >
>> >
>> >On Wed, May 27, 2020 at 4:55 PM David Sekne <david.sekne@gmail.com>
>> >wrote:
>> >>
>> >> Hello,
>> >>
>> >> Yes, no problem. XML is attached (I ommited the hostname and IP).
>> >>
>> >> Server is quite big (8 CPU / 32 Gb RAM / 1 Tb disk) yet not
>> >overloaded. We have multiple servers with the same specs with no
>> >issues.
>> >>
>> >> Regards,
>> >>
>> >> On Wed, May 27, 2020 at 2:28 PM Benny Zlotnik <bzlotnik@redhat.com>
>> >wrote:
>> >>>
>> >>> Can you share the VM's xml?
>> >>> Can be obtained with `virsh -r dumpxml <vm_name>`
>> >>> Is the VM overloaded? I suspect it has trouble converging
>> >>>
>> >>> taskcleaner only cleans up the database, I don't think it will help
>> >here
>> >>>
>> >_______________________________________________
>> >Users mailing list -- users@ovirt.org
>> >To unsubscribe send an email to users-leave@ovirt.org
>> >Privacy Statement: https://www.ovirt.org/privacy-policy.html
>> >oVirt Code of Conduct:
>> >https://www.ovirt.org/community/about/community-guidelines/
>> >List Archives:
>> >https://lists.ovirt.org/archives/list/users@ovirt.org/message/HX4QZDIKXH7ETWPDNI3SKZ535WHBXE2V/