[ovirt-users] Hung task finalizing live migration

Maton, Brett matonb at ltresources.co.uk
Sat Sep 10 14:40:25 UTC 2016


Thanks Gervais I'll give that a go

On 10 September 2016 at 15:39, Gervais de Montbrun <gervais at demontbrun.com>
wrote:

> Hi Maton,
>
> I have seen tasks in a weird state on my cluster also. I've had a vm get
> "stuck" during a migration where it says "migrating to" in the web GUI, but
> it has finished migrating hours ago... If I click "Cancel Migraton" the gui
> tells me that it is not migrating, but I can't do any action on the vm
> because I am then told that the vm can't be acted upon while it is
> migrating. I also try to kill the task, but there are none listed
>
> What has worked for me has been to put my hosted-engine in global
> maintenance mode, then ssh into the hosted engine and run the
> "engine-setup" command. I am not saying the is the best course of action,
> but when the engine comes back online the task is cleared.
>
> Cheers,
> Gervais
>
>
>
> On Sep 10, 2016, at 11:06 AM, Maton, Brett <matonb at ltresources.co.uk>
> wrote:
>
> Anyone know how to fix this broken task ?
>
> It's persisted through a reboot of all hosts and the engine, something
> needs deleting from the database to clear the task and release the locked
> disk
>
> On 8 September 2016 at 13:25, Maton, Brett <matonb at ltresources.co.uk>
> wrote:
>
>> Thanks for the pointer Mikhail, however I don't get any tasks listed with
>> that command:
>>
>> vdsClient -s 0 getAllTasksStatuses
>>
>> /usr/share/vdsm/vdsClient.py:33: DeprecationWarning: vdscli uses xmlrpc.
>> since ovirt 3.6 xmlrpc is deprecated, please use vdsm.jsonrpcvdscli
>>   from vdsm import utils, vdscli, constants
>>
>> {'status': {'message': 'OK', 'code': 0}, 'allTasksStatus': {}}
>>
>>
>> On 8 September 2016 at 09:51, Краснобаев Михаил <milo1 at ya.ru> wrote:
>>
>>> Hi,
>>>
>>> There is a way to cancel a running task  -  look here
>>> http://lists.ovirt.org/pipermail/users/2014-November/028946.html
>>> I was able to stop snapshot deletion this way.
>>>
>>> Best, Mikhail.
>>>
>>> 08.09.2016, 08:14, "Maton, Brett" <matonb at ltresources.co.uk>:
>>>
>>> Any suggestions ?
>>>
>>> THe task has been hung for 5 days now, I can't start the machine or
>>> destroy it.
>>>
>>>
>>> On 7 September 2016 at 06:49, Maton, Brett <matonb at ltresources.co.uk>
>>> wrote:
>>>
>>> Sorry just hit reply....
>>>
>>> I'm seeing these errors in the logs which look related to the problem:
>>>
>>>
>>> 2016-09-07 06:46:35,123 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller]
>>> (DefaultQuartzScheduler6) [19c58c0d] Failed invoking callback end method
>>> 'onFailed' for command '07608003-ca05-4e2e-b917-85ce525c011b' with
>>> exception 'null', the callback is marked for end method retries
>>> 2016-09-07 06:46:45,184 ERROR [org.ovirt.engine.core.bll.Com
>>> <http://org.ovirt.engine.core.bll.com/>mandsFactory]
>>> (DefaultQuartzScheduler7) [19c58c0d] Error in invocating CTOR of command
>>> 'LiveMigrateDisk': null
>>> 2016-09-07 06:46:45,185 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller]
>>> (DefaultQuartzScheduler7) [19c58c0d] Failed invoking callback end method
>>> 'onFailed' for command '07608003-ca05-4e2e-b917-85ce525c011b' with
>>> exception 'null', the callback is marked for end method retries
>>>
>>> On 5 September 2016 at 06:46, Nir Soffer <nsoffer at redhat.com> wrote:
>>>
>>> Hi Maton,
>>>
>>> Please reply to the list, not to me directly.
>>>
>>> Ala, can you look at this? is this a known issue?
>>>
>>> Thanks,
>>> Nir
>>>
>>> On Mon, Sep 5, 2016 at 8:43 AM, Maton, Brett <matonb at ltresources.co.uk>
>>> wrote:
>>> > Log files as requested
>>> >
>>> > https://ufile.io/4fc35 vdsm log
>>> > https://ufile.io/e9836 engine 03-Sep
>>> > https://ufile.io/15f37 engine 04-Sep
>>> >
>>> > vdsm log stops on the 01-Sep...
>>> >
>>> > Couple of entries from the event log:
>>> >
>>> > Sep 3, 2016 7:31:07 PM    Snapshot 'Auto-generated for Live Storage
>>> > Migration' deletion for VM 'lv01' has been completed.
>>> > Sep 3, 2016 6:46:46 PM    Snapshot 'Auto-generated for Live Storage
>>> > Migration' deletion for VM 'lv01' was initiated by SYSTEM
>>> >
>>> > And the related tasks
>>> >
>>> > Removing Snapshot Auto-generated for Live Storage Migration of VM lv01
>>> > Sep 3, 2016 6:46:44 PM        N/A    29f45ca9
>>> > Validating    Sep 3, 2016 6:46:44 PM    until    Sep 3, 2016 6:46:44 PM
>>> > Executing    Sep 3, 2016 6:46:44 PM    until    Sep 3, 2016 7:31:06 PM
>>> >
>>> > Finalizing    Sep 3, 2016 7:31:06 PM        N/A
>>> >
>>> >
>>> >
>>> > On 4 September 2016 at 14:27, Nir Soffer <nsoffer at redhat.com> wrote:
>>> >>
>>> >> On Sun, Sep 4, 2016 at 12:40 PM, Maton, Brett <
>>> matonb at ltresources.co.uk>
>>> >> wrote:
>>> >>>
>>> >>> How do I fix / kill a hung vdsm task?
>>> >>>
>>> >>> It seems to have completed the task but is stuck finalising.
>>> >>>
>>> >>> Removing Snapshot Auto-generated for Live Storage Migration
>>> >>> Validating
>>> >>> Executing
>>> >>> (hour glass) Finalizing
>>> >>>
>>> >>> Task has been 'stuck' finalising for over 13 hours
>>> >>
>>> >>
>>> >> Can you share engine and vdsm logs since the time the merge was
>>> started?
>>> >>
>>> >> Nir
>>> >
>>> >
>>>
>>> ,
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>>
>>> --
>>> С уважением, Краснобаев Михаил.
>>>
>>>
>>>
>>
>>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160910/591d73fd/attachment-0001.html>


More information about the Users mailing list