[ovirt-users] Hung task finalizing live migration

Logan Kuhn logank at wolfram.com
Thu Dec 8 13:46:54 UTC 2016


I had a similar situation where I was attempting and failing to delete a cinder disk and this fixed it. I'm using 4.0.5-5 

Regards, 
Logan 

----- On Sep 10, 2016, at 9:39 AM, Gervais de Montbrun <gervais at demontbrun.com> wrote: 

| Hi Maton,

| I have seen tasks in a weird state on my cluster also. I've had a vm get "stuck"
| during a migration where it says "migrating to" in the web GUI, but it has
| finished migrating hours ago... If I click "Cancel Migraton" the gui tells me
| that it is not migrating, but I can't do any action on the vm because I am then
| told that the vm can't be acted upon while it is migrating. I also try to kill
| the task, but there are none listed

| What has worked for me has been to put my hosted-engine in global maintenance
| mode, then ssh into the hosted engine and run the "engine-setup" command. I am
| not saying the is the best course of action, but when the engine comes back
| online the task is cleared.

| Cheers,
| Gervais

|| On Sep 10, 2016, at 11:06 AM, Maton, Brett < matonb at ltresources.co.uk > wrote:

|| Anyone know how to fix this broken task ?

|| It's persisted through a reboot of all hosts and the engine, something needs
|| deleting from the database to clear the task and release the locked disk

|| On 8 September 2016 at 13:25, Maton, Brett < matonb at ltresources.co.uk > wrote:

||| Thanks for the pointer Mikhail, however I don't get any tasks listed with that
||| command:

||| vdsClient -s 0 getAllTasksStatuses

||| /usr/share/vdsm/vdsClient.py:33: DeprecationWarning: vdscli uses xmlrpc. since
||| ovirt 3.6 xmlrpc is deprecated, please use vdsm.jsonrpcvdscli
||| from vdsm import utils, vdscli, constants

||| {'status': {'message': 'OK', 'code': 0}, 'allTasksStatus': {}}

||| On 8 September 2016 at 09:51, Краснобаев Михаил < milo1 at ya.ru > wrote:

|||| Hi,
|||| There is a way to cancel a running task - look here
|||| http://lists.ovirt.org/pipermail/users/2014-November/028946.html
|||| I was able to stop snapshot deletion this way.
|||| Best, Mikhail.
|||| 08.09.2016, 08:14, "Maton, Brett" < matonb at ltresources.co.uk >:

||||| Any suggestions ?

||||| THe task has been hung for 5 days now, I can't start the machine or destroy it.

||||| On 7 September 2016 at 06:49, Maton, Brett < matonb at ltresources.co.uk > wrote:

|||||| Sorry just hit reply....

|||||| I'm seeing these errors in the logs which look related to the problem:

|||||| 2016-09-07 06 :46:35,123 ERROR
|||||| [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller]
|||||| (DefaultQuartzScheduler6) [19c58c0d] Failed invoking callback end method
|||||| 'onFailed' for command ' 07608003 -ca05-4e2e-b917-85ce525c011b' with exception
|||||| 'null', the callback is marked for end method retries
|||||| 2016-09-07 06 :46:45,184 ERROR [ org.ovirt.engine.core.bll.Com mandsFactory]
|||||| (DefaultQuartzScheduler7) [19c58c0d] Error in invocating CTOR of command
|||||| 'LiveMigrateDisk': null
|||||| 2016-09-07 06 :46:45,185 ERROR
|||||| [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller]
|||||| (DefaultQuartzScheduler7) [19c58c0d] Failed invoking callback end method
|||||| 'onFailed' for command ' 07608003 -ca05-4e2e-b917-85ce525c011b' with exception
|||||| 'null', the callback is marked for end method retries

|||||| On 5 September 2016 at 06:46, Nir Soffer < nsoffer at redhat.com > wrote:

||||||| Hi Maton,

||||||| Please reply to the list, not to me directly.

||||||| Ala, can you look at this? is this a known issue?

||||||| Thanks,
||||||| Nir

||||||| On Mon, Sep 5, 2016 at 8:43 AM, Maton, Brett < matonb at ltresources.co.uk > wrote:
||||||| > Log files as requested

||||||| > https://ufile.io/4fc35 vdsm log
||||||| > https://ufile.io/e9836 engine 03-Sep
||||||| > https://ufile.io/15f37 engine 04-Sep

||||||| > vdsm log stops on the 01-Sep...

||||||| > Couple of entries from the event log:

||||||| > Sep 3, 2016 7:31:07 PM Snapshot 'Auto-generated for Live Storage
||||||| > Migration' deletion for VM 'lv01' has been completed.
||||||| > Sep 3, 2016 6:46:46 PM Snapshot 'Auto-generated for Live Storage
||||||| > Migration' deletion for VM 'lv01' was initiated by SYSTEM

||||||| > And the related tasks

||||||| > Removing Snapshot Auto-generated for Live Storage Migration of VM lv01
||||||| > Sep 3, 2016 6:46:44 PM N/A 29f45ca9
||||||| > Validating Sep 3, 2016 6:46:44 PM until Sep 3, 2016 6:46:44 PM
||||||| > Executing Sep 3, 2016 6:46:44 PM until Sep 3, 2016 7:31:06 PM

||||||| > Finalizing Sep 3, 2016 7:31:06 PM N/A



||||||| > On 4 September 2016 at 14:27, Nir Soffer < nsoffer at redhat.com > wrote:

||||||| >> On Sun, Sep 4, 2016 at 12:40 PM, Maton, Brett < matonb at ltresources.co.uk >
||||||| >> wrote:

||||||| >>> How do I fix / kill a hung vdsm task?

||||||| >>> It seems to have completed the task but is stuck finalising.

||||||| >>> Removing Snapshot Auto-generated for Live Storage Migration
||||||| >>> Validating
||||||| >>> Executing
||||||| >>> (hour glass) Finalizing

||||||| >>> Task has been 'stuck' finalising for over 13 hours


||||||| >> Can you share engine and vdsm logs since the time the merge was started?

||||||| >> Nir



||||| ,

||||| _______________________________________________
||||| Users mailing list
||||| Users at ovirt.org
||||| http://lists.ovirt.org/mailman/listinfo/users
|||| --
|||| С уважением, Краснобаев Михаил.

|| _______________________________________________
|| Users mailing list
|| Users at ovirt.org
|| http://lists.ovirt.org/mailman/listinfo/users

| _______________________________________________
| Users mailing list
| Users at ovirt.org
| http://lists.ovirt.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20161208/c3bf4982/attachment-0001.html>


More information about the Users mailing list