[ovirt-users] Gluster: VM disk stuck in transfer; georep gone wonky

Jim Kusznir jim at palousetech.com
Tue Mar 20 07:22:28 UTC 2018


Thank you for the replies.

While waiting, I found one more google responce that said to run
engine-setup.  I did that, and it fixed the issue.  the VM is now running
again.

As to checking the logs, I'm not sure which ones to check...there are so
many in so many different places.

I was not able to detach the disk, as "an operation is currently in
process"  No matter what i did to the disk, it was essentially still
locked, even though it no longer said "locked" after I removed it with the
unlock script.

So, it appears running engine-setup can really fix a bunch of stuff!  An
important tip to remember...

--Jim

On Mon, Mar 19, 2018 at 11:55 PM, Tony Brian Albers <tba at kb.dk> wrote:

> I read somewhere about clearing out wrong stuff from the UI by manually
> editing the database, maybe you can try searching for something like that.
>
> With regards to the VM, I'd probably just delete it, edit the DB and
> remove all sorts of references to it and then recover it from backup.
>
> Is there nothing about all this in the ovirt logs on the engine and the
> host? It might point you in the right direction.
>
> HTH
>
> /tony
>
>
> On 20/03/18 07:48, Jim Kusznir wrote:
> > Unfortunately, I came under heavy pressure to get this vm back up.  So,
> > i did more googling and attempted to recover myself.  I've gotten
> > closer, but still not quite.
> >
> > I found this post:
> >
> > http://lists.ovirt.org/pipermail/users/2015-November/035686.html
> >
> > Which gave me the unlock tool, which was successful in unlocking the
> > disk.  Unfortunately, it did not delete the task, nor did ovirt do so on
> > its own after the disk was unlocked.
> >
> > So I found the taskcleaner.sh in the same directory and attempted to
> > clean the task out....except it doesn't seem to see the task (none of
> > the show tasks options seemed to work or the delete all options).  I did
> > still have the task uuid from the gui, so i attempted to use that, but
> > all I got back was a "t" on one line and a "0" on the next, so I have no
> > idea what that was supposed to mean.  In any case, the web UI still
> > shows the task, still won't let me start the VM and appears convinced
> > its still copying.  I've tried restarting the engine and vdsm on the
> > SPM, neither have helped.  I can't find any evidence of the task on the
> > command line; only in the UI.
> >
> > I'd create a new VM if i could rescue the image, but I'm not sure I can
> > manage to get this image accepted in another VM
> >
> > How do i recover now?
> >
> > --Jim
> >
> > On Mon, Mar 19, 2018 at 9:38 AM, Jim Kusznir <jim at palousetech.com
> > <mailto:jim at palousetech.com>> wrote:
> >
> >     Hi all:
> >
> >     Sorry for yet another semi-related message to the list.  In my
> >     attempts to troubleshoot and verify some suspicions on the nature of
> >     the performance problems I posted under "Major Performance Issues
> >     with gluster", I attempted to move one of my problem VM's back to
> >     the original storage (SSD-backed).  It appeared to be moving fine,
> >     but last night froze at 84%.  This morning (8hrs later), its still
> >     at 84%.
> >
> >     I need to get that VM back up and running, but I don't know how...It
> >     seems to be stuck in limbo.
> >
> >     The only thing I explicitly did last night as well that may have
> >     caused an issue is finally set up and activated georep to an offsite
> >     backup machine.  That too seems to have gone a bit wonky.  On the
> >     ovirt server side, it shows normal with all but data-hdd show a last
> >     sync'ed time of 3am (which matches my bandwidth graphs for the WAN
> >     connections involved).  data-hdd (the new disk-backed storage with
> >     most of my data in it) shows not yet synced, but I'm also not
> >     currently seeing bandwidth usage anymore.
> >
> >     I logged into the georep destination box, and found system load a
> >     bit high, a bunch of gluster and rsync processes running, and both
> >     data and data-hdd using MORE disk space than the origional (data-hdd
> >     using 4x more disk space than is on the master node).  Not sure what
> >     to do about this; I paused the replication from the cluster, but
> >     that hasn't seem to had an effect on the georep destination.
> >
> >     I promise I'll stop trying things until I get guidance from the
> >     list!  Please do help; I need the VM HDD unstuck so I can start it.
> >
> >     Thanks!
> >     --Jim
> >
> >
> >
> >
> > _______________________________________________
> > Users mailing list
> > Users at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
>
>
> --
> Tony Albers
> Systems administrator, IT-development
> Royal Danish Library, Victor Albecks Vej 1, 8000 Aarhus C, Denmark.
> Tel: +45 2566 2383 / +45 8946 2316
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20180320/1115ebeb/attachment.html>


More information about the Users mailing list