[ovirt-users] VM get stuck randomly

Kevin Wolf kwolf at redhat.com
Tue Mar 29 13:40:56 UTC 2016


Am 27.03.2016 um 22:38 hat Christophe TREFOIS geschrieben:
> Hi,
> 
> MS does not like my previous email, so here it is again with a link to Dropbox
> instead of as attached.
> 
> ——
> Hi Nir,
> 
> Inside the core dump tarball is also the output of the two gdb commands you
> mentioned.
> 
> Understandbly, you might not want to download the big files for that, so I
> attached them here seperately.

The gdb dump looks pretty much like an idle qemu that just sits there
and waits for events. The vcpu threads seem to be running guest code,
the I/O thread and SPICE thread are in poll() waiting for events to
respond to, and finally the RCU thread is idle as well.

Does the qemu process still respond to monitor commands, so for example
can you still pause and resume the guest?

Kevin

> For the other logs, here you go.
> 
> For gluster I didn’t know which, so I sent all.
> 
> I got the icinga notifcation at 17:06 CEST on March 27th (today). So for vdsm,
> I provided logs from 16h-18h.
> The check said that the VM was down for 11 minutes at that time.
> 
> https://dl.dropboxusercontent.com/u/63261/bioservice-1.tar.gz
> 
> Please do let me know if there is anything else I can provide.
> 
> Best regards,
>  
> 
> > On 27 Mar 2016, at 21:24, Nir Soffer <nsoffer at redhat.com> wrote:
> >
> > On Sun, Mar 27, 2016 at 8:39 PM, Christophe TREFOIS
> > <christophe.trefois at uni.lu> wrote:
> >> Hi Nir,
> >>
> >> Here is another one, this time with strace of children and gdb dump.
> >>
> >> Interestingly, this time, the qemu seems stuck 0%, vs 100% for other cases.
> >>
> >> The files for strace are attached.
> >
> > Hopefully Kevin can take a look.
> >
> >
> >> The gdb + core dump is found here (too
> >> big):
> >>
> >> https://dl.dropboxusercontent.com/u/63261/gdb-core.tar.gz
> >
> > I think it will be more useful to extract a traceback of all threads
> > and send the tiny traceback.
> >
> > gdb --pid <qemu pid> --batch --eval-command='thread apply all bt'
> >
> >> If it helps, most machines get stuck on the host hosting the self-hosted
> >> engine, which runs a local 1-node glusterfs.
> >
> > And getting also /var/log/messages, sanlock, vdsm, glusterfs and
> > libvirt logs for this timeframe
> > would be helpful.
> >
> > Nir
> >
> >>
> >> Thank you for your help,
> >>
> >> —
> >> Christophe
> >>
> >> Dr Christophe Trefois, Dipl.-Ing.
> >> Technical Specialist / Post-Doc
> >>
> >> UNIVERSITÉ DU LUXEMBOURG
> >>
> >> LUXEMBOURG CENTRE FOR SYSTEMS BIOMEDICINE
> >> Campus Belval | House of Biomedicine
> >> 6, avenue du Swing
> >> L-4367 Belvaux
> >> T: +352 46 66 44 6124
> >> F: +352 46 66 44 6949
> >> http://www.uni.lu/lcsb
> >>
> >>
> >>
> >> ----
> >> This message is confidential and may contain privileged information.
> >> It is intended for the named recipient only.
> >> If you receive it in error please notify me and permanently delete the
> >> original message and any copies.
> >> ----
> >>
> >>
> >>
> >>> On 25 Mar 2016, at 11:53, Nir Soffer <nsoffer at redhat.com> wrote:
> >>>
> >>> gdb --pid <qemu pid> --batch --eval-command='thread apply all bt'
> >>
> 





More information about the Users mailing list