[ovirt-users] VM get stuck randomly

Yaniv Kaul ykaul at redhat.com
Sun Mar 13 15:43:15 UTC 2016


On Sun, Mar 13, 2016 at 1:14 PM, Christophe TREFOIS <
christophe.trefois at uni.lu> wrote:

> Hi Yaniv,
>
>
>
> See my answers / questions below under [CT].
>
>
>
> *From:* Yaniv Kaul [mailto:ykaul at redhat.com]
> *Sent:* dimanche 13 mars 2016 12:08
> *To:* Christophe TREFOIS <christophe.trefois at uni.lu>
> *Cc:* users <users at ovirt.org>
> *Subject:* Re: [ovirt-users] VM get stuck randomly
>
>
>
>
>
>
>
> On Sun, Mar 13, 2016 at 9:46 AM, Christophe TREFOIS <
> christophe.trefois at uni.lu> wrote:
>
> Dear all,
>
> I have a problem since couple of weeks, where randomly 1 VM (not always
> the same) becomes completely unresponsive.
> We find this out because our Icinga server complains that host is down.
>
> Upon inspection, we find we can’t open a console to the VM, nor can we
> login.
>
>
>
> I assume 3.6's console feature, or is it Spice/VNC?
>
> *[CT] *
>
>
>
> This is 3.5, VNC/Spice yes. Sometimes we can connect, but there’s no way
> to do anything, eg type or so on.
>
>
>
>
> In oVirt engine, the VM looks like “up”. The only weird thing is that RAM
> usage shows 0% and CPU usage shows 100% or 75% depending on number of cores.
>
>
>
> Any chance there's really something bad going on within the VM? Anything
> in its journal or /var/log/messages or ... depending on the OS?
>
> Y.
>
> *[CT] *
>
> *It is possible. It seems to be mostly VMs with Ubuntu 14.04 and latest
> kernels. I read somewhere, I couldn’t find now, that there’s perhaps a bug
> in 3.x kernel with regards to libvirt / vdsm. But my knowledge is too
> limited to even know where to begin the investigation **J*
>
>
>
> *On the VM logs, we just see normal VM stuff, then nothing, and then when
> the VM was rebooted, there’s a couple of lines of ^@^@^@ characters
> repeating. But nothing else really.*
>
> *Initially we thought it’s a bug with aufs on Docker, but the machines
> getting stuck now don’t run either.*
>
>
>
> *From your answer, I deduce that if vdsm or libvirt or spm would see a
> problem with storage / memory / cpu, it would suspend the VM and provide
> that info to ovirt-engine? *
>
> *Since this is not happening, you think it could be related to the inside
> of the VM rather than the oVirt environment, correct?*
>

Either that, or to libvirt/QEMU.
I suggest, if possible, to upgrade the components first to newer versions
(as Nir suggested).
Y.


>
> *Thank you for your help (especially on a Sunday) **J*
>
>
>
>
>
> The only way to recover is to force shutdown the VM via 2-times shutdown
> from the engine.
>
> Could you please help me to start debugging this?
> I can provide any logs, but I’m not sure which ones, because I couldn’t
> see anything with ERROR in the vdsm logs on the host.
>
> The host is running
>
> OS Version:             RHEL - 7 - 1.1503.el7.centos.2.8
> Kernel Version: 3.10.0 - 229.14.1.el7.x86_64
> KVM Version:            2.1.2 - 23.el7_1.8.1
> LIBVIRT Version:        libvirt-1.2.8-16.el7_1.4
> VDSM Version:   vdsm-4.16.26-0.el7.centos
> SPICE Version:  0.12.4 - 9.el7_1.3
> GlusterFS Version:      glusterfs-3.7.5-1.el7
>
> We use a locally exported gluster as storage domain (eg, storage is on the
> same machine exposed via gluster). No replica.
> We run around 50 VMs on that host.
>
> Thank you for your help in this,
>
>> Christophe
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160313/c75a7a18/attachment-0001.html>


More information about the Users mailing list