[Users] Gluster VM stuck in "waiting for launch" state
Dan Kenigsberg
danken at redhat.com
Wed Oct 30 17:04:50 UTC 2013
On Wed, Oct 30, 2013 at 02:40:02PM +0100, Alessandro Bianchi wrote:
> Il 30/10/2013 13:58, Dan Kenigsberg ha scritto:
>
> On Wed, Oct 30, 2013 at 11:34:21AM +0100, Alessandro Bianchi wrote:
>
> Hi everyone
>
> I've set up a gluster storage with two replicated bricks
>
> DC is up and I created a VM to test gluster storage
>
> If I start the VM WITHOUT any disk attached (only one virtual DVD) it
> starts fine.
>
> If I attach a gluster domain disk thin provisioning 30 Gb the Vm stucks in
> "waiting for launch" state
>
> I see no special activity on the gluster servers (they serve several other
> shares with no troubles at all and even the ISO domain is a NFS on
> locally mounted gluster and works fine)
>
> I've double checked all the pre requisites and they look fine (F 19 -
> gluster setup insecure in both glusterd.vol and volume options -
> uid/gid/insecure )
>
> Am I doing something wrong?
>
> I'm even unable to stop the VM from the engine GUI
>
> Any advise?
>
> Which version of ovirt are you using? Hopefully ovirt-3.3.0.1.
> For how long is the VM stuck in its "wait for launch" state?
> What does `virsh -r list` has to say while startup stalls?
> Would you provide more content of your vdsm.log and possibly
> libvirtd.log so we can understand what blocks the VM start-up? Please
> use attachement of pastebin, as your mail agents wreaks havoc to the log
> lines.
>
>
> Thank you for your answer.
>
> Here are the "facts"
>
> In the GUI I see
>
> "waiting for launch 3 h"
>
> virsh -r list
> Id Nome Stato
> ----------------------------------------------------
> 3 CentOS_30 terminato
>
> vdsClient -s 0 list table
> 200dfb05-461e-49d9-95a2-c0a7c7ced669 0 CentOS_30
> WaitForLaunch
>
> Packages:
>
> ovirt-engine-userportal-3.3.0.1-1.fc19.noarch
> ovirt-log-collector-3.3.1-1.fc19.noarch
> ovirt-engine-restapi-3.3.0.1-1.fc19.noarch
> ovirt-engine-setup-3.3.0.1-1.fc19.noarch
> ovirt-engine-backend-3.3.0.1-1.fc19.noarch
> ovirt-host-deploy-java-1.1.1-1.fc19.noarch
> ovirt-release-fedora-8-1.noarch
> ovirt-engine-setup-plugin-allinone-3.3.0.1-1.fc19.noarch
> ovirt-engine-webadmin-portal-3.3.0.1-1.fc19.noarch
> ovirt-engine-sdk-python-3.3.0.7-1.fc19.noarch
> ovirt-iso-uploader-3.3.1-1.fc19.noarch
> ovirt-engine-websocket-proxy-3.3.0.1-1.fc19.noarch
> ovirt-engine-dbscripts-3.3.0.1-1.fc19.noarch
> ovirt-host-deploy-offline-1.1.1-1.fc19.noarch
> ovirt-engine-cli-3.3.0.5-1.fc19.noarch
> ovirt-engine-tools-3.3.0.1-1.fc19.noarch
> ovirt-engine-lib-3.3.0.1-1.fc19.noarch
> ovirt-image-uploader-3.3.1-1.fc19.noarch
> ovirt-engine-3.3.0.1-1.fc19.noarch
> ovirt-host-deploy-1.1.1-1.fc19.noarch
>
> I attach the full vdsm log
>
> Look around 30-10 10:30 to see all what happens
>
> Despite the "terminated" label in output from virsh I still see the VM
> "waiting for launch" in the GUI, so I suspect the answer to "how long" may
> be "forever"
>
> Since this is a test VM I can do whatever test you may need to track the
> problem included destroy and rebuild
>
> It would be great to have gluster support stable in ovirt!
>
> Thank you for your efforts
The log has an ominous failed attempt to start the VM, followed by an
immediate vdsm crash. Is it reproducible?
We have plenty of issues lurking here:
1. Why has libvirt failed to create the VM? For this, please find clues
in the complete non-line-broken CentOS_30.log and libvirtd.log.
2. Why was vdsm killed? Does /var/log/message has a clue from systemd?
3. We may have a nasty race: if Vdsm crashes just before it has
registered that the VM is down.
4. We used to force Vdsm to run with LC_ALL=C. It seems that the grand
service rewrite by Zhou (http://gerrit.ovirt.org/15578) has changed
that. This may have adverse effects, since AFAIR we sometimes parse
application output, and assume that it's in C. Having a non-English
log file is problematic on it's own for support personal, used to
grep for keywords. ybronhei, was it intensional? Can it be reverted
or at least scrutinized?
Thread-77::ERROR::2013-10-30 08:51:13,147::vm::2062::vm.Vm::(_startUnderlyingVm) vmId=`73e6615b-78c3-42e5-803a-3fc20d64ca32`::The vm start process failed
Traceback (most recent call last):
File "/usr/share/vdsm/vm.py", line 2022, in _startUnderlyingVm
self._run()
File "/usr/share/vdsm/vm.py", line 2906, in _run
self._connection.createXML(domxml, flags),
File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py", line 76, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2805, in createXML
if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
libvirtError: Unable to read from monitor: Connessione interrotta dal corrispondente
Thread-77::DEBUG::2013-10-30 08:51:13,151::vm::2448::vm.Vm::(setDownStatus) vmId=`73e6615b-78c3-42e5-803a-3fc20d64ca32`::Changed state to Down: Unable to read from monitor: Connessione interrotta dal corrispondente
MainThread::DEBUG::2013-10-30 08:51:13,153::vdsm::45::vds::(sigtermHandler) Received signal 15
MainThread::DEBUG::2013-10-30 08:51:13,153::clientIF::210::vds::(prepareForShutdown) cannot run prepareForShutdown twice
MainThread::DEBUG::2013-10-30 08:51:15,633::vdsm::45::vds::(sigtermHandler) Received signal 15
MainThread::DEBUG::2013-10-30 08:51:15,633::clientIF::210::vds::(prepareForShutdown) cannot run prepareForShutdown twice
MainThread::INFO::2013-10-30 08:51:15,700::vdsm::101::vds::(run) (PID: 7726) I am the actual vdsm 4.12.1-4.fc19 hypervisor.skynet.it (3.11.1-200.fc19.x86_64)
More information about the Users
mailing list