Continuing with the 3.6 Night Builds testing...
While hosted-engine-setup was adding the host to the newly created
cluster, VDSM crashed, probably because the gluster engine storage
disappeared as in BZ 1201355 [1]
Facts:
- the engine storage (/rhev/data-center/mmt/...) was umounted
during this process
- another mount of the same volume was still mounted after the VDSM
crash (maybe the problem is not related with gluster)
After doing a "hosted-engine --connect-storage", the volume is mounted
again.
Now, when trying to restart VDSM, I get an "invalid lockspace":
Thread-46::ERROR::2015-03-26
19:24:31,843::vm::1237::vm.Vm::(_startUnderlyingVm)
vmId=`191045ac-79e4-4ce8-aad7-52cc9af313c5`::The vm start process failed
Traceback (most recent call last):
File "/usr/share/vdsm/virt/vm.py", line 1185, in _startUnderlyingVm
self._run()
File "/usr/share/vdsm/virt/vm.py", line 2253, in _run
self._connection.createXML(domxml, flags),
File
"/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 126,
in wrapper
ret = f(*args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3427,
in createXML
if ret is None:raise libvirtError('virDomainCreateXML()
failed', conn=self)
libvirtError: Failed to acquire lock: No space left on device
Thread-46::INFO::2015-03-26
19:24:31,844::vm::1709::vm.Vm::(setDownStatus)
vmId=`191045ac-79e4-4ce8-aad7-52cc9af313c5`::Changed state to Down:
Failed to acquire lock: No space left on device (code=1)
Thread-46::DEBUG::2015-03-26
19:24:31,844::vmchannels::214::vds::(unregister) Delete fileno 60 from
listener.
VM Channels Listener::DEBUG::2015-03-26
19:24:32,346::vmchannels::121::vds::(_do_del_channels) fileno 60 was
removed from listener.
In sanlock.log we have:
2015-03-26 19:24:30+0000 7589 [752]: cmd 9 target pid 9559 not found
2015-03-26 19:24:31+0000 7589 [764]: r7 cmd_acquire 2,8,9559
invalid lockspace found -1 failed 935819904 name
7ba46e75-51af-4648-becc-5a469cb8e9c2
(All 3 lease files are present)
This problem is similar to BZ 1201355 reported by Sandro [1].
About the hosted-engine VM not being resumed after restarting VDSM,
please check [2] and [3] (duplicated).
I confirmed that QEMU is not reopening the file descriptors when
resuming a paused VMs, which explains those issues.
Now, how can I fix the "invalid lockspace"?
[1]
https://bugzilla.redhat.com/show_bug.cgi?id=1201355
[2]
https://bugzilla.redhat.com/show_bug.cgi?id=1172905
[3]
https://bugzilla.redhat.com/show_bug.cgi?id=1058300