
Continuing with the 3.6 Night Builds testing... While hosted-engine-setup was adding the host to the newly created cluster, VDSM crashed, probably because the gluster engine storage disappeared as in BZ 1201355 [1] Facts: - the engine storage (/rhev/data-center/mmt/...) was umounted during this process - another mount of the same volume was still mounted after the VDSM crash (maybe the problem is not related with gluster) After doing a "hosted-engine --connect-storage", the volume is mounted again. Now, when trying to restart VDSM, I get an "invalid lockspace": Thread-46::ERROR::2015-03-26 19:24:31,843::vm::1237::vm.Vm::(_startUnderlyingVm) vmId=`191045ac-79e4-4ce8-aad7-52cc9af313c5`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 1185, in _startUnderlyingVm self._run() File "/usr/share/vdsm/virt/vm.py", line 2253, in _run self._connection.createXML(domxml, flags), File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 126, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3427, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: Failed to acquire lock: No space left on device Thread-46::INFO::2015-03-26 19:24:31,844::vm::1709::vm.Vm::(setDownStatus) vmId=`191045ac-79e4-4ce8-aad7-52cc9af313c5`::Changed state to Down: Failed to acquire lock: No space left on device (code=1) Thread-46::DEBUG::2015-03-26 19:24:31,844::vmchannels::214::vds::(unregister) Delete fileno 60 from listener. VM Channels Listener::DEBUG::2015-03-26 19:24:32,346::vmchannels::121::vds::(_do_del_channels) fileno 60 was removed from listener. In sanlock.log we have: 2015-03-26 19:24:30+0000 7589 [752]: cmd 9 target pid 9559 not found 2015-03-26 19:24:31+0000 7589 [764]: r7 cmd_acquire 2,8,9559 invalid lockspace found -1 failed 935819904 name 7ba46e75-51af-4648-becc-5a469cb8e9c2 (All 3 lease files are present) This problem is similar to BZ 1201355 reported by Sandro [1]. About the hosted-engine VM not being resumed after restarting VDSM, please check [2] and [3] (duplicated). I confirmed that QEMU is not reopening the file descriptors when resuming a paused VMs, which explains those issues. Now, how can I fix the "invalid lockspace"? [1] https://bugzilla.redhat.com/show_bug.cgi?id=1201355 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1172905 [3] https://bugzilla.redhat.com/show_bug.cgi?id=1058300