
This is a multi-part message in MIME format. --------------020201080903010807000608 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit The DC storage master domain is on a (unrecoverable) storage on a remote dead host. Engine is automatically setting another storage as the "Data (Master)". Seconds later, the unrecoverable storage is marked as "Data (Master)" again. There is no way to start the Datacenter. Both storages are gluster. The old (unrecoverable) one worked fine as a master. Any hint? Logs: Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::592::Storage.TaskManager.Task::(_updateState) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::moving from state init -> state preparing Thread-32620::INFO::2015-04-28 16:34:02,508::logUtils::48::dispatcher::(wrapper) Run and protect: getAllTasksStatuses(spUUID=None, options=None) Thread-32620::ERROR::2015-04-28 16:34:02,508::task::863::Storage.TaskManager.Task::(_setError) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 870, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 49, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 2202, in getAllTasksStatuses raise se.SpmStatusError() SpmStatusError: Not SPM: () Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::882::Storage.TaskManager.Task::(_run) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Task._run: bf487090-8d62-4b42-bfd e-93574a8e1486 () {} failed - stopping task Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::1214::Storage.TaskManager.Task::(stop) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::stopping in state preparing (for ce False) Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::990::Storage.TaskManager.Task::(_decref) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::ref 1 aborting True Thread-32620::INFO::2015-04-28 16:34:02,508::task::1168::Storage.TaskManager.Task::(prepare) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::aborting: Task is aborted: 'No t SPM' - code 654 Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::1173::Storage.TaskManager.Task::(prepare) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Prepare: aborted: Not SPM Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::990::Storage.TaskManager.Task::(_decref) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::ref 0 aborting True Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::925::Storage.TaskManager.Task::(_doAbort) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Task._doAbort: force False Thread-32620::DEBUG::2015-04-28 16:34:02,508::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::592::Storage.TaskManager.Task::(_updateState) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::moving from state preparing -> state aborting Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::547::Storage.TaskManager.Task::(__state_aborting) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::_aborting: recover policy none Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::592::Storage.TaskManager.Task::(_updateState) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::moving from state aborting -> state failed Thread-32620::DEBUG::2015-04-28 16:34:02,508::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-32620::DEBUG::2015-04-28 16:34:02,508::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-32620::ERROR::2015-04-28 16:34:02,509::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Not SPM: ()', 'code': 654}} Thread-32620::DEBUG::2015-04-28 16:34:02,509::stompReactor::158::yajsonrpc.StompServer::(send) Sending response --------------020201080903010807000608 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8"> </head> <body bgcolor="#FFFFFF" text="#000000"> The DC storage master domain is on a (unrecoverable) storage on a remote dead host.<br> Engine is automatically setting another storage as the "Data (Master)".<br> Seconds later, the unrecoverable storage is marked as "Data (Master)" again.<br> There is no way to start the Datacenter.<br> <br> Both storages are gluster. The old (unrecoverable) one worked fine as a master.<br> <br> Any hint?<br> <br> Logs:<br> <br> <blockquote>Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::592::Storage.TaskManager.Task::(_updateState) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::moving from state init -><br> state preparing<br> Thread-32620::INFO::2015-04-28 16:34:02,508::logUtils::48::dispatcher::(wrapper) Run and protect: getAllTasksStatuses(spUUID=None, options=None)<br> Thread-32620::ERROR::2015-04-28 16:34:02,508::task::863::Storage.TaskManager.Task::(_setError) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Unexpected error<br> Traceback (most recent call last):<br> File "/usr/share/vdsm/storage/task.py", line 870, in _run<br> return fn(*args, **kargs)<br> File "/usr/share/vdsm/logUtils.py", line 49, in wrapper<br> res = f(*args, **kwargs)<br> File "/usr/share/vdsm/storage/hsm.py", line 2202, in getAllTasksStatuses<br> raise se.SpmStatusError()<br> SpmStatusError: Not SPM: ()<br> Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::882::Storage.TaskManager.Task::(_run) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Task._run: bf487090-8d62-4b42-bfd<br> e-93574a8e1486 () {} failed - stopping task<br> Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::1214::Storage.TaskManager.Task::(stop) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::stopping in state preparing (for<br> ce False)<br> Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::990::Storage.TaskManager.Task::(_decref) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::ref 1 aborting True<br> Thread-32620::INFO::2015-04-28 16:34:02,508::task::1168::Storage.TaskManager.Task::(prepare) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::aborting: Task is aborted: 'No<br> t SPM' - code 654<br> Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::1173::Storage.TaskManager.Task::(prepare) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Prepare: aborted: Not SPM<br> Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::990::Storage.TaskManager.Task::(_decref) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::ref 0 aborting True<br> Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::925::Storage.TaskManager.Task::(_doAbort) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::Task._doAbort: force False<br> Thread-32620::DEBUG::2015-04-28 16:34:02,508::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}<br> Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::592::Storage.TaskManager.Task::(_updateState) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::moving from state preparing -> state aborting<br> Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::547::Storage.TaskManager.Task::(__state_aborting) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::_aborting: recover policy none<br> Thread-32620::DEBUG::2015-04-28 16:34:02,508::task::592::Storage.TaskManager.Task::(_updateState) Task=`bf487090-8d62-4b42-bfde-93574a8e1486`::moving from state aborting -> state failed<br> Thread-32620::DEBUG::2015-04-28 16:34:02,508::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}<br> Thread-32620::DEBUG::2015-04-28 16:34:02,508::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}<br> Thread-32620::ERROR::2015-04-28 16:34:02,509::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Not SPM: ()', 'code': 654}}<br> Thread-32620::DEBUG::2015-04-28 16:34:02,509::stompReactor::158::yajsonrpc.StompServer::(send) Sending response<br> </blockquote> </body> </html> --------------020201080903010807000608--