hi all,
We are developing for supporting ceph as a new storage domain type. It went
well when creating and using. But we cannot add a new host to the engine
when there is already a ceph-StorageDomain.
The engine reports "VDSM node1 command ConnectStoragePoolVDS failed: Cannot
find master domain" and set the node to "Non Operational" status. The vdsm
log reports error like this:
INFO (jsonrpc/0) [vdsm.api] FINISH connectStoragePool error=Cannot find
master domain: u'spUUID=5bc95ba9-01f7-0307-0342-000000000033,
msdUUID=a3d6903d-07d5-4794-9667-9dddc3c84fe6'
from=::ffff:192.168.122.22,40166, flow_id=43b39e00,
task_id=4128f9d4-3c92-46ed-a230-2505d3a8ddb9 (api:50)
2018-10-24 11:06:22,227+0800 ERROR (jsonrpc/0) [storage.TaskManager.Task]
(Task='4128f9d4-3c92-46ed-a230-2505d3a8ddb9') Unexpected error (task:875)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882,
in _run
return fn(*args, **kargs)
File "<string>", line 2, in connectStoragePool
File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in
method
ret = func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1032,
in connectStoragePool
spUUID, hostID, msdUUID, masterVersion, domainsMap)
File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1094,
in _connectStoragePool
res = pool.connect(hostID, msdUUID, masterVersion)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 704, in
connect
self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1275, in
__rebuild
self.setMasterDomain(msdUUID, masterVersion)
File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1488, in
setMasterDomain
raise se.StoragePoolMasterNotFound(self.spUUID, msdUUID)
StoragePoolMasterNotFound: Cannot find master domain:
u'spUUID=5bc95ba9-01f7-0307-0342-000000000033,
msdUUID=a3d6903d-07d5-4794-9667-9dddc3c84fe6'
After tracing back, I found that there is no local file path under
/rhev/data-center/mnt for the storage domain, so it cannot find storage
domain when ConnectStoargeSever runs. It only happens when adding a new
host to engine and there is already a ceph-StroageDomain in ovirt.
So how does the host synchronized the existing storage domains? And what
can I do to fix this?
Thanks in advance!
Yours Sincerely,
Tianyuan