How does the host synchronizes existing storage domains when added to engine?

hi all, We are developing for supporting ceph as a new storage domain type. It went well when creating and using. But we cannot add a new host to the engine when there is already a ceph-StorageDomain. The engine reports "VDSM node1 command ConnectStoragePoolVDS failed: Cannot find master domain" and set the node to "Non Operational" status. The vdsm log reports error like this: INFO (jsonrpc/0) [vdsm.api] FINISH connectStoragePool error=Cannot find master domain: u'spUUID=5bc95ba9-01f7-0307-0342-000000000033, msdUUID=a3d6903d-07d5-4794-9667-9dddc3c84fe6' from=::ffff:192.168.122.22,40166, flow_id=43b39e00, task_id=4128f9d4-3c92-46ed-a230-2505d3a8ddb9 (api:50) 2018-10-24 11:06:22,227+0800 ERROR (jsonrpc/0) [storage.TaskManager.Task] (Task='4128f9d4-3c92-46ed-a230-2505d3a8ddb9') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in connectStoragePool File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1032, in connectStoragePool spUUID, hostID, msdUUID, masterVersion, domainsMap) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1094, in _connectStoragePool res = pool.connect(hostID, msdUUID, masterVersion) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 704, in connect self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1275, in __rebuild self.setMasterDomain(msdUUID, masterVersion) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1488, in setMasterDomain raise se.StoragePoolMasterNotFound(self.spUUID, msdUUID) StoragePoolMasterNotFound: Cannot find master domain: u'spUUID=5bc95ba9-01f7-0307-0342-000000000033, msdUUID=a3d6903d-07d5-4794-9667-9dddc3c84fe6' After tracing back, I found that there is no local file path under /rhev/data-center/mnt for the storage domain, so it cannot find storage domain when ConnectStoargeSever runs. It only happens when adding a new host to engine and there is already a ceph-StroageDomain in ovirt. So how does the host synchronized the existing storage domains? And what can I do to fix this? Thanks in advance! Yours Sincerely, Tianyuan

It looks like you have a problem setting, the ceph storage as SDM. You may need to have a regular storage domain first before adding the ceph one, but it's really hard to tell like this. You can look at sp.py#startSpm On Wed, Oct 24, 2018 at 9:29 AM Tianyuan Wang <tywang0113@gmail.com> wrote:
hi all, We are developing for supporting ceph as a new storage domain type. It went well when creating and using. But we cannot add a new host to the engine when there is already a ceph-StorageDomain. The engine reports "VDSM node1 command ConnectStoragePoolVDS failed: Cannot find master domain" and set the node to "Non Operational" status. The vdsm log reports error like this:
INFO (jsonrpc/0) [vdsm.api] FINISH connectStoragePool error=Cannot find master domain: u'spUUID=5bc95ba9-01f7-0307-0342-000000000033, msdUUID=a3d6903d-07d5-4794-9667-9dddc3c84fe6' from=::ffff:192.168.122.22,40166, flow_id=43b39e00, task_id=4128f9d4-3c92-46ed-a230-2505d3a8ddb9 (api:50) 2018-10-24 11:06:22,227+0800 ERROR (jsonrpc/0) [storage.TaskManager.Task] (Task='4128f9d4-3c92-46ed-a230-2505d3a8ddb9') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in connectStoragePool File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1032, in connectStoragePool spUUID, hostID, msdUUID, masterVersion, domainsMap) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1094, in _connectStoragePool res = pool.connect(hostID, msdUUID, masterVersion) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 704, in connect self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1275, in __rebuild self.setMasterDomain(msdUUID, masterVersion) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1488, in setMasterDomain raise se.StoragePoolMasterNotFound(self.spUUID, msdUUID) StoragePoolMasterNotFound: Cannot find master domain: u'spUUID=5bc95ba9-01f7-0307-0342-000000000033, msdUUID=a3d6903d-07d5-4794-9667-9dddc3c84fe6'
After tracing back, I found that there is no local file path under /rhev/data-center/mnt for the storage domain, so it cannot find storage domain when ConnectStoargeSever runs. It only happens when adding a new host to engine and there is already a ceph-StroageDomain in ovirt. So how does the host synchronized the existing storage domains? And what can I do to fix this?
Thanks in advance!
Yours Sincerely, Tianyuan _______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/HD4E7BSUWHVFBT...

On Wed, 24 Oct 2018, 7:29 Tianyuan Wang, <tywang0113@gmail.com> wrote:
hi all, We are developing for supporting ceph as a new storage domain type.
What do you mean by "new storage type"? Can you share your changes so we can understand better? It went well when creating and using. But we cannot add a new host to the
engine when there is already a ceph-StorageDomain.
We don't have such storage domain. The engine reports "VDSM node1 command ConnectStoragePoolVDS failed: Cannot
find master domain" and set the node to "Non Operational" status. The vdsm log reports error like this:
...
raise se.StoragePoolMasterNotFound(self.spUUID, msdUUID) StoragePoolMasterNotFound: Cannot find master domain: u'spUUID=5bc95ba9-01f7-0307-0342-000000000033, msdUUID=a3d6903d-07d5-4794-9667-9dddc3c84fe6'
This probably means the storage is not connected, but...
After tracing back, I found that there is no local file path under /rhev/data-center/mnt for the storage domain, so it cannot find storage domain when ConnectStoargeSever runs.
Why do you need to connect storage server? ceph disk are used as network disks, so there is nothing to connect. We are working on Cinderlib based storage, which will allow using ceph or most storage supported by Cinder drivers. Adding Fred to add more info. Nir
participants (3)
-
Benny Zlotnik
-
Nir Soffer
-
Tianyuan Wang