Hi,

We failed cq test 002_bootstrap.verify_add_all_hosts for Master vdsm project.

Looking at the log, vdsm cannot find master storage domain and engine puts the host on non-operational state.

Although on the surface the patch seems to be related, the master storage domain is iscsi whole the patch is related to gluster.

I do not think there is a connection between the patch and the failure but can you please have a look to make sure?

Link and headline of suspected patches:

https://gerrit.ovirt.org/#/c/69668/ -

gluster: Fix error when brick is on a btrfs subvolume

Link to Job:

http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5180/

Link to all logs:

http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5180/artifact/

(Relevant) error snippet from the log:

<error>



vdsm:

2018-02-01 03:13:49,211-0500 INFO  (jsonrpc/4) [vdsm.api] START createStorageDomain(storageType=3, sdUUID=u'077add35-9171-45d5-b6de-79cc5a853c36', domainName=u'iscsi', typeSpecificArg=u'IdW3HG-K1Af-e0d3-u2O3-rGle-8fk5-ACNk6C', domClass=1,
 domVersion=u'4', options=None) from=::ffff:192.168.201.4,58530, flow_id=22d4ffd8, task_id=2ce6dd52-3d28-4532-abbf-d78d52af6cda (api:46)

2018-02-01 03:14:40,223-0500 INFO  (jsonrpc/7) [vdsm.api] START connectStoragePool(spUUID=u'2570c0c9-f872-4e49-964a-ee533a79c3f2', hostID=1, msdUUID=u'077add35-9171-45d5-b6de-79cc5a853c36', masterVersion=1, domainsMap={u'077add35-9171-45d
5-b6de-79cc5a853c36': u'active'}, options=None) from=::ffff:192.168.201.4,36310, flow_id=19e9aa89, task_id=878419a0-c5ce-4e35-aed5-b27d56b2886e (api:46)
2018-02-01 03:14:40,225-0500 INFO  (jsonrpc/7) [storage.StoragePoolMemoryBackend] new storage pool master version 1 and domains map {u'077add35-9171-45d5-b6de-79cc5a853c36': u'Active'} (spbackends:449)
2018-02-01 03:14:40,225-0500 INFO  (jsonrpc/7) [storage.StoragePool] updating pool 2570c0c9-f872-4e49-964a-ee533a79c3f2 backend from type NoneType instance 0x7f45919e3f20 to type StoragePoolMemoryBackend instance 0x45411b0 (sp:157)
2018-02-01 03:14:40,226-0500 INFO  (jsonrpc/7) [storage.StoragePool] Connect host #1 to the storage pool 2570c0c9-f872-4e49-964a-ee533a79c3f2 with master domain: 077add35-9171-45d5-b6de-79cc5a853c36 (ver = 1) (sp:692)
2018-02-01 03:14:40,462-0500 INFO  (jsonrpc/7) [vdsm.api] FINISH connectStoragePool error=Cannot find master domain: u'spUUID=2570c0c9-f872-4e49-964a-ee533a79c3f2, msdUUID=077add35-9171-45d5-b6de-79cc5a853c36' from=::ffff:192.168.201.4,36
310, flow_id=19e9aa89, task_id=878419a0-c5ce-4e35-aed5-b27d56b2886e (api:50)
2018-02-01 03:14:40,462-0500 ERROR (jsonrpc/7) [storage.TaskManager.Task] (Task='878419a0-c5ce-4e35-aed5-b27d56b2886e') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
    return fn(*args, **kargs)
  File "<string>", line 2, in connectStoragePool
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1032, in connectStoragePool
    spUUID, hostID, msdUUID, masterVersion, domainsMap)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1094, in _connectStoragePool
    res = pool.connect(hostID, msdUUID, masterVersion)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 704, in connect
    self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1275, in __rebuild
    self.setMasterDomain(msdUUID, masterVersion)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1488, in setMasterDomain
    raise se.StoragePoolMasterNotFound(self.spUUID, msdUUID)
StoragePoolMasterNotFound: Cannot find master domain: u'spUUID=2570c0c9-f872-4e49-964a-ee533a79c3f2, msdUUID=077add35-9171-45d5-b6de-79cc5a853c36'
2018-02-01 03:14:40,466-0500 INFO  (jsonrpc/7) [storage.TaskManager.Task] (Task='878419a0-c5ce-4e35-aed5-b27d56b2886e') aborting: Task is aborted: "Cannot find master domain: u'spUUID=2570c0c9-f872-4e49-964a-ee533a79c3f2, msdUUID=077add35
-9171-45d5-b6de-79cc5a853c36'" - code 304 (task:1181)
2018-02-01 03:14:40,467-0500 ERROR (jsonrpc/7) [storage.Dispatcher] FINISH connectStoragePool error=Cannot find master domain: u'spUUID=2570c0c9-f872-4e49-964a-ee533a79c3f2, msdUUID=077add35-9171-45d5-b6de-79cc5a853c36' (dispatcher:82)
2018-02-01 03:14:40,467-0500 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call StoragePool.connect failed (error 304) in 0.25 seconds (__init__:573)

engine:

2018-02-01 03:14:40,603-05 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [ba52086] EVENT_ID: VDS_SET_NONOPERATIONAL_DOMAIN(522), Host lago-basic-suite-mast
er-host-0 cannot access the Storage Domain(s) <UNKNOWN> attached to the Data Center test-dc. Setting Host state to Non-Operational.
2018-02-01 03:14:40,608-05 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [ba52086] EVENT_ID: VDS_ALERT_FENCE_IS_NOT_CONFIGURED(9,000), Failed to verify Pow
er Management configuration for Host lago-basic-suite-master-host-0.
2018-02-01 03:14:40,610-05 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-70) [ba52086] EVENT_ID: CONNECT_STORAGE_POOL_FAILED(995), Failed to connect Host lago-
basic-suite-master-host-0 to Storage Pool test-dc



</error>