Hi community!
 
We have a self-hosted engine test infra with gluster storage replica 3 arbiter 1 (oVirt 4.1).
 
Shared storage for engine VM has a dns name "gluster-facility:/engine".
DNS "gluster-facility" resolf's into 3 A records:
"node-gluster205 10.77.253.205"
"node-gluster203 10.77.253.203"
"node-gluster201 10.77.253.201" (arbiter)
 
Self-hosted engine fresh install was succesfull with path gluster-facility:/engine.
But ovirt-ha-broker and ovirt-ha-agent services cant work normally. They can't manage manager VM.
After manager VM down it can not be started.
 
[root@node-gluster204 ~]# hosted-engine --vm-status
 
--== Host 1 status ==--
 
conf_on_shared_storage             : True
Status up-to-date                  : False
Hostname                           : node-msk-gluster202.ipt.fsin.uis
Host ID                            : 1
Engine status                      : unknown stale-data
Score                              : 0
stopped                            : True
Local maintenance                  : False
crc32                              : 3ac98fe3
local_conf_timestamp               : 342471
Host timestamp                     : 342470
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=342470 (Mon Dec  4 15:03:57 2017)
        host-id=1
        score=0
        vm_conf_refresh_time=342471 (Mon Dec  4 15:03:57 2017)
        conf_on_shared_storage=True
        maintenance=False
        state=AgentStopped
        stopped=True
 
broker.log says
 
Thread-712::INFO::2018-01-11 11:29:30,629::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup) Connection established
Thread-712::INFO::2018-01-11 11:29:30,630::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Connection closed
Thread-700::INFO::2018-01-11 11:29:30,634::mem_free::50::mem_free.MemFree::(action) memFree: 30999
Thread-697::ERROR::2018-01-11 11:29:33,270::listener::192::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Error handling request, data: 'set-storage-domain VdsmBackend hosted-engine.lockspace=7B22696D6167655F75756964223A202239323836376633652D373366312D346264382D393536332D353936616565663564616632222C202270617468223A206E756C6C2C2022766F6C756D655F75756964223A202231336235613464302D646432362D343931632D623563302D353632386235366263336135227D sp_uuid=00000000-0000-0000-0000-000000000000 dom_type=glusterfs hosted-engine.metadata=7B22696D6167655F75756964223A202233643034323833362D663935332D346662372D626634322D356435346632623338356666222C202270617468223A206E756C6C2C2022766F6C756D655F75756964223A202231313134633364642D303236342D343035392D393664312D623635633034623533396165227D sd_uuid=03da57d7-c5a2-4998-916a-f4f71baa4831'
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 166, in handle
    data)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 299, in _dispatch
    .set_storage_domain(client, sd_type, **options)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 66, in set_storage_domain
    self._backends[client].connect()
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 409, in connect
    raise RuntimeError(response.message)
RuntimeError: Volume does not exist: (u'13b5a4d0-dd26-491c-b5c0-5628b56bc3a5',)
 
vdsm.log says
 
a4d0-dd26-491c-b5c0-5628b56bc3a5', allowIllegal=False) from=::1,44266, task_id=fab6382f-8337-45e4-8eb8-1dbf79fe8c1d (api:46)
2018-01-11 11:31:53,050+0300 INFO  (jsonrpc/6) [vdsm.api] START repoStats(options=None) from=::1,44262, task_id=68a37242-a84d-4372-804e-57e416d4c3de (api:46)
2018-01-11 11:31:53,050+0300 INFO  (jsonrpc/6) [vdsm.api] FINISH repoStats return={} from=::1,44262, task_id=68a37242-a84d-4372-804e-57e416d4c3de (api:52)
2018-01-11 11:31:53,065+0300 INFO  (jsonrpc/4) [vdsm.api] FINISH prepareImage error=Volume does not exist: (u'13b5a4d0-dd26-491c-b5c0-5628b56bc3a5',) from=::1,44266, task_id=fab6382f-8337-45e4-8eb8-1dbf79fe8c1d (api:50)
2018-01-11 11:31:53,066+0300 ERROR (jsonrpc/4) [storage.TaskManager.Task] (Task='fab6382f-8337-45e4-8eb8-1dbf79fe8c1d') Unexpected error (task:872)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 879, in _run
    return fn(*args, **kargs)
  File "<string>", line 2, in prepareImage
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
    ret = func(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 3135, in prepareImage
    raise se.VolumeDoesNotExist(leafUUID)
VolumeDoesNotExist: Volume does not exist: (u'13b5a4d0-dd26-491c-b5c0-5628b56bc3a5',)
2018-01-11 11:31:53,066+0300 INFO  (jsonrpc/4) [storage.TaskManager.Task] (Task='fab6382f-8337-45e4-8eb8-1dbf79fe8c1d') aborting: Task is aborted: "Volume does not exist: (u'13b5a4d0-dd26-491c-b5c0-5628b56bc3a5',)" - code 201 (task:1177)
2018-01-11 11:31:53,066+0300 ERROR (jsonrpc/4) [storage.Dispatcher] FINISH prepareImage error=Volume does not exist: (u'13b5a4d0-dd26-491c-b5c0-5628b56bc3a5',) (dispatcher:81)
 
/var/log/messages says
 
Jan 11 11:54:03 node-gluster204 journal: ovirt-ha-broker ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error handling request, data: 'set-storage-domain VdsmBackend hosted-engine.lockspace=7B22696D6167655F75756964223A202239323836376633652D373366312D346264382D393536332D353936616565663564616632222C202270617468223A206E756C6C2C2022766F6C756D655F75756964223A202231336235613464302D646432362D343931632D623563302D353632386235366263336135227D sp_uuid=00000000-0000-0000-0000-000000000000 dom_type=glusterfs hosted-engine.metadata=7B22696D6167655F75756964223A202233643034323833362D663935332D346662372D626634322D356435346632623338356666222C202270617468223A206E756C6C2C2022766F6C756D655F75756964223A202231313134633364642D303236342D343035392D393664312D623635633034623533396165227D sd_uuid=03da57d7-c5a2-4998-916a-f4f71baa4831'#012Traceback (most recent call last):#012  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 166, in handle#012    data)#012  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 299, in _dispatch#012    .set_storage_domain(client, sd_type, **options)#012  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 66, in set_storage_domain#012    self._backends[client].connect()#012  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 409, in connect#012    raise RuntimeError(response.message)#012RuntimeError: Volume does not exist: (u'13b5a4d0-dd26-491c-b5c0-5628b56bc3a5',)