Hello,
  
     I now see that your hosted engine is up and running. Can you let me know how did you try reinstalling the host? Below is the procedure which is used and hope you did not miss any step while reinstalling. If no, can you try reinstalling again and see if that works ?

1) Move the host to maintenance
2) click on reinstall
3) provide the password
4) uncheck 'automatically configure host firewall'
5) click on 'Deploy' tab
6) click Hosted Engine deployment as 'Deploy'

And once the host installation is done, wait till the active score of the host shows 3400 in the general tab then check hosted-engine --vm-status. 

Thanks
kasturi

On Mon, Jan 15, 2018 at 4:57 PM, Artem Tambovskiy <artem.tambovskiy@gmail.com> wrote:
Hello,

I have uploaded 2 archives with all relevant logs to shared hosting
files from host 1  (which is currently running all VM's including hosted_engine)  -  https://yadi.sk/d/PttRoYV63RTvhK
files from second host - 
https://yadi.sk/d/UBducEsV3RTvhc 

I have tried to restart both 
ovirt-ha-agent and ovirt-ha-broker but it gives no effect. I have also tried to shutdown hosted_engine VM, stop ovirt-ha-agent and ovirt-ha-broker  services disconnect storage and connect it again  - no effect as well. 
Also I tried to reinstall second host from WebGUI - this lead to the interesting situation - now  hosted-engine --vm-status  shows that both hosts have the same address. 

[root@ovirt1 ~]# hosted-engine --vm-status    

--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : ovirt1.telia.ru
Host ID                            : 1
Engine status                      : {"health": "good", "vm": "up", "detail": "up"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : a7758085
local_conf_timestamp               : 259327
Host timestamp                     : 259327
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=259327 (Mon Jan 15 14:06:48 2018)
        host-id=1
        score=3400
        vm_conf_refresh_time=259327 (Mon Jan 15 14:06:48 2018)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineUp
        stopped=False


--== Host 2 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : False
Hostname                           : ovirt1.telia.ru
Host ID                            : 2
Engine status                      : unknown stale-data
Score                              : 0
stopped                            : True
Local maintenance                  : False
crc32                              : c7037c03
local_conf_timestamp               : 7530
Host timestamp                     : 7530
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=7530 (Fri Jan 12 16:10:12 2018)
        host-id=2
        score=0
        vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
        conf_on_shared_storage=True
        maintenance=False
        state=AgentStopped
        stopped=True

Gluster seems working fine. all gluster nodes showing connected state.

Any advises on how to resolve this situation are highly appreciated!

Regards,
Artem


On Mon, Jan 15, 2018 at 11:45 AM, Kasturi Narra <knarra@redhat.com> wrote:
Hello Artem,

        Can you check if glusterd service is running on host1 and all the peers are in connected state ? If yes, can you restart ovirt-ha-agent and broker services and check if things are working fine ?

Thanks
kasturi

On Sat, Jan 13, 2018 at 12:33 AM, Artem Tambovskiy <artem.tambovskiy@gmail.com> wrote:
Explored logs on both hosts. 
broker.log shows no errors.

agent.log looking not good:

on host1 (which running hosted engine) :

MainThread::ERROR::2018-01-12 21:51:03,883::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 191, in _run_agent
    return action(he)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 64, in action_proper
    return he.start_monitoring()
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 411, in start_monitoring
    self._initialize_sanlock()
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 749, in _initialize_sanlock
    "Failed to initialize sanlock, the number of errors has"
SanlockInitializationError: Failed to initialize sanlock, the number of errors has exceeded the limit

MainThread::ERROR::2018-01-12 21:51:03,884::agent::206::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Trying to restart agent
MainThread::WARNING::2018-01-12 21:51:08,889::agent::209::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Restarting agent, attempt '1'
MainThread::INFO::2018-01-12 21:51:08,919::hosted_engine::242::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname) Found certificate common name: ovirt1.telia.ru
MainThread::INFO::2018-01-12 21:51:08,921::hosted_engine::604::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) Initializing VDSM
MainThread::INFO::2018-01-12 21:51:11,398::hosted_engine::630::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Connecting the storage
MainThread::INFO::2018-01-12 21:51:11,399::storage_server::220::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(validate_storage_server) Validating storage server
MainThread::INFO::2018-01-12 21:51:13,725::storage_server::239::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server
MainThread::INFO::2018-01-12 21:51:18,390::storage_server::246::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server
MainThread::INFO::2018-01-12 21:51:18,423::storage_server::253::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Refreshing the storage domain
MainThread::INFO::2018-01-12 21:51:18,689::hosted_engine::663::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Preparing images
MainThread::INFO::2018-01-12 21:51:18,690::image::126::ovirt_hosted_engine_ha.lib.image.Image::(prepare_images) Preparing images
MainThread::INFO::2018-01-12 21:51:21,895::hosted_engine::666::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Refreshing vm.conf
MainThread::INFO::2018-01-12 21:51:21,895::config::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_vm_conf) Reloading vm.conf from the shared storage domain
MainThread::INFO::2018-01-12 21:51:21,896::config::416::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Trying to get a fresher copy of vm configuration from the OVF_STORE
MainThread::INFO::2018-01-12 21:51:21,896::ovf_store::132::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) Extracting Engine VM OVF from the OVF_STORE
MainThread::INFO::2018-01-12 21:51:21,897::ovf_store::134::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/5cabd8e1-5f4b-469e-becc-227469e03f5c/8048cbd7-77e2-4805-9af4-d109fa36dfcf 
MainThread::INFO::2018-01-12 21:51:21,915::config::435::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Found an OVF for HE VM, trying to convert
MainThread::INFO::2018-01-12 21:51:21,918::config::440::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Got vm.conf from OVF_STORE
MainThread::INFO::2018-01-12 21:51:21,919::hosted_engine::509::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker) Initializing ha-broker connection
MainThread::INFO::2018-01-12 21:51:21,919::brokerlink::130::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Starting monitor ping, options {'addr': '80.239.162.97'}
MainThread::INFO::2018-01-12 21:51:21,922::brokerlink::141::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Success, id 140547104457680
MainThread::INFO::2018-01-12 21:51:21,922::brokerlink::130::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Starting monitor mgmt-bridge, options {'use_ssl': 'true', 'bridge_name': 'ovirtmgmt', 'address': '0'}
MainThread::INFO::2018-01-12 21:51:21,936::brokerlink::141::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Success, id 140547104458064
MainThread::INFO::2018-01-12 21:51:21,936::brokerlink::130::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Starting monitor mem-free, options {'use_ssl': 'true', 'address': '0'}
MainThread::INFO::2018-01-12 21:51:21,938::brokerlink::141::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Success, id 140547104458448
MainThread::INFO::2018-01-12 21:51:21,939::brokerlink::130::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Starting monitor cpu-load-no-engine, options {'use_ssl': 'true', 'vm_uuid': 'b366e466-b0ea-4a09-866b-d0248d7523a6', 'address': '0'}
MainThread::INFO::2018-01-12 21:51:21,940::brokerlink::141::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Success, id 140547104457552
MainThread::INFO::2018-01-12 21:51:21,941::brokerlink::130::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Starting monitor engine-health, options {'use_ssl': 'true', 'vm_uuid': 'b366e466-b0ea-4a09-866b-d0248d7523a6', 'address': '0'}
MainThread::INFO::2018-01-12 21:51:21,942::brokerlink::141::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor) Success, id 140547104459792
MainThread::INFO::2018-01-12 21:51:26,951::brokerlink::179::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(set_storage_domain) Success, id 140546772847056
MainThread::INFO::2018-01-12 21:51:26,952::hosted_engine::601::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker) Broker initialized, all submonitors started
MainThread::INFO::2018-01-12 21:51:27,049::hosted_engine::704::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock) Ensuring lease for lockspace hosted-engine, host id 1 is acquired (file: /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/093faa75-5e33-4559-84fa-1f1f8d48153b/911c7637-b49d-463e-b186-23b404e50769)
MainThread::INFO::2018-01-12 21:53:48,067::hosted_engine::745::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock) Failed to acquire the lock. Waiting '5's before the next attempt
MainThread::INFO::2018-01-12 21:56:14,088::hosted_engine::745::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock) Failed to acquire the lock. Waiting '5's before the next attempt
MainThread::INFO::2018-01-12 21:58:40,111::hosted_engine::745::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock) Failed to acquire the lock. Waiting '5's before the next attempt
MainThread::INFO::2018-01-12 22:01:06,133::hosted_engine::745::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock) Failed to acquire the lock. Waiting '5's before the next attempt


agent.log from second host 

MainThread::INFO::2018-01-12 22:01:37,241::hosted_engine::630::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Connecting the storage
MainThread::INFO::2018-01-12 22:01:37,242::storage_server::220::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(validate_storage_server) Validating storage server
MainThread::INFO::2018-01-12 22:01:39,540::hosted_engine::639::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Storage domain reported as valid and reconnect is not forced.
MainThread::INFO::2018-01-12 22:01:41,939::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineUnexpectedlyDown (score: 0)
MainThread::INFO::2018-01-12 22:01:52,150::config::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_vm_conf) Reloading vm.conf from the shared storage domain
MainThread::INFO::2018-01-12 22:01:52,150::config::416::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Trying to get a fresher copy of vm configuration from the OVF_STORE
MainThread::INFO::2018-01-12 22:01:52,151::ovf_store::132::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) Extracting Engine VM OVF from the OVF_STORE
MainThread::INFO::2018-01-12 22:01:52,153::ovf_store::134::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/5cabd8e1-5f4b-469e-becc-227469e03f5c/8048cbd7-77e2-4805-9af4-d109fa36dfcf 
MainThread::INFO::2018-01-12 22:01:52,174::config::435::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Found an OVF for HE VM, trying to convert
MainThread::INFO::2018-01-12 22:01:52,179::config::440::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Got vm.conf from OVF_STORE
MainThread::INFO::2018-01-12 22:01:52,189::hosted_engine::604::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) Initializing VDSM
MainThread::INFO::2018-01-12 22:01:54,586::hosted_engine::630::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Connecting the storage
MainThread::INFO::2018-01-12 22:01:54,587::storage_server::220::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(validate_storage_server) Validating storage server
MainThread::INFO::2018-01-12 22:01:56,903::hosted_engine::639::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Storage domain reported as valid and reconnect is not forced.
MainThread::INFO::2018-01-12 22:01:59,299::states::682::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score) Score is 0 due to unexpected vm shutdown at Fri Jan 12 21:57:48 2018
MainThread::INFO::2018-01-12 22:01:59,299::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineUnexpectedlyDown (score: 0)
MainThread::INFO::2018-01-12 22:02:09,659::config::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_vm_conf) Reloading vm.conf from the shared storage domain
MainThread::INFO::2018-01-12 22:02:09,659::config::416::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Trying to get a fresher copy of vm configuration from the OVF_STORE
MainThread::INFO::2018-01-12 22:02:09,660::ovf_store::132::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) Extracting Engine VM OVF from the OVF_STORE
MainThread::INFO::2018-01-12 22:02:09,663::ovf_store::134::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/5cabd8e1-5f4b-469e-becc-227469e03f5c/8048cbd7-77e2-4805-9af4-d109fa36dfcf 
MainThread::INFO::2018-01-12 22:02:09,683::config::435::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Found an OVF for HE VM, trying to convert
MainThread::INFO::2018-01-12 22:02:09,688::config::440::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Got vm.conf from OVF_STORE
MainThread::INFO::2018-01-12 22:02:09,698::hosted_engine::604::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) Initializing VDSM
MainThread::INFO::2018-01-12 22:02:12,112::hosted_engine::630::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Connecting the storage
MainThread::INFO::2018-01-12 22:02:12,113::storage_server::220::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(validate_storage_server) Validating storage server
MainThread::INFO::2018-01-12 22:02:14,444::hosted_engine::639::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Storage domain reported as valid and reconnect is not forced.
MainThread::INFO::2018-01-12 22:02:16,859::states::682::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score) Score is 0 due to unexpected vm shutdown at Fri Jan 12 21:57:47 2018
MainThread::INFO::2018-01-12 22:02:16,859::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineUnexpectedlyDown (score: 0)
MainThread::INFO::2018-01-12 22:02:27,100::config::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_vm_conf) Reloading vm.conf from the shared storage domain
MainThread::INFO::2018-01-12 22:02:27,100::config::416::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Trying to get a fresher copy of vm configuration from the OVF_STORE
MainThread::INFO::2018-01-12 22:02:27,101::ovf_store::132::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) Extracting Engine VM OVF from the OVF_STORE
MainThread::INFO::2018-01-12 22:02:27,103::ovf_store::134::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/5cabd8e1-5f4b-469e-becc-227469e03f5c/8048cbd7-77e2-4805-9af4-d109fa36dfcf 
MainThread::INFO::2018-01-12 22:02:27,125::config::435::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Found an OVF for HE VM, trying to convert
MainThread::INFO::2018-01-12 22:02:27,129::config::440::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) Got vm.conf from OVF_STORE
MainThread::INFO::2018-01-12 22:02:27,130::states::667::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine down, local host does not have best score
MainThread::INFO::2018-01-12 22:02:27,139::hosted_engine::604::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) Initializing VDSM
MainThread::INFO::2018-01-12 22:02:29,584::hosted_engine::630::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Connecting the storage
MainThread::INFO::2018-01-12 22:02:29,586::storage_server::220::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(validate_storage_server) Validating storage server


Any suggestions how to resolve this .

regards,
Artem


On Fri, Jan 12, 2018 at 7:08 PM, Artem Tambovskiy <artem.tambovskiy@gmail.com> wrote:
Trying to fix one thing I broke another :( 

I fixed mnt_options for hosted engine storage domain and installed latest security patches to my hosts and hosted engine. All VM's up and running, but  hosted_engine --vm-status reports about issues: 

[root@ovirt1 ~]# hosted-engine --vm-status


--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : False
Hostname                           : ovirt2
Host ID                            : 1
Engine status                      : unknown stale-data
Score                              : 0
stopped                            : False
Local maintenance                  : False
crc32                              : 193164b8
local_conf_timestamp               : 8350
Host timestamp                     : 8350
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=8350 (Fri Jan 12 19:03:54 2018)
        host-id=1
        score=0
        vm_conf_refresh_time=8350 (Fri Jan 12 19:03:54 2018)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineUnexpectedlyDown
        stopped=False
        timeout=Thu Jan  1 05:24:43 1970


--== Host 2 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : False
Hostname                           : ovirt1.telia.ru
Host ID                            : 2
Engine status                      : unknown stale-data
Score                              : 0
stopped                            : True
Local maintenance                  : False
crc32                              : c7037c03
local_conf_timestamp               : 7530
Host timestamp                     : 7530
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=7530 (Fri Jan 12 16:10:12 2018)
        host-id=2
        score=0
        vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
        conf_on_shared_storage=True
        maintenance=False
        state=AgentStopped
        stopped=True
[root@ovirt1 ~]# 



from second host situation looks a bit different:


[root@ovirt2 ~]# hosted-engine --vm-status


--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : ovirt2
Host ID                            : 1
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 0
stopped                            : False
Local maintenance                  : False
crc32                              : 78eabdb6
local_conf_timestamp               : 8403
Host timestamp                     : 8402
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=8402 (Fri Jan 12 19:04:47 2018)
        host-id=1
        score=0
        vm_conf_refresh_time=8403 (Fri Jan 12 19:04:47 2018)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineUnexpectedlyDown
        stopped=False
        timeout=Thu Jan  1 05:24:43 1970


--== Host 2 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : False
Hostname                           : ovirt1.telia.ru
Host ID                            : 2
Engine status                      : unknown stale-data
Score                              : 0
stopped                            : True
Local maintenance                  : False
crc32                              : c7037c03
local_conf_timestamp               : 7530
Host timestamp                     : 7530
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=7530 (Fri Jan 12 16:10:12 2018)
        host-id=2
        score=0
        vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
        conf_on_shared_storage=True
        maintenance=False
        state=AgentStopped
        stopped=True


WebGUI shows that engine running on host ovirt1. 
Gluster looks fine 
[root@ovirt1 ~]# gluster volume status engine
Status of volume: engine
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick ovirt1.telia.ru:/oVirt/engine         49169     0          Y       3244 
Brick ovirt2.telia.ru:/oVirt/engine         49179     0          Y       20372
Brick ovirt3.telia.ru:/oVirt/engine         49206     0          Y       16609
Self-heal Daemon on localhost               N/A       N/A        Y       117868
Self-heal Daemon on ovirt2.telia.ru         N/A       N/A        Y       20521
Self-heal Daemon on ovirt3                  N/A       N/A        Y       25093
 
Task Status of Volume engine
------------------------------------------------------------------------------
There are no active volume tasks

How to resolve this issue?


_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users