Hello,Any further suggestions on how to fix the issue and make HA setup working? Can the complete removal of second host (with complete removal ovirt configuration files and packages) from cluster and adding it again solve the issue? Or it might completly ruin the cluster?Regards,Artem16 янв. 2018 г. 17:00 пользователь "Artem Tambovskiy" <artem.tambovskiy@gmail.com> написал:Hi Martin,
Thanks for feedback.All hosts and hosted-engine running 4.1.8 release.The strange thing : I can see that host ID is set to 1 on both hosts at /etc/ovirt-hosted-engine/hosted-engine.conf file.
I have no idea how this happen, the only thing I have changed recently is that I have changed mnt_options in order to add backup-volfile-servers
by using hosted-engine --set-shared-config command Both agent and broker are running on second host[root@ovirt2 ovirt-hosted-engine-ha]# ps -ef | grep ovirt-ha-vdsm 42331 1 26 14:40 ? 00:31:35 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon vdsm 42332 1 0 14:40 ? 00:00:16 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon
but I saw some tracebacks during the broker start[root@ovirt2 ovirt-hosted-engine-ha]# systemctl status ovirt-ha-broker -l● ovirt-ha-broker.service - oVirt Hosted Engine High Availability Communications BrokerLoaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2018-01-16 14:40:15 MSK; 1h 58min agoMain PID: 42331 (ovirt-ha-broker)CGroup: /system.slice/ovirt-ha-broker.service └─42331 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon Jan 16 14:40:15 ovirt2.telia.ru systemd[1]: Started oVirt Hosted Engine High Availability Communications Broker.Jan 16 14:40:15 ovirt2.telia.ru systemd[1]: Starting oVirt Hosted Engine High Availability Communications Broker...Jan 16 14:40:16 ovirt2.telia.ru ovirt-ha-broker[42331]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error handling request, data: 'set-storage-domain FilesystemBackend dom_type=glusterfs sd_uuid=4a7f8717-9bb0-4d80-801 6-498fa4b88162' Traceback (most recent call last):File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/ broker/listener.py", line 166, in handle data)File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/ broker/listener.py", line 299, in _dispatch .set_storage_domain(client, sd_type, **options)File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/ broker/storage_broker.py", line 66, in set_storage_domain self._backends[client].connect() File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/ lib/storage_backends.py", line 462, in connect self._dom_type)File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/ lib/storage_backends.py", line 107, in get_domain_path " in {1}".format(sd_uuid, parent))BackendFailureException: path to storage domain 4a7f8717-9bb0-4d80-8016-498fa4b88162 not found in /rhev/data-center/mnt/glusterS D I have tried to issue hosted-engine --connect-storage on second host followed by agent & broker restartBut there is no any visible improvements.Regards,ArtemOn Tue, Jan 16, 2018 at 4:18 PM, Martin Sivak <msivak@redhat.com> wrote:Hi everybody,
there are couple of things to check here.
- what version of hosted engine agent is this? The logs look like
coming from 4.1
- what version of engine is used?
- check the host ID in /etc/ovirt-hosted-engine/hosted-engine.conf on
both hosts, the numbers must be different
- it looks like the agent or broker on host 2 is not active (or there
would be a report)
- the second host does not see data from the first host (unknown
stale-data), wait for a minute and check again, then check the storage
connection
And then the general troubleshooting:
- put hosted engine in global maintenance mode (and check that it is
visible from the other host using he --vm-status)
- mount storage domain (hosted-engine --connect-storage)
- check sanlock client status to see if proper lockspaces are present
Best regards
Martin Sivak
On Tue, Jan 16, 2018 at 1:16 PM, Derek Atkins <derek@ihtfp.com> wrote:
> Why are both hosts reporting as ovirt 1?
> Look at the hostname fields to see what mean.
>
> -derek
> Sent using my mobile device. Please excuse any typos.
>
> On January 16, 2018 7:11:09 AM Artem Tambovskiy <artem.tambovskiy@gmail.com>
> wrote:
>>
>> Hello,
>>
>> Yes, I followed exactly the same procedure while reinstalling the hosts
>> (the only difference that I have SSH key configured instead of the
>> password).
>>
>> Just reinstalled the second host one more time, after 20 min the host
>> still haven't reached active score of 3400 (Hosted Engine HA:Not Active) and
>> I still don't see crown icon for this host.
>>
>> hosted-engine --vm-status from ovirt1 host
>>
>> [root@ovirt1 ~]# hosted-engine --vm-status
>>
>>
>> --== Host 1 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date : True
>> Hostname : ovirt1.telia.ru
>> Host ID : 1
>> Engine status : {"health": "good", "vm": "up",
>> "detail": "up"}
>> Score : 3400
>> stopped : False
>> Local maintenance : False
>> crc32 : 3f94156a
>> local_conf_timestamp : 349144
>> Host timestamp : 349144
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=349144 (Tue Jan 16 15:03:45 2018)
>> host-id=1
>> score=3400
>> vm_conf_refresh_time=349144 (Tue Jan 16 15:03:45 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=EngineUp
>> stopped=False
>>
>>
>> --== Host 2 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date : False
>> Hostname : ovirt1.telia.ru
>> Host ID : 2
>> Engine status : unknown stale-data
>> Score : 0
>> stopped : True
>> Local maintenance : False
>> crc32 : c7037c03
>> local_conf_timestamp : 7530
>> Host timestamp : 7530
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=7530 (Fri Jan 12 16:10:12 2018)
>> host-id=2
>> score=0
>> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=AgentStopped
>> stopped=True
>>
>>
>> hosted-engine --vm-status output from ovirt2 host
>>
>> [root@ovirt2 ovirt-hosted-engine-ha]# hosted-engine --vm-status
>>
>>
>> --== Host 1 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date : False
>> Hostname : ovirt1.telia.ru
>> Host ID : 1
>> Engine status : unknown stale-data
>> Score : 3400
>> stopped : False
>> Local maintenance : False
>> crc32 : 6d3606f1
>> local_conf_timestamp : 349264
>> Host timestamp : 349264
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=349264 (Tue Jan 16 15:05:45 2018)
>> host-id=1
>> score=3400
>> vm_conf_refresh_time=349264 (Tue Jan 16 15:05:45 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=EngineUp
>> stopped=False
>>
>>
>> --== Host 2 status ==--
>>
>> conf_on_shared_storage : True
>> Status up-to-date : False
>> Hostname : ovirt1.telia.ru
>> Host ID : 2
>> Engine status : unknown stale-data
>> Score : 0
>> stopped : True
>> Local maintenance : False
>> crc32 : c7037c03
>> local_conf_timestamp : 7530
>> Host timestamp : 7530
>> Extra metadata (valid at timestamp):
>> metadata_parse_version=1
>> metadata_feature_version=1
>> timestamp=7530 (Fri Jan 12 16:10:12 2018)
>> host-id=2
>> score=0
>> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
>> conf_on_shared_storage=True
>> maintenance=False
>> state=AgentStopped
>> stopped=True
>>
>>
>> Also I saw some log messages in webGUI about time drift like
>>
>> "Host ovirt2.telia.ru has time-drift of 5305 seconds while maximum
>> configured value is 300 seconds." that is a bit weird as haven't touched any
>> time settings since I installed the cluster.
>> both host have the same time and timezone (MSK) but hosted engine lives in
>> UTC timezone. Is it mandatory to have everything in sync and in the same
>> timezone?
>>
>> Regards,
>> Artem
>>
>>
>>
>>
>>
>>
>> On Tue, Jan 16, 2018 at 2:20 PM, Kasturi Narra <knarra@redhat.com> wrote:
>>>
>>> Hello,
>>>
>>> I now see that your hosted engine is up and running. Can you let me
>>> know how did you try reinstalling the host? Below is the procedure which is
>>> used and hope you did not miss any step while reinstalling. If no, can you
>>> try reinstalling again and see if that works ?
>>>
>>> 1) Move the host to maintenance
>>> 2) click on reinstall
>>> 3) provide the password
>>> 4) uncheck 'automatically configure host firewall'
>>> 5) click on 'Deploy' tab
>>> 6) click Hosted Engine deployment as 'Deploy'
>>>
>>> And once the host installation is done, wait till the active score of the
>>> host shows 3400 in the general tab then check hosted-engine --vm-status.
>>>
>>> Thanks
>>> kasturi
>>>
>>> On Mon, Jan 15, 2018 at 4:57 PM, Artem Tambovskiy
>>> <artem.tambovskiy@gmail.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> I have uploaded 2 archives with all relevant logs to shared hosting
>>>> files from host 1 (which is currently running all VM's including
>>>> hosted_engine) - https://yadi.sk/d/PttRoYV63RTvhK
>>>> files from second host - https://yadi.sk/d/UBducEsV3RTvhc
>>>>
>>>> I have tried to restart both ovirt-ha-agent and ovirt-ha-broker but it
>>>> gives no effect. I have also tried to shutdown hosted_engine VM, stop
>>>> ovirt-ha-agent and ovirt-ha-broker services disconnect storage and connect
>>>> it again - no effect as well.
>>>> Also I tried to reinstall second host from WebGUI - this lead to the
>>>> interesting situation - now hosted-engine --vm-status shows that both
>>>> hosts have the same address.
>>>>
>>>> [root@ovirt1 ~]# hosted-engine --vm-status
>>>>
>>>> --== Host 1 status ==--
>>>>
>>>> conf_on_shared_storage : True
>>>> Status up-to-date : True
>>>> Hostname : ovirt1.telia.ru
>>>> Host ID : 1
>>>> Engine status : {"health": "good", "vm": "up",
>>>> "detail": "up"}
>>>> Score : 3400
>>>> stopped : False
>>>> Local maintenance : False
>>>> crc32 : a7758085
>>>> local_conf_timestamp : 259327
>>>> Host timestamp : 259327
>>>> Extra metadata (valid at timestamp):
>>>> metadata_parse_version=1
>>>> metadata_feature_version=1
>>>> timestamp=259327 (Mon Jan 15 14:06:48 2018)
>>>> host-id=1
>>>> score=3400
>>>> vm_conf_refresh_time=259327 (Mon Jan 15 14:06:48 2018)
>>>> conf_on_shared_storage=True
>>>> maintenance=False
>>>> state=EngineUp
>>>> stopped=False
>>>>
>>>>
>>>> --== Host 2 status ==--
>>>>
>>>> conf_on_shared_storage : True
>>>> Status up-to-date : False
>>>> Hostname : ovirt1.telia.ru
>>>> Host ID : 2
>>>> Engine status : unknown stale-data
>>>> Score : 0
>>>> stopped : True
>>>> Local maintenance : False
>>>> crc32 : c7037c03
>>>> local_conf_timestamp : 7530
>>>> Host timestamp : 7530
>>>> Extra metadata (valid at timestamp):
>>>> metadata_parse_version=1
>>>> metadata_feature_version=1
>>>> timestamp=7530 (Fri Jan 12 16:10:12 2018)
>>>> host-id=2
>>>> score=0
>>>> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)
>>>> conf_on_shared_storage=True
>>>> maintenance=False
>>>> state=AgentStopped
>>>> stopped=True
>>>>
>>>> Gluster seems working fine. all gluster nodes showing connected state.
>>>>
>>>> Any advises on how to resolve this situation are highly appreciated!
>>>>
>>>> Regards,
>>>> Artem
>>>>
>>>>
>>>> On Mon, Jan 15, 2018 at 11:45 AM, Kasturi Narra <knarra@redhat.com>
>>>> wrote:
>>>>>
>>>>> Hello Artem,
>>>>>
>>>>> Can you check if glusterd service is running on host1 and all
>>>>> the peers are in connected state ? If yes, can you restart ovirt-ha-agent
>>>>> and broker services and check if things are working fine ?
>>>>>
>>>>> Thanks
>>>>> kasturi
>>>>>
>>>>> On Sat, Jan 13, 2018 at 12:33 AM, Artem Tambovskiy
>>>>> <artem.tambovskiy@gmail.com> wrote:
>>>>>>
>>>>>> Explored logs on both hosts.
>>>>>> broker.log shows no errors.
>>>>>>
>>>>>> agent.log looking not good:
>>>>>>
>>>>>> on host1 (which running hosted engine) :
>>>>>>
>>>>>> MainThread::ERROR::2018-01-12
>>>>>> 21:51:03,883::agent::205::ovirt_hosted_engine_ha.agent.agent .Agent::(_run_agent)
>>>>>> Traceback (most recent call last):
>>>>>> File
>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/age nt/agent.py",
>>>>>> line 191, in _run_agent
>>>>>> return action(he)
>>>>>> File
>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/age nt/agent.py",
>>>>>> line 64, in action_proper
>>>>>> return he.start_monitoring()
>>>>>> File
>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/age nt/hosted_engine.py",
>>>>>> line 411, in start_monitoring
>>>>>> self._initialize_sanlock()
>>>>>> File
>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/age nt/hosted_engine.py",
>>>>>> line 749, in _initialize_sanlock
>>>>>> "Failed to initialize sanlock, the number of errors has"
>>>>>> SanlockInitializationError: Failed to initialize sanlock, the number
>>>>>> of errors has exceeded the limit
>>>>>>
>>>>>> MainThread::ERROR::2018-01-12
>>>>>> 21:51:03,884::agent::206::ovirt_hosted_engine_ha.agent.agent .Agent::(_run_agent)
>>>>>> Trying to restart agent
>>>>>> MainThread::WARNING::2018-01-12
>>>>>> 21:51:08,889::agent::209::ovirt_hosted_engine_ha.agent.agent .Agent::(_run_agent)
>>>>>> Restarting agent, attempt '1'
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:08,919::hosted_engine::242::ovirt_hosted_engine_ha.age nthosted_engine.HostedEngine:: (_get_hostname)
>>>>>> Found certificate common name: ovirt1.telia.ru
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:08,921::hosted_engine::604::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_vdsm)
>>>>>> Initializing VDSM
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:11,398::hosted_engine::630::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_storage_images)
>>>>>> Connecting the storage
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:11,399::storage_server::220::ovirt_hosted_engine_ha.li b.storage_server.StorageServer ::(validate_storage_server)
>>>>>> Validating storage server
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:13,725::storage_server::239::ovirt_hosted_engine_ha.li bstorage_server.StorageServer: :(connect_storage_server)
>>>>>> Connecting storage server
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:18,390::storage_server::246::ovirt_hosted_engine_ha.li b.storage_server.StorageServer ::(connect_storage_server)
>>>>>> Connecting storage server
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:18,423::storage_server::253::ovirt_hosted_engine_ha.li b.storage_server.StorageServer ::(connect_storage_server)
>>>>>> Refreshing the storage domain
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:18,689::hosted_engine::663::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_storage_images)
>>>>>> Preparing images
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:18,690::image::126::ovirt_hosted_engine_ha.lib.image.I mage::(prepare_images)
>>>>>> Preparing images
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,895::hosted_engine::666::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_storage_images)
>>>>>> Refreshing vm.conf
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,895::config::493::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine.config: :(refresh_vm_conf)
>>>>>> Reloading vm.conf from the shared storage domain
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,896::config::416::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine.config: :(_get_vm_conf_content_from_ov f_store)
>>>>>> Trying to get a fresher copy of vm configuration from the OVF_STORE
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,896::ovf_store::132::ovirt_hosted_engine_ha.lib.ovf .ovf_store.OVFStore::(getEngin eVMOVF)
>>>>>> Extracting Engine VM OVF from the OVF_STORE
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,897::ovf_store::134::ovirt_hosted_engine_ha.lib.ovf .ovf_store.OVFStore::(getEngin eVMOVF)
>>>>>> OVF_STORE volume path:
>>>>>> /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/5 cabd8e1-5f4b-469e-becc-227469e 03f5c/8048cbd7-77e2-4805-9af4- d109fa36dfcf
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,915::config::435::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine.config: :(_get_vm_conf_content_from_ov f_store)
>>>>>> Found an OVF for HE VM, trying to convert
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,918::config::440::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine.config: :(_get_vm_conf_content_from_ov f_store)
>>>>>> Got vm.conf from OVF_STORE
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,919::hosted_engine::509::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_broker)
>>>>>> Initializing ha-broker connection
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,919::brokerlink::130::ovirt_hosted_engine_ha.lib.br okerlink.BrokerLink::(start_mo nitor)
>>>>>> Starting monitor ping, options {'addr': '80.239.162.97'}
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,922::brokerlink::141::ovirt_hosted_engine_ha.lib.br okerlink.BrokerLink::(start_mo nitor)
>>>>>> Success, id 140547104457680
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,922::brokerlink::130::ovirt_hosted_engine_ha.lib.br okerlink.BrokerLink::(start_mo nitor)
>>>>>> Starting monitor mgmt-bridge, options {'use_ssl': 'true', 'bridge_name':
>>>>>> 'ovirtmgmt', 'address': '0'}
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,936::brokerlink::141::ovirt_hosted_engine_ha.lib.br okerlinkBrokerLink::(start_mon itor)
>>>>>> Success, id 140547104458064
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,936::brokerlink::130::ovirt_hosted_engine_ha.lib.br okerlink.BrokerLink::(start_mo nitor)
>>>>>> Starting monitor mem-free, options {'use_ssl': 'true', 'address': '0'}
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,938::brokerlink::141::ovirt_hosted_engine_ha.lib.br okerlink.BrokerLink::(start_mo nitor)
>>>>>> Success, id 140547104458448
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,939::brokerlink::130::ovirt_hosted_engine_ha.lib.br okerlinkBrokerLink::(start_mon itor)
>>>>>> Starting monitor cpu-load-no-engine, options {'use_ssl': 'true', 'vm_uuid':
>>>>>> 'b366e466-b0ea-4a09-866b-d0248d7523a6', 'address': '0'}
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,940::brokerlink::141::ovirt_hosted_engine_ha.lib.br okerlink.BrokerLink::(start_mo nitor)
>>>>>> Success, id 140547104457552
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,941::brokerlink::130::ovirt_hosted_engine_ha.lib.br okerlink.BrokerLink::(start_mo nitor)
>>>>>> Starting monitor engine-health, options {'use_ssl': 'true', 'vm_uuid':
>>>>>> 'b366e466-b0ea-4a09-866b-d0248d7523a6', 'address': '0'}
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:21,942::brokerlink::141::ovirt_hosted_engine_ha.lib.br okerlink.BrokerLink::(start_mo nitor)
>>>>>> Success, id 140547104459792
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:26,951::brokerlink::179::ovirt_hosted_engine_ha.lib.br okerlink.BrokerLink::(set_stor age_domain)
>>>>>> Success, id 140546772847056
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:26,952::hosted_engine::601::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_broker)
>>>>>> Broker initialized, all submonitors started
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:51:27,049::hosted_engine::704::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_sanlock)
>>>>>> Ensuring lease for lockspace hosted-engine, host id 1 is acquired (file:
>>>>>> /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/0 93faa75-5e33-4559-84fa-1f1f8d4 8153b/911c7637-b49d-463e-b186- 23b404e50769)
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:53:48,067::hosted_engine::745::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_sanlock)
>>>>>> Failed to acquire the lock. Waiting '5's before the next attempt
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:56:14,088::hosted_engine::745::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_sanlock)
>>>>>> Failed to acquire the lock. Waiting '5's before the next attempt
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 21:58:40,111::hosted_engine::745::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_sanlock)
>>>>>> Failed to acquire the lock. Waiting '5's before the next attempt
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:06,133::hosted_engine::745::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_sanlock)
>>>>>> Failed to acquire the lock. Waiting '5's before the next attempt
>>>>>>
>>>>>>
>>>>>> agent.log from second host
>>>>>>
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:37,241::hosted_engine::630::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_storage_images)
>>>>>> Connecting the storage
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:37,242::storage_server::220::ovirt_hosted_engine_ha.li b.storage_server.StorageServer ::(validate_storage_server)
>>>>>> Validating storage server
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:39,540::hosted_engine::639::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_storage_images)
>>>>>> Storage domain reported as valid and reconnect is not forced.
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:41,939::hosted_engine::453::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(start_monitoring)
>>>>>> Current state EngineUnexpectedlyDown (score: 0)
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:52,150::config::493::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine.config: :(refresh_vm_conf)
>>>>>> Reloading vm.conf from the shared storage domain
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:52,150::config::416::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine.config: :(_get_vm_conf_content_from_ov f_store)
>>>>>> Trying to get a fresher copy of vm configuration from the OVF_STORE
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:52,151::ovf_store::132::ovirt_hosted_engine_ha.lib.ovf .ovf_store.OVFStore::(getEngin eVMOVF)
>>>>>> Extracting Engine VM OVF from the OVF_STORE
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:52,153::ovf_store::134::ovirt_hosted_engine_ha.lib.ovf .ovf_store.OVFStore::(getEngin eVMOVF)
>>>>>> OVF_STORE volume path:
>>>>>> /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/5 cabd8e1-5f4b-469e-becc-227469e 03f5c/8048cbd7-77e2-4805-9af4- d109fa36dfcf
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:52,174::config::435::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine.config: :(_get_vm_conf_content_from_ov f_store)
>>>>>> Found an OVF for HE VM, trying to convert
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:52,179::config::440::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine.config: :(_get_vm_conf_content_from_ov f_store)
>>>>>> Got vm.conf from OVF_STORE
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:52,189::hosted_engine::604::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_vdsm)
>>>>>> Initializing VDSM
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:54,586::hosted_engine::630::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_storage_images)
>>>>>> Connecting the storage
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:54,587::storage_server::220::ovirt_hosted_engine_ha.li b.storage_server.StorageServer ::(validate_storage_server)
>>>>>> Validating storage server
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:56,903::hosted_engine::639::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_storage_images)
>>>>>> Storage domain reported as valid and reconnect is not forced.
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:59,299::states::682::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine::(score )
>>>>>> Score is 0 due to unexpected vm shutdown at Fri Jan 12 21:57:48 2018
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:01:59,299::hosted_engine::453::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(start_monitoring)
>>>>>> Current state EngineUnexpectedlyDown (score: 0)
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:09,659::config::493::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine.config: :(refresh_vm_conf)
>>>>>> Reloading vm.conf from the shared storage domain
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:09,659::config::416::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine.config: :(_get_vm_conf_content_from_ov f_store)
>>>>>> Trying to get a fresher copy of vm configuration from the OVF_STORE
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:09,660::ovf_store::132::ovirt_hosted_engine_ha.lib.ovf .ovf_store.OVFStore::(getEngin eVMOVF)
>>>>>> Extracting Engine VM OVF from the OVF_STORE
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:09,663::ovf_store::134::ovirt_hosted_engine_ha.lib.ovf .ovf_store.OVFStore::(getEngin eVMOVF)
>>>>>> OVF_STORE volume path:
>>>>>> /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/5 cabd8e1-5f4b-469e-becc-227469e 03f5c/8048cbd7-77e2-4805-9af4- d109fa36dfcf
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:09,683::config::435::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine.config: :(_get_vm_conf_content_from_ov f_store)
>>>>>> Found an OVF for HE VM, trying to convert
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:09,688::config::440::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine.config: :(_get_vm_conf_content_from_ov f_store)
>>>>>> Got vm.conf from OVF_STORE
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:09,698::hosted_engine::604::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_vdsm)
>>>>>> Initializing VDSM
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:12,112::hosted_engine::630::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_storage_images)
>>>>>> Connecting the storage
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:12,113::storage_server::220::ovirt_hosted_engine_ha.li b.storage_server.StorageServer ::(validate_storage_server)
>>>>>> Validating storage server
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:14,444::hosted_engine::639::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(_initialize_storage_images)
>>>>>> Storage domain reported as valid and reconnect is not forced.
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:16,859::states::682::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine::(score )
>>>>>> Score is 0 due to unexpected vm shutdown at Fri Jan 12 21:57:47 2018
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:16,859::hosted_engine::453::ovirt_hosted_engine_ha.age nt.hosted_engine.HostedEngine: :(start_monitoring)
>>>>>> Current state EngineUnexpectedlyDown (score: 0)
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:27,100::config::493::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine.config: :(refresh_vm_conf)
>>>>>> Reloading vm.conf from the shared storage domain
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:27,100::config::416::ovirt_hosted_engine_ha.agent.host ed_engine.HostedEngine.config: :(_get_vm_conf_content_from_ov f_store)
>>>>>> Trying to get a fresher copy of vm configuration from the OVF_STORE
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:27,101::ovf_store::132::ovirt_hosted_engine_ha.lib.ovf .ovf_store.OVFStore::(getEngin eVMOVF)
>>>>>> Extracting Engine VM OVF from the OVF_STORE
>>>>>> MainThread::INFO::2018-01-12
>>>>>> 22:02:27,103::ovf_store::134::ovirt_hosted_engine_ha.lib.ovf .ovf_store.OVFStore::(getEngin eVMOVF)
>>>>>> OVF_STORE volume path:
>>>>>> /var/run/vdsm/storage/4a7f8717-9bb0-4d80-8016-498fa4b88162/5