<div dir="ltr">Hi Martin,<br><br>Thanks for feedback. <div><br></div><div>All hosts and hosted-engine running 4.1.8 release.</div><div>The strange thing : I can see that host ID is set to 1 on both hosts at <span style="font-size:12.8px">/etc/ovirt-hosted-engine/</span><wbr style="font-size:12.8px"><span style="font-size:12.8px">hosted-engine.conf file. </span><br><span style="font-size:12.8px">I have no idea how this happen, the only thing I have changed recently is that I have changed mnt_options in order to add backup-volfile-servers <br>by using hosted-engine --set-shared-config command </span></div><div><span style="font-size:12.8px"><br></span></div><div><span style="font-size:12.8px">Both agent and broker are running on second host <br></span><br><div><span style="font-size:12.8px">[root@ovirt2 ovirt-hosted-engine-ha]# ps -ef | grep ovirt-ha-</span></div><div><span style="font-size:12.8px">vdsm 42331 1 26 14:40 ? 00:31:35 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon</span></div><div><span style="font-size:12.8px">vdsm 42332 1 0 14:40 ? 00:00:16 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon</span></div></div><div><br>but I saw some tracebacks during the broker start <br><br><div>[root@ovirt2 ovirt-hosted-engine-ha]# systemctl status ovirt-ha-broker -l</div><div>● ovirt-ha-broker.service - oVirt Hosted Engine High Availability Communications Broker</div><div> Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; enabled; vendor preset: disabled)</div><div> Active: active (running) since Tue 2018-01-16 14:40:15 MSK; 1h 58min ago</div><div> Main PID: 42331 (ovirt-ha-broker)</div><div> CGroup: /system.slice/ovirt-ha-broker.service</div><div> └─42331 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon</div><div><br></div><div>Jan 16 14:40:15 <a href="http://ovirt2.telia.ru">ovirt2.telia.ru</a> systemd[1]: Started oVirt Hosted Engine High Availability Communications Broker.</div><div>Jan 16 14:40:15 <a href="http://ovirt2.telia.ru">ovirt2.telia.ru</a> systemd[1]: Starting oVirt Hosted Engine High Availability Communications Broker...</div><div>Jan 16 14:40:16 <a href="http://ovirt2.telia.ru">ovirt2.telia.ru</a> ovirt-ha-broker[42331]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error handling request, data: 'set-storage-domain FilesystemBackend dom_type=glusterfs sd_uuid=4a7f8717-9bb0-4d80-8016-498fa4b88162'</div><div> Traceback (most recent call last):</div><div> File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 166, in handle</div><div> data)</div><div> File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py", line 299, in _dispatch</div><div> .set_storage_domain(client, sd_type, **options)</div><div> File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", line 66, in set_storage_domain</div><div> self._backends[client].connect()</div><div> File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 462, in connect</div><div> self._dom_type)</div><div> File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", line 107, in get_domain_path</div><div> " in {1}".format(sd_uuid, parent))</div><div> BackendFailureException: path to storage domain 4a7f8717-9bb0-4d80-8016-498fa4b88162 not found in /rhev/data-center/mnt/glusterSD</div></div><div><br></div><div><br></div><div><br></div><div>I have tried to issue hosted-engine --connect-storage on second host followed by agent & broker restart </div><div>But there is no any visible improvements.</div><div><br></div><div>Regards,</div><div>Artem</div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jan 16, 2018 at 4:18 PM, Martin Sivak <span dir="ltr"><<a href="mailto:msivak@redhat.com" target="_blank">msivak@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi everybody,<br>
<br>
there are couple of things to check here.<br>
<br>
- what version of hosted engine agent is this? The logs look like<br>
coming from 4.1<br>
- what version of engine is used?<br>
- check the host ID in /etc/ovirt-hosted-engine/<wbr>hosted-engine.conf on<br>
both hosts, the numbers must be different<br>
- it looks like the agent or broker on host 2 is not active (or there<br>
would be a report)<br>
- the second host does not see data from the first host (unknown<br>
stale-data), wait for a minute and check again, then check the storage<br>
connection<br>
<br>
And then the general troubleshooting:<br>
<br>
- put hosted engine in global maintenance mode (and check that it is<br>
visible from the other host using he --vm-status)<br>
- mount storage domain (hosted-engine --connect-storage)<br>
- check sanlock client status to see if proper lockspaces are present<br>
<br>
Best regards<br>
<span class="HOEnZb"><font color="#888888"><br>
Martin Sivak<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
On Tue, Jan 16, 2018 at 1:16 PM, Derek Atkins <<a href="mailto:derek@ihtfp.com">derek@ihtfp.com</a>> wrote:<br>
> Why are both hosts reporting as ovirt 1?<br>
> Look at the hostname fields to see what mean.<br>
><br>
> -derek<br>
> Sent using my mobile device. Please excuse any typos.<br>
><br>
> On January 16, 2018 7:11:09 AM Artem Tambovskiy <<a href="mailto:artem.tambovskiy@gmail.com">artem.tambovskiy@gmail.com</a>><br>
> wrote:<br>
>><br>
>> Hello,<br>
>><br>
>> Yes, I followed exactly the same procedure while reinstalling the hosts<br>
>> (the only difference that I have SSH key configured instead of the<br>
>> password).<br>
>><br>
>> Just reinstalled the second host one more time, after 20 min the host<br>
>> still haven't reached active score of 3400 (Hosted Engine HA:Not Active) and<br>
>> I still don't see crown icon for this host.<br>
>><br>
>> hosted-engine --vm-status from ovirt1 host<br>
>><br>
>> [root@ovirt1 ~]# hosted-engine --vm-status<br>
>><br>
>><br>
>> --== Host 1 status ==--<br>
>><br>
>> conf_on_shared_storage : True<br>
>> Status up-to-date : True<br>
>> Hostname : <a href="http://ovirt1.telia.ru" rel="noreferrer" target="_blank">ovirt1.telia.ru</a><br>
>> Host ID : 1<br>
>> Engine status : {"health": "good", "vm": "up",<br>
>> "detail": "up"}<br>
>> Score : 3400<br>
>> stopped : False<br>
>> Local maintenance : False<br>
>> crc32 : 3f94156a<br>
>> local_conf_timestamp : 349144<br>
>> Host timestamp : 349144<br>
>> Extra metadata (valid at timestamp):<br>
>> metadata_parse_version=1<br>
>> metadata_feature_version=1<br>
>> timestamp=349144 (Tue Jan 16 15:03:45 2018)<br>
>> host-id=1<br>
>> score=3400<br>
>> vm_conf_refresh_time=349144 (Tue Jan 16 15:03:45 2018)<br>
>> conf_on_shared_storage=True<br>
>> maintenance=False<br>
>> state=EngineUp<br>
>> stopped=False<br>
>><br>
>><br>
>> --== Host 2 status ==--<br>
>><br>
>> conf_on_shared_storage : True<br>
>> Status up-to-date : False<br>
>> Hostname : <a href="http://ovirt1.telia.ru" rel="noreferrer" target="_blank">ovirt1.telia.ru</a><br>
>> Host ID : 2<br>
>> Engine status : unknown stale-data<br>
>> Score : 0<br>
>> stopped : True<br>
>> Local maintenance : False<br>
>> crc32 : c7037c03<br>
>> local_conf_timestamp : 7530<br>
>> Host timestamp : 7530<br>
>> Extra metadata (valid at timestamp):<br>
>> metadata_parse_version=1<br>
>> metadata_feature_version=1<br>
>> timestamp=7530 (Fri Jan 12 16:10:12 2018)<br>
>> host-id=2<br>
>> score=0<br>
>> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)<br>
>> conf_on_shared_storage=True<br>
>> maintenance=False<br>
>> state=AgentStopped<br>
>> stopped=True<br>
>><br>
>><br>
>> hosted-engine --vm-status output from ovirt2 host<br>
>><br>
>> [root@ovirt2 ovirt-hosted-engine-ha]# hosted-engine --vm-status<br>
>><br>
>><br>
>> --== Host 1 status ==--<br>
>><br>
>> conf_on_shared_storage : True<br>
>> Status up-to-date : False<br>
>> Hostname : <a href="http://ovirt1.telia.ru" rel="noreferrer" target="_blank">ovirt1.telia.ru</a><br>
>> Host ID : 1<br>
>> Engine status : unknown stale-data<br>
>> Score : 3400<br>
>> stopped : False<br>
>> Local maintenance : False<br>
>> crc32 : 6d3606f1<br>
>> local_conf_timestamp : 349264<br>
>> Host timestamp : 349264<br>
>> Extra metadata (valid at timestamp):<br>
>> metadata_parse_version=1<br>
>> metadata_feature_version=1<br>
>> timestamp=349264 (Tue Jan 16 15:05:45 2018)<br>
>> host-id=1<br>
>> score=3400<br>
>> vm_conf_refresh_time=349264 (Tue Jan 16 15:05:45 2018)<br>
>> conf_on_shared_storage=True<br>
>> maintenance=False<br>
>> state=EngineUp<br>
>> stopped=False<br>
>><br>
>><br>
>> --== Host 2 status ==--<br>
>><br>
>> conf_on_shared_storage : True<br>
>> Status up-to-date : False<br>
>> Hostname : <a href="http://ovirt1.telia.ru" rel="noreferrer" target="_blank">ovirt1.telia.ru</a><br>
>> Host ID : 2<br>
>> Engine status : unknown stale-data<br>
>> Score : 0<br>
>> stopped : True<br>
>> Local maintenance : False<br>
>> crc32 : c7037c03<br>
>> local_conf_timestamp : 7530<br>
>> Host timestamp : 7530<br>
>> Extra metadata (valid at timestamp):<br>
>> metadata_parse_version=1<br>
>> metadata_feature_version=1<br>
>> timestamp=7530 (Fri Jan 12 16:10:12 2018)<br>
>> host-id=2<br>
>> score=0<br>
>> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)<br>
>> conf_on_shared_storage=True<br>
>> maintenance=False<br>
>> state=AgentStopped<br>
>> stopped=True<br>
>><br>
>><br>
>> Also I saw some log messages in webGUI about time drift like<br>
>><br>
>> "Host <a href="http://ovirt2.telia.ru" rel="noreferrer" target="_blank">ovirt2.telia.ru</a> has time-drift of 5305 seconds while maximum<br>
>> configured value is 300 seconds." that is a bit weird as haven't touched any<br>
>> time settings since I installed the cluster.<br>
>> both host have the same time and timezone (MSK) but hosted engine lives in<br>
>> UTC timezone. Is it mandatory to have everything in sync and in the same<br>
>> timezone?<br>
>><br>
>> Regards,<br>
>> Artem<br>
>><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>> On Tue, Jan 16, 2018 at 2:20 PM, Kasturi Narra <<a href="mailto:knarra@redhat.com">knarra@redhat.com</a>> wrote:<br>
>>><br>
>>> Hello,<br>
>>><br>
>>> I now see that your hosted engine is up and running. Can you let me<br>
>>> know how did you try reinstalling the host? Below is the procedure which is<br>
>>> used and hope you did not miss any step while reinstalling. If no, can you<br>
>>> try reinstalling again and see if that works ?<br>
>>><br>
>>> 1) Move the host to maintenance<br>
>>> 2) click on reinstall<br>
>>> 3) provide the password<br>
>>> 4) uncheck 'automatically configure host firewall'<br>
>>> 5) click on 'Deploy' tab<br>
>>> 6) click Hosted Engine deployment as 'Deploy'<br>
>>><br>
>>> And once the host installation is done, wait till the active score of the<br>
>>> host shows 3400 in the general tab then check hosted-engine --vm-status.<br>
>>><br>
>>> Thanks<br>
>>> kasturi<br>
>>><br>
>>> On Mon, Jan 15, 2018 at 4:57 PM, Artem Tambovskiy<br>
>>> <<a href="mailto:artem.tambovskiy@gmail.com">artem.tambovskiy@gmail.com</a>> wrote:<br>
>>>><br>
>>>> Hello,<br>
>>>><br>
>>>> I have uploaded 2 archives with all relevant logs to shared hosting<br>
>>>> files from host 1 (which is currently running all VM's including<br>
>>>> hosted_engine) - <a href="https://yadi.sk/d/PttRoYV63RTvhK" rel="noreferrer" target="_blank">https://yadi.sk/d/<wbr>PttRoYV63RTvhK</a><br>
>>>> files from second host - <a href="https://yadi.sk/d/UBducEsV3RTvhc" rel="noreferrer" target="_blank">https://yadi.sk/d/<wbr>UBducEsV3RTvhc</a><br>
>>>><br>
>>>> I have tried to restart both ovirt-ha-agent and ovirt-ha-broker but it<br>
>>>> gives no effect. I have also tried to shutdown hosted_engine VM, stop<br>
>>>> ovirt-ha-agent and ovirt-ha-broker services disconnect storage and connect<br>
>>>> it again - no effect as well.<br>
>>>> Also I tried to reinstall second host from WebGUI - this lead to the<br>
>>>> interesting situation - now hosted-engine --vm-status shows that both<br>
>>>> hosts have the same address.<br>
>>>><br>
>>>> [root@ovirt1 ~]# hosted-engine --vm-status<br>
>>>><br>
>>>> --== Host 1 status ==--<br>
>>>><br>
>>>> conf_on_shared_storage : True<br>
>>>> Status up-to-date : True<br>
>>>> Hostname : <a href="http://ovirt1.telia.ru" rel="noreferrer" target="_blank">ovirt1.telia.ru</a><br>
>>>> Host ID : 1<br>
>>>> Engine status : {"health": "good", "vm": "up",<br>
>>>> "detail": "up"}<br>
>>>> Score : 3400<br>
>>>> stopped : False<br>
>>>> Local maintenance : False<br>
>>>> crc32 : a7758085<br>
>>>> local_conf_timestamp : 259327<br>
>>>> Host timestamp : 259327<br>
>>>> Extra metadata (valid at timestamp):<br>
>>>> metadata_parse_version=1<br>
>>>> metadata_feature_version=1<br>
>>>> timestamp=259327 (Mon Jan 15 14:06:48 2018)<br>
>>>> host-id=1<br>
>>>> score=3400<br>
>>>> vm_conf_refresh_time=259327 (Mon Jan 15 14:06:48 2018)<br>
>>>> conf_on_shared_storage=True<br>
>>>> maintenance=False<br>
>>>> state=EngineUp<br>
>>>> stopped=False<br>
>>>><br>
>>>><br>
>>>> --== Host 2 status ==--<br>
>>>><br>
>>>> conf_on_shared_storage : True<br>
>>>> Status up-to-date : False<br>
>>>> Hostname : <a href="http://ovirt1.telia.ru" rel="noreferrer" target="_blank">ovirt1.telia.ru</a><br>
>>>> Host ID : 2<br>
>>>> Engine status : unknown stale-data<br>
>>>> Score : 0<br>
>>>> stopped : True<br>
>>>> Local maintenance : False<br>
>>>> crc32 : c7037c03<br>
>>>> local_conf_timestamp : 7530<br>
>>>> Host timestamp : 7530<br>
>>>> Extra metadata (valid at timestamp):<br>
>>>> metadata_parse_version=1<br>
>>>> metadata_feature_version=1<br>
>>>> timestamp=7530 (Fri Jan 12 16:10:12 2018)<br>
>>>> host-id=2<br>
>>>> score=0<br>
>>>> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)<br>
>>>> conf_on_shared_storage=True<br>
>>>> maintenance=False<br>
>>>> state=AgentStopped<br>
>>>> stopped=True<br>
>>>><br>
>>>> Gluster seems working fine. all gluster nodes showing connected state.<br>
>>>><br>
>>>> Any advises on how to resolve this situation are highly appreciated!<br>
>>>><br>
>>>> Regards,<br>
>>>> Artem<br>
>>>><br>
>>>><br>
>>>> On Mon, Jan 15, 2018 at 11:45 AM, Kasturi Narra <<a href="mailto:knarra@redhat.com">knarra@redhat.com</a>><br>
>>>> wrote:<br>
>>>>><br>
>>>>> Hello Artem,<br>
>>>>><br>
>>>>> Can you check if glusterd service is running on host1 and all<br>
>>>>> the peers are in connected state ? If yes, can you restart ovirt-ha-agent<br>
>>>>> and broker services and check if things are working fine ?<br>
>>>>><br>
>>>>> Thanks<br>
>>>>> kasturi<br>
>>>>><br>
>>>>> On Sat, Jan 13, 2018 at 12:33 AM, Artem Tambovskiy<br>
>>>>> <<a href="mailto:artem.tambovskiy@gmail.com">artem.tambovskiy@gmail.com</a>> wrote:<br>
>>>>>><br>
>>>>>> Explored logs on both hosts.<br>
>>>>>> broker.log shows no errors.<br>
>>>>>><br>
>>>>>> agent.log looking not good:<br>
>>>>>><br>
>>>>>> on host1 (which running hosted engine) :<br>
>>>>>><br>
>>>>>> MainThread::ERROR::2018-01-12<br>
>>>>>> 21:51:03,883::agent::205::<wbr>ovirt_hosted_engine_ha.agent.<wbr>agent.Agent::(_run_agent)<br>
>>>>>> Traceback (most recent call last):<br>
>>>>>> File<br>
>>>>>> "/usr/lib/python2.7/site-<wbr>packages/ovirt_hosted_engine_<wbr>ha/agent/agent.py",<br>
>>>>>> line 191, in _run_agent<br>
>>>>>> return action(he)<br>
>>>>>> File<br>
>>>>>> "/usr/lib/python2.7/site-<wbr>packages/ovirt_hosted_engine_<wbr>ha/agent/agent.py",<br>
>>>>>> line 64, in action_proper<br>
>>>>>> return he.start_monitoring()<br>
>>>>>> File<br>
>>>>>> "/usr/lib/python2.7/site-<wbr>packages/ovirt_hosted_engine_<wbr>ha/agent/hosted_engine.py",<br>
>>>>>> line 411, in start_monitoring<br>
>>>>>> self._initialize_sanlock()<br>
>>>>>> File<br>
>>>>>> "/usr/lib/python2.7/site-<wbr>packages/ovirt_hosted_engine_<wbr>ha/agent/hosted_engine.py",<br>
>>>>>> line 749, in _initialize_sanlock<br>
>>>>>> "Failed to initialize sanlock, the number of errors has"<br>
>>>>>> SanlockInitializationError: Failed to initialize sanlock, the number<br>
>>>>>> of errors has exceeded the limit<br>
>>>>>><br>
>>>>>> MainThread::ERROR::2018-01-12<br>
>>>>>> 21:51:03,884::agent::206::<wbr>ovirt_hosted_engine_ha.agent.<wbr>agent.Agent::(_run_agent)<br>
>>>>>> Trying to restart agent<br>
>>>>>> MainThread::WARNING::2018-01-<wbr>12<br>
>>>>>> 21:51:08,889::agent::209::<wbr>ovirt_hosted_engine_ha.agent.<wbr>agent.Agent::(_run_agent)<br>
>>>>>> Restarting agent, attempt '1'<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:08,919::hosted_engine::<wbr>242::ovirt_hosted_engine_ha.<wbr>agenthosted_engine.<wbr>HostedEngine::(_get_hostname)<br>
>>>>>> Found certificate common name: <a href="http://ovirt1.telia.ru" rel="noreferrer" target="_blank">ovirt1.telia.ru</a><br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:08,921::hosted_engine::<wbr>604::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>vdsm)<br>
>>>>>> Initializing VDSM<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:11,398::hosted_engine::<wbr>630::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images)<br>
>>>>>> Connecting the storage<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:11,399::storage_server::<wbr>220::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(validate_<wbr>storage_server)<br>
>>>>>> Validating storage server<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:13,725::storage_server::<wbr>239::ovirt_hosted_engine_ha.<wbr>libstorage_server.<wbr>StorageServer::(connect_<wbr>storage_server)<br>
>>>>>> Connecting storage server<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:18,390::storage_server::<wbr>246::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(connect_<wbr>storage_server)<br>
>>>>>> Connecting storage server<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:18,423::storage_server::<wbr>253::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(connect_<wbr>storage_server)<br>
>>>>>> Refreshing the storage domain<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:18,689::hosted_engine::<wbr>663::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images)<br>
>>>>>> Preparing images<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:18,690::image::126::<wbr>ovirt_hosted_engine_ha.lib.<wbr>image.Image::(prepare_images)<br>
>>>>>> Preparing images<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,895::hosted_engine::<wbr>666::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images)<br>
>>>>>> Refreshing vm.conf<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,895::config::493::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(refresh_vm_conf)<br>
>>>>>> Reloading vm.conf from the shared storage domain<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,896::config::416::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store)<br>
>>>>>> Trying to get a fresher copy of vm configuration from the OVF_STORE<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,896::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF)<br>
>>>>>> Extracting Engine VM OVF from the OVF_STORE<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,897::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF)<br>
>>>>>> OVF_STORE volume path:<br>
>>>>>> /var/run/vdsm/storage/<wbr>4a7f8717-9bb0-4d80-8016-<wbr>498fa4b88162/5cabd8e1-5f4b-<wbr>469e-becc-227469e03f5c/<wbr>8048cbd7-77e2-4805-9af4-<wbr>d109fa36dfcf<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,915::config::435::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store)<br>
>>>>>> Found an OVF for HE VM, trying to convert<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,918::config::440::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store)<br>
>>>>>> Got vm.conf from OVF_STORE<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,919::hosted_engine::<wbr>509::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>broker)<br>
>>>>>> Initializing ha-broker connection<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,919::brokerlink::130:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor)<br>
>>>>>> Starting monitor ping, options {'addr': '80.239.162.97'}<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,922::brokerlink::141:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor)<br>
>>>>>> Success, id 140547104457680<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,922::brokerlink::130:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor)<br>
>>>>>> Starting monitor mgmt-bridge, options {'use_ssl': 'true', 'bridge_name':<br>
>>>>>> 'ovirtmgmt', 'address': '0'}<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,936::brokerlink::141:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlinkBrokerLink::(start_<wbr>monitor)<br>
>>>>>> Success, id 140547104458064<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,936::brokerlink::130:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor)<br>
>>>>>> Starting monitor mem-free, options {'use_ssl': 'true', 'address': '0'}<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,938::brokerlink::141:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor)<br>
>>>>>> Success, id 140547104458448<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,939::brokerlink::130:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlinkBrokerLink::(start_<wbr>monitor)<br>
>>>>>> Starting monitor cpu-load-no-engine, options {'use_ssl': 'true', 'vm_uuid':<br>
>>>>>> 'b366e466-b0ea-4a09-866b-<wbr>d0248d7523a6', 'address': '0'}<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,940::brokerlink::141:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor)<br>
>>>>>> Success, id 140547104457552<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,941::brokerlink::130:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor)<br>
>>>>>> Starting monitor engine-health, options {'use_ssl': 'true', 'vm_uuid':<br>
>>>>>> 'b366e466-b0ea-4a09-866b-<wbr>d0248d7523a6', 'address': '0'}<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:21,942::brokerlink::141:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor)<br>
>>>>>> Success, id 140547104459792<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:26,951::brokerlink::179:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(set_<wbr>storage_domain)<br>
>>>>>> Success, id 140546772847056<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:26,952::hosted_engine::<wbr>601::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>broker)<br>
>>>>>> Broker initialized, all submonitors started<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:51:27,049::hosted_engine::<wbr>704::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>sanlock)<br>
>>>>>> Ensuring lease for lockspace hosted-engine, host id 1 is acquired (file:<br>
>>>>>> /var/run/vdsm/storage/<wbr>4a7f8717-9bb0-4d80-8016-<wbr>498fa4b88162/093faa75-5e33-<wbr>4559-84fa-1f1f8d48153b/<wbr>911c7637-b49d-463e-b186-<wbr>23b404e50769)<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:53:48,067::hosted_engine::<wbr>745::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>sanlock)<br>
>>>>>> Failed to acquire the lock. Waiting '5's before the next attempt<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:56:14,088::hosted_engine::<wbr>745::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>sanlock)<br>
>>>>>> Failed to acquire the lock. Waiting '5's before the next attempt<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 21:58:40,111::hosted_engine::<wbr>745::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>sanlock)<br>
>>>>>> Failed to acquire the lock. Waiting '5's before the next attempt<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:06,133::hosted_engine::<wbr>745::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>sanlock)<br>
>>>>>> Failed to acquire the lock. Waiting '5's before the next attempt<br>
>>>>>><br>
>>>>>><br>
>>>>>> agent.log from second host<br>
>>>>>><br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:37,241::hosted_engine::<wbr>630::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images)<br>
>>>>>> Connecting the storage<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:37,242::storage_server::<wbr>220::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(validate_<wbr>storage_server)<br>
>>>>>> Validating storage server<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:39,540::hosted_engine::<wbr>639::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images)<br>
>>>>>> Storage domain reported as valid and reconnect is not forced.<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:41,939::hosted_engine::<wbr>453::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(start_<wbr>monitoring)<br>
>>>>>> Current state EngineUnexpectedlyDown (score: 0)<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:52,150::config::493::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(refresh_vm_conf)<br>
>>>>>> Reloading vm.conf from the shared storage domain<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:52,150::config::416::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store)<br>
>>>>>> Trying to get a fresher copy of vm configuration from the OVF_STORE<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:52,151::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF)<br>
>>>>>> Extracting Engine VM OVF from the OVF_STORE<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:52,153::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF)<br>
>>>>>> OVF_STORE volume path:<br>
>>>>>> /var/run/vdsm/storage/<wbr>4a7f8717-9bb0-4d80-8016-<wbr>498fa4b88162/5cabd8e1-5f4b-<wbr>469e-becc-227469e03f5c/<wbr>8048cbd7-77e2-4805-9af4-<wbr>d109fa36dfcf<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:52,174::config::435::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store)<br>
>>>>>> Found an OVF for HE VM, trying to convert<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:52,179::config::440::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store)<br>
>>>>>> Got vm.conf from OVF_STORE<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:52,189::hosted_engine::<wbr>604::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>vdsm)<br>
>>>>>> Initializing VDSM<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:54,586::hosted_engine::<wbr>630::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images)<br>
>>>>>> Connecting the storage<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:54,587::storage_server::<wbr>220::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(validate_<wbr>storage_server)<br>
>>>>>> Validating storage server<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:56,903::hosted_engine::<wbr>639::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images)<br>
>>>>>> Storage domain reported as valid and reconnect is not forced.<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:59,299::states::682::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine::(<wbr>score)<br>
>>>>>> Score is 0 due to unexpected vm shutdown at Fri Jan 12 21:57:48 2018<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:01:59,299::hosted_engine::<wbr>453::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(start_<wbr>monitoring)<br>
>>>>>> Current state EngineUnexpectedlyDown (score: 0)<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:09,659::config::493::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(refresh_vm_conf)<br>
>>>>>> Reloading vm.conf from the shared storage domain<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:09,659::config::416::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store)<br>
>>>>>> Trying to get a fresher copy of vm configuration from the OVF_STORE<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:09,660::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF)<br>
>>>>>> Extracting Engine VM OVF from the OVF_STORE<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:09,663::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF)<br>
>>>>>> OVF_STORE volume path:<br>
>>>>>> /var/run/vdsm/storage/<wbr>4a7f8717-9bb0-4d80-8016-<wbr>498fa4b88162/5cabd8e1-5f4b-<wbr>469e-becc-227469e03f5c/<wbr>8048cbd7-77e2-4805-9af4-<wbr>d109fa36dfcf<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:09,683::config::435::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store)<br>
>>>>>> Found an OVF for HE VM, trying to convert<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:09,688::config::440::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store)<br>
>>>>>> Got vm.conf from OVF_STORE<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:09,698::hosted_engine::<wbr>604::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>vdsm)<br>
>>>>>> Initializing VDSM<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:12,112::hosted_engine::<wbr>630::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images)<br>
>>>>>> Connecting the storage<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:12,113::storage_server::<wbr>220::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(validate_<wbr>storage_server)<br>
>>>>>> Validating storage server<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:14,444::hosted_engine::<wbr>639::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images)<br>
>>>>>> Storage domain reported as valid and reconnect is not forced.<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:16,859::states::682::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine::(<wbr>score)<br>
>>>>>> Score is 0 due to unexpected vm shutdown at Fri Jan 12 21:57:47 2018<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:16,859::hosted_engine::<wbr>453::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(start_<wbr>monitoring)<br>
>>>>>> Current state EngineUnexpectedlyDown (score: 0)<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:27,100::config::493::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(refresh_vm_conf)<br>
>>>>>> Reloading vm.conf from the shared storage domain<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:27,100::config::416::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store)<br>
>>>>>> Trying to get a fresher copy of vm configuration from the OVF_STORE<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:27,101::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF)<br>
>>>>>> Extracting Engine VM OVF from the OVF_STORE<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:27,103::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF)<br>
>>>>>> OVF_STORE volume path:<br>
>>>>>> /var/run/vdsm/storage/<wbr>4a7f8717-9bb0-4d80-8016-<wbr>498fa4b88162/5cabd8e1-5f4b-<wbr>469e-becc-227469e03f5c/<wbr>8048cbd7-77e2-4805-9af4-<wbr>d109fa36dfcf<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:27,125::config::435::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store)<br>
>>>>>> Found an OVF for HE VM, trying to convert<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:27,129::config::440::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store)<br>
>>>>>> Got vm.conf from OVF_STORE<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:27,130::states::667::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine::(<wbr>consume)<br>
>>>>>> Engine down, local host does not have best score<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:27,139::hosted_engine::<wbr>604::ovirt_hosted_engine_ha.<wbr>agent.hosted_<wbr>engineHostedEngine::(_<wbr>initialize_vdsm)<br>
>>>>>> Initializing VDSM<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:29,584::hosted_engine::<wbr>630::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images)<br>
>>>>>> Connecting the storage<br>
>>>>>> MainThread::INFO::2018-01-12<br>
>>>>>> 22:02:29,586::storage_server::<wbr>220::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(validate_<wbr>storage_server)<br>
>>>>>> Validating storage server<br>
>>>>>><br>
>>>>>><br>
>>>>>> Any suggestions how to resolve this .<br>
>>>>>><br>
>>>>>> regards,<br>
>>>>>> Artem<br>
>>>>>><br>
>>>>>><br>
>>>>>> On Fri, Jan 12, 2018 at 7:08 PM, Artem Tambovskiy<br>
>>>>>> <<a href="mailto:artem.tambovskiy@gmail.com">artem.tambovskiy@gmail.com</a>> wrote:<br>
>>>>>>><br>
>>>>>>> Trying to fix one thing I broke another :(<br>
>>>>>>><br>
>>>>>>> I fixed mnt_options for hosted engine storage domain and installed<br>
>>>>>>> latest security patches to my hosts and hosted engine. All VM's up and<br>
>>>>>>> running, but hosted_engine --vm-status reports about issues:<br>
>>>>>>><br>
>>>>>>> [root@ovirt1 ~]# hosted-engine --vm-status<br>
>>>>>>><br>
>>>>>>><br>
>>>>>>> --== Host 1 status ==--<br>
>>>>>>><br>
>>>>>>> conf_on_shared_storage : True<br>
>>>>>>> Status up-to-date : False<br>
>>>>>>> Hostname : ovirt2<br>
>>>>>>> Host ID : 1<br>
>>>>>>> Engine status : unknown stale-data<br>
>>>>>>> Score : 0<br>
>>>>>>> stopped : False<br>
>>>>>>> Local maintenance : False<br>
>>>>>>> crc32 : 193164b8<br>
>>>>>>> local_conf_timestamp : 8350<br>
>>>>>>> Host timestamp : 8350<br>
>>>>>>> Extra metadata (valid at timestamp):<br>
>>>>>>> metadata_parse_version=1<br>
>>>>>>> metadata_feature_version=1<br>
>>>>>>> timestamp=8350 (Fri Jan 12 19:03:54 2018)<br>
>>>>>>> host-id=1<br>
>>>>>>> score=0<br>
>>>>>>> vm_conf_refresh_time=8350 (Fri Jan 12 19:03:54 2018)<br>
>>>>>>> conf_on_shared_storage=True<br>
>>>>>>> maintenance=False<br>
>>>>>>> state=EngineUnexpectedlyDown<br>
>>>>>>> stopped=False<br>
>>>>>>> timeout=Thu Jan 1 05:24:43 1970<br>
>>>>>>><br>
>>>>>>><br>
>>>>>>> --== Host 2 status ==--<br>
>>>>>>><br>
>>>>>>> conf_on_shared_storage : True<br>
>>>>>>> Status up-to-date : False<br>
>>>>>>> Hostname : <a href="http://ovirt1.telia.ru" rel="noreferrer" target="_blank">ovirt1.telia.ru</a><br>
>>>>>>> Host ID : 2<br>
>>>>>>> Engine status : unknown stale-data<br>
>>>>>>> Score : 0<br>
>>>>>>> stopped : True<br>
>>>>>>> Local maintenance : False<br>
>>>>>>> crc32 : c7037c03<br>
>>>>>>> local_conf_timestamp : 7530<br>
>>>>>>> Host timestamp : 7530<br>
>>>>>>> Extra metadata (valid at timestamp):<br>
>>>>>>> metadata_parse_version=1<br>
>>>>>>> metadata_feature_version=1<br>
>>>>>>> timestamp=7530 (Fri Jan 12 16:10:12 2018)<br>
>>>>>>> host-id=2<br>
>>>>>>> score=0<br>
>>>>>>> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)<br>
>>>>>>> conf_on_shared_storage=True<br>
>>>>>>> maintenance=False<br>
>>>>>>> state=AgentStopped<br>
>>>>>>> stopped=True<br>
>>>>>>> [root@ovirt1 ~]#<br>
>>>>>>><br>
>>>>>>><br>
>>>>>>><br>
>>>>>>> from second host situation looks a bit different:<br>
>>>>>>><br>
>>>>>>><br>
>>>>>>> [root@ovirt2 ~]# hosted-engine --vm-status<br>
>>>>>>><br>
>>>>>>><br>
>>>>>>> --== Host 1 status ==--<br>
>>>>>>><br>
>>>>>>> conf_on_shared_storage : True<br>
>>>>>>> Status up-to-date : True<br>
>>>>>>> Hostname : ovirt2<br>
>>>>>>> Host ID : 1<br>
>>>>>>> Engine status : {"reason": "vm not running on<br>
>>>>>>> this host", "health": "bad", "vm": "down", "detail": "unknown"}<br>
>>>>>>> Score : 0<br>
>>>>>>> stopped : False<br>
>>>>>>> Local maintenance : False<br>
>>>>>>> crc32 : 78eabdb6<br>
>>>>>>> local_conf_timestamp : 8403<br>
>>>>>>> Host timestamp : 8402<br>
>>>>>>> Extra metadata (valid at timestamp):<br>
>>>>>>> metadata_parse_version=1<br>
>>>>>>> metadata_feature_version=1<br>
>>>>>>> timestamp=8402 (Fri Jan 12 19:04:47 2018)<br>
>>>>>>> host-id=1<br>
>>>>>>> score=0<br>
>>>>>>> vm_conf_refresh_time=8403 (Fri Jan 12 19:04:47 2018)<br>
>>>>>>> conf_on_shared_storage=True<br>
>>>>>>> maintenance=False<br>
>>>>>>> state=EngineUnexpectedlyDown<br>
>>>>>>> stopped=False<br>
>>>>>>> timeout=Thu Jan 1 05:24:43 1970<br>
>>>>>>><br>
>>>>>>><br>
>>>>>>> --== Host 2 status ==--<br>
>>>>>>><br>
>>>>>>> conf_on_shared_storage : True<br>
>>>>>>> Status up-to-date : False<br>
>>>>>>> Hostname : <a href="http://ovirt1.telia.ru" rel="noreferrer" target="_blank">ovirt1.telia.ru</a><br>
>>>>>>> Host ID : 2<br>
>>>>>>> Engine status : unknown stale-data<br>
>>>>>>> Score : 0<br>
>>>>>>> stopped : True<br>
>>>>>>> Local maintenance : False<br>
>>>>>>> crc32 : c7037c03<br>
>>>>>>> local_conf_timestamp : 7530<br>
>>>>>>> Host timestamp : 7530<br>
>>>>>>> Extra metadata (valid at timestamp):<br>
>>>>>>> metadata_parse_version=1<br>
>>>>>>> metadata_feature_version=1<br>
>>>>>>> timestamp=7530 (Fri Jan 12 16:10:12 2018)<br>
>>>>>>> host-id=2<br>
>>>>>>> score=0<br>
>>>>>>> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)<br>
>>>>>>> conf_on_shared_storage=True<br>
>>>>>>> maintenance=False<br>
>>>>>>> state=AgentStopped<br>
>>>>>>> stopped=True<br>
>>>>>>><br>
>>>>>>><br>
>>>>>>> WebGUI shows that engine running on host ovirt1.<br>
>>>>>>> Gluster looks fine<br>
>>>>>>> [root@ovirt1 ~]# gluster volume status engine<br>
>>>>>>> Status of volume: engine<br>
>>>>>>> Gluster process TCP Port RDMA Port<br>
>>>>>>> Online Pid<br>
>>>>>>><br>
>>>>>>> ------------------------------<wbr>------------------------------<wbr>------------------<br>
>>>>>>> Brick ovirt1.teliaru:/oVirt/engine 49169 0 Y<br>
>>>>>>> 3244<br>
>>>>>>> Brick ovirt2.telia.ru:/oVirt/engine 49179 0 Y<br>
>>>>>>> 20372<br>
>>>>>>> Brick ovirt3.telia.ru:/oVirt/engine 49206 0 Y<br>
>>>>>>> 16609<br>
>>>>>>> Self-heal Daemon on localhost N/A N/A Y<br>
>>>>>>> 117868<br>
>>>>>>> Self-heal Daemon on <a href="http://ovirt2.telia.ru" rel="noreferrer" target="_blank">ovirt2.telia.ru</a> N/A N/A Y<br>
>>>>>>> 20521<br>
>>>>>>> Self-heal Daemon on ovirt3 N/A N/A Y<br>
>>>>>>> 25093<br>
>>>>>>><br>
>>>>>>> Task Status of Volume engine<br>
>>>>>>><br>
>>>>>>> ------------------------------<wbr>------------------------------<wbr>------------------<br>
>>>>>>> There are no active volume tasks<br>
>>>>>>><br>
>>>>>>> How to resolve this issue?<br>
>>>>>>><br>
>>>>>>><br>
>>>>>>> ______________________________<wbr>_________________<br>
>>>>>>> Users mailing list<br>
>>>>>>> <a href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
>>>>>>> <a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/users</a><br>
>>>>>>><br>
>>>>>><br>
>>>>>><br>
>>>>>> ______________________________<wbr>_________________<br>
>>>>>> Users mailing list<br>
>>>>>> <a href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
>>>>>> <a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/users</a><br>
>>>>>><br>
>>>>><br>
>>>><br>
>>><br>
>><br>
>> ______________________________<wbr>_________________<br>
>> Users mailing list<br>
>> <a href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
>> <a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/users</a><br>
>><br>
><br>
> ______________________________<wbr>_________________<br>
> Users mailing list<br>
> <a href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
> <a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/users</a><br>
><br>
</div></div></blockquote></div><br></div>