<div dir="ltr">Hello,<br><br>Yes, I followed exactly the same procedure while reinstalling the hosts (the only difference that I have SSH key configured instead of the password). <div><br></div><div>Just reinstalled the second host one more time, after 20 min the host still haven't reached active score of 3400 (Hosted Engine HA:Not Active) and I still don't see crown icon for this host. <br><br>hosted-engine --vm-status from ovirt1 host <br><br><div>[root@ovirt1 ~]# hosted-engine --vm-status</div><div><br></div><div><br></div><div>--== Host 1 status ==--</div><div><br></div><div>conf_on_shared_storage : True</div><div>Status up-to-date : True</div><div>Hostname : <a href="http://ovirt1.telia.ru">ovirt1.telia.ru</a></div><div>Host ID : 1</div><div>Engine status : {"health": "good", "vm": "up", "detail": "up"}</div><div>Score : 3400</div><div>stopped : False</div><div>Local maintenance : False</div><div>crc32 : 3f94156a</div><div>local_conf_timestamp : 349144</div><div>Host timestamp : 349144</div><div>Extra metadata (valid at timestamp):</div><div> metadata_parse_version=1</div><div> metadata_feature_version=1</div><div> timestamp=349144 (Tue Jan 16 15:03:45 2018)</div><div> host-id=1</div><div> score=3400</div><div> vm_conf_refresh_time=349144 (Tue Jan 16 15:03:45 2018)</div><div> conf_on_shared_storage=True</div><div> maintenance=False</div><div> state=EngineUp</div><div> stopped=False</div><div><br></div><div><br></div><div>--== Host 2 status ==--</div><div><br></div><div>conf_on_shared_storage : True</div><div>Status up-to-date : False</div><div>Hostname : <a href="http://ovirt1.telia.ru">ovirt1.telia.ru</a></div><div>Host ID : 2</div><div>Engine status : unknown stale-data</div><div>Score : 0</div><div>stopped : True</div><div>Local maintenance : False</div><div>crc32 : c7037c03</div><div>local_conf_timestamp : 7530</div><div>Host timestamp : 7530</div><div>Extra metadata (valid at timestamp):</div><div> metadata_parse_version=1</div><div> metadata_feature_version=1</div><div> timestamp=7530 (Fri Jan 12 16:10:12 2018)</div><div> host-id=2</div><div> score=0</div><div> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)</div><div> conf_on_shared_storage=True</div><div> maintenance=False</div><div> state=AgentStopped</div><div> stopped=True</div><br><br>hosted-engine --vm-status output from ovirt2 host </div><div><br><div>[root@ovirt2 ovirt-hosted-engine-ha]# hosted-engine --vm-status</div><div><br></div><div><br></div><div>--== Host 1 status ==--</div><div><br></div><div>conf_on_shared_storage : True</div><div>Status up-to-date : False</div><div>Hostname : <a href="http://ovirt1.telia.ru">ovirt1.telia.ru</a></div><div>Host ID : 1</div><div>Engine status : unknown stale-data</div><div>Score : 3400</div><div>stopped : False</div><div>Local maintenance : False</div><div>crc32 : 6d3606f1</div><div>local_conf_timestamp : 349264</div><div>Host timestamp : 349264</div><div>Extra metadata (valid at timestamp):</div><div> metadata_parse_version=1</div><div> metadata_feature_version=1</div><div> timestamp=349264 (Tue Jan 16 15:05:45 2018)</div><div> host-id=1</div><div> score=3400</div><div> vm_conf_refresh_time=349264 (Tue Jan 16 15:05:45 2018)</div><div> conf_on_shared_storage=True</div><div> maintenance=False</div><div> state=EngineUp</div><div> stopped=False</div><div><br></div><div><br></div><div>--== Host 2 status ==--</div><div><br></div><div>conf_on_shared_storage : True</div><div>Status up-to-date : False</div><div>Hostname : <a href="http://ovirt1.telia.ru">ovirt1.telia.ru</a></div><div>Host ID : 2</div><div>Engine status : unknown stale-data</div><div>Score : 0</div><div>stopped : True</div><div>Local maintenance : False</div><div>crc32 : c7037c03</div><div>local_conf_timestamp : 7530</div><div>Host timestamp : 7530</div><div>Extra metadata (valid at timestamp):</div><div> metadata_parse_version=1</div><div> metadata_feature_version=1</div><div> timestamp=7530 (Fri Jan 12 16:10:12 2018)</div><div> host-id=2</div><div> score=0</div><div> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)</div><div> conf_on_shared_storage=True</div><div> maintenance=False</div><div> state=AgentStopped</div><div> stopped=True</div><div><br></div><div><br></div><div>Also I saw some log messages in webGUI about time drift like <br><br>"Host <a href="http://ovirt2.telia.ru">ovirt2.telia.ru</a> has time-drift of 5305 seconds while maximum configured value is 300 seconds." that is a bit weird as haven't touched any time settings since I installed the cluster. <br>both host have the same time and timezone (MSK) but hosted engine lives in UTC timezone. Is it mandatory to have everything in sync and in the same timezone?</div><div><br></div>Regards,</div><div>Artem<br><br><br><br><br><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jan 16, 2018 at 2:20 PM, Kasturi Narra <span dir="ltr"><<a href="mailto:knarra@redhat.com" target="_blank">knarra@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hello,<div> </div><div> I now see that your hosted engine is up and running. Can you let me know how did you try reinstalling the host? Below is the procedure which is used and hope you did not miss any step while reinstalling. If no, can you try reinstalling again and see if that works ?</div><div><br></div><div>1) Move the host to maintenance</div><div>2) click on reinstall</div><div>3) provide the password</div><div>4) uncheck 'automatically configure host firewall'</div><div>5) click on 'Deploy' tab</div><div>6) click Hosted Engine deployment as 'Deploy'</div><div><br></div><div>And once the host installation is done, wait till the active score of the host shows 3400 in the general tab then check hosted-engine --vm-status. </div><div><br></div><div>Thanks</div><span class="gmail-HOEnZb"><font color="#888888"><div>kasturi</div></font></span></div><div class="gmail-HOEnZb"><div class="gmail-h5"><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jan 15, 2018 at 4:57 PM, Artem Tambovskiy <span dir="ltr"><<a href="mailto:artem.tambovskiy@gmail.com" target="_blank">artem.tambovskiy@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hello,<br><br>I have uploaded 2 archives with all relevant logs to shared hosting<br>files from host 1 (which is currently running all VM's including hosted_engine) - <span style="color:rgb(0,0,0);font-family:Arial,sans-serif;font-size:15.0016px"><a href="https://yadi.sk/d/PttRoYV63RTvhK" target="_blank">https://yadi.sk/d/PttRoYV63RT<wbr>vhK</a><br>files from second host - </span><span style="color:rgb(0,0,0);font-family:Arial,sans-serif;font-size:15.0016px"><a href="https://yadi.sk/d/UBducEsV3RTvhc" target="_blank">https://yadi.sk/d/UBducEsV3R<wbr>Tvhc</a> <br></span><span style="color:rgb(0,0,0);font-family:Arial,sans-serif;font-size:15.0016px"><br>I have tried to restart both </span>ovirt-ha-agent and ovirt-ha-broker but it gives no effect. I have also tried to shutdown hosted_engine VM, stop ovirt-ha-agent and ovirt-ha-broker services disconnect storage and connect it again - no effect as well. <br>Also I tried to reinstall second host from WebGUI - this lead to the interesting situation - now hosted-engine --vm-status shows that both hosts have the same address. <div><br></div><div><span><div>[root@ovirt1 ~]# hosted-engine --vm-status </div><div><br></div><div>--== Host 1 status ==--</div><div><br></div><div>conf_on_shared_storage : True</div></span><div>Status up-to-date : True</div><div>Hostname : <a href="http://ovirt1.telia.ru" target="_blank">ovirt1.telia.ru</a></div><div>Host ID : 1</div><div>Engine status : {"health": "good", "vm": "up", "detail": "up"}</div><div>Score : 3400</div><span><div>stopped : False</div><div>Local maintenance : False</div></span><div>crc32 : a7758085</div><div>local_conf_timestamp : 259327</div><div>Host timestamp : 259327</div><span><div>Extra metadata (valid at timestamp):</div><div> metadata_parse_version=1</div><div> metadata_feature_version=1</div></span><div> timestamp=259327 (Mon Jan 15 14:06:48 2018)</div><div> host-id=1</div><div> score=3400</div><div> vm_conf_refresh_time=259327 (Mon Jan 15 14:06:48 2018)</div><div> conf_on_shared_storage=True</div><div> maintenance=False</div><div> state=EngineUp</div><div> stopped=False</div><span><div><br></div><div><br></div><div>--== Host 2 status ==--</div><div><br></div><div>conf_on_shared_storage : True</div><div>Status up-to-date : False</div><div>Hostname : <a href="http://ovirt1.telia.ru" target="_blank">ovirt1.telia.ru</a></div><div>Host ID : 2</div><div>Engine status : unknown stale-data</div><div>Score : 0</div><div>stopped : True</div><div>Local maintenance : False</div><div>crc32 : c7037c03</div><div>local_conf_timestamp : 7530</div><div>Host timestamp : 7530</div><div>Extra metadata (valid at timestamp):</div><div> metadata_parse_version=1</div><div> metadata_feature_version=1</div><div> timestamp=7530 (Fri Jan 12 16:10:12 2018)</div><div> host-id=2</div><div> score=0</div><div> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)</div><div> conf_on_shared_storage=True</div><div> maintenance=False</div><div> state=AgentStopped</div><div> stopped=True</div><div><br></div></span><div>Gluster seems working fine. all gluster nodes showing connected state.</div><div><br></div><div>Any advises on how to resolve this situation are highly appreciated!<br><br></div><div>Regards,</div><div>Artem</div><div><br></div></div></div><div class="gmail-m_4720594609300657870HOEnZb"><div class="gmail-m_4720594609300657870h5"><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jan 15, 2018 at 11:45 AM, Kasturi Narra <span dir="ltr"><<a href="mailto:knarra@redhat.com" target="_blank">knarra@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hello Artem,<div><br></div><div> Can you check if glusterd service is running on host1 and all the peers are in connected state ? If yes, can you restart ovirt-ha-agent and broker services and check if things are working fine ?</div><div><br></div><div>Thanks</div><span class="gmail-m_4720594609300657870m_-3199546786741141787HOEnZb"><font color="#888888"><div>kasturi</div></font></span></div><div class="gmail-m_4720594609300657870m_-3199546786741141787HOEnZb"><div class="gmail-m_4720594609300657870m_-3199546786741141787h5"><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Jan 13, 2018 at 12:33 AM, Artem Tambovskiy <span dir="ltr"><<a href="mailto:artem.tambovskiy@gmail.com" target="_blank">artem.tambovskiy@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Explored logs on both hosts. <div>broker.log shows no errors.</div><div><br></div><div>agent.log looking not good:<br><br>on host1 (which running hosted engine) :</div><div><br></div><div><div>MainThread::ERROR::2018-01-12 21:51:03,883::agent::205::ovir<wbr>t_hosted_engine_ha.agent.agent<wbr>.Agent::(_run_agent) Traceback (most recent call last):</div><div> File "/usr/lib/python2.7/site-packa<wbr>ges/ovirt_hosted_engine_ha/age<wbr>nt/agent.py", line 191, in _run_agent</div><div> return action(he)</div><div> File "/usr/lib/python2.7/site-packa<wbr>ges/ovirt_hosted_engine_ha/age<wbr>nt/agent.py", line 64, in action_proper</div><div> return he.start_monitoring()</div><div> File "/usr/lib/python2.7/site-packa<wbr>ges/ovirt_hosted_engine_ha/age<wbr>nt/hosted_engine.py", line 411, in start_monitoring</div><div> self._initialize_sanlock()</div><div> File "/usr/lib/python2.7/site-packa<wbr>ges/ovirt_hosted_engine_ha/age<wbr>nt/hosted_engine.py", line 749, in _initialize_sanlock</div><div> "Failed to initialize sanlock, the number of errors has"</div><div>SanlockInitializationError: Failed to initialize sanlock, the number of errors has exceeded the limit</div><div><br></div><div>MainThread::ERROR::2018-01-12 21:51:03,884::agent::206::ovir<wbr>t_hosted_engine_ha.agent.agent<wbr>.Agent::(_run_agent) Trying to restart agent</div><div>MainThread::WARNING::2018-01-1<wbr>2 21:51:08,889::agent::209::ovir<wbr>t_hosted_engine_ha.agent.agent<wbr>.Agent::(_run_agent) Restarting agent, attempt '1'</div><div>MainThread::INFO::2018-01-12 21:51:08,919::hosted_engine::2<wbr>42::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_get_hostname) Found certificate common name: <a href="http://ovirt1.telia.ru" target="_blank">ovirt1.telia.ru</a></div><div>MainThread::INFO::2018-01-12 21:51:08,921::hosted_engine::6<wbr>04::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_vdsm) Initializing VDSM</div><div>MainThread::INFO::2018-01-12 21:51:11,398::hosted_engine::6<wbr>30::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 21:51:11,399::storage_server::<wbr>220::<a href="http://ovirt_hosted_engine_ha.li" target="_blank">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(validate_storage_server) Validating storage server</div><div>MainThread::INFO::2018-01-12 21:51:13,725::storage_server::<wbr>239::<a href="http://ovirt_hosted_engine_ha.li" target="_blank">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(connect_storage_server) Connecting storage server</div><div>MainThread::INFO::2018-01-12 21:51:18,390::storage_server::<wbr>246::<a href="http://ovirt_hosted_engine_ha.li" target="_blank">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(connect_storage_server) Connecting storage server</div><div>MainThread::INFO::2018-01-12 21:51:18,423::storage_server::<wbr>253::<a href="http://ovirt_hosted_engine_ha.li" target="_blank">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(connect_storage_server) Refreshing the storage domain</div><div>MainThread::INFO::2018-01-12 21:51:18,689::hosted_engine::6<wbr>63::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Preparing images</div><div>MainThread::INFO::2018-01-12 21:51:18,690::image::126::ovir<wbr>t_hosted_engine_ha.lib.image.I<wbr>mage::(prepare_images) Preparing images</div><div>MainThread::INFO::2018-01-12 21:51:21,895::hosted_engine::6<wbr>66::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Refreshing vm.conf</div><div>MainThread::INFO::2018-01-12 21:51:21,895::config::493::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(refresh_vm_conf) Reloading vm.conf from the shared storage domain</div><div>MainThread::INFO::2018-01-12 21:51:21,896::config::416::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_ov<wbr>f_store) Trying to get a fresher copy of vm configuration from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 21:51:21,896::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) Extracting Engine VM OVF from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 21:51:21,897::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/4a7f8717<wbr>-9bb0-4d80-8016-498fa4b88162/5<wbr>cabd8e1-5f4b-469e-becc-227469e<wbr>03f5c/8048cbd7-77e2-4805-9af4-<wbr>d109fa36dfcf </div><div>MainThread::INFO::2018-01-12 21:51:21,915::config::435::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_ov<wbr>f_store) Found an OVF for HE VM, trying to convert</div><div>MainThread::INFO::2018-01-12 21:51:21,918::config::440::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_ov<wbr>f_store) Got vm.conf from OVF_STORE</div><div>MainThread::INFO::2018-01-12 21:51:21,919::hosted_engine::5<wbr>09::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_broker) Initializing ha-broker connection</div><div>MainThread::INFO::2018-01-12 21:51:21,919::brokerlink::130:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br" target="_blank">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Starting monitor ping, options {'addr': '80.239.162.97'}</div><div>MainThread::INFO::2018-01-12 21:51:21,922::brokerlink::141:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br" target="_blank">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Success, id 140547104457680</div><div>MainThread::INFO::2018-01-12 21:51:21,922::brokerlink::130:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br" target="_blank">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Starting monitor mgmt-bridge, options {'use_ssl': 'true', 'bridge_name': 'ovirtmgmt', 'address': '0'}</div><div>MainThread::INFO::2018-01-12 21:51:21,936::brokerlink::141:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br" target="_blank">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Success, id 140547104458064</div><div>MainThread::INFO::2018-01-12 21:51:21,936::brokerlink::130:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br" target="_blank">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Starting monitor mem-free, options {'use_ssl': 'true', 'address': '0'}</div><div>MainThread::INFO::2018-01-12 21:51:21,938::brokerlink::141:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br" target="_blank">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Success, id 140547104458448</div><div>MainThread::INFO::2018-01-12 21:51:21,939::brokerlink::130:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br" target="_blank">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Starting monitor cpu-load-no-engine, options {'use_ssl': 'true', 'vm_uuid': 'b366e466-b0ea-4a09-866b-d0248<wbr>d7523a6', 'address': '0'}</div><div>MainThread::INFO::2018-01-12 21:51:21,940::brokerlink::141:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br" target="_blank">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Success, id 140547104457552</div><div>MainThread::INFO::2018-01-12 21:51:21,941::brokerlink::130:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br" target="_blank">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Starting monitor engine-health, options {'use_ssl': 'true', 'vm_uuid': 'b366e466-b0ea-4a09-866b-d0248<wbr>d7523a6', 'address': '0'}</div><div>MainThread::INFO::2018-01-12 21:51:21,942::brokerlink::141:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br" target="_blank">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Success, id 140547104459792</div><div>MainThread::INFO::2018-01-12 21:51:26,951::brokerlink::179:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br" target="_blank">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(set_stor<wbr>age_domain) Success, id 140546772847056</div><div>MainThread::INFO::2018-01-12 21:51:26,952::hosted_engine::6<wbr>01::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_broker) Broker initialized, all submonitors started</div><div>MainThread::INFO::2018-01-12 21:51:27,049::hosted_engine::7<wbr>04::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_sanlock) Ensuring lease for lockspace hosted-engine, host id 1 is acquired (file: /var/run/vdsm/storage/4a7f8717<wbr>-9bb0-4d80-8016-498fa4b88162/0<wbr>93faa75-5e33-4559-84fa-1f1f8d4<wbr>8153b/911c7637-b49d-463e-b186-<wbr>23b404e50769)</div><div>MainThread::INFO::2018-01-12 21:53:48,067::hosted_engine::7<wbr>45::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_sanlock) Failed to acquire the lock. Waiting '5's before the next attempt</div><div>MainThread::INFO::2018-01-12 21:56:14,088::hosted_engine::7<wbr>45::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_sanlock) Failed to acquire the lock. Waiting '5's before the next attempt</div><div>MainThread::INFO::2018-01-12 21:58:40,111::hosted_engine::7<wbr>45::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_sanlock) Failed to acquire the lock. Waiting '5's before the next attempt</div><div>MainThread::INFO::2018-01-12 22:01:06,133::hosted_engine::7<wbr>45::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_sanlock) Failed to acquire the lock. Waiting '5's before the next attempt</div></div><div><br><br>agent.log from second host <br><br><div>MainThread::INFO::2018-01-12 22:01:37,241::hosted_engine::6<wbr>30::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 22:01:37,242::storage_server::<wbr>220::<a href="http://ovirt_hosted_engine_ha.li" target="_blank">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(validate_storage_server) Validating storage server</div><div>MainThread::INFO::2018-01-12 22:01:39,540::hosted_engine::6<wbr>39::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Storage domain reported as valid and reconnect is not forced.</div><div>MainThread::INFO::2018-01-12 22:01:41,939::hosted_engine::4<wbr>53::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(start_monitoring) Current state EngineUnexpectedlyDown (score: 0)</div><div>MainThread::INFO::2018-01-12 22:01:52,150::config::493::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(refresh_vm_conf) Reloading vm.conf from the shared storage domain</div><div>MainThread::INFO::2018-01-12 22:01:52,150::config::416::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_ov<wbr>f_store) Trying to get a fresher copy of vm configuration from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:01:52,151::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) Extracting Engine VM OVF from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:01:52,153::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/4a7f8717<wbr>-9bb0-4d80-8016-498fa4b88162/5<wbr>cabd8e1-5f4b-469e-becc-227469e<wbr>03f5c/8048cbd7-77e2-4805-9af4-<wbr>d109fa36dfcf </div><div>MainThread::INFO::2018-01-12 22:01:52,174::config::435::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_ov<wbr>f_store) Found an OVF for HE VM, trying to convert</div><div>MainThread::INFO::2018-01-12 22:01:52,179::config::440::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_ov<wbr>f_store) Got vm.conf from OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:01:52,189::hosted_engine::6<wbr>04::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_vdsm) Initializing VDSM</div><div>MainThread::INFO::2018-01-12 22:01:54,586::hosted_engine::6<wbr>30::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 22:01:54,587::storage_server::<wbr>220::<a href="http://ovirt_hosted_engine_ha.li" target="_blank">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(validate_storage_server) Validating storage server</div><div>MainThread::INFO::2018-01-12 22:01:56,903::hosted_engine::6<wbr>39::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Storage domain reported as valid and reconnect is not forced.</div><div>MainThread::INFO::2018-01-12 22:01:59,299::states::682::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine::(score<wbr>) Score is 0 due to unexpected vm shutdown at Fri Jan 12 21:57:48 2018</div><div>MainThread::INFO::2018-01-12 22:01:59,299::hosted_engine::4<wbr>53::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(start_monitoring) Current state EngineUnexpectedlyDown (score: 0)</div><div>MainThread::INFO::2018-01-12 22:02:09,659::config::493::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(refresh_vm_conf) Reloading vm.conf from the shared storage domain</div><div>MainThread::INFO::2018-01-12 22:02:09,659::config::416::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_ov<wbr>f_store) Trying to get a fresher copy of vm configuration from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:09,660::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) Extracting Engine VM OVF from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:09,663::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/4a7f8717<wbr>-9bb0-4d80-8016-498fa4b88162/5<wbr>cabd8e1-5f4b-469e-becc-227469e<wbr>03f5c/8048cbd7-77e2-4805-9af4-<wbr>d109fa36dfcf </div><div>MainThread::INFO::2018-01-12 22:02:09,683::config::435::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_ov<wbr>f_store) Found an OVF for HE VM, trying to convert</div><div>MainThread::INFO::2018-01-12 22:02:09,688::config::440::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_ov<wbr>f_store) Got vm.conf from OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:09,698::hosted_engine::6<wbr>04::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_vdsm) Initializing VDSM</div><div>MainThread::INFO::2018-01-12 22:02:12,112::hosted_engine::6<wbr>30::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 22:02:12,113::storage_server::<wbr>220::<a href="http://ovirt_hosted_engine_ha.li" target="_blank">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(validate_storage_server) Validating storage server</div><div>MainThread::INFO::2018-01-12 22:02:14,444::hosted_engine::6<wbr>39::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Storage domain reported as valid and reconnect is not forced.</div><div>MainThread::INFO::2018-01-12 22:02:16,859::states::682::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine::(score<wbr>) Score is 0 due to unexpected vm shutdown at Fri Jan 12 21:57:47 2018</div><div>MainThread::INFO::2018-01-12 22:02:16,859::hosted_engine::4<wbr>53::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(start_monitoring) Current state EngineUnexpectedlyDown (score: 0)</div><div>MainThread::INFO::2018-01-12 22:02:27,100::config::493::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(refresh_vm_conf) Reloading vm.conf from the shared storage domain</div><div>MainThread::INFO::2018-01-12 22:02:27,100::config::416::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_ov<wbr>f_store) Trying to get a fresher copy of vm configuration from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:27,101::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) Extracting Engine VM OVF from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:27,103::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/4a7f8717<wbr>-9bb0-4d80-8016-498fa4b88162/5<wbr>cabd8e1-5f4b-469e-becc-227469e<wbr>03f5c/8048cbd7-77e2-4805-9af4-<wbr>d109fa36dfcf </div><div>MainThread::INFO::2018-01-12 22:02:27,125::config::435::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_ov<wbr>f_store) Found an OVF for HE VM, trying to convert</div><div>MainThread::INFO::2018-01-12 22:02:27,129::config::440::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_ov<wbr>f_store) Got vm.conf from OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:27,130::states::667::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine::(consu<wbr>me) Engine down, local host does not have best score</div><div>MainThread::INFO::2018-01-12 22:02:27,139::hosted_engine::6<wbr>04::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_vdsm) Initializing VDSM</div><div>MainThread::INFO::2018-01-12 22:02:29,584::hosted_engine::6<wbr>30::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 22:02:29,586::storage_server::<wbr>220::<a href="http://ovirt_hosted_engine_ha.li" target="_blank">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(validate_storage_server) Validating storage server<br><br><br>Any suggestions how to resolve this .</div><div><br>regards,</div></div><div>Artem</div><div><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jan 12, 2018 at 7:08 PM, Artem Tambovskiy <span dir="ltr"><<a href="mailto:artem.tambovskiy@gmail.com" target="_blank">artem.tambovskiy@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Trying to fix one thing I broke another :( <br><br>I fixed mnt_options for hosted engine storage domain and installed latest security patches to my hosts and hosted engine. All VM's up and running, but hosted_engine --vm-status reports about issues: <br><br><div>[root@ovirt1 ~]# hosted-engine --vm-status</div><div><br></div><div><br></div><div>--== Host 1 status ==--</div><div><br></div><div>conf_on_shared_storage : True</div><div>Status up-to-date : False</div><div>Hostname : ovirt2</div><div>Host ID : 1</div><div>Engine status : unknown stale-data</div><div>Score : 0</div><div>stopped : False</div><div>Local maintenance : False</div><div>crc32 : 193164b8</div><div>local_conf_timestamp : 8350</div><div>Host timestamp : 8350</div><div>Extra metadata (valid at timestamp):</div><div> metadata_parse_version=1</div><div> metadata_feature_version=1</div><div> timestamp=8350 (Fri Jan 12 19:03:54 2018)</div><div> host-id=1</div><div> score=0</div><div> vm_conf_refresh_time=8350 (Fri Jan 12 19:03:54 2018)</div><div> conf_on_shared_storage=True</div><div> maintenance=False</div><div> state=EngineUnexpectedlyDown</div><div> stopped=False</div><div> timeout=Thu Jan 1 05:24:43 1970</div><div><br></div><div><br></div><div>--== Host 2 status ==--</div><div><br></div><div>conf_on_shared_storage : True</div><div>Status up-to-date : False</div><div>Hostname : <a href="http://ovirt1.telia.ru" target="_blank">ovirt1.telia.ru</a></div><div>Host ID : 2</div><div>Engine status : unknown stale-data</div><div>Score : 0</div><div>stopped : True</div><div>Local maintenance : False</div><div>crc32 : c7037c03</div><div>local_conf_timestamp : 7530</div><div>Host timestamp : 7530</div><div>Extra metadata (valid at timestamp):</div><div> metadata_parse_version=1</div><div> metadata_feature_version=1</div><div> timestamp=7530 (Fri Jan 12 16:10:12 2018)</div><div> host-id=2</div><div> score=0</div><div> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)</div><div> conf_on_shared_storage=True</div><div> maintenance=False</div><div> state=AgentStopped</div><div> stopped=True</div><div>[root@ovirt1 ~]# </div><div><br></div><div><br></div><div><br></div><div>from second host situation looks a bit different:<br><br><br><div>[root@ovirt2 ~]# hosted-engine --vm-status</div><div><br></div><div><br></div><div>--== Host 1 status ==--</div><div><br></div><div>conf_on_shared_storage : True</div><div>Status up-to-date : True</div><div>Hostname : ovirt2</div><div>Host ID : 1</div><div>Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}</div><div>Score : 0</div><div>stopped : False</div><div>Local maintenance : False</div><div>crc32 : 78eabdb6</div><div>local_conf_timestamp : 8403</div><div>Host timestamp : 8402</div><div>Extra metadata (valid at timestamp):</div><div> metadata_parse_version=1</div><div> metadata_feature_version=1</div><div> timestamp=8402 (Fri Jan 12 19:04:47 2018)</div><div> host-id=1</div><div> score=0</div><div> vm_conf_refresh_time=8403 (Fri Jan 12 19:04:47 2018)</div><div> conf_on_shared_storage=True</div><div> maintenance=False</div><div> state=EngineUnexpectedlyDown</div><div> stopped=False</div><div> timeout=Thu Jan 1 05:24:43 1970</div><div><br></div><div><br></div><div>--== Host 2 status ==--</div><div><br></div><div>conf_on_shared_storage : True</div><div>Status up-to-date : False</div><div>Hostname : <a href="http://ovirt1.telia.ru" target="_blank">ovirt1.telia.ru</a></div><div>Host ID : 2</div><div>Engine status : unknown stale-data</div><div>Score : 0</div><div>stopped : True</div><div>Local maintenance : False</div><div>crc32 : c7037c03</div><div>local_conf_timestamp : 7530</div><div>Host timestamp : 7530</div><div>Extra metadata (valid at timestamp):</div><div> metadata_parse_version=1</div><div> metadata_feature_version=1</div><div> timestamp=7530 (Fri Jan 12 16:10:12 2018)</div><div> host-id=2</div><div> score=0</div><div> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)</div><div> conf_on_shared_storage=True</div><div> maintenance=False</div><div> state=AgentStopped</div><div> stopped=True</div></div><div><br></div><div><br></div><div>WebGUI shows that engine running on host ovirt1. <br>Gluster looks fine <br><div>[root@ovirt1 ~]# gluster volume status engine</div><div>Status of volume: engine</div><div>Gluster process TCP Port RDMA Port Online Pid</div><div>------------------------------<wbr>------------------------------<wbr>------------------</div><div>Brick ovirt1.telia.ru:/oVirt/engine 49169 0 Y 3244 </div><div>Brick ovirt2.telia.ru:/oVirt/engine 49179 0 Y 20372</div><div>Brick ovirt3.telia.ru:/oVirt/engine 49206 0 Y 16609</div><div>Self-heal Daemon on localhost N/A N/A Y 117868</div><div>Self-heal Daemon on <a href="http://ovirt2.telia.ru" target="_blank">ovirt2.telia.ru</a> N/A N/A Y 20521</div><div>Self-heal Daemon on ovirt3 N/A N/A Y 25093</div><div> </div><div>Task Status of Volume engine</div><div>------------------------------<wbr>------------------------------<wbr>------------------</div><div>There are no active volume tasks<br><br>How to resolve this issue?</div><br></div></div>
<br>______________________________<wbr>_________________<br>
Users mailing list<br>
<a href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a><br>
<a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman<wbr>/listinfo/users</a><br>
<br></blockquote></div><br></div></div></div>
<br>______________________________<wbr>_________________<br>
Users mailing list<br>
<a href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a><br>
<a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman<wbr>/listinfo/users</a><br>
<br></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div></div>