<div dir="ltr">Hello,<br><br>I have uploaded 2 archives with all relevant logs to shared hosting<br>files from host 1  (which is currently running all VM&#39;s including hosted_engine)  -  <span style="color:rgb(0,0,0);font-family:Arial,sans-serif;font-size:15.0016px"><a href="https://yadi.sk/d/PttRoYV63RTvhK">https://yadi.sk/d/PttRoYV63RTvhK</a><br>files from second host - </span><span style="color:rgb(0,0,0);font-family:Arial,sans-serif;font-size:15.0016px"><a href="https://yadi.sk/d/UBducEsV3RTvhc">https://yadi.sk/d/UBducEsV3RTvhc</a> <br></span><span style="color:rgb(0,0,0);font-family:Arial,sans-serif;font-size:15.0016px"><br>I have tried to restart both </span>ovirt-ha-agent and ovirt-ha-broker but it gives no effect. I have also tried to shutdown hosted_engine VM, stop ovirt-ha-agent and ovirt-ha-broker  services disconnect storage and connect it again  - no effect as well. <br>Also I tried to reinstall second host from WebGUI - this lead to the interesting situation - now  hosted-engine --vm-status  shows that both hosts have the same address. <div><br></div><div><div>[root@ovirt1 ~]# hosted-engine --vm-status    </div><div><br></div><div>--== Host 1 status ==--</div><div><br></div><div>conf_on_shared_storage             : True</div><div>Status up-to-date                  : True</div><div>Hostname                           : <a href="http://ovirt1.telia.ru">ovirt1.telia.ru</a></div><div>Host ID                            : 1</div><div>Engine status                      : {&quot;health&quot;: &quot;good&quot;, &quot;vm&quot;: &quot;up&quot;, &quot;detail&quot;: &quot;up&quot;}</div><div>Score                              : 3400</div><div>stopped                            : False</div><div>Local maintenance                  : False</div><div>crc32                              : a7758085</div><div>local_conf_timestamp               : 259327</div><div>Host timestamp                     : 259327</div><div>Extra metadata (valid at timestamp):</div><div>        metadata_parse_version=1</div><div>        metadata_feature_version=1</div><div>        timestamp=259327 (Mon Jan 15 14:06:48 2018)</div><div>        host-id=1</div><div>        score=3400</div><div>        vm_conf_refresh_time=259327 (Mon Jan 15 14:06:48 2018)</div><div>        conf_on_shared_storage=True</div><div>        maintenance=False</div><div>        state=EngineUp</div><div>        stopped=False</div><div><br></div><div><br></div><div>--== Host 2 status ==--</div><div><br></div><div>conf_on_shared_storage             : True</div><div>Status up-to-date                  : False</div><div>Hostname                           : <a href="http://ovirt1.telia.ru">ovirt1.telia.ru</a></div><div>Host ID                            : 2</div><div>Engine status                      : unknown stale-data</div><div>Score                              : 0</div><div>stopped                            : True</div><div>Local maintenance                  : False</div><div>crc32                              : c7037c03</div><div>local_conf_timestamp               : 7530</div><div>Host timestamp                     : 7530</div><div>Extra metadata (valid at timestamp):</div><div>        metadata_parse_version=1</div><div>        metadata_feature_version=1</div><div>        timestamp=7530 (Fri Jan 12 16:10:12 2018)</div><div>        host-id=2</div><div>        score=0</div><div>        vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)</div><div>        conf_on_shared_storage=True</div><div>        maintenance=False</div><div>        state=AgentStopped</div><div>        stopped=True</div><div><br></div><div>Gluster seems working fine. all gluster nodes showing connected state.</div><div><br></div><div>Any advises on how to resolve this situation are highly appreciated!<br><br></div><div>Regards,</div><div>Artem</div><div><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jan 15, 2018 at 11:45 AM, Kasturi Narra <span dir="ltr">&lt;<a href="mailto:knarra@redhat.com" target="_blank">knarra@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hello Artem,<div><br></div><div>        Can you check if glusterd service is running on host1 and all the peers are in connected state ? If yes, can you restart ovirt-ha-agent and broker services and check if things are working fine ?</div><div><br></div><div>Thanks</div><span class="HOEnZb"><font color="#888888"><div>kasturi</div></font></span></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Jan 13, 2018 at 12:33 AM, Artem Tambovskiy <span dir="ltr">&lt;<a href="mailto:artem.tambovskiy@gmail.com" target="_blank">artem.tambovskiy@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Explored logs on both hosts. <div>broker.log shows no errors.</div><div><br></div><div>agent.log looking not good:<br><br>on host1 (which running hosted engine) :</div><div><br></div><div><div>MainThread::ERROR::2018-01-12 21:51:03,883::agent::205::ovir<wbr>t_hosted_engine_ha.agent.agent<wbr>.Agent::(_run_agent) Traceback (most recent call last):</div><div>  File &quot;/usr/lib/python2.7/site-packa<wbr>ges/ovirt_hosted_engine_ha/<wbr>agent/agent.py&quot;, line 191, in _run_agent</div><div>    return action(he)</div><div>  File &quot;/usr/lib/python2.7/site-packa<wbr>ges/ovirt_hosted_engine_ha/<wbr>agent/agent.py&quot;, line 64, in action_proper</div><div>    return he.start_monitoring()</div><div>  File &quot;/usr/lib/python2.7/site-packa<wbr>ges/ovirt_hosted_engine_ha/<wbr>agent/hosted_engine.py&quot;, line 411, in start_monitoring</div><div>    self._initialize_sanlock()</div><div>  File &quot;/usr/lib/python2.7/site-packa<wbr>ges/ovirt_hosted_engine_ha/<wbr>agent/hosted_engine.py&quot;, line 749, in _initialize_sanlock</div><div>    &quot;Failed to initialize sanlock, the number of errors has&quot;</div><div>SanlockInitializationError: Failed to initialize sanlock, the number of errors has exceeded the limit</div><div><br></div><div>MainThread::ERROR::2018-01-12 21:51:03,884::agent::206::ovir<wbr>t_hosted_engine_ha.agent.agent<wbr>.Agent::(_run_agent) Trying to restart agent</div><div>MainThread::WARNING::2018-01-1<wbr>2 21:51:08,889::agent::209::ovir<wbr>t_hosted_engine_ha.agent.agent<wbr>.Agent::(_run_agent) Restarting agent, attempt &#39;1&#39;</div><div>MainThread::INFO::2018-01-12 21:51:08,919::hosted_engine::2<wbr>42::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_get_hostname) Found certificate common name: <a href="http://ovirt1.telia.ru" target="_blank">ovirt1.telia.ru</a></div><div>MainThread::INFO::2018-01-12 21:51:08,921::hosted_engine::6<wbr>04::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_vdsm) Initializing VDSM</div><div>MainThread::INFO::2018-01-12 21:51:11,398::hosted_engine::6<wbr>30::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 21:51:11,399::storage_server::<wbr>220::<a href="http://ovirt_hosted_engine_ha.li">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(validate_storage_server) Validating storage server</div><div>MainThread::INFO::2018-01-12 21:51:13,725::storage_server::<wbr>239::<a href="http://ovirt_hosted_engine_ha.li">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(connect_storage_server) Connecting storage server</div><div>MainThread::INFO::2018-01-12 21:51:18,390::storage_server::<wbr>246::<a href="http://ovirt_hosted_engine_ha.li">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(connect_storage_server) Connecting storage server</div><div>MainThread::INFO::2018-01-12 21:51:18,423::storage_server::<wbr>253::<a href="http://ovirt_hosted_engine_ha.li">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(connect_storage_server) Refreshing the storage domain</div><div>MainThread::INFO::2018-01-12 21:51:18,689::hosted_engine::6<wbr>63::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Preparing images</div><div>MainThread::INFO::2018-01-12 21:51:18,690::image::126::ovir<wbr>t_hosted_engine_ha.lib.image.<wbr>Image::(prepare_images) Preparing images</div><div>MainThread::INFO::2018-01-12 21:51:21,895::hosted_engine::6<wbr>66::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Refreshing vm.conf</div><div>MainThread::INFO::2018-01-12 21:51:21,895::config::493::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(refresh_vm_conf) Reloading vm.conf from the shared storage domain</div><div>MainThread::INFO::2018-01-12 21:51:21,896::config::416::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_<wbr>ovf_store) Trying to get a fresher copy of vm configuration from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 21:51:21,896::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) Extracting Engine VM OVF from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 21:51:21,897::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/4a7f8717<wbr>-9bb0-4d80-8016-498fa4b88162/<wbr>5cabd8e1-5f4b-469e-becc-<wbr>227469e03f5c/8048cbd7-77e2-<wbr>4805-9af4-d109fa36dfcf </div><div>MainThread::INFO::2018-01-12 21:51:21,915::config::435::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_<wbr>ovf_store) Found an OVF for HE VM, trying to convert</div><div>MainThread::INFO::2018-01-12 21:51:21,918::config::440::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_<wbr>ovf_store) Got vm.conf from OVF_STORE</div><div>MainThread::INFO::2018-01-12 21:51:21,919::hosted_engine::5<wbr>09::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_broker) Initializing ha-broker connection</div><div>MainThread::INFO::2018-01-12 21:51:21,919::brokerlink::130:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Starting monitor ping, options {&#39;addr&#39;: &#39;80.239.162.97&#39;}</div><div>MainThread::INFO::2018-01-12 21:51:21,922::brokerlink::141:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Success, id 140547104457680</div><div>MainThread::INFO::2018-01-12 21:51:21,922::brokerlink::130:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Starting monitor mgmt-bridge, options {&#39;use_ssl&#39;: &#39;true&#39;, &#39;bridge_name&#39;: &#39;ovirtmgmt&#39;, &#39;address&#39;: &#39;0&#39;}</div><div>MainThread::INFO::2018-01-12 21:51:21,936::brokerlink::141:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Success, id 140547104458064</div><div>MainThread::INFO::2018-01-12 21:51:21,936::brokerlink::130:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Starting monitor mem-free, options {&#39;use_ssl&#39;: &#39;true&#39;, &#39;address&#39;: &#39;0&#39;}</div><div>MainThread::INFO::2018-01-12 21:51:21,938::brokerlink::141:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Success, id 140547104458448</div><div>MainThread::INFO::2018-01-12 21:51:21,939::brokerlink::130:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Starting monitor cpu-load-no-engine, options {&#39;use_ssl&#39;: &#39;true&#39;, &#39;vm_uuid&#39;: &#39;b366e466-b0ea-4a09-866b-d0248<wbr>d7523a6&#39;, &#39;address&#39;: &#39;0&#39;}</div><div>MainThread::INFO::2018-01-12 21:51:21,940::brokerlink::141:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Success, id 140547104457552</div><div>MainThread::INFO::2018-01-12 21:51:21,941::brokerlink::130:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Starting monitor engine-health, options {&#39;use_ssl&#39;: &#39;true&#39;, &#39;vm_uuid&#39;: &#39;b366e466-b0ea-4a09-866b-d0248<wbr>d7523a6&#39;, &#39;address&#39;: &#39;0&#39;}</div><div>MainThread::INFO::2018-01-12 21:51:21,942::brokerlink::141:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(start_mo<wbr>nitor) Success, id 140547104459792</div><div>MainThread::INFO::2018-01-12 21:51:26,951::brokerlink::179:<wbr>:<a href="http://ovirt_hosted_engine_ha.lib.br">ovirt_hosted_engine_ha.lib.br</a><wbr>okerlink.BrokerLink::(set_stor<wbr>age_domain) Success, id 140546772847056</div><div>MainThread::INFO::2018-01-12 21:51:26,952::hosted_engine::6<wbr>01::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_broker) Broker initialized, all submonitors started</div><div>MainThread::INFO::2018-01-12 21:51:27,049::hosted_engine::7<wbr>04::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_sanlock) Ensuring lease for lockspace hosted-engine, host id 1 is acquired (file: /var/run/vdsm/storage/4a7f8717<wbr>-9bb0-4d80-8016-498fa4b88162/<wbr>093faa75-5e33-4559-84fa-<wbr>1f1f8d48153b/911c7637-b49d-<wbr>463e-b186-23b404e50769)</div><div>MainThread::INFO::2018-01-12 21:53:48,067::hosted_engine::7<wbr>45::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_sanlock) Failed to acquire the lock. Waiting &#39;5&#39;s before the next attempt</div><div>MainThread::INFO::2018-01-12 21:56:14,088::hosted_engine::7<wbr>45::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_sanlock) Failed to acquire the lock. Waiting &#39;5&#39;s before the next attempt</div><div>MainThread::INFO::2018-01-12 21:58:40,111::hosted_engine::7<wbr>45::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_sanlock) Failed to acquire the lock. Waiting &#39;5&#39;s before the next attempt</div><div>MainThread::INFO::2018-01-12 22:01:06,133::hosted_engine::7<wbr>45::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_sanlock) Failed to acquire the lock. Waiting &#39;5&#39;s before the next attempt</div></div><div><br><br>agent.log from second host <br><br><div>MainThread::INFO::2018-01-12 22:01:37,241::hosted_engine::6<wbr>30::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 22:01:37,242::storage_server::<wbr>220::<a href="http://ovirt_hosted_engine_ha.li">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(validate_storage_server) Validating storage server</div><div>MainThread::INFO::2018-01-12 22:01:39,540::hosted_engine::6<wbr>39::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Storage domain reported as valid and reconnect is not forced.</div><div>MainThread::INFO::2018-01-12 22:01:41,939::hosted_engine::4<wbr>53::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(start_monitoring) Current state EngineUnexpectedlyDown (score: 0)</div><div>MainThread::INFO::2018-01-12 22:01:52,150::config::493::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(refresh_vm_conf) Reloading vm.conf from the shared storage domain</div><div>MainThread::INFO::2018-01-12 22:01:52,150::config::416::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_<wbr>ovf_store) Trying to get a fresher copy of vm configuration from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:01:52,151::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) Extracting Engine VM OVF from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:01:52,153::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/4a7f8717<wbr>-9bb0-4d80-8016-498fa4b88162/<wbr>5cabd8e1-5f4b-469e-becc-<wbr>227469e03f5c/8048cbd7-77e2-<wbr>4805-9af4-d109fa36dfcf </div><div>MainThread::INFO::2018-01-12 22:01:52,174::config::435::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_<wbr>ovf_store) Found an OVF for HE VM, trying to convert</div><div>MainThread::INFO::2018-01-12 22:01:52,179::config::440::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_<wbr>ovf_store) Got vm.conf from OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:01:52,189::hosted_engine::6<wbr>04::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_vdsm) Initializing VDSM</div><div>MainThread::INFO::2018-01-12 22:01:54,586::hosted_engine::6<wbr>30::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 22:01:54,587::storage_server::<wbr>220::<a href="http://ovirt_hosted_engine_ha.li">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(validate_storage_server) Validating storage server</div><div>MainThread::INFO::2018-01-12 22:01:56,903::hosted_engine::6<wbr>39::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Storage domain reported as valid and reconnect is not forced.</div><div>MainThread::INFO::2018-01-12 22:01:59,299::states::682::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine::(score<wbr>) Score is 0 due to unexpected vm shutdown at Fri Jan 12 21:57:48 2018</div><div>MainThread::INFO::2018-01-12 22:01:59,299::hosted_engine::4<wbr>53::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(start_monitoring) Current state EngineUnexpectedlyDown (score: 0)</div><div>MainThread::INFO::2018-01-12 22:02:09,659::config::493::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(refresh_vm_conf) Reloading vm.conf from the shared storage domain</div><div>MainThread::INFO::2018-01-12 22:02:09,659::config::416::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_<wbr>ovf_store) Trying to get a fresher copy of vm configuration from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:09,660::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) Extracting Engine VM OVF from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:09,663::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/4a7f8717<wbr>-9bb0-4d80-8016-498fa4b88162/<wbr>5cabd8e1-5f4b-469e-becc-<wbr>227469e03f5c/8048cbd7-77e2-<wbr>4805-9af4-d109fa36dfcf </div><div>MainThread::INFO::2018-01-12 22:02:09,683::config::435::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_<wbr>ovf_store) Found an OVF for HE VM, trying to convert</div><div>MainThread::INFO::2018-01-12 22:02:09,688::config::440::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_<wbr>ovf_store) Got vm.conf from OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:09,698::hosted_engine::6<wbr>04::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_vdsm) Initializing VDSM</div><div>MainThread::INFO::2018-01-12 22:02:12,112::hosted_engine::6<wbr>30::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 22:02:12,113::storage_server::<wbr>220::<a href="http://ovirt_hosted_engine_ha.li">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(validate_storage_server) Validating storage server</div><div>MainThread::INFO::2018-01-12 22:02:14,444::hosted_engine::6<wbr>39::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Storage domain reported as valid and reconnect is not forced.</div><div>MainThread::INFO::2018-01-12 22:02:16,859::states::682::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine::(score<wbr>) Score is 0 due to unexpected vm shutdown at Fri Jan 12 21:57:47 2018</div><div>MainThread::INFO::2018-01-12 22:02:16,859::hosted_engine::4<wbr>53::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(start_monitoring) Current state EngineUnexpectedlyDown (score: 0)</div><div>MainThread::INFO::2018-01-12 22:02:27,100::config::493::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(refresh_vm_conf) Reloading vm.conf from the shared storage domain</div><div>MainThread::INFO::2018-01-12 22:02:27,100::config::416::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_<wbr>ovf_store) Trying to get a fresher copy of vm configuration from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:27,101::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) Extracting Engine VM OVF from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:27,103::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.ovf<wbr>.ovf_store.OVFStore::(getEngin<wbr>eVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/4a7f8717<wbr>-9bb0-4d80-8016-498fa4b88162/<wbr>5cabd8e1-5f4b-469e-becc-<wbr>227469e03f5c/8048cbd7-77e2-<wbr>4805-9af4-d109fa36dfcf </div><div>MainThread::INFO::2018-01-12 22:02:27,125::config::435::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_<wbr>ovf_store) Found an OVF for HE VM, trying to convert</div><div>MainThread::INFO::2018-01-12 22:02:27,129::config::440::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine.config:<wbr>:(_get_vm_conf_content_from_<wbr>ovf_store) Got vm.conf from OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:27,130::states::667::ovi<wbr>rt_hosted_engine_ha.agent.host<wbr>ed_engine.HostedEngine::(consu<wbr>me) Engine down, local host does not have best score</div><div>MainThread::INFO::2018-01-12 22:02:27,139::hosted_engine::6<wbr>04::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_vdsm) Initializing VDSM</div><div>MainThread::INFO::2018-01-12 22:02:29,584::hosted_engine::6<wbr>30::ovirt_hosted_engine_ha.age<wbr>nt.hosted_engine.HostedEngine:<wbr>:(_initialize_storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 22:02:29,586::storage_server::<wbr>220::<a href="http://ovirt_hosted_engine_ha.li">ovirt_hosted_engine_ha.li</a><wbr>b.storage_server.StorageServer<wbr>::(validate_storage_server) Validating storage server<br><br><br>Any suggestions how to resolve this .</div><div><br>regards,</div></div><div>Artem</div><div><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jan 12, 2018 at 7:08 PM, Artem Tambovskiy <span dir="ltr">&lt;<a href="mailto:artem.tambovskiy@gmail.com" target="_blank">artem.tambovskiy@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Trying to fix one thing I broke another :( <br><br>I fixed mnt_options for hosted engine storage domain and installed latest security patches to my hosts and hosted engine. All VM&#39;s up and running, but  hosted_engine --vm-status reports about issues: <br><br><div>[root@ovirt1 ~]# hosted-engine --vm-status</div><div><br></div><div><br></div><div>--== Host 1 status ==--</div><div><br></div><div>conf_on_shared_storage             : True</div><div>Status up-to-date                  : False</div><div>Hostname                           : ovirt2</div><div>Host ID                            : 1</div><div>Engine status                      : unknown stale-data</div><div>Score                              : 0</div><div>stopped                            : False</div><div>Local maintenance                  : False</div><div>crc32                              : 193164b8</div><div>local_conf_timestamp               : 8350</div><div>Host timestamp                     : 8350</div><div>Extra metadata (valid at timestamp):</div><div>        metadata_parse_version=1</div><div>        metadata_feature_version=1</div><div>        timestamp=8350 (Fri Jan 12 19:03:54 2018)</div><div>        host-id=1</div><div>        score=0</div><div>        vm_conf_refresh_time=8350 (Fri Jan 12 19:03:54 2018)</div><div>        conf_on_shared_storage=True</div><div>        maintenance=False</div><div>        state=EngineUnexpectedlyDown</div><div>        stopped=False</div><div>        timeout=Thu Jan  1 05:24:43 1970</div><div><br></div><div><br></div><div>--== Host 2 status ==--</div><div><br></div><div>conf_on_shared_storage             : True</div><div>Status up-to-date                  : False</div><div>Hostname                           : <a href="http://ovirt1.telia.ru" target="_blank">ovirt1.telia.ru</a></div><div>Host ID                            : 2</div><div>Engine status                      : unknown stale-data</div><div>Score                              : 0</div><div>stopped                            : True</div><div>Local maintenance                  : False</div><div>crc32                              : c7037c03</div><div>local_conf_timestamp               : 7530</div><div>Host timestamp                     : 7530</div><div>Extra metadata (valid at timestamp):</div><div>        metadata_parse_version=1</div><div>        metadata_feature_version=1</div><div>        timestamp=7530 (Fri Jan 12 16:10:12 2018)</div><div>        host-id=2</div><div>        score=0</div><div>        vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)</div><div>        conf_on_shared_storage=True</div><div>        maintenance=False</div><div>        state=AgentStopped</div><div>        stopped=True</div><div>[root@ovirt1 ~]# </div><div><br></div><div><br></div><div><br></div><div>from second host situation looks a bit different:<br><br><br><div>[root@ovirt2 ~]# hosted-engine --vm-status</div><div><br></div><div><br></div><div>--== Host 1 status ==--</div><div><br></div><div>conf_on_shared_storage             : True</div><div>Status up-to-date                  : True</div><div>Hostname                           : ovirt2</div><div>Host ID                            : 1</div><div>Engine status                      : {&quot;reason&quot;: &quot;vm not running on this host&quot;, &quot;health&quot;: &quot;bad&quot;, &quot;vm&quot;: &quot;down&quot;, &quot;detail&quot;: &quot;unknown&quot;}</div><div>Score                              : 0</div><div>stopped                            : False</div><div>Local maintenance                  : False</div><div>crc32                              : 78eabdb6</div><div>local_conf_timestamp               : 8403</div><div>Host timestamp                     : 8402</div><div>Extra metadata (valid at timestamp):</div><div>        metadata_parse_version=1</div><div>        metadata_feature_version=1</div><div>        timestamp=8402 (Fri Jan 12 19:04:47 2018)</div><div>        host-id=1</div><div>        score=0</div><div>        vm_conf_refresh_time=8403 (Fri Jan 12 19:04:47 2018)</div><div>        conf_on_shared_storage=True</div><div>        maintenance=False</div><div>        state=EngineUnexpectedlyDown</div><div>        stopped=False</div><div>        timeout=Thu Jan  1 05:24:43 1970</div><div><br></div><div><br></div><div>--== Host 2 status ==--</div><div><br></div><div>conf_on_shared_storage             : True</div><div>Status up-to-date                  : False</div><div>Hostname                           : <a href="http://ovirt1.telia.ru" target="_blank">ovirt1.telia.ru</a></div><div>Host ID                            : 2</div><div>Engine status                      : unknown stale-data</div><div>Score                              : 0</div><div>stopped                            : True</div><div>Local maintenance                  : False</div><div>crc32                              : c7037c03</div><div>local_conf_timestamp               : 7530</div><div>Host timestamp                     : 7530</div><div>Extra metadata (valid at timestamp):</div><div>        metadata_parse_version=1</div><div>        metadata_feature_version=1</div><div>        timestamp=7530 (Fri Jan 12 16:10:12 2018)</div><div>        host-id=2</div><div>        score=0</div><div>        vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)</div><div>        conf_on_shared_storage=True</div><div>        maintenance=False</div><div>        state=AgentStopped</div><div>        stopped=True</div></div><div><br></div><div><br></div><div>WebGUI shows that engine running on host ovirt1. <br>Gluster looks fine <br><div>[root@ovirt1 ~]# gluster volume status engine</div><div>Status of volume: engine</div><div>Gluster process                             TCP Port  RDMA Port  Online  Pid</div><div>------------------------------<wbr>------------------------------<wbr>------------------</div><div>Brick ovirt1.telia.ru:/oVirt/engine         49169     0          Y       3244 </div><div>Brick ovirt2.telia.ru:/oVirt/engine         49179     0          Y       20372</div><div>Brick ovirt3.telia.ru:/oVirt/engine         49206     0          Y       16609</div><div>Self-heal Daemon on localhost               N/A       N/A        Y       117868</div><div>Self-heal Daemon on <a href="http://ovirt2.telia.ru" target="_blank">ovirt2.telia.ru</a>         N/A       N/A        Y       20521</div><div>Self-heal Daemon on ovirt3                  N/A       N/A        Y       25093</div><div> </div><div>Task Status of Volume engine</div><div>------------------------------<wbr>------------------------------<wbr>------------------</div><div>There are no active volume tasks<br><br>How to resolve this issue?</div><br></div></div>
<br>______________________________<wbr>_________________<br>
Users mailing list<br>
<a href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a><br>
<a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman<wbr>/listinfo/users</a><br>
<br></blockquote></div><br></div></div></div>
<br>______________________________<wbr>_________________<br>
Users mailing list<br>
<a href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a><br>
<a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman<wbr>/listinfo/users</a><br>
<br></blockquote></div><br></div>
</div></div></blockquote></div><br></div>