<div dir="ltr">Hello Artem,<div><br></div><div> Can you check if glusterd service is running on host1 and all the peers are in connected state ? If yes, can you restart ovirt-ha-agent and broker services and check if things are working fine ?</div><div><br></div><div>Thanks</div><div>kasturi</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Jan 13, 2018 at 12:33 AM, Artem Tambovskiy <span dir="ltr"><<a href="mailto:artem.tambovskiy@gmail.com" target="_blank">artem.tambovskiy@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Explored logs on both hosts. <div>broker.log shows no errors.</div><div><br></div><div>agent.log looking not good:<br><br>on host1 (which running hosted engine) :</div><div><br></div><div><div>MainThread::ERROR::2018-01-12 21:51:03,883::agent::205::<wbr>ovirt_hosted_engine_ha.agent.<wbr>agent.Agent::(_run_agent) Traceback (most recent call last):</div><div> File "/usr/lib/python2.7/site-<wbr>packages/ovirt_hosted_engine_<wbr>ha/agent/agent.py", line 191, in _run_agent</div><div> return action(he)</div><div> File "/usr/lib/python2.7/site-<wbr>packages/ovirt_hosted_engine_<wbr>ha/agent/agent.py", line 64, in action_proper</div><div> return he.start_monitoring()</div><div> File "/usr/lib/python2.7/site-<wbr>packages/ovirt_hosted_engine_<wbr>ha/agent/hosted_engine.py", line 411, in start_monitoring</div><div> self._initialize_sanlock()</div><div> File "/usr/lib/python2.7/site-<wbr>packages/ovirt_hosted_engine_<wbr>ha/agent/hosted_engine.py", line 749, in _initialize_sanlock</div><div> "Failed to initialize sanlock, the number of errors has"</div><div>SanlockInitializationError: Failed to initialize sanlock, the number of errors has exceeded the limit</div><div><br></div><div>MainThread::ERROR::2018-01-12 21:51:03,884::agent::206::<wbr>ovirt_hosted_engine_ha.agent.<wbr>agent.Agent::(_run_agent) Trying to restart agent</div><div>MainThread::WARNING::2018-01-<wbr>12 21:51:08,889::agent::209::<wbr>ovirt_hosted_engine_ha.agent.<wbr>agent.Agent::(_run_agent) Restarting agent, attempt '1'</div><div>MainThread::INFO::2018-01-12 21:51:08,919::hosted_engine::<wbr>242::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_get_hostname) Found certificate common name: <a href="http://ovirt1.telia.ru" target="_blank">ovirt1.telia.ru</a></div><div>MainThread::INFO::2018-01-12 21:51:08,921::hosted_engine::<wbr>604::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>vdsm) Initializing VDSM</div><div>MainThread::INFO::2018-01-12 21:51:11,398::hosted_engine::<wbr>630::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 21:51:11,399::storage_server::<wbr>220::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(validate_<wbr>storage_server) Validating storage server</div><div>MainThread::INFO::2018-01-12 21:51:13,725::storage_server::<wbr>239::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(connect_<wbr>storage_server) Connecting storage server</div><div>MainThread::INFO::2018-01-12 21:51:18,390::storage_server::<wbr>246::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(connect_<wbr>storage_server) Connecting storage server</div><div>MainThread::INFO::2018-01-12 21:51:18,423::storage_server::<wbr>253::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(connect_<wbr>storage_server) Refreshing the storage domain</div><div>MainThread::INFO::2018-01-12 21:51:18,689::hosted_engine::<wbr>663::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images) Preparing images</div><div>MainThread::INFO::2018-01-12 21:51:18,690::image::126::<wbr>ovirt_hosted_engine_ha.lib.<wbr>image.Image::(prepare_images) Preparing images</div><div>MainThread::INFO::2018-01-12 21:51:21,895::hosted_engine::<wbr>666::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images) Refreshing vm.conf</div><div>MainThread::INFO::2018-01-12 21:51:21,895::config::493::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(refresh_vm_conf) Reloading vm.conf from the shared storage domain</div><div>MainThread::INFO::2018-01-12 21:51:21,896::config::416::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store) Trying to get a fresher copy of vm configuration from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 21:51:21,896::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF) Extracting Engine VM OVF from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 21:51:21,897::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/<wbr>4a7f8717-9bb0-4d80-8016-<wbr>498fa4b88162/5cabd8e1-5f4b-<wbr>469e-becc-227469e03f5c/<wbr>8048cbd7-77e2-4805-9af4-<wbr>d109fa36dfcf </div><div>MainThread::INFO::2018-01-12 21:51:21,915::config::435::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store) Found an OVF for HE VM, trying to convert</div><div>MainThread::INFO::2018-01-12 21:51:21,918::config::440::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store) Got vm.conf from OVF_STORE</div><div>MainThread::INFO::2018-01-12 21:51:21,919::hosted_engine::<wbr>509::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>broker) Initializing ha-broker connection</div><div>MainThread::INFO::2018-01-12 21:51:21,919::brokerlink::130:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor) Starting monitor ping, options {'addr': '80.239.162.97'}</div><div>MainThread::INFO::2018-01-12 21:51:21,922::brokerlink::141:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor) Success, id 140547104457680</div><div>MainThread::INFO::2018-01-12 21:51:21,922::brokerlink::130:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor) Starting monitor mgmt-bridge, options {'use_ssl': 'true', 'bridge_name': 'ovirtmgmt', 'address': '0'}</div><div>MainThread::INFO::2018-01-12 21:51:21,936::brokerlink::141:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor) Success, id 140547104458064</div><div>MainThread::INFO::2018-01-12 21:51:21,936::brokerlink::130:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor) Starting monitor mem-free, options {'use_ssl': 'true', 'address': '0'}</div><div>MainThread::INFO::2018-01-12 21:51:21,938::brokerlink::141:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor) Success, id 140547104458448</div><div>MainThread::INFO::2018-01-12 21:51:21,939::brokerlink::130:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor) Starting monitor cpu-load-no-engine, options {'use_ssl': 'true', 'vm_uuid': 'b366e466-b0ea-4a09-866b-<wbr>d0248d7523a6', 'address': '0'}</div><div>MainThread::INFO::2018-01-12 21:51:21,940::brokerlink::141:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor) Success, id 140547104457552</div><div>MainThread::INFO::2018-01-12 21:51:21,941::brokerlink::130:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor) Starting monitor engine-health, options {'use_ssl': 'true', 'vm_uuid': 'b366e466-b0ea-4a09-866b-<wbr>d0248d7523a6', 'address': '0'}</div><div>MainThread::INFO::2018-01-12 21:51:21,942::brokerlink::141:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(start_<wbr>monitor) Success, id 140547104459792</div><div>MainThread::INFO::2018-01-12 21:51:26,951::brokerlink::179:<wbr>:ovirt_hosted_engine_ha.lib.<wbr>brokerlink.BrokerLink::(set_<wbr>storage_domain) Success, id 140546772847056</div><div>MainThread::INFO::2018-01-12 21:51:26,952::hosted_engine::<wbr>601::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>broker) Broker initialized, all submonitors started</div><div>MainThread::INFO::2018-01-12 21:51:27,049::hosted_engine::<wbr>704::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>sanlock) Ensuring lease for lockspace hosted-engine, host id 1 is acquired (file: /var/run/vdsm/storage/<wbr>4a7f8717-9bb0-4d80-8016-<wbr>498fa4b88162/093faa75-5e33-<wbr>4559-84fa-1f1f8d48153b/<wbr>911c7637-b49d-463e-b186-<wbr>23b404e50769)</div><div>MainThread::INFO::2018-01-12 21:53:48,067::hosted_engine::<wbr>745::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>sanlock) Failed to acquire the lock. Waiting '5's before the next attempt</div><div>MainThread::INFO::2018-01-12 21:56:14,088::hosted_engine::<wbr>745::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>sanlock) Failed to acquire the lock. Waiting '5's before the next attempt</div><div>MainThread::INFO::2018-01-12 21:58:40,111::hosted_engine::<wbr>745::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>sanlock) Failed to acquire the lock. Waiting '5's before the next attempt</div><div>MainThread::INFO::2018-01-12 22:01:06,133::hosted_engine::<wbr>745::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>sanlock) Failed to acquire the lock. Waiting '5's before the next attempt</div></div><div><br><br>agent.log from second host <br><br><div>MainThread::INFO::2018-01-12 22:01:37,241::hosted_engine::<wbr>630::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 22:01:37,242::storage_server::<wbr>220::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(validate_<wbr>storage_server) Validating storage server</div><div>MainThread::INFO::2018-01-12 22:01:39,540::hosted_engine::<wbr>639::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images) Storage domain reported as valid and reconnect is not forced.</div><div>MainThread::INFO::2018-01-12 22:01:41,939::hosted_engine::<wbr>453::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(start_<wbr>monitoring) Current state EngineUnexpectedlyDown (score: 0)</div><div>MainThread::INFO::2018-01-12 22:01:52,150::config::493::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(refresh_vm_conf) Reloading vm.conf from the shared storage domain</div><div>MainThread::INFO::2018-01-12 22:01:52,150::config::416::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store) Trying to get a fresher copy of vm configuration from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:01:52,151::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF) Extracting Engine VM OVF from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:01:52,153::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/<wbr>4a7f8717-9bb0-4d80-8016-<wbr>498fa4b88162/5cabd8e1-5f4b-<wbr>469e-becc-227469e03f5c/<wbr>8048cbd7-77e2-4805-9af4-<wbr>d109fa36dfcf </div><div>MainThread::INFO::2018-01-12 22:01:52,174::config::435::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store) Found an OVF for HE VM, trying to convert</div><div>MainThread::INFO::2018-01-12 22:01:52,179::config::440::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store) Got vm.conf from OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:01:52,189::hosted_engine::<wbr>604::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>vdsm) Initializing VDSM</div><div>MainThread::INFO::2018-01-12 22:01:54,586::hosted_engine::<wbr>630::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 22:01:54,587::storage_server::<wbr>220::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(validate_<wbr>storage_server) Validating storage server</div><div>MainThread::INFO::2018-01-12 22:01:56,903::hosted_engine::<wbr>639::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images) Storage domain reported as valid and reconnect is not forced.</div><div>MainThread::INFO::2018-01-12 22:01:59,299::states::682::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine::(<wbr>score) Score is 0 due to unexpected vm shutdown at Fri Jan 12 21:57:48 2018</div><div>MainThread::INFO::2018-01-12 22:01:59,299::hosted_engine::<wbr>453::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(start_<wbr>monitoring) Current state EngineUnexpectedlyDown (score: 0)</div><div>MainThread::INFO::2018-01-12 22:02:09,659::config::493::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(refresh_vm_conf) Reloading vm.conf from the shared storage domain</div><div>MainThread::INFO::2018-01-12 22:02:09,659::config::416::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store) Trying to get a fresher copy of vm configuration from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:09,660::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF) Extracting Engine VM OVF from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:09,663::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/<wbr>4a7f8717-9bb0-4d80-8016-<wbr>498fa4b88162/5cabd8e1-5f4b-<wbr>469e-becc-227469e03f5c/<wbr>8048cbd7-77e2-4805-9af4-<wbr>d109fa36dfcf </div><div>MainThread::INFO::2018-01-12 22:02:09,683::config::435::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store) Found an OVF for HE VM, trying to convert</div><div>MainThread::INFO::2018-01-12 22:02:09,688::config::440::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store) Got vm.conf from OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:09,698::hosted_engine::<wbr>604::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>vdsm) Initializing VDSM</div><div>MainThread::INFO::2018-01-12 22:02:12,112::hosted_engine::<wbr>630::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 22:02:12,113::storage_server::<wbr>220::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(validate_<wbr>storage_server) Validating storage server</div><div>MainThread::INFO::2018-01-12 22:02:14,444::hosted_engine::<wbr>639::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images) Storage domain reported as valid and reconnect is not forced.</div><div>MainThread::INFO::2018-01-12 22:02:16,859::states::682::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine::(<wbr>score) Score is 0 due to unexpected vm shutdown at Fri Jan 12 21:57:47 2018</div><div>MainThread::INFO::2018-01-12 22:02:16,859::hosted_engine::<wbr>453::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(start_<wbr>monitoring) Current state EngineUnexpectedlyDown (score: 0)</div><div>MainThread::INFO::2018-01-12 22:02:27,100::config::493::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(refresh_vm_conf) Reloading vm.conf from the shared storage domain</div><div>MainThread::INFO::2018-01-12 22:02:27,100::config::416::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store) Trying to get a fresher copy of vm configuration from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:27,101::ovf_store::132::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF) Extracting Engine VM OVF from the OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:27,103::ovf_store::134::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(<wbr>getEngineVMOVF) OVF_STORE volume path: /var/run/vdsm/storage/<wbr>4a7f8717-9bb0-4d80-8016-<wbr>498fa4b88162/5cabd8e1-5f4b-<wbr>469e-becc-227469e03f5c/<wbr>8048cbd7-77e2-4805-9af4-<wbr>d109fa36dfcf </div><div>MainThread::INFO::2018-01-12 22:02:27,125::config::435::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store) Found an OVF for HE VM, trying to convert</div><div>MainThread::INFO::2018-01-12 22:02:27,129::config::440::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(_get_vm_conf_content_<wbr>from_ovf_store) Got vm.conf from OVF_STORE</div><div>MainThread::INFO::2018-01-12 22:02:27,130::states::667::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine::(<wbr>consume) Engine down, local host does not have best score</div><div>MainThread::INFO::2018-01-12 22:02:27,139::hosted_engine::<wbr>604::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>vdsm) Initializing VDSM</div><div>MainThread::INFO::2018-01-12 22:02:29,584::hosted_engine::<wbr>630::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images) Connecting the storage</div><div>MainThread::INFO::2018-01-12 22:02:29,586::storage_server::<wbr>220::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(validate_<wbr>storage_server) Validating storage server<br><br><br>Any suggestions how to resolve this .</div><div><br>regards,</div></div><div>Artem</div><div><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jan 12, 2018 at 7:08 PM, Artem Tambovskiy <span dir="ltr"><<a href="mailto:artem.tambovskiy@gmail.com" target="_blank">artem.tambovskiy@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Trying to fix one thing I broke another :( <br><br>I fixed mnt_options for hosted engine storage domain and installed latest security patches to my hosts and hosted engine. All VM's up and running, but hosted_engine --vm-status reports about issues: <br><br><div>[root@ovirt1 ~]# hosted-engine --vm-status</div><div><br></div><div><br></div><div>--== Host 1 status ==--</div><div><br></div><div>conf_on_shared_storage : True</div><div>Status up-to-date : False</div><div>Hostname : ovirt2</div><div>Host ID : 1</div><div>Engine status : unknown stale-data</div><div>Score : 0</div><div>stopped : False</div><div>Local maintenance : False</div><div>crc32 : 193164b8</div><div>local_conf_timestamp : 8350</div><div>Host timestamp : 8350</div><div>Extra metadata (valid at timestamp):</div><div> metadata_parse_version=1</div><div> metadata_feature_version=1</div><div> timestamp=8350 (Fri Jan 12 19:03:54 2018)</div><div> host-id=1</div><div> score=0</div><div> vm_conf_refresh_time=8350 (Fri Jan 12 19:03:54 2018)</div><div> conf_on_shared_storage=True</div><div> maintenance=False</div><div> state=EngineUnexpectedlyDown</div><div> stopped=False</div><div> timeout=Thu Jan 1 05:24:43 1970</div><div><br></div><div><br></div><div>--== Host 2 status ==--</div><div><br></div><div>conf_on_shared_storage : True</div><div>Status up-to-date : False</div><div>Hostname : <a href="http://ovirt1.telia.ru" target="_blank">ovirt1.telia.ru</a></div><div>Host ID : 2</div><div>Engine status : unknown stale-data</div><div>Score : 0</div><div>stopped : True</div><div>Local maintenance : False</div><div>crc32 : c7037c03</div><div>local_conf_timestamp : 7530</div><div>Host timestamp : 7530</div><div>Extra metadata (valid at timestamp):</div><div> metadata_parse_version=1</div><div> metadata_feature_version=1</div><div> timestamp=7530 (Fri Jan 12 16:10:12 2018)</div><div> host-id=2</div><div> score=0</div><div> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)</div><div> conf_on_shared_storage=True</div><div> maintenance=False</div><div> state=AgentStopped</div><div> stopped=True</div><div>[root@ovirt1 ~]# </div><div><br></div><div><br></div><div><br></div><div>from second host situation looks a bit different:<br><br><br><div>[root@ovirt2 ~]# hosted-engine --vm-status</div><div><br></div><div><br></div><div>--== Host 1 status ==--</div><div><br></div><div>conf_on_shared_storage : True</div><div>Status up-to-date : True</div><div>Hostname : ovirt2</div><div>Host ID : 1</div><div>Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}</div><div>Score : 0</div><div>stopped : False</div><div>Local maintenance : False</div><div>crc32 : 78eabdb6</div><div>local_conf_timestamp : 8403</div><div>Host timestamp : 8402</div><div>Extra metadata (valid at timestamp):</div><div> metadata_parse_version=1</div><div> metadata_feature_version=1</div><div> timestamp=8402 (Fri Jan 12 19:04:47 2018)</div><div> host-id=1</div><div> score=0</div><div> vm_conf_refresh_time=8403 (Fri Jan 12 19:04:47 2018)</div><div> conf_on_shared_storage=True</div><div> maintenance=False</div><div> state=EngineUnexpectedlyDown</div><div> stopped=False</div><div> timeout=Thu Jan 1 05:24:43 1970</div><div><br></div><div><br></div><div>--== Host 2 status ==--</div><div><br></div><div>conf_on_shared_storage : True</div><div>Status up-to-date : False</div><div>Hostname : <a href="http://ovirt1.telia.ru" target="_blank">ovirt1.telia.ru</a></div><div>Host ID : 2</div><div>Engine status : unknown stale-data</div><div>Score : 0</div><div>stopped : True</div><div>Local maintenance : False</div><div>crc32 : c7037c03</div><div>local_conf_timestamp : 7530</div><div>Host timestamp : 7530</div><div>Extra metadata (valid at timestamp):</div><div> metadata_parse_version=1</div><div> metadata_feature_version=1</div><div> timestamp=7530 (Fri Jan 12 16:10:12 2018)</div><div> host-id=2</div><div> score=0</div><div> vm_conf_refresh_time=7530 (Fri Jan 12 16:10:12 2018)</div><div> conf_on_shared_storage=True</div><div> maintenance=False</div><div> state=AgentStopped</div><div> stopped=True</div></div><div><br></div><div><br></div><div>WebGUI shows that engine running on host ovirt1. <br>Gluster looks fine <br><div>[root@ovirt1 ~]# gluster volume status engine</div><div>Status of volume: engine</div><div>Gluster process TCP Port RDMA Port Online Pid</div><div>------------------------------<wbr>------------------------------<wbr>------------------</div><div>Brick ovirt1.telia.ru:/oVirt/engine 49169 0 Y 3244 </div><div>Brick ovirt2.telia.ru:/oVirt/engine 49179 0 Y 20372</div><div>Brick ovirt3.telia.ru:/oVirt/engine 49206 0 Y 16609</div><div>Self-heal Daemon on localhost N/A N/A Y 117868</div><div>Self-heal Daemon on <a href="http://ovirt2.telia.ru" target="_blank">ovirt2.telia.ru</a> N/A N/A Y 20521</div><div>Self-heal Daemon on ovirt3 N/A N/A Y 25093</div><div> </div><div>Task Status of Volume engine</div><div>------------------------------<wbr>------------------------------<wbr>------------------</div><div>There are no active volume tasks<br><br>How to resolve this issue?</div><br></div></div>
<br>______________________________<wbr>_________________<br>
Users mailing list<br>
<a href="mailto:Users@ovirt.org" target="_blank">Users@ovirt.org</a><br>
<a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman<wbr>/listinfo/users</a><br>
<br></blockquote></div><br></div></div></div>
<br>______________________________<wbr>_________________<br>
Users mailing list<br>
<a href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
<a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/<wbr>mailman/listinfo/users</a><br>
<br></blockquote></div><br></div>