<div dir="ltr">I just noticed this in the vdsm.logs.  The agent looks like it is trying to start hosted engine on both machines??<br><br>&lt;on_poweroff&gt;destroy&lt;/on_poweroff&gt;&lt;on_reboot&gt;destroy&lt;/on_reboot&gt;&lt;on_crash&gt;destroy&lt;/on_crash&gt;&lt;/domain&gt;<br>Thread-7517::ERROR::2017-03-10 01:26:13,053::vm::773::virt.vm::(_startUnderlyingVm) vmId=`2419f9fe-4998-4b7a-9fe9-151571d20379`::The vm start process failed<br>Traceback (most recent call last):<br>  File &quot;/usr/share/vdsm/virt/vm.py&quot;, line 714, in _startUnderlyingVm   self._run()<br>  File &quot;/usr/share/vdsm/virt/vm.py&quot;, line 2026, in _run  self._connection.createXML(domxml, flags),<br>  File &quot;/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py&quot;, line 123, in wrapper ret = f(*args, **kwargs)<br>  File &quot;/usr/lib/python2.7/site-packages/vdsm/utils.py&quot;, line 917, in wrapper return func(inst, *args, **kwargs)<br>  File &quot;/usr/lib64/python2.7/site-packages/libvirt.py&quot;, line 3782, in createXML if ret is None:raise libvirtError(&#39;virDomainCreateXML() failed&#39;, conn=self)<br><br>libvirtError: Failed to acquire lock: Permission denied<br><br>INFO::2017-03-10 01:26:13,054::vm::1330::virt.vm::(setDownStatus) vmId=`2419f9fe-4998-4b7a-9fe9-151571d20379`::Changed state to Down: Failed to acquire lock: Permission denied (code=1)<br>INFO::2017-03-10 01:26:13,054::guestagent::430::virt.vm::(stop) vmId=`2419f9fe-4998-4b7a-9fe9-151571d20379`::Stopping connection<br><br>DEBUG::2017-03-10 01:26:13,054::vmchannels::238::vds::(unregister) Delete fileno 56 from listener.<br>DEBUG::2017-03-10 01:26:13,055::vmchannels::66::vds::(_unregister_fd) Failed to unregister FD from epoll (ENOENT): 56<br>DEBUG::2017-03-10 01:26:13,055::__init__::209::jsonrpc.Notification::(emit) Sending event {&quot;params&quot;: {&quot;2419f9fe-4998-4b7a-9fe9-151571d20379&quot;: {&quot;status&quot;: &quot;Down&quot;, &quot;exitReason&quot;: 1, &quot;exitMessage&quot;: &quot;Failed to acquire lock: Permission denied&quot;, &quot;exitCode&quot;: 1}, &quot;notify_time&quot;: 4308740560}, &quot;jsonrpc&quot;: &quot;2.0&quot;, &quot;method&quot;: &quot;|virt|VM_status|2419f9fe-4998-4b7a-9fe9-151571d20379&quot;}<br>VM Channels Listener::DEBUG::2017-03-10 01:26:13,475::vmchannels::142::vds::(_do_del_channels) fileno 56 was removed from listener.<br>DEBUG::2017-03-10 01:26:14,430::check::296::storage.check::(_start_process) START check u&#39;/rhev/data-center/mnt/glusterSD/192.168.3.10:_data/a08822ec-3f5b-4dba-ac2d-5510f0b4b6a2/dom_md/metadata&#39; cmd=[&#39;/usr/bin/taskset&#39;, &#39;--cpu-list&#39;, &#39;0-39&#39;, &#39;/usr/bin/dd&#39;, u&#39;if=/rhev/data-center/mnt/glusterSD/192.168.3.10:_data/a08822ec-3f5b-4dba-ac2d-5510f0b4b6a2/dom_md/metadata&#39;, &#39;of=/dev/null&#39;, &#39;bs=4096&#39;, &#39;count=1&#39;, &#39;iflag=direct&#39;] delay=0.00<br>DEBUG::2017-03-10 01:26:14,481::asyncevent::564::storage.asyncevent::(reap) Process &lt;cpopen.CPopen object at 0x3ba6550&gt; terminated (count=1)<br>DEBUG::2017-03-10 01:26:14,481::check::327::storage.check::(_check_completed) FINISH check u&#39;/rhev/data-center/mnt/glusterSD/192.168.3.10:_data/a08822ec-3f5b-4dba-ac2d-5510f0b4b6a2/dom_md/metadata&#39; rc=0 err=bytearray(b&#39;0+1 records in\n0+1 records out\n300 bytes (300 B) copied, 8.7603e-05 s, 3.4 MB/s\n&#39;) elapsed=0.06<br><br></div><div class="gmail_extra"><br><div class="gmail_quote">On 10 March 2017 at 10:40, Ian Neilsen <span dir="ltr">&lt;<a href="mailto:ian.neilsen@gmail.com" target="_blank">ian.neilsen@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div>Hi All<br><br></div>I had a storage issue with my gluster volumes running under ovirt hosted.<br>I now cannot start the hosted engine manager vm from &quot;hosted-engine --vm-start&quot;.<br>I&#39;ve scoured the net to find a way, but can&#39;t seem to find anything concrete. <br><br>Running Centos7, ovirt 4.0 and gluster 3.8.9<br><br> How do I recover the engine manager. Im at a loss!<br></div> <br></div>Engine Status = score between nodes was 0 for all, now node 1 is reading 3400, but all others are 0 <div><div><div><div><br>{&quot;reason&quot;: &quot;bad vm status&quot;, &quot;health&quot;: &quot;bad&quot;, &quot;vm&quot;: &quot;down&quot;, &quot;detail&quot;: &quot;down&quot;}<br></div><div><br></div><div><br>Logs from agent.log<br>==================<br><br>INFO::2017-03-09 19:32:52,600::state_<wbr>decorators::51::ovirt_hosted_<wbr>engine_ha.agent.hosted_engine.<wbr>HostedEngine::(check) Global maintenance detected<br>INFO::2017-03-09 19:32:52,603::hosted_engine::<wbr>612::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>vdsm) Initializing VDSM<br>INFO::2017-03-09 19:32:54,820::hosted_engine::<wbr>639::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images) Connecting the storage<br>INFO::2017-03-09 19:32:54,821::storage_server::<wbr>219::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(connect_<wbr>storage_server) Connecting storage server<br>INFO::2017-03-09 19:32:59,194::storage_server::<wbr>226::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(connect_<wbr>storage_server) Connecting storage server<br>INFO::2017-03-09 19:32:59,211::storage_server::<wbr>233::ovirt_hosted_engine_ha.<wbr>lib.storage_server.<wbr>StorageServer::(connect_<wbr>storage_server) Refreshing the storage domain<br>INFO::2017-03-09 19:32:59,328::hosted_engine::<wbr>666::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images) Preparing images<br>INFO::2017-03-09 19:32:59,328::image::126::<wbr>ovirt_hosted_engine_ha.lib.<wbr>image.Image::(prepare_images) Preparing images<br>INFO::2017-03-09 19:33:01,748::hosted_engine::<wbr>669::ovirt_hosted_engine_ha.<wbr>agent.hosted_engine.<wbr>HostedEngine::(_initialize_<wbr>storage_images) Reloading vm.conf from the shared storage domain<br>INFO::2017-03-09 19:33:01,748::config::206::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(refresh_local_conf_<wbr>file) Trying to get a fresher copy of vm configuration from the OVF_STORE<br>WARNING::2017-03-09 19:33:04,056::ovf_store::107::<wbr>ovirt_hosted_engine_ha.lib.<wbr>ovf.ovf_store.OVFStore::(scan) Unable to find OVF_STORE<br>ERROR::2017-03-09 19:33:04,058::config::235::<wbr>ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config::(refresh_local_conf_<wbr>file) Unable to get vm.conf from OVF_STORE, falling back to initial vm.conf<br><br></div><div>ovirt-ha-agent logs<br>================<br><br>ovirt-ha-agent ovirt_hosted_engine_ha.agent.<wbr>hosted_engine.HostedEngine.<wbr>config ERROR Unable to get vm.conf from OVF_STORE, falling back to initial vm.conf<br></div><div><br></div><div>vdsm<br>======<br><br>vdsm vds.dispatcher ERROR SSL error during reading data: unexpected eof<br></div><div><br></div><div>ovirt-ha-broker<br>============<br><br>ovirt-ha-broker cpu_load_no_engine.<wbr>EngineHealth ERROR Failed to getVmStats: &#39;pid&#39;<span class="HOEnZb"><font color="#888888"><br></font></span></div><span class="HOEnZb"><font color="#888888"><div><br>-- <br><div class="m_-906845641019494375gmail_signature"><div dir="ltr"><div>Ian Neilsen<br><br>Mobile: <a href="tel:0424%20379%20762" value="+61424379762" target="_blank">0424 379 762</a><br>Linkedin: <a href="http://au.linkedin.com/in/ianneilsen" target="_blank">http://au.linkedin.com/in/<wbr>ianneilsen</a><div>Twitter : ineilsen</div></div></div></div>
</div></font></span></div></div></div></div>
</blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div>Ian Neilsen<br><br>Mobile: 0424 379 762<br>Linkedin: <a href="http://au.linkedin.com/in/ianneilsen" target="_blank">http://au.linkedin.com/in/ianneilsen</a><div>Twitter : ineilsen</div></div></div></div>
</div>