<div dir="ltr">Hi,<div><br></div><div>/var/log/ovirt-hosted-engine-ha/broker.log </div><div><br></div><div>Host1:</div><div><div>Thread-118327::INFO::2014-04-23 12:34:59,360::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup) Connection established</div>
<div>Thread-118327::INFO::2014-04-23 12:34:59,375::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Connection closed</div><div>Thread-118328::INFO::2014-04-23 12:35:14,546::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup) Connection established</div>
<div>Thread-118328::INFO::2014-04-23 12:35:14,549::listener::184::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) Connection closed</div></div><div><br></div><div>Host2:</div><div><div>Thread-4::INFO::2014-04-23 12:36:08,020::mem_free::53::mem_free.MemFree::(action                    ) memFree: 9816</div>
<div>Thread-3::INFO::2014-04-23 12:36:08,240::mgmt_bridge::59::mgmt_bridge.MgmtBridge                    ::(action) Found bridge ovirtmgmt</div><div>Thread-296455::INFO::2014-04-23 12:36:08,678::listener::134::ovirt_hosted_engine                    _ha.broker.listener.ConnectionHandler::(setup) Connection established</div>
<div>Thread-296455::INFO::2014-04-23 12:36:08,684::listener::184::ovirt_hosted_engine                    _ha.broker.listener.ConnectionHandler::(handle) Connection closed</div></div><div><br></div><div><br></div><div><br>
</div><div>/var/log/ovirt-hosted-engine-ha/agent.log </div><div><br></div><div>host1:</div><div><br></div><div><div>MainThread::INFO::2014-04-02 17:46:14,856::state_decorators::25::ovirt_hosted_en                           gine_ha.agent.hosted_engine.HostedEngine::(check) Unknown local engine vm status                            no actions taken</div>
<div>MainThread::INFO::2014-04-02 17:46:14,857::brokerlink::108::ovirt_hosted_engine_                           ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1396453574.86 type=st                           ate_transition detail=UnknownLocalVmState-UnknownLocalVmState hostname=&#39;host01.o                           virt.lan&#39;</div>
<div>MainThread::INFO::2014-04-02 17:46:14,858::brokerlink::117::ovirt_hosted_engine_                           ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transi                           tion (UnknownLocalVmState-UnknownLocalVmState) sent? ignored</div>
<div>MainThread::WARNING::2014-04-02 17:46:15,463::hosted_engine::334::ovirt_hosted_e                           ngine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Error while monito                           ring engine: float() argument must be a string or a number</div>
<div>MainThread::WARNING::2014-04-02 17:46:15,464::hosted_engine::337::ovirt_hosted_e                           ngine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Unexpected error</div><div>Traceback (most recent call last):</div>
<div>  File &quot;/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_eng                           ine.py&quot;, line 323, in start_monitoring</div><div>    state.score(self._log))</div><div>  File &quot;/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/states.py&quot;                           , line 160, in score</div>
<div>    lm, logger, score, score_cfg)</div><div>  File &quot;/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/states.py&quot;                           , line 61, in _penalize_memory</div><div>    if self._float_or_default(lm[&#39;mem-free&#39;], 0) &lt; vm_mem:</div>
<div>  File &quot;/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/states.py&quot;                           , line 51, in _float_or_default</div><div>    return float(value)</div><div>TypeError: float() argument must be a string or a number</div>
<div>MainThread::ERROR::2014-04-02 17:46:15,464::hosted_engine::350::ovirt_hosted_eng                           ine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Shutting down the ag                           ent because of 3 failures in a row!</div>
<div>MainThread::INFO::2014-04-02 17:46:15,466::agent::116::<a href="http://ovirt_hosted_engine_ha.ag">ovirt_hosted_engine_ha.ag</a>                           ent.agent.Agent::(run) Agent shutting down</div></div><div><br>
</div><div><br></div><div>host2:</div><div><br></div><div><div>MainThread::INFO::2014-04-23 12:36:44,800::hosted_engine::323::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Current state EngineUnexpectedlyDown (score: 0)</div>
<div>MainThread::INFO::2014-04-23 12:36:54,844::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1398249414.84 type=state_transition detail=EngineUnexpectedlyDown-EngineUnexpectedlyDown hostname=&#39;host02.ovirt.lan&#39;</div>
<div>MainThread::INFO::2014-04-23 12:36:54,846::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineUnexpectedlyDown-EngineUnexpectedlyDown) sent? ignored</div>
</div><div><br></div><div>/var/log/vdsm/vdsm.log</div><div><br></div><div>host1 :</div><div><br></div><div><div>Thread-116::DEBUG::2014-04-23 12:40:17,060::fileSD::225::Storage.Misc.excCmd::(getReadDelay) &#39;/bin/dd iflag=direct if=/rhev/data-center/mnt/host01.ovirt.lan:_home_iso/cc51143e-8ad7-4b0b-a4d2-9024dffc1188/dom_md/metadata bs=4096 count=1&#39; (cwd None)</div>
<div>Thread-116::DEBUG::2014-04-23 12:40:17,070::fileSD::225::Storage.Misc.excCmd::(getReadDelay) SUCCESS: &lt;err&gt; = &#39;0+1 records in\n0+1 records out\n343 bytes (343 B) copied, 0.000183642 s, 1.9 MB/s\n&#39;; &lt;rc&gt; = 0</div>
<div>Thread-37::DEBUG::2014-04-23 12:40:17,504::fileSD::225::Storage.Misc.excCmd::(getReadDelay) &#39;/bin/dd iflag=direct if=/rhev/data-center/mnt/host01.ovirt.lan:_home_NFS01/aea040f8-ab9d-435b-9ecf-ddd4272e592f/dom_md/metadata bs=4096 count=1&#39; (cwd None)</div>
<div>Thread-37::DEBUG::2014-04-23 12:40:17,514::fileSD::225::Storage.Misc.excCmd::(getReadDelay) SUCCESS: &lt;err&gt; = &#39;0+1 records in\n0+1 records out\n472 bytes (472 B) copied, 0.000165064 s, 2.9 MB/s\n&#39;; &lt;rc&gt; = 0</div>
<div>Thread-11736::DEBUG::2014-04-23 12:40:18,170::task::595::TaskManager.Task::(_updateState) Task=`8a3a3e42-6e79-4849-9b1c-cad895722884`::moving from state init -&gt; state preparing</div><div>Thread-11736::INFO::2014-04-23 12:40:18,170::logUtils::44::dispatcher::(wrapper) Run and protect: repoStats(options=None)</div>
<div>Thread-11736::INFO::2014-04-23 12:40:18,171::logUtils::47::dispatcher::(wrapper) Run and protect: repoStats, Return response: {&#39;aea040f8-ab9d-435b-9ecf-ddd4272e592f&#39;: {&#39;code&#39;: 0, &#39;version&#39;: 3, &#39;acquired&#39;: True, &#39;delay&#39;: &#39;0.000165064&#39;, &#39;lastCheck&#39;: &#39;0.7&#39;, &#39;valid&#39;: True}, &#39;5ae613a4-44e4-42cb-89fc-7b5d34c1f30f&#39;: {&#39;code&#39;: 0, &#39;version&#39;: 3, &#39;acquired&#39;: True, &#39;delay&#39;: &#39;0.000174536&#39;, &#39;lastCheck&#39;: &#39;3.0&#39;, &#39;valid&#39;: True}, &#39;cc51143e-8ad7-4b0b-a4d2-9024dffc1188&#39;: {&#39;code&#39;: 0, &#39;version&#39;: 0, &#39;acquired&#39;: True, &#39;delay&#39;: &#39;0.000183642&#39;, &#39;lastCheck&#39;: &#39;1.1&#39;, &#39;valid&#39;: True}, &#39;ff98d346-4515-4349-8437-fb2f5e9eaadf&#39;: {&#39;code&#39;: 0, &#39;version&#39;: 0, &#39;acquired&#39;: True, &#39;delay&#39;: &#39;0.00045492&#39;, &#39;lastCheck&#39;: &#39;8.6&#39;, &#39;valid&#39;: True}}</div>
<div>Thread-11736::DEBUG::2014-04-23 12:40:18,171::task::1185::TaskManager.Task::(prepare) Task=`8a3a3e42-6e79-4849-9b1c-cad895722884`::finished: {&#39;aea040f8-ab9d-435b-9ecf-ddd4272e592f&#39;: {&#39;code&#39;: 0, &#39;version&#39;: 3, &#39;acquired&#39;: True, &#39;delay&#39;: &#39;0.000165064&#39;, &#39;lastCheck&#39;: &#39;0.7&#39;, &#39;valid&#39;: True}, &#39;5ae613a4-44e4-42cb-89fc-7b5d34c1f30f&#39;: {&#39;code&#39;: 0, &#39;version&#39;: 3, &#39;acquired&#39;: True, &#39;delay&#39;: &#39;0.000174536&#39;, &#39;lastCheck&#39;: &#39;3.0&#39;, &#39;valid&#39;: True}, &#39;cc51143e-8ad7-4b0b-a4d2-9024dffc1188&#39;: {&#39;code&#39;: 0, &#39;version&#39;: 0, &#39;acquired&#39;: True, &#39;delay&#39;: &#39;0.000183642&#39;, &#39;lastCheck&#39;: &#39;1.1&#39;, &#39;valid&#39;: True}, &#39;ff98d346-4515-4349-8437-fb2f5e9eaadf&#39;: {&#39;code&#39;: 0, &#39;version&#39;: 0, &#39;acquired&#39;: True, &#39;delay&#39;: &#39;0.00045492&#39;, &#39;lastCheck&#39;: &#39;8.6&#39;, &#39;valid&#39;: True}}</div>
<div>Thread-11736::DEBUG::2014-04-23 12:40:18,172::task::595::TaskManager.Task::(_updateState) Task=`8a3a3e42-6e79-4849-9b1c-cad895722884`::moving from state preparing -&gt; state finished</div><div>Thread-11736::DEBUG::2014-04-23 12:40:18,172::resourceManager::940::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}</div>
<div>Thread-11736::DEBUG::2014-04-23 12:40:18,172::resourceManager::977::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}</div><div>Thread-11736::DEBUG::2014-04-23 12:40:18,172::task::990::TaskManager.Task::(_decref) Task=`8a3a3e42-6e79-4849-9b1c-cad895722884`::ref 0 aborting False</div>
<div>Thread-299::DEBUG::2014-04-23 12:40:19,599::fileSD::225::Storage.Misc.excCmd::(getReadDelay) &#39;/bin/dd iflag=direct if=/rhev/data-center/mnt/host01.ovirt.lan:_home_export/ff98d346-4515-4349-8437-fb2f5e9eaadf/dom_md/metadata bs=4096 count=1&#39; (cwd None)</div>
<div>Thread-299::DEBUG::2014-04-23 12:40:19,610::fileSD::225::Storage.Misc.excCmd::(getReadDelay) SUCCESS: &lt;err&gt; = &#39;0+1 records in\n0+1 records out\n352 bytes (352 B) copied, 0.000525872 s, 669 kB/s\n&#39;; &lt;rc&gt; = 0</div>
</div><div><br></div><div><br></div><div>host2 :</div><div><br></div><div><div>Thread-1688899::DEBUG::2014-04-23 12:41:30,270::task::990::TaskManager.Task::(_decref) Task=`c23aeaf                             5-aed4-4285-a8c9-2bffadc0240e`::ref 0 aborting False</div>
<div>Thread-159126::DEBUG::2014-04-23 12:41:30,547::fileSD::225::Storage.Misc.excCmd::(getReadDelay) &#39;/bi                             n/dd iflag=direct if=/rhev/data-center/mnt/host01.ovirt.lan:_home_iso/cc51143e-8ad7-4b0b-a4d2-9024df                             fc1188/dom_md/metadata bs=4096 count=1&#39; (cwd None)</div>
<div>Thread-159126::DEBUG::2014-04-23 12:41:30,569::fileSD::225::Storage.Misc.excCmd::(getReadDelay) SUCC                             ESS: &lt;err&gt; = &#39;0+1 records in\n0+1 records out\n343 bytes (343 B) copied, 0.000480513 s, 714 kB/s\n&#39;;                              &lt;rc&gt; = 0</div>
<div>Thread-159125::DEBUG::2014-04-23 12:41:30,740::fileSD::225::Storage.Misc.excCmd::(getReadDelay) &#39;/bi                             n/dd iflag=direct if=/rhev/data-center/mnt/host01.ovirt.lan:_home_DATA/5ae613a4-44e4-42cb-89fc-7b5d3                             4c1f30f/dom_md/metadata bs=4096 count=1&#39; (cwd None)</div>
<div>Thread-159125::DEBUG::2014-04-23 12:41:30,762::fileSD::225::Storage.Misc.excCmd::(getReadDelay) SUCC                             ESS: &lt;err&gt; = &#39;0+1 records in\n0+1 records out\n545 bytes (545 B) copied, 0.000382036 s, 1.4 MB/s\n&#39;;                              &lt;rc&gt; = 0</div>
<div>Thread-159128::DEBUG::2014-04-23 12:41:32,226::fileSD::225::Storage.Misc.excCmd::(getReadDelay) &#39;/bi                             n/dd iflag=direct if=/rhev/data-center/mnt/host01.ovirt.lan:_home_export/ff98d346-4515-4349-8437-fb2                             f5e9eaadf/dom_md/metadata bs=4096 count=1&#39; (cwd None)</div>
<div>Thread-159128::DEBUG::2014-04-23 12:41:32,245::fileSD::225::Storage.Misc.excCmd::(getReadDelay) SUCC                             ESS: &lt;err&gt; = &#39;0+1 records in\n0+1 records out\n352 bytes (352 B) copied, 0.000648972 s, 542 kB/s\n&#39;;                              &lt;rc&gt; = 0</div>
</div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">2014-04-23 0:21 GMT+02:00 Doron Fediuck <span dir="ltr">&lt;<a href="mailto:dfediuck@redhat.com" target="_blank">dfediuck@redhat.com</a>&gt;</span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5"><br>
<br>
----- Original Message -----<br>
&gt; From: &quot;Kevin Tibi&quot; &lt;<a href="mailto:kevintibi@hotmail.com">kevintibi@hotmail.com</a>&gt;<br>
&gt; To: &quot;users&quot; &lt;<a href="mailto:users@ovirt.org">users@ovirt.org</a>&gt;<br>
&gt; Sent: Tuesday, April 22, 2014 2:12:50 PM<br>
&gt; Subject: [ovirt-users] Hosted Engine error -243<br>
&gt;<br>
&gt; Hi all,<br>
&gt;<br>
&gt; I have a probleme with my hosted engine. Every 10 min i have a event in<br>
&gt; engine :<br>
&gt;<br>
&gt; VM HostedEngine is down. Exit message: internal error Failed to acquire lock:<br>
&gt; error -243<br>
&gt;<br>
&gt; My data is a local export NFS.<br>
&gt;<br>
&gt; Thx for you help.<br>
&gt;<br>
&gt; Kevin.<br>
&gt;<br>
<br>
</div></div>Hi Kevin,<br>
can you please check the /var/log/ovirt-hosted-* log files in your hosts<br>
and let us know if you see something else there or in your vdsm log file?<br>
_______________________________________________<br>
Users mailing list<br>
<a href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
<a href="http://lists.ovirt.org/mailman/listinfo/users" target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a><br>
</blockquote></div><br></div>