<div dir="ltr">We had the bug related to this issue <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1343005">https://bugzilla.redhat.com/show_bug.cgi?id=1343005</a>.<div>It must be fixed in recent versions.</div><div>Best Regards</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jul 14, 2016 at 8:14 PM, Gervais de Montbrun <span dir="ltr">&lt;<a href="mailto:gervais@demontbrun.com" target="_blank">gervais@demontbrun.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hey Folks,<br>
<br>
I upgraded my oVirt cluster from 3.6.7 to 4.0.0 yesterday and am experiencing a bunch of issues.<br>
<br>
1) I can&#39;t update the Compatibility Version to 4.0 because it tells me that all my VMs have to be off to do so, but I have a hosted engine. I found some info online about how you plan to fix this. Do we know if the fix will be in 4.0.1?<br>
<br>
2) More alarming... the ovirt-ha-agent keeps quitting. The agent.log shows:<br>
<br>
MainThread::ERROR::2016-07-13 16:38:57,100::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: &#39;[Errno 24] Too many open files&#39; - trying to restart agent<br>
MainThread::ERROR::2016-07-13 16:39:02,104::config::122::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_load) Configuration file &#39;/etc/ovirt-hosted-engine/hosted-engine.conf&#39; not available [[Errno 24] Too many open files: &#39;/etc/ovirt-hosted-engine/hosted-engine.conf&#39;]<br>
MainThread::ERROR::2016-07-13 16:39:02,105::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: &#39;[Errno 24] Too many open files&#39; - trying to restart agent<br>
MainThread::ERROR::2016-07-13 16:39:07,110::agent::210::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Too many errors occurred, giving up. Please review the log and consider filing a bug.<br>
MainThread::ERROR::2016-07-13 17:44:03,499::hosted_engine::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Shutting down the agent because of 3 failures in a row!<br>
MainThread::ERROR::2016-07-13 17:44:03,515::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: &#39;(24, &#39;Sanlock lockspace remove failure&#39;, &#39;Too many open files&#39;)&#39; - trying to restart agent<br>
MainThread::ERROR::2016-07-13 17:44:08,520::config::122::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_load) Configuration file &#39;/etc/ovirt-hosted-engine/hosted-engine.conf&#39; not available [[Errno 24] Too many open files: &#39;/etc/ovirt-hosted-engine/hosted-engine.conf&#39;]<br>
MainThread::ERROR::2016-07-13 17:44:08,523::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: &#39;[Errno 24] Too many open files&#39; - trying to restart agent<br>
MainThread::ERROR::2016-07-13 17:44:13,529::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: &#39;[Errno 24] Too many open files&#39; - trying to restart agent<br>
MainThread::ERROR::2016-07-13 17:44:18,535::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: &#39;[Errno 24] Too many open files&#39; - trying to restart agent<br>
MainThread::ERROR::2016-07-13 17:44:23,541::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: &#39;[Errno 24] Too many open files&#39; - trying to restart agent<br>
MainThread::ERROR::2016-07-13 17:44:28,546::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: &#39;[Errno 24] Too many open files&#39; - trying to restart agent<br>
MainThread::ERROR::2016-07-13 17:44:33,552::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: &#39;[Errno 24] Too many open files&#39; - trying to restart agent<br>
MainThread::ERROR::2016-07-13 17:44:38,556::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: &#39;[Errno 24] Too many open files&#39; - trying to restart agent<br>
MainThread::ERROR::2016-07-13 17:44:43,561::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: &#39;[Errno 24] Too many open files&#39; - trying to restart agent<br>
MainThread::ERROR::2016-07-13 17:44:48,566::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: &#39;[Errno 24] Too many open files&#39; - trying to restart agent<br>
MainThread::ERROR::2016-07-13 17:44:53,571::agent::210::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Too many errors occurred, giving up. Please review the log and consider filing a bug.<br>
MainThread::ERROR::2016-07-13 18:47:40,048::hosted_engine::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Shutting down the agent because of 3 failures in a row!<br>
MainThread::ERROR::2016-07-14 10:32:29,184::hosted_engine::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Shutting down the agent because of 3 failures in a row!<br>
MainThread::ERROR::2016-07-14 11:10:07,223::brokerlink::279::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(_communicate) Connection closed: Connection closed<br>
MainThread::ERROR::2016-07-14 11:10:07,224::brokerlink::148::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(get_monitor_status) Exception getting monitor status: Connection closed<br>
MainThread::ERROR::2016-07-14 11:10:07,224::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: &#39;Failed to get monitor status: Connection closed&#39; - trying to restart agent<br>
MainThread::ERROR::2016-07-14 12:10:26,772::hosted_engine::493::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Shutting down the agent because of 3 failures in a row!<br>
<br>
systemtl output:<br>
<br>
[root@cultivar3 ~]# systemctl status ovirt-ha-agent.service ovirt-ha-broker.service vdsmd<br>
● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent<br>
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled)<br>
   Active: inactive (dead) since Thu 2016-07-14 12:10:29 ADT; 2h 3min ago<br>
  Process: 19426 ExecStart=/usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon (code=exited, status=0/SUCCESS)<br>
 Main PID: 19426 (code=exited, status=0/SUCCESS)<br>
<br>
Jul 14 11:10:07 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-agent[19426]: ovirt-ha-agent ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Connection closed: Connection closed<br>
Jul 14 11:10:07 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-agent[19426]: ovirt-ha-agent ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Exception getting monitor status: Connection closed<br>
Jul 14 11:10:07 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-agent[19426]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.agent.Agent ERROR Error: &#39;Failed to get monitor status: Connection closed&#39; - trying to restart agent<br>
Jul 14 11:10:07 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-agent[19426]: ERROR:ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink:Connection closed: Connection closed<br>
Jul 14 11:10:07 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-agent[19426]: ERROR:ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink:Exception getting monitor status: Connection closed<br>
Jul 14 11:10:07 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-agent[19426]: ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Error: &#39;Failed to get monitor status: Connection closed&#39; - trying to restart agent<br>
Jul 14 12:10:26 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-agent[19426]: Exception AttributeError: &quot;&#39;EventFD&#39; object has no attribute &#39;_fd&#39;&quot; in &lt;bound method EventFD.__del__ of &lt;vdsm.infra.eventfd.EventFD object at 0x2b035d0&gt;&gt; ignored<br>
Jul 14 12:10:26 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-agent[19426]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Shutting down the agent because of 3 failures in a row!<br>
Jul 14 12:10:26 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-agent[19426]: ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Shutting down the agent because of 3 failures in a row!<br>
Jul 14 12:10:28 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-agent[19426]: Exception AttributeError: &quot;&#39;EventFD&#39; object has no attribute &#39;_fd&#39;&quot; in &lt;bound method EventFD.__del__ of &lt;vdsm.infra.eventfd.EventFD object at 0x2b03f90&gt;&gt; ignored<br>
<br>
● ovirt-ha-broker.service - oVirt Hosted Engine High Availability Communications Broker<br>
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; enabled; vendor preset: disabled)<br>
   Active: active (running) since Thu 2016-07-14 11:10:09 ADT; 3h 3min ago<br>
 Main PID: 19907 (ovirt-ha-broker)<br>
   CGroup: /system.slice/ovirt-ha-broker.service<br>
           └─19907 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon<br>
<br>
Jul 14 11:36:01 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-broker[19907]: reply: &#39;354 End data with &lt;CR&gt;&lt;LF&gt;.&lt;CR&gt;&lt;LF&gt;\r\n&#39;<br>
Jul 14 11:36:01 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-broker[19907]: reply: retcode (354); Msg: End data with &lt;CR&gt;&lt;LF&gt;.&lt;CR&gt;&lt;LF&gt;<br>
Jul 14 11:36:01 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-broker[19907]: data: (354, &#39;End data with &lt;CR&gt;&lt;LF&gt;.&lt;CR&gt;&lt;LF&gt;&#39;)<br>
Jul 14 11:36:01 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-broker[19907]: send: &#39;From: <a href="mailto:root@cultivar.grove.silverorange.com">root@cultivar.grove.silverorange.com</a>\r\nTo: <a href="mailto:sysadmin@silverorange.com">sysadmin@silverorange.com</a>\r\nSubject: ovirt-hosted-engine state transition EngineUnexpectedlyDown-EngineDown\r\nDate: ...<br>
Jul 14 11:36:01 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-broker[19907]: reply: &#39;250 2.0.0 Ok: queued as 1B5F9C0064B90\r\n&#39;<br>
Jul 14 11:36:01 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-broker[19907]: reply: retcode (250); Msg: 2.0.0 Ok: queued as 1B5F9C0064B90<br>
Jul 14 11:36:01 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-broker[19907]: data: (250, &#39;2.0.0 Ok: queued as 1B5F9C0064B90&#39;)<br>
Jul 14 11:36:01 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-broker[19907]: send: &#39;quit\r\n&#39;<br>
Jul 14 11:36:01 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-broker[19907]: reply: &#39;221 2.0.0 Bye\r\n&#39;<br>
Jul 14 11:36:01 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> ovirt-ha-broker[19907]: reply: retcode (221); Msg: 2.0.0 Bye<br>
<br>
● vdsmd.service - Virtual Desktop Server Manager<br>
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor preset: enabled)<br>
   Active: active (running) since Thu 2016-07-14 09:31:06 ADT; 4h 42min ago<br>
  Process: 2236 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS)<br>
 Main PID: 2356 (vdsm)<br>
   CGroup: /system.slice/vdsmd.service<br>
           ├─2356 /usr/bin/python /usr/share/vdsm/vdsm<br>
           ├─2577 /usr/libexec/ioprocess --read-pipe-fd 82 --write-pipe-fd 81 --max-threads 10 --max-queued-requests 10<br>
           ├─3180 /usr/libexec/ioprocess --read-pipe-fd 125 --write-pipe-fd 124 --max-threads 10 --max-queued-requests 10<br>
           ├─3191 /usr/libexec/ioprocess --read-pipe-fd 130 --write-pipe-fd 127 --max-threads 10 --max-queued-requests 10<br>
           └─3198 /usr/libexec/ioprocess --read-pipe-fd 138 --write-pipe-fd 136 --max-threads 10 --max-queued-requests 10<br>
<br>
Jul 14 14:13:04 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> vdsm[2356]: vdsm SchemaCache WARNING Provided parameters {&#39;vcpuCount&#39;: &#39;1&#39;, &#39;displayInfo&#39;: [{&#39;tlsPort&#39;: u&#39;5905&#39;, &#39;ipAddress&#39;: &#39;0&#39;, &#39;type&#39;: u&#39;spice&#39;, &#39;port&#39;: u&#39;5904&#39;}], &#39;hash&#39;: &#39;242489...9-e01f21985049&#39;,<br>
Jul 14 14:13:20 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> vdsm[2356]: vdsm SchemaCache WARNING Provided parameters {&#39;displayInfo&#39;: [{&#39;tlsPort&#39;: u&#39;5901&#39;, &#39;ipAddress&#39;: &#39;0&#39;, &#39;type&#39;: u&#39;spice&#39;, &#39;port&#39;: u&#39;5900&#39;}], &#39;memUsage&#39;: &#39;27&#39;, &#39;acpiEnable&#39;: u...eRuntimeInfo&#39;: {<br>
Jul 14 14:13:20 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> vdsm[2356]: vdsm SchemaCache WARNING Provided parameters {&#39;displayInfo&#39;: [{&#39;tlsPort&#39;: u&#39;5903&#39;, &#39;ipAddress&#39;: &#39;0&#39;, &#39;type&#39;: u&#39;spice&#39;, &#39;port&#39;: u&#39;5902&#39;}], &#39;memUsage&#39;: &#39;19&#39;, &#39;acpiEnable&#39;: u...deRuntimeInfo&#39;:<br>
Jul 14 14:13:20 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> vdsm[2356]: vdsm SchemaCache WARNING Provided parameters {&#39;vcpuCount&#39;: &#39;1&#39;, &#39;displayInfo&#39;: [{&#39;tlsPort&#39;: u&#39;5905&#39;, &#39;ipAddress&#39;: &#39;0&#39;, &#39;type&#39;: u&#39;spice&#39;, &#39;port&#39;: u&#39;5904&#39;}], &#39;hash&#39;: &#39;242489...9-e01f21985049&#39;,<br>
Jul 14 14:13:36 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> vdsm[2356]: vdsm SchemaCache WARNING Provided parameters {&#39;displayInfo&#39;: [{&#39;tlsPort&#39;: u&#39;5901&#39;, &#39;ipAddress&#39;: &#39;0&#39;, &#39;type&#39;: u&#39;spice&#39;, &#39;port&#39;: u&#39;5900&#39;}], &#39;memUsage&#39;: &#39;27&#39;, &#39;acpiEnable&#39;: u...eRuntimeInfo&#39;: {<br>
Jul 14 14:13:36 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> vdsm[2356]: vdsm SchemaCache WARNING Provided parameters {&#39;displayInfo&#39;: [{&#39;tlsPort&#39;: u&#39;5903&#39;, &#39;ipAddress&#39;: &#39;0&#39;, &#39;type&#39;: u&#39;spice&#39;, &#39;port&#39;: u&#39;5902&#39;}], &#39;memUsage&#39;: &#39;19&#39;, &#39;acpiEnable&#39;: u...deRuntimeInfo&#39;:<br>
Jul 14 14:13:36 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> vdsm[2356]: vdsm SchemaCache WARNING Provided parameters {&#39;vcpuCount&#39;: &#39;1&#39;, &#39;displayInfo&#39;: [{&#39;tlsPort&#39;: u&#39;5905&#39;, &#39;ipAddress&#39;: &#39;0&#39;, &#39;type&#39;: u&#39;spice&#39;, &#39;port&#39;: u&#39;5904&#39;}], &#39;hash&#39;: &#39;242489...9-e01f21985049&#39;,<br>
Jul 14 14:13:52 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> vdsm[2356]: vdsm SchemaCache WARNING Provided parameters {&#39;displayInfo&#39;: [{&#39;tlsPort&#39;: u&#39;5901&#39;, &#39;ipAddress&#39;: &#39;0&#39;, &#39;type&#39;: u&#39;spice&#39;, &#39;port&#39;: u&#39;5900&#39;}], &#39;memUsage&#39;: &#39;27&#39;, &#39;acpiEnable&#39;: u...eRuntimeInfo&#39;: {<br>
Jul 14 14:13:52 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> vdsm[2356]: vdsm SchemaCache WARNING Provided parameters {&#39;displayInfo&#39;: [{&#39;tlsPort&#39;: u&#39;5903&#39;, &#39;ipAddress&#39;: &#39;0&#39;, &#39;type&#39;: u&#39;spice&#39;, &#39;port&#39;: u&#39;5902&#39;}], &#39;memUsage&#39;: &#39;19&#39;, &#39;acpiEnable&#39;: u...deRuntimeInfo&#39;:<br>
Jul 14 14:13:52 <a href="http://cultivar3.grove.silverorange.com" rel="noreferrer" target="_blank">cultivar3.grove.silverorange.com</a> vdsm[2356]: vdsm SchemaCache WARNING Provided parameters {&#39;vcpuCount&#39;: &#39;1&#39;, &#39;displayInfo&#39;: [{&#39;tlsPort&#39;: u&#39;5905&#39;, &#39;ipAddress&#39;: &#39;0&#39;, &#39;type&#39;: u&#39;spice&#39;, &#39;port&#39;: u&#39;5904&#39;}], &#39;hash&#39;: &#39;242489...9-e01f21985049&#39;,<br>
Hint: Some lines were ellipsized, use -l to show in full.<br>
<br>
<br>
Cheers,<br>
Gervais<br>
<br>
<br>
<br>
_______________________________________________<br>
Users mailing list<br>
<a href="mailto:Users@ovirt.org">Users@ovirt.org</a><br>
<a href="http://lists.ovirt.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman/listinfo/users</a><br>
</blockquote></div><br></div>