
This is a new system is CentOS 8.3, with the oVirt-4.4 repo and all updates applied. When I try to install the hosted engine with my engine backup from 4.3.10, the installation fails with a too many open files error. My 8.3 hosts already had 1M system max files, which is more than any of my CentOS 7/oVirt 4.3 hosts have. I tried increasing it to 2M with no luck, so my suspicion is that the error is on the engine itself? I tried provisioning a new engine just to test, and I get SSH key errors instead of this one. Any suggestions? 2021-05-12 23:09:44,731-0700 ERROR ansible failed { "ansible_host": "localhost", "ansible_playbook": "/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml", "ansible_result": { "_ansible_no_log": false, "exception": "Traceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/ansible/executor/task_executor.py\", line 665, in _execute\n result = self._handler.run(task_vars=variables)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/wait_for_connection.py\", line 122, in run\n self._remove_tmp_path(self._connection._shell.tmpdir)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 417, in _remove_tmp_path\n tmp_rm_res = self._low_level_execute_command(cmd, sudoable=False)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 1085, in _low_level_execute_command\n rc, stdout, stderr = self._connection.exec_command(cmd, in_data=in_data, sudoable=sudoable)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/connection/ssh.py\", line 1191, in exec_command\n cmd = self._build_command(*args)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/connection/s sh.py\", line 562, in _build_command\n self.sshpass_pipe = os.pipe()\nOSError: [Errno 24] Too many open files\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/ansible/executor/task_executor.py\", line 147, in run\n res = self._execute()\n File \"/usr/lib/python3.6/site-packages/ansible/executor/task_executor.py\", line 673, in _execute\n self._handler.cleanup()\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 128, in cleanup\n self._remove_tmp_path(self._connection._shell.tmpdir)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 417, in _remove_tmp_path\n tmp_rm_res = self._low_level_execute_command(cmd, sudoable=False)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 1085, in _low_level_execute_command\n rc, stdout, stderr = self._connection.exec_command (cmd, in_data=in_data, sudoable=sudoable)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/connection/ssh.py\", line 1191, in exec_command\n cmd = self._build_command(*args)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/connection/ssh.py\", line 562, in _build_command\n self.sshpass_pipe = os.pipe()\nOSError: [Errno 24] Too many open files\n", "msg": "Unexpected failure during module execution.", "stdout": "" }, "ansible_task": "Wait for the local VM", "ansible_type": "task", "status": "FAILED", "task_duration": 3605 }

On Thu, May 13, 2021 at 3:34 PM Sketch <ovirt@rednsx.org> wrote:
This is a new system is CentOS 8.3, with the oVirt-4.4 repo and all updates applied. When I try to install the hosted engine with my engine backup from 4.3.10, the installation fails with a too many open files error. My 8.3 hosts already had 1M system max files, which is more than any of my CentOS 7/oVirt 4.3 hosts have. I tried increasing it to 2M with no luck, so my suspicion is that the error is on the engine itself?
I tried provisioning a new engine just to test, and I get SSH key errors instead of this one.
Any suggestions?
2021-05-12 23:09:44,731-0700 ERROR ansible failed { "ansible_host": "localhost", "ansible_playbook": "/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml", "ansible_result": { "_ansible_no_log": false, "exception": "Traceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/ansible/executor/task_executor.py\", line 665, in _execute\n result = self._handler.run(task_vars=variables)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/wait_for_connection.py\", line 122, in run\n self._remove_tmp_path(self._connection._shell.tmpdir)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 417, in _remove_tmp_path\n tmp_rm_res = self._low_level_execute_command(cmd, sudoable=False)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 1085, in _low_level_execute_command\n rc, stdout, stderr = self._connection.exec_command(cmd, in_data=in_data, sudoable=sudoable)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/connection/ssh.py\", line 1191, in exec_command\n cmd = self._build_command(*args)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/connection/s sh.py\", line 562, in _build_command\n self.sshpass_pipe = os.pipe()\nOSError: [Errno 24] Too many open files\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/ansible/executor/task_executor.py\", line 147, in run\n res = self._execute()\n File \"/usr/lib/python3.6/site-packages/ansible/executor/task_executor.py\", line 673, in _execute\n self._handler.cleanup()\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 128, in cleanup\n self._remove_tmp_path(self._connection._shell.tmpdir)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 417, in _remove_tmp_path\n tmp_rm_res = self._low_level_execute_command(cmd, sudoable=False)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 1085, in _low_level_execute_command\n rc, stdout, stderr = self._connection.exec_command (cmd, in_data=in_data, sudoable=sudoable)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/connection/ssh.py\", line 1191, in exec_command\n cmd = self._build_command(*args)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/connection/ssh.py\", line 562, in _build_command\n self.sshpass_pipe = os.pipe()\nOSError: [Errno 24] Too many open files\n", "msg": "Unexpected failure during module execution.", "stdout": "" }, "ansible_task": "Wait for the local VM", "ansible_type": "task", "status": "FAILED", "task_duration": 3605
So I suppose it failed after 3605 seconds, or 721 attempts (of 5 seconds each). Do you see the VM in 'virsh list'? Can you see the VM running (e.g. 'ps auxww | grep qemu')? Can you try to ssh to it from the host (search the logs for local_vm_ip for its local/private temporary address)? Perhaps open its console (Perhaps 'virsh console HostedEngineLocal')? That said, I'd personally also consider it a bug in ansible, unless you made some relevant custom changes - the bug is that it seems to leak open files. Thanks and best regards, -- Didi

On Thu, May 13, 2021 at 3:34 PM Sketch <ovirt@rednsx.org> wrote:
This is a new system is CentOS 8.3, with the oVirt-4.4 repo and all updates applied. When I try to install the hosted engine with my engine backup from 4.3.10, the installation fails with a too many open files error. My 8.3 hosts already had 1M system max files, which is more than any of my CentOS 7/oVirt 4.3 hosts have. I tried increasing it to 2M with no luck, so my suspicion is that the error is on the engine itself?
I tried provisioning a new engine just to test, and I get SSH key errors instead of this one.
Any suggestions?
2021-05-12 23:09:44,731-0700 ERROR ansible failed { "ansible_host": "localhost", "ansible_playbook": "/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml", "ansible_result": { "_ansible_no_log": false, "exception": "Traceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/ansible/executor/task_executor.py\", line 665, in _execute\n result = self._handler.run(task_vars=variables)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/wait_for_connection.py\", line 122, in run\n self._remove_tmp_path(self._connection._shell.tmpdir)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 417, in _remove_tmp_path\n tmp_rm_res = self._low_level_execute_command(cmd, sudoable=False)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 1085, in _low_level_execute_command\n rc, stdout, stderr = self._connection.exec_command(cmd, in_data=in_data, sudoable=sudoable)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/connection/ssh.py\", line 1191, in exec_command\n cmd = self._build_command(*args)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/connection/s sh.py\", line 562, in _build_command\n self.sshpass_pipe = os.pipe()\nOSError: [Errno 24] Too many open files\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/ansible/executor/task_executor.py\", line 147, in run\n res = self._execute()\n File \"/usr/lib/python3.6/site-packages/ansible/executor/task_executor.py\", line 673, in _execute\n self._handler.cleanup()\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 128, in cleanup\n self._remove_tmp_path(self._connection._shell.tmpdir)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 417, in _remove_tmp_path\n tmp_rm_res = self._low_level_execute_command(cmd, sudoable=False)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line 1085, in _low_level_execute_command\n rc, stdout, stderr = self._connection.exec_command (cmd, in_data=in_data, sudoable=sudoable)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/connection/ssh.py\", line 1191, in exec_command\n cmd = self._build_command(*args)\n File \"/usr/lib/python3.6/site-packages/ansible/plugins/connection/ssh.py\", line 562, in _build_command\n self.sshpass_pipe = os.pipe()\nOSError: [Errno 24] Too many open files\n", "msg": "Unexpected failure during module execution.", "stdout": "" }, "ansible_task": "Wait for the local VM", "ansible_type": "task", "status": "FAILED", "task_duration": 3605 So I suppose it failed after 3605 seconds, or 721 attempts (of 5 seconds each).
Do you see the VM in 'virsh list'? Can you see the VM running (e.g. 'ps auxww | grep qemu')? Can you try to ssh to it from the host (search the logs for local_vm_ip for its local/private temporary address)? Perhaps open its console (Perhaps 'virsh console HostedEngineLocal')?
That said, I'd personally also consider it a bug in ansible, unless you made some relevant custom changes - the bug is that it seems to leak open files.
Thanks and best regards, I also hit the very same problem and error. ... [ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Wait for
On 18/05/2021 12:32, Yedidyah Bar David wrote: the local VM] [ ERROR ] OSError: [Errno 24] Too many open files [ ERROR ] fatal: [localhost]: FAILED! => {"msg": "Unexpected failure during module execution.", "stdout": ""} [ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Sync on engine machine] [ ERROR ] fatal: [localhost]: FAILED! => {"msg": "Using a SSH password instead of a key is not possible because Host Key checking is enabled and sshpass does not support this. Please add this host's fingerprint to your known_hosts file to manage this host."} [ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook I can ssh to the VM via 192.168.1.124 on virbr0 and I can console and then log in. My oVirt host is itself a KVM. ovirt-host-4.4.6-1.el8.x86_64 ovirt-hosted-engine-ha-2.4.7-1.el8.noarch ovirt-hosted-engine-setup-2.5.0-1.el8.noarch
participants (3)
-
lejeczek
-
Sketch
-
Yedidyah Bar David