On 18/05/2021 12:32, Yedidyah Bar David wrote:
On Thu, May 13, 2021 at 3:34 PM Sketch <ovirt(a)rednsx.org>
wrote:
> This is a new system is CentOS 8.3, with the oVirt-4.4 repo and all
> updates applied. When I try to install the hosted engine with my engine
> backup from 4.3.10, the installation fails with a too many open files
> error. My 8.3 hosts already had 1M system max files, which is more than
> any of my CentOS 7/oVirt 4.3 hosts have. I tried increasing it to 2M with
> no luck, so my suspicion is that the error is on the engine itself?
>
> I tried provisioning a new engine just to test, and I get SSH key errors
> instead of this one.
>
> Any suggestions?
>
> 2021-05-12 23:09:44,731-0700 ERROR ansible failed {
> "ansible_host": "localhost",
> "ansible_playbook":
"/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml",
> "ansible_result": {
> "_ansible_no_log": false,
> "exception": "Traceback (most recent call last):\n File
\"/usr/lib/python3.6/site-packages/ansible/executor/task_executor.py\", line
665, in _execute\n result = self._handler.run(task_vars=variables)\n File
\"/usr/lib/python3.6/site-packages/ansible/plugins/action/wait_for_connection.py\",
line 122, in run\n self._remove_tmp_path(self._connection._shell.tmpdir)\n File
\"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line
417, in _remove_tmp_path\n tmp_rm_res = self._low_level_execute_command(cmd,
sudoable=False)\n File
\"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line
1085, in _low_level_execute_command\n rc, stdout, stderr =
self._connection.exec_command(cmd, in_data=in_data, sudoable=sudoable)\n File
\"/usr/lib/python3.6/site-packages/ansible/plugins/connection/ssh.py\", line
1191, in exec_command\n cmd = self._build_command(*args)\n File
\"/usr/lib/python3.6/site-packages/ansible/plugins/connection/s
> sh.py\", line 562, in _build_command\n self.sshpass_pipe =
os.pipe()\nOSError: [Errno 24] Too many open files\n\nDuring handling of the above
exception, another exception occurred:\n\nTraceback (most recent call last):\n File
\"/usr/lib/python3.6/site-packages/ansible/executor/task_executor.py\", line
147, in run\n res = self._execute()\n File
\"/usr/lib/python3.6/site-packages/ansible/executor/task_executor.py\", line
673, in _execute\n self._handler.cleanup()\n File
\"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line
128, in cleanup\n self._remove_tmp_path(self._connection._shell.tmpdir)\n File
\"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line
417, in _remove_tmp_path\n tmp_rm_res = self._low_level_execute_command(cmd,
sudoable=False)\n File
\"/usr/lib/python3.6/site-packages/ansible/plugins/action/__init__.py\", line
1085, in _low_level_execute_command\n rc, stdout, stderr =
self._connection.exec_command
> (cmd, in_data=in_data, sudoable=sudoable)\n File
\"/usr/lib/python3.6/site-packages/ansible/plugins/connection/ssh.py\", line
1191, in exec_command\n cmd = self._build_command(*args)\n File
\"/usr/lib/python3.6/site-packages/ansible/plugins/connection/ssh.py\", line
562, in _build_command\n self.sshpass_pipe = os.pipe()\nOSError: [Errno 24] Too many
open files\n",
> "msg": "Unexpected failure during module execution.",
> "stdout": ""
> },
> "ansible_task": "Wait for the local VM",
> "ansible_type": "task",
> "status": "FAILED",
> "task_duration": 3605
So I suppose it failed after 3605 seconds, or 721 attempts (of 5 seconds each).
Do you see the VM in 'virsh list'?
Can you see the VM running (e.g. 'ps auxww | grep qemu')?
Can you try to ssh to it from the host (search the logs for
local_vm_ip for its local/private temporary address)?
Perhaps open its console (Perhaps 'virsh console HostedEngineLocal')?
That said, I'd personally also consider it a bug in ansible, unless
you made some relevant custom changes - the bug is that it seems to
leak open files.
Thanks and best regards,
I also hit the very same problem and error.
...
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Wait for
the local VM]
[ ERROR ] OSError: [Errno 24] Too many open files
[ ERROR ] fatal: [localhost]: FAILED! => {"msg": "Unexpected
failure during module execution.", "stdout": ""}
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Sync on
engine machine]
[ ERROR ] fatal: [localhost]: FAILED! => {"msg": "Using a
SSH password instead of a key is not possible because Host
Key checking is enabled and sshpass does not support this.
Please add this host's fingerprint to your known_hosts file
to manage this host."}
[ ERROR ] Failed to execute stage 'Closing up': Failed
executing ansible-playbook
I can ssh to the VM via 192.168.1.124 on virbr0 and I can
console and then log in.
My oVirt host is itself a KVM.
ovirt-host-4.4.6-1.el8.x86_64
ovirt-hosted-engine-ha-2.4.7-1.el8.noarch
ovirt-hosted-engine-setup-2.5.0-1.el8.noarch