Thanks Didi
Great pointer - I have just performed a fresh deploy - am in the
hosted-engine VM, and in /var/log/ovirt-engine/engine-log, I can see the
following 3 lines cycling over and over again:
2021-01-19 05:12:11,395-05 INFO
[org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
Reactor) [] Connecting to
rhvh1.example.org/192.168.50.31
2021-01-19 05:12:11,399-05 ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-96)
[] Unable to RefreshCapabilities: ConnectException: Connection refused
2021-01-19 05:12:11,401-05 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesAsyncVDSCommand]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-96)
[] Command 'GetCapabilitiesAsyncVDSCommand(HostName =
rhvh1.example.org,
VdsIdAndVdsVDSCommandParametersBase:{hostId='12057f7e-a4cf-46ec-b563-c1037ba5c62d',
vds='Host[rhvh1.example.org,12057f7e-a4cf-46ec-b563-c1037ba5c62d]'})'
execution failed: java.net.ConnectException: Connection refused
I can ping 192.168.50.31 and resolve
rhvh1.example.org - however I note
that firewalld on the hypervisor host (192.168.50.31) hasn't had
anything allowed through it yet apart from SSH and Cockpit. Is this a
problem, or a red herring?
It seems that the hosted-engine is coming up and being installed and
configured ok. The engine health page looks ok (as validated by
Ansible). It looks like the hosted-engine is waiting for something to
happen on the host itself, but this never completed - which I suspect it
never will given that it cannot connect to the host.
Am I on the right track?
Yedidyah Bar David wrote on 19/01/2021 10:06:
On Tue, Jan 19, 2021 at 11:44 AM <james.freeman(a)a24.io> wrote:
> Hi all
>
> I am in the process of migrating a RHV 4.3 setup to RHV 4.4 and struggling with the
setup. I am installing on RHEL 8.3, using settings backed up from the RHV 4.3 install (via
'hosted-engine --deploy --restore-from-file=backup.bck').
>
> The install process always fails at the same point for me at the moment, and I
can't figure out how to get past it. As far as install progress goes, the local
hosted-engine comes up and runs on the node. I have been able to grep for local_vm_ip in
the logs, and can SSH into it with the password I set during the setup phase.
>
> However the install playbooks always fail with:
> 2021-01-18 18:38:00,086-0500 ERROR otopi.plugins.gr_he_common.core.misc
misc._terminate:167 Hosted Engine deployment failed: please check the logs for the issue,
fix accordingly or re-deploy from scratch.
>
> Earlier in the logs, I note the following:
> 2021-01-18 18:34:51,258-0500 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils._process_output:109 fatal: [localhost]: FAILED! => {"changed":
false, "msg": "Host is not up, please check logs, perhaps also on the
engine machine"}
> 2021-01-18 18:37:16,661-0500 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils._process_output:109 fatal: [localhost]: FAILED! => {"changed":
false, "msg": "The system may not be provisioned according to the playbook
results: please check the logs for the issue, fix accordingly or re-deploy from
scratch.\n"}
> Traceback (most recent call last):
> File "/usr/lib/python3.6/site-packages/otopi/context.py", line 132, in
_executeMethod
> method['method']()
> File
"/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-ansiblesetup/core/misc.py",
line 435, in _closeup
> raise RuntimeError(_('Failed executing ansible-playbook'))
> RuntimeError: Failed executing ansible-playbook
> 2021-01-18 18:37:18,996-0500 ERROR otopi.context context._executeMethod:154 Failed to
execute stage 'Closing up': Failed executing ansible-playbook
> 2021-01-18 18:37:32,421-0500 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils._process_output:109 fatal: [localhost]: UNREACHABLE! =>
{"changed": false, "msg": "Failed to connect to the host via ssh:
ssh: connect to host
rhvm.example.org port 22: No route to host",
"skip_reason": "Host localhost is unreachable",
"unreachable": true}
>
> I find the unreachable message a bit odd, as at this stage all that has happened is
that the local hosted-engine has been brought up to be configured, and so it is running on
virbr0, not on my actual network. As a result, that DNS address will never resolve, and
the IP it resolves to won't be up. I gave the installation script permission to modify
the local /etc/hosts but this hasn't improved things.
>
> I presume I'm missing something in the install process, or earlier on in the
logs, but I've been scanning for errors and possible clues to no avail.
>
> Any and all help greatly appreciated!
Please check/share, on the engine machine under /var/log/ovirt-engine,
or, if inaccessible, on the host, under
/var/log/ovirt-hosted-engine-setup/engine-logs-*:
engine.log
host-deploy/*
Good luck and best regards,