Hi,
I have figured it out that the root cause of the deployment failure is timing out while
the hosted engine was trying to connect to host vis SSH as shown in engine.log (located in
/var/log/ovirt-hosted-engine-setup/engine-logs-2019-12-31T06:34:38Z/ovirt-engine):
2019-12-31 15:43:06,082+09 ERROR [org.ovirt.engine.core.bll.hostdeploy.AddVdsCommand]
(default task-1) [f48796e7-a4c5-4c09-a70d-956f0c4249b4] Failed to establish session with
host 'alice-ovirt-01.sdfarm.kr': SSH connection timed out connecting to
'root(a)alice-ovirt-01.sdfarm.kr'
2019-12-31 15:43:06,085+09 WARN [org.ovirt.engine.core.bll.hostdeploy.AddVdsCommand]
(default task-1) [f48796e7-a4c5-4c09-a70d-956f0c4249b4] Validation of action
'AddVds' failed for user admin@internal-authz. Reasons:
VAR__ACTION__ADD,VAR__TYPE__HOST,$server
alice-ovirt-01.sdfarm.kr,VDS_CANNOT_CONNECT_TO_SERVER
2019-12-31 15:43:06,129+09 ERROR
[org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-1) []
Operation Failed: [Cannot add Host. Connecting to host via SSH has failed, verify that the
host is reachable (IP address, routable address etc.) You may refer to the engine.log file
for further details.]
The FQDN of hosted engine (alice-ovirt-engine.sdfarm.kr
<
http://alice-ovirt-engine.sdfarm.kr/>) is resolved as well as the host
(alice-ovirt-01.sdfarm.kr <
http://alice-ovirt-01.sdfarm.kr/>) and SSH is the one of
services that are allowed by firewalld. I believe the rules of firewalld is automatically
configured during the deployment to work with hosted engine and the host. Also root access
is configured to be allowed at the first stage of deployment.
I was just wondering how I can verify the hosted engine can access to the host at this
stage? Once it fails to deploy, the deployment script make all things rolled back (I
believe it cleans all up) and the vm-status of hosted-engine is un-deployed.
Thank you in advance,
Best regards,
Sang-Un