[ovirt-devel] Host installation failure - master

Martin Perina mperina at redhat.com
Mon Oct 23 19:55:46 UTC 2017


On Mon, Oct 23, 2017 at 9:38 PM, Roy Golan <rgolan at redhat.com> wrote:

>
>
> On Mon, 23 Oct 2017 at 21:51 Martin Perina <mperina at redhat.com> wrote:
>
>> On Mon, Oct 23, 2017 at 6:21 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>>
>>> I'm failing to install hosts on o-s-t on Master.
>>> What worries me is not that I'm failing (though it is a bit of a
>>> surprise, perhaps something I've done?), but that there are no logs around
>>> it.
>>>
>>
>> ​Please see my response below, but which logs are you ​
>> ​missing?
>>>>>>
>>>
>>> /var/log/ovirt-engine/host-deploy is empty and so is
>>> /var/log/ovirt-engine/ansible.
>>>
>>
>> Logs for both part of host installation (host-deploy and ansible) are​
>>
>> ​in /var/log/ovirt-engine/host-deploy, but they are created once each
>> part successfully started.
>>>>
>>>
>>>
>>> All I'm seeing:
>>> Host lago-basic-suite-master-host-0 installation failed. Unexpected
>>> connection termination.
>>>
>>> Server.log:
>>> 2017-10-23 12:16:33,041-04 WARN  [org.apache.sshd.client.session.ClientSessionImpl]
>>> (sshd-SshClient[346b54f3]-nio2-thread-2) Exception caught:
>>> java.io.IOException: Connection timed out
>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>> [rt.jar:1.8.0_151]
>>>         at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>>> [rt.jar:1.8.0_151]
>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>>> [rt.jar:1.8.0_151]
>>>         at sun.nio.ch.IOUtil.write(IOUtil.java:65) [rt.jar:1.8.0_151]
>>>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.finishWrite(
>>> UnixAsynchronousSocketChannelImpl.java:582) [rt.jar:1.8.0_151]
>>>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.finish(
>>> UnixAsynchronousSocketChannelImpl.java:190) [rt.jar:1.8.0_151]
>>>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.onEvent(
>>> UnixAsynchronousSocketChannelImpl.java:213) [rt.jar:1.8.0_151]
>>>         at sun.nio.ch.EPollPort$EventHandlerTask.run(EPollPort.java:293)
>>> [rt.jar:1.8.0_151]
>>>         at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_151]
>>>
>>>
>>> Engine.log:
>>>
>>> 2017-10-23 12:16:33,046-04 DEBUG [org.ovirt.engine.core.uutils.ssh.SSHClient]
>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Executed: 'umask
>>> 0077; MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXX
>>> XXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr
>>> \"${MYTMP}\" > /dev/null 2>&1" 0; tar --warning=no-timestamp -C "${MYTMP}"
>>> -x &&  "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine DIALO
>>> G/customization=bool:True'
>>> 2017-10-23 12:16:33,056-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>> (VdsDeploy) [83a00e1] Error during deploy dialog
>>>
>>
>> ​So this means that SSH connection to the host using which the host
>> deploy should be started failed. The reason is above in server.log, that
>> SSH connection timed out. This error appears even before host-deploy is
>> executed, that's we don't have any host-deploy log created.
>>>>
>>
>>> 2017-10-23 12:16:33,057-04 DEBUG [org.ovirt.engine.core.uutils.ssh.SSHDialog]
>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] execute leave
>>> 2017-10-23 12:16:33,057-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Error during host
>>> lago-basic-suite-master-host-0 install
>>> 2017-10-23 12:16:33,065-04 ERROR [org.ovirt.engine.core.dal.
>>> dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1)
>>> [83a00e1] EVENT_ID: VDS_INSTALL_IN_PROGRESS_ERROR(511), An error has
>>> occurred during installation of Host lago-basic-suite-master-host-0:
>>> Unexpected connection termination.
>>>
>>
>> ​Here is an ERROR event in audit_log for above issue.
>>
>> perhaps reposync was stalling the rpm installation/download and this
> triggered the ssh timeout?
>

​As the host-deploy log hasn't been created, I'd say that this is the
initial connection timeout, so engine couldn't connect to the host at all.
We would need to investigate the host if it was some firewall issues or
sshd was not running or something else.
​


>
>>>
>>
>>> 2017-10-23 12:16:33,065-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Error during host
>>> lago-basic-suite-master-host-0 install, preferring first exception:
>>> Unexpected connection termination
>>> 2017-10-23 12:16:33,065-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand]
>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Host installation
>>> failed for host 'c4138375-aa53-4c36-8907-306803ae4282',
>>> 'lago-basic-suite-master-host-0': Unexpected connection termination
>>>
>>>
>>> _______________________________________________
>>> Devel mailing list
>>> Devel at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>
>> _______________________________________________
>> Devel mailing list
>> Devel at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20171023/d34c565e/attachment.html>


More information about the Devel mailing list