[ovirt-devel] Host installation failure - master

Yaniv Kaul ykaul at redhat.com
Tue Oct 24 12:32:45 UTC 2017


OK, it seems to fail when I'm using Jumbo frames everywhere.
Works will with mtu 1500.
Y.

On Mon, Oct 23, 2017 at 10:55 PM, Martin Perina <mperina at redhat.com> wrote:

>
>
> On Mon, Oct 23, 2017 at 9:38 PM, Roy Golan <rgolan at redhat.com> wrote:
>
>>
>>
>> On Mon, 23 Oct 2017 at 21:51 Martin Perina <mperina at redhat.com> wrote:
>>
>>> On Mon, Oct 23, 2017 at 6:21 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>>>
>>>> I'm failing to install hosts on o-s-t on Master.
>>>> What worries me is not that I'm failing (though it is a bit of a
>>>> surprise, perhaps something I've done?), but that there are no logs around
>>>> it.
>>>>
>>>
>>> ​Please see my response below, but which logs are you ​
>>> ​missing?
>>>>>>>>>
>>>>
>>>> /var/log/ovirt-engine/host-deploy is empty and so is
>>>> /var/log/ovirt-engine/ansible.
>>>>
>>>
>>> Logs for both part of host installation (host-deploy and ansible) are​
>>>
>>> ​in /var/log/ovirt-engine/host-deploy, but they are created once each
>>> part successfully started.
>>>>>>
>>>>
>>>>
>>>> All I'm seeing:
>>>> Host lago-basic-suite-master-host-0 installation failed. Unexpected
>>>> connection termination.
>>>>
>>>> Server.log:
>>>> 2017-10-23 12:16:33,041-04 WARN  [org.apache.sshd.client.session.ClientSessionImpl]
>>>> (sshd-SshClient[346b54f3]-nio2-thread-2) Exception caught:
>>>> java.io.IOException: Connection timed out
>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>> [rt.jar:1.8.0_151]
>>>>         at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>>>> [rt.jar:1.8.0_151]
>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>>>> [rt.jar:1.8.0_151]
>>>>         at sun.nio.ch.IOUtil.write(IOUtil.java:65) [rt.jar:1.8.0_151]
>>>>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.finishWrite(Uni
>>>> xAsynchronousSocketChannelImpl.java:582) [rt.jar:1.8.0_151]
>>>>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.finish(UnixAsyn
>>>> chronousSocketChannelImpl.java:190) [rt.jar:1.8.0_151]
>>>>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.onEvent(UnixAsy
>>>> nchronousSocketChannelImpl.java:213) [rt.jar:1.8.0_151]
>>>>         at sun.nio.ch.EPollPort$EventHandlerTask.run(EPollPort.java:293)
>>>> [rt.jar:1.8.0_151]
>>>>         at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_151]
>>>>
>>>>
>>>> Engine.log:
>>>>
>>>> 2017-10-23 12:16:33,046-04 DEBUG [org.ovirt.engine.core.uutils.ssh.SSHClient]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Executed: 'umask
>>>> 0077; MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXX
>>>> XXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr
>>>> \"${MYTMP}\" > /dev/null 2>&1" 0; tar --warning=no-timestamp -C "${MYTMP}"
>>>> -x &&  "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine DIALO
>>>> G/customization=bool:True'
>>>> 2017-10-23 12:16:33,056-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>>> (VdsDeploy) [83a00e1] Error during deploy dialog
>>>>
>>>
>>> ​So this means that SSH connection to the host using which the host
>>> deploy should be started failed. The reason is above in server.log, that
>>> SSH connection timed out. This error appears even before host-deploy is
>>> executed, that's we don't have any host-deploy log created.
>>>>>>
>>>
>>>> 2017-10-23 12:16:33,057-04 DEBUG [org.ovirt.engine.core.uutils.ssh.SSHDialog]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] execute leave
>>>> 2017-10-23 12:16:33,057-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Error during host
>>>> lago-basic-suite-master-host-0 install
>>>> 2017-10-23 12:16:33,065-04 ERROR [org.ovirt.engine.core.dal.dbb
>>>> roker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1)
>>>> [83a00e1] EVENT_ID: VDS_INSTALL_IN_PROGRESS_ERROR(511), An error has
>>>> occurred during installation of Host lago-basic-suite-master-host-0:
>>>> Unexpected connection termination.
>>>>
>>>
>>> ​Here is an ERROR event in audit_log for above issue.
>>>
>>> perhaps reposync was stalling the rpm installation/download and this
>> triggered the ssh timeout?
>>
>
> ​As the host-deploy log hasn't been created, I'd say that this is the
> initial connection timeout, so engine couldn't connect to the host at all.
> We would need to investigate the host if it was some firewall issues or
> sshd was not running or something else.
>>
>
>>
>>>>>
>>>
>>>> 2017-10-23 12:16:33,065-04 ERROR [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Error during host
>>>> lago-basic-suite-master-host-0 install, preferring first exception:
>>>> Unexpected connection termination
>>>> 2017-10-23 12:16:33,065-04 ERROR [org.ovirt.engine.core.bll.hos
>>>> tdeploy.InstallVdsInternalCommand] (EE-ManagedThreadFactory-engine-Thread-1)
>>>> [83a00e1] Host installation failed for host 'c4138375-aa53-4c36-8907-306803ae4282',
>>>> 'lago-basic-suite-master-host-0': Unexpected connection termination
>>>>
>>>>
>>>> _______________________________________________
>>>> Devel mailing list
>>>> Devel at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>
>>> _______________________________________________
>>> Devel mailing list
>>> Devel at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/devel
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20171024/8774417f/attachment.html>


More information about the Devel mailing list