[ovirt-devel] Host installation failure - master

Dan Kenigsberg danken at redhat.com
Tue Oct 24 12:41:41 UTC 2017


In "everywhere" you mean that ovirtmgmt is defined with mtu 1500? or
the underlying lago networks?
I'm guessing (just guessing) that a mismatch causes packets to drop,
leading to failure to ssh into host.

On Tue, Oct 24, 2017 at 3:32 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
> OK, it seems to fail when I'm using Jumbo frames everywhere.
> Works will with mtu 1500.
> Y.
>
>
> On Mon, Oct 23, 2017 at 10:55 PM, Martin Perina <mperina at redhat.com> wrote:
>>
>>
>>
>> On Mon, Oct 23, 2017 at 9:38 PM, Roy Golan <rgolan at redhat.com> wrote:
>>>
>>>
>>>
>>> On Mon, 23 Oct 2017 at 21:51 Martin Perina <mperina at redhat.com> wrote:
>>>>
>>>> On Mon, Oct 23, 2017 at 6:21 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>>>>>
>>>>> I'm failing to install hosts on o-s-t on Master.
>>>>> What worries me is not that I'm failing (though it is a bit of a
>>>>> surprise, perhaps something I've done?), but that there are no logs around
>>>>> it.
>>>>
>>>>
>>>> Please see my response below, but which logs are you
>>>> missing?
>>>>>
>>>>>
>>>>> /var/log/ovirt-engine/host-deploy is empty and so is
>>>>> /var/log/ovirt-engine/ansible.
>>>>
>>>>
>>>> Logs for both part of host installation (host-deploy and ansible) are
>>>>
>>>> in /var/log/ovirt-engine/host-deploy, but they are created once each
>>>> part successfully started.
>>>>>
>>>>>
>>>>>
>>>>> All I'm seeing:
>>>>> Host lago-basic-suite-master-host-0 installation failed. Unexpected
>>>>> connection termination.
>>>>>
>>>>> Server.log:
>>>>> 2017-10-23 12:16:33,041-04 WARN
>>>>> [org.apache.sshd.client.session.ClientSessionImpl]
>>>>> (sshd-SshClient[346b54f3]-nio2-thread-2) Exception caught:
>>>>> java.io.IOException: Connection timed out
>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>>> [rt.jar:1.8.0_151]
>>>>>         at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>>>>> [rt.jar:1.8.0_151]
>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>>>>> [rt.jar:1.8.0_151]
>>>>>         at sun.nio.ch.IOUtil.write(IOUtil.java:65) [rt.jar:1.8.0_151]
>>>>>         at
>>>>> sun.nio.ch.UnixAsynchronousSocketChannelImpl.finishWrite(UnixAsynchronousSocketChannelImpl.java:582)
>>>>> [rt.jar:1.8.0_151]
>>>>>         at
>>>>> sun.nio.ch.UnixAsynchronousSocketChannelImpl.finish(UnixAsynchronousSocketChannelImpl.java:190)
>>>>> [rt.jar:1.8.0_151]
>>>>>         at
>>>>> sun.nio.ch.UnixAsynchronousSocketChannelImpl.onEvent(UnixAsynchronousSocketChannelImpl.java:213)
>>>>> [rt.jar:1.8.0_151]
>>>>>         at
>>>>> sun.nio.ch.EPollPort$EventHandlerTask.run(EPollPort.java:293)
>>>>> [rt.jar:1.8.0_151]
>>>>>         at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_151]
>>>>>
>>>>>
>>>>> Engine.log:
>>>>>
>>>>> 2017-10-23 12:16:33,046-04 DEBUG
>>>>> [org.ovirt.engine.core.uutils.ssh.SSHClient]
>>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Executed: 'umask 0077;
>>>>> MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXX
>>>>> XXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr
>>>>> \"${MYTMP}\" > /dev/null 2>&1" 0; tar --warning=no-timestamp -C "${MYTMP}"
>>>>> -x &&  "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine DIALO
>>>>> G/customization=bool:True'
>>>>> 2017-10-23 12:16:33,056-04 ERROR
>>>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [83a00e1]
>>>>> Error during deploy dialog
>>>>
>>>>
>>>> So this means that SSH connection to the host using which the host
>>>> deploy should be started failed. The reason is above in server.log, that SSH
>>>> connection timed out. This error appears even before host-deploy is
>>>> executed, that's we don't have any host-deploy log created.
>>>>
>>>>>
>>>>> 2017-10-23 12:16:33,057-04 DEBUG
>>>>> [org.ovirt.engine.core.uutils.ssh.SSHDialog]
>>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] execute leave
>>>>> 2017-10-23 12:16:33,057-04 ERROR
>>>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Error during host
>>>>> lago-basic-suite-master-host-0 install
>>>>> 2017-10-23 12:16:33,065-04 ERROR
>>>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] EVENT_ID:
>>>>> VDS_INSTALL_IN_PROGRESS_ERROR(511), An error has occurred during
>>>>> installation of Host lago-basic-suite-master-host-0: Unexpected connection
>>>>> termination.
>>>>
>>>>
>>>> Here is an ERROR event in audit_log for above issue.
>>>>
>>> perhaps reposync was stalling the rpm installation/download and this
>>> triggered the ssh timeout?
>>
>>
>> As the host-deploy log hasn't been created, I'd say that this is the
>> initial connection timeout, so engine couldn't connect to the host at all.
>> We would need to investigate the host if it was some firewall issues or sshd
>> was not running or something else.
>>
>>>
>>>
>>>>
>>>>>
>>>>> 2017-10-23 12:16:33,065-04 ERROR
>>>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Error during host
>>>>> lago-basic-suite-master-host-0 install, preferring first exception:
>>>>> Unexpected connection termination
>>>>> 2017-10-23 12:16:33,065-04 ERROR
>>>>> [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand]
>>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Host installation failed
>>>>> for host 'c4138375-aa53-4c36-8907-306803ae4282',
>>>>> 'lago-basic-suite-master-host-0': Unexpected connection termination
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Devel mailing list
>>>>> Devel at ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>
>>>> _______________________________________________
>>>> Devel mailing list
>>>> Devel at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>
>>
>


More information about the Devel mailing list