In "everywhere" you mean that ovirtmgmt is defined with mtu 1500? or
the underlying lago networks?
I'm guessing (just guessing) that a mismatch causes packets to drop,
leading to failure to ssh into host.
On Tue, Oct 24, 2017 at 3:32 PM, Yaniv Kaul <ykaul(a)redhat.com> wrote:
OK, it seems to fail when I'm using Jumbo frames everywhere.
Works will with mtu 1500.
Y.
On Mon, Oct 23, 2017 at 10:55 PM, Martin Perina <mperina(a)redhat.com> wrote:
>
>
>
> On Mon, Oct 23, 2017 at 9:38 PM, Roy Golan <rgolan(a)redhat.com> wrote:
>>
>>
>>
>> On Mon, 23 Oct 2017 at 21:51 Martin Perina <mperina(a)redhat.com> wrote:
>>>
>>> On Mon, Oct 23, 2017 at 6:21 PM, Yaniv Kaul <ykaul(a)redhat.com> wrote:
>>>>
>>>> I'm failing to install hosts on o-s-t on Master.
>>>> What worries me is not that I'm failing (though it is a bit of a
>>>> surprise, perhaps something I've done?), but that there are no logs
around
>>>> it.
>>>
>>>
>>> Please see my response below, but which logs are you
>>> missing?
>>>>
>>>>
>>>> /var/log/ovirt-engine/host-deploy is empty and so is
>>>> /var/log/ovirt-engine/ansible.
>>>
>>>
>>> Logs for both part of host installation (host-deploy and ansible) are
>>>
>>> in /var/log/ovirt-engine/host-deploy, but they are created once each
>>> part successfully started.
>>>>
>>>>
>>>>
>>>> All I'm seeing:
>>>> Host lago-basic-suite-master-host-0 installation failed. Unexpected
>>>> connection termination.
>>>>
>>>> Server.log:
>>>> 2017-10-23 12:16:33,041-04 WARN
>>>> [org.apache.sshd.client.session.ClientSessionImpl]
>>>> (sshd-SshClient[346b54f3]-nio2-thread-2) Exception caught:
>>>> java.io.IOException: Connection timed out
>>>> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>> [rt.jar:1.8.0_151]
>>>> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>>>> [rt.jar:1.8.0_151]
>>>> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>>>> [rt.jar:1.8.0_151]
>>>> at sun.nio.ch.IOUtil.write(IOUtil.java:65) [rt.jar:1.8.0_151]
>>>> at
>>>>
sun.nio.ch.UnixAsynchronousSocketChannelImpl.finishWrite(UnixAsynchronousSocketChannelImpl.java:582)
>>>> [rt.jar:1.8.0_151]
>>>> at
>>>>
sun.nio.ch.UnixAsynchronousSocketChannelImpl.finish(UnixAsynchronousSocketChannelImpl.java:190)
>>>> [rt.jar:1.8.0_151]
>>>> at
>>>>
sun.nio.ch.UnixAsynchronousSocketChannelImpl.onEvent(UnixAsynchronousSocketChannelImpl.java:213)
>>>> [rt.jar:1.8.0_151]
>>>> at
>>>> sun.nio.ch.EPollPort$EventHandlerTask.run(EPollPort.java:293)
>>>> [rt.jar:1.8.0_151]
>>>> at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_151]
>>>>
>>>>
>>>> Engine.log:
>>>>
>>>> 2017-10-23 12:16:33,046-04 DEBUG
>>>> [org.ovirt.engine.core.uutils.ssh.SSHClient]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Executed: 'umask
0077;
>>>> MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t
ovirt-XXXXX
>>>> XXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" >
/dev/null 2>&1; rm -fr
>>>> \"${MYTMP}\" > /dev/null 2>&1" 0; tar
--warning=no-timestamp -C "${MYTMP}"
>>>> -x && "${MYTMP}"/ovirt-host-deploy
DIALOG/dialect=str:machine DIALO
>>>> G/customization=bool:True'
>>>> 2017-10-23 12:16:33,056-04 ERROR
>>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy)
[83a00e1]
>>>> Error during deploy dialog
>>>
>>>
>>> So this means that SSH connection to the host using which the host
>>> deploy should be started failed. The reason is above in server.log, that SSH
>>> connection timed out. This error appears even before host-deploy is
>>> executed, that's we don't have any host-deploy log created.
>>>
>>>>
>>>> 2017-10-23 12:16:33,057-04 DEBUG
>>>> [org.ovirt.engine.core.uutils.ssh.SSHDialog]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] execute leave
>>>> 2017-10-23 12:16:33,057-04 ERROR
>>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Error during host
>>>> lago-basic-suite-master-host-0 install
>>>> 2017-10-23 12:16:33,065-04 ERROR
>>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] EVENT_ID:
>>>> VDS_INSTALL_IN_PROGRESS_ERROR(511), An error has occurred during
>>>> installation of Host lago-basic-suite-master-host-0: Unexpected
connection
>>>> termination.
>>>
>>>
>>> Here is an ERROR event in audit_log for above issue.
>>>
>> perhaps reposync was stalling the rpm installation/download and this
>> triggered the ssh timeout?
>
>
> As the host-deploy log hasn't been created, I'd say that this is the
> initial connection timeout, so engine couldn't connect to the host at all.
> We would need to investigate the host if it was some firewall issues or sshd
> was not running or something else.
>
>>
>>
>>>
>>>>
>>>> 2017-10-23 12:16:33,065-04 ERROR
>>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Error during host
>>>> lago-basic-suite-master-host-0 install, preferring first exception:
>>>> Unexpected connection termination
>>>> 2017-10-23 12:16:33,065-04 ERROR
>>>> [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand]
>>>> (EE-ManagedThreadFactory-engine-Thread-1) [83a00e1] Host installation
failed
>>>> for host 'c4138375-aa53-4c36-8907-306803ae4282',
>>>> 'lago-basic-suite-master-host-0': Unexpected connection
termination
>>>>
>>>>
>>>> _______________________________________________
>>>> Devel mailing list
>>>> Devel(a)ovirt.org
>>>>
http://lists.ovirt.org/mailman/listinfo/devel
>>>
>>> _______________________________________________
>>> Devel mailing list
>>> Devel(a)ovirt.org
>>>
http://lists.ovirt.org/mailman/listinfo/devel
>
>