On 28 May 2018 at 10:19, Martin Perina <mperina@redhat.com> wrote:On Mon, May 28, 2018 at 9:00 AM, Piotr Kliczewski <pkliczew@redhat.com> wrote:Simone,What do you think about this failure?Thanks,PiotrOn Mon, May 28, 2018 at 7:12 AM, Barak Korren <bkorren@redhat.com> wrote:On 27 May 2018 at 14:59, Piotr Kliczewski <pkliczew@redhat.com> wrote:Martin,I only can see:2018-05-25 13:57:44,255-04 ERROR [org.ovirt.engine.core.uutils.There are no additional logs. SSH to host timeout. Are we sure that it is an issue caused by Ravi's change?ssh.SSHDialog] (EE-ManagedThreadFactory-engin e-Thread-1) [55a7b15b] SSH error running command root@lago-upgrade-from-release -suite-master-host-0:'umask 0077; MYTMP="$(TMPDIR="${OVIRT_TMPDI R}" mktemp -d -t ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; tar --warning=no-timestamp -C "${MYTMP}" -x && "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine DIALOG/customization=bool:True ': TimeLimitExceededException: SSH session timeout host 'root@lago-upgrade-from-releas e-suite-master-host-0' 2018-05-25 13:57:44,259-04 ERROR [org.ovirt.engine.core.bll.hos tdeploy.VdsDeployBase] (EE-ManagedThreadFactory-engin e-Thread-1) [55a7b15b] Timeout during host lago-upgrade-from-release-suit e-master-host-0 install: SSH session timeout host 'root@lago-upgrade-from-releas e-suite-master-host-0' We have some quite strong circumstantial evidence:- Issue had affected all engine patches since that patch in a similar fashion.- Prior engine patch [1] passed successfully [2]- Other subsequent OST runs without engine patches passed successfully as well [3].
[1]: https://gerrit.ovirt.org/c/91595/2
[2]: http://jenkins.ovirt.org/job/ovirt-master_change-queue-teste r/7777/
[3]: http://jenkins.ovirt.org/job/ovirt-master_change-queue-teste r/7778/ Please note - the issue is affecting a test that is run by an upgrade suit on the post-upgrade system. It has no affect on the basic suit. So it probably has to do with some behaviour that is specific to upgraded systems.I will try to reproduce later today in dev env, but I agree with Piotr's investigation, engine was not able to connect to the host using SSH and that's why no host-deploy logs were fetched.Lago fetches the logs from the host too (And it can take then from the VM image directly if the host is not responsive over SSH), can we get at the host-deploy logs that way?Thanks,PiotrOn Sun, May 27, 2018 at 11:21 AM, Martin Perina <mperina@redhat.com> wrote:Adding also Piotr to the threadOn Sun, 27 May 2018, 08:46 Barak Korren, <bkorren@redhat.com> wrote:This seems to have cause consistent failure in all other engine patches that followed it.Test failed: [ AddHost (in upgrade-from-release-suite) ]Not finding a host deploy log in /var/log/ovirt-engine for some reason.
Link to suspected patches:
https://gerrit.ovirt.org/#/c/91445/5 - Disable TLS versions < 1.2 for hosts with cluster level>=4.1
Link to Job:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-teste r/7776/
Link to all logs:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-teste r/7776/artifact/exported-artif acts/upgrade-from-release-suit -master-el7/test_logs/upgrade- from-release-suite-master/post -002_bootstrap.py/
Error snippet from log:
From nosetst log:
<error>
AssertionError: False != True after 1200 seconds
</error>
--Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
--Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
--Martin Perina
Associate Manager, Software Engineering
Red Hat Czech s.r.o.
--Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted