On 28 May 2018 at 12:38, Piotr Kliczewski <piotr.kliczewski@gmail.com> wrote:
On Mon, May 28, 2018 at 10:57 AM, Barak Korren <bkorren@redhat.com> wrote:
> Note: we're now seeing a very similar issue in the 4.2 branch as well that
> seems to have been introduced by the following patch:

Can you point to specific job so we could take a look at the logs?

Whoops, sorry, here:
http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/2034/
 

>
> https://gerrit.ovirt.org/c/91638/2 - core: Enable only strong ciphers for
> 4.2 hosts
>
> On 28 May 2018 at 10:26, Barak Korren <bkorren@redhat.com> wrote:
>>
>>
>>
>> On 28 May 2018 at 10:19, Martin Perina <mperina@redhat.com> wrote:
>>>
>>>
>>>
>>> On Mon, May 28, 2018 at 9:00 AM, Piotr Kliczewski <pkliczew@redhat.com>
>>> wrote:
>>>>
>>>> Simone,
>>>>
>>>> What do you think about this failure?
>>>>
>>>> Thanks,
>>>> Piotr
>>>>
>>>> On Mon, May 28, 2018 at 7:12 AM, Barak Korren <bkorren@redhat.com>
>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 27 May 2018 at 14:59, Piotr Kliczewski <pkliczew@redhat.com> wrote:
>>>>>>
>>>>>> Martin,
>>>>>>
>>>>>> I only can see:
>>>>>>
>>>>>> 2018-05-25 13:57:44,255-04 ERROR
>>>>>> [org.ovirt.engine.core.uutils.ssh.SSHDialog]
>>>>>> (EE-ManagedThreadFactory-engine-Thread-1) [55a7b15b] SSH error running
>>>>>> command root@lago-upgrade-from-release-suite-master-host-0:'umask 0077;
>>>>>> MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXXXXXXX)"; trap
>>>>>> "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" >
>>>>>> /dev/null 2>&1" 0; tar --warning=no-timestamp -C "${MYTMP}" -x &&
>>>>>> "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine
>>>>>> DIALOG/customization=bool:True': TimeLimitExceededException: SSH session
>>>>>> timeout host 'root@lago-upgrade-from-release-suite-master-host-0'
>>>>>> 2018-05-25 13:57:44,259-04 ERROR
>>>>>> [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase]
>>>>>> (EE-ManagedThreadFactory-engine-Thread-1) [55a7b15b] Timeout during host
>>>>>> lago-upgrade-from-release-suite-master-host-0 install: SSH session timeout
>>>>>> host 'root@lago-upgrade-from-release-suite-master-host-0'
>>>>>>
>>>>>> There are no additional logs. SSH to host timeout. Are we sure that it
>>>>>> is an issue caused by Ravi's change?
>>>>>
>>>>>
>>>>> We have some quite strong circumstantial evidence:
>>>>> - Issue had affected all engine patches since that patch in a similar
>>>>> fashion.
>>>>> - Prior engine patch [1] passed successfully [2]
>>>>> - Other subsequent OST runs without engine patches passed successfully
>>>>> as well [3].
>>>>>
>>>>> [1]: https://gerrit.ovirt.org/c/91595/2
>>>>> [2]:
>>>>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/7777/
>>>>> [3]:
>>>>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/7778/
>>>>>
>>>>>
>>>>> Please note - the issue is affecting a test that is run by an upgrade
>>>>> suit on the post-upgrade system. It has no affect on the basic suit. So it
>>>>> probably has to do with some behaviour that is specific to upgraded systems.
>>>
>>>
>>> I will try to reproduce later today in dev env, but I agree with Piotr's
>>> investigation, engine was not able to connect to the host using SSH and
>>> that's why no host-deploy logs were fetched.
>>
>>
>> Lago fetches the logs from the host too (And it can take then from the VM
>> image directly if the host is not responsive over SSH), can we get at the
>> host-deploy logs that way?
>>
>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Piotr
>>>>>>
>>>>>> On Sun, May 27, 2018 at 11:21 AM, Martin Perina <mperina@redhat.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Adding also Piotr to the thread
>>>>>>>
>>>>>>>
>>>>>>> On Sun, 27 May 2018, 08:46 Barak Korren, <bkorren@redhat.com> wrote:
>>>>>>>>
>>>>>>>> Test failed: [ AddHost (in upgrade-from-release-suite) ]
>>>>>>>>
>>>>>>>> Link to suspected patches:
>>>>>>>> https://gerrit.ovirt.org/#/c/91445/5 - Disable TLS versions < 1.2
>>>>>>>> for hosts with cluster level>=4.1
>>>>>>>>
>>>>>>>> Link to Job:
>>>>>>>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/7776/
>>>>>>>>
>>>>>>>> Link to all logs:
>>>>>>>>
>>>>>>>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/7776/artifact/exported-artifacts/upgrade-from-release-suit-master-el7/test_logs/upgrade-from-release-suite-master/post-002_bootstrap.py/
>>>>>>>>
>>>>>>>> Error snippet from log:
>>>>>>>>
>>>>>>>> From nosetst log:
>>>>>>>> <error>
>>>>>>>>
>>>>>>>> AssertionError: False != True after 1200 seconds
>>>>>>>>
>>>>>>>> </error>
>>>>>>>>
>>>>>>>> Not finding a host deploy log in /var/log/ovirt-engine for some
>>>>>>>> reason.
>>>>>>>> This seems to have cause consistent failure in all other engine
>>>>>>>> patches that followed it.
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Barak Korren
>>>>>>>> RHV DevOps team , RHCE, RHCi
>>>>>>>> Red Hat EMEA
>>>>>>>> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Barak Korren
>>>>> RHV DevOps team , RHCE, RHCi
>>>>> Red Hat EMEA
>>>>> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Martin Perina
>>> Associate Manager, Software Engineering
>>> Red Hat Czech s.r.o.
>>
>>
>>
>>
>> --
>> Barak Korren
>> RHV DevOps team , RHCE, RHCi
>> Red Hat EMEA
>> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>
>
>
>
> --
> Barak Korren
> RHV DevOps team , RHCE, RHCi
> Red Hat EMEA
> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>
> _______________________________________________
> Devel mailing list -- devel@ovirt.org
> To unsubscribe send an email to devel-leave@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/QIZ5L4FKII7X5FHQ4OXBBR2SLUIK5C74/
>



--
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted