On 3 May 2020, at 13:48, Galit Rosenthal <grosenth@redhat.com> wrote:

I already checked this, it isn't ssh directly it requires the password, and if the host not listed also to approve.


On Sun, May 3, 2020 at 2:19 PM Yedidyah Bar David <didi@redhat.com> wrote:
On Sun, May 3, 2020 at 1:47 PM Galit Rosenthal <grosenth@redhat.com> wrote:
Hi Didi,

I manage to reproduce this error locally.
Is there something you would like me to check?

Yes, please! Can you ssh from the engine to the hosts?
 

Regards,
Galit

On Sun, May 3, 2020 at 1:37 PM Yedidyah Bar David <didi@redhat.com> wrote:
On Sun, May 3, 2020 at 12:57 PM oVirt Jenkins <jenkins@ovirt.org> wrote:
>
> A system test invoked by the "ovirt-master" change queue including change
> 108705,9 (ovirt-engine) failed. However, this change seems not to be the root
> cause for this failure. Change 107284,12 (ovirt-engine) that this change
> depends on or is based on, was detected as the cause of the testing failures.
>
> This change had been removed from the testing queue. Artifacts built from this
> change will not be released until either change 107284,12 (ovirt-engine) is
> fixed and this change is updated to refer to or rebased on the fixed version,
> or this change is modified to no longer depend on it.
>
> For further details about the change see:
> https://gerrit.ovirt.org/#/c/108705/9
>
> For further details about the change that seems to be the root cause behind the
> testing failures see:
> https://gerrit.ovirt.org/#/c/107284/12
>
> For failed test results see:
> https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/23505/

The engine did manage to run ssh-copy-id to both host-0 and host-1, but
then failed, a few seconds later, while running ansible:

https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/23505/artifact/basic-suite.el7.x86_64/test_logs/basic-suite-master/post-002_bootstrap_pytest.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/engine.log

2020-05-03 05:54:49,779-04 ERROR
[org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor]
(EE-ManagedThreadFactory-engine-Thread-1) [13ea1de1] Error executing
playbook: Failed to add host to inventory: SSH timeout waiting for
response from 'lago-basic-suite-master-host-0'

new ansible-runner prioritizes IPv6 so in case it’s not entirely correct it’s going to fail.
it should be fixed with https://gerrit.ovirt.org/#/c/108725/


A few other lines along this one, as well as similar ones for host-1,
do not give (me) more information.

lago did manage to collect logs from both hosts, a few seconds later.
vdsm logs are empty, messages does not give me a clue.

What is the timeout on trying to ssh (with ansible)? engine.log shows
only 1-2 seconds from start to timeout. Perhaps we should make it a bit
longer?

Best regards,
--
Didi
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/WQ3JDHUPHR42VNWYBE7DGF4LRLOVC4V2/


--

GALIT ROSENTHAL

SOFTWARE ENGINEER

galit@redhat.com    T: 972-9-7692230    



--
Didi


--

GALIT ROSENTHAL

SOFTWARE ENGINEER

galit@redhat.com    T: 972-9-7692230    

_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/ASQJQW6Z44UCKAOA65PNACMFNVPWXX7R/