On Wed, Apr 1, 2020 at 11:15 AM Marcin Sobczyk <msobczyk(a)redhat.com
<mailto:msobczyk@redhat.com>> wrote:
On 4/1/20 11:06 AM, Marcin Sobczyk wrote:
>
>
> On 4/1/20 9:51 AM, Marcin Sobczyk wrote:
>> Hi,
>>
>> On 4/1/20 8:44 AM, Yedidyah Bar David wrote:
>>> On Wed, Apr 1, 2020 at 6:21 AM <jenkins(a)jenkins.phx.ovirt.org
<mailto:jenkins@jenkins.phx.ovirt.org>> wrote:
>>>> Project:
>>>>
https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/
>>>>
>>>> Build:
>>>>
https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1548/
>>> Previous build 1547 passed!, after many months of failing,
thanks to
>>> Evgeny's work
>>> in recent weeks. Above one failed.
>>> I think the root cause is that the engine tried to connect to
vdsm
>>> right after
>>> successfully finishing ansible host-deploy, but failed.
vdsm.log has:
>>>
>>>
https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/15...
>>>
>>>
>>> 2020-03-31 22:58:49,773-0400 ERROR (Reactor thread)
[vds.dispatcher]
>>> uncaptured python exception, closing channel
>>> <yajsonrpc.betterAsyncore.Dispatcher connected
>>> ('::ffff:192.168.222.76', 46754, 0, 0) at 0x7f416c150a90>
(<class
>>> 'ssl.SSLError'>:[X509] no certificate or crl found
(_ssl.c:3771)
>>> [/usr/lib64/python3.6/asyncore.py|readwrite|110]
>>> [/usr/lib64/python3.6/asyncore.py|handle_write_event|442]
>>>
[/usr/lib/python3.6/site-packages/yajsonrpc/betterAsyncore.py|handle_write|74]
>>>
>>>
[/usr/lib/python3.6/site-packages/yajsonrpc/betterAsyncore.py|_delegate_call|168]
>>>
>>>
[/usr/lib/python3.6/site-packages/vdsm/sslutils.py|handle_write|190]
>>> [/usr/lib/python3.6/site-packages/vdsm/sslutils.py|_handle_io|194]
>>>
[/usr/lib/python3.6/site-packages/vdsm/sslutils.py|_set_up_socket|154])
>>> (betterAsyncore:179)
>>>
>>> Not sure what might have caused this. Can anyone have a look?
Thanks.
>> Probably caused by
https://gerrit.ovirt.org/108016
>> Looking into this.
>>
> Turns out that the patch is not the cause of the error per se -
it simply
> uncovered a different problem - the CA on the hosts is broken:
>
> [root@lago-basic-suite-master-host-0 certs]# openssl x509 -in
> /etc/pki/vdsm/certs/cacert.pem -text
> unable to load certificate
> 139987452258112:error:0909006C:PEM routines:get_name:no start
> line:crypto/pem/pem_lib.c:745:Expecting: TRUSTED CERTIFICATE
It looks like they have spaces instead of newlines.
When I manually replaced the spaces to newlines, openssl is able
to read
them.
Martin/Dana, couldn't this be caused by any recent changes in
ansible-runner integrations?
>
>>>
>>>> Build Number: 1548
>>>> Build Status: Failure
>>>> Triggered By: Started by timer
>>>>
>>>> -------------------------------------
>>>> Changes Since Last Success:
>>>> -------------------------------------
>>>> Changes for Build #1548
>>>> [Galit Rosenthal] Fix the repo for suites that weren't moved
to no
>>>> reposync
>>>>
>>>>
>>>>
>>>>
>>>> -----------------
>>>> Failed Tests:
>>>> -----------------
>>>> No tests ran.
>>>
>>>
>>
>
--
Martin Perina
Manager, Software Engineering
Red Hat Czech s.r.o.