On 4/1/20 2:23 PM, Martin Necas wrote:
It's possible that the issue was
introduced in the patch [1], but as Arthurs logs
showed properly formatted ovirt_ca_cert, so not sure
with it.
Arthur/Marcin could you please check command in
ovirt-engine/share/ovirt-engine/ansible-runner-service-project/artifacts
you should see there variables with which the
ansible-playbook is executed.
It should be same as you linked but still want
to make sure that there isn't some issue. Also you
can check stdout file if there is some issue.
I tried changing the host deployment playbook to inject
a debug message:
- name: Add vdsm cacert files
copy:
content: "{{ ovirt_ca_cert }}"
dest: "{{ filedest }}"
owner: 'root'
group: 'kvm'
mode: 0644
with_items:
- "{{ ovirt_vdsm_trust_store }}/{{
ovirt_vdsm_ca_file }}"
- "{{ ovirt_vdsm_trust_store }}/{{
ovirt_vdsm_spice_ca_file }}"
- "{{ ovirt_libvirt_default_trust_store }}/{{
ovirt_libvirt_default_client_ca_file }}"
loop_control:
loop_var: filedest
- name: Show cacert
debug:
msg: CA contents 1987 {{ ovirt_ca_cert }}
and the result was:
2020-04-01 06:02:23 EDT - TASK
[ovirt-host-deploy-vdsm-certificates : Show cacert]
***********************
2020-04-01 06:02:23 EDT - ok:
[lago-basic-suite-master-host-1] => {
"msg": "CA contents 1987 -----BEGIN CERTIFICATE-----
MIIDhDCCAmygAwIBAgICEAAwDQYJKoZIhvcNAQELBQAwMzELMAkGA1UEBhMCVVMxDTALBgNVBAoM
BFRlc3QxFTATBgNVBAMMDGVuZ2luZS4yNDU1NTAeFw0yMDAzMzEwOTU0MjRaFw0zMDAzMzAwOTU0
MjRaMDMxCzAJBgNVBAYTAlVTMQ0wCwYDVQQKDARUZXN0MRUwEwYDVQQDDAxlbmdpbmUuMjQ1NTUw
ggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQC3c20WyiBD98u6Ty6Yjb48fx9wuYUp2MIK
j7E8qlX9QvNgvuudTYugPf040xyi+pcVhbXjqc7PhJoqowzgYxuyBu7W/KZigAp2pWMl12w7J1J/
3Hp2IXD5hM7M6aCQ1jMDLxt1YECZfw+TEFVep1z7oxGZHPRZM8MDvYdBje+oPj41kIL1XNsCOiTy
J8auU5/eaFbZFjP/sCDNuN14MnmhJtlVahRouODt86N1DRf3ubkmV/Bcr/Xp4iLx4ycyFiPU31cu
Gnb2x8pTMPIbgtMYJTqMnRVrzJPV+ALA/PCSOL6LKkM7Jy4ecVFcGcJfvFpmsvF+qd7NuCOfqA7u
l6EnAgMBAAGjgaEwgZ4wHQYDVR0OBBYEFPK3q/RlmHfh5o0KmmTguIALVwFgMFwGA1UdIwRVMFOA
FPK3q/RlmHfh5o0KmmTguIALVwFgoTekNTAzMQswCQYDVQQGEwJVUzENMAsGA1UECgwEVGVzdDEV
MBMGA1UEAwwMZW5naW5lLjI0NTU1ggIQADAPBgNVHRMBAf8EBTADAQH/MA4GA1UdDwEB/wQEAwIB
BjANBgkqhkiG9w0BAQsFAAOCAQEAaHQqbgeG7ReoodKwbmFOFq99YOMrYmLx2llt5s49wz+eZsMN
OIja8Dilyhew+r6aM30cXHm6U8dOZpLQ9Ga0Y1hk4Edu6Vu4x51WXZdVTkxIjhD+DrHsuaM0PZsE
s1tq+ngBaMFxSdXIWNf7DUEf9hymxfLDoOjjVfxxlFtaDsBmu1dup/N8shzUrZ+bTt8i7TGG/JWl
F+Iyq/A1EHXywFwr/ZsEAeRjStFt0IytbYprGi98yt9LRZ4puDooio8PI57crON+Cu9vqHsYU3yc
lj8vLtwcr354LlY+nLO+cnslhirZlhIuLtytDvBXA8bNJ3EdlAInCfr6SnXKC61aqA==
-----END CERTIFICATE----- "
So when running the playbook it's already broken.
Artur OTOH checked the value of the variable by breaking
in the engine code and it seemed ok there.
Indeed I think there's a problem in [1].
Posting a
public pastebin url [1]. Apologies for using the
private one before.
[1] https://pastebin.com/wrw5ME7j
A.
On Wed, Apr 1, 2020 at 12:31 PM Artur Socha <asocha@redhat.com>
wrote:
>
> Adding request content:
> http://pastebin.test.redhat.com/850652
>
> A.
>
> On Wed, Apr 1, 2020 at 12:28 PM Artur Socha
<asocha@redhat.com>
wrote:
>>
>> I have debug the flow until the moment
the request is being seng via http client to
ansible runner service and until that point it was
correct. The json did contain correctly formatted
ovirt_ca_cert.
>> Artur
>>
>> On Wed, Apr 1, 2020 at 12:26 PM Marcin
Sobczyk <msobczyk@redhat.com>
wrote:
>>>
>>>
>>>
>>> On 4/1/20 11:54 AM, Martin Perina
wrote:
>>>
>>>
>>>
>>> On Wed, Apr 1, 2020 at 11:15 AM
Marcin Sobczyk <msobczyk@redhat.com>
wrote:
>>>>
>>>>
>>>>
>>>> On 4/1/20 11:06 AM, Marcin
Sobczyk wrote:
>>>> >
>>>> >
>>>> > On 4/1/20 9:51 AM, Marcin
Sobczyk wrote:
>>>> >> Hi,
>>>> >>
>>>> >> On 4/1/20 8:44 AM,
Yedidyah Bar David wrote:
>>>> >>> On Wed, Apr 1, 2020
at 6:21 AM <jenkins@jenkins.phx.ovirt.org>
wrote:
>>>> >>>> Project:
>>>> >>>> https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/
>>>> >>>>
>>>> >>>> Build:
>>>> >>>> https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1548/
>>>> >>> Previous build 1547
passed!, after many months of failing, thanks to
>>>> >>> Evgeny's work
>>>> >>> in recent weeks.
Above one failed.
>>>> >>> I think the root
cause is that the engine tried to connect to vdsm
>>>> >>> right after
>>>> >>> successfully
finishing ansible host-deploy, but failed.
vdsm.log has:
>>>> >>>
>>>> >>> https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1548/artifact/exported-artifacts/test_logs/he-basic-suite-master/post-he_deploy/lago-he-basic-suite-master-host-0/_var_log/vdsm/vdsm.log
>>>> >>>
>>>> >>>
>>>> >>> 2020-03-31
22:58:49,773-0400 ERROR (Reactor thread)
[vds.dispatcher]
>>>> >>> uncaptured python
exception, closing channel
>>>> >>>
<yajsonrpc.betterAsyncore.Dispatcher connected
>>>> >>>
('::ffff:192.168.222.76', 46754, 0, 0) at
0x7f416c150a90> (<class
>>>> >>>
'ssl.SSLError'>:[X509] no certificate or crl
found (_ssl.c:3771)
>>>> >>>
[/usr/lib64/python3.6/asyncore.py|readwrite|110]
>>>> >>>
[/usr/lib64/python3.6/asyncore.py|handle_write_event|442]
>>>> >>>
[/usr/lib/python3.6/site-packages/yajsonrpc/betterAsyncore.py|handle_write|74]
>>>> >>>
>>>> >>>
[/usr/lib/python3.6/site-packages/yajsonrpc/betterAsyncore.py|_delegate_call|168]
>>>> >>>
>>>> >>>
[/usr/lib/python3.6/site-packages/vdsm/sslutils.py|handle_write|190]
>>>> >>>
[/usr/lib/python3.6/site-packages/vdsm/sslutils.py|_handle_io|194]
>>>> >>>
[/usr/lib/python3.6/site-packages/vdsm/sslutils.py|_set_up_socket|154])
>>>> >>> (betterAsyncore:179)
>>>> >>>
>>>> >>> Not sure what might
have caused this. Can anyone have a look? Thanks.
>>>> >> Probably caused by https://gerrit.ovirt.org/108016
>>>> >> Looking into this.
>>>> >>
>>>> > Turns out that the patch is
not the cause of the error per se - it simply
>>>> > uncovered a different
problem - the CA on the hosts is broken:
>>>> >
>>>> >
[root@lago-basic-suite-master-host-0 certs]#
openssl x509 -in
>>>> >
/etc/pki/vdsm/certs/cacert.pem -text
>>>> > unable to load certificate
>>>> >
139987452258112:error:0909006C:PEM
routines:get_name:no start
>>>> >
line:crypto/pem/pem_lib.c:745:Expecting: TRUSTED
CERTIFICATE
>>>> It looks like they have spaces
instead of newlines.
>>>> When I manually replaced the
spaces to newlines, openssl is able to read
>>>> them.
>>>
>>>
>>> Martin/Dana, couldn't this be caused
by any recent changes in ansible-runner
integrations?
>>>
>>> This looks like a suspect to me:
>>>
>>> https://gerrit.ovirt.org/#/c/107683/5/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/common/utils/ansible/AnsibleRunnerHTTPClient.java
>>>
>>>>
>>>> >
>>>> >>>
>>>> >>>> Build Number:
1548
>>>> >>>> Build Status:
Failure
>>>> >>>> Triggered By:
Started by timer
>>>> >>>>
>>>> >>>>
-------------------------------------
>>>> >>>> Changes Since
Last Success:
>>>> >>>>
-------------------------------------
>>>> >>>> Changes for
Build #1548
>>>> >>>> [Galit
Rosenthal] Fix the repo for suites that weren't
moved to no
>>>> >>>> reposync
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>>
-----------------
>>>> >>>> Failed Tests:
>>>> >>>>
-----------------
>>>> >>>> No tests ran.
>>>> >>>
>>>> >>>
>>>> >>
>>>> >
>>>>
>>>
>>>
>>> --
>>> Martin Perina
>>> Manager, Software Engineering
>>> Red Hat Czech s.r.o.
>>>
>>>
>>
>>
>> --
>>
>> Artur Socha
>>
>> Senior Software Engineer, RHV
>>
>> Red Hat
>
>
>
> --
>
> Artur Socha
>
> Senior Software Engineer, RHV
>
> Red Hat
--
Artur Socha
Senior Software Engineer, RHV
Red Hat