Okay found the issue for some reason the ansible runner wants
'\\n'
instead of '\n' I have not done anything in the patch with new line
symbol so need to do more investigating maybe something new in
ansible-runner or ansible-runner-service.
The fix in engine is simple in [1] update the line 131 to
`String.valueOf(e.getValue()).replaceAll("\n", "\\\\\\\\n")` but want
to make sure that it won't break anything else.
Can you post a patch, even an
unverified one yet, so I can quickly try
it out?
I want to test other changes, but this issue is currently blocking me.
Dne st 1. 4. 2020 14:41 uživatel Marcin Sobczyk <msobczyk(a)redhat.com
<mailto:msobczyk@redhat.com>> napsal:
On 4/1/20 2:23 PM, Martin Necas wrote:
> It's possible that the issue was introduced in the patch [1], but
> as Arthurs logs showed properly formatted ovirt_ca_cert, so not
> sure with it.
> Arthur/Marcin could you please check command in
> ovirt-engine/share/ovirt-engine/ansible-runner-service-project/artifacts
> you should see there variables with which the ansible-playbook is
> executed.
> It should be same as you linked but still want to make sure that
> there isn't some issue. Also you can check stdout file if there
> is some issue.
>
I tried changing the host deployment playbook to inject a debug
message:
- name: Add vdsm cacert files
copy:
content: "{{ ovirt_ca_cert }}"
dest: "{{ filedest }}"
owner: 'root'
group: 'kvm'
mode: 0644
with_items:
- "{{ ovirt_vdsm_trust_store }}/{{ ovirt_vdsm_ca_file }}"
- "{{ ovirt_vdsm_trust_store }}/{{ ovirt_vdsm_spice_ca_file }}"
- "{{ ovirt_libvirt_default_trust_store }}/{{
ovirt_libvirt_default_client_ca_file }}"
loop_control:
loop_var: filedest
- name: Show cacert
debug:
msg: CA contents 1987 {{ ovirt_ca_cert }}
and the result was:
2020-04-01 06:02:23 EDT - TASK
[ovirt-host-deploy-vdsm-certificates : Show cacert]
***********************
2020-04-01 06:02:23 EDT - ok: [lago-basic-suite-master-host-1] => {
"msg": "CA contents 1987 -----BEGIN CERTIFICATE-----
MIIDhDCCAmygAwIBAgICEAAwDQYJKoZIhvcNAQELBQAwMzELMAkGA1UEBhMCVVMxDTALBgNVBAoM
BFRlc3QxFTATBgNVBAMMDGVuZ2luZS4yNDU1NTAeFw0yMDAzMzEwOTU0MjRaFw0zMDAzMzAwOTU0
MjRaMDMxCzAJBgNVBAYTAlVTMQ0wCwYDVQQKDARUZXN0MRUwEwYDVQQDDAxlbmdpbmUuMjQ1NTUw
ggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQC3c20WyiBD98u6Ty6Yjb48fx9wuYUp2MIK
j7E8qlX9QvNgvuudTYugPf040xyi+pcVhbXjqc7PhJoqowzgYxuyBu7W/KZigAp2pWMl12w7J1J/
3Hp2IXD5hM7M6aCQ1jMDLxt1YECZfw+TEFVep1z7oxGZHPRZM8MDvYdBje+oPj41kIL1XNsCOiTy
J8auU5/eaFbZFjP/sCDNuN14MnmhJtlVahRouODt86N1DRf3ubkmV/Bcr/Xp4iLx4ycyFiPU31cu
Gnb2x8pTMPIbgtMYJTqMnRVrzJPV+ALA/PCSOL6LKkM7Jy4ecVFcGcJfvFpmsvF+qd7NuCOfqA7u
l6EnAgMBAAGjgaEwgZ4wHQYDVR0OBBYEFPK3q/RlmHfh5o0KmmTguIALVwFgMFwGA1UdIwRVMFOA
FPK3q/RlmHfh5o0KmmTguIALVwFgoTekNTAzMQswCQYDVQQGEwJVUzENMAsGA1UECgwEVGVzdDEV
MBMGA1UEAwwMZW5naW5lLjI0NTU1ggIQADAPBgNVHRMBAf8EBTADAQH/MA4GA1UdDwEB/wQEAwIB
BjANBgkqhkiG9w0BAQsFAAOCAQEAaHQqbgeG7ReoodKwbmFOFq99YOMrYmLx2llt5s49wz+eZsMN
OIja8Dilyhew+r6aM30cXHm6U8dOZpLQ9Ga0Y1hk4Edu6Vu4x51WXZdVTkxIjhD+DrHsuaM0PZsE
s1tq+ngBaMFxSdXIWNf7DUEf9hymxfLDoOjjVfxxlFtaDsBmu1dup/N8shzUrZ+bTt8i7TGG/JWl
F+Iyq/A1EHXywFwr/ZsEAeRjStFt0IytbYprGi98yt9LRZ4puDooio8PI57crON+Cu9vqHsYU3yc
lj8vLtwcr354LlY+nLO+cnslhirZlhIuLtytDvBXA8bNJ3EdlAInCfr6SnXKC61aqA==
-----END CERTIFICATE----- "
So when running the playbook it's already broken.
Artur OTOH checked the value of the variable by breaking in the
engine code and it seemed ok there.
Indeed I think there's a problem in [1].
> [1]
>
https://gerrit.ovirt.org/#/c/107683/5/backend/manager/modules/bll/src/mai...
>
>
> Martin Necas
>
>
> On Wed, Apr 1, 2020 at 1:22 PM Artur Socha <asocha(a)redhat.com
> <mailto:asocha@redhat.com>> wrote:
>
> Posting a public pastebin url [1]. Apologies for using the
> private one before.
>
> [1]
https://pastebin.com/wrw5ME7j
> A.
>
>
> On Wed, Apr 1, 2020 at 12:31 PM Artur Socha
> <asocha(a)redhat.com <mailto:asocha@redhat.com>> wrote:
> >
> > Adding request content:
> >
http://pastebin.test.redhat.com/850652
> >
> > A.
> >
> > On Wed, Apr 1, 2020 at 12:28 PM Artur Socha
> <asocha(a)redhat.com <mailto:asocha@redhat.com>> wrote:
> >>
> >> I have debug the flow until the moment the request is
> being seng via http client to ansible runner service and
> until that point it was correct. The json did contain
> correctly formatted ovirt_ca_cert.
> >> Artur
> >>
> >> On Wed, Apr 1, 2020 at 12:26 PM Marcin Sobczyk
> <msobczyk(a)redhat.com <mailto:msobczyk@redhat.com>> wrote:
> >>>
> >>>
> >>>
> >>> On 4/1/20 11:54 AM, Martin Perina wrote:
> >>>
> >>>
> >>>
> >>> On Wed, Apr 1, 2020 at 11:15 AM Marcin Sobczyk
> <msobczyk(a)redhat.com <mailto:msobczyk@redhat.com>> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 4/1/20 11:06 AM, Marcin Sobczyk wrote:
> >>>> >
> >>>> >
> >>>> > On 4/1/20 9:51 AM, Marcin Sobczyk wrote:
> >>>> >> Hi,
> >>>> >>
> >>>> >> On 4/1/20 8:44 AM, Yedidyah Bar David wrote:
> >>>> >>> On Wed, Apr 1, 2020 at 6:21 AM
> <jenkins(a)jenkins.phx.ovirt.org
> <mailto:jenkins@jenkins.phx.ovirt.org>> wrote:
> >>>> >>>> Project:
> >>>> >>>>
>
https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/
> >>>> >>>>
> >>>> >>>> Build:
> >>>> >>>>
>
https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/1548/
> >>>> >>> Previous build 1547 passed!, after many months
of
> failing, thanks to
> >>>> >>> Evgeny's work
> >>>> >>> in recent weeks. Above one failed.
> >>>> >>> I think the root cause is that the engine tried
to
> connect to vdsm
> >>>> >>> right after
> >>>> >>> successfully finishing ansible host-deploy,
but
> failed. vdsm.log has:
> >>>> >>>
> >>>> >>>
>
https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-master/15...
> >>>> >>>
> >>>> >>>
> >>>> >>> 2020-03-31 22:58:49,773-0400 ERROR (Reactor
thread)
> [vds.dispatcher]
> >>>> >>> uncaptured python exception, closing channel
> >>>> >>> <yajsonrpc.betterAsyncore.Dispatcher
connected
> >>>> >>> ('::ffff:192.168.222.76', 46754, 0, 0)
at
> 0x7f416c150a90> (<class
> >>>> >>> 'ssl.SSLError'>:[X509] no
certificate or crl found
> (_ssl.c:3771)
> >>>> >>>
[/usr/lib64/python3.6/asyncore.py|readwrite|110]
> >>>> >>>
> [/usr/lib64/python3.6/asyncore.py|handle_write_event|442]
> >>>> >>>
>
[/usr/lib/python3.6/site-packages/yajsonrpc/betterAsyncore.py|handle_write|74]
> >>>> >>>
> >>>> >>>
>
[/usr/lib/python3.6/site-packages/yajsonrpc/betterAsyncore.py|_delegate_call|168]
> >>>> >>>
> >>>> >>>
> [/usr/lib/python3.6/site-packages/vdsm/sslutils.py|handle_write|190]
> >>>> >>>
> [/usr/lib/python3.6/site-packages/vdsm/sslutils.py|_handle_io|194]
> >>>> >>>
> [/usr/lib/python3.6/site-packages/vdsm/sslutils.py|_set_up_socket|154])
> >>>> >>> (betterAsyncore:179)
> >>>> >>>
> >>>> >>> Not sure what might have caused this. Can
anyone
> have a look? Thanks.
> >>>> >> Probably caused by
https://gerrit.ovirt.org/108016
> >>>> >> Looking into this.
> >>>> >>
> >>>> > Turns out that the patch is not the cause of the error
> per se - it simply
> >>>> > uncovered a different problem - the CA on the hosts is
> broken:
> >>>> >
> >>>> > [root@lago-basic-suite-master-host-0 certs]# openssl
> x509 -in
> >>>> > /etc/pki/vdsm/certs/cacert.pem -text
> >>>> > unable to load certificate
> >>>> > 139987452258112:error:0909006C:PEM
> routines:get_name:no start
> >>>> > line:crypto/pem/pem_lib.c:745:Expecting: TRUSTED
> CERTIFICATE
> >>>> It looks like they have spaces instead of newlines.
> >>>> When I manually replaced the spaces to newlines, openssl
> is able to read
> >>>> them.
> >>>
> >>>
> >>> Martin/Dana, couldn't this be caused by any recent
> changes in ansible-runner integrations?
> >>>
> >>> This looks like a suspect to me:
> >>>
> >>>
>
https://gerrit.ovirt.org/#/c/107683/5/backend/manager/modules/bll/src/mai...
> >>>
> >>>>
> >>>> >
> >>>> >>>
> >>>> >>>> Build Number: 1548
> >>>> >>>> Build Status: Failure
> >>>> >>>> Triggered By: Started by timer
> >>>> >>>>
> >>>> >>>> -------------------------------------
> >>>> >>>> Changes Since Last Success:
> >>>> >>>> -------------------------------------
> >>>> >>>> Changes for Build #1548
> >>>> >>>> [Galit Rosenthal] Fix the repo for suites
that
> weren't moved to no
> >>>> >>>> reposync
> >>>> >>>>
> >>>> >>>>
> >>>> >>>>
> >>>> >>>>
> >>>> >>>> -----------------
> >>>> >>>> Failed Tests:
> >>>> >>>> -----------------
> >>>> >>>> No tests ran.
> >>>> >>>
> >>>> >>>
> >>>> >>
> >>>> >
> >>>>
> >>>
> >>>
> >>> --
> >>> Martin Perina
> >>> Manager, Software Engineering
> >>> Red Hat Czech s.r.o.
> >>>
> >>>
> >>
> >>
> >> --
> >>
> >> Artur Socha
> >>
> >> Senior Software Engineer, RHV
> >>
> >> Red Hat
> >
> >
> >
> > --
> >
> > Artur Socha
> >
> > Senior Software Engineer, RHV
> >
> > Red Hat
>
>
>
> --
>
> Artur Socha
>
> Senior Software Engineer, RHV
>
> Red Hat
>