[ovirt-devel] [ OST Failure Report ] [ oVirt master ] [ 27-04-2017 ] [add_hosts]

Piotr Kliczewski pkliczew at redhat.com
Sun Apr 30 11:15:22 UTC 2017


Nadav,

Here is the code [1] which is responsible for this check. Here is vdsm log
[2] where I added logging statement to understand what is commonName value
(it was 'engine').

Here are the steps what is done during the check:
1. We get client peer name by calling socket.getpeername()[0] which is:

::ffff:192.168.201.3

2. We get common name from the certificate. It should be engine's fqdn and
as in the log we get 'engine'
3. For name use we compare common name and name lookup based on the IP. I
pushed [3] a patch to normalizes the the ip (still requires my attention)

Based on the outcome of the logs it seems that 192.168.201.3 does not
resolve to 'engine' name.

Thanks,
Piotr

[1] https://github.com/oVirt/vdsm/blob/master/lib/vdsm/sslutils.py#L234
[2]
http://jenkins.ovirt.org/job/ovirt-system-tests_manual/331/artifact/exported-artifacts/test_logs/basic-suite-master/post-002_bootstrap.py/lago-basic-suite-master-host0/_var_log/vdsm/vdsm.log
[3] https://gerrit.ovirt.org/#/c/76197/

On Sun, Apr 30, 2017 at 12:52 PM, Nadav Goldin <ngoldin at redhat.com> wrote:

> Looking at the failure, I'm not sure what is wrong here on the setup
> side. The FQDN(lago-basic-suite-master-engine) should be resolvable in
> the hosts - at least from what I tested that locally. On the engine
> setup.log I see this was the generated certificate(if we're talking
> about the same one here):
>
> 2017-04-30 06:30:41,308-0400 DEBUG
> otopi.plugins.ovirt_engine_setup.ovirt_engine.pki.ca
> plugin.executeRaw:813 execute:
> ('/usr/share/ovirt-engine/bin/pki-enroll-pkcs12.sh', '--name=engine',
> '--password=**FILTERED**',
> '--subject=/C=US/O=Test/CN=lago-basic-suite-master-engine'),
> executable='None', cwd='None', env=None
> 2017-04-30 06:30:44,542-0400 DEBUG
> otopi.plugins.ovirt_engine_setup.ovirt_engine.pki.ca
> plugin.executeRaw:863 execute-result:
> ('/usr/share/ovirt-engine/bin/pki-enroll-pkcs12.sh', '--name=engine',
> '--password=**FILTERED**',
> '--subject=/C=US/O=Test/CN=lago-basic-suite-master-engine'), rc=0
> 2017-04-30 06:30:44,543-0400 DEBUG
> otopi.plugins.ovirt_engine_setup.ovirt_engine.pki.ca
> plugin.execute:921 execute-output:
> ('/usr/share/ovirt-engine/bin/pki-enroll-pkcs12.sh', '--name=engine',
> '--password=**FILTERED**',
> '--subject=/C=US/O=Test/CN=lago-basic-suite-master-engine')
>
>
> Do we expect the '--name' parameter to be the same as the hostname? My
> thought was that it should use the engine FQDN, and that should match
> the certificate name.
>
> If that is not the problem, can you make the output more verbose in
> vdsm logs? so we'll know exactly what name is it looking for.
>
>
> Thanks
>
> Nadav.
>
> On Sun, Apr 30, 2017 at 1:43 PM, Piotr Kliczewski <pkliczew at redhat.com>
> wrote:
> > The job failed.
> >
> > Just to be clear. We need to resolve engine name on a host side or use ip
> > address.
> >
> > Thanks,
> > Piotr
> >
> > On Sun, Apr 30, 2017 at 12:23 PM, Piotr Kliczewski <pkliczew at redhat.com>
> > wrote:
> >>
> >> Here is the link
> >>
> >> http://jenkins.ovirt.org/job/ovirt-system-tests_manual/331/
> >>
> >> On Sun, Apr 30, 2017 at 12:17 PM, Piotr Kliczewski <pkliczew at redhat.com
> >
> >> wrote:
> >>>
> >>> Sure, will test
> >>>
> >>> 30 kwi 2017 12:14 "Nadav Goldin" <ngoldin at redhat.com> napisał(a):
> >>>>
> >>>> It is under-work in [1], as it requires cross-changes in all suites it
> >>>> takes a while to test it/cover all changes, though basic-suite-master
> >>>> already passed.
> >>>> Can you test it by running OST manual with your changes and the OST
> >>>> patch(i.e. put also in GERRIT_REFSPEC: refs/changes/25/76225/7 )
> >>>>
> >>>>
> >>>>
> >>>> [1] https://gerrit.ovirt.org/76225
> >>>>
> >>>> On Sun, Apr 30, 2017 at 1:09 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
> >>>> >
> >>>> >
> >>>> > On Sun, Apr 30, 2017 at 1:03 PM, Piotr Kliczewski
> >>>> > <piotr.kliczewski at gmail.com> wrote:
> >>>> >>
> >>>> >> When we can have it fixed? I checked few minutes ago and the
> problem
> >>>> >> is still there.
> >>>> >
> >>>> >
> >>>> > https://gerrit.ovirt.org/#/c/76225/ should cover this.
> >>>> >
> >>>> > What I wonder is what caused this in the first place. The SSL
> change?
> >>>> > Y.
> >>>> >
> >>>> >>
> >>>> >>
> >>>> >> Thanks,
> >>>> >> Piotr
> >>>> >>
> >>>> >> On Sat, Apr 29, 2017 at 11:18 AM, Piotr Kliczewski
> >>>> >> <pkliczew at redhat.com>
> >>>> >> wrote:
> >>>> >> > Nadav,
> >>>> >> >
> >>>> >> > Yes, vdsm is not able to resolve 'engine' which is used in
> engine's
> >>>> >> > certificate.
> >>>> >> >
> >>>> >> > Thanks,
> >>>> >> > Piotr
> >>>> >> >
> >>>> >> > 29 kwi 2017 00:37 "Nadav Goldin" <ngoldin at redhat.com>
> napisał(a):
> >>>> >> >
> >>>> >> > Hi Piotr,
> >>>> >> > Can you clarify what you noticed is not resolvable - the 'engine'
> >>>> >> > FQDN
> >>>> >> > from host0?
> >>>> >> >
> >>>> >> > Thanks,
> >>>> >> > Nadav.
> >>>> >> >
> >>>> >> >
> >>>> >> > On Fri, Apr 28, 2017 at 4:15 PM, Piotr Kliczewski
> >>>> >> > <pkliczew at redhat.com>
> >>>> >> > wrote:
> >>>> >> >> I started to investigate the issue [1] and it seems like there
> is
> >>>> >> >> an
> >>>> >> >> issue
> >>>> >> >> in Lago setup we use.
> >>>> >> >>
> >>>> >> >> During handshake we have a step to verify whether client
> >>>> >> >> certificate
> >>>> >> >> was
> >>>> >> >> issued for a specific host (no such functionality in m2crytpo
> code
> >>>> >> >> base).
> >>>> >> >> It works fine when using either ip addresses or fqdns but in
> this
> >>>> >> >> particular
> >>>> >> >> setup we use mixed.
> >>>> >> >>
> >>>> >> >> When added logging I see that in engine certificate we use
> >>>> >> >> 'engine'
> >>>> >> >> name
> >>>> >> >> which is not resolvable on the host side and the check fails.
> >>>> >> >> I posted a patch [2] which fixes IPv4 mapped addresses issue but
> >>>> >> >> we
> >>>> >> >> need
> >>>> >> >> to
> >>>> >> >> fix the setup issue.
> >>>> >> >>
> >>>> >> >> Thanks,
> >>>> >> >> Piotr
> >>>> >> >>
> >>>> >> >> [1] http://jenkins.ovirt.org/job/ovirt-system-tests_manual/326/
> >>>> >> >> [2] https://gerrit.ovirt.org/#/c/76197/
> >>>> >> >>
> >>>> >> >> On Thu, Apr 27, 2017 at 3:39 PM, Piotr Kliczewski
> >>>> >> >> <pkliczew at redhat.com>
> >>>> >> >> wrote:
> >>>> >> >>>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >> >>> On Thu, Apr 27, 2017 at 3:13 PM, Evgheni Dereveanchin
> >>>> >> >>> <ederevea at redhat.com> wrote:
> >>>> >> >>>>
> >>>> >> >>>> Test failed: 002_bootstrap/add_hosts
> >>>> >> >>>>
> >>>> >> >>>> Link to suspected patches:
> >>>> >> >>>>  https://gerrit.ovirt.org/76107 - ssl: change default library
> >>>> >> >>>>
> >>>> >> >>>> Link to job:
> >>>> >> >>>>
> >>>> >> >>>>
> >>>> >> >>>> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_
> master/6491/
> >>>> >> >>>>
> >>>> >> >>>> VDSM log:
> >>>> >> >>>>
> >>>> >> >>>>
> >>>> >> >>>>
> >>>> >> >>>>
> >>>> >> >>>> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_
> master/6491/artifact/exported-artifacts/basic-suit-master-
> el7/test_logs/basic-suite-master/post-002_bootstrap.py/
> lago-basic-suite-master-host0/_var_log/vdsm/vdsm.log
> >>>> >> >>>>
> >>>> >> >>>> Error snippet from VDSM log, this repeats on each connection
> >>>> >> >>>> attempt
> >>>> >> >>>> from
> >>>> >> >>>> Engine side:
> >>>> >> >>>>
> >>>> >> >>>> <error>
> >>>> >> >>>>
> >>>> >> >>>> 2017-04-27 06:39:27,768-0400 INFO  (Reactor thread)
> >>>> >> >>>> [ProtocolDetector.AcceptorImpl] Accepted connection from
> >>>> >> >>>> ::ffff:192.168.201.3:49530 (protocoldetector:74)
> >>>> >> >>>> 2017-04-27 06:39:27,898-0400 ERROR (Reactor thread)
> >>>> >> >>>> [vds.dispatcher]
> >>>> >> >>>> uncaptured python exception, closing channel
> >>>> >> >>>> <yajsonrpc.betterAsyncore.Dispatcher connected
> >>>> >> >>>> ('::ffff:192.168.201.3',
> >>>> >> >>>> 49530, 0, 0) at 0x1cc3b00> (<class 'socket.error'>:Address
> >>>> >> >>>> family not
> >>>> >> >>>> supported by protocol
> >>>> >> >>>> [/usr/lib64/python2.7/asyncore.py|readwrite|110]
> >>>> >> >>>> [/usr/lib64/python2.7/asyncore.py|handle_write_event|468]
> >>>> >> >>>>
> >>>> >> >>>>
> >>>> >> >>>>
> >>>> >> >>>> [/usr/lib/python2.7/site-packages/yajsonrpc/
> betterAsyncore.py|handle_write|70]
> >>>> >> >>>>
> >>>> >> >>>>
> >>>> >> >>>>
> >>>> >> >>>> [/usr/lib/python2.7/site-packages/yajsonrpc/
> betterAsyncore.py|_delegate_call|149]
> >>>> >> >>>>
> >>>> >> >>>> [/usr/lib/python2.7/site-packages/vdsm/sslutils.py|
> handle_write|213]
> >>>> >> >>>>
> >>>> >> >>>> [/usr/lib/python2.7/site-packages/vdsm/sslutils.py|_
> handle_io|223]
> >>>> >> >>>>
> >>>> >> >>>> [/usr/lib/python2.7/site-packages/vdsm/sslutils.py|_
> verify_host|237]
> >>>> >> >>>>
> >>>> >> >>>>
> >>>> >> >>>> [/usr/lib/python2.7/site-packages/vdsm/sslutils.py|
> compare_names|249])
> >>>> >> >>>> (betterAsyncore:160)
> >>>> >> >>>>
> >>>> >> >>>> </error>
> >>>> >> >>>
> >>>> >> >>>
> >>>> >> >>> This means that what we have in the certificate do not match
> the
> >>>> >> >>> source
> >>>> >> >>> address we get. I suspect that we issue the certificate for
> >>>> >> >>> 192.168.201.3
> >>>> >> >>> but when we get ::ffff:192.168.201.3.
> >>>> >> >>> The change was verified in the env when ipv4 is used. I pushed
> a
> >>>> >> >>> revert
> >>>> >> >>> [1] for now so we can work on fixing the issue.
> >>>> >> >>>
> >>>> >> >>> [1] https://gerrit.ovirt.org/#/c/76160
> >>>> >> >>>
> >>>> >> >>>>
> >>>> >> >>>> --
> >>>> >> >>>> Regards,
> >>>> >> >>>> Evgheni Dereveanchin
> >>>> >> >>>
> >>>> >> >>>
> >>>> >> >>
> >>>> >> >>
> >>>> >> >> _______________________________________________
> >>>> >> >> Devel mailing list
> >>>> >> >> Devel at ovirt.org
> >>>> >> >> http://lists.ovirt.org/mailman/listinfo/devel
> >>>> >> >
> >>>> >> >
> >>>> >> >
> >>>> >> > _______________________________________________
> >>>> >> > Devel mailing list
> >>>> >> > Devel at ovirt.org
> >>>> >> > http://lists.ovirt.org/mailman/listinfo/devel
> >>>> >> _______________________________________________
> >>>> >> Devel mailing list
> >>>> >> Devel at ovirt.org
> >>>> >> http://lists.ovirt.org/mailman/listinfo/devel
> >>>> >
> >>>> >
> >>>> >
> >>>> > _______________________________________________
> >>>> > Devel mailing list
> >>>> > Devel at ovirt.org
> >>>> > http://lists.ovirt.org/mailman/listinfo/devel
> >>
> >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170430/82450da8/attachment-0001.html>


More information about the Devel mailing list