Nadav,
Here is the code [1] which is responsible for this check. Here is vdsm log
[2] where I added logging statement to understand what is commonName value
(it was 'engine').
Here are the steps what is done during the check:
1. We get client peer name by calling socket.getpeername()[0] which is:
::ffff:192.168.201.3
2. We get common name from the certificate. It should be engine's fqdn and
as in the log we get 'engine'
3. For name use we compare common name and name lookup based on the IP. I
pushed [3] a patch to normalizes the the ip (still requires my attention)
Based on the outcome of the logs it seems that 192.168.201.3 does not
resolve to 'engine' name.
Thanks,
Piotr
[1]
On Sun, Apr 30, 2017 at 12:52 PM, Nadav Goldin <ngoldin(a)redhat.com> wrote:
Looking at the failure, I'm not sure what is wrong here on the
setup
side. The FQDN(lago-basic-suite-master-engine) should be resolvable in
the hosts - at least from what I tested that locally. On the engine
setup.log I see this was the generated certificate(if we're talking
about the same one here):
2017-04-30 06:30:41,308-0400 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.pki.ca
plugin.executeRaw:813 execute:
('/usr/share/ovirt-engine/bin/pki-enroll-pkcs12.sh', '--name=engine',
'--password=**FILTERED**',
'--subject=/C=US/O=Test/CN=lago-basic-suite-master-engine'),
executable='None', cwd='None', env=None
2017-04-30 06:30:44,542-0400 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.pki.ca
plugin.executeRaw:863 execute-result:
('/usr/share/ovirt-engine/bin/pki-enroll-pkcs12.sh', '--name=engine',
'--password=**FILTERED**',
'--subject=/C=US/O=Test/CN=lago-basic-suite-master-engine'), rc=0
2017-04-30 06:30:44,543-0400 DEBUG
otopi.plugins.ovirt_engine_setup.ovirt_engine.pki.ca
plugin.execute:921 execute-output:
('/usr/share/ovirt-engine/bin/pki-enroll-pkcs12.sh', '--name=engine',
'--password=**FILTERED**',
'--subject=/C=US/O=Test/CN=lago-basic-suite-master-engine')
Do we expect the '--name' parameter to be the same as the hostname? My
thought was that it should use the engine FQDN, and that should match
the certificate name.
If that is not the problem, can you make the output more verbose in
vdsm logs? so we'll know exactly what name is it looking for.
Thanks
Nadav.
On Sun, Apr 30, 2017 at 1:43 PM, Piotr Kliczewski <pkliczew(a)redhat.com>
wrote:
> The job failed.
>
> Just to be clear. We need to resolve engine name on a host side or use ip
> address.
>
> Thanks,
> Piotr
>
> On Sun, Apr 30, 2017 at 12:23 PM, Piotr Kliczewski <pkliczew(a)redhat.com>
> wrote:
>>
>> Here is the link
>>
>>
http://jenkins.ovirt.org/job/ovirt-system-tests_manual/331/
>>
>> On Sun, Apr 30, 2017 at 12:17 PM, Piotr Kliczewski <pkliczew(a)redhat.com
>
>> wrote:
>>>
>>> Sure, will test
>>>
>>> 30 kwi 2017 12:14 "Nadav Goldin" <ngoldin(a)redhat.com>
napisał(a):
>>>>
>>>> It is under-work in [1], as it requires cross-changes in all suites it
>>>> takes a while to test it/cover all changes, though basic-suite-master
>>>> already passed.
>>>> Can you test it by running OST manual with your changes and the OST
>>>> patch(i.e. put also in GERRIT_REFSPEC: refs/changes/25/76225/7 )
>>>>
>>>>
>>>>
>>>> [1]
https://gerrit.ovirt.org/76225
>>>>
>>>> On Sun, Apr 30, 2017 at 1:09 PM, Yaniv Kaul <ykaul(a)redhat.com>
wrote:
>>>> >
>>>> >
>>>> > On Sun, Apr 30, 2017 at 1:03 PM, Piotr Kliczewski
>>>> > <piotr.kliczewski(a)gmail.com> wrote:
>>>> >>
>>>> >> When we can have it fixed? I checked few minutes ago and the
problem
>>>> >> is still there.
>>>> >
>>>> >
>>>> >
https://gerrit.ovirt.org/#/c/76225/ should cover this.
>>>> >
>>>> > What I wonder is what caused this in the first place. The SSL
change?
>>>> > Y.
>>>> >
>>>> >>
>>>> >>
>>>> >> Thanks,
>>>> >> Piotr
>>>> >>
>>>> >> On Sat, Apr 29, 2017 at 11:18 AM, Piotr Kliczewski
>>>> >> <pkliczew(a)redhat.com>
>>>> >> wrote:
>>>> >> > Nadav,
>>>> >> >
>>>> >> > Yes, vdsm is not able to resolve 'engine' which is
used in
engine's
>>>> >> > certificate.
>>>> >> >
>>>> >> > Thanks,
>>>> >> > Piotr
>>>> >> >
>>>> >> > 29 kwi 2017 00:37 "Nadav Goldin"
<ngoldin(a)redhat.com>
napisał(a):
>>>> >> >
>>>> >> > Hi Piotr,
>>>> >> > Can you clarify what you noticed is not resolvable - the
'engine'
>>>> >> > FQDN
>>>> >> > from host0?
>>>> >> >
>>>> >> > Thanks,
>>>> >> > Nadav.
>>>> >> >
>>>> >> >
>>>> >> > On Fri, Apr 28, 2017 at 4:15 PM, Piotr Kliczewski
>>>> >> > <pkliczew(a)redhat.com>
>>>> >> > wrote:
>>>> >> >> I started to investigate the issue [1] and it seems
like there
is
>>>> >> >> an
>>>> >> >> issue
>>>> >> >> in Lago setup we use.
>>>> >> >>
>>>> >> >> During handshake we have a step to verify whether
client
>>>> >> >> certificate
>>>> >> >> was
>>>> >> >> issued for a specific host (no such functionality in
m2crytpo
code
>>>> >> >> base).
>>>> >> >> It works fine when using either ip addresses or fqdns
but in
this
>>>> >> >> particular
>>>> >> >> setup we use mixed.
>>>> >> >>
>>>> >> >> When added logging I see that in engine certificate we
use
>>>> >> >> 'engine'
>>>> >> >> name
>>>> >> >> which is not resolvable on the host side and the check
fails.
>>>> >> >> I posted a patch [2] which fixes IPv4 mapped addresses
issue but
>>>> >> >> we
>>>> >> >> need
>>>> >> >> to
>>>> >> >> fix the setup issue.
>>>> >> >>
>>>> >> >> Thanks,
>>>> >> >> Piotr
>>>> >> >>
>>>> >> >> [1]
http://jenkins.ovirt.org/job/ovirt-system-tests_manual/326/
>>>> >> >> [2]
https://gerrit.ovirt.org/#/c/76197/
>>>> >> >>
>>>> >> >> On Thu, Apr 27, 2017 at 3:39 PM, Piotr Kliczewski
>>>> >> >> <pkliczew(a)redhat.com>
>>>> >> >> wrote:
>>>> >> >>>
>>>> >> >>>
>>>> >> >>>
>>>> >> >>> On Thu, Apr 27, 2017 at 3:13 PM, Evgheni
Dereveanchin
>>>> >> >>> <ederevea(a)redhat.com> wrote:
>>>> >> >>>>
>>>> >> >>>> Test failed: 002_bootstrap/add_hosts
>>>> >> >>>>
>>>> >> >>>> Link to suspected patches:
>>>> >> >>>>
https://gerrit.ovirt.org/76107 - ssl: change
default library
>>>> >> >>>>
>>>> >> >>>> Link to job:
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>>
http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_
master/6491/
>>>> >> >>>>
>>>> >> >>>> VDSM log:
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>>
http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_
master/6491/artifact/exported-artifacts/basic-suit-master-
el7/test_logs/basic-suite-master/post-002_bootstrap.py/
lago-basic-suite-master-host0/_var_log/vdsm/vdsm.log
>>>> >> >>>>
>>>> >> >>>> Error snippet from VDSM log, this repeats on
each connection
>>>> >> >>>> attempt
>>>> >> >>>> from
>>>> >> >>>> Engine side:
>>>> >> >>>>
>>>> >> >>>> <error>
>>>> >> >>>>
>>>> >> >>>> 2017-04-27 06:39:27,768-0400 INFO (Reactor
thread)
>>>> >> >>>> [ProtocolDetector.AcceptorImpl] Accepted
connection from
>>>> >> >>>> ::ffff:192.168.201.3:49530
(protocoldetector:74)
>>>> >> >>>> 2017-04-27 06:39:27,898-0400 ERROR (Reactor
thread)
>>>> >> >>>> [vds.dispatcher]
>>>> >> >>>> uncaptured python exception, closing channel
>>>> >> >>>> <yajsonrpc.betterAsyncore.Dispatcher
connected
>>>> >> >>>> ('::ffff:192.168.201.3',
>>>> >> >>>> 49530, 0, 0) at 0x1cc3b00> (<class
'socket.error'>:Address
>>>> >> >>>> family not
>>>> >> >>>> supported by protocol
>>>> >> >>>>
[/usr/lib64/python2.7/asyncore.py|readwrite|110]
>>>> >> >>>>
[/usr/lib64/python2.7/asyncore.py|handle_write_event|468]
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>> [/usr/lib/python2.7/site-packages/yajsonrpc/
betterAsyncore.py|handle_write|70]
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>> [/usr/lib/python2.7/site-packages/yajsonrpc/
betterAsyncore.py|_delegate_call|149]
>>>> >> >>>>
>>>> >> >>>>
[/usr/lib/python2.7/site-packages/vdsm/sslutils.py|
handle_write|213]
>>>> >> >>>>
>>>> >> >>>>
[/usr/lib/python2.7/site-packages/vdsm/sslutils.py|_
handle_io|223]
>>>> >> >>>>
>>>> >> >>>>
[/usr/lib/python2.7/site-packages/vdsm/sslutils.py|_
verify_host|237]
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>>
[/usr/lib/python2.7/site-packages/vdsm/sslutils.py|
compare_names|249])
>>>> >> >>>> (betterAsyncore:160)
>>>> >> >>>>
>>>> >> >>>> </error>
>>>> >> >>>
>>>> >> >>>
>>>> >> >>> This means that what we have in the certificate do
not match
the
>>>> >> >>> source
>>>> >> >>> address we get. I suspect that we issue the
certificate for
>>>> >> >>> 192.168.201.3
>>>> >> >>> but when we get ::ffff:192.168.201.3.
>>>> >> >>> The change was verified in the env when ipv4 is
used. I pushed
a
>>>> >> >>> revert
>>>> >> >>> [1] for now so we can work on fixing the issue.
>>>> >> >>>
>>>> >> >>> [1]
https://gerrit.ovirt.org/#/c/76160
>>>> >> >>>
>>>> >> >>>>
>>>> >> >>>> --
>>>> >> >>>> Regards,
>>>> >> >>>> Evgheni Dereveanchin
>>>> >> >>>
>>>> >> >>>
>>>> >> >>
>>>> >> >>
>>>> >> >> _______________________________________________
>>>> >> >> Devel mailing list
>>>> >> >> Devel(a)ovirt.org
>>>> >> >>
http://lists.ovirt.org/mailman/listinfo/devel
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > _______________________________________________
>>>> >> > Devel mailing list
>>>> >> > Devel(a)ovirt.org
>>>> >> >
http://lists.ovirt.org/mailman/listinfo/devel
>>>> >> _______________________________________________
>>>> >> Devel mailing list
>>>> >> Devel(a)ovirt.org
>>>> >>
http://lists.ovirt.org/mailman/listinfo/devel
>>>> >
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > Devel mailing list
>>>> > Devel(a)ovirt.org
>>>> >
http://lists.ovirt.org/mailman/listinfo/devel
>>
>>
>