Nadav,
Thank you for working on this but we have one more issue with name resolution.
I checked the last job you triggered and I noticed that vm migration
failed due to similar issue between the hosts.
Here is a piece of custom logs that you added:
2017-04-30 14:23:35,675-0400 INFO (Reactor thread)
[ProtocolDetector.SSLHandshakeDispatcher] subject:
((('organizationName', u'Test'),), (('commonName',
u'lago-basic-suite-master-host0'),)), key: organizationName, value:
Test (sslutils:241)
2017-04-30 14:23:35,675-0400 INFO (Reactor thread)
[ProtocolDetector.SSLHandshakeDispatcher] subject:
((('organizationName', u'Test'),), (('commonName',
u'lago-basic-suite-master-host0'),)), key: commonName, value:
lago-basic-suite-master-host0 (sslutils:241)
2017-04-30 14:23:35,676-0400 INFO (Reactor thread)
[ProtocolDetector.SSLHandshakeDispatcher] src_addr:
::ffff:192.168.201.2, cn_addr: lago-basic-suite-master-host0
(sslutils:262)
2017-04-30 14:23:35,676-0400 INFO (Reactor thread)
[ProtocolDetector.SSLHandshakeDispatcher] src_addr_extracted:
192.168.201.2, cn_addr_extracted: lago-basic-suite-master-host0
(sslutils:266)
2017-04-30 14:23:35,677-0400 INFO (Reactor thread)
[ProtocolDetector.SSLHandshakeDispatcher]
socket.gethostbyadd(src_addr)[0]:
lago-basic-suite-master-host0.lago.local (sslutils:268)
2017-04-30 14:23:35,678-0400 INFO (Reactor thread)
[ProtocolDetector.SSLHandshakeDispatcher] compare
::ffff:192.168.201.2, lago-basic-suite-master-host0, res: False
(sslutils:244)
2017-04-30 14:23:35,678-0400 ERROR (Reactor thread)
[ProtocolDetector.SSLHandshakeDispatcher] peer certificate does not
match host name (sslutils:226)
It looks like the engine issued certificate for
'lago-basic-suite-master-host0' but we resolve 192.168.201.2 to
'lago-basic-suite-master-host0.lago.local'.
Can we fix it as well?
Thanks,
Piotr
On Sun, Apr 30, 2017 at 7:42 PM, Piotr Kliczewski <pkliczew@redhat.com> wrote:
> Wow, great.
>
> Thank you!
>
> 30 kwi 2017 19:40 "Nadav Goldin" <ngoldin@redhat.com> napisał(a):
>>
>> Ok, I think the issue was the unqualified domain name. The certificate
>> was generated(as before for 'engine') without the domain name, i.e.
>> 'lago-basic-suite-master-engine', on VDSM side it resolved the IP to
>> the address 'lago-basic-suite-master-engine.lago.local' and then
>> failed comparing it to the unqualified one. I assume this is the
>> expected behaviour, though not sure(as you can easily resolve
>> 'lago-basic-suite-master-engine' to
>> 'lago-basic-suite-master-engine.lago.local' on the hosts). It should
>> be fixed in [1], just ran OST manual with the same debugging patch
>> applied on top of yours, and at least add_hosts passed.
>>
>>
>> [1] https://gerrit.ovirt.org/#/c/76225/10
>> [2] http://jenkins.ovirt.org/job/ovirt-system-tests_manual/338/ console
>>
>> On Sun, Apr 30, 2017 at 7:50 PM, Piotr Kliczewski <pkliczew@redhat.com>
>> wrote:
>> > Sure, will take look later today.
>> >
>> > 30 kwi 2017 18:47 "Nadav Goldin" <ngoldin@redhat.com> napisał(a):
>> >>
>> >> Thanks for the explanation.
>> >>
>> >> I added some more debugging messages on top of your patch, could you
>> >> please take a look at [1] and tell me what do you expect to resolve
>> >> differently for this to work?
>> >>
>> >>
>> >> [1]
>> >>
>> >> http://jenkins.ovirt.org/job/ovirt-system-tests_manual/337/ artifact/exported-artifacts/ test_logs/basic-suite-master/ post-002_bootstrap.py/lago- basic-suite-master-host0/_var_ log/vdsm/vdsm.log