On Mon, Nov 9, 2015 at 8:27 PM, Jamie Lawrence
<jlawrence(a)squaretrade.com> wrote:
On 8 Nov 2015, at 1:32, Yedidyah Bar David wrote:
> On Sat, Nov 7, 2015 at 3:00 AM, Jamie Lawrence
> <jlawrence(a)squaretrade.com> wrote:
>> I’m attempting to run engine-setup, and get to the DNS reverse lookup of
>> the
>> FQDN. The machine has two (bonded) interfaces, one for storage and one
>> for
>> everything else. The “everything else” network has DNS service, the
>> storage
>> network doesn’t, and this seems to make engine-setup cranky. /etc/hosts
>> is
>> properly set up for the storage network, but that apparently doesn’t
>> count.
>> I tried running with the -offline flag, but that apparently still expects
>> DNS.
(-offline affects only package management, e.g. to prevent an upgrade)
>
>
> IIUC engine-setup never fails on missing DNS resolution, only warns.
Sorry, that was wrong. Let me sketch the flow:
We query for the fqdn. let's mark the answer FQDN.
We then check what FQDN resolves to using getaddrinfo, let's mark the
result resolvedAddresses.
We also lookup FQDN in the dns (using 'dig'). If it does not resolve, we warn.
If it does resolve in the dns, *and* a special variable is set, we reverse
lookup each of resolvedAddresses (dig -x), and if none of the results
matches FQDN, we fail with the error you see.
This special variable is set, by default, only if you configured all-in-one.
This is the place where it failed for you. If it didn't, we then:
If another special variable is set (which is also set by default only
in all-in-one), we check if resolvedAddresses is a subset of the addresses
of non-loopback local interfaces, and fail if not. This is the only place
where we actually check local addresses.
getaddrinfo means we effectively use the local resolver as configured by
you in /etc/nsswitch.conf, by default looking first in /etc/hosts and then
dns.
I may well be missing something or otherwise being bone-headed, but I am
getting [ Error ] messages, which it doesn’t allow me to skip.
>> Details:
>> ovirt-engine.noarch 0:3.6.0.3-1.el7.centos
>> ovirt-engine-setup-plugin-allinone.noarch 0:3.6.0.3-1.el7.centos
>>
>> CentOS Linux release 7.1.1503 (Core)
> Please check/post setup logs. Thanks!
The full log is pushing 350k; unless you want, I’m not going to do that to
the mailing list.
Well, you can use other means for that, such as various
pastbins/file sharing sites/whatever.
The relevant portion (starting a bit before) seems to be:
- - - Snip - - -
2015-11-06 16:52:26 DEBUG otopi.plugins.otopi.dialog.human
dialog.__logString:219 DIALOG:SEND Local storage domain name
[
local_storage]:
2015-11-06 16:52:27 DEBUG otopi.context context.dumpEnvironment:500
ENVIRONMENT DUMP - BEGIN
2015-11-06 16:52:27 DEBUG otopi.context context.dumpEnvironment:510 ENV
OVESETUP_AIO/storageDomainDir=str:’/mnt/gluster/vm-img-brick-1/gv
0'
2015-11-06 16:52:27 DEBUG otopi.context context.dumpEnvironment:510 ENV
OVESETUP_AIO/storageDomainName=str:'local_storage'
2015-11-06 16:52:27 DEBUG otopi.context context.dumpEnvironment:510 ENV
OVESETUP_SYSTEM/selinuxContexts=list:'[{'pattern': ‘/mnt/gluster/
vm-img-brick-1/gv0(/.*)?', 'type': 'public_content_rw_t'}]'
2015-11-06 16:52:27 DEBUG otopi.context context.dumpEnvironment:510 ENV
OVESETUP_SYSTEM/selinuxRestorePaths=list:’[‘/mnt/gluster/vm-img-b
rick-1/gv0']'
2015-11-06 16:52:27 DEBUG otopi.context context.dumpEnvironment:514
ENVIRONMENT DUMP - END
2015-11-06 16:52:27 DEBUG otopi.context context._executeMethod:142 Stage
customization METHOD otopi.plugins.ovirt_engine_setup.ovirt_engi
ne_common.dialog.titles.Plugin._title_e_allinone
2015-11-06 16:52:27 DEBUG otopi.context context._executeMethod:142 Stage
customization METHOD otopi.plugins.ovirt_engine_setup.ovirt_engi
ne_common.dialog.titles.Plugin._title_s_network
2015-11-06 16:52:27 DEBUG otopi.plugins.otopi.dialog.human
dialog.__logString:219 DIALOG:SEND
2015-11-06 16:52:27 DEBUG otopi.plugins.otopi.dialog.human
dialog.__logString:219 DIALOG:SEND --== NETWORK
CONFIGURATION
==--
2015-11-06 16:52:27 DEBUG otopi.plugins.otopi.dialog.human
dialog.__logString:219 DIALOG:SEND
2015-11-06 16:52:27 DEBUG otopi.context context._executeMethod:142 Stage
customization METHOD otopi.plugins.ovirt_engine_common.base.netw
ork.hostname.Plugin._customization
2015-11-06 16:52:27 DEBUG otopi.plugins.otopi.dialog.human
human.queryString:156 query OVESETUP_NETWORK_FQDN_this
DIALOG:SEND Host fully qualified DNS na
me of this server [
box-3.squaretrade.com]:
So here you were asked about FQDN, and accepted the default, which was
box-3.squaretrade.com
2015-11-06 16:52:29 DEBUG
otopi.plugins.ovirt_engine_common.base.network.hostname
hostname._validateFQDNresolvability:195 box-3
.squaretrade.com resolves to: set(['172.16.1.13'])
That's resolvedAddresses, which contains exactly one address,
172.16.1.13.
2015-11-06 16:52:29 DEBUG
otopi.plugins.ovirt_engine_common.base.network.hostname
plugin.executeRaw:828 execute: ['/bin/dig', ‘box
-3.squaretrade.com'], executable='None', cwd='None', env=None
2015-11-06 16:52:29 DEBUG
otopi.plugins.ovirt_engine_common.base.network.hostname
plugin.executeRaw:878 execute-result: ['/bin/dig', ‘box
-3.squaretrade.com'], rc=0
2015-11-06 16:52:29 DEBUG
otopi.plugins.ovirt_engine_common.base.network.hostname plugin.execute:936
execute-output: ['/bin/dig', ‘box-3.
squaretrade.com'] stdout:
; <<>> DiG 9.9.4-RedHat-9.9.4-18.el7_1.5 <<>>
box-3.squaretrade.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62281
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;box-3.squaretrade.com. IN A
;; ANSWER SECTION:
box-3.squaretrade.com. 83752 IN A 10.180.202.13
That's the result of a dns lookup for FQDN, 10.180.202.13 .
As you can see, we already might have a problem here, as this is
different. I admit I did not fully understand what you are trying to
do, but this seems like a contradiction between what you wrote in the
dns and what you wrote in /etc/hosts. As explained above, this would
have caused a failure eventually, if we didn't fail earlier.
;; Query time: 0 msec
;; SERVER: 10.22.10.253#53(10.22.10.253)
;; WHEN: Fri Nov 06 16:52:29 PST 2015
;; MSG SIZE rcvd: 65
2015-11-06 16:52:29 DEBUG
otopi.plugins.ovirt_engine_common.base.network.hostname plugin.execute:941
execute-output: ['/bin/dig', ‘box-3.
squaretrade.com'] stderr:
2015-11-06 16:52:29 DEBUG
otopi.plugins.ovirt_engine_common.base.network.hostname
plugin.executeRaw:828 execute: ['/bin/dig', '-x', '172.
16.1.13'], executable='None', cwd='None', env=None
2015-11-06 16:52:44 DEBUG
otopi.plugins.ovirt_engine_common.base.network.hostname
plugin.executeRaw:878 execute-result: ['/bin/dig', '-x'
, '172.16.1.13'], rc=9
2015-11-06 16:52:44 DEBUG
otopi.plugins.ovirt_engine_common.base.network.hostname plugin.execute:936
execute-output: ['/bin/dig', '-x', '
172.16.1.13'] stdout:
; <<>> DiG 9.9.4-RedHat-9.9.4-18.el7_1.5 <<>> -x 172.16.1.13
;; global options: +cmd
;; connection timed out; no servers could be reached
Here we try to reverse-lookup 172.16.1.13 in the dns, and timeout.
Not sure why. Perhaps your dns server is configured to not reply to queries
about private addresses, or something like that - I'd expect here a reply
NXDOMAIN (or some actual answer).
2015-11-06 16:52:44 DEBUG
otopi.plugins.ovirt_engine_common.base.network.hostname plugin.execute:941
execute-output: ['/bin/dig', '-x', '
172.16.1.13'] stderr:
2015-11-06 16:52:44 DEBUG
otopi.plugins.ovirt_engine_common.base.network.hostname
hostname.test_hostname:323 test_hostname exception
Traceback (most recent call last):
File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/hostname.py", line
320, in test_hostname
self._validateFQDNresolvability(name)
File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/hostname.py", line
232, in _validateFQDNresolvability
fqdn=fqdn
RuntimeError: The following addresses: 172.16.1.13 did not reverse resolve
into
sfo-mgmt-prod-3.squaretrade.com
2015-11-06 16:52:44 ERROR
otopi.plugins.ovirt_engine_common.base.network.hostname
dialog.queryEnvKey:115 Host name is not valid: The foll
owing addresses: 172.16.1.13 did not reverse resolve into
box-3.squaretrade.com
And this is the error message you received, which I think makes sense even
without understanding how engine-setup works.
2015-11-06 16:52:44 DEBUG otopi.plugins.otopi.dialog.human
human.queryString:156 query OVESETUP_NETWORK_FQDN_this
2015-11-06 16:52:44 DEBUG otopi.plugins.otopi.dialog.human
dialog.__logString:219 DIALOG:SEND Host fully qualified DNS
na
me of this server [
box-3.squaretrade.com]:
2015-11-06 16:52:47 DEBUG otopi.context context._executeMethod:156 method
exception
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/otopi/context.py", line 146, in
_executeMethod
method['method']()
File
"/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-common/base/network/hostname.py",
line 81, in _customization
supply_default=True,
File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/hostname.py", line
349, in getHostname
'test': test_hostname,
File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/dialog.py", line
103, in queryEnvKey
default=default,
File "/usr/share/otopi/plugins/otopi/dialog/human.py", line 174, in
queryString
value = self._readline(hidden=hidden)
File "/usr/lib/python2.7/site-packages/otopi/dialog.py", line 261, in
_readline
value = self.__input.readline()
File "/usr/lib/python2.7/site-packages/otopi/main.py", line 63, in _signal
raise RuntimeError("SIG%s" % signum)
RuntimeError: SIG2
2015-11-06 16:52:47 ERROR otopi.context context._executeMethod:165 Failed to
execute stage 'Environment customization': SIG2
2015-11-06 16:52:47 DEBUG otopi.context context.dumpEnvironment:500
ENVIRONMENT DUMP - BEGIN
- - - Snip - - -
It looks like it resolves the non-storage network fine, tries to do the same
for the storage network, can’t, and errors out.
No, it only tries the result of a lookup for FQDN.
As I wrote above, I do not fully understand what you try to do.
May I suggest that you simply use, everywhere, both in the dns and in
/etc/hosts,
different names for the different addresses.
E.g. if your storage interface is 172.16.1.13, give it a different name, say
box-3-storage.squaretrade.com or whatever. Do this either in the dns, or in
/etc/hosts, or both, but if both - same at both. And same for the other address.
And when asked about FQDN, input the one you want to access the engine with.
Thanks for looking.
Thanks for the report.
If you feel, reading the above analysis, that engine-setup should behave
differently, by all means go ahead and open a bug. But if you do, please
describe exactly what you want.
And, if what you want is "Please add a flag or whatever that will allow me
to override all this name lookup mess and just make engine-setup do what I
say", please consider that the current behavior actually did find something
which I personally think is unintended, so it helped you catch it now instead
of perhaps spending much more time, during a much less comfortable situation,
when something actually breaks due to this.
Best,
--
Didi