
On Mon, Nov 9, 2015 at 8:27 PM, Jamie Lawrence <jlawrence@squaretrade.com> wrote:
On 8 Nov 2015, at 1:32, Yedidyah Bar David wrote:
On Sat, Nov 7, 2015 at 3:00 AM, Jamie Lawrence <jlawrence@squaretrade.com> wrote:
I’m attempting to run engine-setup, and get to the DNS reverse lookup of the FQDN. The machine has two (bonded) interfaces, one for storage and one for everything else. The “everything else” network has DNS service, the storage network doesn’t, and this seems to make engine-setup cranky. /etc/hosts is properly set up for the storage network, but that apparently doesn’t count. I tried running with the -offline flag, but that apparently still expects DNS.
(-offline affects only package management, e.g. to prevent an upgrade)
IIUC engine-setup never fails on missing DNS resolution, only warns.
Sorry, that was wrong. Let me sketch the flow: We query for the fqdn. let's mark the answer FQDN. We then check what FQDN resolves to using getaddrinfo, let's mark the result resolvedAddresses. We also lookup FQDN in the dns (using 'dig'). If it does not resolve, we warn. If it does resolve in the dns, *and* a special variable is set, we reverse lookup each of resolvedAddresses (dig -x), and if none of the results matches FQDN, we fail with the error you see. This special variable is set, by default, only if you configured all-in-one. This is the place where it failed for you. If it didn't, we then: If another special variable is set (which is also set by default only in all-in-one), we check if resolvedAddresses is a subset of the addresses of non-loopback local interfaces, and fail if not. This is the only place where we actually check local addresses. getaddrinfo means we effectively use the local resolver as configured by you in /etc/nsswitch.conf, by default looking first in /etc/hosts and then dns.
I may well be missing something or otherwise being bone-headed, but I am getting [ Error ] messages, which it doesn’t allow me to skip.
Details: ovirt-engine.noarch 0:3.6.0.3-1.el7.centos ovirt-engine-setup-plugin-allinone.noarch 0:3.6.0.3-1.el7.centos
CentOS Linux release 7.1.1503 (Core)
Please check/post setup logs. Thanks!
The full log is pushing 350k; unless you want, I’m not going to do that to the mailing list.
Well, you can use other means for that, such as various pastbins/file sharing sites/whatever.
The relevant portion (starting a bit before) seems to be:
- - - Snip - - - 2015-11-06 16:52:26 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:219 DIALOG:SEND Local storage domain name [ local_storage]: 2015-11-06 16:52:27 DEBUG otopi.context context.dumpEnvironment:500 ENVIRONMENT DUMP - BEGIN 2015-11-06 16:52:27 DEBUG otopi.context context.dumpEnvironment:510 ENV OVESETUP_AIO/storageDomainDir=str:’/mnt/gluster/vm-img-brick-1/gv 0' 2015-11-06 16:52:27 DEBUG otopi.context context.dumpEnvironment:510 ENV OVESETUP_AIO/storageDomainName=str:'local_storage' 2015-11-06 16:52:27 DEBUG otopi.context context.dumpEnvironment:510 ENV OVESETUP_SYSTEM/selinuxContexts=list:'[{'pattern': ‘/mnt/gluster/ vm-img-brick-1/gv0(/.*)?', 'type': 'public_content_rw_t'}]' 2015-11-06 16:52:27 DEBUG otopi.context context.dumpEnvironment:510 ENV OVESETUP_SYSTEM/selinuxRestorePaths=list:’[‘/mnt/gluster/vm-img-b rick-1/gv0']' 2015-11-06 16:52:27 DEBUG otopi.context context.dumpEnvironment:514 ENVIRONMENT DUMP - END 2015-11-06 16:52:27 DEBUG otopi.context context._executeMethod:142 Stage customization METHOD otopi.plugins.ovirt_engine_setup.ovirt_engi ne_common.dialog.titles.Plugin._title_e_allinone 2015-11-06 16:52:27 DEBUG otopi.context context._executeMethod:142 Stage customization METHOD otopi.plugins.ovirt_engine_setup.ovirt_engi ne_common.dialog.titles.Plugin._title_s_network 2015-11-06 16:52:27 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:219 DIALOG:SEND 2015-11-06 16:52:27 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:219 DIALOG:SEND --== NETWORK CONFIGURATION ==-- 2015-11-06 16:52:27 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:219 DIALOG:SEND 2015-11-06 16:52:27 DEBUG otopi.context context._executeMethod:142 Stage customization METHOD otopi.plugins.ovirt_engine_common.base.netw ork.hostname.Plugin._customization 2015-11-06 16:52:27 DEBUG otopi.plugins.otopi.dialog.human human.queryString:156 query OVESETUP_NETWORK_FQDN_this DIALOG:SEND Host fully qualified DNS na me of this server [box-3.squaretrade.com]:
So here you were asked about FQDN, and accepted the default, which was box-3.squaretrade.com
2015-11-06 16:52:29 DEBUG otopi.plugins.ovirt_engine_common.base.network.hostname hostname._validateFQDNresolvability:195 box-3 .squaretrade.com resolves to: set(['172.16.1.13'])
That's resolvedAddresses, which contains exactly one address, 172.16.1.13.
2015-11-06 16:52:29 DEBUG otopi.plugins.ovirt_engine_common.base.network.hostname plugin.executeRaw:828 execute: ['/bin/dig', ‘box -3.squaretrade.com'], executable='None', cwd='None', env=None 2015-11-06 16:52:29 DEBUG otopi.plugins.ovirt_engine_common.base.network.hostname plugin.executeRaw:878 execute-result: ['/bin/dig', ‘box -3.squaretrade.com'], rc=0 2015-11-06 16:52:29 DEBUG otopi.plugins.ovirt_engine_common.base.network.hostname plugin.execute:936 execute-output: ['/bin/dig', ‘box-3. squaretrade.com'] stdout:
; <<>> DiG 9.9.4-RedHat-9.9.4-18.el7_1.5 <<>> box-3.squaretrade.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62281 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION: ;box-3.squaretrade.com. IN A
;; ANSWER SECTION: box-3.squaretrade.com. 83752 IN A 10.180.202.13
That's the result of a dns lookup for FQDN, 10.180.202.13 . As you can see, we already might have a problem here, as this is different. I admit I did not fully understand what you are trying to do, but this seems like a contradiction between what you wrote in the dns and what you wrote in /etc/hosts. As explained above, this would have caused a failure eventually, if we didn't fail earlier.
;; Query time: 0 msec ;; SERVER: 10.22.10.253#53(10.22.10.253) ;; WHEN: Fri Nov 06 16:52:29 PST 2015 ;; MSG SIZE rcvd: 65
2015-11-06 16:52:29 DEBUG otopi.plugins.ovirt_engine_common.base.network.hostname plugin.execute:941 execute-output: ['/bin/dig', ‘box-3. squaretrade.com'] stderr:
2015-11-06 16:52:29 DEBUG otopi.plugins.ovirt_engine_common.base.network.hostname plugin.executeRaw:828 execute: ['/bin/dig', '-x', '172. 16.1.13'], executable='None', cwd='None', env=None 2015-11-06 16:52:44 DEBUG otopi.plugins.ovirt_engine_common.base.network.hostname plugin.executeRaw:878 execute-result: ['/bin/dig', '-x' , '172.16.1.13'], rc=9 2015-11-06 16:52:44 DEBUG otopi.plugins.ovirt_engine_common.base.network.hostname plugin.execute:936 execute-output: ['/bin/dig', '-x', ' 172.16.1.13'] stdout:
; <<>> DiG 9.9.4-RedHat-9.9.4-18.el7_1.5 <<>> -x 172.16.1.13 ;; global options: +cmd ;; connection timed out; no servers could be reached
Here we try to reverse-lookup 172.16.1.13 in the dns, and timeout. Not sure why. Perhaps your dns server is configured to not reply to queries about private addresses, or something like that - I'd expect here a reply NXDOMAIN (or some actual answer).
2015-11-06 16:52:44 DEBUG otopi.plugins.ovirt_engine_common.base.network.hostname plugin.execute:941 execute-output: ['/bin/dig', '-x', ' 172.16.1.13'] stderr:
2015-11-06 16:52:44 DEBUG otopi.plugins.ovirt_engine_common.base.network.hostname hostname.test_hostname:323 test_hostname exception Traceback (most recent call last): File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/hostname.py", line 320, in test_hostname self._validateFQDNresolvability(name) File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/hostname.py", line 232, in _validateFQDNresolvability fqdn=fqdn RuntimeError: The following addresses: 172.16.1.13 did not reverse resolve into sfo-mgmt-prod-3.squaretrade.com 2015-11-06 16:52:44 ERROR otopi.plugins.ovirt_engine_common.base.network.hostname dialog.queryEnvKey:115 Host name is not valid: The foll owing addresses: 172.16.1.13 did not reverse resolve into box-3.squaretrade.com
And this is the error message you received, which I think makes sense even without understanding how engine-setup works.
2015-11-06 16:52:44 DEBUG otopi.plugins.otopi.dialog.human human.queryString:156 query OVESETUP_NETWORK_FQDN_this 2015-11-06 16:52:44 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:219 DIALOG:SEND Host fully qualified DNS na me of this server [box-3.squaretrade.com]: 2015-11-06 16:52:47 DEBUG otopi.context context._executeMethod:156 method exception Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/otopi/context.py", line 146, in _executeMethod method['method']() File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-common/base/network/hostname.py", line 81, in _customization supply_default=True, File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/hostname.py", line 349, in getHostname 'test': test_hostname, File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/dialog.py", line 103, in queryEnvKey default=default, File "/usr/share/otopi/plugins/otopi/dialog/human.py", line 174, in queryString value = self._readline(hidden=hidden) File "/usr/lib/python2.7/site-packages/otopi/dialog.py", line 261, in _readline value = self.__input.readline() File "/usr/lib/python2.7/site-packages/otopi/main.py", line 63, in _signal raise RuntimeError("SIG%s" % signum) RuntimeError: SIG2 2015-11-06 16:52:47 ERROR otopi.context context._executeMethod:165 Failed to execute stage 'Environment customization': SIG2 2015-11-06 16:52:47 DEBUG otopi.context context.dumpEnvironment:500 ENVIRONMENT DUMP - BEGIN
- - - Snip - - -
It looks like it resolves the non-storage network fine, tries to do the same for the storage network, can’t, and errors out.
No, it only tries the result of a lookup for FQDN. As I wrote above, I do not fully understand what you try to do. May I suggest that you simply use, everywhere, both in the dns and in /etc/hosts, different names for the different addresses. E.g. if your storage interface is 172.16.1.13, give it a different name, say box-3-storage.squaretrade.com or whatever. Do this either in the dns, or in /etc/hosts, or both, but if both - same at both. And same for the other address. And when asked about FQDN, input the one you want to access the engine with.
Thanks for looking.
Thanks for the report. If you feel, reading the above analysis, that engine-setup should behave differently, by all means go ahead and open a bug. But if you do, please describe exactly what you want. And, if what you want is "Please add a flag or whatever that will allow me to override all this name lookup mess and just make engine-setup do what I say", please consider that the current behavior actually did find something which I personally think is unintended, so it helped you catch it now instead of perhaps spending much more time, during a much less comfortable situation, when something actually breaks due to this. Best, -- Didi