
On Sun, Jun 21, 2020 at 8:04 PM Gilboa Davara <gilboad@gmail.com> wrote:
Hello,
Following the previous email, I think I'm hitting an odd problem, not sure if it's my mistake or an actual bug. 1. Newly deployed 4.4 self-hosted engine on localhost NFS storage on a single node. 2. Installation failed during the final phase with a non-descriptive error message [1].
I agree. Would you like to open a bug about this? It's not always easy to know the root cause for the failure, nor to pass it through the various components until it can reach the end-user.
3. Log attached. 4. Even though the installation seemed to have failed, I managed to connect to the ovirt console, and noticed it failed to connect to the host. 5. SSH into the hosted engine, and noticed it cannot resolve the host hostname. 6. Added the missing /etc/hosts entry, restarted the ovirt-engine service, and all is green. 7. Looking the deployment log, I'm seeing the following message: "[WARNING] Failed to resolve gilboa-wx-ovirt.localdomain using DNS, it can be resolved only locally", which means the ansible was aware the my DNS server doesn't resolve the host hostname, but didn't add the missing /etc/hosts entry / and or errored out.
Not sure it must abort. In principle, you could have supplied custom ansible code to be ran inside the appliance, to add the items yourself to /etc/hosts, or in theory it can also happen that you configured stuff so that the host fails DNS resolution but the engine VM does not.
A. Is it a bug, or is it PBKAC?
It also asked you: 2020-06-21 10:49:18,562-0400 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND Add lines for the appliance itself and for this host to /etc/hosts on the engine VM? 2020-06-21 10:49:18,562-0400 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND Note: ensuring that this host could resolve the engine VM hostname is still up to you 2020-06-21 10:49:18,563-0400 DEBUG otopi.plugins.otopi.dialog.human dialog.__logString:204 DIALOG:SEND (Yes, No)[No] And you accepted the default 'No'. Perhaps we should change the default to Yes. Of course - Yes is also a risk - a user not noticing it, then later on changing the DNS, and not understanding why it "does not work"...
B. What are the chances that I have a working ovirt (test) setup?
In theory, you can examine the ansible code, and see what (not very many) next steps it should have done if it didn't fail there, and do that yourself (or decide that they are not important). In practice, I'd personally deploy again cleanly, unless this is for a quick test or something. Best regards, -- Didi