On Thu, Feb 15, 2018 at 1:08 AM, Jamie Lawrence <jlawrence@squaretrade.com> wrote:
> On Feb 14, 2018, at 1:27 AM, Simone Tiraboschi <stirabos@redhat.com> wrote:
> On Wed, Feb 14, 2018 at 2:11 AM, Jamie Lawrence <jlawrence@squaretrade.com> wrote:
> Hello,
>
> I'm seeing the hosted engine install fail on an Ansible playbook step. Log below. I tried looking at the file specified for retry, below (/usr/share/ovirt-hosted-engine-setup/ansible/bootstrap_local_vm.retry); it contains the word, 'localhost'.
>
> The log below didn't contain anything I could see that was actionable; given that it was an ansible error, I hunted down the config and enabled logging. On this run the error was different - the installer log was the same, but the reported error (from the installer changed).
>
> The first time, the installer said:
>
> [ INFO  ] TASK [Wait for the host to become non operational]
> [ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts": []}, "attempts": 150, "changed": false}
> [ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook
> [ INFO  ] Stage: Clean up
>
> 'localhost' here is not an issue by itself: the playbook is executed on the host against the same host over a local connection so localhost is absolutely fine there.
>
> Maybe you hit this one:
> https://bugzilla.redhat.com/show_bug.cgi?id=1540451

That seems likely.

At the point the engine VM is up but you can reach it only from that host since it's on a natted network.
I'd suggest to connect to the engine VM from there and check host-deploy logs.
 


> It seams NetworkManager related but still not that clear.
> Stopping NetworkManager and starting network before the deployment seams to help.

Tried this, got the same results.

[snip]
> Anyone see what is wrong here?
>
> This is absolutely fine.
> The new ansible based flow (also called node zero) uses an engine running on a local virtual machine to bootstrap the system.
> The bootstrap local VM runs over libvirt default natted network with its own dhcp instance, that's why we are consuming it.
> The locally running engine will create a target virtual machine on the shared storage and that one will be instead configured as you specified.

Thanks for the context - that's useful, and presumably explains why 192.168 addresses  (which we don't use) are appearing in the logs.

Not being entirely sure where to go from here, I guess I'll spend the evening figuring out ansible-ese in order to try to figure out why it is blowing chunks.

Thanks for the note.

-j