Hi all,
I have set up 3 servers in 3 data centers, each having one physical interface and a vlan
interface parented by it.
The connection between the 3 servers over the vlan interfaces (using private ip addresses)
works (using icmp ping as the test).
Now I want to turn them into an ovirt cluster creating the self hosted engine on the first
server. I have
- made sure the engine fqdn is in dns forward and reverse and in /etc/hosts
- made sure that both interfaces have unique dns entries which can be resolved forward and
reverse
- made sure that both interfaces' fqdns are in /etc/hosts
- made sure only the primary hostname (not fqdn) is in /etc/hostname,
- made sure ipv6 is available on the physical interface,
- made sure ipv6 method is "disabled" on the vlan interface,
- set
/usr/share/ansible/collections/ansible_collections/ovirt/ovirt/roles/hosted_engine_setup/defaults/main.yml:he_force_ip4:
true to make sure no ipv6 attempts to interfere.
Now when I use cockpit's hosted engine wizard (not hyperconverged), i run into 2
opposing problems.
If I set the FQDN in the "Advanced" sub pane to the FQDN of the vlan interface,
the wizards gets stuck at "preparing VM" with "The resolved address
doesn't resolve on the selected interface\n".
If I set the FQDN in the "Advanced" sub pane to the FQDN of the physical
interface, I get the same result.
If i add the physical interfaces FQDN to the vlan ip address in /etc/hosts, i get
"hostname 'x.y.z' doesn't uniquely match the interface
'enp5s0.4000' selected for the management bridge; it matches also interface with
IP ['physical']. Please make sure that the hostname got from the interface for the
management network resolves only there." So clearly separating the two interfaces
namewise is mandatory.
I tried to follow the ansible workflow step by step to see what it does. I seems the
validate hostname is triggered twice, second time on filling in FQDN in
"Advanced" sub pane - it succeeds with both hostnames (physiscal interface and
vlan ip), but that does not prevent the "prepare VM" workflow in doing the same
verification and failing, as far as I can see. This is where it happens:
2023-03-20 14:31:48,354+0100 DEBUG ansible on_any args TASK:
ovirt.ovirt.hosted_engine_setup : Check the resolved address resolves on the selected
interface kwargs is_conditional:False
2023-03-20 14:31:48,355+0100 DEBUG ansible on_any args localhost TASK:
ovirt.ovirt.hosted_engine_setup : Check the resolved address resolves on the selected
interface kwargs
2023-03-20 14:31:48,481+0100 DEBUG var changed: host "localhost" var
"ansible_play_hosts" type "<class 'list'>" value:
"[]"
2023-03-20 14:31:48,481+0100 DEBUG var changed: host "localhost" var
"ansible_play_batch" type "<class 'list'>" value:
"[]"
2023-03-20 14:31:48,481+0100 DEBUG var changed: host "localhost" var
"play_hosts" type "<class 'list'>" value:
"[]"
2023-03-20 14:31:48,481+0100 ERROR ansible failed {
"ansible_host": "localhost",
"ansible_playbook":
"/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml",
"ansible_result": {
"_ansible_no_log": false,
"changed": false,
"msg": "The resolved address doesn't resolve on the selected
interface\n"
},
"ansible_task": "Check the resolved address resolves on the selected
interface",
"ansible_type": "task",
"status": "FAILED",
"task_duration": 0
}
So I am really stuck there. I do not have any idea how and where to go on. I can try
changing bits in the playbooks and parameters (like using "hostname -A" instead
of "hostname -f" for the failing test), but that can't really be the idea -
I am to new to this to run into a bug or similar, I will suspect I do overlook something.
Any hint or help is appreciated.
Cheers,
Dirk