I had this same issue on oVirt Node 4.5.5, however, I did not see the same code in
/usr/share/ovirt-engine/ansible-runner-service-project/project/roles/ovirt-provider-ovn-driver/tasks/configure.yml
on the hosted engine.
On my version 4.5.5, I have two blocks: one installs ovs and ensures Open vSwitch is
started, the second block installs the ovirt-provider-ovn-driver and configures OVN (as
well as some other steps).
For the first block, my when statement shows as:
when:
- cluster_switch == "ovs" or (ovn_central is defined)
For the second block, it shows:
when:
- ovn_central is defined
In Ansible, inside a when: statement, multiple lines beginning with "-" are
equivalent to AND conditions. For example:
when:
- this == true
- that == true
This would be equivalent to when: (this == true) and (that == true).
I didn't want to toy with the control logic, but I realized that this was a non-issue.
The error in this occurs in the Configuring OVN step, which in my configure.yml is near
the end of the second block. The when statements are working fine, otherwise it
wouldn't be executing those steps.
I dug in further, and the issue comes about when the installer attempts to run:
vdsm-tool config-ovn <IP-Central> <FQDN> !
I tried this on my own system:
[root@b-drone11 ~]# vdsm-tool ovn-config 10.99.8.31
b-drone11.arcc.uwyo.edu
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py", line 117, in
get_network
return networks[net_name]
KeyError: 'b-drone11.arcc.uwyo.edu'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/bin/vdsm-tool", line 195, in main
return tool_command[cmd]["command"](*args)
File "/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py", line 63, in
ovn_config
ip_address = get_ip_addr(get_network(network_caps(), net_name))
File "/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py", line 119, in
get_network
raise NetworkNotFoundError(net_name)
vdsm.tool.ovn_config.NetworkNotFoundError:
b-drone11.arcc.uwyo.edu
It's the same error as in the host-deploy logs. If you dig in a bit more, you'll
find in the ovn_config.py script referred to by the above output, there's a function
get_networks() that is throwing the error:
def get_network(net_caps, net_name):
networks = net_caps['networks']
try:
return networks[net_name]
except KeyError:
raise NetworkNotFoundError(net_name)
Digging in EVEN further, if you look at where the function is called and how the
"net_name" variable comes in, you'll find that it's only run when a FQDN
is given as an argument to vdsm-tool ovn-config instead of an IP:
if is_ipaddress(args[2]):
ip_address = args[2]
else:
net_name = args[2]
ip_address = get_ip_addr(get_network(network_caps(), net_name))
if not ip_address:
raise IpAddressNotFoundError(net_name)
Now, this is as far I got. As far as WHY the get_network() function isn't working, I
haven't looked further into the ovirt code and can't say. But it appears somehow
this function fails when attempting to resolve FQDN's. Which brings me to the
WORKAROUND!
Since the error lies in translating a FQDN to an IP, if you instead provide an IP address
in the first place, it completely bypasses the buggy get_networks() function, and lets you
add a host.
So, when you run the host deploy, if you add the host using it's IP address vs. its
FQDN, it goes through fine, and I've tested this on my cluster and it worked
beautifully.
The only caveat is you can't add with the FQDN, but for now, our cluster is up and
working.