It turned out to be user error. I was modifying configure.yml at the
pause before engine-setup is run. But part of engine-setup involves
auto-updating all packages on the host. One of these packages was
ovirt-engine-tools which gets updated from 4.5.5-1.el8 to 4.5.6-1.el8.
This update overwrote my changes.
I was able to pre-update these packages before engine-setup ran, then
patch configure.yml with the patch linked in your email. This allowed
the engine-setup to continue on to the storage setup phase.
Thanks again for the link to your problem, which ended up being the
solution for me as well.
--Mike
On 11/11/24 11:50, Fabrice Bacchella wrote:
That’s strange, every thing look so similar, but the path fails on
you. Two bug in one ?
> Le 11 nov. 2024 à 17:56, Michael Thomas <wart(a)caltech.edu> a écrit :
>
> Hi Fabrice,
>
> Indeed, it does look very similar, but I think it's not quite the same (but could
be wrong).
>
> For starters, the fix by appending the extra clauses in the when: block in
configure.yml did not help me. In my case, it appears the engine is trying to run the
following when setting up the first host:
>
> vdsm-tool ovn-config 10.110.115.21
hv1-mgmt.cds.ligo-la.caltech.edu
<
http://hv1-mgmt.cds.ligo-la.caltech.edu/>
>
> Note that this IP and hostname both point to the host that is being added.
>
> But the help info for vdsm-tool shows that another argument is needed for the IP of
the engine itself:
>
> [root@hv1 ~]# vdsm-tool ovn-config
> usage:
> /usr/bin/vdsm-tool [options] ovn-config IP-central [tunneling-IP|tunneling-network]
host-fqdn
> Configures the ovn-controller on the host.
>
> Parameters:
> IP-central - the IP of the engine (the host where OVN central is located)
> tunneling-IP - the local IP which is to be used for OVN tunneling
> tunneling-network - the vdsm network meant to be used for OVN tunneling
> host-fqdn - FQDN that will be set as system-id for OvS (optional)
>
>
> So either the engine is omitting this argument, or it's misinterpreting the
host's IP address as the engine's IP address. If I manually run this vdsm-tool
command with the extra engine IP address as an argument, it appears to work:
>
> vdsm-tool ovn-config 10.110.115.20 10.110.115.21
hv1-mgmt.cds.ligo-la.caltech.edu
<
http://hv1-mgmt.cds.ligo-la.caltech.edu/>
>
> I'm not fluent enough in ansible to know where I can make the fix to add this
extra argument, unfortunately.
>
> The host is running Rocky Linux 9:
> [root@hv1 ~]# rpm -qf /usr/bin/vdsm-tool
> vdsm-python-4.50.5.1-1.el9.noarch
>
> --Mike
>
> On 11/8/24 16:14, Fabrice Bacchella wrote:
>> Your error with vdsm-tool ovn-config look like a lot mine:
>>
https://lists.ovirt.org/archives/list/users@ovirt.org/thread/4P3EHTC7ZYWI...
>>> Le 8 nov. 2024 à 22:53, Michael Thomas <wart(a)caltech.edu> a écrit :
>>>
>>> tl;dr Where can I find a compatible set of host packages and hosted-engine
image? What is the recommended combination to use for new installs?
>>>
>>> First, some background:
>>>
>>> I've been trying to get a new oVirt 4.5.5 install on Rocky 9 hosts using
a hosted engine. My first few attempts failed because the engine image
(ovirt-engine-appliance-4.5-20231201120201.1.el9) was still based on CentOS8-stream.
Using --ansible-extra-vars=he_pause_before_engine_setup=true I was able to redirect the
repos to
vault.centos.org. This helped, but still failed when the engine tried to access
the host:
>>>
>>> 2024-11-06 17:05:53,600-0600 DEBUG
otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109
{'msg': 'Host is not up, please check logs, perhaps also on the engine
machine', '_ansible_no_log': False, 'changed': False}
>>> 2024-11-06 17:05:53,700-0600 ERROR
otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:113 fatal:
[localhost]: FAILED! => {"changed": false, "msg": "Host is not
up, please check logs, perhaps also on the engine machine"}
>>>
>>> ...and the logs on the engine throw a NetworkNotFoundError while trying to
set up OVN:
>>>
>>> "stdout" : "fatal: [
hv1-mgmt.cds.ligo-la.caltech.edu]:
FAILED! => {\"changed\": true, \"cmd\": [\"vdsm-tool\",
\"ovn-config\", \"10.110.115.21\",
\"hv1-mgmt.cds.ligo-la.caltech.edu\"], \"delta\":
\"0:00:02.413890\", \"end\": \"2024-11-08 14:29:54.215138\",
\"msg\": \"non-zero return code\", \"rc\": 1,
\"start\": \"2024-11-08 14:29:51.801248\", \"stderr\":
\"Traceback (most recent call last):\\n File
\\\"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\\\", line 117, in
get_network\\n return networks[net_name]\\nKeyError:
'hv1-mgmt.cds.ligo-la.caltech.edu'\\n\\nDuring handling of the above exception,
another exception o
>>> ccurred:\\n\\nTraceback (most recent call last):\\n File
\\\"/usr/bin/vdsm-tool\\\", line 195, in main\\n return
tool_command[cmd][\\\"command\\\"](*args)\\n File
\\\"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\\\", line 63, in
ovn_config\\n ip_address = get_ip_addr(get_network(network_caps(), net_name))\\n File
\\\"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\\\", line 119, in
get_network\\n raise
NetworkNotFoundError(net_name)\\nvdsm.tool.ovn_config.NetworkNotFoundError:
hv1-mgmt.cds.ligo-la.caltech.edu\", \"stderr_lines\": [\"Traceback
(most recent call last):\", \" File \\\"/usr/lib/python3.
>>> 9/site-packages/vdsm/tool/ovn_config.py\\\", line 117, in
get_network\", \" r
>>> eturn networks[net_name]\", \"KeyError:
'hv1-mgmt.cds.ligo-la.caltech.edu'\", \"\", \"During handling
of the above exception, another exception occurred:\", \"\",
\"Traceback (most recent call last):\", \" File
\\\"/usr/bin/vdsm-tool\\\", line 195, in main\", \" return
tool_command[cmd][\\\"command\\\"](*args)\", \" File
\\\"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\\\", line 63, in
ovn_config\", \" ip_address = get_ip_addr(get_network(network_caps(),
net_name))\", \" File
\\\"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\\\", line 119, in
get_network\", \" raise NetworkNotFoundError(net_name)\",
\"vdsm.tool.ovn_config.NetworkNotFoundError:
hv1-mgmt.cds.ligo-la.caltech.edu\"], \"stdout\": \"\",
\"stdout_lines\": []}",
>>>
>>> Ok, so then I think to myself that I should be using a newer engine image. I
installed ovirt-engine-appliance-4.5-20240817071039.1.el9.x86_64.rpm and tried again. But
of course that failed because the host and engine now have incompatible versions:
>>>
>>> [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false,
"msg": "The host has
>>> been set in non_operational status, deployment errors: code 154: Host
hv1-mgmt.cds.ligo-la.caltech.edu is compatible with versions (4.2,4.3,4.4,4.5,4.6,4.7) and
cannot join Cluster CDS which is set to version 4.8., code 1110: Host
hv1-mgmt.cds.ligo-la.caltech.edu's following network(s) are not synchronized with
their Logical Network configuration: ovirtmgmt., code 9000: Failed to verify Power
Management configuration for Host hv1-mgmt.cds.ligo-la.caltech.edu., fix accordingly and
re-deploy."}
>>>
>>> --Mike
>>> _______________________________________________
>>> Users mailing list -- users(a)ovirt.org
>>> To unsubscribe send an email to users-leave(a)ovirt.org
>>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6PSPQCV7NG6...
> _______________________________________________
> Users mailing list -- users(a)ovirt.org <mailto:users@ovirt.org>
> To unsubscribe send an email to users-leave(a)ovirt.org
<mailto:users-leave@ovirt.org>
> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/TPV3VXHPCNX...