Hi Guys & Girls,
<begin_rant>
OK, so I am really, *really* starting to get fed up with this. I know this is probably my
fault, but even if it is then the oVirt documentation isn't helping in any way
(being... "less than clear").
What I would really like is instead of having to rely on the "black box" that is
Ansible, what I'd like is a simple set of clear cut instructions, Step-By-Step, so
that we actually *know* what was going on when attempting to do a Self-Hosted install.
After all, oVirt's "competition" doesn't make things so difficult...
<end_rant>
Now that I've got that on my chest, I'm trying to do a straight forward
Self-Hosted Install. I've followed the instructions in the oVirt doco pretty much to
the letter, and I'm still having problems.
My (pre-install) set-up:
- A freshly installed server (oVirt_Node_1) running Rocky Linux 8.6 with 3 NICs - NIC_1,
NIC_2, & NIC_3.
- There are three VLANs - VLAN_A (172.16.1.0/24), VLAN_B (172.16.2.0/24), & VLAN_C
(172.16.3.0/24).
- NIC_1 & NIC_2 are formed into a bond (bond_1).
- bond_1 is an 802.3ad bond.
- bond_1 has 2 sub-interfaces - bond_1.a & bond_1.b
- Interface bond_1.a in in VLAN_A.
- Interface bond_1.b is in VLAN_B.
- NIC_3 is sitting in VLAN_C.
- VLAN_A is the everyday "working" VLAN where the rest of the servers all sit
(ie DNS Servers, Local Repository Server, etc, etc, etc), and where the oVirt Engine (OVE)
will sit.
- VLAN B is for data throughput to and from the Ceph iSCSI Gateways in our Ceph Storage
Cluster. This is a dedicated isolated VLAN with no gateway (ie only the oVirt Hosting
Nodes and the Ceph iSCSI Gateways are on this VLAN).
- VLAN C is for OOB management traffic. This is a dedicated isolated VLAN with no
gateway.
Everything is working. Everything can ping properly back and forth within the individual
VLANs and VLAN_A can ping out to the Internet via its gateway (172.16.1.1).
Because we don't require iSCSI connectivity for the OVE (its on a working local
Gluster TSP volume) the iSCSI hasn't *yet* been implemented.
After trying to do the install using our Local Repository Mirror (after discovering and
mirroring all the required repositories), I gave up on that because for a
"one-off" install it wasn't worth the time and effort it was taking,
especially when it "seems" that the Ansible playbook wants the
"original" repositories anyway - but that's another rant/issue.
So, I'm using all the original repositories as per the oVirt doco, including the
special instructions for Rocky Linux and RHEL-derivatives in general, and using the
defaults for the answers to the deployment script (except where there are no defaults) -
and now I've got the following error:
~~~
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "cmd":
["virsh", "net-start", "default"], "delta":
"0:00:00.031972", "end": "2022-10-04 16:41:38.603454",
"msg": "non-zero return code", "rc": 1, "start":
"2022-10-04 16:41:38.571482", "stderr": "error: Failed to start
network default\nerror: internal error: Network is already in use by interface
bond_1.a", "stderr_lines": ["error: Failed to start network
default", "error: internal error: Network is already in use by interface
bond_1.a"], "stdout": "", "stdout_lines": []}
[ ERROR ] Failed to execute stage 'Closing up': Failed getting local_vm_dir
~~~
The relevant lines from the log file (at least I think these are the relevant lines):
~~~
2022-10-04 16:41:35,712+1100 INFO otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils._process_output:115 TASK [ovirt.ovirt.hosted_engine_setup : Update libvirt
default network configuration, undefine]
2022-10-04 16:41:37,017+1100 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils._process_output:109 {'changed': False, 'stdout': '',
'stderr': "error: failed to get network 'default'\nerror: Network not
found: no network with matching name 'default'", 'rc': 1,
'cmd': ['virsh', 'net-undefine', 'default'],
'start': '2022-10-04 16:41:35.806251', 'end': '2022-10-04
16:41:36.839780', 'delta': '0:00:01.033529', 'msg':
'non-zero return code', 'invocation': {'module_args':
{'_raw_params': 'virsh net-undefine default', '_uses_shell':
False, 'warn': False, 'stdin_add_newline': True,
'strip_empty_ends': True, 'argv': None, 'chdir': None,
'executable': None, 'creates': None, 'removes': None,
'stdin': None}}, 'stdout_lines': [], 'stderr_lines': ["error:
failed to get network 'default'", "error: Network not found: no network
with matching name 'default'"], '_ansible_no_log': False}
2022-10-04 16:41:37,118+1100 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils._process_output:109 ignored: [localhost]: FAILED! =>
{"changed": false, "cmd": ["virsh",
"net-undefine", "default"], "delta":
"0:00:01.033529", "end": "2022-10-04 16:41:36.839780",
"msg": "non-zero return code", "rc": 1, "start":
"2022-10-04 16:41:35.806251", "stderr": "error: failed to get
network 'default'\nerror: Network not found: no network with matching name
'default'", "stderr_lines": ["error: failed to get network
'default'", "error: Network not found: no network with matching name
'default'"], "stdout": "", "stdout_lines": []}
2022-10-04 16:41:37,219+1100 INFO otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils._process_output:115 TASK [ovirt.ovirt.hosted_engine_setup : Update libvirt
default network configuration, define]
2022-10-04 16:41:38,421+1100 INFO otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils._process_output:115 ok: [localhost]
2022-10-04 16:41:38,522+1100 INFO otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils._process_output:115 TASK [ovirt.ovirt.hosted_engine_setup : Activate default
libvirt network]
2022-10-04 16:41:38,823+1100 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils._process_output:109 {'changed': False, 'stdout': '',
'stderr': 'error: Failed to start network default\nerror: internal error:
Network is already in use by interface bond_1.a', 'rc': 1, 'cmd':
['virsh', 'net-start', 'default'], 'start':
'2022-10-04 16:41:38.571482', 'end': '2022-10-04 16:41:38.603454',
'delta': '0:00:00.031972', 'msg': 'non-zero return code',
'invocation': {'module_args': {'_raw_params': 'virsh net-start
default', '_uses_shell': False, 'warn': False,
'stdin_add_newline': True, 'strip_empty_ends': True, 'argv': None,
'chdir': None, 'executable': None, 'creates': None,
'removes': None, 'stdin': None}}, 'stdout_lines': [],
'stderr_lines': ['error: Failed to start network default', 'error:
internal error: Network is already in use by interface bond_1.a'],
'_ansible_no_log': False}
2022-10-04 16:41:38,924+1100 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils._process_output:113 fatal: [localhost]: FAILED! => {"changed":
false, "cmd": ["virsh", "net-start", "default"],
"delta": "0:00:00.031972", "end": "2022-10-04
16:41:38.603454", "msg": "non-zero return code", "rc":
1, "start": "2022-10-04 16:41:38.571482", "stderr":
"error: Failed to start network default\nerror: internal error: Network is already in
use by interface bond_1.a", "stderr_lines": ["error: Failed to start
network default", "error: internal error: Network is already in use by interface
bond_1.a"], "stdout": "", "stdout_lines": []}
2022-10-04 16:41:39,125+1100 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils._process_output:109 PLAY RECAP [localhost] : ok: 106 changed: 32
unreachable: 0 skipped: 61 failed: 1
2022-10-04 16:41:39,226+1100 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils.run:226 ansible-playbook rc: 2
2022-10-04 16:41:39,226+1100 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils.run:233 ansible-playbook stdout:
2022-10-04 16:41:39,226+1100 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils
ansible_utils.run:236 ansible-playbook stderr:
2022-10-04 16:41:39,226+1100 DEBUG otopi.plugins.gr_he_ansiblesetup.core.misc
misc._closeup:475 {'otopi_host_net': {'ansible_facts':
{'otopi_host_net': ['ens0p1', 'bond_1.a', 'bond_1.b']},
'_ansible_no_log': False, 'changed': False},
'ansible-playbook_rc': 2}
2022-10-04 16:41:39,226+1100 DEBUG otopi.context context._executeMethod:145 method
exception
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/otopi/context.py", line 132, in
_executeMethod
method['method']()
File
"/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-ansiblesetup/core/misc.py",
line 485, in _closeup
raise RuntimeError(_('Failed getting local_vm_dir'))
RuntimeError: Failed getting local_vm_dir
2022-10-04 16:41:39,227+1100 ERROR otopi.context context._executeMethod:154 Failed to
execute stage 'Closing up': Failed getting local_vm_dir
2022-10-04 16:41:39,228+1100 DEBUG otopi.context context.dumpEnvironment:765 ENVIRONMENT
DUMP - BEGIN
2022-10-04 16:41:39,228+1100 DEBUG otopi.context context.dumpEnvironment:775 ENV
BASE/error=bool:'True'
2022-10-04 16:41:39,228+1100 DEBUG otopi.context context.dumpEnvironment:775 ENV
BASE/exceptionInfo=list:'[(<class 'RuntimeError'>,
RuntimeError('Failed getting local_vm_dir',), <traceback object at
0x7f5210013088>)]'
2022-10-04 16:41:39,228+1100 DEBUG otopi.context context.dumpEnvironment:779 ENVIRONMENT
DUMP - END
~~~
So, would someone please help me in getting this sorted - I mean, how are we supposed to
do this install if the interface we need to connect to the box in the first place
can't be used because it's "already in use"?
Cheers
Dulux-Oz