Hi Guys & Girls,
<begin_rant>
OK, so I am really, *really* starting to get fed up with this. I know this is probably my fault, but even if it is then the oVirt documentation isn't helping in any way (being... "less than clear").
What I would really like is instead of having to rely on the "black box" that is Ansible, what I'd like is a simple set of clear cut instructions, Step-By-Step, so that we actually *know* what was going on when attempting to do a Self-Hosted install. After all, oVirt's "competition" doesn't make things so difficult...
<end_rant>
Now that I've got that on my chest, I'm trying to do a straight forward Self-Hosted Install. I've followed the instructions in the oVirt doco pretty much to the letter, and I'm still having problems.
My (pre-install) set-up:
- A freshly installed server (oVirt_Node_1) running Rocky Linux 8.6 with 3 NICs - NIC_1, NIC_2, & NIC_3.
- There are three VLANs - VLAN_A (172.16.1.0/24), VLAN_B (172.16.2.0/24), & VLAN_C (172.16.3.0/24).
- NIC_1 & NIC_2 are formed into a bond (bond_1).
- bond_1 is an 802.3ad bond.
- bond_1 has 2 sub-interfaces - bond_1.a & bond_1.b
- Interface bond_1.a in in VLAN_A.
- Interface bond_1.b is in VLAN_B.
- NIC_3 is sitting in VLAN_C.
- VLAN_A is the everyday "working" VLAN where the rest of the servers all sit (ie DNS Servers, Local Repository Server, etc, etc, etc), and where the oVirt Engine (OVE) will sit.
- VLAN B is for data throughput to and from the Ceph iSCSI Gateways in our Ceph Storage Cluster. This is a dedicated isolated VLAN with no gateway (ie only the oVirt Hosting Nodes and the Ceph iSCSI Gateways are on this VLAN).
- VLAN C is for OOB management traffic. This is a dedicated isolated VLAN with no gateway.
Everything is working. Everything can ping properly back and forth within the individual VLANs and VLAN_A can ping out to the Internet via its gateway (172.16.1.1).
Because we don't require iSCSI connectivity for the OVE (its on a working local Gluster TSP volume) the iSCSI hasn't *yet* been implemented.
After trying to do the install using our Local Repository Mirror (after discovering and mirroring all the required repositories), I gave up on that because for a "one-off" install it wasn't worth the time and effort it was taking, especially when it "seems" that the Ansible playbook wants the "original" repositories anyway - but that's another rant/issue.
So, I'm using all the original repositories as per the oVirt doco, including the special instructions for Rocky Linux and RHEL-derivatives in general, and using the defaults for the answers to the deployment script (except where there are no defaults) - and now I've got the following error:
~~~
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "cmd": ["virsh", "net-start", "default"], "delta": "0:00:00.031972", "end": "2022-10-04 16:41:38.603454", "msg": "non-zero return code", "rc": 1, "start": "2022-10-04 16:41:38.571482", "stderr": "error: Failed to start network default\nerror: internal error: Network is already in use by interface bond_1.a", "stderr_lines": ["error: Failed to start network default", "error: internal error: Network is already in use by interface bond_1.a"], "stdout": "", "stdout_lines": []}
[ ERROR ] Failed to execute stage 'Closing up': Failed getting local_vm_dir
~~~
The relevant lines from the log file (at least I think these are the relevant lines):
~~~
2022-10-04 16:41:35,712+1100 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:115 TASK [ovirt.ovirt.hosted_engine_setup : Update libvirt default network configuration, undefine]
2022-10-04 16:41:37,017+1100 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 {'changed': False, 'stdout': '', 'stderr': "error: failed to get network 'default'\nerror: Network not found: no network with matching name 'default'", 'rc': 1, 'cmd': ['virsh', 'net-undefine', 'default'], 'start': '2022-10-04 16:41:35.806251', 'end': '2022-10-04 16:41:36.839780', 'delta': '0:00:01.033529', 'msg': 'non-zero return code', 'invocation': {'module_args': {'_raw_params': 'virsh net-undefine default', '_uses_shell': False, 'warn': False, 'stdin_add_newline': True, 'strip_empty_ends': True, 'argv': None, 'chdir': None, 'executable': None, 'creates': None, 'removes': None, 'stdin': None}}, 'stdout_lines': [], 'stderr_lines': ["error: failed to get network 'default'", "error: Network not found: no network with matching name 'default'"], '_ansible_no_log': False}
2022-10-04 16:41:37,118+1100 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 ignored: [localhost]: FAILED! => {"changed": false, "cmd": ["virsh", "net-undefine", "default"], "delta": "0:00:01.033529", "end": "2022-10-04 16:41:36.839780", "msg": "non-zero return code", "rc": 1, "start": "2022-10-04 16:41:35.806251", "stderr": "error: failed to get network 'default'\nerror: Network not found: no network with matching name 'default'", "stderr_lines": ["error: failed to get network 'default'", "error: Network not found: no network with matching name 'default'"], "stdout": "", "stdout_lines": []}
2022-10-04 16:41:37,219+1100 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:115 TASK [ovirt.ovirt.hosted_engine_setup : Update libvirt default network configuration, define]
2022-10-04 16:41:38,421+1100 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:115 ok: [localhost]
2022-10-04 16:41:38,522+1100 INFO otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:115 TASK [ovirt.ovirt.hosted_engine_setup : Activate default libvirt network]
2022-10-04 16:41:38,823+1100 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 {'changed': False, 'stdout': '', 'stderr': 'error: Failed to start network default\nerror: internal error: Network is already in use by interface bond_1.a', 'rc': 1, 'cmd': ['virsh', 'net-start', 'default'], 'start': '2022-10-04 16:41:38.571482', 'end': '2022-10-04 16:41:38.603454', 'delta': '0:00:00.031972', 'msg': 'non-zero return code', 'invocation': {'module_args': {'_raw_params': 'virsh net-start default', '_uses_shell': False, 'warn': False, 'stdin_add_newline': True, 'strip_empty_ends': True, 'argv': None, 'chdir': None, 'executable': None, 'creates': None, 'removes': None, 'stdin': None}}, 'stdout_lines': [], 'stderr_lines': ['error: Failed to start network default', 'error: internal error: Network is already in use by interface bond_1.a'], '_ansible_no_log': False}
2022-10-04 16:41:38,924+1100 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:113 fatal: [localhost]: FAILED! => {"changed": false, "cmd": ["virsh", "net-start", "default"], "delta": "0:00:00.031972", "end": "2022-10-04 16:41:38.603454", "msg": "non-zero return code", "rc": 1, "start": "2022-10-04 16:41:38.571482", "stderr": "error: Failed to start network default\nerror: internal error: Network is already in use by interface bond_1.a", "stderr_lines": ["error: Failed to start network default", "error: internal error: Network is already in use by interface bond_1.a"], "stdout": "", "stdout_lines": []}
2022-10-04 16:41:39,125+1100 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:109 PLAY RECAP [localhost] : ok: 106 changed: 32 unreachable: 0 skipped: 61 failed: 1
2022-10-04 16:41:39,226+1100 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:226 ansible-playbook rc: 2
2022-10-04 16:41:39,226+1100 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:233 ansible-playbook stdout:
2022-10-04 16:41:39,226+1100 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:236 ansible-playbook stderr:
2022-10-04 16:41:39,226+1100 DEBUG otopi.plugins.gr_he_ansiblesetup.core.misc misc._closeup:475 {'otopi_host_net': {'ansible_facts': {'otopi_host_net': ['ens0p1', 'bond_1.a', 'bond_1.b']}, '_ansible_no_log': False, 'changed': False}, 'ansible-playbook_rc': 2}
2022-10-04 16:41:39,226+1100 DEBUG otopi.context context._executeMethod:145 method exception
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/otopi/context.py", line 132, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-ansiblesetup/core/misc.py", line 485, in _closeup
    raise RuntimeError(_('Failed getting local_vm_dir'))
RuntimeError: Failed getting local_vm_dir
2022-10-04 16:41:39,227+1100 ERROR otopi.context context._executeMethod:154 Failed to execute stage 'Closing up': Failed getting local_vm_dir
2022-10-04 16:41:39,228+1100 DEBUG otopi.context context.dumpEnvironment:765 ENVIRONMENT DUMP - BEGIN
2022-10-04 16:41:39,228+1100 DEBUG otopi.context context.dumpEnvironment:775 ENV BASE/error=bool:'True'
2022-10-04 16:41:39,228+1100 DEBUG otopi.context context.dumpEnvironment:775 ENV BASE/exceptionInfo=list:'[(<class 'RuntimeError'>, RuntimeError('Failed getting local_vm_dir',), <traceback object at 0x7f5210013088>)]'
2022-10-04 16:41:39,228+1100 DEBUG otopi.context context.dumpEnvironment:779 ENVIRONMENT DUMP - END
~~~
So, would someone please help me in getting this sorted - I mean, how are we supposed to do this install if the interface we need to connect to the box in the first place can't be used because it's "already in use"?
Cheers
Dulux-Oz