oVirt 4.5.1 Hyperconverge Gluster install fails

16 Jul 2022

      Hello all,

I am hoping someone can help me with a oVirt installation that has just gotten the better of me after weeks of trying.

After setting up ssh-keys and making sure each host is known to the primary host (sr-svr04), I go though Cockpit and "Configure Gluster storage and oVirt hosted engine", enter all of the details with <host>.san.lennoxconsulting.com.au for the storage network FQDN and <host>.core.lennoxconsulting.com.au for the public interfaces. Connectivity on each of the VLAN's test out as basically working (everything is pingable and ssh connections work) and the hosts are generally usable on the network. 

But the install ultimately dies with the following ansible error:

----------- gluster-deployment.log ---------
:
:

TASK [gluster.infra/roles/backend_setup : Create volume groups] ****************
task path: /etc/ansible/roles/gluster.infra/roles/backend_setup/tasks/vg_create.yml:63
failed: [sr-svr04.san.lennoxconsulting.com.au] (item={'key': 'gluster_vg_sdb', 'value': [{'vgname': 'gluster_vg_sdb', 'pvname': '/dev/sdb'}]}) => {"ansible_loop_var": "item", "changed": true, "cmd": ["vgcreate", "--dataalignment", "1536K", "-s", "1536K", "gluster_vg_sdb", "/dev/sdb"], "delta": "0:00:00.058528", "end": "2022-07-16 16:18:37.018563", "item": {"key": "gluster_vg_sdb", "value": [{"pvname": "/dev/sdb", "vgname": "gluster_vg_sdb"}]}, "msg": "non-zero return code", "rc": 5, "start": "2022-07-16 16:18:36.960035", "stderr": "  A volume group called gluster_vg_sdb already exists.", "stderr_lines": ["  A volume group called gluster_vg_sdb already exists."], "stdout": "", "stdout_lines": []}
failed: [sr-svr05.san.lennoxconsulting.com.au] (item={'key': 'gluster_vg_sdb', 'value': [{'vgname': 'gluster_vg_sdb', 'pvname': '/dev/sdb'}]}) => {"ansible_loop_var": "item", "changed": true, "cmd": ["vgcreate", "--dataalignment", "1536K", "-s", "1536K", "gluster_vg_sdb", "/dev/sdb"], "delta": "0:00:00.057186", "end": "2022-07-16 16:18:37.784063", "item": {"key": "gluster_vg_sdb", "value": [{"pvname": "/dev/sdb", "vgname": "gluster_vg_sdb"}]}, "msg": "non-zero return code", "rc": 5, "start": "2022-07-16 16:18:37.726877", "stderr": "  A volume group called gluster_vg_sdb already exists.", "stderr_lines": ["  A volume group called gluster_vg_sdb already exists."], "stdout": "", "stdout_lines": []}
failed: [sr-svr06.san.lennoxconsulting.com.au] (item={'key': 'gluster_vg_sdb', 'value': [{'vgname': 'gluster_vg_sdb', 'pvname': '/dev/sdb'}]}) => {"ansible_loop_var": "item", "changed": true, "cmd": ["vgcreate", "--dataalignment", "1536K", "-s", "1536K", "gluster_vg_sdb", "/dev/sdb"], "delta": "0:00:00.062212", "end": "2022-07-16 16:18:37.250371", "item": {"key": "gluster_vg_sdb", "value": [{"pvname": "/dev/sdb", "vgname": "gluster_vg_sdb"}]}, "msg": "non-zero return code", "rc": 5, "start": "2022-07-16 16:18:37.188159", "stderr": "  A volume group called gluster_vg_sdb already exists.", "stderr_lines": ["  A volume group called gluster_vg_sdb already exists."], "stdout": "", "stdout_lines": []}

NO MORE HOSTS LEFT *************************************************************

NO MORE HOSTS LEFT *************************************************************

PLAY RECAP *********************************************************************
sr-svr04.san.lennoxconsulting.com.au : ok=32   changed=13   unreachable=0    failed=1    skipped=27   rescued=0    ignored=1   
sr-svr05.san.lennoxconsulting.com.au : ok=31   changed=12   unreachable=0    failed=1    skipped=27   rescued=0    ignored=1   
sr-svr06.san.lennoxconsulting.com.au : ok=31   changed=12   unreachable=0    failed=1    skipped=27   rescued=0    ignored=1   

----------- gluster-deployment.log ---------

A "gluster v status" gives me no volumes present and that is where I am stuck! Any ideas of what I start trying next?

I have tried this with oVirt Node 4.5.1 el8 and el9 images as well as 4.5 el8 images so it has got to be somewhere in my infrastructure configuration but I am out of ideas.

My hardware configuration is 3 x HP DL360's with oVirt Node 4.5.1 el8 installed on 2x146gb RAID1 array and a gluster 6x900gb RAID5  array.

Network configuration is:

root@sr-svr04 ~]# ip addr show up
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000
    link/ether 1c:98:ec:29:41:68 brd ff:ff:ff:ff:ff:ff
3: eno2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000
    link/ether 1c:98:ec:29:41:68 brd ff:ff:ff:ff:ff:ff permaddr 1c:98:ec:29:41:69
4: eno3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 1c:98:ec:29:41:6a brd ff:ff:ff:ff:ff:ff
5: eno4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether 1c:98:ec:29:41:6b brd ff:ff:ff:ff:ff:ff
6: eno3.3@eno3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 1c:98:ec:29:41:6a brd ff:ff:ff:ff:ff:ff
    inet 192.168.3.11/24 brd 192.168.3.255 scope global noprefixroute eno3.3
       valid_lft forever preferred_lft forever
    inet6 fe80::ec:d7be:760e:8eda/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
7: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 1c:98:ec:29:41:68 brd ff:ff:ff:ff:ff:ff
8: bond0.4@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 1c:98:ec:29:41:68 brd ff:ff:ff:ff:ff:ff
    inet 192.168.4.11/24 brd 192.168.4.255 scope global noprefixroute bond0.4
       valid_lft forever preferred_lft forever
    inet6 fe80::f503:a54:8421:ea8b/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
9: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 52:54:00:5b:fd:ac brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever

The public network is core.lennoxconsulting.com.au which is 192.168.4.0/24 and the storage network is san.lennoxconsulting.com.au which is 192.168.3.0/24

Any help to move forward please is appreciated.

- Dave.

david.lennox＠frontlinedigital.com.au

Strahil Nikolov

david.lennox＠frontlinedigital.com.au

tags

participants (2)