Hi All!
I'm try to set up last ovirt version : ovirt-release42-pre.rpm
repository (because bug
https://bugzilla.redhat.com/show_bug.cgi?id=1637468 , for example -
it's not fixed in release42 stable)
Then, I install these (and many deps) rpms:
ovirt-hosted-engine-setup-2.2.30-1.el7.noarch
ovirt-engine-appliance-4.2-20181026.1.el7.noarch
vdsm-4.20.43-1.el7.x86_64
vdsm-gluster-4.20.43-1.el7.x86_64
vdsm-network-4.20.43-1.el7.x86_64
All from that repository, and use webui installer for create glusterfs
volumes (default suggested engine, data, vmstore) and then install
hosted engine on that "engine" volume.
But in my case i try to setup additional "storage network" (for example,
as described there:
https://ovirt.org/develop/release-management/features/gluster/select-netw...
)
These screenshots are too old, and in 4.2 UI changed as I see, but idea
are same.
I have two interface on each host: one ethernet (enp59s0f0 with address
from 172.16.10.0/24 with default gateway) and one "Infiniband" (no
default gateway, only between cluster nodes, no routing, no external
access). Really it is Intel Omni-path fabric :
-----------
[root@ovirtnode1 log]# hfi1_control -i
Driver Version: 10.8-0
Opa Version: 10.8.0.0.204
0: BoardId: Intel Corporation Omni-Path HFI Silicon 100 Series [integrated]
0,1: Status: 5: LinkUp 4: ACTIVE
-------------
It looks like IP-over-IB interface:
6: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc pfifo_fast
state UP group default qlen 256
link/infiniband
80:00:00:02:fe:80:00:00:00:00:00:00:00:11:75:09:01:1a:ee:ea brd
00:ff:ff:ff:ff:12:40:1b:80:01:00:00:00:00:00:00:ff:ff:ff:ff
inet 172.16.100.1/24 brd 172.16.100.255 scope global noprefixroute ib0
valid_lft forever preferred_lft forever
It has this properties in ifcfg-ib0 file:
CONNECTED_MODE=yes
MTU=65520
All IP-s on that interfaces pair have DNS records on external DNS:
ethernet (management network) has ovirtnode{N} names and infiniband
(storage network) has an ovirtstor{N} names.
During webui glusterfs setup I used ovirtstor host names, trusted pool
created:
5a9a0a5f-12f4-48b1-bfbe-24c172adc65c ovirtstor5.miac Connected
41350da9-c944-41c5-afdc-46ff51ab93f6 ovirtstor6.miac Connected
0f50175e-7e47-4839-99c7-c7ced21f090c localhost Connected
Then I log in to web administration console and add two other hosts by
their names
Name Hostname/IP Cluster Data Center Status SPM
ovirtnode1 ovirtnode1 Default Default Up SPM
ovirtnode5 ovirtnode5 Default Default Up Normal
For this setup I have some questions:
1. Where is a webui place when I can configure that i want to use
"storage network" ?
I try to create second network (network->networks->new), but vdsm
overwrite the ifcfg-ib0 file without that properties, as it is "like
ethernet" interface:
Generated by VDSM version 4.20.43-1.el7
DEVICE=ib0
ONBOOT=yes
IPADDR=172.16.100.5
NETMASK=255.255.255.0
BOOTPROTO=none
MTU=65520
DEFROUTE=no
NM_CONTROLLED=no
IPV6INIT=no
MTU i entered by hand in General->MTU-Custom field, but:
It cannot be set without "CONNECTED_MODE=yes" property, and now in
networks->"storage"->hosts it always show as "out-of-sync".
"Custom
properties" are greyed and not available.
2. If I use checkbox "VM network" when create network and then "setup
host networks" with this network for ib0 interface - all engine hangs. I
think it's because it try to bridge infiniband interface with other, and
that cannot done (i see only "1 task running" that never ends and no
other interface can show any details)
3. Also ovirt try to start send LLDP TLVs on interface ib0, but it
cannot be done:
Nov 6 17:30:01 ovirtnode5 systemd: Starting Link Layer Discovery
Protocol Agent Daemon....
Nov 6 17:30:01 ovirtnode5 kernel: bnx2x:
[bnx2x_dcbnl_set_dcbx:2383(enp59s0f0)]Requested DCBX mode 5 is beyond
advertised capabilities
Nov 6 17:30:02 ovirtnode5 systemd: Started /sbin/ifup ib0.
Nov 6 17:30:02 ovirtnode5 systemd: Starting /sbin/ifup ib0.
Nov 6 17:30:02 ovirtnode5 kernel: IPv6: ADDRCONF(NETDEV_UP): ib0: link
is not ready
Nov 6 17:30:02 ovirtnode5 NetworkManager[1650]: <info>
[1541511002.9642] device (ib0): carrier: link connected
Nov 6 17:30:02 ovirtnode5 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): ib0:
link becomes ready
Nov 6 17:30:02 ovirtnode5 lldpad: setsockopt nearest_bridge: Invalid
argument
Nov 6 17:30:41 ovirtnode5 vdsm[127585]: ERROR Internal server
error#012Traceback (most recent call last):#012 File
"/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", lin
e 606, in _handle_request#012 res = method(**params)#012 File
"/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 193, in
_dynamicMethod#012 result = fn(*methodArg
s)#012 File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1561,
in getLldp#012 info=supervdsm.getProxy().get_lldp_info(filter))#012
File "/usr/lib/python2.7/site-pack
ages/vdsm/common/supervdsm.py", line 55, in __call__#012 return
callMethod()#012 File
"/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 53, in
<lambda>#012
**kwargs)#012 File "<string>", line 2, in get_lldp_info#012 File
"/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in
_callmethod#012 raise convert_to_error(kin
d, result)#012TlvReportLldpError: (1, 'Agent instance for device not
found \n', '', 'ib0')
4. Gluster volumes are in strange state:
I "import domain" (storage->domains) that webui installer created on
first step, but in "host to use" there is a drop-down list contains only
host names as it added to cluster, i.e. ovirtnode1, ovirtnode{N}.
And there is a red message "For data integrity make sure that the server
is configured with Quorum (both client and server Quorum)" (it
configured by cockpit webui installer, as i see).
But imported volumes shown as "1 Up 2 Down" bricks, and on host only
"localhost" bricks showed as "Online", in logs there is a message
Nov 6 10:24:24 ovirtnode5 systemd: Started GlusterFS, a clustered
file-system server.
Nov 6 10:24:24 ovirtnode5 glusterd[229325]: [2018-11-06
06:24:24.404149] C [MSGID: 106003]
[glusterd-server-quorum.c:354:glusterd_do_volume_quorum_action]
0-management: Server q
uorum regained for volume data. Starting local bricks.
Nov 6 10:24:24 ovirtnode5 glusterd[229325]: [2018-11-06
06:24:24.450356] C [MSGID: 106003]
[glusterd-server-quorum.c:354:glusterd_do_volume_quorum_action]
0-management: Server q
uorum regained for volume engine. Starting local bricks.
Nov 6 10:24:24 ovirtnode5 glusterd[229325]: [2018-11-06
06:24:24.503677] C [MSGID: 106003]
[glusterd-server-quorum.c:354:glusterd_do_volume_quorum_action]
0-management: Server q
uorum regained for volume vmstore. Starting local bricks.
What does it mean? If "quorum REGAINED" that all bricks must be started,
isn't it?
When i try to create volume (storage->volumes->new), press "add
bricks" - there is a similar drop-down box "Bricks Host" contains only
"ovirtnode" names, not "ovirtstor" ib interfaces..
If I try to use it - It cannot finished with error like "This host not
in trusted pool", its true - in trusted tool there is other interface.
What the right way to configure this?
Are there a "some start guide" for this case with config-steps?
I found
https://www.ovirt.org/documentation/quickstart/quickstart-guide/
but it also out-of-date, for example it not describe GlusterFS for
storage domains, has old screenshots (previous interface version), etc...
I cannot find any documentation in
https://www.ovirt.org/documentation/admin-guide/
For example,
https://www.ovirt.org/documentation/admin-guide/chap-Logical_Networks/
Explanation of Settings in the Manage Networks Window
does not contain role "Gluster network" at all...
Many links point to not-existent pages, for example
"For more information on these parameters, see Explanation of bridge
opts Parameters." link to
https://www.ovirt.org/documentation/admin-guide/chap-Logical_Networks/Exp...
that
"404 Not found :(
Sorry, but the page you were trying to view does not exist."
(it is for example, there are MANY 404 links/pages).
--
Mike