losing ib0 connection after activating host

Hi I keep losing ib0 connection on hypervisor after adding host to engine. This makes the host not really work since NFS will be mounted over ib0. I don't really understand why this occurs. OS: [root@ovirt-hv2 ~]# cat /etc/redhat-release CentOS Linux release 7.5.1804 (Core) Here's the network script: [root@ovirt-hv2 ~]# cat /etc/sysconfig/network-scripts/ifcfg-ib0 DEVICE=ib0 BOOTPROTO=static IPADDR=172.16.0.207 NETMASK=255.255.255.0 ONBOOT=yes ZONE=public When I try "ifup" [root@ovirt-hv2 ~]# ifup ib0 Error: Connection activation failed: No suitable device found for this connection. The error in syslog: Aug 22 11:31:50 ovirt-hv2 kernel: IPv4: martian source 172.16.0.87 from 172.16.0.49, on dev ib0 Aug 22 11:31:53 ovirt-hv2 NetworkManager[1070]: <info> [1534951913.7486] audit: op="connection-activate" uuid="2ab4abde-b8a5-6cbc-19b1-2bfb193e4e89" name="System ib0" result="fail" reason="No suitable device found for this connection. As you can see media state up: [root@ovirt-hv2 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 50:9a:4c:89:d3:81 brd ff:ff:ff:ff:ff:ff 3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 50:9a:4c:89:d3:82 brd ff:ff:ff:ff:ff:ff 4: p1p1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether b4:96:91:13:ea:68 brd ff:ff:ff:ff:ff:ff 5: p1p2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether b4:96:91:13:ea:6a brd ff:ff:ff:ff:ff:ff 6: idrac: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000 link/ether 50:9a:4c:89:d3:84 brd ff:ff:ff:ff:ff:ff inet 169.254.0.2/16 brd 169.254.255.255 scope global idrac valid_lft forever preferred_lft forever 7: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group default qlen 256 link/infiniband a0:00:02:08:fe:80:00:00:00:00:00:00:ec:0d:9a:03:00:1d:13:41 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff 8: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 12:b4:30:22:39:5b brd ff:ff:ff:ff:ff:ff 9: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 3e:32:e6:66:98:49 brd ff:ff:ff:ff:ff:ff 25: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 50:9a:4c:89:d3:81 brd ff:ff:ff:ff:ff:ff inet 10.0.0.183/16 brd 10.0.255.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 26: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000 link/ether aa:32:82:1b:01:d9 brd ff:ff:ff:ff:ff:ff 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 32:ff:5d:b8:c2:b4 brd ff:ff:ff:ff:ff:ff The card is FDR: [root@ovirt-hv2 ~]# lspci -v | grep Mellanox 01:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3] Subsystem: Mellanox Technologies Device 0051 Latest OFED driver: [root@ovirt-hv2 ~]# /etc/init.d/openibd status HCA driver loaded Configured IPoIB devices: ib0 Currently active IPoIB devices: ib0 Configured Mellanox EN devices: Currently active Mellanox devices: ib0 The following OFED modules are loaded: rdma_ucm rdma_cm ib_ipoib mlx4_core mlx4_ib mlx4_en mlx5_core mlx5_ib ib_uverbs ib_umad ib_ucm ib_cm ib_core mlxfw mlx5_fpga_tools I can add an IP to ib0 using "ip addr" though I need Network Manager to work with ib0. Thanks, Douglas Duckworth, MSc, LFCS HPC System Administrator Scientific Computing Unit Weill Cornell Medicine 1300 York - LC-502 E: doug@med.cornell.edu O: 212-746-6305 F: 212-746-8690

Would you please share the vdsm.log and the supervdsm.log from this host? On Wed, 22 Aug 2018 11:36:09 -0400 Douglas Duckworth <dod2014@med.cornell.edu> wrote:
Hi
I keep losing ib0 connection on hypervisor after adding host to engine. This makes the host not really work since NFS will be mounted over ib0.
I don't really understand why this occurs.
OS:
[root@ovirt-hv2 ~]# cat /etc/redhat-release CentOS Linux release 7.5.1804 (Core)
Here's the network script:
[root@ovirt-hv2 ~]# cat /etc/sysconfig/network-scripts/ifcfg-ib0 DEVICE=ib0 BOOTPROTO=static IPADDR=172.16.0.207 NETMASK=255.255.255.0 ONBOOT=yes ZONE=public
When I try "ifup"
[root@ovirt-hv2 ~]# ifup ib0 Error: Connection activation failed: No suitable device found for this connection.
The error in syslog:
Aug 22 11:31:50 ovirt-hv2 kernel: IPv4: martian source 172.16.0.87 from 172.16.0.49, on dev ib0 Aug 22 11:31:53 ovirt-hv2 NetworkManager[1070]: <info> [1534951913.7486] audit: op="connection-activate" uuid="2ab4abde-b8a5-6cbc-19b1-2bfb193e4e89" name="System ib0" result="fail" reason="No suitable device found for this connection.
As you can see media state up:
[root@ovirt-hv2 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000 link/ether 50:9a:4c:89:d3:81 brd ff:ff:ff:ff:ff:ff 3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 50:9a:4c:89:d3:82 brd ff:ff:ff:ff:ff:ff 4: p1p1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether b4:96:91:13:ea:68 brd ff:ff:ff:ff:ff:ff 5: p1p2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether b4:96:91:13:ea:6a brd ff:ff:ff:ff:ff:ff 6: idrac: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000 link/ether 50:9a:4c:89:d3:84 brd ff:ff:ff:ff:ff:ff inet 169.254.0.2/16 brd 169.254.255.255 scope global idrac valid_lft forever preferred_lft forever 7: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group default qlen 256 link/infiniband a0:00:02:08:fe:80:00:00:00:00:00:00:ec:0d:9a:03:00:1d:13:41 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff 8: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 12:b4:30:22:39:5b brd ff:ff:ff:ff:ff:ff 9: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 3e:32:e6:66:98:49 brd ff:ff:ff:ff:ff:ff 25: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 50:9a:4c:89:d3:81 brd ff:ff:ff:ff:ff:ff inet 10.0.0.183/16 brd 10.0.255.255 scope global ovirtmgmt valid_lft forever preferred_lft forever 26: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000 link/ether aa:32:82:1b:01:d9 brd ff:ff:ff:ff:ff:ff 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 32:ff:5d:b8:c2:b4 brd ff:ff:ff:ff:ff:ff
The card is FDR:
[root@ovirt-hv2 ~]# lspci -v | grep Mellanox 01:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3] Subsystem: Mellanox Technologies Device 0051
Latest OFED driver:
[root@ovirt-hv2 ~]# /etc/init.d/openibd status
HCA driver loaded
Configured IPoIB devices: ib0
Currently active IPoIB devices: ib0 Configured Mellanox EN devices:
Currently active Mellanox devices: ib0
The following OFED modules are loaded:
rdma_ucm rdma_cm ib_ipoib mlx4_core mlx4_ib mlx4_en mlx5_core mlx5_ib ib_uverbs ib_umad ib_ucm ib_cm ib_core mlxfw mlx5_fpga_tools
I can add an IP to ib0 using "ip addr" though I need Network Manager to work with ib0.
Thanks,
Douglas Duckworth, MSc, LFCS HPC System Administrator Scientific Computing Unit Weill Cornell Medicine 1300 York - LC-502 E: doug@med.cornell.edu O: 212-746-6305 F: 212-746-8690
participants (2)
-
Dominik Holler
-
Douglas Duckworth