Hi

I keep losing ib0 connection on hypervisor after adding host to engine.  This makes the host not really work since NFS will be mounted over ib0.

I don't really understand why this occurs.

OS:

[root@ovirt-hv2 ~]# cat /etc/redhat-release 
CentOS Linux release 7.5.1804 (Core)

Here's the network script:

[root@ovirt-hv2 ~]# cat /etc/sysconfig/network-scripts/ifcfg-ib0
DEVICE=ib0
BOOTPROTO=static
IPADDR=172.16.0.207
NETMASK=255.255.255.0
ONBOOT=yes
ZONE=public

When I try "ifup"

[root@ovirt-hv2 ~]# ifup ib0
Error: Connection activation failed: No suitable device found for this connection.

The error in syslog:

Aug 22 11:31:50 ovirt-hv2 kernel: IPv4: martian source 172.16.0.87 from 172.16.0.49, on dev ib0
Aug 22 11:31:53 ovirt-hv2 NetworkManager[1070]: <info>  [1534951913.7486] audit: op="connection-activate" uuid="2ab4abde-b8a5-6cbc-19b1-2bfb193e4e89" name="System ib0" result="fail" reason="No suitable device found for this connection.

As you can see media state up:

[root@ovirt-hv2 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovirtmgmt state UP group default qlen 1000
    link/ether 50:9a:4c:89:d3:81 brd ff:ff:ff:ff:ff:ff
3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether 50:9a:4c:89:d3:82 brd ff:ff:ff:ff:ff:ff
4: p1p1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether b4:96:91:13:ea:68 brd ff:ff:ff:ff:ff:ff
5: p1p2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether b4:96:91:13:ea:6a brd ff:ff:ff:ff:ff:ff
6: idrac: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000
    link/ether 50:9a:4c:89:d3:84 brd ff:ff:ff:ff:ff:ff
    inet 169.254.0.2/16 brd 169.254.255.255 scope global idrac
       valid_lft forever preferred_lft forever
7: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP group default qlen 256
    link/infiniband a0:00:02:08:fe:80:00:00:00:00:00:00:ec:0d:9a:03:00:1d:13:41 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
8: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 12:b4:30:22:39:5b brd ff:ff:ff:ff:ff:ff
9: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 3e:32:e6:66:98:49 brd ff:ff:ff:ff:ff:ff
25: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 50:9a:4c:89:d3:81 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.183/16 brd 10.0.255.255 scope global ovirtmgmt
       valid_lft forever preferred_lft forever
26: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
    link/ether aa:32:82:1b:01:d9 brd ff:ff:ff:ff:ff:ff
27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 32:ff:5d:b8:c2:b4 brd ff:ff:ff:ff:ff:ff

The card is FDR:

[root@ovirt-hv2 ~]# lspci -v | grep Mellanox
01:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
Subsystem: Mellanox Technologies Device 0051

Latest OFED driver:

[root@ovirt-hv2 ~]# /etc/init.d/openibd status

  HCA driver loaded

Configured IPoIB devices:
ib0

Currently active IPoIB devices:
ib0
Configured Mellanox EN devices:

Currently active Mellanox devices:
ib0

The following OFED modules are loaded:

  rdma_ucm
  rdma_cm
  ib_ipoib
  mlx4_core
  mlx4_ib
  mlx4_en
  mlx5_core
  mlx5_ib
  ib_uverbs
  ib_umad
  ib_ucm
  ib_cm
  ib_core
  mlxfw
  mlx5_fpga_tools

I can add an IP to ib0 using "ip addr" though I need Network Manager to work with ib0.


Thanks,

Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unit
Weill Cornell Medicine
1300 York - LC-502
E: doug@med.cornell.edu
O: 212-746-6305
F: 212-746-8690