Here's a link to the files:
https://bit.ly/2wjZ6Vo
Thank you!
Thanks,
Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unit
Weill Cornell Medicine
1300 York - LC-502
E: doug(a)med.cornell.edu
O: 212-746-6305
F: 212-746-8690
On Thu, Aug 23, 2018 at 6:51 AM, Dominik Holler <dholler(a)redhat.com> wrote:
Would you please share the vdsm.log and the supervdsm.log from this
host?
On Wed, 22 Aug 2018 11:36:09 -0400
Douglas Duckworth <dod2014(a)med.cornell.edu> wrote:
> Hi
>
> I keep losing ib0 connection on hypervisor after adding host to
> engine. This makes the host not really work since NFS will be mounted
> over ib0.
>
> I don't really understand why this occurs.
>
> OS:
>
> [root@ovirt-hv2 ~]# cat /etc/redhat-release
> CentOS Linux release 7.5.1804 (Core)
>
> Here's the network script:
>
> [root@ovirt-hv2 ~]# cat /etc/sysconfig/network-scripts/ifcfg-ib0
> DEVICE=ib0
> BOOTPROTO=static
> IPADDR=172.16.0.207
> NETMASK=255.255.255.0
> ONBOOT=yes
> ZONE=public
>
> When I try "ifup"
>
> [root@ovirt-hv2 ~]# ifup ib0
> Error: Connection activation failed: No suitable device found for this
> connection.
>
> The error in syslog:
>
> Aug 22 11:31:50 ovirt-hv2 kernel: IPv4: martian source 172.16.0.87
> from 172.16.0.49, on dev ib0
> Aug 22 11:31:53 ovirt-hv2 NetworkManager[1070]: <info>
> [1534951913.7486] audit: op="connection-activate"
> uuid="2ab4abde-b8a5-6cbc-19b1-2bfb193e4e89" name="System ib0"
> result="fail" reason="No suitable device found for this connection.
>
> As you can see media state up:
>
> [root@ovirt-hv2 ~]# ip a
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
> group default qlen 1000
> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> inet 127.0.0.1/8 scope host lo
> valid_lft forever preferred_lft forever
> 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master
> ovirtmgmt state UP group default qlen 1000
> link/ether 50:9a:4c:89:d3:81 brd ff:ff:ff:ff:ff:ff
> 3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state
> DOWN group default qlen 1000
> link/ether 50:9a:4c:89:d3:82 brd ff:ff:ff:ff:ff:ff
> 4: p1p1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state
> DOWN group default qlen 1000
> link/ether b4:96:91:13:ea:68 brd ff:ff:ff:ff:ff:ff
> 5: p1p2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state
> DOWN group default qlen 1000
> link/ether b4:96:91:13:ea:6a brd ff:ff:ff:ff:ff:ff
> 6: idrac: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
> state UNKNOWN group default qlen 1000
> link/ether 50:9a:4c:89:d3:84 brd ff:ff:ff:ff:ff:ff
> inet 169.254.0.2/16 brd 169.254.255.255 scope global idrac
> valid_lft forever preferred_lft forever
> 7: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP
> group default qlen 256
> link/infiniband
> a0:00:02:08:fe:80:00:00:00:00:00:00:ec:0d:9a:03:00:1d:13:41 brd
> 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
> 8: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
> group default qlen 1000
> link/ether 12:b4:30:22:39:5b brd ff:ff:ff:ff:ff:ff
> 9: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group
> default qlen 1000
> link/ether 3e:32:e6:66:98:49 brd ff:ff:ff:ff:ff:ff
> 25: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> noqueue state UP group default qlen 1000
> link/ether 50:9a:4c:89:d3:81 brd ff:ff:ff:ff:ff:ff
> inet 10.0.0.183/16 brd 10.0.255.255 scope global ovirtmgmt
> valid_lft forever preferred_lft forever
> 26: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc
> noqueue master ovs-system state UNKNOWN group default qlen 1000
> link/ether aa:32:82:1b:01:d9 brd ff:ff:ff:ff:ff:ff
> 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
> group default qlen 1000
> link/ether 32:ff:5d:b8:c2:b4 brd ff:ff:ff:ff:ff:ff
>
> The card is FDR:
>
> [root@ovirt-hv2 ~]# lspci -v | grep Mellanox
> 01:00.0 Network controller: Mellanox Technologies MT27500 Family
> [ConnectX-3]
> Subsystem: Mellanox Technologies Device 0051
>
> Latest OFED driver:
>
> [root@ovirt-hv2 ~]# /etc/init.d/openibd status
>
> HCA driver loaded
>
> Configured IPoIB devices:
> ib0
>
> Currently active IPoIB devices:
> ib0
> Configured Mellanox EN devices:
>
> Currently active Mellanox devices:
> ib0
>
> The following OFED modules are loaded:
>
> rdma_ucm
> rdma_cm
> ib_ipoib
> mlx4_core
> mlx4_ib
> mlx4_en
> mlx5_core
> mlx5_ib
> ib_uverbs
> ib_umad
> ib_ucm
> ib_cm
> ib_core
> mlxfw
> mlx5_fpga_tools
>
> I can add an IP to ib0 using "ip addr" though I need Network Manager
> to work with ib0.
>
>
> Thanks,
>
> Douglas Duckworth, MSc, LFCS
> HPC System Administrator
> Scientific Computing Unit
> Weill Cornell Medicine
> 1300 York - LC-502
> E: doug(a)med.cornell.edu
> O: 212-746-6305
> F: 212-746-8690