Hi Dominik
Yes, the network-script was created by our Ansible role that deploys CentOS
hosts. It pulls the IP from DNS then templates the script and copies to
host.
I will try this oVirt step then see if it works!
Thanks,
Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing Unit
Weill Cornell Medicine
1300 York - LC-502
E: doug(a)med.cornell.edu
O: 212-746-6305
F: 212-746-8690
On Thu, Aug 23, 2018 at 11:09 AM, Dominik Holler <dholler(a)redhat.com> wrote:
Is ifcfg-ib0 created before adding the host?
Can ib0 be reconfigured using engine, e.g. by
"Compute > Hosts > hostx > Network Interfaces > Setup Host
Networks"?
If this some kind of self-hosted engine?
On Thu, 23 Aug 2018 09:30:59 -0400
Douglas Duckworth <dod2014(a)med.cornell.edu> wrote:
> Here's a link to the files:
>
>
https://urldefense.proofpoint.com/v2/url?u=https-3A__bit.ly_
2wjZ6Vo&d=DwICAg&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=2Fzhh_
78OGspKQpl_e-CbhH6xUjnRkaqPFUS2wTJ2cw&m=Y25-OOvgu58jlC82-
fzBeNIpQ7ZscoHznffUhqE6EBM&s=QQXlC9Tisa60TvimyS3BnFDCaDF7VP
D8eCzT-Fke-p0&e=
>
> Thank you!
>
> Thanks,
>
> Douglas Duckworth, MSc, LFCS
> HPC System Administrator
> Scientific Computing Unit
> Weill Cornell Medicine
> 1300 York - LC-502
> E: doug(a)med.cornell.edu
> O: 212-746-6305
> F: 212-746-8690
>
>
> On Thu, Aug 23, 2018 at 6:51 AM, Dominik Holler <dholler(a)redhat.com>
> wrote:
>
> > Would you please share the vdsm.log and the supervdsm.log from this
> > host?
> >
> > On Wed, 22 Aug 2018 11:36:09 -0400
> > Douglas Duckworth <dod2014(a)med.cornell.edu> wrote:
> >
> > > Hi
> > >
> > > I keep losing ib0 connection on hypervisor after adding host to
> > > engine. This makes the host not really work since NFS will be
> > > mounted over ib0.
> > >
> > > I don't really understand why this occurs.
> > >
> > > OS:
> > >
> > > [root@ovirt-hv2 ~]# cat /etc/redhat-release
> > > CentOS Linux release 7.5.1804 (Core)
> > >
> > > Here's the network script:
> > >
> > > [root@ovirt-hv2 ~]# cat /etc/sysconfig/network-scripts/ifcfg-ib0
> > > DEVICE=ib0
> > > BOOTPROTO=static
> > > IPADDR=172.16.0.207
> > > NETMASK=255.255.255.0
> > > ONBOOT=yes
> > > ZONE=public
> > >
> > > When I try "ifup"
> > >
> > > [root@ovirt-hv2 ~]# ifup ib0
> > > Error: Connection activation failed: No suitable device found for
> > > this connection.
> > >
> > > The error in syslog:
> > >
> > > Aug 22 11:31:50 ovirt-hv2 kernel: IPv4: martian source 172.16.0.87
> > > from 172.16.0.49, on dev ib0
> > > Aug 22 11:31:53 ovirt-hv2 NetworkManager[1070]: <info>
> > > [1534951913.7486] audit: op="connection-activate"
> > > uuid="2ab4abde-b8a5-6cbc-19b1-2bfb193e4e89" name="System
ib0"
> > > result="fail" reason="No suitable device found for this
> > > connection.
> > >
> > > As you can see media state up:
> > >
> > > [root@ovirt-hv2 ~]# ip a
> > > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state
> > > UNKNOWN group default qlen 1000
> > > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> > > inet 127.0.0.1/8 scope host lo
> > > valid_lft forever preferred_lft forever
> > > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master
> > > ovirtmgmt state UP group default qlen 1000
> > > link/ether 50:9a:4c:89:d3:81 brd ff:ff:ff:ff:ff:ff
> > > 3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq
> > > state DOWN group default qlen 1000
> > > link/ether 50:9a:4c:89:d3:82 brd ff:ff:ff:ff:ff:ff
> > > 4: p1p1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq
> > > state DOWN group default qlen 1000
> > > link/ether b4:96:91:13:ea:68 brd ff:ff:ff:ff:ff:ff
> > > 5: p1p2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq
> > > state DOWN group default qlen 1000
> > > link/ether b4:96:91:13:ea:6a brd ff:ff:ff:ff:ff:ff
> > > 6: idrac: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> > > pfifo_fast state UNKNOWN group default qlen 1000
> > > link/ether 50:9a:4c:89:d3:84 brd ff:ff:ff:ff:ff:ff
> > > inet 169.254.0.2/16 brd 169.254.255.255 scope global idrac
> > > valid_lft forever preferred_lft forever
> > > 7: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state
> > > UP group default qlen 256
> > > link/infiniband
> > > a0:00:02:08:fe:80:00:00:00:00:00:00:ec:0d:9a:03:00:1d:13:41 brd
> > > 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
> > > 8: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state
> > > DOWN group default qlen 1000
> > > link/ether 12:b4:30:22:39:5b brd ff:ff:ff:ff:ff:ff
> > > 9: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
> > > group default qlen 1000
> > > link/ether 3e:32:e6:66:98:49 brd ff:ff:ff:ff:ff:ff
> > > 25: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> > > noqueue state UP group default qlen 1000
> > > link/ether 50:9a:4c:89:d3:81 brd ff:ff:ff:ff:ff:ff
> > > inet 10.0.0.183/16 brd 10.0.255.255 scope global ovirtmgmt
> > > valid_lft forever preferred_lft forever
> > > 26: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000
> > > qdisc noqueue master ovs-system state UNKNOWN group default qlen
> > > 1000 link/ether aa:32:82:1b:01:d9 brd ff:ff:ff:ff:ff:ff
> > > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state
> > > DOWN group default qlen 1000
> > > link/ether 32:ff:5d:b8:c2:b4 brd ff:ff:ff:ff:ff:ff
> > >
> > > The card is FDR:
> > >
> > > [root@ovirt-hv2 ~]# lspci -v | grep Mellanox
> > > 01:00.0 Network controller: Mellanox Technologies MT27500 Family
> > > [ConnectX-3]
> > > Subsystem: Mellanox Technologies Device 0051
> > >
> > > Latest OFED driver:
> > >
> > > [root@ovirt-hv2 ~]# /etc/init.d/openibd status
> > >
> > > HCA driver loaded
> > >
> > > Configured IPoIB devices:
> > > ib0
> > >
> > > Currently active IPoIB devices:
> > > ib0
> > > Configured Mellanox EN devices:
> > >
> > > Currently active Mellanox devices:
> > > ib0
> > >
> > > The following OFED modules are loaded:
> > >
> > > rdma_ucm
> > > rdma_cm
> > > ib_ipoib
> > > mlx4_core
> > > mlx4_ib
> > > mlx4_en
> > > mlx5_core
> > > mlx5_ib
> > > ib_uverbs
> > > ib_umad
> > > ib_ucm
> > > ib_cm
> > > ib_core
> > > mlxfw
> > > mlx5_fpga_tools
> > >
> > > I can add an IP to ib0 using "ip addr" though I need Network
> > > Manager to work with ib0.
> > >
> > >
> > > Thanks,
> > >
> > > Douglas Duckworth, MSc, LFCS
> > > HPC System Administrator
> > > Scientific Computing Unit
> > > Weill Cornell Medicine
> > > 1300 York - LC-502
> > > E: doug(a)med.cornell.edu
> > > O: 212-746-6305
> > > F: 212-746-8690
> >
> >