Hi DominikYes, the network-script was created by our Ansible role that deploys CentOS hosts. It pulls the IP from DNS then templates the script and copies to host.I will try this oVirt step then see if it works!Thanks,
Douglas Duckworth, MSc, LFCS
HPC System Administrator
Scientific Computing UnitWeill Cornell Medicine1300 York - LC-502On Thu, Aug 23, 2018 at 11:09 AM, Dominik Holler <dholler@redhat.com> wrote:Is ifcfg-ib0 created before adding the host?
Can ib0 be reconfigured using engine, e.g. by
"Compute > Hosts > hostx > Network Interfaces > Setup Host Networks"?
If this some kind of self-hosted engine?
On Thu, 23 Aug 2018 09:30:59 -0400
Douglas Duckworth <dod2014@med.cornell.edu> wrote:
> Here's a link to the files:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__bit.ly_ 2wjZ6Vo&d=DwICAg&c=lb62iw4YL4R FalcE2hQUQealT9-RXrryqt9KZX2qu 2s&r=2Fzhh_78OGspKQpl_e-CbhH6x UjnRkaqPFUS2wTJ2cw&m=Y25- OOvgu58jlC82-fzBeNIpQ7ZscoHznf fUhqE6EBM&s=QQXlC9Tisa60TvimyS 3BnFDCaDF7VPD8eCzT-Fke-p0&e=
>
> Thank you!
>
> Thanks,
>
> Douglas Duckworth, MSc, LFCS
> HPC System Administrator
> Scientific Computing Unit
> Weill Cornell Medicine
> 1300 York - LC-502
> E: doug@med.cornell.edu
> O: 212-746-6305
> F: 212-746-8690
>
>
> On Thu, Aug 23, 2018 at 6:51 AM, Dominik Holler <dholler@redhat.com>
> wrote:
>
> > Would you please share the vdsm.log and the supervdsm.log from this
> > host?
> >
> > On Wed, 22 Aug 2018 11:36:09 -0400
> > Douglas Duckworth <dod2014@med.cornell.edu> wrote:
> >
> > > Hi
> > >
> > > I keep losing ib0 connection on hypervisor after adding host to
> > > engine. This makes the host not really work since NFS will be
> > > mounted over ib0.
> > >
> > > I don't really understand why this occurs.
> > >
> > > OS:
> > >
> > > [root@ovirt-hv2 ~]# cat /etc/redhat-release
> > > CentOS Linux release 7.5.1804 (Core)
> > >
> > > Here's the network script:
> > >
> > > [root@ovirt-hv2 ~]# cat /etc/sysconfig/network-scripts/ifcfg-ib0
> > > DEVICE=ib0
> > > BOOTPROTO=static
> > > IPADDR=172.16.0.207
> > > NETMASK=255.255.255.0
> > > ONBOOT=yes
> > > ZONE=public
> > >
> > > When I try "ifup"
> > >
> > > [root@ovirt-hv2 ~]# ifup ib0
> > > Error: Connection activation failed: No suitable device found for
> > > this connection.
> > >
> > > The error in syslog:
> > >
> > > Aug 22 11:31:50 ovirt-hv2 kernel: IPv4: martian source 172.16.0.87
> > > from 172.16.0.49, on dev ib0
> > > Aug 22 11:31:53 ovirt-hv2 NetworkManager[1070]: <info>
> > > [1534951913.7486] audit: op="connection-activate"
> > > uuid="2ab4abde-b8a5-6cbc-19b1-2bfb193e4e89" name="System ib0"
> > > result="fail" reason="No suitable device found for this
> > > connection.
> > >
> > > As you can see media state up:
> > >
> > > [root@ovirt-hv2 ~]# ip a
> > > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state
> > > UNKNOWN group default qlen 1000
> > > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> > > inet 127.0.0.1/8 scope host lo
> > > valid_lft forever preferred_lft forever
> > > 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master
> > > ovirtmgmt state UP group default qlen 1000
> > > link/ether 50:9a:4c:89:d3:81 brd ff:ff:ff:ff:ff:ff
> > > 3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq
> > > state DOWN group default qlen 1000
> > > link/ether 50:9a:4c:89:d3:82 brd ff:ff:ff:ff:ff:ff
> > > 4: p1p1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq
> > > state DOWN group default qlen 1000
> > > link/ether b4:96:91:13:ea:68 brd ff:ff:ff:ff:ff:ff
> > > 5: p1p2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq
> > > state DOWN group default qlen 1000
> > > link/ether b4:96:91:13:ea:6a brd ff:ff:ff:ff:ff:ff
> > > 6: idrac: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> > > pfifo_fast state UNKNOWN group default qlen 1000
> > > link/ether 50:9a:4c:89:d3:84 brd ff:ff:ff:ff:ff:ff
> > > inet 169.254.0.2/16 brd 169.254.255.255 scope global idrac
> > > valid_lft forever preferred_lft forever
> > > 7: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state
> > > UP group default qlen 256
> > > link/infiniband
> > > a0:00:02:08:fe:80:00:00:00:00:00:00:ec:0d:9a:03:00:1d:13:41 brd
> > > 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
> > > 8: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state
> > > DOWN group default qlen 1000
> > > link/ether 12:b4:30:22:39:5b brd ff:ff:ff:ff:ff:ff
> > > 9: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
> > > group default qlen 1000
> > > link/ether 3e:32:e6:66:98:49 brd ff:ff:ff:ff:ff:ff
> > > 25: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> > > noqueue state UP group default qlen 1000
> > > link/ether 50:9a:4c:89:d3:81 brd ff:ff:ff:ff:ff:ff
> > > inet 10.0.0.183/16 brd 10.0.255.255 scope global ovirtmgmt
> > > valid_lft forever preferred_lft forever
> > > 26: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000
> > > qdisc noqueue master ovs-system state UNKNOWN group default qlen
> > > 1000 link/ether aa:32:82:1b:01:d9 brd ff:ff:ff:ff:ff:ff
> > > 27: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state
> > > DOWN group default qlen 1000
> > > link/ether 32:ff:5d:b8:c2:b4 brd ff:ff:ff:ff:ff:ff
> > >
> > > The card is FDR:
> > >
> > > [root@ovirt-hv2 ~]# lspci -v | grep Mellanox
> > > 01:00.0 Network controller: Mellanox Technologies MT27500 Family
> > > [ConnectX-3]
> > > Subsystem: Mellanox Technologies Device 0051
> > >
> > > Latest OFED driver:
> > >
> > > [root@ovirt-hv2 ~]# /etc/init.d/openibd status
> > >
> > > HCA driver loaded
> > >
> > > Configured IPoIB devices:
> > > ib0
> > >
> > > Currently active IPoIB devices:
> > > ib0
> > > Configured Mellanox EN devices:
> > >
> > > Currently active Mellanox devices:
> > > ib0
> > >
> > > The following OFED modules are loaded:
> > >
> > > rdma_ucm
> > > rdma_cm
> > > ib_ipoib
> > > mlx4_core
> > > mlx4_ib
> > > mlx4_en
> > > mlx5_core
> > > mlx5_ib
> > > ib_uverbs
> > > ib_umad
> > > ib_ucm
> > > ib_cm
> > > ib_core
> > > mlxfw
> > > mlx5_fpga_tools
> > >
> > > I can add an IP to ib0 using "ip addr" though I need Network
> > > Manager to work with ib0.
> > >
> > >
> > > Thanks,
> > >
> > > Douglas Duckworth, MSc, LFCS
> > > HPC System Administrator
> > > Scientific Computing Unit
> > > Weill Cornell Medicine
> > > 1300 York - LC-502
> > > E: doug@med.cornell.edu
> > > O: 212-746-6305
> > > F: 212-746-8690
> >
> >