[ovirt-users] oVIRT 4 / OVN / Communication issues of instances between nodes.

Devin Acosta devin at pabstatencio.com
Tue Dec 6 04:28:17 UTC 2016


Lance,

Well I installed the new kernel module and it cleared up a lot of the
errors I was seeing in the log, but what I notice is that I still can't
ping instances between hosts.  I'm starting to wonder am I missing
something fundamental here? I don't see anything in the ovs-vswitchd.log to
show tunnel?

I do show in the kernel log on reload of the module:

[1056295.308707] openvswitch: module verification failed: signature and/or
required key missing - tainting kernel
[1056295.311034] openvswitch: Open vSwitch switching datapath 2.6.90
[1056295.311145] openvswitch: LISP tunneling driver
[1056295.311147] openvswitch: GRE over IPv4 tunneling driver
[1056295.311153] openvswitch: Geneve tunneling driver
[1056295.311164] openvswitch: VxLAN tunneling driver
[1056295.311166] openvswitch: STT tunneling driver

[node2]

[root at ovirt-node2 openvswitch]# cat ovs-vswitchd.log
2016-12-06T04:22:23.192Z|00001|vlog|INFO|opened log file
/var/log/openvswitch/ovs-vswitchd.log
2016-12-06T04:22:23.194Z|00002|ovs_numa|INFO|Discovered 16 CPU cores on
NUMA node 0
2016-12-06T04:22:23.194Z|00003|ovs_numa|INFO|Discovered 16 CPU cores on
NUMA node 1
2016-12-06T04:22:23.194Z|00004|ovs_numa|INFO|Discovered 2 NUMA nodes and 32
CPU cores
2016-12-06T04:22:23.194Z|00005|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
connecting...
2016-12-06T04:22:23.195Z|00006|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
connected
2016-12-06T04:22:23.197Z|00007|ofproto_dpif|INFO|system at ovs-system:
Datapath supports recirculation
2016-12-06T04:22:23.197Z|00008|ofproto_dpif|INFO|system at ovs-system: MPLS
label stack length probed as 1
2016-12-06T04:22:23.197Z|00009|ofproto_dpif|INFO|system at ovs-system:
Datapath supports truncate action
2016-12-06T04:22:23.197Z|00010|ofproto_dpif|INFO|system at ovs-system:
Datapath supports unique flow ids
2016-12-06T04:22:23.197Z|00011|ofproto_dpif|INFO|system at ovs-system:
Datapath supports ct_state
2016-12-06T04:22:23.197Z|00012|ofproto_dpif|INFO|system at ovs-system:
Datapath supports ct_zone
2016-12-06T04:22:23.197Z|00013|ofproto_dpif|INFO|system at ovs-system:
Datapath supports ct_mark
2016-12-06T04:22:23.197Z|00014|ofproto_dpif|INFO|system at ovs-system:
Datapath supports ct_label
2016-12-06T04:22:23.197Z|00015|ofproto_dpif|INFO|system at ovs-system:
Datapath supports ct_state_nat
2016-12-06T04:22:23.339Z|00001|ofproto_dpif_upcall(handler1)|INFO|received
packet on unassociated datapath port 0
2016-12-06T04:22:23.339Z|00016|bridge|INFO|bridge br-int: added interface
vnet0 on port 5
2016-12-06T04:22:23.339Z|00017|bridge|INFO|bridge br-int: added interface
br-int on port 65534
2016-12-06T04:22:23.339Z|00018|bridge|INFO|bridge br-int: using datapath ID
000016d6e0b66442
2016-12-06T04:22:23.339Z|00019|connmgr|INFO|br-int: added service
controller "punix:/var/run/openvswitch/br-int.mgmt"
2016-12-06T04:22:23.340Z|00020|bridge|INFO|ovs-vswitchd (Open vSwitch)
2.6.90
2016-12-06T04:22:32.437Z|00021|bridge|INFO|bridge br-int: added interface
ovn-c0dc09-0 on port 6
2016-12-06T04:22:32.437Z|00022|bridge|INFO|bridge br-int: added interface
ovn-252778-0 on port 7
2016-12-06T04:22:33.342Z|00023|memory|INFO|281400 kB peak resident set size
after 10.2 seconds
2016-12-06T04:22:33.342Z|00024|memory|INFO|handlers:23 ofconns:2 ports:4
revalidators:9 rules:79
2016-12-06T04:22:42.440Z|00025|connmgr|INFO|br-int<->unix: 76 flow_mods 10
s ago (75 adds, 1 deletes)

[root at ovirt-node2 openvswitch]# cat ovn-controller.log
2016-12-06T04:22:32.398Z|00001|vlog|INFO|opened log file
/var/log/openvswitch/ovn-controller.log
2016-12-06T04:22:32.400Z|00002|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
connecting...
2016-12-06T04:22:32.400Z|00003|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
connected
2016-12-06T04:22:32.402Z|00004|reconnect|INFO|tcp:172.20.192.77:6642:
connecting...
2016-12-06T04:22:32.403Z|00005|reconnect|INFO|tcp:172.20.192.77:6642:
connected
2016-12-06T04:22:32.406Z|00006|binding|INFO|Claiming lport
56432d2b-a96d-4ac7-b0e9-3450a006e1d4 for this chassis.
2016-12-06T04:22:32.406Z|00007|binding|INFO|Claiming 00:1a:4a:16:01:64
dynamic
2016-12-06T04:22:32.407Z|00008|ofctrl|INFO|unix:/var/run/openvswitch/br-int.mgmt:
connecting to switch
2016-12-06T04:22:32.407Z|00009|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt:
connecting...
2016-12-06T04:22:32.407Z|00010|pinctrl|INFO|unix:/var/run/openvswitch/br-int.mgmt:
connecting to switch
2016-12-06T04:22:32.407Z|00011|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt:
connecting...
2016-12-06T04:22:32.408Z|00012|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt:
connected
2016-12-06T04:22:32.408Z|00013|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt:
connected
2016-12-06T04:22:32.440Z|00014|ofctrl|INFO|dropping duplicate flow:
table_id=32, priority=150, reg10=0x2/0x2, actions=resubmit(,33)
2016-12-06T04:22:32.441Z|00015|ofctrl|INFO|dropping duplicate flow:
table_id=32, priority=150, reg10=0x2/0x2, actions=resubmit(,33)
2016-12-06T04:22:32.441Z|00016|ofctrl|INFO|dropping duplicate flow:
table_id=32, priority=150, reg10=0x2/0x2, actions=resubmit(,33)
2016-12-06T04:22:37.408Z|00017|ofctrl|INFO|dropping duplicate flow:
table_id=32, priority=150, reg10=0x2/0x2, actions=resubmit(,33)
2016-12-06T04:22:42.408Z|00018|ofctrl|INFO|dropping duplicate flow:
table_id=32, priority=150, reg10=0x2/0x2, actions=resubmit(,33)
2016-12-06T04:22:47.409Z|00019|ofctrl|INFO|Dropped 1 log messages in last 5
seconds (most recently, 5 seconds ago) due to excessive rate
2016-12-06T04:22:47.409Z|00020|ofctrl|INFO|dropping duplicate flow:
table_id=32, priority=150, reg10=0x2/0x2, actions=resubmit(,33)
2016-12-06T04:22:57.411Z|00021|ofctrl|INFO|Dropped 3 log messages in last
10 seconds (most recently, 5 seconds ago) due to excessive rate
2016-12-06T04:22:57.411Z|00022|ofctrl|INFO|dropping duplicate flow:
table_id=32, priority=150, reg10=0x2/0x2, actions=resubmit(,33)
2016-12-06T04:23:12.413Z|00023|ofctrl|INFO|Dropped 4 log messages in last
10 seconds (most recently, 5 seconds ago) due to excessive rate
2016-12-06T04:23:12.413Z|00024|ofctrl|INFO|dropping duplicate flow:
table_id=32, priority=150, reg10=0x2/0x2, actions=resubmit(,33)
2016-12-06T04:23:22.415Z|00025|ofctrl|INFO|Dropped 3 log messages in last
10 seconds (most recently, 5 seconds ago) due to excessive rate
2016-12-06T04:23:22.415Z|00026|ofctrl|INFO|dropping duplicate flow:
table_id=32, priority=150, reg10=0x2/0x2, actions=resubmit(,33)
2016-12-06T04:23:37.417Z|00027|ofctrl|INFO|Dropped 5 log messages in last
10 seconds (most recently, 5 seconds ago) due to excessive rate
2016-12-06T04:23:37.417Z|00028|ofctrl|INFO|dropping duplicate flow:
table_id=32, priority=150, reg10=0x2/0x2, actions=resubmit(,33)
2016-12-06T04:23:47.419Z|00029|ofctrl|INFO|Dropped 3 log messages in last
10 seconds (most recently, 5 seconds ago) due to excessive rate
2016-12-06T04:23:47.419Z|00030|ofctrl|INFO|dropping duplicate flow:
table_id=32, priority=150, reg10=0x2/0x2, actions=resubmit(,33)
2016-12-06T04:23:57.421Z|00031|ofctrl|INFO|Dropped 3 log messages in last
10 seconds (most recently, 5 seconds ago) due to excessive rate
2016-12-06T04:23:57.421Z|00032|ofctrl|INFO|dropping duplicate flow:
table_id=32, priority=150, reg10=0x2/0x2, actions=resubmit(,33)

[root at ovirt-node2 openvswitch]# brctl show
bridge name bridge id STP enabled interfaces
;vdsmdummy; 8000.000000000000 no
DEV-NOC 8000.0cc47a1ef306 no bond0
DEV-VM-NET 8000.0cc47a1ef306 no bond0.700
ovirtmgmt 8000.0cc47a08b3c2 no enp7s0f0

-- 

Devin Acosta
Red Hat Certified Architect, LinuxStack
devin at linuxguru.co


On Mon, Dec 5, 2016 at 2:34 PM, Lance Richardson <lrichard at redhat.com>
wrote:

> > From: "Devin Acosta" <devin at pabstatencio.com>
> > To: "Lance Richardson" <lrichard at redhat.com>
> > Cc: "Marcin Mirecki" <mmirecki at redhat.com>, "users" <Users at ovirt.org>
> > Sent: Monday, December 5, 2016 4:17:35 PM
> > Subject: Re: [ovirt-users] oVIRT 4 / OVN / Communication issues of
> instances between nodes.
> >
> > Lance,
> >
> > I found some interesting logs, we have (3) oVIRT nodes.
> >
> > We are running:
> > CentOS Linux release 7.2.1511 (Core)
> > Linux hostname 3.10.0-327.36.3.el7.x86_64 #1 SMP Mon Oct 24 16:09:20 UTC
> > 2016 x86_64 x86_64 x86_64 GNU/Linux
> >
>
> <snip>
>
> > 2016-12-05T20:47:56.774Z|00021|ofctrl|INFO|OpenFlow error: OFPT_ERROR
> > (OF1.3) (xid=0x17): OFPBAC_BAD_TYPE
>
> This (generally unintelligible message usually indicates that the kernel
> openvswitch module doesn't support conntrack.
>
> <snip>
>
> >
> > 2016-12-05T20:35:04.345Z|00001|vlog|INFO|opened log file
> > /var/log/openvswitch/ovs-vswitchd.log
> > 2016-12-05T20:35:04.347Z|00002|ovs_numa|INFO|Discovered 16 CPU cores on
> > NUMA node 0
> > 2016-12-05T20:35:04.347Z|00003|ovs_numa|INFO|Discovered 16 CPU cores on
> > NUMA node 1
> > 2016-12-05T20:35:04.347Z|00004|ovs_numa|INFO|Discovered 2 NUMA nodes
> and 32
> > CPU cores
> > 2016-12-05T20:35:04.348Z|00005|reconnect|INFO|unix:/
> var/run/openvswitch/db.sock:
> > connecting...
> > 2016-12-05T20:35:04.348Z|00006|reconnect|INFO|unix:/
> var/run/openvswitch/db.sock:
> > connected
> > 2016-12-05T20:35:04.350Z|00007|ofproto_dpif|INFO|system at ovs-system:
> > Datapath supports recirculation
> > 2016-12-05T20:35:04.350Z|00008|ofproto_dpif|INFO|system at ovs-system: MPLS
> > label stack length probed as 1
> > 2016-12-05T20:35:04.350Z|00009|ofproto_dpif|INFO|system at ovs-system:
> > Datapath does not support truncate action
> > 2016-12-05T20:35:04.350Z|00010|ofproto_dpif|INFO|system at ovs-system:
> > Datapath supports unique flow ids
> > 2016-12-05T20:35:04.350Z|00011|ofproto_dpif|INFO|system at ovs-system:
> > Datapath does not support ct_state
> > 2016-12-05T20:35:04.350Z|00012|ofproto_dpif|INFO|system at ovs-system:
> > Datapath does not support ct_zone
> > 2016-12-05T20:35:04.350Z|00013|ofproto_dpif|INFO|system at ovs-system:
> > Datapath does not support ct_mark
> > 2016-12-05T20:35:04.350Z|00014|ofproto_dpif|INFO|system at ovs-system:
> > Datapath does not support ct_label
> > 2016-12-05T20:35:04.350Z|00015|ofproto_dpif|INFO|system at ovs-system:
> > Datapath does not support ct_state_nat
>
> OK, "Datapath does not support ct_*" confirms that the kernel openvswitch
> module doesn't support the conntrack features needed by OVN.
>
> Most likely the loaded module is the stock CentOS one, you can build
> the out-of-tree kernel module RPM from the same source tree where you
> built the other OVS/OVN RPMs via:
>
>    make rpm-fedora-kmod
>
> This should leave an RPM named something like:
>
>    openvswitch-kmod-2.6.90-1.el7.centos.x86_64.rpm
>
> Install that and reboot and things should be working better.
>
> Regards,
>
>    Lance
>
>
> >
> > Your help is greatly appreciated!
> >
> > Devin
> >
> > On Mon, Dec 5, 2016 at 12:31 PM, Lance Richardson <lrichard at redhat.com>
> > wrote:
> >
> > > > From: "Devin Acosta" <devin at pabstatencio.com>
> > > > To: "Marcin Mirecki" <mmirecki at redhat.com>
> > > > Cc: "users" <Users at ovirt.org>
> > > > Sent: Monday, December 5, 2016 12:11:46 PM
> > > > Subject: Re: [ovirt-users] oVIRT 4 / OVN / Communication issues of
> > > instances between nodes.
> > > >
> > > > Marcin,
> > > >
> > > > Also I noticed in your original post it mentions:
> > > >
> > > > ip link - the result should include a link called genev_sys_ ...
> > > >
> > > > I noticed that on my hosts I don't see any links with name:
> genev_sys_ ??
> > > > Could this be a problem?
> > > >
> > > > lo:
> > > > enp4s0f0:
> > > > enp4s0f1:
> > > > enp7s0f0:
> > > > enp7s0f1:
> > > > bond0:
> > > > DEV-NOC:
> > > > ovirtmgmt:
> > > > bond0.700 at bond0:
> > > > DEV-VM-NET:
> > > > bond0.705 at bond0:
> > > > ;vdsmdummy;:
> > > > vnet0:
> > > > vnet1:
> > > > vnet2:
> > > > vnet3:
> > > > vnet4:
> > > > ovs-system:
> > > > br-int:
> > > > vnet5:
> > > > vnet6:
> > > >
> > >
> > > Hi Devin,
> > >
> > > What distribution and kernel version are you using?
> > >
> > > One thing you could check is whether the vport_geneve kernel module
> > > is being loaded, e.g. you should see something like:
> > >
> > >     $ lsmod | grep vport
> > >     vport_geneve           12560  1
> > >     openvswitch           246755  5 vport_geneve
> > >
> > > If vport_geneve is  not loaded, you could "sudo modprobe vport_geneve"
> > > to make sure it's available and can be loaded.
> > >
> > > The first 100 lines or so of ovs-vswitchd.log might have some useful
> > > information about where things are going wrong.
> > >
> > > It does sound as though there is some issue with geneve tunnels,
> > > which would certainly explain issues with inter-node traffic.
> > >
> > > Regards,
> > >
> > >     Lance
> > >
> >
> >
> >
> > --
> >
> > Devin Acosta
> > Red Hat Certified Architect, LinuxStack
> > 602-354-1220 || devin at linuxguru.co
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20161205/aea97766/attachment-0001.html>


More information about the Users mailing list