[ovirt-users] R: Re: R: Re: Network instability after upgrade 3.6.0 ->? 3.6.1

Dan Kenigsberg danken at redhat.com
Mon Feb 8 13:40:15 UTC 2016


On Thu, Feb 04, 2016 at 10:21:10PM +0100, Stefano Danzi wrote:
> I have only one switch so two interfaces are connected to the same switch. The configuration in switch is corrected.  I opened a ticket for switch Tech support and the configuration was validated.
> This configuration worked without problems h24 for one year! !!!!  All problems started after a kernel update.... so something was changed in kernel. ....

Jarod, do you have a clue why AggregatorIDs may be mismatching with
recent el7.2 kernels?

> 
> -------- Messaggio originale --------
> Da: Dan Kenigsberg <danken at redhat.com> 
> Data: 04/02/2016  22:02  (GMT+01:00) 
> A: Stefano Danzi <s.danzi at hawai.it>, ydary at redhat.com 
> Cc: Jon Archer <jon at rosslug.org.uk>, mburman at redhat.com, users at ovirt.org 
> Oggetto: Re: [ovirt-users] R: Re: Network instability after upgrade 3.6.0 ->
>   3.6.1 
> 
> On Thu, Feb 04, 2016 at 06:26:14PM +0100, Stefano Danzi wrote:
> > 
> > 
> > Il 04/02/2016 16.55, Dan Kenigsberg ha scritto:
> > >On Wed, Jan 06, 2016 at 08:45:16AM +0200, Dan Kenigsberg wrote:
> > >>On Mon, Jan 04, 2016 at 01:54:37PM +0200, Dan Kenigsberg wrote:
> > >>>On Mon, Jan 04, 2016 at 12:31:38PM +0100, Stefano Danzi wrote:
> > >>>>I did some tests:
> > >>>>
> > >>>>kernel-3.10.0-327.3.1.el7.x86_64 -> bond mode 4 doesn't work (if I detach
> > >>>>one network cable the network is stable)
> > >>>>kernel-3.10.0-229.20.1.el7.x86_64 -> bond mode 4 works fine
> > >>>Would you be kind to file a kernel bug in bugzilla.redhat.com?
> > >>>Summarize the information from this thread (e.g. your ifcfgs and in what
> > >>>way does mode 4 doesn't work).
> > >>>
> > >>>To get the bug solved quickly we'd better find paying RHEL7 customer
> > >>>subscribing to it. But I'll try to push from my direction.
> > >>Stefano has been kind to open
> > >>
> > >>     Bug 1295423 - Unstable network link using bond mode = 4
> > >>     https://bugzilla.redhat.com/show_bug.cgi?id=1295423
> > >>
> > >>which we fail to reproduce on our own lab. I'd be pleased if anybody who
> > >>experiences it, and their networking config to the bug (if it is
> > >>different). Can you also lay out your switch's hardware and
> > >>configuration?
> > >Stefano, could you share your /proc/net/bonding/* files with us?
> > >I heard about similar reports were the bond slaves had mismatching
> > >aggregator id. Could it be your case as well?
> > >
> > 
> > Here:
> > 
> > [root at ovirt01 ~]# cat /proc/net/bonding/bond0
> > Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
> > 
> > Bonding Mode: IEEE 802.3ad Dynamic link aggregation
> > Transmit Hash Policy: layer2 (0)
> > MII Status: up
> > MII Polling Interval (ms): 100
> > Up Delay (ms): 0
> > Down Delay (ms): 0
> > 
> > 802.3ad info
> > LACP rate: slow
> > Min links: 0
> > Aggregator selection policy (ad_select): stable
> > Active Aggregator Info:
> >         Aggregator ID: 2
> >         Number of ports: 1
> >         Actor Key: 9
> >         Partner Key: 1
> >         Partner Mac Address: 00:00:00:00:00:00
> > 
> > Slave Interface: enp4s0
> > MII Status: up
> > Speed: 1000 Mbps
> > Duplex: full
> > Link Failure Count: 2
> > Permanent HW addr: **:**:**:**:**:f1
> > Slave queue ID: 0
> > Aggregator ID: 1
> 
> ---------------^^^
> 
> 
> > Actor Churn State: churned
> > Partner Churn State: churned
> > Actor Churned Count: 4
> > Partner Churned Count: 5
> > details actor lacp pdu:
> >     system priority: 65535
> >     port key: 9
> >     port priority: 255
> >     port number: 1
> >     port state: 69
> > details partner lacp pdu:
> >     system priority: 65535
> >     oper key: 1
> >     port priority: 255
> >     port number: 1
> >     port state: 1
> > 
> > Slave Interface: enp5s0
> > MII Status: up
> > Speed: 1000 Mbps
> > Duplex: full
> > Link Failure Count: 1
> > Permanent HW addr: **:**:**:**:**:f2
> > Slave queue ID: 0
> > Aggregator ID: 2
> 
> ---------------^^^
> 
> 
> it sounds awfully familiar - mismatching aggregator IDs, and an all-zero
> partner mac. Can you double-check that both your nics are wired to the
> same switch, which is properly configured to use lacp on these two
> ports?
> 



More information about the Users mailing list