[ovirt-users] R: Re: R: Re: Network instability after upgrade 3.6.0 ->? 3.6.1
Dan Kenigsberg
danken at redhat.com
Mon Feb 8 13:40:15 UTC 2016
On Thu, Feb 04, 2016 at 10:21:10PM +0100, Stefano Danzi wrote:
> I have only one switch so two interfaces are connected to the same switch. The configuration in switch is corrected. I opened a ticket for switch Tech support and the configuration was validated.
> This configuration worked without problems h24 for one year! !!!! All problems started after a kernel update.... so something was changed in kernel. ....
Jarod, do you have a clue why AggregatorIDs may be mismatching with
recent el7.2 kernels?
>
> -------- Messaggio originale --------
> Da: Dan Kenigsberg <danken at redhat.com>
> Data: 04/02/2016 22:02 (GMT+01:00)
> A: Stefano Danzi <s.danzi at hawai.it>, ydary at redhat.com
> Cc: Jon Archer <jon at rosslug.org.uk>, mburman at redhat.com, users at ovirt.org
> Oggetto: Re: [ovirt-users] R: Re: Network instability after upgrade 3.6.0 ->
> 3.6.1
>
> On Thu, Feb 04, 2016 at 06:26:14PM +0100, Stefano Danzi wrote:
> >
> >
> > Il 04/02/2016 16.55, Dan Kenigsberg ha scritto:
> > >On Wed, Jan 06, 2016 at 08:45:16AM +0200, Dan Kenigsberg wrote:
> > >>On Mon, Jan 04, 2016 at 01:54:37PM +0200, Dan Kenigsberg wrote:
> > >>>On Mon, Jan 04, 2016 at 12:31:38PM +0100, Stefano Danzi wrote:
> > >>>>I did some tests:
> > >>>>
> > >>>>kernel-3.10.0-327.3.1.el7.x86_64 -> bond mode 4 doesn't work (if I detach
> > >>>>one network cable the network is stable)
> > >>>>kernel-3.10.0-229.20.1.el7.x86_64 -> bond mode 4 works fine
> > >>>Would you be kind to file a kernel bug in bugzilla.redhat.com?
> > >>>Summarize the information from this thread (e.g. your ifcfgs and in what
> > >>>way does mode 4 doesn't work).
> > >>>
> > >>>To get the bug solved quickly we'd better find paying RHEL7 customer
> > >>>subscribing to it. But I'll try to push from my direction.
> > >>Stefano has been kind to open
> > >>
> > >> Bug 1295423 - Unstable network link using bond mode = 4
> > >> https://bugzilla.redhat.com/show_bug.cgi?id=1295423
> > >>
> > >>which we fail to reproduce on our own lab. I'd be pleased if anybody who
> > >>experiences it, and their networking config to the bug (if it is
> > >>different). Can you also lay out your switch's hardware and
> > >>configuration?
> > >Stefano, could you share your /proc/net/bonding/* files with us?
> > >I heard about similar reports were the bond slaves had mismatching
> > >aggregator id. Could it be your case as well?
> > >
> >
> > Here:
> >
> > [root at ovirt01 ~]# cat /proc/net/bonding/bond0
> > Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
> >
> > Bonding Mode: IEEE 802.3ad Dynamic link aggregation
> > Transmit Hash Policy: layer2 (0)
> > MII Status: up
> > MII Polling Interval (ms): 100
> > Up Delay (ms): 0
> > Down Delay (ms): 0
> >
> > 802.3ad info
> > LACP rate: slow
> > Min links: 0
> > Aggregator selection policy (ad_select): stable
> > Active Aggregator Info:
> > Aggregator ID: 2
> > Number of ports: 1
> > Actor Key: 9
> > Partner Key: 1
> > Partner Mac Address: 00:00:00:00:00:00
> >
> > Slave Interface: enp4s0
> > MII Status: up
> > Speed: 1000 Mbps
> > Duplex: full
> > Link Failure Count: 2
> > Permanent HW addr: **:**:**:**:**:f1
> > Slave queue ID: 0
> > Aggregator ID: 1
>
> ---------------^^^
>
>
> > Actor Churn State: churned
> > Partner Churn State: churned
> > Actor Churned Count: 4
> > Partner Churned Count: 5
> > details actor lacp pdu:
> > system priority: 65535
> > port key: 9
> > port priority: 255
> > port number: 1
> > port state: 69
> > details partner lacp pdu:
> > system priority: 65535
> > oper key: 1
> > port priority: 255
> > port number: 1
> > port state: 1
> >
> > Slave Interface: enp5s0
> > MII Status: up
> > Speed: 1000 Mbps
> > Duplex: full
> > Link Failure Count: 1
> > Permanent HW addr: **:**:**:**:**:f2
> > Slave queue ID: 0
> > Aggregator ID: 2
>
> ---------------^^^
>
>
> it sounds awfully familiar - mismatching aggregator IDs, and an all-zero
> partner mac. Can you double-check that both your nics are wired to the
> same switch, which is properly configured to use lacp on these two
> ports?
>
More information about the Users
mailing list