<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></head><body>I have only one switch so two interfaces are connected to the same switch. The configuration in switch is corrected. I opened a ticket for switch Tech support and the configuration was validated.<div><br></div><div>This configuration worked without problems h24 for one year! !!!! All problems started after a kernel update.... so something was changed in kernel. ....</div><br><br><div style="font-size:100%;text-align:left;color:#000000">-------- Messaggio originale --------<br>Da: Dan Kenigsberg <danken@redhat.com> <br>Data: 04/02/2016 22:02 (GMT+01:00) <br>A: Stefano Danzi <s.danzi@hawai.it>, ydary@redhat.com <br>Cc: Jon Archer <jon@rosslug.org.uk>, mburman@redhat.com, users@ovirt.org <br>Oggetto: Re: [ovirt-users] R: Re: Network instability after upgrade 3.6.0 ->
3.6.1 <br><br></div>On Thu, Feb 04, 2016 at 06:26:14PM +0100, Stefano Danzi wrote:<br>> <br>> <br>> Il 04/02/2016 16.55, Dan Kenigsberg ha scritto:<br>> >On Wed, Jan 06, 2016 at 08:45:16AM +0200, Dan Kenigsberg wrote:<br>> >>On Mon, Jan 04, 2016 at 01:54:37PM +0200, Dan Kenigsberg wrote:<br>> >>>On Mon, Jan 04, 2016 at 12:31:38PM +0100, Stefano Danzi wrote:<br>> >>>>I did some tests:<br>> >>>><br>> >>>>kernel-3.10.0-327.3.1.el7.x86_64 -> bond mode 4 doesn't work (if I detach<br>> >>>>one network cable the network is stable)<br>> >>>>kernel-3.10.0-229.20.1.el7.x86_64 -> bond mode 4 works fine<br>> >>>Would you be kind to file a kernel bug in bugzilla.redhat.com?<br>> >>>Summarize the information from this thread (e.g. your ifcfgs and in what<br>> >>>way does mode 4 doesn't work).<br>> >>><br>> >>>To get the bug solved quickly we'd better find paying RHEL7 customer<br>> >>>subscribing to it. But I'll try to push from my direction.<br>> >>Stefano has been kind to open<br>> >><br>> >> Bug 1295423 - Unstable network link using bond mode = 4<br>> >> https://bugzilla.redhat.com/show_bug.cgi?id=1295423<br>> >><br>> >>which we fail to reproduce on our own lab. I'd be pleased if anybody who<br>> >>experiences it, and their networking config to the bug (if it is<br>> >>different). Can you also lay out your switch's hardware and<br>> >>configuration?<br>> >Stefano, could you share your /proc/net/bonding/* files with us?<br>> >I heard about similar reports were the bond slaves had mismatching<br>> >aggregator id. Could it be your case as well?<br>> ><br>> <br>> Here:<br>> <br>> [root@ovirt01 ~]# cat /proc/net/bonding/bond0<br>> Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)<br>> <br>> Bonding Mode: IEEE 802.3ad Dynamic link aggregation<br>> Transmit Hash Policy: layer2 (0)<br>> MII Status: up<br>> MII Polling Interval (ms): 100<br>> Up Delay (ms): 0<br>> Down Delay (ms): 0<br>> <br>> 802.3ad info<br>> LACP rate: slow<br>> Min links: 0<br>> Aggregator selection policy (ad_select): stable<br>> Active Aggregator Info:<br>> Aggregator ID: 2<br>> Number of ports: 1<br>> Actor Key: 9<br>> Partner Key: 1<br>> Partner Mac Address: 00:00:00:00:00:00<br>> <br>> Slave Interface: enp4s0<br>> MII Status: up<br>> Speed: 1000 Mbps<br>> Duplex: full<br>> Link Failure Count: 2<br>> Permanent HW addr: **:**:**:**:**:f1<br>> Slave queue ID: 0<br>> Aggregator ID: 1<br><br>---------------^^^<br><br><br>> Actor Churn State: churned<br>> Partner Churn State: churned<br>> Actor Churned Count: 4<br>> Partner Churned Count: 5<br>> details actor lacp pdu:<br>> system priority: 65535<br>> port key: 9<br>> port priority: 255<br>> port number: 1<br>> port state: 69<br>> details partner lacp pdu:<br>> system priority: 65535<br>> oper key: 1<br>> port priority: 255<br>> port number: 1<br>> port state: 1<br>> <br>> Slave Interface: enp5s0<br>> MII Status: up<br>> Speed: 1000 Mbps<br>> Duplex: full<br>> Link Failure Count: 1<br>> Permanent HW addr: **:**:**:**:**:f2<br>> Slave queue ID: 0<br>> Aggregator ID: 2<br><br>---------------^^^<br><br><br>it sounds awfully familiar - mismatching aggregator IDs, and an all-zero<br>partner mac. Can you double-check that both your nics are wired to the<br>same switch, which is properly configured to use lacp on these two<br>ports?<br><br></body></html>