On Thu, Feb 04, 2016 at 06:26:14PM +0100, Stefano Danzi wrote:
Il 04/02/2016 16.55, Dan Kenigsberg ha scritto:
>On Wed, Jan 06, 2016 at 08:45:16AM +0200, Dan Kenigsberg wrote:
>>On Mon, Jan 04, 2016 at 01:54:37PM +0200, Dan Kenigsberg wrote:
>>>On Mon, Jan 04, 2016 at 12:31:38PM +0100, Stefano Danzi wrote:
>>>>I did some tests:
>>>>
>>>>kernel-3.10.0-327.3.1.el7.x86_64 -> bond mode 4 doesn't work (if I
detach
>>>>one network cable the network is stable)
>>>>kernel-3.10.0-229.20.1.el7.x86_64 -> bond mode 4 works fine
>>>Would you be kind to file a kernel bug in bugzilla.redhat.com?
>>>Summarize the information from this thread (e.g. your ifcfgs and in what
>>>way does mode 4 doesn't work).
>>>
>>>To get the bug solved quickly we'd better find paying RHEL7 customer
>>>subscribing to it. But I'll try to push from my direction.
>>Stefano has been kind to open
>>
>> Bug 1295423 - Unstable network link using bond mode = 4
>>
https://bugzilla.redhat.com/show_bug.cgi?id=1295423
>>
>>which we fail to reproduce on our own lab. I'd be pleased if anybody who
>>experiences it, and their networking config to the bug (if it is
>>different). Can you also lay out your switch's hardware and
>>configuration?
>Stefano, could you share your /proc/net/bonding/* files with us?
>I heard about similar reports were the bond slaves had mismatching
>aggregator id. Could it be your case as well?
>
Here:
[root@ovirt01 ~]# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
Aggregator ID: 2
Number of ports: 1
Actor Key: 9
Partner Key: 1
Partner Mac Address: 00:00:00:00:00:00
Slave Interface: enp4s0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 2
Permanent HW addr: **:**:**:**:**:f1
Slave queue ID: 0
Aggregator ID: 1
---------------^^^
Actor Churn State: churned
Partner Churn State: churned
Actor Churned Count: 4
Partner Churned Count: 5
details actor lacp pdu:
system priority: 65535
port key: 9
port priority: 255
port number: 1
port state: 69
details partner lacp pdu:
system priority: 65535
oper key: 1
port priority: 255
port number: 1
port state: 1
Slave Interface: enp5s0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: **:**:**:**:**:f2
Slave queue ID: 0
Aggregator ID: 2
---------------^^^
it sounds awfully familiar - mismatching aggregator IDs, and an all-zero
partner mac. Can you double-check that both your nics are wired to the
same switch, which is properly configured to use lacp on these two
ports?