[Users] alx driver causes network troubles on VMs after bonding

Alessandro Bianchi a.bianchi at skynet.it
Thu Jan 9 14:42:38 UTC 2014


Hi all

I'm running several F19 hosts with multiple Gigabit NIC's

I've discovered on at least two different harware host nodes that alx 
driver causes troubles in network connection

This is related to this Atheros very common onboard hardware (lspci -vv 
relevant part only)

03:00.0 Ethernet controller: Qualcomm Atheros AR8161 Gigabit Ethernet 
(rev 10)
     Subsystem: ASUSTeK Computer Inc. Device 8507
     Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
     Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
     Latency: 0, Cache Line Size: 64 bytes
     Interrupt: pin A routed to IRQ 16
     Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=256K]
     Region 2: I/O ports at e000 [size=128]
     Capabilities: [40] Power Management version 3
         Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA 
PME(D0+,D1+,D2+,D3hot+,D3cold+)
         Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
     Capabilities: [58] Express (v1) Endpoint, MSI 00
         DevCap:    MaxPayload 4096 bytes, PhantFunc 0, Latency L0s 
unlimited, L1 unlimited
             ExtTag- AttnBtn+ AttnInd+ PwrInd+ RBE+ FLReset-
         DevCtl:    Report errors: Correctable+ Non-Fatal+ Fatal+ 
Unsupported+
             RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
             MaxPayload 128 bytes, MaxReadReq 512 bytes
         DevSta:    CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ 
TransPend-
         LnkCap:    Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit 
Latency L0s unlimited, L1 unlimited
             ClockPM+ Surprise- LLActRep- BwNot-
         LnkCtl:    ASPM Disabled; RCB 64 bytes Disabled- CommClk+
             ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
         LnkSta:    Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ 
DLActive- BWMgmt- ABWMgmt-
     Capabilities: [c0] MSI: Enable- Count=1/16 Maskable+ 64bit+
         Address: 00000000fee0f00c  Data: 4142
         Masking: ffff0000  Pending: 00000000
     Capabilities: [d8] MSI-X: Enable- Count=16 Masked-
         Vector table: BAR=0 offset=00002000
         PBA: BAR=0 offset=00003000
     Capabilities: [100 v1] Advanced Error Reporting
         UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
         UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
         UESvrt:    DLP- SDES+ TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
         CESta:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
         CEMsk:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
         AERCap:    First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
     Capabilities: [180 v1] Device Serial Number ff-d6-66-47-08-60-6e-ff
     Kernel driver in use: alx

When I bond it on the ovirtmgmnt network (two gigabit bond) VMs on local 
cluster run OK but they fail to mount some gluster mount points

I suppose this is related to the alx driver which seems to be not 
completely working (it still misses counters)

Bringing down the relevant interface so that the bond remains with only 
one NIC (ifdown p4p1) fixes the issue.

So be warned: if you have this hardware check very carefully your 
network mount points inside Vms

If you have already tested this configuration and have this working 
please let me know.

Best regards

Alessandro Bianchi



More information about the Users mailing list