[ovirt-users] centos 7.1 and up & ixgbe
Johan Kooijman
mail at johankooijman.com
Mon Apr 4 08:54:20 UTC 2016
Hi Jurrien,
I don't see anything in logs on the nodes itself. The only thing we see in
logs are in engine log - it looses connectivity to the host.
Definitely CentOS 7.1/7.2 related. Downgraded the hosts to ovirt-iso 3.5,
this resolves the issue.
On Fri, Mar 18, 2016 at 9:01 AM, Bloemen, Jurriën <
Jurrien.Bloemen at dmc.amcnetworks.com> wrote:
> Hi Johan,
>
> Could you check if you see the following in you dmesg or message log file?
>
> [1123306.014288] ------------[ cut here ]------------
> [1123306.014302] WARNING: at net/core/dev.c:2189
> skb_warn_bad_offload+0xcd/0xda()
> [1123306.014306] : caps=(0x0000000200004849, 0x0000000000000000) len=330
> data_len=276 gso_size=276 gso_type=1 ip_summed=1
> [1123306.014308] Modules linked in: vhost_net macvtap macvlan
> ip6table_filter ip6_tables iptable_filter ip_tables ebt_arp ebtable_nat
> ebtables tun scsi_transport_iscsi iTCO_wdt iTCO_vendor_support
> dm_service_time intel_powerclamp coretemp intel_rapl kvm_intel kvm
> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel cryptd pcspkr sb_edac
> edac_core i2c_i801 lpc_ich mfd_core mei_me mei wmi ioatdma shpchp
> ipmi_devintf ipmi_si ipmi_msghandler acpi_power_meter acpi_pad 8021q garp
> mrp bridge stp llc bonding dm_multipath xfs libcrc32c sd_mod crc_t10dif
> crct10dif_common ast syscopyarea sysfillrect sysimgblt drm_kms_helper ttm
> crc32c_intel igb drm ahci ixgbe i2c_algo_bit libahci libata mdio i2c_core
> ptp megaraid_sas pps_core dca dm_mirror dm_region_hash dm_log dm_mod
> [1123306.014360] CPU: 30 PID: 0 Comm: swapper/30 Tainted: G W
> -------------- 3.10.0-229.1.2.el7.x86_64 #1
> [1123306.014362] Hardware name: Supermicro SYS-2028TP-HC1TR/X10DRT-PT,
> BIOS 1.1 08/03/2015
> [1123306.014364] ffff881fffc439a8 5326fb90ad1041ea ffff881fffc43960
> ffffffff81604afa
> [1123306.014371] ffff881fffc43998 ffffffff8106e34b ffff881fcebb0500
> ffff881fce88c000
> [1123306.014376] 0000000000000001 0000000000000001 ffff881fcebb0500
> ffff881fffc43a00
> [1123306.014381] Call Trace:
> [1123306.014383] <IRQ> [<ffffffff81604afa>] dump_stack+0x19/0x1b
> [1123306.014396] [<ffffffff8106e34b>] warn_slowpath_common+0x6b/0xb0
> [1123306.014399] [<ffffffff8106e3ec>] warn_slowpath_fmt+0x5c/0x80
> [1123306.014405] [<ffffffff812db093>] ? ___ratelimit+0x93/0x100
> [1123306.014409] [<ffffffff816076c3>] skb_warn_bad_offload+0xcd/0xda
> [1123306.014425] [<ffffffff814fdeb9>] __skb_gso_segment+0x79/0xb0
> [1123306.014429] [<ffffffff814fe1c2>] dev_hard_start_xmit+0x1a2/0x580
> [1123306.014438] [<ffffffffa0168790>] ? deliver_clone+0x50/0x50 [bridge]
> [1123306.014443] [<ffffffff8151df1e>] sch_direct_xmit+0xee/0x1c0
> [1123306.014447] [<ffffffff814fe798>] dev_queue_xmit+0x1f8/0x4a0
> [1123306.014453] [<ffffffffa016880b>] br_dev_queue_push_xmit+0x7b/0xc0
> [bridge]
> [1123306.014458] [<ffffffffa0168a22>] br_forward_finish+0x22/0x60 [bridge]
> [1123306.014464] [<ffffffffa0168ae0>] __br_forward+0x80/0xf0 [bridge]
> [1123306.014469] [<ffffffffa0168ebb>] br_forward+0x8b/0xa0 [bridge]
> [1123306.014476] [<ffffffffa0169e65>] br_handle_frame_finish+0x175/0x410
> [bridge]
> [1123306.014481] [<ffffffffa016a275>] br_handle_frame+0x175/0x260 [bridge]
> [1123306.014485] [<ffffffff814fc112>] __netif_receive_skb_core+0x282/0x870
> [1123306.014490] [<ffffffff8101b589>] ? read_tsc+0x9/0x10
> [1123306.014493] [<ffffffff814fc718>] __netif_receive_skb+0x18/0x60
> [1123306.014497] [<ffffffff814fc7a0>] netif_receive_skb+0x40/0xd0
> [1123306.014500] [<ffffffff814fd2b0>] napi_gro_receive+0x80/0xb0
> [1123306.014512] [<ffffffffa00cde2c>] ixgbe_clean_rx_irq+0x7ac/0xb30
> [ixgbe]
> [1123306.014519] [<ffffffffa00cf07b>] ixgbe_poll+0x4bb/0x930 [ixgbe]
> [1123306.014524] [<ffffffff814fcb62>] net_rx_action+0x152/0x240
> [1123306.014528] [<ffffffff81077bf7>] __do_softirq+0xf7/0x290
> [1123306.014533] [<ffffffff8161635c>] call_softirq+0x1c/0x30
> [1123306.014539] [<ffffffff81015de5>] do_softirq+0x55/0x90
> [1123306.014543] [<ffffffff81077f95>] irq_exit+0x115/0x120
> [1123306.014546] [<ffffffff81616ef8>] do_IRQ+0x58/0xf0
> [1123306.014551] [<ffffffff8160c0ed>] common_interrupt+0x6d/0x6d
> [1123306.014553] <EOI> [<ffffffff814aa6d2>] ?
> cpuidle_enter_state+0x52/0xc0
> [1123306.014561] [<ffffffff814aa6c8>] ? cpuidle_enter_state+0x48/0xc0
> [1123306.014565] [<ffffffff814aa805>] cpuidle_idle_call+0xc5/0x200
> [1123306.014569] [<ffffffff8101d21e>] arch_cpu_idle+0xe/0x30
> [1123306.014574] [<ffffffff810c6945>] cpu_startup_entry+0xf5/0x290
> [1123306.014580] [<ffffffff810423ca>] start_secondary+0x1ba/0x230
> [1123306.014582] ---[ end trace 4d5a1bc838e1fcc0 ]---
>
> If so, then could you try the following:
>
> ethtool -K <nic name> lro off
>
> Do this for all the 10G intel nics and check if the problems still exists
>
>
> *Kind regards,*
>
>
>
> *Jurriën Bloemen*
>
> On 17-03-16 09:49, Johan Kooijman wrote:
>
> Hi all,
>
> Since we upgraded to the latest ovirt node running 7.2, we're seeing that
> nodes become unavailable after a while. It's running fine, with a couple of
> VM's on it, untill it becomes non responsive. At that moment it doesn't
> even respond to ICMP. It'll come back by itself after a while, but oVirt
> fences the machine before that time and restarts VM's elsewhere.
>
> Engine tells me this message:
>
> VDSM host09 command failed: Message timeout which can be caused by
> communication issues
>
> Is anyone else experiencing these issues with ixgbe drivers? I'm running
> on Intel X540-AT2 cards.
>
> --
> Met vriendelijke groeten / With kind regards,
> Johan Kooijman
>
>
> _______________________________________________
> Users mailing listUsers at ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
>
>
> This message (including any attachments) may contain information that is
> privileged or confidential. If you are not the intended recipient, please
> notify the sender and delete this email immediately from your systems and
> destroy all copies of it. You may not, directly or indirectly, use,
> disclose, distribute, print or copy this email or any part of it if you are
> not the intended recipient
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
--
Met vriendelijke groeten / With kind regards,
Johan Kooijman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160404/6995206c/attachment-0001.html>
More information about the Users
mailing list