[ovirt-users] Strange network performance on VirtIIO VM NIC

Yaniv Kaul ykaul at redhat.com
Tue Mar 21 15:31:53 UTC 2017


On Tue, Mar 21, 2017 at 5:00 PM, FERNANDO FREDIANI <
fernando.frediani at upx.com> wrote:

> Hi Yaniv
> On 21/03/2017 06:19, Yaniv Kaul wrote:
>
>
> Is your host with NUMA support (multiple sockets) ? Are all your
> interfaces connected to the same socket? Perhaps one is on the 'other'
> socket (a different PCI bus, etc.)? This can introduce latency.
> In general, you would want to align everything, from host (interrupts of
> the drivers) all the way to the guest to perform the processing on the same
> socket.
>
> I believe so it is. Look:
> ~]# dmesg | grep -i numa
> [    0.000000] Enabling automatic NUMA balancing. Configure with
> numa_balancing= or the kernel.numa_balancing sysctl
> [    0.693082] pci_bus 0000:00: on NUMA node 0
> [    0.696457] pci_bus 0000:40: on NUMA node 1
> [    0.700678] pci_bus 0000:3f: on NUMA node 0
> [    0.704844] pci_bus 0000:7f: on NUMA node 1
>

So there are 2 NUMA nodes on the host? And where are the NICs located?


>
> The thing is, if was something affecting the underlying network layer
> (drivers for the physical nics for example) it would affect all traffic to
> the VM, not just the one going in/out via vNIC1, right ?
>

Most likely.


>
>
> Layer 2+3 may or may not provide you with good distribution across the
> physical links, depending on the traffic. Layer 3+4 hashing is better, but
> is not entirely compliant with all vendors/equipment.
>
> Yes, I have tested with both and both work well. Have settled on layer2+3
> as it balances the traffic equally layer3+4 for my scenario.
> Initially I have guessed it could be the bonding, but ruled that out when
> I tested with another physical interface that doesn't have any bonding and
> the problem happened the same for the VM in question.
>
> Linux is not always happy with multiple interfaces on the same L2 network.
> I think there are some params needed to be set to make it happy?
>
> Yes you are right and yes, knowing of that I have configured PBR using
> iproute2 which makes Linux work happy in this scenario. Works like a charm.
>

BTW, since those are virtual interfaces, why do you need two on the same
VLAN?


>
>
>
> That can explain it.  Ideally, you need to also streamline the processing
> in the guest. The relevant application should be on the same NUMA node as
> the vCPU processing the virtio-net interrupts.
> In your case, the VM sees a single NUMA node - does that match the
> underlying host architecture as well?
>
> Not sure. The command line from qemu-kvm is automatically generated by
> oVirt. Perhaps some extra option to be changed under Advanced Parameters on
> VM CPU configuration ? Also I was wondering if enabling "IO Threads
> Enabled" under Resource Allocation could be of any help.
>

IO threads are for IO (= storage, perhaps it's not clear and we need to
clarify it) and only useful with large number of disks (and IO of course).


>
> To finish I more inclined to understand that problem is restricted to the
> VM, not to the Host(drivers, physical NICs, etc), given the packet loss
> happens in vNIC1 not in vNIC2 when it has no traffic. If it was in the Host
> level or bonding it would affect the whole VM traffic in either vNICs.
> As a last resource I am considering add an extra 2 vCPUs to the VMs, but I
> guess that will only lower the problem. Does anyone think that "Threads per
> Core" or IO Thread could be a better choice ?
>

Are you using hyper-threading on the host? Otherwise, I'm not sure threads
per core would help.
Y.



>
> Thanks
> Fernando
>
>
> On 18/03/2017 12:53, Yaniv Kaul wrote:
>
>
>
> On Fri, Mar 17, 2017 at 6:11 PM, FERNANDO FREDIANI <
> fernando.frediani at upx.com> wrote:
>
>> Hello all.
>>
>> I have a peculiar problem here which perhaps others may have had or know
>> about and can advise.
>>
>> I have Virtual Machine with 2 VirtIO NICs. This VM serves around 1Gbps of
>> traffic with thousands of clients connecting to it. When I do a packet loss
>> test to the IP pinned to NIC1 it varies from 3% to 10% of packet loss. When
>> I run the same test on NIC2 the packet loss is consistently 0%.
>>
>> From what I gather I may have something to do with possible lack of Multi
>> Queu VirtIO where NIC1 is managed by a single CPU which might be hitting
>> 100% and causing this packet loss.
>>
>> Looking at this reference (https://fedoraproject.org/wik
>> i/Features/MQ_virtio_net) I see one way to test it is start the VM with
>> 4 queues (for example), but checking on the qemu-kvm process I don't see
>> option present. Any way I can force it from the Engine ?
>>
>
> I don't see a need for multi-queue for 1Gbps.
> Can you share the host statistics, the network configuration, the qemu-kvm
> command line, etc.?
> What is the difference between NIC1 and NIC2, in the way they are
> connected to the outside world?
>
>
>>
>> This other reference (https://www.linux-kvm.org/pag
>> e/Multiqueue#Enable_MQ_feature) points to the same direction about
>> starting the VM with queues=N
>>
>> Also trying to increase the TX ring buffer within the guest with ethtool
>> -g eth0 is not possible.
>>
>> Oh, by the way, the Load on the VM is significantly high despite the CPU
>> usage isn't above 50% - 60% in average.
>>
>
> Load = latest 'top' results? Vs. CPU usage? Can mean a lot of processes
> waiting for CPU and doing very little - typical for web servers, for
> example. What is occupying the CPU?
> Y.
>
>
>>
>> Thanks
>> Fernando
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170321/a4a23a0d/attachment-0001.html>


More information about the Users mailing list