On Tue, Mar 21, 2017 at 5:00 PM, FERNANDO FREDIANI <
fernando.frediani(a)upx.com> wrote:
Hi Yaniv
On 21/03/2017 06:19, Yaniv Kaul wrote:
Is your host with NUMA support (multiple sockets) ? Are all your
interfaces connected to the same socket? Perhaps one is on the 'other'
socket (a different PCI bus, etc.)? This can introduce latency.
In general, you would want to align everything, from host (interrupts of
the drivers) all the way to the guest to perform the processing on the same
socket.
I believe so it is. Look:
~]# dmesg | grep -i numa
[ 0.000000] Enabling automatic NUMA balancing. Configure with
numa_balancing= or the kernel.numa_balancing sysctl
[ 0.693082] pci_bus 0000:00: on NUMA node 0
[ 0.696457] pci_bus 0000:40: on NUMA node 1
[ 0.700678] pci_bus 0000:3f: on NUMA node 0
[ 0.704844] pci_bus 0000:7f: on NUMA node 1
So there are 2 NUMA nodes on the host? And where are the NICs located?
The thing is, if was something affecting the underlying network layer
(drivers for the physical nics for example) it would affect all traffic to
the VM, not just the one going in/out via vNIC1, right ?
Most likely.
Layer 2+3 may or may not provide you with good distribution across the
physical links, depending on the traffic. Layer 3+4 hashing is better, but
is not entirely compliant with all vendors/equipment.
Yes, I have tested with both and both work well. Have settled on layer2+3
as it balances the traffic equally layer3+4 for my scenario.
Initially I have guessed it could be the bonding, but ruled that out when
I tested with another physical interface that doesn't have any bonding and
the problem happened the same for the VM in question.
Linux is not always happy with multiple interfaces on the same L2 network.
I think there are some params needed to be set to make it happy?
Yes you are right and yes, knowing of that I have configured PBR using
iproute2 which makes Linux work happy in this scenario. Works like a charm.
BTW, since those are virtual interfaces, why do you need two on the same
VLAN?
That can explain it. Ideally, you need to also streamline the processing
in the guest. The relevant application should be on the same NUMA node as
the vCPU processing the virtio-net interrupts.
In your case, the VM sees a single NUMA node - does that match the
underlying host architecture as well?
Not sure. The command line from qemu-kvm is automatically generated by
oVirt. Perhaps some extra option to be changed under Advanced Parameters on
VM CPU configuration ? Also I was wondering if enabling "IO Threads
Enabled" under Resource Allocation could be of any help.
IO threads are for IO (= storage, perhaps it's not clear and we need to
clarify it) and only useful with large number of disks (and IO of course).
To finish I more inclined to understand that problem is restricted to the
VM, not to the Host(drivers, physical NICs, etc), given the packet loss
happens in vNIC1 not in vNIC2 when it has no traffic. If it was in the Host
level or bonding it would affect the whole VM traffic in either vNICs.
As a last resource I am considering add an extra 2 vCPUs to the VMs, but I
guess that will only lower the problem. Does anyone think that "Threads per
Core" or IO Thread could be a better choice ?
Are you using hyper-threading on the host? Otherwise, I'm not sure threads
per core would help.
Y.
Thanks
Fernando
On 18/03/2017 12:53, Yaniv Kaul wrote:
On Fri, Mar 17, 2017 at 6:11 PM, FERNANDO FREDIANI <
fernando.frediani(a)upx.com> wrote:
> Hello all.
>
> I have a peculiar problem here which perhaps others may have had or know
> about and can advise.
>
> I have Virtual Machine with 2 VirtIO NICs. This VM serves around 1Gbps of
> traffic with thousands of clients connecting to it. When I do a packet loss
> test to the IP pinned to NIC1 it varies from 3% to 10% of packet loss. When
> I run the same test on NIC2 the packet loss is consistently 0%.
>
> From what I gather I may have something to do with possible lack of Multi
> Queu VirtIO where NIC1 is managed by a single CPU which might be hitting
> 100% and causing this packet loss.
>
> Looking at this reference (
https://fedoraproject.org/wik
> i/Features/MQ_virtio_net) I see one way to test it is start the VM with
> 4 queues (for example), but checking on the qemu-kvm process I don't see
> option present. Any way I can force it from the Engine ?
>
I don't see a need for multi-queue for 1Gbps.
Can you share the host statistics, the network configuration, the qemu-kvm
command line, etc.?
What is the difference between NIC1 and NIC2, in the way they are
connected to the outside world?
>
> This other reference (
https://www.linux-kvm.org/pag
> e/Multiqueue#Enable_MQ_feature) points to the same direction about
> starting the VM with queues=N
>
> Also trying to increase the TX ring buffer within the guest with ethtool
> -g eth0 is not possible.
>
> Oh, by the way, the Load on the VM is significantly high despite the CPU
> usage isn't above 50% - 60% in average.
>
Load = latest 'top' results? Vs. CPU usage? Can mean a lot of processes
waiting for CPU and doing very little - typical for web servers, for
example. What is occupying the CPU?
Y.
>
> Thanks
> Fernando
>
>
>
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users
>
>