On 24/05/17 12:57 +0200, Michal Skrivanek wrote:
Hi all,
we plan to work on an improvement in VM definition for high performance workloads which do
not require desktop-class devices and generally favor highest possible performance in
expense of less flexibility.
We’re thinking of adding a new VM preset in addition to current Desktop and Server in New
VM dialog, which would automatically pre-select existing options in the right way, and
suggest/warn on suboptimal configuration
All the presets and warning can be changed and ignored. There are few things we already
identified as boosting performance and/or minimize the complexity of the VM, so we plan
the preset to:
- remove all graphical consoles and set the VM as headless, making it accessible by serial
console.
- disable all USB.
- disable soundcard.
- enable I/O Threads, just one for all disks by default.
- set host cpu passthrough (effectively disabling VM live migration), add I/O Thread
pinning in a similar way as the existing CPU pinning.
We plan the following checks and suggest to perform CPU pinning, host topology == guest
topology (number of cores per socket and threads per core should match), NUMA topology
host and guest match, check and suggest the I/O threads pinning.
A popup on a VM dialog save seems suitable.
As for the checks, I'd prefer to see slightly more fine-grained
topology analysis. We don't need the guest topology to overlap with
hosts topology, but it needs to fit in. So the checks should be
(ordered by it's topology significance):
1) #guest_numa_nodes <= #host_numa_nodes
2) #guest_sockets_per_node <= #host_sockets_per_node
3) #guest_cpus_per_socket <= #host_cpus_per_socket (this check has to
account for cores X threads difference)
4) guest_ram_per_node <= host_ram_per_node
These four checks guarantee that each guest's numa fits onto host numa
node and that we're not requesting more nodes than we have at disposal.
Now if these don't pass, each check can have suggestion on how to
proceed:
1) lower the number of numa nodes but increase sockets/cores/memory
per node
2) increase number of numa nodes
3) increase number of sockets
4) increase number of numa nodes
Should these checks fail, the VM can't be started in high performance mode.
In short, we've relaxed the requirement that host topology == guest
topology to host topology >= guest topology.
If the checks pass, we should recommend pinning the numa nodes (and do
strict/prefered pinning rather than interleave), numa
memory and CPUs to make sure it will match the topology. Additionally,
if the VM uses host devices (incl. SR-IOV), the pinning should be as
close to the node where devices' MMIO resides as possible.
Additionally, we should suggest to leave some CPUs within numa node
and pin iothreads/emulator against these.
BTW what about virtio-scsi vs virtio-blk in this case? High
performance may be the case where virtio-blk is reasonable.
currently identified task and status can be followed on trello
card[1]
Please share your thoughts, questions, any kind of feedback…
Thanks,
michal
[1]
https://trello.com/c/MHRDD8ZO