Sampling threads reduction

Hi everyone, Over the last few days I've been working on an concept to reduce the number of threads in VDSM. Currently, one of the biggest source of them is the VM statistics sampling: We have one thread per VM in charge to do so, and this is obviously poorly scalable. I tried for quite some time to solve this at libvirt level, but according to my findings and feelings, that whould require a huge patch. In the process of studying libvirt, I came up with a simple idea which, according to the initial tests, seems to work quite nicely and I'd like to discuss. The concept is to start with a thread pool (http://en.wikipedia.org/wiki/Thread_pool) and to add the few additions needed by VDSM: 1. to detect and take care of 'stuck' tasks 2. to avoid a 'stuck' task deplenish the worker pool 3. to avoid to start leak threads on a 'stuck' task I failed to find an existing thread pool which had those additions, so I wrote a new one from scratch: https://github.com/mojaves/vdsm/tree/master/lib/threadpool Then I spawned a new subpackage (a-la zombiereaper) and I consolidated the existed thread pool (from the storage subsystem) and added some longer doc. Please note that the storage had no changes except the trivial import fixes. Lastly, I've also added a small compatibility module for concurrent.futures, which is a very nice python module which provides a convenient interface to asynchronously execute callables; this module is included in python 3.2 (https://docs.python.org/3.3/library/concurrent.futures.html#module-concurren...) and there is a backport for python 2.x. This can also allow us (as in the virt group) to consolidate all the long running async operations using the same interface and code. Please note that I reimplemented a thread pool mostly to be able to experiment with the concepts listed above, which I failed to find implemented elsewhere in existing packages. I'm fine to see them reimplemented elsewhere, since I now believe they collectively provide a viable solution for us. The vdsm/virt/sampling.py has been my testbed, and the patch came up nice This is the bulk of the work https://github.com/mojaves/vdsm/commit/2b4c96f9ca3566f0c2f1426beff4400d5311e... Here all the changes, most of them are small adjustements https://github.com/mojaves/vdsm/commits/master/vdsm/virt/sampling.py and here is how it will looks like https://github.com/mojaves/vdsm/blob/master/vdsm/virt/sampling.py I'd like the sampling thread mess sorted out in time for 3.6, so please share your thoughts! Thanks and best regards, -- Francesco Romani RedHat Engineering Virtualization R & D Phone: 8261328 IRC: fromani
participants (1)
-
Francesco Romani