[Engine-devel] ThreadPoolUtil with 500 threads and no queue?

Hello all, What is the reasoning behind the decision to have a pool with a maximum of 500 threads and no job queue (see ThreadPoolUtil.java)? Wouldn't it make more sense to have a much smaller thread pool and a potentially large queue of jobs? Regards, Juan Hernandez -- Dirección Comercial: C/Jose Bardasano Baos, 9, Edif. Gorbea 3, planta 3ºD, 28016 Madrid, Spain Inscrita en el Reg. Mercantil de Madrid – C.I.F. B82657941 - Red Hat S.L.

----- Original Message -----
From: "Juan Hernandez" <jhernand@redhat.com> To: engine-devel@ovirt.org Sent: Wednesday, December 12, 2012 6:06:30 PM Subject: [Engine-devel] ThreadPoolUtil with 500 threads and no queue?
Hello all,
What is the reasoning behind the decision to have a pool with a maximum of 500 threads and no job queue (see ThreadPoolUtil.java)? Wouldn't it make more sense to have a much smaller thread pool and a potentially large queue of jobs?
Regards, Juan Hernandez
There are three general strategies for queuing: 1) Direct handoffs. A good default choice for a work queue is a SynchronousQueue that hands off tasks to threads without otherwise holding them. Here, an attempt to queue a task will fail if no threads are immediately available to run it, so a new thread will be constructed. This policy avoids lockups when handling sets of requests that might have internal dependencies. Direct handoffs generally require unbounded maximumPoolSizes to avoid rejection of new submitted tasks. This in turn admits the possibility of unbounded thread growth when commands continue to arrive on average faster than they can be processed. 2) Unbounded queues. Using an unbounded queue (for example a LinkedBlockingQueue without a predefined capacity) will cause new tasks to wait in the queue when all corePoolSize threads are busy. Thus, no more than corePoolSize threads will ever be created. (And the value of the maximumPoolSize therefore doesn't have any effect.) This may be appropriate when each task is completely independent of others, so tasks cannot affect each others execution; for example, in a web page server. While this style of queuing can be useful in smoothing out transient bursts of requests, it admits the possibility of unbounded work queue growth when commands continue to arrive on average faster than they can be processed. 3) Bounded queues. A bounded queue (for example, an ArrayBlockingQueue) helps prevent resource exhaustion when used with finite maximumPoolSizes, but can be more difficult to tune and control. Queue sizes and maximum pool sizes may be traded off for each other: Using large queues and small pools minimizes CPU usage, OS resources, and context-switching overhead, but can lead to artificially low throughput. If tasks frequently block (for example if they are I/O bound), a system may be able to schedule time for more threads than you otherwise allow. Use of small queues generally requires larger pool sizes, which keeps CPUs busier but may encounter unacceptable scheduling overhead, which also decreases throughput. Why not? we are using 1). Actually 500 threads should be enough for very big applications

On 12/12/2012 05:21 PM, Michael Kublin wrote:
----- Original Message -----
From: "Juan Hernandez" <jhernand@redhat.com> To: engine-devel@ovirt.org Sent: Wednesday, December 12, 2012 6:06:30 PM Subject: [Engine-devel] ThreadPoolUtil with 500 threads and no queue?
Hello all,
What is the reasoning behind the decision to have a pool with a maximum of 500 threads and no job queue (see ThreadPoolUtil.java)? Wouldn't it make more sense to have a much smaller thread pool and a potentially large queue of jobs?
Regards, Juan Hernandez
There are three general strategies for queuing:
1) Direct handoffs. A good default choice for a work queue is a SynchronousQueue that hands off tasks to threads without otherwise holding them. Here, an attempt to queue a task will fail if no threads are immediately available to run it, so a new thread will be constructed. This policy avoids lockups when handling sets of requests that might have internal dependencies. Direct handoffs generally require unbounded maximumPoolSizes to avoid rejection of new submitted tasks. This in turn admits the possibility of unbounded thread growth when commands continue to arrive on average faster than they can be processed. 2) Unbounded queues. Using an unbounded queue (for example a LinkedBlockingQueue without a predefined capacity) will cause new tasks to wait in the queue when all corePoolSize threads are busy. Thus, no more than corePoolSize threads will ever be created. (And the value of the maximumPoolSize therefore doesn't have any effect.) This may be appropriate when each task is completely independent of others, so tasks cannot affect each others execution; for example, in a web page server. While this style of queuing can be useful in smoothing out transient bursts of requests, it admits the possibility of unbounded work queue growth when commands continue to arrive on average faster than they can be processed. 3) Bounded queues. A bounded queue (for example, an ArrayBlockingQueue) helps prevent resource exhaustion when used with finite maximumPoolSizes, but can be more difficult to tune and control. Queue sizes and maximum pool sizes may be traded off for each other: Using large queues and small pools minimizes CPU usage, OS resources, and context-switching overhead, but can lead to artificially low throughput. If tasks frequently block (for example if they are I/O bound), a system may be able to schedule time for more threads than you otherwise allow. Use of small queues generally requires larger pool sizes, which keeps CPUs busier but may encounter unacceptable scheduling overhead, which also decreases throughput.
Why not? we are using 1). Actually 500 threads should be enough for very big applications
I think that 500 are maybe too much, even for a very big application, the reasons you explain very well in 3). A resource that we can very easily overload is the database. If those 500 threads happen to need database connections we have a problem. I think we should use a bounded queue and have both the number of threads and the size of the queue configurable. Does that make sense? -- Dirección Comercial: C/Jose Bardasano Baos, 9, Edif. Gorbea 3, planta 3ºD, 28016 Madrid, Spain Inscrita en el Reg. Mercantil de Madrid – C.I.F. B82657941 - Red Hat S.L.

----- Original Message -----
From: "Juan Hernandez" <jhernand@redhat.com> To: "Michael Kublin" <mkublin@redhat.com> Cc: engine-devel@ovirt.org Sent: Wednesday, December 12, 2012 6:31:44 PM Subject: Re: [Engine-devel] ThreadPoolUtil with 500 threads and no queue?
On 12/12/2012 05:21 PM, Michael Kublin wrote:
----- Original Message -----
From: "Juan Hernandez" <jhernand@redhat.com> To: engine-devel@ovirt.org Sent: Wednesday, December 12, 2012 6:06:30 PM Subject: [Engine-devel] ThreadPoolUtil with 500 threads and no queue?
Hello all,
What is the reasoning behind the decision to have a pool with a maximum of 500 threads and no job queue (see ThreadPoolUtil.java)? Wouldn't it make more sense to have a much smaller thread pool and a potentially large queue of jobs?
Regards, Juan Hernandez
There are three general strategies for queuing:
1) Direct handoffs. A good default choice for a work queue is a SynchronousQueue that hands off tasks to threads without otherwise holding them. Here, an attempt to queue a task will fail if no threads are immediately available to run it, so a new thread will be constructed. This policy avoids lockups when handling sets of requests that might have internal dependencies. Direct handoffs generally require unbounded maximumPoolSizes to avoid rejection of new submitted tasks. This in turn admits the possibility of unbounded thread growth when commands continue to arrive on average faster than they can be processed. 2) Unbounded queues. Using an unbounded queue (for example a LinkedBlockingQueue without a predefined capacity) will cause new tasks to wait in the queue when all corePoolSize threads are busy. Thus, no more than corePoolSize threads will ever be created. (And the value of the maximumPoolSize therefore doesn't have any effect.) This may be appropriate when each task is completely independent of others, so tasks cannot affect each others execution; for example, in a web page server. While this style of queuing can be useful in smoothing out transient bursts of requests, it admits the possibility of unbounded work queue growth when commands continue to arrive on average faster than they can be processed. 3) Bounded queues. A bounded queue (for example, an ArrayBlockingQueue) helps prevent resource exhaustion when used with finite maximumPoolSizes, but can be more difficult to tune and control. Queue sizes and maximum pool sizes may be traded off for each other: Using large queues and small pools minimizes CPU usage, OS resources, and context-switching overhead, but can lead to artificially low throughput. If tasks frequently block (for example if they are I/O bound), a system may be able to schedule time for more threads than you otherwise allow. Use of small queues generally requires larger pool sizes, which keeps CPUs busier but may encounter unacceptable scheduling overhead, which also decreases throughput.
Why not? we are using 1). Actually 500 threads should be enough for very big applications
I think that 500 are maybe too much, even for a very big application, the reasons you explain very well in 3). A resource that we can very easily overload is the database. If those 500 threads happen to need database connections we have a problem.
I think we should use a bounded queue and have both the number of threads and the size of the queue configurable. Does that make sense?
Actually I think that it is more complicated. Today most of the threads are used to perform xml rpc calls and in order to paralyse some operations (for example connect all host to some storage domain), most of the action from users are not opening a new thread in order to retrieve some information from DB. I think today most of the load on DB comes from scheduled monitoring jobs, which are running every couple seconds and performing enormous number of queries to DB. I think if we want to make some kind of queue it should be done at the business logic level and if we decided that we can not run some action because of load (user action or internal action) we can reject it, I think that thread pool is internal mechanism which can not solve our design problems, we can tune it but it is not main cause of problem.

On 12/12/12 18:06, Juan Hernandez wrote:
Hello all,
What is the reasoning behind the decision to have a pool with a maximum of 500 threads and no job queue (see ThreadPoolUtil.java)? Wouldn't it make more sense to have a much smaller thread pool and a potentially large queue of jobs?
Hi Juan, I think there is no right/wrong number, as Kublin added on this thread there are several approaches to address this issue. My 2 cents on this is that a change should be based on a given workload profile. Any given solution would suits specific workload and hurt another, as long as we are not sure what is the common workload our users use I would make any change configurable and write a recommendation on how to configure it. For writing such document one should characterized few typical usages of the system and test what is the preferred configuration for them. Livnat
Regards, Juan Hernandez
participants (3)
-
Juan Hernandez
-
Livnat Peer
-
Michael Kublin