[Engine-devel] ThreadPoolUtil with 500 threads and no queue?
Michael Kublin
mkublin at redhat.com
Wed Dec 12 17:37:26 UTC 2012
----- Original Message -----
> From: "Juan Hernandez" <jhernand at redhat.com>
> To: "Michael Kublin" <mkublin at redhat.com>
> Cc: engine-devel at ovirt.org
> Sent: Wednesday, December 12, 2012 6:31:44 PM
> Subject: Re: [Engine-devel] ThreadPoolUtil with 500 threads and no queue?
>
> On 12/12/2012 05:21 PM, Michael Kublin wrote:
> >
> >
> > ----- Original Message -----
> >> From: "Juan Hernandez" <jhernand at redhat.com>
> >> To: engine-devel at ovirt.org
> >> Sent: Wednesday, December 12, 2012 6:06:30 PM
> >> Subject: [Engine-devel] ThreadPoolUtil with 500 threads and no
> >> queue?
> >>
> >> Hello all,
> >>
> >> What is the reasoning behind the decision to have a pool with a
> >> maximum
> >> of 500 threads and no job queue (see ThreadPoolUtil.java)?
> >> Wouldn't
> >> it
> >> make more sense to have a much smaller thread pool and a
> >> potentially
> >> large queue of jobs?
> >>
> >> Regards,
> >> Juan Hernandez
> >
> > There are three general strategies for queuing:
> >
> > 1) Direct handoffs. A good default choice for a work queue is a
> > SynchronousQueue that hands off tasks to threads without
> > otherwise holding them. Here, an attempt to queue a task will
> > fail if no threads are immediately available to run it, so a new
> > thread will be constructed. This policy avoids lockups when
> > handling sets of requests that might have internal dependencies.
> > Direct handoffs generally require unbounded maximumPoolSizes to
> > avoid rejection of new submitted tasks. This in turn admits the
> > possibility of unbounded thread growth when commands continue to
> > arrive on average faster than they can be processed.
> > 2) Unbounded queues. Using an unbounded queue (for example a
> > LinkedBlockingQueue without a predefined capacity) will cause
> > new tasks to wait in the queue when all corePoolSize threads are
> > busy. Thus, no more than corePoolSize threads will ever be
> > created. (And the value of the maximumPoolSize therefore doesn't
> > have any effect.) This may be appropriate when each task is
> > completely independent of others, so tasks cannot affect each
> > others execution; for example, in a web page server. While this
> > style of queuing can be useful in smoothing out transient bursts
> > of requests, it admits the possibility of unbounded work queue
> > growth when commands continue to arrive on average faster than
> > they can be processed.
> > 3) Bounded queues. A bounded queue (for example, an
> > ArrayBlockingQueue) helps prevent resource exhaustion when used
> > with finite maximumPoolSizes, but can be more difficult to tune
> > and control. Queue sizes and maximum pool sizes may be traded
> > off for each other: Using large queues and small pools minimizes
> > CPU usage, OS resources, and context-switching overhead, but can
> > lead to artificially low throughput. If tasks frequently block
> > (for example if they are I/O bound), a system may be able to
> > schedule time for more threads than you otherwise allow. Use of
> > small queues generally requires larger pool sizes, which keeps
> > CPUs busier but may encounter unacceptable scheduling overhead,
> > which also decreases throughput.
> >
> > Why not? we are using 1).
> > Actually 500 threads should be enough for very big applications
>
> I think that 500 are maybe too much, even for a very big application,
> the reasons you explain very well in 3). A resource that we can very
> easily overload is the database. If those 500 threads happen to need
> database connections we have a problem.
>
> I think we should use a bounded queue and have both the number of
> threads and the size of the queue configurable. Does that make sense?
>
Actually I think that it is more complicated.
Today most of the threads are used to perform xml rpc calls and in order to paralyse some operations
(for example connect all host to some storage domain), most of the action from users are not
opening a new thread in order to retrieve some information from DB.
I think today most of the load on DB comes from scheduled monitoring jobs,
which are running every couple seconds and performing enormous number of queries to DB.
I think if we want to make some kind of queue it should be done at the business logic level
and if we decided that we can not run some action because of load (user action or internal action) we can reject it,
I think that thread pool is internal mechanism which can not solve our design problems, we can tune it but it is not
main cause of problem.
More information about the Devel
mailing list