From: "Saggi Mizrahi" <smizrahi(a)redhat.com>
To: "Nir Soffer" <nsoffer(a)redhat.com>, "Francesco Romani"
<fromani(a)redhat.com>
Cc: devel(a)ovirt.org, "Federico Simoncelli" <fsimonce(a)redhat.com>,
"Michal Skrivanek" <mskrivan(a)redhat.com>
Sent: Monday, July 7, 2014 8:03:35 PM
Subject: Re: [ovirt-devel] [VDSM][sampling] thread pool status and handling of stuck
calls
----- Original Message -----
> From: "Nir Soffer" <nsoffer(a)redhat.com>
> To: "Saggi Mizrahi" <smizrahi(a)redhat.com>
> Cc: "Francesco Romani" <fromani(a)redhat.com>, devel(a)ovirt.org,
"Federico
> Simoncelli" <fsimonce(a)redhat.com>, "Michal
> Skrivanek" <mskrivan(a)redhat.com>
> Sent: Sunday, July 6, 2014 6:21:35 PM
> Subject: Re: [ovirt-devel] [VDSM][sampling] thread pool status and handling
> of stuck calls
>
> ----- Original Message -----
> > From: "Saggi Mizrahi" <smizrahi(a)redhat.com>
> > To: "Francesco Romani" <fromani(a)redhat.com>
> > Cc: devel(a)ovirt.org
> > Sent: Sunday, July 6, 2014 2:58:38 PM
> > Subject: Re: [ovirt-devel] [VDSM][sampling] thread pool status and
> > handling
> > of stuck calls
> >
> > The more I think about it the more it looks like it's
> > purely a libvirt issue. As long as we can make calls
> > that get stuck on D state we can't scale under stress.
>
> We can scale, but if all libvirt worker threads are blocked, we won't
> reach very far :-)
>
> Does libvirt try to handle such blocked threads?
Nope, that is why we won't scale as we just get stuck on libvirt.
>
> >
> > In any case, seems like having an interface like.
> >
> > libvirtConnectionPool.request(args, callback)
> >
> > Would be a better solution than a thread pool.
> >
> > It would queue up the request and call the callback
> > once it's done.
> >
> > pseudo example:
> >
> > def collectStats():
> > def callback(resp):
> > doStuff(resp)
> > threading.Timer(4, collectStats)
>
> This starts a new thread for each sample, which is even worse then having
> long living thread per vm which Francesco is trying to eliminate.
We could implement a Timer like class that uses a single preallocated thread.
It would only run short operations and queue things in a threadpool in case
of long operations. That way it's unrelated to the threadpool and can be used
for any use case where we want to queue an event.
This is exactly what I was thinking about.
It would be nice if you can review it:
>
> We have two issues here:
> 1. Scalability - can be improved regardless of libvirt
> 2. Handling stuck calls - may require solution on the libvirt side
>
> Did you see this?
>
https://bugzilla.redhat.com/1112684
>
> We should stop this trend of starting zillions of threads and use more
> efficient and maintainable solutions.
agreed.
>
> >
> > lcp.listAllDevices(callback)
> >
> > We could have the timer queue in a threadpool since it normally
> > runs in the main thread but that is orthogonal to the libvirt
> > connection issue.
> >
> > As for the thread pool itself. As long as it's more than one class
> > it's too complicated.
> >
> > Things like:
> > * Task Cancelling
> > * Re-queuing (periodic operations)
> >
> > Shouldn't be part of the thread pool.
>
> +1
>
> >
> > Task should just be functions. If we need something
> > with a state we could use the __call__() method to make
> > an object look like a function.
>
> +1
>
> > I also don't mind doing
> > if hasattr(task, "run") or callable(task) to handle it
> > using an "interface"
>
> Please don't - there is no reason why related object cannot
> have common interface like __call__.
I agree, but some people don't like it and despite my reputation
I do try and avoid pointless bike-shedding
>
> >
> > Re-queuing could be done with the build it threading.Timer
> > as in
> >
> > threading.Timer(4.5, threadpool.queue, args=(self,))
> >
> > That way each operation is responsible for handing
> > the details of rescheduling.
> > Should we always wait X, should we wait X - timeItTookToCalculate.
> >
> > You could also do:
> > threading.Timer(2, threadpool.queue, args=(self,))
> > threading.Timer(4, self.handleBeingLate)
> >
> > Which would handle not getting queued for a certain amount of time.
> >
> > Task cancelling can't be done in a generic manner.
> > The most I think we could do is have threadpool.stop()
> > check hassattr("stop", task) and call it.
>
> Please don't
I agree that forcing "jobs" to stop should be handled outside
the threadpool. Again, just trying to find a common ground.
>
> >
> >
> > ----- Original Message -----
> > > From: "Francesco Romani" <fromani(a)redhat.com>
> > > To: devel(a)ovirt.org
> > > Sent: Friday, July 4, 2014 5:48:59 PM
> > > Subject: [ovirt-devel] [VDSM][sampling] thread pool status and handling
> > > of
> > > stuck calls
> > >
> > > Hi,
> > >
> > > Nir has begun reviewing my draft patches about the thread pool and
> > > sampling
> > > refactoring (thanks!),
> > > and already suggested quite some improvements which I'd like to
> > > summarize
> > >
> > > Quick links to the ongoing discussion:
> > >
http://gerrit.ovirt.org/#/c/29191/8/lib/threadpool/worker.py,cm
> > >
http://gerrit.ovirt.org/#/c/29190/4/lib/threadpool/README.rst,cm
> > >
> > > Quick summary of the discussion on gerrit so far:
> > > 1. extract the scheduling logic from the thread pool. Either add a
> > > separate
> > > scheduler class
> > > or let the sampling task reschedule themselves after a succesfull
> > > completion.
> > > In any way the concept of 'periodic task', and the added
> > > complexity,
> > > isn't needed.
> > >
> > > 2. drop all the *queue classes I've added, thus making the package
> > > simpler.
> > > They are no longer needed since we remove the concept of periodic
> > > task.
> > >
> > > 3. have per-task timeout, move the stuck task detection elsewhere, like
> > > in
> > > the worker thread, ot
> > > maybe better in the aforementioned scheduler.
> > > If the scheduler finds that any task started in the former pass (or
> > > even
> > > before!)
> > > has not yet completed, there is no point in keeping this task alive
> > > and
> > > it
> > > should be cancelled.
> > >
> > > 4. the sampling task (or maybe the scheduler) can be smarter and
> > > halting
> > > the
> > > sample in presence of
> > > not responding calls for a given VM, granted the VM reports its
> > > 'health'/responsiveness.
> > >
> > > (Hopefully I haven't forgot anything big)
> > >
> > > In the draft currently published, I reluctantly added the *queue
> > > classes
> > > and
> > > I agree the periodic
> > > task implementation is messy, so I'll be very happy to drop them.
> > >
> > > However, a core question still holds: what to do in presence of the
> > > stuck
> > > task?
> > >
> > > I think it is worth to discuss this topic on a medium friendlier than
> > > gerrit,
> > > as it is the single
> > > most important decision to make in the sampling refactoring.
> > >
> > > It all boils down to:
> > > Should we just keep somewhere stuck threads and wait? Should we cancel
> > > stuck
> > > tasks?
> > >
> > > A. Let's cancel the stuck tasks.
> > > If we move toward a libvirt connection pool, and we give each worker
> > > thread
> > > in the sampling pool
> > > a separate libvirt connection, hopefully read-only, then we should be
> > > able
> > > to
> > > cancel stuck task by
> > > killing the worker's libvirt connection. We'll still need a
(probably
> > > much
> > > simpler) watchman/supervisor,
> > > but no big deal here.
> > > Libvirt allows to close a connection from a different thread.
> > > I haven't actually tried to unstuck a blocked thread this way, but I
> > > have
> > > no
> > > reason to believe it
> > > will not work.
> > >
> > > B. Let's keep around blocked threads
> > > The code as it is just leaves a blocked libvirt call and the worker
> > > thread
> > > that carried it frozen.
> > > The stuck worker thread can be replaced up to a cap of frozen threads.
> > > In this worst case scenario, we end up with one (blocked!) thread per
> > > VM,
> > > as
> > > it is today, and with
> > > no sampling data.
> > >
> > > I believe that #A has some drawbacks which we risk to overlook, and on
> > > the
> > > same time #B has some merits.
> > >
> > > Let me explain:
> > > The hardest case is a call blocked in the kernel in D state. Libvirt
> > > has
> > > no
> > > more room than VDSM
> > > to unblock it; and libvirt itself *has* a pool of resources (threads in
> > > this
> > > case) which can be depleted
> > > by stuck calls. Actually, retrying to do a failed task may deplete
> > > their
> > > pool
> > > even faster[1].
> > >
> > > I'm not happy to just push this problem down the stack, as it looks
to
> > > me
> > > that we gain
> > > very little by doing so. VDSM itself surely stays cleaner, but the
> > > VDS/hypervisor hosts on the whole
> > > improves just a bit: libvirt scales better, and that gives us some more
> > > room.
> > >
> > > On the other hand, by avoiding to reissue dangerous calls, I believe we
> > > make
> > > better use of
> > > the host resources in general. Actually, the point of keeping blocked
> > > thread
> > > around is a side effect
> > > of not reattempting blocked calls. Moreover, to keep the blocked thread
> > > around has a significant
> > > benefit: we can discover at the earliest moment when it is safe again
> > > to
> > > do
> > > the blocked call,
> > > because the blocked call itself returns and we can track this event!
> > > (and
> > > of
> > > course drop the
> > > now stale result). Otherwise, if we drop the connection, we'll lose
> > > this
> > > event and we have no
> > > more option that trying again and hoping for the best[2]
> > >
> > > I know the #B approach is not the cleanest, but I think it has slightly
> > > more
> > > appeal, especially
> > > on the libvirt depletion front.
> > >
> > > Thoughts and comments very welcome!
> > >
> > > +++
> > >
> > > [1] They have extensions to management API to dinamically adjust their
> > > thread
> > > pool and/or to cancel
> > > tasks, but it is in the RHEL7.2 timeframe.
> > > [2] A crazy idea would be to do something like
> > >
http://en.wikipedia.org/wiki/Exponential_backoff
> > > which I'm not sure would be beneficial
> > >
> > > Bests and thanks,
> > >
> > > --
> > > Francesco Romani
> > > RedHat Engineering Virtualization R & D
> > > Phone: 8261328
> > > IRC: fromani
> > > _______________________________________________
> > > Devel mailing list
> > > Devel(a)ovirt.org
> > >
http://lists.ovirt.org/mailman/listinfo/devel
> > >
> > _______________________________________________
> > Devel mailing list
> > Devel(a)ovirt.org
> >
http://lists.ovirt.org/mailman/listinfo/devel
> >
>