[ovirt-devel] [VDSM][sampling] thread pool status and handling of stuck calls

Nir Soffer nsoffer at redhat.com
Tue Jul 15 17:28:51 UTC 2014


----- Original Message -----
> From: "Francesco Romani" <fromani at redhat.com>
> To: devel at ovirt.org
> Cc: "Nir Soffer" <nsoffer at redhat.com>, "Michal Skrivanek" <mskrivan at redhat.com>, "Federico Simoncelli"
> <fsimonce at redhat.com>, "Dan Kenigsberg" <danken at redhat.com>, "Saggi Mizrahi" <smizrahi at redhat.com>
> Sent: Tuesday, July 15, 2014 6:50:45 PM
> Subject: Re: [ovirt-devel] [VDSM][sampling] thread pool status and handling of stuck calls
> 
> ----- Original Message -----
> > From: "Saggi Mizrahi" <smizrahi at redhat.com>
> > To: "Nir Soffer" <nsoffer at redhat.com>
> > Cc: "Francesco Romani" <fromani at redhat.com>, devel at ovirt.org, "Michal
> > Skrivanek" <mskrivan at redhat.com>, "Federico
> > Simoncelli" <fsimonce at redhat.com>, "Dan Kenigsberg" <danken at redhat.com>
> > Sent: Sunday, July 13, 2014 5:43:28 PM
> > Subject: Re: [ovirt-devel] [VDSM][sampling] thread pool status and handling
> > of stuck calls
> 
> [...]
> > > The current patches do not change libvirt connection management
> > > this is orthogonal issue. They are only about changing the way
> > > we do sampling.
> > As I've been saying, I think the problem is in actually in the
> > libvirt connection management and not the stats operations.
> 
> Well yes, I think to have a better libvirt connection management is
> another way to reach the go, granted it could detect and signal to
> the upper layer a stuck call.
> 
> With that in place, the sampling code is simpler, and no need
> for fancy thread pool. Even though we may need something like this
> in the connection handling code internals.

Can you explain how do you solve the sampling issue with better
connection management?

Since libvirt does not have async api (yet), it seems that this
would just move the thread pool to the connection layer.

> 
> *Maybe* a supervisor-like approach like my very first proposal could
> work, but a very good point made by Nir is how to tell when something is
> 'stuck', since only tasks really know their timeout.
> 
> For sampling it is easy, is the sampling interval, but how to convey
> this timeout in a generic manner to the connection layer?
 



More information about the Devel mailing list