
----- Original Message -----
From: "Francesco Romani" <fromani@redhat.com> To: devel@ovirt.org Cc: "Nir Soffer" <nsoffer@redhat.com>, "Michal Skrivanek" <mskrivan@redhat.com>, "Federico Simoncelli" <fsimonce@redhat.com>, "Dan Kenigsberg" <danken@redhat.com>, "Saggi Mizrahi" <smizrahi@redhat.com> Sent: Tuesday, July 15, 2014 6:50:45 PM Subject: Re: [ovirt-devel] [VDSM][sampling] thread pool status and handling of stuck calls
----- Original Message -----
From: "Saggi Mizrahi" <smizrahi@redhat.com> To: "Nir Soffer" <nsoffer@redhat.com> Cc: "Francesco Romani" <fromani@redhat.com>, devel@ovirt.org, "Michal Skrivanek" <mskrivan@redhat.com>, "Federico Simoncelli" <fsimonce@redhat.com>, "Dan Kenigsberg" <danken@redhat.com> Sent: Sunday, July 13, 2014 5:43:28 PM Subject: Re: [ovirt-devel] [VDSM][sampling] thread pool status and handling of stuck calls
[...]
The current patches do not change libvirt connection management this is orthogonal issue. They are only about changing the way we do sampling. As I've been saying, I think the problem is in actually in the libvirt connection management and not the stats operations.
Well yes, I think to have a better libvirt connection management is another way to reach the go, granted it could detect and signal to the upper layer a stuck call.
With that in place, the sampling code is simpler, and no need for fancy thread pool. Even though we may need something like this in the connection handling code internals.
Can you explain how do you solve the sampling issue with better connection management? Since libvirt does not have async api (yet), it seems that this would just move the thread pool to the connection layer.
*Maybe* a supervisor-like approach like my very first proposal could work, but a very good point made by Nir is how to tell when something is 'stuck', since only tasks really know their timeout.
For sampling it is easy, is the sampling interval, but how to convey this timeout in a generic manner to the connection layer?