[Engine-devel] [vdsm] CPU Overcommit Feature
Simon Grinberg
simon at redhat.com
Thu Dec 20 14:56:14 UTC 2012
----- Original Message -----
> From: "Doron Fediuck" <dfediuck at redhat.com>
> To: "Dan Kenigsberg" <danken at redhat.com>
> Cc: "engine-devel" <engine-devel at ovirt.org>, vdsm-devel at fedorahosted.org
> Sent: Thursday, December 20, 2012 2:14:45 PM
> Subject: Re: [Engine-devel] [vdsm] CPU Overcommit Feature
>
>
>
> ----- Original Message -----
> > From: "Dan Kenigsberg" <danken at redhat.com>
> > To: "Itamar Heim" <iheim at redhat.com>
> > Cc: "engine-devel" <engine-devel at ovirt.org>,
> > vdsm-devel at fedorahosted.org
> > Sent: Thursday, December 20, 2012 11:55:10 AM
> > Subject: Re: [Engine-devel] [vdsm] CPU Overcommit Feature
> >
> > On Thu, Dec 20, 2012 at 10:23:55AM +0200, Itamar Heim wrote:
> > > On 12/20/2012 09:43 AM, Dan Kenigsberg wrote:
> > > >On Wed, Dec 19, 2012 at 09:53:15AM -0500, Doron Fediuck wrote:
> > > >>
> > > >>----- Original Message -----
> > > >>>From: "Dan Kenigsberg" <danken at redhat.com>
> > > >>>To: "Greg Padgett" <gpadgett at redhat.com>
> > > >>>Cc: "engine-devel" <engine-devel at ovirt.org>,
> > > >>>vdsm-devel at fedorahosted.org
> > > >>>Sent: Wednesday, December 19, 2012 3:59:11 PM
> > > >>>Subject: Re: [Engine-devel] CPU Overcommit Feature
> > > >>>
> > > >>>On Mon, Dec 17, 2012 at 09:37:57AM -0500, Greg Padgett wrote:
> > > >>>>Hi,
> > > >>>>
> > > >>>>I've been working on a feature to allow CPU Overcommitment of
> > > >>>>hosts
> > > >>>>in a cluster. This first stage allows the engine to consider
> > > >>>>host
> > > >>>>cpu threads as cores for the purposes of VM resource
> > > >>>>allocation.
> > > >>>>
> > > >>>>This wiki page has further details, your comments are
> > > >>>>welcome!
> > > >>>>http://www.ovirt.org/Features/cpu_overcommit
> > > >>>
> > > >>>I've commented about the vdsm/engine API on
> > > >>>http://gerrit.ovirt.org/#/c/10144/ but it is probably better
> > > >>>to
> > > >>>reiterate it here.
> > > >>>
> > > >>>The suggested API is tightly coupled with an ugly hack we
> > > >>>pushed
> > > >>>to
> > > >>>vdsm
> > > >>>in order not to solve the issue properly on the first strike.
> > > >>>
> > > >>>If we had not have report_host_threads_as_cores, I think we'd
> > > >>>have a
> > > >>>simpler API reporting only cpuThreads and cpuCores; with no
> > > >>>funny
> > > >>>boolean flags.
> > > >>>
> > > >>>Let us strive to that position as much as we can.
> > > >>>
> > > >>>How about asking whoever used report_host_threads_as_cores to
> > > >>>unset
> > > >>>it
> > > >>>once they install Engine 3.2 ? I think that these are very few
> > > >>>people,
> > > >>>that would not mind this very much.
> > > >>>
> > > >>>If this is impossible, I'd add a cpuCores2, always reporting
> > > >>>the
> > > >>>true
> > > >>>number, to be used by new Engines. We may even report it only
> > > >>>on
> > > >>>the
> > > >>>very few cases of report_host_threads_as_cores being set.
> > > >>>
> > > >>>Dan.
> > > >>
> > > >>Hi Dan,
> > > >>Thanks for the review.
> > > >>
> > > >>I agree simply reporting cores and threads would be the right
> > > >>solution.
> > > >>However, when you have hyperthreading turned off you get
> > > >>cores=threads.
> > > >>This is the same situation you have when hyperthreading turned
> > > >>on, and
> > > >>someone used the vdsm configuration of reporting threads as
> > > >>cores.
> > > >>
> > > >>So the engine won't know the real status of the host.
> > > >
> > > >This is not surprising, as report_host_threads_as_cores means in
> > > >blunt
> > > >English "lie to Engine about the number of cores". The newly
> > > >suggested
> > > >flag says "don't believe what I said in cpuCores, since I'm
> > > >lying". Next
> > > >thing we'd have is another flag saying that the former flag was
> > > >a
> > > >lie,
> > > >and cpuCores is actually trustworthy.
> > > >
> > > >Instead of dancing this dance, I suggest to stop lying.
> > > >
> > > >report_host_threads_as_cores was a hack to assist a older Engine
> > > >versions. Engine users that needed it had to set it out-of-band
> > > >on
> > > >their
> > > >hosts. Now if they upgrade their Engine, they can -- as easily
> > > >--
> > > >reset
> > > >that value.
> > > >
> > > >If they forget, nothing devastating happens beyond Engine
> > > >assuming
> > > >that
> > > >hyperthreading is off.
> > > >
> > > >Please consider this suggestion. I find it the simplest for all
> > > >involved
> > > >parties.
> > >
> > > the only problem is the new vdsm doesn't know which engine may be
> > > using it.
> > > if engine would say "getVdsCaps engineVersion=3.2", then vdsm
> > > could
> > > know engine no longer needs lying to and ignore the flag,
> > > re-using
> > > same field.
> >
> > Note that I do not suggest to drop report_host_threads_as_cores
> > now.
> > I am suggesting to keep on lying even to new Engine.
> > If someone thinks that lying is bad, he should reset
> > report_host_threads_as_cores.
> >
> > It seems to me that the suggested API is being coerced by a very
> > limited
> > use case, that is not going to be really harmed by a
> > straight-forward
> > API.
> >
> > Dan.
>
> Dan,
> Did some further checking, and we can go with it;
> So basically now we add cpuThreads. Additionally, if the
> report_host_threads_as_cores
> is turned on, an additional cpuCoresReal will be reported.
No need for that.
There is only one problematic state where VDSM cheats and reports cores == threads.
This does not happen by mistake, the user specifically asked for it.
The above condition above also happens if threads is really off or the processor does not have threads, so it's a state we need to handle in any case.
So just report threads=off, whenever cores == threads and treat it as such. If the user is unhappy then he should just turn off the cheat mode.
> Since this is a 3.2 cluster feature, the engine will make sure to
> update
> the DB with the real numbers, and older clusters will use the
> information
> vdsm reports, while blocking the engine side optimization.
> An update path is already available here:
> http://gerrit.ovirt.org/#/c/10144/4
>
> I hope that for cluster 3.3 we'll be able to drop the vdsm
> configuration.
> A review would be highly appreciated due to the holidays proximity.
>
> Doron
> _______________________________________________
> Engine-devel mailing list
> Engine-devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/engine-devel
>
More information about the Devel
mailing list