[Engine-devel] [vdsm] CPU Overcommit Feature
Doron Fediuck
dfediuck at redhat.com
Thu Dec 20 15:17:50 UTC 2012
----- Original Message -----
> From: "Simon Grinberg" <simon at redhat.com>
> To: "Doron Fediuck" <dfediuck at redhat.com>
> Cc: "engine-devel" <engine-devel at ovirt.org>, vdsm-devel at fedorahosted.org
> Sent: Thursday, December 20, 2012 4:56:14 PM
> Subject: Re: [Engine-devel] [vdsm] CPU Overcommit Feature
>
>
>
> ----- Original Message -----
> > From: "Doron Fediuck" <dfediuck at redhat.com>
> > To: "Dan Kenigsberg" <danken at redhat.com>
> > Cc: "engine-devel" <engine-devel at ovirt.org>,
> > vdsm-devel at fedorahosted.org
> > Sent: Thursday, December 20, 2012 2:14:45 PM
> > Subject: Re: [Engine-devel] [vdsm] CPU Overcommit Feature
> >
> >
> >
> > ----- Original Message -----
> > > From: "Dan Kenigsberg" <danken at redhat.com>
> > > To: "Itamar Heim" <iheim at redhat.com>
> > > Cc: "engine-devel" <engine-devel at ovirt.org>,
> > > vdsm-devel at fedorahosted.org
> > > Sent: Thursday, December 20, 2012 11:55:10 AM
> > > Subject: Re: [Engine-devel] [vdsm] CPU Overcommit Feature
> > >
> > > On Thu, Dec 20, 2012 at 10:23:55AM +0200, Itamar Heim wrote:
> > > > On 12/20/2012 09:43 AM, Dan Kenigsberg wrote:
> > > > >On Wed, Dec 19, 2012 at 09:53:15AM -0500, Doron Fediuck wrote:
> > > > >>
> > > > >>----- Original Message -----
> > > > >>>From: "Dan Kenigsberg" <danken at redhat.com>
> > > > >>>To: "Greg Padgett" <gpadgett at redhat.com>
> > > > >>>Cc: "engine-devel" <engine-devel at ovirt.org>,
> > > > >>>vdsm-devel at fedorahosted.org
> > > > >>>Sent: Wednesday, December 19, 2012 3:59:11 PM
> > > > >>>Subject: Re: [Engine-devel] CPU Overcommit Feature
> > > > >>>
> > > > >>>On Mon, Dec 17, 2012 at 09:37:57AM -0500, Greg Padgett
> > > > >>>wrote:
> > > > >>>>Hi,
> > > > >>>>
> > > > >>>>I've been working on a feature to allow CPU Overcommitment
> > > > >>>>of
> > > > >>>>hosts
> > > > >>>>in a cluster. This first stage allows the engine to
> > > > >>>>consider
> > > > >>>>host
> > > > >>>>cpu threads as cores for the purposes of VM resource
> > > > >>>>allocation.
> > > > >>>>
> > > > >>>>This wiki page has further details, your comments are
> > > > >>>>welcome!
> > > > >>>>http://www.ovirt.org/Features/cpu_overcommit
> > > > >>>
> > > > >>>I've commented about the vdsm/engine API on
> > > > >>>http://gerrit.ovirt.org/#/c/10144/ but it is probably better
> > > > >>>to
> > > > >>>reiterate it here.
> > > > >>>
> > > > >>>The suggested API is tightly coupled with an ugly hack we
> > > > >>>pushed
> > > > >>>to
> > > > >>>vdsm
> > > > >>>in order not to solve the issue properly on the first
> > > > >>>strike.
> > > > >>>
> > > > >>>If we had not have report_host_threads_as_cores, I think
> > > > >>>we'd
> > > > >>>have a
> > > > >>>simpler API reporting only cpuThreads and cpuCores; with no
> > > > >>>funny
> > > > >>>boolean flags.
> > > > >>>
> > > > >>>Let us strive to that position as much as we can.
> > > > >>>
> > > > >>>How about asking whoever used report_host_threads_as_cores
> > > > >>>to
> > > > >>>unset
> > > > >>>it
> > > > >>>once they install Engine 3.2 ? I think that these are very
> > > > >>>few
> > > > >>>people,
> > > > >>>that would not mind this very much.
> > > > >>>
> > > > >>>If this is impossible, I'd add a cpuCores2, always reporting
> > > > >>>the
> > > > >>>true
> > > > >>>number, to be used by new Engines. We may even report it
> > > > >>>only
> > > > >>>on
> > > > >>>the
> > > > >>>very few cases of report_host_threads_as_cores being set.
> > > > >>>
> > > > >>>Dan.
> > > > >>
> > > > >>Hi Dan,
> > > > >>Thanks for the review.
> > > > >>
> > > > >>I agree simply reporting cores and threads would be the right
> > > > >>solution.
> > > > >>However, when you have hyperthreading turned off you get
> > > > >>cores=threads.
> > > > >>This is the same situation you have when hyperthreading
> > > > >>turned
> > > > >>on, and
> > > > >>someone used the vdsm configuration of reporting threads as
> > > > >>cores.
> > > > >>
> > > > >>So the engine won't know the real status of the host.
> > > > >
> > > > >This is not surprising, as report_host_threads_as_cores means
> > > > >in
> > > > >blunt
> > > > >English "lie to Engine about the number of cores". The newly
> > > > >suggested
> > > > >flag says "don't believe what I said in cpuCores, since I'm
> > > > >lying". Next
> > > > >thing we'd have is another flag saying that the former flag
> > > > >was
> > > > >a
> > > > >lie,
> > > > >and cpuCores is actually trustworthy.
> > > > >
> > > > >Instead of dancing this dance, I suggest to stop lying.
> > > > >
> > > > >report_host_threads_as_cores was a hack to assist a older
> > > > >Engine
> > > > >versions. Engine users that needed it had to set it
> > > > >out-of-band
> > > > >on
> > > > >their
> > > > >hosts. Now if they upgrade their Engine, they can -- as easily
> > > > >--
> > > > >reset
> > > > >that value.
> > > > >
> > > > >If they forget, nothing devastating happens beyond Engine
> > > > >assuming
> > > > >that
> > > > >hyperthreading is off.
> > > > >
> > > > >Please consider this suggestion. I find it the simplest for
> > > > >all
> > > > >involved
> > > > >parties.
> > > >
> > > > the only problem is the new vdsm doesn't know which engine may
> > > > be
> > > > using it.
> > > > if engine would say "getVdsCaps engineVersion=3.2", then vdsm
> > > > could
> > > > know engine no longer needs lying to and ignore the flag,
> > > > re-using
> > > > same field.
> > >
> > > Note that I do not suggest to drop report_host_threads_as_cores
> > > now.
> > > I am suggesting to keep on lying even to new Engine.
> > > If someone thinks that lying is bad, he should reset
> > > report_host_threads_as_cores.
> > >
> > > It seems to me that the suggested API is being coerced by a very
> > > limited
> > > use case, that is not going to be really harmed by a
> > > straight-forward
> > > API.
> > >
> > > Dan.
> >
> > Dan,
> > Did some further checking, and we can go with it;
> > So basically now we add cpuThreads. Additionally, if the
> > report_host_threads_as_cores
> > is turned on, an additional cpuCoresReal will be reported.
>
> No need for that.
> There is only one problematic state where VDSM cheats and reports
> cores == threads.
> This does not happen by mistake, the user specifically asked for it.
>
> The above condition above also happens if threads is really off or
> the processor does not have threads, so it's a state we need to
> handle in any case.
>
> So just report threads=off, whenever cores == threads and treat it as
> such. If the user is unhappy then he should just turn off the cheat
> mode.
>
ack.
* <3.2 clusters are not supported.
* >=3.2 clusters will assume HT off unless cores < threads.
Some release note is needed for this, so users will be advised to
turn off vdsm cheating when upgrading to new 3.2 engine, in case they
happen to use it.
>
> > Since this is a 3.2 cluster feature, the engine will make sure to
> > update
> > the DB with the real numbers, and older clusters will use the
> > information
> > vdsm reports, while blocking the engine side optimization.
> > An update path is already available here:
> > http://gerrit.ovirt.org/#/c/10144/4
> >
> > I hope that for cluster 3.3 we'll be able to drop the vdsm
> > configuration.
> > A review would be highly appreciated due to the holidays proximity.
> >
> > Doron
More information about the Engine-devel
mailing list