On Thu, Dec 20, 2012 at 10:17:50AM -0500, Doron Fediuck wrote:
----- Original Message -----
> From: "Simon Grinberg" <simon(a)redhat.com>
> To: "Doron Fediuck" <dfediuck(a)redhat.com>
> Cc: "engine-devel" <engine-devel(a)ovirt.org>,
vdsm-devel(a)fedorahosted.org
> Sent: Thursday, December 20, 2012 4:56:14 PM
> Subject: Re: [Engine-devel] [vdsm] CPU Overcommit Feature
>
>
>
> ----- Original Message -----
> > From: "Doron Fediuck" <dfediuck(a)redhat.com>
> > To: "Dan Kenigsberg" <danken(a)redhat.com>
> > Cc: "engine-devel" <engine-devel(a)ovirt.org>,
> > vdsm-devel(a)fedorahosted.org
> > Sent: Thursday, December 20, 2012 2:14:45 PM
> > Subject: Re: [Engine-devel] [vdsm] CPU Overcommit Feature
> >
> >
> >
> > ----- Original Message -----
> > > From: "Dan Kenigsberg" <danken(a)redhat.com>
> > > To: "Itamar Heim" <iheim(a)redhat.com>
> > > Cc: "engine-devel" <engine-devel(a)ovirt.org>,
> > > vdsm-devel(a)fedorahosted.org
> > > Sent: Thursday, December 20, 2012 11:55:10 AM
> > > Subject: Re: [Engine-devel] [vdsm] CPU Overcommit Feature
> > >
> > > On Thu, Dec 20, 2012 at 10:23:55AM +0200, Itamar Heim wrote:
> > > > On 12/20/2012 09:43 AM, Dan Kenigsberg wrote:
> > > > >On Wed, Dec 19, 2012 at 09:53:15AM -0500, Doron Fediuck wrote:
> > > > >>
> > > > >>----- Original Message -----
> > > > >>>From: "Dan Kenigsberg"
<danken(a)redhat.com>
> > > > >>>To: "Greg Padgett" <gpadgett(a)redhat.com>
> > > > >>>Cc: "engine-devel"
<engine-devel(a)ovirt.org>,
> > > > >>>vdsm-devel(a)fedorahosted.org
> > > > >>>Sent: Wednesday, December 19, 2012 3:59:11 PM
> > > > >>>Subject: Re: [Engine-devel] CPU Overcommit Feature
> > > > >>>
> > > > >>>On Mon, Dec 17, 2012 at 09:37:57AM -0500, Greg Padgett
> > > > >>>wrote:
> > > > >>>>Hi,
> > > > >>>>
> > > > >>>>I've been working on a feature to allow CPU
Overcommitment
> > > > >>>>of
> > > > >>>>hosts
> > > > >>>>in a cluster. This first stage allows the engine to
> > > > >>>>consider
> > > > >>>>host
> > > > >>>>cpu threads as cores for the purposes of VM resource
> > > > >>>>allocation.
> > > > >>>>
> > > > >>>>This wiki page has further details, your comments
are
> > > > >>>>welcome!
> > > > >>>>http://www.ovirt.org/Features/cpu_overcommit
> > > > >>>
> > > > >>>I've commented about the vdsm/engine API on
> > > > >>>http://gerrit.ovirt.org/#/c/10144/ but it is probably
better
> > > > >>>to
> > > > >>>reiterate it here.
> > > > >>>
> > > > >>>The suggested API is tightly coupled with an ugly hack
we
> > > > >>>pushed
> > > > >>>to
> > > > >>>vdsm
> > > > >>>in order not to solve the issue properly on the first
> > > > >>>strike.
> > > > >>>
> > > > >>>If we had not have report_host_threads_as_cores, I think
> > > > >>>we'd
> > > > >>>have a
> > > > >>>simpler API reporting only cpuThreads and cpuCores; with
no
> > > > >>>funny
> > > > >>>boolean flags.
> > > > >>>
> > > > >>>Let us strive to that position as much as we can.
> > > > >>>
> > > > >>>How about asking whoever used
report_host_threads_as_cores
> > > > >>>to
> > > > >>>unset
> > > > >>>it
> > > > >>>once they install Engine 3.2 ? I think that these are
very
> > > > >>>few
> > > > >>>people,
> > > > >>>that would not mind this very much.
> > > > >>>
> > > > >>>If this is impossible, I'd add a cpuCores2, always
reporting
> > > > >>>the
> > > > >>>true
> > > > >>>number, to be used by new Engines. We may even report it
> > > > >>>only
> > > > >>>on
> > > > >>>the
> > > > >>>very few cases of report_host_threads_as_cores being
set.
> > > > >>>
> > > > >>>Dan.
> > > > >>
> > > > >>Hi Dan,
> > > > >>Thanks for the review.
> > > > >>
> > > > >>I agree simply reporting cores and threads would be the
right
> > > > >>solution.
> > > > >>However, when you have hyperthreading turned off you get
> > > > >>cores=threads.
> > > > >>This is the same situation you have when hyperthreading
> > > > >>turned
> > > > >>on, and
> > > > >>someone used the vdsm configuration of reporting threads as
> > > > >>cores.
> > > > >>
> > > > >>So the engine won't know the real status of the host.
> > > > >
> > > > >This is not surprising, as report_host_threads_as_cores means
> > > > >in
> > > > >blunt
> > > > >English "lie to Engine about the number of cores". The
newly
> > > > >suggested
> > > > >flag says "don't believe what I said in cpuCores, since
I'm
> > > > >lying". Next
> > > > >thing we'd have is another flag saying that the former flag
> > > > >was
> > > > >a
> > > > >lie,
> > > > >and cpuCores is actually trustworthy.
> > > > >
> > > > >Instead of dancing this dance, I suggest to stop lying.
> > > > >
> > > > >report_host_threads_as_cores was a hack to assist a older
> > > > >Engine
> > > > >versions. Engine users that needed it had to set it
> > > > >out-of-band
> > > > >on
> > > > >their
> > > > >hosts. Now if they upgrade their Engine, they can -- as easily
> > > > >--
> > > > >reset
> > > > >that value.
> > > > >
> > > > >If they forget, nothing devastating happens beyond Engine
> > > > >assuming
> > > > >that
> > > > >hyperthreading is off.
> > > > >
> > > > >Please consider this suggestion. I find it the simplest for
> > > > >all
> > > > >involved
> > > > >parties.
> > > >
> > > > the only problem is the new vdsm doesn't know which engine may
> > > > be
> > > > using it.
> > > > if engine would say "getVdsCaps engineVersion=3.2", then
vdsm
> > > > could
> > > > know engine no longer needs lying to and ignore the flag,
> > > > re-using
> > > > same field.
> > >
> > > Note that I do not suggest to drop report_host_threads_as_cores
> > > now.
> > > I am suggesting to keep on lying even to new Engine.
> > > If someone thinks that lying is bad, he should reset
> > > report_host_threads_as_cores.
> > >
> > > It seems to me that the suggested API is being coerced by a very
> > > limited
> > > use case, that is not going to be really harmed by a
> > > straight-forward
> > > API.
> > >
> > > Dan.
> >
> > Dan,
> > Did some further checking, and we can go with it;
> > So basically now we add cpuThreads. Additionally, if the
> > report_host_threads_as_cores
> > is turned on, an additional cpuCoresReal will be reported.
>
> No need for that.
> There is only one problematic state where VDSM cheats and reports
> cores == threads.
> This does not happen by mistake, the user specifically asked for it.
>
> The above condition above also happens if threads is really off or
> the processor does not have threads, so it's a state we need to
> handle in any case.
>
> So just report threads=off, whenever cores == threads and treat it as
> such. If the user is unhappy then he should just turn off the cheat
> mode.
>
ack.
* <3.2 clusters are not supported.
* >=3.2 clusters will assume HT off unless cores < threads.
Some release note is needed for this, so users will be advised to
turn off vdsm cheating when upgrading to new 3.2 engine, in case they
happen to use it.
Plain and simple. vdsm side accepted to master branch.
Please backport it to the ovirt-3.2 branch in order to get into the
release.