----- Original Message -----
From: "Dan Kenigsberg" <danken(a)redhat.com>
To: "Itamar Heim" <iheim(a)redhat.com>
Cc: "engine-devel" <engine-devel(a)ovirt.org>, vdsm-devel(a)fedorahosted.org
Sent: Thursday, December 20, 2012 11:55:10 AM
Subject: Re: [Engine-devel] [vdsm] CPU Overcommit Feature
On Thu, Dec 20, 2012 at 10:23:55AM +0200, Itamar Heim wrote:
> On 12/20/2012 09:43 AM, Dan Kenigsberg wrote:
> >On Wed, Dec 19, 2012 at 09:53:15AM -0500, Doron Fediuck wrote:
> >>
> >>----- Original Message -----
> >>>From: "Dan Kenigsberg" <danken(a)redhat.com>
> >>>To: "Greg Padgett" <gpadgett(a)redhat.com>
> >>>Cc: "engine-devel" <engine-devel(a)ovirt.org>,
> >>>vdsm-devel(a)fedorahosted.org
> >>>Sent: Wednesday, December 19, 2012 3:59:11 PM
> >>>Subject: Re: [Engine-devel] CPU Overcommit Feature
> >>>
> >>>On Mon, Dec 17, 2012 at 09:37:57AM -0500, Greg Padgett wrote:
> >>>>Hi,
> >>>>
> >>>>I've been working on a feature to allow CPU Overcommitment of
> >>>>hosts
> >>>>in a cluster. This first stage allows the engine to consider
> >>>>host
> >>>>cpu threads as cores for the purposes of VM resource
> >>>>allocation.
> >>>>
> >>>>This wiki page has further details, your comments are welcome!
> >>>>http://www.ovirt.org/Features/cpu_overcommit
> >>>
> >>>I've commented about the vdsm/engine API on
> >>>http://gerrit.ovirt.org/#/c/10144/ but it is probably better to
> >>>reiterate it here.
> >>>
> >>>The suggested API is tightly coupled with an ugly hack we pushed
> >>>to
> >>>vdsm
> >>>in order not to solve the issue properly on the first strike.
> >>>
> >>>If we had not have report_host_threads_as_cores, I think we'd
> >>>have a
> >>>simpler API reporting only cpuThreads and cpuCores; with no
> >>>funny
> >>>boolean flags.
> >>>
> >>>Let us strive to that position as much as we can.
> >>>
> >>>How about asking whoever used report_host_threads_as_cores to
> >>>unset
> >>>it
> >>>once they install Engine 3.2 ? I think that these are very few
> >>>people,
> >>>that would not mind this very much.
> >>>
> >>>If this is impossible, I'd add a cpuCores2, always reporting the
> >>>true
> >>>number, to be used by new Engines. We may even report it only on
> >>>the
> >>>very few cases of report_host_threads_as_cores being set.
> >>>
> >>>Dan.
> >>
> >>Hi Dan,
> >>Thanks for the review.
> >>
> >>I agree simply reporting cores and threads would be the right
> >>solution.
> >>However, when you have hyperthreading turned off you get
> >>cores=threads.
> >>This is the same situation you have when hyperthreading turned
> >>on, and
> >>someone used the vdsm configuration of reporting threads as
> >>cores.
> >>
> >>So the engine won't know the real status of the host.
> >
> >This is not surprising, as report_host_threads_as_cores means in
> >blunt
> >English "lie to Engine about the number of cores". The newly
> >suggested
> >flag says "don't believe what I said in cpuCores, since I'm
> >lying". Next
> >thing we'd have is another flag saying that the former flag was a
> >lie,
> >and cpuCores is actually trustworthy.
> >
> >Instead of dancing this dance, I suggest to stop lying.
> >
> >report_host_threads_as_cores was a hack to assist a older Engine
> >versions. Engine users that needed it had to set it out-of-band on
> >their
> >hosts. Now if they upgrade their Engine, they can -- as easily --
> >reset
> >that value.
> >
> >If they forget, nothing devastating happens beyond Engine assuming
> >that
> >hyperthreading is off.
> >
> >Please consider this suggestion. I find it the simplest for all
> >involved
> >parties.
>
> the only problem is the new vdsm doesn't know which engine may be
> using it.
> if engine would say "getVdsCaps engineVersion=3.2", then vdsm could
> know engine no longer needs lying to and ignore the flag, re-using
> same field.
Note that I do not suggest to drop report_host_threads_as_cores now.
I am suggesting to keep on lying even to new Engine.
If someone thinks that lying is bad, he should reset
report_host_threads_as_cores.
It seems to me that the suggested API is being coerced by a very
limited
use case, that is not going to be really harmed by a straight-forward
API.
Dan.
Dan,
Did some further checking, and we can go with it;
So basically now we add cpuThreads. Additionally, if the report_host_threads_as_cores
is turned on, an additional cpuCoresReal will be reported.
Since this is a 3.2 cluster feature, the engine will make sure to update
the DB with the real numbers, and older clusters will use the information
vdsm reports, while blocking the engine side optimization.
An update path is already available here:
http://gerrit.ovirt.org/#/c/10144/4
I hope that for cluster 3.3 we'll be able to drop the vdsm configuration.
A review would be highly appreciated due to the holidays proximity.
Doron