----- Original Message -----
From: "Dan Kenigsberg" <danken(a)redhat.com>
To: "Francesco Romani" <fromani(a)redhat.com>
Cc: "vdsm-devel" <vdsm-devel(a)lists.fedorahosted.org>, devel(a)ovirt.org
Sent: Tuesday, April 8, 2014 6:57:15 PM
Subject: Re: cleaning statistics retrieval
On Tue, Apr 08, 2014 at 06:52:50AM -0400, Francesco Romani wrote:
> Hello VDSM developers,
Please use devel@ovirt, and mention "vdsm" at the subject. This thread
in particular involves Engine as well.
Right, I forgot. Sorry about that.
> I'd like to discuss and explain the plans for cleaning up
> in vdsm/virt/vm.py, and how it affects a bug we have:
> Let's start from the bug.
> To make a long story short, there is a (small) race in VDSM, probably
> introduced by commit
> The race can actually be triggered only if the VM is shutting down, so it
> is not that bad.
> Fixing the reported issue in the specific case can be done with a trivial
> if, and that it what I did
> in my initial http://gerrit.ovirt.org/#/c/25803/1/vdsm/vm.py,cm
Could you explain how an AttributeError there moved the VM to Down?
This should actually be this bug of engine
if GetVmStats fails for whatever reason, engine thinks the VM is down.
> And this is the core of the issue.
> My initial idea is to either return success and a complete, well formed
> statistics set, or return an error.
> However current engine seems to not cope properly with this, and we cannot
> break backward compatibility.
Would you be more precise? If getAllVmStats returns an errCode for one
of the Vms, what happens?
For GetAllVmStats, AFAIK, but please correct me if I am wrong, because I'm not really
expert on engine side,
engine simply does not expects anything different from a list
of either a RunningVmStats or an ExitedVmStats.
So not sure (will verify just after this mail) if engine can cope with mixed content,
meaning [ stats, errCode, stats, stats ... ]
For GetVmStats, like Michal said, the engine expects the call to succeed otherwise
it puts the VM into the Unknown state.
> Looks like the only way to go is to always return success and to add a
> field to describe the content of the
> statistics (partial, complete...). THis is admittedly a far cry from the
> ideal solution, but it is dictated
> by the need to preserve the compatibility with current/old engines.
I don't think that I understand your suggestion, but it does not sound
right to send a partial dictionary and to "appologize" about its
paritiality elsewhere. The dictionary may be "partial" for engine-4.0
yet "complete" for engine-3.5. It's not for Vdsm to grade its own
I see your point (that's one of the reasons I'm not happy about this solution).
Please see below for the detauls.
> please note that I'm not really happy about this solution,
but, given the
> constraint, I don't see better alternatives.
Please explain the benefits of describing the partial content, as I do
not see them.
The root issue here is the getStats must always succeed, because the engine doesn't
expect otherwise and thus we guarantee this to cope with old engines;
but inside VDSM, getStats can actually fail in a lot of places
(like in this case getBalloonInfo)
So, in VDSM we can end up with a partial response, and now the question
is: what to do with this partial response? And if there are optional fields
in the response, how can the engine distinguish between
* full response, but with an optional field missing
* partial response (because of an exception inside VDSM),
without some field, incidentally including an optional one
The VDSM 'grading' was an hint from VDSM to help engine to distinguish
between those cases.
Even if we agree this grading idea is bad, the core issue remains open:
what to do if we end up with a partial response?
For example, let's say we handle the getBalloonInfo exception
the stats object to be returned will lack
* the (mandatory, expected) balloon stats
* the (optional) migration progress - ok, bad example because this makes sense only during
but other optional fields may be added later and the issue remains
Again, anyone feel free to correct me if I misunderstood something about engine
(or VDSM <=> engine communication) and to suggest better alternatives :\
Thanks and bests,
RedHat Engineering Virtualization R & D