----- Original Message -----
From: "Dan Kenigsberg" <danken(a)redhat.com>
To: "Francesco Romani" <fromani(a)redhat.com>
Cc: devel(a)ovirt.org, "vdsm-devel" <vdsm-devel(a)lists.fedorahosted.org>
Sent: Thursday, April 10, 2014 12:25:33 PM
Subject: Re: [VDSM] cleaning statistics retrieval
[...]
> > Could you explain how an AttributeError there moved the VM
to Down?
>
> This should actually be this bug of engine
>
https://bugzilla.redhat.com/show_bug.cgi?id=1072282
> if GetVmStats fails for whatever reason, engine thinks the VM is down.
Could you rename the Vdsm bug to exprese the in-vdsm problem, and make
clear that it confuses older Engines to think the Vm is Down?
Done
[...]
> The VDSM 'grading' was an hint from VDSM to help engine
to distinguish
> between those cases.
>
> Even if we agree this grading idea is bad, the core issue remains open:
> what to do if we end up with a partial response?
> For example, let's say we handle the getBalloonInfo exception
> (
http://gerrit.ovirt.org/#/c/26539/),
> the stats object to be returned will lack
>
> * the (mandatory, expected) balloon stats
> * the (optional) migration progress - ok, bad example because this makes
> sense only during migrations,
> but other optional fields may be added later and the issue remains
>
> Again, anyone feel free to correct me if I misunderstood something about
> engine
> (or VDSM <=> engine communication) and to suggest better alternatives :\
Currently, we have way too many try-except-Exception clauses in our
code. They swallow everything: from expected libvirt errors to
unexpected syntax errors. We should eliminate them, not add more.
Mandatory stuff must be reported, so if we fail extracting them
we'd better explode and raise an error. Optional stuff are optional, so
we could drop them from the output, in order to report the mandatory
ones.
http://gerrit.ovirt.org/#/c/26539/2/vdsm/virt/vm.py currently suggest
that Vdsm lie when it fails to extract the current ballon size, and say
that it's 0. I'd prefer to drop balloon_cur from the return value of
_getBalloonInfo or drop balloonInfo from the reported stats.
My only question is the granularity of the definition: is balloonInfo
atomic, or can it be reported without balloon_cur? Should it? I'd prefer
to have the "atoms" as big as possible, to limit the number of
combinations - if self._dom is None, don't report balloonInfo at all,
and so if the libvirt connection timed out.
OK, I'll amend
http://gerrit.ovirt.org/#/c/26539/ accordingly (and of course
taking in account Michal's remarks).
I posted previously a tentative getStats cleanup in the form of this patch series
http://gerrit.ovirt.org/#/c/26547/ (13 patches but quite fine-grained), in light
of what we discussed on this thread I consider them obsolete and I'm going
to abandon them.
Bests,
--
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani