----- Original Message -----
From: "Adam Litke" <agl(a)us.ibm.com>
To: "Saggi Mizrahi" <smizrahi(a)redhat.com>
Cc: "Shu Ming" <shuming(a)linux.vnet.ibm.com>, "engine-devel"
<engine-devel(a)ovirt.org>, "VDSM Project Development"
<vdsm-devel(a)lists.fedorahosted.org>
Sent: Monday, December 10, 2012 4:47:46 PM
Subject: Re: [vdsm] RFC: New Storage API
On Mon, Dec 10, 2012 at 03:36:23PM -0500, Saggi Mizrahi wrote:
> > Statements like this make me start to worry about your userData
> > concept. It's a
> > sign of a bad API if the user needs to invent a custom metadata
> > scheme for
> > itself. This reminds me of the abomination that is the 'custom'
> > property in the
> > vm definition today.
> In one sentence: If VDSM doesn't care about it, VDSM doesn't manage
> it.
>
> userData being a "void*" is quite common and I don't understand why
> you would thing it's a sign of a bad API.
> Further more, giving the user choice about how to represent it's
> own metadata and what fields it want to keep seems reasonable to
> me.
> Especially given the fact that VDSM never reads it.
>
> The reason we are pulling away from the current system of VDSM
> understanding the extra data is that it makes that data tied to
> VDSMs on disk format.
> VDSM on disk format has to be very stable because of clusters with
> multiple VDSM versions.
> Further more, since this is actually manager data it has to be tied
> to the manager backward compatibility lifetime as well.
> Having it be opaque to VDSM ties it to only one, simpler, support
> lifetime instead of two.
>
> I guess you are implying that it will make it problematic for
> multiple users to read userData left by another user because the
> formats might not be compatible.
> The solution is that all parties interested in using VDSM storage
> agree on format, and common fields, and supportability, and all
> the other things that choosing a supporting *something* entails.
> This is, however, out of the scope of VDSM. When the time comes I
> think how the userData blob is actually parsed and what fields it
> keeps should be discussed on ovirt-devel or engine-devel.
>
> The crux of the issue is that VDSM manages only what it cares about
> and the user can't modify directly.
> This is done because everything we expose we commit to.
> If you want any information persisted like:
> - Human readable name (in whatever encoding)
> - Is this a template or a snapshot
> - What user owns this image
>
> You can just put it in the userData.
> VDSM is not going to impose what encoding you use.
> It's not going to decide if you represent your users as IDs or
> names or ldap queries or Public Keys.
> It's not going to decide if you have explicit templates or not.
> It's not going to decide if you care what is the logical image
> chain.
> It's not going to decide anything that is out of it's scope.
> No format is future proof, no selection of fields will be good for
> any situation.
> I'd much rather it be someone else's problem when any of them need
> to be changed.
> They have currently been VDSMs problem and it has been hell to
> maintain.
In general, I actually agree with most of this. What I want to avoid
is pushing
things that should actually be a part of the API into this userData
blob. We do
want to keep the API as simple as possible to give vdsm flexibility.
If, over
time, we find that users are always using userData to work around
something
missing in the API, this could be a really good sign that the API
needs
extension.
I was actually contemplating about this for quite a while.
If while you create an image the reply is lost or, VDSM is unable to know if the operation
was committed or not, the user will have no way of knowing what thew new image ID is.
To solve this it is recommended that the manager puts some sort of task related
information in the user data.
If the operation ever finishes in an an ambiguous state the user just reads the userData
from any images it doesn't know or is unsure about their state.
This is a flow that every client will have to have.
So why not just add that to the API?
Because I don't want to impose how this "information" gets generated, what
is the content of that data or how unique it has to be.
Since VDSM doesn't use it for anything, I don't feel like I need to figure this
out.
I am all for simplicity, but simplicity is kind of an abstract concept. Having it be a
blob is in some aspects the simplest thing you can do.
Just saying that "I have a field, put whatever in it" is simple to convey but
does requires more work on the user's side to figure out what to do with it.
All that being said, I do think that the format, fields and how to use them should be
defined so different users can communicate and synchronize.
It's also important that you don't reinvent the wheel for every flow in every
client.
I'm just saying that it's not in the scope of VDSM.
It should be done as a standard that all users of VDSM agree too conform to.
It's the same way that a file-system doesn't check that every *.PDF file is a
proper PDF file. PDF clients agree on how to read\save files and users promise to mark
them with that suffix.
Think of what happened if the FS did check that kind of thing. PDF features will be bound
to the kernel release cycle so it doesn't refuse to save\load a file just because your
PDF client is newer or older than the kernel.
You will have to have code to validate PDF files in the kernel. That means more code
which, in turn, means more bugs.
--
Adam Litke <agl(a)us.ibm.com>
IBM Linux Technology Center