[Engine-devel] RFD: API: Identifying vdsm objects in the next-gen API

Mon Dec 3 20:30:21 UTC 2012

On Thu, Nov 29, 2012 at 05:59:09PM -0500, Saggi Mizrahi wrote:
> 
> 
> ----- Original Message -----
> > From: "Adam Litke" <agl at us.ibm.com> To: "Saggi Mizrahi"
> > <smizrahi at redhat.com> Cc: engine-devel at linode01.ovirt.org, "Dan Kenigsberg"
> > <danken at redhat.com>, "Federico Simoncelli" <fsimonce at redhat.com>, "Ayal
> > Baron" <abaron at redhat.com>, vdsm-devel at lists.fedorahosted.org Sent:
> > Thursday, November 29, 2012 5:22:43 PM Subject: Re: RFD: API: Identifying
> > vdsm objects in the next-gen API
> > 
> > On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
> > > They are not future proof as the paradigm is completely different.
> > > Storage domain IDs are not static any more (and are not guaranteed to be
> > > unique or the same across the cluster.  Image IDs represent the ID of the
> > > projected data and not the actual unique path.  Just as an example, to run
> > > a VM you give a list of domains that might contain the needed images in
> > > the chain and the image ID of the tip.  The paradigm is changed to and
> > > most calls get non synchronous number of images and domains.  Further
> > > more, the APIs themselves are completely different. So future proofing is
> > > not really an issue.
> > 
> > I don't understand this at all.  Perhaps we could all use some education on
> > the architecture of the planned architectural changes.  If I can pass an
> > arbitrary list of domainIDs that _might_ contain the data, why wouldn't I
> > just pass all of them every time?  In that case, why are they even required
> > since vdsm would have to search anyway?
> It's for optimization mostly, the engine usually has a good idea of where
> stuff are, having it give hints to VDSM can speed up the search process.
> also, then engines knows how transient some storage pieces are. If you have a
> domain that is only there for backup or "owned" by another manager sharing the
> host, you don't want you VMs using the disks that are on that storage
> effectively preventing it from being removed (though we do have plans to have
> qemu switch base snapshots at runtime for just that).

This is not a clean design.  If the search is slow, then maybe we need to
improve caching internally.  Making a client cache a bunch of internal IDs to
pass around sounds like a complete layering violation to me.

> > 
> > > As to making the current API a bit simpler. As I said, making them opaque
> > > is problematic as currently the engine is responsible for creating the
> > > IDs.
> > 
> > As I mentioned in my last post, engine still can specify the ID's when the
> > object is first created.  From that point forward the ID never changes so it
> > can be baked into the identifier.
> Where will this identifier be persisted?
> > 
> > > Further more, some calls require you to play with these (making a template
> > > instead of a snapshot).  Also, the full chain and topology needs to be
> > > completely visible to the engine.
> > 
> > Please provide a specific example of how you play with the IDs.  I can guess
> > where you are going, but I don't want to divert the thread.
> The relationship between volumes and images is deceptive at the moment.  IMG
> is the chain and volume is a member, IMGUUID is only used to for verification
> and to detect when we hit a template going up the chain.  When you do
> operation on images assumptions are being guaranteed about the resulting IDs.
> When you copy an image, you assume to know all the new IDs as they remain the
> same.  With your method I can't tell what the new "opaque" result is going to
> be.  Preview mode (another abomination being deprecated) relies on the
> disconnect between imgUUID and volUUID.  Live migration currently moves a lot
> of the responsibility to the engine.

No client should need to know about all of these internal details.  I understand
that's the way it is today, and that's one of the main reasons that the API is a
complete pain to use.

> > 
> > > These things, as you said, are problematic. But this is the way things are
> > > today.
> > 
> > We are changing them.
> Any intermediary step is needlessly problematic for existing clients.  Work is
> already in progress for fixing the API properly, making some calls a bit nicer
> isn't an excuse to start making more compatibility code in the engine.

The engine won't need compatibility code.  This only would impact the jsonrpc
bindings which aren't used by engine yet.  When engine switches over, then yes
it would need to adapt.

> > 
> > > As for task IDs.  Currently task IDs are only used for storage and they
> > > get persisted to disk. This is WRONG and is not the case with the new
> > > storage API.  Because we moved to an asynchronous message based protocol
> > > (json-rpc over TCP\AMQP) there is no need to generate a task ID. it is
> > > built in to json-rpc.  json-rpc specifies that the IDs have to be unique
> > > for a client as long as the request is still active.  This is good enough
> > > as internally we can have a verb for a client to query it's own running
> > > tasks and a verb to query other host tasks by mangling in the client
> > > before the ID.  Because the protocol is
> > 
> > So this would rely on the client keeping the connection open and as soon as
> > it disconnects it would lose the ability to query tasks from before the
> > connection went down?  I don't know if it's a good idea to conflate message
> > ID's with task ID's.  While the protocol can operate asynchronously, some
> > calls have synchronous semantics and others have asynchronous semantics.  I
> > would expect sync calls to return their data immediately and async calls to
> > return immediately with either: an error code, or an 'operation started'
> > message and associated ID for querying the status of the operation.
> Upon reflection I agree that having the request ID unique per client is
> problematic and we need to make sure they are unique per host at every point
> in time.
> > 
> > > asynchronous all calls are asynchronous by nature well.  Tasks will no
> > > longer be persisted or expected to be persisted. It's the callers
> > > responsibility to query the state and see if the operation succeeded or
> > > failed if the caller or VDSM died in the middle of the call. The current
> > > "cleanTask()" system can't be used when more then one client is using VDSM
> > > and will not be used for anything other then legacy storage.
> > 
> > I agree about not persisting tasks in the future.  Although I think finished
> > tasks should remain in memory for some time so they can be queried by a
> > client who must reconnect.
> I am completely against keeping the task for a nominal amount of time, it just
> makes another flow.  You need to have code that makes up in case you missed
> that window any way then just have one recovery code path, when VDSM looses
> you task or you lose VDSM recover immediately.  Also, because task IDs can be
> reused once they expire assuming that the task you encountered is the same
> task you originally sent is problematic.
> 
> If you expect intermittent connections use the AMQP backend (which will
> support broker-less p2p communication as well)

HOw will you tell the difference between a completed task and an invalid (no
such task) task?  Do all completed tasks just issue a task completed event?

> > 
> > > AFAIK Apart from storage all objects IDs are constructed with a single ID,
> > > name or alias. VMs, storageConnections, network interfaces. So it's not a
> > > real issue.  I agree that in the future we should keep the idiom of pass
> > > configuration once, name it, and keep using the name to reference the
> > > object.
> > 
> > Yes, storage is the major problem here.
> And, as I said, changing the API is problematic for migration of current
> users.
> > 

-- 
Adam Litke <agl at us.ibm.com>
IBM Linux Technology Center