Re: [Engine-devel] Managing async tasks

17 Dec 2012


      ----- Original Message -----
...
From: "Adam Litke" <agl@us.ibm.com>
To: "Saggi Mizrahi" <smizrahi@redhat.com>
Cc: "Dan Kenigsberg" <danken@redhat.com>, "Ayal Baron" <abaron@redhat.com>, "Federico Simoncelli"
<fsimonce@redhat.com>, engine-devel@ovirt.org, vdsm-devel@lists.fedorahosted.org
Sent: Monday, December 17, 2012 2:16:25 PM
Subject: Re: Managing async tasks
On Mon, Dec 17, 2012 at 12:15:08PM -0500, Saggi Mizrahi wrote:
...
----- Original Message -----
...
From: "Adam Litke" <agl@us.ibm.com> To:
vdsm-devel@lists.fedorahosted.org
Cc: "Dan Kenigsberg" <danken@redhat.com>, "Ayal Baron"
<abaron@redhat.com>,
"Saggi Mizrahi" <smizrahi@redhat.com>, "Federico Simoncelli"
<fsimonce@redhat.com>, engine-devel@ovirt.org Sent: Monday,
December 17,
2012 12:00:49 PM Subject: Managing async tasks
On today's vdsm call we had a lively discussion around how
asynchronous
operations should be handled in the future.  In an effort to
include more
people in the discussion and to better capture the resulting
conversation I
would like to continue that discussion here on the mailing list.
A lot of ideas were thrown around about how 'tasks' should be
handled in the
future.  There are a lot of ways that it can be done.  To
determine how we
should implement it, it's probably best if we start with a set of
requirements.  If we can first agree on these, it should be easy
to find a
solution that meets them.  I'll take a stab at identifying a
first set of
POSSIBLE requirements:
- Standardized method for determining the result of an operation
This is a big one for me because it directly affects the
  consumability of
  the API.  If each verb has different semantics for discovering
  whether it
  has completed successfully, then the API will be nearly
  impossible to use
  easily.
Since there is no way to assure if of some tasks completed
successfully or
failed, especially around the murky waters of storage, I say this
requirement
should be removed.  At least not in the context of a task.
I don't agree.  Please feel free to convince me with some exampled.
 If we
cannot provide feedback to a user as to whether their request has
been satisfied
or not, then we have some bigger problems to solve.
If VDSM sends a write command to a storage server, and the connection hangs up before the ACK has returned.
The operation has been committed but VDSM has no way of knowing if that happened as far as VDSM is concerned it got an ETIMEO or EIO.
This is the same problem that the engine has with VDSM.
If VDSM creates an image\VM\network\repo but the connection hangs up before the response can be sent back as far as the engine is concerned the operation times out.
This is an inherent issue with clustering.
This is why I want to move away from tasks being *the* trackable objects.
Tasks should be short. As short as possible.
Run VM should just persist the VM information on the VDSM host and return. The rest of the tracking should be done using the VM ID.
Create image should return once VDSM persisted the information about the request on the repository and created the metadata files.
Tracking should be done on the repo or the imageId.
...
...
Sorry.  That's my list :)  Hopefully others will be willing to
add other
requirements for consideration.
From my understanding, task recovery (stop, abort, rollback, etc)
will not
be generally supported and should not be a requirement.
--
Adam Litke <agl@us.ibm.com>
IBM Linux Technology Center