[Engine-devel] Task cancelation feature

Shireesh Anjal sanjal at redhat.com
Tue Dec 4 14:48:50 UTC 2012



----- Original Message -----
> From: "Dan Kenigsberg" <danken at redhat.com>
> To: "Saggi Mizrahi" <smizrahi at redhat.com>
> Cc: "engine-devel" <engine-devel at ovirt.org>
> Sent: Tuesday, December 4, 2012 3:08:13 PM
> Subject: Re: [Engine-devel] Task cancelation feature
> 
> On Mon, Dec 03, 2012 at 12:15:07PM -0500, Saggi Mizrahi wrote:
> > VDSM tasks are changing to something completely different.
> > It's still under discussion but the general direction is that:
> > - TaskIDs will be decided by the caller.
> > - VDSM can start tasks on it's own
> > - There will be no distinction between async tasks and sync tasks.
> > Everything is always async.
> > - There will be no cleanTask() when tasks are done they return
> > result to the caller and disappear immediately.
> 
> I'm not sure I understand the motivation for the latter change. I
> kinda
> like the unix process semantics, were the return code of a process is
> kept with its id after the process ends, until the process parent
> calls
> wait(2). Otherwise, how can the caller tell why its task has failed?
> 
> For example, I'd like to see vmCreate using async tasks like that.
> vmCreate returns immediately, and a vdsm task is tracking the vm
> creation. If something bad happens, the information about the failure
> can be polled by the Engine that created the vm (or a new Engine
> instance, after an Engine crash).
> 
> Similarily, we may need to make setupNetwork asynchronous, since we
> depend on dhclient, which may take a lot of time to finish.
> 
> Have these future use cases been debated?

There are also a few requirements from gluster perspective that may need enhancements in vdsm as well as engine.
- Tasks are created and managed by glusterfs and not by vdsm.
- Gluster tasks are not bound to a particular host, but are cluster-wide, and their status can be checked from any of the hosts of the cluster.
- Concept of SPM does not come into picture in case of gluster clusters/hosts
- Apart from starting, aborting and checking status, some of the gluster tasks support additional actions like pause, resume and commit.

Based on some of the telephonic conversations with maintainers of engine and vdsm, we were planning to enhance the existing task management as follows:
- Enhance the getAllTasks verb in vdsm to accept one or more tags for filtering tasks. (http://gerrit.ovirt.org/7579)
- Currently all tasks created in vdsm through requests from engine have the 'spm' tag as they are SPM tasks.
- Introduce a new field, say task_target in engine, which indicates what kind of a task it is. Possible values:
   - SPM (SPM tasks)
   - CLUSTER (cluster-wide tasks e.g. gluster tasks)
   - HOST (tasks specific to a particular host e.g. format a disk on a host)
- Enhance the async task manager in engine, to fetch details of all types of tasks, by sending appropriate tags to the new getAllTasks verb. (At present it fetches only SPM tasks)
- Once all the tasks are fetched, rest of the processing (updating status) remains same as before

I would like to know whether we should stop working on above approach in case the new design is coming up immediately. If yes, we should make sure that the new design is capable of handling the gluster tasks as well.

In case it is too far in the future,

1) We will start working with above approach. Any comments/suggestions/concerns are welcome.
2) Instead of focusing on just 'cancellation', we should try and come up with a more generic approach which can help in easily supporting more actions like pause, resume, commit, etc. (I haven't yet gone through the feature page, and please pardon me if this is already taken care)


> 
> Dan.
> > 
> > Also, some stuff you consider tasks will no longer be tasks any
> > more.
> > For instance, copying and image will finish successfully once VDSM
> > registers the operation for with the storage subsystem and creates
> > the image handle.
> > After that the status of the copy is bound to the status of the new
> > image and is tracked that way.
> > This means that the thing you track when you do copyImage() is
> > actually the creation of the image handle and the metadata to make
> > it usable.
> > After that is done any host can query the state of the new image by
> > using the image ID and not the task Id which was deprecated.
> > This will be true for all storage operations.
> > 
> > 
> > ----- Original Message -----
> > > From: "Michael Kublin" <mkublin at redhat.com>
> > > To: "engine-devel" <engine-devel at ovirt.org>
> > > Sent: Monday, December 3, 2012 4:19:48 AM
> > > Subject: [Engine-devel] Task cancelation feature
> > > 
> > > Hi, I created a wiki page with design of task cancellation
> > > feature.
> > > The url is : http://www.ovirt.org/Features/TaskManagerCancelTask
> > > I can not call these design, I have not any requirements , except
> > > a
> > > name of the feature,
> > > so my wiki doesn't contains anything except open questions.
> > > Also, I think that it is impossible to make a good feature based
> > > on
> > > very problematic infrastructure,
> > > I think before we should fix all our infrastructure problems, and
> > > after that to add any cancellation task
> > > feature will be a meter of couple hours of work
> > > 
> > > Regards Michael
> > > _______________________________________________
> > > Engine-devel mailing list
> > > Engine-devel at ovirt.org
> > > http://lists.ovirt.org/mailman/listinfo/engine-devel
> > > 
> > _______________________________________________
> > Engine-devel mailing list
> > Engine-devel at ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/engine-devel
> _______________________________________________
> Engine-devel mailing list
> Engine-devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/engine-devel
> 



More information about the Engine-devel mailing list