[Engine-devel] VDSM tasks, the future

Tue Dec 4 20:50:28 UTC 2012

On Tue, Dec 04, 2012 at 10:35:01AM -0500, Saggi Mizrahi wrote:
> Because I started hinting about how VDSM tasks are going to look going forward
> I thought it's better I'll just write everything in an email so we can talk
> about it in context.  This is not set in stone and I'm still debating things
> myself but it's very close to being done.

Don't debate them yourself, debate them here!  Even better, propose your idea in
schema form to show how a command might work exactly.

> - Everything is asynchronous.  The nature of message based communication is
> that you can't have synchronous operations.  This is not really debatable
> because it's just how TCP\AMQP\<messaging> works.

Can you show how a traditionally synchronous command might work?  Let's take
Host.getVmList as an example.

> - Task IDs will be decided by the caller.  This is how json-rpc works and also
> makes sense because no the engine can track the task without needing to have a
> stage where we give it the task ID back.  IDs are reusable as long as no one
> else is using them at the time so they can be used for synchronizing
> operations between clients (making sure a command is only executed once on a
> specific host without locking).
> 
> - Tasks are transient If VDSM restarts it forgets all the task information.
> There are 2 ways to have persistent tasks: 1. The task creates an object that
> you can continue work on in VDSM.  The new storage does that by the fact that
> copyImage() returns one the target volume has been created but before the data
> has been fully copied.  From that moment on the stat of the copy can be
> queried from any host using getImageStatus() and the specific copy operation
> can be queried with getTaskStatus() on the host performing it.  After VDSM
> crashes, depending on policy, either VDSM will create a new task to continue
> the copy or someone else will send a command to continue the operation and
> that will be a new task.  2. VDSM tasks just start other operations track-able
> not through the task interface. For example Gluster.
> gluster.startVolumeRebalance() will return once it has been registered with
> Gluster.  glster.getOperationStatuses() will return the state of the operation
> from any host.  Each call is a task in itself.

I worry about this approach because every command has a different semantic for
checking progress.  For migration, we have to check VM status on the src and
dest hosts.  For image copy we need to use a special status call on the dest
image.  It would be nice if there was a unified method for checking on an
operation.  Maybe that can be completion events.

Client:               vdsm:
-------               -----

Image.copy(...)  -->
                 <--  Operation Started
Wait for event   ...
                 <--  Event: Operation <id> done <code>

For an early error:

Client:               vdsm:
-------               -----

Image.copy(...)  -->
                 <--  Error: <code>

> - No task tags.  They are silly and the caller can mangle whatever in the task
> ID if he really wants to tag tasks.

Yes.  Agreed.

> - No explicit recovery stage.  VDSM will be crash-only, there should be
> efforts to make everything crash-safe.  If that is problematic, in case of
> networking, VDSM will recover on start without having a task for it.

How does this work in practice for something like creating a new image from a
template?

> - No clean Task: Tasks can be started by any number of hosts this means that
> there is no way to own all tasks.  There could be cases where VDSM starts
> tasks on it's own and thus they have no owner at all.  The caller needs to
> continually track the state of VDSM. We will have brodcasted events to
> mitigate polling.

If a disconnected client might have missed a completion event, it will need to
check state.  This means each async operation that changes state must document a
proceedure for checking progress of a potentially ongoing operation.  For
Image.copy, that process would be to lookup the new image and check its state.

> - No revert Impossible to implement safely.

How do the engine folks feel about this?  I am ok with it :)

> - No SPM\HSM tasks SPM\SDM is no longer necessary for all domain types (only
> for type).  What used to be SPM tasks, or tasks that persist and can be
> restarted on other hosts is talked about in previous bullet points.
> 
A nice simplification.

-- 
Adam Litke <agl at us.ibm.com>
IBM Linux Technology Center