[Engine-devel] Asynchronous tasks for live merge

Mon Mar 3 14:28:27 UTC 2014

On Fri, Feb 28, 2014 at 09:30:16AM -0500, Adam Litke wrote:
> Hi all,
> 
> As part of our plan to support live merging of VM disk snapshots it
> seems we will need a new form of asynchronous task in ovirt-engine.  I
> am aware of AsyncTaskManager but it seems to be limited to managing
> SPM tasks.  For live merge, we are going to need something called
> VmTasks since the async command can be run only on the host that
> currently runs the VM.
> 
> The way I see this working from an engine perspective is:
> 1. RemoveSnapshotCommand in bll is invoked as usual but since the VM is
>   found to be up, we activate an alternative live merge flow.
> 2. We submit a LiveMerge VDS Command for each impacted disk.  This is
>   an asynchronous command which we need to monitor for completion.
> 3. A VmJob is inserted into the DB so we'll remember to handle it.
> 4. The VDS Broker monitors the operation via an extension to the
>   already collected VmStatistics data.  Vdsm will report active Block
>   Jobs only.  Once the job stops (in error or success) it will cease
>   to be reported by vdsm and engine will know to proceed.

You describe a reasonable way for Vdsm to report whether an async
operation has finished. However, may we instead use the oportunity to
introduce generic "hsm" tasks?

I suggest to have something loosely modeled on posix fork/wait.

- Engine asks Vdsm to start an API verb asynchronously and supplies a
  uuid. This is unlike fork(2), where the system chooses the pid, but
  that's required so that Engine could tell if the command has reached
  Vdsm in case of a network error.

- Engine may monitor the task (a-la wait(WNOHANG))

- When the task is finished, Engine may collect its result (a-la wait).
  Until that happens, Vdsm must report the task forever; restart or
  upgrade are no excuses. On reboot, though, all tasks are forgotten, so
  Engine may stop monitoring tasks on a fenced host.

This may be an over kill for your use case, but it would come useful for
other cases. In particular, setupNetwork returns before it is completely
done, since dhcp address acquisition may take too much time. Engine may
poll getVdsCaps to see when it's done (or timeout), but it would be
nicer to have a generic mechanism that can serve us all.

Note that I'm suggesting a completely new task framwork, at least on
Vdsm side, as the current one (with its broken persistence, arcane
states and never-reliable rollback) is beyond redemption, imho.

> 5. When the job has completed, VDS Broker raises an event up to bll.
>   Maybe this could be done via VmJobDAO on the stored VmJob?
> 6. Bll receives the event and issues a series of VDS commands to
>   complete the operation:
>   a) Verify the new image chain matches our expectations (the snap is
>      no longer present in the chain).
>   b) Delete the snapshot volume
>   c) Remove the VmJob from the DB
> 
> Could you guys review this proposed flow for sanity?  The main
> conceptual gaps I am left with concern #5 and #6.  What is the
> appropriate way for VDSBroker to communicate with BLL?  Is there an
> event mechanism I can explore or should I use the database?  I am
> leaning toward the database because it is persistent and will ensure
> #6 gets completed even if engine is restarted somewhere in the middle.
> For #6, is there an existing polling / event loop in bll that I can
> plug into?
> 
> Thanks in advance for taking the time to think about this flow and for
> providing your insights!