[Engine-devel] Asynchronous tasks for live merge

Mon Mar 3 14:56:56 UTC 2014

On 03/03/14 16:36 +0200, Itamar Heim wrote:
>On 03/03/2014 04:28 PM, Dan Kenigsberg wrote:
>>On Fri, Feb 28, 2014 at 09:30:16AM -0500, Adam Litke wrote:
>>>Hi all,
>>>
>>>As part of our plan to support live merging of VM disk snapshots it
>>>seems we will need a new form of asynchronous task in ovirt-engine.  I
>>>am aware of AsyncTaskManager but it seems to be limited to managing
>>>SPM tasks.  For live merge, we are going to need something called
>>>VmTasks since the async command can be run only on the host that
>>>currently runs the VM.
>>>
>>>The way I see this working from an engine perspective is:
>>>1. RemoveSnapshotCommand in bll is invoked as usual but since the VM is
>>>   found to be up, we activate an alternative live merge flow.
>>>2. We submit a LiveMerge VDS Command for each impacted disk.  This is
>>>   an asynchronous command which we need to monitor for completion.
>>>3. A VmJob is inserted into the DB so we'll remember to handle it.
>>>4. The VDS Broker monitors the operation via an extension to the
>>>   already collected VmStatistics data.  Vdsm will report active Block
>>>   Jobs only.  Once the job stops (in error or success) it will cease
>>>   to be reported by vdsm and engine will know to proceed.
>>
>>You describe a reasonable way for Vdsm to report whether an async
>>operation has finished. However, may we instead use the oportunity to
>>introduce generic "hsm" tasks?
>>
>>I suggest to have something loosely modeled on posix fork/wait.
>>
>>- Engine asks Vdsm to start an API verb asynchronously and supplies a
>>   uuid. This is unlike fork(2), where the system chooses the pid, but
>>   that's required so that Engine could tell if the command has reached
>>   Vdsm in case of a network error.
>>
>>- Engine may monitor the task (a-la wait(WNOHANG))
>>
>>- When the task is finished, Engine may collect its result (a-la wait).
>>   Until that happens, Vdsm must report the task forever; restart or
>>   upgrade are no excuses. On reboot, though, all tasks are forgotten, so
>>   Engine may stop monitoring tasks on a fenced host.
>>
>>This may be an over kill for your use case, but it would come useful for
>>other cases. In particular, setupNetwork returns before it is completely
>>done, since dhcp address acquisition may take too much time. Engine may
>>poll getVdsCaps to see when it's done (or timeout), but it would be
>>nicer to have a generic mechanism that can serve us all.
>>
>>Note that I'm suggesting a completely new task framwork, at least on
>>Vdsm side, as the current one (with its broken persistence, arcane
>>states and never-reliable rollback) is beyond redemption, imho.
>>
>>>5. When the job has completed, VDS Broker raises an event up to bll.
>>>   Maybe this could be done via VmJobDAO on the stored VmJob?
>>>6. Bll receives the event and issues a series of VDS commands to
>>>   complete the operation:
>>>   a) Verify the new image chain matches our expectations (the snap is
>>>      no longer present in the chain).
>>>   b) Delete the snapshot volume
>>>   c) Remove the VmJob from the DB
>>>
>>>Could you guys review this proposed flow for sanity?  The main
>>>conceptual gaps I am left with concern #5 and #6.  What is the
>>>appropriate way for VDSBroker to communicate with BLL?  Is there an
>>>event mechanism I can explore or should I use the database?  I am
>>>leaning toward the database because it is persistent and will ensure
>>>#6 gets completed even if engine is restarted somewhere in the middle.
>>>For #6, is there an existing polling / event loop in bll that I can
>>>plug into?
>>>
>>>Thanks in advance for taking the time to think about this flow and for
>>>providing your insights!
>>_______________________________________________
>>Engine-devel mailing list
>>Engine-devel at ovirt.org
>>http://lists.ovirt.org/mailman/listinfo/engine-devel
>>
>
>the way i read Adam's proposal, there is no "task" entity at vdsm side 
>to monitor, rather the state of the object the operation is performed 
>on (similar to CreateVM, where the engine monitors the state of the 
>VM, rather than the CreateVM request).

Yeah, we use the term "job" in order to avoid assumptions and
implications (ie. rollback/cancel, persistence) that come with the
word "task".  "Job" essentially means "libvirt Block Job", but I am
trying to allow for extension in the future.  Vdsm would collect block
job information for devices it expects to have active block jobs and
report them all under a single structure in the VM statistics.  There
would be no persistence of information so when a libvirt block job
goes poof, vdsm will stop reporting it.

-- 
Adam Litke