[Engine-devel] [vdsm] VDSM tasks, the future

Saggi Mizrahi smizrahi at redhat.com
Wed Dec 5 14:45:48 UTC 2012


I'm sorry but your email client messed up the formatting and I can't figure out what are you comments.
Could you please use text only emails.

----- Original Message -----
> From: "ybronhei" <ybronhei at redhat.com>
> To: "Saggi Mizrahi" <smizrahi at redhat.com>
> Cc: "Adam Litke" <agl at us.ibm.com>, "engine-devel" <engine-devel at ovirt.org>, "VDSM Project Development"
> <vdsm-devel at lists.fedorahosted.org>
> Sent: Wednesday, December 5, 2012 8:37:23 AM
> Subject: Re: [vdsm] VDSM tasks, the future
> 
> 
> On 12/05/2012 12:20 AM, Saggi Mizrahi wrote:
> 
> 
> As the only subsystem to use asynchronous tasks until now is the
> storage subsystem I suggest going over how
> I suggest tackling task creation, task stop, task remove and task
> recovery.
> Other subsystem can create similar mechanisms depending on their
> needs.
> 
> There is no way of avoiding it, different types of tasks need
> different ways of tracking\recovering from them.
> network should always auto-recover because it can't get a "please
> fix" command if the network is down.
> Storage on the other hand should never start operations on it's own
> because it might take up valuable resources from the host.
> Tasks that need to be tracked on a single host, 2 hosts, or the
> entire cluster need to have their own APIs.
> VM configuration never persist across reboots, networking sometimes
> persists and storage always persists.
> This means that recovery procedures (from the managers point of view)
> need to be vastly different.
> Add policy, resource allocation, and error flows you see that VDSM
> doesn't have nearly as much information to deal with the tasks.
> 
> ----- Original Message -----
> 
> From: "Adam Litke" <agl at us.ibm.com> To: "Saggi Mizrahi"
> <smizrahi at redhat.com> Cc: "VDSM Project Development"
> <vdsm-devel at lists.fedorahosted.org> , "engine-devel"
> <engine-devel at ovirt.org> , "Ayal
> Baron" <abaron at redhat.com> , "Barak Azulay" <bazulay at redhat.com> ,
> "Shireesh Anjal" <sanjal at redhat.com> Sent: Tuesday, December 4, 2012
> 3:50:28 PM
> Subject: Re: VDSM tasks, the future
> 
> On Tue, Dec 04, 2012 at 10:35:01AM -0500, Saggi Mizrahi wrote:
> 
> Because I started hinting about how VDSM tasks are going to look
> going forward
> I thought it's better I'll just write everything in an email so we
> can talk
> about it in context.  This is not set in stone and I'm still
> debating things
> myself but it's very close to being done. Don't debate them yourself,
> debate them here!  Even better, propose
> your idea in
> schema form to show how a command might work exactly. I don't like
> throwing ideas in the air It can be much easier to understand the
> flow of a task in vdsm and outside vdsm by a small schema, mainly
> for the each task's states.
> To define the flow of a task you can separate between type of tasks
> (network, storage, vms, or else), we should have task's states that
> clarify if the task can be recovered or not, can be canceled or not
> and inc..
> 
> Canceling\Aborting\Reverting states should be more clarified and not
> every state can lead to all types of states.
> I tries to figure how task flow works today in vdsm, and this is what
> I've got - http://wiki.ovirt.org/Vdsm_tasks
> 
> 
> 
> 
> 
> 
> - Everything is asynchronous.  The nature of message based
> communication is
> that you can't have synchronous operations.  This is not really
> debatable
> because it's just how TCP\AMQP\<messaging> works. Can you show how a
> traditionally synchronous command might work?
>  Let's take
> Host.getVmList as an example. The same as it works today, it's all a
> matter of how you wrap the transport layer.
> You will send a json-rpc request and wait for a response with the
> same id.
> 
> As for the bindings, there are a lot of way we can tackle that.
> Always wait for the response and simulate synchronous behavior.
> Make every method return an object to track the task.
> task = host.getVmList()
> if not task.wait(1):
>     task.cancel()
> else:
>     res = task.result() It looks like traditional timeout.. why not
>     to split blocking actions and non-blocking actions, non-blocking
>     action will supply callback function to return to if the task
>     fails or success. for example:
> 
> createAsyncTask(host.getVmList, params, timeout=30,
> callbackGetVmList)
> 
> Instead of using the dispatcher? Do you want to keep the dispatcher
> concept?
> 
> 
> 
> Have it both ways (it's auto generated anyway) and have
> list = host.getVmList()
> task = host.getVmList_async()
> 
> Have a high level and low level interfaces.
> host = host()
> host.connect("tcp://host:3233")
> req = host.sendRequest("123213", "getVmList", [])
> if not req.wait(1):
>    ....
> 
> shost = SynchHost(host)
> shost.getVmList() # Actually wraps a request object
> ahost = AsyncHost(host)
> task = getVmList() # Actually wraps a request object
> 
> 
> 
> - Task IDs will be decided by the caller.  This is how json-rpc
> works and also
> makes sense because no the engine can track the task without
> needing to have a
> stage where we give it the task ID back.  IDs are reusable as long
> as no one
> else is using them at the time so they can be used for
> synchronizing
> operations between clients (making sure a command is only executed
> once on a
> specific host without locking).
> 
> - Tasks are transient If VDSM restarts it forgets all the task
> information.
> There are 2 ways to have persistent tasks: 1. The task creates an
> object that
> you can continue work on in VDSM.  The new storage does that by the
> fact that
> copyImage() returns one the target volume has been created but
> before the data
> has been fully copied.  From that moment on the stat of the copy
> can be
> queried from any host using getImageStatus() and the specific copy
> operation
> can be queried with getTaskStatus() on the host performing it.
>  After VDSM
> crashes, depending on policy, either VDSM will create a new task to
> continue
> the copy or someone else will send a command to continue the
> operation and
> that will be a new task.  2. VDSM tasks just start other operations
> track-able
> not through the task interface. For example Gluster.
> gluster.startVolumeRebalance() will return once it has been
> registered with
> Gluster.  glster.getOperationStatuses() will return the state of
> the operation
> from any host.  Each call is a task in itself. I worry about this
> approach because every command has a different
> semantic for
> checking progress.  For migration, we have to check VM status on the
> src and
> dest hosts.  For image copy we need to use a special status call on
> the dest
> image.  It would be nice if there was a unified method for checking
> on an
> operation.  Maybe that can be completion events.
> 
> Client:               vdsm:
> -------               -----
> 
> Image.copy(...)  -->
>                  <--  Operation Started
> Wait for event   ...
>                  <--  Event: Operation <id> done <code>
> 
> For an early error:
> 
> Client:               vdsm:
> -------               -----
> 
> Image.copy(...)  -->
>                  <--  Error: <code> The thing is that a lot of things
>                  need a different way of tracking their progress.
> Storage have completely different semantics from network or VM
> operations. This is the reason why we can use the implementation of
> task as something generic for all processes that we have.
> of course things need different ways of tracking their progress...
> That's why we need to use task's states with meaning, and split
> storageTaskStates and networkTaskStates that inherit of TaskStates
> and add their parts as in the new bootstrap implementation.
> Also we can add hooks for each state as alonbl did in his otopi code
> (not sure if we need that)
> 
> Like for instance: general states can be - starting, started,
> finishing, finished, and each specific implementation adds middle
> states. like waitForResource, processing, recovering and inc..
> for each one you can add levels (pre state, post state) that can add
> more flexibility.
> 
> That way Task Object will be a general way to implement specific
> process, you will have a NetworkTask and StorageTask and the
> infrastructure will be the interface and implementation of the
> generic parts.
> 
> So here how vdsm can work that way:
> client: vsdm:
> -------- ---------
> image.copy() ---> copyImage::starting (same starting code - keeping
> the id, and move forward to next state)
> copyImage::started (waiting to recovery file that task is started)
> copyImage::part1 (whatever you want to do)
> copyImage::part2 (whatever you want to do)
> copyImage::part3 (whatever you want to do) -- for each process the
> programmers will add their states as they want in a sequence flow
> result <------ copyImage::finishing (send back to client a success
> and clean recovery file)
> copyImage::finished (sign task id as succeeded)
> 
> If somewhere in the middle an error occurred, it easier to start over
> and remember where we were.
> The problem with that is that we need to modify the current
> implementation for each process, and I'm not sure if we want to get
> there.. but if we do, it won't be so hard.
> We can split the logic of each process to define a logic of each
> state, and then arranging the states flow for each process and
> clarify what can be recovered or not, what signs corruption or
> errors, and how the returned result can point of the current process
> status (\state)
> 
> 
> 
> 
> 
> 
> 
> - No task tags.  They are silly and the caller can mangle whatever
> in the task
> ID if he really wants to tag tasks. Yes.  Agreed.
> 
> - No explicit recovery stage.  VDSM will be crash-only, there
> should be
> efforts to make everything crash-safe.  If that is problematic, in
> case of
> networking, VDSM will recover on start without having a task for
> it. How does this work in practice for something like creating a new
> image from a
> template?
> 
> - No clean Task: Tasks can be started by any number of hosts this
> means that
> there is no way to own all tasks.  There could be cases where VDSM
> starts
> tasks on it's own and thus they have no owner at all.  The caller
> needs to
> continually track the state of VDSM. We will have brodcasted events
> to
> mitigate polling. If a disconnected client might have missed a
> completion event, it
> will need to
> check state.  This means each async operation that changes state must
> document a
> proceedure for checking progress of a potentially ongoing operation.
>  For
> Image.copy, that process would be to lookup the new image and check
> its state.
> 
> - No revert Impossible to implement safely. How do the engine folks
> feel about this?  I am ok with it :) I don't care, unless they find
> a way to change they way logic works they can't have it.
> The whole concept of recovery (as it is defined now) doesn't work in
> an HA cluster.
> 
> 
> 
> - No SPM\HSM tasks SPM\SDM is no longer necessary for all domain
> types (only
> for type).  What used to be SPM tasks, or tasks that persist and
> can be
> restarted on other hosts is talked about in previous bullet points. A
> nice simplification.
> 
> 
> --
> Adam Litke <agl at us.ibm.com> IBM Linux Technology Center
> _______________________________________________
> vdsm-devel mailing list vdsm-devel at lists.fedorahosted.org
> https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
> 
> --
> Yaniv Bronhaim.
> RedHat, Israel
> 09-7692289
> 054-7744187



More information about the Engine-devel mailing list