Few questions on Backup Restore API implmn.

Mon Jun 10 22:38:43 UTC 2013

----- Original Message -----
> 
> Hi Ayal,
> 
> On 06/07/2013 02:10 AM, Ayal Baron wrote:
> > 
> > 
> > ----- Original Message -----
> >> On 06/05/2013 02:14 PM, Deepak C Shetty wrote:
> >>> On 06/05/2013 03:58 PM, Michael Pasternak wrote:
> >>>> Hi Deepak,
> >>>>
> >>>> On 05/30/2013 06:07 PM, Deepak C Shetty wrote:
> >>>>> Hi All,
> >>>>>     Per the previous mail from Ayal.. I am refering to the flow
> >>>>>     below...
> >>>>>
> >>>>> The backup flow should be (all api calls are engine REST API):
> >>>> I disagree with this, by following this pattern you move internal
> >>>> logic (even more than this given implementation specific logic) to
> >>>> the api users instead of abstracting it and creating solid api for
> >>>> backups,
> > 
> > This is not internal logic.
> > I think what is not clear here is that we're *not* modelling a backup flow.
> > We're modelling a flow to allow a user to expose an existing disk to a
> > virt-app / physical appliance, while the VM which owns the disk is still
> > running.
> 
> in this case this exposure is relevant for the backup and is a part of the
> disk backup process.

That is an *example* of a specific usage. You should not program to a specific usage, rather build mechanisms that can be reused.

> 
> > There can be a horde of uses for this, off the top of my head:
> > 1. obviously backup
> > 2. scan disk for viruses externally
> > 3. scan disk for inventory management
> > There will be many more which users come up with.
> 
> so?, i can't see why this is relevant? you can reuse this code in the backend
> later
> when you need it for the new functionalities, - this is basic coding
> practice, and
> we do that across the backend by reusing internal commands.

The point is we're adding these functionalities *right now* without having to duplicate or write usage specific APIs.

> 
> i can give you at least one example where bad design of the backend command
> forces backend clients such as api, UI, etc. to implement same logic several
> times manually without having any ability to perform atomic operation against
> a backend:
> 
> snapshot.restore()
> ==================
> 
> you have to perform two separate steps in api against a backend:
> 
> 1. VdcActionType.TryBackToAllSnapshotsOfVm
> 2. VdcActionType.RestoreAllSnapshots

The fact that there are bad APIs doesn't reflect on our current proposal.
In the above example I don't even understand the names of the API calls, let alone what they do and why the separation was made.  It may very well be that the separation here is wrong, but that does not reflect anything on the API we're talking about here.

> 
> well, we all developers and can do it once so api/UI/etc. users will see this
> as a
> single operation and save the pain of learning restore() logic and running
> two
> separate commands for restore to happen.
> 
> in your case this much worse, case you expect every api user to *learn vendor
> specific* logic for backup/foo() and implement the steps + understand the
> responses
> of each step and act accordingly (and i'm not even mentioning the atomic ops.
> here)

I think there is some misunderstanding here.
The 'users' *are* the backup vendors.
These APIs are intended directly for backup vendors who will integrate once with the API.
Users wanting to backup or restore will go to the vendors application not to oVirt.

> 
> i apologise, but this is a bad thing to do, this is not how public api should
> look like.
> 
> > 
> > Designing the API around the specific usage (policy) is limiting and will
> > lead to a lot more API calls due to the need to add new ones for every new
> > usage someone thinks of.
> 
> that's what you suggested to begin with (and i'm opposed to it) -
> e.g leaving "exposer of existing disk to a virt-app / physical appliance"
> in the api

I don't understand this comment. What are you opposed to and why?

> 
> > 
> > The API should provide users with useful mechanisms.
> > In this case what we need is the ability to get a point in time of a disk
> > available for read (and in some cases write) on one of the hosts (and
> > later on possibly over the net as well, assuming we secure it).
> > I do not always want to take a snapshot for example, I want to be able to
> > use the API to expose an existing snapshot.
> > Once we add live merge, user may *choose* to merge immediately after backup
> > is finished or to keep the snapshot (policy).
> 
> these all are implementation details that can be easily hidden by adding
> extra parameters to parameters holder as i described bellow,
> 
> actually this is exactly where abstraction shines and only proves my point.

Parameters that do what exactly?
Let's describe this more clearly. In a single API this would be (pseudocode):

prepareDiskForBackupOrRestore(int hostID, int diskID, boolean autoAttach, int vmID, boolean createSnapshot, int snapshotID, boolean autoMerge, boolean writeable)

tearDownDisk(...)

hostID         - which host to expose the disk on
diskID         - which disk should be exposed
autoAttach     - should disk be attached to a VM or not
vmID           - if VM should be attached then which VM to attach it to
createSnapshot - should operation first create a snapshot or not
snapshotID     - if createSnapshot is false then which snapshot should be used
autoMerge      - should snapshot be automatically merged when operation ends (may only be 'True' if createSnapshot is true) - this would fail today if the VM we're backing up is running since we do not yet support live merge.
writeable      - should disk be exposed read only or read write (may only be 'True' if autoMerge is False)

Just by looking at the API you can see that it's wrong.  All these variables change the entire flow of the method and should clearly be broken down into separate calls.
Note also that any assumption of atomicity here is false since you cannot guarantee atomicity of all of these operations (e.g. once snapshot creation completes and operation moves to the next phase you cannot 'rollback' as the disks are no longer in the same state).

The alternative is:

getSnapConfig  (int snapshotID) *
mountDisk      (int hostId, int diskId, int snapshotId, boolean writeable)
unmountDisk    (...)

And for the rest of the options:
createSnapshot (already exists today and is optional in the process)
attachDisk     (already exists today and is optional in the process)
detachDisk     (already exists today and is optional in the process)
deleteSnapshot (already exists today and is optional in the process) - note that since this is separate, user can run this at a later point in time when the VM to backup is not running.

With the latter approach, there are 3 new API calls (compared with 2 in a monolithic approach) and we immediately support other usages such as:
scan disk for viruses externally
scan disk for inventory management
etc, without writing a single line of code nor bloating the API with every usage imaginable.

> 
> > 
> > 
> > 
> > 
> >>>>
> >>>> +++++++++ several api calls for single backup ++++++++++++++
> >>>>
> >>>>
> >>>> cons:
> >>>> ====
> >>>>
> >>>> 1. user have to be aware of internal (implementation specific) flow of
> >>>> each provider in order to be able creating backup
> >>>>    1.1 call add(a)->do(a)->del(a)->do(b)->etc.
> >>>> 2. no flexibility for backup provider to change the backup flow (as it
> >>>> would be considered an api break)
> >>>> 3. no atomic resolution for the /backup failures
> >>>>    3.1. failure at do(b) of add(a)->do(a)->del(a)->do(b)->etc. will mean
> >>>>         that user have to del(a) manually and if it's deeper in the flow
> >>>>         every api user will have to implement own rollbacks to the steps
> >>>>         that took
> >>>>         place before the failure.
> >>>> 4. backward compatibility becomes an issue when exposing internals in
> >>>> such
> >>>> a low level.
> >>>> 5. new features expose becomes much complex task cause they may require
> >>>> performing extra
> >>>>    step/s in a middle.
> >>>>
> >>>> pros:
> >>>> ====
> >>>>
> >>>> can't think of any ..., cause it doesn't give you flexibility, but
> >>>> creates
> >>>> unnatural
> >>>> (to public APIs) complexity.
> >>>
> >>> I was able to dig the below from one of the older mails. I had exactly
> >>> this
> >>> Q on why we need separate APIs Vs 1 and this is what Ayal had to say
> >>>
> >>> Deepak asked:
> >>>
> >>> /Wouldn't it be better if there was just 1 REST API for startign backup
> >>> which would take all the necessary inputs (VM uuid, hostid, disk(s)) and
> >>> internally caused the above individual APIs to get called ?
> >>> Why do we have to expose each of the steps in backup process as
> >>> individual REST APIs ?
> >>> /
> >>>
> >>> Ayal said:
> >>>
> >>> /1. Because a single API assumes that I always want to backup everything
> >>> which isn't necessarily true (normally backup policy for system disk and
> >>> data disks are different)
> >>
> >> modelling backup under /api/vms/xxx/disks/yyy/backups solves this.
> >>
> >>> 2. Going forward we will have disk level snapshot in which case the
> >>> backup
> >>> API would have to change.
> >>
> >> same
> >>
> >>> 3. You could always return the vm config per disk when calling "prepare
> >>> backup disk" which would be a bit redundant now but once we have disk
> >>> level snapshots it would be more relevant and this way remove 1 API call
> >>> now.
> >>> Later on when we do have disk level snaps it could be an option to the
> >>> command to take the snap or something I guess.
> >>
> >> it's only proves my point, we can/should hide this under api abstraction
> >> rather than forcing users using different flows now and then.
> >>
> >>> /
> >>> So i kind of agreed to what Ayal said... from the perspective that it
> >>> provides flexibility to the end user using the API, as to how he/she
> >>> wants
> >>> to script/use it.
> >>> What do you think ?  I just wanted to put the above here... so that we
> >>> are
> >>> sure we can considering all aspects before making the decision on 1 Vs
> >>> many APIs.
> >>
> >> i've mentioned this before, - i don't think this is about flexibility, if
> >> you
> >> want to allow
> >> user doing several things differently, - expose several api for same thing
> >> (like method overloads),
> >> but it doesn't mean that we should be exposing implementation internals to
> >> end user.
> >>
> >>>
> >>> 1 API provides simplicity, but getting flexibility using 1 API would mean
> >>> passing addnl params.. for eg> If i want to backup a particular vmdisk
> >>> only, then the xml below would change, rite ?
> >>
> >> no, you misunderstood, see url [1] from my previous eamil, you already
> >> doing
> >> this for specific disk - yyy
> >>
> >> [1] /api/vms/xxx/disks/yyy/backups
> >>
> >>> But looks like either way, its possible to get things done...
> >>> I personally don't have much experience in the area of REST APIs to take
> >>> the call on whether we need 1 or multiple APIs... so I am still
> >>> inconclusive here (sorry about that!)
> >>> All i can say is that with 1 API approach too, we can get flexibility.
> >>>
> >>>
> >>>>
> >>>>
> >>>> +++++++++ single api call for single backup ++++++++++++++
> >>>>
> >>>> cons:
> >>>> ====
> >>>>
> >>>> nothing comes to my mind, are there any?
> >>>>
> >>>> pros:
> >>>> ====
> >>>>
> >>>> 1. user is not aware of any backup internals and can consume different
> >>>> /backup
> >>>>    providers using very same api and supplying parameters relevant for
> >>>>    the
> >>>>    provider
> >>>>    he'd like to use.
> >>>> 2. backup providers can change/extend internal flow as much as they
> >>>> want/need, only
> >>>>    by modifying internal engine driver, user using abstraction over the
> >>>>    backup api
> >>>>    won't feel the difference.
> >>>> 3. backup providers can execute backup as single atomic operation and
> >>>> take
> >>>> care
> >>>>    for rollback cleanups in the transaction.
> >>>> 4. backup providers can easily maintain backward compatibility over
> >>>> abstraction
> >>>> 5. adding new features that require flow extension can be easily hidden
> >>>> under
> >>>>    the abstraction layer.
> >>>>
> >>>> also this is a common practice to expose abstraction layer to public,
> >>>> while
> >>>> vendors only have to implement the driver in order to be supported.
> >>>>
> >>>>
> >>>>> 1. create vm snapshot (the api call exists today)
> >>>>> 2. get VM Config (new API)
> >>>>> 3. prepare backup disk (new api, should probably accept: hostid, disk;
> >>>>> return: paths to device on host as well as map of changed blocks) -
> >>>>> this should be called for every disk that needs to be backed up. Note
> >>>>> that VM snapshot takes a snap of *all* disks of the VM
> >>>>> 4. attach disk to backup vm (the api call exists today.  This assumes
> >>>>> virt app) - also has to be called per disk to back up
> >>>>> 5. At this point backup app would do the backup
> >>>>>
> >>>> this can be easily done under the single api call:
> >>>> =================================================
> >>>>
> >>>> POST /api/vms/xxx/disks/yyy/backups
> >>>>
> >>>> <backup>
> >>>>   <host id=aaa> #3
> >>>>   <backup_vm id=bbb> #1, #2
> >>>>   <delete_snapshot>true|false<delete_snapshot/> #7
> >>>> <backup/>
> >>>>
> >>>> 1. create vm snapshot => you already in context of vm, trigger snapshot
> >>>> on
> >>>> it
> >>>> 2. get VM Config => you already in context of vm, collect relevant meta
> >>>> 3. prepare backup disk => you in context of the disk already, do
> >>>> whatever
> >>>> is needed
> >>>> 4. attach disk to backup vm => you have backup_vm id in the request body
> >>>> 5. do the actual backup
> >>>>
> >>>> does this makes sense?
> >>>>
> >>>>> 5) detach disk (symmetric to 4)
> >>>>> 6. teardown backup disk (symmetric to 3)
> >>>>> 7. delete snap - This can only be called if VM is down today and is
> >>>>> not mandatory in any event.  Once we have live merge, it should be
> >>>>> policy driven (user should be able to choose whether to keep or delete)
> >>>>>
> >>>>> Questions...
> >>>>>
> >>>>> 1) My current implmn of prepareBackupDisk in VDSM (pls see
> >>>>> http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:backup-restore,n,z)
> >>>>> returns drive and cbtinfo as dicts back to engine. The understnading
> >>>>> here
> >>>>> is that the drive returned will be a dict which has all the necessary
> >>>>> info
> >>>>> to represent a NBD drive to libvirt... hence this drive will be passed
> >>>>> as-is in the "attach disk to VM" REST API. Is this correct ?
> >>>>>
> >>>>> Note that i cannot return a disk UUID (like VDSM does for create volume
> >>>>> case).. because in preparebackupdisk case the image is exported usign
> >>>>> qemu-nbd
> >>>>> as a block device over the network... and hence its not a regular
> >>>>> vdsm-type disk, hence no UUID, Agree ?
> >>>>>
> >>>>> 2) In "attach disk to VM" REST API user will pass the drive dict which
> >>>>> he/she got as part of preparebackupdisk api.. as-is
> >>>>> and VDSM will need to add support for accepting NBD disk as a valid
> >>>>> disktype in VM.hotplugDisk() code of VDSM.
> >>>
> >>>>>
> >>>>> The ability to add a network disk is present in libvirtvm.py... as part
> >>>>> of my GlusterSD work, sicne gluster is represented as a NBD to QEMU
> >>>>> but it doesn't work at the API.py level.. its only from prepareImage
> >>>>> onwards... hence the need to modify API.py to accept NBD disk type and
> >>>>> connect
> >>>>> to the libvirtvm.py appropriately
> >>>>>
> >>>>> Is this acceptable.. otherwise there is no good way to pass the NBD
> >>>>> drive's info back to VDSM as part of existing "attach disk to VM" API
> >>>>> Also does the existing "attach disk to VM" API work if a pre-loaded
> >>>>> drive
> >>>>> dict/hashmap is provided by the user. This will have drive->type =
> >>>>> network
> >>>>> and not file/blcok as is currently supported
> >>>
> >>> Mike,
> >>>     Could you provide your views on this pls ? How do we represent the
> >>>     NBD
> >>>     disk and pass it back to VDSM as part of 'hotplugDisk' API
> >>> Currently hotplugDisk doesn't support drive of type = network!
> >>>
> >>>>>
> >>>>> 3) After "attach disk to backup VM" REST API step.. the understnading
> >>>>> is
> >>>>> that some backup vendor specific API will be called to tell the
> >>>>> backup appln....
> >>>>>     -- Block device (attached to the VM) to be used as src for backup
> >>>>>     -- CBT info of this blcok device (which was recd. as part of
> >>>>>     prepareBackupDisk API)
> >>>>> Is this correct ?
> >>>>>
> >>>>> thanx,
> >>>>> deepak
> >>>>>
> >>>>
> >>>
> >>
> >>
> >> --
> >>
> >> Michael Pasternak
> >> RedHat, ENG-Virtualization R&D
> >>
> 
> 
> --
> 
> Michael Pasternak
> RedHat, ENG-Virtualization R&D
>