Re: [Engine-devel] RFD: API: Identifying vdsm objects in the next-gen API

29 Nov 2012

On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
...
They are not future proof as the paradigm is completely different.  Storage
domain IDs are not static any more (and are not guaranteed to be unique or the
same across the cluster.  Image IDs represent the ID of the projected data and
not the actual unique path.  Just as an example, to run a VM you give a list
of domains that might contain the needed images in the chain and the image ID
of the tip.  The paradigm is changed to and most calls get non synchronous
number of images and domains.  Further more, the APIs themselves are
completely different. So future proofing is not really an issue.
I don't understand this at all.  Perhaps we could all use some education on the
architecture of the planned architectural changes.  If I can pass an arbitrary
list of domainIDs that _might_ contain the data, why wouldn't I just pass all
of them every time?  In that case, why are they even required since vdsm would
have to search anyway?
...
As to making the current API a bit simpler. As I said, making them opaque is
problematic as currently the engine is responsible for creating the IDs.
As I mentioned in my last post, engine still can specify the ID's when the
object is first created.  From that point forward the ID never changes so it can
be baked into the identifier.
...
Further more, some calls require you to play with these (making a template
instead of a snapshot).  Also, the full chain and topology needs to be
completely visible to the engine.
Please provide a specific example of how you play with the IDs.  I can guess
where you are going, but I don't want to divert the thread.
...
These things, as you said, are problematic. But this is the way things are
today.
We are changing them.
...
As for task IDs.  Currently task IDs are only used for storage and they get
persisted to disk. This is WRONG and is not the case with the new storage API.
Because we moved to an asynchronous message based protocol (json-rpc over
TCP\AMQP) there is no need to generate a task ID. it is built in to json-rpc.
json-rpc specifies that the IDs have to be unique for a client as long as the
request is still active.  This is good enough as internally we can have a verb
for a client to query it's own running tasks and a verb to query other host
tasks by mangling in the client before the ID.  Because the protocol is
So this would rely on the client keeping the connection open and as soon as it
disconnects it would lose the ability to query tasks from before the connection
went down?  I don't know if it's a good idea to conflate message ID's with task
ID's.  While the protocol can operate asynchronously, some calls have
synchronous semantics and others have asynchronous semantics.  I would expect
sync calls to return their data immediately and async calls to return
immediately with either: an error code, or an 'operation started' message and
associated ID for querying the status of the operation.
...
asynchronous all calls are asynchronous by nature well.  Tasks will no longer
be persisted or expected to be persisted. It's the callers responsibility to
query the state and see if the operation succeeded or failed if the caller or
VDSM died in the middle of the call. The current "cleanTask()" system can't be
used when more then one client is using VDSM and will not be used for anything
other then legacy storage.
I agree about not persisting tasks in the future.  Although I think finished
tasks should remain in memory for some time so they can be queried by a client
who must reconnect.
...
AFAIK Apart from storage all objects IDs are constructed with a single ID,
name or alias. VMs, storageConnections, network interfaces. So it's not a real
issue.  I agree that in the future we should keep the idiom of pass
configuration once, name it, and keep using the name to reference the object.
Yes, storage is the major problem here.
...
----- Original Message -----
...
From: "Adam Litke" <agl@us.ibm.com>
To: "Saggi Mizrahi" <smizrahi@redhat.com>
Cc: engine-devel@linode01.ovirt.org, "Dan Kenigsberg" <danken@redhat.com>, "Federico Simoncelli"
<fsimonce@redhat.com>, "Ayal Baron" <abaron@redhat.com>, vdsm-devel@lists.fedorahosted.org
Sent: Thursday, November 29, 2012 4:18:40 PM
Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API
On Thu, Nov 29, 2012 at 02:16:42PM -0500, Saggi Mizrahi wrote:
...
This is all only valid for the current storage API the new one
doesn't have
pools or volumes. Only domains and images.  Also, images and
domains are more
loosely coupled and make this method problematic.
I am looking for an incremental way to bridge the differences.  It's
been 2
years and we still don't have the revamped storage API so I am
planning on what
we have being around for awhile :)  I think that defining object
identifiers as
opaque structured types is also future proof.  In the future an
Image-ng object
we can drop 'storagepoolID' from the identifier and, if it makes
sense, remove
the hard association with a storageDomain as well.  The point behind
this
refactoring is to give us the option of coupling multiple UUID's (or
other data)
to form a single, opaque identifier.
...
That being said, if we do choose to make the current storage API
officially
supported I do agree that it looks a bit simpler but for the price
of forcing
the user to construct these objects before sending the request. I
know for a
fact that the engine will just create these objects on the fly
because they
use their own objects to group things logically. This means adding
more work
instead of removing it.  Most clients will do that anyway as they
will use
their own DAL to store these relationships.
Thanks for bringing up some of these points.  All deserve attention
so I will
address each one individually:
The current API does not yet make an official statement of support
for anything.
I want to model the current storage API so that the node level API
can have the
same level of functionality as is currently supported.  I am all for
removing
deprecated functions and redesigning in-place for a reasonable amount
of time
going forward.  In a perfect world, libvdsm-1.0 would release with no
mention of
storage pools at all.
If properly designed, the end-user (including engine) would never be
constructing these objects itself.  Object identifiers are
essentially opaque
structures.  In order to make this possible, we need to make sure
that the API
provides all of the functions needed to lookup objects.  So far these
are:
StoragePool:	Host.getConnectedStoragePools
StorageDomain:	Host.getStorageDomains
Image:		StorageDomain.getImages
Volume:		StorageDomain.getVolumes / Image.getVolumes
VM:		Host.getVMList
Task:		Host.getAllTasks
All of the above would return object identifiers.
The other case is for creation of new resources.  In that case, the
create
method needs to move to the owning object.  The key example is
VM.create which
should move to Host.createVM.  Functions such as Host.createVM could
still
accept a vmUUID (because I assume engine does want the ability to set
this
explicitly).  However, they should be changed to return either a
TaskIdentifier
(if the creation is asynchronous) or a *Identifier (eg. VmIdentifier)
if the
object was created synchronously.
All told, I think the net is less work for clients.  They will no
longer need to
model the object associations and relationships because the API will
take care
of that automatically.
...
----- Original Message -----
...
From: "Adam Litke" <agl@us.ibm.com>
To: vdsm-devel@lists.fedorahosted.org
Cc: engine-devel@linode01.ovirt.org, "Dan Kenigsberg"
<danken@redhat.com>, "Federico Simoncelli"
<fsimonce@redhat.com>, "Saggi Mizrahi" <smizrahi@redhat.com>,
"Ayal Baron" <abaron@redhat.com>
Sent: Thursday, November 29, 2012 12:19:06 PM
Subject: RFD: API: Identifying vdsm objects in the next-gen API
Today in vdsm, every object (StoragePool, StorageDomain, VM,
Volume,
etc) is
identified by a single UUID.  On the surface, it seems like this
is
enough info
to properly identify a resource but in practice it's not.  For
example, when you
look at the API's dealing with Volumes, almost all of them
require an
sdUUID,
spUUID, and imgUUID in order to provide proper context for the
operation.
Needing to provide these extra UUIDs is a burden on the API user
because knowing
which values to pass requires internal knowledge of the API.  For
example, the
spUUID parameter is almost always just the connected storage
pool.
 Since we
know there can currently be only one connected pool, the value is
known.
I would like to move away from needing to understand all of these
relationships
from the end user perspective by encapsulating the extra context
into
new object
identifier types as follows:
StoragePoolIdentifier:
    { 'storagepoolID': 'UUID' }
StorageDomainIdentifier:
    { 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID' }
ImageIdentifier:
    { 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID',
    'imageID':
    'UUID' }
VolumeIdentifier:
    { 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID',
      'imageID': 'UUID', 'volumeID': 'UUID' }
TaskIdentifier:
    { 'taskID': 'UUID' }
VMIdentifier:
    { 'vmID': 'UUID' }
In the new API, anytime a reference to an object is required, one
of
the above
structures must be passed in place of today's single UUID.  In
many
cases, this
will allow us to reduce the number of parameters to the function
since the
needed contextual parameters (spUUID, etc) will be part of the
object's
identifier.  Similarly, any time the API returns an object
reference
it would
return a *Identifier instead of a bare UUID.
These identifier types are basically opaque blobs to the API
users
and are only
ever generated by vdsm itself.  Because of this, we can change
the
internal
structure of the identifier to require new information or (before
freezing the
API) remove fields that no longer make sense.
I would greatly appreciate your comments on this proposal.  If it
seems
reasonable, I will revamp the current schema to make the
necessary
changes and
provide the Bridge patch functions to convert between the current
implementation
and the new schema.
--- sample schema patch ---
commit 48f6b0f0a111dd0b372d211a4e566ce87f375cee
Author: Adam Litke <agl@us.ibm.com>
Date:   Tue Nov 27 14:14:06 2012 -0600
schema: Introduce class identifier types
When calling API methods that belong to a particular class, a
    class instance
    must be indicated by passing a set of identifiers in the
    request.
     The location
    of these parameters within the request is: 'params' ->
    '__obj__'.
     Since this
    set of identifiers must be used together to correctly
    instantiate
    an object, it
    makes sense to define these as proper types within the API.
     Then, functions
    that return an object (or list of objects) can refer to the
    correct type.
Signed-off-by: Adam Litke <agl@us.ibm.com>

diff --git a/vdsm_api/vdsmapi-schema.json
b/vdsm_api/vdsmapi-schema.json
index 0418e6e..7e2e851 100644
--- a/vdsm_api/vdsmapi-schema.json
+++ b/vdsm_api/vdsmapi-schema.json
@@ -937,7 +937,7 @@
 # Since: 4.10.0
 ##
 {'command': {'class': 'Host', 'name':
 'getConnectedStoragePools'},
- 'returns': ['StoragePool']}
+ 'returns': ['StoragePoolIdentifier']}
##
 # @BlockDeviceType:
@@ -1572,7 +1572,7 @@
 {'command': {'class': 'Host', 'name': 'getStorageDomains'},
  'data': {'*storagepoolID': 'UUID', '*domainClass':
  'StorageDomainImageClass',
           '*storageType': 'StorageDomainType', '*remotePath':
           'str'},
- 'returns': ['StorageDomain']}
+ 'returns': ['StorageDomainIdentifier']}
##
 # @Host.getStorageRepoStats:
@@ -2406,7 +2406,7 @@
 ##
 {'command': {'class': 'Host', 'name': 'getVMList'},
  'data': {'*vmList': ['UUID']},
- 'returns': ['VM']}
+ 'returns': ['VMIdentifier']}
##
 # @Host.ping:
@@ -2744,10 +2744,11 @@
  'returns': 'ConnectionRefMap'}
## Category: @ISCSIConnection
 ##################################################
+
 ##
-# @ISCSIConnection:
+# @ISCSIConnectionIdentifier:
 #
-# ISCSIConnection API object.
+# Identifier for an ISCSIConnection object.
 #
 # @host:      A fully-qualified domain name (FQDN) or IP address
 #
@@ -2757,11 +2758,21 @@
 #
 # @password:  #optional The password associated with the given
 username
 #
-# Since: 4.10.0
+# Since: 4.10.1
+##
+{'type': 'ISCSIConnectionIdentifier',
+ 'data': {'host': 'str', 'port': 'int', '*user': 'str',
'*password':
'str'}}
+
+##
+# @ISCSIConnection:
+#
+# ISCSIConnection API object.
+#
+# @ident:  The object identifier
+#
+# Since: 4.10.1
 ##
-{'class': 'ISCSIConnection',
- 'data': {'host': 'str', 'port': 'int', '*user': 'str',
-          '*password': 'str'}}
+{'class': 'ISCSIConnection', 'ident':
'ISCSIConnectionIdentifier'}
##
 # @ISCSIConnection.discoverSendTargets:
@@ -2777,10 +2788,11 @@
  'returns': ['str']}
## Category: @Image
 ############################################################
+
 ##
-# @Image:
+# @ImageIdentifier:
 #
-# Image API object.
+# Identifier for an Image object.
 #
 # @imageID:          The UUID of the Image
 #
@@ -2788,13 +2800,24 @@
 #
 # @storagedomainID:  The UUID of the Storage Domain associated
 with
 the Image
 #
-# Since: 4.10.0
+# Since: 4.10.1
 ##
-{'class': 'Image',
+{'type': 'ImageIdentifier',
  'data': {'imageID': 'UUID', 'storagepoolID': 'UUID',
           'storagedomainID': 'UUID'}}
##
+# @Image:
+#
+# Image API object.
+#
+# @ident:  The object identifier
+#
+# Since: 4.10.1
+##
+{'class': 'Image', 'ident': 'ImageIdentifier'}
+
+##
 # @Image.delete:
 #
 # Delete the Image and all of its Volumes.
@@ -2843,7 +2866,7 @@
 # Since: 4.10.0
 ##
 {'command': {'class': 'Image', 'name': 'getVolumes'},
- 'returns': ['Volume']}
+ 'returns': ['VolumeIdentifier']}
##
 # @Image.mergeSnapshots:
@@ -2905,17 +2928,26 @@
## Category: @LVMVolumeGroup
 ###################################################
 ##
+# @LVMVolumeGroupIdentifier:
+#
+# An identifier for a LVMVolumeGroup object.
+#
+# @lvmvolumegroupID:  The volume group UUID
+#
+# Since: 4.10.1
+##
+{'type': 'LVMVolumeGroupIdentifier', 'data':
{'lvmvolumegroupID':
'UUID'}}
+
+##
 # @LVMVolumeGroup:
 #
 # LVMVolumeGroup API object.
 #
-# @lvmvolumegroupID:  #optional Associate this object with an
existing LVM
-#                     Volume Group
+# @ident:  The object identifier
 #
-# Since: 4.10.0
+# Since: 4.10.1
 ##
-{'class': 'LVMVolumeGroup',
- 'data': {'lvmvolumegroupID': 'UUID'}}
+{'class': 'LVMVolumeGroup', 'ident': 'LVMVolumeGroupIdentifier'}
##
 # @LVMVolumeGroup.create:
@@ -2964,21 +2996,32 @@
## Category: @StorageDomain
 ####################################################
 ##
-# @StorageDomain:
+# @StorageDomainIdentifier:
 #
-# StorageDomain API object.
+# An identifier for a StorageDomain object.
 #
 # @storagedomainID:  Associate this object with a new or
 existing
 Storage Domain
 #
 # @storagepoolID:    #optional The Storage Pool UUID if this
 Storage
 Domain is
 #                    attached
 #
-# Since: 4.10.0
+# Since: 4.10.1
 ##
-{'class': 'StorageDomain',
+{'type': 'StorageDomainIdentifier',
  'data': {'storagedomainID': 'UUID', 'storagepoolID': 'UUID'}}
##
+# @StorageDomain:
+#
+# StorageDomain API object.
+#
+# @ident:  The object identifier
+#
+# Since: 4.10.1
+##
+{'class': 'StorageDomain', 'ident': 'StorageDomainIdentifier'}
+
+##
 # @StorageDomain.activate:
 #
 # Activate an attached but inactive Storage Domain.
@@ -3184,7 +3227,7 @@
 # Since: 4.10.0
 ##
 {'command': {'class': 'StorageDomain', 'name': 'getImages'},
- 'returns': ['Image']}
+ 'returns': ['ImageIdentifier']}
##
 # @StorageDomainRole:
@@ -3295,7 +3338,7 @@
 ##
 {'command': {'class': 'StorageDomain', 'name': 'getVolumes'},
  'data': {'imageID': 'UUID'},
- 'returns': ['Volume']}
+ 'returns': ['VolumeIdentifier']}
##
 # @StorageDomain.setDescription:
@@ -3355,15 +3398,26 @@
## Category: @StoragePool
 ######################################################
 ##
+# @StoragePoolIdentifier:
+#
+# An identifier for a StoragePool object.
+#
+# @storagepoolID:  Associate this object with a new or existing
Storage Pool
+#
+# Since: 4.10.1
+##
+{'type': 'StoragePoolIdentifier', 'data': {'storagepoolID':
'UUID'}}
+
+##
 # @StoragePool:
 #
 # StoragePool API object.
 #
-# @storagepoolID:  Associate this object with a new or existing
Storage Pool
+# @ident:  The object identifier
 #
-# Since: 4.10.0
+# Since: 4.10.1
 ##
-{'class': 'StoragePool', 'data': {'storagepoolID': 'UUID'}}
+{'class': 'StoragePool', 'ident': 'StoragePoolIdentifier'}
##
 # @StoragePool.connect:
@@ -3629,7 +3683,7 @@
 ##
 {'command': {'class': 'StoragePool', 'name':
 'getDomainsContainingImage'},
  'data': {'imageID': 'UUID', '*onlyDataDomains': 'bool'},
- 'returns': ['StorageDomain']}
+ 'returns': ['StorageDomainIdentifier']}
##
 # @StoragePool.getIsoList:
@@ -4058,15 +4112,27 @@
## Category: @Task
 #############################################################
 ##
+# @TaskIdentifier:
+#
+# An identifier for a Task object.
+#
+# @taskID:  The task UUID
+#
+# Since: 4.10.1
+##
+{'type': 'TaskIdentifier', 'data': {'taskID': 'UUID'}}
+
+
+##
 # @Task:
 #
 # Task API object.
 #
-# @taskID:  Associate this object with an existing Task
+# @ident:  The object identifier
 #
-# Since: 4.10.0
+# Since: 4.10.1
 ##
-{'class': 'Task', 'data': {'taskID': 'UUID'}}
+{'class': 'Task', 'ident': 'TaskIdentifier'}
##
 # @Task.clear:
@@ -4123,15 +4189,26 @@
## Category: @VM
 ###############################################################
 ##
+# @VMIdentifier:
+#
+# An identifier for a VM object.
+#
+# @vmID:  The task UUID
+#
+# Since: 4.10.1
+##
+{'type': 'VMIdentifier', 'data': {'vmID': 'UUID'}}
+
+##
 # @VM:
 #
 # VM API object.
 #
-# @vmID:  Associate this object with an existing VM
+# @ident:  The object identifier
 #
-# Since: 4.10.0
+# Since: 4.10.1
 ##
-{'class': 'VM', 'data': {'vmID': 'UUID'}}
+{'class': 'VM', 'ident': 'VMIdentifier'}
##
 # @DriveSpecVolume:
@@ -5161,9 +5238,9 @@
## Category: @Volume
 ###########################################################
 ##
-# @Volume:
+# @VolumeIdentifier:
 #
-# Volume API object.
+# An identifier for a Volume object.
 #
 # @volumeID:         The UUID of the Volume
 #
@@ -5173,13 +5250,24 @@
 #
 # @imageID:          The Image associated with @UUID
 #
-# Since: 4.10.0
+# Since: 4.10.1
 ##
-{'class': 'Volume',
+{'type': 'VolumeIdentifier',
  'data': {'volumeID': 'UUID', 'storagepoolID': 'UUID',
           'storagedomainID': 'UUID', 'imageID': 'UUID'}}
##
+# @Volume:
+#
+# Volume API object.
+#
+# @ident:  The object identifier
+#
+# Since: 4.10.1
+##
+{'class': 'Volume', 'ident': 'VolumeIdentifier'}
+
+##
 # @VolumeRole:
 #
 # An enumeration of Volume Roles.
--
Adam Litke <agl@us.ibm.com>
IBM Linux Technology Center
--
Adam Litke <agl@us.ibm.com>
IBM Linux Technology Center
-- 
Adam Litke <agl@us.ibm.com>
IBM Linux Technology Center