----- Original Message -----
From: "Deepak C Shetty"
<deepakcs(a)linux.vnet.ibm.com>
To: "Saggi Mizrahi" <smizrahi(a)redhat.com>
Cc: "Shu Ming" <shuming(a)linux.vnet.ibm.com>, "engine-devel"
<engine-devel(a)ovirt.org>, "VDSM Project Development"
<vdsm-devel(a)lists.fedorahosted.org>, "Deepak C Shetty"
<deepakcs(a)linux.vnet.ibm.com>
Sent: Sunday, December 16, 2012 11:40:01 PM
Subject: Re: [vdsm] RFC: New Storage API
On 12/08/2012 01:23 AM, Saggi Mizrahi wrote:
>
> ----- Original Message -----
>> From: "Deepak C Shetty" <deepakcs(a)linux.vnet.ibm.com>
>> To: "Saggi Mizrahi" <smizrahi(a)redhat.com>
>> Cc: "Shu Ming" <shuming(a)linux.vnet.ibm.com>,
"engine-devel"
>> <engine-devel(a)ovirt.org>, "VDSM Project Development"
>> <vdsm-devel(a)lists.fedorahosted.org>, "Deepak C Shetty"
>> <deepakcs(a)linux.vnet.ibm.com>
>> Sent: Friday, December 7, 2012 12:23:15 AM
>> Subject: Re: [vdsm] RFC: New Storage API
>>
>> On 12/06/2012 10:22 PM, Saggi Mizrahi wrote:
>>> ----- Original Message -----
>>>> From: "Shu Ming" <shuming(a)linux.vnet.ibm.com>
>>>> To: "Saggi Mizrahi" <smizrahi(a)redhat.com>
>>>> Cc: "VDSM Project Development"
>>>> <vdsm-devel(a)lists.fedorahosted.org>, "engine-devel"
>>>> <engine-devel(a)ovirt.org>
>>>> Sent: Thursday, December 6, 2012 11:02:02 AM
>>>> Subject: Re: [vdsm] RFC: New Storage API
>>>>
>>>> Saggi,
>>>>
>>>> Thanks for sharing your thought and I get some comments below.
>>>>
>>>>
>>>> Saggi Mizrahi:
>>>>> I've been throwing a lot of bits out about the new storage API
>>>>> and
>>>>> I think it's time to talk a bit.
>>>>> I will purposefully try and keep implementation details away
>>>>> and
>>>>> concentrate about how the API looks and how you use it.
>>>>>
>>>>> First major change is in terminology, there is no long a
>>>>> storage
>>>>> domain but a storage repository.
>>>>> This change is done because so many things are already called
>>>>> domain in the system and this will make things less confusing
>>>>> for
>>>>> new-commers with a libvirt background.
>>>>>
>>>>> One other changes is that repositories no longer have a UUID.
>>>>> The UUID was only used in the pool members manifest and is no
>>>>> longer needed.
>>>>>
>>>>>
>>>>> connectStorageRepository(repoId, repoFormat,
>>>>> connectionParameters={}):
>>>>> repoId - is a transient name that will be used to refer to the
>>>>> connected domain, it is not persisted and doesn't have to be
>>>>> the
>>>>> same across the cluster.
>>>>> repoFormat - Similar to what used to be type (eg. localfs-1.0,
>>>>> nfs-3.4, clvm-1.2).
>>>>> connectionParameters - This is format specific and will used to
>>>>> tell VDSM how to connect to the repo.
>>>> Where does repoID come from? I think repoID doesn't exist before
>>>> connectStorageRepository() return. Isn't repoID a return value
>>>> of
>>>> connectStorageRepository()?
>>> No, repoIDs are no longer part of the domain, they are just a
>>> transient handle.
>>> The user can put whatever it wants there as long as it isn't
>>> already taken by another currently connected domain.
>> So what happens when user mistakenly gives a repoID that is in use
>> before.. there should be something in the return value that
>> specifies
>> the error and/or reason for error so that user can try with a
>> new/diff
>> repoID ?
> Asi I said, connect fails if the repoId is in use ATM.
>>>>> disconnectStorageRepository(self, repoId)
>>>>>
>>>>>
>>>>> In the new API there are only images, some images are mutable
>>>>> and
>>>>> some are not.
>>>>> mutable images are also called VirtualDisks
>>>>> immutable images are also called Snapshots
>>>>>
>>>>> There are no explicit templates, you can create as many images
>>>>> as
>>>>> you want from any snapshot.
>>>>>
>>>>> There are 4 major image operations:
>>>>>
>>>>>
>>>>> createVirtualDisk(targetRepoId, size, baseSnapshotId=None,
>>>>> userData={}, options={}):
>>>>>
>>>>> targetRepoId - ID of a connected repo where the disk will be
>>>>> created
>>>>> size - The size of the image you wish to create
>>>>> baseSnapshotId - the ID of the snapshot you want the base the
>>>>> new
>>>>> virtual disk on
>>>>> userData - optional data that will be attached to the new VD,
>>>>> could
>>>>> be anything that the user desires.
>>>>> options - options to modify VDSMs default behavior
>> IIUC, i can use options to do storage offloads ? For eg. I can
>> create
>> a
>> LUN that represents this VD on my storage array based on the
>> 'options'
>> parameter ? Is this the intended way to use 'options' ?
> No, this has nothing to do with offloads.
> If by "offloads" you mean having other VDSM hosts to the heavy
> lifting then this is what the option autoFix=False and the fix
> mechanism is for.
> If you are talking about advanced scsi features (ie. write same)
> they will be used automatically whenever possible.
> In any case, how we manage LUNs (if they are even used) is an
> implementation detail.
I am a bit more interested in how storage array offloads ( by that I
mean, offload VD creation, snapshot, clone etc to the storage array
when
available/possible) can be done from VDSM ?
In the past there were talks of using libSM to do that. How does that
strategy play in this new Storage API scenario ? I agree its implmn
detail, but how & where does that implm sit and how it would be
triggered is not very clear to me. Looking at createVD args, it
sounded
like 'options' seems to be a trigger point for deciding whether to
use
storage offloads or not, but you spoke otherwise :) Can you provide
your
vision on how VDSM can understand the storage array capabilities &
exploit storgae array offloads in this New Storage API context ? --
Thanks deepak
Some will be used automatically whenever possible (storage
offloading).
Features that favor a specific strategy will be activated when the proper strategy (space,
performance) option is selected.
In cases when only the user can know whether to use a feature or not we will have options
to turn that on.
In any case every domain exports a capability list through GetRepositoryCapabilities()
that returns a list off repository capabilities.
Some capabilities are VDSM specific like CLUSTERED or REQUIRES_SRM. Some are storage
capabilities like NATIVE_SNAPSHOTS, NATIVE_THIN_PROVISIONING, SPARSE_VOLUMES, etc...
We are also considering an override mechanism where you can disable features in storage
that supports it by setting it in the domain options. This will be done with NO_XXXXX (eg.
NO_NATIVE_SNAPSHOTS). This will make the domain not use or expose the capability through
the API. I assume it will only be used for testing or in cases where the storage array is
known to have problems with a certain feature. Not everything can be disables as an
example there is no real way to disable NATIVE_THING_PROVISIONING or SPARSE_VOLUMES.