Storage Device Management in VDSM and oVirt

Ayal Baron abaron at redhat.com
Thu Apr 19 07:47:43 UTC 2012



----- Original Message -----
> On Wed, Apr 18, 2012 at 09:06:36AM -0400, Ayal Baron wrote:
> > 
> > 
> > ----- Original Message -----
> > > On Tue, Apr 17, 2012 at 03:38:25PM +0530, Shireesh Anjal wrote:
> > > > Hi all,
> > > > 
> > > > As part of adding Gluster support in ovirt, we need to
> > > > introduce
> > > > some Storage Device management capabilities (on the host).
> > > > Since
> > > > these are quite generic and not specific to Gluster as such, we
> > > > think it might be useful to add it as a core vdsm and oVirt
> > > > feature.
> > > > At a high level, this involves following:
> > > > 
> > > >  - A "Storage Devices" sub-tab on "Host" entity, displaying
> > > > information about all the storage devices*
> > > >  - Listing of different types of storage devices of a host
> > > >     - Regular Disks and Partitions*
> > > >     - LVM*
> > > >     - Software RAID*
> > > >  - Various actions related to device configuration
> > > >     - Partition disks*
> > > >     - Format and mount disks / partitions*
> > > >     - Create, resize and delete LVM Volume Groups (VGs)
> > > >     - Create, resize, delete, format and mount LVM Logical
> > > >     Volumes
> > > >     (LVs)
> > > >     - Create, resize, delete, partition, format and mount
> > > >     Software
> > > > RAID devices
> > > >  - Edit properties of the devices
> > > >  - UI can be modeled similar to the system-config-lvm tool
> > > > 
> > > > The items marked with (*) in above list are urgently required
> > > > for
> > > > the Gluster feature, and will be developed first.
> > > > 
> > > > Comments / inputs welcome.
> > > 
> > > This seems like a big undertaking, and I would like to understand
> > > the
> > > complete use case of this. Is it intended to create the block
> > > storage
> > > devices on top of which a Gluster volume will be created?
> > 
> > Yes, but not only.
> > It could also be used to create the file system on top of which you
> > create a local storage domain (just an example, there are many
> > others, more listed below).
> > 
> > > 
> > > I must tell that we had a bad experience with exposing low level
> > > commands over the Vdsm API: A Vdsm storage domain is a VG with
> > > some
> > > metadata on top. We used to have two API calls for creating a
> > > storage
> > > domain: one to create the VG and one to add the metadata and make
> > > it
> > > an
> > > SD. But it is pretty hard to handle all the error cases remotely.
> > > It
> > > proved more useful to have one atomic command for the whole
> > > sequence.
> > > 
> > > I suspect that this would be the case here, too. I'm not sure if
> > > using
> > > Vdsm as an ssh-replacement for transporting lvm/md/fdisk commands
> > > is
> > > the
> > > best approach.
> > 
> > I agree, we should either provide added value or figure out a way
> > where we don't need to simply add a verb every time the underlying
> > APIs added something.
> > 
> > > 
> > > It may be better to have a single verb for creating Gluster
> > > volume
> > > out
> > > of block storage devices. Something like: "take these disks,
> > > partition
> > > them, build a raid, cover with a vg, carve some PVs and make each
> > > of
> > > them a Gluster volume".
> > > 
> > > Obviously, it is not simple to define a good language to describe
> > > a
> > > general architecture of a Gluster voluem. But it would have to be
> > > done
> > > somewhere - if not in Vdsm then in Engine; and I suspect it would
> > > be
> > > better done on the local host, not beyond a fragile network link.
> > > 
> > > Please note that currently, Vdsm makes a lot of effort not to
> > > touch
> > > LVM
> > > metadata of existing VGs on regular "HSM" hosts. All such
> > > operations
> > > are
> > > done on the engine-selected "SPM" host. When implementing this,
> > > we
> > > must
> > > bear in mind these safeguards and think whether we want to break
> > > them.
> > 
> > I'm not sure I see how this is relevant, we allow creating a VG on
> > any host today and that isn't going to change...
> 
> We have one painful exception, that alone is no reason to add more.
> Note
> that currently, Engine uses the would-be-spm for vg creation. In the
> gluster use case, any host is expected to do this on async timing. It
> might be required, but it's not warm and fuzzy.

Engine does not use the spm for vg creation, it uses whatever host the user wishes to use (there is a drop-down to choose/you can pass host in API).  The default host is SPM in the drop-down, but it's optional.
In addition, next major version we'll have SDM and then there will be no good default.
This also means that the same host would be allowed to manipulate 

> 
> > 
> > In general, we know that we already need to support using a LUN
> > even if it has partitions on it (with force or something).
> > 
> > We know that we have requirements for more control over the VG that
> > we create e.g. support striping, control over max LV size (limited
> > by pv extent size today) etc.
> > 
> > We also know that users would like a way not only to use a local
> > dir for a storage domain but also create the directory + fs?
> 
> These three examples are storage domain flavors..

Yes, but these would bloat createStorageDomain, and the more flexibility we need, the more bloat we'll see.

> 
> > 
> > We know that in the gLuster use case we would like the ability to
> > setup samba over the gluster volume as well as iSCSI probably.
> 
> Now I do not see the relevance. Configuring gluster and how it
> exposes
> its volume is something other than preparing block storage for
> gluster
> bricks.

Not preparing block storage for bricks, exposing a gluster volume as a LUN to other hosts.
Basically we are moving towards a services view here and these services will all need the ability to setup the underlying storage before exposing it.

> 
> > 
> > So although I believe that when we create a gluster volume or an
> > ovirt storage domain then indeed we shouldn't need a lot of low
> > level commands, but it would appear to me that not allowing for
> > more control when needed is not going to work and that there are
> > enough use cases which do not involve a gluster volume nor a
> > storage domain to warrant this to be generic.
> 
> I'm not against more control; I'm against uncontrollable API such as
> runThisLvmCommandAsRoot()

I can't argue with this.
I think what we're missing here though is something similar to setupNetworks which would solve the problem.  Not have 100 verbs (createPartition, createFS, createVG, createLV,  setupRaid,...) but rather have setupStorage (better name suggestions are welcome) which would get the list of objects to use and the final configuration to setup.

This way we'd have a 2 stage process:
1. setupStorage (generic)
2. createSD/createGlusterVolume/...  (plugin specific)



More information about the Arch mailing list