[ovirt-devel] Libvirt secrets management - take 2

Nir Soffer nsoffer at redhat.com
Sat Jun 13 21:14:03 UTC 2015


----- Original Message -----
> From: "Adam Litke" <alitke at redhat.com>
> To: "Nir Soffer" <nsoffer at redhat.com>
> Cc: "devel" <devel at ovirt.org>, "Francesco Romani" <fromani at redhat.com>, "Federico Simoncelli" <fsimonce at redhat.com>,
> "Dan Kenigsberg" <danken at redhat.com>, "Allon Mureinik" <amureini at redhat.com>, "Daniel Erez" <derez at redhat.com>,
> "Michal Skrivanek" <mskrivan at redhat.com>, "Eric Blake" <eblake at redhat.com>
> Sent: Saturday, June 13, 2015 4:52:19 PM
> Subject: Re: Libvirt secrets management - take 2
> 
> On 12/06/15 08:10 -0400, Nir Soffer wrote:
> >Here are more details on the new approach.
> >
> >A Ceph key is required only when starting a vm or hot-plugging a disk.
> >Once the operation is done, libvirt does not need the Ceph key any more.
> 
> >A vm operation requiring a secret, will register a Ceph key using new
> >random UUID, and remove the libvirt secret as soon as the operation was
> >finished or failed.
> >
> >This scheme does not require secret reference counting. If multiple vms
> >need the same Ceph key, we register it multiple times with libvirt,
> >using unique UUIDs.
> >
> >This also avoid possible races when removing a libvirt secret in the
> >same time another vm is trying to add it, or updating secret usage id,
> >which is currently racy (you must remove the existing secret and
> >register a new one).
> 
> I really like this new design.  As long as you remove the secrets when
> you're done with them then I am not concerned with having an
> accumulation of untracked secrets.  I am happy to get rid of the
> complexity of secret reference counting.
> 
> >Positive flow:
> >
> >- Engine adds required Ceph keys to vm description
> >- Vdsm register keys with libvirt, using new random UUID
> >- When the vm operation is done (vm started, disk hot-plugged), remove
> >  the temporary secret (e.g. using try-finally)
> >
> >Negative flows:
> >
> >- Vm operation fails - temporary secret is unregistered in the finally
> >  block
> >- Vdsm crash during the operation - temporary secret is removed when
> >  vdsm starts again.
> >- Libvirt crash during the operation - secret removed since we use
> >  ephemeral secrets
> >- Host crash during operation - same
> >- Libvirt fail to remove secret - we cannot handle this :-)
> 
> What about the case where a storage domain becomes unavailable
> (causing the VM to pause with -EIO).  When the domainMonitor
> reestablishes a connection to the domain would the secrets need to be
> renewed?

We don't have domain monitoring for ceph disks. I don't think we tried
yet what happen when you block access to all ceph monitors, but I guess
that qemu can renew the connection if needed since it is holding the
authentication key.

> 
> >
> >Flows
> >=====
> >
> >Start vm
> >--------
> >- Engine add required secrets to vm description
> >- Vdsm register temporary secrets with libvirt
> >- When vm is up or if operation failed, Vdsm remove the temporary secret
> >
> >Migrate vm
> >----------
> >- Engine add required secrets to vm description
> >- Vdsm add secrets to the vm description sent to the destination
> >- On the destination, Vdsm register temporary secrets with libvirt
> >- On the destination, when vm is up or if operation failed, Vdsm remove
> >  the temporary secret

On issue with migration - do we have to keep the same auth information as in the
original vm xml, or we can create the vm on the destination using different auth
xml?

In the original, xml, each disk may have auth element with secret uuid. This 
secret must be defined in the destination libvirt so qemu can connect to the 
disk on the destination machine.

Francesco? Eric?

> >
> >Hot-plug disk
> >-------------
> >- Engine add secret to disk description
> >- Vdsm register temporary secret with libvirt
> >- When disk is successfully plugged, or if operation failed, Vdsm remove
> >  the temporary secret
> >
> >I think this is the correct direction, assuming that we can get it works
> >for migration - I have no idea on that part.
> 
> +1 - This is a much improved design in my opinion.
> 
> >
> >----- Original Message -----
> >> From: "Nir Soffer" <nsoffer at redhat.com>
> >> To: "devel" <devel at ovirt.org>
> >> Cc: "Francesco Romani" <fromani at redhat.com>, "Federico Simoncelli"
> >> <fsimonce at redhat.com>, "Dan Kenigsberg"
> >> <danken at redhat.com>, "Adam Litke" <alitke at redhat.com>, "Allon Mureinik"
> >> <amureini at redhat.com>, "Daniel Erez"
> >> <derez at redhat.com>, "Michal Skrivanek" <mskrivan at redhat.com>, "Eric Blake"
> >> <eblake at redhat.com>
> >> Sent: Friday, June 12, 2015 2:21:46 PM
> >> Subject: Libvirt secrets management - take 2
> >>
> >> Hi all,
> >>
> >> Recently support for Ceph network disk landed in master. It its possible
> >> now to start a vm using Ceph network disk or hot-plug/unplug such disk
> >> using Cephx authentication.
> >>
> >> However, to make it work, you must add the relevant Ceph secret to
> >> libvirt manually, in the same way it is done in OpenStack deployment.
> >> Our goal is to manage secrets automatically and use ephemeral (safer)
> >> secrets.
> >>
> >> The next patches in the Ceph topic [1], implement secret management in
> >> the same way we manage storage domains or server connections:
> >>
> >> The concept is - all hosts can use all secrets, so you can migrate a vm
> >> using Ceph disk to any host in the cluster.
> >>
> >> 1. When host becomes up, we register the secrets associated with all the
> >>    current active domains with libvirt
> >>
> >> 2. When activating a domain, we register the secrets associated with the
> >>    new domain with libvirt
> >>
> >> 3. When deactivating a domain, we unregister the secrets associated with
> >>    the domain from libvirt
> >>
> >> 4. When moving host to maintenance, we clear all secrets
> >>
> >> 5. When vdsm shutdown or starts, clear all secrets to ensure that we don't
> >> keep
> >>    stale or unneeded secrets on a host
> >>
> >> This system seems to work, but Federico pointed few issues and suggested
> >> a new (simpler?) approach.
> >>
> >> In future libvirt version, libvirt will support the concept of transient
> >> secrets so you can start a transient vm using secret without registering
> >> the secret with libvirt before starting the vm. The secret will be
> >> specified in the vm XML (for starting a vm) or disk XML (for hot-plug).
> >> This will make our secret management system and APIs useless.
> >>
> >> Managing state on multiple hosts is hard; we will probably have to deal
> >> with nasty edge cases (e.g. lost messages, network errors), which may
> >> lead to host with missing secret, which cannot run some vms. We probably
> >> do this right for storage domains (after 8 years?), and we should not
> >> assume that we are smarter and secret management will work in the first
> >> try.
> >>
> >> The new approach is to *not* manage state or multiple hosts. Instead,
> >> send the required secrets only to the host that starting a vm or
> >> hot-plugging a disk that need a libvirt secret:
> >>
> >> 1. When starting a vm, add the required secrets to the vm description.
> >>    On the host, vdsm will register these secrets with libvirt before
> >>    starting the vm.
> >>
> >> 2. When migrating a vm, add the required secrets to the vm description.
> >>    On the host, vdsm will send these secrets to the destination host,
> >>    and on the destination host, vdsm will register the secrets with
> >>    libvirt
> >>    before starting the vm.
> >>
> >> 3. When hot-plugging a disk, send the secret if needed in the disk
> >>    description.  On the host, vdsm will register the secrets with libvirt.
> >>
> >> 4. When vdsm shutdown or starts, clear all secrets to ensure that we don't
> >> keep
> >>    stale or unneeded secrets on a host
> >>
> >> 5. We never unregister secrets, since they are ephemeral anyway.
> >>
> >> 6. Alternatively, we can implement secrets reference counting so when a vm
> >>    stops or disk is hot-unplugged we decrease the reference count on the
> >>    secrets associated with this vm/disk, and if no other vms need the
> >>    secret, we can unregister the secret from libvirt.
> >>
> >> The new approach is simpler, if we avoid the fancy secret reference
> >> counting. I believe we can get it merged in couple of weeks with help
> >> from the virt team.
> >>
> >> Please share your thoughts on these alternative solutions.
> >>
> >> Thanks,
> >> Nir
> >>
> >> [1]
> >> https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:ceph
> 
> --
> Adam Litke
> 



More information about the Devel mailing list