Hi all,
Recently support for Ceph network disk landed in master. It its possible
now to start a vm using Ceph network disk or hot-plug/unplug such disk
using Cephx authentication.
However, to make it work, you must add the relevant Ceph secret to
libvirt manually, in the same way it is done in OpenStack deployment.
Our goal is to manage secrets automatically and use ephemeral (safer)
secrets.
The next patches in the Ceph topic [1], implement secret management in
the same way we manage storage domains or server connections:
The concept is - all hosts can use all secrets, so you can migrate a vm
using Ceph disk to any host in the cluster.
1. When host becomes up, we register the secrets associated with all the
current active domains with libvirt
2. When activating a domain, we register the secrets associated with the
new domain with libvirt
3. When deactivating a domain, we unregister the secrets associated with
the domain from libvirt
4. When moving host to maintenance, we clear all secrets
5. When vdsm shutdown or starts, clear all secrets to ensure that we don't keep
stale or unneeded secrets on a host
This system seems to work, but Federico pointed few issues and suggested
a new (simpler?) approach.
In future libvirt version, libvirt will support the concept of transient
secrets so you can start a transient vm using secret without registering
the secret with libvirt before starting the vm. The secret will be
specified in the vm XML (for starting a vm) or disk XML (for hot-plug).
This will make our secret management system and APIs useless.
Managing state on multiple hosts is hard; we will probably have to deal
with nasty edge cases (e.g. lost messages, network errors), which may
lead to host with missing secret, which cannot run some vms. We probably
do this right for storage domains (after 8 years?), and we should not
assume that we are smarter and secret management will work in the first
try.
The new approach is to *not* manage state or multiple hosts. Instead,
send the required secrets only to the host that starting a vm or
hot-plugging a disk that need a libvirt secret:
1. When starting a vm, add the required secrets to the vm description.
On the host, vdsm will register these secrets with libvirt before
starting the vm.
2. When migrating a vm, add the required secrets to the vm description.
On the host, vdsm will send these secrets to the destination host,
and on the destination host, vdsm will register the secrets with libvirt
before starting the vm.
3. When hot-plugging a disk, send the secret if needed in the disk
description. On the host, vdsm will register the secrets with libvirt.
4. When vdsm shutdown or starts, clear all secrets to ensure that we don't keep
stale or unneeded secrets on a host
5. We never unregister secrets, since they are ephemeral anyway.
6. Alternatively, we can implement secrets reference counting so when a vm
stops or disk is hot-unplugged we decrease the reference count on the
secrets associated with this vm/disk, and if no other vms need the
secret, we can unregister the secret from libvirt.
The new approach is simpler, if we avoid the fancy secret reference
counting. I believe we can get it merged in couple of weeks with help
from the virt team.
Please share your thoughts on these alternative solutions.
Thanks,
Nir
[1]
https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic...