On 03/31/2015 07:03 AM, Crístian Viana wrote:
Hi,
I'd like to propose a change to how Kimchi uses the resource field
"ref_cnt".
Currently, "ref_cnt" - which stands for "reference count" - is one of
the fields returned when looking up a storage volume. Its purpose is
to indicate how many times that resource is being used at the moment.
For example, if the resource /storagepools/pool/storagevolumes/vol has
ref_cnt=1, it means that the disk is attached to 1 VM right now and
thus it cannot be attached to another VM. I believe the original idea
of this feature is to prevent the same resource from being attached
more than once at the same time. However, IMO, that might not always
be the desired behavior and there's no way to enforce it completely as
those resources can be used outside of Kimchi, where "ref_cnt" doesn't
exist. For example, if I have one VM which uses the disk "vol", I'm
not able to attach it to another VM via Kimchi; but if I use another
libvirt-based VM manager (e.g. virsh), I am able to attach that disk
to a different VM. This becomes even trickier when we consider other
operations, like snapshots, which can attach/detach disks while
they're being reverted to. Also, suppose I might want to inspect one
VM's disk from another VM, and then I'd need to attach one disk twice;
Kimchi wouldn't allow that by stating that the disk is already in use.
I propose Kimchi should stop using "ref_cnt" as a blocking method. The
field may still exist for information/warning messages (e.g. "are you
sure you want to attach this disk? it's already being used by another
VM.") but no operation should be blocked because of it, as it is the
case now. As inconsistencies with that value may happen and we have no
way to make sure it will always work, we shouldn't annoy the user by
stopping them from doing something that may be perfectly valid.
Let me explain the
context of this design:
As kimchi always create a disk internally with VM, we do not want
these disks we know belong to given VM to be attached to others at the
same time, because it can easily cause corruption. And openstack nova
also has this logic to prevent cinder volumes to be attached to multiple
VMs.
I agree that are some use cases we need shared disk:
clustered application can deal with concurrent access of disk
(concurrent filesystem, databases with shared table-space). I think for
these cases we need to label these disks as "shared", let users aware
that these disks have concurrent access control on these disks, we can
refer to ovirt's shared raw disk if we want:
http://www.ovirt.org/Features/SharedRawDisk.
What I do not agree is we lay the burdern of preventing corruption to
user by chopping away ref_cnt just because we think handle it bothers
us. For the use case you mentioned:
1. virsh/virt-manager allows attach twice: right, but it will not handle
the corruption disk of concurrent access. Two vms writing the disk
meta-data will surely course corruption.
2. snapshots: because a disk ref_cnt is neither 0 or 1 for now. we can
scan the xml to decide its ref_cnt after snapshots revert is done.
3. inspect one VM's disk from another VM: if from two running VM, the
disk may corrupt, if one is down and another is running (e.g. for
emergency recovery) I would suggest to detach it from the paniced VM
first like what we do to physical machine.
Any feedback will be very welcome.
Best regards,
Crístian.
_______________________________________________
Kimchi-devel mailing list
Kimchi-devel(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/kimchi-devel