[Users] sanlock leases on VM disks
Itamar Heim
iheim at redhat.com
Mon Jan 27 11:59:32 UTC 2014
On 01/27/2014 01:06 PM, José Luis Sanz Boixader wrote:
> On 01/17/2014 11:43 PM, Itamar Heim wrote:
>> On 01/10/2014 08:44 PM, José Luis Sanz Boixader wrote:
>>> I have an oVirt testing setup with 3 hosts running for a few weeks:
>>> CentOS 6.4, oVirt 3.3.1, VDSM 4.13.0, iSCSI based storage domain.
>>>
>>> I have just realized that sanlock has no leases on VM disks, so nothing
>>> prevents vdsm/libvirt from starting a VM on two different hosts,
>>> corrupting disk data. I know that something has to go wrong on oVirt
>>> engine to do it, but I've manually forced some errors ("Setting Host
>>> state to Non-Operational", "VM xxxx is not responding") for a "Highly
>>> available" VM and oVirt engine started that VM on another host. oVirt
>>> engine was not aware, but the VM was running on two hosts.
>>>
>>> I think this is a job for libvirt/sanlock/wdmd, but libvirt is not
>>> receiving "lease" tags for disks when creating domains. I think it
>>> should.
>>> What's left in my config? What am I doing wrong?
>>>
>>> Thanks
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>
>> we started introducing sanlock carefully, first to SPM nodes. in 3.4
>> to hosted ovirt-engine node, and looking to add it to VMs/disks as
>> well going forward.
>>
>> I don't remember if we have a config option to enable this, but you
>> can make this work via a custom hook at this point at vm/disk level,
>> and we would love feedback on this.
>>
>> Thanks,
>> Itamar
>>
>
> Looking into vdsm code, I've found that there's already code for sanlock
> on VM disks, but it has been disabled by default
> [https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=838802].
> I guess it was disabled because you can't hot attach/dettach disks to VM
> while running. But I prefer to enable it, as data protection is critical
> in a SAN environment.
it wasn't enabled yet to finish testing and some corner cases. help in
this area is appreciated.
ayal/federico can share on the current gaps/status (assuming i remember
correctly)
>
> To enable it, you need to include this in /etc/vdsm/vdsm.conf at every
> host in your setup.
> ...
> [irs]
> use_volume_leases = true
> ...
> and after that, you'll need to restart vdsmd. Ensure that
> /etc/libvirt/qemu.conf says lock_manager="sanlock".
> VMs that were already running will not be modified, and thus not
> protected. Restart them to get sanlock leases on those disks.
>
> To confirm that it is properly running, connect to a host and type:
> # sanlock client status
> and you'll get some output like this, listing VMs running on that host
> and its disk leases:
>
> daemon dc7e06a0-18bb-4f68-9ea6-883dda883ef2.server
> p -1 helper
> p -1 listener
> p 10629 Vicent
> p -1 status
> s
> e9a91ad7-2bd3-4c98-a171-88324bc87a09:2:/dev/e9a91ad7-2bd3-4c98-a171-88324bc87a09/ids:0
> r
> e9a91ad7-2bd3-4c98-a171-88324bc87a09:1c7005ad-3d33-4c8f-9c99-2eef7be865f3:/dev/e9a91ad7-2bd3-4c98-a171-88324bc87a09/leases:123731968:6
> p 10629
> r
> e9a91ad7-2bd3-4c98-a171-88324bc87a09:978b3630-f491-45a3-9826-e9ab6a744e72:/dev/e9a91ad7-2bd3-4c98-a171-88324bc87a09/leases:127926272:6
> p 10629
>
> Which tells that domain "Vicent" with proc id 10629 has two disks with
> leases.
> No other host will be able to run that VM, even in a engine error event,
> because it could not get leases to use those disks.
>
> Migration of VM also works, and the destination host gets and acquires
> the disk leases.
> All this has been tested with oVirt 3.3.1 release.
>
More information about the Users
mailing list