On 01/17/2014 11:43 PM, Itamar Heim wrote:
On 01/10/2014 08:44 PM, José Luis Sanz Boixader wrote:
> I have an oVirt testing setup with 3 hosts running for a few weeks:
> CentOS 6.4, oVirt 3.3.1, VDSM 4.13.0, iSCSI based storage domain.
>
> I have just realized that sanlock has no leases on VM disks, so nothing
> prevents vdsm/libvirt from starting a VM on two different hosts,
> corrupting disk data. I know that something has to go wrong on oVirt
> engine to do it, but I've manually forced some errors ("Setting Host
> state to Non-Operational", "VM xxxx is not responding") for a
"Highly
> available" VM and oVirt engine started that VM on another host. oVirt
> engine was not aware, but the VM was running on two hosts.
>
> I think this is a job for libvirt/sanlock/wdmd, but libvirt is not
> receiving "lease" tags for disks when creating domains. I think it
> should.
> What's left in my config? What am I doing wrong?
>
> Thanks
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users
>
we started introducing sanlock carefully, first to SPM nodes. in 3.4
to hosted ovirt-engine node, and looking to add it to VMs/disks as
well going forward.
I don't remember if we have a config option to enable this, but you
can make this work via a custom hook at this point at vm/disk level,
and we would love feedback on this.
Thanks,
Itamar
Looking into vdsm code, I've found that there's already code for sanlock
on VM disks, but it has been disabled by default
[
https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=838802].
I guess it was disabled because you can't hot attach/dettach disks to VM
while running. But I prefer to enable it, as data protection is critical
in a SAN environment.
To enable it, you need to include this in /etc/vdsm/vdsm.conf at every
host in your setup.
...
[irs]
use_volume_leases = true
...
and after that, you'll need to restart vdsmd. Ensure that
/etc/libvirt/qemu.conf says lock_manager="sanlock".
VMs that were already running will not be modified, and thus not
protected. Restart them to get sanlock leases on those disks.
To confirm that it is properly running, connect to a host and type:
# sanlock client status
and you'll get some output like this, listing VMs running on that host
and its disk leases:
daemon dc7e06a0-18bb-4f68-9ea6-883dda883ef2.server
p -1 helper
p -1 listener
p 10629 Vicent
p -1 status
s
e9a91ad7-2bd3-4c98-a171-88324bc87a09:2:/dev/e9a91ad7-2bd3-4c98-a171-88324bc87a09/ids:0
r
e9a91ad7-2bd3-4c98-a171-88324bc87a09:1c7005ad-3d33-4c8f-9c99-2eef7be865f3:/dev/e9a91ad7-2bd3-4c98-a171-88324bc87a09/leases:123731968:6
p 10629
r
e9a91ad7-2bd3-4c98-a171-88324bc87a09:978b3630-f491-45a3-9826-e9ab6a744e72:/dev/e9a91ad7-2bd3-4c98-a171-88324bc87a09/leases:127926272:6
p 10629
Which tells that domain "Vicent" with proc id 10629 has two disks with
leases.
No other host will be able to run that VM, even in a engine error event,
because it could not get leases to use those disks.
Migration of VM also works, and the destination host gets and acquires
the disk leases.
All this has been tested with oVirt 3.3.1 release.