[ovirt-users] sanlock + gluster recovery -- RFE

Federico Simoncelli fsimonce at redhat.com
Wed May 21 12:41:39 UTC 2014


----- Original Message -----
> From: "Ted Miller" <tmiller at hcjb.org>
> To: "users" <users at ovirt.org>
> Sent: Tuesday, May 20, 2014 11:31:42 PM
> Subject: [ovirt-users] sanlock + gluster recovery -- RFE
> 
> As you are aware, there is an ongoing split-brain problem with running
> sanlock on replicated gluster storage. Personally, I believe that this is
> the 5th time that I have been bitten by this sanlock+gluster problem.
> 
> I believe that the following are true (if not, my entire request is probably
> off base).
> 
> 
>     * ovirt uses sanlock in such a way that when the sanlock storage is on a
>     replicated gluster file system, very small storage disruptions can
>     result in a gluster split-brain on the sanlock space

Although this is possible (at the moment) we are working hard to avoid it.
The hardest part here is to ensure that the gluster volume is properly
configured.

The suggested configuration for a volume to be used with ovirt is:

Volume Name: (...)
Type: Replicate
Volume ID: (...)
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
(...three bricks...)
Options Reconfigured:
network.ping-timeout: 10
cluster.quorum-type: auto

The two options ping-timeout and quorum-type are really important.

You would also need a build where this bug is fixed in order to avoid any
chance of a split-brain:

https://bugzilla.redhat.com/show_bug.cgi?id=1066996

> How did I get into this mess?
> 
> ...
> 
> What I would like to see in ovirt to help me (and others like me). Alternates
> listed in order from most desirable (automatic) to least desirable (set of
> commands to type, with lots of variables to figure out).

The real solution is to avoid the split-brain altogether. At the moment it
seems that using the suggested configurations and the bug fix we shouldn't
hit a split-brain.

> 1. automagic recovery
> 
> 2. recovery subcommand
> 
> 3. script
> 
> 4. commands

I think that the commands to resolve a split-brain should be documented.
I just started a page here:

http://www.ovirt.org/Gluster_Storage_Domain_Reference

Could you add your documentation there? Thanks!

-- 
Federico



More information about the Users mailing list