[ovirt-users] sanlock + gluster recovery -- RFE
Ted Miller
tmiller at hcjb.org
Wed May 21 19:50:34 UTC 2014
On 5/21/2014 11:15 AM, Giuseppe Ragusa wrote:
> Hi,
>
> > ----- Original Message -----
> > > From: "Ted Miller" <tmiller at hcjb.org>
> > > To: "users" <users at ovirt.org>
> > > Sent: Tuesday, May 20, 2014 11:31:42 PM
> > > Subject: [ovirt-users] sanlock + gluster recovery -- RFE
> > >
> > > As you are aware, there is an ongoing split-brain problem with running
> > > sanlock on replicated gluster storage. Personally, I believe that this is
> > > the 5th time that I have been bitten by this sanlock+gluster problem.
> > >
> > > I believe that the following are true (if not, my entire request is
> probably
> > > off base).
> > >
> > >
> > > * ovirt uses sanlock in such a way that when the sanlock storage is
> on a
> > > replicated gluster file system, very small storage disruptions can
> > > result in a gluster split-brain on the sanlock space
> >
> > Although this is possible (at the moment) we are working hard to avoid it.
> > The hardest part here is to ensure that the gluster volume is properly
> > configured.
> >
> > The suggested configuration for a volume to be used with ovirt is:
> >
> > Volume Name: (...)
> > Type: Replicate
> > Volume ID: (...)
> > Status: Started
> > Number of Bricks: 1 x 3 = 3
> > Transport-type: tcp
> > Bricks:
> > (...three bricks...)
> > Options Reconfigured:
> > network.ping-timeout: 10
> > cluster.quorum-type: auto
> >
> > The two options ping-timeout and quorum-type are really important.
> >
> > You would also need a build where this bug is fixed in order to avoid any
> > chance of a split-brain:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=1066996
>
> It seems that the aforementioned bug is peculiar to 3-bricks setups.
>
> I understand that a 3-bricks setup can allow proper quorum formation
> without resorting to "first-configured-brick-has-more-weight" convention
> used with only 2 bricks and quorum "auto" (which makes one node "special",
> so not properly any-single-fault tolerant).
>
> But, since we are on ovirt-users, is there a similar suggested
> configuration for a 2-hosts setup oVirt+GlusterFS with oVirt-side power
> management properly configured and tested-working?
> I mean a configuration where "any" host can go south and oVirt (through the
> other one) fences it (forcibly powering it off with confirmation from IPMI
> or similar) then restarts HA-marked vms that were running there, all the
> while keeping the underlying GlusterFS-based storage domains responsive and
> readable/writeable (maybe apart from a lapse between detected other-node
> unresposiveness and confirmed fencing)?
>
> Furthermore: is such a suggested configuration possible in a
> self-hosted-engine scenario?
>
> Regards,
> Giuseppe
>
> > > How did I get into this mess?
> > >
> > > ...
> > >
> > > What I would like to see in ovirt to help me (and others like me).
> Alternates
> > > listed in order from most desirable (automatic) to least desirable (set of
> > > commands to type, with lots of variables to figure out).
> >
> > The real solution is to avoid the split-brain altogether. At the moment it
> > seems that using the suggested configurations and the bug fix we shouldn't
> > hit a split-brain.
> >
> > > 1. automagic recovery
> > >
> > > 2. recovery subcommand
> > >
> > > 3. script
> > >
> > > 4. commands
> >
> > I think that the commands to resolve a split-brain should be documented.
> > I just started a page here:
> >
> > http://www.ovirt.org/Gluster_Storage_Domain_Reference
I suggest you add these lines to the Gluster configuration, as I have seen
this come up multiple times on the User list:
storage.owner-uid: 36
storage.owner-gid: 36
Ted Miller
Elkhart, IN, USA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20140521/f72475c8/attachment-0001.html>
More information about the Users
mailing list