--------------010808060507020608030500
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
On 5/21/2014 11:15 AM, Giuseppe Ragusa wrote:
Hi,
> ----- Original Message -----
> > From: "Ted Miller" <tmiller at hcjb.org>
> > To: "users" <users at ovirt.org>
> > Sent: Tuesday, May 20, 2014 11:31:42 PM
> > Subject: [ovirt-users] sanlock + gluster recovery -- RFE
> >
> > As you are aware, there is an ongoing split-brain problem with running
> > sanlock on replicated gluster storage. Personally, I believe that this is
> > the 5th time that I have been bitten by this sanlock+gluster problem.
> >
> > I believe that the following are true (if not, my entire request is
probably
> > off base).
> >
> >
> > * ovirt uses sanlock in such a way that when the sanlock storage is
on a
> > replicated gluster file system, very small storage disruptions can
> > result in a gluster split-brain on the sanlock space
>
> Although this is possible (at the moment) we are working hard to avoid it.
> The hardest part here is to ensure that the gluster volume is properly
> configured.
>
> The suggested configuration for a volume to be used with ovirt is:
>
> Volume Name: (...)
> Type: Replicate
> Volume ID: (...)
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> (...three bricks...)
> Options Reconfigured:
> network.ping-timeout: 10
> cluster.quorum-type: auto
>
> The two options ping-timeout and quorum-type are really important.
>
> You would also need a build where this bug is fixed in order to avoid any
> chance of a split-brain:
>
>
https://bugzilla.redhat.com/show_bug.cgi?id=1066996
It seems that the aforementioned bug is peculiar to 3-bricks setups.
I understand that a 3-bricks setup can allow proper quorum formation
without resorting to "first-configured-brick-has-more-weight" convention
used with only 2 bricks and quorum "auto" (which makes one node
"special",
so not properly any-single-fault tolerant).
But, since we are on ovirt-users, is there a similar suggested
configuration for a 2-hosts setup oVirt+GlusterFS with oVirt-side power
management properly configured and tested-working?
I mean a configuration where "any" host can go south and oVirt (through the
other one) fences it (forcibly powering it off with confirmation from IPMI
or similar) then restarts HA-marked vms that were running there, all the
while keeping the underlying GlusterFS-based storage domains responsive and
readable/writeable (maybe apart from a lapse between detected other-node
unresposiveness and confirmed fencing)?
Furthermore: is such a suggested configuration possible in a
self-hosted-engine scenario?
Regards,
Giuseppe
> > How did I get into this mess?
> >
> > ...
> >
> > What I would like to see in ovirt to help me (and others like me).
Alternates
> > listed in order from most desirable (automatic) to least desirable (set of
> > commands to type, with lots of variables to figure out).
>
> The real solution is to avoid the split-brain altogether. At the moment it
> seems that using the suggested configurations and the bug fix we shouldn't
> hit a split-brain.
>
> > 1. automagic recovery
> >
> > 2. recovery subcommand
> >
> > 3. script
> >
> > 4. commands
>
> I think that the commands to resolve a split-brain should be documented.
> I just started a page here:
>
>
http://www.ovirt.org/Gluster_Storage_Domain_Reference I suggest you add these
lines to the Gluster configuration, as I have seen
this come up multiple times on the User list:
storage.owner-uid: 36
storage.owner-gid: 36
Ted Miller
Elkhart, IN, USA
--------------010808060507020608030500
Content-Type: text/html; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<div class="moz-cite-prefix">On 5/21/2014 11:15 AM, Giuseppe Ragusa
wrote:<br>
</div>
<blockquote cite="mid:DUB121-W1FD91CF8CB1AFFE6E62D5FA3C0@phx.gbl"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 12pt;
font-family:Calibri
}
--></style>
<div dir="ltr">Hi,<br>
<br>
> ----- Original Message -----<br>
> > From: "Ted Miller" <tmiller at
hcjb.org><br>
> > To: "users" <users at
ovirt.org><br>
> > Sent: Tuesday, May 20, 2014 11:31:42 PM<br>
> > Subject: [ovirt-users] sanlock + gluster recovery --
RFE<br>
> > <br>
> > As you are aware, there is an ongoing split-brain
problem with running<br>
> > sanlock on replicated gluster storage. Personally, I
believe that this is<br>
> > the 5th time that I have been bitten by this
sanlock+gluster problem.<br>
> > <br>
> > I believe that the following are true (if not, my
entire request is probably<br>
> > off base).<br>
> > <br>
> > <br>
> > * ovirt uses sanlock in
such a way that when the
sanlock storage is on a<br>
> > replicated gluster file
system, very small storage
disruptions can<br>
> > result in a gluster
split-brain on the sanlock
space<br>
> <br>
> Although this is possible (at the moment) we are working
hard to avoid it.<br>
> The hardest part here is to ensure that the gluster volume
is properly<br>
> configured.<br>
> <br>
> The suggested configuration for a volume to be used with
ovirt is:<br>
> <br>
> Volume Name: (...)<br>
> Type: Replicate<br>
> Volume ID: (...)<br>
> Status: Started<br>
> Number of Bricks: 1 x 3 = 3<br>
> Transport-type: tcp<br>
> Bricks:<br>
> (...three bricks...)<br>
> Options Reconfigured:<br>
> network.ping-timeout: 10<br>
> cluster.quorum-type: auto<br>
> <br>
> The two options ping-timeout and quorum-type are really
important.<br>
> <br>
> You would also need a build where this bug is fixed in
order to avoid any<br>
> chance of a split-brain:<br>
> <br>
> <a class="moz-txt-link-freetext"
href="https://bugzilla.redhat.com/show_bug.cgi?id=1066996">h...
<br>
It seems that the aforementioned bug is peculiar to 3-bricks
setups.<br>
<br>
I understand that a 3-bricks setup can allow proper quorum
formation without resorting to
"first-configured-brick-has-more-weight" convention used with
only 2 bricks and quorum "auto" (which makes one node
"special",
so not properly any-single-fault tolerant).<br>
<br>
But, since we are on ovirt-users, is there a similar suggested
configuration for a 2-hosts setup oVirt+GlusterFS with
oVirt-side power management properly configured and
tested-working?<br>
I mean a configuration where "any" host can go south and oVirt
(through the other one) fences it (forcibly powering it off with
confirmation from IPMI or similar) then restarts HA-marked vms
that were running there, all the while keeping the underlying
GlusterFS-based storage domains responsive and
readable/writeable (maybe apart from a lapse between detected
other-node unresposiveness and confirmed fencing)?<br>
<br>
Furthermore: is such a suggested configuration possible in a
self-hosted-engine scenario?<br>
<br>
Regards,<br>
Giuseppe<br>
<br>
> > How did I get into this mess?<br>
> > <br>
> > ...<br>
> > <br>
> > What I would like to see in ovirt to help me (and
others like me). Alternates<br>
> > listed in order from most desirable (automatic) to
least desirable (set of<br>
> > commands to type, with lots of variables to figure
out).<br>
> <br>
> The real solution is to avoid the split-brain altogether.
At the moment it<br>
> seems that using the suggested configurations and the bug
fix we shouldn't<br>
> hit a split-brain.<br>
> <br>
> > 1. automagic recovery<br>
> > <br>
> > 2. recovery subcommand<br>
> > <br>
> > 3. script<br>
> > <br>
> > 4. commands<br>
> <br>
> I think that the commands to resolve a split-brain should
be documented.<br>
> I just started a page here:<br>
> <br>
> <a class="moz-txt-link-freetext"
href="http://www.ovirt.org/Gluster_Storage_Domain_Reference">...
</blockquote>
I suggest you add these lines to the Gluster configuration, as I
have seen this come up multiple times on the User list:<br>
<br>
storage.owner-uid: 36<br>
storage.owner-gid: 36<br>
<br>
Ted Miller<br>
Elkhart, IN, USA<br>
<br>
</body>
</html>
--------------010808060507020608030500--