--_27321994-845b-49b2-9a1d-a49b376f5af2_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Hi=2C
----- Original Message -----
> From: "Ted Miller" <tmiller at hcjb.org>
> To: "users" <users at ovirt.org>
> Sent: Tuesday=2C May 20=2C 2014 11:31:42 PM
> Subject: [ovirt-users] sanlock + gluster recovery -- RFE
>=20
> As you are aware=2C there is an ongoing split-brain problem with runnin=
g
> sanlock on replicated gluster storage. Personally=2C I believe
that thi=
s is
> the 5th time that I have been bitten by this sanlock+gluster
problem.
>=20
> I believe that the following are true (if not=2C my entire request is p=
robably
> off base).
>=20
>=20
> * ovirt uses sanlock in such a way that when the sanlock storage is=
on
a
> replicated gluster file system=2C very small storage
disruptions ca=
n
> result in a gluster split-brain on the sanlock space
=20
Although this is possible (at the moment) we are working hard to avoid it=
.
The hardest part here is to ensure that the gluster volume is
properly
configured.
=20
The suggested configuration for a volume to be used with ovirt is:
=20
Volume Name: (...)
Type: Replicate
Volume ID: (...)
Status: Started
Number of Bricks: 1 x 3 =3D 3
Transport-type: tcp
Bricks:
(...three bricks...)
Options Reconfigured:
network.ping-timeout: 10
cluster.quorum-type: auto
=20
The two options ping-timeout and quorum-type are really important.
=20
You would also need a build where this bug is fixed in order to avoid any
chance of a split-brain:
=20
https://bugzilla.redhat.com/show_bug.cgi?id=3D1066996
It seems that the aforementioned bug is peculiar to 3-bricks setups.
I understand that a 3-bricks setup can allow proper quorum formation withou=
t resorting to "first-configured-brick-has-more-weight" convention used wit=
h only 2 bricks and quorum "auto" (which makes one node "special"=2C
so not=
properly any-single-fault tolerant).
But=2C since we are on ovirt-users=2C is there a similar suggested configur=
ation for a 2-hosts setup oVirt+GlusterFS with oVirt-side power management =
properly configured and tested-working?
I mean a configuration where "any" host can go south and oVirt (through the=
other one) fences it (forcibly powering it off with confirmation from IPMI=
or similar) then restarts HA-marked vms that were running there=2C all the=
while keeping the underlying GlusterFS-based storage domains responsive an=
d readable/writeable (maybe apart from a lapse between detected other-node =
unresposiveness and confirmed fencing)?
Furthermore: is such a suggested configuration possible in a self-hosted-en=
gine scenario?
Regards=2C
Giuseppe
> How did I get into this mess?
>=20
> ...
>=20
> What I would like to see in ovirt to help me (and others like me). Alte=
rnates
> listed in order from most desirable (automatic) to least
desirable (set=
of
> commands to type=2C with lots of variables to figure out).
=20
The real solution is to avoid the split-brain altogether. At the moment i=
t
seems that using the suggested configurations and the bug fix we
shouldn'=
t
hit a split-brain.
=20
> 1. automagic recovery
>=20
> 2. recovery subcommand
>=20
> 3. script
>=20
> 4. commands
=20
I think that the commands to resolve a split-brain should be documented.
I just started a page here:
=20
http://www.ovirt.org/Gluster_Storage_Domain_Reference
=20
Could you add your documentation there? Thanks!
=20
--=20
Federico
=
--_27321994-845b-49b2-9a1d-a49b376f5af2_
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<style><!--
.hmmessage P
{
margin:0px=3B
padding:0px
}
body.hmmessage
{
font-size: 12pt=3B
font-family:Calibri
}
--></style></head>
<body class=3D'hmmessage'><div
dir=3D'ltr'>Hi=2C<br><br>>=3B ----- Origin=
al Message -----<br>>=3B >=3B From: "Ted Miller"
<=3Btmiller at hcjb.=
org>=3B<br>>=3B >=3B To: "users" <=3Busers at
ovirt.org>=3B<br>&g=
t=3B >=3B Sent: Tuesday=2C May 20=2C 2014 11:31:42 PM<br>>=3B >=3B
Su=
bject: [ovirt-users] sanlock + gluster recovery -- RFE<br>>=3B >=3B
<br=
>=3B >=3B As you are aware=2C there is an ongoing
split-brain problem =
with running<br>>=3B >=3B sanlock on
replicated gluster storage. Person=
ally=2C I believe that this is<br>>=3B >=3B the 5th time that I have
be=
en bitten by this sanlock+gluster problem.<br>>=3B >=3B
<br>>=3B >=
=3B I believe that the following are true (if not=2C my entire request is p=
robably<br>>=3B >=3B off base).<br>>=3B >=3B
<br>>=3B >=3B <br>=
>=3B >=3B =3B =3B =3B =3B * ovirt uses sanlock
in such =
a way that when the sanlock storage is on a<br>>=3B
>=3B =3B =
=3B =3B =3B replicated gluster file system=2C very small storage di=
sruptions can<br>>=3B
>=3B =3B =3B =3B =3B result in a =
gluster split-brain on the sanlock space<br>>=3B <br>>=3B Although
this=
is possible (at the moment) we are working hard to avoid it.<br>>=3B The=
hardest part here is to ensure that the gluster volume is properly<br>>=
=3B configured.<br>>=3B <br>>=3B The suggested configuration for a
volu=
me to be used with ovirt is:<br>>=3B <br>>=3B Volume Name:
(...)<br>>=
=3B Type: Replicate<br>>=3B Volume ID: (...)<br>>=3B Status:
Started<br=
>=3B Number of Bricks: 1 x 3 =3D 3<br>>=3B
Transport-type: tcp<br>>=
=3B Bricks:<br>>=3B (...three
bricks...)<br>>=3B Options Reconfigured:<=
br>>=3B network.ping-timeout: 10<br>>=3B cluster.quorum-type:
auto<br>&=
gt=3B <br>>=3B The two options ping-timeout and quorum-type are really im=
portant.<br>>=3B <br>>=3B You would also need a build where this
bug is=
fixed in order to avoid any<br>>=3B chance of a
split-brain:<br>>=3B <=
br>>=3B
https://bugzilla.redhat.com/show_bug.cgi?id=3D1066996<br><br>It s=
eems that the aforementioned bug is peculiar to 3-bricks setups.<br><br>I u=
nderstand that a 3-bricks setup can allow proper quorum formation without r=
esorting to "first-configured-brick-has-more-weight" convention used with o=
nly 2 bricks and quorum "auto" (which makes one node "special"=2C so
not pr=
operly any-single-fault tolerant).<br><br>But=2C since we are on ovirt-user=
s=2C is there a similar suggested configuration for a 2-hosts setup oVirt+G=
lusterFS with oVirt-side power management properly configured and tested-wo=
rking?<br>I mean a configuration where "any" host can go south and oVirt
(t=
hrough the other one) fences it (forcibly powering it off with confirmation=
from IPMI or similar) then restarts HA-marked vms that were running there=
=2C all the while keeping the underlying GlusterFS-based storage domains re=
sponsive and readable/writeable (maybe apart from a lapse between detected =
other-node unresposiveness and confirmed fencing)?<br><br>Furthermore: is s=
uch a suggested configuration possible in a self-hosted-engine scenario?<br=
<br>Regards=2C<br>Giuseppe<br><br>>=3B
>=3B How did I get into this me=
ss?<br>>=3B >=3B
<br>>=3B >=3B ...<br>>=3B >=3B <br>>=3B
>=
=3B What I would like to see in ovirt to help me (and others like me). Alte=
rnates<br>>=3B >=3B listed in order from most desirable (automatic) to
=
least desirable (set of<br>>=3B >=3B commands to type=2C with lots of
v=
ariables to figure out).<br>>=3B <br>>=3B The real solution is to
avoid=
the split-brain altogether. At the moment it<br>>=3B seems that using th=
e suggested configurations and the bug fix we shouldn't<br>>=3B hit a
spl=
it-brain.<br>>=3B <br>>=3B >=3B 1. automagic
recovery<br>>=3B >=
=3B <br>>=3B >=3B 2. recovery subcommand<br>>=3B >=3B
<br>>=3B &g=
t=3B 3. script<br>>=3B >=3B <br>>=3B >=3B 4.
commands<br>>=3B <br=
>=3B I think that the commands to resolve a split-brain should
be docume=
nted.<br>>=3B I just started a page here:<br>>=3B
<br>>=3B
http://www=
.ovirt.org/Gluster_Storage_Domain_Reference<br>>=3B <br>>=3B Could
you =
add your documentation there? Thanks!<br>>=3B <br>>=3B --
<br>>=3B Fe=
derico<br><br> </div></body>
</html>=
--_27321994-845b-49b2-9a1d-a49b376f5af2_--