Re: [ovirt-users] Gluster storage question

11 Feb 2017

      --=_a529982ddfb8bfce09fad28eff29c930
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII

Thank you for your reply Doug, 

I didn't use localhost as I was preparing to follow instructions (blog
post:
http://community.redhat.com/blog/2014/11/up-and-running-with-ovirt-3-5-part-...)
 for setting up CTDB and had already created hostnames for the floating
IP when I decided to ditch that and go with the hosts file hack. I
already had the volumes mounted on those hostnames but you are
absolutely right, simply using localhost would be the best option. 

Thank you for your suggested outline of how to power up/down the
cluster, I hadn't considered the fact that turning on two out of date
nodes would clobber data on the new node. This is something I will need
to be very careful to avoid. The setup is mostly for lab work so not
really mission critical but I do run a few VM's (freeIPA, GitLab and
pfSense) that I'd like to keep up 24/7. I make regular backups (outside
of ovirt) of those just in case. 

Thanks, I will do some reading on how gluster handles quorum and heal
operations but your procedure sounds like a sensible way to operate this
cluster. 

Regards, 

Chris. 

On 2017-02-11 18:08, Doug Ingham wrote:
...
On 11 February 2017 at 13:32, Bartosiak-Jentys, Chris <chris.bartosiak-jentys@certico.co.uk> wrote:
...
Hello list,
Just wanted to get your opinion on my ovirt home lab setup. While this is not a production setup I would like it to run relatively reliably so please tell me if the following storage configuration is likely to result in corruption or just bat s**t insane.
I have a 3 node hosted engine setup, VM data store and engine data store are both replica 3 gluster volumes (one brick on each host).
I do not want to run all 3 hosts 24/7 due to electricity costs, I only power up the larger hosts (2 Dell R710's) when I need additional resources for VM's.
I read about using CTDB and floating/virtual IP's to allow the storage mount point to transition between available hosts but after some thought decided to go about this another, simpler, way:
I created a common hostname for the storage mount points: gfs-data and gfs-engine
On each host I edited /etc/hosts file to have these hostnames resolve to each hosts IP i.e. on host1 gfs-data & gfs-engine --> host1 IP
on host2 gfs-data & gfs-engine --> host2 IP
etc.
In ovirt engine each storage domain is mounted as gfs-data:/data and gfs-engine:/engine
My thinking is that this way no matter which host is up and acting as SPM it will be able to mount the storage as its only dependent on that host being up.
I changed gluster options for server-quorum-ratio so that the volumes remain up even if quorum is not met, I know this is risky but its just a lab setup after all.
So, any thoughts on the /etc/hosts method to ensure the storage mount point is always available? Is data corruption more or less inevitable with this setup? Am I insane ;) ?
Why not just use localhost? And no need for CTDB with a floating IP, oVirt uses libgfapi for Gluster which deals with that all natively.
As for the quorum issue, I would most definitely *not* run with quorum disabled when you're running more than one node. As you say you specifically plan for when the other 2 nodes of the replica 3 set will be active or not, I'd do something along the lines of the following...
Going from 3 nodes to 1 node: 
- Put nodes 2 & 3 in maintenance to offload their virtual load; 
- Once the 2 nodes are free of load, disable quorum on the Gluster volumes; 
- Power down the 2 nodes.
Going from 1 node to 3 nodes: 
- Power on *only* 1 of the pair of nodes (if you power on both & self-heal is enabled, Gluster will "heal" the files on the main node with the older files on the 2 nodes which were powered down); 
- Allow Gluster some time to detect that the files are in split-brain; 
- Tell Gluster to heal the files in split-brain based on modification time; 
- Once the 2 nodes are in sync, re-enable quorum & power on the last node, which will be resynchronised automatically; 
- Take the 2 hosts out of maintenance mode.
If you want to power on the 2nd two nodes at the same time, make absolutely sure self-heal is disabled first! If you don't, Gluster will see the 2nd two nodes as in quorum & heal the data on your 1st node with the out-of-date data.
-- 
Doug
...
I changed gluster options for server-quorum-ratio so that the volumes rem=
ain up even if quorum is not met, I know this is risky but its just a lab s=
etup after all.<br /> <br /> So, any thoughts on the /etc/hosts method to e=
nsure the storage mount point is always available? Is data corruption more =
or less inevitable with this setup? Am I insane ;) ?</blockquote>
<div> </div>
<div>Why not just use localhost? And no need for CTDB with a floating IP, o=
Virt uses libgfapi for Gluster which deals with that all natively.</div>
<div><br />As for the quorum issue, I would most definitely *not* run with =
quorum disabled when you're running more than one node. As you say you spec=
ifically plan for when the other 2 nodes of the replica 3 set will be activ=
e or not, I'd do something along the lines of the following...<br /><br /><=
/div>
<div>Going from 3 nodes to 1 node:</div>
<div> - Put nodes 2 & 3 in maintenance to offload their virtual lo=
ad;</div>
<div> - Once the 2 nodes are free of load, disable quorum on the Glust=
er volumes;</div>
<div> - Power down the 2 nodes.<br /><br /></div>
<div>Going from 1 node to 3 nodes:</div>
<div> - Power on *only* 1 of the pair of nodes (if you power on both &=
amp; self-heal is enabled, Gluster will "heal" the files on the main node w=
ith the older files on the 2 nodes which were powered down);</div>
<div> - Allow Gluster some time to detect that the files are in split-=
brain;</div>
<div> - Tell Gluster to heal the files in split-brain based on modific=
ation time;</div>
<div> - Once the 2 nodes are in sync, re-enable quorum & power on =
-- 

Chris Bartosiak-Jentys
Certico
Tel: 03333 444 884
Mob: 077 0246 8132 
e-mail: chris@certico.co.uk 
www.certico.co.uk [1]

-------------------------

Confidentiality Notice: the information contained in this email and any
attachments may be legally privileged and confidential.
If you are not an intended recipient, you are hereby notified that any
dissemination, distribution, or copying of this e-mail is strictly
prohibited.
If you have received this e-mail in error, please notify the sender and
permanently delete the e-mail and any attachments immediately.
You should not retain, copy or use this e-mail or any attachments for
any purpose, nor disclose all or any part of the contents to any other
person. 
Certico is a trading name of "Certico Trading Limited" England & Wales
registered company no. 5819172.

Links:
------
[1] https://www.certico.co.uk
--=_a529982ddfb8bfce09fad28eff29c930
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset=UTF-8

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html; charset=
=3DUTF-8" /></head><body style=3D'font-size: 10pt; font-family: Helvetica,A=
rial,sans-serif'>
<p>Thank you for your reply Doug,</p>
<p>I didn't use localhost as I was preparing to follow instructions (blog p=
ost: <a href=3D"http://community.redhat.com/blog/2014/11/up-and-runnin=
g-with-ovirt-3-5-part-two/)">http://community.redhat.com/blog/2014/11/up-an=
d-running-with-ovirt-3-5-part-two/)</a>  for setting up CTDB and had a=
lready created hostnames for the floating IP when I decided to ditch that a=
nd go with the hosts file hack. I already had the volumes mounted on those =
hostnames but you are absolutely right, simply using localhost would be the=
 best option.</p>
<p>Thank you for your suggested outline of how to power up/down the cluster=
, I hadn't considered the fact that turning on two out of date nodes would =
clobber data on the new node. This is something I will need to be very care=
ful to avoid. The setup is mostly for lab work so not really mission critic=
al but I do run a few VM's (freeIPA, GitLab and pfSense) that I'd like to k=
eep up 24/7. I make regular backups (outside of ovirt) of those just in cas=
e.</p>
<p>Thanks, I will do some reading on how gluster handles quorum and heal op=
erations but your procedure sounds like a sensible way to operate this clus=
ter.</p>
<p>Regards,</p>
<p>Chris.</p>
<p><br /></p>
<p>On 2017-02-11 18:08, Doug Ingham wrote:</p>
<blockquote type=3D"cite" style=3D"padding: 0 0.4em; border-left: #1010ff 2=
px solid; margin: 0"><!-- html ignored --><!-- head ignored --><!-- meta ig=
nored -->
<div dir=3D"ltr"><br />
<div class=3D"gmail_extra"><br />
<div class=3D"gmail_quote">On 11 February 2017 at 13:32, Bartosiak-Jentys, =
Chris <span><<a href=3D"mailto:chris.bartosiak-jentys@certico.co.uk">chr=
is.bartosiak-jentys@certico.co.uk</a>></span> wrote:<br />
<blockquote class=3D"gmail_quote" style=3D"margin: 0 0 0 .8ex; border-left:=
 1px #ccc solid; padding-left: 1ex;">Hello list,<br /> <br /> Just wanted t=
o get your opinion on my ovirt home lab setup. While this is not a producti=
on setup I would like it to run relatively reliably so please tell me if th=
e following storage configuration is likely to result in corruption or just=
 bat s**t insane.<br /> <br /> I have a 3 node hosted engine setup, VM data=
 store and engine data store are both replica 3 gluster volumes (one brick =
on each host).<br /> I do not want to run all 3 hosts 24/7 due to electrici=
ty costs, I only power up the larger hosts (2 Dell R710's) when I need addi=
tional resources for VM's.<br /> <br /> I read about using CTDB and floatin=
g/virtual IP's to allow the storage mount point to transition between avail=
able hosts but after some thought decided to go about this another, simpler=
, way:<br /> <br /> I created a common hostname for the storage mount point=
s: gfs-data and gfs-engine<br /> <br /> On each host I edited /etc/hosts fi=
le to have these hostnames resolve to each hosts IP i.e. on host1 gfs-data =
& gfs-engine --> host1 IP<br /> on host2 gfs-data & gfs-engine -=
-> host2 IP<br /> etc.<br /> <br /> In ovirt engine each storage domain =
is mounted as gfs-data:/data and gfs-engine:/engine<br /> My thinking is th=
at this way no matter which host is up and acting as SPM it will be able to=
 mount the storage as its only dependent on that host being up.<br /> <br /=
the last node, which will be resynchronised automatically;</div>
<div> - Take the 2 hosts out of maintenance mode.</div>
<div> </div>
<div>If you want to power on the 2nd two nodes at the same time, make absol=
utely sure self-heal is disabled first! If you don't, Gluster will see the =
2nd two nodes as in quorum & heal the data on your 1st node with the ou=
t-of-date data.</div>
</div>
<br clear=3D"all" /><br />-- <br />
<div class=3D"gmail_signature">Doug</div>
</div>
</div>
</blockquote>
<div>-- <br />
<div class=3D"pre" style=3D"margin: 0; padding: 0; font-family: monospace">
<pre><span style=3D"font-family: arial,helvetica,sans-serif;">Chris Bartosi=
ak-Jentys<br />Certico<br />Tel: 03333 444 884<br />Mob: 077 0246 8132 <br =
/>e-mail: <a href=3D"mailto:chris@certico.co.uk">chris@certico.co.uk</a> <b=
r /><a href=3D"https://www.certico.co.uk">www.certico.co.uk</a><br /><br />=
</span></pre>
<hr />
<pre><span style=3D"font-family: arial,helvetica,sans-serif;">Confidentiali=
ty Notice: the information contained in this email and any attachments may =
be legally privileged and confidential.<br />If you are not an intended rec=
ipient, you are hereby notified that any dissemination, distribution, or co=
pying of this e-mail is strictly prohibited.<br />If you have received this=
 e-mail in error, please notify the sender and permanently delete the e-mai=
l and any attachments immediately.<br />You should not retain, copy or use =
this e-mail or any attachments for any purpose, nor disclose all or any par=
t of the contents to any other person. <br />Certico is a trading name of "=
Certico Trading Limited" England & Wales registered company no. 5819172=
=2E</span></pre>
</div>
</div>
</body></html>

--=_a529982ddfb8bfce09fad28eff29c930--