Re: [ovirt-users] Good practices

Tuesday, 8 August 2017

On Tue, Aug 8, 2017 at 12:03 AM, FERNANDO FREDIANI <
fernando.frediani(a)upx.com&gt; wrote:

...
 Thanks for the detailed answer Erekle.

 I conclude that it is worth in any scenario to have a arbiter node in
 order to avoid wasting more disk space to RAID X + Gluster Replication on
 the top of it. The cost seems much lower if you consider running costs of
 the whole storage and compare it with the cost to build the arbiter node.
 Even having a fully redundant arbiter service with 2 nodes would make it
 wort on a larger deployment.

Note that although you get the same consistency as a replica 3 setup, a
2+arbiter gives you data availability as a replica 2 setup. May or may not
be OK with your high availability requirements.
Y.

...
 Regards
 Fernando
 On 07/08/2017 17:07, Erekle Magradze wrote:

 Hi Fernando (sorry for misspelling your name, I used a different keyboard),

 So let's go with the following scenarios:

 1. Let's say you have two servers (replication factor is 2), i.e. two
 bricks per volume, in this case it is strongly recommended to have the
 arbiter node, the metadata storage that will guarantee avoiding the split
 brain situation, in this case for arbiter you don't even need a disk with
 lots of space, it's enough to have a tiny ssd but hosted on a separate
 server. Advantage of such setup is that you don't need the RAID 1 for each
 brick, you have the metadata information stored in arbiter node and brick
 replacement is easy.

 2. If you have odd number of bricks (let's say 3, i.e. replication factor
 is 3) in your volume and you didn't create the arbiter node as well as you
 didn't configure the quorum, in this case the entire load for keeping the
 consistency of the volume resides on all 3 servers, each of them is
 important and each brick contains key information, they need to cross-check
 each other (that's what people usually do with the first try of gluster :)
 ), in this case replacing a brick is a big pain and in this case RAID 1 is
 a good option to have (that's the disadvantage, i.e. loosing the space and
 not having the JBOD option) advantage is that you don't have the to have
 additional arbiter node.

 3. You have odd number of bricks and configured arbiter node, in this case
 you can easily go with JBOD, however a good practice would be to have a
 RAID 1 for arbiter disks (tiny 128GB SSD-s ar perfectly sufficient for
 volumes with 10s of TB-s in size.)

 That's basically it

 The rest about the reliability and setup scenarios you can find in gluster
 documentation, especially look for quorum and arbiter node configs+options.

 Cheers

 Erekle
 P.S. What I was mentioning, regarding a good practice is mostly related to
 the operations of gluster not installation or deployment, i.e. not the
 conceptual understanding of gluster (conceptually it's a JBOD system).

 On 08/07/2017 05:41 PM, FERNANDO FREDIANI wrote:

 Thanks for the clarification Erekle.

 However I get surprised with this way of operating from GlusterFS as it
 adds another layer of complexity to the system (either a hardware or
 software RAID) before the gluster config and increase the system's overall
 costs.

 An important point to consider is: In RAID configuration you already have
 space 'wasted' in order to build redundancy (either RAID 1, 5, or 6). Then
 when you have GlusterFS on the top of several RAIDs you have again more
 data replicated so you end up with the same data consuming more space in a
 group of disks and again on the top of several RAIDs depending on the
 Gluster configuration you have (in a RAID 1 config the same data is
 replicated 4 times).

 Yet another downside of having a RAID (specially RAID 5 or 6) is that it
 reduces considerably the write speeds as each group of disks will end up
 having the write speed of a single disk as all other disks of that group
 have to wait for each other to write as well.

 Therefore if Gluster already replicates data why does it create this big
 pain you mentioned if the data is replicated somewhere else, can still be
 retrieved to both serve clients and reconstruct the equivalent disk when it
 is replaced ?

 Fernando

 On 07/08/2017 10:26, Erekle Magradze wrote:

 Hi Frenando,

 Here is my experience, if you consider a particular hard drive as a brick
 for gluster volume and it dies, i.e. it becomes not accessible it's a huge
 hassle to discard that brick and exchange with another one, since gluster
 some tries to access that broken brick and it's causing (at least it cause
 for me) a big pain, therefore it's better to have a RAID as brick, i.e.
 have RAID 1 (mirroring) for each brick, in this case if the disk is down
 you can easily exchange it and rebuild the RAID without going offline, i.e
 switching off the volume doing brick manipulations and switching it back on.

 Cheers

 Erekle

 On 08/07/2017 03:04 PM, FERNANDO FREDIANI wrote:

 For any RAID 5 or 6 configuration I normally follow a simple gold rule
 which gave good results so far:
 - up to 4 disks RAID 5
 - 5 or more disks RAID 6

 However I didn't really understand well the recommendation to use any RAID
 with GlusterFS. I always thought that GlusteFS likes to work in JBOD mode
 and control the disks (bricks) directlly so you can create whatever
 distribution rule you wish, and if a single disk fails you just replace it
 and which obviously have the data replicated from another. The only
 downside of using in this way is that the replication data will be flow
 accross all servers but that is not much a big issue.

 Anyone can elaborate about Using RAID + GlusterFS and JBOD + GlusterFS.

 Thanks
 Regards
 Fernando

 On 07/08/2017 03:46, Devin Acosta wrote:

 Moacir,

 I have recently installed multiple Red Hat Virtualization hosts for
 several different companies, and have dealt with the Red Hat Support Team
 in depth about optimal configuration in regards to setting up GlusterFS
 most efficiently and I wanted to share with you what I learned.

 In general Red Hat Virtualization team frowns upon using each DISK of the
 system as just a JBOD, sure there is some protection by having the data
 replicated, however, the recommendation is to use RAID 6 (preferred) or
 RAID-5, or at least RAID-1 at the very least.

 Here is the direct quote from Red Hat when I asked about RAID and Bricks:

 *"A typical Gluster configuration would use RAID underneath the bricks.
 RAID 6 is most typical as it gives you 2 disk failure protection, but RAID
 5 could be used too. Once you have the RAIDed bricks, you'd then apply the
 desired replication on top of that. The most popular way of doing this
 would be distributed replicated with 2x replication. In general you'll get
 better performance with larger bricks. 12 drives is often a sweet spot.
 Another option would be to create a separate tier using all SSD’s.” *

 *In order to SSD tiering from my understanding you would need 1 x NVMe
 drive in each server, or 4 x SSD hot tier (it needs to be distributed,
 replicated for the hot tier if not using NVME). So with you only having 1
 SSD drive in each server, I’d suggest maybe looking into the NVME option. *

 *Since your using only 3-servers, what I’d probably suggest is to do (2
 Replicas + Arbiter Node), this setup actually doesn’t require the 3rd
 server to have big drives at all as it only stores meta-data about the
 files and not actually a full copy. *

 *Please see the attached document that was given to me by Red Hat to get
 more information on this. Hope this information helps you.*

 --

 Devin Acosta, RHCA, RHVCA
 Red Hat Certified Architect

 On August 6, 2017 at 7:29:29 PM, Moacir Ferreira (
 moacirferreira(a)hotmail.com) wrote:

 I am willing to assemble a oVirt "pod", made of 3 servers, each with 2 CPU
 sockets of 12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The idea is to use
 GlusterFS to provide HA for the VMs. The 3 servers have a dual 40Gb NIC and
 a dual 10Gb NIC. So my intention is to create a loop like a server triangle
 using the 40Gb NICs for virtualization files (VMs .qcow2) access and to
 move VMs around the pod (east /west traffic) while using the 10Gb
 interfaces for giving services to the outside world (north/south traffic).

 This said, my first question is: How should I deploy GlusterFS in such
 oVirt scenario? My questions are:

 1 - Should I create 3 RAID (i.e.: RAID 5), one on each oVirt node, and
 then create a GlusterFS using them?

 2 - Instead, should I create a JBOD array made of all server's disks?

 3 - What is the best Gluster configuration to provide for HA while not
 consuming too much disk space?

 4 - Does a oVirt hypervisor pod like I am planning to build, and the
 virtualization environment, benefits from tiering when using a SSD disk?
 And yes, will Gluster do it by default or I have to configure it to do so?

 At the bottom line, what is the good practice for using GlusterFS in small
 pods for enterprises?

 You opinion/feedback will be really appreciated!

 Moacir
 _______________________________________________
 Users mailing list
 Users(a)ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users

 _______________________________________________
 Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users

 _______________________________________________
 Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users

 --
 Recogizer Group GmbH

 Dr.rer.nat. Erekle Magradze
 Lead Big Data Engineering & DevOps
 Rheinwerkallee 2, 53227 Bonn
 Tel: +49 228 29974555 <+49%20228%2029974555>

 E-Mail erekle.magradze(a)recogizer.de
 Web: www.recogizer.com

 Recogizer auf LinkedIn https://www.linkedin.com/company-beta/10039182/
 Folgen Sie uns auf Twitter https://twitter.com/recogizer

 -----------------------------------------------------------------
 Recogizer Group GmbH
 Geschäftsführer: Oliver Habisch, Carsten Kreutze
 Handelsregister: Amtsgericht Bonn HRB 20724
 Sitz der Gesellschaft: Bonn; USt-ID-Nr.: DE294195993

 Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen.
 Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben,
 informieren Sie bitte sofort den Absender und löschen Sie diese Mail.
 Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail und der darin
enthaltenen Informationen ist nicht gestattet.

 _______________________________________________
 Users mailing list
 Users(a)ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [ovirt-users] Good practices