<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>Hi Franando,</p>

    <p>So let's go with the following scenarios:</p>

    <p>1. Let's say you have two servers (replication factor is 2), i.e.

      two bricks per volume, in this case it is strongly recommended to

      have the arbiter node, the metadata storage that will guarantee

      avoiding the split brain situation, in this case for arbiter you

      don't even need a disk with lots of space, it's enough to have a

      tiny ssd but hosted on a separate server. Advantage of such setup

      is that you don't need the RAID 1 for each brick, you have the

      metadata information stored in arbiter node and brick replacement

      is easy.</p>

    <p>2. If you have odd number of bricks (let's say 3, i.e.

      replication factor is 3) in your volume and you didn't create the

      arbiter node as well as you didn't configure the quorum, in this

      case the entire load for keeping the consistency of the volume

      resides on all 3 servers, each of them is important and each brick

      contains key information, they need to cross-check each other

      (that's what people usually do with the first try of gluster :) ),

      in this case replacing a brick is a big pain and in this case RAID

      1 is a good option to have (that's the disadvantage, i.e. loosing

      the space and not having the JBOD option) advantage is that you

      don't have the to have additional arbiter node.</p>

    <p>3. You have odd number of bricks and configured arbiter node, in

      this case you can easily go with JBOD, however a good practice

      would be to have a RAID 1 for arbiter disks (tiny 128GB SSD-s ar

      perfectly sufficient for volumes with 10s of TB-s in size.)</p>

    <p>That's basically it</p>

    <p>The rest about the reliability and setup scenarios you can find

      in gluster documentation, especially look for quorum and arbiter

      node configs+options.</p>

    <p>Cheers</p>

    <p>Erekle</p>

    <p>P.S. What I was mentioning, regarding a good practice is mostly

      related to the operations of gluster not installation or

      deployment, i.e. not the conceptual understanding of gluster

      (conceptually it's a JBOD system).<br>

    </p>

    <br>

    <div class="moz-cite-prefix">On 08/07/2017 05:41 PM, FERNANDO

      FREDIANI wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:c7a1c2e1-57c3-9fa5-0710-ebee3f3fa069@upx.com">

      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

      <p>Thanks for the clarification Erekle.</p>

      <p>However I get surprised with this way of operating from

        GlusterFS as it adds another layer of complexity to the system

        (either a hardware or software RAID) before the gluster config

        and increase the system's overall costs.<br>

      </p>

      <p>An important point to consider is: In RAID configuration you

        already have space 'wasted' in order to build redundancy (either

        RAID 1, 5, or 6). Then when you have GlusterFS on the top of

        several RAIDs you have again more data replicated so you end up

        with the same data consuming more space in a group of disks and

        again on the top of several RAIDs depending on the Gluster

        configuration you have (in a RAID 1 config the same data is

        replicated 4 times).</p>

      <p>Yet another downside of having a RAID (specially RAID 5 or 6)

        is that it reduces considerably the write speeds as each group

        of disks will end up having the write speed of a single disk as

        all other disks of that group have to wait for each other to

        write as well.<br>

      </p>

      <p>Therefore if Gluster already replicates data why does it create

        this big pain you mentioned if the data is replicated somewhere

        else, can still be retrieved to both serve clients and

        reconstruct the equivalent disk when it is replaced ?</p>

      <p>Fernando<br>

      </p>

      <br>

      <div class="moz-cite-prefix">On 07/08/2017 10:26, Erekle Magradze

        wrote:<br>

      </div>

      <blockquote type="cite"

        cite="mid:aa829d07-fa77-3ed9-2500-e33cc01414b6@recogizer.de">

        <meta http-equiv="Content-Type" content="text/html;

          charset=utf-8">

        <p>Hi Frenando,</p>

        <p>Here is my experience, if you consider a particular hard

          drive as a brick for gluster volume and it dies, i.e. it

          becomes not accessible it's a huge hassle to discard that

          brick and exchange with another one, since gluster some tries

          to access that broken brick and it's causing (at least it

          cause for me) a big pain, therefore it's better to have a RAID

          as brick, i.e. have RAID 1 (mirroring) for each brick, in this

          case if the disk is down you can easily exchange it and

          rebuild the RAID without going offline, i.e switching off the

          volume doing brick manipulations and switching it back on.<br>

        </p>

        <p>Cheers</p>

        <p>Erekle<br>

        </p>

        <br>

        <div class="moz-cite-prefix">On 08/07/2017 03:04 PM, FERNANDO

          FREDIANI wrote:<br>

        </div>

        <blockquote type="cite"

          cite="mid:63bac47b-afe6-0258-d3d7-e545a5004c30@upx.com">

          <meta http-equiv="Content-Type" content="text/html;

            charset=utf-8">

          <p>For any RAID 5 or 6 configuration I normally follow a

            simple gold rule which gave good results so far:<br>

            - up to 4 disks RAID 5<br>

            - 5 or more disks RAID 6</p>

          <p>However I didn't really understand well the recommendation

            to use any RAID with GlusterFS. I always thought that

            GlusteFS likes to work in JBOD mode and control the disks

            (bricks) directlly so you can create whatever distribution

            rule you wish, and if a single disk fails you just replace

            it and which obviously have the data replicated from

            another. The only downside of using in this way is that the

            replication data will be flow accross all servers but that

            is not much a big issue.</p>

          <p>Anyone can elaborate about Using RAID + GlusterFS and JBOD

            + GlusterFS.</p>

          <p>Thanks<br>

            Regards<br>

            Fernando<br>

          </p>

          <br>

          <div class="moz-cite-prefix">On 07/08/2017 03:46, Devin Acosta

            wrote:<br>

          </div>

          <blockquote type="cite"

cite="mid:CANCGKEp4XGs0U+Qs78eEmqCNtvpLY-Azjb5DcGhZ9yiKTBEEfw@mail.gmail.com">

            <style>body{font-family:Helvetica,Arial;font-size:13px}</style>

            <div id="bloop_customfont"

              style="color:rgb(0,0,0);margin:0px"><font face="Input

                Mono"><br>

              </font></div>

            <div id="bloop_customfont"

              style="color:rgb(0,0,0);margin:0px"><font face="Input

                Mono">Moacir,</font></div>

            <div id="bloop_customfont"

              style="color:rgb(0,0,0);margin:0px"><font face="Input

                Mono"><br>

              </font></div>

            <div id="bloop_customfont"

              style="color:rgb(0,0,0);margin:0px"><font face="Input

                Mono">I have recently installed multiple Red Hat

                Virtualization hosts for several different companies,

                and have dealt with the Red Hat Support Team in depth

                about optimal configuration in regards to setting up

                GlusterFS most efficiently and I wanted to share with

                you what I learned.</font></div>

            <div id="bloop_customfont"

              style="color:rgb(0,0,0);margin:0px"><font face="Input

                Mono"><br>

              </font></div>

            <div id="bloop_customfont"

              style="color:rgb(0,0,0);margin:0px"><font face="Input

                Mono">In general Red Hat Virtualization team frowns upon

                using each DISK of the system as just a JBOD, sure there

                is some protection by having the data replicated,

                however, the recommendation is to use RAID 6 (preferred)

                or RAID-5, or at least RAID-1 at the very least.</font></div>

            <div id="bloop_customfont"

              style="color:rgb(0,0,0);margin:0px"><font face="Input

                Mono"><br>

              </font></div>

            <div id="bloop_customfont" style="margin:0px"><font

                face="Input Mono">Here is the direct quote from Red Hat

                when I asked about RAID and Bricks:</font></div>

            <div id="bloop_customfont" style="margin:0px"><font

                face="Input Mono"><i><br>

                </i></font></div>

            <div id="bloop_customfont" style="margin:0px"><font

                face="Input Mono"><i>"A typical Gluster configuration

                  would use RAID underneath the bricks. RAID 6 is most

                  typical as it gives you 2 disk failure protection, but

                  RAID 5 could be used too. Once you have the RAIDed

                  bricks, you'd then apply the desired replication on

                  top of that. The most popular way of doing this would

                  be distributed replicated with 2x replication. In

                  general you'll get better performance with larger

                  bricks. 12 drives is often a sweet spot. Another

                  option would be to create a separate tier using all

                  SSD’s.” </i></font></div>

            <div id="bloop_customfont" style="margin:0px"><br>

            </div>

            <div id="bloop_customfont" style="margin:0px"><font

                face="Input Mono"><i>In order to SSD tiering from my

                  understanding you would need 1 x NVMe drive in each

                  server, or 4 x SSD hot tier (it needs to be

                  distributed, replicated for the hot tier if not using

                  NVME). So with you only having 1 SSD drive in each

                  server, I’d suggest maybe looking into the NVME

                  option. </i></font></div>

            <div id="bloop_customfont" style="margin:0px"><font

                face="Input Mono"><i><br>

                </i></font></div>

            <div id="bloop_customfont" style="margin:0px"><font

                face="Input Mono"><i>Since your using only 3-servers,

                  what I’d probably suggest is to do (2 Replicas +

                  Arbiter Node), this setup actually doesn’t require the

                  3rd server to have big drives at all as it only stores

                  meta-data about the files and not actually a full

                  copy. </i></font></div>

            <div id="bloop_customfont" style="margin:0px"><font

                face="Input Mono"><i><br>

                </i></font></div>

            <div id="bloop_customfont" style="margin:0px"><font

                face="Input Mono"><i>Please see the attached document

                  that was given to me by Red Hat to get more

                  information on this. Hope this information helps you.</i></font></div>

            <div id="bloop_customfont" style="margin:0px"><font

                face="Input Mono"><i><br>

                </i></font></div>

            <br>

            <div id="bloop_sign_1502087376725469184" class="bloop_sign"><span

                style="font-family:'helvetica

                Neue',helvetica;font-size:14px">--</span><br

                style="font-family:'helvetica

                Neue',helvetica;font-size:14px">

              <div class="gmail_signature" style="font-family:'helvetica

                Neue',helvetica;font-size:14px">

                <div dir="ltr">

                  <div><br>

                  </div>

                  <div>Devin Acosta, RHCA, RHVCA</div>

                  <div>Red Hat Certified Architect</div>

                </div>

              </div>

            </div>

            <br>

            <p class="airmail_on">On August 6, 2017 at 7:29:29 PM,

              Moacir Ferreira (<a

                href="mailto:moacirferreira@hotmail.com"

                moz-do-not-send="true">moacirferreira@hotmail.com</a>)

              wrote:</p>

            <blockquote type="cite" class="clean_bq"><span>

                <div dir="ltr">

                  <div>

                    <title></title>

                    <div id="divtagdefaultwrapper"

style="font-size:12pt;color:#000000;font-family:Calibri,Helvetica,sans-serif"

                      dir="ltr">

                      <p><span>I am willing to assemble a oVirt "pod",

                          made of 3 servers, each with 2 CPU sockets of

                          12 cores, 256GB RAM, 7 HDD 10K, 1 SSD. The

                          idea is to use GlusterFS to provide HA for the

                          VMs. The 3 servers have a dual 40Gb NIC and a

                          dual 10Gb NIC. So my intention is to create a

                          loop like a server triangle using the 40Gb

                          NICs for virtualization files (VMs .qcow2)

                          access and to move VMs around the pod (east

                          /west traffic) while using the 10Gb interfaces

                          for giving services to the outside world

                          (north/south traffic).</span></p>

                      <p><br>

                      </p>

                      <p>This said, my first question is: How should I

                        deploy GlusterFS in such oVirt scenario? My

                        questions are:</p>

                      <p><br>

                      </p>

                      <p>1 - Should I create 3 RAID (i.e.: RAID 5), one

                        on each oVirt node, and then create a GlusterFS

                        using them?</p>

                      <p>2 - Instead, should I create a JBOD array made

                        of all server's disks?</p>

                      <p>3 - What is the best Gluster configuration to

                        provide for HA while not consuming too much disk

                        space?<br>

                      </p>

                      <p>4 - Does a oVirt hypervisor pod like I am

                        planning to build, and the virtualization

                        environment, benefits from tiering when using a

                        SSD disk? And yes, will Gluster do it by default

                        or I have to configure it to do so?</p>

                      <p><br>

                      </p>

                      <p>At the bottom line, what is the good practice

                        for using GlusterFS in small pods for

                        enterprises?<br>

                      </p>

                      <p><br>

                      </p>

                      <p>You opinion/feedback will be really

                        appreciated!</p>

                      <p>Moacir<br>

                      </p>

                    </div>

                    _______________________________________________ <br>

                    Users mailing list <br>

                    <a href="mailto:Users@ovirt.org"

                      moz-do-not-send="true">Users@ovirt.org</a> <br>

                    <a

                      href="http://lists.ovirt.org/mailman/listinfo/users"

                      moz-do-not-send="true">http://lists.ovirt.org/mailman/listinfo/users</a>

                    <br>

                  </div>

                </div>

              </span></blockquote>

            <br>

            <fieldset class="mimeAttachmentHeader"></fieldset>

            <br>

            <pre wrap="">_______________________________________________

Users mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org" moz-do-not-send="true">Users@ovirt.org</a>

<a class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users" moz-do-not-send="true">http://lists.ovirt.org/mailman/listinfo/users</a>

</pre>

          </blockquote>

          <br>

          <br>

          <fieldset class="mimeAttachmentHeader"></fieldset>

          <br>

          <pre wrap="">_______________________________________________

Users mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org" moz-do-not-send="true">Users@ovirt.org</a>

<a class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users" moz-do-not-send="true">http://lists.ovirt.org/mailman/listinfo/users</a>

</pre>

        </blockquote>

        <br>

      </blockquote>

      <br>

    </blockquote>

    <br>

    <pre class="moz-signature" cols="72">-- 

Recogizer Group GmbH

Dr.rer.nat. Erekle Magradze

Lead Big Data Engineering &amp; DevOps

Rheinwerkallee 2, 53227 Bonn

Tel: +49 228 29974555

E-Mail <a class="moz-txt-link-abbreviated" href="mailto:erekle.magradze@recogizer.de">erekle.magradze@recogizer.de</a>

Web: <a class="moz-txt-link-abbreviated" href="http://www.recogizer.com">www.recogizer.com</a>

Recogizer auf LinkedIn <a class="moz-txt-link-freetext" href="https://www.linkedin.com/company-beta/10039182/">https://www.linkedin.com/company-beta/10039182/</a>

Folgen Sie uns auf Twitter <a class="moz-txt-link-freetext" href="https://twitter.com/recogizer">https://twitter.com/recogizer</a>

-----------------------------------------------------------------

Recogizer Group GmbH

Geschäftsführer: Oliver Habisch, Carsten Kreutze

Handelsregister: Amtsgericht Bonn HRB 20724

Sitz der Gesellschaft: Bonn; USt-ID-Nr.: DE294195993

Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen.

Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben,

informieren Sie bitte sofort den Absender und löschen Sie diese Mail.

Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail und der darin enthaltenen Informationen ist nicht gestattet.</pre>

  </body>

</html>