Re: [ovirt-users] HE (3.6) on gluster storage, chicken and egg status

6 Jan 2016

      This is a multi-part message in MIME format.
--------------070905060402030504010805
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit

On 01/05/2016 09:33 PM, Fil Di Noto wrote:
...
On Thu, Dec 31, 2015 at 3:44 AM, Donny Davis <donny@cloudspin.me> wrote:
...
I would say you would be much better off picking two hosts to do your HE on
and setting up drdb for the HE storage. You will have fewer problems with
your HE.
Is drdb an oVirt thing? Or are you suggesting a db backup or
replicated sql db solution that I would cut over to manually.
...
On Thu, Dec 31, 2015 at 1:41 AM, Sahina Bose <sabose@redhat.com> wrote:
...
If you mean, creating new gluster volumes - you need to make sure the
gluster service is enabled on the Default cluster. The cluster that HE
creates, has only virt service enabled by default. Engine should have been
installed in "Both" mode like Roy mentioned.
I went over those settings with limited success. I went through
multiple iterations of trying to stand up HE on gluster storage. Then
I gave up on that and tried NFS storage.
Ultimately it always came down to sanlock errors (both gluster and NFS
storage). I tried restarting the sanlock service, which would lead to
watchdog rebooting the hosts. When the host came back up it almost
seemed like things had started working. I could see begin to
see/create gluster volumes (see hosted_storage, or begin to create a
data storage domain)
But when I would try to activate the hosted_storage domain things
would start to fall apart again. sanlock as far as I can tell.
I am currently running the engine on a physical system and things are
working fine. I am considering taking a backup and attempting to use
the HE physical to VM migration method, time permitting.
Were the optimizations applied on the gluster volume hosting the HE? 
Assuming that "engine" is the gluster volume -
# gluster volume set engine group virt
# gluster volume set engine storage.owner-uid 36 && gluster volume set 
engine storage.owner-gid 36

Did you encounter errors on adding the data storage domain or 
installation of additional HE hosts?

If you are having trouble with creating gluster volume using the oVirt 
interface
1. Enable gluster service on Default cluster
2. Put host to maintenance and activate - to ensure that oVirt 
understands the host as a gluster enabled host.
3. You should now see the gluster volume hosting the HE being added to 
the oVirt UI

If this does not work,
you could try creating the gluster volume (for data store) from CLI - 
and then adding the storage domain from oVirt UI.
...
...
...
On 12/28/2015 12:43 AM, Roy Golan wrote:
3 way replica is the officially supported replica count for VM store use
case. If you wish to work with replica 4, you can update the
supported_replica_count in vdsm.conf
Thanks for that insight. I think I just experienced the bad aspects of
both quorum=auto and quorum=none. I don't like replica 3, because you
can only have one brick offline at a time. I think N+2 should be the
target for a production environment. (so you have the capacity for a
failure while doing maintenance). Would adding an arbiter effect the
quorum status? Is 3x replica and 1x arbiter considered a replica 3 or
4?
For the VM store use-case, quorum=auto is what you need to set, to 
ensure that your VM images are not corrupted (due to split brain files)
With quorum=auto, odd replica count is recommended. With an even replica 
count, for instance 4, if 2 bricks fail there's a restriction that the 
primary brick (the first brick that was specified in creating volume) 
should be UP. If primary brick is down along with any other brick, then 
the volume becomes read only.

Arbiter does affect the quorum status. Arbiter is currently supported 
only in a replica 3 setup. replica 3 arbiter 1 - implies that you have 3 
bricks that are providing a 3 way replica out of which 1 is an arbiter 
brick.
...
...
...
No chicken and egg here I think. You want a volume to be used as your
master data domain and creating a new volume in a new gluster-cluster is
independent of your datacenter status.
You mentioned your hosts are on default cluster - so make sure your
cluster support gluster service (you should have picked gluster as a service
during engine install)
I chose "both" during engine-setup, although I didn't have "gluster
service" enable on the default cluster at first. Also vdsm-gluster rpm
was not installed (I sort of feel like 'hosted-engine --deploy' should
take care of that. Adding a host from my current physical engine using
the "add host" gui didn't bring it in either.
In our hyperconverged testing, we installed vdsm and vdsm-gluster prior 
to running hosted-engine --deploy to get around this. (HE setup does not 
support hyper-converged setup yet)
Could you try that?

# yum install vdsm vdsm-gluster ovirt-hosted-engine-setup screen
-- prior to running hosted-engine deploy on the 3 nodes?

Adding a host to a cluster with "gluster service" enabled, should bring 
in vdsm-gluster. Please provide the host-deploy logs if this didnot work 
for you.

HTH,
sahina
...
Thanks for the input!
--------------070905060402030504010805
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <br>
    <br>
    <div class="moz-cite-prefix">On 01/05/2016 09:33 PM, Fil Di Noto
      wrote:<br>
    </div>
    <blockquote
cite="mid:CAPkW28qWzdOb1N2ZbB0W3Qne+=o6JP_Eofsjn1Mrm2a9Uwzk2Q@mail.gmail.com"
      type="cite">
      <pre wrap="">On Thu, Dec 31, 2015 at 3:44 AM, Donny Davis <a class="moz-txt-link-rfc2396E" href="mailto:donny@cloudspin.me"><donny@cloudspin.me></a> wrote:
</pre>
      <blockquote type="cite">
        <pre wrap="">I would say you would be much better off picking two hosts to do your HE on
and setting up drdb for the HE storage. You will have fewer problems with
your HE.
</pre>
      </blockquote>
      <pre wrap="">
Is drdb an oVirt thing? Or are you suggesting a db backup or
replicated sql db solution that I would cut over to manually.

</pre>
      <blockquote type="cite">
        <pre wrap="">On Thu, Dec 31, 2015 at 1:41 AM, Sahina Bose <a class="moz-txt-link-rfc2396E" href="mailto:sabose@redhat.com"><sabose@redhat.com></a> wrote:
</pre>
        <blockquote type="cite">
          <pre wrap="">
If you mean, creating new gluster volumes - you need to make sure the
gluster service is enabled on the Default cluster. The cluster that HE
creates, has only virt service enabled by default. Engine should have been
installed in "Both" mode like Roy mentioned.
</pre>
        </blockquote>
      </blockquote>
      <pre wrap="">
I went over those settings with limited success. I went through
multiple iterations of trying to stand up HE on gluster storage. Then
I gave up on that and tried NFS storage.
Ultimately it always came down to sanlock errors (both gluster and NFS
storage). I tried restarting the sanlock service, which would lead to
watchdog rebooting the hosts. When the host came back up it almost
seemed like things had started working. I could see begin to
see/create gluster volumes (see hosted_storage, or begin to create a
data storage domain)

But when I would try to activate the hosted_storage domain things
would start to fall apart again. sanlock as far as I can tell.

I am currently running the engine on a physical system and things are
working fine. I am considering taking a backup and attempting to use
the HE physical to VM migration method, time permitting.</pre>
    </blockquote>
    <br>
    <br>
    Were the optimizations applied on the gluster volume hosting the HE?
    Assuming that "engine" is the gluster volume -<br>
    # gluster volume set engine group virt<br>
    # gluster volume set engine storage.owner-uid 36 && gluster
    volume set engine storage.owner-gid 36<br>
    <br>
    Did you encounter errors on adding the data storage domain or
    installation of additional HE hosts?<br>
    <br>
    If you are having trouble with creating gluster volume using the
    oVirt interface<br>
    1. Enable gluster service on Default cluster<br>
    2. Put host to maintenance and activate - to ensure that oVirt
    understands the host as a gluster enabled host.<br>
    3. You should now see the gluster volume hosting the HE being added
    to the oVirt UI<br>
    <br>
    If this does not work,<br>
    you could try creating the gluster volume (for data store) from CLI
    - and then adding the storage domain from oVirt UI.<br>
    <br>
    <br>
    <br>
    <br>
    <blockquote
cite="mid:CAPkW28qWzdOb1N2ZbB0W3Qne+=o6JP_Eofsjn1Mrm2a9Uwzk2Q@mail.gmail.com"
      type="cite">
      <pre wrap="">

</pre>
      <blockquote type="cite">
        <blockquote type="cite">
          <pre wrap="">On 12/28/2015 12:43 AM, Roy Golan wrote:

3 way replica is the officially supported replica count for VM store use
case. If you wish to work with replica 4, you can update the
supported_replica_count in vdsm.conf
</pre>
        </blockquote>
      </blockquote>
      <pre wrap="">
Thanks for that insight. I think I just experienced the bad aspects of
both quorum=auto and quorum=none. I don't like replica 3, because you
can only have one brick offline at a time. I think N+2 should be the
target for a production environment. (so you have the capacity for a
failure while doing maintenance). Would adding an arbiter effect the
quorum status? Is 3x replica and 1x arbiter considered a replica 3 or
4?
</pre>
    </blockquote>
    <br>
    <br>
    For the VM store use-case, quorum=auto is what you need to set, to
    ensure that your VM images are not corrupted (due to split brain
    files)<br>
    With quorum=auto, odd replica count is recommended. With an even
    replica count, for instance 4, if 2 bricks fail there's a
    restriction that the primary brick (the first brick that was
    specified in creating volume) should be UP. If primary brick is down
    along with any other brick, then the volume becomes read only.<br>
    <br>
    Arbiter does affect the quorum status. Arbiter is currently
    supported only in a replica 3 setup. replica 3 arbiter 1 - implies
    that you have 3 bricks that are providing a 3 way replica out of
    which 1 is an arbiter brick.<br>
    <br>
    <blockquote
cite="mid:CAPkW28qWzdOb1N2ZbB0W3Qne+=o6JP_Eofsjn1Mrm2a9Uwzk2Q@mail.gmail.com"
      type="cite">
      <pre wrap="">
</pre>
      <blockquote type="cite">
        <blockquote type="cite">
          <pre wrap=""> No chicken and egg here I think. You want a volume to be used as your
master data domain and creating a new volume in a new gluster-cluster is
independent of your datacenter status.

You mentioned your hosts are on default cluster - so make sure your
cluster support gluster service (you should have picked gluster as a service
during engine install)
</pre>
        </blockquote>
      </blockquote>
      <pre wrap="">
I chose "both" during engine-setup, although I didn't have "gluster
service" enable on the default cluster at first. Also vdsm-gluster rpm
was not installed (I sort of feel like 'hosted-engine --deploy' should
take care of that. Adding a host from my current physical engine using
the "add host" gui didn't bring it in either.</pre>
    </blockquote>
    <br>
    In our hyperconverged testing, we installed vdsm and vdsm-gluster
    prior to running hosted-engine --deploy to get around this. (HE
    setup does not support hyper-converged setup yet)<br>
    Could you try that?<br>
    <br>
    # yum install vdsm vdsm-gluster ovirt-hosted-engine-setup screen<br>
    -- prior to running hosted-engine deploy on the 3 nodes?<br>
    <br>
    Adding a host to a cluster with "gluster service" enabled, should
    bring in vdsm-gluster. Please provide the host-deploy logs if this
    didnot work for you.<br>
    <br>
    HTH,<br>
    sahina<br>
    <br>
    <meta charset="utf-8">
    <blockquote
cite="mid:CAPkW28qWzdOb1N2ZbB0W3Qne+=o6JP_Eofsjn1Mrm2a9Uwzk2Q@mail.gmail.com"
      type="cite">
      <pre wrap="">

Thanks for the input!
</pre>
    </blockquote>
    <br>
  </body>
</html>

--------------070905060402030504010805--