[ovirt-users] hyperconverged question

Jim Kusznir jim at palousetech.com
Fri Sep 1 18:46:48 UTC 2017


I'm now also confused as to what the point of an arbiter is / what it does
/ why one would use it.

On Fri, Sep 1, 2017 at 11:44 AM, Jim Kusznir <jim at palousetech.com> wrote:

> Thanks for the help!
>
> Here's my gluster volume info for the data export/brick (I have 3: data,
> engine, and iso, but they're all configured the same):
>
> Volume Name: data
> Type: Replicate
> Volume ID: e670c488-ac16-4dd1-8bd3-e43b2e42cc59
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: ovirt1.nwfiber.com:/gluster/brick2/data
> Brick2: ovirt2.nwfiber.com:/gluster/brick2/data
> Brick3: ovirt3.nwfiber.com:/gluster/brick2/data (arbiter)
> Options Reconfigured:
> performance.strict-o-direct: on
> nfs.disable: on
> user.cifs: off
> network.ping-timeout: 30
> cluster.shd-max-threads: 8
> cluster.shd-wait-qlength: 10000
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> performance.low-prio-threads: 32
> features.shard-block-size: 512MB
> features.shard: on
> storage.owner-gid: 36
> storage.owner-uid: 36
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> network.remote-dio: enable
> cluster.eager-lock: enable
> performance.stat-prefetch: off
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> performance.readdir-ahead: on
> server.allow-insecure: on
> [root at ovirt1 ~]#
>
>
> all 3 of my brick nodes ARE also members of the virtualization cluster
> (including ovirt3).  How can I convert it into a full replica instead of
> just an arbiter?
>
> Thanks!
> --Jim
>
> On Fri, Sep 1, 2017 at 9:09 AM, Charles Kozler <ckozleriii at gmail.com>
> wrote:
>
>> @Kasturi - Looks good now. Cluster showed down for a moment but VM's
>> stayed up in their appropriate places. Thanks!
>>
>> < Anyone on this list please feel free to correct my response to Jim if
>> its wrong>
>>
>> @ Jim - If you can share your gluster volume info / status I can confirm
>> (to the best of my knowledge). From my understanding, If you setup the
>> volume with something like 'gluster volume set <vol> group virt' this will
>> configure some quorum options as well, Ex: http://i.imgur.com/Mya4N5o.png
>>
>> While, yes, you are configured for arbiter node you're still losing
>> quorum by dropping from 2 -> 1. You would need 4 node with 1 being arbiter
>> to configure quorum which is in effect 3 writable nodes and 1 arbiter. If
>> one gluster node drops, you still have 2 up. Although in this case, you
>> probably wouldnt need arbiter at all
>>
>> If you are configured, you can drop quorum settings and just let arbiter
>> run since you're not using arbiter node in your VM cluster part (I
>> believe), just storage cluster part. When using quorum, you need > 50% of
>> the cluster being up at one time. Since you have 3 nodes with 1 arbiter,
>> you're actually losing 1/2 which == 50 which == degraded / hindered gluster
>>
>> Again, this is to the best of my knowledge based on other quorum backed
>> software....and this is what I understand from testing with gluster and
>> ovirt thus far
>>
>> On Fri, Sep 1, 2017 at 11:53 AM, Jim Kusznir <jim at palousetech.com> wrote:
>>
>>> Huh...Ok., how do I convert the arbitrar to full replica, then?  I was
>>> misinformed when I created this setup.  I thought the arbitrator held
>>> enough metadata that it could validate or refudiate  any one replica (kinda
>>> like the parity drive for a RAID-4 array).  I was also under the impression
>>> that one replica  + Arbitrator is enough to keep the array online and
>>> functional.
>>>
>>> --Jim
>>>
>>> On Fri, Sep 1, 2017 at 5:22 AM, Charles Kozler <ckozleriii at gmail.com>
>>> wrote:
>>>
>>>> @ Jim - you have only two data volumes and lost quorum. Arbitrator only
>>>> stores metadata, no actual files. So yes, you were running in degraded mode
>>>> so some operations were hindered.
>>>>
>>>> @ Sahina - Yes, this actually worked fine for me once I did that.
>>>> However, the issue I am still facing, is when I go to create a new gluster
>>>> storage domain (replica 3, hyperconverged) and I tell it "Host to use" and
>>>> I select that host. If I fail that host, all VMs halt. I do not recall this
>>>> in 3.6 or early 4.0. This to me makes it seem like this is "pinning" a node
>>>> to a volume and vice versa like you could, for instance, for a singular
>>>> hyperconverged to ex: export a local disk via NFS and then mount it via
>>>> ovirt domain. But of course, this has its caveats. To that end, I am using
>>>> gluster replica 3, when configuring it I say "host to use: " node 1, then
>>>> in the connection details I give it node1:/data. I fail node1, all VMs
>>>> halt. Did I miss something?
>>>>
>>>> On Fri, Sep 1, 2017 at 2:13 AM, Sahina Bose <sabose at redhat.com> wrote:
>>>>
>>>>> To the OP question, when you set up a gluster storage domain, you need
>>>>> to specify backup-volfile-servers=<server2>:<server3> where server2
>>>>> and server3 also have bricks running. When server1 is down, and the volume
>>>>> is mounted again - server2 or server3 are queried to get the gluster
>>>>> volfiles.
>>>>>
>>>>> @Jim, if this does not work, are you using 4.1.5 build with libgfapi
>>>>> access? If not, please provide the vdsm and gluster mount logs to analyse
>>>>>
>>>>> If VMs go to paused state - this could mean the storage is not
>>>>> available. You can check "gluster volume status <volname>" to see if
>>>>> atleast 2 bricks are running.
>>>>>
>>>>> On Fri, Sep 1, 2017 at 11:31 AM, Johan Bernhardsson <johan at kafit.se>
>>>>> wrote:
>>>>>
>>>>>> If gluster drops in quorum so that it has less votes than it should
>>>>>> it will stop file operations until quorum is back to normal.If i rember it
>>>>>> right you need two bricks to write for quorum to be met and that the
>>>>>> arbiter only is a vote to avoid split brain.
>>>>>>
>>>>>>
>>>>>> Basically what you have is a raid5 solution without a spare. And when
>>>>>> one disk dies it will run in degraded mode. And some raid systems will stop
>>>>>> the raid until you have removed the disk or forced it to run anyway.
>>>>>>
>>>>>> You can read up on it here: https://gluster.readthed
>>>>>> ocs.io/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/
>>>>>>
>>>>>> /Johan
>>>>>>
>>>>>> On Thu, 2017-08-31 at 22:33 -0700, Jim Kusznir wrote:
>>>>>>
>>>>>> Hi all:
>>>>>>
>>>>>> Sorry to hijack the thread, but I was about to start essentially the
>>>>>> same thread.
>>>>>>
>>>>>> I have a 3 node cluster, all three are hosts and gluster nodes
>>>>>> (replica 2 + arbitrar).  I DO have the mnt_options=backup-volfile-servers=
>>>>>> set:
>>>>>>
>>>>>> storage=192.168.8.11:/engine
>>>>>> mnt_options=backup-volfile-servers=192.168.8.12:192.168.8.13
>>>>>>
>>>>>> I had an issue today where 192.168.8.11 went down.  ALL VMs
>>>>>> immediately paused, including the engine (all VMs were running on
>>>>>> host2:192.168.8.12).  I couldn't get any gluster stuff working until host1
>>>>>> (192.168.8.11) was restored.
>>>>>>
>>>>>> What's wrong / what did I miss?
>>>>>>
>>>>>> (this was set up "manually" through the article on setting up
>>>>>> self-hosted gluster cluster back when 4.0 was new..I've upgraded it to 4.1
>>>>>> since).
>>>>>>
>>>>>> Thanks!
>>>>>> --Jim
>>>>>>
>>>>>>
>>>>>> On Thu, Aug 31, 2017 at 12:31 PM, Charles Kozler <
>>>>>> ckozleriii at gmail.com> wrote:
>>>>>>
>>>>>> Typo..."Set it up and then failed that **HOST**"
>>>>>>
>>>>>> And upon that host going down, the storage domain went down. I only
>>>>>> have hosted storage domain and this new one - is this why the DC went down
>>>>>> and no SPM could be elected?
>>>>>>
>>>>>> I dont recall this working this way in early 4.0 or 3.6
>>>>>>
>>>>>> On Thu, Aug 31, 2017 at 3:30 PM, Charles Kozler <ckozleriii at gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>> So I've tested this today and I failed a node. Specifically, I setup
>>>>>> a glusterfs domain and selected "host to use: node1". Set it up and then
>>>>>> failed that VM
>>>>>>
>>>>>> However, this did not work and the datacenter went down. My engine
>>>>>> stayed up, however, it seems configuring a domain to pin to a host to use
>>>>>> will obviously cause it to fail
>>>>>>
>>>>>> This seems counter-intuitive to the point of glusterfs or any
>>>>>> redundant storage. If a single host has to be tied to its function, this
>>>>>> introduces a single point of failure
>>>>>>
>>>>>> Am I missing something obvious?
>>>>>>
>>>>>> On Thu, Aug 31, 2017 at 9:43 AM, Kasturi Narra <knarra at redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>> yes, right.  What you can do is edit the hosted-engine.conf file and
>>>>>> there is a parameter as shown below [1] and replace h2 and h3 with your
>>>>>> second and third storage servers. Then you will need to restart
>>>>>> ovirt-ha-agent and ovirt-ha-broker services in all the nodes .
>>>>>>
>>>>>> [1] 'mnt_options=backup-volfile-servers=<h2>:<h3>'
>>>>>>
>>>>>> On Thu, Aug 31, 2017 at 5:54 PM, Charles Kozler <ckozleriii at gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>> Hi Kasturi -
>>>>>>
>>>>>> Thanks for feedback
>>>>>>
>>>>>> > If cockpit+gdeploy plugin would be have been used then that would
>>>>>> have automatically detected glusterfs replica 3 volume created during
>>>>>> Hosted Engine deployment and this question would not have been asked
>>>>>>
>>>>>> Actually, doing hosted-engine --deploy it too also auto detects
>>>>>> glusterfs.  I know glusterfs fuse client has the ability to failover
>>>>>> between all nodes in cluster, but I am still curious given the fact that I
>>>>>> see in ovirt config node1:/engine (being node1 I set it to in hosted-engine
>>>>>> --deploy). So my concern was to ensure and find out exactly how engine
>>>>>> works when one node goes away and the fuse client moves over to the other
>>>>>> node in the gluster cluster
>>>>>>
>>>>>> But you did somewhat answer my question, the answer seems to be no
>>>>>> (as default) and I will have to use hosted-engine.conf and change the
>>>>>> parameter as you list
>>>>>>
>>>>>> So I need to do something manual to create HA for engine on gluster?
>>>>>> Yes?
>>>>>>
>>>>>> Thanks so much!
>>>>>>
>>>>>> On Thu, Aug 31, 2017 at 3:03 AM, Kasturi Narra <knarra at redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>    During Hosted Engine setup question about glusterfs volume is
>>>>>> being asked because you have setup the volumes yourself. If cockpit+gdeploy
>>>>>> plugin would be have been used then that would have automatically detected
>>>>>> glusterfs replica 3 volume created during Hosted Engine deployment and this
>>>>>> question would not have been asked.
>>>>>>
>>>>>>    During new storage domain creation when glusterfs is selected
>>>>>> there is a feature called 'use managed gluster volumes' and upon checking
>>>>>> this all glusterfs volumes managed will be listed and you could choose the
>>>>>> volume of your choice from the dropdown list.
>>>>>>
>>>>>>     There is a conf file called /etc/hosted-engine/hosted-engine.conf
>>>>>> where there is a parameter called backup-volfile-servers="h1:h2" and if one
>>>>>> of the gluster node goes down engine uses this parameter to provide ha /
>>>>>> failover.
>>>>>>
>>>>>>  Hope this helps !!
>>>>>>
>>>>>> Thanks
>>>>>> kasturi
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 30, 2017 at 8:09 PM, Charles Kozler <ckozleriii at gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>> Hello -
>>>>>>
>>>>>> I have successfully created a hyperconverged hosted engine setup
>>>>>> consisting of 3 nodes - 2 for VM's and the third purely for storage. I
>>>>>> manually configured it all, did not use ovirt node or anything. Built the
>>>>>> gluster volumes myself
>>>>>>
>>>>>> However, I noticed that when setting up the hosted engine and even
>>>>>> when adding a new storage domain with glusterfs type, it still asks for
>>>>>> hostname:/volumename
>>>>>>
>>>>>> This leads me to believe that if that one node goes down (ex:
>>>>>> node1:/data), then ovirt engine wont be able to communicate with that
>>>>>> volume because its trying to reach it on node 1 and thus, go down
>>>>>>
>>>>>> I know glusterfs fuse client can connect to all nodes to provide
>>>>>> failover/ha but how does the engine handle this?
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users at ovirt.org
>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users at ovirt.org
>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing listUsers at ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users at ovirt.org
>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170901/df5e4e6a/attachment.html>


More information about the Users mailing list