I can confirm that I did set it up manually, and I did specify backupvol,
and in the "manage domain" storage settings, I do have under mount
options, backup-volfile-servers=192.168.8.12:192.168.8.13 (and this was
done at initial install time).
The "used managed gluster" checkbox is NOT checked, and if I check it and
save settings, next time I go in it is not checked.
--Jim
On Fri, Sep 1, 2017 at 2:08 PM, Charles Kozler <ckozleriii(a)gmail.com> wrote:
@ Jim - here is my setup which I will test in a few (brand new
cluster)
and report back what I found in my tests
- 3x servers direct connected via 10Gb
- 2 of those 3 setup in ovirt as hosts
- Hosted engine
- Gluster replica 3 (no arbiter) for all volumes
- 1x engine volume gluster replica 3 manually configured (not using ovirt
managed gluster)
- 1x datatest volume (20gb) replica 3 manually configured (not using ovirt
managed gluster)
- 1x nfstest domain served from some other server in my infrastructure
which, at the time of my original testing, was master domain
I tested this earlier and all VMs stayed online. However, ovirt cluster
reported DC/cluster down, all VM's stayed up
As I am now typing this, can you confirm you setup your gluster storage
domain with backupvol? Also, confirm you updated hosted-engine.conf with
backupvol mount option as well?
On Fri, Sep 1, 2017 at 4:22 PM, Jim Kusznir <jim(a)palousetech.com> wrote:
> So, after reading the first document twice and the 2nd link thoroughly
> once, I believe that the arbitrator volume should be sufficient and count
> for replica / split brain. EG, if any one full replica is down, and the
> arbitrator and the other replica is up, then it should have quorum and all
> should be good.
>
> I think my underlying problem has to do more with config than the replica
> state. That said, I did size the drive on my 3rd node planning to have an
> identical copy of all data on it, so I'm still not opposed to making it a
> full replica.
>
> Did I miss something here?
>
> Thanks!
>
> On Fri, Sep 1, 2017 at 11:59 AM, Charles Kozler <ckozleriii(a)gmail.com>
> wrote:
>
>> These can get a little confusing but this explains it best:
>>
https://gluster.readthedocs.io/en/latest/Administrator
>> %20Guide/arbiter-volumes-and-quorum/#replica-2-and-replica-3-volumes
>>
>> Basically in the first paragraph they are explaining why you cant have
>> HA with quorum for 2 nodes. Here is another overview doc that explains some
>> more
>>
>>
http://openmymind.net/Does-My-Replica-Set-Need-An-Arbiter/
>>
>> From my understanding arbiter is good for resolving split brains. Quorum
>> and arbiter are two different things though quorum is a mechanism to help
>> you **avoid** split brain and the arbiter is to help gluster resolve split
>> brain by voting and other internal mechanics (as outlined in link 1). How
>> did you create the volume exactly - what command? It looks to me like you
>> created it with 'gluster volume create replica 2 arbiter 1 {....}' per
your
>> earlier mention of "replica 2 arbiter 1". That being said, if you did
that
>> and then setup quorum in the volume configuration, this would cause your
>> gluster to halt up since quorum was lost (as you saw until you recovered
>> node 1)
>>
>> As you can see from the docs, there is still a corner case for getting
>> in to split brain with replica 3, which again, is where arbiter would help
>> gluster resolve it
>>
>> I need to amend my previous statement: I was told that arbiter volume
>> does not store data, only metadata. I cannot find anything in the docs
>> backing this up however it would make sense for it to be. That being said,
>> in my setup, I would not include my arbiter or my third node in my ovirt VM
>> cluster component. I would keep it completely separate
>>
>>
>> On Fri, Sep 1, 2017 at 2:46 PM, Jim Kusznir <jim(a)palousetech.com> wrote:
>>
>>> I'm now also confused as to what the point of an arbiter is / what it
>>> does / why one would use it.
>>>
>>> On Fri, Sep 1, 2017 at 11:44 AM, Jim Kusznir <jim(a)palousetech.com>
>>> wrote:
>>>
>>>> Thanks for the help!
>>>>
>>>> Here's my gluster volume info for the data export/brick (I have 3:
>>>> data, engine, and iso, but they're all configured the same):
>>>>
>>>> Volume Name: data
>>>> Type: Replicate
>>>> Volume ID: e670c488-ac16-4dd1-8bd3-e43b2e42cc59
>>>> Status: Started
>>>> Snapshot Count: 0
>>>> Number of Bricks: 1 x (2 + 1) = 3
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: ovirt1.nwfiber.com:/gluster/brick2/data
>>>> Brick2: ovirt2.nwfiber.com:/gluster/brick2/data
>>>> Brick3: ovirt3.nwfiber.com:/gluster/brick2/data (arbiter)
>>>> Options Reconfigured:
>>>> performance.strict-o-direct: on
>>>> nfs.disable: on
>>>> user.cifs: off
>>>> network.ping-timeout: 30
>>>> cluster.shd-max-threads: 8
>>>> cluster.shd-wait-qlength: 10000
>>>> cluster.locking-scheme: granular
>>>> cluster.data-self-heal-algorithm: full
>>>> performance.low-prio-threads: 32
>>>> features.shard-block-size: 512MB
>>>> features.shard: on
>>>> storage.owner-gid: 36
>>>> storage.owner-uid: 36
>>>> cluster.server-quorum-type: server
>>>> cluster.quorum-type: auto
>>>> network.remote-dio: enable
>>>> cluster.eager-lock: enable
>>>> performance.stat-prefetch: off
>>>> performance.io-cache: off
>>>> performance.read-ahead: off
>>>> performance.quick-read: off
>>>> performance.readdir-ahead: on
>>>> server.allow-insecure: on
>>>> [root@ovirt1 ~]#
>>>>
>>>>
>>>> all 3 of my brick nodes ARE also members of the virtualization cluster
>>>> (including ovirt3). How can I convert it into a full replica instead of
>>>> just an arbiter?
>>>>
>>>> Thanks!
>>>> --Jim
>>>>
>>>> On Fri, Sep 1, 2017 at 9:09 AM, Charles Kozler
<ckozleriii(a)gmail.com>
>>>> wrote:
>>>>
>>>>> @Kasturi - Looks good now. Cluster showed down for a moment but
VM's
>>>>> stayed up in their appropriate places. Thanks!
>>>>>
>>>>> < Anyone on this list please feel free to correct my response to
Jim
>>>>> if its wrong>
>>>>>
>>>>> @ Jim - If you can share your gluster volume info / status I can
>>>>> confirm (to the best of my knowledge). From my understanding, If you
setup
>>>>> the volume with something like 'gluster volume set <vol>
group virt' this
>>>>> will configure some quorum options as well, Ex:
>>>>>
http://i.imgur.com/Mya4N5o.png
>>>>>
>>>>> While, yes, you are configured for arbiter node you're still
losing
>>>>> quorum by dropping from 2 -> 1. You would need 4 node with 1 being
arbiter
>>>>> to configure quorum which is in effect 3 writable nodes and 1
arbiter. If
>>>>> one gluster node drops, you still have 2 up. Although in this case,
you
>>>>> probably wouldnt need arbiter at all
>>>>>
>>>>> If you are configured, you can drop quorum settings and just let
>>>>> arbiter run since you're not using arbiter node in your VM
cluster part (I
>>>>> believe), just storage cluster part. When using quorum, you need >
50% of
>>>>> the cluster being up at one time. Since you have 3 nodes with 1
arbiter,
>>>>> you're actually losing 1/2 which == 50 which == degraded /
hindered gluster
>>>>>
>>>>> Again, this is to the best of my knowledge based on other quorum
>>>>> backed software....and this is what I understand from testing with
gluster
>>>>> and ovirt thus far
>>>>>
>>>>> On Fri, Sep 1, 2017 at 11:53 AM, Jim Kusznir
<jim(a)palousetech.com>
>>>>> wrote:
>>>>>
>>>>>> Huh...Ok., how do I convert the arbitrar to full replica, then?
I
>>>>>> was misinformed when I created this setup. I thought the
arbitrator held
>>>>>> enough metadata that it could validate or refudiate any one
replica (kinda
>>>>>> like the parity drive for a RAID-4 array). I was also under the
impression
>>>>>> that one replica + Arbitrator is enough to keep the array online
and
>>>>>> functional.
>>>>>>
>>>>>> --Jim
>>>>>>
>>>>>> On Fri, Sep 1, 2017 at 5:22 AM, Charles Kozler
<ckozleriii(a)gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> @ Jim - you have only two data volumes and lost quorum.
Arbitrator
>>>>>>> only stores metadata, no actual files. So yes, you were
running in degraded
>>>>>>> mode so some operations were hindered.
>>>>>>>
>>>>>>> @ Sahina - Yes, this actually worked fine for me once I did
that.
>>>>>>> However, the issue I am still facing, is when I go to create
a new gluster
>>>>>>> storage domain (replica 3, hyperconverged) and I tell it
"Host to use" and
>>>>>>> I select that host. If I fail that host, all VMs halt. I do
not recall this
>>>>>>> in 3.6 or early 4.0. This to me makes it seem like this is
"pinning" a node
>>>>>>> to a volume and vice versa like you could, for instance, for
a singular
>>>>>>> hyperconverged to ex: export a local disk via NFS and then
mount it via
>>>>>>> ovirt domain. But of course, this has its caveats. To that
end, I am using
>>>>>>> gluster replica 3, when configuring it I say "host to
use: " node 1, then
>>>>>>> in the connection details I give it node1:/data. I fail
node1, all VMs
>>>>>>> halt. Did I miss something?
>>>>>>>
>>>>>>> On Fri, Sep 1, 2017 at 2:13 AM, Sahina Bose
<sabose(a)redhat.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> To the OP question, when you set up a gluster storage
domain, you
>>>>>>>> need to specify
backup-volfile-servers=<server2>:<server3> where
>>>>>>>> server2 and server3 also have bricks running. When
server1 is down, and the
>>>>>>>> volume is mounted again - server2 or server3 are queried
to get the gluster
>>>>>>>> volfiles.
>>>>>>>>
>>>>>>>> @Jim, if this does not work, are you using 4.1.5 build
with
>>>>>>>> libgfapi access? If not, please provide the vdsm and
gluster mount logs to
>>>>>>>> analyse
>>>>>>>>
>>>>>>>> If VMs go to paused state - this could mean the storage
is not
>>>>>>>> available. You can check "gluster volume status
<volname>" to see if
>>>>>>>> atleast 2 bricks are running.
>>>>>>>>
>>>>>>>> On Fri, Sep 1, 2017 at 11:31 AM, Johan Bernhardsson <
>>>>>>>> johan(a)kafit.se> wrote:
>>>>>>>>
>>>>>>>>> If gluster drops in quorum so that it has less votes
than it
>>>>>>>>> should it will stop file operations until quorum is
back to normal.If i
>>>>>>>>> rember it right you need two bricks to write for
quorum to be met and that
>>>>>>>>> the arbiter only is a vote to avoid split brain.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Basically what you have is a raid5 solution without a
spare. And
>>>>>>>>> when one disk dies it will run in degraded mode. And
some raid systems will
>>>>>>>>> stop the raid until you have removed the disk or
forced it to run anyway.
>>>>>>>>>
>>>>>>>>> You can read up on it here:
https://gluster.readthed
>>>>>>>>>
ocs.io/en/latest/Administrator%20Guide/arbiter-volumes-and-q
>>>>>>>>> uorum/
>>>>>>>>>
>>>>>>>>> /Johan
>>>>>>>>>
>>>>>>>>> On Thu, 2017-08-31 at 22:33 -0700, Jim Kusznir
wrote:
>>>>>>>>>
>>>>>>>>> Hi all:
>>>>>>>>>
>>>>>>>>> Sorry to hijack the thread, but I was about to start
essentially
>>>>>>>>> the same thread.
>>>>>>>>>
>>>>>>>>> I have a 3 node cluster, all three are hosts and
gluster nodes
>>>>>>>>> (replica 2 + arbitrar). I DO have the
mnt_options=backup-volfile-servers=
>>>>>>>>> set:
>>>>>>>>>
>>>>>>>>> storage=192.168.8.11:/engine
>>>>>>>>>
mnt_options=backup-volfile-servers=192.168.8.12:192.168.8.13
>>>>>>>>>
>>>>>>>>> I had an issue today where 192.168.8.11 went down.
ALL VMs
>>>>>>>>> immediately paused, including the engine (all VMs
were running on
>>>>>>>>> host2:192.168.8.12). I couldn't get any gluster
stuff working until host1
>>>>>>>>> (192.168.8.11) was restored.
>>>>>>>>>
>>>>>>>>> What's wrong / what did I miss?
>>>>>>>>>
>>>>>>>>> (this was set up "manually" through the
article on setting up
>>>>>>>>> self-hosted gluster cluster back when 4.0 was
new..I've upgraded it to 4.1
>>>>>>>>> since).
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>> --Jim
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Aug 31, 2017 at 12:31 PM, Charles Kozler
<
>>>>>>>>> ckozleriii(a)gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Typo..."Set it up and then failed that
**HOST**"
>>>>>>>>>
>>>>>>>>> And upon that host going down, the storage domain
went down. I
>>>>>>>>> only have hosted storage domain and this new one - is
this why the DC went
>>>>>>>>> down and no SPM could be elected?
>>>>>>>>>
>>>>>>>>> I dont recall this working this way in early 4.0 or
3.6
>>>>>>>>>
>>>>>>>>> On Thu, Aug 31, 2017 at 3:30 PM, Charles Kozler <
>>>>>>>>> ckozleriii(a)gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> So I've tested this today and I failed a node.
Specifically, I
>>>>>>>>> setup a glusterfs domain and selected "host to
use: node1". Set it up and
>>>>>>>>> then failed that VM
>>>>>>>>>
>>>>>>>>> However, this did not work and the datacenter went
down. My
>>>>>>>>> engine stayed up, however, it seems configuring a
domain to pin to a host
>>>>>>>>> to use will obviously cause it to fail
>>>>>>>>>
>>>>>>>>> This seems counter-intuitive to the point of
glusterfs or any
>>>>>>>>> redundant storage. If a single host has to be tied to
its function, this
>>>>>>>>> introduces a single point of failure
>>>>>>>>>
>>>>>>>>> Am I missing something obvious?
>>>>>>>>>
>>>>>>>>> On Thu, Aug 31, 2017 at 9:43 AM, Kasturi Narra
<knarra(a)redhat.com
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>> yes, right. What you can do is edit the
hosted-engine.conf file
>>>>>>>>> and there is a parameter as shown below [1] and
replace h2 and h3 with your
>>>>>>>>> second and third storage servers. Then you will need
to restart
>>>>>>>>> ovirt-ha-agent and ovirt-ha-broker services in all
the nodes .
>>>>>>>>>
>>>>>>>>> [1]
'mnt_options=backup-volfile-servers=<h2>:<h3>'
>>>>>>>>>
>>>>>>>>> On Thu, Aug 31, 2017 at 5:54 PM, Charles Kozler <
>>>>>>>>> ckozleriii(a)gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi Kasturi -
>>>>>>>>>
>>>>>>>>> Thanks for feedback
>>>>>>>>>
>>>>>>>>> > If cockpit+gdeploy plugin would be have been
used then that
>>>>>>>>> would have automatically detected glusterfs replica 3
volume created during
>>>>>>>>> Hosted Engine deployment and this question would not
have been asked
>>>>>>>>>
>>>>>>>>> Actually, doing hosted-engine --deploy it too also
auto detects
>>>>>>>>> glusterfs. I know glusterfs fuse client has the
ability to failover
>>>>>>>>> between all nodes in cluster, but I am still curious
given the fact that I
>>>>>>>>> see in ovirt config node1:/engine (being node1 I set
it to in hosted-engine
>>>>>>>>> --deploy). So my concern was to ensure and find out
exactly how engine
>>>>>>>>> works when one node goes away and the fuse client
moves over to the other
>>>>>>>>> node in the gluster cluster
>>>>>>>>>
>>>>>>>>> But you did somewhat answer my question, the answer
seems to be
>>>>>>>>> no (as default) and I will have to use
hosted-engine.conf and change the
>>>>>>>>> parameter as you list
>>>>>>>>>
>>>>>>>>> So I need to do something manual to create HA for
engine on
>>>>>>>>> gluster? Yes?
>>>>>>>>>
>>>>>>>>> Thanks so much!
>>>>>>>>>
>>>>>>>>> On Thu, Aug 31, 2017 at 3:03 AM, Kasturi Narra
<knarra(a)redhat.com
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> During Hosted Engine setup question about
glusterfs volume is
>>>>>>>>> being asked because you have setup the volumes
yourself. If cockpit+gdeploy
>>>>>>>>> plugin would be have been used then that would have
automatically detected
>>>>>>>>> glusterfs replica 3 volume created during Hosted
Engine deployment and this
>>>>>>>>> question would not have been asked.
>>>>>>>>>
>>>>>>>>> During new storage domain creation when glusterfs
is selected
>>>>>>>>> there is a feature called 'use managed gluster
volumes' and upon checking
>>>>>>>>> this all glusterfs volumes managed will be listed and
you could choose the
>>>>>>>>> volume of your choice from the dropdown list.
>>>>>>>>>
>>>>>>>>> There is a conf file called
/etc/hosted-engine/hosted-engine.conf
>>>>>>>>> where there is a parameter called
backup-volfile-servers="h1:h2" and if one
>>>>>>>>> of the gluster node goes down engine uses this
parameter to provide ha /
>>>>>>>>> failover.
>>>>>>>>>
>>>>>>>>> Hope this helps !!
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> kasturi
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Aug 30, 2017 at 8:09 PM, Charles Kozler <
>>>>>>>>> ckozleriii(a)gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hello -
>>>>>>>>>
>>>>>>>>> I have successfully created a hyperconverged hosted
engine setup
>>>>>>>>> consisting of 3 nodes - 2 for VM's and the third
purely for storage. I
>>>>>>>>> manually configured it all, did not use ovirt node or
anything. Built the
>>>>>>>>> gluster volumes myself
>>>>>>>>>
>>>>>>>>> However, I noticed that when setting up the hosted
engine and
>>>>>>>>> even when adding a new storage domain with glusterfs
type, it still asks
>>>>>>>>> for hostname:/volumename
>>>>>>>>>
>>>>>>>>> This leads me to believe that if that one node goes
down (ex:
>>>>>>>>> node1:/data), then ovirt engine wont be able to
communicate with that
>>>>>>>>> volume because its trying to reach it on node 1 and
thus, go down
>>>>>>>>>
>>>>>>>>> I know glusterfs fuse client can connect to all nodes
to provide
>>>>>>>>> failover/ha but how does the engine handle this?
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Users mailing list
>>>>>>>>> Users(a)ovirt.org
>>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Users mailing list
>>>>>>>>> Users(a)ovirt.org
>>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Users mailing
listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Users mailing list
>>>>>>>>> Users(a)ovirt.org
>>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Users mailing list
>>>>>>> Users(a)ovirt.org
>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>