[ovirt-users] hyperconverged question

Fri Sep 1 15:57:31 UTC 2017

Speaking of the "use managed gluster", I created this gluster setup under
ovirt 4.0 when that wasn't there.  I've gone into my settings and checked
the box and saved it at least twice, but when I go back into the storage
settings, its not checked again.

The "about" box in the gui reports that I'm using this version: oVirt
Engine Version: 4.1.1.8-1.el7.centos

I thought I was staying up to date, but I'm not sure if I'm doing
everything right on the upgrade...The documentation says to click for
hosted engine upgrade instructions, which takes me to a page not found
error...For several versions now, and I haven't found those instructions,
so I've been "winging it".

--Jim

On Fri, Sep 1, 2017 at 8:53 AM, Jim Kusznir <jim at palousetech.com> wrote:

> Huh...Ok., how do I convert the arbitrar to full replica, then?  I was
> misinformed when I created this setup.  I thought the arbitrator held
> enough metadata that it could validate or refudiate  any one replica (kinda
> like the parity drive for a RAID-4 array).  I was also under the impression
> that one replica  + Arbitrator is enough to keep the array online and
> functional.
>
> --Jim
>
> On Fri, Sep 1, 2017 at 5:22 AM, Charles Kozler <ckozleriii at gmail.com>
> wrote:
>
>> @ Jim - you have only two data volumes and lost quorum. Arbitrator only
>> stores metadata, no actual files. So yes, you were running in degraded mode
>> so some operations were hindered.
>>
>> @ Sahina - Yes, this actually worked fine for me once I did that.
>> However, the issue I am still facing, is when I go to create a new gluster
>> storage domain (replica 3, hyperconverged) and I tell it "Host to use" and
>> I select that host. If I fail that host, all VMs halt. I do not recall this
>> in 3.6 or early 4.0. This to me makes it seem like this is "pinning" a node
>> to a volume and vice versa like you could, for instance, for a singular
>> hyperconverged to ex: export a local disk via NFS and then mount it via
>> ovirt domain. But of course, this has its caveats. To that end, I am using
>> gluster replica 3, when configuring it I say "host to use: " node 1, then
>> in the connection details I give it node1:/data. I fail node1, all VMs
>> halt. Did I miss something?
>>
>> On Fri, Sep 1, 2017 at 2:13 AM, Sahina Bose <sabose at redhat.com> wrote:
>>
>>> To the OP question, when you set up a gluster storage domain, you need
>>> to specify backup-volfile-servers=<server2>:<server3> where server2 and
>>> server3 also have bricks running. When server1 is down, and the volume is
>>> mounted again - server2 or server3 are queried to get the gluster volfiles.
>>>
>>> @Jim, if this does not work, are you using 4.1.5 build with libgfapi
>>> access? If not, please provide the vdsm and gluster mount logs to analyse
>>>
>>> If VMs go to paused state - this could mean the storage is not
>>> available. You can check "gluster volume status <volname>" to see if
>>> atleast 2 bricks are running.
>>>
>>> On Fri, Sep 1, 2017 at 11:31 AM, Johan Bernhardsson <johan at kafit.se>
>>> wrote:
>>>
>>>> If gluster drops in quorum so that it has less votes than it should it
>>>> will stop file operations until quorum is back to normal.If i rember it
>>>> right you need two bricks to write for quorum to be met and that the
>>>> arbiter only is a vote to avoid split brain.
>>>>
>>>>
>>>> Basically what you have is a raid5 solution without a spare. And when
>>>> one disk dies it will run in degraded mode. And some raid systems will stop
>>>> the raid until you have removed the disk or forced it to run anyway.
>>>>
>>>> You can read up on it here: https://gluster.readthed
>>>> ocs.io/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/
>>>>
>>>> /Johan
>>>>
>>>> On Thu, 2017-08-31 at 22:33 -0700, Jim Kusznir wrote:
>>>>
>>>> Hi all:
>>>>
>>>> Sorry to hijack the thread, but I was about to start essentially the
>>>> same thread.
>>>>
>>>> I have a 3 node cluster, all three are hosts and gluster nodes (replica
>>>> 2 + arbitrar).  I DO have the mnt_options=backup-volfile-servers= set:
>>>>
>>>> storage=192.168.8.11:/engine
>>>> mnt_options=backup-volfile-servers=192.168.8.12:192.168.8.13
>>>>
>>>> I had an issue today where 192.168.8.11 went down.  ALL VMs immediately
>>>> paused, including the engine (all VMs were running on host2:192.168.8.12).
>>>> I couldn't get any gluster stuff working until host1 (192.168.8.11) was
>>>> restored.
>>>>
>>>> What's wrong / what did I miss?
>>>>
>>>> (this was set up "manually" through the article on setting up
>>>> self-hosted gluster cluster back when 4.0 was new..I've upgraded it to 4.1
>>>> since).
>>>>
>>>> Thanks!
>>>> --Jim
>>>>
>>>>
>>>> On Thu, Aug 31, 2017 at 12:31 PM, Charles Kozler <ckozleriii at gmail.com>
>>>> wrote:
>>>>
>>>> Typo..."Set it up and then failed that **HOST**"
>>>>
>>>> And upon that host going down, the storage domain went down. I only
>>>> have hosted storage domain and this new one - is this why the DC went down
>>>> and no SPM could be elected?
>>>>
>>>> I dont recall this working this way in early 4.0 or 3.6
>>>>
>>>> On Thu, Aug 31, 2017 at 3:30 PM, Charles Kozler <ckozleriii at gmail.com>
>>>> wrote:
>>>>
>>>> So I've tested this today and I failed a node. Specifically, I setup a
>>>> glusterfs domain and selected "host to use: node1". Set it up and then
>>>> failed that VM
>>>>
>>>> However, this did not work and the datacenter went down. My engine
>>>> stayed up, however, it seems configuring a domain to pin to a host to use
>>>> will obviously cause it to fail
>>>>
>>>> This seems counter-intuitive to the point of glusterfs or any redundant
>>>> storage. If a single host has to be tied to its function, this introduces a
>>>> single point of failure
>>>>
>>>> Am I missing something obvious?
>>>>
>>>> On Thu, Aug 31, 2017 at 9:43 AM, Kasturi Narra <knarra at redhat.com>
>>>> wrote:
>>>>
>>>> yes, right.  What you can do is edit the hosted-engine.conf file and
>>>> there is a parameter as shown below [1] and replace h2 and h3 with your
>>>> second and third storage servers. Then you will need to restart
>>>> ovirt-ha-agent and ovirt-ha-broker services in all the nodes .
>>>>
>>>> [1] 'mnt_options=backup-volfile-servers=<h2>:<h3>'
>>>>
>>>> On Thu, Aug 31, 2017 at 5:54 PM, Charles Kozler <ckozleriii at gmail.com>
>>>> wrote:
>>>>
>>>> Hi Kasturi -
>>>>
>>>> Thanks for feedback
>>>>
>>>> > If cockpit+gdeploy plugin would be have been used then that would
>>>> have automatically detected glusterfs replica 3 volume created during
>>>> Hosted Engine deployment and this question would not have been asked
>>>>
>>>> Actually, doing hosted-engine --deploy it too also auto detects
>>>> glusterfs.  I know glusterfs fuse client has the ability to failover
>>>> between all nodes in cluster, but I am still curious given the fact that I
>>>> see in ovirt config node1:/engine (being node1 I set it to in hosted-engine
>>>> --deploy). So my concern was to ensure and find out exactly how engine
>>>> works when one node goes away and the fuse client moves over to the other
>>>> node in the gluster cluster
>>>>
>>>> But you did somewhat answer my question, the answer seems to be no (as
>>>> default) and I will have to use hosted-engine.conf and change the parameter
>>>> as you list
>>>>
>>>> So I need to do something manual to create HA for engine on gluster?
>>>> Yes?
>>>>
>>>> Thanks so much!
>>>>
>>>> On Thu, Aug 31, 2017 at 3:03 AM, Kasturi Narra <knarra at redhat.com>
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>>    During Hosted Engine setup question about glusterfs volume is being
>>>> asked because you have setup the volumes yourself. If cockpit+gdeploy
>>>> plugin would be have been used then that would have automatically detected
>>>> glusterfs replica 3 volume created during Hosted Engine deployment and this
>>>> question would not have been asked.
>>>>
>>>>    During new storage domain creation when glusterfs is selected there
>>>> is a feature called 'use managed gluster volumes' and upon checking this
>>>> all glusterfs volumes managed will be listed and you could choose the
>>>> volume of your choice from the dropdown list.
>>>>
>>>>     There is a conf file called /etc/hosted-engine/hosted-engine.conf
>>>> where there is a parameter called backup-volfile-servers="h1:h2" and if one
>>>> of the gluster node goes down engine uses this parameter to provide ha /
>>>> failover.
>>>>
>>>>  Hope this helps !!
>>>>
>>>> Thanks
>>>> kasturi
>>>>
>>>>
>>>>
>>>> On Wed, Aug 30, 2017 at 8:09 PM, Charles Kozler <ckozleriii at gmail.com>
>>>> wrote:
>>>>
>>>> Hello -
>>>>
>>>> I have successfully created a hyperconverged hosted engine setup
>>>> consisting of 3 nodes - 2 for VM's and the third purely for storage. I
>>>> manually configured it all, did not use ovirt node or anything. Built the
>>>> gluster volumes myself
>>>>
>>>> However, I noticed that when setting up the hosted engine and even when
>>>> adding a new storage domain with glusterfs type, it still asks for
>>>> hostname:/volumename
>>>>
>>>> This leads me to believe that if that one node goes down (ex:
>>>> node1:/data), then ovirt engine wont be able to communicate with that
>>>> volume because its trying to reach it on node 1 and thus, go down
>>>>
>>>> I know glusterfs fuse client can connect to all nodes to provide
>>>> failover/ha but how does the engine handle this?
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing listUsers at ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170901/a63ad333/attachment.html>