[ovirt-users] hyperconverged question

Charles Kozler ckozleriii at gmail.com
Wed Sep 13 00:42:42 UTC 2017


So also on my engine storage domain. Shouldnt we see the mount options in
mount -l output? It appears fault tolerance worked (sort of - see more
below) during my test

[root at appovirtp01 ~]# grep -i mnt_options
/etc/ovirt-hosted-engine/hosted-engine.conf
mnt_options=backup-volfile-servers=n2:n3

[root at appovirtp02 ~]# grep -i mnt_options
/etc/ovirt-hosted-engine/hosted-engine.conf
mnt_options=backup-volfile-servers=n2:n3

[root at appovirtp03 ~]# grep -i mnt_options
/etc/ovirt-hosted-engine/hosted-engine.conf
mnt_options=backup-volfile-servers=n2:n3

Meanwhile not visible in mount -l output:

[root at appovirtp01 ~]# mount -l | grep -i n1:/engine
n1:/engine on /rhev/data-center/mnt/glusterSD/n1:_engine type
fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

[root at appovirtp02 ~]# mount -l | grep -i n1:/engine
n1:/engine on /rhev/data-center/mnt/glusterSD/n1:_engine type
fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

[root at appovirtp03 ~]# mount -l | grep -i n1:/engine
n1:/engine on /rhev/data-center/mnt/glusterSD/n1:_engine type
fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

So since everything is "pointed" at node 1 for engine storage, I decided to
hard shut down node 1 while hosted engine VM runs on node 3

The result was that after ~30 seconds the engine crashed likely because of
the gluster 42 second timeout. The hosted engine VM came back up (with node
1 still down) after about 5-7 minutes

Is this expected for the VM to go down? I thought gluster fuse mounted all
bricks in the volume
http://lists.gluster.org/pipermail/gluster-users/2015-May/021989.html so I
would imagine this to be more seamless?




On Tue, Sep 12, 2017 at 7:04 PM, Charles Kozler <ckozleriii at gmail.com>
wrote:

> Hey All -
>
> So I havent tested this yet but what I do know is that I did setup
> backupvol option when I added the data gluster volume, however, mount
> options on mount -l do not show it as being used
>
> n1:/data on /rhev/data-center/mnt/glusterSD/n1:_data type fuse.glusterfs
> (rw,relatime,user_id=0,group_id=0,default_permissions,
> allow_other,max_read=131072)
>
> I will delete it and re-add it, but I think this might be part of the
> problem. Perhaps me and Jim have the same issue because oVirt is actually
> not passing the additional mount options from the web UI to the backend to
> mount with said parameters?
>
> Thoughts?
>
> On Mon, Sep 4, 2017 at 10:51 AM, FERNANDO FREDIANI <
> fernando.frediani at upx.com> wrote:
>
>> I had the very same impression. It doesn't look like that it works then.
>> So for a fully redundant where you can loose a complete host you must have
>> at least 3 nodes then ?
>>
>> Fernando
>>
>> On 01/09/2017 12:53, Jim Kusznir wrote:
>>
>> Huh...Ok., how do I convert the arbitrar to full replica, then?  I was
>> misinformed when I created this setup.  I thought the arbitrator held
>> enough metadata that it could validate or refudiate  any one replica (kinda
>> like the parity drive for a RAID-4 array).  I was also under the impression
>> that one replica  + Arbitrator is enough to keep the array online and
>> functional.
>>
>> --Jim
>>
>> On Fri, Sep 1, 2017 at 5:22 AM, Charles Kozler <ckozleriii at gmail.com>
>> wrote:
>>
>>> @ Jim - you have only two data volumes and lost quorum. Arbitrator only
>>> stores metadata, no actual files. So yes, you were running in degraded mode
>>> so some operations were hindered.
>>>
>>> @ Sahina - Yes, this actually worked fine for me once I did that.
>>> However, the issue I am still facing, is when I go to create a new gluster
>>> storage domain (replica 3, hyperconverged) and I tell it "Host to use" and
>>> I select that host. If I fail that host, all VMs halt. I do not recall this
>>> in 3.6 or early 4.0. This to me makes it seem like this is "pinning" a node
>>> to a volume and vice versa like you could, for instance, for a singular
>>> hyperconverged to ex: export a local disk via NFS and then mount it via
>>> ovirt domain. But of course, this has its caveats. To that end, I am using
>>> gluster replica 3, when configuring it I say "host to use: " node 1, then
>>> in the connection details I give it node1:/data. I fail node1, all VMs
>>> halt. Did I miss something?
>>>
>>> On Fri, Sep 1, 2017 at 2:13 AM, Sahina Bose <sabose at redhat.com> wrote:
>>>
>>>> To the OP question, when you set up a gluster storage domain, you need
>>>> to specify backup-volfile-servers=<server2>:<server3> where server2
>>>> and server3 also have bricks running. When server1 is down, and the volume
>>>> is mounted again - server2 or server3 are queried to get the gluster
>>>> volfiles.
>>>>
>>>> @Jim, if this does not work, are you using 4.1.5 build with libgfapi
>>>> access? If not, please provide the vdsm and gluster mount logs to analyse
>>>>
>>>> If VMs go to paused state - this could mean the storage is not
>>>> available. You can check "gluster volume status <volname>" to see if
>>>> atleast 2 bricks are running.
>>>>
>>>> On Fri, Sep 1, 2017 at 11:31 AM, Johan Bernhardsson <johan at kafit.se>
>>>> wrote:
>>>>
>>>>> If gluster drops in quorum so that it has less votes than it should it
>>>>> will stop file operations until quorum is back to normal.If i rember it
>>>>> right you need two bricks to write for quorum to be met and that the
>>>>> arbiter only is a vote to avoid split brain.
>>>>>
>>>>>
>>>>> Basically what you have is a raid5 solution without a spare. And when
>>>>> one disk dies it will run in degraded mode. And some raid systems will stop
>>>>> the raid until you have removed the disk or forced it to run anyway.
>>>>>
>>>>> You can read up on it here: https://gluster.readthed
>>>>> ocs.io/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/
>>>>>
>>>>> /Johan
>>>>>
>>>>> On Thu, 2017-08-31 at 22:33 -0700, Jim Kusznir wrote:
>>>>>
>>>>> Hi all:
>>>>>
>>>>> Sorry to hijack the thread, but I was about to start essentially the
>>>>> same thread.
>>>>>
>>>>> I have a 3 node cluster, all three are hosts and gluster nodes
>>>>> (replica 2 + arbitrar).  I DO have the mnt_options=backup-volfile-servers=
>>>>> set:
>>>>>
>>>>> storage=192.168.8.11:/engine
>>>>> mnt_options=backup-volfile-servers=192.168.8.12:192.168.8.13
>>>>>
>>>>> I had an issue today where 192.168.8.11 went down.  ALL VMs
>>>>> immediately paused, including the engine (all VMs were running on
>>>>> host2:192.168.8.12).  I couldn't get any gluster stuff working until host1
>>>>> (192.168.8.11) was restored.
>>>>>
>>>>> What's wrong / what did I miss?
>>>>>
>>>>> (this was set up "manually" through the article on setting up
>>>>> self-hosted gluster cluster back when 4.0 was new..I've upgraded it to 4.1
>>>>> since).
>>>>>
>>>>> Thanks!
>>>>> --Jim
>>>>>
>>>>>
>>>>> On Thu, Aug 31, 2017 at 12:31 PM, Charles Kozler <ckozleriii at gmail.com
>>>>> > wrote:
>>>>>
>>>>> Typo..."Set it up and then failed that **HOST**"
>>>>>
>>>>> And upon that host going down, the storage domain went down. I only
>>>>> have hosted storage domain and this new one - is this why the DC went down
>>>>> and no SPM could be elected?
>>>>>
>>>>> I dont recall this working this way in early 4.0 or 3.6
>>>>>
>>>>> On Thu, Aug 31, 2017 at 3:30 PM, Charles Kozler <ckozleriii at gmail.com>
>>>>> wrote:
>>>>>
>>>>> So I've tested this today and I failed a node. Specifically, I setup a
>>>>> glusterfs domain and selected "host to use: node1". Set it up and then
>>>>> failed that VM
>>>>>
>>>>> However, this did not work and the datacenter went down. My engine
>>>>> stayed up, however, it seems configuring a domain to pin to a host to use
>>>>> will obviously cause it to fail
>>>>>
>>>>> This seems counter-intuitive to the point of glusterfs or any
>>>>> redundant storage. If a single host has to be tied to its function, this
>>>>> introduces a single point of failure
>>>>>
>>>>> Am I missing something obvious?
>>>>>
>>>>> On Thu, Aug 31, 2017 at 9:43 AM, Kasturi Narra <knarra at redhat.com>
>>>>> wrote:
>>>>>
>>>>> yes, right.  What you can do is edit the hosted-engine.conf file and
>>>>> there is a parameter as shown below [1] and replace h2 and h3 with your
>>>>> second and third storage servers. Then you will need to restart
>>>>> ovirt-ha-agent and ovirt-ha-broker services in all the nodes .
>>>>>
>>>>> [1] 'mnt_options=backup-volfile-servers=<h2>:<h3>'
>>>>>
>>>>> On Thu, Aug 31, 2017 at 5:54 PM, Charles Kozler <ckozleriii at gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hi Kasturi -
>>>>>
>>>>> Thanks for feedback
>>>>>
>>>>> > If cockpit+gdeploy plugin would be have been used then that would
>>>>> have automatically detected glusterfs replica 3 volume created during
>>>>> Hosted Engine deployment and this question would not have been asked
>>>>>
>>>>> Actually, doing hosted-engine --deploy it too also auto detects
>>>>> glusterfs.  I know glusterfs fuse client has the ability to failover
>>>>> between all nodes in cluster, but I am still curious given the fact that I
>>>>> see in ovirt config node1:/engine (being node1 I set it to in hosted-engine
>>>>> --deploy). So my concern was to ensure and find out exactly how engine
>>>>> works when one node goes away and the fuse client moves over to the other
>>>>> node in the gluster cluster
>>>>>
>>>>> But you did somewhat answer my question, the answer seems to be no (as
>>>>> default) and I will have to use hosted-engine.conf and change the parameter
>>>>> as you list
>>>>>
>>>>> So I need to do something manual to create HA for engine on gluster?
>>>>> Yes?
>>>>>
>>>>> Thanks so much!
>>>>>
>>>>> On Thu, Aug 31, 2017 at 3:03 AM, Kasturi Narra <knarra at redhat.com>
>>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>>    During Hosted Engine setup question about glusterfs volume is being
>>>>> asked because you have setup the volumes yourself. If cockpit+gdeploy
>>>>> plugin would be have been used then that would have automatically detected
>>>>> glusterfs replica 3 volume created during Hosted Engine deployment and this
>>>>> question would not have been asked.
>>>>>
>>>>>    During new storage domain creation when glusterfs is selected there
>>>>> is a feature called 'use managed gluster volumes' and upon checking this
>>>>> all glusterfs volumes managed will be listed and you could choose the
>>>>> volume of your choice from the dropdown list.
>>>>>
>>>>>     There is a conf file called /etc/hosted-engine/hosted-engine.conf
>>>>> where there is a parameter called backup-volfile-servers="h1:h2" and if one
>>>>> of the gluster node goes down engine uses this parameter to provide ha /
>>>>> failover.
>>>>>
>>>>>  Hope this helps !!
>>>>>
>>>>> Thanks
>>>>> kasturi
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Aug 30, 2017 at 8:09 PM, Charles Kozler <ckozleriii at gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hello -
>>>>>
>>>>> I have successfully created a hyperconverged hosted engine setup
>>>>> consisting of 3 nodes - 2 for VM's and the third purely for storage. I
>>>>> manually configured it all, did not use ovirt node or anything. Built the
>>>>> gluster volumes myself
>>>>>
>>>>> However, I noticed that when setting up the hosted engine and even
>>>>> when adding a new storage domain with glusterfs type, it still asks for
>>>>> hostname:/volumename
>>>>>
>>>>> This leads me to believe that if that one node goes down (ex:
>>>>> node1:/data), then ovirt engine wont be able to communicate with that
>>>>> volume because its trying to reach it on node 1 and thus, go down
>>>>>
>>>>> I know glusterfs fuse client can connect to all nodes to provide
>>>>> failover/ha but how does the engine handle this?
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing listUsers at ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>>
>> _______________________________________________
>> Users mailing listUsers at ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170912/2bed1c99/attachment.html>


More information about the Users mailing list