[Users] Stopping glusterfsd service shut down data center

Amedeo Salvati amedeo at oscert.net
Mon Jan 6 15:00:04 UTC 2014


Il 06/01/2014 10:44, Vijay Bellur ha scritto:
> Adding gluster-users.
>
> On 01/06/2014 12:25 AM, Amedeo Salvati wrote:
>> Hi all,
>>
>> I'm testing ovirt+glusterfs with only two node for all (engine,
>> glusterfs, hypervisors),  on centos 6.5 hosts following guide at:
>>
>> http://community.redhat.com/blog/2013/09/up-and-running-with-ovirt-3-3/
>> http://www.gluster.org/2013/09/ovirt-3-3-glusterized/
>>
>> but with some change like setting on glusterfs, parameter
>> cluster.server-quorum-ratio to 50% (due to prevent glusterfs to go down
>> if one node goes done) and option on /etc/glusterfs/glusterd.vol "option
>> base-port 50152" (due to libvirt port conflict).
>>
>> So, with the above parameter I was able to stop/reboot node not used to
>> directly mount glusterfs (eg lovhm002), but when I stop/reboot node,
>> that is used to mount glusterfs (eg node lovhm001), all data center goes
>> done, especially when I stop service glusterfsd (not glusterd
>> service!!!), but the glusterfs still alive and is reachable on node
>> lovhm002 that survives but ovirt/libvirt marks DC/storage in error.
>>
>> Do you have any ideas to configure DC/Cluster on ovirt that remains
>> aware if node used to mount glusterfs goes down?
>
>
> This seems to be due to client quorum in glusterfs. It can be observed 
> that client quorum is on since option cluster.quorum-type has been set 
> to value "auto".
>
> client quorum gets enabled by default as part of "Optimize for Virt" 
> action in oVirt or by enabling "volume set group virt" in gluster CLI. 
> client quorum gets enabled by default to provide additional protection 
> against split-brains. In case of a gluster volume with replica count > 
> 2, client quorum returns an error if writes/updates fail in more than 
> 50% of the bricks. However, when the replica count happens to be 2, 
> updates are failed if the first server/glusterfsd is not online.
>
> If the chances of a network partition and a split brain is not 
> significant in your setup, you can turn off client quorum by setting 
> option cluster.quorum-type to value "none".
yea!
setting quorum-type to none has solved the issue... now I'm able to have 
a two node ovirt/gluster DC aware of faulting one node, except for 
engine portal that at this moment resides on only one node (sigh :-(  ) 
I'm waiting for self hosted engine)

thanks1k
a

>
> Regards,
> Vijay
>


-- 
Amedeo Salvati
RHC{DS,E,VA} - LPIC-3 - UCP - NCLA 11
email: amedeo at oscert.net
email: amedeo at linux.com
http://plugcomputing.it/redhatcert.php
http://plugcomputing.it/lpicert.php




More information about the Users mailing list