[ovirt-users] ovirt with glusterfs - big test - unwanted results
Yaniv Kaul
ykaul at redhat.com
Thu Mar 31 12:30:54 UTC 2016
Hi Pavel,
Thanks for the report. Can you begin with a more accurate description of
your environment?
Begin with host, oVirt and Gluster versions. Then continue with the exact
setup (what are 'A', 'B', 'C' - domains? Volumes? What is the mapping
between domains and volumes?).
Are there any logs you can share with us?
I'm sure with more information, we'd be happy to look at the issue.
Y.
On Thu, Mar 31, 2016 at 3:09 PM, paf1 at email.cz <paf1 at email.cz> wrote:
> Hello,
> we tried the following test - with unwanted results
>
> input:
> 5 node gluster
> A = replica 3 with arbiter 1 ( node1+node2+arbiter on node 5 )
> B = replica 3 with arbiter 1 ( node3+node4+arbiter on node 5 )
> C = distributed replica 3 arbiter 1 ( node1+node2, node3+node4, each
> arbiter on node 5)
> node 5 has only arbiter replica ( 4x )
>
> TEST:
> 1) directly reboot one node - OK ( is not important which ( data node or
> arbiter node ))
> 2) directly reboot two nodes - OK ( if nodes are not from the same
> replica )
> 3) directly reboot three nodes - yes, this is the main problem and a
> questions ....
> - rebooted all three nodes from replica "B" ( not so possible, but
> who knows ... )
> - all VMs with data on this replica was paused ( no data access ) - OK
> - all VMs running on replica "B" nodes lost ( started manually, later
> )( datas on other replicas ) - acceptable
> BUT
> - !!! all oVIrt domains went down !! - master domain is on replica "A"
> which lost only one member from three !!!
> so we are not expecting that all domain will go down, especially
> master with 2 live members.
>
> Results:
> - the whole cluster unreachable until at all domains up - depent of
> all nodes up !!!
> - all paused VMs started back - OK
> - rest of all VMs rebooted and runnig - OK
>
> Questions:
> 1) why all domains down if master domain ( on replica "A" ) has two
> runnig members ( 2 of 3 ) ??
> 2) how to fix that colaps without waiting to all nodes up ? ( in
> worste case if node has HW error eg. ) ??
> 3) which oVirt cluster policy can prevent that situation ?? ( if
> any )
>
> regs.
> Pavel
>
>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20160331/ae267d80/attachment-0001.html>
More information about the Users
mailing list