On Sat, Dec 21, 2013 at 11:56 PM, Grégoire Leroy <gregoire.leroy@retenodus.net> wrote:

Hello,

> > If you disable quorum then you won't have the issue of "read only" when
> >
> >> you lose a host, but you > won't have protection from split brain (if
> >> your
> >> two hosts lose network connectivity). VMs will
> >> keep writing to the hosts, as you have the gluster server and client on
> >> the same host this is
> >> inevitable.

> > Ok, I get the problem caused by disabling the quorum. So, what if while
> > I've two hosts the lack of HA is not so dramatic but will be necessary
> > when
>
> > I'll have more hosts ? (3 or 4). Here is the scenario I would like to have
:
> Quorum generally requires 3 hosts, I believe the default configuration when
> you press "Optimize for virt store" will require a minimum of 2 bricks
> connected before writing is allowed.

Ok, if I understand, the quorum thing is very specific to gluster (bricks) and
not to ovirt (hosts). So, maybe what I need is just another gluster server
with very few space on a dummy VM (not hosted by a ovirt host but outside of
my cluster) to add as a brick. It wouldn't be use at all, just to check
connectivity

Then, if a host lose connectivity, it can't join neither the real gluster
server nor the "dummy" one and so, doesn't run VM. The other one, which is
able to join the dummy one becomes the SPM (the dummy wouldn't have vdsm
server, so it couldn't become) and runs VM.

Maybe by this way could I have HA with two hosts, right ? Is there a reason it
shouldn't work ?

I guess this would work as quroum is based on how many peers are in the cluster. Actually quite a good idea and I'd love to hear from you on how it goes. I'd be interested to see how gluster will work with this though, I assume it has to be apart of the volume. If you're doing distribute-replicate I think this "dummy" VM will need to hold hold the full replicated data?

 cluster.server-quorum-ratio - this is % > 50. If the volume is not set with any ratio the equation for quorum is:
 active_peer_count > 50% of all peers in cluster. But when the percentage (P)
 is specified the equation for quorum is active_peer_count >= P % of all the befriended peers in cluster.

> > 1) I have two hosts : HOSTA and HOSTB. They have glusterfs bricks
> > configured as Distribute Replicated and data is replicated.
> > => For now, I'm totally ok with the fact that if a node fails, then VM on
> > this hosts are stopped and unreachable. However, I would like that if a
> > node fails, the DC keeps running so that VM on the other hosts are not
> > stopped and a human intervention make possible to start the VM on the
> > other
> > host. Would it be possible without disabling the quorum ?
>
> For the 2 host scenario, disable quorum will allow you to do this.

Unfortunately, not for all cases. If the network interface used by glusterfs
to reach each other falls, I get the following behaviour :

1) HOSTB, on which the VM run, detect that HOSTA's brick is unreachable. So it
keeps running. Fine.
2) HOSTA detects that HOSTB's brick is unreachable. So it starts to run the VM
=> Split brain. If the network interfaces not used for management of the
cluster but for VM are OK, I'm going to have a split network.
3) Conclusion, the fall of HOSTA has impact on the VM of HOSTB

Does this scenario seem correct to you, or have I miss something ? Maybe power
management could avoid this issue.

Yes you'll need the power management which they call "fencing", so it will ensure that the host which has dropped from the cluster is sent for a reboot thus making any VMs running on it be shut off immediately and ready to be brought up on another ovirt host.

> > 2) In few months, I'll add two other hosts to the glusterfs volum. Their
> > bricks will be replicated.
> > => At that time, I would like to be able to make evolve my architecture
> > (without shut my VM and export/import them on a new cluster) so that if a
> > node fails, VM on this host start to run on the other host of the same
> > brick (without manual intervention).
>
> Later on you just enable quorum, it's only a setting in the gluster volume.
> gluster volume set DATA cluster.quorum-type auto

Thanks you,
Regards,
Grégoire Leroy