I finally got around to running some tests on our environment, are you
getting the same case where as when one host drops the VM ends up in a
paused state and can't be migrated?
With your case, you should be able to obtain full HA, quorum is just a
protection for split brain. If you enable the migration policy to migrate
all VMs, in theory the VMs from the crashed node I assume should migrate to
the other node when it sees the node is offline.
I was wondering if this may be because of the VM reads directly from the
gluster storage server and there doesn't seem to be any fail over? Would a
NFS solution with keepalived across the two servers fix this issue as the
connection would be isolated to IP address rather than the single gluster
node? I'm not too familiar with completely how the libgfapi protocol works.
Could anyone else chime in?
On Fri, Jan 3, 2014 at 2:05 AM, <gregoire.leroy(a)retenodus.net> wrote:
I finally took the following configuration :
- Migration policy is "don't migrate"
- cluster.server-quorum-type is none
- cluster.quorum-type is none
When a host is down, a manual migration allows me to use the other.
Later, I'll add another host so that I get a real HA.