On 10/19/2013 10:44 AM, Gianluca Cecchi wrote:
On Wed, Oct 16, 2013 at 1:01 PM, Itamar Heim wrote:
> Hope to meet many next week in LinuxCon Europe/KVM Forum/oVirt conference in
Unfortunately not me ;-(
> - LINBIT published "High-Availability oVirt-Cluster with
>  note: link requires registeration:
> the pdf is:
About 3 years ago I successfully tested plain Qemu/KVM with two nodes
and drbd dual-primary with live migration on Fedora, so I know it
works very well.
I think the LINBIT guide gives many suggestions, one of them is in
particular just another way to have a High Available oVirt Engine for
example, that is a hot topic lately
Deep attention is to be put in fencing mechanism though, as always it
is in general when talking about drbd, pacemaker, oVirt.
I don't know it quite well, but the drbd proxy feature (closed source
and not free) could also be then an option for DR in small
I'm going to experiment the solution of the paper (possibly using
repo for cluster and so using corosync instead of
To compare in small environments where two nodes are ok, in my opinion
glusterfs is far away from robust at the moment: with 3.4.1 on fedora
19 I cannot put a host in maintenance and get back a workable node
after normal reboot.
Are you referring to the absence of self-healing after maintenance?
Please see below as to why this should not happen. If there's a
reproducible test case, it would be good to track this bug at:
Didn't find any "safe" way for a node in
nodes config to leave the cluster and get it back without becoming
crazy to manual solve the brick differences generated in the mean
time. And this was with only one VM running .....
Possibly I have to dig more in this, but I didn't find great resources
for it. Any tips/links are welcome
The following commands might help:
1. gluster volume heal <volname> - perform a fast index based self-heal
2. gluster volume heal <volname> full - perform a "full" self-heal if 1.
does not work.
3. gluster volume heal <volname> info - provide information about
files/directories that were healed recently.
Neither of 1. or 2. would be required in normal operational course as
the proactive self-healing feature in glusterfs takes care of healing
after a node comes online.
I would like to have oVirt more conscious about it and have sort of
capability to solve itself the misalignments generated on gluster
backend during mainteneance of a node.
At the moment it seems to me it only shows volumes are ok in the sense
of started, but they could be very different...
For example another tab with details about heal info; something like
the output of the command
gluster volume heal $VOLUME info
gluster volume heal $VOLUME info split-brain
Yes, we are looking to build this for monitoring replicated gluster volumes.
so that if one finds 0 entries, he/she is calm and at the same doesn't
risk to be erroneously calm in the other scenario...
just my opinion.
Users mailing list