Re: [Users] oVirt October Updates

21 Oct 2013

      On 10/19/2013 10:44 AM, Gianluca Cecchi wrote:
...
On Wed, Oct 16, 2013 at 1:01 PM, Itamar Heim wrote:
...
Hope to meet many next week in LinuxCon Europe/KVM Forum/oVirt conference in
Edinburgh
Unfortunately not me ;-(
...
- LINBIT published "High-Availability oVirt-Cluster with
   iSCSI-Storage"[9]
...
[9] note: link requires registeration:
http://www.linbit.com/en/company/news/333-high-available-virtualization-at-a...
the pdf is:
http://www.linbit.com/en/downloads/tech-guides?download=68:high-availability...
About 3 years ago I successfully tested plain Qemu/KVM with two nodes
and drbd dual-primary with live migration on Fedora, so I know it
works very well.
I think the LINBIT guide gives many suggestions, one of them is in
particular just another way to have a High Available oVirt Engine for
example, that is a hot topic lately
Deep attention is to be put in fencing mechanism though, as always it
is in general when talking about drbd, pacemaker, oVirt.
I don't know it quite well, but the drbd proxy feature (closed source
and not free) could also be then an option for DR in small
environments
I'm going to experiment the solution of the paper (possibly using
clusterlabs.org repo for cluster and so using corosync instead of
heartbeat).
To compare in small environments where two nodes are ok, in my opinion
glusterfs is far away from robust at the moment: with 3.4.1 on fedora
19 I cannot put a host in maintenance and get back a workable node
after normal reboot.
Are you referring to the absence of self-healing after maintenance? 
Please see below as to why this should not happen. If there's a 
reproducible test case, it would be good to track this bug at:

https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS
...
Didn't find any "safe" way for a node in distributed-replicated two
nodes config to leave the cluster and get it back without becoming
crazy to manual solve the brick differences generated in the mean
time. And this was with only one VM running .....
Possibly I have to dig more in this, but I didn't find great resources
for it. Any tips/links are welcome
The following commands might help:

1. gluster volume heal <volname>  - perform a fast index based self-heal

2. gluster volume heal <volname> full - perform a "full" self-heal if 1. 
does not work.

3. gluster volume heal <volname> info - provide information about 
files/directories that were healed recently.

Neither of 1. or 2. would be required in normal operational course as 
the proactive self-healing feature in glusterfs takes care of healing 
after a node comes online.
...
I would like to have oVirt more conscious about it and have sort of
capability to solve itself the misalignments generated on gluster
backend during mainteneance of a node.
At the moment it seems to me it only shows volumes are ok in the sense
of started, but they could be very different...
For example another tab with details about heal info; something like
the output of the command
gluster volume heal $VOLUME info
and/or
gluster volume heal $VOLUME info split-brain
Yes, we are looking to build this for monitoring replicated gluster volumes.

- Vijay
...
so that if one finds 0 entries, he/she is calm and at the same doesn't
risk to be erroneously calm in the other scenario...
just my opinion.
Gianluca
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] oVirt October Updates

Vijay Bellur