[Users] oVirt October Updates

Mon Oct 21 09:33:22 UTC 2013

On 10/19/2013 10:44 AM, Gianluca Cecchi wrote:
> On Wed, Oct 16, 2013 at 1:01 PM, Itamar Heim wrote:
>> Hope to meet many next week in LinuxCon Europe/KVM Forum/oVirt conference in
>> Edinburgh
>>
> Unfortunately not me ;-(
>
>
>> - LINBIT published "High-Availability oVirt-Cluster with
>>    iSCSI-Storage"[9]
>
>>
>> [9] note: link requires registeration:
>> http://www.linbit.com/en/company/news/333-high-available-virtualization-at-a-most-reasonable-price
>> the pdf is:
>> http://www.linbit.com/en/downloads/tech-guides?download=68:high-availability-ovirt-cluster-with-iscsi-storage
>
> About 3 years ago I successfully tested plain Qemu/KVM with two nodes
> and drbd dual-primary with live migration on Fedora, so I know it
> works very well.
> I think the LINBIT guide gives many suggestions, one of them is in
> particular just another way to have a High Available oVirt Engine for
> example, that is a hot topic lately
> Deep attention is to be put in fencing mechanism though, as always it
> is in general when talking about drbd, pacemaker, oVirt.
> I don't know it quite well, but the drbd proxy feature (closed source
> and not free) could also be then an option for DR in small
> environments
>
> I'm going to experiment the solution of the paper (possibly using
> clusterlabs.org repo for cluster and so using corosync instead of
> heartbeat).
> To compare in small environments where two nodes are ok, in my opinion
> glusterfs is far away from robust at the moment: with 3.4.1 on fedora
> 19 I cannot put a host in maintenance and get back a workable node
> after normal reboot.

Are you referring to the absence of self-healing after maintenance? 
Please see below as to why this should not happen. If there's a 
reproducible test case, it would be good to track this bug at:

https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS

> Didn't find any "safe" way for a node in distributed-replicated two
> nodes config to leave the cluster and get it back without becoming
> crazy to manual solve the brick differences generated in the mean
> time. And this was with only one VM running .....
> Possibly I have to dig more in this, but I didn't find great resources
> for it. Any tips/links are welcome

The following commands might help:

1. gluster volume heal <volname>  - perform a fast index based self-heal

2. gluster volume heal <volname> full - perform a "full" self-heal if 1. 
does not work.

3. gluster volume heal <volname> info - provide information about 
files/directories that were healed recently.

Neither of 1. or 2. would be required in normal operational course as 
the proactive self-healing feature in glusterfs takes care of healing 
after a node comes online.

> I would like to have oVirt more conscious about it and have sort of
> capability to solve itself the misalignments generated on gluster
> backend during mainteneance of a node.
> At the moment it seems to me it only shows volumes are ok in the sense
> of started, but they could be very different...
> For example another tab with details about heal info; something like
> the output of the command
>
> gluster volume heal $VOLUME info
>
> and/or
>
> gluster volume heal $VOLUME info split-brain

Yes, we are looking to build this for monitoring replicated gluster volumes.

- Vijay
>
> so that if one finds 0 entries, he/she is calm and at the same doesn't
> risk to be erroneously calm in the other scenario...
>
> just my opinion.
>
> Gianluca
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>