Some thoughts on enhancing High Availability in oVirt

Perry Myers pmyers at redhat.com
Thu Feb 16 03:01:16 UTC 2012


>>> HA is a simple use case of policy.
>>
>> *Today* HA is simply 'if VM is down restart it' but what Perry was suggesting was to improve this to something more robust.
> 
> I think that the main concept of what Perry suggested (leaving the
> implementation details aside :)) is to add HA of services.

That's it in a nutshell :)

> I like this idea and I would like to extend it a little bit.
> How about services that are spread on more than a single VM.
> I would like to be able to define a service and specify which VM/s
> provides this service and add HA flag on the service.

That is in line with what I was proposing.  There are two ways you can
do service HA...

* Take a set of OSes (guests) and a set of services.  Each service can
run on any of the guests.  Therefore services can be failed over from
one live guest to another.  This is effectively how the Pacemaker HA
stack works on both bare metal and virtual clusters

* Take a set of OSes (guests) and on each guest place a specific set of
services.  Services can be restarted if they fail on a specific guest,
but if a guest/host fails, rather than failing over the service to
another live running guest, instead the entire guest responsible for
that service is restarted.  The recovery time is slightly longer in this
case because recovery involves restarting a VM instead of just starting
a service on another running VM.  But the positive here is that the
configuration and policies are not as complex, and since VMs typically
can start fairly quickly the failover time is still adequate for most users

Both models work.  Pacemaker HA uses the first model, Pacemaker Cloud
uses the second, but over time could be adapted to include the 1st.

> Then i would like to manage policies around it - I define a service
> with 3 VMs providing this service and I want to have at least 2 VM
> running it at any given time. (now the VMs are not highly available only
> the service is.)

Yep.  This is in line with use case #1 above.

Perry



More information about the Arch mailing list