Some thoughts on enhancing High Availability in oVirt
Steven Dake
sdake at redhat.com
Thu Feb 16 14:23:21 UTC 2012
On 02/16/2012 06:33 AM, Livnat Peer wrote:
> On 16/02/12 05:01, Perry Myers wrote:
>>>>> HA is a simple use case of policy.
>>>>
>>>> *Today* HA is simply 'if VM is down restart it' but what Perry was suggesting was to improve this to something more robust.
>>>
>>> I think that the main concept of what Perry suggested (leaving the
>>> implementation details aside :)) is to add HA of services.
>>
>> That's it in a nutshell :)
>>
>>> I like this idea and I would like to extend it a little bit.
>>> How about services that are spread on more than a single VM.
>>> I would like to be able to define a service and specify which VM/s
>>> provides this service and add HA flag on the service.
>>
>> That is in line with what I was proposing. There are two ways you can
>> do service HA...
>>
>> * Take a set of OSes (guests) and a set of services. Each service can
>> run on any of the guests. Therefore services can be failed over from
>> one live guest to another. This is effectively how the Pacemaker HA
>> stack works on both bare metal and virtual clusters
>>
>> * Take a set of OSes (guests) and on each guest place a specific set of
>> services. Services can be restarted if they fail on a specific guest,
>> but if a guest/host fails, rather than failing over the service to
>> another live running guest, instead the entire guest responsible for
>> that service is restarted. The recovery time is slightly longer in this
>> case because recovery involves restarting a VM instead of just starting
>> a service on another running VM. But the positive here is that the
>> configuration and policies are not as complex, and since VMs typically
>> can start fairly quickly the failover time is still adequate for most users
>>
>
> Can a service be spread on more than one VM?
> For example if I have enterprise application that requires application
> server (AS) and a data base (DB), the AS and DB can not live in the same
> guest because of different access restrictions (based on real use case).
> The service availability is dependent on both guests being active, and
> an optimization is to run both of them on the same host.
>
>
Absolutely.
In this case the Cloud Application is the combination of thw two
separate VM components (database VM and AS VM). A CAPE (cloud
application policy engine) maintains the HA state of both VMs including
correcting for resource (db,as) or vm failures, and ensuring ordering
constraints even during recovery (the AS would start after the DB in
this model).
Our target scaling atm is 10 VMs per CAPE with 36 resources per VM.
These are arbitrary - we could likely go an order of magnitude beyond.
Note that is *per cape* - where the cape process limit is limited by
memory and scheduling capabilities of the system.
Regards
-steve
>
>> Both models work. Pacemaker HA uses the first model, Pacemaker Cloud
>> uses the second, but over time could be adapted to include the 1st.
>>
>>> Then i would like to manage policies around it - I define a service
>>> with 3 VMs providing this service and I want to have at least 2 VM
>>> running it at any given time. (now the VMs are not highly available only
>>> the service is.)
>>
>> Yep. This is in line with use case #1 above.
>>
>> Perry
>
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
More information about the Arch
mailing list