Some thoughts on enhancing High Availability in oVirt

Steven Dake sdake at redhat.com
Tue Feb 21 17:53:00 UTC 2012


On 02/21/2012 09:27 AM, Steven Dake wrote:
> On 02/21/2012 06:09 AM, Livnat Peer wrote:
>> On 21/02/12 03:34, Steven Dake wrote:
>>> On 02/19/2012 01:55 PM, Livnat Peer wrote:
>>>> On 19/02/12 17:42, Perry Myers wrote:
>>>>>>> Absolutely.
>>>>>>>
>>>>>>> In this case the Cloud Application is the combination of thw two
>>>>>>> separate VM components (database VM and AS VM).  A CAPE (cloud
>>>>>>> application policy engine) maintains the HA state of both VMs including
>>>>>>> correcting for resource (db,as) or vm failures, and ensuring ordering
>>>>>>> constraints even during recovery (the AS would start after the DB in
>>>>>>> this model).
>>>>>>>
>>>>>>
>>>>>> ok, how would a flow look like to the user (oVirt user)?
>>>>>>
>>>>>> - Adding new service in OE
>>>>>> - Specifying for the service which VMs provide it (?)
>>>>>
>>>>> That could work, or you could do:
>>>>>
>>>>> 1. Adding a new VM (or set of VMs in OE)
>>>>> 2. Adding one or more services to associate with those VMs
>>>>>
>>>>> Just depends on what the easier user experience is.  From the
>>>>> perspective of pcmk-cloud, we get the same data in the end, which is a
>>>>> config file that specifies the resources we care about (both VMs and
>>>>> services on those VMs)
>>>>>
>>>>>> - Specify how the service can be monitored (? how does CAPE knows what
>>>>>> to look for as the service heartbeat?)
>>>>>
>>>>> For each service you would specify whether or not to use:
>>>>> * an OCF resource agent (see resources-agents package in Fedora and
>>>>>   other distros)
>>>>> * A systemd unit or sysV init script
>>>>> * Some other custom script (which would need to be either in OCF RA or
>>>>>   init script style)
>>>>>
>>>>>> - Marking th service as HA
>>>>>>
>>>>>> What's next?
>>>>>> Where can the user define the policy about this service
>>>>>
>>>>> There would need to be UI in OE that exposed an interface for adding
>>>>> policy information.  Because the Pacemaker policy engine is very
>>>>> flexible, it would make sense to only define very specific knobs in the
>>>>> UI, otherwise it could get very confusing for the users.  For more
>>>>> complex policies, it might be better to provide a way to manually edit
>>>>> the policy file and upload it rather than trying to model everything in
>>>>> the UI.
>>>>>
>>>>>> (i.e. 'should be
>>>>>> available only on Tuesdays' or 'should be available only between
>>>>>> 0800-1700 CET' etc)?
>>>>>
>>>>> For this example, what do you mean by 'should be available'?  In general
>>>>> with HA, the idea is to 'keep the service running as much as possible'.
>>>>>
>>>>
>>>> You are right, I mixed two use cases.
>>>> Let's focus on HA for start.
>>>>
>>>> Let say CAPE found VM/service is down, does it initiate runVM by OE API?
>>>> Who chooses on which host to start the VM and who is responsible for
>>>> doing setup work in case it is required by the VM? for example if a VM
>>>> is using direct LUN then we might need to connect the host to that LUN
>>>> before starting the VM on the target host.
>>>>
>>>> If CAPE use OE to start the VM the setup will be taken-care-of by OE as
>>>> part of starting the VM.
>>>>
>>>>
>>>
>>> Currently CAPE uses deltacloud APIs to start/stop instances.
>>>
>>> The choosing of which host to start the vm is an act of scheduling
>>> which, in our model, is in the domain of the IAAS platform,  I expect
>>> the typical start operation would look like:
>>> 1. cape determines which VMs to start
>>> 2. cape sends instance start operations to deltacloudd
>>> 3. deltacloudd sends instance start operations to OE API
>>> 4. OE starts the vms
>>>
>>> The model we have been operating under is that setup work of the actual
>>> virtual machine image is done prior to launching.
>>>
>>
>> Few more questions:
>>
>> - If the user initiates stop to HA VM does OE has to coordinate that
>> with cape? terminate CAPE as well?
>>
> 
> There is another process called a CPE (cloud policy engine) which
> provides a REST API for start/stop of instances.  This process starts
> and stops the CAPE processes as necessary.
> 

correction: replace instances above with "cloud applications"

>> - How does CAPE makes the decision that it is 'safe' to restart the
>> resource?
> 
> when monitoring fails in some way we terminate the node via deltacloud.
> 
>> For example currently if OE looses the VM heart beat but we have the
>> host heart beat we know that it is safe to restart the VM. If we loose
>> the host heart beat (which implies we loose the VM heart beat as well)
>> we do not start the VM until we fence the host (or the user can manually
>> approve he rebooted the host).
>>
> 
> This particular use case could be handled with a bit of extra code on
> our end.  Use case seems reasonable.
> 
>>
>> - Currently OE is monitoring the VMs for collecting statistics (CPU,
>> memory, network usage etc.) if OE uses CAPE for providing HA of VMs (or
>> services) it won't 'save' OE the need to monitor the VM for statistics,
>> so if the purpose of this integration is to help with OE scalability
>> don't we need to take care of the monitoring of the VM statistics as well?
>>
> 
> We support multiple transport mechanisms per a separate cape binary.
> Please have a look at
> 
> http://www.pacemaker-cloud.org/downloads/cape-ovirt.pdf
> 
> This shows how ovirt support could be added by pacemaker cloud devs.
> Essentially ovirt.o would communicate with current ovirt monitoring
> infrastructure via whatever method makes the most sense.  The operations
> that trans_ssh.o, or matahari.o or ovirt.o need are vm healthcheck,
> reosurce start, stop, monitor.
> 
> Regards
> -steve
> 
>> Livnat
>>
>>> Physical resource mapping (such as LUNs or block storage) are again the
>>> domain of the IAAS platform.
>>>
>>> Note we have had some informal requests to also handle scheduling, but
>>> would need topology information about the physical resources available
>>> in order to make those decisions.  Currently there is no "standardized"
>>> way to determine the topology.  We don't tackle this problem (currently)
>>> in our implementation.  The project is only focused on HA.
>>>
>>> Regards
>>> -steve
>>>
>>>>
>>>>> The above example seems less like an HA concern and more of a general
>>>>> resource scheduling concern.  I think using the Pacemaker Rules engine
>>>>> with pcmk-cloud, this should be possible as well, but I'll let
>>>>> Andrew/Steve comment further on that.
>>>>>
>>>>> Perry
>>>>
>>>
>>
> 
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch




More information about the Arch mailing list