Some thoughts on enhancing High Availability in oVirt
Yaniv Kaul
ykaul at redhat.com
Tue Feb 14 16:40:20 UTC 2012
On 02/14/2012 06:31 PM, Adam Litke wrote:
> On Thu, Feb 09, 2012 at 11:45:09AM -0500, Perry Myers wrote:
>> warning: tl;dr
>>
>> Right now, HA in oVirt is limited to VM level granularity. Each VM
>> provides a heartbeat through vdsm back to the oVirt Engine. If that
>> heartbeat is lost, the VM is terminated and (if the user has configured
>> it) the VM is relaunched. If the host running that VM has lost its
>> heartbeat, the host is fenced (via a remote power operation) and all HA
>> VMs are restarted on an alternate host.
>>
> Has anyone considered how live snapshots and live block copy will intersect HA
> to provide a better end-user experience? For example, will we be able to handle
> a storage connection failure without power-cycling VMs by migrating storage to a
> failover storage domain and/or live-migrating the VM to a host with functioning
> storage connections?
I think migrating a paused VM (due to EIO) is something KVM is afraid to
do - there might be in-flight (in the host already) data en-route to the
storage.
I'm not entirely sure how you migrate the storage, when it's failed.
Y.
>
>> Also, the policies for controlling if/when a VM should be restarted are
>> somewhat limited and hardcoded.
>>
>> So there are two things that we can improve here:
>>
>> 1. Provide introspection into VMs so that we can monitor the health of
>> individual services and not just the VM
>>
>> 2. Provide a more configurable way of expressing policy for when a VM
>> (and its services) should trigger remediation by the HA subsystem
>>
>> We can tackle these two things in isolation, or we can try to combine
>> and solve them at the same time.
>>
>> Some possible paths (not the only ones) might be:
>>
> I also want to mention Memory Overcommitment Manager. It hasn't been included
> in vdsm yet, but the patches will be hitting gerrit within the next couple of
> days. MOM will contribute a single-host policy which is useful for making
> decisions about the condition of a host and applying remediation policies:
> ballooning, ksm, cgroups, vm ejection (migrating to another host). It is
> lightweight and will integrate seamlessly with vdsm from an oVirt-engine
> perspective.
>
>> * Leverage Pacemaker Cloud (http://pacemaker-cloud.org/)
>>
>> Pacemaker Cloud works by providing a generic (read: virt mgmt system
>> agnostic) way of managing HA for virtual machines and their services.
>> At a high level the concept is that you define 1 or more virtual
>> machines to be in a application group, and pcmk-cloud spawns a process
>> to monitor that application group using either Matahari/QMF or direct
>> SSH access.
>>
>> pcmk-cloud is not meant to be a user facing component, so integration
>> work would need to be done here to have oVirt consume the pcmk-cloud
>> REST API for specifying what the application groups (sets of VMs) are
>> and exposing that through the oVirt web UI.
>>
>> pcmk-cloud at a high level has the following functions:
>> + monitoring of services through Matahari/QMF/SSH
>> + monitoring of VMs through Matahari/QMF/SSH/Deltacloud
>> + control of services through Matahari/QMF/SSH
>> + control of VMs through Deltacloud or the native provider (in this
>> case oVirt Engine REST API)
>> + policy engine/model (per application group) to make decisions about
>> when to control services/VMs based on the monitoring input
>>
>> Integration decisions:
>> + pcmk-cloud to use existing transports for monitoring/control
>> (QMF/SSH) or do we leverage a new transport via vdsm/ovirt-guest-
>> agent?
>> + pcmk-cloud could act as the core policy engine to determine VM
>> placement in the oVirt datacenter/clusters or it could be used
>> solely for the monitoring/remediation aspect
>>
>>
>> * Leverage guest monitoring agents w/ ovirt-guest-agent
>>
>> This would be taking the Services Agent from Matahari (which is just a C
>> library) and utilizing it from the ovirt-guest-agent. So oga would
>> setup recurring monitoring of services using this lib and use its
>> existing communication path with vdsm->oVirt Engine to report back
>> service events. In turn, oVirt Engine would need to interpret these
>> events and then issue service control actions back to oga
>>
>> Conceptually this is very similar to using pcmk-cloud in the case where
>> pcmk-cloud utilizes information obtained through oga/vdsm through oVirt
>> Engine instead of communicating directly to Guests via QMF/SSH. In
>> fact, taking this route would probably end up duplicating some effort
>> because effectively you'd need the pcmk-cloud concept of the Cloud
>> Application Policy Engine (formerly called DPE/Deployable Policy Engine)
>> built directly into oVirt Engine anyhow.
>>
>> So part of looking at this is determining how much reuse/integration of
>> existing components makes sense vs. just re-implementing similar concepts.
>>
>> I've cc'd folks from the HA community/pcmk-cloud and hopefully we can
>> have a bit of a discussion to determine the best path forward here.
>>
>> Perry
>> _______________________________________________
>> Arch mailing list
>> Arch at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/arch
>>
More information about the Arch
mailing list