Some thoughts on enhancing High Availability in oVirt

Tue Feb 14 16:31:11 UTC 2012

On Thu, Feb 09, 2012 at 11:45:09AM -0500, Perry Myers wrote:
> warning: tl;dr
> 
> Right now, HA in oVirt is limited to VM level granularity.  Each VM
> provides a heartbeat through vdsm back to the oVirt Engine.  If that
> heartbeat is lost, the VM is terminated and (if the user has configured
> it) the VM is relaunched.  If the host running that VM has lost its
> heartbeat, the host is fenced (via a remote power operation) and all HA
> VMs are restarted on an alternate host.
> 

Has anyone considered how live snapshots and live block copy will intersect HA
to provide a better end-user experience?  For example, will we be able to handle
a storage connection failure without power-cycling VMs by migrating storage to a
failover storage domain and/or live-migrating the VM to a host with functioning
storage connections?

> Also, the policies for controlling if/when a VM should be restarted are
> somewhat limited and hardcoded.
> 
> So there are two things that we can improve here:
> 
> 1. Provide introspection into VMs so that we can monitor the health of
>    individual services and not just the VM
> 
> 2. Provide a more configurable way of expressing policy for when a VM
>    (and its services) should trigger remediation by the HA subsystem
> 
> We can tackle these two things in isolation, or we can try to combine
> and solve them at the same time.
> 
> Some possible paths (not the only ones) might be:
> 

I also want to mention Memory Overcommitment Manager.  It hasn't been included
in vdsm yet, but the patches will be hitting gerrit within the next couple of
days.  MOM will contribute a single-host policy which is useful for making
decisions about the condition of a host and applying remediation policies:
ballooning, ksm, cgroups, vm ejection (migrating to another host).  It is
lightweight and will integrate seamlessly with vdsm from an oVirt-engine
perspective.

> * Leverage Pacemaker Cloud (http://pacemaker-cloud.org/)
> 
> Pacemaker Cloud works by providing a generic (read: virt mgmt system
> agnostic) way of managing HA for virtual machines and their services.
> At a high level the concept is that you define 1 or more virtual
> machines to be in a application group, and pcmk-cloud spawns a process
> to monitor that application group using either Matahari/QMF or direct
> SSH access.
> 
> pcmk-cloud is not meant to be a user facing component, so integration
> work would need to be done here to have oVirt consume the pcmk-cloud
> REST API for specifying what the application groups (sets of VMs) are
> and exposing that through the oVirt web UI.
> 
> pcmk-cloud at a high level has the following functions:
>   + monitoring of services through Matahari/QMF/SSH
>   + monitoring of VMs through Matahari/QMF/SSH/Deltacloud
>   + control of services through Matahari/QMF/SSH
>   + control of VMs through Deltacloud or the native provider (in this
>     case oVirt Engine REST API)
>   + policy engine/model (per application group) to make decisions about
>     when to control services/VMs based on the monitoring input
> 
> Integration decisions:
>   + pcmk-cloud to use existing transports for monitoring/control
>     (QMF/SSH) or do we leverage a new transport via vdsm/ovirt-guest-
>     agent?
>   + pcmk-cloud could act as the core policy engine to determine VM
>     placement in the oVirt datacenter/clusters or it could be used
>     solely for the monitoring/remediation aspect
> 
> 
> * Leverage guest monitoring agents w/ ovirt-guest-agent
> 
> This would be taking the Services Agent from Matahari (which is just a C
> library) and utilizing it from the ovirt-guest-agent.  So oga would
> setup recurring monitoring of services using this lib and use its
> existing communication path with vdsm->oVirt Engine to report back
> service events.  In turn, oVirt Engine would need to interpret these
> events and then issue service control actions back to oga
> 
> Conceptually this is very similar to using pcmk-cloud in the case where
> pcmk-cloud utilizes information obtained through oga/vdsm through oVirt
> Engine instead of communicating directly to Guests via QMF/SSH.  In
> fact, taking this route would probably end up duplicating some effort
> because effectively you'd need the pcmk-cloud concept of the Cloud
> Application Policy Engine (formerly called DPE/Deployable Policy Engine)
> built directly into oVirt Engine anyhow.
> 
> So part of looking at this is determining how much reuse/integration of
> existing components makes sense vs. just re-implementing similar concepts.
> 
> I've cc'd folks from the HA community/pcmk-cloud and hopefully we can
> have a bit of a discussion to determine the best path forward here.
> 
> Perry
> _______________________________________________
> Arch mailing list
> Arch at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/arch
> 

-- 
Adam Litke <agl at us.ibm.com>
IBM Linux Technology Center