Some thoughts on enhancing High Availability in oVirt

Perry Myers pmyers at redhat.com
Thu Feb 9 16:45:09 UTC 2012


warning: tl;dr

Right now, HA in oVirt is limited to VM level granularity.  Each VM
provides a heartbeat through vdsm back to the oVirt Engine.  If that
heartbeat is lost, the VM is terminated and (if the user has configured
it) the VM is relaunched.  If the host running that VM has lost its
heartbeat, the host is fenced (via a remote power operation) and all HA
VMs are restarted on an alternate host.

Also, the policies for controlling if/when a VM should be restarted are
somewhat limited and hardcoded.

So there are two things that we can improve here:

1. Provide introspection into VMs so that we can monitor the health of
   individual services and not just the VM

2. Provide a more configurable way of expressing policy for when a VM
   (and its services) should trigger remediation by the HA subsystem

We can tackle these two things in isolation, or we can try to combine
and solve them at the same time.

Some possible paths (not the only ones) might be:


* Leverage Pacemaker Cloud (http://pacemaker-cloud.org/)

Pacemaker Cloud works by providing a generic (read: virt mgmt system
agnostic) way of managing HA for virtual machines and their services.
At a high level the concept is that you define 1 or more virtual
machines to be in a application group, and pcmk-cloud spawns a process
to monitor that application group using either Matahari/QMF or direct
SSH access.

pcmk-cloud is not meant to be a user facing component, so integration
work would need to be done here to have oVirt consume the pcmk-cloud
REST API for specifying what the application groups (sets of VMs) are
and exposing that through the oVirt web UI.

pcmk-cloud at a high level has the following functions:
  + monitoring of services through Matahari/QMF/SSH
  + monitoring of VMs through Matahari/QMF/SSH/Deltacloud
  + control of services through Matahari/QMF/SSH
  + control of VMs through Deltacloud or the native provider (in this
    case oVirt Engine REST API)
  + policy engine/model (per application group) to make decisions about
    when to control services/VMs based on the monitoring input

Integration decisions:
  + pcmk-cloud to use existing transports for monitoring/control
    (QMF/SSH) or do we leverage a new transport via vdsm/ovirt-guest-
    agent?
  + pcmk-cloud could act as the core policy engine to determine VM
    placement in the oVirt datacenter/clusters or it could be used
    solely for the monitoring/remediation aspect


* Leverage guest monitoring agents w/ ovirt-guest-agent

This would be taking the Services Agent from Matahari (which is just a C
library) and utilizing it from the ovirt-guest-agent.  So oga would
setup recurring monitoring of services using this lib and use its
existing communication path with vdsm->oVirt Engine to report back
service events.  In turn, oVirt Engine would need to interpret these
events and then issue service control actions back to oga

Conceptually this is very similar to using pcmk-cloud in the case where
pcmk-cloud utilizes information obtained through oga/vdsm through oVirt
Engine instead of communicating directly to Guests via QMF/SSH.  In
fact, taking this route would probably end up duplicating some effort
because effectively you'd need the pcmk-cloud concept of the Cloud
Application Policy Engine (formerly called DPE/Deployable Policy Engine)
built directly into oVirt Engine anyhow.

So part of looking at this is determining how much reuse/integration of
existing components makes sense vs. just re-implementing similar concepts.

I've cc'd folks from the HA community/pcmk-cloud and hopefully we can
have a bit of a discussion to determine the best path forward here.

Perry



More information about the Arch mailing list