Some thoughts on enhancing High Availability in oVirt

Wed Feb 15 00:02:02 UTC 2012

Aval Baron wrote:

> I'm not sure I agree.
> This entire thread assumes that the way to do this is to have the engine continuously monitor all services on all (HA) guests and according to
> varying policies reschedule VMs (services within VMs?) I don't think this is scalable (and wrt drools/pacemaker, assuming what Andrew says is 
> correct, drools doesn't even remotely come close to supporting even relatively small scales)

>Engine should decide on policy, the hosts should enforce it.
>What this would translate to is a more distributed way of monitoring and moving around of VMs/services.
> E.g. for each service, engine would run the VM on host A and let host B know that it is the failover node for
> this service.  Node B would be monitoring the heartbeats for the services it is in charge of and take over when
> needed. In case host B crashes, engine would choose a different host to be the failover node (note that there
>can be more than 2 nodes with a predefined order of priority).

As long as you expect the VM to enforce reliability on the raw storage devices then you are going to have problems with restarting HA VMs.
If you switch your thinking to making the storage operations HA, then all you need is a response cache.

A restarted VM replays the operation, and the cached response is retransmitted (or the operation is benignly re-applied).
Without defining the operations so that they can be benignly re-applied or adding a response cache you will always be able
to come up with some order of failure that won't work. There is no cost-effective way to guarantee that you snapshot the
VM only when there is no in-flight storage activity.