Some thoughts on enhancing High Availability in oVirt

Wed Feb 15 17:08:35 UTC 2012

Perry Myers wrote:

>> As long as you expect the VM to enforce reliability on the raw storage 
>> devices then you are going to have problems with restarting HA VMs. If 
>> you switch your thinking to making the storage operations HA, then all 
>> you need is a response cache.
>> 
>> A restarted VM replays the operation, and the cached response is 
>> retransmitted (or the operation is benignly re-applied). Without 
>> defining the operations so that they can be benignly re-applied or 
>> adding a response cache you will always be able to come up with some 
>> order of failure that won't work. There is no cost-effective way to 
>> guarantee that you snapshot the VM only when there is no in-flight 
>> storage activity.

> How is this any different than a bare metal host crashing while writes are
> in flight either to a local disk or FC disk?  When something crashes (be it
> physical or virtual) you're always going to lose some data that was in flight
> but not committed to disk (network has same issue).  It's up to individual
> applications to be resilient to this.

Don't think of a storage write as being a write to a device. It is a request to
a service made in the context of a session. The session protocol includes the
necessary logic to complete the transaction even when a TCP connection is
broken. Examples of this include multi-connection iSCSI and NFSv4. Both of
which can be used to back a virtual disk.

When a VM is migrated you break the connections by it or were made on its
behalf. The pre-existing session logic will make in-progress operations retry
until they are successful.

The key is thinking of block storage as a service, rather than as a device.