Some thoughts on enhancing High Availability in oVirt

Thu Feb 16 02:55:01 UTC 2012

On 02/15/2012 12:08 PM, Caitlin Bestler wrote:
> Perry Myers wrote:
> 
>>> As long as you expect the VM to enforce reliability on the raw storage 
>>> devices then you are going to have problems with restarting HA VMs. If 
>>> you switch your thinking to making the storage operations HA, then all 
>>> you need is a response cache.
>>>
>>> A restarted VM replays the operation, and the cached response is 
>>> retransmitted (or the operation is benignly re-applied). Without 
>>> defining the operations so that they can be benignly re-applied or 
>>> adding a response cache you will always be able to come up with some 
>>> order of failure that won't work. There is no cost-effective way to 
>>> guarantee that you snapshot the VM only when there is no in-flight 
>>> storage activity.
> 
>> How is this any different than a bare metal host crashing while writes are
>> in flight either to a local disk or FC disk?  When something crashes (be it
>> physical or virtual) you're always going to lose some data that was in flight
>> but not committed to disk (network has same issue).  It's up to individual
>> applications to be resilient to this.
> 
> Don't think of a storage write as being a write to a device. It is a request to
> a service made in the context of a session. The session protocol includes the
> necessary logic to complete the transaction even when a TCP connection is
> broken. Examples of this include multi-connection iSCSI and NFSv4. Both of
> which can be used to back a virtual disk.
>  
> When a VM is migrated you break the connections by it or were made on its
> behalf. The pre-existing session logic will make in-progress operations retry
> until they are successful.
> 
> The key is thinking of block storage as a service, rather than as a device.

Ok, that is clearer.  I can see how this would relate to providing
better data integrity in the face of hardware/software faults (at the
expense of performance), but it doesn't replace the need for
monitoring/remediation of failed hosts/VMs/services.  So this is
something that would be used in conjunction with a traditional HA
solution, not in replace of.

Perry