Some thoughts on enhancing High Availability in oVirt
Perry Myers
pmyers at redhat.com
Thu Feb 16 02:55:01 UTC 2012
On 02/15/2012 12:08 PM, Caitlin Bestler wrote:
> Perry Myers wrote:
>
>>> As long as you expect the VM to enforce reliability on the raw storage
>>> devices then you are going to have problems with restarting HA VMs. If
>>> you switch your thinking to making the storage operations HA, then all
>>> you need is a response cache.
>>>
>>> A restarted VM replays the operation, and the cached response is
>>> retransmitted (or the operation is benignly re-applied). Without
>>> defining the operations so that they can be benignly re-applied or
>>> adding a response cache you will always be able to come up with some
>>> order of failure that won't work. There is no cost-effective way to
>>> guarantee that you snapshot the VM only when there is no in-flight
>>> storage activity.
>
>> How is this any different than a bare metal host crashing while writes are
>> in flight either to a local disk or FC disk? When something crashes (be it
>> physical or virtual) you're always going to lose some data that was in flight
>> but not committed to disk (network has same issue). It's up to individual
>> applications to be resilient to this.
>
> Don't think of a storage write as being a write to a device. It is a request to
> a service made in the context of a session. The session protocol includes the
> necessary logic to complete the transaction even when a TCP connection is
> broken. Examples of this include multi-connection iSCSI and NFSv4. Both of
> which can be used to back a virtual disk.
>
> When a VM is migrated you break the connections by it or were made on its
> behalf. The pre-existing session logic will make in-progress operations retry
> until they are successful.
>
> The key is thinking of block storage as a service, rather than as a device.
Ok, that is clearer. I can see how this would relate to providing
better data integrity in the face of hardware/software faults (at the
expense of performance), but it doesn't replace the need for
monitoring/remediation of failed hosts/VMs/services. So this is
something that would be used in conjunction with a traditional HA
solution, not in replace of.
Perry
More information about the Arch
mailing list