Some thoughts on enhancing High Availability in oVirt

Thu Feb 16 02:52:19 UTC 2012

On 02/15/2012 01:55 AM, Itamar Heim wrote:
> On 02/15/2012 03:41 AM, Perry Myers wrote:
>>> As long as you expect the VM to enforce reliability on the raw
>>> storage devices then you are going to have problems with restarting
>>> HA VMs. If you switch your thinking to making the storage operations
>>> HA, then all you need is a response cache.
>>>
>>> A restarted VM replays the operation, and the cached response is
>>> retransmitted (or the operation is benignly re-applied). Without
>>> defining the operations so that they can be benignly re-applied or
>>> adding a response cache you will always be able to come up with some
>>> order of failure that won't work. There is no cost-effective way to
>>> guarantee that you snapshot the VM only when there is no in-flight
>>> storage activity.
>>
>> How is this any different than a bare metal host crashing while writes
>> are in flight either to a local disk or FC disk?  When something crashes
>> (be it physical or virtual) you're always going to lose some data that
>> was in flight but not committed to disk (network has same issue).  It's
>> up to individual applications to be resilient to this.
>>
>> I think this issue is somewhat orthogonal to simply providing reduced
>> MTTR by restarting failed services or VMs.
> 
> don't you fence the other node first to make sure it won't write after
> you started another one?

yes

> here we are talking about moving the VM, without fencing the host.

Ok.   I don't see how that's possible...  If you don't fence the other
host (either by cutting off I/O via sanlock, SCSI reservations or power
fencing) then you always run the risk of both VMs accessing shared
storage at the same time from two different hosts leading to data
corruption.