[Users] two node ovirt cluster with HA

Dafna Ron dron at redhat.com
Mon Jan 27 12:05:13 UTC 2014


I am adding Tareq for the Power Management implementation.

Dafna


On 01/27/2014 11:48 AM, Karli Sjöberg wrote:
> On Mon, 2014-01-27 at 11:11 +0000, Dafna Ron wrote:
>> Powering off the host will never trigger vm migration.
>> As far as engine is concerned it just lost connection to the host, but
>> has no way of telling if the host is down or if a router is down.
> Can´t it at least check with power management if the Host status is down
> first?
>
> I mean, if the network is down there will be no response from either PM
> or Host. But if PM is up and can tell you that the Host is down, sounds
> rather clear cut to me...
>
> Seems to me the VM's would be restarted sooner if the flow was altered
> to first check with PM if it´s a network or Host issue, and if Host
> issue, immediately restart VM's on another Host, instead of waiting for
> a potentially problematic Host to boot up eventually.
>
> /K
>
>> since vm's can continue running on the host even if engine has no access
>> to it, starting the vm's on the second host can cause split brain and
>> data corruption.
>>
>> The way that the engine knows what's going on is by sending heath check
>> queries to the vdsm.
>> Power management will try to reboot a host when the health checks to
>> vdsm will not be answered.
>> So... if engine gets no reply and has no way of rebooting the host, the
>> host status will be changed to Non-Responsive and the vm's will be
>> unknown because engine has no way of knowing what's happening with the
>> vm's.
>> Since reboot of the host will kill the vm's running on it - this will
>> never cause any vm migration but... along with the High-Availability vm
>> feature, you will be able to have some of the vm's re-started on the
>> second host after the host reboot (and that is only if Power Management
>> was confirmed as successful).
>>
>> VM migration is only triggered when:
>> 1. Cluster configuration states that the vm should be migrated in case
>> of failure
>> 2. Engine has access to the host - so the failure is on the storage side
>> and not the host side.
>> 3. the vms are not actively writing (although there might be a new RFE
>> for it).
>>
>> hope this clears things up
>>
>> Dafna
>>
>>
>>
>> On 01/27/2014 10:11 AM, Andrew Lau wrote:
>>> Hi,
>>>
>>> Have you got power management enabled?
>>>
>>> That's the fencing feature required for the engine to ensure that the
>>> host is actually offline. It won't resume any other VMs to prevent
>>> potential VM corruption (eg. VM running on multiple hosts).
>>>
>>> Andrew.
>>>
>>> On Jan 27, 2014 5:12 PM, "Jaison peter" <urotrip2 at gmail.com
>>> <mailto:urotrip2 at gmail.com>> wrote:
>>>
>>>      Hi all ,
>>>
>>>      I was setting a two node ovirt cluster with ovirt engine on
>>>      seperate node . I completed the configuration and tested VM  live
>>>      migrations with out any issues . Then for checking cluster HA I
>>>      powered down one host and expected vms running on that host to be
>>>      migrated to the other one . But nothing happened , Engine detected
>>>      host as un-rechable and marked it as non-operational and vm ran on
>>>      that host went to 'unknown state' . Is that not possible to setup
>>>      a fully HA ovirt cluster with two nodes ? or else is that my
>>>      configuration problem ? please advice .
>>>
>>>      Thanks & Regards
>>>
>>>      Alex
>>>
>>>      _______________________________________________
>>>      Users mailing list
>>>      Users at ovirt.org <mailto:Users at ovirt.org>
>>>      http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>
>> -- 
>> Dafna Ron
>> _______________________________________________
>> Users mailing list
>> Users at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>
>


-- 
Dafna Ron



More information about the Users mailing list