[Users] HA

Sander Grendelman sander at grendelman.com
Fri Apr 4 13:14:58 UTC 2014


Do you have power management configured?
Was the "failed" host fenced/rebooted?


On Fri, Apr 4, 2014 at 2:21 PM, Koen Vanoppen <vanoppen.koen at gmail.com>wrote:

> So... It is possible for a fully automatic migration of the VM to another
> hypervisor in case Storage connection fails?
> How can we make this happen? Because for the moment, when we tested the
> situation they stayed in pause state.
> (Test situation:
>
>    - Unplug the 2 fibre cables from the hypervisor
>    - VM's go in pause state
>    - VM's stayed in pause state until the failure was solved
>
> )
>
>
> They only returned when we restored the fiber connection to the
> Hypervisor...
>
> Kind Regards,
>
> Koen
>
>
>
> 2014-04-04 13:52 GMT+02:00 Koen Vanoppen <vanoppen.koen at gmail.com>:
>
>> So... It is possible for a fully automatic migration of the VM to another
>> hypervisor in case Storage connection fails?
>> How can we make this happen? Because for the moment, when we tested the
>> situation they stayed in pause state.
>> (Test situation:
>>
>>    - Unplug the 2 fibre cables from the hypervisor
>>    - VM's go in pause state
>>    - VM's stayed in pause state until the failure was solved
>>
>> )
>>
>>
>> They only returned when we restored the fiber connection to the
>> Hypervisor...
>>
>> Kind Regards,
>>
>> Koen
>>
>>
>> 2014-04-03 16:53 GMT+02:00 Koen Vanoppen <vanoppen.koen at gmail.com>:
>>
>> ---------- Forwarded message ----------
>>> From: "Doron Fediuck" <dfediuck at redhat.com>
>>> Date: Apr 3, 2014 4:51 PM
>>> Subject: Re: [Users] HA
>>> To: "Koen Vanoppen" <vanoppen.koen at gmail.com>
>>> Cc: "Omer Frenkel" <ofrenkel at redhat.com>, <users at ovirt.org>, "Federico
>>> Simoncelli" <fsimonce at redhat.com>, "Allon Mureinik" <amureini at redhat.com
>>> >
>>>
>>>
>>>
>>> ----- Original Message -----
>>> > From: "Koen Vanoppen" <vanoppen.koen at gmail.com>
>>> > To: "Omer Frenkel" <ofrenkel at redhat.com>, users at ovirt.org
>>> > Sent: Wednesday, April 2, 2014 4:17:36 PM
>>> > Subject: Re: [Users] HA
>>> >
>>> > Yes, indeed. I meant not-operational. Sorry.
>>> > So, if I understand this correctly. When we ever come in a situation
>>> that we
>>> > loose both storage connections on our hypervisor, we will have to
>>> manually
>>> > restore the connections first?
>>> >
>>> > And thanx for the tip for speeding up thins :-).
>>> >
>>> > Kind regards,
>>> >
>>> > Koen
>>> >
>>> >
>>> > 2014-04-02 15:14 GMT+02:00 Omer Frenkel < ofrenkel at redhat.com > :
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > ----- Original Message -----
>>> > > From: "Koen Vanoppen" < vanoppen.koen at gmail.com >
>>> > > To: users at ovirt.org
>>> > > Sent: Wednesday, April 2, 2014 4:07:19 PM
>>> > > Subject: [Users] HA
>>> > >
>>> > > Dear All,
>>> > >
>>> > > Due our acceptance testing, we discovered something. (Document will
>>> > > follow).
>>> > > When we disable one fiber path, no problem multipath finds it way no
>>> pings
>>> > > are lost.
>>> > > BUT when we disabled both the fiber paths (so one of the storage
>>> domain is
>>> > > gone on this host, but still available on the other host), vms go in
>>> paused
>>> > > mode... He chooses a new SPM (can we speed this up?), put's the host
>>> in
>>> > > non-responsive (can we speed this up, more important) and the VM's
>>> stay on
>>> > > Paused mode... I would expect that they would be migrated (yes, HA is
>>> >
>>> > i guess you mean the host moves to not-operational (in contrast to
>>> > non-responsive)?
>>> > if so, the engine will not migrate vms that are paused to do io error,
>>> > because of data corruption risk.
>>> >
>>> > to speed up you can look at the storage domain monitoring timeout:
>>> > engine-config --get StorageDomainFalureTimeoutInMinutes
>>> >
>>> >
>>> > > enabled) to the other host and reboot there... Any solution? We are
>>> still
>>> > > using oVirt 3.3.1 , but we are planning a upgrade to 3.4 after the
>>> easter
>>> > > holiday.
>>> > >
>>> > > Kind Regards,
>>> > >
>>> > > Koen
>>> > >
>>>
>>> Hi Koen,
>>> Resuming from paused due to io issues is supported (adding relevant
>>> folks).
>>> Regardless, if you did not define power management, you should manually
>>> approve
>>> source host was rebooted in order for migration to proceed. Otherwise we
>>> risk
>>> split-brain scenario.
>>>
>>> Doron
>>>
>>
>>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20140404/8a2131df/attachment-0001.html>


More information about the Users mailing list