[Users] HA

Doron Fediuck dfediuck at redhat.com
Thu Apr 3 14:51:47 UTC 2014



----- Original Message -----
> From: "Koen Vanoppen" <vanoppen.koen at gmail.com>
> To: "Omer Frenkel" <ofrenkel at redhat.com>, users at ovirt.org
> Sent: Wednesday, April 2, 2014 4:17:36 PM
> Subject: Re: [Users] HA
> 
> Yes, indeed. I meant not-operational. Sorry.
> So, if I understand this correctly. When we ever come in a situation that we
> loose both storage connections on our hypervisor, we will have to manually
> restore the connections first?
> 
> And thanx for the tip for speeding up thins :-).
> 
> Kind regards,
> 
> Koen
> 
> 
> 2014-04-02 15:14 GMT+02:00 Omer Frenkel < ofrenkel at redhat.com > :
> 
> 
> 
> 
> 
> ----- Original Message -----
> > From: "Koen Vanoppen" < vanoppen.koen at gmail.com >
> > To: users at ovirt.org
> > Sent: Wednesday, April 2, 2014 4:07:19 PM
> > Subject: [Users] HA
> > 
> > Dear All,
> > 
> > Due our acceptance testing, we discovered something. (Document will
> > follow).
> > When we disable one fiber path, no problem multipath finds it way no pings
> > are lost.
> > BUT when we disabled both the fiber paths (so one of the storage domain is
> > gone on this host, but still available on the other host), vms go in paused
> > mode... He chooses a new SPM (can we speed this up?), put's the host in
> > non-responsive (can we speed this up, more important) and the VM's stay on
> > Paused mode... I would expect that they would be migrated (yes, HA is
> 
> i guess you mean the host moves to not-operational (in contrast to
> non-responsive)?
> if so, the engine will not migrate vms that are paused to do io error,
> because of data corruption risk.
> 
> to speed up you can look at the storage domain monitoring timeout:
> engine-config --get StorageDomainFalureTimeoutInMinutes
> 
> 
> > enabled) to the other host and reboot there... Any solution? We are still
> > using oVirt 3.3.1 , but we are planning a upgrade to 3.4 after the easter
> > holiday.
> > 
> > Kind Regards,
> > 
> > Koen
> > 

Hi Koen,
Resuming from paused due to io issues is supported (adding relevant folks).
Regardless, if you did not define power management, you should manually approve
source host was rebooted in order for migration to proceed. Otherwise we risk
split-brain scenario.

Doron



More information about the Users mailing list