[ovirt-users] Fenced hosts VM's never migrate

Martin Perina mperina at redhat.com
Wed Feb 11 16:46:48 UTC 2015


Hi,

I looked at the logs and the reason why host vmh-02 wasn't
restarted is that PM restart failed using both other hosts
(vmh-01 and vmh-03) with error:

  Test Failed, [Powering off machine @ IPMI:10.9.1.11...Failed

So we couldn't restart HA VMs on another hosts, because we were
not sure that host vmh-02 is really down.

I also noticed that even getting PM status of host vmh-02 is
problematic, fence agent returned this message:

  Power Management test failed for Host vmh-02.Done

but it also returned successful operation. This looks very very
suspicious!

Could you please execute following command from vmh-01 or vmh-03
to test PM agent on vmh-02

  fence_ipmilan -a <IP> -l <USER> -p <PASSWORD> -o status -v -P

where <IP>, <USER> and <PASSWORD> contains values valid for vmh-02?


Could you please send us also vdsm.log from machines vmh-01 and
vmh-03 so we could investigate details of fence agents execution
failures?

Thanks a lot

Martin Perina


----- Original Message -----
> From: "Siddharth Patil" <siddharth at patil.co.uk>
> To: users at ovirt.org
> Sent: Wednesday, February 11, 2015 5:17:28 PM
> Subject: Re: [ovirt-users] Fenced hosts VM's never migrate
> 
> On 11/02/15, 4:34 PM, Omer Frenkel wrote:
> >
> >
> > ----- Original Message -----
> >> From: "Tim Macy" <macytd at gmail.com>
> >> To: users at ovirt.org
> >> Sent: Tuesday, February 10, 2015 6:55:31 PM
> >> Subject: [ovirt-users] Fenced hosts VM's never migrate
> >>
> >> I have a 3 host cluster setup with HA enabled and fencing enabled and it
> >> appears to be working properly. Executing power management stop, start,
> >> and
> >> restart work along with host shutdown/restart following a simulated crash.
> >> When network is pulled a proxy is chosen and it powers off the downed
> >> host,
> >> and then restarts it. Since the network is still down it repeats the
> >> following in events:
> >> "Host kvm01 is not responding. It will stay in Connecting state for a
> >> grace
> >> period of 162 seconds and after that an attempt to fence the host will be
> >> issued."
> >>
> >> The real problem here is that the VM's on the host that has failed never
> >> migrate to a new host and remain down until the network is reconnected.
> >>
> >
> > once the host is powered off by the proxy, HA vms will be started (not
> > migrated) on other host,
> > if there are resources for it..
> > if you have HA vms that are not started although there is another host
> > available for it,
> > it might be a bug, can you please attach engine.log from the time of the
> > failure?
> >
> >> We have tested this with back-end storage on gluster and NFS with the same
> >> result. This is on oVirt Engine Version: 3.5.1.1-1.el6. Hosts are on
> >> CentOS
> >> 7 and the Engine is standalone on CentOS 6.6.
> >>
> >>
> >>
> >> _______________________________________________
> >> Users mailing list
> >> Users at ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/users
> >>
> 
> I've had the exact same problem during testing yesterday. The HA VMs
> never restarted on the other available hosts. The only difference is
> that we're using iSCSI storage backend.
> 
> oVirt Engine Version: 3.5.1.1-1.el6 (hosted engine)
> Host: CentOS 6.6
> 
> Engine logs are attached.
> 
> Thanks,
> Siddharth
> 
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 



More information about the Users mailing list