
Hi, I looked at the logs and the reason why host vmh-02 wasn't restarted is that PM restart failed using both other hosts (vmh-01 and vmh-03) with error: Test Failed, [Powering off machine @ IPMI:10.9.1.11...Failed So we couldn't restart HA VMs on another hosts, because we were not sure that host vmh-02 is really down. I also noticed that even getting PM status of host vmh-02 is problematic, fence agent returned this message: Power Management test failed for Host vmh-02.Done but it also returned successful operation. This looks very very suspicious! Could you please execute following command from vmh-01 or vmh-03 to test PM agent on vmh-02 fence_ipmilan -a <IP> -l <USER> -p <PASSWORD> -o status -v -P where <IP>, <USER> and <PASSWORD> contains values valid for vmh-02? Could you please send us also vdsm.log from machines vmh-01 and vmh-03 so we could investigate details of fence agents execution failures? Thanks a lot Martin Perina ----- Original Message -----
From: "Siddharth Patil" <siddharth@patil.co.uk> To: users@ovirt.org Sent: Wednesday, February 11, 2015 5:17:28 PM Subject: Re: [ovirt-users] Fenced hosts VM's never migrate
On 11/02/15, 4:34 PM, Omer Frenkel wrote:
----- Original Message -----
From: "Tim Macy" <macytd@gmail.com> To: users@ovirt.org Sent: Tuesday, February 10, 2015 6:55:31 PM Subject: [ovirt-users] Fenced hosts VM's never migrate
I have a 3 host cluster setup with HA enabled and fencing enabled and it appears to be working properly. Executing power management stop, start, and restart work along with host shutdown/restart following a simulated crash. When network is pulled a proxy is chosen and it powers off the downed host, and then restarts it. Since the network is still down it repeats the following in events: "Host kvm01 is not responding. It will stay in Connecting state for a grace period of 162 seconds and after that an attempt to fence the host will be issued."
The real problem here is that the VM's on the host that has failed never migrate to a new host and remain down until the network is reconnected.
once the host is powered off by the proxy, HA vms will be started (not migrated) on other host, if there are resources for it.. if you have HA vms that are not started although there is another host available for it, it might be a bug, can you please attach engine.log from the time of the failure?
We have tested this with back-end storage on gluster and NFS with the same result. This is on oVirt Engine Version: 3.5.1.1-1.el6. Hosts are on CentOS 7 and the Engine is standalone on CentOS 6.6.
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
I've had the exact same problem during testing yesterday. The HA VMs never restarted on the other available hosts. The only difference is that we're using iSCSI storage backend.
oVirt Engine Version: 3.5.1.1-1.el6 (hosted engine) Host: CentOS 6.6
Engine logs are attached.
Thanks, Siddharth
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users