On 11/02/15, 6:46 PM, Martin Perina wrote:
Hi,
I looked at the logs and the reason why host vmh-02 wasn't
restarted is that PM restart failed using both other hosts
(vmh-01 and vmh-03) with error:
Test Failed, [Powering off machine @ IPMI:10.9.1.11...Failed
So we couldn't restart HA VMs on another hosts, because we were
not sure that host vmh-02 is really down.
I also noticed that even getting PM status of host vmh-02 is
problematic, fence agent returned this message:
Power Management test failed for Host vmh-02.Done
but it also returned successful operation. This looks very very
suspicious!
Could this be because we turned off power to vmh-02 completely? We are
testing to make sure that the HA VMs will be restarted on another host
even if the host suffers complete hardware failure.
Could you please execute following command from vmh-01 or vmh-03
to test PM agent on vmh-02
fence_ipmilan -a <IP> -l <USER> -p <PASSWORD> -o status -v -P
where <IP>, <USER> and <PASSWORD> contains values valid for vmh-02?
Here's the result (from both):
Getting status of IPMI:10.9.1.11...Spawning: '/usr/bin/ipmitool -I
lanplus -H '10.9.1.11' -U 'ADMIN' -P '[set]' -v chassis power
status'...
Chassis power = On
Done
Of course, the server is now up and running so this is expected.
Could you please send us also vdsm.log from machines vmh-01 and
vmh-03 so we could investigate details of fence agents execution
failures?
See attached.
Regards,
Siddharth