----- Original Message -----
From: "Siddharth Patil" <siddharth(a)patil.co.uk>
To: users(a)ovirt.org
Sent: Wednesday, February 11, 2015 6:25:02 PM
Subject: Re: [ovirt-users] Fenced hosts VM's never migrate
On 11/02/15, 6:46 PM, Martin Perina wrote:
> Hi,
>
> I looked at the logs and the reason why host vmh-02 wasn't
> restarted is that PM restart failed using both other hosts
> (vmh-01 and vmh-03) with error:
>
> Test Failed, [Powering off machine @ IPMI:10.9.1.11...Failed
>
> So we couldn't restart HA VMs on another hosts, because we were
> not sure that host vmh-02 is really down.
>
> I also noticed that even getting PM status of host vmh-02 is
> problematic, fence agent returned this message:
>
> Power Management test failed for Host vmh-02.Done
>
> but it also returned successful operation. This looks very very
> suspicious!
Could this be because we turned off power to vmh-02 completely? We are
testing to make sure that the HA VMs will be restarted on another host
even if the host suffers complete hardware failure.
IPMI interface should work even if server is turned off. But of course
it needs power, so if you loose power to it, it cannot work.
For this case you would need another (secondary) fencing agent (for
example APC) which will control the power for server and its IPMI
interface.
>
> Could you please execute following command from vmh-01 or vmh-03
> to test PM agent on vmh-02
>
> fence_ipmilan -a <IP> -l <USER> -p <PASSWORD> -o status -v -P
>
> where <IP>, <USER> and <PASSWORD> contains values valid for
vmh-02?
Here's the result (from both):
Getting status of IPMI:10.9.1.11...Spawning: '/usr/bin/ipmitool -I
lanplus -H '10.9.1.11' -U 'ADMIN' -P '[set]' -v chassis power
status'...
Chassis power = On
Done
Of course, the server is now up and running so this is expected.
Yes this is the correct result. And you should get result
Chassis power = Off
when server is turned off.
>
>
> Could you please send us also vdsm.log from machines vmh-01 and
> vmh-03 so we could investigate details of fence agents execution
> failures?
See attached.
I looked at the vdsm logs and I wasn't able to find any additional
error details. But it looks like a bug, that fence agent status
command returned error code 1 and in engine we reported this as success.
So could you please file a new bug with above logs attached, reproducing
steps and also following versions:
Host where engine is running:
ovirt-engine
Host vmh-01:
vdsm
fence-agents
Thanks a lot
Martin Perina
Regards,
Siddharth
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users