[ovirt-users] [Centos7.1x64] [Ovirt 3.5.2] Test fence : Power management test failed for Host hosted_engine1 Done

wodel youchi wodel.youchi at gmail.com
Thu Jun 11 10:56:46 UTC 2015


Hi Martin,

Could you please tell me the version of the ILO firmware you are using?

I did upgrade mine from 1.40 to 2.10 but nothing changed, I did also
upgrade the smart array p420i card from 5.10 to 6.34 without luck so far.

I checked again all parameters, I can't find the error.

I did all the updates for Centos and oVirt

I have another problem when rebooting any hypervisor, the hypervisor hangs,
the problem is with hpwtd (hp watch dog)
"hpwdt unexpected close not stopping watchdog"

I added this to kernel parameters "intremap=no_x2apic_optout" but it didn't
change any thing.

I am thinking to test with the latest kernel available to see if it's a
kernel problem.

and I am going to reinstall the platform with Centos 6 to see if there will
be any differences.




2015-06-10 12:00 GMT+01:00 wodel youchi <wodel.youchi at gmail.com>:

> Hi,
>
> engine log is already in debug mode
>
> here it is:
> 2015-06-10 11:48:23,653 INFO
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (ajp--127.0.0.1-8702-12) Correlation ID: null, Call Stack: null, Custom
> Event ID: -1, Message: Host hosted_engine_2 from cluster Default was chosen
> as a proxy to execute Status command on Host hosted_engine_1.
> 2015-06-10 11:48:23,653 INFO  [org.ovirt.engine.core.bll.FenceExecutor]
> (ajp--127.0.0.1-8702-12) Using Host hosted_engine_2 from cluster Default as
> proxy to execute Status command on Host
> 2015-06-10 11:48:23,673 INFO  [org.ovirt.engine.core.bll.FenceExecutor]
> (ajp--127.0.0.1-8702-12) Executing <Status> Power Management command, Proxy
> Host:hosted_engine_2, Agent:ipmilan, Target Host:, Management
> IP:192.168.2.2, User:Administrator, Options: power_wait="60",lanplus="1",
> Fencing policy:null
> 2015-06-10 11:48:23,703 INFO
> *[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
> (ajp--127.0.0.1-8702-12) START, FenceVdsVDSCommand(HostName =
> hosted_engine_2, HostId = 0192d1ac-b905-4660-b149-4bef578985dd, targetVdsId
> = cf2d1260-7bb3-451a-9cd7-80e6a0ede52a, action = Status, ip = 192.168.2.2,
> port = , type = ipmilan, user = Administrator, password = ******, options =
> ' power_wait="60",lanplus="1"', policy = 'null'), log id:
> 2bda01bd2015-06-10 11:48:23,892 WARN
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (ajp--127.0.0.1-8702-12) Correlation ID: null, Call Stack: null, Custom
> Event ID: -1, Message: Power Management test failed for Host
> hosted_engine_1.Done*
> 2015-06-10 11:48:23,892 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
> (ajp--127.0.0.1-8702-12) FINISH, FenceVdsVDSCommand, return:
> *Test Succeeded, unknown, log id: 2bda01bd2015-06-10 11:48:23,897 WARN
> [org.ovirt.engine.core.bll.FenceExecutor] (ajp--127.0.0.1-8702-12) Fencing
> operation failed with proxy host 0192d1ac-b905-4660-*b149-4bef578985dd,
> trying another proxy...
> 2015-06-10 11:48:24,039 ERROR [org.ovirt.engine.core.bll.FenceExecutor]
> (ajp--127.0.0.1-8702-12) Failed to run Power Management command on Host ,
> no running proxy Host was found.
> 2015-06-10 11:48:24,039 WARN  [org.ovirt.engine.core.bll.FenceExecutor]
> (ajp--127.0.0.1-8702-12) Failed to find other proxy to re-run failed fence
> operation, retrying with the same proxy...
> 2015-06-10 11:48:24,143 INFO  *[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (ajp--127.0.0.1-8702-12) Correlation ID: null, Call Stack: null, Custom
> Event ID: -1, Message: Host hosted_engine_2 from cluster Default was chosen
> as a proxy to execute Status command on Host hosted_engine_1.*
> 2015-06-10 11:48:24,143 INFO  [org.ovirt.engine.core.bll.FenceExecutor]
> (ajp--127.0.0.1-8702-12) Using Host hosted_engine_2 from cluster Default as
> proxy to execute Status command on Host
>
> *2015-06-10 11:48:24,148 INFO  [org.ovirt.engine.core.bll.FenceExecutor]
> (ajp--127.0.0.1-8702-12) Executing <Status> Power Management command, Proxy
> Host:hosted_engine_2, Agent:ipmilan, Target Host:, Management
> IP:192.168.2.2, User:Administrator, Options: power_wait="60",lanplus="1",
> Fencing policy:null2015-06-10 11:48:24,165 INFO  *[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
> (ajp--127.0.0.1-8702-12) START, FenceVdsVDSCommand(HostName =
> hosted_engine_2, HostId = 0192d1ac-b905-4660-b149-4bef578985dd, targetVdsId
> = cf2d1260-7bb3-451a-9cd7-80e6a0ede52a, action = Status, ip = 192.168.2.2,
> port = , type = ipmilan, user = Administrator, password = ******, options =
> ' power_wait="60",lanplus="1"', policy = 'null'), log id: 7e7f2726
> 2015-06-10 11:48:24,360 WARN
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (ajp--127.0.0.1-8702-12) Correlation ID: null, Call Stack: null, Custom
> Event ID: -1, Message: Power Management test failed for Host
> hosted_engine_1.Done
> 2015-06-10 11:48:24,360 INFO  *[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
> (ajp--127.0.0.1-8702-12) FINISH, FenceVdsVDSCommand, return: Test
> Succeeded, unknown, log id: 7e7f2726*
>
>
> VDSM log from hosted_engine_2
>
> JsonRpcServer::DEBUG::2015-06-10
> 11:48:23,640::__init__::506::jsonrpc.JsonRpcServer::(serve_requests)
> Waiting for request
> Thread-2201::DEBUG::2015-06-10 11:48:23,642::API::1209::vds::(fenceNode)
> *fenceNode(addr=192.168.2.2,port=,agent=ipmilan,user=Administrator,passwd=XXXX,action=status,secure=False,options=
> power_wait="60"lanplus="1",policy=None)*
> Thread-2201::DEBUG::2015-06-10 11:48:23,642::utils::739::root::(execCmd)
> /usr/sbin/fence_ipmilan (cwd None)
> Thread-2201::DEBUG::2015-06-10 11:48:23,709::utils::759::root::(execCmd) *FAILED:
> <err> = 'Failed: Unable to obtain correct plug status or plug is not
> available\n\n\n*'; <rc> = 1
> Thread-2201::DEBUG::2015-06-10 11:48:23,710::API::1164::vds::(fence) rc 1
> inp agent=fence_ipmilan
> ipaddr=192.168.2.2
> login=Administrator
> action=status
> passwd=XXXX
>  power_wait="60"
> lanplus="1" out [] err ['Failed: Unable to obtain correct plug status or
> plug is not available', '', '']
> Thread-2201::DEBUG::2015-06-10 11:48:23,710::API::1235::vds::(fenceNode)
> rc 1 in agent=fence_ipmilan
> ipaddr=192.168.2.2
> login=Administrator
> action=status
> passwd=XXXX
>  power_wait="60"
> lanplus="1" out [] err ['Failed: Unable to obtain correct plug status or
> plug is not available', '', '']
> Thread-2201::DEBUG::2015-06-10
> 11:48:23,710::stompReactor::163::yajsonrpc.StompServer::(send) Sending
> response
> JsonRpc (StompReactor)::DEBUG::2015-06-10
> 11:48:23,712::stompReactor::98::Broker.StompAdapter::(handle_frame)
> Handling message <StompFrame command='SEND'>
> JsonRpcServer::DEBUG::2015-06-10
> 11:48:23,713::__init__::506::jsonrpc.JsonRpcServer::(serve_requests)
> Waiting for request
> Thread-2202::DEBUG::2015-06-10 11:48:23,715::API::1209::vds::(fenceNode)
> fenceNode(addr=192.168.2.2,port=,agent=ipmilan,user=Administrator,passwd=XXXX,action=status,secure=False,options=
> power_wait="60"
> lanplus="1",policy=None)
> Thread-2202::DEBUG::2015-06-10 11:48:23,715::utils::739::root::(execCmd)
> /usr/sbin/fence_ipmilan (cwd None)
> Thread-2202::DEBUG::2015-06-10 11:48:23,781::utils::759::root::(execCmd)
> FAILED: <err> = 'Failed: Unable to obtain correct plug status or plug is
> not available\n\n\n'; <rc> = 1
> Thread-2202::DEBUG::2015-06-10 11:48:23,781::API::1164::vds::(fence) rc 1
> inp agent=fence_ipmilan
> ipaddr=192.168.2.2
> login=Administrator
> action=status
> passwd=XXXX
>  power_wait="60"
> lanplus="1" out [] err ['Failed: Unable to obtain correct plug status or
> plug is not available', '', '']
>
>
>
> I triple checked, I used the correct IPs and login password, the test in
> console works.
>
> 2015-06-10 10:31 GMT+01:00 Martin Perina <mperina at redhat.com>:
>
>> Hi,
>>
>> I just install engine 3.5.2 on Centos 7.1, added 2 Centos 7.1 hosts (both
>> with ipmilan fence devices) and everything worked fine. I also tried to
>> add
>> options
>>
>>   lanplus="1", power_wait="60"
>>
>> and even with them getting power status of hosts worked fine.
>>
>> So could you please check again settings of your hosts in webadmin?
>>
>>  hosted_engine1
>>    PM address: IP address of ILO4 interface of the host hosted_engine1
>>
>>
>>  hosted_engine2
>>    PM address: IP address of ILO4 interface of the host hosted_engine2
>>
>> If the IP addresses are entered correctly, please allow DEBUG log for
>> engine,
>> execute test of PM settings for one host and attach logs from engine and
>> VDSM logs from both hosts.
>>
>> Thanks
>>
>> Martin Perina
>>
>>
>> ----- Original Message -----
>> > From: "wodel youchi" <wodel.youchi at gmail.com>
>> > To: "users" <Users at ovirt.org>
>> > Sent: Tuesday, June 9, 2015 2:41:02 PM
>> > Subject: [ovirt-users] [Centos7.1x64] [Ovirt 3.5.2] Test fence : Power
>> management test failed for Host hosted_engine1
>> > Done
>> >
>> > Hi,
>> >
>> > I have a weird problem with fencing
>> >
>> > I have a cluster of two HP DL380p G8 (ILO4)
>> >
>> > Centos7.1x64 and oVirt 3.5.2 ALL UPDATED
>> >
>> > I configured fencing first with ilo4 then ipmilan
>> >
>> > When testing fence from the engine I get : Succeeded, Unknown
>> >
>> > And in alerts tab I get : Power management test failed for Host
>> > hosted_engine1 Done (the same for host2)
>> >
>> > I tested with fence_ilo4 and fence_ipmilan and they report the result
>> > correctly
>> >
>> > # fence_ipmilan -P -a 192.168.2.2 -o status -l Administrator -p ertyuiop
>> > -vExecuting: /usr/bin/ipmitool -I lanplus -H 192.168.2.2 -U
>> Administrator -P
>> > ertyuiop -p 623 -L ADMINISTRATOR chassis power status
>> >
>> > 0 Chassis Power is on
>> >
>> >
>> > Status: ON
>> >
>> >
>> > # fence_ilo4 -l Administrator -p ertyuiop -a 192.168.2.2 -o status -v
>> > Executing: /usr/bin/ipmitool -I lanplus -H 192.168.2.2 -U Administrator
>> -P
>> > ertyuiop -p 623 -L ADMINISTRATOR chassis power status
>> >
>> > 0 Chassis Power is on
>> >
>> >
>> > Status: ON
>> >
>> > ----------------------------------
>> > These are the options passed to fence_ipmilan (I tested with the
>> options and
>> > without them)
>> >
>> > lanplus="1", power_wait="60"
>> >
>> >
>> > This is the engine log:
>> >
>> > 2015-06-09 13:35:29,287 INFO [org.ovirt.engine.core.bll.FenceExecutor]
>> > (ajp--127.0.0.1-8702-7) Using Host hosted_engine_2 from cluster Default
>> as
>> > proxy to execute Status command on Host
>> > 2015-06-09 13:35:29,289 INFO [org.ovirt.engine.core.bll.FenceExecutor]
>> > (ajp--127.0.0.1-8702-7) Executing <Status> Power Management command,
>> Proxy
>> > Host:hosted_engine_2, Agent:ipmilan, Target Host:, Management
>> > IP:192.168.2.2, User:Administrator, Options:
>> power_wait="60",lanplus="1",
>> > Fencing policy:null
>> > 2015-06-09 13:35:29,306 INFO
>> > [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
>> > (ajp--127.0.0.1-8702-7) START, FenceVdsVDSCommand(
>> > HostName = hosted_engine_2,
>> > HostId = 0192d1ac-b905-4660-b149-4bef578985dd,
>> > targetVdsId = cf2d1260-7bb3-451a-9cd7-80e6a0ede52a,
>> > action = Status,
>> > ip = 192.168.2.2,
>> > port = ,
>> > type = ipmilan,
>> > user = Administrator,
>> > password = ******,
>> > options = ' power_wait="60",lanplus="1"',
>> > policy = 'null'), log id: 24ce6206
>> > 2015-06-09 13:35:29,516 WARN
>> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> > (ajp--127.0.0.1-8702-7) Correlation ID: null, Call Stack: null, Custom
>> Event
>> > ID: -1, Message: Power Management test failed for Host
>> hosted_engine_1.Done
>> > 2015-06-09 13:35:29,516 INFO
>> > [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
>> > (ajp--127.0.0.1-8702-7) FINISH, FenceVdsVDSCommand, return: Test
>> Succeeded,
>> > unknown , log id: 24ce6206
>> >
>> >
>> > and here the vdsm log from the proxy
>> >
>> > JsonRpcServer::DEBUG::2015-06-09
>> > 13:37:52,461::__init__::506::jsonrpc.JsonRpcServer::(serve_requests)
>> Waiting
>> > for request
>> > Thread-131907::DEBUG::2015-06-09
>> 13:37:52,463::API::1209::vds::(fenceNode)
>> >
>> fenceNode(addr=192.168.2.2,port=,agent=ipmilan,user=Administrator,passwd=XXXX,action=status,secure=False,options=
>> > power_wait="60"
>> > lanplus="1",policy=None)
>> > Thread-131907::DEBUG::2015-06-09
>> 13:37:52,463::utils::739::root::(execCmd)
>> > /usr/sbin/fence_ipmilan (cwd None)
>> > Thread-131907::DEBUG::2015-06-09
>> 13:37:52,533::utils::759::root::(execCmd)
>> > FAILED: <err> = 'Failed: Unable to obtain correct plug status or plug
>> is not
>> > available\n\n\n'; <rc> = 1
>> > Thread-131907::DEBUG::2015-06-09 13:37:52,533::API::1164::vds::(fence)
>> rc 1
>> > inp agent=fence_ipmilan
>> > ipaddr=192.168.2.2
>> > login=Administrator
>> > action=status
>> > passwd=XXXX
>> > power_wait="60"
>> > lanplus="1" out [] err ['Failed: Unable to obtain correct plug status
>> or plug
>> > is not available', '', '']
>> > Thread-131907::DEBUG::2015-06-09
>> 13:37:52,533::API::1235::vds::(fenceNode) rc
>> > 1 in agent=fence_ipmilan
>> > ipaddr=192.168.2.2
>> > login=Administrator
>> > action=status
>> > passwd=XXXX
>> > power_wait="60"
>> > lanplus="1" out [] err [' Failed: Unable to obtain correct plug status
>> or
>> > plug is not available ', '', '']
>> > Thread-131907::DEBUG::2015-06-09
>> > 13:37:52,534::stompReactor::163::yajsonrpc.StompServer::(send) Sending
>> > response
>> > Detector thread::DEBUG::2015-06-09
>> >
>> 13:37:53,670::protocoldetector::187::vds.MultiProtocolAcceptor::(_add_connection)
>> > Adding connection from 127.0.0.1:55761
>> >
>> >
>> > VDSM rpms
>> > # rpm -qa | grep vdsm
>> > vdsm-cli-4.16.14-0.el7.noarch
>> > vdsm-python-zombiereaper-4.16.14-0.el7.noarch
>> > vdsm-xmlrpc-4.16.14-0.el7.noarch
>> > vdsm-yajsonrpc-4.16.14-0.el7.noarch
>> > vdsm-4.16.14-0.el7.x86_64
>> > vdsm-python-4.16.14-0.el7.noarch
>> > vdsm-jsonrpc-4.16.14-0.el7.noarch
>> >
>> > any idea?
>> >
>> > Thanks in advance.
>> >
>> > _______________________________________________
>> > Users mailing list
>> > Users at ovirt.org
>> > http://lists.ovirt.org/mailman/listinfo/users
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20150611/9b6d49e7/attachment-0001.html>


More information about the Users mailing list