Hi All,
I have a bit of an issue with a new install of Ovirt 3.5 (our 3.4 cluster is working fine) in a 4 node cluster.
When I test fencing (or cause a kernal panic triggering a fence) the fencing fails. On investigation it appears that the fencing options are not being passed to the fencing script (fence_ipmilan in this case):
Fence options under GUI(as entered in the gui): lanplus, ipport=623, power_wait=4, privlvl=operator
from vdsm.log on the fence proxy node:
Thread-818296::DEBUG::2015-04-21 12:39:39,136::API::1209::vds::(fenceNode) fenceNode(addr=x.x.x.x,port=,agent=ipmilan,user=stonith,passwd=XXXX,action=status,secure=False,options= power_wait=4
Thread-818296::DEBUG::2015-04-21 12:39:39,137::utils::739::root::(execCmd) /usr/sbin/fence_ipmilan (cwd None)
Thread-818296::DEBUG::2015-04-21 12:39:39,295::utils::759::root::(execCmd) FAILED: <err> = 'Failed: Unable to obtain correct plug status or plug is not available\n\n\n'; <rc> = 1
Thread-818296::DEBUG::2015-04-21 12:39:39,296::API::1164::vds::(fence) rc 1 inp agent=fence_ipmilan
Thread-818296::DEBUG::2015-04-21 12:39:39,296::API::1235::vds::(fenceNode) rc 1 in agent=fence_ipmilan
Thread-818296::DEBUG::2015-04-21 12:39:39,297::stompReactor::163::yajsonrpc.StompServer::(send) Sending response
from engine.log on the engine:
2015-04-21 12:39:38,843 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-4) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Host mpc-ovirt-node03 from cluster Default was chosen as a proxy to execute Status command on Host mpc-ovirt-node04.
2015-04-21 12:39:38,845 INFO [org.ovirt.engine.core.bll.FenceExecutor] (ajp--127.0.0.1-8702-4) Using Host mpc-ovirt-node03 from cluster Default as proxy to execute Status command on Host
2015-04-21 12:39:38,885 INFO [org.ovirt.engine.core.bll.FenceExecutor] (ajp--127.0.0.1-8702-4) Executing <Status> Power Management command, Proxy Host:mpc-ovirt-node03, Agent:ipmilan, Target Host:, Management IP:x.x.x.x, User:stonith, Options: power_wait=4, ipport=623, privlvl=operator,lanplus, Fencing policy:null
2015-04-21 12:39:38,921 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (ajp--127.0.0.1-8702-4) START, FenceVdsVDSCommand(HostName = mpc-ovirt-node03, HostId = 5613a489-589d-4e89-ab01-3642795eedb8, targetVdsId = dbfa4e85-3e97-4324-b222-bf40a491db08, action = Status, ip = x.x.x.x, port = , type = ipmilan, user = stonith, password = ******, options = ' power_wait=4, ipport=623, privlvl=operator,lanplus', policy = 'null'), log id: 774f328
2015-04-21 12:39:39,338 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-4) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Power Management test failed for Host mpc-ovirt-node04.Done
2015-04-21 12:39:39,339 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (ajp--127.0.0.1-8702-4) FINISH, FenceVdsVDSCommand, return: Test Succeeded, unknown, log id: 774f328
2015-04-21 12:39:39,340 WARN [org.ovirt.engine.core.bll.FenceExecutor] (ajp--127.0.0.1-8702-4) Fencing operation failed with proxy host 5613a489-589d-4e89-ab01-3642795eedb8, trying another proxy...
2015-04-21 12:39:39,594 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-4) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Host mpc-ovirt-node01 from cluster Default was chosen as a proxy to execute Status command on Host mpc-ovirt-node04.
2015-04-21 12:39:39,595 INFO [org.ovirt.engine.core.bll.FenceExecutor] (ajp--127.0.0.1-8702-4) Using Host mpc-ovirt-node01 from cluster Default as proxy to execute Status command on Host
2015-04-21 12:39:39,598 INFO [org.ovirt.engine.core.bll.FenceExecutor] (ajp--127.0.0.1-8702-4) Executing <Status> Power Management command, Proxy Host:mpc-ovirt-node01, Agent:ipmilan, Target Host:, Management IP:x.x.x.x, User:stonith, Options: power_wait=4, ipport=623, privlvl=operator,lanplus, Fencing policy:null
2015-04-21 12:39:39,634 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (ajp--127.0.0.1-8702-4) START, FenceVdsVDSCommand(HostName = mpc-ovirt-node01, HostId = c3e8be6e-ac54-4861-b774-17ba5cc66dc6, targetVdsId = dbfa4e85-3e97-4324-b222-bf40a491db08, action = Status, ip = x.x.x.x, port = , type = ipmilan, user = stonith, password = ******, options = ' power_wait=4, ipport=623, privlvl=operator,lanplus', policy = 'null'), log id: 6369eb1
2015-04-21 12:39:40,056 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-4) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Power Management test failed for Host mpc-ovirt-node04.Done
2015-04-21 12:39:40,057 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (ajp--127.0.0.1-8702-4) FINISH, FenceVdsVDSCommand, return: Test Succeeded, unknown, log id: 6369eb1
For verification I temporarily replaced /usr/sbin/fence_ipmilan with a shell script that dumps the env plus any cli args passed into a log file:
-------------------------- Tue Apr 21 12:39:39 EDT 2015 ----------------------------
ENV DUMP:
LC_ALL=C
USER=vdsm
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
PWD=/
LANG=en_CA.UTF-8
LIBVIRT_LOG_FILTERS=
SHLVL=1
HOME=/var/lib/vdsm
LOGNAME=vdsm
LIBVIRT_LOG_OUTPUTS=
_=/usr/bin/env
------------------------------
CLI DUMP:
<this is where the cli args should be listed>
Version info:
libvirt version: 1.2.8, package: 16.el7_1.2 (CentOS BuildSystem <
http://bugs.centos.org>, 2015-03-26-23:17:42,
worker1.bsys.centos.org)
fence_ipmilan: 4.0.11 (built Mon Apr 13 13:22:18 UTC 2015)
vdsm.x86_64: 4.16.10-8.gitc937927.el7
ovirt-engine.noarch: 3.5.1.1-1.el6
Engine os: Centos 6.6
Host os: Centos 7.1.1503
I've found some old post from 2012 that describe the same problem. Has anyone else run into this?
Any thought or suggestions would be appreciated.
Cheers,
Mike