Fencing failed, fence agent ipmilan used instead of ilo4

Hi, I'm running an oVirt hosted-engine environment on 3 hosts. To test VMs' HA functionality, I shutdown host02's link, where one of my HA VMs is running on, using this command: 2016-05-10 09:59:19 ICT [root@host02 ~]# ip link set bond0 down Few seconds later, an attempt to fence host02 was issued, and this entry appears in the web UI event tab "May 10, 2016 10:00:34 ... Executing power management status on Host hosted_engine_2 using Proxy Host hosted_engine_1 and Fence Agent ipmilan:172.16.3.5.". The IP "172.16.3.5" was correct the Fence Agent "ipmilan" was not. Even though a failure message "May 10, 2016 10:00:36 ... Execution of power management status on Host hosted_engine_2 using Proxy Host hosted_engine_1 and Fence Agent ipmilan:172.16.3.5 failed." appears in the web UI event tab, host02 was successfully powered off. The last message in the web GUI event tab is "May 10, 2016 10:00:40 AM ... Host hosted_engine_2 is rebooting.", but the host wasn't actually rebooted - I have to boot it manually using iLo web UI. How can fix this issue in order to make VMs' HA work? Thank you. Here is my power management settings: hosted_engine_1 -> ilo4 : 172.16.3.4 hosted_engine_2 -> ilo4 : 172.16.3.5 hosted_engine_3 -> ilo4 : 172.16.3.6 Here are the log files: https://app.box.com/s/fs5let8955rjbcuxuy0p42ixj4dzou6m [root@engine ~]# rpm -qa | grep ovirt ovirt-engine-wildfly-8.2.1-1.el7.x86_64 ovirt-engine-setup-plugin-ovirt-engine-common-3.6.5.3-1.el7.centos.noarch ovirt-vmconsole-1.0.0-1.el7.centos.noarch ovirt-engine-cli-3.6.2.0-1.el7.centos.noarch ovirt-engine-setup-plugin-ovirt-engine-3.6.5.3-1.el7.centos.noarch ovirt-engine-backend-3.6.5.3-1.el7.centos.noarch ovirt-iso-uploader-3.6.0-1.el7.centos.noarch ovirt-engine-extensions-api-impl-3.6.5.3-1.el7.centos.noarch ovirt-host-deploy-1.4.1-1.el7.centos.noarch ovirt-release36-007-1.noarch ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch ovirt-image-uploader-3.6.0-1.el7.centos.noarch ovirt-engine-extension-aaa-jdbc-1.0.6-1.el7.noarch ovirt-setup-lib-1.0.1-1.el7.centos.noarch ovirt-host-deploy-java-1.4.1-1.el7.centos.noarch ovirt-engine-setup-base-3.6.5.3-1.el7.centos.noarch ovirt-engine-setup-plugin-websocket-proxy-3.6.5.3-1.el7.centos.noarch ovirt-engine-tools-backup-3.6.5.3-1.el7.centos.noarch ovirt-vmconsole-proxy-1.0.0-1.el7.centos.noarch ovirt-engine-vmconsole-proxy-helper-3.6.5.3-1.el7.centos.noarch ovirt-engine-setup-3.6.5.3-1.el7.centos.noarch ovirt-engine-webadmin-portal-3.6.5.3-1.el7.centos.noarch ovirt-engine-tools-3.6.5.3-1.el7.centos.noarch ovirt-engine-restapi-3.6.5.3-1.el7.centos.noarch ovirt-engine-3.6.5.3-1.el7.centos.noarch ovirt-guest-agent-common-1.0.11-1.el7.noarch ovirt-engine-wildfly-overlay-8.0.5-1.el7.noarch ovirt-engine-lib-3.6.5.3-1.el7.centos.noarch ovirt-engine-websocket-proxy-3.6.5.3-1.el7.centos.noarch ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.5.3-1.el7.centos.noarch ovirt-engine-userportal-3.6.5.3-1.el7.centos.noarch ovirt-engine-dbscripts-3.6.5.3-1.el7.centos.noarch [root@host03 ~]# rpm -qa | grep ovirt ovirt-vmconsole-1.0.0-1.el7.centos.noarch ovirt-host-deploy-1.4.1-1.el7.centos.noarch ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch libgovirt-0.3.3-1.el7_2.1.x86_64 ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch ovirt-setup-lib-1.0.1-1.el7.centos.noarch [root@host03 ~]# rpm -qa | grep vdsm vdsm-cli-4.17.26-1.el7.noarch vdsm-4.17.26-1.el7.noarch vdsm-infra-4.17.26-1.el7.noarch vdsm-xmlrpc-4.17.26-1.el7.noarch vdsm-yajsonrpc-4.17.26-1.el7.noarch vdsm-hook-vmfex-dev-4.17.26-1.el7.noarch vdsm-python-4.17.26-1.el7.noarch vdsm-jsonrpc-4.17.26-1.el7.noarch -- Wee

Found a workaround. Changed the fence agent type from "ilo4" to "ipmilan" then added "lanplus=1,power_wait=30" (without quotes) to options. Now the host can be fenced successfully and all HA VMs in that host will be restarted in another hosts. Did a small experiment with power_wait parameter, here are the results: - power_wait=60 : HA VMs restarted and are pingable in ~2:45 minutes after connection lost - power_wait=30 : HA VMs restarted and are pingable in ~2:15 minutes after connection lost On 10/5/2559 12:52, Wee Sritippho wrote:
Hi,
I'm running an oVirt hosted-engine environment on 3 hosts. To test VMs' HA functionality, I shutdown host02's link, where one of my HA VMs is running on, using this command:
2016-05-10 09:59:19 ICT [root@host02 ~]# ip link set bond0 down
Few seconds later, an attempt to fence host02 was issued, and this entry appears in the web UI event tab "May 10, 2016 10:00:34 ... Executing power management status on Host hosted_engine_2 using Proxy Host hosted_engine_1 and Fence Agent ipmilan:172.16.3.5.". The IP "172.16.3.5" was correct the Fence Agent "ipmilan" was not.
Even though a failure message "May 10, 2016 10:00:36 ... Execution of power management status on Host hosted_engine_2 using Proxy Host hosted_engine_1 and Fence Agent ipmilan:172.16.3.5 failed." appears in the web UI event tab, host02 was successfully powered off.
The last message in the web GUI event tab is "May 10, 2016 10:00:40 AM ... Host hosted_engine_2 is rebooting.", but the host wasn't actually rebooted - I have to boot it manually using iLo web UI.
How can fix this issue in order to make VMs' HA work?
Thank you.
Here is my power management settings: hosted_engine_1 -> ilo4 : 172.16.3.4 hosted_engine_2 -> ilo4 : 172.16.3.5 hosted_engine_3 -> ilo4 : 172.16.3.6
Here are the log files: https://app.box.com/s/fs5let8955rjbcuxuy0p42ixj4dzou6m
[root@engine ~]# rpm -qa | grep ovirt ovirt-engine-wildfly-8.2.1-1.el7.x86_64 ovirt-engine-setup-plugin-ovirt-engine-common-3.6.5.3-1.el7.centos.noarch ovirt-vmconsole-1.0.0-1.el7.centos.noarch ovirt-engine-cli-3.6.2.0-1.el7.centos.noarch ovirt-engine-setup-plugin-ovirt-engine-3.6.5.3-1.el7.centos.noarch ovirt-engine-backend-3.6.5.3-1.el7.centos.noarch ovirt-iso-uploader-3.6.0-1.el7.centos.noarch ovirt-engine-extensions-api-impl-3.6.5.3-1.el7.centos.noarch ovirt-host-deploy-1.4.1-1.el7.centos.noarch ovirt-release36-007-1.noarch ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch ovirt-image-uploader-3.6.0-1.el7.centos.noarch ovirt-engine-extension-aaa-jdbc-1.0.6-1.el7.noarch ovirt-setup-lib-1.0.1-1.el7.centos.noarch ovirt-host-deploy-java-1.4.1-1.el7.centos.noarch ovirt-engine-setup-base-3.6.5.3-1.el7.centos.noarch ovirt-engine-setup-plugin-websocket-proxy-3.6.5.3-1.el7.centos.noarch ovirt-engine-tools-backup-3.6.5.3-1.el7.centos.noarch ovirt-vmconsole-proxy-1.0.0-1.el7.centos.noarch ovirt-engine-vmconsole-proxy-helper-3.6.5.3-1.el7.centos.noarch ovirt-engine-setup-3.6.5.3-1.el7.centos.noarch ovirt-engine-webadmin-portal-3.6.5.3-1.el7.centos.noarch ovirt-engine-tools-3.6.5.3-1.el7.centos.noarch ovirt-engine-restapi-3.6.5.3-1.el7.centos.noarch ovirt-engine-3.6.5.3-1.el7.centos.noarch ovirt-guest-agent-common-1.0.11-1.el7.noarch ovirt-engine-wildfly-overlay-8.0.5-1.el7.noarch ovirt-engine-lib-3.6.5.3-1.el7.centos.noarch ovirt-engine-websocket-proxy-3.6.5.3-1.el7.centos.noarch ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.5.3-1.el7.centos.noarch
ovirt-engine-userportal-3.6.5.3-1.el7.centos.noarch ovirt-engine-dbscripts-3.6.5.3-1.el7.centos.noarch
[root@host03 ~]# rpm -qa | grep ovirt ovirt-vmconsole-1.0.0-1.el7.centos.noarch ovirt-host-deploy-1.4.1-1.el7.centos.noarch ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch libgovirt-0.3.3-1.el7_2.1.x86_64 ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch ovirt-setup-lib-1.0.1-1.el7.centos.noarch
[root@host03 ~]# rpm -qa | grep vdsm vdsm-cli-4.17.26-1.el7.noarch vdsm-4.17.26-1.el7.noarch vdsm-infra-4.17.26-1.el7.noarch vdsm-xmlrpc-4.17.26-1.el7.noarch vdsm-yajsonrpc-4.17.26-1.el7.noarch vdsm-hook-vmfex-dev-4.17.26-1.el7.noarch vdsm-python-4.17.26-1.el7.noarch vdsm-jsonrpc-4.17.26-1.el7.noarch
-- Wee

Hi Currently ilo3 and ilo4 are mapped implicitly to ipmilan with default parameters of 'ilo4:lanplus=1,power_wait=4' So, I think that in your case overriding only the power_wait parameter should work as well (it seems that te default is too short for your host) On Tue, May 10, 2016 at 1:01 PM, Wee Sritippho <wee.s@forest.go.th> wrote:
Found a workaround.
Changed the fence agent type from "ilo4" to "ipmilan" then added "lanplus=1,power_wait=30" (without quotes) to options.
Now the host can be fenced successfully and all HA VMs in that host will be restarted in another hosts.
Did a small experiment with power_wait parameter, here are the results: - power_wait=60 : HA VMs restarted and are pingable in ~2:45 minutes after connection lost - power_wait=30 : HA VMs restarted and are pingable in ~2:15 minutes after connection lost
On 10/5/2559 12:52, Wee Sritippho wrote:
Hi,
I'm running an oVirt hosted-engine environment on 3 hosts. To test VMs' HA functionality, I shutdown host02's link, where one of my HA VMs is running on, using this command:
2016-05-10 09:59:19 ICT [root@host02 ~]# ip link set bond0 down
Few seconds later, an attempt to fence host02 was issued, and this entry appears in the web UI event tab "May 10, 2016 10:00:34 ... Executing power management status on Host hosted_engine_2 using Proxy Host hosted_engine_1 and Fence Agent ipmilan:172.16.3.5.". The IP "172.16.3.5" was correct the Fence Agent "ipmilan" was not.
Even though a failure message "May 10, 2016 10:00:36 ... Execution of power management status on Host hosted_engine_2 using Proxy Host hosted_engine_1 and Fence Agent ipmilan:172.16.3.5 failed." appears in the web UI event tab, host02 was successfully powered off.
The last message in the web GUI event tab is "May 10, 2016 10:00:40 AM ... Host hosted_engine_2 is rebooting.", but the host wasn't actually rebooted - I have to boot it manually using iLo web UI.
How can fix this issue in order to make VMs' HA work?
Thank you.
Here is my power management settings: hosted_engine_1 -> ilo4 : 172.16.3.4 hosted_engine_2 -> ilo4 : 172.16.3.5 hosted_engine_3 -> ilo4 : 172.16.3.6
Here are the log files: https://app.box.com/s/fs5let8955rjbcuxuy0p42ixj4dzou6m
[root@engine ~]# rpm -qa | grep ovirt ovirt-engine-wildfly-8.2.1-1.el7.x86_64 ovirt-engine-setup-plugin-ovirt-engine-common-3.6.5.3-1.el7.centos.noarch ovirt-vmconsole-1.0.0-1.el7.centos.noarch ovirt-engine-cli-3.6.2.0-1.el7.centos.noarch ovirt-engine-setup-plugin-ovirt-engine-3.6.5.3-1.el7.centos.noarch ovirt-engine-backend-3.6.5.3-1.el7.centos.noarch ovirt-iso-uploader-3.6.0-1.el7.centos.noarch ovirt-engine-extensions-api-impl-3.6.5.3-1.el7.centos.noarch ovirt-host-deploy-1.4.1-1.el7.centos.noarch ovirt-release36-007-1.noarch ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch ovirt-image-uploader-3.6.0-1.el7.centos.noarch ovirt-engine-extension-aaa-jdbc-1.0.6-1.el7.noarch ovirt-setup-lib-1.0.1-1.el7.centos.noarch ovirt-host-deploy-java-1.4.1-1.el7.centos.noarch ovirt-engine-setup-base-3.6.5.3-1.el7.centos.noarch ovirt-engine-setup-plugin-websocket-proxy-3.6.5.3-1.el7.centos.noarch ovirt-engine-tools-backup-3.6.5.3-1.el7.centos.noarch ovirt-vmconsole-proxy-1.0.0-1.el7.centos.noarch ovirt-engine-vmconsole-proxy-helper-3.6.5.3-1.el7.centos.noarch ovirt-engine-setup-3.6.5.3-1.el7.centos.noarch ovirt-engine-webadmin-portal-3.6.5.3-1.el7.centos.noarch ovirt-engine-tools-3.6.5.3-1.el7.centos.noarch ovirt-engine-restapi-3.6.5.3-1.el7.centos.noarch ovirt-engine-3.6.5.3-1.el7.centos.noarch ovirt-guest-agent-common-1.0.11-1.el7.noarch ovirt-engine-wildfly-overlay-8.0.5-1.el7.noarch ovirt-engine-lib-3.6.5.3-1.el7.centos.noarch ovirt-engine-websocket-proxy-3.6.5.3-1.el7.centos.noarch ovirt-engine-setup-plugin-vmconsole-proxy-helper-3.6.5.3-1.el7.centos.noarch
ovirt-engine-userportal-3.6.5.3-1.el7.centos.noarch ovirt-engine-dbscripts-3.6.5.3-1.el7.centos.noarch
[root@host03 ~]# rpm -qa | grep ovirt ovirt-vmconsole-1.0.0-1.el7.centos.noarch ovirt-host-deploy-1.4.1-1.el7.centos.noarch ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch ovirt-hosted-engine-ha-1.3.5.3-1.1.el7.noarch libgovirt-0.3.3-1.el7_2.1.x86_64 ovirt-hosted-engine-setup-1.3.5.0-1.1.el7.noarch ovirt-setup-lib-1.0.1-1.el7.centos.noarch
[root@host03 ~]# rpm -qa | grep vdsm vdsm-cli-4.17.26-1.el7.noarch vdsm-4.17.26-1.el7.noarch vdsm-infra-4.17.26-1.el7.noarch vdsm-xmlrpc-4.17.26-1.el7.noarch vdsm-yajsonrpc-4.17.26-1.el7.noarch vdsm-hook-vmfex-dev-4.17.26-1.el7.noarch vdsm-python-4.17.26-1.el7.noarch vdsm-jsonrpc-4.17.26-1.el7.noarch
-- Wee
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
participants (2)
-
Eli Mesika
-
Wee Sritippho