On Fri, Feb 22, 2019 at 11:39 AM Nicolas Ecarnot <nicolas@ecarnot.net> wrote:
Hi Martin,

Le 21/02/2019 à 13:04, Martin Perina a écrit :
> Hi Nicolas,
>
> see my reply inline

See mine below.

>
> On Mon, Feb 18, 2019 at 9:51 AM Nicolas Ecarnot <nicolas@ecarnot.net
> <mailto:nicolas@ecarnot.net>> wrote:
>
>     Hello,
>
>     As fence_idrac has never worked for us, and as fence_ipmilan has worked
>     nicely since years, we are using fence_ipmilan with the lanplus=1
>     option
>     and we're happy with it.
>
>     We upgraded to 4.3.0.4 and we're witnessing that we cannot fence our
>     hosts anymore :
>
>     2019-02-18 09:42:08,678+01 ERROR
>     [org.ovirt.engine.core.bll.pm
>     <http://org.ovirt.engine.core.bll.pm>.FenceProxyLocator] (default
>     task-11)
>     [2f78ed99-6703-4d92-b7cb-948c2d24b623] Can not run fence action on host
>     'xxxxx', no suitable proxy host was found.
>
>
> This is not related fence_ipmi issue below. Engine, is order to be able
> to execute fencing operation, needs at least one other hosts in Up
> status, which is used as a proxy host to perform fencing operation. So
> do you have at least one host in Up status in the same
> cluster/datacenter as the host you want to run fencing operation on?

Yes.

> If so, then please enable debug information to find out why we cannot
> find any host acting as fence proxy:
>
> 1. Please download log-control.sh script from
> https://github.com/oVirt/ovirt-engine/tree/master/contrib#log-control-sh
> and save on engine machine
> 2. Please execute following on engine machine
>        log-control.sh org.ovirt.engine.core.bll.pm
> <http://org.ovirt.engine.core.bll.pm> DEBUG
> 3. Go to the problematic host, click Edit, go to Power Management tab,
> click on the existing fence agent and click on Test button
> 4. Take a look at engine.log, there should be logged information, why we
> were not able to find out fence proxy

I followed the instructions above, but I feel this is not the best debug
path. I learned nothing new.
The fence proxy is not missing. It is known and found, and it is trying
to do its job, as written below :

>
>
>     and on the SPM :
>
>     fence_ipmilan: Failed: Unable to obtain correct plug status or plug is
>     not available
>
>
> Could you please provide debug output of below command?
>
> ipmitool -vv -I lanplus -H <IPMI_INTERFACE_IP> -p 623 -U <IPMI_USERNAME>
> -P <IPMI_PASSWORD> -L ADMINISTRATOR chassis power status

See below a debug session.
I'm comparing two hosts, and one only is answering fence status queries.

I must add that before the upgrade to 4.3, both hosts were responding
correctly.

fence_ipmilan --username=stonith --password='xxx' --lanplus
--ip=c-serv-hv-prds01.sdis.isere.fr --action=status -v
2019-02-22 11:34:01,537 INFO: Executing: /usr/bin/ipmitool -I lanplus -H
c-serv-hv-prds01.sdis.isere.fr -p 623 -U stonith -P [set] -L
ADMINISTRATOR chassis power status

2019-02-22 11:34:01,654 DEBUG: 0 Chassis Power is on


Status: ON
root@hv04:/etc# fence_ipmilan --username=stonith --password='xxx'
--lanplus --ip=c-hv05.prd.sdis38.fr --action=status -v
2019-02-22 11:34:15,335 INFO: Executing: /usr/bin/ipmitool -I lanplus -H
c-hv05.prd.sdis38.fr -p 623 -U stonith -P [set] -L ADMINISTRATOR chassis
power status

2019-02-22 11:34:35,338 ERROR: Connection timed out

Unfortunately using fence_ipmilan is not possible to display more debugging details, so as mentioned earlier could you please run ipmitool directly?

ipmitool vv -I lanplus -H c-hv05.prd.sdis38.fr -p 623 -U stonith -P <PASSWORD> -L ADMINISTRATOR chassis power status

Above should display more details ...



root@hv04:/etc# nmap c-serv-hv-prds01.sdis.isere.fr

Starting Nmap 6.40 ( http://nmap.org ) at 2019-02-22 11:34 CET
Nmap scan report for c-serv-hv-prds01.sdis.isere.fr (192.168.53.2)
Host is up (0.010s latency).
rDNS record for 192.168.53.2: c-5g3yxx1.sdis.isere.fr
Not shown: 996 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
80/tcp   open  http
443/tcp  open  https
5900/tcp open  vnc

Nmap done: 1 IP address (1 host up) scanned in 0.45 seconds
root@hv04:/etc# nmap c-hv05.prd.sdis38.fr

Starting Nmap 6.40 ( http://nmap.org ) at 2019-02-22 11:34 CET
Nmap scan report for c-hv05.prd.sdis38.fr (192.168.50.194)
Host is up (0.00060s latency).
rDNS record for 192.168.50.194: C-550W2S2.sdis.isere.fr
Not shown: 996 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
80/tcp   open  http
443/tcp  open  https
5900/tcp open  vnc
MAC Address: CC:C5:E5:57:26:E0 (Unknown)

Nmap done: 1 IP address (1 host up) scanned in 0.20 seconds
root@hv04:/etc# ping -c 1 c-serv-hv-prds01.sdis.isere.fr
PING c-5g3yxx1.sdis.isere.fr (192.168.53.2) 56(84) bytes of data.
64 bytes from c-5g3yxx1.sdis.isere.fr (192.168.53.2): icmp_seq=1 ttl=61
time=2.37 ms

--- c-5g3yxx1.sdis.isere.fr ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.371/2.371/2.371/0.000 ms
root@hv04:/etc# ping -c 1 c-hv05.prd.sdis38.fr
PING c-550w2s2.prd.sdis38.fr (192.168.50.194) 56(84) bytes of data.
64 bytes from C-550W2S2.sdis.isere.fr (192.168.50.194): icmp_seq=1
ttl=64 time=0.189 ms

--- c-550w2s2.prd.sdis38.fr ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.189/0.189/0.189/0.000 ms


>
> Above is the command which fence_ipmi is internally executing, and -vv
> adds debugging output which can reveal issue with the plug status
>
> Regards,
> Martin
>
>
>     I found the suggested workaround here :
>
>     https://access.redhat.com/solutions/3349841
>
>     but no combination of
>     - lanplus={0,1}
>     - -z
>     - ssl=={0,1}
>
>     lead to no solution.
>
>     The package version is the same as what's described in the KB :
>     fence-agents-rhevm-4.2.1-11.el7_6.7.x86_64
>
>     What should I test now?
>
>     Thank you.
>
>     --
>     Nicolas ECARNOT
>     _______________________________________________
>     Users mailing list -- users@ovirt.org <mailto:users@ovirt.org>
>     To unsubscribe send an email to users-leave@ovirt.org
>     <mailto:users-leave@ovirt.org>
>     Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>     oVirt Code of Conduct:
>     https://www.ovirt.org/community/about/community-guidelines/
>     List Archives:
>     https://lists.ovirt.org/archives/list/users@ovirt.org/message/SEUAZ6JB6CIYY2GOBNJN2XSWOSH6DHDJ/
>
>
>
> --
> Martin Perina
> Associate Manager, Software Engineering
> Red Hat Czech s.r.o.


--
Nicolas ECARNOT


--
Martin Perina
Associate Manager, Software Engineering
Red Hat Czech s.r.o.