On Tue, Mar 19, 2019 at 1:33 PM Juhani Rautiainen
<juhani.rautiainen(a)gmail.com> wrote:
On Tue, Mar 19, 2019 at 12:46 PM Juhani Rautiainen
It seems that either our firewall is not responding to pings or
something else is wrong. Looking at the broker.log this can be seen.
Curious thing is that the reboot happens even when ping comes back in
couple of seconds. Is there timeout in ping or does it fire them in
quick succession?
I don't know much of Python, but I think there is a problem with
broker/ping.py. I noticed that these ping failures happen every
fifteen minutes:
[root@ovirt01 ~]# grep Failed /var/log/ovirt-hosted-engine-ha/broker.log
Thread-1::WARNING::2019-03-19
14:04:44,898::ping::63::ping.Ping::(action) Failed to ping 10.168.8.1,
(4 out of 5)
Thread-1::WARNING::2019-03-19
14:19:38,891::ping::63::ping.Ping::(action) Failed to ping 10.168.8.1,
(4 out of 5)
I monitored the firewall and network traffic in host and ping works
but that ping.py somehow thinks that it did not get replies. I can't
see anything obvius in the code. But this is from tcpdump from that
last failure time frame:
14:19:22.598518 IP ovirt01.virt.local > gateway: ICMP echo request, id
19055, seq 1, length 64
14:19:22.598705 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19055, seq 1, length 64
14:19:23.126800 IP ovirt01.virt.local > gateway: ICMP echo request, id
19056, seq 1, length 64
14:19:23.126978 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19056, seq 1, length 64
14:19:23.653544 IP ovirt01.virt.local > gateway: ICMP echo request, id
19057, seq 1, length 64
14:19:23.653731 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19057, seq 1, length 64
14:19:24.180846 IP ovirt01.virt.local > gateway: ICMP echo request, id
19058, seq 1, length 64
14:19:24.181042 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19058, seq 1, length 64
14:19:24.708083 IP ovirt01.virt.local > gateway: ICMP echo request, id
19065, seq 1, length 64
14:19:24.708274 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19065, seq 1, length 64
14:19:32.743986 IP ovirt01.virt.local > gateway: ICMP echo request, id
19141, seq 1, length 64
14:19:35.160398 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19141, seq 1, length 64
14:19:35.271171 IP ovirt01.virt.local > gateway: ICMP echo request, id
19152, seq 1, length 64
14:19:35.365315 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19152, seq 1, length 64
14:19:35.892716 IP ovirt01.virt.local > gateway: ICMP echo request, id
19154, seq 1, length 64
14:19:36.002087 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19154, seq 1, length 64
14:19:36.529263 IP ovirt01.virt.local > gateway: ICMP echo request, id
19156, seq 1, length 64
14:19:38.359281 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19156, seq 1, length 64
14:19:38.887231 IP ovirt01.virt.local > gateway: ICMP echo request, id
19201, seq 1, length 64
14:19:38.889774 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19201, seq 1, length 64
14:19:42.923684 IP ovirt01.virt.local > gateway: ICMP echo request, id
19234, seq 1, length 64
14:19:42.923951 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19234, seq 1, length 64
14:19:43.450788 IP ovirt01.virt.local > gateway: ICMP echo request, id
19235, seq 1, length 64
14:19:43.450968 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19235, seq 1, length 64
14:19:43.977791 IP ovirt01.virt.local > gateway: ICMP echo request, id
19237, seq 1, length 64
14:19:43.977965 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19237, seq 1, length 64
14:19:44.504541 IP ovirt01.virt.local > gateway: ICMP echo request, id
19238, seq 1, length 64
14:19:44.504715 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19238, seq 1, length 64
14:19:45.031570 IP ovirt01.virt.local > gateway: ICMP echo request, id
19244, seq 1, length 64
14:19:45.031752 IP gateway > ovirt01.virt.local: ICMP echo reply, id
19244, seq 1, length 64
No failed pings to be seen. So how that ping.py decides that 4 out of 5 failed??
Thanks,
Juhani