Changing the engine HA ping address?

I have an up-to-date hosted-engine 3.5.1 setup (CentOS 7 for the nodes, CentOS 6 for the engine), and the engine keeps jumping between the two nodes running the hosted-engine HA (sometimes after just 10-20 minutes, sometimes after a day or two). I figured out that it is failing on pinging the gateway sometimes. The gateway IP is a layer-3 switch, and I think sometimes it just is not responding to ICMP echo request in a timely fashion (traffic is routing just fine though). How is the HA ping implemented? How many requests does it send (and how many responses are required to be considered "good")? If I can't tweak the sensitivity of the ping, I'd like to ping a different IP (on a HA load balancer setup). The oVirt HA config refers to it as "gateway" though; is it really used as a gateway in any case, or is that just the recommended IP? Can I just edit /etc/ovirt-hosted-engine/hosted-engine.conf on the two nodes and restart the ovirt-ha-broker service? -- Chris Adams <cma@cmadams.net>

Once upon a time, Chris Adams <cma@cmadams.net> said:
The gateway IP is a layer-3 switch, and I think sometimes it just is not responding to ICMP echo request in a timely fashion (traffic is routing just fine though). How is the HA ping implemented? How many requests does it send (and how many responses are required to be considered "good")?
I see ovirt_hosted_engine_he/broker/submonitors/ping.py that only one packet is sent. That's probably not a great way to do things; there are a number of routers/firewalls/etc. that put ICMP echo requests to the device (as opposed to through the device) at the very lowest priority, and drop them under any load. A better way would be to send multiple requests, with only one answer required. "ping -c 1 -i 0.2 -w <timeout> -W <timeout> <IP>" should do that. -- Chris Adams <cma@cmadams.net>
participants (1)
-
Chris Adams