[ovirt-users] Debugging why hosted engine flips between EngineUp and EngineBadHealth

Luca 'remix_tj' Lorenzetto lorenzetto.luca at gmail.com
Fri Dec 1 09:50:52 UTC 2017


Hi all,

since some days my hosted-engine environments (one RHEV 4.0.7, one
ovirt 4.1.7) continue to send mails about changes between EngineUp and
EngineBadHealth.

This is pretty annoying and i'm not able to find out the root cause.

The only issue i've seen on hosts is this error appearing sometimes
randomly about sending mails.

Thread-1::ERROR::2017-12-01
03:05:05,084::notifications::39::ovirt_hosted_engine_ha.broker.notifications.Notifications::(send_email)
[Errno -2] Name or service not known
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/notifications.py",
line 26, in send_email
    timeout=float(cfg["smtp-timeout"]))
  File "/usr/lib64/python2.7/smtplib.py", line 255, in __init__
    (code, msg) = self.connect(host, port)
  File "/usr/lib64/python2.7/smtplib.py", line 315, in connect
    self.sock = self._get_socket(host, port, self.timeout)
  File "/usr/lib64/python2.7/smtplib.py", line 290, in _get_socket
    return socket.create_connection((host, port), timeout)
  File "/usr/lib64/python2.7/socket.py", line 553, in create_connection
    for res in getaddrinfo(host, port, 0, SOCK_STREAM):
gaierror: [Errno -2] Name or service not known
Thread-6::WARNING::2017-12-01
03:05:05,427::engine_health::130::engine_health.CpuLoadNoEngine::(action)
bad health status: Hosted Engine is not up!

There are no errors on engine logs and all the api queries done by
ovirt-hosted-engine-ha returns HTTP code 200.

I suspect the switch between EngineUP and EngineBadHealth status could
be due to some dns resolution issues, but there is no clear message on
the log showing this and this doesn't help our netadmins to make some
traces.

Is there a way to increase the verbosity of broker.log and agent.log?

Luca

-- 
"E' assurdo impiegare gli uomini di intelligenza eccellente per fare
calcoli che potrebbero essere affidati a chiunque se si usassero delle
macchine"
Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)

"Internet è la più grande biblioteca del mondo.
Ma il problema è che i libri sono tutti sparsi sul pavimento"
John Allen Paulos, Matematico (1945-vivente)

Luca 'remix_tj' Lorenzetto, http://www.remixtj.net , <lorenzetto.luca at gmail.com>


More information about the Users mailing list