Hi,
I am afraid we do not have logs that would go that deep into the stack. DNS
resolution issues will definitely affect both the notification system (if
not using localhost smtp) and the engine status checks (because we use the
fqdn).
Best regards
Martin
On Wed, Dec 13, 2017 at 3:15 PM, Luca 'remix_tj' Lorenzetto <
lorenzetto.luca(a)gmail.com> wrote:
Hello,
Today i started troubleshooting more in depth on dns requests and exactly
while i was looking at tcpdump an event of EngineUp -> EngineBadHealth
happened.
Looking at the dns requests i see this:
[...]
14:30:35.909201 IP kvmhost01.intranet.company.it.55654 >
dns.company.it.53: 34102+ A? engine01.intranet.company.it. (54)
14:30:35.909215 IP kvmhost01.intranet.company.it.55654 >
dns.company.it.53: 6242+ AAAA? engine01.intranet.company.it. (54)
14:30:40.914285 IP kvmhost01.intranet.company.it.55654 >
dns.company.it.53: 34102+ A? engine01.intranet.company.it. (54)
14:30:40.914316 IP kvmhost01.intranet.company.it.55654 >
dns.company.it.53: 6242+ AAAA? engine01.intranet.company.it. (54)
14:30:45.918306 IP kvmhost01.intranet.company.it.54885 >
dns.company.it.53: 60263+ A? engine01.intranet.company.it.
intranet.company.it. (74)
14:30:45.918329 IP kvmhost01.intranet.company.it.54885 >
dns.company.it.53: 18681+ AAAA? engine01.intranet.company.it.
intranet.company.it. (74)
14:30:50.920376 IP kvmhost01.intranet.company.it.54885 >
dns.company.it.53: 60263+ A? engine01.intranet.company.it.
intranet.company.it. (74)
14:30:50.920411 IP kvmhost01.intranet.company.it.54885 >
dns.company.it.53: 18681+ AAAA? engine01.intranet.company.it.
intranet.company.it. (74)
14:30:56.044242 IP kvmhost01.intranet.company.it.58319 >
dns.company.it.53: 28413+ A? engine01.intranet.company.it. (54)
14:30:56.044267 IP kvmhost01.intranet.company.it.58319 >
dns.company.it.53: 29680+ AAAA? engine01.intranet.company.it. (54)
14:31:01.049761 IP kvmhost01.intranet.company.it.58319 >
dns.company.it.53: 28413+ A? engine01.intranet.company.it. (54)
14:31:01.049777 IP kvmhost01.intranet.company.it.58319 >
dns.company.it.53: 29680+ AAAA? engine01.intranet.company.it. (54)
14:31:06.052635 IP kvmhost01.intranet.company.it.58093 >
dns.company.it.53: 24807+ A? engine01.intranet.company.it.
intranet.company.it. (74)
14:31:06.052649 IP kvmhost01.intranet.company.it.58093 >
dns.company.it.53: 53745+ AAAA? engine01.intranet.company.it.
intranet.company.it. (74)
14:31:11.057724 IP kvmhost01.intranet.company.it.58093 >
dns.company.it.53: 24807+ A? engine01.intranet.company.it.
intranet.company.it. (74)
14:31:11.057745 IP kvmhost01.intranet.company.it.58093 >
dns.company.it.53: 53745+ AAAA? engine01.intranet.company.it.
intranet.company.it. (74)
14:31:16.175204 IP kvmhost01.intranet.company.it.44950 >
dns.company.it.53: 63680+ A? engine01.intranet.company.it. (54)
14:31:16.175225 IP kvmhost01.intranet.company.it.44950 >
dns.company.it.53: 15726+ AAAA? engine01.intranet.company.it. (54)
14:31:19.670746 IP kvmhost01.intranet.company.it.54689 >
dns.company.it.53: 40999+ A? kvmsvilca01.intranet.company.it. (49)
14:31:21.180295 IP kvmhost01.intranet.company.it.44950 >
dns.company.it.53: 63680+ A? engine01.intranet.company.it. (54)
14:31:21.180337 IP kvmhost01.intranet.company.it.44950 >
dns.company.it.53: 15726+ AAAA? engine01.intranet.company.it. (54)
14:31:23.771959 IP kvmhost01.intranet.company.it.53741 >
dns.company.it.53: 1707+ A? internalmx.intranet.company.it. (48)
[...]
The last dns requests has success and gets the MX address and immediately
after i get the email reporting the status change.
This is clearly an issue with name resolution, but that's not clear to me
from the broker.log file. The only message about it that i get is:
Thread-16::DEBUG::2017-12-13 14:31:23,657::monitor::126::
ovirt_hosted_engine_ha.broker.monitor.Monitor::(get_value) Submonitor
engine-health id 139653
412040592 current value: {"reason": "failed liveliness check",
"health":
"bad", "vm": "up", "detail": "up"}
Thread-16::DEBUG::2017-12-13 14:31:23,657::listener::170::
ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
Response: success {"reaso
n": "failed liveliness check", "health": "bad",
"vm": "up", "detail": "up"}
But around that messages i get no signals of error on dns queries or
similar. Do i need to check on other log files?
Luca
On Mon, Dec 11, 2017 at 3:34 PM, Luca 'remix_tj' Lorenzetto <
lorenzetto.luca(a)gmail.com> wrote:
> Hi Martin, Hi all,
>
> *some minutes* has passed and i've the piece of log i'm looking at.
>
>
> broker.log-upbadup
>
<
https://drive.google.com/file/d/1wlWZPuhgtJRBWt4xUZC-Jis8vLWM1jYD/view?us...
>
>
>
--
"E' assurdo impiegare gli uomini di intelligenza eccellente per fare
calcoli che potrebbero essere affidati a chiunque se si usassero delle
macchine"
Gottfried Wilhelm von Leibnitz, Filosofo e Matematico (1646-1716)
"Internet è la più grande biblioteca del mondo.
Ma il problema è che i libri sono tutti sparsi sul pavimento"
John Allen Paulos, Matematico (1945-vivente)
Luca 'remix_tj' Lorenzetto,
http://www.remixtj.net , <
lorenzetto.luca(a)gmail.com>