2440 is pretty low - did you check what lowers it, other than the
single failed network test below?
As far as I can see in agent.log, only thing that lowers that score is "network
status"
There are a lot of lines like this:
Penalizing score by 319 due to network status
Penalizing score by 640 due to network status
Penalizing score by 1280 due to network status
Penalizing score by 960 due to network status
What exactly do you test?
On another vm on same host where
which contains hosted engine i tried this:
1. Continuous ping command to 8.8.8.8 (0 lost packages)
2. dig command every second -> dig +tries=1 +time=5 +tcp (no errors, query time between
2 and 15ms)
Before fixing the above bug, we added to ovirt-system-tests loops of
'dig', and did see drops - not many, but enough, apparently, and
often.
The 'dig' test is not very configurable, from -ha's POV - but you do
have control over it from elsewhere - resolv.conf, your name server,
etc. Also, note that it runs 'dig' without passing a query, and the
default query is for '.' - the root - perhaps your name server has
some problem with this?
Given that the dig command from the other vm goes through
without problems, I think the nameserver should be ok.
You can configure the agent/broker to log at DEBUG level, to see
some
more details.
You can also change the network monitoring method, and/or configure
options for methods that do have them - e.g. 'tcp' and
'tcp_t_address', 'tcp_t_port'. See e.g.:
https://www.ovirt.org/documentation/administration_guide/index.html#Admin...
https://www.ovirt.org/develop/release-management/features/sla/hosted-engi...
I will
try this, thank you.