Hi List,
Running a 3 node setup for a client, i'm constantly having the
HostedEngine move itself around, whatever node its on ends up penalizing
its score so low that it forces a migrate to the other node.
Looking at /var/log/ovirt-hosted-engine-ha/agent.log shows a decent
amount of:
MainThread::INFO::2020-05-21
15:47:54,742::states::135::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
Penalizing score by 319 due to network status
What I want to know is how do I get more debug out of this to know
what network status its concerned about, so I can go about stablising it.
The system is heavily monitored with ping checks, never drops link and
never drops ICMP. None of its VM's falter accessing shared NFS space for
disk storage so I'm not sure what the concern is. The node will
literally over time penalise itself down to ~2000 and then HA agent will
want it to swap nodes. It's not necessarily a bad thing but generates a
heap of status emails multiple times a day which is just garbage - and
makes the HE unavailable sometimes when mid-admin task.
Any help is appreciated.
Thanks,
Joe