ovirt engine frequently rebooting/changing host

Hi, currently I'm evaluating oVirt and I have three hosts installed within nested KVM. They're sharing a gluster environment which has been configured using the oVirt Node Wizards. It seems to work quite well, but after some hours I get many status update mails from the ovirt engine which are either going to EngineStop or EngeineForceStop. Sometimes the host where the engine runs is switched. After some of those reboots there is silence for some hours before it is starting over. Can you tell me where I should look at to fix that problem? Regards Bernhard Dick

On Mon, May 7, 2018 at 11:50 AM, Bernhard Dick <bernhard@bdick.de> wrote:
Hi,
currently I'm evaluating oVirt and I have three hosts installed within nested KVM. They're sharing a gluster environment which has been configured using the oVirt Node Wizards. It seems to work quite well, but after some hours I get many status update mails from the ovirt engine which are either going to EngineStop or EngeineForceStop. Sometimes the host where the engine runs is switched. After some of those reboots there is silence for some hours before it is starting over. Can you tell me where I should look at to fix that problem?
You can check, on all hosts, /var/log/ovirt-hosted-engine-ha/* . Good luck, -- Didi

[...]
It seems to work quite well, but after some hours I get many status update mails from the ovirt engine which are either going to EngineStop or EngeineForceStop. Sometimes the host where the engine runs is switched. After some of those reboots there is silence for some hours before it is starting over. Can you tell me where I should look at to fix that problem?
You can check, on all hosts, /var/log/ovirt-hosted-engine-ha/* .
Hi, Am 07.05.2018 um 11:23 schrieb Yedidyah Bar David: thanks, that helped. Our gateway does not always respond to ping-requests so I changed the penality score accordingly. It is now running stable for almost one week. Regards Bernhard
Good luck,

On Wed, May 16, 2018 at 5:38 PM, Bernhard Dick <bernhard@bdick.de> wrote:
Hi,
Am 07.05.2018 um 11:23 schrieb Yedidyah Bar David:
[...]
It seems to work quite well, but after some hours I get many status update mails from the ovirt engine which are either going to EngineStop or EngeineForceStop. Sometimes the host where the engine runs is switched. After some of those reboots there is silence for some hours before it is starting over. Can you tell me where I should look at to fix that problem?
You can check, on all hosts, /var/log/ovirt-hosted-engine-ha/* .
thanks, that helped. Our gateway does not always respond to ping-requests so I changed the penality score accordingly.
How? In the code? I am not sure there isn't some other logic that relies on this score, such as wishing to migrate away the engine VM from its host if it fails this specific test.
It is now running stable for almost one week.
Thanks for the report! Best regards, -- Didi

Hi, Am 17.05.2018 um 07:30 schrieb Yedidyah Bar David:
On Wed, May 16, 2018 at 5:38 PM, Bernhard Dick <bernhard@bdick.de> wrote:
Hi,
Am 07.05.2018 um 11:23 schrieb Yedidyah Bar David:
[...]
It seems to work quite well, but after some hours I get many status update mails from the ovirt engine which are either going to EngineStop or EngeineForceStop. Sometimes the host where the engine runs is switched. After some of those reboots there is silence for some hours before it is starting over. Can you tell me where I should look at to fix that problem?
You can check, on all hosts, /var/log/ovirt-hosted-engine-ha/* .
thanks, that helped. Our gateway does not always respond to ping-requests so I changed the penality score accordingly.
How? In the code? I changed the value for "gateway-score-penalty" in /etc/ovirt-hosted-engine-ha/agent.conf .
Regards Bernhard

On Thu, May 17, 2018 at 1:34 PM, Bernhard Dick <bernhard@bdick.de> wrote:
Hi,
Am 17.05.2018 um 07:30 schrieb Yedidyah Bar David:
On Wed, May 16, 2018 at 5:38 PM, Bernhard Dick <bernhard@bdick.de> wrote:
Hi,
Am 07.05.2018 um 11:23 schrieb Yedidyah Bar David:
[...]
It seems to work quite well, but after some hours I get many status update mails from the ovirt engine which are either going to EngineStop or EngeineForceStop. Sometimes the host where the engine runs is switched. After some of those reboots there is silence for some hours before it is starting over. Can you tell me where I should look at to fix that problem?
You can check, on all hosts, /var/log/ovirt-hosted-engine-ha/* .
thanks, that helped. Our gateway does not always respond to ping-requests so I changed the penality score accordingly.
How? In the code?
I changed the value for "gateway-score-penalty" in /etc/ovirt-hosted-engine-ha/agent.conf .
I see. Adding Martin for review. -- Didi
participants (2)
-
Bernhard Dick
-
Yedidyah Bar David