[ovirt-users] Engine HA-Issues

Kasturi Narra knarra at redhat.com
Mon Jul 17 06:10:10 UTC 2017


Hi ,

  Can you please check the following. Following could be one of the reason
why HE vm restarts every minute.

Check the error or engine health state. If it’s to do with Liveliness
check, then this is mostly an issue connecting to engine.

- Check if engine FQDN is reachable from all hosts

-  curl -v http://<engine-fqdn>/ovirt-engine/services/health - does this
return ok?

- Access the HE console and check if ovirt-engine is running.

- Check /var/log/ovirt-engine/server.log or
/var/log/ovirt-engine/engine.log if there are errors starting ovirt-engine


Thanks

kasturi


On Fri, Jul 14, 2017 at 10:28 PM, Sven Achtelik <Sven.Achtelik at eps.aero>
wrote:

> Hi All,
>
>
>
> after running solid for several month my ovirt-engine started rebooting on
> several hosts. I’ve looked into the hostend-engine –vm-status and it sees
> that the engine is up on one host but not reachable. At the same time I can
> access the gui and everything is working fine. After some time the engine
> is shutting down and all hosts are trying to start the engine until one is
> the winner, at least it looks like this. Any clues where to look at and
> find the issue with the liveliness check ?
>
>
>
> ------------------------------------------------------------
> --------------------------------------------
>
>
>
> --== Host 1 status ==--
>
>
>
> conf_on_shared_storage             : True
>
> Status up-to-date                  : True
>
> Hostname                           : ovirt-node01
>
> Host ID                            : 1
>
> Engine status                      : {"reason": "vm not running on this
> host", "health": "bad", "vm": "down", "detail": "unknown"}
>
> Score                              : 3400
>
> stopped                            : False
>
> Local maintenance                  : False
>
> crc32                              : 3eb33843
>
> local_conf_timestamp               : 17128
>
> Host timestamp                     : 17113
>
> Extra metadata (valid at timestamp):
>
>         metadata_parse_version=1
>
>         metadata_feature_version=1
>
>         timestamp=17113 (Fri Jul 14 11:50:23 2017)
>
>         host-id=1
>
>         score=3400
>
>         vm_conf_refresh_time=17128 (Fri Jul 14 11:50:38 2017)
>
>         conf_on_shared_storage=True
>
>         maintenance=False
>
>         state=EngineDown
>
>         stopped=False
>
>
>
>
>
> --== Host 2 status ==--
>
>
>
> conf_on_shared_storage             : True
>
> Status up-to-date                  : True
>
> Hostname                           : ovirt-node02.mgmt.lan
>
> Host ID                            : 2
>
> Engine status                      : {"reason": "failed liveliness check",
> "health": "bad", "vm": "up", "detail": "up"}
>
> Score                              : 3400
>
> stopped                            : False
>
> Local maintenance                  : False
>
> crc32                              : 2a8c86cc
>
> local_conf_timestamp               : 523182
>
> Host timestamp                     : 523167
>
> Extra metadata (valid at timestamp):
>
>         metadata_parse_version=1
>
>         metadata_feature_version=1
>
>         timestamp=523167 (Fri Jul 14 11:50:25 2017)
>
>         host-id=2
>
>         score=3400
>
>         vm_conf_refresh_time=523182 (Fri Jul 14 11:50:40 2017)
>
>         conf_on_shared_storage=True
>
>         maintenance=False
>
>         state=EngineStarting
>
>         stopped=False
>
>
>
>
>
> --== Host 3 status ==--
>
>
>
> conf_on_shared_storage             : True
>
> Status up-to-date                  : True
>
> Hostname                           : ovirt-node03.mgmt.lan
>
> Host ID                            : 3
>
> Engine status                      : {"reason": "vm not running on this
> host", "health": "bad", "vm": "down", "detail": "unknown"}
>
> Score                              : 3400
>
> stopped                            : False
>
> Local maintenance                  : False
>
> crc32                              : f8490d79
>
> local_conf_timestamp               : 527698
>
> Host timestamp                     : 527683
>
> Extra metadata (valid at timestamp):
>
>         metadata_parse_version=1
>
>         metadata_feature_version=1
>
>         timestamp=527683 (Fri Jul 14 11:50:33 2017)
>
>         host-id=3
>
>         score=3400
>
>         vm_conf_refresh_time=527698 (Fri Jul 14 11:50:47 2017)
>
>         conf_on_shared_storage=True
>
>         maintenance=False
>
>         state=EngineDown
>
>         stopped=False
>
>
>
> ------------------------------------------------------------
> ----------------------------------
>
> Thank you,
>
> Sven
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170717/737e3793/attachment-0001.html>


More information about the Users mailing list