Engine/host communication maybe broken

Hi All, I'm currently facing a situation which I don't know how to resolve and I don't know how that happened. The hosted engine HA part is working fine and the vm starts. It's accessible via the gui and that's it. It's showing some hosts online and some offline. It looks like it's not communication with the nodes anymore to get the latest status or to start task. Is there a cache or something that can be cleared on the engine vm to force a connection the the nodes ? Or is there a way to analyze if communication is really broken ? Any ideas how to start from here ? Thank you, Sven

You might want to look at the status of and the logs for the various oVirt services on the hosts. Open a host's console at http://host-ip-or-name:9090, go to the Services tab and filter the services for "ovirt". Please let us know which services and running and which are stopped and if any service logs show errors or warnings.

On the engine VM it looks like this: ovirt-engine-dwhd.service loaded active running oVirt Engine Data Warehouse ovirt-engine.service loaded active running oVirt Engine ovirt-fence-kdump-listener.service loaded active running oVirt Engine fence_kdump listener ovirt-guest-agent.service loaded active running oVirt Guest Agent ovirt-imageio-proxy.service loaded active running oVirt ImageIO Proxy ovirt-vmconsole-proxy-sshd.service loaded active running oVirt VM Console SSH server daemon ovirt-websocket-proxy.service loaded active running oVirt Engine websockets proxy on all three hosts: ovirt-ha-agent.service loaded active running oVirt Hosted Engine High Availability Monitoring Agent ovirt-ha-broker.service loaded active running oVirt Hosted Engine High Availability Communications Broker ovirt-imageio-daemon.service loaded active running oVirt ImageIO Daemon ovirt-vmconsole-host-sshd.service loaded active running oVirt VM Console SSH server daemon The thing is that inside of the engine gui I can't do anything, because all domains show as down and mostly everything is shown down. Selecting a Master domain to start it and get the data center up won't do anything. It will just sit there and say it's working on it. Looking at the engine log with tail -f doesn’t show any activity going to the hosts. I do have /etc/hosts entries for all nodes, so no dns issue. Also there are still gluster volumes and vms running on those nodes, so I can't just restart everything. -----Ursprüngliche Nachricht----- Von: Randall Wood [mailto:rwood@forcepoint.com] Gesendet: Montag, 6. April 2020 16:01 An: users@ovirt.org Betreff: [ovirt-users] Re: Engine/host communication maybe broken You might want to look at the status of and the logs for the various oVirt services on the hosts. Open a host's console at http://host-ip-or-name:9090, go to the Services tab and filter the services for "ovirt". Please let us know which services and running and which are stopped and if any service logs show errors or warnings. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/6JGCIX2AJZNLFX...

It sounds like you are self-hosted. You need to establish communication from the engine host. Check the engine.log and the libvirt qemu vm logs. Eric Evans Digital Data Services LLC. 304.660.9080 From: Sven Achtelik <Sven.Achtelik@eps.aero> Sent: Monday, April 6, 2020 7:47 AM To: users <users@ovirt.org> Subject: [ovirt-users] Engine/host communication maybe broken Hi All, I'm currently facing a situation which I don't know how to resolve and I don't know how that happened. The hosted engine HA part is working fine and the vm starts. It's accessible via the gui and that's it. It's showing some hosts online and some offline. It looks like it's not communication with the nodes anymore to get the latest status or to start task. Is there a cache or something that can be cleared on the engine vm to force a connection the the nodes ? Or is there a way to analyze if communication is really broken ? Any ideas how to start from here ? Thank you, Sven
participants (3)
-
eevans@digitaldatatechs.com
-
Randall Wood
-
Sven Achtelik