Here are some of the logs I see in the oVirt Event Manager on the host that is
compute-only,
[Screenshot from 2022-01-22 16-10-13.png]
Sent with ProtonMail Secure Email.
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Saturday, January 22nd, 2022 at 4:01 PM, David White <dmwhite823(a)protonmail.com>
wrote:
I have a Hyperconverged cluster with 4 hosts.
Gluster is replicated across 2 hosts, and a 3rd host is an arbiter
node.
The 4th host is compute only.
I updated the compute-only node, as well as the arbiter node, early
this morning. I didn't touch either of the actual storage nodes.That said, I forgot to
upgrade the engine.
oVirt Manager thinks that all but 1 of the hosts in the cluster are
unhealthy. However, all 4 hosts are online. oVirt Manager (Engine) also keeps deactivating
at least 1, if not 2 of the 3 (total) bricks behind each volume.
Even though the Engine thinks that only 1 host is healthy, VMs are
clearly running on some of the other hosts. However, in troubleshooting, some of the
customer VMs were turned off, and oVirt is refusing to start those VMs, because it only
recognizes that 1 of the hosts is healthy -- and that host's resources are maxed out.
This afternoon, I went ahead and upgraded (and rebooted) the Engine
VM, so it is now up-to-date. Unfortunately, that didn't resolve the issue. So I took
one of the "unhealthy" hosts which didn't have any VMs on it (which was the
host that is our compute-only server hosting no gluster data), and I used oVirt to
"reinstall" the oVirt software. That didn't resolve the issue for that
host.
How can I troubleshoot this? I need:
- To figure out why oVirt keeps trying to deactivate volumes
- From the command line, `gluster peer status` show all nodes
connected, and all volumes appear to be healthy
> - More importantly, I need to get these VMs that are currently down back online. Is
there a way to somehow force oVirt to launch the VMs on the "unhealthy" nodes?
> What logs should I be looking at? Any help would be greatly appreciated .
Sent with ProtonMail Secure Email.