Hello everyone!
Due to a recent major power outage in my area I now have an unresponsive self hosted host
in an environment of 3 self hosted hosts. There's one vm stuck on there as well as
some metadata I guess from when hosted engine was running there (before the power went
down).
I'm running 4.3.10 ovirt node with 3 nodes and GlusterFS, no arbiter, and I'm
using it to provide services to our clients i.e. DNS, web sites, wikis, ticketing etc. and
I cannot shut them down.
The ovirt engine is up and running and I can manage all the other VMs that run on the
other hosts through the web gui.
The unresponsive host replies only to ICMP requests; in every other sense it's dead,
no ssh, no gluster bricks, no console, nothing.
I tried to place the faulty host in maintenance, using the option to stop glusterd, but
wasn't able to as the engine won't let the host go into maintenance mode because
it thinks the host has running VMs on it. The host won't go into maintenance even if I
chose the "Ignore gluster quorum and self-heal validations" option.
I spent last week creating a backup environment were I copied the VMs, to have somewhere
to run them in case something goes terribly wrong with the systems or the gluster in the
production system.
I'm thinking of using the global maintenance mode and then shutting down the engine
itself with *hosted-engine --vm-shutdown* and rebooting the affected host.
Should I remove the host from the cluster and then re-add it or should I do something
else?
Thanks for any of your help!