Inconsisten datacenter [urgent]

I have a gluster engine domain on 4.2 replica 3 1 arbiter and the arbiter got down. I've been trying to replace the arbiter with no success. In the meantime, the two remaining hosts are non- responsive, both with one vm (one actually with the engine). The one without the engine can't be set to maintenance because shows one vm and in non-responsive state, but in the vm list, it shows none. I manually tried to shut the vm down with virsh but no domain is listed. I can't power off the host as it's part of the gluster. Is there a way I can manually remove the ghost vm so I can set the host to maintenance and try to fix it? Regards,

I'm reluctant to give any advice in a situation as fragile as that, but I believe VDSM doesn't want you to mess with virsh in oVirt: You're supposed to make things happen with hosted-engine and potentially vdsm-tool. Try to see if you can get the management VM down with hosted-engine --vm-shutdown, enable maintenance with hosted-engine --set-maintenance --mode=global (as well as --mode=local for each surviving node) and if that doesn't work then perhaps you'd want to stop the vdsm broker/agent daemons while you do gluster repairs.

Thanks. Sorry for the late replay but I had to travel. I've tried your suggestion and now I'm not able to start the engine VM. It's trying to get access for a storage (that I can't tell which is as vdsm only shows the ID and I don't have a way to know the label without the engine). I think that I might need to hack into the database of the engine or reset it and hack in the backup so I can change some values by hand. Any other ideas?
participants (3)
-
jplorier@gmail.com
-
Juan Pablo Lorier
-
thomas@hoberg.net