
On Tue, Aug 10, 2021 at 9:20 PM Gilboa Davara <gilboad@gmail.com> wrote:
Hello,
Many thanks again for taking the time to try and help me recover this machine (even though it would have been far easier to simply redeploy it...)
Sadly enough, it seems that --clean-metadata requires an active agent. E.g. $ hosted-engine --clean-metadata The hosted engine configuration has not been retrieved from shared storage. Please ensure that ovirt-ha-agent is running and the storage server is reachable.
Did you try to search the net/list archives?
Yes. All of them seem to repeat the same clean-metadata command (which fails).
I suppose we need better documentation. Sorry. Perhaps open a bug/issue about that.
Can I manually delete the metadata state files?
Yes, see e.g.:
https://lists.ovirt.org/pipermail/users/2016-April/072676.html
As an alternative to the 'find' command there, you can also find the IDs with:
$ grep metadata /etc/ovirt-hosted-engine/hosted-engine.conf
Best regards, -- Didi
Yippie! Success (At least it seems that way...)
Following https://lists.ovirt.org/pipermail/users/2016-April/072676.html, I stopped the broker and agent services, archived the existing hosted metadata files, created an empty 1GB metadata file using dd, (dd if=/dev/zero of=/run/vdsm/storage/<uuid>/<uuid> bs=1M count=1024), making double sure permissions (0660 / 0644), owner (vdsm:kvm) and SELinux labels (restorecon, just incase) stay the same. Let everything settle down. Restarted the services.... ... and everything is up again :)
I plan to let the engine run overnight with zero VMs (making sure all backups are fully up-to-date). Once done, I'll return to normal (until I replace this setup with a normal multi-node setup).
Many thanks again!
Glad to hear that, welcome, thanks for the report! More tests you might want to do before starting your real VMs: - Set and later clear global maintenance from each hosts, see that this propagates to the others (both 'hosted-engine --vm-status' and agent.log) - Migrate the engine VM between the hosts and see this propagates - Shutdown the engine VM without global maint and see that it's started automatically. But I do not think all of this is mandatory, if 'hosted-engine --vm-status' looks ok on all hosts. I'd still be careful with other things that might have been corrupted, though - obviously can't tell you what/where... Best regards, -- Didi