It is useful information, but it doesn't help in this case.


Every one also seems to have missed the point, this is not about recovering a broken hosted engine.

The problem I ran into is installing a NEW self hosted engine on physical hosts that had already been configured in the old cluster.
(because of storage and network changes, but the reason doesn't really matter)

What happened:

Ovirt UI:
   Shutdown all VMS
   Maintenance Mode / Detach storage domains
   Remove all physical hosts from cluster except the one running HE
   Switch to HE Engine global maintenance

On last physical host:
    hosted-engine --vm-poweroff

On all physical hosts:
    ovirt-hosted-engine-cleanup


- Physical network changes to all hosts

Power on "host 1", (expectation deploy a new SHE on a new storage domain)

# hosted-engine --deploy
...
...
[ INFO  ] TASK [ovirt.hosted_engine_setup : Fail with error description]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The host has been set in non_operational status, deployment errors:   code 9000: Failed to verify Power Management configuration for Host physhost01.example.com,   fix accordingly and re-deploy."}

At this point after removing the host from the old cluster I don't/didn't expect legacy configuration to be left lurking in places unknown.
In this case it's a power management configuration that wasn't cleaned up that's stopping the very first SHE from being deployed.  Although I'mn sure there is a raft of 'unclean' configuration left behind when a host is removed from a cluster.


The only way I could get SHE to redeploy onto the same hardware was to do a full clean OS install and basically start from scratch.

On Thu, 2 Apr 2020 at 07:27, Yedidyah Bar David <didi@redhat.com> wrote:
On Thu, Apr 2, 2020 at 7:41 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
>
> On April 1, 2020 5:28:35 PM GMT+03:00, eevans@digitaldatatechs.com wrote:
> >https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.2/html/self-hosted_engine_guide/troubleshooting
> >
> >
> >
> >It should tell you the steps to take to troubleshoot your deployment.
> >
> >
> >
> >Eric Evans
> >
> >Digital Data Services LLC.
> >
> >304.660.9080
> >
> >
> >
> >
> >
> >From: Maton, Brett <matonb@ltresources.co.uk>
> >Sent: Tuesday, March 31, 2020 11:52 PM
> >To: eevans@digitaldatatechs.com
> >Cc: Ovirt Users <users@ovirt.org>
> >Subject: [ovirt-users] Re: Failing to redeploy self hosted engine
> >
> >
> >
> >So, how would I go about disabling global maintenance when hosted
> >engine isn't running?
> >
> >
> >
> >I tried editing /var/lib/ovirt-hosted-engine-ha/ha.conf and setting
> >both values to False but that didn't help.
> >
> >
> >
> >local_maintenance=False
> >local_maintenance_manual=False
> >
> >
> >
> >
> >
> >
> >
> >On Tue, 31 Mar 2020 at 23:01, Maton, Brett <matonb@ltresources.co.uk
> ><mailto:matonb@ltresources.co.uk> > wrote:
> >
> >Oooh probably...
> >
> >
> >
> >I'll give that a try in the morning, cheers for the tip!
> >
> >
> >
> >On Tue, 31 Mar 2020, 21:23 , <eevans@digitaldatatechs.com
> ><mailto:eevans@digitaldatatechs.com> > wrote:
> >
> >Did you put the ovirt host into global maintenance mode? That may be
> >the issue.
> >
> >
> >
> >Eric Evans
> >
> >Digital Data Services LLC.
> >
> >304.660.9080
> >
> >
> >
> >
> >
> >From: Maton, Brett <matonb@ltresources.co.uk
> ><mailto:matonb@ltresources.co.uk> >
> >Sent: Tuesday, March 31, 2020 2:35 PM
> >To: Ovirt Users <users@ovirt.org <mailto:users@ovirt.org> >
> >Subject: [ovirt-users] Failing to redeploy self hosted engine
> >
> >
> >
> >I keep running into this error when I try to (re)deploy self-hosted
> >engine.
> >
> >
> >
> >
> >
> ># ovirt-hosted-engine-cleanup
> >
> ># hosted-engine --deploy
> >
> >...
> >
> >...
> >
> >[ INFO  ] TASK [ovirt.hosted_engine_setup : Fail with error
> >description]
> >
> >[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The
> >host has been set in non_operational status, deployment errors:   code
> >9000: Failed to verify Power Management configuration for Host
> >physhost01.example.com <http://physhost01.example.com> ,   fix
> >accordingly and re-deploy."}
> >
> >
> >
> >I shut down all of the VM's and detached the storage before cleaning up
> >and trying to re-deploy the hosted engine, first time I've run into
> >this particular problem.
> >
> >
> >
> >Any help appreciated
> >
> >
> >
> >Brett
> >
> >
>
> I'm not very pleased of this article.

One more place to check is:

https://www.ovirt.org/images/Hosted-Engine-4.3-deep-dive.pdf

I found it a few months ago, and added to the site under:

https://www.ovirt.org/community/get-involved/resources/slide-decks.html

But that one sadly isn't easy to find either. Google does know about
it, but only if you know what to search for. We should probably add a
few more internal links to make sure all content is easily accessible.

> Last time I needed to fix my HostedEngine, I couldn't find anything useful.
> Looking into retrospection ,  If I knew  how to attach a rescue media and boot from it  - it would take no more than 10 min to fix it - way faster than to restore from backup.

There is an option '--vm-conf' you can use with '--vm-start', to pass
your own libvirt-style conf. There you can add a virtual CD with an
image or whatever.
See slide 58 in above presentation.

Best regards,
--
Didi