Hosted engine restore went very wrong

7 May 2019

      I really feel like an idiot. I tried to move our hosted engine from our Default datacenter to our Ceph Datacenter.
I ran intro problems which were correctly addressed. 
see: https://lists.ovirt.org/archives/list/users@ovirt.org/thread/ZFCLFWRN6XR6KMH...

It was in case a race condition. I was able to bring back the engine to our Default Cluster. And then I tried to 
do the move again to our Ceph Datacenter.

I got the error "The target Data Center does not contain the Virtual Disk" twice yesterday. Because it was late,
I decided to do it in the next morning.

I did a new backup from the engine. Copied it over to the new node of the Ceph Datacenter and started the 
hosted-engine --deploy. But I FORGET TO SHUTDOWN the other engine! Oh man.

The deploy script errored out with:

[ ERROR ] fatal: [localhost]: FAILED! => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false}

[ ERROR ] fatal: [localhost -> engine.infra.solutions.work]: FAILED! => {"changed": false, "msg": "There was a failure deploying the engine on the local engine VM. The system may not be provisioned accord

Then I realised something was different this time.
I shutdown and undefined the Local Engine. The node is now in a degraded state. Is it possible to start the deployment again on a degraded node?
I started the old engine again, but I'm not able to reach the login page.

Any Idea what to do next?

Andreas Elvers

Andreas Elvers

Andreas Elvers

Andreas Elvers

Andreas Elvers

Andreas Elvers

tags

participants (1)