I ran 'hosted-engine --vm-start' after trying to ping the engine and
running 'hosted-engine --vm-status' (which said it wasn't running) and
it reported that it was 'destroying storage' and starting the engine,
though it did not start it. I could not see any evidence from
'hosted-engine --vm-status' or logs that it started. By this point I
was in a panic to get VMs running. So I had to fire up the old bare
metal engine. This has been a very disappointing experience. I still
have no idea why the IDs in 'host_id' differed from the spm ID, and
why, when I put the cluster into global maintenance and shutdown all
the hosts, the Hosted Engine did not come up, nor any of the VMs. I
don't feel confident in this any more. If I try the deploying the
Hosted Engine again I am not sure if it will result in the same
non-functional cluster. It gave no error on deployment, but clearly
something was wrong.
I have two questions:
1. Why did the VMs (apart from the Hosted Engine VM) not start on
power up of the hosts? Is it because the hosts were powered down, that
they stay in a down state on power up of the host?
2. Now that I have connected the bare metal engine back to the
cluster, is there a way back, or do I have to start from scratch
again? I imagine there is no way of getting the Hosted Engine running
again. If not, what do I need to 'clean' all the hosts of the remnants
of the failed deployment? I can of course reinitialise the LUN that
the Hosted Engine was on - anything else?
Thanks
On Fri, Jun 30, 2017 at 4:30 PM, Denis Chaplygin <dchaplyg(a)redhat.com> wrote:
Hello!
On Fri, Jun 30, 2017 at 4:19 PM, cmc <iucounu(a)gmail.com> wrote:
>
> Help! I put the cluster into global maintenance, then powered off and
> then on all of the nodes I have powered off and powered on all the
> nodes. I have taken it out of global maintenance. No VM has started,
> including the hosted engine. This is very bad. I am going to look
> through logs to see why nothing has started. Help greatly appreciated.
Global maintenance mode turns off high availability for the hosted engine
vm. You should either cancel global maintenance or start vm manually with
hosted-engine --vm-start
Global maintenance was added to allow manual maintenance of the engine VM,
so in that mode state of the engine VM and engine itself is not managed and
you a free to stop engine or vm or both, do whatever you like and hosted
engine tools will not interfere. Obviously when engine VM just dies while
cluster is in global maintenance (or all nodes reboot, as in your case)
there is no one to restart it :)