Hi,
I created a new host to deploy a hosted engine, and then used a backup
from the bare metal engine and restored this, as per the procedure in:
http://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_M...
Everything worked fine up until step 15 ('Continue setup') as the
script said the engine was not responding. I tried the reboot option
(option 3), but still it would not connect. So I could not do the
final step involving the internal CA, adding the host to an existing
cluster (of which there were two other hosts). I was able to connect
via vnc and ssh fine to the engine, and from here I could see that the
ovirt-engine service was up. I had to install the aaa-ldap extension
package to enable ldap auth separately however, but once done I was
able to log in, and it showed the old cluster as it was on the bare
metal engine. I added the host that I created the hosted engine on,
and it installed various packages and then I configured the network
and it looked fine, apart from the fact that I could not see a VM
named 'HostedEngine' in the list of VMs. I think however that this was
not a properly working setup, as the NFS storage I used to setup the
hosted engine became unavailable and I think this killed the hosted
engine, which caused it to reboot the host it was on. The hosted
engine has not come back since then, so I'm guessing it either isn't
properly set up for HA or it needs the NFS storage or something else
was not properly done by me in the setup. I've restarted the bare
metal engine for now as I needed it running for now.
My questions are:
1. My understanding is that the NFS storage is initially used to
create the hosted engine disk image, and is temporary, and that the
hosted engine later gets migrated to the storage used by the rest of
the cluster (which in my case is directly attached to the hosts via
fibre channel). I suspect that this did not happen. The bare metal
engine had some local ISO storage (on a hard disk local to it), which
will not be replicated to the hosted engine VM - will this cause a
problem for the deployment? I can create some new ISO storage later if
not.
2. What is the recommended way to recover from this situation? Should
I just run 'hosted-engine --deploy' again and try and find out what is
going wrong at step 15?
I can probably get the disk image that was on NFS and mount it to find
out what went wrong on the initial deployment, or I can run the
deployment again and then get the log when it fails at step 15.
Ovirt version was 4.1.2.2
Thanks for any help,
Cam