Bill,
I think this is the same problem. I figured out the vnc stuff earlier and
got a console to the engine and got its status up but I don't know if
that's enough. I feel like all steps haven't been completed. Once you
figured out what was going on did you have to redeploy the engine again?
On Mon, Jul 30, 2018 at 9:49 PM, William Dossett <william.dossett(a)gmail.com>
wrote:
That happened to me twice… the second time I figure it out and it
was
networking.
I am not familiar with the hosted-engine ---console…
The person that helped me said to do the following:
Run on your first host
hosted-engine --add-console-password
to set a temporary VNC password and then connect to it over VNC with
something like
remote-viewer vnc://<host>:<port>
Which got me in and allowed me to fix the networking once I saw what was
wrong… can you get to the console like that?
Regards
Bill
*From:* Jayme [mailto:jaymef@gmail.com]
*Sent:* Monday, July 30, 2018 3:38 PM
*To:* users <users(a)ovirt.org>
*Subject:* [ovirt-users] Re: Hosted Engine deploy failed on new HCI build
I haven't had much luck with this yet I completely wiped the three hosts
and did the entire install over again from the ground up only this time I
used dhcp instead of static IP for the hostedengine deployment and ended up
failing again in the exact step as before, waiting for the VM to come back
but never does.
I still feel like it could be network related in some way just not sure
how. Any ideas?
On Mon, Jul 30, 2018, 2:25 PM Jayme, <jaymef(a)gmail.com> wrote:
Latest version of oVirt node 4.2 installed on three hosts. I completed
successfully the cockpit gdeploy process to deploy HCI. All of that went
well with no errors. I then proceeded to the hosted engine deployment step
which eventually failed (log attached).
This is the current status:
--== Host 1 status ==--
conf_on_shared_storage : True
Status up-to-date : True
Hostname : MASKED
Host ID : 1
Engine status : {"reason": "failed liveliness
check",
"health": "bad", "vm": "up", "detail":
"Up"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : 3fa48e03
local_conf_timestamp : 8468
Host timestamp : 8468
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=8468 (Mon Jul 30 14:21:09 2018)
host-id=1
score=3400
vm_conf_refresh_time=8468 (Mon Jul 30 14:21:09 2018)
conf_on_shared_storage=True
maintenance=False
state=EngineStarting
stopped=False
If I do hosted-engine --console I get:
The engine VM is running on this host
Connected to domain HostedEngine
Escape character is ^]
error: internal error: cannot find character device <null>
does anyone know why it may have failed or what I could do to recover from
this? I'm thinking it could have potentially failed due to some problem
with network config. If I could get a console in to the engine VM I might
be able to fix it but that serial error above is preventing me from
reaching the vm console to diagnose further.
Log of deploy attached:
Thanks!