Update, when the node is rebooted, it fails with "timed out waiting for
device dev-gluster_vg_vdb-gluster_lv_data.device
The node also has no networking online, which is probably the cause of the
gluster failure.
On Wed, Feb 6, 2019 at 2:04 PM feral <blistovmhz(a)gmail.com> wrote:
I have no idea what's wrong at this point. Very vanilla install
of 3
nodes. Run the Hyperconverged wizard, completes fine. Run the engine
deployment, takes hours, eventually fails with :
[ INFO ] TASK [oVirt.hosted-engine-setup : Check engine VM health]
[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 120,
"changed":
true, "cmd": ["hosted-engine", "--vm-status",
"--json"], "delta":
"0:00:00.340985", "end": "2019-02-06 11:44:48.836431",
"rc": 0, "start":
"2019-02-06 11:44:48.495446", "stderr": "",
"stderr_lines": [], "stdout":
"{\"1\": {\"conf_on_shared_storage\": true,
\"live-data\": true, \"extra\":
\"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=12994
(Wed Feb 6 11:44:44
2019)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=12995 (Wed Feb 6
11:44:44
2019)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStop\\nstopped=False\\n\",
\"hostname\": \"ovirt-431.localdomain\", \"host-id\": 1,
\"engine-status\":
{\"reason\": \"failed liveliness check\", \"health\":
\"bad\", \"vm\":
\"up\", \"detail\": \"Up\"}, \"score\": 3400,
\"stopped\": false,
\"maintenance\": false, \"crc32\": \"5474927a\",
\"local_conf_timestamp\":
12995, \"host-ts\": 12994}, \"global_maintenance\": false}",
"stdout_lines": ["{\"1\": {\"conf_on_shared_storage\":
true, \"live-data\":
true, \"extra\":
\"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=12994
(Wed Feb 6 11:44:44
2019)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=12995 (Wed Feb 6
11:44:44
2019)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStop\\nstopped=False\\n\",
\"hostname\": \"ovirt-431.localdomain\", \"host-id\": 1,
\"engine-status\":
{\"reason\": \"failed liveliness check\", \"health\":
\"bad\", \"vm\":
\"up\", \"detail\": \"Up\"}, \"score\": 3400,
\"stopped\": false,
\"maintenance\": false, \"crc32\": \"5474927a\",
\"local_conf_timestamp\":
12995, \"host-ts\": 12994}, \"global_maintenance\": false}"]}
[ INFO ] TASK [oVirt.hosted-engine-setup : Check VM status at virt level]
[ INFO ] changed: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : debug]
[ INFO ] ok: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Fail if engine VM is not
running]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Get target engine VM IP address]
[ INFO ] changed: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Get VDSM's target engine VM
stats]
[ INFO ] changed: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Convert stats to JSON format]
[ INFO ] ok: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Get target engine VM IP address
from VDSM stats]
[ INFO ] ok: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : debug]
[ INFO ] ok: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Fail if Engine IP is different
from engine's he_fqdn resolved IP]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Fail is for any other reason
the engine didn't started]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg":
"The
engine failed to start inside the engine VM; please check engine.log."}
---------------------------------------------------
I can't check the engine.log as I can't connect to the VM once this
failure occurs. I can ssh in prior to the VM being moved to gluster
storage, but as soon as it starts doing so, the VM never comes back online.
--
_____
Fact:
1. Ninjas are mammals.
2. Ninjas fight ALL the time.
3. The purpose of the ninja is to flip out and kill people.
--
_____
Fact:
1. Ninjas are mammals.
2. Ninjas fight ALL the time.
3. The purpose of the ninja is to flip out and kill people.