On Wed, Feb 6, 2019 at 2:04 PM feral <blistovmhz@gmail.com> wrote:

I have no idea what's wrong at this point. Very vanilla install of 3 nodes. Run the Hyperconverged wizard, completes fine. Run the engine deployment, takes hours, eventually fails with :

[ INFO ] TASK [oVirt.hosted-engine-setup : Check engine VM health]
[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 120, "changed": true, "cmd": ["hosted-engine", "--vm-status", "--json"], "delta": "0:00:00.340985", "end": "2019-02-06 11:44:48.836431", "rc": 0, "start": "2019-02-06 11:44:48.495446", "stderr": "", "stderr_lines": [], "stdout": "{\"1\": {\"conf_on_shared_storage\": true, \"live-data\": true, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=12994 (Wed Feb 6 11:44:44 2019)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=12995 (Wed Feb 6 11:44:44 2019)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStop\\nstopped=False\\n\", \"hostname\": \"ovirt-431.localdomain\", \"host-id\": 1, \"engine-status\": {\"reason\": \"failed liveliness check\", \"health\": \"bad\", \"vm\": \"up\", \"detail\": \"Up\"}, \"score\": 3400, \"stopped\": false, \"maintenance\": false, \"crc32\": \"5474927a\", \"local_conf_timestamp\": 12995, \"host-ts\": 12994}, \"global_maintenance\": false}", "stdout_lines": ["{\"1\": {\"conf_on_shared_storage\": true, \"live-data\": true, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=12994 (Wed Feb 6 11:44:44 2019)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=12995 (Wed Feb 6 11:44:44 2019)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStop\\nstopped=False\\n\", \"hostname\": \"ovirt-431.localdomain\", \"host-id\": 1, \"engine-status\": {\"reason\": \"failed liveliness check\", \"health\": \"bad\", \"vm\": \"up\", \"detail\": \"Up\"}, \"score\": 3400, \"stopped\": false, \"maintenance\": false, \"crc32\": \"5474927a\", \"local_conf_timestamp\": 12995, \"host-ts\": 12994}, \"global_maintenance\": false}"]}
[ INFO ] TASK [oVirt.hosted-engine-setup : Check VM status at virt level]
[ INFO ] changed: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : debug]
[ INFO ] ok: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Fail if engine VM is not running]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Get target engine VM IP address]
[ INFO ] changed: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Get VDSM's target engine VM stats]
[ INFO ] changed: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Convert stats to JSON format]
[ INFO ] ok: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Get target engine VM IP address from VDSM stats]
[ INFO ] ok: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : debug]
[ INFO ] ok: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Fail if Engine IP is different from engine's he_fqdn resolved IP]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [oVirt.hosted-engine-setup : Fail is for any other reason the engine didn't started]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The engine failed to start inside the engine VM; please check engine.log."}

---------------------------------------------------

I can't check the engine.log as I can't connect to the VM once this failure occurs. I can ssh in prior to the VM being moved to gluster storage, but as soon as it starts doing so, the VM never comes back online.

--
_____
Fact:
1. Ninjas are mammals.
2. Ninjas fight ALL the time.
3. The purpose of the ninja is to flip out and kill people.