I am having a problem with the hosted-engine deployment, and unfortunately after a weekend spent trying to get this far, I am finally stuck, and cannot figure out how to fix this.

I am starting with 1 host, and will have 4 when this is finished. Storage is GlusterFS, hyperconverged, but I am managing that myself outside of oVirt. It's a single-node GlusterFS volume, which I will expand out across the other 4 nodes as well. I get all the way through the initial hosted-engine deployment (via the cockpit interface) pre-storage, then get most of the way through the storage portion of it. It fails at starting the HostedEngine VM in its final state after copying the VM disk to shared storage.

This is where it gets weird.

[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM IP address is while the engine's he_fqdn ovirt.deleted.domain resolves to 192.168.x.x. If you are using DHCP, check your DHCP reservation configuration"}

I've masked out the domain and IP for obvious reasons. However I think this deployment error isn't really the reason for the failure, it's just where it is at when it fails. The HostedEngine VM is starting, but not actually booting. I was able to change the VNC password with `hosted-engine --add-console-password`, and see the local console display with that, however it just displays "The guest has not initialized the display (yet)".

I also did:

# hosted-engine --console

The engine VM is running on this host

Escape character is ^]

Yet that doesn't move any further, nor allow any input. The VM does not respond on the network. I am thinking it's just not making it to the initial BIOS screen and booting at all. What would cause that?

Here is the glusterfs volume for clarity.

# gluster volume info storage

Volume Name: storage

Type: Distribute

Volume ID: e9544310-8890-43e3-b49c-6e8c7472dbbb

Status: Started

Snapshot Count: 0

Number of Bricks: 1

Transport-type: tcp

Bricks:

Brick1: node1:/var/glusterfs/storage/1

Options Reconfigured:

storage.owner-gid: 36

storage.owner-uid: 36

network.ping-timeout: 5

performance.client-io-threads: on

server.event-threads: 4

client.event-threads: 4

cluster.choose-local: off

user.cifs: off

features.shard: on

cluster.shd-wait-qlength: 1024

cluster.locking-scheme: full

cluster.data-self-heal-algorithm: full

cluster.server-quorum-type: server

cluster.quorum-type: auto

cluster.eager-lock: enable

performance.strict-o-direct: on

network.remote-dio: disable

performance.low-prio-threads: 32

performance.io-cache: off

performance.read-ahead: off

performance.quick-read: off

storage.fips-mode-rchecksum: on

transport.address-family: inet

nfs.disable: on

# cat /proc/cpuinfo

processor : 0

vendor_id : GenuineIntel

cpu family : 6

model : 58

model name : Intel(R) Xeon(R) CPU E3-1280 V2 @ 3.60GHz

stepping : 9

microcode : 0x21

cpu MHz : 4000.000

cache size : 8192 KB

physical id : 0

siblings : 8

core id : 0

cpu cores : 4

apicid : 0

initial apicid : 0

fpu : yes

fpu_exception : yes

cpuid level : 13

wp : yes

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d

bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit srbds

bogomips : 7199.86

clflush size : 64

cache_alignment: 64

address sizes : 36 bits physical, 48 bits virtual

power management:

[ plus 7 more ]

Thanks for any insight that can be provided.