Dear Ovirt Hackers,
(CZ: pokud to pomuze - muzeme i cesky)
we are dealing with hosted engine deployment issue on a fresh AMD EPYC servers:
and we are ready to donate hardware to Ovirt community after we pass this issue ( :-) )
0/ base infra:
- 3 identical physical servers (produced in 2021-4Q)
- fresh, clean and recent version of centos 8 stream installed (@^minimal-environment)
- servers are interconnected with cisco switch, each other are network visible,
all with nice internet access (NAT)
1/ storage:
- all 3 servers/nodes host nice and clean glusterfs (v9.5) and volume
"vol-images01" is ready for VM images
- ovirt hosted engine deployment procedure:
- easily accept mentioned glusterfs storage domain
- mount it during "hosted-engine --deploy" with no issue
- all permissions are set correctly at all glustrfs nodes ("chown vdsm.kvm
vol-images01")
- no issue with storage domain at all
2/ ovirt - hosted engine deployment:
- all 3 servers successfully deployed recent ovirt version with standart procedure
(on top of minimal install of centos 8 stream):
dnf -y install ovirt-host
virt-host-validate: PASS ALL
- at first server we continue with:
dnf -y install ovirt-engine-appliance
hosted-engine --deploy (pure commandline - so no cockpit is used)
DEPLOYMENT ISSUE:
- during "hosted-engine --deploy" procedure - hosted engine becomes temporairly
accessible at:https://server01:6900/ovirt-engine/
- with request to manualy set "ovirtmgmt" virtual nic
- Hosts > server01 > Network Interfaces > [SETUP HOST NETWORKS]
"ovirtmgmt" dropped to eno1 - [OK]
- than All pass fine - and host "server01" becomes Active
- back to commandline to Continue with deployment "Pause execution until
/tmp/ansible.jksf4_n2_he_setup_lock is removed"
by removing the lock file
- deployment than pass all steps_until_ "[ INFO ] TASK
[ovirt.ovirt.hosted_engine_setup : Check engine VM health]"
ISSUE DETAILS: new VM becomes not accessible in the final stage - as it should be
reachable at its final IP:
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Fail if Engine IP is different from
engine's he_fqdn resolved IP]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg":
"Engine VM IP address is while the engine's he_fqdn ovirt-engine.mgmt.pss.local
resolves to 10.210.1.101. If you are using DHCP, check your DHCP reservation
configuration"}
- problem is, that even if we go with "Static" IP (provided during answering
procedure) or with "DHCP" way (with properly set DHCP and DNS server responding
with correct IP for both
WE STUCK THERE
WE TRYIED:
- no success to connect to terminal/vnc of running VM "HostedEngine" to figure
out the internal network issue
any suggestion howto "connect" into newly deployed UP and RUNNING HostedEngine
VM? to figure out eventually manualy fix the internal network issue?
Thank You all for your help
Charles Stellen
PS: we are advanced in Ovirt deployment (from version 4.0), also we are advanced in
GNU/Linux KVM based virtualisation for 10+ years,
so any suggests or any details requested - WE ARE READY to provide
online debuging or direct access to servers is not a problem
PPS: after we pass this deployment - and after decomissioning procedure - we are ready to
provide older HW to Ovirt community