hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

Dear Ovirt Hackers, sorry: incidently send to devel@ovitr.org we are dealing with hosted engine deployment issue on a fresh AMD EPYC servers: and we are ready to donate hardware to Ovirt community after we pass this issue ( :-) ) 0/ base infra: - 3 identical physical servers (produced in 2021-4Q) - fresh, clean and recent version of centos 8 stream installed (@^minimal-environment) - servers are interconnected with cisco switch, each other are network visible, all with nice internet access (NAT) 1/ storage: - all 3 servers/nodes host nice and clean glusterfs (v9.5) and volume "vol-images01" is ready for VM images - ovirt hosted engine deployment procedure: - easily accept mentioned glusterfs storage domain - mount it during "hosted-engine --deploy" with no issue - all permissions are set correctly at all glustrfs nodes ("chown vdsm.kvm vol-images01") - no issue with storage domain at all 2/ ovirt - hosted engine deployment: - all 3 servers successfully deployed recent ovirt version with standart procedure (on top of minimal install of centos 8 stream): dnf -y install ovirt-host virt-host-validate: PASS ALL - at first server we continue with: dnf -y install ovirt-engine-appliance hosted-engine --deploy (pure commandline - so no cockpit is used) DEPLOYMENT ISSUE: - during "hosted-engine --deploy" procedure - hosted engine becomes temporairly accessible at:https://server01:6900/ovirt-engine/ - with request to manualy set "ovirtmgmt" virtual nic - Hosts > server01 > Network Interfaces > [SETUP HOST NETWORKS] "ovirtmgmt" dropped to eno1 - [OK] - than All pass fine - and host "server01" becomes Active - back to commandline to Continue with deployment "Pause execution until /tmp/ansible.jksf4_n2_he_setup_lock is removed" by removing the lock file - deployment than pass all steps_until_ "[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Check engine VM health]" ISSUE DETAILS: new VM becomes not accessible in the final stage - as it should be reachable at its final IP: [ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Fail if Engine IP is different from engine's he_fqdn resolved IP] [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM IP address is while the engine's he_fqdn ovirt-engine.mgmt.pss.local resolves to 10.210.1.101. If you are using DHCP, check your DHCP reservation configuration"} - problem is, that even if we go with "Static" IP (provided during answering procedure) or with "DHCP" way (with properly set DHCP and DNS server responding with correct IP for both WE STUCK THERE WE TRYIED: - no success to connect to terminal/vnc of running VM "HostedEngine" to figure out the internal network issue any suggestion howto "connect" into newly deployed UP and RUNNING HostedEngine VM? to figure out eventually manualy fix the internal network issue? Thank You all for your help Charles Stellen PS: we are advanced in Ovirt deployment (from version 4.0), also we are advanced in GNU/Linux KVM based virtualisation for 10+ years, so any suggests or any details requested - WE ARE READY to provide online debuging or direct access to servers is not a problem PPS: after we pass this deployment - and after decomissioning procedure - we are ready to provide older HW to Ovirt community

While I have not answered your question directly, I would strongly advise you just use ovirt-node I went through similar build issues all the time. Ansible (well, whoever wrote the playbook) can be finicky sometimes and I found when I deployed ovirt-node I was done in under an hour with absolutely zero issues The +1 is knowing that if I upgrade ovirt on ovirt-node, I will (hopefully) have a much lesser chance of breaking ovirt on an upgrade On Tue, Feb 8, 2022 at 9:04 AM Charles Stellen <charles@inux.cz> wrote:
Dear Ovirt Hackers,
sorry: incidently send to devel@ovitr.org
we are dealing with hosted engine deployment issue on a fresh AMD EPYC servers:
and we are ready to donate hardware to Ovirt community after we pass this issue ( :-) )
0/ base infra:
- 3 identical physical servers (produced in 2021-4Q) - fresh, clean and recent version of centos 8 stream installed (@^minimal-environment) - servers are interconnected with cisco switch, each other are network visible, all with nice internet access (NAT)
1/ storage:
- all 3 servers/nodes host nice and clean glusterfs (v9.5) and volume "vol-images01" is ready for VM images - ovirt hosted engine deployment procedure: - easily accept mentioned glusterfs storage domain - mount it during "hosted-engine --deploy" with no issue - all permissions are set correctly at all glustrfs nodes ("chown vdsm.kvm vol-images01") - no issue with storage domain at all
2/ ovirt - hosted engine deployment:
- all 3 servers successfully deployed recent ovirt version with standart procedure (on top of minimal install of centos 8 stream):
dnf -y install ovirt-host virt-host-validate: PASS ALL
- at first server we continue with:
dnf -y install ovirt-engine-appliance hosted-engine --deploy (pure commandline - so no cockpit is used)
DEPLOYMENT ISSUE:
- during "hosted-engine --deploy" procedure - hosted engine becomes temporairly accessible at:https://server01:6900/ovirt-engine/ - with request to manualy set "ovirtmgmt" virtual nic - Hosts > server01 > Network Interfaces > [SETUP HOST NETWORKS] "ovirtmgmt" dropped to eno1 - [OK] - than All pass fine - and host "server01" becomes Active - back to commandline to Continue with deployment "Pause execution until /tmp/ansible.jksf4_n2_he_setup_lock is removed" by removing the lock file
- deployment than pass all steps_until_ "[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Check engine VM health]"
ISSUE DETAILS: new VM becomes not accessible in the final stage - as it should be reachable at its final IP:
[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Fail if Engine IP is different from engine's he_fqdn resolved IP] [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM IP address is while the engine's he_fqdn ovirt-engine.mgmt.pss.local resolves to 10.210.1.101. If you are using DHCP, check your DHCP reservation configuration"}
- problem is, that even if we go with "Static" IP (provided during answering procedure) or with "DHCP" way (with properly set DHCP and DNS server responding with correct IP for both WE STUCK THERE
WE TRYIED: - no success to connect to terminal/vnc of running VM "HostedEngine" to figure out the internal network issue
any suggestion howto "connect" into newly deployed UP and RUNNING HostedEngine VM? to figure out eventually manualy fix the internal network issue?
Thank You all for your help Charles Stellen
PS: we are advanced in Ovirt deployment (from version 4.0), also we are advanced in GNU/Linux KVM based virtualisation for 10+ years, so any suggests or any details requested - WE ARE READY to provide online debuging or direct access to servers is not a problem
PPS: after we pass this deployment - and after decomissioning procedure - we are ready to provide older HW to Ovirt community
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/LKOLWUCOFAHCXS...
-- *Notice to Recipient*: *https://www.flyerft.com/disclaimer <https://www.flyerft.com/disclaimer>*

On Tue, Feb 8, 2022 at 4:05 PM Charles Stellen <charles@inux.cz> wrote:
Dear Ovirt Hackers,
sorry: incidently send to devel@ovitr.org
we are dealing with hosted engine deployment issue on a fresh AMD EPYC servers:
and we are ready to donate hardware to Ovirt community after we pass this issue ( :-) )
0/ base infra:
- 3 identical physical servers (produced in 2021-4Q) - fresh, clean and recent version of centos 8 stream installed (@^minimal-environment) - servers are interconnected with cisco switch, each other are network visible, all with nice internet access (NAT)
1/ storage:
- all 3 servers/nodes host nice and clean glusterfs (v9.5) and volume "vol-images01" is ready for VM images - ovirt hosted engine deployment procedure: - easily accept mentioned glusterfs storage domain - mount it during "hosted-engine --deploy" with no issue - all permissions are set correctly at all glustrfs nodes ("chown vdsm.kvm vol-images01") - no issue with storage domain at all
2/ ovirt - hosted engine deployment:
- all 3 servers successfully deployed recent ovirt version with standart procedure (on top of minimal install of centos 8 stream):
dnf -y install ovirt-host virt-host-validate: PASS ALL
- at first server we continue with:
dnf -y install ovirt-engine-appliance hosted-engine --deploy (pure commandline - so no cockpit is used)
DEPLOYMENT ISSUE:
- during "hosted-engine --deploy" procedure - hosted engine becomes temporairly accessible at:https://server01:6900/ovirt-engine/ - with request to manualy set "ovirtmgmt" virtual nic - Hosts > server01 > Network Interfaces > [SETUP HOST NETWORKS] "ovirtmgmt" dropped to eno1 - [OK] - than All pass fine - and host "server01" becomes Active - back to commandline to Continue with deployment "Pause execution until /tmp/ansible.jksf4_n2_he_setup_lock is removed" by removing the lock file
- deployment than pass all steps_until_ "[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Check engine VM health]"
ISSUE DETAILS: new VM becomes not accessible in the final stage - as it should be reachable at its final IP:
Can you please try with qemu 6.0.0? See other threads here about broken 6.1. Sorry for that. Good luck and best regards, -- Didi

[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Fail if Engine IP is different from engine's he_fqdn resolved IP] [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM IP address is while the engine's he_fqdn ovirt-engine.mgmt.pss.local resolves to 10.210.1.101. If you are using DHCP, check your DHCP reservation configuration"}
Hello, It's a known issue (Yesterday it took me 4 cups of coffee and ~4-5 of lost sleep to remember this fact...) The Latest qemu update (6.1) is broken, and fails during --deploy. Make sure you run 'dnf downgrade qemu*' a couple of times on the first host, until you get qemu-6.0. Once done, try deploying again. - Gilboa

Or just add an exclude in /etc/dnf/dnf.conf On Tue, Feb 8, 2022 at 18:32, Gilboa Davara<gilboad@gmail.com> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/AMWN33K6BELU6V...

On Wed, Feb 9, 2022 at 1:05 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
Or just add an exclude in /etc/dnf/dnf.conf
I personally added and exclusion to /etc/yum.repos.d/CentOS-Stream-AppStream.repo exclude=qemu* It allows ovirt-4.4* repos to push a new qemu release, without letting CentOS stream break things... - Gilboa
On Tue, Feb 8, 2022 at 18:32, Gilboa Davara <gilboad@gmail.com> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/AMWN33K6BELU6V...

On Wed, Feb 9, 2022 at 12:47 PM Gilboa Davara <gilboad@gmail.com> wrote:
On Wed, Feb 9, 2022 at 1:05 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
Or just add an exclude in /etc/dnf/dnf.conf
I personally added and exclusion to /etc/yum.repos.d/CentOS-Stream-AppStream.repo exclude=qemu* It allows ovirt-4.4* repos to push a new qemu release, without letting CentOS stream break things...
But new libvirt versions may require a newer qemu version, and oVirt itself may require a new libvirt version. These kind of excludes are fragile and need constant maintenance. Nir

On Wed, Feb 9, 2022 at 3:35 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Wed, Feb 9, 2022 at 12:47 PM Gilboa Davara <gilboad@gmail.com> wrote:
On Wed, Feb 9, 2022 at 1:05 AM Strahil Nikolov <hunter86_bg@yahoo.com>
wrote:
Or just add an exclude in /etc/dnf/dnf.conf
I personally added and exclusion to /etc/yum.repos.d/CentOS-Stream-AppStream.repo exclude=qemu* It allows ovirt-4.4* repos to push a new qemu release, without letting CentOS stream break things...
But new libvirt versions may require a newer qemu version, and oVirt itself may require a new libvirt version.
These kind of excludes are fragile and need constant maintenance.
Nir
The previous poster proposed a global qemu exclusion. I propose a partial qemu exclusion (on centos-streams only), with the assumption that ovirt-required qemu will be pushed directly via the ovirt repo. In both cases, this is a temporary measure needed to avoid using the broken qemu pushed by streams. In both cases libvirt update from appstreams will get blocked - assuming it requires the broken qemu release. Do you advise we simply --exclude=qemu* everything we run dnf? I would imagine it's far more dangerous and will block libvirt update just as well. ... Unless I'm missing something? - Gilboa

On Wed, Feb 9, 2022 at 5:06 PM Gilboa Davara <gilboad@gmail.com> wrote:
On Wed, Feb 9, 2022 at 3:35 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Wed, Feb 9, 2022 at 12:47 PM Gilboa Davara <gilboad@gmail.com> wrote:
On Wed, Feb 9, 2022 at 1:05 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
Or just add an exclude in /etc/dnf/dnf.conf
I personally added and exclusion to /etc/yum.repos.d/CentOS-Stream-AppStream.repo exclude=qemu* It allows ovirt-4.4* repos to push a new qemu release, without letting CentOS stream break things...
But new libvirt versions may require a newer qemu version, and oVirt itself may require a new libvirt version.
These kind of excludes are fragile and need constant maintenance.
Nir
The previous poster proposed a global qemu exclusion. I propose a partial qemu exclusion (on centos-streams only), with the assumption that ovirt-required qemu will be pushed directly via the ovirt repo. In both cases, this is a temporary measure needed to avoid using the broken qemu pushed by streams. In both cases libvirt update from appstreams will get blocked - assuming it requires the broken qemu release.
Do you advise we simply --exclude=qemu* everything we run dnf? I would imagine it's far more dangerous and will block libvirt update just as well.
I don't have a better solution, I just wanted to warn about these excludes. Nir

On Wed, Feb 9, 2022, 21:33 Nir Soffer <nsoffer@redhat.com> wrote:
On Wed, Feb 9, 2022 at 5:06 PM Gilboa Davara <gilboad@gmail.com> wrote:
On Wed, Feb 9, 2022 at 3:35 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Wed, Feb 9, 2022 at 12:47 PM Gilboa Davara <gilboad@gmail.com>
wrote:
On Wed, Feb 9, 2022 at 1:05 AM Strahil Nikolov <hunter86_bg@yahoo.com>
wrote:
Or just add an exclude in /etc/dnf/dnf.conf
I personally added and exclusion to /etc/yum.repos.d/CentOS-Stream-AppStream.repo exclude=qemu* It allows ovirt-4.4* repos to push a new qemu release, without letting CentOS stream break things...
But new libvirt versions may require a newer qemu version, and oVirt itself may require a new libvirt version.
These kind of excludes are fragile and need constant maintenance.
Nir
The previous poster proposed a global qemu exclusion. I propose a partial qemu exclusion (on centos-streams only), with the assumption that ovirt-required qemu will be pushed directly via the ovirt repo. In both cases, this is a temporary measure needed to avoid using the broken qemu pushed by streams. In both cases libvirt update from appstreams will get blocked - assuming it requires the broken qemu release.
Do you advise we simply --exclude=qemu* everything we run dnf? I would imagine it's far more dangerous and will block libvirt update just as well.
I don't have a better solution, I just wanted to warn about these excludes.
Nir
Ok, understood, thanks. Gilboa

I ment blacklisting only the broken qemu packages .Something like:exclude=qemu*6.1.0-4.module_el8.6.0+983+a7505f3f.x86_64 or even more explicit -> full package names qith version & arch New packages would not match the filter. By the way, did anyone check qemu 6.2.0 ? Best Regards,Strahil Nikolov On Thu, Feb 10, 2022 at 6:15, Gilboa Davara<gilboad@gmail.com> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/WYUFB55GKFXI7U...

Hello, I somehow missed your reply (and was AFK nearly two weeks). How can I test qemu 6.2? Is it available in some repo? - Gilboa On Thu, Feb 10, 2022 at 4:31 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
I ment blacklisting only the broken qemu packages . Something like: exclude=qemu*6.1.0-4.module_el8.6.0+983+a7505f3f.x86_64
or even more explicit -> full package names qith version & arch
New packages would not match the filter.
By the way, did anyone check qemu 6.2.0 ?
Best Regards, Strahil Nikolov
On Thu, Feb 10, 2022 at 6:15, Gilboa Davara <gilboad@gmail.com> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/WYUFB55GKFXI7U...

You can blacklist packages in dnf with specific version, and thus you don't need to blacklist from repo. Best Regards,Strahil Nikolov On Mon, Feb 21, 2022 at 10:33, Gilboa Davara<gilboad@gmail.com> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SJZNIGOZXWC44R...

On Mon, Feb 21, 2022 at 12:07 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:
You can blacklist packages in dnf with specific version, and thus you don't need to blacklist from repo.
Best Regards, Strahil Nikolov
Hello, Understood. Perr your qemu 6.2 question, how can I test it? Is it packaged in some testing repo? - Gilboa
On Mon, Feb 21, 2022 at 10:33, Gilboa Davara <gilboad@gmail.com> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SJZNIGOZXWC44R...

Nope, it's in the appstream (CentOS Stream), but I never tested it. Best Regards,Strahil Nikolov On Wed, Feb 23, 2022 at 12:42, Gilboa Davara<gilboad@gmail.com> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZF3X276Y2WS34R...
participants (6)
-
Charles Kozler
-
Charles Stellen
-
Gilboa Davara
-
Nir Soffer
-
Strahil Nikolov
-
Yedidyah Bar David