hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED - Users

hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

older
how to convert centos 8 to centos...

Charles Stellen

8 Feb 2022 8 Feb '22

2:27 p.m.

Dear Ovirt Hackers, sorry: incidently send to devel@ovitr.org we are dealing with hosted engine deployment issue on a fresh AMD EPYC servers: and we are ready to donate hardware to Ovirt community after we pass this issue ( :-) ) 0/ base infra: - 3 identical physical servers (produced in 2021-4Q) - fresh, clean and recent version of centos 8 stream installed (@^minimal-environment) - servers are interconnected with cisco switch, each other are network visible, all with nice internet access (NAT) 1/ storage: - all 3 servers/nodes host nice and clean glusterfs (v9.5) and volume "vol-images01" is ready for VM images - ovirt hosted engine deployment procedure: - easily accept mentioned glusterfs storage domain - mount it during "hosted-engine --deploy" with no issue - all permissions are set correctly at all glustrfs nodes ("chown vdsm.kvm vol-images01") - no issue with storage domain at all 2/ ovirt - hosted engine deployment: - all 3 servers successfully deployed recent ovirt version with standart procedure (on top of minimal install of centos 8 stream): dnf -y install ovirt-host virt-host-validate: PASS ALL - at first server we continue with: dnf -y install ovirt-engine-appliance hosted-engine --deploy (pure commandline - so no cockpit is used) DEPLOYMENT ISSUE: - during "hosted-engine --deploy" procedure - hosted engine becomes temporairly accessible at:https://server01:6900/ovirt-engine/ - with request to manualy set "ovirtmgmt" virtual nic - Hosts > server01 > Network Interfaces > [SETUP HOST NETWORKS] "ovirtmgmt" dropped to eno1 - [OK] - than All pass fine - and host "server01" becomes Active - back to commandline to Continue with deployment "Pause execution until /tmp/ansible.jksf4_n2_he_setup_lock is removed" by removing the lock file - deployment than pass all steps_until_ "[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Check engine VM health]" ISSUE DETAILS: new VM becomes not accessible in the final stage - as it should be reachable at its final IP: [ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Fail if Engine IP is different from engine's he_fqdn resolved IP] [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM IP address is while the engine's he_fqdn ovirt-engine.mgmt.pss.local resolves to 10.210.1.101. If you are using DHCP, check your DHCP reservation configuration"} - problem is, that even if we go with "Static" IP (provided during answering procedure) or with "DHCP" way (with properly set DHCP and DNS server responding with correct IP for both WE STUCK THERE WE TRYIED: - no success to connect to terminal/vnc of running VM "HostedEngine" to figure out the internal network issue any suggestion howto "connect" into newly deployed UP and RUNNING HostedEngine VM? to figure out eventually manualy fix the internal network issue? Thank You all for your help Charles Stellen PS: we are advanced in Ovirt deployment (from version 4.0), also we are advanced in GNU/Linux KVM based virtualisation for 10+ years, so any suggests or any details requested - WE ARE READY to provide online debuging or direct access to servers is not a problem PPS: after we pass this deployment - and after decomissioning procedure - we are ready to provide older HW to Ovirt community

Attachments:

OpenPGP_signature.sig (application/pgp-signature — 665 bytes)

Show replies by date

Charles Kozler

8 Feb 8 Feb

3:07 p.m.

New subject: hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

While I have not answered your question directly, I would strongly advise you just use ovirt-node I went through similar build issues all the time. Ansible (well, whoever wrote the playbook) can be finicky sometimes and I found when I deployed ovirt-node I was done in under an hour with absolutely zero issues The +1 is knowing that if I upgrade ovirt on ovirt-node, I will (hopefully) have a much lesser chance of breaking ovirt on an upgrade On Tue, Feb 8, 2022 at 9:04 AM Charles Stellen <charles@inux.cz> wrote:

...

Dear Ovirt Hackers,

sorry: incidently send to devel@ovitr.org

we are dealing with hosted engine deployment issue on a fresh AMD EPYC servers:

and we are ready to donate hardware to Ovirt community after we pass this issue ( :-) )

0/ base infra:

- 3 identical physical servers (produced in 2021-4Q) - fresh, clean and recent version of centos 8 stream installed (@^minimal-environment) - servers are interconnected with cisco switch, each other are network visible, all with nice internet access (NAT)

1/ storage:

- all 3 servers/nodes host nice and clean glusterfs (v9.5) and volume "vol-images01" is ready for VM images - ovirt hosted engine deployment procedure: - easily accept mentioned glusterfs storage domain - mount it during "hosted-engine --deploy" with no issue - all permissions are set correctly at all glustrfs nodes ("chown vdsm.kvm vol-images01") - no issue with storage domain at all

2/ ovirt - hosted engine deployment:

- all 3 servers successfully deployed recent ovirt version with standart procedure (on top of minimal install of centos 8 stream):

dnf -y install ovirt-host virt-host-validate: PASS ALL

- at first server we continue with:

dnf -y install ovirt-engine-appliance hosted-engine --deploy (pure commandline - so no cockpit is used)

DEPLOYMENT ISSUE:

- during "hosted-engine --deploy" procedure - hosted engine becomes temporairly accessible at:https://server01:6900/ovirt-engine/ - with request to manualy set "ovirtmgmt" virtual nic - Hosts > server01 > Network Interfaces > [SETUP HOST NETWORKS] "ovirtmgmt" dropped to eno1 - [OK] - than All pass fine - and host "server01" becomes Active - back to commandline to Continue with deployment "Pause execution until /tmp/ansible.jksf4_n2_he_setup_lock is removed" by removing the lock file

- deployment than pass all steps_until_ "[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Check engine VM health]"

ISSUE DETAILS: new VM becomes not accessible in the final stage - as it should be reachable at its final IP:

[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Fail if Engine IP is different from engine's he_fqdn resolved IP] [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM IP address is while the engine's he_fqdn ovirt-engine.mgmt.pss.local resolves to 10.210.1.101. If you are using DHCP, check your DHCP reservation configuration"}

- problem is, that even if we go with "Static" IP (provided during answering procedure) or with "DHCP" way (with properly set DHCP and DNS server responding with correct IP for both WE STUCK THERE

WE TRYIED: - no success to connect to terminal/vnc of running VM "HostedEngine" to figure out the internal network issue

any suggestion howto "connect" into newly deployed UP and RUNNING HostedEngine VM? to figure out eventually manualy fix the internal network issue?

Thank You all for your help Charles Stellen

PS: we are advanced in Ovirt deployment (from version 4.0), also we are advanced in GNU/Linux KVM based virtualisation for 10+ years, so any suggests or any details requested - WE ARE READY to provide online debuging or direct access to servers is not a problem

PPS: after we pass this deployment - and after decomissioning procedure - we are ready to provide older HW to Ovirt community

_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/LKOLWUCOFAHCXS...

-- *Notice to Recipient*: *https://www.flyerft.com/disclaimer <https://www.flyerft.com/disclaimer>*

Yedidyah Bar David

3:07 p.m.

New subject: hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

On Tue, Feb 8, 2022 at 4:05 PM Charles Stellen <charles@inux.cz> wrote:

...

Dear Ovirt Hackers,

sorry: incidently send to devel@ovitr.org

we are dealing with hosted engine deployment issue on a fresh AMD EPYC servers:

and we are ready to donate hardware to Ovirt community after we pass this issue ( :-) )

0/ base infra:

- 3 identical physical servers (produced in 2021-4Q) - fresh, clean and recent version of centos 8 stream installed (@^minimal-environment) - servers are interconnected with cisco switch, each other are network visible, all with nice internet access (NAT)

1/ storage:

- all 3 servers/nodes host nice and clean glusterfs (v9.5) and volume "vol-images01" is ready for VM images - ovirt hosted engine deployment procedure: - easily accept mentioned glusterfs storage domain - mount it during "hosted-engine --deploy" with no issue - all permissions are set correctly at all glustrfs nodes ("chown vdsm.kvm vol-images01") - no issue with storage domain at all

2/ ovirt - hosted engine deployment:

- all 3 servers successfully deployed recent ovirt version with standart procedure (on top of minimal install of centos 8 stream):

dnf -y install ovirt-host virt-host-validate: PASS ALL

- at first server we continue with:

dnf -y install ovirt-engine-appliance hosted-engine --deploy (pure commandline - so no cockpit is used)

DEPLOYMENT ISSUE:

- during "hosted-engine --deploy" procedure - hosted engine becomes temporairly accessible at:https://server01:6900/ovirt-engine/ - with request to manualy set "ovirtmgmt" virtual nic - Hosts > server01 > Network Interfaces > [SETUP HOST NETWORKS] "ovirtmgmt" dropped to eno1 - [OK] - than All pass fine - and host "server01" becomes Active - back to commandline to Continue with deployment "Pause execution until /tmp/ansible.jksf4_n2_he_setup_lock is removed" by removing the lock file

- deployment than pass all steps_until_ "[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Check engine VM health]"

ISSUE DETAILS: new VM becomes not accessible in the final stage - as it should be reachable at its final IP:

Can you please try with qemu 6.0.0? See other threads here about broken 6.1. Sorry for that. Good luck and best regards, -- Didi

Gilboa Davara

5:31 p.m.

New subject: hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

...

[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Fail if Engine IP is different from engine's he_fqdn resolved IP] [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Engine VM IP address is while the engine's he_fqdn ovirt-engine.mgmt.pss.local resolves to 10.210.1.101. If you are using DHCP, check your DHCP reservation configuration"}

Hello, It's a known issue (Yesterday it took me 4 cups of coffee and ~4-5 of lost sleep to remember this fact...) The Latest qemu update (6.1) is broken, and fails during --deploy. Make sure you run 'dnf downgrade qemu*' a couple of times on the first host, until you get qemu-6.0. Once done, try deploying again. - Gilboa

Strahil Nikolov

9 Feb 9 Feb

12:05 a.m.

New subject: hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

Or just add an exclude in /etc/dnf/dnf.conf On Tue, Feb 8, 2022 at 18:32, Gilboa Davara<gilboad@gmail.com> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/AMWN33K6BELU6V...

Gilboa Davara

11:24 a.m.

New subject: hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

On Wed, Feb 9, 2022 at 1:05 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:

...

Or just add an exclude in /etc/dnf/dnf.conf

I personally added and exclusion to /etc/yum.repos.d/CentOS-Stream-AppStream.repo exclude=qemu* It allows ovirt-4.4* repos to push a new qemu release, without letting CentOS stream break things... - Gilboa

...

On Tue, Feb 8, 2022 at 18:32, Gilboa Davara <gilboad@gmail.com> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/AMWN33K6BELU6V...

Nir Soffer

2:35 p.m.

New subject: hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

On Wed, Feb 9, 2022 at 12:47 PM Gilboa Davara <gilboad@gmail.com> wrote:

...

On Wed, Feb 9, 2022 at 1:05 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:

...
Or just add an exclude in /etc/dnf/dnf.conf

I personally added and exclusion to /etc/yum.repos.d/CentOS-Stream-AppStream.repo exclude=qemu* It allows ovirt-4.4* repos to push a new qemu release, without letting CentOS stream break things...

But new libvirt versions may require a newer qemu version, and oVirt itself may require a new libvirt version. These kind of excludes are fragile and need constant maintenance. Nir

Gilboa Davara

4:05 p.m.

New subject: hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

On Wed, Feb 9, 2022 at 3:35 PM Nir Soffer <nsoffer@redhat.com> wrote:

...

On Wed, Feb 9, 2022 at 12:47 PM Gilboa Davara <gilboad@gmail.com> wrote:

...
On Wed, Feb 9, 2022 at 1:05 AM Strahil Nikolov <hunter86_bg@yahoo.com>

wrote:

...
...
Or just add an exclude in /etc/dnf/dnf.conf

I personally added and exclusion to /etc/yum.repos.d/CentOS-Stream-AppStream.repo exclude=qemu* It allows ovirt-4.4* repos to push a new qemu release, without letting CentOS stream break things...

But new libvirt versions may require a newer qemu version, and oVirt itself may require a new libvirt version.

These kind of excludes are fragile and need constant maintenance.

Nir

The previous poster proposed a global qemu exclusion. I propose a partial qemu exclusion (on centos-streams only), with the assumption that ovirt-required qemu will be pushed directly via the ovirt repo. In both cases, this is a temporary measure needed to avoid using the broken qemu pushed by streams. In both cases libvirt update from appstreams will get blocked - assuming it requires the broken qemu release. Do you advise we simply --exclude=qemu* everything we run dnf? I would imagine it's far more dangerous and will block libvirt update just as well. ... Unless I'm missing something? - Gilboa

Nir Soffer

8:32 p.m.

New subject: hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

On Wed, Feb 9, 2022 at 5:06 PM Gilboa Davara <gilboad@gmail.com> wrote:

...

On Wed, Feb 9, 2022 at 3:35 PM Nir Soffer <nsoffer@redhat.com> wrote:

...
On Wed, Feb 9, 2022 at 12:47 PM Gilboa Davara <gilboad@gmail.com> wrote:

...
On Wed, Feb 9, 2022 at 1:05 AM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:

...
Or just add an exclude in /etc/dnf/dnf.conf

I personally added and exclusion to /etc/yum.repos.d/CentOS-Stream-AppStream.repo exclude=qemu* It allows ovirt-4.4* repos to push a new qemu release, without letting CentOS stream break things...

But new libvirt versions may require a newer qemu version, and oVirt itself may require a new libvirt version.

These kind of excludes are fragile and need constant maintenance.

Nir

The previous poster proposed a global qemu exclusion. I propose a partial qemu exclusion (on centos-streams only), with the assumption that ovirt-required qemu will be pushed directly via the ovirt repo. In both cases, this is a temporary measure needed to avoid using the broken qemu pushed by streams. In both cases libvirt update from appstreams will get blocked - assuming it requires the broken qemu release.

Do you advise we simply --exclude=qemu* everything we run dnf? I would imagine it's far more dangerous and will block libvirt update just as well.

I don't have a better solution, I just wanted to warn about these excludes. Nir

Gilboa Davara

10 Feb 10 Feb

5:14 a.m.

New subject: hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

On Wed, Feb 9, 2022, 21:33 Nir Soffer <nsoffer@redhat.com> wrote:

...

On Wed, Feb 9, 2022 at 5:06 PM Gilboa Davara <gilboad@gmail.com> wrote:

...
On Wed, Feb 9, 2022 at 3:35 PM Nir Soffer <nsoffer@redhat.com> wrote:

...
On Wed, Feb 9, 2022 at 12:47 PM Gilboa Davara <gilboad@gmail.com>

wrote:

...
...
...
On Wed, Feb 9, 2022 at 1:05 AM Strahil Nikolov <hunter86_bg@yahoo.com>

wrote:

...
...
Or just add an exclude in /etc/dnf/dnf.conf

I personally added and exclusion to /etc/yum.repos.d/CentOS-Stream-AppStream.repo exclude=qemu* It allows ovirt-4.4* repos to push a new qemu release, without letting CentOS stream break things...

But new libvirt versions may require a newer qemu version, and oVirt itself may require a new libvirt version.

These kind of excludes are fragile and need constant maintenance.

Nir

The previous poster proposed a global qemu exclusion. I propose a partial qemu exclusion (on centos-streams only), with the assumption that ovirt-required qemu will be pushed directly via the ovirt repo. In both cases, this is a temporary measure needed to avoid using the broken qemu pushed by streams. In both cases libvirt update from appstreams will get blocked - assuming it requires the broken qemu release.

Do you advise we simply --exclude=qemu* everything we run dnf? I would imagine it's far more dangerous and will block libvirt update just as well.

I don't have a better solution, I just wanted to warn about these excludes.

Nir

Ok, understood, thanks. Gilboa

...

Strahil Nikolov

3:31 p.m.

New subject: hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

I ment blacklisting only the broken qemu packages .Something like:exclude=qemu*6.1.0-4.module_el8.6.0+983+a7505f3f.x86_64 or even more explicit -> full package names qith version & arch New packages would not match the filter. By the way, did anyone check qemu 6.2.0 ? Best Regards,Strahil Nikolov On Thu, Feb 10, 2022 at 6:15, Gilboa Davara<gilboad@gmail.com> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/WYUFB55GKFXI7U...

Gilboa Davara

21 Feb 21 Feb

9:29 a.m.

New subject: hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

Hello, I somehow missed your reply (and was AFK nearly two weeks). How can I test qemu 6.2? Is it available in some repo? - Gilboa On Thu, Feb 10, 2022 at 4:31 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:

...

I ment blacklisting only the broken qemu packages . Something like: exclude=qemu*6.1.0-4.module_el8.6.0+983+a7505f3f.x86_64

or even more explicit -> full package names qith version & arch

New packages would not match the filter.

By the way, did anyone check qemu 6.2.0 ?

Best Regards, Strahil Nikolov

On Thu, Feb 10, 2022 at 6:15, Gilboa Davara <gilboad@gmail.com> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/WYUFB55GKFXI7U...

Strahil Nikolov

11:07 a.m.

New subject: hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

You can blacklist packages in dnf with specific version, and thus you don't need to blacklist from repo. Best Regards,Strahil Nikolov On Mon, Feb 21, 2022 at 10:33, Gilboa Davara<gilboad@gmail.com> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SJZNIGOZXWC44R...

Gilboa Davara

23 Feb 23 Feb

11:38 a.m.

New subject: hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

On Mon, Feb 21, 2022 at 12:07 PM Strahil Nikolov <hunter86_bg@yahoo.com> wrote:

...

You can blacklist packages in dnf with specific version, and thus you don't need to blacklist from repo.

Best Regards, Strahil Nikolov

Hello, Understood. Perr your qemu 6.2 question, how can I test it? Is it packaged in some testing repo? - Gilboa

...

On Mon, Feb 21, 2022 at 10:33, Gilboa Davara <gilboad@gmail.com> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SJZNIGOZXWC44R...

Strahil Nikolov

8:36 p.m.

New subject: hosted engine deployment (v4.4.10) - TASK Check engine VM health - fatal FAILED

Nope, it's in the appstream (CentOS Stream), but I never tested it. Best Regards,Strahil Nikolov On Wed, Feb 23, 2022 at 12:42, Gilboa Davara<gilboad@gmail.com> wrote: _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZF3X276Y2WS34R...

1379

Age (days ago)

1394

Last active (days ago)

List overview

Download

14 comments

6 participants

participants (6)

Charles Kozler
Charles Stellen
Gilboa Davara
Nir Soffer
Strahil Nikolov
Yedidyah Bar David