Please, Please Help - New oVirt Install/Deployment Failing - "Host is not up..."
by Matthew J Black
Hi Everyone,
Could someone please help me - I've been trying to do an install of oVirt for *weeks* (including false starts and self-inflicted wounds/errors) and it is still not working.
My setup:
- oVirt v4.5.3
- A brand new fresh vanilla install of RockyLinux 8.6 - all working AOK
- 2*NICs in a bond (802.3ad) with a couple of sub-Interfaces/VLANs - all working AOK
- All relevant IPv4 Address in DNS with Reverse Lookups - all working AOK
- All relevant IPv4 Address in "/etc/hosts" file - all working AOK
- IPv6 (using "method=auto" in the interface config file) enabled on the relevant sub-Interface/VLAN - I'm not using IPv6 on the network, only IPv4, but I'm trying to cover all the bases.
- All relevant Ports (as per the oVirt documentation) set up on the firewall
- ie firewall-cmd --add-service={{ libvirt-tls | ovirt-imageio | ovirt-vmconsole | vdsm }}
- All the relevant Repositories installed (ie RockyLinux BaseOS, AppStream, & PowerTools, and the EPEL, plus the ones from the oVirt documentation)
I have followed the oVirt documentation (including the special RHEL-instructions and RockyLinux-instructions) to the letter - no deviations, no special settings, exactly as they are written.
All the dnf installs, etc, went off without a hitch, including the "dnf install centos-release-ovirt45", "dnf install ovirt-engine-appliance", and "dnf install ovirt-hosted-engine-setup" - no errors anywhere.
Here is the results of a "dnf repolist":
- appstream Rocky Linux 8 - AppStream
- baseos Rocky Linux 8 - BaseOS
- centos-ceph-pacific CentOS-8-stream - Ceph Pacific
- centos-gluster10 CentOS-8-stream - Gluster 10
- centos-nfv-openvswitch CentOS-8 - NFV OpenvSwitch
- centos-opstools CentOS-OpsTools - collectd
- centos-ovirt45 CentOS Stream 8 - oVirt 4.5
- cs8-extras CentOS Stream 8 - Extras
- cs8-extras-common CentOS Stream 8 - Extras common packages
- epel Extra Packages for Enterprise Linux 8 - x86_64
- epel-modular Extra Packages for Enterprise Linux Modular 8 - x86_64
- ovirt-45-centos-stream-openstack-yoga CentOS Stream 8 - oVirt 4.5 - OpenStack Yoga Repository
- ovirt-45-upstream oVirt upstream for CentOS Stream 8 - oVirt 4.5
- powertools Rocky Linux 8 - PowerTools
So I kicked-off the oVirt deployment with: "hosted-engine --deploy --4 --ansible-extra-vars=he_offline_deployment=true".
I used "--ansible-extra-vars=he_offline_deployment=true" because without that flag I was getting "DNF timout" issues (see my previous post `Local (Deployment) VM Can't Reach "centos-ceph-pacific" Repo`).
I answer the defaults to all of questions the script asked, or entered the deployment-relevant answers where appropriate. In doing this I double-checked every answer before hitting <Enter>. Everything progressed smoothly until the deployment reached the "Wait for the host to be up" task... which then hung for more than 30 minutes before failing.
From the ovirt-hosted-engine-setup... log file:
- 2022-10-20 17:54:26,285+1100 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:113 fatal: [localhost]: FAILED! => {"changed": false, "msg": "Host is not up, please check logs, perhaps also on the engine machine"}
I checked the following log files and found all of the relevant ERROR lines, then checked several 10s of proceeding and succeeding lines trying to determine what was going wrong, but I could not determine anything.
- ovirt-hosted-engine-setup...
- ovirt-hosted-engine-setup-ansible-bootstrap_local_vm...
- ovirt-hosted-engine-setup-ansible-final_clean... - not really relevant, I believe
I can include the log files (or the relevant parts of the log files) if people want - but that are very large: several 100 kilobytes each.
I also googled "oVirt Host is not up" and found several entries, but after reading them all the most relevant seems to be a thread from these mailing list: `Install of RHV 4.4 failing - "Host is not up, please check logs, perhaps also on the engine machine"` - but this seems to be talking about an upgrade and I didn't gleam anything useful from it - I could, of course, be wrong about that.
So my questions are:
- Where else should I be looking (ie other log files, etc, and possible where to find them)?
- Does anyone have any idea why this isn't working?
- Does anyone have a work-around (including a completely manual process to get things working - I don't mind working in the CLI with virsh, etc)?
- What am I doing wrong?
Please, I'm really stumped with this, and I really do need help.
Cheers
Dulux-Oz
11 months, 1 week
how to renew expired ovirt node vdsm cert manually ?
by dhanaraj.ramesh@yahoo.com
below are the steps to renew the expired vdsm cert of ovirt node
# To check CERT expired
# openssl x509 -in /etc/pki/vdsm/certs/vdsmcert.pem -noout -dates
1. Backup vdsm folder
# cd /etc/pki
# mv vdsm vdsm.orig
# mkdir vdsm ; chown vdsm:kvm vdsm
# cd vdsm
# mkdir libvirt-vnc certs keys libvirt-spice libvirt-migrate
# chown vdsm:kvm libvirt-vnc certs keys libvirt-spice libvirt-migrate
2. Regenerate cert & keys
# vdsm-tool configure --module certificates
3. Copy the cert to destination location
chmod 440 /etc/pki/vdsm/keys/vdsmkey.pem
chown root /etc/pki/vdsmcerts/*pem
chmod 644 /etc/pki/vdsmcerts/*pem
cp /etc/pki/vdsm/certs/cacert.pem /etc/pki/vdsm/libvirt-spice/ca-cert.pem
cp /etc/pki/vdsm/keys/vdsmkey.pem /etc/pki/vdsm/libvirt-spice/server-key.pem
cp /etc/pki/vdsm/certs/vdsmcert.pem /etc/pki/vdsm/libvirt-spice/server-cert.pem
cp /etc/pki/vdsm/certs/cacert.pem /etc/pki/vdsm/libvirt-vnc/ca-cert.pem
cp /etc/pki/vdsm/keys/vdsmkey.pem /etc/pki/vdsm/libvirt-vnc/server-key.pem
cp /etc/pki/vdsm/certs/vdsmcert.pem /etc/pki/vdsm/libvirt-vnc/server-cert.pem
cp -p /etc/pki/vdsm/certs/cacert.pem /etc/pki/vdsm/libvirt-migrate/ca-cert.pem
cp -p /etc/pki/vdsm/keys/vdsmkey.pem /etc/pki/vdsm/libvirt-migrate/server-key.pem
cp -p /etc/pki/vdsm/certs/vdsmcert.pem /etc/pki/vdsm/libvirt-migrate/server-cert.pem
chown root:qemu /etc/pki/vdsm/libvirt-migrate/server-key.pem
cp -p /etc/pki/vdsm.orig/keys/libvirt_password /etc/pki/vdsm/keys/
mv /etc/pki/libvirt/clientcert.pem /etc/pki/libvirt/clientcert.pem.orig
mv /etc/pki/libvirt/private/clientkey.pem /etc/pki/libvirt/private/clientkey.pem.orig
mv /etc/pki/CA/cacert.pem /etc/pki/CA/cacert.pem.orig
cp -p /etc/pki/vdsm/certs/vdsmcert.pem /etc/pki/libvirt/clientcert.pem
cp -p /etc/pki/vdsm/keys/vdsmkey.pem /etc/pki/libvirt/private/clientkey.pem
cp -p /etc/pki/vdsm/certs/cacert.pem /etc/pki/CA/cacert.pem
3. cross check the backup folder /etc/pki/vdsm.orig vs /etc/pki/vdsm
# refer to /etc/pki/vdsm.orig/*/ and set the correct owner & group permission in /etc/pki/vdsm/*/
4. restart services # Make sure both services are up
systemctl restart vdsmd libvirtd
11 months, 2 weeks
Unable to install oVirt on RHEL7.5
by SS00514758@techmahindra.com
Hi All,
I am unable to install oVirt on RHEL7.5, to install it I am taking reference of below link,
https://www.ovirt.org/documentation/install-guide/chap-Installing_oVirt.html
But though it is not working for me, couple of dependencies is not getting installed, and because of this I am not able to run the ovirt-engine, below are the depencies packages that unable to install,
Error: Package: collectd-write_http-5.8.0-6.1.el7.x86_64 (@ovirt-4.2-centos-opstools)
Requires: collectd(x86-64) = 5.8.0-6.1.el7
Removing: collectd-5.8.0-6.1.el7.x86_64 (@ovirt-4.2-centos-opstools)
collectd(x86-64) = 5.8.0-6.1.el7
Updated By: collectd-5.8.1-1.el7.x86_64 (epel)
collectd(x86-64) = 5.8.1-1.el7
Available: collectd-5.7.2-1.el7.x86_64 (ovirt-4.2-centos-opstools)
collectd(x86-64) = 5.7.2-1.el7
Available: collectd-5.7.2-3.el7.x86_64 (ovirt-4.2-centos-opstools)
collectd(x86-64) = 5.7.2-3.el7
Available: collectd-5.8.0-2.el7.x86_64 (ovirt-4.2-centos-opstools)
collectd(x86-64) = 5.8.0-2.el7
Available: collectd-5.8.0-3.el7.x86_64 (ovirt-4.2-centos-opstools)
collectd(x86-64) = 5.8.0-3.el7
Available: collectd-5.8.0-5.el7.x86_64 (ovirt-4.2-centos-opstools)
collectd(x86-64) = 5.8.0-5.el7
Help me to install this.
Looking forward to resolve this issue.
Regards
Sumit Sahay
11 months, 3 weeks
Grafana - Origin Not Allowed
by Maton, Brett
oVirt 4.5.0.8-1.el8
I tried to connect to grafana via the monitoring portal link from the dash
and all panels are failing to display any data with varying error messages,
but all include 'Origin Not Allowed'
I navigated to Data Sources and ran a test on the PostgreSQL connection
(localhost) which threw the same Origin Not Allowed error message.
Any suggestions?
1 year
Re: Failed to synchronize networks of Provider ovirt-provider-ovn
by Mail SET Inc. Group
Yes, i use same manual to change WebUI SSL.
ovirt-ca-file= is a same SSL file which use WebUI.
Yes, i restart ovirt-provider-ovn, i restart engine, i restart all what i can restart. Nothing...
> 12 сент. 2018 г., в 16:11, Dominik Holler <dholler(a)redhat.com> написал(а):
>
> On Wed, 12 Sep 2018 14:23:54 +0300
> "Mail SET Inc. Group" <mail(a)set-pro.net> wrote:
>
>> Ok!
>
> Not exactly, please use users(a)ovirt.org for such questions.
> Other should benefit from this questions, too.
> Please write the next mail to users(a)ovirt.org and keep me in CC.
>
>> What i did:
>>
>> 1) install oVirt «from box» (4.2.5.2-1.el7);
>> 2) generate own ssl for my engine using my FreeIPA CA, Install it and
>
> What means "Install it"? You can use the doc from the following link
> https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.2/...
>
> Ensure that ovirt-ca-file= in
> /etc/ovirt-provider-ovn/conf.d/10-setup-ovirt-provider-ovn.conf
> points to the correct file and ovirt-provider-ovn is restarted.
>
>> get tis issue;
>>
>>
>> [root@engine ~]# tail -n 50 /var/log/ovirt-provider-ovn.log
>> 2018-09-12 14:10:23,828 root [SSL: CERTIFICATE_VERIFY_FAILED]
>> certificate verify failed (_ssl.c:579) Traceback (most recent call
>> last): File "/usr/share/ovirt-provider-ovn/handlers/base_handler.py",
>> line 133, in _handle_request method, path_parts, content
>> File "/usr/share/ovirt-provider-ovn/handlers/selecting_handler.py",
>> line 175, in handle_request return
>> self.call_response_handler(handler, content, parameters) File
>> "/usr/share/ovirt-provider-ovn/handlers/keystone.py", line 33, in
>> call_response_handler return response_handler(content, parameters)
>> File "/usr/share/ovirt-provider-ovn/handlers/keystone_responses.py",
>> line 62, in post_tokens user_password=user_password) File
>> "/usr/share/ovirt-provider-ovn/auth/plugin_facade.py", line 26, in
>> create_token return auth.core.plugin.create_token(user_at_domain,
>> user_password) File
>> "/usr/share/ovirt-provider-ovn/auth/plugins/ovirt/plugin.py", line
>> 48, in create_token timeout=self._timeout()) File
>> "/usr/share/ovirt-provider-ovn/auth/plugins/ovirt/sso.py", line 75,
>> in create_token username, password, engine_url, ca_file, timeout)
>> File "/usr/share/ovirt-provider-ovn/auth/plugins/ovirt/sso.py", line
>> 91, in _get_sso_token timeout=timeout File
>> "/usr/share/ovirt-provider-ovn/auth/plugins/ovirt/sso.py", line 54,
>> in wrapper response = func(*args, **kwargs) File
>> "/usr/share/ovirt-provider-ovn/auth/plugins/ovirt/sso.py", line 47,
>> in wrapper raise BadGateway(e) BadGateway: [SSL:
>> CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:579)
>>
>>
>> [root@engine ~]# tail -n 20 /var/log/ovirt-engine/engine.log
>> 2018-09-12 14:10:23,773+03 INFO
>> [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand]
>> (EE-ManagedThreadFactory-engineScheduled-Thread-47) [316db685] Lock
>> Acquired to object
>> 'EngineLock:{exclusiveLocks='[14e4fb72-9764-4757-b37d-4d487995571a=PROVIDER]',
>> sharedLocks=''}' 2018-09-12 14:10:23,778+03 INFO
>> [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand]
>> (EE-ManagedThreadFactory-engineScheduled-Thread-47) [316db685]
>> Running command: SyncNetworkProviderCommand internal: true.
>> 2018-09-12 14:10:23,836+03 ERROR
>> [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand]
>> (EE-ManagedThreadFactory-engineScheduled-Thread-47) [316db685]
>> Command
>> 'org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand'
>> failed: EngineException: (Failed with error Bad Gateway and code
>> 5050) 2018-09-12 14:10:23,837+03 INFO
>> [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand]
>> (EE-ManagedThreadFactory-engineScheduled-Thread-47) [316db685] Lock
>> freed to object
>> 'EngineLock:{exclusiveLocks='[14e4fb72-9764-4757-b37d-4d487995571a=PROVIDER]',
>> sharedLocks=''}' 2018-09-12 14:14:12,477+03 INFO
>> [org.ovirt.engine.core.sso.utils.AuthenticationUtils] (default
>> task-6) [] User admin@internal successfully logged in with scopes:
>> ovirt-app-admin ovirt-app-api ovirt-app-portal
>> ovirt-ext=auth:sequence-priority=~ ovirt-ext=revoke:revoke-all
>> ovirt-ext=token-info:authz-search
>> ovirt-ext=token-info:public-authz-search
>> ovirt-ext=token-info:validate ovirt-ext=token:password-access
>> 2018-09-12 14:14:12,587+03 INFO
>> [org.ovirt.engine.core.bll.aaa.CreateUserSessionCommand] (default
>> task-6) [1bf1b763] Running command: CreateUserSessionCommand
>> internal: false. 2018-09-12 14:14:12,628+03 INFO
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (default task-6) [1bf1b763] EVENT_ID: USER_VDC_LOGIN(30), User
>> admin@internal-authz connecting from '10.0.3.61' using session
>> 's8jAm7BUJGlicthm6yZBA3CUM8QpRdtwFaK3M/IppfhB3fHFB9gmNf0cAlbl1xIhcJ2WX+ww7e71Ri+MxJSsIg=='
>> logged in. 2018-09-12 14:14:30,972+03 INFO
>> [org.ovirt.engine.core.bll.provider.ImportProviderCertificateCommand]
>> (default task-6) [ee3cc8a7-4485-4fdf-a0c2-e9d67b5cfcd3] Running
>> command: ImportProviderCertificateCommand internal: false. Entities
>> affected : ID: aaa00000-0000-0000-0000-123456789aaa Type:
>> SystemAction group CREATE_STORAGE_POOL with role type ADMIN
>> 2018-09-12 14:14:30,982+03 INFO
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (default task-6) [ee3cc8a7-4485-4fdf-a0c2-e9d67b5cfcd3] EVENT_ID:
>> PROVIDER_CERTIFICATE_IMPORTED(213), Certificate for provider
>> ovirt-provider-ovn was imported. (User: admin@internal-authz)
>> 2018-09-12 14:14:31,006+03 INFO
>> [org.ovirt.engine.core.bll.provider.TestProviderConnectivityCommand]
>> (default task-6) [a48d94ab-b0b2-42a2-a667-0525b4c652ea] Running
>> command: TestProviderConnectivityCommand internal: false. Entities
>> affected : ID: aaa00000-0000-0000-0000-123456789aaa Type:
>> SystemAction group CREATE_STORAGE_POOL with role type ADMIN
>> 2018-09-12 14:14:31,058+03 ERROR
>> [org.ovirt.engine.core.bll.provider.TestProviderConnectivityCommand]
>> (default task-6) [a48d94ab-b0b2-42a2-a667-0525b4c652ea] Command
>> 'org.ovirt.engine.core.bll.provider.TestProviderConnectivityCommand'
>> failed: EngineException: (Failed with error Bad Gateway and code
>> 5050) 2018-09-12 14:15:10,954+03 INFO
>> [org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService]
>> (EE-ManagedThreadFactory-engineThreadMonitoring-Thread-1) [] Thread
>> pool 'default' is using 0 threads out of 1, 5 threads waiting for
>> tasks. 2018-09-12 14:15:10,954+03 INFO
>> [org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService]
>> (EE-ManagedThreadFactory-engineThreadMonitoring-Thread-1) [] Thread
>> pool 'engine' is using 0 threads out of 500, 16 threads waiting for
>> tasks and 0 tasks in queue. 2018-09-12 14:15:10,954+03 INFO
>> [org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService]
>> (EE-ManagedThreadFactory-engineThreadMonitoring-Thread-1) [] Thread
>> pool 'engineScheduled' is using 0 threads out of 100, 100 threads
>> waiting for tasks. 2018-09-12 14:15:10,954+03 INFO
>> [org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService]
>> (EE-ManagedThreadFactory-engineThreadMonitoring-Thread-1) [] Thread
>> pool 'engineThreadMonitoring' is using 1 threads out of 1, 0 threads
>> waiting for tasks. 2018-09-12 14:15:10,954+03 INFO
>> [org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService]
>> (EE-ManagedThreadFactory-engineThreadMonitoring-Thread-1) [] Thread
>> pool 'hostUpdatesChecker' is using 0 threads out of 5, 2 threads
>> waiting for tasks. 2018-09-12 14:15:23,843+03 INFO
>> [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand]
>> (EE-ManagedThreadFactory-engineScheduled-Thread-61) [2455041f] Lock
>> Acquired to object
>> 'EngineLock:{exclusiveLocks='[14e4fb72-9764-4757-b37d-4d487995571a=PROVIDER]',
>> sharedLocks=''}' 2018-09-12 14:15:23,849+03 INFO
>> [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand]
>> (EE-ManagedThreadFactory-engineScheduled-Thread-61) [2455041f]
>> Running command: SyncNetworkProviderCommand internal: true.
>> 2018-09-12 14:15:23,900+03 ERROR
>> [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand]
>> (EE-ManagedThreadFactory-engineScheduled-Thread-61) [2455041f]
>> Command
>> 'org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand'
>> failed: EngineException: (Failed with error Bad Gateway and code
>> 5050) 2018-09-12 14:15:23,901+03 INFO
>> [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand]
>> (EE-ManagedThreadFactory-engineScheduled-Thread-61) [2455041f] Lock
>> freed to object
>> 'EngineLock:{exclusiveLocks='[14e4fb72-9764-4757-b37d-4d487995571a=PROVIDER]',
>> sharedLocks=''}'
>>
>>
>> [root@engine ~]#
>> cat /etc/ovirt-provider-ovn/conf.d/10-setup-ovirt-provider-ovn.conf #
>> This file is automatically generated by engine-setup. Please do not
>> edit manually [OVN REMOTE] ovn-remote=ssl:127.0.0.1:6641
>> [SSL]
>> https-enabled=true
>> ssl-cacert-file=/etc/pki/ovirt-engine/ca.pem
>> ssl-cert-file=/etc/pki/ovirt-engine/certs/ovirt-provider-ovn.cer
>> ssl-key-file=/etc/pki/ovirt-engine/keys/ovirt-provider-ovn.key.nopass
>> [OVIRT]
>> ovirt-sso-client-secret=Ms7Gw9qNT6IkXu7oA54tDmxaZDIukABV
>> ovirt-host=https://engine.set.local:443
>> ovirt-sso-client-id=ovirt-provider-ovn
>> ovirt-ca-file=/etc/pki/ovirt-engine/apache-ca.pem
>> [PROVIDER]
>> provider-host=engine.set.local
>>
>>
>>> 12 сент. 2018 г., в 13:59, Dominik Holler <dholler(a)redhat.com>
>>> написал(а):
>>>
>>> On Wed, 12 Sep 2018 13:04:53 +0300
>>> "Mail SET Inc. Group" <mail(a)set-pro.net> wrote:
>>>
>>>> Hello Dominik!
>>>> I have a same issue with OVN provider and SSL
>>>> https://www.mail-archive.com/users@ovirt.org/msg47020.html
>>>> <https://www.mail-archive.com/users@ovirt.org/msg47020.html> But
>>>> certificate changes not helps to resolve it. Maybe you can help me
>>>> with this?
>>>
>>> Sure. Can you please share the relevant lines of
>>> ovirt-provider-ovn.log and engine.log, and the information if you
>>> are using the certificates generated by engine-setup with
>>> users(a)ovirt.org ? Thanks,
>>> Dominik
>>>
>>
>
>
1 year, 1 month
engine-setup failing on 4.3.2 -> 4.3.3 fails during Engine schema refresh fail
by Edward Berger
I was trying to upgrade a hyperconverged oVirt hosted engine and failed in
the engine-setup command with these error and warnings.
...
[ INFO ] Creating/refreshing Engine database schema
[ ERROR ] schema.sh: FATAL: Cannot execute sql command:
--file=/usr/share/ovirt-engine/dbscripts/upgrade/04_03_0830_add_foreign_key_to_image_transfers.sql
[ ERROR ] Failed to execute stage 'Misc configuration': Engine schema
refresh failed
...
[ INFO ] Yum Verify: 16/16: ovirt-engine-tools.noarch 0:4.3.3.5-1.el7 - e
[WARNING] Rollback of DWH database postponed to Stage "Clean up"
[ INFO ] Rolling back database schema
...
[ INFO ] Stage: Pre-termination
[ INFO ] Stage: Termination
[ ERROR ] Execution of setup failed
Attaching engine-setup logfile.
1 year, 5 months
Unable to change the admin passsword on oVirt 4.5.2.5
by Ayansh Rocks
Hi All,
Any idea hot to change password of admin user on oVirt 4.5.2.5 ?
Below is not working -
[root@ovirt]# ovirt-aaa-jdbc-tool user password-reset admin
Picked up JAVA_TOOL_OPTIONS: -Dcom.redhat.fips=false
Password:
Reenter password:
updating user admin...
user updated successfully
[root@delhi-test-ovirtm-02 ~]#
Above shows successful but password not changed.
Thanks
1 year, 7 months
4.4.9 -> 4.4.10 Cannot start or migrate any VM (hotpluggable cpus requested exceeds the maximum cpus supported by KVM)
by Jillian Morgan
After upgrading the engine from 4.4.9 to 4.4.10, and then upgrading one
host, any attempt to migrate a VM to that host or start a VM on that host
results in the following error:
Number of hotpluggable cpus requested (16) exceeds the maximum cpus
supported by KVM (8)
While the version of qemu is the same across hosts, (
qemu-kvm-6.0.0-33.el8s.x86_64), I traced the difference to the upgraded
kernel on the new host. I have always run elrepo's kernel-ml on these hosts
to support bcache which RHEL's kernel doesn't support. The working hosts
still run kernel-ml-5.15.12. The upgraded host ran kernel-ml-5.17.0.
In case anyone else runs kernel-ml, have you run into this issue?
Does anyone know why KVM's KVM_CAP_MAX_VCPUS value is lowered on the new
kernel?
Does anyone know how to query the KVM capabilities from userspace without
writing a program leveraging kvm_ioctl()'s?
Related to this, it seems that ovirt and/or libvirtd always runs qmu-kvm
with an -smp argument of "maxcpus=16". This causes qemu's built-in check to
fail on the new kernel which is supporting max_vpus of 8.
Why does ovirt always request maxcpus=16?
And yes, before you say it, I know you're going to say that running
kernel-ml isn't supported.
--
Jillian Morgan (she/her) 🏳️⚧️
Systems & Networking Specialist
Primordial Software Group & I.T. Consultancy
https://www.primordial.ca
1 year, 8 months