oVirt: Migration recommendation to move from CentOS Stream 8 to 9
by dadamysus@proton.me
I have inherited the admin duty to maintain a a small oVirt cluster featuring 2 nodes running oVirt 4.5.4 based on the CentOS Stream (using the official Node ISOs) running on physical machines, including a hosted engine. Sadly, the documentation is a bit on the "light side" and the official oVirt documentation can be overwhelming at times.
As CentOS Stream 8 is no longer supported we would like to migrate the nodes to either CentOS Stream 9 using the NG node ISOs or to Rocky Linux 9 - but I guess using the official ISOs seems like the smoothest path. Still, I would like to know what the general steps are to migrate without setting up a entirely fresh cluster.
Is it possible to migrate by setting up an CentOS Stream 9 node in the existing cluster and to deploy a new engine from there?
Is it possible to migrate VMs without downtime in this process?
Ideally, does anyone have some kind of step-by-step guide to get from CentOS Stream 8 to 9 using the oVirt Node ISOs?
Or is it possible to upgrade the existing nodes to CentOS Stream 9 and deploy a new hosted engine from there without creating an entirely new node?
And yeah, I have already searched this list for answers (https://lists.ovirt.org/archives/list/users@ovirt.org/thread/BE5BS4MJSJ4F...) but that thread lacks some the mentioned details.
3 months, 2 weeks
Engine loosing VM contact and forcing reboot
by Kalil de A. Carvalho
Hello all.
We are having a problem with some VM's Ubuntu 22.04, that oVirt engine
loses contact with the VM and rebooting it. What is strange is that it
happens just to a group of machines.
We received this message when one vm is rebooted:
VM SRV-01 was configured with 102092MiB of memory while the recommended
value range is 64MiB - 65536MiB
In our cluster we have enough memory per host but we already found out this
other message:
failed to set up stack guard page: Cannot allocate memory
We already put the memory size and guarantee memory with the same value.
We tried to look after any log but nothing helped give us a ideia what was
going on.
The unique thing that we thought is that VM's are using UEFI but we do not
know it can cause this situation.
Any of you already passed through this situation?
Kalil de A. Carvalho
3 months, 2 weeks
SPM and Task error ...
by Enrico
Hi all,
my ovirt cluster has got 3 Hypervisors runnig Centos 7.5.1804 vdsm is,
ovirt engine is, the storage systems are HP MSA P2000 and
2050 (fibre channel).
I need to stop one of the hypervisors for maintenance but this system is
the storage pool manager.
For this reason I decided to manually activate SPM in one of the other
nodes but this operation is not
In the ovirt engine (engine.log) the error is this:
2019-07-25 12:39:16,744+02 INFO
[org.ovirt.engine.core.bll.storage.pool.ForceSelectSPMCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] Running command:
ForceSelectSPMCommand internal: false. Entities affected : ID:
81c9bd3c-ae0a-467f-bf7f-63ab30cd8d9e Type: VDSAction group
2019-07-25 12:39:16,745+02 INFO
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START,
ignoreFailoverLimit='false'}), log id: 37bf4639
2019-07-25 12:39:16,747+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START,
ignoreStopFailed='false'}), log id: 2522686f
2019-07-25 12:39:16,749+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START,
SpmStopVDSCommand(HostName = infn-vm05.management,
storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7'}), log id: 1810fd8b
2019-07-25 12:39:16,758+02 *ERROR*
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] SpmStopVDSCommand::Not
stopping SPM on vds 'infn-vm05.management', pool id
'18d57688-6ed4-43b8-bd7c-0665b55950b7' as there are uncleared tasks
'Task 'fdcf4d1b-82fe-49a6-b233-323ebe568f8e', status 'running''
2019-07-25 12:39:16,758+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH,
SpmStopVDSCommand, log id: 1810fd8b
2019-07-25 12:39:16,758+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH,
ResetIrsVDSCommand, log id: 2522686f
2019-07-25 12:39:16,758+02 INFO
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH,
SpmStopOnIrsVDSCommand, log id: 37bf4639
2019-07-25 12:39:16,760+02 *ERROR*
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] EVENT_ID:
USER_FORCE_SELECTED_SPM_STOP_FAILED(4,096), Failed to force select
infn-vm07.management as the SPM due to a failure to stop the current SPM.
while in the hypervisor (SPM) vdsm.log:
2019-07-25 12:39:16,744+02 INFO
[org.ovirt.engine.core.bll.storage.pool.ForceSelectSPMCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] Running command:
ForceSelectSPMCommand internal: false. Entities affected : ID:
81c9bd3c-ae0a-467f-bf7f-63ab30cd8d9e Type: VDSAction group
2019-07-25 12:39:16,745+02 INFO
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START,
ignoreFailoverLimit='false'}), log id: 37bf4639
2019-07-25 12:39:16,747+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START,
ignoreStopFailed='false'}), log id: 2522686f
2019-07-25 12:39:16,749+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START,
SpmStopVDSCommand(HostName = infn-vm05.management,
storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7'}), log id: 1810fd8b
2019-07-25 12:39:16,758+02 *ERROR*
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] SpmStopVDSCommand::Not
stopping SPM on vds 'infn-vm05.management', pool id
'18d57688-6ed4-43b8-bd7c-0665b55950b7' as there are uncleared tasks
'Task 'fdcf4d1b-82fe-49a6-b233-323ebe568f8e', status 'running''
2019-07-25 12:39:16,758+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH,
SpmStopVDSCommand, log id: 1810fd8b
2019-07-25 12:39:16,758+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH,
ResetIrsVDSCommand, log id: 2522686f
2019-07-25 12:39:16,758+02 INFO
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH,
SpmStopOnIrsVDSCommand, log id: 37bf4639
2019-07-25 12:39:16,760+02 *ERROR*
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] EVENT_ID:
USER_FORCE_SELECTED_SPM_STOP_FAILED(4,096), Failed to force select
infn-vm07.management as the SPM due to a failure to stop the current SPM.
2019-07-25 12:39:18,660+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] Task id
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' has passed pre-polling period
time and should be polled. Pre-polling period is 60000 millis.
2019-07-25 12:39:18,660+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] Task id
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' has passed pre-polling period
time and should be polled. Pre-polling period is 60000 millis.
2019-07-25 12:39:18,750+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] Task id
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' has passed pre-polling period
time and should be polled. Pre-polling period is 60000 millis.
2019-07-25 12:39:18,750+02 *ERROR*
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
BaseAsyncTask::logEndTaskFailure: Task
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' (Parent Command 'Unknown',
Parameters Type
'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
with failure:
2019-07-25 12:39:18,750+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
SPMAsyncTask::ClearAsyncTask: Attempting to clear task
2019-07-25 12:39:18,751+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] START,
taskId='fdcf4d1b-82fe-49a6-b233-323ebe568f8e'}), log id: 34ae2b2f
2019-07-25 12:39:18,752+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] START,
HSMClearTaskVDSCommand(HostName = infn-vm05.management,
taskId='fdcf4d1b-82fe-49a6-b233-323ebe568f8e'}), log id: d3a78ad
2019-07-25 12:39:18,757+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] FINISH,
HSMClearTaskVDSCommand, log id: d3a78ad
2019-07-25 12:39:18,757+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] FINISH,
SPMClearTaskVDSCommand, log id: 34ae2b2f
2019-07-25 12:39:18,757+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
SPMAsyncTask::ClearAsyncTask: At time of attempt to clear task
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' the response code was
'TaskStateError' and message was 'Operation is not allowed in this task
state: ("can't clean in state running",)'. Task will not be cleaned
2019-07-25 12:39:18,757+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
BaseAsyncTask::onTaskEndSuccess: Task
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' (Parent Command 'Unknown',
Parameters Type
'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
2019-07-25 12:39:18,757+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
SPMAsyncTask::ClearAsyncTask: Attempting to clear task
2019-07-25 12:39:18,758+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] START,
taskId='fdcf4d1b-82fe-49a6-b233-323ebe568f8e'}), log id: 42de0c2b
2019-07-25 12:39:18,759+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] START,
HSMClearTaskVDSCommand(HostName = infn-vm05.management,
taskId='fdcf4d1b-82fe-49a6-b233-323ebe568f8e'}), log id: 4895c79c
2019-07-25 12:39:18,764+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] FINISH,
HSMClearTaskVDSCommand, log id: 4895c79c
2019-07-25 12:39:18,764+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] FINISH,
SPMClearTaskVDSCommand, log id: 42de0c2b
2019-07-25 12:39:18,764+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
SPMAsyncTask::ClearAsyncTask: At time of attempt to clear task
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' the response code was
'TaskStateError' and message was 'Operation is not allowed in this task
state: ("can't clean in state running",)'. Task will not be cleaned
2019-07-25 12:39:18,764+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] Task id
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' has passed pre-polling period
time and should be polled. Pre-polling period is 60000 millis.
2019-07-25 12:39:18,764+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] Cleaning zombie
tasks: Clearing async task 'Unknown' that started at 'Fri May 03
14:48:50 CEST 2019'
2019-07-25 12:39:18,764+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
SPMAsyncTask::ClearAsyncTask: Attempting to clear task
2019-07-25 12:39:18,765+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] START,
taskId='fdcf4d1b-82fe-49a6-b233-323ebe568f8e'}), log id: da77af2
2019-07-25 12:39:18,766+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] START,
HSMClearTaskVDSCommand(HostName = infn-vm05.management,
taskId='fdcf4d1b-82fe-49a6-b233-323ebe568f8e'}), log id: 530694fb
2019-07-25 12:39:18,771+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] FINISH,
HSMClearTaskVDSCommand, log id: 530694fb
2019-07-25 12:39:18,771+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] FINISH,
SPMClearTaskVDSCommand, log id: da77af2
2019-07-25 12:39:18,771+02 INFO
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
SPMAsyncTask::ClearAsyncTask: At time of attempt to clear task
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' the response code was
'TaskStateError' and message was 'Operation is not allowed in this task
state: ("can't clean in state running",)'. Task will not be cleaned
there is some relation between this error and a task that has remained
hanging, from SPM server:
# vdsm-client Task getInfo taskID=fdcf4d1b-82fe-49a6-b233-323ebe568f8e
"verb": "prepareMerge",
"id": "fdcf4d1b-82fe-49a6-b233-323ebe568f8e"
# vdsm-client Task getStatus taskID=fdcf4d1b-82fe-49a6-b233-323ebe568f8e
"message": "running job 1 of 1",
"code": 0,
"taskID": "fdcf4d1b-82fe-49a6-b233-323ebe568f8e",
"taskResult": "",
"taskState": "running"
How can I solve this problem ?
Thanks a lot for your help !!
Best Regards
Enrico Becchetti Servizio di Calcolo e Reti
Istituto Nazionale di Fisica Nucleare - Sezione di Perugia
Via Pascoli,c/o Dipartimento di Fisica 06123 Perugia (ITALY)
Phone:+39 075 5852777 Mail: Enrico.Becchetti<at>pg.infn.it
3 months, 2 weeks
dmi identify
by Fabrice Bacchella
I’m in the process of switching from oVirt 4.3 to 4.5.
VM runring on old hosts are returned as :
$ sudo dmidecode -t 1
# dmidecode 3.5
Getting SMBIOS data from sysfs.
SMBIOS 2.8 present.
Handle 0x0100, DMI type 1, 27 bytes
System Information
Manufacturer: oVirt
Product Name: oVirt Node
Version: 7-9.2009.1.el7.centos
Serial Number: 31333138-3839-5a43-3238-3438304b4646
UUID: 2a9280d3-ba1c-438e-bd49-186cd470a188
Wake-up Type: Power Switch
SKU Number: Not Specified
Family: Red Hat Enterprise Linux
But when running on new hosts, they are identified as
$ sudo dmidecode -t 1
# dmidecode 3.5
Getting SMBIOS data from sysfs.
SMBIOS 2.8 present.
Handle 0x0100, DMI type 1, 27 bytes
System Information
Manufacturer: oVirt
Product Name: RHEL
Version: 9.4-1.7.el9
Serial Number: 30373237-3132-5a43-3235-343233333937
UUID: 92159a86-f002-404e-aca1-5c40290d48bb
Wake-up Type: Power Switch
SKU Number: Not Specified
Family: Red Hat Enterprise Linux
Is that made on purpose ? I think that’s quite annoying as I have lot of tools that check the value 'oVirt Node', is that configurable ?
Product Name being RHEL is not a very useful information as it’s not expected to run on anything else.
3 months, 3 weeks
Host has status "NonResponse"
by p0ntorez1g@gmail.com
Hi all, one of my hosts got a “NonResponsive” status. While a virtual machine was migrated there. In fact, there is no machine on the host. I have reinstalled host completely, but can't remove it from the engine. Can you tell me if it is possible to remove the record about this host from the engine? And reconnect a clean host.
Thank you very much
3 months, 3 weeks