SPM and Task error ...
by Enrico
Hi all,
my ovirt cluster has got 3 Hypervisors runnig Centos 7.5.1804 vdsm is
4.20.39.1-1.el7,
ovirt engine is 4.2.4.5-1.el7, the storage systems are HP MSA P2000 and
2050 (fibre channel).
I need to stop one of the hypervisors for maintenance but this system is
the storage pool manager.
For this reason I decided to manually activate SPM in one of the other
nodes but this operation is not
successful.
In the ovirt engine (engine.log) the error is this:
2019-07-25 12:39:16,744+02 INFO
[org.ovirt.engine.core.bll.storage.pool.ForceSelectSPMCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] Running command:
ForceSelectSPMCommand internal: false. Entities affected : ID:
81c9bd3c-ae0a-467f-bf7f-63ab30cd8d9e Type: VDSAction group
MANIPULATE_HOST with role type ADMIN
2019-07-25 12:39:16,745+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SpmStopOnIrsVDSCommand]
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START,
SpmStopOnIrsVDSCommand(
SpmStopOnIrsVDSCommandParameters:{storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7',
ignoreFailoverLimit='false'}), log id: 37bf4639
2019-07-25 12:39:16,747+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START,
ResetIrsVDSCommand(
ResetIrsVDSCommandParameters:{storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7',
ignoreFailoverLimit='false',
vdsId='751f3e99-b95e-4c31-bc38-77f5661a0bdc',
ignoreStopFailed='false'}), log id: 2522686f
2019-07-25 12:39:16,749+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START,
SpmStopVDSCommand(HostName = infn-vm05.management,
SpmStopVDSCommandParameters:{hostId='751f3e99-b95e-4c31-bc38-77f5661a0bdc',
storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7'}), log id: 1810fd8b
2019-07-25 12:39:16,758+02 *ERROR*
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] SpmStopVDSCommand::Not
stopping SPM on vds 'infn-vm05.management', pool id
'18d57688-6ed4-43b8-bd7c-0665b55950b7' as there are uncleared tasks
'Task 'fdcf4d1b-82fe-49a6-b233-323ebe568f8e', status 'running''
2019-07-25 12:39:16,758+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH,
SpmStopVDSCommand, log id: 1810fd8b
2019-07-25 12:39:16,758+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH,
ResetIrsVDSCommand, log id: 2522686f
2019-07-25 12:39:16,758+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SpmStopOnIrsVDSCommand]
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH,
SpmStopOnIrsVDSCommand, log id: 37bf4639
2019-07-25 12:39:16,760+02 *ERROR*
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] EVENT_ID:
USER_FORCE_SELECTED_SPM_STOP_FAILED(4,096), Failed to force select
infn-vm07.management as the SPM due to a failure to stop the current SPM.
while in the hypervisor (SPM) vdsm.log:
2019-07-25 12:39:16,744+02 INFO
[org.ovirt.engine.core.bll.storage.pool.ForceSelectSPMCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] Running command:
ForceSelectSPMCommand internal: false. Entities affected : ID:
81c9bd3c-ae0a-467f-bf7f-63ab30cd8d9e Type: VDSAction group
MANIPULATE_HOST with role type ADMIN
2019-07-25 12:39:16,745+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SpmStopOnIrsVDSCommand]
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START,
SpmStopOnIrsVDSCommand(
SpmStopOnIrsVDSCommandParameters:{storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7',
ignoreFailoverLimit='false'}), log id: 37bf4639
2019-07-25 12:39:16,747+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START,
ResetIrsVDSCommand(
ResetIrsVDSCommandParameters:{storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7',
ignoreFailoverLimit='false',
vdsId='751f3e99-b95e-4c31-bc38-77f5661a0bdc',
ignoreStopFailed='false'}), log id: 2522686f
2019-07-25 12:39:16,749+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START,
SpmStopVDSCommand(HostName = infn-vm05.management,
SpmStopVDSCommandParameters:{hostId='751f3e99-b95e-4c31-bc38-77f5661a0bdc',
storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7'}), log id: 1810fd8b
2019-07-25 12:39:16,758+02 *ERROR*
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] SpmStopVDSCommand::Not
stopping SPM on vds 'infn-vm05.management', pool id
'18d57688-6ed4-43b8-bd7c-0665b55950b7' as there are uncleared tasks
'Task 'fdcf4d1b-82fe-49a6-b233-323ebe568f8e', status 'running''
2019-07-25 12:39:16,758+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH,
SpmStopVDSCommand, log id: 1810fd8b
2019-07-25 12:39:16,758+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH,
ResetIrsVDSCommand, log id: 2522686f
2019-07-25 12:39:16,758+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SpmStopOnIrsVDSCommand]
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH,
SpmStopOnIrsVDSCommand, log id: 37bf4639
2019-07-25 12:39:16,760+02 *ERROR*
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] EVENT_ID:
USER_FORCE_SELECTED_SPM_STOP_FAILED(4,096), Failed to force select
infn-vm07.management as the SPM due to a failure to stop the current SPM.
2019-07-25 12:39:18,660+02 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] Task id
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' has passed pre-polling period
time and should be polled. Pre-polling period is 60000 millis.
2019-07-25 12:39:18,660+02 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] Task id
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' has passed pre-polling period
time and should be polled. Pre-polling period is 60000 millis.
2019-07-25 12:39:18,750+02 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] Task id
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' has passed pre-polling period
time and should be polled. Pre-polling period is 60000 millis.
2019-07-25 12:39:18,750+02 *ERROR*
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
BaseAsyncTask::logEndTaskFailure: Task
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' (Parent Command 'Unknown',
Parameters Type
'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
with failure:
2019-07-25 12:39:18,750+02 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
SPMAsyncTask::ClearAsyncTask: Attempting to clear task
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e'
2019-07-25 12:39:18,751+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] START,
SPMClearTaskVDSCommand(
SPMTaskGuidBaseVDSCommandParameters:{storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7',
ignoreFailoverLimit='false',
taskId='fdcf4d1b-82fe-49a6-b233-323ebe568f8e'}), log id: 34ae2b2f
2019-07-25 12:39:18,752+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] START,
HSMClearTaskVDSCommand(HostName = infn-vm05.management,
HSMTaskGuidBaseVDSCommandParameters:{hostId='751f3e99-b95e-4c31-bc38-77f5661a0bdc',
taskId='fdcf4d1b-82fe-49a6-b233-323ebe568f8e'}), log id: d3a78ad
2019-07-25 12:39:18,757+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] FINISH,
HSMClearTaskVDSCommand, log id: d3a78ad
2019-07-25 12:39:18,757+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] FINISH,
SPMClearTaskVDSCommand, log id: 34ae2b2f
2019-07-25 12:39:18,757+02 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
SPMAsyncTask::ClearAsyncTask: At time of attempt to clear task
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' the response code was
'TaskStateError' and message was 'Operation is not allowed in this task
state: ("can't clean in state running",)'. Task will not be cleaned
2019-07-25 12:39:18,757+02 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
BaseAsyncTask::onTaskEndSuccess: Task
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' (Parent Command 'Unknown',
Parameters Type
'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended
successfully.
2019-07-25 12:39:18,757+02 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
SPMAsyncTask::ClearAsyncTask: Attempting to clear task
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e'
2019-07-25 12:39:18,758+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] START,
SPMClearTaskVDSCommand(
SPMTaskGuidBaseVDSCommandParameters:{storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7',
ignoreFailoverLimit='false',
taskId='fdcf4d1b-82fe-49a6-b233-323ebe568f8e'}), log id: 42de0c2b
2019-07-25 12:39:18,759+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] START,
HSMClearTaskVDSCommand(HostName = infn-vm05.management,
HSMTaskGuidBaseVDSCommandParameters:{hostId='751f3e99-b95e-4c31-bc38-77f5661a0bdc',
taskId='fdcf4d1b-82fe-49a6-b233-323ebe568f8e'}), log id: 4895c79c
2019-07-25 12:39:18,764+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] FINISH,
HSMClearTaskVDSCommand, log id: 4895c79c
2019-07-25 12:39:18,764+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] FINISH,
SPMClearTaskVDSCommand, log id: 42de0c2b
2019-07-25 12:39:18,764+02 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
SPMAsyncTask::ClearAsyncTask: At time of attempt to clear task
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' the response code was
'TaskStateError' and message was 'Operation is not allowed in this task
state: ("can't clean in state running",)'. Task will not be cleaned
2019-07-25 12:39:18,764+02 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] Task id
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' has passed pre-polling period
time and should be polled. Pre-polling period is 60000 millis.
2019-07-25 12:39:18,764+02 INFO
[org.ovirt.engine.core.bll.tasks.AsyncTaskManager]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] Cleaning zombie
tasks: Clearing async task 'Unknown' that started at 'Fri May 03
14:48:50 CEST 2019'
2019-07-25 12:39:18,764+02 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
SPMAsyncTask::ClearAsyncTask: Attempting to clear task
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e'
2019-07-25 12:39:18,765+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] START,
SPMClearTaskVDSCommand(
SPMTaskGuidBaseVDSCommandParameters:{storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7',
ignoreFailoverLimit='false',
taskId='fdcf4d1b-82fe-49a6-b233-323ebe568f8e'}), log id: da77af2
2019-07-25 12:39:18,766+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] START,
HSMClearTaskVDSCommand(HostName = infn-vm05.management,
HSMTaskGuidBaseVDSCommandParameters:{hostId='751f3e99-b95e-4c31-bc38-77f5661a0bdc',
taskId='fdcf4d1b-82fe-49a6-b233-323ebe568f8e'}), log id: 530694fb
2019-07-25 12:39:18,771+02 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] FINISH,
HSMClearTaskVDSCommand, log id: 530694fb
2019-07-25 12:39:18,771+02 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.SPMClearTaskVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) [] FINISH,
SPMClearTaskVDSCommand, log id: da77af2
2019-07-25 12:39:18,771+02 INFO
[org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
(EE-ManagedThreadFactory-engineScheduled-Thread-67) []
SPMAsyncTask::ClearAsyncTask: At time of attempt to clear task
'fdcf4d1b-82fe-49a6-b233-323ebe568f8e' the response code was
'TaskStateError' and message was 'Operation is not allowed in this task
state: ("can't clean in state running",)'. Task will not be cleaned
there is some relation between this error and a task that has remained
hanging, from SPM server:
# vdsm-client Task getInfo taskID=fdcf4d1b-82fe-49a6-b233-323ebe568f8e
{
"verb": "prepareMerge",
"id": "fdcf4d1b-82fe-49a6-b233-323ebe568f8e"
}
# vdsm-client Task getStatus taskID=fdcf4d1b-82fe-49a6-b233-323ebe568f8e
{
"message": "running job 1 of 1",
"code": 0,
"taskID": "fdcf4d1b-82fe-49a6-b233-323ebe568f8e",
"taskResult": "",
"taskState": "running"
}
How can I solve this problem ?
Thanks a lot for your help !!
Best Regards
Enrico
--
_______________________________________________________________________
Enrico Becchetti Servizio di Calcolo e Reti
Istituto Nazionale di Fisica Nucleare - Sezione di Perugia
Via Pascoli,c/o Dipartimento di Fisica 06123 Perugia (ITALY)
Phone:+39 075 5852777 Mail: Enrico.Becchetti<at>pg.infn.it
_______________________________________________________________________
1 week, 3 days
oVirt networks
by Enrico Becchetti
Dear all,
Ineed your help to understand how to configure the network of a new
oVirt cluster.
Mynew system will have a 4.3 engine thatruns in a virtual machine, andsome
Dell R7525 AMD EPYC hypervisors, eachholding two 4-port PCI network cards.
These servers will have node-ovirt image again in version 4.3.
As for the network, there are two HPE Aruba 2540G, non-stackable, with
24 1Gbs ports
and 2 10Gbs uplinks to the star center.
This is a simplified scheme:
My goal is to make the most of the server's 8 ethernet interfaces to have
both reliability and maximum possible throughput.
This cluster will have two virtual networks, one forovirt management and
one for
the traffic of individual virtual machines.
With that said here's what my idea is. I would like to have two links
aggregated by 4Gbs,
one for ovrtmgt and the other for vmnet.
With the ovirt web interface I can createan active-passive "Mode 1"
bond, but this
won'tallow me to go beyond 1Gbs. Alternatively I could create a "Mode 4"
bond
802.3ad but unfortunately the switches are not stacked and therefore not
even
this solution applies.
This is an example with active passive configuration:
Can you tell me if ovirt can generate//nested bonds? Or do you have
other solutions ?
Thanks a lot !
Best Regards
Enrico
--
_______________________________________________________________________
Enrico Becchetti Servizio di Calcolo e Reti
Istituto Nazionale di Fisica Nucleare - Sezione di Perugia
Via Pascoli,c/o Dipartimento di Fisica 06123 Perugia (ITALY)
Phone:+39 075 5852777 Skype:enrico_becchetti
Mail: Enrico.Becchetti<at>pg.infn.it
Pagina web personale: https://www.pg.infn.it/home/enrico-becchetti/
______________________________________________________________________
3 months, 1 week
boot from cdrom & error code 0005
by edp@maddalena.it
Hi.
I have created a new storage domain (data domain, storage type nfs) to use it to upload iso images.
I have so uploaded a new iso and then attach the iso to a new vm.
But when I try to boot the vm I obtain this error:
booting from dvd/cd...
boot failed: could not read from cdrom (code 0005)
no bootable device
The iso file has been uploaded with success in the data storage domain and so the vm lets my attach the iso to the vm in the boot settings.
Can you help me?
Thank you
3 months, 2 weeks
VM Migration Failed
by KSNull Zero
Running oVirt 4.4.5
VM cannot migrate between hosts.
vdsm.log contains the following error:
libvirt.libvirtError: operation failed: Failed to connect to remote libvirt URI qemu+tls://ovhost01.local/system: authentication failed: Failed to verify peer's certificate
Certificates on hosts was renewed some time ago. How this issue can be fixed ?
Thank you.
4 months, 3 weeks
How to re-enroll (or renew) host certificates for a single-host hosted-engine deployment?
by Derek Atkins
Hi,
I've got a single-host hosted-engine deployment that I originally
installed with 4.0 and have upgraded over the years to 4.3.10. I and some
of my users have upgraded remote-viewer and now I get an error when I try
to view the console of my VMs:
(remote-viewer:8252): Spice-WARNING **: 11:30:41.806:
../subprojects/spice-common/common/ssl_verify.c:477:openssl_verify: Error
in server certificate verification: CA signature digest algorithm too weak
(num=68:depth0:/O=<My Org Name>/CN=<Host's Name>)
I am 99.99% sure this is because the old certs use SHA1.
I reran engine-setup on the engine and it asked me if I wanted to renew
the PKI, and I answered yes. This replaced many[1] of the certificates in
/etc/pki/ovirt-engine/certs on the engine, but it did not update the
Host's certificate.
All the documentation I've seen says that to refresh this certificate I
need to put the host into maintenance mode and then re-enroll.. However I
cannot do that, because this is a single-host system so I cannot put the
host in local mode -- there is no place to migrate the VMs (let alone the
Engine VM).
So.... Is there a command-line way to re-enroll manually and update the
host certs? Or some other way to get all the leftover certs renewed?
Thanks,
-derek
[1] Not only did it not update the Host's cert, it did not update any of
the vmconsole-proxy certs, nor the certs in /etc/pki/ovirt-vmconsole/, and
obviously nothing in /etc/pki/ on the host itself.
--
Derek Atkins 617-623-3745
derek(a)ihtfp.com www.ihtfp.com
Computer and Internet Security Consultant
5 months, 1 week
Changing disk QoS causes segfault with IO-Threads enabled (oVirt 4.3.0.4-1.el7)
by jloh@squiz.net
We recently upgraded to 4.3.0 and have found that when changing disk QoS settings on VMs whilst IO-Threads is enabled causes them to segfault and the VM to reboot. We've been able to replicate this across several VMs. VMs with IO-Threads disabled/turned off do not segfault when changing the QoS.
Mar 1 11:49:06 srvXX kernel: IO iothread1[30468]: segfault at fffffffffffffff8 ip 0000557649f2bd24 sp 00007f80de832f60 error 5 in qemu-kvm[5576498dd000+a03000]
Mar 1 11:49:06 srvXX abrt-hook-ccpp: invalid number 'iothread1'
Mar 1 11:49:11 srvXX libvirtd: 2019-03-01 00:49:11.116+0000: 13365: error : qemuMonitorIORead:609 : Unable to read from monitor: Connection reset by peer
Happy to supply some more logs to someone if they'll help but just wondering whether anyone else has experienced this or knows of a current fix other than turning io-threads off.
Cheers.
7 months, 2 weeks
Deploy oVirt Engine fail behind proxy
by Matteo Bonardi
Hi,
I am trying to deploy the ovirt engine following self-hosted engine installation procedure on documentation.
Deployment servers are behind a proxy and I have set it in environment and in yum.conf before run deploy.
Deploy fails because ovirt engine vm cannot resolve AppStream repository url:
[ INFO ] TASK [ovirt.engine-setup : Install oVirt Engine package]
[ ERROR ] fatal: [localhost -> ovirt-manager.mydomain]: FAILED! => {"changed": false, "msg": "Failed to download metadata for repo 'AppStream': Cannot prepare internal mirrorlist: Curl error (6): Couldn't resolve host name for http://mirrorlist.centos.org/?release=8&arch=x86_64&repo=AppStream&infra=... [Could not resolve host: mirrorlist.centos.org]", "rc": 1, "results": []}
[ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook
[ INFO ] Stage: Clean up
[ INFO ] Cleaning temporary resources
[ INFO ] TASK [ovirt.hosted_engine_setup : Execute just a specific set of steps]
[ INFO ] ok: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Force facts gathering]
[ INFO ] ok: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Fetch logs from the engine VM]
[ INFO ] ok: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Set destination directory path]
[ INFO ] ok: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Create destination directory]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]
[ INFO ] ok: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Find the local appliance image]
[ INFO ] ok: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Set local_vm_disk_path]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Give the vm time to flush dirty buffers]
[ INFO ] ok: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Copy engine logs]
[ INFO ] TASK [ovirt.hosted_engine_setup : include_tasks]
[ INFO ] ok: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Remove local vm dir]
[ INFO ] ok: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Remove temporary entry in /etc/hosts for the local VM]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Clean local storage pools]
[ INFO ] ok: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Destroy local storage-pool {{ he_local_vm_dir | basename }}]
[ INFO ] TASK [ovirt.hosted_engine_setup : Undefine local storage-pool {{ he_local_vm_dir | basename }}]
[ INFO ] TASK [ovirt.hosted_engine_setup : Destroy local storage-pool {{ local_vm_disk_path.split('/')[5] }}]
[ INFO ] TASK [ovirt.hosted_engine_setup : Undefine local storage-pool {{ local_vm_disk_path.split('/')[5] }}]
[ INFO ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20201109165237.conf'
[ INFO ] Stage: Pre-termination
[ INFO ] Stage: Termination
[ ERROR ] Hosted Engine deployment failed: please check the logs for the issue, fix accordingly or re-deploy from scratch.
Log file is located at /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20201109164244-b3e8sd.log
How I can set proxy for the engine vm?
Ovirt version:
[root@myhost ~]# rpm -qa | grep ovirt-engine-appliance
ovirt-engine-appliance-4.4-20200916125954.1.el8.x86_64
[root@myhost ~]# rpm -qa | grep ovirt-hosted-engine-setup
ovirt-hosted-engine-setup-2.4.6-1.el8.noarch
OS version:
[root@myhost ~]# cat /etc/centos-release
CentOS Linux release 8.2.2004 (Core)
[root@myhost ~]# uname -a
Linux myhost.mydomain 4.18.0-193.28.1.el8_2.x86_64 #1 SMP Thu Oct 22 00:20:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Thanks for the help.
Regards,
Matteo
8 months, 2 weeks
The oVirt Counter
by Sandro Bonazzola
Hi, for those who remember the Linux Counter project, if you'd like other
to know you're using oVirt and know some details about your deployment,
here's a way to count you in:
https://ovirt.org/community/ovirt-counter.html
Enjoy!
--
Sandro Bonazzola
MANAGER, SOFTWARE ENGINEERING, EMEA R&D PERFORMANCE & SCALE
Red Hat EMEA <https://www.redhat.com/>
sbonazzo(a)redhat.com
<https://www.redhat.com/>
*Red Hat respects your work life balance. Therefore there is no need to
answer this email out of your office hours.*
9 months, 2 weeks
Cannot restart ovirt after massive failure.
by Gilboa Davara
Hello all,
During the night, one of my (smaller) setups, a single node self hosted
engine (localhost NFS) crashed due to what-looks-like a massive disk
failure (Software RAID6, with 10 drives + spare).
After a reboot, I let the RAID resync with a fresh drive) and went on to
start oVirt.
However, no such luck.
Two issues:
1. ovirt-ha-broker fails due to broken hosted engine state (log attached).
2. ovirt-ha-agent fails due to network test (tcp) even though both
remote-host and DNS servers are active. (log attached).
Two questions:
1. Can I somehow force the agent to disable the network liveliness test?
2. Can I somehow force the broker to rebuild / fix the hosted engine state?
- Gilboa
10 months, 2 weeks
Please, Please Help - New oVirt Install/Deployment Failing - "Host is not up..."
by Matthew J Black
Hi Everyone,
Could someone please help me - I've been trying to do an install of oVirt for *weeks* (including false starts and self-inflicted wounds/errors) and it is still not working.
My setup:
- oVirt v4.5.3
- A brand new fresh vanilla install of RockyLinux 8.6 - all working AOK
- 2*NICs in a bond (802.3ad) with a couple of sub-Interfaces/VLANs - all working AOK
- All relevant IPv4 Address in DNS with Reverse Lookups - all working AOK
- All relevant IPv4 Address in "/etc/hosts" file - all working AOK
- IPv6 (using "method=auto" in the interface config file) enabled on the relevant sub-Interface/VLAN - I'm not using IPv6 on the network, only IPv4, but I'm trying to cover all the bases.
- All relevant Ports (as per the oVirt documentation) set up on the firewall
- ie firewall-cmd --add-service={{ libvirt-tls | ovirt-imageio | ovirt-vmconsole | vdsm }}
- All the relevant Repositories installed (ie RockyLinux BaseOS, AppStream, & PowerTools, and the EPEL, plus the ones from the oVirt documentation)
I have followed the oVirt documentation (including the special RHEL-instructions and RockyLinux-instructions) to the letter - no deviations, no special settings, exactly as they are written.
All the dnf installs, etc, went off without a hitch, including the "dnf install centos-release-ovirt45", "dnf install ovirt-engine-appliance", and "dnf install ovirt-hosted-engine-setup" - no errors anywhere.
Here is the results of a "dnf repolist":
- appstream Rocky Linux 8 - AppStream
- baseos Rocky Linux 8 - BaseOS
- centos-ceph-pacific CentOS-8-stream - Ceph Pacific
- centos-gluster10 CentOS-8-stream - Gluster 10
- centos-nfv-openvswitch CentOS-8 - NFV OpenvSwitch
- centos-opstools CentOS-OpsTools - collectd
- centos-ovirt45 CentOS Stream 8 - oVirt 4.5
- cs8-extras CentOS Stream 8 - Extras
- cs8-extras-common CentOS Stream 8 - Extras common packages
- epel Extra Packages for Enterprise Linux 8 - x86_64
- epel-modular Extra Packages for Enterprise Linux Modular 8 - x86_64
- ovirt-45-centos-stream-openstack-yoga CentOS Stream 8 - oVirt 4.5 - OpenStack Yoga Repository
- ovirt-45-upstream oVirt upstream for CentOS Stream 8 - oVirt 4.5
- powertools Rocky Linux 8 - PowerTools
So I kicked-off the oVirt deployment with: "hosted-engine --deploy --4 --ansible-extra-vars=he_offline_deployment=true".
I used "--ansible-extra-vars=he_offline_deployment=true" because without that flag I was getting "DNF timout" issues (see my previous post `Local (Deployment) VM Can't Reach "centos-ceph-pacific" Repo`).
I answer the defaults to all of questions the script asked, or entered the deployment-relevant answers where appropriate. In doing this I double-checked every answer before hitting <Enter>. Everything progressed smoothly until the deployment reached the "Wait for the host to be up" task... which then hung for more than 30 minutes before failing.
From the ovirt-hosted-engine-setup... log file:
- 2022-10-20 17:54:26,285+1100 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:113 fatal: [localhost]: FAILED! => {"changed": false, "msg": "Host is not up, please check logs, perhaps also on the engine machine"}
I checked the following log files and found all of the relevant ERROR lines, then checked several 10s of proceeding and succeeding lines trying to determine what was going wrong, but I could not determine anything.
- ovirt-hosted-engine-setup...
- ovirt-hosted-engine-setup-ansible-bootstrap_local_vm...
- ovirt-hosted-engine-setup-ansible-final_clean... - not really relevant, I believe
I can include the log files (or the relevant parts of the log files) if people want - but that are very large: several 100 kilobytes each.
I also googled "oVirt Host is not up" and found several entries, but after reading them all the most relevant seems to be a thread from these mailing list: `Install of RHV 4.4 failing - "Host is not up, please check logs, perhaps also on the engine machine"` - but this seems to be talking about an upgrade and I didn't gleam anything useful from it - I could, of course, be wrong about that.
So my questions are:
- Where else should I be looking (ie other log files, etc, and possible where to find them)?
- Does anyone have any idea why this isn't working?
- Does anyone have a work-around (including a completely manual process to get things working - I don't mind working in the CLI with virsh, etc)?
- What am I doing wrong?
Please, I'm really stumped with this, and I really do need help.
Cheers
Dulux-Oz
10 months, 2 weeks