Cannot remove Snapshot. Related operation is currently in progress. Please try again later.
by Martin Marusinec
Hello,
my storage got disconnected during the image migration, resulting is stuck migration. Task had been cleaned with restart of vdsdm service, but lead to stuck disk image. The VM got powered down in the meantime, and currently I cannot do anything with it, I always get a message like "Cannot.... Related operation is currently in progress." I already used the unlock_entity.sh script, it cleaned something, but did not help me. I cannot delete the VM, delete the disk, delete the snapshots. I even deleted the wrong snapshot from disk. All I need is it get rid of the VM and its disk, as I can restore it from the backup. Could you please somebody try to navigate me how to solve the problem? Engine log keeps filling up with messages which indicate there is an orphaned task, however I am not sure what to delete from database:
2022-04-13 22:26:41,822+02 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Command CopyData id: '62b1976b-729a-41ae-972c-a42de750ad1d': couldn't get the status of job '6ef3cb28-14cf-486a-ad71-96484c1a596e' on host 'kvm2.finamis.com' (id: 'f880cfbe-5827-4853-a4fc-11199c439f79'), assuming it's still running
2022-04-13 22:26:41,846+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Command 'LiveMigrateDisk' (id: '3cd698fd-08a2-419e-bdd8-6b3516bb3127') waiting on child command id: '2e9ad954-8f4e-41ed-9c75-015afeed337d' type:'CopyImageGroupVolumesData' to complete
2022-04-13 22:26:41,847+02 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Command FenceVolumeJob id: '831828a3-2e0f-4989-965b-91761fa9601f': job '9dc88836-db04-49ae-b01e-18a56e24c91a' wasn't executed on any host, considering the job status as failed
2022-04-13 22:26:41,847+02 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Command FenceVolumeJob id: '831828a3-2e0f-4989-965b-91761fa9601f': execution was completed, the command status is 'FAILED'
2022-04-13 22:26:42,850+02 ERROR [org.ovirt.engine.core.bll.job.JobRepositoryImpl] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-6) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Failed to save step '499b0207-61ba-4998-8cb8-2cb93b89422b', 'FINALIZING': CallableStatementCallback; SQL [{call insertstep(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)}ERROR: insert or update on table "step" violates foreign key constraint "fk_step_job"
Detail: Key (job_id)=(ed10f993-cfd2-4a8b-806f-3fbd823101ef) is not present in table "job".
Where: SQL statement "INSERT INTO step (
step_id,
parent_step_id,
job_id,
step_type,
description,
step_number,
status,
progress,
start_time,
end_time,
correlation_id,
external_id,
external_system_type,
is_external
)
VALUES (
v_step_id,
v_parent_step_id,
v_job_id,
v_step_type,
v_description,
v_step_number,
v_status,
v_progress,
v_start_time,
v_end_time,
v_correlation_id,
v_external_id,
v_external_system_type,
v_is_external
)"
PL/pgSQL function insertstep(uuid,uuid,uuid,character varying,text,integer,character varying,smallint,timestamp with time zone,timestamp with time zone,character varying,uuid,character varying,boolean) line 3 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: insert or update on table "step" violates foreign key constraint "fk_step_job"
Detail: Key (job_id)=(ed10f993-cfd2-4a8b-806f-3fbd823101ef) is not present in table "job".
Where: SQL statement "INSERT INTO step (
step_id,
parent_step_id,
job_id,
step_type,
description,
step_number,
status,
progress,
start_time,
end_time,
correlation_id,
external_id,
external_system_type,
is_external
)
VALUES (
v_step_id,
v_parent_step_id,
v_job_id,
v_step_type,
v_description,
v_step_number,
v_status,
v_progress,
v_start_time,
v_end_time,
v_correlation_id,
v_external_id,
v_external_system_type,
v_is_external
)"
PL/pgSQL function insertstep(uuid,uuid,uuid,character varying,text,integer,character varying,smallint,timestamp with time zone,timestamp with time zone,character varying,uuid,character varying,boolean) line 3 at SQL statement
2022-04-13 22:26:42,851+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.FenceVolumeJobCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-6) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.FenceVolumeJobCommand' with failure.
2022-04-13 22:26:51,855+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-76) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Command 'CopyImageGroupVolumesData' (id: '2e9ad954-8f4e-41ed-9c75-015afeed337d') waiting on child command id: '62b1976b-729a-41ae-972c-a42de750ad1d' type:'CopyData' to complete
2022-04-13 22:26:51,858+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-76) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] START, GetHostJobsVDSCommand(HostName = kvm2.finamis.com, GetHostJobsVDSCommandParameters:{hostId='f880cfbe-5827-4853-a4fc-11199c439f79', type='storage', jobIds='[6ef3cb28-14cf-486a-ad71-96484c1a596e]'}), log id: 480d931c
2022-04-13 22:26:51,862+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-76) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] FINISH, GetHostJobsVDSCommand, return: {}, log id: 480d931c
2022-04-13 22:26:51,862+02 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-76) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Command CopyData id: '62b1976b-729a-41ae-972c-a42de750ad1d': attempting to determine the job status by polling the entity.
2022-04-13 22:26:51,862+02 ERROR [org.ovirt.engine.core.bll.StorageJobCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-76) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Command CopyData id: '62b1976b-729a-41ae-972c-a42de750ad1d': failed to poll the command entity
2022-04-13 22:26:51,862+02 INFO [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-76) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Command CopyData id: '62b1976b-729a-41ae-972c-a42de750ad1d': attempting to fence job ed10f993-cfd2-4a8b-806f-3fbd823101ef
2022-04-13 22:26:51,868+02 INFO [org.ovirt.engine.core.bll.storage.disk.image.FenceVolumeJobCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-76) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Running command: FenceVolumeJobCommand internal: true.
2022-04-13 22:26:51,868+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.FenceVolumeJobCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-76) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Command 'org.ovirt.engine.core.bll.storage.disk.image.FenceVolumeJobCommand' failed: null
2022-04-13 22:26:51,868+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.FenceVolumeJobCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-76) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Exception: java.lang.NullPointerException
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.FenceVolumeJobCommand.executeCommand(FenceVolumeJobCommand.java:31)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1174)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1332)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:2008)
at org.ovirt.engine.core.utils//org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:140)
at org.ovirt.engine.core.utils//org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:79)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1392)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:424)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.executor.DefaultBackendActionExecutor.execute(DefaultBackendActionExecutor.java:13)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend.runAction(Backend.java:450)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:432)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.Backend.runInternalAction(Backend.java:638)
at jdk.internal.reflect.GeneratedMethodAccessor273.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.jboss.as.ee@23.0.2.Final//org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
at org.jboss.as.weld.common@23.0.2.Final//org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:79)
at org.jboss.as.weld.common@23.0.2.Final//org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:89)
at org.jboss.as.weld.common@23.0.2.Final//org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:102)
at org.jboss.as.ee@23.0.2.Final//org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ejb3@23.0.2.Final//org.jboss.as.ejb3.component.invocationmetrics.ExecutionTimeInterceptor.processInvocation(ExecutionTimeInterceptor.java:43)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ee@23.0.2.Final//org.jboss.as.ee.concurrent.ConcurrentContextInterceptor.processInvocation(ConcurrentContextInterceptor.java:45)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:40)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:53)
at org.jboss.as.ee@23.0.2.Final//org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:52)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ejb3@23.0.2.Final//org.jboss.as.ejb3.component.singleton.SingletonComponentInstanceAssociationInterceptor.processInvocation(SingletonComponentInstanceAssociationInterceptor.java:53)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ejb3@23.0.2.Final//org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:232)
at org.jboss.as.ejb3@23.0.2.Final//org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:446)
at org.jboss.as.ejb3@23.0.2.Final//org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:164)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
at org.jboss.weld.core@3.1.6.Final//org.jboss.weld.module.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:81)
at org.jboss.as.weld.common@23.0.2.Final//org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:89)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ejb3@23.0.2.Final//org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ejb3@23.0.2.Final//org.jboss.as.ejb3.component.invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor.java:47)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ejb3@23.0.2.Final//org.jboss.as.ejb3.security.SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:100)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ejb3@23.0.2.Final//org.jboss.as.ejb3.deployment.processors.StartupAwaitInterceptor.processInvocation(StartupAwaitInterceptor.java:22)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ejb3@23.0.2.Final//org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ejb3@23.0.2.Final//org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:67)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.as.ee@23.0.2.Final//org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.ContextClassLoaderInterceptor.processInvocation(ContextClassLoaderInterceptor.java:60)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.run(InterceptorContext.java:438)
at org.wildfly.security.elytron-private@1.15.3.Final//org.wildfly.security.manager.WildFlySecurityManager.doChecked(WildFlySecurityManager.java:633)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.AccessCheckingInterceptor.processInvocation(AccessCheckingInterceptor.java:57)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
at org.jboss.invocation@1.6.0.Final//org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:53)
at org.jboss.as.ee@23.0.2.Final//org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:198)
at org.jboss.as.ee@23.0.2.Final//org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:191)
at org.jboss.as.ee@23.0.2.Final//org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:81)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view3.runInternalAction(Unknown Source)
at jdk.internal.reflect.GeneratedMethodAccessor272.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.jboss.weld.core@3.1.6.Final//org.jboss.weld.util.reflection.Reflections.invokeAndUnwrap(Reflections.java:410)
at org.jboss.weld.core@3.1.6.Final//org.jboss.weld.module.ejb.EnterpriseBeanProxyMethodHandler.invoke(EnterpriseBeanProxyMethodHandler.java:134)
at org.jboss.weld.core@3.1.6.Final//org.jboss.weld.bean.proxy.EnterpriseTargetBeanInstance.invoke(EnterpriseTargetBeanInstance.java:56)
at org.jboss.weld.core@3.1.6.Final//org.jboss.weld.module.ejb.InjectionPointPropagatingEnterpriseTargetBeanInstance.invoke(InjectionPointPropagatingEnterpriseTargetBeanInstance.java:68)
at org.jboss.weld.core@3.1.6.Final//org.jboss.weld.bean.proxy.ProxyMethodHandler.invoke(ProxyMethodHandler.java:106)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.BackendCommandObjectsHandler$BackendInternal$BackendLocal2049259618$Proxy$_$$_Weld$EnterpriseProxy$.runInternalAction(Unknown Source)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.runInternalAction(CommandBase.java:2386)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.runInternalActionWithTasksContext(CommandBase.java:2411)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.CommandBase.runInternalActionWithTasksContext(CommandBase.java:2406)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand.attemptToFenceJob(CopyDataCommand.java:144)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.HostJobCallback.handleUndeterminedJobStatus(HostJobCallback.java:196)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.HostJobCallback.childCommandsExecutionEnded(HostJobCallback.java:84)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:181)
at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at org.glassfish.javax.enterprise.concurrent//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:360)
at org.glassfish.javax.enterprise.concurrent//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:511)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
at org.glassfish.javax.enterprise.concurrent//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:227)
2022-04-13 22:26:51,872+02 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-76) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Command CopyData id: '62b1976b-729a-41ae-972c-a42de750ad1d': couldn't get the status of job '6ef3cb28-14cf-486a-ad71-96484c1a596e' on host 'kvm2.finamis.com' (id: 'f880cfbe-5827-4853-a4fc-11199c439f79'), assuming it's still running
2022-04-13 22:26:51,899+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-76) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Command 'LiveMigrateDisk' (id: '3cd698fd-08a2-419e-bdd8-6b3516bb3127') waiting on child command id: '2e9ad954-8f4e-41ed-9c75-015afeed337d' type:'CopyImageGroupVolumesData' to complete
2022-04-13 22:26:51,900+02 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-76) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Command FenceVolumeJob id: '3c8b0471-c405-4b7f-ac54-43732407d3cb': job '3c38e07a-44f3-46b7-97f9-56b6218e69b4' wasn't executed on any host, considering the job status as failed
2022-04-13 22:26:51,900+02 INFO [org.ovirt.engine.core.bll.StorageJobCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-76) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Command FenceVolumeJob id: '3c8b0471-c405-4b7f-ac54-43732407d3cb': execution was completed, the command status is 'FAILED'
2022-04-13 22:26:52,903+02 ERROR [org.ovirt.engine.core.bll.job.JobRepositoryImpl] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-14) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Failed to save step 'ad184ef6-e708-41f3-acff-962d39f01c30', 'FINALIZING': CallableStatementCallback; SQL [{call insertstep(?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)}ERROR: insert or update on table "step" violates foreign key constraint "fk_step_job"
Detail: Key (job_id)=(ed10f993-cfd2-4a8b-806f-3fbd823101ef) is not present in table "job".
Where: SQL statement "INSERT INTO step (
step_id,
parent_step_id,
job_id,
step_type,
description,
step_number,
status,
progress,
start_time,
end_time,
correlation_id,
external_id,
external_system_type,
is_external
)
VALUES (
v_step_id,
v_parent_step_id,
v_job_id,
v_step_type,
v_description,
v_step_number,
v_status,
v_progress,
v_start_time,
v_end_time,
v_correlation_id,
v_external_id,
v_external_system_type,
v_is_external
)"
PL/pgSQL function insertstep(uuid,uuid,uuid,character varying,text,integer,character varying,smallint,timestamp with time zone,timestamp with time zone,character varying,uuid,character varying,boolean) line 3 at SQL statement; nested exception is org.postgresql.util.PSQLException: ERROR: insert or update on table "step" violates foreign key constraint "fk_step_job"
Detail: Key (job_id)=(ed10f993-cfd2-4a8b-806f-3fbd823101ef) is not present in table "job".
Where: SQL statement "INSERT INTO step (
step_id,
parent_step_id,
job_id,
step_type,
description,
step_number,
status,
progress,
start_time,
end_time,
correlation_id,
external_id,
external_system_type,
is_external
)
VALUES (
v_step_id,
v_parent_step_id,
v_job_id,
v_step_type,
v_description,
v_step_number,
v_status,
v_progress,
v_start_time,
v_end_time,
v_correlation_id,
v_external_id,
v_external_system_type,
v_is_external
)"
PL/pgSQL function insertstep(uuid,uuid,uuid,character varying,text,integer,character varying,smallint,timestamp with time zone,timestamp with time zone,character varying,uuid,character varying,boolean) line 3 at SQL statement
2022-04-13 22:26:52,904+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.FenceVolumeJobCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-14) [e546bb30-7d18-4348-9c28-9dbbf4ab1893] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.FenceVolumeJobCommand' with failure.
^C
2 years, 7 months
certification expires: PKIX path validation failed
by Nathanaël Blanchet
Hi,
Some of my hosts came into a non responsive state since there certicate
had expired:
VDSM palomo command Get Host Capabilities failed: PKIX path validation
failed: java.security.cert.CertPathValidatorException: validity check failed
|openssl x509 -noout -enddate -in /etc/pki/vdsm/certs/vdsmcert.pem
palomo notAfter=Apr 6 11:09:05 2022 GMT |
The recommanded path to update certificates is to put hosts into
maintenance and enroll certificates.
But I can't anymore live migrate vms since the certificate is expired:
2022-04-13 10:34:12,022+0200 ERROR (migsrc/bf0f7628) [virt.vm]
(vmId='bf0f7628-d70b-47a4-8569-5430e178f429') [SSL:
CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)
(migration:331)
So is there a way to disable tls to migrate these vms so as to put the
host into maintenance?
No possibility of migration would imply to stop production vms, this is
what we absolutely don't want!
Any help much appreciated.
||
--
Nathanaël Blanchet
Supervision réseau
SIRE
227 avenue Professeur-Jean-Louis-Viala
34193 MONTPELLIER CEDEX 5
Tél. 33 (0)4 67 54 84 55
Fax 33 (0)4 67 54 84 14
blanchet(a)abes.fr
2 years, 7 months
vdsm hook after node upgrade
by Nathanaël Blanchet
Hi,
I've upgraded my hosts from 4.4.9 to 4.4.10 and none of my vdsm hooks
are present anymore... i believed those additionnal personnal data were
persistent across update...
--
Nathanaël Blanchet
Supervision réseau
SIRE
227 avenue Professeur-Jean-Louis-Viala
34193 MONTPELLIER CEDEX 5
Tél. 33 (0)4 67 54 84 55
Fax 33 (0)4 67 54 84 14
blanchet(a)abes.fr
2 years, 7 months
ovirt-dr generate
by Colin Coe
Hi all
I'm trying to run ovirt-dr generate but its failing:
/usr/share/ansible/collections/ansible_collections/redhat/rhv/roles/disaster_recovery/files/ovirt-dr
generate
Log file: '/tmp/ovirt-dr-1649673243333.log'
[Generate Mapping File] Connection to setup has failed. Please check your
credentials:
URL: https://server.fqdn/ovirt-engine/api
user: admin@internal
CA file: ./ca.pem
[Generate Mapping File] Failed to generate var file.
When I examine the log file:
2022-04-11 18:34:03,332 INFO Start generate variable mapping file for oVirt
ansible disaster recovery
2022-04-11 18:34:03,333 INFO Site address:
https://server.fqdn/ovirt-engine/api
username: admin@internal
password: *******
ca file location: ./ca.pem
output file location: ./disaster_recovery_vars.yml
ansible play location: ./dr_play.yml
2022-04-11 18:34:03,343 ERROR Connection to setup has failed. Please check
your credentials:
URL: https://server.fqdn/ovirt-engine/api
user: admin@internal
CA file: ./ca.pem
2022-04-11 18:34:03,343 ERROR Error: Error while sending HTTP request: (60,
'SSL certificate problem: unable to get local issuer certificate')
2022-04-11 18:34:03,343 ERROR Failed to generate var file.
My suspicion is that the script doesn't like third party certs.
Has anyone got this working with third party certs? If so, what did you
need to do?
Thanks
2 years, 7 months
Couldn't connect to VDSM within 60 seconds
by pasquale.borrelli@synlab.it
Hi,
we have 3 hosts and a self-hosted engine VM (ovirt version 4.4.).
After rebooting all the hosts, we are unable to start the VM hosted-engine. In particular, the output of 'hosted-engine --vm-start' is as follows:
"The hosted engine configuration has not been retrieved from shared storage yet,
for more details please check sanlock status."
We checked the sanlock, ovirt-ha-agent and broker status but all are "active (running)".
Checking the log files, all share the common error "RuntimeError: Couldn't connect to VDSM within 60 second", returned in loop.
In the vdsm.log we found this error:
"
2022-04-05 16:20:11,786+0200 INFO (periodic/0) [vdsm.api] START repoStats(domains=()) from=internal, task_id=8541000f-e7fd-4b59-8ae6-522c87538688 (api:48)
2022-04-05 16:20:11,786+0200 INFO (periodic/0) [vdsm.api] FINISH repoStats return={} from=internal, task_id=8541000f-e7fd-4b59-8ae6-522c87538688 (api:54)
2022-04-05 16:20:11,789+0200 WARN (periodic/0) [root] Failed to retrieve Hosted Engine HA info, is Hosted Engine setup finished? (api:168)
"
We tried googling to resolve this issue but, unfortunately, unsuccessfully.
Can someone help us to solve our critical issue?
Bests,
Pasquale
2 years, 7 months
Ovirt 4.4.10 lab unstable on ESXi
by Mohamed Roushdy
Hello,
I’m used to having a nested-ESXi on another ESXi, but we are testing Ovirt these days, and we are facing a strange behavior with the engine VM. I have a hyper-converged Ovirt cluster with a hosted-engine, and whenever I reboot/shutdown one of the nodes for testing (a node that is not hosting the engine VM of course) the engine goes crazy and too slow, and starts losing network connecting to the remaining (healthy) nodes in the cluster, and I can hardly SSH into it. Ping also has too many drops while another node is down. Once that faulty node is up again the engine works very fine, but this is really limiting my ability to make further Ovirt HA tests before moving to production. I though maybe the promiscuous mode on the physical host causes this, but never faced this with nester-virtualization with VMware, and turning-off promiscuous mode isn’t an option either… what do you think please?
Mohamed Roushdy
Team Member – Systems Administrator
M: +31 61 55 94 300
[signature_514057058]<https://peopleintouch.com/>[signature_1294661742]<https://www.linkedin.com/authwall?trk=bf&trkInfo=AQGK3xaGp-v5DAAAAX16b8yY...>
2 years, 7 months
VM failed to start when host's network is down
by lizhijian@fujitsu.com
Post again after subscribing the mail list.
Hi guys
I have an all in one ovirt environment which node installed both
vdsm and ovirt-engine.
I have setup the ovirt environment and it could work well.
For some reasons, i have to use this ovirt with node's networking down(i unplugged the network cable)
In such case, I noticed that i cannot start a VM anymore.
I wonder if there is a configuration switch to enable a ovirt to work with node's networking down ?
if not, may i possible to make it work by a easy way ?
When i tried to start VM with ovirt API, it responses with:
```bash
[root@74d2ab9cb0 ~]# sh start.sh
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<action>
<async>false</async>
<fault>
<detail>[Cannot run VM. Unknown Data Center status.]</detail>
<reason>Operation Failed</reason>
</fault>
<status>failed</status>
</action>
[root@74d2ab9cb0 ~]# sh start.sh
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<action>
<async>false</async>
<fault>
<detail>[Cannot run VM. Unknown Data Center status.]</detail>
<reason>Operation Failed</reason>
</fault>
<status>failed</status>
</action>
[root@74d2ab9cb0 ~]#
```
Attached the vdsm and ovirt-engine
Thanks
Zhijian
2 years, 7 months
Wait for the engine to come up on the target vm
by Vladimir Belov
I'm trying to deploy oVirt from a self-hosted engine, but at the last step I get an engine startup error.
[ INFO ] TASK [Wait for the engine to come up on the target VM]
[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 120, "changed": true, "cmd": ["hosted-engine", "--vm-status", "--json"], "delta": "0:00:00.181846", "end": "2022-03-28 15:41:28.853150", "rc": 0, "start": "2022-03-28 15:41:28.671304", "stderr": "", "stderr_lines": [], "stdout": "{\"1\": {\"conf_on_shared_storage\": true, \"live-data\": true, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=5537 (Mon Mar 28 15:41:20 2022)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=5537 (Mon Mar 28 15:41:20 2022)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\", \"hostname\": \"v2.test.ru\", \"host-id\": 1, \"engine-status\": {\"reason\": \"failed liveliness check\", \"health\": \"bad\", \"vm\": \"up\", \"detail\": \"Up\"}, \"score\": 3400, \"stopped\": false, \"maintenance\": false, \"crc32\": \"4d2eeaea\", \"local_conf_timestamp\": 5537, \"host-ts\": 5537}, \"global_maintenance\": false}", "stdout_lines": ["{\"1\": {\"conf_on_shared_storage\": true, \"live-data\": true, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=5537 (Mon Mar 28 15:41:20 2022)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=5537 (Mon Mar 28 15:41:20 2022)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStarting\\nstopped=False\\n\", \"hostname\": \"v2.test.ru\", \"host-id\": 1, \"engine-status\": {\"reason\": \"failed liveliness check\", \"health\": \"bad\", \"vm\": \"up\", \"detail\": \"Up\"}, \"score\": 3400, \"stopped\": false, \"maintenance\": false, \"crc32\": \"4d2eeaea\", \"local_conf_timestamp\": 5537, \"host-ts\": 5537}, \"global_maintenance\": false}"]}
Аfter the installation is completed, the condition of the engine is as follows:
Engine status: {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "Up"}
After reading the vdsm.logs, I found that qemu-guest-agent failed to connect to the engine for some reason.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5400, in qemuGuestAgentShutdown
self._dom.shutdownFlags(libvirt.VIR_DOMAIN_SHUTDOWN_GUEST_AGENT)
File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f
ret = attr(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper
return func(inst, *args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2517, in shutdownFlags
if ret == -1: raise libvirtError ('virDomainShutdownFlags() failed', dom=self)
libvirtError: Guest agent is not responding: QEMU guest agent is not connected
During the installation phase, qemu-guest-agent on the guest VM is running.
Setting a temporary password (hosted-engine --add-console-password --password) and connecting via VNC also failed.
Using "hosted-engine --console" also failed to connect
The engine VM is running on this host
Connected to HostedEngine domain
Escaping character: ^]
error: internal error: character device <null> not found
The network settings are configured using static addressing, without DHCP.
It seems to me that this is due to the fact that the engine receives an IP address that does not match the entry in /etc/hosts, but I do not know how to fix it. Any help is welcome, I will provide the necessary logs. Thanks
2 years, 7 months