Q: How to fix ghost "locked" status of VM

Hi, Creating snapshot of one of the VM vailed, and zombie tasks was killed with: su postgres psql -d engine -U postgres select * from job order by start_time desc; select DeleteJob('UUID_FROZEN_TASK_ID’); However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface. I run: /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all then rebooted engine VM, still no luck. Can’t do anything with that VM. Please advise how to fix. Thanks in advance.

can you share the logs after restarting ovirt-engine? On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi,
Creating snapshot of one of the VM vailed, and zombie tasks was killed with:
su postgres psql -d engine -U postgres select * from job order by start_time desc;
select DeleteJob('UUID_FROZEN_TASK_ID’);
However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface.
I run: /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all
then rebooted engine VM, still no luck. Can’t do anything with that VM.
Please advise how to fix. Thanks in advance. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL...

Hi, Benny, I have sent log on your mailbox, its too big to post here on mailing list. Looks like ghost task is still running, anything else need to be removed from Postgres DB? BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it? I think it's actually exists, but due to invalid state its not possible to do anything with it. —— Log file still shows zombie task: 2022-08-04 20:57:30,145+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:40,176+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:50,252+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete
On 4 Aug 2022, at 19:06, Benny Zlotnik <bzlotnik@redhat.com> wrote:
can you share the logs after restarting ovirt-engine?
On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi,
Creating snapshot of one of the VM vailed, and zombie tasks was killed with:
su postgres psql -d engine -U postgres select * from job order by start_time desc;
select DeleteJob('UUID_FROZEN_TASK_ID’);
However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface.
I run: /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all
then rebooted engine VM, still no luck. Can’t do anything with that VM.
Please advise how to fix. Thanks in advance. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL...

So based on your logs the lock you are seeing is a memory lock, unlock_entity.sh can't really help with these. Also, the job table is used mainly for presentation so removing an entry from will not help. Do you have the logs from when this snapshot operation started, you can use the correlation id (28353fa0-5e36-4fe8-8609-e74cd1da6d36) to search? Also, do you have the vdsm logs (SPM at the time and the host running the VM), same correlation id can be used for this as well The table that's used to coordinate this is command_entities, so in theory removing the entries with this correlation id can help, but I'd like to see what led to this first On Fri, Aug 5, 2022 at 8:37 AM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
I have sent log on your mailbox, its too big to post here on mailing list.
Looks like ghost task is still running, anything else need to be removed from Postgres DB?
BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it? I think it's actually exists, but due to invalid state its not possible to do anything with it.
——
Log file still shows zombie task:
2022-08-04 20:57:30,145+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:40,176+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:50,252+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete
On 4 Aug 2022, at 19:06, Benny Zlotnik <bzlotnik@redhat.com> wrote:
can you share the logs after restarting ovirt-engine?
On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi,
Creating snapshot of one of the VM vailed, and zombie tasks was killed with:
su postgres psql -d engine -U postgres select * from job order by start_time desc;
select DeleteJob('UUID_FROZEN_TASK_ID’);
However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface.
I run: /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all
then rebooted engine VM, still no luck. Can’t do anything with that VM.
Please advise how to fix. Thanks in advance. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL...

Hi Benny, so we have the same problem last 2 days. there are two affected VMs (furtunately not more) and we cant start any snapshot for these VMS. it looks that the snapshot haven't start yet , then to unlock objects (snapshot and disk) shouldn't damage disk (i tried it on both VMS), then i restarted them and VMS run again. I made clone from one of them ant the clone works properly. Jirka On 8/5/22 10:34, Benny Zlotnik wrote:
So based on your logs the lock you are seeing is a memory lock, unlock_entity.sh can't really help with these. Also, the job table is used mainly for presentation so removing an entry from will not help.
Do you have the logs from when this snapshot operation started, you can use the correlation id (28353fa0-5e36-4fe8-8609-e74cd1da6d36) to search? Also, do you have the vdsm logs (SPM at the time and the host running the VM), same correlation id can be used for this as well
The table that's used to coordinate this is command_entities, so in theory removing the entries with this correlation id can help, but I'd like to see what led to this first
On Fri, Aug 5, 2022 at 8:37 AM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
I have sent log on your mailbox, its too big to post here on mailing list.
Looks like ghost task is still running, anything else need to be removed from Postgres DB?
BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it? I think it's actually exists, but due to invalid state its not possible to do anything with it.
——
Log file still shows zombie task:
2022-08-04 20:57:30,145+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:40,176+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:50,252+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete
On 4 Aug 2022, at 19:06, Benny Zlotnik <bzlotnik@redhat.com> wrote:
can you share the logs after restarting ovirt-engine?
On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi,
Creating snapshot of one of the VM vailed, and zombie tasks was killed with:
su postgres psql -d engine -U postgres select * from job order by start_time desc;
select DeleteJob('UUID_FROZEN_TASK_ID’);
However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface.
I run: /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all
then rebooted engine VM, still no luck. Can’t do anything with that VM.
Please advise how to fix. Thanks in advance. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL...
Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ILYSLIM7UNRU3B...

Hi, I have e-mailed vdsm log to Benny. Unfortunately, yesterday I deleted engine.log before restarting engine to have more shorter log. Jirka, can you e-mail engine.log and vdsm logs (SPM at the time and the host running the VM) to Benny?
On 5 Aug 2022, at 11:42, Jirka Simon <jirka@vesim.cz> wrote:
Hi Benny, so we have the same problem last 2 days.
there are two affected VMs (furtunately not more) and we cant start any snapshot for these VMS.
it looks that the snapshot haven't start yet , then to unlock objects (snapshot and disk) shouldn't damage disk (i tried it on both VMS), then i restarted them and VMS run again. I made clone from one of them ant the clone works properly.
Jirka
On 8/5/22 10:34, Benny Zlotnik wrote:
So based on your logs the lock you are seeing is a memory lock, unlock_entity.sh can't really help with these. Also, the job table is used mainly for presentation so removing an entry from will not help.
Do you have the logs from when this snapshot operation started, you can use the correlation id (28353fa0-5e36-4fe8-8609-e74cd1da6d36) to search? Also, do you have the vdsm logs (SPM at the time and the host running the VM), same correlation id can be used for this as well
The table that's used to coordinate this is command_entities, so in theory removing the entries with this correlation id can help, but I'd like to see what led to this first
On Fri, Aug 5, 2022 at 8:37 AM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
I have sent log on your mailbox, its too big to post here on mailing list.
Looks like ghost task is still running, anything else need to be removed from Postgres DB?
BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it? I think it's actually exists, but due to invalid state its not possible to do anything with it.
——
Log file still shows zombie task:
2022-08-04 20:57:30,145+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:40,176+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:50,252+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete
On 4 Aug 2022, at 19:06, Benny Zlotnik <bzlotnik@redhat.com> wrote:
can you share the logs after restarting ovirt-engine?
On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi,
Creating snapshot of one of the VM vailed, and zombie tasks was killed with:
su postgres psql -d engine -U postgres select * from job order by start_time desc;
select DeleteJob('UUID_FROZEN_TASK_ID’);
However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface.
I run: /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all
then rebooted engine VM, still no luck. Can’t do anything with that VM.
Please advise how to fix. Thanks in advance. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL...
Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ILYSLIM7UNRU3B...

HI, OK, how to properly remove this lock? Right now VM is locked and is unmanageable at all in any way. I suppose its with some SQL commands in Postgres. Thanks.
On 5 Aug 2022, at 11:34, Benny Zlotnik <bzlotnik@redhat.com> wrote:
So based on your logs the lock you are seeing is a memory lock, unlock_entity.sh can't really help with these. Also, the job table is used mainly for presentation so removing an entry from will not help.
Do you have the logs from when this snapshot operation started, you can use the correlation id (28353fa0-5e36-4fe8-8609-e74cd1da6d36) to search? Also, do you have the vdsm logs (SPM at the time and the host running the VM), same correlation id can be used for this as well
The table that's used to coordinate this is command_entities, so in theory removing the entries with this correlation id can help, but I'd like to see what led to this first
On Fri, Aug 5, 2022 at 8:37 AM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
I have sent log on your mailbox, its too big to post here on mailing list.
Looks like ghost task is still running, anything else need to be removed from Postgres DB?
BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it? I think it's actually exists, but due to invalid state its not possible to do anything with it.
——
Log file still shows zombie task:
2022-08-04 20:57:30,145+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:40,176+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:50,252+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete
On 4 Aug 2022, at 19:06, Benny Zlotnik <bzlotnik@redhat.com> wrote:
can you share the logs after restarting ovirt-engine?
On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi,
Creating snapshot of one of the VM vailed, and zombie tasks was killed with:
su postgres psql -d engine -U postgres select * from job order by start_time desc;
select DeleteJob('UUID_FROZEN_TASK_ID’);
However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface.
I run: /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all
then rebooted engine VM, still no luck. Can’t do anything with that VM.
Please advise how to fix. Thanks in advance. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL...

you can look up the relevant command by command_id in the command_entities table, in your case it would be 2f8b32d8-fd3c-46c9-90e9-4863d63c0530 and ed816f9d-e25c-4b58-8c8f-fd0393abda2f, there might be more as the log is trimmed, so I suggest to look it up with select * from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530'; Then delete the relevant entries and restart ovirt-engine (as they might still be present in the cache) But before doing that, is the command still running? Async commands like create snapshot are failed automatically after 50 hours and I believe it has already passed Also, this manual operation is very intrusive and might have unexpected consequences so make sure you have backups. On Mon, Aug 8, 2022 at 9:33 AM Andrei Verovski <andreil1@starlett.lv> wrote:
HI,
OK, how to properly remove this lock? Right now VM is locked and is unmanageable at all in any way. I suppose its with some SQL commands in Postgres.
Thanks.
On 5 Aug 2022, at 11:34, Benny Zlotnik <bzlotnik@redhat.com> wrote:
So based on your logs the lock you are seeing is a memory lock, unlock_entity.sh can't really help with these. Also, the job table is used mainly for presentation so removing an entry from will not help.
Do you have the logs from when this snapshot operation started, you can use the correlation id (28353fa0-5e36-4fe8-8609-e74cd1da6d36) to search? Also, do you have the vdsm logs (SPM at the time and the host running the VM), same correlation id can be used for this as well
The table that's used to coordinate this is command_entities, so in theory removing the entries with this correlation id can help, but I'd like to see what led to this first
On Fri, Aug 5, 2022 at 8:37 AM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
I have sent log on your mailbox, its too big to post here on mailing list.
Looks like ghost task is still running, anything else need to be removed from Postgres DB?
BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it? I think it's actually exists, but due to invalid state its not possible to do anything with it.
——
Log file still shows zombie task:
2022-08-04 20:57:30,145+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:40,176+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:50,252+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete
On 4 Aug 2022, at 19:06, Benny Zlotnik <bzlotnik@redhat.com> wrote:
can you share the logs after restarting ovirt-engine?
On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi,
Creating snapshot of one of the VM vailed, and zombie tasks was killed with:
su postgres psql -d engine -U postgres select * from job order by start_time desc;
select DeleteJob('UUID_FROZEN_TASK_ID’);
However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface.
I run: /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all
then rebooted engine VM, still no luck. Can’t do anything with that VM.
Please advise how to fix. Thanks in advance. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL...

Hi, Benny, select * from command_entities where root_command_id = 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f’; -> 0 rows select * from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530’; -> Huge page of smth, have to page out with “more”, is it possible to truncate it to meaningful value, e.g. just number of rows? I can restore this VM from backup copy, bit since it is unmanageable, I can’t even remove it.
On 8 Aug 2022, at 13:07, Benny Zlotnik <bzlotnik@redhat.com> wrote:
you can look up the relevant command by command_id in the command_entities table, in your case it would be 2f8b32d8-fd3c-46c9-90e9-4863d63c0530 and ed816f9d-e25c-4b58-8c8f-fd0393abda2f, there might be more as the log is trimmed, so I suggest to look it up with select * from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530';
Then delete the relevant entries and restart ovirt-engine (as they might still be present in the cache)
But before doing that, is the command still running? Async commands like create snapshot are failed automatically after 50 hours and I believe it has already passed Also, this manual operation is very intrusive and might have unexpected consequences so make sure you have backups.
On Mon, Aug 8, 2022 at 9:33 AM Andrei Verovski <andreil1@starlett.lv> wrote:
HI,
OK, how to properly remove this lock? Right now VM is locked and is unmanageable at all in any way. I suppose its with some SQL commands in Postgres.
Thanks.
On 5 Aug 2022, at 11:34, Benny Zlotnik <bzlotnik@redhat.com> wrote:
So based on your logs the lock you are seeing is a memory lock, unlock_entity.sh can't really help with these. Also, the job table is used mainly for presentation so removing an entry from will not help.
Do you have the logs from when this snapshot operation started, you can use the correlation id (28353fa0-5e36-4fe8-8609-e74cd1da6d36) to search? Also, do you have the vdsm logs (SPM at the time and the host running the VM), same correlation id can be used for this as well
The table that's used to coordinate this is command_entities, so in theory removing the entries with this correlation id can help, but I'd like to see what led to this first
On Fri, Aug 5, 2022 at 8:37 AM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
I have sent log on your mailbox, its too big to post here on mailing list.
Looks like ghost task is still running, anything else need to be removed from Postgres DB?
BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it? I think it's actually exists, but due to invalid state its not possible to do anything with it.
——
Log file still shows zombie task:
2022-08-04 20:57:30,145+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:40,176+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:50,252+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete
On 4 Aug 2022, at 19:06, Benny Zlotnik <bzlotnik@redhat.com> wrote:
can you share the logs after restarting ovirt-engine?
On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi,
Creating snapshot of one of the VM vailed, and zombie tasks was killed with:
su postgres psql -d engine -U postgres select * from job order by start_time desc;
select DeleteJob('UUID_FROZEN_TASK_ID’);
However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface.
I run: /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all
then rebooted engine VM, still no luck. Can’t do anything with that VM.
Please advise how to fix. Thanks in advance. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL...

you can do: select command_id,root_command_id from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530’; On Mon, Aug 8, 2022 at 4:19 PM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
select * from command_entities where root_command_id = 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f’; -> 0 rows
select * from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530’; -> Huge page of smth, have to page out with “more”, is it possible to truncate it to meaningful value, e.g. just number of rows?
I can restore this VM from backup copy, bit since it is unmanageable, I can’t even remove it.
On 8 Aug 2022, at 13:07, Benny Zlotnik <bzlotnik@redhat.com> wrote:
you can look up the relevant command by command_id in the command_entities table, in your case it would be 2f8b32d8-fd3c-46c9-90e9-4863d63c0530 and ed816f9d-e25c-4b58-8c8f-fd0393abda2f, there might be more as the log is trimmed, so I suggest to look it up with select * from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530';
Then delete the relevant entries and restart ovirt-engine (as they might still be present in the cache)
But before doing that, is the command still running? Async commands like create snapshot are failed automatically after 50 hours and I believe it has already passed Also, this manual operation is very intrusive and might have unexpected consequences so make sure you have backups.
On Mon, Aug 8, 2022 at 9:33 AM Andrei Verovski <andreil1@starlett.lv> wrote:
HI,
OK, how to properly remove this lock? Right now VM is locked and is unmanageable at all in any way. I suppose its with some SQL commands in Postgres.
Thanks.
On 5 Aug 2022, at 11:34, Benny Zlotnik <bzlotnik@redhat.com> wrote:
So based on your logs the lock you are seeing is a memory lock, unlock_entity.sh can't really help with these. Also, the job table is used mainly for presentation so removing an entry from will not help.
Do you have the logs from when this snapshot operation started, you can use the correlation id (28353fa0-5e36-4fe8-8609-e74cd1da6d36) to search? Also, do you have the vdsm logs (SPM at the time and the host running the VM), same correlation id can be used for this as well
The table that's used to coordinate this is command_entities, so in theory removing the entries with this correlation id can help, but I'd like to see what led to this first
On Fri, Aug 5, 2022 at 8:37 AM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
I have sent log on your mailbox, its too big to post here on mailing list.
Looks like ghost task is still running, anything else need to be removed from Postgres DB?
BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it? I think it's actually exists, but due to invalid state its not possible to do anything with it.
——
Log file still shows zombie task:
2022-08-04 20:57:30,145+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:40,176+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:50,252+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete
On 4 Aug 2022, at 19:06, Benny Zlotnik <bzlotnik@redhat.com> wrote:
can you share the logs after restarting ovirt-engine?
On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> wrote: > > Hi, > > > Creating snapshot of one of the VM vailed, and zombie tasks was killed with: > > su postgres > psql -d engine -U postgres > select * from job order by start_time desc; > > select DeleteJob('UUID_FROZEN_TASK_ID’); > > > However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface. > > I run: > /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all > > then rebooted engine VM, still no luck. Can’t do anything with that VM. > > Please advise how to fix. > Thanks in advance. > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-leave@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ > List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL...

Hi, Benny, 3 records found. engine=# select command_id,root_command_id from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530'; command_id | root_command_id --------------------------------------+-------------------------------------- 2f8b32d8-fd3c-46c9-90e9-4863d63c0530 | 2f8b32d8-fd3c-46c9-90e9-4863d63c0530 3c09e1ce-1e49-4d03-82dd-2844fb9dc39f | 2f8b32d8-fd3c-46c9-90e9-4863d63c0530 ed816f9d-e25c-4b58-8c8f-fd0393abda2f | 2f8b32d8-fd3c-46c9-90e9-4863d63c0530 Now this ? select DeleteJob('2f8b32d8-fd3c-46c9-90e9-4863d63c0530');
On 8 Aug 2022, at 16:27, Benny Zlotnik <bzlotnik@redhat.com> wrote:
you can do: select command_id,root_command_id from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530’;
On Mon, Aug 8, 2022 at 4:19 PM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
select * from command_entities where root_command_id = 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f’; -> 0 rows
select * from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530’; -> Huge page of smth, have to page out with “more”, is it possible to truncate it to meaningful value, e.g. just number of rows?
I can restore this VM from backup copy, bit since it is unmanageable, I can’t even remove it.
On 8 Aug 2022, at 13:07, Benny Zlotnik <bzlotnik@redhat.com> wrote:
you can look up the relevant command by command_id in the command_entities table, in your case it would be 2f8b32d8-fd3c-46c9-90e9-4863d63c0530 and ed816f9d-e25c-4b58-8c8f-fd0393abda2f, there might be more as the log is trimmed, so I suggest to look it up with select * from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530';
Then delete the relevant entries and restart ovirt-engine (as they might still be present in the cache)
But before doing that, is the command still running? Async commands like create snapshot are failed automatically after 50 hours and I believe it has already passed Also, this manual operation is very intrusive and might have unexpected consequences so make sure you have backups.
On Mon, Aug 8, 2022 at 9:33 AM Andrei Verovski <andreil1@starlett.lv> wrote:
HI,
OK, how to properly remove this lock? Right now VM is locked and is unmanageable at all in any way. I suppose its with some SQL commands in Postgres.
Thanks.
On 5 Aug 2022, at 11:34, Benny Zlotnik <bzlotnik@redhat.com> wrote:
So based on your logs the lock you are seeing is a memory lock, unlock_entity.sh can't really help with these. Also, the job table is used mainly for presentation so removing an entry from will not help.
Do you have the logs from when this snapshot operation started, you can use the correlation id (28353fa0-5e36-4fe8-8609-e74cd1da6d36) to search? Also, do you have the vdsm logs (SPM at the time and the host running the VM), same correlation id can be used for this as well
The table that's used to coordinate this is command_entities, so in theory removing the entries with this correlation id can help, but I'd like to see what led to this first
On Fri, Aug 5, 2022 at 8:37 AM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
I have sent log on your mailbox, its too big to post here on mailing list.
Looks like ghost task is still running, anything else need to be removed from Postgres DB?
BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it? I think it's actually exists, but due to invalid state its not possible to do anything with it.
——
Log file still shows zombie task:
2022-08-04 20:57:30,145+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:40,176+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:50,252+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete
> On 4 Aug 2022, at 19:06, Benny Zlotnik <bzlotnik@redhat.com> wrote: > > can you share the logs after restarting ovirt-engine? > > On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> wrote: >> >> Hi, >> >> >> Creating snapshot of one of the VM vailed, and zombie tasks was killed with: >> >> su postgres >> psql -d engine -U postgres >> select * from job order by start_time desc; >> >> select DeleteJob('UUID_FROZEN_TASK_ID’); >> >> >> However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface. >> >> I run: >> /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all >> >> then rebooted engine VM, still no luck. Can’t do anything with that VM. >> >> Please advise how to fix. >> Thanks in advance. >> _______________________________________________ >> Users mailing list -- users@ovirt.org >> To unsubscribe send an email to users-leave@ovirt.org >> Privacy Statement: https://www.ovirt.org/privacy-policy.html >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL... >

no, DeleteJob deletes from the job table, you can use DeleteCommandEntity(uuid)[1] [1] https://github.com/oVirt/ovirt-engine/blob/fbd5851b9de889fb88df6f10310ea9051... On Mon, Aug 8, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
3 records found.
engine=# select command_id,root_command_id from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530'; command_id | root_command_id --------------------------------------+-------------------------------------- 2f8b32d8-fd3c-46c9-90e9-4863d63c0530 | 2f8b32d8-fd3c-46c9-90e9-4863d63c0530 3c09e1ce-1e49-4d03-82dd-2844fb9dc39f | 2f8b32d8-fd3c-46c9-90e9-4863d63c0530 ed816f9d-e25c-4b58-8c8f-fd0393abda2f | 2f8b32d8-fd3c-46c9-90e9-4863d63c0530
Now this ? select DeleteJob('2f8b32d8-fd3c-46c9-90e9-4863d63c0530');
On 8 Aug 2022, at 16:27, Benny Zlotnik <bzlotnik@redhat.com> wrote:
you can do: select command_id,root_command_id from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530’;
On Mon, Aug 8, 2022 at 4:19 PM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
select * from command_entities where root_command_id = 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f’; -> 0 rows
select * from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530’; -> Huge page of smth, have to page out with “more”, is it possible to truncate it to meaningful value, e.g. just number of rows?
I can restore this VM from backup copy, bit since it is unmanageable, I can’t even remove it.
On 8 Aug 2022, at 13:07, Benny Zlotnik <bzlotnik@redhat.com> wrote:
you can look up the relevant command by command_id in the command_entities table, in your case it would be 2f8b32d8-fd3c-46c9-90e9-4863d63c0530 and ed816f9d-e25c-4b58-8c8f-fd0393abda2f, there might be more as the log is trimmed, so I suggest to look it up with select * from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530';
Then delete the relevant entries and restart ovirt-engine (as they might still be present in the cache)
But before doing that, is the command still running? Async commands like create snapshot are failed automatically after 50 hours and I believe it has already passed Also, this manual operation is very intrusive and might have unexpected consequences so make sure you have backups.
On Mon, Aug 8, 2022 at 9:33 AM Andrei Verovski <andreil1@starlett.lv> wrote:
HI,
OK, how to properly remove this lock? Right now VM is locked and is unmanageable at all in any way. I suppose its with some SQL commands in Postgres.
Thanks.
On 5 Aug 2022, at 11:34, Benny Zlotnik <bzlotnik@redhat.com> wrote:
So based on your logs the lock you are seeing is a memory lock, unlock_entity.sh can't really help with these. Also, the job table is used mainly for presentation so removing an entry from will not help.
Do you have the logs from when this snapshot operation started, you can use the correlation id (28353fa0-5e36-4fe8-8609-e74cd1da6d36) to search? Also, do you have the vdsm logs (SPM at the time and the host running the VM), same correlation id can be used for this as well
The table that's used to coordinate this is command_entities, so in theory removing the entries with this correlation id can help, but I'd like to see what led to this first
On Fri, Aug 5, 2022 at 8:37 AM Andrei Verovski <andreil1@starlett.lv> wrote: > > Hi, Benny, > > I have sent log on your mailbox, its too big to post here on mailing list. > > Looks like ghost task is still running, anything else need to be removed from Postgres DB? > > BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it? I think it's actually exists, but due to invalid state its not possible to do anything with it. > > > —— > > Log file still shows zombie task: > > 2022-08-04 20:57:30,145+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete > 2022-08-04 20:57:40,176+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete > 2022-08-04 20:57:50,252+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete > > >> On 4 Aug 2022, at 19:06, Benny Zlotnik <bzlotnik@redhat.com> wrote: >> >> can you share the logs after restarting ovirt-engine? >> >> On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> wrote: >>> >>> Hi, >>> >>> >>> Creating snapshot of one of the VM vailed, and zombie tasks was killed with: >>> >>> su postgres >>> psql -d engine -U postgres >>> select * from job order by start_time desc; >>> >>> select DeleteJob('UUID_FROZEN_TASK_ID’); >>> >>> >>> However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface. >>> >>> I run: >>> /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all >>> >>> then rebooted engine VM, still no luck. Can’t do anything with that VM. >>> >>> Please advise how to fix. >>> Thanks in advance. >>> _______________________________________________ >>> Users mailing list -- users@ovirt.org >>> To unsubscribe send an email to users-leave@ovirt.org >>> Privacy Statement: https://www.ovirt.org/privacy-policy.html >>> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >>> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL... >> >

Hi, I re-run engine-setup and it was cleared zombie status of this VM. On 8/8/22 16:27, Benny Zlotnik wrote:
you can do: select command_id,root_command_id from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530’;
On Mon, Aug 8, 2022 at 4:19 PM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
select * from command_entities where root_command_id = 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f’; -> 0 rows
select * from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530’; -> Huge page of smth, have to page out with “more”, is it possible to truncate it to meaningful value, e.g. just number of rows?
I can restore this VM from backup copy, bit since it is unmanageable, I can’t even remove it.
On 8 Aug 2022, at 13:07, Benny Zlotnik <bzlotnik@redhat.com> wrote:
you can look up the relevant command by command_id in the command_entities table, in your case it would be 2f8b32d8-fd3c-46c9-90e9-4863d63c0530 and ed816f9d-e25c-4b58-8c8f-fd0393abda2f, there might be more as the log is trimmed, so I suggest to look it up with select * from command_entities where root_command_id = '2f8b32d8-fd3c-46c9-90e9-4863d63c0530';
Then delete the relevant entries and restart ovirt-engine (as they might still be present in the cache)
But before doing that, is the command still running? Async commands like create snapshot are failed automatically after 50 hours and I believe it has already passed Also, this manual operation is very intrusive and might have unexpected consequences so make sure you have backups.
On Mon, Aug 8, 2022 at 9:33 AM Andrei Verovski <andreil1@starlett.lv> wrote:
HI,
OK, how to properly remove this lock? Right now VM is locked and is unmanageable at all in any way. I suppose its with some SQL commands in Postgres.
Thanks.
On 5 Aug 2022, at 11:34, Benny Zlotnik <bzlotnik@redhat.com> wrote:
So based on your logs the lock you are seeing is a memory lock, unlock_entity.sh can't really help with these. Also, the job table is used mainly for presentation so removing an entry from will not help.
Do you have the logs from when this snapshot operation started, you can use the correlation id (28353fa0-5e36-4fe8-8609-e74cd1da6d36) to search? Also, do you have the vdsm logs (SPM at the time and the host running the VM), same correlation id can be used for this as well
The table that's used to coordinate this is command_entities, so in theory removing the entries with this correlation id can help, but I'd like to see what led to this first
On Fri, Aug 5, 2022 at 8:37 AM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
I have sent log on your mailbox, its too big to post here on mailing list.
Looks like ghost task is still running, anything else need to be removed from Postgres DB?
BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it? I think it's actually exists, but due to invalid state its not possible to do anything with it.
——
Log file still shows zombie task:
2022-08-04 20:57:30,145+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:40,176+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:50,252+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete
> On 4 Aug 2022, at 19:06, Benny Zlotnik <bzlotnik@redhat.com> wrote: > > can you share the logs after restarting ovirt-engine? > > On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> wrote: >> Hi, >> >> >> Creating snapshot of one of the VM vailed, and zombie tasks was killed with: >> >> su postgres >> psql -d engine -U postgres >> select * from job order by start_time desc; >> >> select DeleteJob('UUID_FROZEN_TASK_ID’); >> >> >> However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface. >> >> I run: >> /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all >> >> then rebooted engine VM, still no luck. Can’t do anything with that VM. >> >> Please advise how to fix. >> Thanks in advance. >> _______________________________________________ >> Users mailing list -- users@ovirt.org >> To unsubscribe send an email to users-leave@ovirt.org >> Privacy Statement: https://www.ovirt.org/privacy-policy.html >> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL...

Hi Andrei have you cecked locked entries here ? /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all -qc it helped to me to unlock snapshots and disks and then restart VM. but yes, there is a chance to damage VM, Jirka On 8/8/22 08:33, Andrei Verovski wrote:
HI,
OK, how to properly remove this lock? Right now VM is locked and is unmanageable at all in any way. I suppose its with some SQL commands in Postgres.
Thanks.
On 5 Aug 2022, at 11:34, Benny Zlotnik<bzlotnik@redhat.com> wrote:
So based on your logs the lock you are seeing is a memory lock, unlock_entity.sh can't really help with these. Also, the job table is used mainly for presentation so removing an entry from will not help.
Do you have the logs from when this snapshot operation started, you can use the correlation id (28353fa0-5e36-4fe8-8609-e74cd1da6d36) to search? Also, do you have the vdsm logs (SPM at the time and the host running the VM), same correlation id can be used for this as well
The table that's used to coordinate this is command_entities, so in theory removing the entries with this correlation id can help, but I'd like to see what led to this first
On Fri, Aug 5, 2022 at 8:37 AM Andrei Verovski<andreil1@starlett.lv> wrote:
Hi, Benny,
I have sent log on your mailbox, its too big to post here on mailing list.
Looks like ghost task is still running, anything else need to be removed from Postgres DB?
BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it? I think it's actually exists, but due to invalid state its not possible to do anything with it.
——
Log file still shows zombie task:
2022-08-04 20:57:30,145+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:40,176+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:50,252+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete
On 4 Aug 2022, at 19:06, Benny Zlotnik<bzlotnik@redhat.com> wrote:
can you share the logs after restarting ovirt-engine?
On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski<andreil1@starlett.lv> wrote:
Hi,
Creating snapshot of one of the VM vailed, and zombie tasks was killed with:
su postgres psql -d engine -U postgres select * from job order by start_time desc;
select DeleteJob('UUID_FROZEN_TASK_ID’);
However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface.
I run: /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all
then rebooted engine VM, still no luck. Can’t do anything with that VM.
Please advise how to fix. Thanks in advance. _______________________________________________ Users mailing list --users@ovirt.org To unsubscribe send an email tousers-leave@ovirt.org Privacy Statement:https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct:https://www.ovirt.org/community/about/community-guidelines/ List Archives:https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL...
Users mailing list --users@ovirt.org To unsubscribe send an email tousers-leave@ovirt.org Privacy Statement:https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct:https://www.ovirt.org/community/about/community-guidelines/ List Archives:https://lists.ovirt.org/archives/list/users@ovirt.org/message/RCB2URNPYZGAFC...

Yes, I did it, not working in my case. This is explanation from Benny: So based on your logs the lock you are seeing is a memory lock, unlock_entity.sh can't really help with these. Also, the job table is used mainly for presentation so removing an entry from will not help.
On 8 Aug 2022, at 15:30, Jirka Simon <jirka@vesim.cz> wrote:
Hi Andrei
have you cecked locked entries here ?
/usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all -qc
it helped to me to unlock snapshots and disks and then restart VM. but yes, there is a chance to damage VM,
Jirka
On 8/8/22 08:33, Andrei Verovski wrote:
HI,
OK, how to properly remove this lock? Right now VM is locked and is unmanageable at all in any way. I suppose its with some SQL commands in Postgres.
Thanks.
On 5 Aug 2022, at 11:34, Benny Zlotnik <bzlotnik@redhat.com> <mailto:bzlotnik@redhat.com> wrote:
So based on your logs the lock you are seeing is a memory lock, unlock_entity.sh can't really help with these. Also, the job table is used mainly for presentation so removing an entry from will not help.
Do you have the logs from when this snapshot operation started, you can use the correlation id (28353fa0-5e36-4fe8-8609-e74cd1da6d36) to search? Also, do you have the vdsm logs (SPM at the time and the host running the VM), same correlation id can be used for this as well
The table that's used to coordinate this is command_entities, so in theory removing the entries with this correlation id can help, but I'd like to see what led to this first
On Fri, Aug 5, 2022 at 8:37 AM Andrei Verovski <andreil1@starlett.lv> <mailto:andreil1@starlett.lv> wrote:
Hi, Benny,
I have sent log on your mailbox, its too big to post here on mailing list.
Looks like ghost task is still running, anything else need to be removed from Postgres DB?
BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it? I think it's actually exists, but due to invalid state its not possible to do anything with it.
——
Log file still shows zombie task:
2022-08-04 20:57:30,145+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:40,176+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:50,252+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete
On 4 Aug 2022, at 19:06, Benny Zlotnik <bzlotnik@redhat.com> <mailto:bzlotnik@redhat.com> wrote:
can you share the logs after restarting ovirt-engine?
On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> <mailto:andreil1@starlett.lv> wrote:
Hi,
Creating snapshot of one of the VM vailed, and zombie tasks was killed with:
su postgres psql -d engine -U postgres select * from job order by start_time desc;
select DeleteJob('UUID_FROZEN_TASK_ID’);
However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface.
I run: /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all
then rebooted engine VM, still no luck. Can’t do anything with that VM.
Please advise how to fix. Thanks in advance. _______________________________________________ Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/privacy-policy.html <https://www.ovirt.org/privacy-policy.html> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ <https://www.ovirt.org/community/about/community-guidelines/> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL... <https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCLDJNK6UXLZ7ZAUOSYFJ/>
Users mailing list -- users@ovirt.org <mailto:users@ovirt.org> To unsubscribe send an email to users-leave@ovirt.org <mailto:users-leave@ovirt.org> Privacy Statement: https://www.ovirt.org/privacy-policy.html <https://www.ovirt.org/privacy-policy.html> oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ <https://www.ovirt.org/community/about/community-guidelines/> List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/RCB2URNPYZGAFC... <https://lists.ovirt.org/archives/list/users@ovirt.org/message/RCB2URNPYZGAFCARZX367C54TTNHBH5U/>
participants (3)
-
Andrei Verovski
-
Benny Zlotnik
-
Jirka Simon