Re: Q: How to fix ghost "locked" status of VM

On 8 Aug 2022, at 09:52, Jirka Simon <jirka@vesim.cz> wrote:
Hi Benny, Andrei.
do you have any news? I deleted frozen task and locked snapshot and disk on Friday and during the weekend everything worked fine. (there were 3 snapshots and backups from Friday to now.)
Deleting zombie task from Postgresql didn’t help.
Thank you
Jirka
On 8/5/22 11:53, Benny Zlotnik wrote:
Jirka, I suppose your issue is different than Anderi's no?
In your log I see the command has failed: 2022-08-05 10:26:57,741+02 INFO [org.ovirt.engine.core.bll.VirtJobCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-92) [8ff04bca-b5a2-4ba3-892 5-1037c5a2850e] Command CreateLiveSnapshotForVm id: '2f94c337-487e-40f7-b8f8-df69225d5c79': execution was completed, the command status is 'FAILED' 2022-08-05 10:26:58,742+02 ERROR [org.ovirt.engine.core.bll.snapshots.CreateLiveSnapshotForVmCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-35 ) [8ff04bca-b5a2-4ba3-8925-1037c5a2850e] Ending command 'org.ovirt.engine.core.bll.snapshots.CreateLiveSnapshotForVmCommand' with failure. 2022-08-05 10:26:59,796+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-31) [8 ff04bca-b5a2-4ba3-8925-1037c5a2850e] Command 'CreateSnapshotForVm' id: '93f98c9c-2936-4615-b12e-c295b90eace0' child commands '[2f94c337-487e-40f7-b8f8-df69225d5c79, 1b3abc1e -f0c1-40f7-b52e-f7df52c52601]' executions were completed, status 'FAILED' 2022-08-05 10:27:00,826+02 ERROR [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-4) [8f f04bca-b5a2-4ba3-8925-1037c5a2850e] Ending command 'org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand' with failure. 2022-08-05 10:27:00,829+02 ERROR [org.ovirt.engine.core.bll.snapshots.CreateSnapshotDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-4) [8ff 04bca-b5a2-4ba3-8925-1037c5a2850e] Ending command 'org.ovirt.engine.core.bll.snapshots.CreateSnapshotDiskCommand' with failure.
Generally, live snapshot has a default 30 minute timeout to complete and it looks like it was reached in your case, I don't see a command that's still running
On Fri, Aug 5, 2022 at 12:36 PM Jirka Simon <jirka@vesim.cz> wrote:
Hi Benny , Andrei,
here are my logs
affected job was " waiting for job '6e73cda8-85d0-471c-b090-4c89719a97f4' on host 'ovirt3.corp.sldev.cz' "
thank you
Jirka
On 8/5/22 11:00, Andrei Verovski wrote:
Hi,
I have e-mailed vdsm log to Benny. Unfortunately, yesterday I deleted engine.log before restarting engine to have more shorter log.
Jirka, can you e-mail engine.log and vdsm logs (SPM at the time and the host running the VM) to Benny?
On 5 Aug 2022, at 11:42, Jirka Simon <jirka@vesim.cz> wrote:
Hi Benny, so we have the same problem last 2 days.
there are two affected VMs (furtunately not more) and we cant start any snapshot for these VMS.
it looks that the snapshot haven't start yet , then to unlock objects (snapshot and disk) shouldn't damage disk (i tried it on both VMS), then i restarted them and VMS run again. I made clone from one of them ant the clone works properly.
Jirka
On 8/5/22 10:34, Benny Zlotnik wrote:
So based on your logs the lock you are seeing is a memory lock, unlock_entity.sh can't really help with these. Also, the job table is used mainly for presentation so removing an entry from will not help.
Do you have the logs from when this snapshot operation started, you can use the correlation id (28353fa0-5e36-4fe8-8609-e74cd1da6d36) to search? Also, do you have the vdsm logs (SPM at the time and the host running the VM), same correlation id can be used for this as well
The table that's used to coordinate this is command_entities, so in theory removing the entries with this correlation id can help, but I'd like to see what led to this first
On Fri, Aug 5, 2022 at 8:37 AM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi, Benny,
I have sent log on your mailbox, its too big to post here on mailing list.
Looks like ghost task is still running, anything else need to be removed from Postgres DB?
BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it? I think it's actually exists, but due to invalid state its not possible to do anything with it.
——
Log file still shows zombie task:
2022-08-04 20:57:30,145+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:40,176+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete 2022-08-04 20:57:50,252+03 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66) [28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id: '2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id: 'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to complete
On 4 Aug 2022, at 19:06, Benny Zlotnik <bzlotnik@redhat.com> wrote:
can you share the logs after restarting ovirt-engine?
On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1@starlett.lv> wrote:
Hi,
Creating snapshot of one of the VM vailed, and zombie tasks was killed with:
su postgres psql -d engine -U postgres select * from job order by start_time desc;
select DeleteJob('UUID_FROZEN_TASK_ID’);
However, VM remains in locked state (with lock sign left-below red “DOWN” arrow in status column of web interface.
I run: /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all
then rebooted engine VM, still no luck. Can’t do anything with that VM.
Please advise how to fix. Thanks in advance. _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53JWCL...
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ILYSLIM7UNRU3B...
participants (1)
-
Andrei Verovski