On 8 Aug 2022, at 09:52, Jirka Simon <jirka(a)vesim.cz> wrote:
Hi Benny, Andrei.
do you have any news? I deleted frozen task and locked snapshot and disk on Friday and
during the weekend everything worked fine. (there were 3 snapshots and backups from Friday
to now.)
Thank you
Jirka
On 8/5/22 11:53, Benny Zlotnik wrote:
> Jirka, I suppose your issue is different than Anderi's no?
>
> In your log I see the command has failed:
> 2022-08-05 10:26:57,741+02 INFO
> [org.ovirt.engine.core.bll.VirtJobCallback]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-92)
> [8ff04bca-b5a2-4ba3-892
> 5-1037c5a2850e] Command CreateLiveSnapshotForVm id:
> '2f94c337-487e-40f7-b8f8-df69225d5c79': execution was completed, the
> command status is 'FAILED'
> 2022-08-05 10:26:58,742+02 ERROR
> [org.ovirt.engine.core.bll.snapshots.CreateLiveSnapshotForVmCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-35
> ) [8ff04bca-b5a2-4ba3-8925-1037c5a2850e] Ending command
> 'org.ovirt.engine.core.bll.snapshots.CreateLiveSnapshotForVmCommand'
> with failure.
> 2022-08-05 10:26:59,796+02 INFO
> [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-31)
> [8
> ff04bca-b5a2-4ba3-8925-1037c5a2850e] Command 'CreateSnapshotForVm' id:
> '93f98c9c-2936-4615-b12e-c295b90eace0' child commands
> '[2f94c337-487e-40f7-b8f8-df69225d5c79, 1b3abc1e
> -f0c1-40f7-b52e-f7df52c52601]' executions were completed, status
'FAILED'
> 2022-08-05 10:27:00,826+02 ERROR
> [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-4)
> [8f
> f04bca-b5a2-4ba3-8925-1037c5a2850e] Ending command
> 'org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand' with
> failure.
> 2022-08-05 10:27:00,829+02 ERROR
> [org.ovirt.engine.core.bll.snapshots.CreateSnapshotDiskCommand]
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-4)
> [8ff
> 04bca-b5a2-4ba3-8925-1037c5a2850e] Ending command
> 'org.ovirt.engine.core.bll.snapshots.CreateSnapshotDiskCommand' with
> failure.
>
> Generally, live snapshot has a default 30 minute timeout to complete
> and it looks like it was reached in your case, I don't see a command
> that's still running
>
> On Fri, Aug 5, 2022 at 12:36 PM Jirka Simon <jirka(a)vesim.cz> wrote:
>> Hi Benny , Andrei,
>>
>>
>> here are my logs
>>
>>
>> affected job was " waiting for job
'6e73cda8-85d0-471c-b090-4c89719a97f4' on host 'ovirt3.corp.sldev.cz'
"
>>
>> thank you
>>
>>
>> Jirka
>>
>> On 8/5/22 11:00, Andrei Verovski wrote:
>>
>> Hi,
>>
>> I have e-mailed vdsm log to Benny.
>> Unfortunately, yesterday I deleted engine.log before restarting engine to have
more shorter log.
>>
>> Jirka, can you e-mail engine.log and vdsm logs (SPM at the time and the host
>> running the VM) to Benny?
>>
>>
>> On 5 Aug 2022, at 11:42, Jirka Simon <jirka(a)vesim.cz> wrote:
>>
>> Hi Benny, so we have the same problem last 2 days.
>>
>> there are two affected VMs (furtunately not more) and we cant start any snapshot
for these VMS.
>>
>> it looks that the snapshot haven't start yet , then to unlock objects
(snapshot and disk) shouldn't damage disk (i tried it on both VMS), then i restarted
them and VMS run again. I made clone from one of them ant the clone works properly.
>>
>>
>> Jirka
>>
>> On 8/5/22 10:34, Benny Zlotnik wrote:
>>
>> So based on your logs the lock you are seeing is a memory lock,
>> unlock_entity.sh can't really help with these.
>> Also, the job table is used mainly for presentation so removing an
>> entry from will not help.
>>
>> Do you have the logs from when this snapshot operation started, you
>> can use the correlation id (28353fa0-5e36-4fe8-8609-e74cd1da6d36) to
>> search? Also, do you have the vdsm logs (SPM at the time and the host
>> running the VM), same correlation id can be used for this as well
>>
>> The table that's used to coordinate this is command_entities, so in
>> theory removing the entries with this correlation id can help, but I'd
>> like to see what led to this first
>>
>> On Fri, Aug 5, 2022 at 8:37 AM Andrei Verovski <andreil1(a)starlett.lv>
wrote:
>>
>> Hi, Benny,
>>
>> I have sent log on your mailbox, its too big to post here on mailing list.
>>
>> Looks like ghost task is still running, anything else need to be removed from
Postgres DB?
>>
>> BTW, frozen dead snapshot is in invalid state, is there any way to get rid of it?
I think it's actually exists, but due to invalid state its not possible to do anything
with it.
>>
>>
>> ——
>>
>> Log file still shows zombie task:
>>
>> 2022-08-04 20:57:30,145+03 INFO
[org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-24)
[28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id:
'2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id:
'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to
complete
>> 2022-08-04 20:57:40,176+03 INFO
[org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-85)
[28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id:
'2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id:
'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to
complete
>> 2022-08-04 20:57:50,252+03 INFO
[org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-66)
[28353fa0-5e36-4fe8-8609-e74cd1da6d36] Command 'CreateSnapshotForVm' (id:
'2f8b32d8-fd3c-46c9-90e9-4863d63c0530') waiting on child command id:
'ed816f9d-e25c-4b58-8c8f-fd0393abda2f' type:'CreateLiveSnapshotForVm' to
complete
>>
>>
>> On 4 Aug 2022, at 19:06, Benny Zlotnik <bzlotnik(a)redhat.com> wrote:
>>
>> can you share the logs after restarting ovirt-engine?
>>
>> On Thu, Aug 4, 2022 at 4:58 PM Andrei Verovski <andreil1(a)starlett.lv>
wrote:
>>
>> Hi,
>>
>>
>> Creating snapshot of one of the VM vailed, and zombie tasks was killed with:
>>
>> su postgres
>> psql -d engine -U postgres
>> select * from job order by start_time desc;
>>
>> select DeleteJob('UUID_FROZEN_TASK_ID’);
>>
>>
>> However, VM remains in locked state (with lock sign left-below red “DOWN” arrow
in status column of web interface.
>>
>> I run:
>> /usr/share/ovirt-engine/setup/dbutils/unlock_entity.sh -t all
>>
>> then rebooted engine VM, still no luck. Can’t do anything with that VM.
>>
>> Please advise how to fix.
>> Thanks in advance.
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/P2TVMLHC53J...
>>
>> _______________________________________________
>> Users mailing list -- users(a)ovirt.org
>> To unsubscribe send an email to users-leave(a)ovirt.org
>> Privacy Statement:
https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ILYSLIM7UNR...