Adding Artur ...

On Mon, Oct 19, 2020 at 9:06 AM Liran Rotenberg <lrotenbe@redhat.com> wrote:
Hi all,
Investigating a bug about serial execution of the same command more than once shows a problem with our callbacks.

The problem is felt when we have synchronized operation (such as running ansible runner service) that takes a bit time to be executed.
The problem is, when running multiple commands and one has synchronized long operation within a child command, it will hang out all the commands running on the engine using callbacks.
The good example is given in the bug, since export OVA command using ansible(sync) and the pack_ova script takes time to finish.
This is not really noticeable with short synchronized commands, but it does execute serial for them as well instead of parallel.
Running export to OVA command and then other commands with callbacks will get them hanging if the export command callback started and it reached to performNextOperation.

Some technical details:
We have one thread in CommandCallbacksPoller[1] that runs, collects the current command on the engine and processes them.
Once we have the above scenario (let's say 2 commands), the first one will go into callback.doPolling in invokeCallbackMethodsImpl. [2]
In that case it will go to ChildCommandsCallbackBase::doPolling, eventually to childCommandsExecutingEnded. [3]
While there are more actions to perform we will do performNextOperation [4], which calls executeNextOperation (in the bug case [5]).
When the next operation is long and synchronized, this will block the CommandCallbacksPoller thread and only when it finishes the thread is released and the callbacks continue working.

Any idea how to solve this issue?

Regards,
Liran

[1] - https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/tasks/CommandCallbacksPoller.java#L52-L55
[2] - https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/tasks/CommandCallbacksPoller.java#L175
[3] - https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/ChildCommandsCallbackBase.java#L80
[4] - https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/SerialChildCommandsExecutionCallback.java#L32
[5] - https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/exportimport/ExportVmToOvaCommand.java#L199-L231



--
Martin Perina
Manager, Software Engineering
Red Hat Czech s.r.o.