I have overseen that many other commands handle this problem by using a
waiting lock, so the related commands in this issue should do the same.
I created
Adding a new external network fails during auto-sync is running
to track this.
From my point of view the other related two bugs I created are not
required anymore, because other commands seems to use a waiting lock,
too.
Should we close 1557419 and 1557424?
On Mon, 19 Mar 2018 09:40:32 +0100
Dominik Holler <dholler(a)redhat.com> wrote:
On Sat, 17 Mar 2018 16:16:49 +0200
Dan Kenigsberg <danken(a)redhat.com> wrote:
> Thanks for filing those, but let us not keep them secret:
>
> Bug 1557419 - Recommend that a call should be tried again
> sounds most reasonable. Providing more information to the client
> makes sense
>
> Bug 1557424 - Automatically retry call failed because he failed to
> acquire a lock
> is more tricky. when would REST perform the retry?
If the command returns a machine-readable hint that the conflict is
temporary.
> how often? We must make sure we never cause a livelock.
>
I agree that it is a good idea to have an upper limit for the count of
retries to prevent a livelock.
> Can we alternatively take a normal blocking lock when we create an
> external network (instead of a trylock)?
>
This might work, but currently no command is doing this.
(The lock is acquired in CommandBase.acquireLockInternal() which could
be overridden for all commands using an external network provider.)
But I would prefer a solution which solve the issue for other
commands, too.
> On Fri, Mar 16, 2018 at 4:52 PM, Dominik Holler <dholler(a)redhat.com>
> wrote:
> > I have created
> >
https://bugzilla.redhat.com/show_bug.cgi?id=1557419 and
> >
https://bugzilla.redhat.com/show_bug.cgi?id=1557424
> > to discuss how failing to acquire locks in engine should be
> > handled in engine's REST-API, in ovirt-sdk or in application (OST
> > in this case).
> >
> >
> > On Fri, 16 Mar 2018 09:32:47 +0000
> > Dafna Ron <dron(a)redhat.com> wrote:
> >
> >> Thank you for the fast reply and help.
> >>
> >> On Thu, Mar 15, 2018 at 8:21 PM, Dominik Holler
> >> <dholler(a)redhat.com> wrote:
> >>
> >> > On Thu, 15 Mar 2018 16:24:10 +0000
> >> > Dafna Ron <dron(a)redhat.com> wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > We have a failure on master for test
> >> > > 098_ovirt_provider_ovn.use_ovn_provider in project
> >> > > cockpit-ovirt. This seems to be a race because object is
> >> > > locked. also, the actual failure is logged as WARN and not
> >> > > ERROR.
> >> > >
> >> > > I don't think the patch is actually related to the failure
> >> > > but I think the test should be fixed.
> >> > > can you please review to make sure we do not have an actual
> >> > > regression and let me know if we need to open a bz to fix the
> >> > > test?
> >> > >
> >> > >
> >> > > *Link and headline of suspected patches: *
> >> > > *https://gerrit.ovirt.org/#/c/89020/2
> >> > > <
https://gerrit.ovirt.org/#/c/89020/2> - *
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > *wizard: Enable scroll on start page for low-res screensLink
> >> > > to
> >> > >
Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374
> >> > >
<
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374>Link
> >> > > to all
> >> > >
logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue-
> >> > tester/6374/artifacts
> >> > > <
http://jenkins.ovirt.org/job/ovirt-master_change-queue-
> >> > tester/6374/artifacts>(Relevant)
> >> > > error snippet from the log: <error>2018-03-15
10:05:00,160-04
> >> > > DEBUG
> >> > > [org.ovirt.engine.core.sso.servlets.OAuthTokenInfoServlet]
> >> > > (default task-10) [] Sending json response2018-03-15
> >> > > 10:05:00,160-04 DEBUG
> >> > > [org.ovirt.engine.core.sso.utils.TokenCleanupUtility]
> >> > > (default task-10) [] Not cleaning up expired tokens2018-03-15
> >> > > 10:05:00,169-04 INFO
> >> > >
[org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand]
> >> > > (EE-ManagedThreadFactory-engineScheduled-Thread-90)
> >> > > [789edb23] Lock Acquired to object
> >> > > 'EngineLock:{exclusiveLocks='[c38a67ec-0b48-4e6f-be85-
> >> > 70c700df5483=PROVIDER]',
> >> > > sharedLocks=''}'2018-03-15 10:05:00,184-04 INFO
> >> > >
[org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand]
> >> > > (EE-ManagedThreadFactory-engineScheduled-Thread-90)
> >> > > [789edb23] Running command: SyncNetworkProviderCommand
> >> > > internal: true.2018-03-15 10:05:00,228-04 DEBUG
> >> > > [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$
> >> > PostgresSimpleJdbcCall]
> >> > > (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828]
> >> > > Compiled stored procedure. Call string is [{call
> >> > > getdcidbyexternalnetworkid(?)}]2018-03-15 10:05:00,228-04
> >> > > DEBUG
> >> > > [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$
> >> > PostgresSimpleJdbcCall]
> >> > > (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828]
> >> > > SqlCall for procedure [GetDcIdByExternalNetworkId]
> >> > > compiled2018-03-15 10:05:00,229-04 DEBUG
> >> > >
[org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor]
> >> > > (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828]
> >> > > method: runQuery, params: [GetAllExternalNetworksOnProvider,
> >> > > IdQueryParameters:{refresh='false',
filtered='false'}],
> >> > > timeElapsed: 353ms2018-03-15 10:05:00,239-04 INFO
> >> > > [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand]
> >> > > (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828]
> >> > > Failed to Acquire Lock to object
> >> > > 'EngineLock:{exclusiveLocks='[network_1=NETWORK,
> >> > > c38a67ec-0b48-4e6f-be85-70c700df5483=PROVIDER]',
> >> > > sharedLocks=''}'2018-03-15 10:05:00,239-04 WARN
> >> > > [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand]
> >> > > (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828]
> >> > > Validation of action 'AddNetwork' failed for user
> >> > > admin@internal-authz. Reasons:
> >> > >
VAR__TYPE__NETWORK,VAR__ACTION__ADD,ACTION_TYPE_FAILED_PROVIDER_LOCKED,$
> >> > providerId
> >> > > c38a67ec-0b48-4e6f-be85-70c700df54832018-03-15
> >> > > 10:05:00,240-04 DEBUG
> >> > >
[org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor]
> >> > > (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828]
> >> > > method: runAction, params: [AddNetwork,
> >> > > AddNetworkStoragePoolParameters:{commandId='61b365ec-27c1-
> >> > 49af-ad72-f907df8befcd',
> >> > > user='null', commandType='Unknown'}],
timeElapsed:
> >> > > 10ms2018-03-15 10:05:00,250-04 ERROR
> >> > > [org.ovirt.engine.api.restapi.resource.AbstractBackendResource]
> >> > > (default task-13) [] Operation Failed: [Cannot add Network.
> >> > > Related operation on provider with the id
> >> > > c38a67ec-0b48-4e6f-be85-70c700df5483 is currently in
> >> > > progress. Please try again later.]2018-03-15 10:05:00,254-04
> >> > > DEBUG [org.ovirt.engine.core.utils.servlet.LocaleFilter]
> >> > > (default task-14) [] Incoming locale 'en-US'. Filter
> >> > > determined locale to be 'en-US'2018-03-15
10:05:00,254-04
> >> > > DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenServlet]
> >> > > (default task-14) [] Entered OAuthTokenServlet Query String:
> >> > > null, Parameters : password = ***, grant_type = password,
> >> > > scope = ovirt-app-api ovirt-ext=token-info:validate,
> >> > > username = admin@internal, </error>*
> >> >
> >> > I will care about this.
> >> > The problem is that SyncNetworkProviderCommand is running in
> >> > the background and locking the provider, which blocks the lock
> >> > for the tested AddNetworkCommand.
> >> > The related changes are
> >> > core: Add locking for Add and RemoveNetworkCommand
> >> >
https://gerrit.ovirt.org/#/c/85480/
> >> > and
> >> > core: Add SyncNetworkProviderCommand
> >> >
https://gerrit.ovirt.org/#/c/85134/
> >> >
> >> >
> >> >
> >
> > _______________________________________________
> > Devel mailing list
> > Devel(a)ovirt.org
> >
http://lists.ovirt.org/mailman/listinfo/devel
> >
> >