[ OST Failure Report ] [ oVirt Master (cockpit-ovirt) ] [ 15-03-2018 ] [ 098_ovirt_provider_ovn.use_ovn_provider ]

Hi, We have a failure on master for test 098_ovirt_provider_ovn.use_ovn_provider in project cockpit-ovirt. This seems to be a race because object is locked. also, the actual failure is logged as WARN and not ERROR. I don't think the patch is actually related to the failure but I think the test should be fixed. can you please review to make sure we do not have an actual regression and let me know if we need to open a bz to fix the test? *Link and headline of suspected patches: * *https://gerrit.ovirt.org/#/c/89020/2 <https://gerrit.ovirt.org/#/c/89020/2> - * *wizard: Enable scroll on start page for low-res screensLink to Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374 <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374>Link to all logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374/artifacts <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374/artifacts>(Relevant) error snippet from the log: <error>2018-03-15 10:05:00,160-04 DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenInfoServlet] (default task-10) [] Sending json response2018-03-15 10:05:00,160-04 DEBUG [org.ovirt.engine.core.sso.utils.TokenCleanupUtility] (default task-10) [] Not cleaning up expired tokens2018-03-15 10:05:00,169-04 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [789edb23] Lock Acquired to object 'EngineLock:{exclusiveLocks='[c38a67ec-0b48-4e6f-be85-70c700df5483=PROVIDER]', sharedLocks=''}'2018-03-15 10:05:00,184-04 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [789edb23] Running command: SyncNetworkProviderCommand internal: true.2018-03-15 10:05:00,228-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Compiled stored procedure. Call string is [{call getdcidbyexternalnetworkid(?)}]2018-03-15 10:05:00,228-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] SqlCall for procedure [GetDcIdByExternalNetworkId] compiled2018-03-15 10:05:00,229-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] method: runQuery, params: [GetAllExternalNetworksOnProvider, IdQueryParameters:{refresh='false', filtered='false'}], timeElapsed: 353ms2018-03-15 10:05:00,239-04 INFO [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[network_1=NETWORK, c38a67ec-0b48-4e6f-be85-70c700df5483=PROVIDER]', sharedLocks=''}'2018-03-15 10:05:00,239-04 WARN [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Validation of action 'AddNetwork' failed for user admin@internal-authz. Reasons: VAR__TYPE__NETWORK,VAR__ACTION__ADD,ACTION_TYPE_FAILED_PROVIDER_LOCKED,$providerId c38a67ec-0b48-4e6f-be85-70c700df54832018-03-15 10:05:00,240-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] method: runAction, params: [AddNetwork, AddNetworkStoragePoolParameters:{commandId='61b365ec-27c1-49af-ad72-f907df8befcd', user='null', commandType='Unknown'}], timeElapsed: 10ms2018-03-15 10:05:00,250-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-13) [] Operation Failed: [Cannot add Network. Related operation on provider with the id c38a67ec-0b48-4e6f-be85-70c700df5483 is currently in progress. Please try again later.]2018-03-15 10:05:00,254-04 DEBUG [org.ovirt.engine.core.utils.servlet.LocaleFilter] (default task-14) [] Incoming locale 'en-US'. Filter determined locale to be 'en-US'2018-03-15 10:05:00,254-04 DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenServlet] (default task-14) [] Entered OAuthTokenServlet Query String: null, Parameters : password = ***, grant_type = password, scope = ovirt-app-api ovirt-ext=token-info:validate, username = admin@internal, </error>*

On Thu, 15 Mar 2018 16:24:10 +0000 Dafna Ron <dron@redhat.com> wrote:
Hi,
We have a failure on master for test 098_ovirt_provider_ovn.use_ovn_provider in project cockpit-ovirt. This seems to be a race because object is locked. also, the actual failure is logged as WARN and not ERROR.
I don't think the patch is actually related to the failure but I think the test should be fixed. can you please review to make sure we do not have an actual regression and let me know if we need to open a bz to fix the test?
*Link and headline of suspected patches: * *https://gerrit.ovirt.org/#/c/89020/2 <https://gerrit.ovirt.org/#/c/89020/2> - *
*wizard: Enable scroll on start page for low-res screensLink to Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374 <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374>Link to all logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374/artifacts <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374/artifacts>(Relevant) error snippet from the log: <error>2018-03-15 10:05:00,160-04 DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenInfoServlet] (default task-10) [] Sending json response2018-03-15 10:05:00,160-04 DEBUG [org.ovirt.engine.core.sso.utils.TokenCleanupUtility] (default task-10) [] Not cleaning up expired tokens2018-03-15 10:05:00,169-04 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [789edb23] Lock Acquired to object 'EngineLock:{exclusiveLocks='[c38a67ec-0b48-4e6f-be85-70c700df5483=PROVIDER]', sharedLocks=''}'2018-03-15 10:05:00,184-04 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [789edb23] Running command: SyncNetworkProviderCommand internal: true.2018-03-15 10:05:00,228-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Compiled stored procedure. Call string is [{call getdcidbyexternalnetworkid(?)}]2018-03-15 10:05:00,228-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] SqlCall for procedure [GetDcIdByExternalNetworkId] compiled2018-03-15 10:05:00,229-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] method: runQuery, params: [GetAllExternalNetworksOnProvider, IdQueryParameters:{refresh='false', filtered='false'}], timeElapsed: 353ms2018-03-15 10:05:00,239-04 INFO [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[network_1=NETWORK, c38a67ec-0b48-4e6f-be85-70c700df5483=PROVIDER]', sharedLocks=''}'2018-03-15 10:05:00,239-04 WARN [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Validation of action 'AddNetwork' failed for user admin@internal-authz. Reasons: VAR__TYPE__NETWORK,VAR__ACTION__ADD,ACTION_TYPE_FAILED_PROVIDER_LOCKED,$providerId c38a67ec-0b48-4e6f-be85-70c700df54832018-03-15 10:05:00,240-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] method: runAction, params: [AddNetwork, AddNetworkStoragePoolParameters:{commandId='61b365ec-27c1-49af-ad72-f907df8befcd', user='null', commandType='Unknown'}], timeElapsed: 10ms2018-03-15 10:05:00,250-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-13) [] Operation Failed: [Cannot add Network. Related operation on provider with the id c38a67ec-0b48-4e6f-be85-70c700df5483 is currently in progress. Please try again later.]2018-03-15 10:05:00,254-04 DEBUG [org.ovirt.engine.core.utils.servlet.LocaleFilter] (default task-14) [] Incoming locale 'en-US'. Filter determined locale to be 'en-US'2018-03-15 10:05:00,254-04 DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenServlet] (default task-14) [] Entered OAuthTokenServlet Query String: null, Parameters : password = ***, grant_type = password, scope = ovirt-app-api ovirt-ext=token-info:validate, username = admin@internal, </error>*
I will care about this. The problem is that SyncNetworkProviderCommand is running in the background and locking the provider, which blocks the lock for the tested AddNetworkCommand. The related changes are core: Add locking for Add and RemoveNetworkCommand https://gerrit.ovirt.org/#/c/85480/ and core: Add SyncNetworkProviderCommand https://gerrit.ovirt.org/#/c/85134/

Thank you for the fast reply and help. On Thu, Mar 15, 2018 at 8:21 PM, Dominik Holler <dholler@redhat.com> wrote:
On Thu, 15 Mar 2018 16:24:10 +0000 Dafna Ron <dron@redhat.com> wrote:
Hi,
We have a failure on master for test 098_ovirt_provider_ovn.use_ovn_provider in project cockpit-ovirt. This seems to be a race because object is locked. also, the actual failure is logged as WARN and not ERROR.
I don't think the patch is actually related to the failure but I think the test should be fixed. can you please review to make sure we do not have an actual regression and let me know if we need to open a bz to fix the test?
*Link and headline of suspected patches: * *https://gerrit.ovirt.org/#/c/89020/2 <https://gerrit.ovirt.org/#/c/89020/2> - *
*wizard: Enable scroll on start page for low-res screensLink to Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374 <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374>Link to all logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/6374/artifacts <http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/6374/artifacts>(Relevant) error snippet from the log: <error>2018-03-15 10:05:00,160-04 DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenInfoServlet] (default task-10) [] Sending json response2018-03-15 10:05:00,160-04 DEBUG [org.ovirt.engine.core.sso.utils.TokenCleanupUtility] (default task-10) [] Not cleaning up expired tokens2018-03-15 10:05:00,169-04 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [789edb23] Lock Acquired to object 'EngineLock:{exclusiveLocks='[c38a67ec-0b48-4e6f-be85- 70c700df5483=PROVIDER]', sharedLocks=''}'2018-03-15 10:05:00,184-04 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [789edb23] Running command: SyncNetworkProviderCommand internal: true.2018-03-15 10:05:00,228-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$ PostgresSimpleJdbcCall] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Compiled stored procedure. Call string is [{call getdcidbyexternalnetworkid(?)}]2018-03-15 10:05:00,228-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$ PostgresSimpleJdbcCall] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] SqlCall for procedure [GetDcIdByExternalNetworkId] compiled2018-03-15 10:05:00,229-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] method: runQuery, params: [GetAllExternalNetworksOnProvider, IdQueryParameters:{refresh='false', filtered='false'}], timeElapsed: 353ms2018-03-15 10:05:00,239-04 INFO [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[network_1=NETWORK, c38a67ec-0b48-4e6f-be85-70c700df5483=PROVIDER]', sharedLocks=''}'2018-03-15 10:05:00,239-04 WARN [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Validation of action 'AddNetwork' failed for user admin@internal-authz. Reasons: VAR__TYPE__NETWORK,VAR__ACTION__ADD,ACTION_TYPE_FAILED_PROVIDER_LOCKED,$ providerId c38a67ec-0b48-4e6f-be85-70c700df54832018-03-15 10:05:00,240-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] method: runAction, params: [AddNetwork, AddNetworkStoragePoolParameters:{commandId='61b365ec-27c1- 49af-ad72-f907df8befcd', user='null', commandType='Unknown'}], timeElapsed: 10ms2018-03-15 10:05:00,250-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-13) [] Operation Failed: [Cannot add Network. Related operation on provider with the id c38a67ec-0b48-4e6f-be85-70c700df5483 is currently in progress. Please try again later.]2018-03-15 10:05:00,254-04 DEBUG [org.ovirt.engine.core.utils.servlet.LocaleFilter] (default task-14) [] Incoming locale 'en-US'. Filter determined locale to be 'en-US'2018-03-15 10:05:00,254-04 DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenServlet] (default task-14) [] Entered OAuthTokenServlet Query String: null, Parameters : password = ***, grant_type = password, scope = ovirt-app-api ovirt-ext=token-info:validate, username = admin@internal, </error>*
I will care about this. The problem is that SyncNetworkProviderCommand is running in the background and locking the provider, which blocks the lock for the tested AddNetworkCommand. The related changes are core: Add locking for Add and RemoveNetworkCommand https://gerrit.ovirt.org/#/c/85480/ and core: Add SyncNetworkProviderCommand https://gerrit.ovirt.org/#/c/85134/

I have created https://bugzilla.redhat.com/show_bug.cgi?id=1557419 and https://bugzilla.redhat.com/show_bug.cgi?id=1557424 to discuss how failing to acquire locks in engine should be handled in engine's REST-API, in ovirt-sdk or in application (OST in this case). On Fri, 16 Mar 2018 09:32:47 +0000 Dafna Ron <dron@redhat.com> wrote:
Thank you for the fast reply and help.
On Thu, Mar 15, 2018 at 8:21 PM, Dominik Holler <dholler@redhat.com> wrote:
On Thu, 15 Mar 2018 16:24:10 +0000 Dafna Ron <dron@redhat.com> wrote:
Hi,
We have a failure on master for test 098_ovirt_provider_ovn.use_ovn_provider in project cockpit-ovirt. This seems to be a race because object is locked. also, the actual failure is logged as WARN and not ERROR.
I don't think the patch is actually related to the failure but I think the test should be fixed. can you please review to make sure we do not have an actual regression and let me know if we need to open a bz to fix the test?
*Link and headline of suspected patches: * *https://gerrit.ovirt.org/#/c/89020/2 <https://gerrit.ovirt.org/#/c/89020/2> - *
*wizard: Enable scroll on start page for low-res screensLink to Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374 <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374>Link to all logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/6374/artifacts <http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/6374/artifacts>(Relevant) error snippet from the log: <error>2018-03-15 10:05:00,160-04 DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenInfoServlet] (default task-10) [] Sending json response2018-03-15 10:05:00,160-04 DEBUG [org.ovirt.engine.core.sso.utils.TokenCleanupUtility] (default task-10) [] Not cleaning up expired tokens2018-03-15 10:05:00,169-04 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [789edb23] Lock Acquired to object 'EngineLock:{exclusiveLocks='[c38a67ec-0b48-4e6f-be85- 70c700df5483=PROVIDER]', sharedLocks=''}'2018-03-15 10:05:00,184-04 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [789edb23] Running command: SyncNetworkProviderCommand internal: true.2018-03-15 10:05:00,228-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$ PostgresSimpleJdbcCall] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Compiled stored procedure. Call string is [{call getdcidbyexternalnetworkid(?)}]2018-03-15 10:05:00,228-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$ PostgresSimpleJdbcCall] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] SqlCall for procedure [GetDcIdByExternalNetworkId] compiled2018-03-15 10:05:00,229-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] method: runQuery, params: [GetAllExternalNetworksOnProvider, IdQueryParameters:{refresh='false', filtered='false'}], timeElapsed: 353ms2018-03-15 10:05:00,239-04 INFO [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[network_1=NETWORK, c38a67ec-0b48-4e6f-be85-70c700df5483=PROVIDER]', sharedLocks=''}'2018-03-15 10:05:00,239-04 WARN [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Validation of action 'AddNetwork' failed for user admin@internal-authz. Reasons: VAR__TYPE__NETWORK,VAR__ACTION__ADD,ACTION_TYPE_FAILED_PROVIDER_LOCKED,$ providerId c38a67ec-0b48-4e6f-be85-70c700df54832018-03-15 10:05:00,240-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] method: runAction, params: [AddNetwork, AddNetworkStoragePoolParameters:{commandId='61b365ec-27c1- 49af-ad72-f907df8befcd', user='null', commandType='Unknown'}], timeElapsed: 10ms2018-03-15 10:05:00,250-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-13) [] Operation Failed: [Cannot add Network. Related operation on provider with the id c38a67ec-0b48-4e6f-be85-70c700df5483 is currently in progress. Please try again later.]2018-03-15 10:05:00,254-04 DEBUG [org.ovirt.engine.core.utils.servlet.LocaleFilter] (default task-14) [] Incoming locale 'en-US'. Filter determined locale to be 'en-US'2018-03-15 10:05:00,254-04 DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenServlet] (default task-14) [] Entered OAuthTokenServlet Query String: null, Parameters : password = ***, grant_type = password, scope = ovirt-app-api ovirt-ext=token-info:validate, username = admin@internal, </error>*
I will care about this. The problem is that SyncNetworkProviderCommand is running in the background and locking the provider, which blocks the lock for the tested AddNetworkCommand. The related changes are core: Add locking for Add and RemoveNetworkCommand https://gerrit.ovirt.org/#/c/85480/ and core: Add SyncNetworkProviderCommand https://gerrit.ovirt.org/#/c/85134/

Thanks for filing those, but let us not keep them secret: Bug 1557419 - Recommend that a call should be tried again sounds most reasonable. Providing more information to the client makes sense Bug 1557424 - Automatically retry call failed because he failed to acquire a lock is more tricky. when would REST perform the retry? how often? We must make sure we never cause a livelock. Can we alternatively take a normal blocking lock when we create an external network (instead of a trylock)? On Fri, Mar 16, 2018 at 4:52 PM, Dominik Holler <dholler@redhat.com> wrote:
I have created https://bugzilla.redhat.com/show_bug.cgi?id=1557419 and https://bugzilla.redhat.com/show_bug.cgi?id=1557424 to discuss how failing to acquire locks in engine should be handled in engine's REST-API, in ovirt-sdk or in application (OST in this case).
On Fri, 16 Mar 2018 09:32:47 +0000 Dafna Ron <dron@redhat.com> wrote:
Thank you for the fast reply and help.
On Thu, Mar 15, 2018 at 8:21 PM, Dominik Holler <dholler@redhat.com> wrote:
On Thu, 15 Mar 2018 16:24:10 +0000 Dafna Ron <dron@redhat.com> wrote:
Hi,
We have a failure on master for test 098_ovirt_provider_ovn.use_ovn_provider in project cockpit-ovirt. This seems to be a race because object is locked. also, the actual failure is logged as WARN and not ERROR.
I don't think the patch is actually related to the failure but I think the test should be fixed. can you please review to make sure we do not have an actual regression and let me know if we need to open a bz to fix the test?
*Link and headline of suspected patches: * *https://gerrit.ovirt.org/#/c/89020/2 <https://gerrit.ovirt.org/#/c/89020/2> - *
*wizard: Enable scroll on start page for low-res screensLink to Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374 <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374>Link to all logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/6374/artifacts <http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/6374/artifacts>(Relevant) error snippet from the log: <error>2018-03-15 10:05:00,160-04 DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenInfoServlet] (default task-10) [] Sending json response2018-03-15 10:05:00,160-04 DEBUG [org.ovirt.engine.core.sso.utils.TokenCleanupUtility] (default task-10) [] Not cleaning up expired tokens2018-03-15 10:05:00,169-04 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [789edb23] Lock Acquired to object 'EngineLock:{exclusiveLocks='[c38a67ec-0b48-4e6f-be85- 70c700df5483=PROVIDER]', sharedLocks=''}'2018-03-15 10:05:00,184-04 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [789edb23] Running command: SyncNetworkProviderCommand internal: true.2018-03-15 10:05:00,228-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$ PostgresSimpleJdbcCall] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Compiled stored procedure. Call string is [{call getdcidbyexternalnetworkid(?)}]2018-03-15 10:05:00,228-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$ PostgresSimpleJdbcCall] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] SqlCall for procedure [GetDcIdByExternalNetworkId] compiled2018-03-15 10:05:00,229-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] method: runQuery, params: [GetAllExternalNetworksOnProvider, IdQueryParameters:{refresh='false', filtered='false'}], timeElapsed: 353ms2018-03-15 10:05:00,239-04 INFO [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[network_1=NETWORK, c38a67ec-0b48-4e6f-be85-70c700df5483=PROVIDER]', sharedLocks=''}'2018-03-15 10:05:00,239-04 WARN [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Validation of action 'AddNetwork' failed for user admin@internal-authz. Reasons: VAR__TYPE__NETWORK,VAR__ACTION__ADD,ACTION_TYPE_FAILED_PROVIDER_LOCKED,$ providerId c38a67ec-0b48-4e6f-be85-70c700df54832018-03-15 10:05:00,240-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] method: runAction, params: [AddNetwork, AddNetworkStoragePoolParameters:{commandId='61b365ec-27c1- 49af-ad72-f907df8befcd', user='null', commandType='Unknown'}], timeElapsed: 10ms2018-03-15 10:05:00,250-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-13) [] Operation Failed: [Cannot add Network. Related operation on provider with the id c38a67ec-0b48-4e6f-be85-70c700df5483 is currently in progress. Please try again later.]2018-03-15 10:05:00,254-04 DEBUG [org.ovirt.engine.core.utils.servlet.LocaleFilter] (default task-14) [] Incoming locale 'en-US'. Filter determined locale to be 'en-US'2018-03-15 10:05:00,254-04 DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenServlet] (default task-14) [] Entered OAuthTokenServlet Query String: null, Parameters : password = ***, grant_type = password, scope = ovirt-app-api ovirt-ext=token-info:validate, username = admin@internal, </error>*
I will care about this. The problem is that SyncNetworkProviderCommand is running in the background and locking the provider, which blocks the lock for the tested AddNetworkCommand. The related changes are core: Add locking for Add and RemoveNetworkCommand https://gerrit.ovirt.org/#/c/85480/ and core: Add SyncNetworkProviderCommand https://gerrit.ovirt.org/#/c/85134/
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

On Sat, 17 Mar 2018 16:16:49 +0200 Dan Kenigsberg <danken@redhat.com> wrote:
Thanks for filing those, but let us not keep them secret:
Bug 1557419 - Recommend that a call should be tried again sounds most reasonable. Providing more information to the client makes sense
Bug 1557424 - Automatically retry call failed because he failed to acquire a lock is more tricky. when would REST perform the retry?
If the command returns a machine-readable hint that the conflict is temporary.
how often? We must make sure we never cause a livelock.
I agree that it is a good idea to have an upper limit for the count of retries to prevent a livelock.
Can we alternatively take a normal blocking lock when we create an external network (instead of a trylock)?
This might work, but currently no command is doing this. (The lock is acquired in CommandBase.acquireLockInternal() which could be overridden for all commands using an external network provider.) But I would prefer a solution which solve the issue for other commands, too.
On Fri, Mar 16, 2018 at 4:52 PM, Dominik Holler <dholler@redhat.com> wrote:
I have created https://bugzilla.redhat.com/show_bug.cgi?id=1557419 and https://bugzilla.redhat.com/show_bug.cgi?id=1557424 to discuss how failing to acquire locks in engine should be handled in engine's REST-API, in ovirt-sdk or in application (OST in this case).
On Fri, 16 Mar 2018 09:32:47 +0000 Dafna Ron <dron@redhat.com> wrote:
Thank you for the fast reply and help.
On Thu, Mar 15, 2018 at 8:21 PM, Dominik Holler <dholler@redhat.com> wrote:
On Thu, 15 Mar 2018 16:24:10 +0000 Dafna Ron <dron@redhat.com> wrote:
Hi,
We have a failure on master for test 098_ovirt_provider_ovn.use_ovn_provider in project cockpit-ovirt. This seems to be a race because object is locked. also, the actual failure is logged as WARN and not ERROR.
I don't think the patch is actually related to the failure but I think the test should be fixed. can you please review to make sure we do not have an actual regression and let me know if we need to open a bz to fix the test?
*Link and headline of suspected patches: * *https://gerrit.ovirt.org/#/c/89020/2 <https://gerrit.ovirt.org/#/c/89020/2> - *
*wizard: Enable scroll on start page for low-res screensLink to Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374 <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374>Link to all logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/6374/artifacts <http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/6374/artifacts>(Relevant) error snippet from the log: <error>2018-03-15 10:05:00,160-04 DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenInfoServlet] (default task-10) [] Sending json response2018-03-15 10:05:00,160-04 DEBUG [org.ovirt.engine.core.sso.utils.TokenCleanupUtility] (default task-10) [] Not cleaning up expired tokens2018-03-15 10:05:00,169-04 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [789edb23] Lock Acquired to object 'EngineLock:{exclusiveLocks='[c38a67ec-0b48-4e6f-be85- 70c700df5483=PROVIDER]', sharedLocks=''}'2018-03-15 10:05:00,184-04 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [789edb23] Running command: SyncNetworkProviderCommand internal: true.2018-03-15 10:05:00,228-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$ PostgresSimpleJdbcCall] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Compiled stored procedure. Call string is [{call getdcidbyexternalnetworkid(?)}]2018-03-15 10:05:00,228-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$ PostgresSimpleJdbcCall] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] SqlCall for procedure [GetDcIdByExternalNetworkId] compiled2018-03-15 10:05:00,229-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] method: runQuery, params: [GetAllExternalNetworksOnProvider, IdQueryParameters:{refresh='false', filtered='false'}], timeElapsed: 353ms2018-03-15 10:05:00,239-04 INFO [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[network_1=NETWORK, c38a67ec-0b48-4e6f-be85-70c700df5483=PROVIDER]', sharedLocks=''}'2018-03-15 10:05:00,239-04 WARN [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Validation of action 'AddNetwork' failed for user admin@internal-authz. Reasons: VAR__TYPE__NETWORK,VAR__ACTION__ADD,ACTION_TYPE_FAILED_PROVIDER_LOCKED,$ providerId c38a67ec-0b48-4e6f-be85-70c700df54832018-03-15 10:05:00,240-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] method: runAction, params: [AddNetwork, AddNetworkStoragePoolParameters:{commandId='61b365ec-27c1- 49af-ad72-f907df8befcd', user='null', commandType='Unknown'}], timeElapsed: 10ms2018-03-15 10:05:00,250-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-13) [] Operation Failed: [Cannot add Network. Related operation on provider with the id c38a67ec-0b48-4e6f-be85-70c700df5483 is currently in progress. Please try again later.]2018-03-15 10:05:00,254-04 DEBUG [org.ovirt.engine.core.utils.servlet.LocaleFilter] (default task-14) [] Incoming locale 'en-US'. Filter determined locale to be 'en-US'2018-03-15 10:05:00,254-04 DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenServlet] (default task-14) [] Entered OAuthTokenServlet Query String: null, Parameters : password = ***, grant_type = password, scope = ovirt-app-api ovirt-ext=token-info:validate, username = admin@internal, </error>*
I will care about this. The problem is that SyncNetworkProviderCommand is running in the background and locking the provider, which blocks the lock for the tested AddNetworkCommand. The related changes are core: Add locking for Add and RemoveNetworkCommand https://gerrit.ovirt.org/#/c/85480/ and core: Add SyncNetworkProviderCommand https://gerrit.ovirt.org/#/c/85134/
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

I have overseen that many other commands handle this problem by using a waiting lock, so the related commands in this issue should do the same. I created Adding a new external network fails during auto-sync is running https://bugzilla.redhat.com/show_bug.cgi?id=1558054 to track this. From my point of view the other related two bugs I created are not required anymore, because other commands seems to use a waiting lock, too. Should we close 1557419 and 1557424? On Mon, 19 Mar 2018 09:40:32 +0100 Dominik Holler <dholler@redhat.com> wrote:
On Sat, 17 Mar 2018 16:16:49 +0200 Dan Kenigsberg <danken@redhat.com> wrote:
Thanks for filing those, but let us not keep them secret:
Bug 1557419 - Recommend that a call should be tried again sounds most reasonable. Providing more information to the client makes sense
Bug 1557424 - Automatically retry call failed because he failed to acquire a lock is more tricky. when would REST perform the retry?
If the command returns a machine-readable hint that the conflict is temporary.
how often? We must make sure we never cause a livelock.
I agree that it is a good idea to have an upper limit for the count of retries to prevent a livelock.
Can we alternatively take a normal blocking lock when we create an external network (instead of a trylock)?
This might work, but currently no command is doing this. (The lock is acquired in CommandBase.acquireLockInternal() which could be overridden for all commands using an external network provider.) But I would prefer a solution which solve the issue for other commands, too.
On Fri, Mar 16, 2018 at 4:52 PM, Dominik Holler <dholler@redhat.com> wrote:
I have created https://bugzilla.redhat.com/show_bug.cgi?id=1557419 and https://bugzilla.redhat.com/show_bug.cgi?id=1557424 to discuss how failing to acquire locks in engine should be handled in engine's REST-API, in ovirt-sdk or in application (OST in this case).
On Fri, 16 Mar 2018 09:32:47 +0000 Dafna Ron <dron@redhat.com> wrote:
Thank you for the fast reply and help.
On Thu, Mar 15, 2018 at 8:21 PM, Dominik Holler <dholler@redhat.com> wrote:
On Thu, 15 Mar 2018 16:24:10 +0000 Dafna Ron <dron@redhat.com> wrote:
Hi,
We have a failure on master for test 098_ovirt_provider_ovn.use_ovn_provider in project cockpit-ovirt. This seems to be a race because object is locked. also, the actual failure is logged as WARN and not ERROR.
I don't think the patch is actually related to the failure but I think the test should be fixed. can you please review to make sure we do not have an actual regression and let me know if we need to open a bz to fix the test?
*Link and headline of suspected patches: * *https://gerrit.ovirt.org/#/c/89020/2 <https://gerrit.ovirt.org/#/c/89020/2> - *
*wizard: Enable scroll on start page for low-res screensLink to Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374 <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374>Link to all logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/6374/artifacts <http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/6374/artifacts>(Relevant) error snippet from the log: <error>2018-03-15 10:05:00,160-04 DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenInfoServlet] (default task-10) [] Sending json response2018-03-15 10:05:00,160-04 DEBUG [org.ovirt.engine.core.sso.utils.TokenCleanupUtility] (default task-10) [] Not cleaning up expired tokens2018-03-15 10:05:00,169-04 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [789edb23] Lock Acquired to object 'EngineLock:{exclusiveLocks='[c38a67ec-0b48-4e6f-be85- 70c700df5483=PROVIDER]', sharedLocks=''}'2018-03-15 10:05:00,184-04 INFO [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [789edb23] Running command: SyncNetworkProviderCommand internal: true.2018-03-15 10:05:00,228-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$ PostgresSimpleJdbcCall] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Compiled stored procedure. Call string is [{call getdcidbyexternalnetworkid(?)}]2018-03-15 10:05:00,228-04 DEBUG [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$ PostgresSimpleJdbcCall] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] SqlCall for procedure [GetDcIdByExternalNetworkId] compiled2018-03-15 10:05:00,229-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] method: runQuery, params: [GetAllExternalNetworksOnProvider, IdQueryParameters:{refresh='false', filtered='false'}], timeElapsed: 353ms2018-03-15 10:05:00,239-04 INFO [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[network_1=NETWORK, c38a67ec-0b48-4e6f-be85-70c700df5483=PROVIDER]', sharedLocks=''}'2018-03-15 10:05:00,239-04 WARN [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] Validation of action 'AddNetwork' failed for user admin@internal-authz. Reasons: VAR__TYPE__NETWORK,VAR__ACTION__ADD,ACTION_TYPE_FAILED_PROVIDER_LOCKED,$ providerId c38a67ec-0b48-4e6f-be85-70c700df54832018-03-15 10:05:00,240-04 DEBUG [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] method: runAction, params: [AddNetwork, AddNetworkStoragePoolParameters:{commandId='61b365ec-27c1- 49af-ad72-f907df8befcd', user='null', commandType='Unknown'}], timeElapsed: 10ms2018-03-15 10:05:00,250-04 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-13) [] Operation Failed: [Cannot add Network. Related operation on provider with the id c38a67ec-0b48-4e6f-be85-70c700df5483 is currently in progress. Please try again later.]2018-03-15 10:05:00,254-04 DEBUG [org.ovirt.engine.core.utils.servlet.LocaleFilter] (default task-14) [] Incoming locale 'en-US'. Filter determined locale to be 'en-US'2018-03-15 10:05:00,254-04 DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenServlet] (default task-14) [] Entered OAuthTokenServlet Query String: null, Parameters : password = ***, grant_type = password, scope = ovirt-app-api ovirt-ext=token-info:validate, username = admin@internal, </error>*
I will care about this. The problem is that SyncNetworkProviderCommand is running in the background and locking the provider, which blocks the lock for the tested AddNetworkCommand. The related changes are core: Add locking for Add and RemoveNetworkCommand https://gerrit.ovirt.org/#/c/85480/ and core: Add SyncNetworkProviderCommand https://gerrit.ovirt.org/#/c/85134/
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

I think you should keep Bug 1557419 - Recommend that a call should be tried again but it's for Ondra to decide. On Mon, Mar 19, 2018 at 4:38 PM, Dominik Holler <dholler@redhat.com> wrote:
I have overseen that many other commands handle this problem by using a waiting lock, so the related commands in this issue should do the same. I created Adding a new external network fails during auto-sync is running https://bugzilla.redhat.com/show_bug.cgi?id=1558054 to track this.
From my point of view the other related two bugs I created are not required anymore, because other commands seems to use a waiting lock, too. Should we close 1557419 and 1557424?
On Mon, 19 Mar 2018 09:40:32 +0100 Dominik Holler <dholler@redhat.com> wrote:
On Sat, 17 Mar 2018 16:16:49 +0200 Dan Kenigsberg <danken@redhat.com> wrote:
Thanks for filing those, but let us not keep them secret:
Bug 1557419 - Recommend that a call should be tried again sounds most reasonable. Providing more information to the client makes sense
Bug 1557424 - Automatically retry call failed because he failed to acquire a lock is more tricky. when would REST perform the retry?
If the command returns a machine-readable hint that the conflict is temporary.
how often? We must make sure we never cause a livelock.
I agree that it is a good idea to have an upper limit for the count of retries to prevent a livelock.
Can we alternatively take a normal blocking lock when we create an external network (instead of a trylock)?
This might work, but currently no command is doing this. (The lock is acquired in CommandBase.acquireLockInternal() which could be overridden for all commands using an external network provider.) But I would prefer a solution which solve the issue for other commands, too.
On Fri, Mar 16, 2018 at 4:52 PM, Dominik Holler <dholler@redhat.com> wrote:
I have created https://bugzilla.redhat.com/show_bug.cgi?id=1557419 and https://bugzilla.redhat.com/show_bug.cgi?id=1557424 to discuss how failing to acquire locks in engine should be handled in engine's REST-API, in ovirt-sdk or in application (OST in this case).
On Fri, 16 Mar 2018 09:32:47 +0000 Dafna Ron <dron@redhat.com> wrote:
Thank you for the fast reply and help.
On Thu, Mar 15, 2018 at 8:21 PM, Dominik Holler <dholler@redhat.com> wrote:
On Thu, 15 Mar 2018 16:24:10 +0000 Dafna Ron <dron@redhat.com> wrote:
> Hi, > > We have a failure on master for test > 098_ovirt_provider_ovn.use_ovn_provider in project > cockpit-ovirt. This seems to be a race because object is > locked. also, the actual failure is logged as WARN and not > ERROR. > > I don't think the patch is actually related to the failure > but I think the test should be fixed. > can you please review to make sure we do not have an actual > regression and let me know if we need to open a bz to fix the > test? > > > *Link and headline of suspected patches: * > *https://gerrit.ovirt.org/#/c/89020/2 > <https://gerrit.ovirt.org/#/c/89020/2> - * > > > > > > > > > > > > > > > > > > > > > > > > > > > *wizard: Enable scroll on start page for low-res screensLink > to > Job:http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374 > <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/6374>Link > to all > logs:http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/6374/artifacts > <http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/6374/artifacts>(Relevant) > error snippet from the log: <error>2018-03-15 10:05:00,160-04 > DEBUG > [org.ovirt.engine.core.sso.servlets.OAuthTokenInfoServlet] > (default task-10) [] Sending json response2018-03-15 > 10:05:00,160-04 DEBUG > [org.ovirt.engine.core.sso.utils.TokenCleanupUtility] > (default task-10) [] Not cleaning up expired tokens2018-03-15 > 10:05:00,169-04 INFO > [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] > (EE-ManagedThreadFactory-engineScheduled-Thread-90) > [789edb23] Lock Acquired to object > 'EngineLock:{exclusiveLocks='[c38a67ec-0b48-4e6f-be85- 70c700df5483=PROVIDER]', > sharedLocks=''}'2018-03-15 10:05:00,184-04 INFO > [org.ovirt.engine.core.bll.provider.network.SyncNetworkProviderCommand] > (EE-ManagedThreadFactory-engineScheduled-Thread-90) > [789edb23] Running command: SyncNetworkProviderCommand > internal: true.2018-03-15 10:05:00,228-04 DEBUG > [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$ PostgresSimpleJdbcCall] > (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] > Compiled stored procedure. Call string is [{call > getdcidbyexternalnetworkid(?)}]2018-03-15 10:05:00,228-04 > DEBUG > [org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$ PostgresSimpleJdbcCall] > (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] > SqlCall for procedure [GetDcIdByExternalNetworkId] > compiled2018-03-15 10:05:00,229-04 DEBUG > [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] > (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] > method: runQuery, params: [GetAllExternalNetworksOnProvider, > IdQueryParameters:{refresh='false', filtered='false'}], > timeElapsed: 353ms2018-03-15 10:05:00,239-04 INFO > [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] > (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] > Failed to Acquire Lock to object > 'EngineLock:{exclusiveLocks='[network_1=NETWORK, > c38a67ec-0b48-4e6f-be85-70c700df5483=PROVIDER]', > sharedLocks=''}'2018-03-15 10:05:00,239-04 WARN > [org.ovirt.engine.core.bll.network.dc.AddNetworkCommand] > (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] > Validation of action 'AddNetwork' failed for user > admin@internal-authz. Reasons: > VAR__TYPE__NETWORK,VAR__ACTION__ADD,ACTION_TYPE_FAILED_PROVIDER_LOCKED,$ providerId > c38a67ec-0b48-4e6f-be85-70c700df54832018-03-15 > 10:05:00,240-04 DEBUG > [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor] > (default task-13) [e1328379-17b7-49f8-beb2-cf8331784828] > method: runAction, params: [AddNetwork, > AddNetworkStoragePoolParameters:{commandId='61b365ec-27c1- 49af-ad72-f907df8befcd', > user='null', commandType='Unknown'}], timeElapsed: > 10ms2018-03-15 10:05:00,250-04 ERROR > [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] > (default task-13) [] Operation Failed: [Cannot add Network. > Related operation on provider with the id > c38a67ec-0b48-4e6f-be85-70c700df5483 is currently in > progress. Please try again later.]2018-03-15 10:05:00,254-04 > DEBUG [org.ovirt.engine.core.utils.servlet.LocaleFilter] > (default task-14) [] Incoming locale 'en-US'. Filter > determined locale to be 'en-US'2018-03-15 10:05:00,254-04 > DEBUG [org.ovirt.engine.core.sso.servlets.OAuthTokenServlet] > (default task-14) [] Entered OAuthTokenServlet Query String: > null, Parameters : password = ***, grant_type = password, > scope = ovirt-app-api ovirt-ext=token-info:validate, > username = admin@internal, </error>*
I will care about this. The problem is that SyncNetworkProviderCommand is running in the background and locking the provider, which blocks the lock for the tested AddNetworkCommand. The related changes are core: Add locking for Add and RemoveNetworkCommand https://gerrit.ovirt.org/#/c/85480/ and core: Add SyncNetworkProviderCommand https://gerrit.ovirt.org/#/c/85134/
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
participants (3)
-
Dafna Ron
-
Dan Kenigsberg
-
Dominik Holler