
Heartbeat issue caused channel being closed and set to null which caused a surprise for a reactor. It seems that we may want to schedule closure of channel in this situation instead of calling it explicitly. What was the cause of the heartbeat issues? On Thu, Oct 27, 2016 at 4:12 AM, Allon Mureinik <amureini@redhat.com> wrote:
Seems like we cleared up the engine issues related to the recent injection changes.
I am now seeing stop errors, e.g.:
{"jsonrpc": "2.0", "id": "ea0c564f-bc17-4fc2-8f1b-67c4d28257c6", "result": {"cpuStatistics": {"1": {"cpuUser": "3.07", "nodeIndex": 0, "cpuSys": "3.00", "cpuIdle": "93.93"}, "0": {"cpuUser": "1.67", "nodeIndex": 0, "cpuSys": "2.07", "cpuIdle": "96.26"}}, "numaNodeMemFree": {"0": {"memPercent": 83, "memFree": "359"}}, "memShared": 0, "thpState": "always", "ksmMergeAcrossNodes": true, "vmCount": 0, "memUsed": "20", "storageDomains": {"b2bb3220-1eb3-426a-90c2-5e236aefbe1a": {"code": 0, "actual": true, "version": 0, "acquired": true, "delay": "0.000840117", "lastCheck": "7.1", "valid": true}, "3130195a-73f9-4490-b554-98a9205cead6": {"code": 0, "actual": true, "version": 4, "acquired": true, "delay": "0.00150771", "lastCheck": "7.5", "valid": true}, "1a9e202b-83b7-4bdc-9b0c-e76b83676068": {"code": 0, "actual": true, "version": 4, "acquired": true, "delay": "0.000590956", 2016-10-26 21:51:09,878 DEBUG [org.ovirt.engine.core.utils.timer.FixedDelayJobListener] (DefaultQuartzScheduler7) [6d206bd1] Rescheduling DEFAULT.org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods#-9223372036854775783 as there is no unfired trigger. 2016-10-26 21:51:28,705 DEBUG [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [383dd6a0] Heartbeat exceeded. Closing channel 2016-10-26 21:51:28,708 DEBUG [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [383dd6a0] Internal server error: null: java.lang.NullPointerException at org.ovirt.vdsm.jsonrpc.client.reactors.SSLClient.write(SSLClient.java:102) [vdsm-jsonrpc-java-client.jar:] at org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient.processOutgoing(ReactorClient.java:245) [vdsm-jsonrpc-java-client.jar:] at org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient.process(ReactorClient.java:208) [vdsm-jsonrpc-java-client.jar:] at org.ovirt.vdsm.jsonrpc.client.reactors.SSLClient.process(SSLClient.java:125) [vdsm-jsonrpc-java-client.jar:] at org.ovirt.vdsm.jsonrpc.client.reactors.Reactor.processChannels(Reactor.java:89) [vdsm-jsonrpc-java-client.jar:] at org.ovirt.vdsm.jsonrpc.client.reactors.Reactor.run(Reactor.java:65) [vdsm-jsonrpc-java-client.jar:]
Piotr - any idea?
On Wed, Oct 26, 2016 at 9:34 PM, Nadav Goldin <ngoldin@redhat.com> wrote:
Its running now: http://jenkins.ovirt.org/job/ovirt-engine_master_build-artif acts-fc24-x86_64/1037/, when it finishes it will trigger the deploy job which will trigger the experimental job. The build_artifacts job is triggered approximately every 2 hours(depends on the load). Inside the build-artifacts job you can see the last commit(https://gerrit.ovirt.org/65776, which is already after your commit)
On Wed, Oct 26, 2016 at 9:26 PM, Allon Mureinik <amureini@redhat.com> wrote:
@Infra - the last experimental job I saw was from ~17:30 local Israel time. Any idea why another one isn't being triggered (or am I just being daft)?
On Wed, Oct 26, 2016 at 6:27 PM, Allon Mureinik <amureini@redhat.com> wrote:
Yipes. [1] should fix that, waiting for the CI to run to merge.
[1] https://gerrit.ovirt.org/#/c/65768/
On Wed, Oct 26, 2016 at 3:42 PM, Nadav Goldin <ngoldin@redhat.com>
wrote:
Unfortunately it is still failing, see[1], the repository used was built from commit [2]. If you want to check the logs same links apply(just replace build number 2759->2782)
[1] http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_ma
ster/2782
[2] https://gerrit.ovirt.org/#/c/65740/
On Wed, Oct 26, 2016 at 11:33 AM, Tal Nisan <tnisan@redhat.com> wrote:
Should be fixed now on master as those patches were just merged:
https://gerrit.ovirt.org/#/c/65738/ - "Move InjectorRule to VdsBroker": aligns the InjectorRule's package with the Injector so it can essentially be used wherever the Injector is used [+12, -4] https://gerrit.ovirt.org/#/c/65739 - "InjectorRule: override injector anyway": Fixes a bug in InjectorRule where the Injector is only mocked if you call InjectorRule.bind [+6, -1] https://gerrit.ovirt.org/#/c/65740 - "core: InjectorRule for injecting members": Fixes a bug in InjectorRule to allow using Injector.injectMembers when using it [+18, -5] https://gerrit.ovirt.org/#/c/65725 - "core: Fix AuditLogging": The actual fix. Basically, goes over all the places that create an AuditLoggableBase that needs injecting and take care of it [+155, -160]
On Wed, Oct 26, 2016 at 10:04 AM, Nadav Goldin <ngoldin@redhat.com> wrote: > > Hi, > We have new failure on OST from patches merged to master yesterday, > the failure started after the merge of [1], but as there were quite a > few patches merged quickly I can't make sure it is the one causing > it(OST aren't ran per-patch). > The test that fails is [2] when attempting to start the VM. > > The error from the API side: > > RequestError: > status: 500 > reason: Internal Server Error > detail: javax.ejb.EJBException: java.lang.NullPointerException > at > > org.jboss.as.ejb3.tx.CMTTxInterceptor.handleExceptionInNoTx( CMTTxInterceptor.java:213) > at > > org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInte rceptor.java:265) > at > > org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxIntercep tor.java:374) > at > > org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTT xInterceptor.java:243) > at > > org.jboss.invocation.InterceptorContext.proceed(InterceptorC ontext.java:340) > .... > > In the engine logs there are a few 'java.lang.NullPointerExceptio n' > errors: > > 2016-10-25 11:53:52,845 INFO > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLo gableBase] > (org.ovirt.thread.pool-6-thread-2) [5e6a88be] Failed to get vds > 'd60db21f-95f0-487b-9f17-44861e2610a7', error: null > 2016-10-25 11:53:52,864 DEBUG > [org.ovirt.engine.core.utils.timer.FixedDelayJobListener] > (DefaultQuartzScheduler5) [] Rescheduling > > > DEFAULT.org.ovirt.engine.core.bll.tasks.AsyncTaskManager.tim erElapsed#-9223372036854775787 > as there is no unfired trigger. > ... > 2016-10-25 11:53:52,845 DEBUG > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLo gableBase] > (org.ovirt.thread.pool-6-thread-2) [5e6a88be] Exception: > java.lang.NullPointerException > at > > org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLog ableBase.getVdsStatic(AuditLogableBase.java:633) > [dal.jar:] > at > > org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLog ableBase.getVdsName(AuditLogableBase.java:504) > [dal.jar:] > ... > 2016-10-25 11:53:52,837 INFO > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLo gableBase] > (org.ovirt.thread.pool-6-thread-2) [5e6a88be] Failed to get vds > 'd60db21f-95f0-487b-9f17-44861e2610a7', error: null > 2016-10-25 11:53:52,837 DEBUG > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLo gableBase] > (org.ovirt.thread.pool-6-thread-2) [5e6a88be] Exception: > java.lang.NullPointerException > at > > org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLog ableBase.getVdsStatic(AuditLogableBase.java:633) > [dal.jar:] > at > > org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLog ableBase.getVdsName(AuditLogableBase.java:504) > [dal.jar:] > ... > > The full engine logs can be found here[3] and the entire test suite > logs here[4]. > > Can anyone have a look? > > Thanks, > Nadav. > > > [1] https://gerrit.ovirt.org/#/c/65198/ > [2] > > https://github.com/oVirt/ovirt-system-tests/blob/master/ basic_suite_master/test-scenarios/004_basic_sanity.py#L322 > [3] > > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_ma ster/2759/artifact/exported-artifacts/basic_suite_master.sh- fc24/exported-artifacts/test_logs/basic_suite_master/post- 004_basic_sanity.py/*zip*/post-004_basic_sanity.py.zip > [4] > > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_ma ster/2759/artifact/exported-artifacts/basic_suite_master.sh- fc24/exported-artifacts/ > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel