[ovirt-devel] test-repo_ovirt_experimental_master fails on jsonrpc errors

Yaniv Kaul ykaul at redhat.com
Thu Jan 12 15:18:46 UTC 2017


On Thu, Jan 12, 2017 at 4:58 PM, Piotr Kliczewski <
piotr.kliczewski at gmail.com> wrote:

> On Thu, Jan 12, 2017 at 8:52 AM, Roy Golan <rgolan at redhat.com> wrote:
> >
> >
> > On 11 January 2017 at 21:50, Piotr Kliczewski <
> piotr.kliczewski at gmail.com>
> > wrote:
> >>
> >> On Wed, Jan 11, 2017 at 9:47 AM, Daniel Belenky <dbelenky at redhat.com>
> >> wrote:
> >> > Hi all,
> >> >
> >> > The following job: test-repo_ovirt_experimental_master fails to pass
> the
> >> > basic_suite.
> >> > The job was triggered by this merge: https://gerrit.ovirt.org/#/c/
> 69936/
> >> > to
> >> > vdsm project.
> >> >
> >> > The error I suspect cause this issue:
> >> >
> >> > 2017-01-11 03:32:26,061-05 DEBUG
> >> > [org.ovirt.vdsm.jsonrpc.client.internal.ResponseWorker]
> (ResponseWorker)
> >> > []
> >> > Message received:
> >> >
> >> > {"jsonrpc":"2.0","error":{"code":"192.168.201.2:990178830
> ","message":"Vds
> >> > timeout occured"},"id":null}
> >> > 2017-01-11 03:32:26,067-05 ERROR
> >> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.
> AuditLogDirector]
> >> > (DefaultQuartzScheduler7) [57bc898] Correlation ID: null, Call Stack:
> >> > null,
> >> > Custom Event ID: -1, Message: VDSM command failed: Message timeout
> which
> >> > can
> >> > be caused by communication issues
> >> > 2017-01-11 03:32:26,069-05 ERROR
> >> > [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> >> > (DefaultQuartzScheduler7) [57bc898] ERROR,
> GetStoragePoolInfoVDSCommand(
> >> > GetStoragePoolInfoVDSCommandParameters:{runAsync='true',
> >> > storagePoolId='f92af272-934f-4327-9db0-afe353e6f61c',
> >> > ignoreFailoverLimit='true'}), exception: VDSGenericException:
> >> > VDSNetworkException: Message timeout which can be caused by
> >> > communication
> >> > issues, log id: 2f12b94a
> >> > 2017-01-11 03:32:26,069-05 ERROR
> >> > [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> >> > (DefaultQuartzScheduler7) [57bc898] Exception:
> >> > org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
> >> > VDSGenericException: VDSNetworkException: Message timeout which can be
> >> > caused by communication issues
> >> > at
> >> >
> >> > org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.
> proceedProxyReturnValue(BrokerCommandBase.java:188)
> >> > [vdsbroker.jar:]
> >> > at
> >> >
> >> > org.ovirt.engine.core.vdsbroker.irsbroker.
> GetStoragePoolInfoVDSCommand.executeIrsBrokerCommand(
> GetStoragePoolInfoVDSCommand.java:32)
> >> > [vdsbroker.jar:]
> >> > at
> >> >
> >> > org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand.lambda$
> executeVDSCommand$0(IrsBrokerCommand.java:95)
> >> > [vdsbroker.jar:]
> >> > at
> >> >
> >> > org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy.
> runInControlledConcurrency(IrsProxy.java:262)
> >> > [vdsbroker.jar:]
> >> > at
> >> >
> >> > org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand.
> executeVDSCommand(IrsBrokerCommand.java:92)
> >> > [vdsbroker.jar:]
> >> > at
> >> >
> >> > org.ovirt.engine.core.vdsbroker.VDSCommandBase.
> executeCommand(VDSCommandBase.java:73)
> >> > [vdsbroker.jar:]
> >> > at
> >> > org.ovirt.engine.core.dal.VdcCommandBase.execute(
> VdcCommandBase.java:33)
> >> > [dal.jar:]
> >> > at
> >> >
> >> > org.ovirt.engine.core.vdsbroker.vdsbroker.DefaultVdsCommandExecutor.
> execute(DefaultVdsCommandExecutor.java:14)
> >> > [vdsbroker.jar:]
> >> > at
> >> >
> >> > org.ovirt.engine.core.vdsbroker.ResourceManager.
> runVdsCommand(ResourceManager.java:408)
> >> > [vdsbroker.jar:]
> >> > at
> >> >
> >> > org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy.
> proceedStoragePoolStats(IrsProxy.java:348)
> >> > [vdsbroker.jar:]
> >> > at
> >> >
> >> > org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy.
> lambda$updatingTimerElapsed$0(IrsProxy.java:246)
> >> > [vdsbroker.jar:]
> >> > at
> >> >
> >> > org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy.
> runInControlledConcurrency(IrsProxy.java:262)
> >> > [vdsbroker.jar:]
> >> > at
> >> >
> >> > org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy.
> updatingTimerElapsed(IrsProxy.java:227)
> >> > [vdsbroker.jar:]
> >> > at sun.reflect.GeneratedMethodAccessor191.invoke(Unknown Source)
> >> > [:1.8.0_111]
> >> > at
> >> >
> >> > sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> >> > [rt.jar:1.8.0_111]
> >> > at java.lang.reflect.Method.invoke(Method.java:498)
> [rt.jar:1.8.0_111]
> >> > at
> >> >
> >> > org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(
> JobWrapper.java:77)
> >> > [scheduler.jar:]
> >> > at
> >> > org.ovirt.engine.core.utils.timer.JobWrapper.execute(
> JobWrapper.java:51)
> >> > [scheduler.jar:]
> >> > at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
> [quartz.jar:]
> >> > at
> >> > java.util.concurrent.Executors$RunnableAdapter.
> call(Executors.java:511)
> >> > [rt.jar:1.8.0_111]
> >> > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >> > [rt.jar:1.8.0_111]
> >> > at
> >> >
> >> > java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> >> > [rt.jar:1.8.0_111]
> >> > at
> >> >
> >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> >> > [rt.jar:1.8.0_111]
> >> > at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_111]
> >> >
> >>
> >> This exception occurs when a response do not arrive within specified
> >> period of time.
> >>
> >> Here we can have potential reason for it:
> >>
> >> 2017-01-11 03:23:17,944 ERROR (jsonrpc/5) [storage.TaskManager.Task]
> >> (Task='db7a84ba-d89c-4ac7-ab7a-b9f409ea7365') Unexpected error
> >> (task:870)
> >> Traceback (most recent call last):
> >>   File "/usr/share/vdsm/storage/task.py", line 877, in _run
> >>     return fn(*args, **kargs)
> >>   File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in
> >> wrapper
> >>     res = f(*args, **kwargs)
> >>   File "/usr/share/vdsm/storage/hsm.py", line 3054, in getVolumeInfo
> >>     volUUID=volUUID).getInfo()
> >>   File "/usr/share/vdsm/storage/sd.py", line 748, in produceVolume
> >>     volUUID)
> >>   File "/usr/share/vdsm/storage/blockVolume.py", line 415, in __init__
> >>     manifest = self.manifestClass(repoPath, sdUUID, imgUUID, volUUID)
> >>   File "/usr/share/vdsm/storage/blockVolume.py", line 69, in __init__
> >>     volUUID)
> >>   File "/usr/share/vdsm/storage/volume.py", line 84, in __init__
> >>     self.validate()
> >>   File "/usr/share/vdsm/storage/blockVolume.py", line 159, in validate
> >>     raise se.VolumeDoesNotExist(self.volUUID)
> >> VolumeDoesNotExist: Volume does not exist:
> >> (u'5a464296-ebea-4d1f-a299-cec45e82f9f3',)
> >
> >
> >
> > If that is the case then we should reply with an error on this api call
> and
> > not ignore and let the time out trigger. Do you think this what happen?
>
> Reply to what? Let me say how I see it.
>
> We send to vdsm a request which takes a bit of time to process. During
> processing
> there was an issue that we had to reset the connection which reset
> tracking and buffers.
>

What issue?
I'm concerned about those unexplained disconnections. MOM from VDSM and
Engine from VDSM.
Y.



> Once response arrive we have no information about it.
>
> When the connection is reset we let the command code know that the
> issue occurred
> so it is up to the code to handle the issue - retry or fail.
>
> >>
> >>
> >> >
> >> > Attached is a zip file with all artifacts from Jenkins.
> >> >
> >> > The error I've mentioned above is found in:
> >> >
> >> >
> >> > exported-artifacts/test_logs/basic-suite-master/post-004_
> basic_sanity.py/lago-basic-suite-master-engine/_var_log_
> ovirt-engine/engine.log
> >> >
> >> > Can some one advise?
> >> >
> >> > Thanks,
> >> > --
> >> > Daniel Belenky
> >> > RHV DevOps
> >> > Red Hat Israel
> >> >
> >> > _______________________________________________
> >> > Devel mailing list
> >> > Devel at ovirt.org
> >> > http://lists.ovirt.org/mailman/listinfo/devel
> >> _______________________________________________
> >> Devel mailing list
> >> Devel at ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/devel
> >
> >
> _______________________________________________
> Devel mailing list
> Devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170112/10ab7dc6/attachment-0001.html>


More information about the Devel mailing list