[ovirt-devel] [ OST Failure Report ] [ oVirt master ] [ 14/05/2017 ] [ test-repo_ovirt_experimental_master ]

Piotr Kliczewski piotr.kliczewski at gmail.com
Thu May 18 14:42:41 UTC 2017


Yevgeny,

There is message id which matches request and response but in this
case I made an assumption about setupNetworks being exclusive
operation.
The following setupNetwoks was the last one which was last executed.

Thanks,
Piotr

On Thu, May 18, 2017 at 1:28 PM, Yevgeny Zaspitsky <yzaspits at redhat.com> wrote:
> Piotr,
>
> How do you identify the connection between the 2 message, why do you think
> that the timeout is related to the first message you've mentioned?
>
> Regards,
> Yevgeny
>
> On Mon, May 15, 2017 at 12:40 PM, Piotr Kliczewski
> <piotr.kliczewski at gmail.com> wrote:
>>
>> I agree with Nir that the above issue is not related to the job failure.
>>
>> I see that there is a timeout for:
>>
>> 2017-05-14 04:10:04,215-04 DEBUG
>> [org.ovirt.vdsm.jsonrpc.client.reactors.stomp.StompCommonClient]
>> (org.ovirt.thread.pool-7-thread-12) [] Message sent: SEND
>> destination:jms.topic.vdsm_requests
>> content-length:377
>> reply-to:jms.topic.vdsm_responses
>>
>> <JsonRpcRequest id: "846f4741-5c7d-4396-8ed7-ce995d426728", method:
>> Host.setupNetworks, params: {networks={Migration_Net={vlan=200,
>> ipv6autoconf=false, nic=eth0, bridged=false, dhcpv6=false, mtu=1500,
>> ipv6addr=2001:0db8:85a3:0000:0000:574c:14ea:0a02/64, switch=legacy}},
>> bondings={}, options={connectivityTimeout=120,
>> connectivityCheck=true}}>
>>
>>
>> and the timeout occurred at:
>>
>> 2017-05-14 04:10:07,535-04 DEBUG
>> [org.ovirt.vdsm.jsonrpc.client.internal.ResponseWorker]
>> (ResponseWorker) [] Message received:
>>
>> {"jsonrpc":"2.0","error":{"code":"lago-basic-suite-master-host1:286640862","message":"Vds
>> timeout occured"},"id":null}
>>
>> I see in supervdsm log:
>>
>> MainProcess|jsonrpc/1::INFO::2017-05-14
>> 04:10:05,566::netconfpersistence::52::root::(setNetwork) Adding
>> network Migration_Net({u'ipv6autoconf': False, 'nameservers': [],
>> u'nic': u'eth0', u'vlan': 200, u'mtu': 1500, u'switch': u'legacy',
>> u'dhcpv6': False, u'bridged': False, u'ipv6addr':
>> u'2001:0db8:85a3:0000:0000:574c:14ea:0a02/64', 'defaultRoute': False,
>> 'bootproto': 'none'})
>> ...
>> MainProcess|jsonrpc/1::WARNING::2017-05-14
>> 04:10:05,580::ifcfg::247::root::(_addSourceRoute) Invalid input for
>> source routing: name=eth0.200, addr=None, netmask=None, gateway=None
>> netlink/events::DEBUG::2017-05-14
>> 04:10:05,581::concurrent::184::root::(run) START thread
>> <Thread(netlink/events, started daemon 139829765465856)> (func=<bound
>> method Monitor._scan of <vdsm.network.netlink.monitor.Monitor object
>> at 0x7f2c8c01ebd0>>, args=(), kwargs={})
>> MainProcess|jsonrpc/1::WARNING::2017-05-14
>> 04:10:05,581::ifcfg::894::root::(_ignore_missing_device) eth0.200
>> didn't exist (yet) and so IPv6 wasn't enabled.
>>
>> My question is why only 3 seconds to timeout setupNetworks.
>>
>>
>>
>> I found one more exception not related to the above issue which we should
>> fix:
>>
>> 2017-05-14 04:08:08,924-04 DEBUG
>> [org.ovirt.engine.core.utils.timer.SchedulerUtilQuartzImpl]
>> (DefaultQuartzScheduler10) [58fd90c1] Exception:
>> java.lang.reflect.InvocationTargetException
>> at sun.reflect.GeneratedMethodAccessor178.invoke(Unknown Source)
>> [:1.8.0_131]
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> [rt.jar:1.8.0_131]
>> at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_131]
>> at
>> org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:83)
>> [scheduler.jar:]
>> at
>> org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:55)
>> [scheduler.jar:]
>> at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> [rt.jar:1.8.0_131]
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> [rt.jar:1.8.0_131]
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> [rt.jar:1.8.0_131]
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> [rt.jar:1.8.0_131]
>> at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131]
>> Caused by: java.lang.NullPointerException
>> at
>> org.ovirt.engine.core.vdsbroker.monitoring.VmStatsRefresher.lambda$processDevices$0(VmStatsRefresher.java:44)
>> [vdsbroker.jar:]
>> at
>> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
>> [rt.jar:1.8.0_131]
>> at
>> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>> [rt.jar:1.8.0_131]
>> at
>> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
>> [rt.jar:1.8.0_131]
>> at
>> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
>> [rt.jar:1.8.0_131]
>> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>> [rt.jar:1.8.0_131]
>> at
>> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>> [rt.jar:1.8.0_131]
>> at
>> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
>> [rt.jar:1.8.0_131]
>> at
>> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
>> [rt.jar:1.8.0_131]
>> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>> [rt.jar:1.8.0_131]
>> at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
>> [rt.jar:1.8.0_131]
>> at
>> org.ovirt.engine.core.vdsbroker.monitoring.VmStatsRefresher.processDevices(VmStatsRefresher.java:46)
>> [vdsbroker.jar:]
>> at
>> org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher.poll(PollVmStatsRefresher.java:44)
>> [vdsbroker.jar:]
>> ... 11 more
>>
>> Thanks,
>> Piotr
>>
>>
>> On Sun, May 14, 2017 at 6:12 PM, Nir Soffer <nsoffer at redhat.com> wrote:
>> > On Sun, May 14, 2017 at 5:14 PM Shlomo Ben David <sbendavi at redhat.com>
>> > wrote:
>> >>
>> >> Link to suspected patches: N/A
>> >> Link to Job:
>> >> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/6665/
>> >> Link to all logs:
>> >>
>> >> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/6665/artifact/exported-artifacts/basic-suit-master-el7/test_logs/basic-suite-master/post-006_migrations.py/
>> >>
>> >> Error snippet from the log:
>> >
>> >
>> >> Traceback (most recent call last):
>> >>
>> >>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line
>> >> 878,
>> >> in _run
>> >>     return fn(*args, **kargs)
>> >>   File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in
>> >> wrapper
>> >>     res = f(*args, **kwargs)
>> >>   File "/usr/share/vdsm/storage/hsm.py", line 2132, in
>> >> getAllTasksStatuses
>> >>     allTasksStatus = self._pool.getAllTasksStatuses()
>> >>   File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
>> >> line
>> >> 77, in wrapper
>> >>     raise SecureError("Secured object is not in safe state")
>> >> SecureError: Secured object is not in safe state
>> >
>> >
>> > This means that getAllTasksStatuses (SPM verb) was called on a host
>> > which is not the SPM. This is an error in the caller, not in vdsm.
>> >
>> > We already have a bug abut this:
>> > https://bugzilla.redhat.com/1426429
>> >
>> > However this build logs do not contain enough info. These
>> > patches adds more info that can help to identify the caller:
>> > https://gerrit.ovirt.org/76771
>> >
>> > This error is not related to the real error in this build.
>> >
>> > Nir
>> >
>> > _______________________________________________
>> > Devel mailing list
>> > Devel at ovirt.org
>> > http://lists.ovirt.org/mailman/listinfo/devel
>> _______________________________________________
>> Devel mailing list
>> Devel at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel
>
>


More information about the Devel mailing list