[ovirt-devel] [OST Failure Report] [oVirt master] [09.02.2017] [test-repo_ovirt_experimental_master]

Piotr Kliczewski pkliczew at redhat.com
Tue Feb 21 15:53:28 UTC 2017


Dominik,

Here is the exception which caused "Not able to update response"

2017-02-09 06:30:41,828-05 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(org.ovirt.thread.pool-7-thread-13) [1abe9b01] EVENT_ID:
VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VDSM lago-basic-suite-master-host1
command HostSetupNetworksVDS failed: 880 > 879
2017-02-09 06:30:41,829-05 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand]
(org.ovirt.thread.pool-7-thread-13) [1abe9b01] Command
'HostSetupNetworksVDSCommand(HostName = lago-basic-suite-master-host1,
HostSetupNetworksVdsCommandParameters:{runAsync='true',
hostId='42fa76b6-7bf9-463e-80fa-fcca9b0d3e81',
vds='Host[lago-basic-suite-master-host1,42fa76b6-7bf9-463e-80fa-fcca9b0d3e81]',
rollbackOnFailure='true', connectivityTimeout='120',
networks='[HostNetwork:{defaultRoute='false', bonding='false',
networkName='Labeled_Network', nicName='eth0', vlan='600', mtu='0',
vmNetwork='false', stp='false', properties='null',
ipv4BootProtocol='NONE', ipv4Address='null', ipv4Netmask='null',
ipv4Gateway='null', ipv6BootProtocol='NONE', ipv6Address='null',
ipv6Prefix='null', ipv6Gateway='null'}]', removedNetworks='[]',
bonds='[]', removedBonds='[]', clusterSwitchType='LEGACY'})' execution
failed: VDSGenericException: VDSNetworkException: 880 > 879
2017-02-09 06:30:41,829-05 DEBUG
[org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand]
(org.ovirt.thread.pool-7-thread-13) [1abe9b01] Exception:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
VDSGenericException: VDSNetworkException: 880 > 879
	at org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase.proceedProxyReturnValue(BrokerCommandBase.java:188)
[vdsbroker.jar:]
	at org.ovirt.engine.core.vdsbroker.vdsbroker.FutureVDSCommand.get(FutureVDSCommand.java:70)
[vdsbroker.jar:]
	at org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand.executeCommand(HostSetupNetworksCommand.java:290)
[bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1251)
[bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1391)
[bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:2055)
[bll.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:164)
[utils.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:103)
[utils.jar:]
	at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1451)
[bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:397)
[bll.jar:]

This could be connected with the specific version of the jackson that we use.


On Tue, Feb 21, 2017 at 4:42 PM, Dominik Holler <dholler at redhat.com> wrote:

> A deep analysis of the logfiles gives details about the
> unexpected behavior, but I regret to not provide the fault causing the
> unexpected behavior.
>
> To get this fault, the help of someone familiar with
> org.ovirt.vdsm.jsonrpc.client.JsonRpcClient is needed.
>
> In the failing test "assign_labeled_network" a (labeld) network is
> assigned to the cluster. For this reason the network has to be added to
> the hosts. After that, the test "assign_labeled_network" checks, if the
> engine acknowledges that hosts are in the labeld network. This
> execution of the test failed, because this acknowledge of the engine
> is missing after 180 seconds [3].
>
> There are two hosts lago-basic-suite-master-host0 and
> lago-basic-suite-master-host1 in the scenario.
> lago-basic-suite-master-host1 fails and
> lago-basic-suite-master-host0 succeeds, so only
> lago-basic-suite-master-host1 is analyzed below.
>
> Please find here the most relevant steps causing this error:
> 1. The engine sends Host.setupNetworks to the hosts in
>    line 40279 - 40295 in [1] with
>    "id":"02298344-165f-47e4-9ea4-7c17a55d37f8".
> 2. The host executes the Host.setupNetworks RPC call successfully in
>    line 1286 in [2].
> 3. The engine receives the acknowledgment of the successful execution
>    in line 40716 and 40717 in [1].
> 4. The error occurs in line 40718:
>    '[org.ovirt.vdsm.jsonrpc.client.JsonRpcClient] (ResponseWorker) []
>    Not able to update response for
>    "02298344-165f-47e4-9ea4-7c17a55d37f8"'. This means the engine can
>    not process the acknowledgment of the successful execution.
> 5. The command HostSetupNetworksVDS is aborted.
>    So Host.getCapabilities is skipped and the engine database is not
>    updated with the new network configuration of the host.
> 6. Since the test script relays on the information from database
>    about host network configuration, it does not see that
>    Host.setupNetworks is successfully executed and stops with the
>    error "False != True after 180 seconds" [3]
>
> So the fault happens before or in step 4 and is around the jsonrpc
> communication.
>
> It is an open action item to precise the location of the fault.
>
>
>
> [1]
>   http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_
> master/5217/artifact/exported-artifacts/basic-suit-master-
> el7/test_logs/basic-suite-master/post-005_network_by_
> label.py/lago-basic-suite-master-engine/_var_log/ovirt-engine/engine.log
>
> [2]
>   http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_
> master/5217/artifact/exported-artifacts/basic-suit-master-
> el7/test_logs/basic-suite-master/post-005_network_by_
> label.py/lago-basic-suite-master-host1/_var_log/vdsm/vdsm.log
>
> [3]
>   http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_
> master/5217/testReport/junit/(root)/005_network_by_label/
> assign_labeled_network/
>
>
>
> On Thu, 9 Feb 2017 14:52:52 +0200
> Shlomo Ben David <sbendavi at redhat.com> wrote:
>
> > Hi,
> >
> >
> > *Test failed:* [test-repo_ovirt_experimental_master]
> >
> > *Link to suspected patches:* n/a
> >
> > *Link to Job:*
> > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/5217
> >
> > *Link to all logs:*
> > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_
> master/5217/artifact/exported-artifacts/basic-suit-master-
> el7/test_logs/basic-suite-master/post-005_network_by_label.py/
> >
> > *Error snippet from the log: *
> >
> > <error>
> >
> > ifup/VLAN100_Network::ERROR::2017-02-09
> > 06:21:15,236::concurrent::189::root::(run) FINISH thread
> > <Thread(ifup/VLAN100_Network, started daemon 140189099861760)> failed
> > Traceback (most recent call last):
> >   File "/usr/lib/python2.7/site-packages/vdsm/concurrent.py", line
> > 185, in run ret = func(*args, **kwargs)
> >   File
> > "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py",
> > line 949, in _exec_ifup _exec_ifup_by_name(iface.name, cgroup)
> >   File
> > "/usr/lib/python2.7/site-packages/vdsm/network/configurators/ifcfg.py",
> > line 935, in _exec_ifup_by_name raise
> > ConfigNetworkError(ERR_FAILED_IFUP, out[-1] if out else '')
> > ConfigNetworkError: (29, 'Determining IPv6 information for
> > VLAN100_Network... failed.')
> >
> > </error>
> >
> > Best Regards,
> >
> > Shlomi Ben-David | Software Engineer | Red Hat ISRAEL
> > RHCSA | RHCVA | RHCE
> > IRC: shlomibendavid (on #rhev-integ, #rhev-dev, #rhev-ci)
> >
> > OPEN SOURCE - 1 4 011 && 011 4 1
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170221/a825825f/attachment-0001.html>


More information about the Devel mailing list