On Fri, Jul 6, 2018 at 7:05 PM Edward Haas <ehaas@redhat.com> wrote:


On 6 Jul 2018, at 18:41, Nir Soffer <nsoffer@redhat.com> wrote:



On Fri, 6 Jul 2018, 18:25 Edward Haas, <ehaas@redhat.com> wrote:


On 6 Jul 2018, at 14:35, Nir Soffer <nsoffer@redhat.com> wrote:

On Fri, Jul 6, 2018 at 1:12 PM Edward Haas <ehaas@redhat.com> wrote:
I do not know if it is relevant or not, but the tests that travis runs for master are taken from the 4.2 branch.
OVS tests are now running using pytest.

What do you mean by "taken from 4.2 branch"?

I mean that the branch checked out is 4.2 and not master. It even says so on the console output.

Can you share the url of that build?

I just clicked the icon on the vdsm repo: https://travis-ci.org/oVirt/vdsm

This is indeed 4.2 build. Any commit in github is tested in travis.
We would like to fix also the 4.2 builds, but first we need to fix master builds.

You can see here that master build fail:
https://travis-ci.org/oVirt/vdsm/builds

Since we added gbd and python-debuginfo:
https://travis-ci.org/oVirt/vdsm/builds/400644077

- centos build fail (network-py27)
  https://travis-ci.org/oVirt/vdsm/jobs/400644079

- fedora 28 build pass
  https://travis-ci.org/oVirt/vdsm/jobs/400644081

- fedora rawhide fail because we cannot rebuild the image,
  python-libblokdev is missing in rawhide.
  https://travis-ci.org/oVirt/vdsm/jobs/400644083
  See https://lists.ovirt.org/archives/list/devel@ovirt.org/thread/CDNETITY5RYOCQBIQQF2NUF6RAHGJRPW/


 This is the first failing test:


__________________ TestOvsApiBase.test_execute_a_transaction ___________________
self = <network.integration.ovs.ovs_driver_test.TestOvsApiBase testMethod=test_execute_a_transaction>
def test_execute_a_transaction(self):
ovsdb = create()
cmd_add_br = ovsdb.add_br(TEST_BRIDGE)
cmd_list_bridge_info = ovsdb.list_bridge_info()
t = ovsdb.transaction()
t.add(cmd_add_br)
t.add(cmd_list_bridge_info)
> t.commit()
network/integration/ovs/ovs_driver_test.py:51:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <vdsm.network.ovs.driver.vsctl.Transaction object at 0x7f72a7989b50>
def commit(self):
if not self.commands:
return
timeout_option = []
if self.timeout:
timeout_option = ['--timeout=' + str(self.timeout)]
args = []
for command in self.commands:
args += ['--'] + command.cmd
exec_line = [_ovs_vsctl_cmd()] + timeout_option + OUTPUT_FORMAT + args
logging.debug('Executing commands: %s' % ' '.join(exec_line))
rc, out, err = netcmd.exec_sync(exec_line)
if rc != 0:
err = err.splitlines()
if OvsDBConnectionError.is_ovs_db_conn_error(err):
raise OvsDBConnectionError('\n'.join(err))
else:
raise ConfigNetworkError(
ne.ERR_BAD_PARAMS,
> 'Executing commands failed: %s' % '\n'.join(err))
E ConfigNetworkError: (21, 'Executing commands failed: 2018-07-05T22:24:25Z|00002|fatal_signal|WARN|terminating with signal 14 (Alarm clock)')
../lib/vdsm/network/ovs/driver/vsctl.py:78: ConfigNetworkError
------------------------------ Captured log call -------------------------------
vsctl.py 68 DEBUG Executing commands: /usr/bin/ovs-vsctl --timeout=5 --oneline --format=json -- add-br vdsmbr_test -- list Bridge
cmdutils.py 151 DEBUG /usr/bin/ovs-vsctl --timeout=5 --oneline --format=json -- add-br vdsmbr_test -- list Bridge (cwd None)
cmdutils.py 159 DEBUG FAILED: <err> = '2018-07-05T22:24:25Z|00002|fatal_signal|WARN|terminating with signal 14 (Alarm clock)\n'; <rc> = -14
---------------------------- Captured log teardown -----------------------------
vsctl.py 68 DEBUG Executing commands: /usr/bin/ovs-vsctl --timeout=5 --oneline --format=json -- list Bridge
cmdutils.py 151 DEBUG /usr/bin/ovs-vsctl --timeout=5 --oneline --format=json -- list Bridge (cwd None)
cmdutils.py 159 DEBUG SUCCESS: <err> = ''; <rc> = 0
It failed with a timeout.

The second test fail because the bridge already exists:


____________ TestOvsApiWithSingleRealBridge.test_create_remove_bond ____________
self = <network.integration.ovs.ovs_driver_test.TestOvsApiWithSingleRealBridge testMethod=test_create_remove_bond>
def setUp(self):
self.ovsdb = create()
> self.ovsdb.add_br(TEST_BRIDGE).execute()
network/integration/ovs/ovs_driver_test.py:69:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../lib/vdsm/network/ovs/driver/vsctl.py:99: in execute
t.add(self)
../lib/vdsm/network/ovs/driver/__init__.py:46: in __exit__
self.result = self.commit()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <vdsm.network.ovs.driver.vsctl.Transaction object at 0x7f72a7989210>
def commit(self):
if not self.commands:
return
timeout_option = []
if self.timeout:
timeout_option = ['--timeout=' + str(self.timeout)]
args = []
for command in self.commands:
args += ['--'] + command.cmd
exec_line = [_ovs_vsctl_cmd()] + timeout_option + OUTPUT_FORMAT + args
logging.debug('Executing commands: %s' % ' '.join(exec_line))
rc, out, err = netcmd.exec_sync(exec_line)
if rc != 0:
err = err.splitlines()
if OvsDBConnectionError.is_ovs_db_conn_error(err):
raise OvsDBConnectionError('\n'.join(err))
else:
raise ConfigNetworkError(
ne.ERR_BAD_PARAMS,
> 'Executing commands failed: %s' % '\n'.join(err))
E ConfigNetworkError: (21, 'Executing commands failed: ovs-vsctl: cannot create a bridge named vdsmbr_test because a bridge named vdsmbr_test already exists')
../lib/vdsm/network/ovs/driver/vsctl.py:78: ConfigNetworkError
------------------------------ Captured log call -------------------------------
vsctl.py 68 DEBUG Executing commands: /usr/bin/ovs-vsctl --timeout=5 --oneline --format=json -- add-br vdsmbr_test
cmdutils.py 151 DEBUG /usr/bin/ovs-vsctl --timeout=5 --oneline --format=json -- add-br vdsmbr_test (cwd None)
cmdutils.py 159 DEBUG FAILED: <err> = 'ovs-vsctl: cannot create a bridge named vdsmbr_test because a bridge named vdsmbr_test already exists\n'; <rc> = 1
---------------------------- Captured log teardown -----------------------------
vsctl.py 68 DEBUG Executing commands: /usr/bin/ovs-vsctl --timeout=5 --oneline --format=json -- list Bridge
cmdutils.py 151 DEBUG /usr/bin/ovs-vsctl --timeout=5 --oneline --format=json -- list Bridge (cwd None)
cmdutils.py 159 DEBUG SUCCESS: <err> = ''; <rc> = 0
 I don't know anything about these tests, but this failure looks like:

1. first test has a timeout
2. first test cleanup did not run because the cleanup code is not correct
3. second test fail because the first test did not clean up

This looks like real issue in the code.



We run "make check" both in travis (.travis.yml) and ovirt ci (automation/check-patch.sh)
 


On Fri, Jul 6, 2018 at 12:51 AM, Nir Soffer <nsoffer@redhat.com> wrote:


On Thu, Jul 5, 2018 at 10:55 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Thu, Jul 5, 2018 at 5:53 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Thu, Jul 5, 2018 at 5:43 PM Dan Kenigsberg <danken@redhat.com> wrote:
On Thu, Jul 5, 2018 at 2:52 AM, Nir Soffer <nsoffer@redhat.com> wrote:
> On Wed, Jul 4, 2018 at 1:00 PM Dan Kenigsberg <danken@redhat.com> wrote:
>>
>> On Wed, Jul 4, 2018 at 12:48 PM, Nir Soffer <nsoffer@redhat.com> wrote:
>> > Dan, travis build still fail when renaming coverage file even after
>> > your last patch.
>> >
>> >
>> >
>> > ...........................SS.SS.................................................................................................................................................................SS..................................................S.S................................S................................SS.....SS............................................S...............SSS...S.....S.............................................S................................................................SSS............SSSS..SSSSSSSSS.SS..................................................................................................................................................................
>> > ----------------------------------------------------------------------
>> > Ran 1267 tests in 99.239s
>> > OK (SKIP=63)
>> > [ -n "$NOSE_WITH_COVERAGE" ] && mv .coverage .coverage-nose-py2
>> > make[1]: *** [check] Error 1
>> > make[1]: Leaving directory `/vdsm/tests'
>> > ERROR: InvocationError: '/usr/bin/make -C tests check'
>> >
>> > https://travis-ci.org/oVirt/vdsm/jobs/399932012
>> >
>> > Do you have any idea what is wrong there?
>> >
>> > Why we don't have any error message from the failed command?
>>
>> No idea, nothing pops to mind.
>> We can revert to the sillier [ -f .coverage ] condition instead of
>> understanding (yeah, this feels dirty)
>
>
> Thanks, your patch (https://gerrit.ovirt.org/#/c/92813/) fixed this
> failure.
>
> Now we have failures for the pywatch_test, and some network
> tests. Can someone from network look at this?
> https://travis-ci.org/nirs/vdsm/builds/400204807

https://travis-ci.org/nirs/vdsm/jobs/400204808 shows

              ConfigNetworkError: (21, 'Executing commands failed:
ovs-vsctl: cannot create a bridge named vdsmbr_test because a bridge
named vdsmbr_test already exists')

which I thought was limited to dirty ovirt-ci jenkins slaves. Any idea
why it shows here?

Maybe one failed test leave dirty host to the next test?

network tests fail now only on CentOS now.
 
 
py-watch seems to be failing due to missing gdb on the travis image

cmdutils.py                151 DEBUG    ./py-watch 0.1 sleep 10 (cwd None)
cmdutils.py                159 DEBUG    FAILED: <err> = 'Traceback
(most recent call last):\n  File "./py-watch", line 60, in <module>\n
  dump_trace(watched_proc)\n  File "./py-watch", line 32, in
dump_trace\n    \'thread apply all py-bt\'])\n  File
"/usr/lib64/python2.7/site-packages/subprocess32.py", line 575, in
call\n    p = Popen(*popenargs, **kwargs)\n  File
"/usr/lib64/python2.7/site-packages/subprocess32.py", line 822, in
__init__\n    restore_signals, start_new_session)\n  File
"/usr/lib64/python2.7/site-packages/subprocess32.py", line 1567, in
_execute_child\n    raise child_exception_type(errno_num,
err_msg)\nOSError: [Errno 2] No such file or directory: \'gdb\'\n';
<rc> = 1

Cool, easy fix.


Fedora 28 build is green with this change:
 

___________________________________ summary ____________________________________
tests: commands succeeded
storage-py27: commands succeeded
storage-py36: commands succeeded
lib-py27: commands succeeded
lib-py36: commands succeeded
network-py27: commands succeeded
network-py36: commands succeeded
virt-py27: commands succeeded
virt-py36: commands succeeded
congratulations :)

 
 
Nir, could you remind me what is "ERROR: InterpreterNotFound:
python3.6" and how can we avoid it? it keeps distracting during
debugging test failures.

We can avoid it in travis using env matrix.

Currently we run "make check" which run all the the tox envs
(e.g. storage-py27,storage-py36) regardless of the build type. This is good
for manual usage when you don't know which python version is available
on a developer machine. For example if I have python 3.7 installed, maybe
I like to test.

We can change this so we will test only the *-py27 on centos, and both
*-py27 and *-py36 on Fedora.

We can do the same in ovirt CI but it will be harder, we don't have a declerative
way to configure this.

Fixed all builds using --enable-python3:

Here is an example from CentOS build - no false errors.

___________________________________ summary ____________________________________
tests: commands succeeded
storage-py27: commands succeeded
lib-py27: commands succeeded
ERROR: network-py27: commands failed
virt-py27: commands succeeded
make: *** [tests] Error 1
make: *** Waiting for unfinished jobs....
___________________________________ summary ____________________________________
pylint: commands succeeded
congratulations :)

 

Nir