On Sat, Jul 7, 2018 at 9:11 AM Edward Haas <ehaas(a)redhat.com> wrote:
On Sat, Jul 7, 2018 at 9:02 AM, Edward Haas <ehaas(a)redhat.com>
wrote:
>
>
> On Fri, Jul 6, 2018 at 9:16 PM, Nir Soffer <nsoffer(a)redhat.com> wrote:
>
>> On Fri, Jul 6, 2018 at 7:05 PM Edward Haas <ehaas(a)redhat.com> wrote:
>>
>>>
>>>
>>> On 6 Jul 2018, at 18:41, Nir Soffer <nsoffer(a)redhat.com> wrote:
>>>
>>>
>>>
>>> On Fri, 6 Jul 2018, 18:25 Edward Haas, <ehaas(a)redhat.com> wrote:
>>>
>>>>
>>>>
>>>> On 6 Jul 2018, at 14:35, Nir Soffer <nsoffer(a)redhat.com> wrote:
>>>>
>>>> On Fri, Jul 6, 2018 at 1:12 PM Edward Haas <ehaas(a)redhat.com>
wrote:
>>>>
>>>>> I do not know if it is relevant or not, but the tests that travis
>>>>> runs for master are taken from the 4.2 branch.
>>>>> OVS tests are now running using pytest.
>>>>>
>>>>
>>>> What do you mean by "taken from 4.2 branch"?
>>>>
>>>>
>>>> I mean that the branch checked out is 4.2 and not master. It even says
>>>> so on the console output.
>>>>
>>>
>>> Can you share the url of that build?
>>>
>>>
>>> I just clicked the icon on the vdsm repo:
>>>
https://travis-ci.org/oVirt/vdsm
>>>
>>
>> This is indeed 4.2 build. Any commit in github is tested in travis.
>> We would like to fix also the 4.2 builds, but first we need to fix
>> master builds.
>>
>> You can see here that master build fail:
>>
https://travis-ci.org/oVirt/vdsm/builds
>>
>> Since we added gbd and python-debuginfo:
>>
https://travis-ci.org/oVirt/vdsm/builds/400644077
>>
>> - centos build fail (network-py27)
>>
https://travis-ci.org/oVirt/vdsm/jobs/400644079
>>
>> - fedora 28 build pass
>>
https://travis-ci.org/oVirt/vdsm/jobs/400644081
>>
>> - fedora rawhide fail because we cannot rebuild the image,
>> python-libblokdev is missing in rawhide.
>>
https://travis-ci.org/oVirt/vdsm/jobs/400644083
>> See
>>
https://lists.ovirt.org/archives/list/devel@ovirt.org/thread/CDNETITY5RYO...
>>
>>
>> I don't know anything about these tests, but this failure looks like:
>>
>> 1. first test has a timeout
>> 2. first test cleanup did not run because the cleanup code is not correct
>> 3. second test fail because the first test did not clean up
>>
>> This looks like real issue in the code.
>>
>
> This is the same problem we had on oVirt CI, there are linux bridges on
> the node.
> I have posted a patch to fail earlier and how the real problem:
>
https://gerrit.ovirt.org/#/c/92867/
> The travis-ci run for it is here:
>
https://travis-ci.org/EdDev/vdsm/jobs/401143906
> This is the problem:
> cmdutils.py 151 DEBUG /usr/share/openvswitch/scripts/ovs-ctl
> --system-id=random start (cwd None)
> cmdutils.py 159 DEBUG FAILED: <err> = 'rmmod: ERROR: Module bridge is in
> use by: br_netfilter\n'; <rc> = 1
>
Who is using rmmod?
Any idea who is creating the "br_netfilter" bridge? I guess
this is
> travis-ci related.
>
Why do we care about br_netfilter? do we require a system without any
bridge?
Actually, this may be Docker or some other package that is
installed/setup
on it.
How can I run the docker with the tests locally to debug this?
Run this in vdsm root directory (copied from .travis.yml):
export DOCKER_IMAGE=ovirtorg/vdsm-test-centos
docker pull $DOCKER_IMAGE
docker run \
--env TRAVIS_CI=1 \
--privileged \
--rm \
-it \
-v `pwd`:/vdsm:Z \
$DOCKER_IMAGE \
bash -c "cd /vdsm && ./autogen.sh --system && make &&
make --jobs=2
check"
Since this is privileged container, you probably want to run this inside a
vm.
We run "make check" both in travis (.travis.yml) and ovirt
ci
>>>> (automation/check-patch.sh)
>>>>
>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 6, 2018 at 12:51 AM, Nir Soffer
<nsoffer(a)redhat.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Jul 5, 2018 at 10:55 PM Nir Soffer
<nsoffer(a)redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Thu, Jul 5, 2018 at 5:53 PM Nir Soffer
<nsoffer(a)redhat.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Thu, Jul 5, 2018 at 5:43 PM Dan Kenigsberg
<danken(a)redhat.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Thu, Jul 5, 2018 at 2:52 AM, Nir Soffer
<nsoffer(a)redhat.com>
>>>>>>>>> wrote:
>>>>>>>>> > On Wed, Jul 4, 2018 at 1:00 PM Dan Kenigsberg
<
>>>>>>>>> danken(a)redhat.com> wrote:
>>>>>>>>> >>
>>>>>>>>> >> On Wed, Jul 4, 2018 at 12:48 PM, Nir Soffer
<
>>>>>>>>> nsoffer(a)redhat.com> wrote:
>>>>>>>>> >> > Dan, travis build still fail when
renaming coverage file
>>>>>>>>> even after
>>>>>>>>> >> > your last patch.
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>>
...........................SS.SS.................................................................................................................................................................SS..................................................S.S................................S................................SS.....SS............................................S...............SSS...S.....S.............................................S................................................................SSS............SSSS..SSSSSSSSS.SS..................................................................................................................................................................
>>>>>>>>> >> >
>>>>>>>>>
----------------------------------------------------------------------
>>>>>>>>> >> > Ran 1267 tests in 99.239s
>>>>>>>>> >> > OK (SKIP=63)
>>>>>>>>> >> > [ -n "$NOSE_WITH_COVERAGE" ]
&& mv .coverage
>>>>>>>>> .coverage-nose-py2
>>>>>>>>> >> > make[1]: *** [check] Error 1
>>>>>>>>> >> > make[1]: Leaving directory
`/vdsm/tests'
>>>>>>>>> >> > ERROR: InvocationError:
'/usr/bin/make -C tests check'
>>>>>>>>> >> >
>>>>>>>>> >> >
https://travis-ci.org/oVirt/vdsm/jobs/399932012
>>>>>>>>> >> >
>>>>>>>>> >> > Do you have any idea what is wrong
there?
>>>>>>>>> >> >
>>>>>>>>> >> > Why we don't have any error message
from the failed command?
>>>>>>>>> >>
>>>>>>>>> >> No idea, nothing pops to mind.
>>>>>>>>> >> We can revert to the sillier [ -f .coverage
] condition
>>>>>>>>> instead of
>>>>>>>>> >> understanding (yeah, this feels dirty)
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > Thanks, your patch
(
https://gerrit.ovirt.org/#/c/92813/) fixed
>>>>>>>>> this
>>>>>>>>> > failure.
>>>>>>>>> >
>>>>>>>>> > Now we have failures for the pywatch_test, and
some network
>>>>>>>>> > tests. Can someone from network look at this?
>>>>>>>>> >
https://travis-ci.org/nirs/vdsm/builds/400204807
>>>>>>>>>
>>>>>>>>>
https://travis-ci.org/nirs/vdsm/jobs/400204808 shows
>>>>>>>>>
>>>>>>>>> ConfigNetworkError: (21, 'Executing
commands failed:
>>>>>>>>> ovs-vsctl: cannot create a bridge named vdsmbr_test
because a
>>>>>>>>> bridge
>>>>>>>>> named vdsmbr_test already exists')
>>>>>>>>>
>>>>>>>>> which I thought was limited to dirty ovirt-ci jenkins
slaves. Any
>>>>>>>>> idea
>>>>>>>>> why it shows here?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Maybe one failed test leave dirty host to the next test?
>>>>>>>>
>>>>>>>
>>>>>> network tests fail now only on CentOS now.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> py-watch seems to be failing due to missing gdb on
the travis
>>>>>>>>> image
>>>>>>>>
>>>>>>>>
>>>>>>>>> cmdutils.py 151 DEBUG ./py-watch
0.1 sleep 10
>>>>>>>>> (cwd None)
>>>>>>>>> cmdutils.py 159 DEBUG FAILED:
<err> = 'Traceback
>>>>>>>>> (most recent call last):\n File
"./py-watch", line 60, in
>>>>>>>>> <module>\n
>>>>>>>>> dump_trace(watched_proc)\n File
"./py-watch", line 32, in
>>>>>>>>> dump_trace\n \'thread apply all
py-bt\'])\n File
>>>>>>>>>
"/usr/lib64/python2.7/site-packages/subprocess32.py", line 575, in
>>>>>>>>> call\n p = Popen(*popenargs, **kwargs)\n File
>>>>>>>>>
"/usr/lib64/python2.7/site-packages/subprocess32.py", line 822, in
>>>>>>>>> __init__\n restore_signals, start_new_session)\n
File
>>>>>>>>>
"/usr/lib64/python2.7/site-packages/subprocess32.py", line 1567,
>>>>>>>>> in
>>>>>>>>> _execute_child\n raise
child_exception_type(errno_num,
>>>>>>>>> err_msg)\nOSError: [Errno 2] No such file or
directory:
>>>>>>>>> \'gdb\'\n';
>>>>>>>>> <rc> = 1
>>>>>>>>>
>>>>>>>>
>>>>>>>> Cool, easy fix.
>>>>>>>>
>>>>>>>
>>>>>>> Fixed by
https://gerrit.ovirt.org/#/c/92846/
>>>>>>>
>>>>>>
>>>>>> Fedora 28 build is green with this change:
>>>>>>
https://travis-ci.org/nirs/vdsm/jobs/400549561
>>>>>>
>>>>>>
>>>>>>
>>>>>> ___________________________________ summary
____________________________________
>>>>>> tests: commands succeeded
>>>>>> storage-py27: commands succeeded
>>>>>> storage-py36: commands succeeded
>>>>>> lib-py27: commands succeeded
>>>>>> lib-py36: commands succeeded
>>>>>> network-py27: commands succeeded
>>>>>> network-py36: commands succeeded
>>>>>> virt-py27: commands succeeded
>>>>>> virt-py36: commands succeeded
>>>>>> congratulations :)
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Nir, could you remind me what is "ERROR:
InterpreterNotFound:
>>>>>>>>> python3.6" and how can we avoid it? it keeps
distracting during
>>>>>>>>> debugging test failures.
>>>>>>>>>
>>>>>>>>
>>>>>>>> We can avoid it in travis using env matrix.
>>>>>>>>
>>>>>>>> Currently we run "make check" which run all the
the tox envs
>>>>>>>> (e.g. storage-py27,storage-py36) regardless of the build
type.
>>>>>>>> This is good
>>>>>>>> for manual usage when you don't know which python
version is
>>>>>>>> available
>>>>>>>> on a developer machine. For example if I have python 3.7
>>>>>>>> installed, maybe
>>>>>>>> I like to test.
>>>>>>>>
>>>>>>>> We can change this so we will test only the *-py27 on
centos, and
>>>>>>>> both
>>>>>>>> *-py27 and *-py36 on Fedora.
>>>>>>>>
>>>>>>>> We can do the same in ovirt CI but it will be harder, we
don't
>>>>>>>> have a declerative
>>>>>>>> way to configure this.
>>>>>>>>
>>>>>>>
>>>>>>> Fixed all builds using --enable-python3:
>>>>>>>
https://gerrit.ovirt.org/#/c/92847/
>>>>>>>
>>>>>>
>>>>>> Here is an example from CentOS build - no false errors.
>>>>>>
>>>>>> ___________________________________ summary
____________________________________
>>>>>> tests: commands succeeded
>>>>>> storage-py27: commands succeeded
>>>>>> lib-py27: commands succeeded
>>>>>> ERROR: network-py27: commands failed
>>>>>> virt-py27: commands succeeded
>>>>>> make: *** [tests] Error 1
>>>>>> make: *** Waiting for unfinished jobs....
>>>>>> ___________________________________ summary
____________________________________
>>>>>> pylint: commands succeeded
>>>>>> congratulations :)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Nir
>>>>>>>
>>>>>>
>>>>>
>