Odd errors on vdsm_unit_tests

Yaniv Bronheim ybronhei at redhat.com
Sun Apr 14 09:14:52 UTC 2013


Right, 
If supervdsmServer.py could not start when executing (wrong import for example), testIsSuperUp (first ut of supervdsmTests) does not return and you end up with "killed"..
It happens because we had logic that after 3 retries of communicating with supervdsmServer fail we kill the running process http://gerrit.ovirt.org/11932 (Normally vdsm process, in this case the testrunner process is killed).

First, http://gerrit.ovirt.org/#/c/13804/ - this fix the issue that causes supervdsmServer to fail.
Second, http://gerrit.ovirt.org/#/c/13878/ - check if supervdsmServer starts as expected, if not the test will straightly fail.
Third, I'll submit a change for the logic of communicating with supervdsmServer.. I prefer to avoid the panic call. at least i can monkeyPatch it during the tests.

Thanks,
Yaniv.


----- Original Message -----
> From: "Dan Kenigsberg" <danken at redhat.com>
> To: "David Caro" <dcaroest at redhat.com>, "Yaniv Bronheim" <ybronhei at redhat.com>
> Cc: "Giuseppe Vallarelli" <gvallare at redhat.com>, infra at ovirt.org
> Sent: Sunday, April 14, 2013 11:28:54 AM
> Subject: Re: Odd errors on vdsm_unit_tests
> 
> On Thu, Apr 11, 2013 at 11:47:52PM +0200, David Caro wrote:
> > On Thu 11 Apr 2013 11:36:27 PM CEST, David Caro wrote:
> > > On Thu 11 Apr 2013 11:22:00 PM CEST, Dan Kenigsberg wrote:
> > >> Jenkins used to send email notifications about unit tests failures to
> > >> vdsm-patches at lists.fedorahosted.org at least until March 20.
> > >>
> > >> Come to think of it, the emails seized to have content on March 5.
> > >>
> > >> Now they are not sent at all.
> > >>
> > >
> > > I think I'm not on that list, I'll sign up and look at it, see if I can
> > > figure out something.
> > >
> > >> Beyond this, we have a very odd non-deterministic failure
> > >> http://jenkins.ovirt.org/job/vdsm_unit_tests_gerrit/1976/console
> > >> (except below).
> > >>
> > >> Could the test be run with an unusual locale?
> > >> Does anybody have another idea?
> > >>
> > >> Could someone (Giuseppe? David?) log into this fedora18-vm01 slave and
> > >> try to
> > >> reproduce the issue from the command line?
> > >>
> > >> Regards,
> > >> Dan.
> > >>
> > >> ======================================================================
> > >> ERROR: test_deviceCustomProperties (hooksTests.TestHooks)
> > >> ----------------------------------------------------------------------
> > >> Traceback (most recent call last):
> > >>   File "/jenkins-workspaces/vdsm_unit_tests_gerrit/tests/hooksTests.py",
> > >>   line 125, in test_deviceCustomProperties
> > >>     params={'customProperty': ' rocks!'})
> > >>   File "/jenkins-workspaces/vdsm_unit_tests_gerrit/vdsm/hooks.py", line
> > >>   72, in _runHooksDir
> > >>     scriptenv[k] = unicode(v).encode('utf-8')
> > >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 7:
> > >> ordinal not in range(128)
> > >>
> > >> ======================================================================
> > >> ERROR: test_runHooksDir (hooksTests.TestHooks)
> > >> ----------------------------------------------------------------------
> > >> Traceback (most recent call last):
> > >>   File "/jenkins-workspaces/vdsm_unit_tests_gerrit/tests/hooksTests.py",
> > >>   line 72, in test_runHooksDir
> > >>     res = hooks._runHooksDir(DOMXML, dirName)
> > >>   File "/jenkins-workspaces/vdsm_unit_tests_gerrit/vdsm/hooks.py", line
> > >>   72, in _runHooksDir
> > >>     scriptenv[k] = unicode(v).encode('utf-8')
> > >> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 7:
> > >> ordinal not in range(128)
> > >
> > > About that error, yes, Me and Antoni Segura (apuimedo) troubleshot it,
> > > the problem is that one of the developers has non ascii letter on it's
> > > name, passing it to the author tag on gerrit, that loads it on
> > > GERRIT_*_AUTHOR env variable. Not sure if antoni is working on a patch
> > > or notified someone though.
> > >
> > > --
> > > David Caro
> > >
> > > Red Hat Czech s.r.o.
> > > Continuous Integration Engineer - EMEA ENG Virtualization R&D
> > >
> > > Tel.: +420 532 294 605
> > > Email: dcaro at redhat.com
> > > Web: www.cz.redhat.com
> > > Red Hat Czech s.r.o., Purkyňova 99/71, 612 45, Brno, Czech Republic
> > > RHT Global #: 82-62605
> > > _______________________________________________
> > > Infra mailing list
> > > Infra at ovirt.org
> > > http://lists.ovirt.org/mailman/listinfo/infra
> > 
> > Forgot to mention, the error that bugs me is this one:
> > 
> > http://jenkins.ovirt.org/job/vdsm_unit_tests_gerrit/1962/console
> > 
> > We had several of those today and it all seemed to stop when I set the
> > core dump limit to 'unlimited' (ulimit -c). I saw that before with one
> > patch but subsequent patches worked, so I did not investigate further.
> > As far as I know, the problem is that the test testIsSuperUp somehow
> > end up killing the jenkins slave process (if you run it manually you
> > get a message that says 'killed' in the console). That test seems to
> > mess with threads and processes and maybe it just identifies the slave
> > process as the one to stop (just a wild guess :S).
> 
> Yeah, I've noticed this bug in testIsSuperUp, and Yaniv Bronheim is
> aware of it.
> 
> Dan.
>



More information about the Infra mailing list