
----- Original Message -----
From: "David Caro" <dcaroest@redhat.com> To: "Michal Skrivanek" <michal.skrivanek@redhat.com> Cc: devel@ovirt.org Sent: Friday, June 6, 2014 9:53:23 AM Subject: Re: [ovirt-devel] local vdsm build fails
On Fri 06 Jun 2014 09:23:33 AM CEST, Michal Skrivanek wrote:
On Jun 6, 2014, at 09:19 , Piotr Kliczewski <piotr.kliczewski@gmail.com> wrote:
All,
I pulled the latest vdsm from master and noticed that build is failing.
Here is the patch that causes the failuer:
http://gerrit.ovirt.org/#/c/28226
and looking at jenkins comments I can see that jenkins was failing with the same reason:
http://jenkins.ovirt.org/job/vdsm_master_storage_functional_tests_localfs_ge...
btw at least yesterday again there were so many false errors with jenkins not being able to run the tests properly that it's unusable…. was that the reason the result was ignored? (though the err is clear about relevance to that patch)
Can you point out which jobs were false positives? Also, can you specify for each one how to determine if it's a test failure from the logs? As specific as possible? We can filter the logs for those failures and set a different message so you'll know from the gerrit comments if it was a real issue or infra failure.
Quite some noise is from python segfaulting. I can reproduce the segfault locally but I'm having hard time pinpointing the issue. Reported the bug upstream: https://github.com/nose-devs/nose/issues/817 Let me summarize what I (we) know about python segfaulting: * the segfault should be reproduceable on any box running nose >= 1.3.0, just using $ cd vdsm $ ./configure && make $ NOSE_WITH_XUNIT=1 make check or at least I can reproduce the issue on all the boxes I tried locally (vanilla F20, F19) * if we run each testunit separately, we do NOT observe the failure. This triggers the segfault: $ cd tests $ ./run_tests_local.sh ./*.py This does not: $ cd tests $ for TEST in `ls ./*.py`; do ./run_tests_local.sh $TEST; done * the stack traces I observed are huge, more than 750 levels deep. This suggests the stack exausted, and this in turn probably triggered by some kind of recursion gone wild. Note the offending stack trace is just on one thread; all the others are quiet. * I tried to reproduce the issue with a simpler use case with no luck so far. At the moment I don't have better suggestions than bite the bullet and dig in the huge stack trace looking for repetitive patterns or some sort of hint. A core dump available for post-mortem analysis, from my laptop, which is a F20 with few updates (mostly from virt-preview and few other places - full list, if relevant, provided as pkgs.txt.gz in the folder below) here [link will be provided off-list] core.20626.1000.gz is the fresh new core, core.20626.1000.md5 is its checksum. -- Francesco Romani RedHat Engineering Virtualization R & D Phone: 8261328 IRC: fromani