Intermittent Jenkins crashes

Francesco Romani fromani at redhat.com
Wed Apr 23 15:51:23 UTC 2014


Sorry, forgot to add.

By "main, if not only lead" I mean:
* /var/crash is empty
* abrt-cli list yields nothing relevant

Bests,

----- Original Message -----
> From: "Francesco Romani" <fromani at redhat.com>
> To: infra at ovirt.org
> Sent: Wednesday, April 23, 2014 5:48:46 PM
> Subject: Intermittent Jenkins crashes
> 
> Hi infra
> 
> Recently tests started to fail quite randomly due to the python interpreter
> crashing.
> 
> E.g. for example (but many others are like this)
> http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit/8370/console
> 
> LibvirtModuleConfigureTests
>     testLibvirtConfigureToSSLFalse
>     ../tests/run_tests_local.sh: line 10: 31835
>     Segmentation fault      PYTHONDONTWRITEBYTECODE=1 LC_ALL=C
>     PYTHONPATH="../lib:../vdsm:../client:../vdsm_api:$PYTHONPATH"
>     "$PYTHON_EXE" ../tests/testrunner.py --local-modules $@
> 
> quite often, re-running the same tests using jenkins manual trigger
> or uploading a new version of the affected patch seem to somehow fix the
> crash.
> 
> I have ssh access to the affected box, so I did more investigation
> 
> the main, if not only, lead those crashes leave behind is a laconic
> 
> [8855948.327687] python[10418]: segfault at 1 ip 00000036f2c88637 sp
> 00007fffda3c3a60 error 4 in libpython2.7.so.1.0[36f2c00000+178000]
> 
> the error code sometimes varies, the addresses do not.
> So, I followed
> http://enki-tech.blogspot.it/2012/08/debugging-c-part-3-dmesg.html
> 
> and found the following:
> 
> [root at jenkins-slave-vm02 ~]# ./getcrash.sh '[9141800.034517] python[11612]:
> segfault at 1 ip 00000036f2c88637 sp 00007fffe1127c50 error 4 in
> libpython2.7.so.1.0[36f2c00000+178000]'
> Segmentation fault in libpython2.7.so.1.0 at: 0x88637.
> [root at jenkins-slave-vm02 ~]# gdb /usr/lib64/libpython2.7.so.1.0
> GNU gdb (GDB) Fedora 7.6.50.20130731-19.fc20
> [...]
> Reading symbols from /usr/lib64/libpython2.7.so.1.0...Reading symbols from
> /usr/lib/debug/usr/lib64/libpython2.7.so.1.0.debug...done.
> done.
> (gdb) disass 0x88637
> No function contains specified address.
> (gdb)
> 
> (getcrash.sh is a copy of the script presented in the page linked above)
> 
> The only sense I can make from all of the above summarized, is a faulty RAM
> bank, but this is little more than a wild guess.
> 
> Any suggestion on how to go further?
> 
> Thanks,
> 
> --
> Francesco Romani
> RedHat Engineering Virtualization R & D
> Phone: 8261328
> IRC: fromani
> _______________________________________________
> Infra mailing list
> Infra at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra
> 

-- 
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani



More information about the Infra mailing list