
This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --kStjDNgJGeSwBfMrqQh0NguLeIxRm3vVG Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Wed 23 Apr 2014 05:51:23 PM CEST, Francesco Romani wrote:
Sorry, forgot to add.
By "main, if not only lead" I mean: * /var/crash is empty * abrt-cli list yields nothing relevant
Bests,
----- Original Message -----
From: "Francesco Romani" <fromani@redhat.com> To: infra@ovirt.org Sent: Wednesday, April 23, 2014 5:48:46 PM Subject: Intermittent Jenkins crashes
Hi infra
Recently tests started to fail quite randomly due to the python interp= reter crashing.
E.g. for example (but many others are like this) http://jenkins.ovirt.org/job/vdsm_master_unit_tests_gerrit/8370/consol= e
LibvirtModuleConfigureTests testLibvirtConfigureToSSLFalse ../tests/run_tests_local.sh: line 10: 31835 Segmentation fault PYTHONDONTWRITEBYTECODE=3D1 LC_ALL=3DC PYTHONPATH=3D"../lib:../vdsm:../client:../vdsm_api:$PYTHONPATH" "$PYTHON_EXE" ../tests/testrunner.py --local-modules $@
quite often, re-running the same tests using jenkins manual trigger or uploading a new version of the affected patch seem to somehow fix t= he crash.
I have ssh access to the affected box, so I did more investigation
the main, if not only, lead those crashes leave behind is a laconic
[8855948.327687] python[10418]: segfault at 1 ip 00000036f2c88637 sp 00007fffda3c3a60 error 4 in libpython2.7.so.1.0[36f2c00000+178000]
the error code sometimes varies, the addresses do not. So, I followed http://enki-tech.blogspot.it/2012/08/debugging-c-part-3-dmesg.html
and found the following:
[root@jenkins-slave-vm02 ~]# ./getcrash.sh '[9141800.034517] python[11= 612]: segfault at 1 ip 00000036f2c88637 sp 00007fffe1127c50 error 4 in libpython2.7.so.1.0[36f2c00000+178000]' Segmentation fault in libpython2.7.so.1.0 at: 0x88637. [root@jenkins-slave-vm02 ~]# gdb /usr/lib64/libpython2.7.so.1.0 GNU gdb (GDB) Fedora 7.6.50.20130731-19.fc20 [...] Reading symbols from /usr/lib64/libpython2.7.so.1.0...Reading symbols = from /usr/lib/debug/usr/lib64/libpython2.7.so.1.0.debug...done. done. (gdb) disass 0x88637 No function contains specified address. (gdb)
(getcrash.sh is a copy of the script presented in the page linked abov= e)
The only sense I can make from all of the above summarized, is a fault= y RAM bank, but this is little more than a wild guess.
Any suggestion on how to go further?
Thanks,
-- Francesco Romani RedHat Engineering Virtualization R & D Phone: 8261328 IRC: fromani _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
Let's try to see if it's a problem that only affects one slave, one=20 python version, one distribution or fails anywhere. If it only affects=20 one slave, we might just reprovision it (the one you pointed out is a=20 vm). If it's related to a package version we can try to upgrade it, or=20 downgrade it, or fix it (in the best case). If it's anything else it will be more complicated to fix, and we will=20 have to look deeper (try to reproduce manually, add traces, maybe as=20 you say it's an issue on the RAM, but being a vm, we might expect it=20 failing also on the host). I started running it only on f19 slaves, to see if it happens, I'll=20 check f20 slaves after. -- David Caro Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D Email: dcaro@redhat.com Web: www.redhat.com RHT Global #: 82-62605 --kStjDNgJGeSwBfMrqQh0NguLeIxRm3vVG Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJTV/2CAAoJEEBxx+HSYmnDj2cH/RKtjfn0vxecGeSDKXCDHfYb kVUwrK3nWJV8wK1YE++Geaj/a+cQCi8KXmMucnACZyTfbKQdQ29wJfASpnLexmyJ 59FX2RBnP5/7bDYwqjJo20pl81JxcEPORnFq74b0KC9LxXLFedPQeTL3XroWH4xK 17P64wzKztYyAt8OtUD9kNsOX41n1s1bEZpPU9l3faxYXFkGJhuWHYWW0SFkGleG GDGaYGxfQfcsmneZqjkrhABq+/MWXjI7pb/G6ytZgEeZv4B4heL/HQZXhNRw1/ot 3w2kYBbh3edXrf3roiTbX2dz3qK+ZlHKLXCceSZ5/C5bv5x/1KJnESEgLxP0tHo= =/k6v -----END PGP SIGNATURE----- --kStjDNgJGeSwBfMrqQh0NguLeIxRm3vVG--