
On Thu, May 26, 2016 at 11:08 PM, Nir Soffer <nsoffer@redhat.com> wrote:
Hi all,
We had 2 issues causing vdsm check-patch and check-merge jobs to get stuck.
I fixed the one that caused most trouble: https://gerrit.ovirt.org/57993
The other issue may be related to ioprocess, I fixed a related issue: https://gerrit.ovirt.org/57473
But I have seen stuck jobs after this change, so the issue may not be fixed yet.
If you see a stuck vdsm job - job that run more than 15 minutes, please get me a backtrace:
1. locate the test_runner process pid:
$ ps aux | grep testrunner.py | grep -v grep nsoffer 26297 82.6 0.9 389592 111144 pts/3 R+ 22:52 0:02 /usr/bin/python ../tests/testrunner.py ...
2. save a backtrace:
gdb attach 26297 --batch -ex "thread apply all py-bt" > py-bt.out
This requires the python-debuginfo package, typically installed using: dnf debuginfo-install python I sent this patch, detecting stuck vdsm tests, printing a backtrace, and killing the stuck process: https://gerrit.ovirt.org/58212 It works, but we don't get a backtrace, since python-debuginfo is not installed although I require it - probably we need to add the fedora-debug repository to check-patch.repos. I tried to use the urls from /etc/yum.repos.d/fedora.repo, but none of them work. I will need help from infra to get it working. Nir