[ovirt-devel] [VDSM] stuck tests in ci

Nir Soffer nsoffer at redhat.com
Sun May 15 16:45:01 UTC 2016


Hi all,

I found another stuck build today:
http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/1151/console

11:27:18 ------------------------------------------------------------------------------------------------------------------------------------------------
11:27:18 TOTAL
                                                     40513  21121
48%
11:27:18 ----------------------------------------------------------------------
11:27:18 Ran 2169 tests in 145.934s
11:27:18
11:27:18 OK (SKIP=88)
11:27:18 Exception AttributeError: "'NoneType' object has no attribute
'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
object at 0x7fd7c9f2d3d0>> ignored
[...]
11:27:18 Exception AttributeError: "'NoneType' object has no attribute
'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
object at 0x7fd7c9f15550>> ignored
11:27:18 Exception in thread ioprocess communication (6533) (most
likely raised during interpreter shutdown):
11:27:18 Traceback (most recent call last):
11:27:18   File "/usr/lib64/python2.7/threading.py", line 804, in
__bootstrap_inner
11:27:18   File "/usr/lib64/python2.7/threading.py", line 757, in run
11:27:18   File
"/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 180, in
_communicate
11:27:18 <type 'exceptions.AttributeError'>: 'NoneType' object has no
attribute 'close'

This seems smells like a non-daemon thread started by some code,
blocking hte test process.

I suspect ioprocess, starting such thread, looking into it.

Meanwhile, please:
- verify that all threads in actual code and in the tests are daemon threads
- convert your threads to use vdsm.concurrent.thread instead of
threading.Thread (daemon by default)
- watch your builds and abort stuck builds

David, we need a timeout in the ci, aborting the job after a project
based timeout, maybe
defined in the project yaml.

Cheers,
Nir



More information about the Devel mailing list