[ovirt-devel] [VDSM] stuck tests in ci

Piotr Kliczewski piotr.kliczewski at gmail.com
Tue May 31 17:16:23 UTC 2016


All,

I just noticed one more build [1] which got stuck with:

15:46:40 Traceback (most recent call last):
15:46:40   File "/usr/lib64/python2.7/threading.py", line 804, in
__bootstrap_inner
15:46:40   File "/usr/lib64/python2.7/threading.py", line 757, in run
15:46:40   File
"/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 181, in
_communicate
15:46:40 <type 'exceptions.AttributeError'>: 'NoneType' object has no
attribute 'close'

Thanks,
Piotr

[1] http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/2380/console

On Sat, May 21, 2016 at 8:28 PM, Nir Soffer <nsoffer at redhat.com> wrote:
> The issue is non-daemon thread blocking the python process during
> shutdown of the tests.
>
> Current ioprocess does not create such thread, but we still see this
> issue today:
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/1841/console
>
> If the builds are using the latest ioprocess build (0.16.0-1), built after
> Sun May 15 21:29:24 2016 +0300, this is probably not related to ioprocess
>
> To understand this issue we need to get a stacktrace from the stuck python
> process.
>
> See relevant log bellow.
>
>
> Nir
>
> ----
>
> 11:49:30 --------------------------------------------------------------------------------------------------------------------------------------------
> 11:49:30 TOTAL
>                                                  40672  21072    48%
> 11:49:30 ----------------------------------------------------------------------
> 11:49:30 Ran 2182 tests in 147.661s
> 11:49:30
> 11:49:30 OK (SKIP=94)
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'udev_unref'" in <bound method Context.__del__ of <pyudev.core.Context
> object at 0x7312610>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x5435350>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x5420850>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x7269910>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x5420f90>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x5419c10>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x610e610>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x6dc4d50>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x610f390>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x54195d0>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x5432510>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x6110250>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x5414750>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x5414150>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x6a33390>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x543d590>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x6ba8390>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x5432d50>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x5435a90>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x503f350>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x5420250>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x5442e90>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x503f950>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x610e7d0>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x5414d90>> ignored
> 11:49:30 Exception AttributeError: "'NoneType' object has no attribute
> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
> object at 0x543d450>> ignored
> 11:49:30 Exception in thread ioprocess communication (8008) (most
> likely raised during interpreter shutdown):
> 11:49:30 Traceback (most recent call last):
> 11:49:30   File "/usr/lib64/python2.7/threading.py", line 811, in
> __bootstrap_inner
> 11:49:30   File "/usr/lib64/python2.7/threading.py", line 764, in run
> 11:49:30   File
> "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 180, in
> _communicate
> 11:49:30 <type 'exceptions.AttributeError'>: 'NoneType' object has no
> attribute 'close'
> 17:42:17 Build timed out (after 360 minutes). Marking the build as failed.
> 17:42:17 Build was aborted
>
> On Fri, May 20, 2016 at 10:30 AM, Piotr Kliczewski
> <piotr.kliczewski at gmail.com> wrote:
>> Eyal,
>>
>> This was ioprocess issue occurring after the fix was provided. I
>> haven't seen it since build #1389.
>>
>> Thanks,
>> Piotr
>>
>> On Thu, May 19, 2016 at 3:00 PM, Eyal Edri <eedri at redhat.com> wrote:
>>> was that resolved?
>>> any infra issue or was it problems with the tests?
>>>
>>> On Mon, May 16, 2016 at 3:27 PM, Piotr Kliczewski
>>> <piotr.kliczewski at gmail.com> wrote:
>>>>
>>>> and one more:
>>>>
>>>>
>>>> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/1389/console
>>>>
>>>> On Mon, May 16, 2016 at 1:46 PM, Piotr Kliczewski
>>>> <piotr.kliczewski at gmail.com> wrote:
>>>> > One more occurrence of the issue [1]
>>>> >
>>>> >
>>>> > [1]
>>>> > http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/1359/console
>>>> >
>>>> > On Sun, May 15, 2016 at 8:37 PM, Nir Soffer <nsoffer at redhat.com> wrote:
>>>> >> The ioprocess issue fixed in https://gerrit.ovirt.org/57473
>>>> >>
>>>> >> Will be merge soon and available via ovirt-release-master.
>>>> >>
>>>> >> Nir
>>>> >>
>>>> >> On Sun, May 15, 2016 at 7:45 PM, Nir Soffer <nsoffer at redhat.com> wrote:
>>>> >>> Hi all,
>>>> >>>
>>>> >>> I found another stuck build today:
>>>> >>>
>>>> >>> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc23-x86_64/1151/console
>>>> >>>
>>>> >>> 11:27:18
>>>> >>> ------------------------------------------------------------------------------------------------------------------------------------------------
>>>> >>> 11:27:18 TOTAL
>>>> >>>                                                      40513  21121
>>>> >>> 48%
>>>> >>> 11:27:18
>>>> >>> ----------------------------------------------------------------------
>>>> >>> 11:27:18 Ran 2169 tests in 145.934s
>>>> >>> 11:27:18
>>>> >>> 11:27:18 OK (SKIP=88)
>>>> >>> 11:27:18 Exception AttributeError: "'NoneType' object has no attribute
>>>> >>> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
>>>> >>> object at 0x7fd7c9f2d3d0>> ignored
>>>> >>> [...]
>>>> >>> 11:27:18 Exception AttributeError: "'NoneType' object has no attribute
>>>> >>> 'write'" in <bound method IOProcess.__del__ of <ioprocess.IOProcess
>>>> >>> object at 0x7fd7c9f15550>> ignored
>>>> >>> 11:27:18 Exception in thread ioprocess communication (6533) (most
>>>> >>> likely raised during interpreter shutdown):
>>>> >>> 11:27:18 Traceback (most recent call last):
>>>> >>> 11:27:18   File "/usr/lib64/python2.7/threading.py", line 804, in
>>>> >>> __bootstrap_inner
>>>> >>> 11:27:18   File "/usr/lib64/python2.7/threading.py", line 757, in run
>>>> >>> 11:27:18   File
>>>> >>> "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 180, in
>>>> >>> _communicate
>>>> >>> 11:27:18 <type 'exceptions.AttributeError'>: 'NoneType' object has no
>>>> >>> attribute 'close'
>>>> >>>
>>>> >>> This seems smells like a non-daemon thread started by some code,
>>>> >>> blocking hte test process.
>>>> >>>
>>>> >>> I suspect ioprocess, starting such thread, looking into it.
>>>> >>>
>>>> >>> Meanwhile, please:
>>>> >>> - verify that all threads in actual code and in the tests are daemon
>>>> >>> threads
>>>> >>> - convert your threads to use vdsm.concurrent.thread instead of
>>>> >>> threading.Thread (daemon by default)
>>>> >>> - watch your builds and abort stuck builds
>>>> >>>
>>>> >>> David, we need a timeout in the ci, aborting the job after a project
>>>> >>> based timeout, maybe
>>>> >>> defined in the project yaml.
>>>> >>>
>>>> >>> Cheers,
>>>> >>> Nir
>>>> >> _______________________________________________
>>>> >> Devel mailing list
>>>> >> Devel at ovirt.org
>>>> >> http://lists.ovirt.org/mailman/listinfo/devel
>>>> _______________________________________________
>>>> Devel mailing list
>>>> Devel at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Eyal Edri
>>> Associate Manager
>>> RHEV DevOps
>>> EMEA ENG Virtualization R&D
>>> Red Hat Israel
>>>
>>> phone: +972-9-7692018
>>> irc: eedri (on #tlv #rhev-dev #rhev-integ)



More information about the Devel mailing list