Test network.ovs_driver_test.TestOvsApiBase seems to fail due to a timeout

*Hi,* *I'm trying to verify an ovirt-4.2 patch on vdsm, but the ci keep failing due to a timeout on * *network.ovs_driver_test.TestOvsApiBase test [1]. * *That heppen for both jobs 166 and 167 (I've tryied to retrigger). (Links below).Can you please assist?Thanks!* *http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/166/consoleFull <http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/166/consoleFull>http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/167/console <http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/167/console>[1]08:22:55* network.ovs_driver_test.TestOvsApiBase*08:22:55* test_execute_a_single_command OK*08:26:56* test_execute_a_transaction ___________________________________ summary ____________________________________*08:26:56* pylint: commands succeeded*08:26:56* congratulations :)*08:30:53* *08:30:53* ========================================================================*08:30:53* = Watched process timed out =*08:30:53* ========================================================================*08:30:53* *08:30:53* [Thread debugging using libthread_db enabled]*08:30:53* Using host libthread_db library "/lib64/libthread_db.so.1".*08:30:54* 0x00007f6a10ca6a3d in poll () at ../sysdeps/unix/syscall-template.S:81*08:30:54* 81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) *Regards,* *Shani Leviim*

Not sure its an infra issue, but adding Evgheni to help if needed. On Mon, Mar 26, 2018 at 3:48 PM, Shani Leviim <sleviim@redhat.com> wrote:
*Hi,*
*I'm trying to verify an ovirt-4.2 patch on vdsm, but the ci keep failing due to a timeout on * *network.ovs_driver_test.TestOvsApiBase test [1]. *
*That heppen for both jobs 166 and 167 (I've tryied to retrigger). (Links below).Can you please assist?Thanks!*
*http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/166/consoleFull <http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/166/consoleFull>http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/167/console <http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/167/console>[1]08:22:55* network.ovs_driver_test.TestOvsApiBase*08:22:55* test_execute_a_single_command OK*08:26:56* test_execute_a_transaction ___________________________________ summary ____________________________________*08:26:56* pylint: commands succeeded*08:26:56* congratulations :)*08:30:53* *08:30:53* ========================================================================*08:30:53* = Watched process timed out =*08:30:53* ========================================================================*08:30:53* *08:30:53* [Thread debugging using libthread_db enabled]*08:30:53* Using host libthread_db library "/lib64/libthread_db.so.1".*08:30:54* 0x00007f6a10ca6a3d in poll () at ../sysdeps/unix/syscall-template.S:81*08:30:54* 81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
*Regards,*
*Shani Leviim*
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- Eyal edri MANAGER RHV DevOps EMEA VIRTUALIZATION R&D Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ)

Adding Gal. Gal, can this be related to the timeout you added? thanks, Dafna On Mon, Mar 26, 2018 at 2:24 PM, Eyal Edri <eedri@redhat.com> wrote:
Not sure its an infra issue, but adding Evgheni to help if needed.
On Mon, Mar 26, 2018 at 3:48 PM, Shani Leviim <sleviim@redhat.com> wrote:
*Hi,*
*I'm trying to verify an ovirt-4.2 patch on vdsm, but the ci keep failing due to a timeout on * *network.ovs_driver_test.TestOvsApiBase test [1]. *
*That heppen for both jobs 166 and 167 (I've tryied to retrigger). (Links below).Can you please assist?Thanks!*
*http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/166/consoleFull <http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/166/consoleFull>http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/167/console <http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/167/console>[1]08:22:55* network.ovs_driver_test.TestOvsApiBase*08:22:55* test_execute_a_single_command OK*08:26:56* test_execute_a_transaction ___________________________________ summary ____________________________________*08:26:56* pylint: commands succeeded*08:26:56* congratulations :)*08:30:53* *08:30:53* ========================================================================*08:30:53* = Watched process timed out =*08:30:53* ========================================================================*08:30:53* *08:30:53* [Thread debugging using libthread_db enabled]*08:30:53* Using host libthread_db library "/lib64/libthread_db.so.1".*08:30:54* 0x00007f6a10ca6a3d in poll () at ../sysdeps/unix/syscall-template.S:81*08:30:54* 81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
*Regards,*
*Shani Leviim*
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--
Eyal edri
MANAGER
RHV DevOps
EMEA VIRTUALIZATION R&D
Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 <+972%209-769-2018> irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

We still have no guess regarding what made this pop up. I can confirm that this happens quite often, on different slaves. E.g http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/22648/consol... where the command '/usr/bin/ovs-vsctl', '--oneline', '--format=json', '--', 'add-br', 'vdsmbr_test', '--', 'list', 'Bridge' gets stuck. On Mon, Mar 26, 2018 at 7:20 PM, Dafna Ron <dron@redhat.com> wrote:
Adding Gal.
Gal, can this be related to the timeout you added?
thanks, Dafna
On Mon, Mar 26, 2018 at 2:24 PM, Eyal Edri <eedri@redhat.com> wrote:
Not sure its an infra issue, but adding Evgheni to help if needed.
On Mon, Mar 26, 2018 at 3:48 PM, Shani Leviim <sleviim@redhat.com> wrote:
*Hi,*
*I'm trying to verify an ovirt-4.2 patch on vdsm, but the ci keep failing due to a timeout on * *network.ovs_driver_test.TestOvsApiBase test [1]. *
*That heppen for both jobs 166 and 167 (I've tryied to retrigger). (Links below).Can you please assist?Thanks!*
*http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/166/consoleFull <http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/166/consoleFull>http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/167/console <http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/167/console>[1]08:22:55* network.ovs_driver_test.TestOvsApiBase*08:22:55* test_execute_a_single_command OK*08:26:56* test_execute_a_transaction ___________________________________ summary ____________________________________*08:26:56* pylint: commands succeeded*08:26:56* congratulations :)*08:30:53* *08:30:53* ========================================================================*08:30:53* = Watched process timed out =*08:30:53* ========================================================================*08:30:53* *08:30:53* [Thread debugging using libthread_db enabled]*08:30:53* Using host libthread_db library "/lib64/libthread_db.so.1".*08:30:54* 0x00007f6a10ca6a3d in poll () at ../sysdeps/unix/syscall-template.S:81*08:30:54* 81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
*Regards,*
*Shani Leviim*
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--
Eyal edri
MANAGER
RHV DevOps
EMEA VIRTUALIZATION R&D
Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 <+972%209-769-2018> irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

What I find very strange is that there are multiple threads that execute the same line (or perhaps I do not interpret the gdb output correctly). Can I get access to such a slave to debug this? On Wed, Mar 28, 2018 at 10:12 AM, Dan Kenigsberg <danken@redhat.com> wrote:
We still have no guess regarding what made this pop up. I can confirm that this happens quite often, on different slaves. E.g http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7- x86_64/22648/console where the command
'/usr/bin/ovs-vsctl', '--oneline', '--format=json', '--', 'add-br', 'vdsmbr_test', '--', 'list', 'Bridge'
gets stuck.
On Mon, Mar 26, 2018 at 7:20 PM, Dafna Ron <dron@redhat.com> wrote:
Adding Gal.
Gal, can this be related to the timeout you added?
thanks, Dafna
On Mon, Mar 26, 2018 at 2:24 PM, Eyal Edri <eedri@redhat.com> wrote:
Not sure its an infra issue, but adding Evgheni to help if needed.
On Mon, Mar 26, 2018 at 3:48 PM, Shani Leviim <sleviim@redhat.com> wrote:
*Hi,*
*I'm trying to verify an ovirt-4.2 patch on vdsm, but the ci keep failing due to a timeout on * *network.ovs_driver_test.TestOvsApiBase test [1]. *
*That heppen for both jobs 166 and 167 (I've tryied to retrigger). (Links below).Can you please assist?Thanks!*
*http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/166/consoleFull <http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/166/consoleFull>http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/167/console <http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/167/console>[1]08:22:55* network.ovs_driver_test.TestOvsApiBase*08:22:55* test_execute_a_single_command OK*08:26:56* test_execute_a_transaction ___________________________________ summary ____________________________________*08:26:56* pylint: commands succeeded*08:26:56* congratulations :)*08:30:53* *08:30:53* ========================================================================*08:30:53* = Watched process timed out =*08:30:53* ========================================================================*08:30:53* *08:30:53* [Thread debugging using libthread_db enabled]*08:30:53* Using host libthread_db library "/lib64/libthread_db.so.1".*08:30:54* 0x00007f6a10ca6a3d in poll () at ../sysdeps/unix/syscall-template.S:81*08:30:54* 81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
*Regards,*
*Shani Leviim*
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--
Eyal edri
MANAGER
RHV DevOps
EMEA VIRTUALIZATION R&D
Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 <+972%209-769-2018> irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

I've gotten more reports about this timeout still happening on VDSM check-patch. As the test itself is run in mock, all the environment is deleted after the job run so there isn't an affected environment to log in and check out. The node itself doesn't even have openvswitch installed on it. Can I provide some more info to get this moving? On Wed, Mar 28, 2018 at 9:50 AM, Edward Haas <ehaas@redhat.com> wrote:
What I find very strange is that there are multiple threads that execute the same line (or perhaps I do not interpret the gdb output correctly).
Can I get access to such a slave to debug this?
On Wed, Mar 28, 2018 at 10:12 AM, Dan Kenigsberg <danken@redhat.com> wrote:
We still have no guess regarding what made this pop up. I can confirm that this happens quite often, on different slaves. E.g http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86 _64/22648/console where the command
'/usr/bin/ovs-vsctl', '--oneline', '--format=json', '--', 'add-br', 'vdsmbr_test', '--', 'list', 'Bridge'
gets stuck.
On Mon, Mar 26, 2018 at 7:20 PM, Dafna Ron <dron@redhat.com> wrote:
Adding Gal.
Gal, can this be related to the timeout you added?
thanks, Dafna
On Mon, Mar 26, 2018 at 2:24 PM, Eyal Edri <eedri@redhat.com> wrote:
Not sure its an infra issue, but adding Evgheni to help if needed.
On Mon, Mar 26, 2018 at 3:48 PM, Shani Leviim <sleviim@redhat.com> wrote:
*Hi,*
*I'm trying to verify an ovirt-4.2 patch on vdsm, but the ci keep failing due to a timeout on * *network.ovs_driver_test.TestOvsApiBase test [1]. *
*That heppen for both jobs 166 and 167 (I've tryied to retrigger). (Links below).Can you please assist?Thanks!*
*http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/166/consoleFull <http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/166/consoleFull>http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/167/console <http://jenkins.ovirt.org/job/vdsm_4.2_check-patch-el7-x86_64/167/console>[1]08:22:55* network.ovs_driver_test.TestOvsApiBase*08:22:55* test_execute_a_single_command OK*08:26:56* test_execute_a_transaction ___________________________________ summary ____________________________________*08:26:56* pylint: commands succeeded*08:26:56* congratulations :)*08:30:53* *08:30:53* ========================================================================*08:30:53* = Watched process timed out =*08:30:53* ========================================================================*08:30:53* *08:30:53* [Thread debugging using libthread_db enabled]*08:30:53* Using host libthread_db library "/lib64/libthread_db.so.1".*08:30:54* 0x00007f6a10ca6a3d in poll () at ../sysdeps/unix/syscall-template.S:81*08:30:54* 81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
*Regards,*
*Shani Leviim*
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--
Eyal edri
MANAGER
RHV DevOps
EMEA VIRTUALIZATION R&D
Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> phone: +972-9-7692018 <+972%209-769-2018> irc: eedri (on #tlv #rhev-dev #rhev-integ)
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- Regards, Evgheni Dereveanchin
participants (6)
-
Dafna Ron
-
Dan Kenigsberg
-
Edward Haas
-
Evgheni Dereveanchin
-
Eyal Edri
-
Shani Leviim