VDSM TestCommunicate.test_send_receive() failed

Hi, I see this failure in the VDSM check-patch [1] - 11:25:34 =================================== FAILURES =================================== 11:25:34 ______________________ TestCommunicate.test_send_receive _______________________ 11:25:34 11:25:34 self = <storage.mailbox_test.TestCommunicate instance at 0x7fbf6498e680> 11:25:34 mboxfiles = MboxFiles(inbox='/var/tmp/vdsm/test_send_receive0/inbox', outbox='/var/tmp/vdsm/test_send_receive0/outbox') 11:25:34 11:25:34 def test_send_receive(self, mboxfiles): 11:25:34 msg_processed = threading.Event() 11:25:34 expired = False 11:25:34 received_messages = [] 11:25:34 11:25:34 def spm_callback(msg_id, data): 11:25:34 received_messages.append((msg_id, data)) 11:25:34 msg_processed.set() 11:25:34 11:25:34 with make_hsm_mailbox(mboxfiles, 7) as hsm_mb: 11:25:34 with make_spm_mailbox(mboxfiles) as spm_mm: 11:25:34 spm_mm.registerMessageType(b"xtnd", spm_callback) 11:25:34 VOL_DATA = dict( 11:25:34 poolID=SPUUID, 11:25:34 domainID='8adbc85e-e554-4ae0-b318-8a5465fe5fe1', 11:25:34 volumeID='d772f1c6-3ebb-43c3-a42e-73fcd8255a5f') 11:25:34 REQUESTED_SIZE = 100 11:25:34 11:25:34 hsm_mb.sendExtendMsg(VOL_DATA, REQUESTED_SIZE) 11:25:34 11:25:34 if not msg_processed.wait(10 * MONITOR_INTERVAL): 11:25:34 expired = True 11:25:34 11:25:34 > assert not expired, 'message was not processed on time' 11:25:34 E AssertionError: message was not processed on time 11:25:34 E assert not True 11:25:34 11:25:34 storage/mailbox_test.py:180: AssertionError Is this issue known to anyone? [1] - https://jenkins.ovirt.org/job/vdsm_standard-check-patch/14093/consoleFull Thanks -- Regards, Eyal Shenitzky

py3-mailbox changes are not in branch yet, so not related. From the failing test log i could see signs of relatively slow storage responses (couple 4KB r/w in 1.65, 1.77 seconds, while others take less than a second), so the test timeout could result from slow response times. 2019-11-11 09:23:17,699 DEBUG (mailbox-spm) [storage.Misc.excCmd] SUCCESS: <err> = '1+0 records in\n1+0 records out\n40960 bytes (41 kB, 40 KiB) copied, 1.65805 s, 24.7 kB/s\n'; <rc> = 0 (commands:202) 2019-11-11 09:23:17,702 DEBUG (mailbox-spm) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor: Mailbox 7 validated, checking mail (mailbox:644) 2019-11-11 09:23:17,703 DEBUG (mailbox-spm) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor: processing request: '1xtnd\xe1_\xfeeT\x8a\x18\xb3\xe0JT\xe5^\xc8\xdb\x8a_Z%\xd8\xfcs.\xa4\xc3C\xbb>\xc6\xf1r\xd7000000000000006400000000000' (mailbox:681) 2019-11-11 09:23:17,715 DEBUG (mailbox-spm/0) [storage.ThreadPool] Number of running tasks: 1 (threadPool:60) 2019-11-11 09:23:17,715 INFO (mailbox-spm/0) [storage.ThreadPool.WorkerThread] START task 575a47fe-8b8b-465e-8bd7-80b31f09af69 (cmd=<function runTask at 0x7fbf67324b18>, args=(<function spm_callback at 0x7fbf647a7140>, 449, '1xtnd\xe1_\xfeeT\x8a\x18\xb3\xe0JT\xe5^\xc8\xdb\x8a_Z%\xd8\xfcs.\xa4\xc3C\xbb>\xc6\xf1r\xd7000000000000006400000000000')) (threadPool:208) 2019-11-11 09:23:17,717 DEBUG (mailbox-spm/0) [storage.TaskManager.Task] (Task='25518496-f685-40b9-b04b-8b50d1f26095') moving from state init -> state preparing (task:610) 2019-11-11 09:23:17,718 DEBUG (mailbox-spm/0) [storage.TaskManager.Task] (Task='25518496-f685-40b9-b04b-8b50d1f26095') finished: None (task:1214) 2019-11-11 09:23:17,721 DEBUG (mailbox-spm/0) [storage.TaskManager.Task] (Task='25518496-f685-40b9-b04b-8b50d1f26095') moving from state preparing -> state finished (task:610) 2019-11-11 09:23:17,721 DEBUG (mailbox-spm/0) [storage.ResourceManager.Owner] Owner.releaseAll requests {} resources {} (resourceManager:913) 2019-11-11 09:23:17,722 DEBUG (mailbox-spm/0) [storage.ResourceManager.Owner] Owner.cancelAll requests {} (resourceManager:950) 2019-11-11 09:23:17,722 DEBUG (mailbox-spm/0) [storage.TaskManager.Task] (Task='25518496-f685-40b9-b04b-8b50d1f26095') ref 0 aborting False (task:1012) 2019-11-11 09:23:17,722 INFO (mailbox-spm/0) [storage.ThreadPool.WorkerThread] FINISH task 575a47fe-8b8b-465e-8bd7-80b31f09af69 (threadPool:210) 2019-11-11 09:23:17,723 DEBUG (mailbox-spm/0) [storage.ThreadPool] Number of running tasks: 0 (threadPool:60) 2019-11-11 09:23:17,907 DEBUG (mailbox-spm/4) [root] FINISH thread <Thread(mailbox-spm/4, started daemon 140459389220608)> (concurrent:196) 2019-11-11 09:23:17,918 DEBUG (mailbox-spm/3) [root] FINISH thread <Thread(mailbox-spm/3, started daemon 140459741517568)> (concurrent:196) 2019-11-11 09:23:17,923 DEBUG (mailbox-hsm) [storage.Misc.excCmd] SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB, 4.0 KiB) copied, 1.77794 s, 2.3 kB/s\n'; <rc> = 0 (commands:202) On Mon, Nov 11, 2019 at 12:41 PM Eyal Shenitzky <eshenitz@redhat.com> wrote:
Hi,
I see this failure in the VDSM check-patch [1] -
11:25:34 =================================== FAILURES =================================== 11:25:34 ______________________ TestCommunicate.test_send_receive _______________________ 11:25:34 11:25:34 self = <storage.mailbox_test.TestCommunicate instance at 0x7fbf6498e680> 11:25:34 mboxfiles = MboxFiles(inbox='/var/tmp/vdsm/test_send_receive0/inbox', outbox='/var/tmp/vdsm/test_send_receive0/outbox') 11:25:34 11:25:34 def test_send_receive(self, mboxfiles): 11:25:34 msg_processed = threading.Event() 11:25:34 expired = False 11:25:34 received_messages = [] 11:25:34 11:25:34 def spm_callback(msg_id, data): 11:25:34 received_messages.append((msg_id, data)) 11:25:34 msg_processed.set() 11:25:34 11:25:34 with make_hsm_mailbox(mboxfiles, 7) as hsm_mb: 11:25:34 with make_spm_mailbox(mboxfiles) as spm_mm: 11:25:34 spm_mm.registerMessageType(b"xtnd", spm_callback) 11:25:34 VOL_DATA = dict( 11:25:34 poolID=SPUUID, 11:25:34 domainID='8adbc85e-e554-4ae0-b318-8a5465fe5fe1', 11:25:34 volumeID='d772f1c6-3ebb-43c3-a42e-73fcd8255a5f') 11:25:34 REQUESTED_SIZE = 100 11:25:34 11:25:34 hsm_mb.sendExtendMsg(VOL_DATA, REQUESTED_SIZE) 11:25:34 11:25:34 if not msg_processed.wait(10 * MONITOR_INTERVAL): 11:25:34 expired = True 11:25:34 11:25:34 > assert not expired, 'message was not processed on time' 11:25:34 E AssertionError: message was not processed on time 11:25:34 E assert not True 11:25:34 11:25:34 storage/mailbox_test.py:180: AssertionError
Is this issue known to anyone?
[1] - https://jenkins.ovirt.org/job/vdsm_standard-check-patch/14093/consoleFull
Thanks
-- Regards, Eyal Shenitzky _______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/7L2UDMMPGV2CRS...

On Mon, Nov 11, 2019 at 2:18 PM Amit Bawer <abawer@redhat.com> wrote:
py3-mailbox changes are not in branch yet, so not related.
From the failing test log i could see signs of relatively slow storage responses (couple 4KB r/w in 1.65, 1.77 seconds, while others take less than a second), so the test timeout could result from slow response times.
2019-11-11 09:23:17,699 DEBUG (mailbox-spm) [storage.Misc.excCmd] SUCCESS: <err> = '1+0 records in\n1+0 records out\n40960 bytes (41 kB, 40 KiB) copied, 1.65805 s, 24.7 kB/s\n'; <rc> = 0 (commands:202) 2019-11-11 09:23:17,702 DEBUG (mailbox-spm) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor: Mailbox 7 validated, checking mail (mailbox:644) 2019-11-11 09:23:17,703 DEBUG (mailbox-spm) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor: processing request: '1xtnd\xe1_\xfeeT\x8a\x18\xb3\xe0JT\xe5^\xc8\xdb\x8a_Z%\xd8\xfcs.\xa4\xc3C\xbb>\xc6\xf1r\xd7000000000000006400000000000' (mailbox:681) 2019-11-11 09:23:17,715 DEBUG (mailbox-spm/0) [storage.ThreadPool] Number of running tasks: 1 (threadPool:60) 2019-11-11 09:23:17,715 INFO (mailbox-spm/0) [storage.ThreadPool.WorkerThread] START task 575a47fe-8b8b-465e-8bd7-80b31f09af69 (cmd=<function runTask at 0x7fbf67324b18>, args=(<function spm_callback at 0x7fbf647a7140>, 449, '1xtnd\xe1_\xfeeT\x8a\x18\xb3\xe0JT\xe5^\xc8\xdb\x8a_Z%\xd8\xfcs.\xa4\xc3C\xbb>\xc6\xf1r\xd7000000000000006400000000000')) (threadPool:208) 2019-11-11 09:23:17,717 DEBUG (mailbox-spm/0) [storage.TaskManager.Task] (Task='25518496-f685-40b9-b04b-8b50d1f26095') moving from state init -> state preparing (task:610) 2019-11-11 09:23:17,718 DEBUG (mailbox-spm/0) [storage.TaskManager.Task] (Task='25518496-f685-40b9-b04b-8b50d1f26095') finished: None (task:1214) 2019-11-11 09:23:17,721 DEBUG (mailbox-spm/0) [storage.TaskManager.Task] (Task='25518496-f685-40b9-b04b-8b50d1f26095') moving from state preparing -> state finished (task:610) 2019-11-11 09:23:17,721 DEBUG (mailbox-spm/0) [storage.ResourceManager.Owner] Owner.releaseAll requests {} resources {} (resourceManager:913) 2019-11-11 09:23:17,722 DEBUG (mailbox-spm/0) [storage.ResourceManager.Owner] Owner.cancelAll requests {} (resourceManager:950) 2019-11-11 09:23:17,722 DEBUG (mailbox-spm/0) [storage.TaskManager.Task] (Task='25518496-f685-40b9-b04b-8b50d1f26095') ref 0 aborting False (task:1012) 2019-11-11 09:23:17,722 INFO (mailbox-spm/0) [storage.ThreadPool.WorkerThread] FINISH task 575a47fe-8b8b-465e-8bd7-80b31f09af69 (threadPool:210) 2019-11-11 09:23:17,723 DEBUG (mailbox-spm/0) [storage.ThreadPool] Number of running tasks: 0 (threadPool:60) 2019-11-11 09:23:17,907 DEBUG (mailbox-spm/4) [root] FINISH thread <Thread(mailbox-spm/4, started daemon 140459389220608)> (concurrent:196) 2019-11-11 09:23:17,918 DEBUG (mailbox-spm/3) [root] FINISH thread <Thread(mailbox-spm/3, started daemon 140459741517568)> (concurrent:196) 2019-11-11 09:23:17,923 DEBUG (mailbox-hsm) [storage.Misc.excCmd] SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB, 4.0 KiB) copied, 1.77794 s, 2.3 kB/s\n'; <rc> = 0 (commands:202)
It looks like CI storage is very slow, and we need larger timeouts in these tests. I think we use 6 seconds now, lets bump it to 10 seconds?
On Mon, Nov 11, 2019 at 12:41 PM Eyal Shenitzky <eshenitz@redhat.com> wrote:
Hi,
I see this failure in the VDSM check-patch [1] -
11:25:34 =================================== FAILURES =================================== 11:25:34 ______________________ TestCommunicate.test_send_receive _______________________ 11:25:34 11:25:34 self = <storage.mailbox_test.TestCommunicate instance at 0x7fbf6498e680> 11:25:34 mboxfiles = MboxFiles(inbox='/var/tmp/vdsm/test_send_receive0/inbox', outbox='/var/tmp/vdsm/test_send_receive0/outbox') 11:25:34 11:25:34 def test_send_receive(self, mboxfiles): 11:25:34 msg_processed = threading.Event() 11:25:34 expired = False 11:25:34 received_messages = [] 11:25:34 11:25:34 def spm_callback(msg_id, data): 11:25:34 received_messages.append((msg_id, data)) 11:25:34 msg_processed.set() 11:25:34 11:25:34 with make_hsm_mailbox(mboxfiles, 7) as hsm_mb: 11:25:34 with make_spm_mailbox(mboxfiles) as spm_mm: 11:25:34 spm_mm.registerMessageType(b"xtnd", spm_callback) 11:25:34 VOL_DATA = dict( 11:25:34 poolID=SPUUID, 11:25:34 domainID='8adbc85e-e554-4ae0-b318-8a5465fe5fe1', 11:25:34 volumeID='d772f1c6-3ebb-43c3-a42e-73fcd8255a5f') 11:25:34 REQUESTED_SIZE = 100 11:25:34 11:25:34 hsm_mb.sendExtendMsg(VOL_DATA, REQUESTED_SIZE) 11:25:34 11:25:34 if not msg_processed.wait(10 * MONITOR_INTERVAL): 11:25:34 expired = True 11:25:34 11:25:34 > assert not expired, 'message was not processed on time' 11:25:34 E AssertionError: message was not processed on time 11:25:34 E assert not True 11:25:34 11:25:34 storage/mailbox_test.py:180: AssertionError
Is this issue known to anyone?
[1] - https://jenkins.ovirt.org/job/vdsm_standard-check-patch/14093/consoleFull
Thanks
-- Regards, Eyal Shenitzky _______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/7L2UDMMPGV2CRS...
_______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/PCPSVLS3D473JF...

On Mon, Nov 11, 2019 at 3:25 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Mon, Nov 11, 2019 at 2:18 PM Amit Bawer <abawer@redhat.com> wrote:
py3-mailbox changes are not in branch yet, so not related.
From the failing test log i could see signs of relatively slow storage
responses (couple 4KB r/w in 1.65, 1.77 seconds, while others take less than a second), so the test timeout could result from slow response times.
2019-11-11 09:23:17,699 DEBUG (mailbox-spm) [storage.Misc.excCmd]
SUCCESS: <err> = '1+0 records in\n1+0 records out\n40960 bytes (41 kB, 40 KiB) copied, 1.65805 s, 24.7 kB/s\n'; <rc> = 0 (commands:202)
2019-11-11 09:23:17,702 DEBUG (mailbox-spm) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor: Mailbox 7 validated, checking mail (mailbox:644) 2019-11-11 09:23:17,703 DEBUG (mailbox-spm) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor: processing request: '1xtnd\xe1_\xfeeT\x8a\x18\xb3\xe0JT\xe5^\xc8\xdb\x8a_Z%\xd8\xfcs.\xa4\xc3C\xbb>\xc6\xf1r\xd7000000000000006400000000000' (mailbox:681) 2019-11-11 09:23:17,715 DEBUG (mailbox-spm/0) [storage.ThreadPool] Number of running tasks: 1 (threadPool:60) 2019-11-11 09:23:17,715 INFO (mailbox-spm/0) [storage.ThreadPool.WorkerThread] START task 575a47fe-8b8b-465e-8bd7-80b31f09af69 (cmd=<function runTask at 0x7fbf67324b18>, args=(<function spm_callback at 0x7fbf647a7140>, 449, '1xtnd\xe1_\xfeeT\x8a\x18\xb3\xe0JT\xe5^\xc8\xdb\x8a_Z%\xd8\xfcs.\xa4\xc3C\xbb>\xc6\xf1r\xd7000000000000006400000000000')) (threadPool:208) 2019-11-11 09:23:17,717 DEBUG (mailbox-spm/0) [storage.TaskManager.Task] (Task='25518496-f685-40b9-b04b-8b50d1f26095') moving from state init -> state preparing (task:610) 2019-11-11 09:23:17,718 DEBUG (mailbox-spm/0) [storage.TaskManager.Task] (Task='25518496-f685-40b9-b04b-8b50d1f26095') finished: None (task:1214) 2019-11-11 09:23:17,721 DEBUG (mailbox-spm/0) [storage.TaskManager.Task] (Task='25518496-f685-40b9-b04b-8b50d1f26095') moving from state preparing -> state finished (task:610) 2019-11-11 09:23:17,721 DEBUG (mailbox-spm/0) [storage.ResourceManager.Owner] Owner.releaseAll requests {} resources {} (resourceManager:913) 2019-11-11 09:23:17,722 DEBUG (mailbox-spm/0) [storage.ResourceManager.Owner] Owner.cancelAll requests {} (resourceManager:950) 2019-11-11 09:23:17,722 DEBUG (mailbox-spm/0) [storage.TaskManager.Task] (Task='25518496-f685-40b9-b04b-8b50d1f26095') ref 0 aborting False (task:1012) 2019-11-11 09:23:17,722 INFO (mailbox-spm/0) [storage.ThreadPool.WorkerThread] FINISH task 575a47fe-8b8b-465e-8bd7-80b31f09af69 (threadPool:210) 2019-11-11 09:23:17,723 DEBUG (mailbox-spm/0) [storage.ThreadPool] Number of running tasks: 0 (threadPool:60) 2019-11-11 09:23:17,907 DEBUG (mailbox-spm/4) [root] FINISH thread <Thread(mailbox-spm/4, started daemon 140459389220608)> (concurrent:196) 2019-11-11 09:23:17,918 DEBUG (mailbox-spm/3) [root] FINISH thread <Thread(mailbox-spm/3, started daemon 140459741517568)> (concurrent:196) 2019-11-11 09:23:17,923 DEBUG (mailbox-hsm) [storage.Misc.excCmd] SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB, 4.0 KiB) copied, 1.77794 s, 2.3 kB/s\n'; <rc> = 0 (commands:202)
It looks like CI storage is very slow, and we need larger timeouts in these tests. I think we use 6 seconds now, lets bump it to 10 seconds?
see patch: https://gerrit.ovirt.org/#/c/104556/
On Mon, Nov 11, 2019 at 12:41 PM Eyal Shenitzky <eshenitz@redhat.com>
wrote:
Hi,
I see this failure in the VDSM check-patch [1] -
11:25:34 =================================== FAILURES
===================================
11:25:34 ______________________ TestCommunicate.test_send_receive
11:25:34 11:25:34 self = <storage.mailbox_test.TestCommunicate instance at 0x7fbf6498e680> 11:25:34 mboxfiles = MboxFiles(inbox='/var/tmp/vdsm/test_send_receive0/inbox', outbox='/var/tmp/vdsm/test_send_receive0/outbox') 11:25:34 11:25:34 def test_send_receive(self, mboxfiles): 11:25:34 msg_processed = threading.Event() 11:25:34 expired = False 11:25:34 received_messages = [] 11:25:34 11:25:34 def spm_callback(msg_id, data): 11:25:34 received_messages.append((msg_id, data)) 11:25:34 msg_processed.set() 11:25:34 11:25:34 with make_hsm_mailbox(mboxfiles, 7) as hsm_mb: 11:25:34 with make_spm_mailbox(mboxfiles) as spm_mm: 11:25:34 spm_mm.registerMessageType(b"xtnd", spm_callback) 11:25:34 VOL_DATA = dict( 11:25:34 poolID=SPUUID, 11:25:34 domainID='8adbc85e-e554-4ae0-b318-8a5465fe5fe1', 11:25:34 volumeID='d772f1c6-3ebb-43c3-a42e-73fcd8255a5f') 11:25:34 REQUESTED_SIZE = 100 11:25:34 11:25:34 hsm_mb.sendExtendMsg(VOL_DATA, REQUESTED_SIZE) 11:25:34 11:25:34 if not msg_processed.wait(10 * MONITOR_INTERVAL): 11:25:34 expired = True 11:25:34 11:25:34 > assert not expired, 'message was not processed on time' 11:25:34 E AssertionError: message was not processed on time 11:25:34 E assert not True 11:25:34 11:25:34 storage/mailbox_test.py:180: AssertionError
Is this issue known to anyone?
[1] - https://jenkins.ovirt.org/job/vdsm_standard-check-patch/14093/consoleFull
Thanks
-- Regards, Eyal Shenitzky _______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/7L2UDMMPGV2CRS...
_______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/PCPSVLS3D473JF...
participants (3)
-
Amit Bawer
-
Eyal Shenitzky
-
Nir Soffer