
Hi all, I have seen the new storage mailbox test failing with a timeout twice. If you see this in your builds, please complain here. The cause for the timeout seems to be: 09:56:02 2017-01-29 09:53:59,830 ERROR (mailbox-spm) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor: mailbox 7 checksum failed, not clearing mailbox, clearing newMail. (storage_mailbox:630) We see this error in vdsm logs sometimes, needs investigation. 09:56:02 ====================================================================== 09:56:02 FAIL: test_send_receive (storage_mailbox_test.TestMailbox) 09:56:02 ---------------------------------------------------------------------- 09:56:02 Traceback (most recent call last): 09:56:02 File "/home/jenkins/workspace/vdsm_master_check-patch-fc25-x86_64/vdsm/tests/storage_mailbox_test.py", line 132, in test_send_receive 09:56:02 self.assertFalse(expired, 'message was not processed on time') 09:56:02 AssertionError: message was not processed on time 09:56:02 -------------------- >> begin captured logging << -------------------- 09:56:02 2017-01-29 09:53:55,337 DEBUG (MainThread) [storage.ThreadPool] Enter - name: mailbox-hsm, numThreads: 5, waitTimeout: 0.15, maxTasks: 500 (threadPool:36) 09:56:02 2017-01-29 09:53:55,337 DEBUG (mailbox-hsm/0) [root] START thread <Thread(mailbox-hsm/0, started daemon 140445079619328)> (func=<bound method WorkerThread.run of <vdsm.storage.threadPool.WorkerThread object at 0x7fbc0960a0d0>>, args=(), kwargs={}) (concurrent:183) 09:56:02 2017-01-29 09:53:55,338 DEBUG (mailbox-hsm/1) [root] START thread <Thread(mailbox-hsm/1, started daemon 140445631756032)> (func=<bound method WorkerThread.run of <vdsm.storage.threadPool.WorkerThread object at 0x7fbc096180d0>>, args=(), kwargs={}) (concurrent:183) 09:56:02 2017-01-29 09:53:55,339 DEBUG (mailbox-hsm/2) [root] START thread <Thread(mailbox-hsm/2, started daemon 140444826597120)> (func=<bound method WorkerThread.run of <vdsm.storage.threadPool.WorkerThread object at 0x7fbc09618b10>>, args=(), kwargs={}) (concurrent:183) 09:56:02 2017-01-29 09:53:55,339 DEBUG (mailbox-hsm/3) [root] START thread <Thread(mailbox-hsm/3, started daemon 140444784633600)> (func=<bound method WorkerThread.run of <vdsm.storage.threadPool.WorkerThread object at 0x7fbc09618c50>>, args=(), kwargs={}) (concurrent:183) 09:56:02 2017-01-29 09:53:55,340 DEBUG (MainThread) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-1 /usr/bin/dd if=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/outbox iflag=direct,fullblock bs=512 count=8 skip=56 (cwd None) (commands:69) 09:56:02 2017-01-29 09:53:55,340 DEBUG (mailbox-hsm/4) [root] START thread <Thread(mailbox-hsm/4, started daemon 140445071226624)> (func=<bound method WorkerThread.run of <vdsm.storage.threadPool.WorkerThread object at 0x7fbc09618590>>, args=(), kwargs={}) (concurrent:183) 09:56:02 2017-01-29 09:53:55,646 DEBUG (MainThread) [storage.Misc.excCmd] SUCCESS: <err> = '8+0 records in\n8+0 records out\n4096 bytes (4.1 kB, 4.0 KiB) copied, 0.296994 s, 13.8 kB/s\n'; <rc> = 0 (commands:93) 09:56:02 2017-01-29 09:53:55,646 INFO (MainThread) [storage.MailBox.HsmMailMonitor] HSM_MailMonitor sending mail to SPM - ['/usr/bin/dd', 'of=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/inbox', 'iflag=fullblock', 'oflag=direct', 'conv=notrunc', 'bs=512', 'seek=56'] (storage_mailbox:394) 09:56:02 2017-01-29 09:53:55,647 DEBUG (MainThread) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-1 /usr/bin/dd of=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/inbox iflag=fullblock oflag=direct conv=notrunc bs=512 seek=56 (cwd None) (commands:69) 09:56:02 2017-01-29 09:53:58,798 DEBUG (MainThread) [storage.Misc.excCmd] SUCCESS: <err> = '8+0 records in\n8+0 records out\n4096 bytes (4.1 kB, 4.0 KiB) copied, 3.14004 s, 1.3 kB/s\n'; <rc> = 0 (commands:93) 09:56:02 2017-01-29 09:53:58,807 DEBUG (mailbox-hsm) [storage.MailBox.HsmMailMonitor] START thread <Thread(mailbox-hsm, started daemon 140445049673472)> (func=<bound method HSM_MailMonitor.run of <storage.storage_mailbox.HSM_MailMonitor object at 0x7fbc0960a9d0>>, args=(), kwargs={}) (concurrent:183) 09:56:02 2017-01-29 09:53:58,808 DEBUG (MainThread) [storage.Mailbox.HSM] HSM_MailboxMonitor created for pool 5d928855-b09b-47a7-b920-bd2d2eb5808c (storage_mailbox:209) 09:56:02 2017-01-29 09:53:58,809 DEBUG (MainThread) [storage.ThreadPool] Enter - name: mailbox-spm, numThreads: 5, waitTimeout: 0.15, maxTasks: 500 (threadPool:36) 09:56:02 2017-01-29 09:53:58,818 DEBUG (mailbox-spm/0) [root] START thread <Thread(mailbox-spm/0, started daemon 140445041280768)> (func=<bound method WorkerThread.run of <vdsm.storage.threadPool.WorkerThread object at 0x7fbc09608890>>, args=(), kwargs={}) (concurrent:183) 09:56:02 2017-01-29 09:53:58,820 DEBUG (mailbox-spm/1) [root] START thread <Thread(mailbox-spm/1, started daemon 140444818204416)> (func=<bound method WorkerThread.run of <vdsm.storage.threadPool.WorkerThread object at 0x7fbc09608490>>, args=(), kwargs={}) (concurrent:183) 09:56:02 2017-01-29 09:53:58,820 DEBUG (mailbox-spm/2) [root] START thread <Thread(mailbox-spm/2, started daemon 140444809811712)> (func=<bound method WorkerThread.run of <vdsm.storage.threadPool.WorkerThread object at 0x7fbc09608110>>, args=(), kwargs={}) (concurrent:183) 09:56:02 2017-01-29 09:53:58,821 DEBUG (mailbox-spm/3) [root] START thread <Thread(mailbox-spm/3, started daemon 140444801419008)> (func=<bound method WorkerThread.run of <vdsm.storage.threadPool.WorkerThread object at 0x7fbc0961bb90>>, args=(), kwargs={}) (concurrent:183) 09:56:02 2017-01-29 09:53:58,822 DEBUG (mailbox-spm/4) [root] START thread <Thread(mailbox-spm/4, started daemon 140444793026304)> (func=<bound method WorkerThread.run of <vdsm.storage.threadPool.WorkerThread object at 0x7fbc0961bc10>>, args=(), kwargs={}) (concurrent:183) 09:56:02 2017-01-29 09:53:58,822 DEBUG (MainThread) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor - clearing outgoing mail, command is: ['dd', 'of=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/outbox', 'oflag=direct', 'iflag=fullblock', 'conv=notrunc', 'count=1'] (storage_mailbox:583) 09:56:02 2017-01-29 09:53:58,823 DEBUG (MainThread) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-1 dd of=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/outbox oflag=direct iflag=fullblock conv=notrunc count=1 bs=40960 (cwd None) (commands:69) 09:56:02 2017-01-29 09:53:59,561 DEBUG (MainThread) [storage.Misc.excCmd] SUCCESS: <err> = '1+0 records in\n1+0 records out\n40960 bytes (41 kB, 40 KiB) copied, 0.721626 s, 56.8 kB/s\n'; <rc> = 0 (commands:93) 09:56:02 2017-01-29 09:53:59,562 DEBUG (mailbox-spm) [storage.MailBox.SpmMailMonitor] START thread <Thread(mailbox-spm, started daemon 140444776240896)> (func=<bound method SPM_MailMonitor.run of <storage.storage_mailbox.SPM_MailMonitor instance at 0x7fbc0c610b90>>, args=(), kwargs={}) (concurrent:183) 09:56:02 2017-01-29 09:53:59,563 DEBUG (MainThread) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor created for pool 5d928855-b09b-47a7-b920-bd2d2eb5808c (storage_mailbox:593) 09:56:02 2017-01-29 09:53:59,563 DEBUG (MainThread) [storage.SPM.Messages.Extend] new extend msg created: domain: 8adbc85e-e554-4ae0-b318-8a5465fe5fe1, volume: d772f1c6-3ebb-43c3-a42e-73fcd8255a5f (storage_mailbox:125) 09:56:02 2017-01-29 09:53:59,563 DEBUG (mailbox-hsm) [storage.MailBox.HsmMailMonitor] HSM_MailMonitor - start: 64, end: 128, len: 4096, message(1/63): '1xtnd\xe1_\xfeeT\x8a\x18\xb3\xe0JT\xe5^\xc8\xdb\x8a_Z%\xd8\xfcs.\xa4\xc3C\xbb>\xc6\xf1r\xd7000000000000006400000000000' (storage_mailbox:435) 09:56:02 2017-01-29 09:53:59,564 DEBUG (mailbox-hsm) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-1 /usr/bin/dd if=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/outbox iflag=direct,fullblock bs=512 count=8 skip=56 (cwd None) (commands:69) 09:56:02 2017-01-29 09:53:59,572 DEBUG (mailbox-spm) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-1 dd if=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=40960 (cwd None) (commands:69) 09:56:02 2017-01-29 09:53:59,589 DEBUG (mailbox-spm) [storage.Misc.excCmd] SUCCESS: <err> = '1+0 records in\n1+0 records out\n40960 bytes (41 kB, 40 KiB) copied, 0.0080494 s, 5.1 MB/s\n'; <rc> = 0 (commands:93) 09:56:02 2017-01-29 09:53:59,600 DEBUG (mailbox-hsm) [storage.Misc.excCmd] SUCCESS: <err> = '8+0 records in\n8+0 records out\n4096 bytes (4.1 kB, 4.0 KiB) copied, 0.023761 s, 172 kB/s\n'; <rc> = 0 (commands:93) 09:56:02 2017-01-29 09:53:59,601 INFO (mailbox-hsm) [storage.MailBox.HsmMailMonitor] HSM_MailMonitor sending mail to SPM - ['/usr/bin/dd', 'of=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/inbox', 'iflag=fullblock', 'oflag=direct', 'conv=notrunc', 'bs=512', 'seek=56'] (storage_mailbox:394) 09:56:02 2017-01-29 09:53:59,601 DEBUG (mailbox-hsm) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-1 /usr/bin/dd of=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/inbox iflag=fullblock oflag=direct conv=notrunc bs=512 seek=56 (cwd None) (commands:69) 09:56:02 2017-01-29 09:53:59,691 DEBUG (mailbox-spm) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-1 dd if=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=40960 (cwd None) (commands:69) 09:56:02 2017-01-29 09:53:59,828 DEBUG (mailbox-spm) [storage.Misc.excCmd] SUCCESS: <err> = '1+0 records in\n1+0 records out\n40960 bytes (41 kB, 40 KiB) copied, 0.128307 s, 319 kB/s\n'; <rc> = 0 (commands:93) 09:56:02 2017-01-29 09:53:59,830 ERROR (mailbox-spm) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor: mailbox 7 checksum failed, not clearing mailbox, clearing newMail. (storage_mailbox:630) 09:56:02 2017-01-29 09:53:59,931 DEBUG (mailbox-spm) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-1 dd if=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=40960 (cwd None) (commands:69) 09:56:02 2017-01-29 09:53:59,976 DEBUG (mailbox-spm) [storage.Misc.excCmd] SUCCESS: <err> = '1+0 records in\n1+0 records out\n40960 bytes (41 kB, 40 KiB) copied, 0.0354083 s, 1.2 MB/s\n'; <rc> = 0 (commands:93) 09:56:02 2017-01-29 09:53:59,977 ERROR (mailbox-spm) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor: mailbox 7 checksum failed, not clearing mailbox, clearing newMail. (storage_mailbox:630) 09:56:02 2017-01-29 09:54:00,078 DEBUG (mailbox-spm) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-1 dd if=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=40960 (cwd None) (commands:69) 09:56:02 2017-01-29 09:54:00,220 DEBUG (mailbox-spm) [storage.Misc.excCmd] SUCCESS: <err> = '1+0 records in\n1+0 records out\n40960 bytes (41 kB, 40 KiB) copied, 0.133017 s, 308 kB/s\n'; <rc> = 0 (commands:93) 09:56:02 2017-01-29 09:54:00,222 ERROR (mailbox-spm) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor: mailbox 7 checksum failed, not clearing mailbox, clearing newMail. (storage_mailbox:630) 09:56:02 2017-01-29 09:54:00,323 DEBUG (mailbox-spm) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-1 dd if=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=40960 (cwd None) (commands:69) 09:56:02 2017-01-29 09:54:00,364 DEBUG (mailbox-spm) [storage.Misc.excCmd] SUCCESS: <err> = '1+0 records in\n1+0 records out\n40960 bytes (41 kB, 40 KiB) copied, 0.0325451 s, 1.3 MB/s\n'; <rc> = 0 (commands:93) 09:56:02 2017-01-29 09:54:00,366 ERROR (mailbox-spm) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor: mailbox 7 checksum failed, not clearing mailbox, clearing newMail. (storage_mailbox:630) 09:56:02 2017-01-29 09:54:00,466 DEBUG (mailbox-spm) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-1 dd if=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=40960 (cwd None) (commands:69) 09:56:02 2017-01-29 09:54:00,610 DEBUG (mailbox-spm) [storage.Misc.excCmd] SUCCESS: <err> = '1+0 records in\n1+0 records out\n40960 bytes (41 kB, 40 KiB) copied, 0.129057 s, 317 kB/s\n'; <rc> = 0 (commands:93) 09:56:02 2017-01-29 09:54:00,612 ERROR (mailbox-spm) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor: mailbox 7 checksum failed, not clearing mailbox, clearing newMail. (storage_mailbox:630) 09:56:02 2017-01-29 09:54:00,778 DEBUG (mailbox-spm/1) [root] FINISH thread <Thread(mailbox-spm/1, started daemon 140444818204416)> (concurrent:186) 09:56:02 2017-01-29 09:54:00,778 DEBUG (mailbox-spm/2) [root] FINISH thread <Thread(mailbox-spm/2, started daemon 140444809811712)> (concurrent:186) 09:56:02 2017-01-29 09:54:00,777 DEBUG (mailbox-spm/0) [root] FINISH thread <Thread(mailbox-spm/0, started daemon 140445041280768)> (concurrent:186) 09:56:02 2017-01-29 09:54:00,783 DEBUG (mailbox-spm/4) [root] FINISH thread <Thread(mailbox-spm/4, started daemon 140444793026304)> (concurrent:186) 09:56:15 2017-01-29 09:54:00,783 DEBUG (mailbox-spm/3) [root] FINISH thread <Thread(mailbox-spm/3, started daemon 140444801419008)> (concurrent:186) 09:56:15 2017-01-29 09:54:00,785 INFO (mailbox-spm) [storage.MailBox.SpmMailMonitor] SPM_MailMonitor - Incoming mail monitoring thread stopped (storage_mailbox:805) 09:56:15 2017-01-29 09:54:00,785 DEBUG (mailbox-spm) [storage.MailBox.SpmMailMonitor] FINISH thread <Thread(mailbox-spm, started daemon 140444776240896)> (concurrent:186) 09:56:15 2017-01-29 09:54:00,901 DEBUG (mailbox-hsm) [storage.Misc.excCmd] SUCCESS: <err> = '8+0 records in\n8+0 records out\n4096 bytes (4.1 kB, 4.0 KiB) copied, 1.28855 s, 3.2 kB/s\n'; <rc> = 0 (commands:93) 09:56:15 2017-01-29 09:54:00,901 INFO (mailbox-hsm) [storage.MailBox.HsmMailMonitor] HSM_MailboxMonitor - Incoming mail monitoring thread stopped, clearing outgoing mail (storage_mailbox:518) 09:56:15 2017-01-29 09:54:00,901 INFO (mailbox-hsm) [storage.MailBox.HsmMailMonitor] HSM_MailMonitor sending mail to SPM - ['/usr/bin/dd', 'of=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/inbox', 'iflag=fullblock', 'oflag=direct', 'conv=notrunc', 'bs=512', 'seek=56'] (storage_mailbox:394) 09:56:15 2017-01-29 09:54:00,902 DEBUG (mailbox-hsm) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-1 /usr/bin/dd of=/var/tmp/tmp71q4uJ/5d928855-b09b-47a7-b920-bd2d2eb5808c/mastersd/dom_md/inbox iflag=fullblock oflag=direct conv=notrunc bs=512 seek=56 (cwd None) (commands:69) 09:56:15 2017-01-29 09:54:00,912 DEBUG (mailbox-hsm/4) [root] FINISH thread <Thread(mailbox-hsm/4, started daemon 140445071226624)> (concurrent:186) 09:56:15 2017-01-29 09:54:00,911 DEBUG (mailbox-hsm/2) [root] FINISH thread <Thread(mailbox-hsm/2, started daemon 140444826597120)> (concurrent:186) 09:56:15 2017-01-29 09:54:00,910 DEBUG (mailbox-hsm/1) [root] FINISH thread <Thread(mailbox-hsm/1, started daemon 140445631756032)> (concurrent:186) 09:56:15 2017-01-29 09:54:00,910 DEBUG (mailbox-hsm/3) [root] FINISH thread <Thread(mailbox-hsm/3, started daemon 140444784633600)> (concurrent:186) 09:56:15 2017-01-29 09:54:00,910 DEBUG (mailbox-hsm/0) [root] FINISH thread <Thread(mailbox-hsm/0, started daemon 140445079619328)> (concurrent:186) 09:56:15 2017-01-29 09:54:02,394 DEBUG (mailbox-hsm) [storage.Misc.excCmd] SUCCESS: <err> = '8+0 records in\n8+0 records out\n4096 bytes (4.1 kB, 4.0 KiB) copied, 1.47883 s, 2.8 kB/s\n'; <rc> = 0 (commands:93) 09:56:15 2017-01-29 09:54:02,394 DEBUG (mailbox-hsm) [storage.MailBox.HsmMailMonitor] FINISH thread <Thread(mailbox-hsm, started daemon 140445049673472)> (concurrent:186)