Hi,
I'm doing some manual storage tests on RHEL 8.1 host (kernel version
4.18.0-107.el8.x86_64)
and run into following issue: when I try to move an image to block SD and RHEL 8.1 host is
SPM,
qemu-img gets stuck. There are errors regarding blk_cloned_rq_check_limits on RHEL host
and also max_write_same_len on iscsi target (see bellow). iscsi target runs CentOS 7.6.
Before running qemu-img command, everything works fine (e.g. cat a file from iscsi
target),
after running qemu-img, all IO on iscsi target is very slow.
When I try to run qemu-img command manually, it's very slow (1GB image take several
minutes),
but finishes, when it's run from vdsm, it seems to be hung forever.
In similar cases I found it could be fixed by adjusting various values in
/sys/block/#DEVICE/queue/, but in this case, AFAICT all values are correct (same as on
CentOS 7.7
host where everything work). Or it was identified as a bug in the kernel and was suggested
to
upgrade to newer kernel.
Do you know any workaround, how to make it working? Or does it sound to you like a bug in
kernel
which should be reported?
Thanks
Vojta
Example of qemu-img command:
[root@localhost ~]#/usr/bin/qemu-img convert -p -t none -T none -f raw
/rhev/data-center/mnt/blockSD/ceba76d5-9d5d-4b04-9226-dffc7e77d4be/images/9aafce3d-53a6-41d7-b0d5-0e1db8aa59fc/7b7c7800-4837-4c3f-967c-c942b9c54e1b
-O raw -W
/rhev/data-center/mnt/blockSD/a17d7695-50fc-412e-abc7-008ebb6cdf23/images/9aafce3d-53a6-41d7-b0d5-0e1db8aa59fc/7b7c7800-4837-4c3f-967c-c942b9c54e1b
(100.00/100%)
Errors on RHEL 8.1 host:
[ 356.658973] print_req_error: critical target error, dev dm-8, sector 31203328 flags 9
[ 356.658982] print_req_error: critical target error, dev dm-8, sector 31268863 flags 9
[ 356.658984] print_req_error: critical target error, dev dm-8, sector 31334398 flags 9
[ 356.658986] print_req_error: critical target error, dev dm-8, sector 31399933 flags 9
[ 356.658987] print_req_error: critical target error, dev dm-8, sector 31465468 flags 9
[ 356.658989] print_req_error: critical target error, dev dm-8, sector 31531003 flags 9
[ 356.658999] print_req_error: critical target error, dev dm-8, sector 31596538 flags 9
[ 356.659001] print_req_error: critical target error, dev dm-8, sector 31662073 flags 9
[ 356.659008] blk_cloned_rq_check_limits: over max size limit.
[ 356.659044] device-mapper: multipath: Failing path 8:80.
[ 356.669966] print_req_error: critical target error, dev dm-8, sector 31727608 flags 9
[ 356.669987] print_req_error: critical target error, dev dm-8, sector 31793143 flags 9
[ 358.923614] device-mapper: multipath: Reinstating path 8:80.
[ 358.933888] sd 6:0:0:4: alua: port group 00 state A non-preferred supports TOlUSNA
[ 358.933980] blk_cloned_rq_check_limits: over max size limit.
[ 358.934008] device-mapper: multipath: Failing path 8:80.
[ 364.927647] device-mapper: multipath: Reinstating path 8:80.
[ 364.956799] sd 6:0:0:4: alua: port group 00 state A non-preferred supports TOlUSNA
[ 364.956890] blk_cloned_rq_check_limits: over max size limit.
[...]
[ 475.001255] blk_cloned_rq_check_limits: over max size limit.
[ 475.001271] device-mapper: multipath: Failing path 8:80.
[ 479.995729] device-mapper: multipath: Reinstating path 8:80.
[ 480.006078] sd 6:0:0:4: alua: port group 00 state A non-preferred supports TOlUSNA
[ 480.006114] blk_cloned_rq_check_limits: over max size limit.
[ 480.006134] device-mapper: multipath: Failing path 8:80.
[ 484.997257] device-mapper: multipath: Reinstating path 8:80.
[ 485.006868] sd 6:0:0:4: alua: port group 00 state A non-preferred supports TOlUSNA
[ 485.006908] blk_cloned_rq_check_limits: over max size limit.
[ 485.006960] device-mapper: multipath: Failing path 8:80.
[ 489.998641] device-mapper: multipath: Reinstating path 8:80.
[ 490.010288] sd 6:0:0:4: alua: port group 00 state A non-preferred supports TOlUSNA
[ 490.010430] blk_cloned_rq_check_limits: over max size limit.
[ 490.010455] device-mapper: multipath: Failing path 8:80.
[ 492.922671] INFO: task qemu-img:9509 blocked for more than 120 seconds.
[ 492.922675] Tainted: G --------- -t - 4.18.0-107.el8.x86_64 #1
[ 492.922676] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 492.922677] qemu-img D 0 9509 7416 0x00000080
[ 492.922679] Call Trace:
[ 492.922703] ? __schedule+0x253/0x830
[ 492.922704] schedule+0x28/0x70
[ 492.922705] schedule_timeout+0x26d/0x390
[ 492.922711] ? blk_flush_plug_list+0xd7/0x100
[ 492.922713] io_schedule_timeout+0x19/0x40
[ 492.922714] wait_for_completion_io+0x11f/0x190
[ 492.922717] ? wake_up_q+0x70/0x70
[ 492.922719] submit_bio_wait+0x5b/0x80
[ 492.922728] blkdev_issue_zeroout+0x142/0x220
[ 492.922737] blkdev_fallocate+0x11e/0x190
[ 492.922745] vfs_fallocate+0x13f/0x280
[ 492.922753] ksys_fallocate+0x3c/0x80
[ 492.922754] __x64_sys_fallocate+0x1a/0x20
[ 492.922756] do_syscall_64+0x5b/0x1b0
[ 492.922764] entry_SYSCALL_64_after_hwframe+0x65/0xca
[ 492.922767] RIP: 0033:0x7f9e2f8f1e85
[ 492.922771] Code: Bad RIP value.
[ 492.922771] RSP: 002b:00007f9e2d79ba20 EFLAGS: 00000293 ORIG_RAX: 000000000000011d
[ 492.922772] RAX: ffffffffffffffda RBX: 000000000000000a RCX: 00007f9e2f8f1e85
[ 492.922773] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 000000000000000a
[ 492.922773] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000001
[ 492.922774] R10: 0000000040000000 R11: 0000000000000293 R12: 0000000000000000
[ 492.922774] R13: 0000000040000000 R14: 000055b21fe81390 R15: 00007f9e2d79bbc0
Errors on iscsi target:
[ 703.199374] WRITE_SAME sectors: 65535 exceeds max_write_same_len: 4096
[ 703.200836] WRITE_SAME sectors: 65535 exceeds max_write_same_len: 4096
[ 703.202089] WRITE_SAME sectors: 65535 exceeds max_write_same_len: 4096
[ 703.203289] WRITE_SAME sectors: 65535 exceeds max_write_same_len: 4096
[ 703.204479] WRITE_SAME sectors: 65535 exceeds max_write_same_len: 4096
[ 703.205819] WRITE_SAME sectors: 65535 exceeds max_write_same_len: 4096
[ 703.206878] WRITE_SAME sectors: 65535 exceeds max_write_same_len: 4096
[ 703.208020] WRITE_SAME sectors: 65535 exceeds max_write_same_len: 4096
[ 703.209084] WRITE_SAME sectors: 65535 exceeds max_write_same_len: 4096
[ 703.210357] WRITE_SAME sectors: 65535 exceeds max_write_same_len: 4096
[ 703.211439] WRITE_SAME sectors: 65535 exceeds max_write_same_len: 4096
[ 703.212584] WRITE_SAME sectors: 65535 exceeds max_write_same_len: 4096
[ 703.213756] WRITE_SAME sectors: 65535 exceeds max_write_same_len: 4096
[ 703.214813] WRITE_SAME sectors: 65535 exceeds max_write_same_len: 4096