On Thu, Sep 28, 2017 at 12:03 PM Gianluca Cecchi <gianluca.cecchi(a)gmail.com>
wrote:
Hello,
I'm on 4.1.5 and I'm cloning a snapshot of a VM with 3 disks for a total
of about 200Gb to copy
The target I choose is on a different domain than the source one.
They are both FC storage domains, with the source on SSD disks and the
target on SAS disks.
The disks are preallocated
Now I have 3 processes of kind:
/usr/bin/qemu-img convert -p -t none -T none -f raw
/rhev/data-center/59b7af54-0155-01c2-0248-000000000195/fad05d79-254d-4f40-8201-360757128ede/images/8f62600a-057d-4d59-9655-631f080a73f6/21a8812f-6a89-4015-a79e-150d7e202450
-O raw
/rhev/data-center/mnt/blockSD/6911716c-aa99-4750-a7fe-f83675a2d676/images/c3973d1b-a168-4ec5-8c1a-630cfc4b66c4/27980581-5935-4b23-989a-4811f80956ca
but despite capabilities it seems it is copying using very low system
resources.
We run qemu-img convert (and other storage related commands) with:
nice -n 19 ionice -c 3 qemu-img ...
ionice should not have any effect unless you use the CFQ I/O scheduler.
The intent is to limit the effect of virtual machines.
I see this both using iotop and vmstat
vmstat 3 gives:
----io---- -system-- ------cpu-----
bi bo in cs us sy id wa st
2527 698 3771 29394 1 0 89 10 0
us 94% also seems very high - maybe this hypervisor is overloaded with
other workloads?
wa 89% seems very high
iotop -d 5 -k -o -P gives:
Total DISK READ : 472.73 K/s | Total DISK WRITE : 17.05 K/s
Actual DISK READ: 1113.23 K/s | Actual DISK WRITE: 55.86 K/s
PID PRIO USER DISK READ> DISK WRITE SWAPIN IO COMMAND
2124 be/4 sanlock 401.39 K/s 0.20 K/s 0.00 % 0.00 % sanlock daemon
2146 be/4 vdsm 50.96 K/s 0.00 K/s 0.00 % 0.00 % python
/usr/share/o~a-broker --no-daemon
30379 be/0 root 7.06 K/s 0.00 K/s 0.00 % 98.09 % lvm vgck
--config ~50-a7fe-f83675a2d676
30380 be/0 root 4.70 K/s 0.00 K/s 0.00 % 98.09 % lvm lvchange
--conf~59-b931-4eb61e43b56b
30381 be/0 root 4.70 K/s 0.00 K/s 0.00 % 98.09 % lvm lvchange
--conf~83675a2d676/metadata
30631 be/0 root 3.92 K/s 0.00 K/s 0.00 % 98.09 % lvm vgs
--config d~f6-9466-553849aba5e9
2052 be/3 root 0.00 K/s 2.35 K/s 0.00 % 0.00 % [jbd2/dm-34-8]
6458 be/4 qemu 0.00 K/s 4.70 K/s 0.00 % 0.00 % qemu-kvm -name
gues~x7 -msg timestamp=on
2064 be/3 root 0.00 K/s 0.00 K/s 0.00 % 0.00 % [jbd2/dm-32-8]
2147 be/4 root 0.00 K/s 4.70 K/s 0.00 % 0.00 % rsyslogd -n
9145 idle vdsm 0.00 K/s 0.59 K/s 0.00 % 24.52 % qemu-img
convert -p~23-989a-4811f80956ca
13313 be/4 root 0.00 K/s 0.00 K/s 0.00 % 0.00 %
[kworker/u112:3]
9399 idle vdsm 0.00 K/s 0.59 K/s 0.00 % 24.52 % qemu-img
convert -p~51-9c8c-8d9aaa7e8f58
0.59 K/s seems extremely low, I don't expect such value.
1310 ?dif root 0.00 K/s 0.00 K/s 0.00 % 0.00 %
multipathd
3996 be/4 vdsm 0.00 K/s 0.78 K/s 0.00 % 0.00 % python
/usr/sbin/mo~c /etc/vdsm/mom.conf
6391 be/4 root 0.00 K/s 0.00 K/s 0.00 % 0.00 %
[kworker/u112:0]
2059 be/3 root 0.00 K/s 3.14 K/s 0.00 % 0.00 % [jbd2/dm-33-8]
Is it expected? Any way to speed up the process?
I would try to perform the same copy from the shell, without ionice
and nice, and see if this improves the times.
Can do a test with a small image (e.g 10g) running qemu-img with strace?
strace -f -o qemu-img.strace qemu-img convert \
-p \
-t none \
-T none \
-f raw \
/dev/fad05d79-254d-4f40-8201-360757128ede/<lv-name> \
-O raw \
/dev/6911716c-aa99-4750-a7fe-f83675a2d676/<lv-name>
and shared the trace?
Also version info (kernel, qemu) would be useful.
Adding Kevin.
Nir