Performance of cloning

older
[ovirt-announce][ANN] oVirt 4.1.7...

Gianluca Cecchi

28 Sep 2017 28 Sep '17

11:02 a.m.

Hello, I'm on 4.1.5 and I'm cloning a snapshot of a VM with 3 disks for a total of about 200Gb to copy The target I choose is on a different domain than the source one. They are both FC storage domains, with the source on SSD disks and the target on SAS disks. The disks are preallocated Now I have 3 processes of kind: /usr/bin/qemu-img convert -p -t none -T none -f raw /rhev/data-center/59b7af54-0155-01c2-0248-000000000195/fad05d79-254d-4f40-8201-360757128ede/images/8f62600a-057d-4d59-9655-631f080a73f6/21a8812f-6a89-4015-a79e-150d7e202450 -O raw /rhev/data-center/mnt/blockSD/6911716c-aa99-4750-a7fe-f83675a2d676/images/c3973d1b-a168-4ec5-8c1a-630cfc4b66c4/27980581-5935-4b23-989a-4811f80956ca but despite capabilities it seems it is copying using very low system resources. I see this both using iotop and vmstat vmstat 3 gives: ----io---- -system-- ------cpu----- bi bo in cs us sy id wa st 2527 698 3771 29394 1 0 89 10 0 iotop -d 5 -k -o -P gives: Total DISK READ : 472.73 K/s | Total DISK WRITE : 17.05 K/s Actual DISK READ: 1113.23 K/s | Actual DISK WRITE: 55.86 K/s PID PRIO USER DISK READ> DISK WRITE SWAPIN IO COMMAND 2124 be/4 sanlock 401.39 K/s 0.20 K/s 0.00 % 0.00 % sanlock daemon 2146 be/4 vdsm 50.96 K/s 0.00 K/s 0.00 % 0.00 % python /usr/share/o~a-broker --no-daemon 30379 be/0 root 7.06 K/s 0.00 K/s 0.00 % 98.09 % lvm vgck --config ~50-a7fe-f83675a2d676 30380 be/0 root 4.70 K/s 0.00 K/s 0.00 % 98.09 % lvm lvchange --conf~59-b931-4eb61e43b56b 30381 be/0 root 4.70 K/s 0.00 K/s 0.00 % 98.09 % lvm lvchange --conf~83675a2d676/metadata 30631 be/0 root 3.92 K/s 0.00 K/s 0.00 % 98.09 % lvm vgs --config d~f6-9466-553849aba5e9 2052 be/3 root 0.00 K/s 2.35 K/s 0.00 % 0.00 % [jbd2/dm-34-8] 6458 be/4 qemu 0.00 K/s 4.70 K/s 0.00 % 0.00 % qemu-kvm -name gues~x7 -msg timestamp=on 2064 be/3 root 0.00 K/s 0.00 K/s 0.00 % 0.00 % [jbd2/dm-32-8] 2147 be/4 root 0.00 K/s 4.70 K/s 0.00 % 0.00 % rsyslogd -n 9145 idle vdsm 0.00 K/s 0.59 K/s 0.00 % 24.52 % qemu-img convert -p~23-989a-4811f80956ca 13313 be/4 root 0.00 K/s 0.00 K/s 0.00 % 0.00 % [kworker/u112:3] 9399 idle vdsm 0.00 K/s 0.59 K/s 0.00 % 24.52 % qemu-img convert -p~51-9c8c-8d9aaa7e8f58 1310 ?dif root 0.00 K/s 0.00 K/s 0.00 % 0.00 % multipathd 3996 be/4 vdsm 0.00 K/s 0.78 K/s 0.00 % 0.00 % python /usr/sbin/mo~c /etc/vdsm/mom.conf 6391 be/4 root 0.00 K/s 0.00 K/s 0.00 % 0.00 % [kworker/u112:0] 2059 be/3 root 0.00 K/s 3.14 K/s 0.00 % 0.00 % [jbd2/dm-33-8] Is it expected? Any way to speed up the process? Thanks, Gianluca

Attachments:

attachment.html (text/html — 3.5 KB)

Show replies by date

Gianluca Cecchi

28 Sep 28 Sep

12:38 p.m.

On Thu, Sep 28, 2017 at 11:02 AM, Gianluca Cecchi <gianluca.cecchi@gmail.com

...

wrote:

...

Hello, I'm on 4.1.5 and I'm cloning a snapshot of a VM with 3 disks for a total of about 200Gb to copy The target I choose is on a different domain than the source one. They are both FC storage domains, with the source on SSD disks and the target on SAS disks.

The disks are preallocated

Now I have 3 processes of kind: /usr/bin/qemu-img convert -p -t none -T none -f raw /rhev/data-center/59b7af54-0155-01c2-0248-000000000195/ fad05d79-254d-4f40-8201-360757128ede/images/8f62600a- 057d-4d59-9655-631f080a73f6/21a8812f-6a89-4015-a79e-150d7e202450 -O raw /rhev/data-center/mnt/blockSD/6911716c-aa99-4750-a7fe- f83675a2d676/images/c3973d1b-a168-4ec5-8c1a-630cfc4b66c4/ 27980581-5935-4b23-989a-4811f80956ca

but despite capabilities it seems it is copying using very low system resources.

[snip]

...

Is it expected? Any way to speed up the process?

Thanks, Gianluca

The cloning process elapsed was 101' The 3 disks are 85Gb, 20Gb and 80Gb so at the end an average of 30MB/s At this moment I have only one host with self hosted engine VM running in this environment, planning to add another host in short time. So not yet configured power mgmt for fencing on it During the cloning I saw these kind of events Sep 28, 2017 10:31:30 AM VM vmclone1 creation was initiated by admin@internal-authz. Sep 28, 2017 11:16:38 AM VDSM command SetVolumeDescriptionVDS failed: Message timeout which can be caused by communication issues Sep 28, 2017 11:19:43 AM VDSM command SetVolumeDescriptionVDS failed: Message timeout which can be caused by communication issues Sep 28, 2017 11:19:43 AM Failed to update OVF disks 1504a878-4fe2-40df-a88f-6f073be0bd7b, 4ddac3ed-2bb9-485c-bf57-1750ac1fd761, OVF data isn't updated on those OVF stores (Data Center DC1, Storage Domain SDTEST). At 11:24 I then start a pre-existing VM named benchvm and run a cpu / I/O benchmark (HammerDB with 13 concurrent users; the VM is configured with 12 vcpus (1:6:2) and 64Gb of ram; it is not the one I'm cloning) that runs from 11:40 to 12:02 Sep 28, 2017 11:24:29 AM VM benchvm started on Host host1 Sep 28, 2017 11:45:18 AM Host host1 is not responding. Host cannot be fenced automatically because power management for the host is disabled. Sep 28, 2017 11:45:28 AM Failed to update OVF disks 1504a878-4fe2-40df-a88f-6f073be0bd7b, 4ddac3ed-2bb9-485c-bf57-1750ac1fd761, OVF data isn't updated on those OVF stores (Data Center DC1, Storage Domain SDTEST). Sep 28, 2017 11:45:39 AM Status of host host1 was set to Up. Sep 28, 2017 12:12:31 PM VM vmclone1 creation has been completed. Any hint on the failures detected, both when only the cloning process was in place and when a bench was running inside a VM? Thanks, Gianluca

Nir Soffer

29 Sep 29 Sep

12:04 a.m.

On Thu, Sep 28, 2017 at 1:39 PM Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:

...

On Thu, Sep 28, 2017 at 11:02 AM, Gianluca Cecchi < gianluca.cecchi@gmail.com> wrote:

...
Hello, I'm on 4.1.5 and I'm cloning a snapshot of a VM with 3 disks for a total of about 200Gb to copy The target I choose is on a different domain than the source one. They are both FC storage domains, with the source on SSD disks and the target on SAS disks.

The disks are preallocated

Now I have 3 processes of kind: /usr/bin/qemu-img convert -p -t none -T none -f raw /rhev/data-center/59b7af54-0155-01c2-0248-000000000195/fad05d79-254d-4f40-8201-360757128ede/images/8f62600a-057d-4d59-9655-631f080a73f6/21a8812f-6a89-4015-a79e-150d7e202450 -O raw /rhev/data-center/mnt/blockSD/6911716c-aa99-4750-a7fe-f83675a2d676/images/c3973d1b-a168-4ec5-8c1a-630cfc4b66c4/27980581-5935-4b23-989a-4811f80956ca

but despite capabilities it seems it is copying using very low system resources.

[snip]

...
Is it expected? Any way to speed up the process?

Thanks, Gianluca

The cloning process elapsed was 101' The 3 disks are 85Gb, 20Gb and 80Gb so at the end an average of 30MB/s

At this moment I have only one host with self hosted engine VM running in

...

this environment, planning to add another host in short time. So not yet configured power mgmt for fencing on it During the cloning I saw these kind of events

Sep 28, 2017 10:31:30 AM VM vmclone1 creation was initiated by admin@internal-authz. Sep 28, 2017 11:16:38 AM VDSM command SetVolumeDescriptionVDS failed: Message timeout which can be caused by communication issues

Sep 28, 2017 11:19:43 AM VDSM command SetVolumeDescriptionVDS failed: Message timeout which can be caused by communication issues

This looks like timeout on vdsm side, you can find more info in vdsm log about these errors.

...

Sep 28, 2017 11:19:43 AM Failed to update OVF disks 1504a878-4fe2-40df-a88f-6f073be0bd7b, 4ddac3ed-2bb9-485c-bf57-1750ac1fd761, OVF data isn't updated on those OVF stores (Data Center DC1, Storage Domain SDTEST).

Same

...

At 11:24 I then start a pre-existing VM named benchvm and run a cpu / I/O benchmark (HammerDB with 13 concurrent users; the VM is configured with 12 vcpus (1:6:2) and 64Gb of ram; it is not the one I'm cloning) that runs from 11:40 to 12:02 Sep 28, 2017 11:24:29 AM VM benchvm started on Host host1 Sep 28, 2017 11:45:18 AM Host host1 is not responding. Host cannot be fenced automatically because power management for the host is disabled.

Same

...

Sep 28, 2017 11:45:28 AM Failed to update OVF disks 1504a878-4fe2-40df-a88f-6f073be0bd7b, 4ddac3ed-2bb9-485c-bf57-1750ac1fd761, OVF data isn't updated on those OVF stores (Data Center DC1, Storage Domain SDTEST).

Same

...

Sep 28, 2017 11:45:39 AM Status of host host1 was set to Up. Sep 28, 2017 12:12:31 PM VM vmclone1 creation has been completed.

Any hint on the failures detected, both when only the cloning process was in place and when a bench was running inside a VM?

I would check the host logs (vdsm and messages).

...

Thanks, Gianluca _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users

Nir Soffer

28 Sep 28 Sep

12:44 p.m.

On Thu, Sep 28, 2017 at 12:03 PM Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:

...

Hello, I'm on 4.1.5 and I'm cloning a snapshot of a VM with 3 disks for a total of about 200Gb to copy The target I choose is on a different domain than the source one. They are both FC storage domains, with the source on SSD disks and the target on SAS disks.

The disks are preallocated

Now I have 3 processes of kind: /usr/bin/qemu-img convert -p -t none -T none -f raw /rhev/data-center/59b7af54-0155-01c2-0248-000000000195/fad05d79-254d-4f40-8201-360757128ede/images/8f62600a-057d-4d59-9655-631f080a73f6/21a8812f-6a89-4015-a79e-150d7e202450 -O raw /rhev/data-center/mnt/blockSD/6911716c-aa99-4750-a7fe-f83675a2d676/images/c3973d1b-a168-4ec5-8c1a-630cfc4b66c4/27980581-5935-4b23-989a-4811f80956ca

but despite capabilities it seems it is copying using very low system resources.

We run qemu-img convert (and other storage related commands) with: nice -n 19 ionice -c 3 qemu-img ... ionice should not have any effect unless you use the CFQ I/O scheduler. The intent is to limit the effect of virtual machines.

...

I see this both using iotop and vmstat

vmstat 3 gives: ----io---- -system-- ------cpu----- bi bo in cs us sy id wa st 2527 698 3771 29394 1 0 89 10 0

us 94% also seems very high - maybe this hypervisor is overloaded with other workloads? wa 89% seems very high

...

iotop -d 5 -k -o -P gives:

Total DISK READ : 472.73 K/s | Total DISK WRITE : 17.05 K/s Actual DISK READ: 1113.23 K/s | Actual DISK WRITE: 55.86 K/s PID PRIO USER DISK READ> DISK WRITE SWAPIN IO COMMAND

2124 be/4 sanlock 401.39 K/s 0.20 K/s 0.00 % 0.00 % sanlock daemon 2146 be/4 vdsm 50.96 K/s 0.00 K/s 0.00 % 0.00 % python /usr/share/o~a-broker --no-daemon 30379 be/0 root 7.06 K/s 0.00 K/s 0.00 % 98.09 % lvm vgck --config ~50-a7fe-f83675a2d676 30380 be/0 root 4.70 K/s 0.00 K/s 0.00 % 98.09 % lvm lvchange --conf~59-b931-4eb61e43b56b 30381 be/0 root 4.70 K/s 0.00 K/s 0.00 % 98.09 % lvm lvchange --conf~83675a2d676/metadata 30631 be/0 root 3.92 K/s 0.00 K/s 0.00 % 98.09 % lvm vgs --config d~f6-9466-553849aba5e9 2052 be/3 root 0.00 K/s 2.35 K/s 0.00 % 0.00 % [jbd2/dm-34-8] 6458 be/4 qemu 0.00 K/s 4.70 K/s 0.00 % 0.00 % qemu-kvm -name gues~x7 -msg timestamp=on 2064 be/3 root 0.00 K/s 0.00 K/s 0.00 % 0.00 % [jbd2/dm-32-8] 2147 be/4 root 0.00 K/s 4.70 K/s 0.00 % 0.00 % rsyslogd -n 9145 idle vdsm 0.00 K/s 0.59 K/s 0.00 % 24.52 % qemu-img convert -p~23-989a-4811f80956ca 13313 be/4 root 0.00 K/s 0.00 K/s 0.00 % 0.00 % [kworker/u112:3] 9399 idle vdsm 0.00 K/s 0.59 K/s 0.00 % 24.52 % qemu-img convert -p~51-9c8c-8d9aaa7e8f58

0.59 K/s seems extremely low, I don't expect such value.

...

1310 ?dif root 0.00 K/s 0.00 K/s 0.00 % 0.00 % multipathd 3996 be/4 vdsm 0.00 K/s 0.78 K/s 0.00 % 0.00 % python /usr/sbin/mo~c /etc/vdsm/mom.conf 6391 be/4 root 0.00 K/s 0.00 K/s 0.00 % 0.00 % [kworker/u112:0] 2059 be/3 root 0.00 K/s 3.14 K/s 0.00 % 0.00 % [jbd2/dm-33-8]

Is it expected? Any way to speed up the process?

I would try to perform the same copy from the shell, without ionice and nice, and see if this improves the times. Can do a test with a small image (e.g 10g) running qemu-img with strace? strace -f -o qemu-img.strace qemu-img convert \ -p \ -t none \ -T none \ -f raw \ /dev/fad05d79-254d-4f40-8201-360757128ede/<lv-name> \ -O raw \ /dev/6911716c-aa99-4750-a7fe-f83675a2d676/<lv-name> and shared the trace? Also version info (kernel, qemu) would be useful. Adding Kevin. Nir

Kevin Wolf

2:34 p.m.

Am 28.09.2017 um 12:44 hat Nir Soffer geschrieben:

...

On Thu, Sep 28, 2017 at 12:03 PM Gianluca Cecchi <gianluca.cecchi@gmail.com> wrote:

...
Hello, I'm on 4.1.5 and I'm cloning a snapshot of a VM with 3 disks for a total of about 200Gb to copy The target I choose is on a different domain than the source one. They are both FC storage domains, with the source on SSD disks and the target on SAS disks.

The disks are preallocated

Now I have 3 processes of kind: /usr/bin/qemu-img convert -p -t none -T none -f raw /rhev/data-center/59b7af54-0155-01c2-0248-000000000195/fad05d79-254d-4f40-8201-360757128ede/images/8f62600a-057d-4d59-9655-631f080a73f6/21a8812f-6a89-4015-a79e-150d7e202450 -O raw /rhev/data-center/mnt/blockSD/6911716c-aa99-4750-a7fe-f83675a2d676/images/c3973d1b-a168-4ec5-8c1a-630cfc4b66c4/27980581-5935-4b23-989a-4811f80956ca

but despite capabilities it seems it is copying using very low system resources.

We run qemu-img convert (and other storage related commands) with:

nice -n 19 ionice -c 3 qemu-img ...

ionice should not have any effect unless you use the CFQ I/O scheduler.

The intent is to limit the effect of virtual machines.

...
I see this both using iotop and vmstat

vmstat 3 gives: ----io---- -system-- ------cpu----- bi bo in cs us sy id wa st 2527 698 3771 29394 1 0 89 10 0

us 94% also seems very high - maybe this hypervisor is overloaded with other workloads? wa 89% seems very high

The alignment in the table is a bit off, but us is 1%. The 94 you saw is part of cs=29394. A high percentage for wait is generally a good sign because that means that the system is busy with actual I/O work. Obviously, this I/O work is rather slow, but at least qemu-img is making requests to the kernel instead of doing other work, otherwise user would be much higher. Kevin

Gianluca Cecchi

2:39 p.m.

On Thu, Sep 28, 2017 at 2:34 PM, Kevin Wolf <kwolf@redhat.com> wrote:

...

Am 28.09.2017 um 12:44 hat Nir Soffer geschrieben:

...
On Thu, Sep 28, 2017 at 12:03 PM Gianluca Cecchi < gianluca.cecchi@gmail.com> wrote:

...
Hello, I'm on 4.1.5 and I'm cloning a snapshot of a VM with 3 disks for a total of about 200Gb to copy The target I choose is on a different domain than the source one. They are both FC storage domains, with the source on SSD disks and the target on SAS disks.

...

[snip]

...

...
...
but despite capabilities it seems it is copying using very low system resources.

We run qemu-img convert (and other storage related commands) with:

nice -n 19 ionice -c 3 qemu-img ...

ionice should not have any effect unless you use the CFQ I/O scheduler.

The intent is to limit the effect of virtual machines.

Ah, ok. The hypervisor is ovirt node based on CentOS 7 so the default scheduler should be deadline if not customized in node. And in fact in /sys/block/sd*/queue/scheduler I see only [deadline] contents and also for dm-* block devices where it is not none, it is deadline too

...

...
...
I see this both using iotop and vmstat

vmstat 3 gives: ----io---- -system-- ------cpu----- bi bo in cs us sy id wa st 2527 698 3771 29394 1 0 89 10 0

us 94% also seems very high - maybe this hypervisor is overloaded with other workloads? wa 89% seems very high

The alignment in the table is a bit off, but us is 1%. The 94 you saw is part of cs=29394. A high percentage for wait is generally a good sign because that means that the system is busy with actual I/O work. Obviously, this I/O work is rather slow, but at least qemu-img is making requests to the kernel instead of doing other work, otherwise user would be much higher.

Kevin

Yes, probably a misalignement of output I truncated. Actually a sampling of about 300 lines (once every 3 seconds) shows these number of lines occurrences and related percentage user time 195 0% 95 1% So user time is indeed quite low wait time: 105 7% 58 8% 33 9% 21 6% 17 10% 16 5% 16% 12 11% 9 12% 7 14% 6 13% 2 15% 1 4% 1 16% 1 0% with wait time an average of 7% AT the end the overall performance in copying has been around 30MB/s that probably is to be expected do to how the qemu-img process is run. What about the events I reported instead? Gianluca

2963

Age (days ago)

2963

Last active (days ago)

List overview

Download

5 comments

3 participants

participants (3)

Gianluca Cecchi
Kevin Wolf
Nir Soffer