[ovirt-users] Performance of cloning
Gianluca Cecchi
gianluca.cecchi at gmail.com
Thu Sep 28 12:39:03 UTC 2017
On Thu, Sep 28, 2017 at 2:34 PM, Kevin Wolf <kwolf at redhat.com> wrote:
> Am 28.09.2017 um 12:44 hat Nir Soffer geschrieben:
> > On Thu, Sep 28, 2017 at 12:03 PM Gianluca Cecchi <
> gianluca.cecchi at gmail.com>
> > wrote:
> >
> > > Hello,
> > > I'm on 4.1.5 and I'm cloning a snapshot of a VM with 3 disks for a
> total
> > > of about 200Gb to copy
> > > The target I choose is on a different domain than the source one.
> > > They are both FC storage domains, with the source on SSD disks and the
> > > target on SAS disks.
>
> [snip]
>
> > >
> > > but despite capabilities it seems it is copying using very low system
> > > resources.
> > >
> >
> > We run qemu-img convert (and other storage related commands) with:
> >
> > nice -n 19 ionice -c 3 qemu-img ...
> >
> > ionice should not have any effect unless you use the CFQ I/O scheduler.
> >
> > The intent is to limit the effect of virtual machines.
> >
>
Ah, ok.
The hypervisor is ovirt node based on CentOS 7 so the default scheduler
should be deadline if not customized in node.
And in fact in /sys/block/sd*/queue/scheduler I see only [deadline]
contents and also for dm-* block devices where it is not none, it is
deadline too
> >
> > > I see this both using iotop and vmstat
> > >
> > > vmstat 3 gives:
> > > ----io---- -system-- ------cpu-----
> > > bi bo in cs us sy id wa st
> > > 2527 698 3771 29394 1 0 89 10 0
> > >
> >
> > us 94% also seems very high - maybe this hypervisor is overloaded with
> > other workloads?
> > wa 89% seems very high
>
> The alignment in the table is a bit off, but us is 1%. The 94 you saw is
> part of cs=29394. A high percentage for wait is generally a good sign
> because that means that the system is busy with actual I/O work.
> Obviously, this I/O work is rather slow, but at least qemu-img is making
> requests to the kernel instead of doing other work, otherwise user would
> be much higher.
>
> Kevin
>
Yes, probably a misalignement of output I truncated. Actually a sampling of
about 300 lines (once every 3 seconds) shows these number of lines
occurrences and related percentage
user time
195 0%
95 1%
So user time is indeed quite low
wait time:
105 7%
58 8%
33 9%
21 6%
17 10%
16 5%
16%
12 11%
9 12%
7 14%
6 13%
2 15%
1 4%
1 16%
1 0%
with wait time an average of 7%
AT the end the overall performance in copying has been around 30MB/s that
probably is to be expected do to how the qemu-img process is run.
What about the events I reported instead?
Gianluca
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170928/09103df9/attachment.html>
More information about the Users
mailing list