[ovirt-users] Performance of cloning

Gianluca Cecchi gianluca.cecchi at gmail.com
Thu Sep 28 12:39:03 UTC 2017


On Thu, Sep 28, 2017 at 2:34 PM, Kevin Wolf <kwolf at redhat.com> wrote:

> Am 28.09.2017 um 12:44 hat Nir Soffer geschrieben:
> > On Thu, Sep 28, 2017 at 12:03 PM Gianluca Cecchi <
> gianluca.cecchi at gmail.com>
> > wrote:
> >
> > > Hello,
> > > I'm on 4.1.5 and I'm cloning a snapshot of a VM with 3 disks for a
> total
> > > of about 200Gb to copy
> > > The target I choose is on a different domain than the source one.
> > > They are both FC storage domains, with the source on SSD disks and the
> > > target on SAS disks.
>


> [snip]
>


> > >
> > > but despite capabilities it seems it is copying using very low system
> > > resources.
> > >
> >
> > We run qemu-img convert (and other storage related commands) with:
> >
> >     nice -n 19 ionice -c 3 qemu-img ...
> >
> > ionice should not have any effect unless you use the CFQ I/O scheduler.
> >
> > The intent is to limit the effect of virtual machines.
> >
>

Ah, ok.
The hypervisor is ovirt node based on CentOS 7 so the default scheduler
should be deadline if not customized in node.
And in fact in /sys/block/sd*/queue/scheduler I see only [deadline]
contents and also for dm-* block devices where it is not none, it is
deadline too



> >
> > > I see this both using iotop and vmstat
> > >
> > > vmstat 3 gives:
> > > ----io---- -system-- ------cpu-----
> > > bi    bo   in   cs us sy id wa st
> > > 2527   698 3771 29394  1  0 89 10  0
> > >
> >
> > us 94% also seems very high - maybe this hypervisor is overloaded with
> > other workloads?
> > wa 89% seems very high
>
> The alignment in the table is a bit off, but us is 1%. The 94 you saw is
> part of cs=29394. A high percentage for wait is generally a good sign
> because that means that the system is busy with actual I/O work.
> Obviously, this I/O work is rather slow, but at least qemu-img is making
> requests to the kernel instead of doing other work, otherwise user would
> be much higher.
>
> Kevin
>


Yes, probably a misalignement of output I truncated. Actually a sampling of
about 300 lines (once every 3 seconds) shows these number of lines
occurrences and related percentage

user time
    195 0%
     95 1%

So user time is indeed quite low

wait time:
    105 7%
     58 8%
     33 9%
     21 6%
     17 10%
     16 5%
     16%
     12 11%
      9 12%
      7 14%
      6 13%
      2 15%
      1 4%
      1 16%
      1 0%

with wait time an average of 7%

AT the end the overall performance in copying has been around 30MB/s that
probably is to be expected do to how the qemu-img process is run.
What about the events I reported instead?

Gianluca
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/users/attachments/20170928/09103df9/attachment.html>


More information about the Users mailing list