[ovirt-devel] [VDSM] scalable sampling benchmark

Fri Aug 1 13:23:25 UTC 2014

----- Original Message -----
> From: "Nir Soffer" <nsoffer at redhat.com>
> To: "Francesco Romani" <fromani at redhat.com>
> Cc: devel at ovirt.org, "Michal Skrivanek" <mskrivan at redhat.com>
> Sent: Friday, August 1, 2014 3:10:08 PM
> Subject: Re: [ovirt-devel] [VDSM] scalable sampling benchmark

> > find attached graphs for CPU and memory profiling.
> > Some stats on RHEL 6.5:
> > 
> > master cpu usage:
> >                         samples% below
> >             average%    10%     25%     50%     75%
> > libvirt     74.101      0.083   0.083   3.172   52.922
> > vdsm        44.211	3.506   33.556  70.618  84.641
> > total       30.504	0.000   9.599   99.750  100.000
> > 
> > scalable_sampling cpu usage:
> > 
> >                         samples% below
> >             average%    10%     25%     50%     75%
> > libvirt     58.835	0.000   0.591   28.270  86.160
> 
> I wonder if we are using libvirt correctly - maybe our thread pool is too
> small, keeping tasks in our queue, instead of letting libvirt process
> them concurrently?
> 
> Can you check the size of the libvirt thread pool, and increase our sampling
> pool to the same value?

Will do.

> > vdsm        65.146	0.084   10.549  49.030  71.055
> 
> This looks 47% worse then the current code. We need a profile to understand
> why.
> 
> Are you sure that we are doing the same amount of samplers in both cases?
> Did you compare the logs?

I haven't yet did specific checks besides the code, but indeed this behaviour
need a proper explanation, then I'll bump the priority of this task.
Otherwise to run more benchs is close to useless.

Will also figure out a way to cross-check the logs.

> Maybe we should create simpler standalone benchmark simulating what vdsm
> does,
> hopefully yappi will not crash using it, and if it does, the benchmark can
> be used by yappi developers to fix this.

I agree this could be useful to ourselves also to improve our own code.
If we go this way let's just sync up.

> > total       29.390	0.000   24.473  99.325  100.000
> > 
> > memory usage (RSS, mmegabytes), in numbers:
> > 
> >                     average     minimum     maximum
> > master              262         254         264
> > scalable_sampling   143         133         147
> 
> So this seems to save lot of memory, but cost in more cpu. I wonder if this
> is better - systems have huge amount of memory, but much limited cpus.
> 
> Before 3.5, we were using gigabytes of memory for the remote file handlers
> on NFS (replaced by ioprocess), and I never heard anyone complaining about
> it, but we had lot of issues with excessive cpu usage.

IMHI better to optimize for CPU usage. The memory reduction makes sense
considering we dropped ~100 threads.

Bests,

-- 
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani