Re: [ovirt-devel] [VDSM] [JSONRPC] early, coarse grained benchmarks

17 Nov 2014


      ----- Original Message -----
...
From: "Nir Soffer" <nsoffer@redhat.com>
To: "Francesco Romani" <fromani@redhat.com>
Cc: "engine-devel@ovirt.org" <devel@ovirt.org>
Sent: Thursday, November 13, 2014 9:35:30 PM
Subject: Re: [ovirt-devel] [VDSM] [JSONRPC] early, coarse grained benchmarks
[...]
...
I'm profiling four major flows at the moment, and improving tools along the
way.
Here's my series, still being worked on and not yet ready for review (still
draft for this reason)
http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:pro...
Here we have
1. one patch to start/stop profiling at runtime using vdsClient
Note that stopping yappi may segfault, as yappi is using unsafe iteration
on Python internal link list without locking.
Not nice. But I guess that could be good enough if used only on controlled,
development environments
...
If we add such option, it should work only from local connection.
How can we enforce that? Just check the peer address?
...
...
http://gerrit.ovirt.org/#/c/35024/
http://gerrit.ovirt.org/#/c/34989/
http://gerrit.ovirt.org/#/c/35139 (DRAFT)
http://gerrit.ovirt.org/#/c/35140 (DRAFT)
Let's see the effect of these in the #3 flows.
Scenario:
RHEL 6.5 on minidell running VDSM master + patches, with 150 VMs each one 1
What is the cpu usage and load during the test? In previous tests you had
nice
graph with this data.
To have them back is very easy, I'll do if that can help understand
the behaviour of VDSM and the performance.
...
This is not a good test - you just overload vdsm. Instead, run this verb in a
more typical rate (like engine does) and limit the test by the running time,
not by the number of calls.
Running like this will show that the patched version needs less cpu and
complete more
calls during the test duration (e.g. 5 mintues).
Yes, this sounds more typical and probably useful.
...
...
patched VDSM:
What is "patched"? which patches are applied?
These:
...
...
http://gerrit.ovirt.org/#/c/35024/
http://gerrit.ovirt.org/#/c/34989/
http://gerrit.ovirt.org/#/c/35139 (DRAFT)
http://gerrit.ovirt.org/#/c/35140 (DRAFT)
...
This look strange:
vanila vdsm:   75750   19.871    0.000   75.819    0.001
vm.py:2271(Vm._getRunningVmStats)
patched vdsm: 234244  183.784    0.001  570.998    0.002
vm.py:2274(Vm._getRunningVmStats)
The second profile is doing about 3 times more Vm._getRunningVmStats calls,
which takes 10 times more cpu time(?!).
We don't have any info on the load - maybe vdsm is overloaded in both runs,
which
can explain why it does 1/3 of the work vs the patched version, but the
patched
version is probably overloaded as well, which can explain the much higher
cpu time.
I suggest to check the cpu usage, and do *not* make profiles where vdsm uses
more
then 100% cpu.
Lets profile much lower load to understand where time is taken.
Agreed, another reason to try the scenario you suggested
...
...
Open points:
* profile remaining flows again and report like this (in progress for
getAllVmStats)
* add wiki page
* find out why there is a so high minimum price for list verb
* profile JSONRPC (needs patches to vdsClient)
Comparing jsonrpc would be very interesting - but we don't have a jsonrpc
client yet, so you better use the engine for this intead of vdsclient.
Will try both with new jsonrpc client you started and with engine,
started with client because full-stack profiling and/or engine profiling
is already in progress by others (Yuri/Piotr).

Thanks,

-- 
Francesco Romani
RedHat Engineering Virtualization R & D
Phone: 8261328
IRC: fromani