Re: [ovirt-users] ovirt-ha-agent cpu usage

26 Apr 2017

      On Wed, Apr 26, 2017 at 1:28 PM, Simone Tiraboschi <stirabos@redhat.com>
wrote:
...
On Wed, Apr 26, 2017 at 12:52 PM, Nir Soffer <nsoffer@redhat.com> wrote:
...
On Wed, Apr 26, 2017 at 11:36 AM Gianluca Cecchi <
gianluca.cecchi@gmail.com> wrote:
...
On Tue, Apr 25, 2017 at 11:28 PM, Nir Soffer <nsoffer@redhat.com> wrote:
...
Hi Gianluca,
You can run this on the host:
$ python -c "import yaml; print 'CLoader:', hasattr(yaml, 'CLoader')"
CLoader: True
If you get "CLoader: False", you have some packaging issue, CLoader
is available on all supported platforms.
Nir
...
Thanks,
Gianluca
It seems ok.
[root@ovirt01 ovirt-hosted-engine-ha]#  python -c "import yaml; print
'CLoader:', hasattr(yaml, 'CLoader')"
CLoader: True
[root@ovirt01 ovirt-hosted-engine-ha]#
Anyway see here a sample of the spikes that it cntinues to have.. from
15% to 55% many times
https://drive.google.com/file/d/0BwoPbcrMv8mvMy1xVUE3YzI2YVE
/view?usp=sharing
There are two issues in this video:
- Memory leak, ovirt-ha-agent is using 1g of memory. It is very unlikely
that it needs so much memory.
- Unusual cpu usage - but not the kind of usage related to yaml parsing.
I would open two bugs for this. We have seen the first issue few month
ago, and
we did nothing about it so the memory leak was not fixed.
To understand the unusual cpu usage, we need to integrate yappi into
ovirt-ha-agent,
and do some profiling to understand where cpu time is spent.
Simone, can you do something based on these patches?
https://gerrit.ovirt.org/#/q/topic:generic-profiler
I hope to get these patches merged soon.
Absolutely at this point.
On 4.1.1, the 96% of the cpu time of ovirt-ha-agent is still spent in
connect() in /usr/lib/python2.7/site-packages/vdsm/jsonrpcvdscli.py
and the 95.98% is in Schema.__init__
in /usr/lib/python2.7/site-packages/vdsm/api/vdsmapi.py

So it's still the parsing of the api yaml schema.
On master we already merged a patch to reuse an existing connection if
available and this should mitigate/resolve the issue:
https://gerrit.ovirt.org/73757/

It's still not that clear why we are facing this kind of performance
regression.
...
Nir
...
...
The host is an Intel NUC6i5 with 32Gb of ram. There are the engine, an
F25 guest and a C7 desktop VMs running, without doing almost anything.
Gianluca