Re: [Engine-devel] DB Performance Monitoring

26 Feb 2013

      ----- Original Message -----
...
From: "Michael Kublin" <mkublin@redhat.com>
To: engine-devel@ovirt.org
Sent: Tuesday, February 26, 2013 10:23:01 AM
Subject: Re: [Engine-devel] DB Performance Monitoring
----- Original Message -----
...
From: "Eli Mesika" <emesika@redhat.com>
To: "Libor Spevak" <lspevak@redhat.com>
Cc: engine-devel@ovirt.org
Sent: Friday, February 22, 2013 12:46:07 AM
Subject: Re: [Engine-devel] DB Performance Monitoring
----- Original Message -----
...
From: "Libor Spevak" <lspevak@redhat.com>
To: "Eli Mesika" <emesika@redhat.com>
Cc: engine-devel@ovirt.org, "Yair Zaslavsky"
<yzaslavs@redhat.com>,
"Barak Azulay" <bazulay@redhat.com>
Sent: Thursday, February 21, 2013 6:50:58 PM
Subject: Re: [Engine-devel] DB Performance Monitoring
On 21.2.2013 17:11, Eli Mesika wrote:
----- Original Message -----
From: "Libor Spevak" <lspevak@redhat.com> To:
engine-devel@ovirt.org
Sent: Tuesday, February 19, 2013 5:44:10 PM
Subject: [Engine-devel] DB Performance Monitoring
Hi,
I just wanted to share a partial result from testing of engine db
load during operation (SQL queries frequencies). Maybe, we can
decide later, if some SQL queries result CACHING can boost
throughput for larger deployments.
The caching technology can be from a simple HashMap lookup to
deployments of something like http://ehcache.org/ (memory
database).
I prepared two simple scenarios:
- oVirt engine, 2 hosts, 1 VM, running 15 minutes (Power on + Up
state)
- oVirt engine, 2 hosts, 10 VMs from one pool, running 15 minutes
(Power on + Up state)
Appending 2 spreadsheets with data about the most used SQL
queries
(generated by PostgreSQL standard pg_statements_stat module).
e.g.
- 2nd row shows number of granted connections from the db pool
(e.g.
can be used to set optimal connection pool size) (check of
connection health: select 1)
- most of the queries are wrapped by a PLSQL function so we see a
wrapper:
select * from getvdsgroupbyvdsgroupid($1, $2, $3)
and near to it the 'real' query:
SELECT vds_groups_view.*
FROM vds_groups_view
WHERE vds_group_id = v_vds_group_id
AND (NOT v_is_filtered OR EXISTS (SELECT 1
FROM user_vds_groups_permissions_view
WHERE user_id = v_user_id AND entity_id = v_vds_group_id))
Just as an example, I selected in yellow color some queries,
which
probably do not change often, but are very frequent. Thanks Libor
for
taking te time to do that.
The problematic queries are those that involves all kinds of
*permissions* checks with the complicated & expensive permission
handling views.
I believe that using Snapshot Materialized Views as defined in
http://www.ovirt.org/OVirt-DB-Issues/MaterializedViews will solve
the major part of those problems.
We are currently verifying this approach and my recommendation is
to
have all *permissions* views as Snapshot Materialized Views that
are
updated via a cron job. I think there could be performace
improvement with M-views, of course. There is some overhead just
with their maintainance, e.g. refreshing, updating existing
dependent objects (table name renames, ...),
Refresh of *permission* views was tested with 2 views 160,000
records
each and was about 2 sec
...
not sure about NATIVE
implementation level status in PostgreSQL, according to e.g.
Oracle
db capabilities (MV-logs + M-views).
http://wiki.postgresql.org/wiki/Materialized_Views
AFAIK Postgres is not going to implement that in the near future
...
Still, if the Engine gathers statistics nearly in real time, my
question would be, if there is a need of round-trip to the
database
and back at all for frequent queries.
Depends what is the cost of this round-trip , if it is very cheep
as
I showed when I tested M-Views I think that DB query is still a
good
and natural choice
...
Other tools like 'pg_top' can provide runtime statistics of db
processes (cpu, mem, locks, ... views).
By enabling debug level logging of PostgreSQL we can check real
values to the queries.
Of course, it would be useful to run such tests with many hosts
and
VMs to predict scaling issues.
More info about tools configuration:
http://www.ovirt.org/Engine_database_performance_monitoring
Regards,
Libor
Due the following discussion I checked uses of "select * from
getvdsgroupbyvdsgroupid($1, $2, $3)"
query and find out that we have three call for such query during
monitoring and we need only one,
so the following patch should reduce a number of such queries during
monitoring process
http://gerrit.ovirt.org/#/c/12433/
ACK, thanks for that.
...
The next step is, by my opinion, is start to use cache for not often
changed object, VdsGroup is classical example of
such object.
+1 - and another issue I wanted to take a look into - Please notice we have entities representing views.
I have a feeling we abuse VM (for example) too much - we use AuditLoggableBase to "cache" vm - but still - I bet that in some flows
we don't need the entire VM -> we might need only static (or dynamic part) of it.
We should take a look into this as well . Getting vm dynamic from db instead of vm means less joins.
Same applies for vds and storage_domains

Yair
...
_______________________________________________
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel