[Engine-devel] Opimizing Postgres Stored Procedures
Laszlo Hornyak
lhornyak at redhat.com
Sun Sep 1 17:55:44 UTC 2013
----- Original Message -----
> From: "Eli Mesika" <emesika at redhat.com>
> To: "Laszlo Hornyak" <lhornyak at redhat.com>
> Cc: "engine-devel" <engine-devel at ovirt.org>
> Sent: Sunday, September 1, 2013 5:06:42 PM
> Subject: Re: [Engine-devel] Opimizing Postgres Stored Procedures
>
>
>
> ----- Original Message -----
> > From: "Laszlo Hornyak" <lhornyak at redhat.com>
> > To: "Eli Mesika" <emesika at redhat.com>
> > Cc: "engine-devel" <engine-devel at ovirt.org>
> > Sent: Sunday, September 1, 2013 5:13:39 PM
> > Subject: Re: [Engine-devel] Opimizing Postgres Stored Procedures
> >
> >
> >
> > ----- Original Message -----
> > > From: "Eli Mesika" <emesika at redhat.com>
> > > To: "Laszlo Hornyak" <lhornyak at redhat.com>
> > > Cc: "engine-devel" <engine-devel at ovirt.org>
> > > Sent: Sunday, September 1, 2013 3:29:06 PM
> > > Subject: Re: [Engine-devel] Opimizing Postgres Stored Procedures
> > >
> > >
> > >
> > > ----- Original Message -----
> > > > From: "Laszlo Hornyak" <lhornyak at redhat.com>
> > > > To: "Eli Mesika" <emesika at redhat.com>
> > > > Cc: "engine-devel" <engine-devel at ovirt.org>
> > > > Sent: Sunday, September 1, 2013 2:47:02 PM
> > > > Subject: Re: [Engine-devel] Opimizing Postgres Stored Procedures
> > > >
> > > >
> > > >
> > > > ----- Original Message -----
> > > > > From: "Eli Mesika" <emesika at redhat.com>
> > > > > To: "Laszlo Hornyak" <lhornyak at redhat.com>
> > > > > Cc: "engine-devel" <engine-devel at ovirt.org>
> > > > > Sent: Sunday, September 1, 2013 10:35:43 AM
> > > > > Subject: Re: [Engine-devel] Opimizing Postgres Stored Procedures
> > > > >
> > > > >
> > > > >
> > > > > ----- Original Message -----
> > > > > > From: "Laszlo Hornyak" <lhornyak at redhat.com>
> > > > > > To: "Eli Mesika" <emesika at redhat.com>
> > > > > > Cc: "engine-devel" <engine-devel at ovirt.org>
> > > > > > Sent: Friday, August 30, 2013 7:17:32 PM
> > > > > > Subject: Re: [Engine-devel] Opimizing Postgres Stored Procedures
> > > > > >
> > > > > > Hi Eli,
> > > > > >
> > > > > > I wrote a quick benchmark to see if there is any difference when
> > > > > > using
> > > > > > STABLE
> > > > > > modifier on functions running queries the way the engine does it
> > > > > > (calling
> > > > > > it
> > > > > > from JDBC, one function in a single statement)
> > > > > >
> > > > > > with a stable function:
> > > > > > create function getKakukk(_id int) returns VARCHAR STABLE as
> > > > > > 'select
> > > > > > val
> > > > > > from
> > > > > > kakukk where id = $1' language sql;
> > > > > > and one not marked as stable
> > > > > > create function getKakukk_(_id int) returns VARCHAR as 'select val
> > > > > > from
> > > > > > kakukk where id = $1' language sql;
> > > > > > the table is this simple:
> > > > > > create table kakukk(id int primary key, val varchar);
> > > > > > and the only content is:
> > > > > > insert into kakukk (id, val) values (1, 'bla bla bla');
> > > > > >
> > > > > > Now the benchmark code:
> > > > > >
> > > > > > package com.foobar;
> > > > > >
> > > > > > import java.sql.Connection;
> > > > > > import java.sql.DriverManager;
> > > > > > import java.sql.PreparedStatement;
> > > > > > import java.sql.ResultSet;
> > > > > > import java.sql.SQLException;
> > > > > >
> > > > > > import org.junit.After;
> > > > > > import org.junit.Before;
> > > > > > import org.junit.Test;
> > > > > >
> > > > > > public class SpeedTest {
> > > > > >
> > > > > > Connection connection;
> > > > > >
> > > > > > @Before
> > > > > > public void connect() throws SQLException {
> > > > > > connection =
> > > > > > DriverManager.getConnection("jdbc:postgresql://localhost/stabletest",
> > > > > > "engine", "engine");
> > > > > > }
> > > > > >
> > > > > > @After
> > > > > > public void disconnect() throws SQLException {
> > > > > > connection.close();
> > > > > > }
> > > > > >
> > > > > > private long measure(Runnable runnable, int times) {
> > > > > > final long start = System.currentTimeMillis();
> > > > > > for (int i = 0; i < times; i++) {
> > > > > > runnable.run();
> > > > > > }
> > > > > > final long end = System.currentTimeMillis();
> > > > > > return end - start;
> > > > > > }
> > > > > >
> > > > > > public static class Select implements Runnable {
> > > > > >
> > > > > > public Select(PreparedStatement preparedStatement) {
> > > > > > super();
> > > > > > this.preparedStatement = preparedStatement;
> > > > > > }
> > > > > >
> > > > > > final PreparedStatement preparedStatement;
> > > > > >
> > > > > > public void run() {
> > > > > > try (
> > > > > > ResultSet resultSet =
> > > > > > preparedStatement.executeQuery();)
> > > > > > {
> > > > > > while (resultSet.next()) {
> > > > > > // nothing, just next
> > > > > > }
> > > > > > } catch (SQLException e) {
> > > > > > // TODO Auto-generated catch block
> > > > > > e.printStackTrace();
> > > > > > }
> > > > > > }
> > > > > > }
> > > > > >
> > > > > > @Test
> > > > > > public void performTest() throws SQLException {
> > > > > > for (int i = 0; i < 10; i++) {
> > > > > > try (
> > > > > > PreparedStatement stable =
> > > > > > connection.prepareStatement("select
> > > > > > getKakukk(1)");
> > > > > > PreparedStatement notStable =
> > > > > > connection.prepareStatement("select
> > > > > > getKakukk_(1)");)
> > > > > > {
> > > > > > System.out.println("STABLE: " + measure(new
> > > > > > Select(stable),
> > > > > > 100000));
> > > > > > System.out.println("not STABLE: " + measure(new
> > > > > > Select(notStable), 100000));
> > > > > > System.out.println("---");
> > > > > >
> > > > > > }
> > > > > > }
> > > > > > }
> > > > > > }
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > The results are very similar, seemingly no difference at all.
> > > > > > Therefore, it seems we do not need those STABLE markers for
> > > > > > performance
> > > > > > reasons.
> > > > >
> > > > > Please refer to
> > > > > http://www.postgresql.org/docs/8.3/static/xfunc-volatility.html
> > > > > It says :
> > > > >
> > > > > "For best optimization results, you should label your functions with
> > > > > the
> > > > > strictest volatility category that is valid for them."
> > > >
> > > > I am sure the postgres guys did not mean "without even thinking about
> > > > it"
> > > > :)
> > > > The way the enginge is using these functions, it will not improve
> > > > performance, which is not a problem. The problem in my opinion is that
> > > > more
> > > > requirements increase the time needed for review procedure, which is
> > > > already
> > > > quite lengthy. This new requirement does not seem to have any benefit.
> > > > If
> > > > we
> > > > want to improve the development process then we should not have
> > > > requirements
> > > > that do not come with benefits.
> > >
> > > I totally disagree, I think the opposite , if this will not be a clear
> > > rule
> > > of thumb , it will be used incorrectly.
> > > Thinking on each SP if you will benefit or not in terms of performance
> > > if
> > > you use those keywords will increase the review time when any reviewer
> > > will
> > > need to think if he needs that or not.
> > > I don't think that adding a word to the SP will affect development or
> > > review
> > > time , it should be like you are writing the word FUNCTION for a SP
> > > definition and will become a habit.
> > > I am against complicated rules like "add those keywords whenever you
> > > like"
> > > ,
> >
> > No, it is even more simple than that: do not add those keywords, it would
> > only make someone believe it will improve database performance, while it
> > clearly won't :)
>
> 1) You just had tested a simple select statement which does not represent
> application complex queries and therefor you can not assume that your
> results are relevant to oVirt queries
I am talking about the functions that are running a single query and they are typically invoked from JDBC and only one of this functions in a single statement.
The above test simulates this scenario.
> 2) You did not test repetitive calls with the same arguments to the functions
> which will use PG cache to return the result when STABLE and IMMUTABLE are
> used
Sure, this test is doing repetitive calls just like engine, only a little more.
The point is that STABLE is invoked with the same args only once in the same statement. Not transaction, the pg documentation also puts it clear:
"all rows within a single statement"
Therefore when these functions are invoked, from two subsequent statements, there is no performance gain whatsoever. But of course, why would you call "GetDisksVmGuid" twice in the same statement?
I do not think this is a cache solution in PostgreSQL, it allows the PostgreSQL query planner to optimize the function usage in your queries in case the query need to be evaluated several times. But it's scope is only one statement. Therefore it will not be any helpful in a typical "GetSomethingBySomething" function.
No big deal if this does not improve performance (and it seems it doesn't), the problem is that we MUST use it regardless.
We could comment there: -- this is here to make the reviewer happy, it does not actually improve performance :)
> 3) You can google around on "postgres STABLE IMMUTABLE performance" and see
> some examples when STABLE and IMMUTABLE does affect performance
I did already, I am not quite new to postgresql functions, however, I am always happy to learn something new. This is why I started with asking you if you can show any banchmarks/documentation on why does it improve performance. It did not seem to be logical, and the actual results show no improvement.
> 4) Please use those keywords in your DB patches
In case I will have to write one more DB patch, I will, but I hope by then we resolve this misunderstanding.
>
> Thanks
>
>
> >
> > > if the documentations says a clear instruction to use those keywords in
> > > certain circumstances, that's what I would exactly do and recommend to
> > > others.
> > >
> > > Thanks
> > >
> > >
> > > >
> > > > Best regards,
> > > > Laszlo
> > > >
> > > > >
> > > > > So , using STABLE , IMMUTABLE , STRICT is mandatory from now on.
> > > > >
> > > > > Thanks
> > > > > Eli
> > > > >
> > > > >
> > > > > >
> > > > > > Thank you,
> > > > > > Laszlo
> > > > > >
> > > > > > ----- Original Message -----
> > > > > > > From: "Laszlo Hornyak" <lhornyak at redhat.com>
> > > > > > > To: "Eli Mesika" <emesika at redhat.com>
> > > > > > > Cc: "engine-devel" <engine-devel at ovirt.org>
> > > > > > > Sent: Wednesday, August 28, 2013 1:02:18 PM
> > > > > > > Subject: Re: [Engine-devel] Opimizing Postgres Stored Procedures
> > > > > > >
> > > > > > >
> > > > > > > ----- Original Message -----
> > > > > > > > From: "Eli Mesika" <emesika at redhat.com>
> > > > > > > > To: "Laszlo Hornyak" <lhornyak at redhat.com>
> > > > > > > > Cc: "engine-devel" <engine-devel at ovirt.org>
> > > > > > > > Sent: Wednesday, August 28, 2013 11:45:14 AM
> > > > > > > > Subject: Re: [Engine-devel] Opimizing Postgres Stored
> > > > > > > > Procedures
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > ----- Original Message -----
> > > > > > > > > From: "Laszlo Hornyak" <lhornyak at redhat.com>
> > > > > > > > > To: "Eli Mesika" <emesika at redhat.com>
> > > > > > > > > Cc: "engine-devel" <engine-devel at ovirt.org>
> > > > > > > > > Sent: Tuesday, August 27, 2013 11:40:27 AM
> > > > > > > > > Subject: Re: [Engine-devel] Opimizing Postgres Stored
> > > > > > > > > Procedures
> > > > > > > > >
> > > > > > > > > Hi Eli,
> > > > > > > > >
> > > > > > > > > Most of the functions that we have in the DB are doing very
> > > > > > > > > simple
> > > > > > > > > jobs
> > > > > > > > > like
> > > > > > > > > run a query, insert/update and I see that now you have all
> > > > > > > > > QUERY
> > > > > > > > > functions
> > > > > > > > > as STABLE.
> > > > > > > > > My questions:
> > > > > > > > > Is this required for new functions from now on?
> > > > > > > > Yes and a email asking that was posted to engine_devel
> > > > > > > >
> > > > > > > > > Is this done in order to improve performance?
> > > > > > > > Yes
> > > > > > >
> > > > > > > Do you have any documents/benchmarks on how and why does this
> > > > > > > improve
> > > > > > > performance?
> > > > > > > STABLE functions should improve performance if they return the
> > > > > > > same
> > > > > > > result
> > > > > > > for the same parameters in the same statement.
> > > > > > > E.g. if you have a stable function like "select foo(x) from y"
> > > > > > > then
> > > > > > > the
> > > > > > > function can be invoked only once to evaluate each distinct value
> > > > > > > of
> > > > > > > y.x
> > > > > > > -
> > > > > > > this is kind of useful
> > > > > > > Functions running queries for the ovirt engine are typically
> > > > > > > invoked
> > > > > > > from
> > > > > > > client side, therefore they are only ivoked once from the
> > > > > > > parameters
> > > > > > > list
> > > > > > > and therefore will be only executed once for that single
> > > > > > > statement.
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Thank you,
> > > > > > > > > Laszlo
> > > > > > > > >
> > > > > > > > > ----- Original Message -----
> > > > > > > > > > From: "Eli Mesika" <emesika at redhat.com>
> > > > > > > > > > To: "engine-devel" <engine-devel at ovirt.org>
> > > > > > > > > > Sent: Monday, August 26, 2013 11:22:20 AM
> > > > > > > > > > Subject: [Engine-devel] Opimizing Postgres Stored
> > > > > > > > > > Procedures
> > > > > > > > > >
> > > > > > > > > > Hi
> > > > > > > > > >
> > > > > > > > > > I had merged the following patch
> > > > > > > > > > http://gerrit.ovirt.org/#/c/17962/
> > > > > > > > > >
> > > > > > > > > > This patch introduce usage of the IMMUTABLE, STABLE and
> > > > > > > > > > STRICT
> > > > > > > > > > keywords
> > > > > > > > > > in
> > > > > > > > > > order to boost performance of the Postgres SPs.
> > > > > > > > > >
> > > > > > > > > > Please make sure that your current/and future DB scripts
> > > > > > > > > > applied
> > > > > > > > > > that.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Volatility
> > > > > > > > > > ----------
> > > > > > > > > > * A function should be marked as IMMUTABLE if it doesn't
> > > > > > > > > > change
> > > > > > > > > > the
> > > > > > > > > > database,
> > > > > > > > > > and if it doesn't perform any lookups (even for database
> > > > > > > > > > configuration
> > > > > > > > > > values) during its operation.
> > > > > > > > > > * A function should be marked STABLE if it doesn't change
> > > > > > > > > > the
> > > > > > > > > > database,
> > > > > > > > > > but
> > > > > > > > > > might perform lookups (IMMUTABLE is preferable if function
> > > > > > > > > > meets
> > > > > > > > > > the
> > > > > > > > > > requirements).
> > > > > > > > > > * A function doesn't need to be marked VOLATILE, because
> > > > > > > > > > that's
> > > > > > > > > > the
> > > > > > > > > > default.
> > > > > > > > > >
> > > > > > > > > > STRICTNESS
> > > > > > > > > > ----------
> > > > > > > > > > A function should be marked STRICT if it should return NULL
> > > > > > > > > > when
> > > > > > > > > > it
> > > > > > > > > > is
> > > > > > > > > > passed
> > > > > > > > > > a NULL argument, and then the function won't even be called
> > > > > > > > > > if
> > > > > > > > > > it
> > > > > > > > > > is
> > > > > > > > > > indeed
> > > > > > > > > > passed a NULL argument.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I am available for any questions.
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > Eli
> > > > > > > > > > _______________________________________________
> > > > > > > > > > Engine-devel mailing list
> > > > > > > > > > Engine-devel at ovirt.org
> > > > > > > > > > http://lists.ovirt.org/mailman/listinfo/engine-devel
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > _______________________________________________
> > > > > > > Engine-devel mailing list
> > > > > > > Engine-devel at ovirt.org
> > > > > > > http://lists.ovirt.org/mailman/listinfo/engine-devel
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
More information about the Engine-devel
mailing list