Writing SQL queries in DAO code

older

[OST failure report] vdsmfake...

[ Announcment ] STD CI now...

Yevgeny Zaspitsky

Sunday, 26 March 2017 Sun, 26 Mar '17

9:12 a.m.

...

From my experience with the performance improvement task it looks like

people avoid adding new queries for a specific need of a flow, instead they use the existing general ones (e.g. dao.getAllForX()) and do the actual join in the bll code. IMHO the proposed approach would simplify adding new specific queries and by that would prevent a decent part of performance issues in the future. I do not propose changing all existing SP's to inline queries in a once, but to allow adding new queries this way. Also we might consider converting relatively simple SP's to inline SQL statements later in a graduate way. [1] - https://gerrit.ovirt.org/#/c/74456 [2] - https://gerrit.ovirt.org/#/c/74458 Regards, Yevgeny

Attachments:

attachment.html (text/html — 1.9 KB)

Show replies by date

Moti Asayag

Sunday, 26 March Sun, 26 Mar

5:56 p.m.

On Sun, Mar 26, 2017 at 5:12 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> wrote:

...

There are additional pros and cons for the suggestion to consider: Pros: 1. No need to run engine-setup after changing DB related code (in case of SQL inside Java). Cons: 1. DAO files might become very long. 2. If you decide to return the business entity associated with the DAO as a returned object, you won't know as a caller which fields to expect to be populated, which lead to 3: 3. An inflation of business entities to represent partial populated business entity or inflation of mappers inflation (this will be required for SP as well). 4. SQL code inside of Java: 1. Beside of the fact that a multi-line concatenated string that cannot be easily copied and run with psql, it means that we should compile the code in order to test the change (vs building with DEV_REBUILD=0 which only package the SQL file). 2. No syntax highlighting when performing code review. i.e. I don't think reviewing a patch such as https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/network_sp.sql would be more clear inside a java file. 3. The user permissions management is implemented on DB level. That means that SQL will be more complex (more concatenated java lines). 5. Stored procedure are cached by project's code. See SimpleJdbcCallsHandler.getCall(), while the NamedParameterJdbcTemplate are cached by spring's code which seems less optimal (sync all calls using synchronized vs using ConcurrentHashMap as in SP cache). 6. With the NamedParametersJdbcTemplate there is no use of the DbEngineDialect. What's the impact of it ? So besides 5 and 6, the rest is a matter of style. I'd like to hear more opinions from other members. Regards, Moti So I'd like to propose moving towards this approach in general in all cases

...

when nothing beyond a simple SQL is needed (no stored procedure programming language is needed). From my experience with the performance improvement task it looks like people avoid adding new queries for a specific need of a flow, instead they use the existing general ones (e.g. dao.getAllForX()) and do the actual join in the bll code. IMHO the proposed approach would simplify adding new specific queries and by that would prevent a decent part of performance issues in the future. I do not propose changing all existing SP's to inline queries in a once, but to allow adding new queries this way. Also we might consider converting relatively simple SP's to inline SQL statements later in a graduate way. [1] - https://gerrit.ovirt.org/#/c/74456 [2] - https://gerrit.ovirt.org/#/c/74458 Regards, Yevgeny _______________________________________________ Devel mailing list Devel(a)ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

-- Regards, Moti

Eli Mesika

Monday, 27 March Mon, 27 Mar

3:09 a.m.

On Mon, Mar 27, 2017 at 1:56 AM, Moti Asayag <masayag(a)redhat.com> wrote:

...

On Sun, Mar 26, 2017 at 5:12 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> wrote: > Hi All, > > Recently I had a task of performance improvement in one of our network > related flows and had some hard time following our DAL code and one of the > outcomes of the task was defining a couple of new quite simple, but neat > queries. > When I came to coding those new queries I remembered how hard was > following the existing DAL code, I decided to make my own new methods > clearer. So I created [1] and [2] patches. > > Everything is quite standard there beside the fact that they do not use > any stored procedure, but use SQL directly, IMHO by that they save time > that I spent in trying to follow what a DAO method does. Looking into the > method code you get the understanding of what this method is all about: > > - no looking for a stored procedure name that is buried in the DAO > class hierarchy > - no looking for the SP definition > > There are additional pros and cons for the suggestion to consider: Pros: 1. No need to run engine-setup after changing DB related code (in case of SQL inside Java). Cons: 1. DAO files might become very long. 2. If you decide to return the business entity associated with the DAO as a returned object, you won't know as a caller which fields to expect to be populated, which lead to 3: 3. An inflation of business entities to represent partial populated business entity or inflation of mappers inflation (this will be required for SP as well). 4. SQL code inside of Java: 1. Beside of the fact that a multi-line concatenated string that cannot be easily copied and run with psql, it means that we should compile the code in order to test the change (vs building with DEV_REBUILD=0 which only package the SQL file). 2. No syntax highlighting when performing code review. i.e. I don't think reviewing a patch such as https://gerrit.ovirt.org/#/c/ 66729/10/packaging/dbscripts/network_sp.sql <https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/network_sp.sql> would be more clear inside a java file. 3. The user permissions management is implemented on DB level. That means that SQL will be more complex (more concatenated java lines). 5. Stored procedure are cached by project's code. See SimpleJdbcCallsHandler.getCall(), while the NamedParameterJdbcTemplate are cached by spring's code which seems less optimal (sync all calls using synchronized vs using ConcurrentHashMap as in SP cache). 6. With the NamedParametersJdbcTemplate there is no use of the DbEngineDialect. What's the impact of it ? So besides 5 and 6, the rest is a matter of style. I'd like to hear more opinions from other members.

I agree with all you wrote Moti I don't think that looking in SQL in Java code is clear than looking in a SP code Additionally 1) You will pass more data on the wire instead of calling a SP with parameters 2) Your data that is passed on the wire is exposed to attacks since you will have to implement DB security in the engine level (for example hidden columns) 3) Changes in the SQL code done in patches may be more complicated to track 4) SQL Injection 5) I think that SP performs better than SQL inside Java I see no real reason to replace the SPs with SQL code , SP is just a container for SQL code

...

Regards, Moti So I'd like to propose moving towards this approach in general in all > cases when nothing beyond a simple SQL is needed (no stored procedure > programming language is needed). > From my experience with the performance improvement task it looks like > people avoid adding new queries for a specific need of a flow, instead they > use the existing general ones (e.g. dao.getAllForX()) and do the actual > join in the bll code. > IMHO the proposed approach would simplify adding new specific queries and > by that would prevent a decent part of performance issues in the future. > > I do not propose changing all existing SP's to inline queries in a once, > but to allow adding new queries this way. Also we might consider converting > relatively simple SP's to inline SQL statements later in a graduate way. > > [1] - https://gerrit.ovirt.org/#/c/74456 > [2] - https://gerrit.ovirt.org/#/c/74458 > > Regards, > Yevgeny > > _______________________________________________ > Devel mailing list > Devel(a)ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel > -- Regards, Moti _______________________________________________ Devel mailing list Devel(a)ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

Yevgeny Zaspitsky

4:26 a.m.

...

I don't think that looking in SQL in Java code is clear than looking in

a SP code Looking in SQL isn't the problem I'm trying to solve, but finding that.

...

1) You will pass more data on the wire instead of calling a SP with

parameters Passing the SQL string shouldn't be a problem unless that is very long one (several K) and then we probably do something wrong.

...

2) Your data that is passed on the wire is exposed to attacks since you

will have to implement DB security in the engine level (for example hidden columns). IIUC currently querying tables/views with hidden columns is implemented in a SP that consist of at least 2 SQL's, so those aren't good candidates for my proposal and will stay as is. BTW how other projects resolve the security problem? AFAIK usually hidden columns are hidden by defining a proper view that do not include those.

...

3) Changes in the SQL code done in patches may be more complicated to

track Most of SQL code changes involve changes in a DAO too, so this shouldn't be an issue.

...

4) SQL Injection

Again, how other projects resolve the security problem? Internet is full of articles/blogs of why stored procedures should be avoided ("avoid stored procedures" Google query returned ~6.6M results), so I guess there are some other approaches for the security issue if that exists.

...

5) I think that SP performs better than SQL inside Java

Do we have a measurement of how much better? One of my intentions for this proposal is that people evidently avoid creating neat SQL queries and re-use the existing ones, which has much bigger performance impact. I guess that the biggest limit here is how complicated that procedure is in the engine code. On Mon, Mar 27, 2017 at 11:09 AM, Eli Mesika <emesika(a)redhat.com> wrote: > > > On Mon, Mar 27, 2017 at 1:56 AM, Moti Asayag <masayag(a)redhat.com> wrote: > >> >> >> On Sun, Mar 26, 2017 at 5:12 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> >> wrote: >> >>> Hi All, >>> >>> Recently I had a task of performance improvement in one of our network >>> related flows and had some hard time following our DAL code and one of the >>> outcomes of the task was defining a couple of new quite simple, but neat >>> queries. >>> When I came to coding those new queries I remembered how hard was >>> following the existing DAL code, I decided to make my own new methods >>> clearer. So I created [1] and [2] patches. >>> >>> Everything is quite standard there beside the fact that they do not use >>> any stored procedure, but use SQL directly, IMHO by that they save time >>> that I spent in trying to follow what a DAO method does. Looking into the >>> method code you get the understanding of what this method is all about: >>> >>> - no looking for a stored procedure name that is buried in the DAO >>> class hierarchy >>> - no looking for the SP definition >>> >>> >> There are additional pros and cons for the suggestion to consider: >> >> Pros: >> >> 1. No need to run engine-setup after changing DB related code (in >> case of SQL inside Java). >> >> Cons: >> >> 1. DAO files might become very long. >> 2. If you decide to return the business entity associated with the >> DAO as a returned object, you won't know as a caller which fields to expect >> to be populated, which lead to 3: >> 3. An inflation of business entities to represent partial populated >> business entity or inflation of mappers inflation (this will be required >> for SP as well). >> 4. SQL code inside of Java: >> 1. Beside of the fact that a multi-line concatenated string that >> cannot be easily copied and run with psql, it means that we should compile >> the code in order to test the change (vs building with DEV_REBUILD=0 which >> only package the SQL file). >> 2. No syntax highlighting when performing code review. i.e. I >> don't think reviewing a patch such as >> https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/ne >> twork_sp.sql >> <https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/network_sp.sql> >> would be more clear inside a java file. >> 3. The user permissions management is implemented on DB level. >> That means that SQL will be more complex (more concatenated java lines). >> 5. Stored procedure are cached by project's code. See >> SimpleJdbcCallsHandler.getCall(), while the >> NamedParameterJdbcTemplate are cached by spring's code which seems less >> optimal (sync all calls using synchronized vs using ConcurrentHashMap as in >> SP cache). >> 6. With the NamedParametersJdbcTemplate there is no use of the >> DbEngineDialect. What's the impact of it ? >> >> So besides 5 and 6, the rest is a matter of style. I'd like to hear more >> opinions from other members. >> > > I agree with all you wrote Moti > I don't think that looking in SQL in Java code is clear than looking in a > SP code > Additionally

...

1) You will pass more data on the wire instead of calling a SP with

> parameters

...

2) Your data that is passed on the wire is exposed to attacks since you

> will have to implement DB security in the engine level (for example hidden > columns)

...

3) Changes in the SQL code done in patches may be more complicated to

> track

...

4) SQL Injection

...

5) I think that SP performs better than SQL inside Java

> > I see no real reason to replace the SPs with SQL code , SP is just a > container for SQL code > > > >> >> Regards, >> Moti >> >> So I'd like to propose moving towards this approach in general in all >>> cases when nothing beyond a simple SQL is needed (no stored procedure >>> programming language is needed). >>> From my experience with the performance improvement task it looks like >>> people avoid adding new queries for a specific need of a flow, instead they >>> use the existing general ones (e.g. dao.getAllForX()) and do the actual >>> join in the bll code. >>> IMHO the proposed approach would simplify adding new specific queries >>> and by that would prevent a decent part of performance issues in the future. >>> >>> I do not propose changing all existing SP's to inline queries in a once, >>> but to allow adding new queries this way. Also we might consider converting >>> relatively simple SP's to inline SQL statements later in a graduate way. >>> >>> [1] - https://gerrit.ovirt.org/#/c/74456 >>> [2] - https://gerrit.ovirt.org/#/c/74458 >>> >>> Regards, >>> Yevgeny >>> >>> _______________________________________________ >>> Devel mailing list >>> Devel(a)ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/devel >>> >> >> >> >> -- >> Regards, >> Moti >> >> _______________________________________________ >> Devel mailing list >> Devel(a)ovirt.org >> http://lists.ovirt.org/mailman/listinfo/devel >> > >

Eli Mesika

4:45 a.m.

On Mon, Mar 27, 2017 at 12:26 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> wrote:

...

> I don't think that looking in SQL in Java code is clear than looking in a SP code Looking in SQL isn't the problem I'm trying to solve, but finding that. > 1) You will pass more data on the wire instead of calling a SP with parameters Passing the SQL string shouldn't be a problem unless that is very long one (several K) and then we probably do something wrong. > 2) Your data that is passed on the wire is exposed to attacks since you will have to implement DB security in the engine level (for example hidden columns). IIUC currently querying tables/views with hidden columns is implemented in a SP that consist of at least 2 SQL's, so those aren't good candidates for my proposal and will stay as is. BTW how other projects resolve the security problem? AFAIK usually hidden columns are hidden by defining a proper view that do not include those.

That's a bad solution as long as your data is passed unmasked on the wire Database security should be done in the database level and you will not be able to do that without using SPs

...

> 3) Changes in the SQL code done in patches may be more complicated to track Most of SQL code changes involve changes in a DAO too, so this shouldn't be an issue.

It is !!! , changes in DAO are easy to track since it is a Java code , you are suggesting to write the SQL itself

...

> 4) SQL Injection Again, how other projects resolve the security problem? Internet is full of articles/blogs of why stored procedures should be avoided ("avoid stored procedures" Google query returned ~6.6M results), so I guess there are some other approaches for the security issue if that exists.

"use stored procedures" Google query returned About 4,840,000 results

...

> 5) I think that SP performs better than SQL inside Java Do we have a measurement of how much better? One of my intentions for this proposal is that people evidently avoid creating neat SQL queries and re-use the existing ones, which has much bigger performance impact. I guess that the biggest limit here is how complicated that procedure is in the engine code.

That's why we have code review process and we should nack such patches , so you think that if people are not familiar with Java8 syntax for example we should move this code to be performed by a "easier" mechanism ? If people are not doing the right thing , we have gerrit for that, we can comment , nack , whatever to make the code good

...

On Mon, Mar 27, 2017 at 11:09 AM, Eli Mesika <emesika(a)redhat.com> wrote: > > > On Mon, Mar 27, 2017 at 1:56 AM, Moti Asayag <masayag(a)redhat.com> wrote: > >> >> >> On Sun, Mar 26, 2017 at 5:12 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> >> wrote: >> >>> Hi All, >>> >>> Recently I had a task of performance improvement in one of our network >>> related flows and had some hard time following our DAL code and one of the >>> outcomes of the task was defining a couple of new quite simple, but neat >>> queries. >>> When I came to coding those new queries I remembered how hard was >>> following the existing DAL code, I decided to make my own new methods >>> clearer. So I created [1] and [2] patches. >>> >>> Everything is quite standard there beside the fact that they do not use >>> any stored procedure, but use SQL directly, IMHO by that they save time >>> that I spent in trying to follow what a DAO method does. Looking into the >>> method code you get the understanding of what this method is all about: >>> >>> - no looking for a stored procedure name that is buried in the DAO >>> class hierarchy >>> - no looking for the SP definition >>> >>> >> There are additional pros and cons for the suggestion to consider: >> >> Pros: >> >> 1. No need to run engine-setup after changing DB related code (in >> case of SQL inside Java). >> >> Cons: >> >> 1. DAO files might become very long. >> 2. If you decide to return the business entity associated with the >> DAO as a returned object, you won't know as a caller which fields to expect >> to be populated, which lead to 3: >> 3. An inflation of business entities to represent partial populated >> business entity or inflation of mappers inflation (this will be required >> for SP as well). >> 4. SQL code inside of Java: >> 1. Beside of the fact that a multi-line concatenated string that >> cannot be easily copied and run with psql, it means that we should compile >> the code in order to test the change (vs building with DEV_REBUILD=0 which >> only package the SQL file). >> 2. No syntax highlighting when performing code review. i.e. I >> don't think reviewing a patch such as >> https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/ne >> twork_sp.sql >> <https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/network_sp.sql> >> would be more clear inside a java file. >> 3. The user permissions management is implemented on DB level. >> That means that SQL will be more complex (more concatenated java lines). >> 5. Stored procedure are cached by project's code. See >> SimpleJdbcCallsHandler.getCall(), while the >> NamedParameterJdbcTemplate are cached by spring's code which seems less >> optimal (sync all calls using synchronized vs using ConcurrentHashMap as in >> SP cache). >> 6. With the NamedParametersJdbcTemplate there is no use of the >> DbEngineDialect. What's the impact of it ? >> >> So besides 5 and 6, the rest is a matter of style. I'd like to hear more >> opinions from other members. >> > > I agree with all you wrote Moti > I don't think that looking in SQL in Java code is clear than looking in > a SP code > Additionally > 1) You will pass more data on the wire instead of calling a SP with > parameters > 2) Your data that is passed on the wire is exposed to attacks since you > will have to implement DB security in the engine level (for example hidden > columns) > 3) Changes in the SQL code done in patches may be more complicated to > track > 4) SQL Injection > 5) I think that SP performs better than SQL inside Java > > I see no real reason to replace the SPs with SQL code , SP is just a > container for SQL code > > > >> >> Regards, >> Moti >> >> So I'd like to propose moving towards this approach in general in all >>> cases when nothing beyond a simple SQL is needed (no stored procedure >>> programming language is needed). >>> From my experience with the performance improvement task it looks like >>> people avoid adding new queries for a specific need of a flow, instead they >>> use the existing general ones (e.g. dao.getAllForX()) and do the actual >>> join in the bll code. >>> IMHO the proposed approach would simplify adding new specific queries >>> and by that would prevent a decent part of performance issues in the future. >>> >>> I do not propose changing all existing SP's to inline queries in a >>> once, but to allow adding new queries this way. Also we might consider >>> converting relatively simple SP's to inline SQL statements later in a >>> graduate way. >>> >>> [1] - https://gerrit.ovirt.org/#/c/74456 >>> [2] - https://gerrit.ovirt.org/#/c/74458 >>> >>> Regards, >>> Yevgeny >>> >>> _______________________________________________ >>> Devel mailing list >>> Devel(a)ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/devel >>> >> >> >> >> -- >> Regards, >> Moti >> >> _______________________________________________ >> Devel mailing list >> Devel(a)ovirt.org >> http://lists.ovirt.org/mailman/listinfo/devel >> > >

Yevgeny Zaspitsky

9:30 a.m.

On Mon, Mar 27, 2017 at 12:45 PM, Eli Mesika <emesika(a)redhat.com> wrote:

...

On Mon, Mar 27, 2017 at 12:26 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> wrote: > > I don't think that looking in SQL in Java code is clear than looking > in a SP code > > Looking in SQL isn't the problem I'm trying to solve, but finding that. > > > 1) You will pass more data on the wire instead of calling a SP with > parameters > > Passing the SQL string shouldn't be a problem unless that is very long > one (several K) and then we probably do something wrong. > > > 2) Your data that is passed on the wire is exposed to attacks since you > will have to implement DB security in the engine level (for example hidden > columns). > > IIUC currently querying tables/views with hidden columns is implemented > in a SP that consist of at least 2 SQL's, so those aren't good candidates > for my proposal and will stay as is. > BTW how other projects resolve the security problem? AFAIK usually hidden > columns are hidden by defining a proper view that do not include those. > That's a bad solution as long as your data is passed unmasked on the wire

In some cases we do send a sensitive info in the current solution that uses SP's. Should we use (maybe already) a kind of secured connection to DB? Database security should be done in the database level and you will not be

...

able to do that without using SPs

I'm not a DB security expert, but from my experience from my previous jobs none of them used "SP's only" approach, in fact they didn't use SP's at all.

...

> > > 3) Changes in the SQL code done in patches may be more complicated to > track > > Most of SQL code changes involve changes in a DAO too, so this shouldn't > be an issue. > It is !!! , changes in DAO are easy to track since it is a Java code , you are suggesting to write the SQL itself

If you need to update a Java code together with SQL, then how much difference is in seeing the change in a single file or in two separate ones? Frankly speaking I do not get what do you mean by "track". If you're about following what a DAO method does, then my approach simplifies the job.

...

> > > 4) SQL Injection > > Again, how other projects resolve the security problem? Internet is full > of articles/blogs of why stored procedures should be avoided ("avoid > > stored procedures" Google query returned ~6.6M results), so I guess there > are some other approaches for the security issue if that exists. > "use stored procedures" Google query returned About 4,840,000 results How many of those add "do not" to the search term? Anyhow avoid gives more

results...

...

> > > 5) I think that SP performs better than SQL inside Java > > Do we have a measurement of how much better? > One of my intentions for this proposal is that people evidently avoid > creating neat SQL queries and re-use the existing ones, which has much > bigger performance impact. I guess that the biggest limit here is how > complicated that procedure is in the engine code. > That's why we have code review process and we should nack such patches , so you think that if people are not familiar with Java8 syntax for example we should move this code to be performed by a "easier" mechanism ? If people are not doing the right thing , we have gerrit for that, we can comment , nack , whatever to make the code good

In a perfect word you're right, proper reviews should've stop that. But in the hard reality of oVirt code base it doesn't happen somehow... My proposal bases on the current situation of oVirt code that was created by using all Gerrit features.

...

> > > On Mon, Mar 27, 2017 at 11:09 AM, Eli Mesika <emesika(a)redhat.com> wrote: > >> >> >> On Mon, Mar 27, 2017 at 1:56 AM, Moti Asayag <masayag(a)redhat.com> wrote: >> >>> >>> >>> On Sun, Mar 26, 2017 at 5:12 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com >>> > wrote: >>> >>>> Hi All, >>>> >>>> Recently I had a task of performance improvement in one of our network >>>> related flows and had some hard time following our DAL code and one of the >>>> outcomes of the task was defining a couple of new quite simple, but neat >>>> queries. >>>> When I came to coding those new queries I remembered how hard was >>>> following the existing DAL code, I decided to make my own new methods >>>> clearer. So I created [1] and [2] patches. >>>> >>>> Everything is quite standard there beside the fact that they do not >>>> use any stored procedure, but use SQL directly, IMHO by that they save time >>>> that I spent in trying to follow what a DAO method does. Looking into the >>>> method code you get the understanding of what this method is all about: >>>> >>>> - no looking for a stored procedure name that is buried in the DAO >>>> class hierarchy >>>> - no looking for the SP definition >>>> >>>> >>> There are additional pros and cons for the suggestion to consider: >>> >>> Pros: >>> >>> 1. No need to run engine-setup after changing DB related code (in >>> case of SQL inside Java). >>> >>> Cons: >>> >>> 1. DAO files might become very long. >>> 2. If you decide to return the business entity associated with the >>> DAO as a returned object, you won't know as a caller which fields to expect >>> to be populated, which lead to 3: >>> 3. An inflation of business entities to represent partial populated >>> business entity or inflation of mappers inflation (this will be required >>> for SP as well). >>> 4. SQL code inside of Java: >>> 1. Beside of the fact that a multi-line concatenated string that >>> cannot be easily copied and run with psql, it means that we should compile >>> the code in order to test the change (vs building with DEV_REBUILD=0 which >>> only package the SQL file). >>> 2. No syntax highlighting when performing code review. i.e. I >>> don't think reviewing a patch such as >>> https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/ne >>> twork_sp.sql >>> <https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/network_sp.sql> >>> would be more clear inside a java file. >>> 3. The user permissions management is implemented on DB level. >>> That means that SQL will be more complex (more concatenated java lines). >>> 5. Stored procedure are cached by project's code. See >>> SimpleJdbcCallsHandler.getCall(), while the >>> NamedParameterJdbcTemplate are cached by spring's code which seems less >>> optimal (sync all calls using synchronized vs using ConcurrentHashMap as in >>> SP cache). >>> 6. With the NamedParametersJdbcTemplate there is no use of the >>> DbEngineDialect. What's the impact of it ? >>> >>> So besides 5 and 6, the rest is a matter of style. I'd like to hear >>> more opinions from other members. >>> >> >> I agree with all you wrote Moti >> I don't think that looking in SQL in Java code is clear than looking in >> a SP code >> Additionally >> 1) You will pass more data on the wire instead of calling a SP with >> parameters >> 2) Your data that is passed on the wire is exposed to attacks since you >> will have to implement DB security in the engine level (for example hidden >> columns) >> 3) Changes in the SQL code done in patches may be more complicated to >> track >> 4) SQL Injection >> 5) I think that SP performs better than SQL inside Java >> >> I see no real reason to replace the SPs with SQL code , SP is just a >> container for SQL code >> >> >> >>> >>> Regards, >>> Moti >>> >>> So I'd like to propose moving towards this approach in general in all >>>> cases when nothing beyond a simple SQL is needed (no stored procedure >>>> programming language is needed). >>>> From my experience with the performance improvement task it looks like >>>> people avoid adding new queries for a specific need of a flow, instead they >>>> use the existing general ones (e.g. dao.getAllForX()) and do the actual >>>> join in the bll code. >>>> IMHO the proposed approach would simplify adding new specific queries >>>> and by that would prevent a decent part of performance issues in the future. >>>> >>>> I do not propose changing all existing SP's to inline queries in a >>>> once, but to allow adding new queries this way. Also we might consider >>>> converting relatively simple SP's to inline SQL statements later in a >>>> graduate way. >>>> >>>> [1] - https://gerrit.ovirt.org/#/c/74456 >>>> [2] - https://gerrit.ovirt.org/#/c/74458 >>>> >>>> Regards, >>>> Yevgeny >>>> >>>> _______________________________________________ >>>> Devel mailing list >>>> Devel(a)ovirt.org >>>> http://lists.ovirt.org/mailman/listinfo/devel >>>> >>> >>> >>> >>> -- >>> Regards, >>> Moti >>> >>> _______________________________________________ >>> Devel mailing list >>> Devel(a)ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/devel >>> >> >> >

Yevgeny Zaspitsky

10:38 a.m.

On Mon, Mar 27, 2017 at 5:30 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> wrote:

...

On Mon, Mar 27, 2017 at 12:45 PM, Eli Mesika <emesika(a)redhat.com> wrote: > > > On Mon, Mar 27, 2017 at 12:26 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> > wrote: > >> > I don't think that looking in SQL in Java code is clear than looking >> in a SP code >> >> Looking in SQL isn't the problem I'm trying to solve, but finding that. >> >> > 1) You will pass more data on the wire instead of calling a SP with >> parameters >> >> Passing the SQL string shouldn't be a problem unless that is very long >> one (several K) and then we probably do something wrong. >> >> > 2) Your data that is passed on the wire is exposed to attacks since >> you will have to implement DB security in the engine level (for example >> hidden columns). >> >> IIUC currently querying tables/views with hidden columns is implemented >> in a SP that consist of at least 2 SQL's, so those aren't good candidates >> for my proposal and will stay as is. >> BTW how other projects resolve the security problem? AFAIK usually >> hidden columns are hidden by defining a proper view that do not include >> those. >> > > That's a bad solution as long as your data is passed unmasked on the > wire > In some cases we do send a sensitive info in the current solution that uses SP's. Should we use (maybe already) a kind of secured connection to DB? Database security should be done in the database level and you will not be > able to do that without using SPs > I'm not a DB security expert, but from my experience from my previous jobs none of them used "SP's only" approach, in fact they didn't use SP's at all. > > > >> >> > 3) Changes in the SQL code done in patches may be more complicated to >> track >> >> Most of SQL code changes involve changes in a DAO too, so this shouldn't >> be an issue. >> > It is !!! , changes in DAO are easy to track since it is a Java code , > you are suggesting to write the SQL itself > If you need to update a Java code together with SQL, then how much difference is in seeing the change in a single file or in two separate ones? Frankly speaking I do not get what do you mean by "track". If you're about following what a DAO method does, then my approach simplifies the job. > > >> >> > 4) SQL Injection >> >> Again, how other projects resolve the security problem? Internet is full >> of articles/blogs of why stored procedures should be avoided ("avoid >> >> stored procedures" Google query returned ~6.6M results), so I guess >> there are some other approaches for the security issue if that exists. >> > > "use stored procedures" Google query returned About 4,840,000 results > > How many of those add "do not" to the search term? Anyhow avoid gives more results...

Again, I'm not a DB security expert, but IIUC using prepared statements with parameters (what my patches do) doesn't allow a SQL injection. On the other side the existing SearchDao.getAllWithQuery implementors are exposed to SQL injection attempts. Am I right?

...

> >> >> > 5) I think that SP performs better than SQL inside Java >> >> Do we have a measurement of how much better? >> One of my intentions for this proposal is that people evidently avoid >> creating neat SQL queries and re-use the existing ones, which has much >> bigger performance impact. I guess that the biggest limit here is how >> complicated that procedure is in the engine code. >> > > That's why we have code review process and we should nack such patches > , so you think that if people are not familiar with Java8 syntax for > example we should move this code to be performed by a "easier" mechanism ? > If people are not doing the right thing , we have gerrit for that, we can > comment , nack , whatever to make the code good > In a perfect word you're right, proper reviews should've stop that. But in the hard reality of oVirt code base it doesn't happen somehow... My proposal bases on the current situation of oVirt code that was created by using all Gerrit features. > > > >> >> >> On Mon, Mar 27, 2017 at 11:09 AM, Eli Mesika <emesika(a)redhat.com> wrote: >> >>> >>> >>> On Mon, Mar 27, 2017 at 1:56 AM, Moti Asayag <masayag(a)redhat.com> >>> wrote: >>> >>>> >>>> >>>> On Sun, Mar 26, 2017 at 5:12 PM, Yevgeny Zaspitsky < >>>> yzaspits(a)redhat.com> wrote: >>>> >>>>> Hi All, >>>>> >>>>> Recently I had a task of performance improvement in one of our >>>>> network related flows and had some hard time following our DAL code and one >>>>> of the outcomes of the task was defining a couple of new quite simple, but >>>>> neat queries. >>>>> When I came to coding those new queries I remembered how hard was >>>>> following the existing DAL code, I decided to make my own new methods >>>>> clearer. So I created [1] and [2] patches. >>>>> >>>>> Everything is quite standard there beside the fact that they do not >>>>> use any stored procedure, but use SQL directly, IMHO by that they save time >>>>> that I spent in trying to follow what a DAO method does. Looking into the >>>>> method code you get the understanding of what this method is all about: >>>>> >>>>> - no looking for a stored procedure name that is buried in the >>>>> DAO class hierarchy >>>>> - no looking for the SP definition >>>>> >>>>> >>>> There are additional pros and cons for the suggestion to consider: >>>> >>>> Pros: >>>> >>>> 1. No need to run engine-setup after changing DB related code (in >>>> case of SQL inside Java). >>>> >>>> Cons: >>>> >>>> 1. DAO files might become very long. >>>> 2. If you decide to return the business entity associated with the >>>> DAO as a returned object, you won't know as a caller which fields to expect >>>> to be populated, which lead to 3: >>>> 3. An inflation of business entities to represent partial >>>> populated business entity or inflation of mappers inflation (this will be >>>> required for SP as well). >>>> 4. SQL code inside of Java: >>>> 1. Beside of the fact that a multi-line concatenated string that >>>> cannot be easily copied and run with psql, it means that we should compile >>>> the code in order to test the change (vs building with DEV_REBUILD=0 which >>>> only package the SQL file). >>>> 2. No syntax highlighting when performing code review. i.e. I >>>> don't think reviewing a patch such as >>>> https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/ne >>>> twork_sp.sql >>>> <https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/network_sp.sql> >>>> would be more clear inside a java file. >>>> 3. The user permissions management is implemented on DB level. >>>> That means that SQL will be more complex (more concatenated java lines). >>>> 5. Stored procedure are cached by project's code. See >>>> SimpleJdbcCallsHandler.getCall(), while the >>>> NamedParameterJdbcTemplate are cached by spring's code which seems less >>>> optimal (sync all calls using synchronized vs using ConcurrentHashMap as in >>>> SP cache). >>>> 6. With the NamedParametersJdbcTemplate there is no use of the >>>> DbEngineDialect. What's the impact of it ? >>>> >>>> So besides 5 and 6, the rest is a matter of style. I'd like to hear >>>> more opinions from other members. >>>> >>> >>> I agree with all you wrote Moti >>> I don't think that looking in SQL in Java code is clear than looking >>> in a SP code >>> Additionally >>> 1) You will pass more data on the wire instead of calling a SP with >>> parameters >>> 2) Your data that is passed on the wire is exposed to attacks since you >>> will have to implement DB security in the engine level (for example hidden >>> columns) >>> 3) Changes in the SQL code done in patches may be more complicated to >>> track >>> 4) SQL Injection >>> 5) I think that SP performs better than SQL inside Java >>> >>> I see no real reason to replace the SPs with SQL code , SP is just a >>> container for SQL code >>> >>> >>> >>>> >>>> Regards, >>>> Moti >>>> >>>> So I'd like to propose moving towards this approach in general in all >>>>> cases when nothing beyond a simple SQL is needed (no stored procedure >>>>> programming language is needed). >>>>> From my experience with the performance improvement task it looks >>>>> like people avoid adding new queries for a specific need of a flow, instead >>>>> they use the existing general ones (e.g. dao.getAllForX()) and do the >>>>> actual join in the bll code. >>>>> IMHO the proposed approach would simplify adding new specific queries >>>>> and by that would prevent a decent part of performance issues in the future. >>>>> >>>>> I do not propose changing all existing SP's to inline queries in a >>>>> once, but to allow adding new queries this way. Also we might consider >>>>> converting relatively simple SP's to inline SQL statements later in a >>>>> graduate way. >>>>> >>>>> [1] - https://gerrit.ovirt.org/#/c/74456 >>>>> [2] - https://gerrit.ovirt.org/#/c/74458 >>>>> >>>>> Regards, >>>>> Yevgeny >>>>> >>>>> _______________________________________________ >>>>> Devel mailing list >>>>> Devel(a)ovirt.org >>>>> http://lists.ovirt.org/mailman/listinfo/devel >>>>> >>>> >>>> >>>> >>>> -- >>>> Regards, >>>> Moti >>>> >>>> _______________________________________________ >>>> Devel mailing list >>>> Devel(a)ovirt.org >>>> http://lists.ovirt.org/mailman/listinfo/devel >>>> >>> >>> >> >

Martin Mucha

Wednesday, 29 March Wed, 29 Mar

8:23 a.m.

Hi, I didn't want to step into this, but I was asked by dan to express my opinion about this. I'm not pushing for change (since I believe it will be blocked), I'm just trying to help us focus. We should have some questions answered prior to saying what we want to do... Current state of accessing db is that there's fired lots of unnecessary db calls, just because there isn't easy way to do them correctly, someone is lazy, whatever, that isn't important. The blame was laid on invalid code review. Now let me remember maintainer Kolesnik, who during CR *insisted* on creating N+1 problem[1] when accessing db, to save ±10 lines of code in dal layer. So Eli is right, our code suck because of improper code revision process, but sometimes it is maintainers who wants invalid code to be pushed to simplify db access code. Therefore we cannot rely on review process, this is real issue we have. I'm fine with it, this is how we do things. But we also have (sometimes (a lot)) degraded performance because of it. We should have answered following questions: 1) is performance even important at all? Should we optimize db access? I'm assuming "yes" should be the answer. If it's not the answer, we can skip next questions. 2) can someone responsible for whole project provide timings and count of queries? This should easily bring crosshair on commands requiring attention. 2.1) customer is issuing synchronous command. What is maximum number of queries this command (plus all transitive internal commands) should fire? What's the number for asynchronous command? 2.2) same for time; what is the maximum time for synchronous and asynchronous command? 3) lets say that we found command which requires optimization (and that should not be hard). If we go back to mentioned N+1 problem, it's rather easy to avoid it for one association(yet we did not do it even there), but it's not that easy for multiple joins [1]. But it's superbly easy to achieve it using ORM tool, even with still using only named queries, even native ones defined outside of java. Supposing we want to optimize found command, what would dal maintainers consider as a acceptable optimization? Is there even possibility to not use stored procedures? I believe saying simple "no", if that's the case, is valid response, which can save time. If there is theoretical possibility, what criteria must be met so that we can consider replacing most trivial stored procedures with some tool in our project? There are lots of tools/frameworks, based on your criteria we might find some, which would work best... ————— My generic opinion would be to stick with sp and allowing to use something less heavy for simplest queries/stuff. It could beJPA using ORM and entities. Or named queries (still using entities) or nativequeries (which does not use entities, but plain sql) and both ends up creating prepared statements on server startup. Or we could use some lighter orm. But I'd definitely avoid writing any new homebrewed approach, this isn't a sane option. [1] example: Lets discuss query association A—>B—>C, 5xA, each A has 5B etc. In our typical dao this will mean 1+5+5*5=31 queries. Same with properly annotated ORM sums up to 1 or 3 queries. M. On Mon, Mar 27, 2017 at 5:38 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> wrote:

...

On Mon, Mar 27, 2017 at 5:30 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> wrote: > > > On Mon, Mar 27, 2017 at 12:45 PM, Eli Mesika <emesika(a)redhat.com> wrote: > >> >> >> On Mon, Mar 27, 2017 at 12:26 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com >> > wrote: >> >>> > I don't think that looking in SQL in Java code is clear than looking >>> in a SP code >>> >>> Looking in SQL isn't the problem I'm trying to solve, but finding that. >>> >>> > 1) You will pass more data on the wire instead of calling a SP with >>> parameters >>> >>> Passing the SQL string shouldn't be a problem unless that is very long >>> one (several K) and then we probably do something wrong. >>> >>> > 2) Your data that is passed on the wire is exposed to attacks since >>> you will have to implement DB security in the engine level (for example >>> hidden columns). >>> >>> IIUC currently querying tables/views with hidden columns is implemented >>> in a SP that consist of at least 2 SQL's, so those aren't good candidates >>> for my proposal and will stay as is. >>> BTW how other projects resolve the security problem? AFAIK usually >>> hidden columns are hidden by defining a proper view that do not include >>> those. >>> >> >> That's a bad solution as long as your data is passed unmasked on the >> wire >> > > In some cases we do send a sensitive info in the current solution that > uses SP's. > Should we use (maybe already) a kind of secured connection to DB? > > Database security should be done in the database level and you will not >> be able to do that without using SPs >> > > I'm not a DB security expert, but from my experience from my previous > jobs none of them used "SP's only" approach, in fact they didn't use SP's > at all. > >> >> >> >>> >>> > 3) Changes in the SQL code done in patches may be more complicated to >>> track >>> >>> Most of SQL code changes involve changes in a DAO too, so this >>> shouldn't be an issue. >>> >> It is !!! , changes in DAO are easy to track since it is a Java code , >> you are suggesting to write the SQL itself >> > > If you need to update a Java code together with SQL, then how much > difference is in seeing the change in a single file or in two separate ones? > Frankly speaking I do not get what do you mean by "track". If you're > about following what a DAO method does, then my approach simplifies the job. > >> >> >>> >>> > 4) SQL Injection >>> >>> Again, how other projects resolve the security problem? Internet is >>> full of articles/blogs of why stored procedures should be avoided ("avoid >>> >>> stored procedures" Google query returned ~6.6M results), so I guess >>> there are some other approaches for the security issue if that exists. >>> >> >> "use stored procedures" Google query returned About 4,840,000 results >> >> How many of those add "do not" to the search term? Anyhow avoid gives > more results... > Again, I'm not a DB security expert, but IIUC using prepared statements with parameters (what my patches do) doesn't allow a SQL injection. On the other side the existing SearchDao.getAllWithQuery implementors are exposed to SQL injection attempts. Am I right? >> >>> >>> > 5) I think that SP performs better than SQL inside Java >>> >>> Do we have a measurement of how much better? >>> One of my intentions for this proposal is that people evidently avoid >>> creating neat SQL queries and re-use the existing ones, which has much >>> bigger performance impact. I guess that the biggest limit here is how >>> complicated that procedure is in the engine code. >>> >> >> That's why we have code review process and we should nack such patches >> , so you think that if people are not familiar with Java8 syntax for >> example we should move this code to be performed by a "easier" mechanism ? >> If people are not doing the right thing , we have gerrit for that, we >> can comment , nack , whatever to make the code good >> > > In a perfect word you're right, proper reviews should've stop that. But > in the hard reality of oVirt code base it doesn't happen somehow... My > proposal bases on the current situation of oVirt code that was created by > using all Gerrit features. > >> >> >> >>> >>> >>> On Mon, Mar 27, 2017 at 11:09 AM, Eli Mesika <emesika(a)redhat.com> >>> wrote: >>> >>>> >>>> >>>> On Mon, Mar 27, 2017 at 1:56 AM, Moti Asayag <masayag(a)redhat.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Sun, Mar 26, 2017 at 5:12 PM, Yevgeny Zaspitsky < >>>>> yzaspits(a)redhat.com> wrote: >>>>> >>>>>> Hi All, >>>>>> >>>>>> Recently I had a task of performance improvement in one of our >>>>>> network related flows and had some hard time following our DAL code and one >>>>>> of the outcomes of the task was defining a couple of new quite simple, but >>>>>> neat queries. >>>>>> When I came to coding those new queries I remembered how hard was >>>>>> following the existing DAL code, I decided to make my own new methods >>>>>> clearer. So I created [1] and [2] patches. >>>>>> >>>>>> Everything is quite standard there beside the fact that they do not >>>>>> use any stored procedure, but use SQL directly, IMHO by that they save time >>>>>> that I spent in trying to follow what a DAO method does. Looking into the >>>>>> method code you get the understanding of what this method is all about: >>>>>> >>>>>> - no looking for a stored procedure name that is buried in the >>>>>> DAO class hierarchy >>>>>> - no looking for the SP definition >>>>>> >>>>>> >>>>> There are additional pros and cons for the suggestion to consider: >>>>> >>>>> Pros: >>>>> >>>>> 1. No need to run engine-setup after changing DB related code (in >>>>> case of SQL inside Java). >>>>> >>>>> Cons: >>>>> >>>>> 1. DAO files might become very long. >>>>> 2. If you decide to return the business entity associated with >>>>> the DAO as a returned object, you won't know as a caller which fields to >>>>> expect to be populated, which lead to 3: >>>>> 3. An inflation of business entities to represent partial >>>>> populated business entity or inflation of mappers inflation (this will be >>>>> required for SP as well). >>>>> 4. SQL code inside of Java: >>>>> 1. Beside of the fact that a multi-line concatenated string that >>>>> cannot be easily copied and run with psql, it means that we should compile >>>>> the code in order to test the change (vs building with DEV_REBUILD=0 which >>>>> only package the SQL file). >>>>> 2. No syntax highlighting when performing code review. i.e. I >>>>> don't think reviewing a patch such as >>>>> https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/ne >>>>> twork_sp.sql >>>>> <https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/network_sp.sql> >>>>> would be more clear inside a java file. >>>>> 3. The user permissions management is implemented on DB level. >>>>> That means that SQL will be more complex (more concatenated java lines). >>>>> 5. Stored procedure are cached by project's code. See >>>>> SimpleJdbcCallsHandler.getCall(), while the >>>>> NamedParameterJdbcTemplate are cached by spring's code which seems less >>>>> optimal (sync all calls using synchronized vs using ConcurrentHashMap as in >>>>> SP cache). >>>>> 6. With the NamedParametersJdbcTemplate there is no use of the >>>>> DbEngineDialect. What's the impact of it ? >>>>> >>>>> So besides 5 and 6, the rest is a matter of style. I'd like to hear >>>>> more opinions from other members. >>>>> >>>> >>>> I agree with all you wrote Moti >>>> I don't think that looking in SQL in Java code is clear than looking >>>> in a SP code >>>> Additionally >>>> 1) You will pass more data on the wire instead of calling a SP with >>>> parameters >>>> 2) Your data that is passed on the wire is exposed to attacks since >>>> you will have to implement DB security in the engine level (for example >>>> hidden columns) >>>> 3) Changes in the SQL code done in patches may be more complicated to >>>> track >>>> 4) SQL Injection >>>> 5) I think that SP performs better than SQL inside Java >>>> >>>> I see no real reason to replace the SPs with SQL code , SP is just a >>>> container for SQL code >>>> >>>> >>>> >>>>> >>>>> Regards, >>>>> Moti >>>>> >>>>> So I'd like to propose moving towards this approach in general in all >>>>>> cases when nothing beyond a simple SQL is needed (no stored procedure >>>>>> programming language is needed). >>>>>> From my experience with the performance improvement task it looks >>>>>> like people avoid adding new queries for a specific need of a flow, instead >>>>>> they use the existing general ones (e.g. dao.getAllForX()) and do the >>>>>> actual join in the bll code. >>>>>> IMHO the proposed approach would simplify adding new specific >>>>>> queries and by that would prevent a decent part of performance issues in >>>>>> the future. >>>>>> >>>>>> I do not propose changing all existing SP's to inline queries in a >>>>>> once, but to allow adding new queries this way. Also we might consider >>>>>> converting relatively simple SP's to inline SQL statements later in a >>>>>> graduate way. >>>>>> >>>>>> [1] - https://gerrit.ovirt.org/#/c/74456 >>>>>> [2] - https://gerrit.ovirt.org/#/c/74458 >>>>>> >>>>>> Regards, >>>>>> Yevgeny >>>>>> >>>>>> _______________________________________________ >>>>>> Devel mailing list >>>>>> Devel(a)ovirt.org >>>>>> http://lists.ovirt.org/mailman/listinfo/devel >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Regards, >>>>> Moti >>>>> >>>>> _______________________________________________ >>>>> Devel mailing list >>>>> Devel(a)ovirt.org >>>>> http://lists.ovirt.org/mailman/listinfo/devel >>>>> >>>> >>>> >>> >> > _______________________________________________ Devel mailing list Devel(a)ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

Eli Mesika

Thursday, 30 March Thu, 30 Mar

2:31 a.m.

On Wed, Mar 29, 2017 at 4:23 PM, Martin Mucha <mmucha(a)redhat.com> wrote:

...

In that case , if we will not use SP , we still will have to secure the data (for example , hidden columns for some users like in the VDS view)

...

[1] example: Lets discuss query association A—>B—>C, 5xA, each A has 5B etc. In our typical dao this will mean 1+5+5*5=31 queries. Same with properly annotated ORM sums up to 1 or 3 queries. M. On Mon, Mar 27, 2017 at 5:38 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> wrote: > > > On Mon, Mar 27, 2017 at 5:30 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> > wrote: > >> >> >> On Mon, Mar 27, 2017 at 12:45 PM, Eli Mesika <emesika(a)redhat.com> wrote: >> >>> >>> >>> On Mon, Mar 27, 2017 at 12:26 PM, Yevgeny Zaspitsky < >>> yzaspits(a)redhat.com> wrote: >>> >>>> > I don't think that looking in SQL in Java code is clear than >>>> looking in a SP code >>>> >>>> Looking in SQL isn't the problem I'm trying to solve, but finding that. >>>> >>>> > 1) You will pass more data on the wire instead of calling a SP with >>>> parameters >>>> >>>> Passing the SQL string shouldn't be a problem unless that is very long >>>> one (several K) and then we probably do something wrong. >>>> >>>> > 2) Your data that is passed on the wire is exposed to attacks since >>>> you will have to implement DB security in the engine level (for example >>>> hidden columns). >>>> >>>> IIUC currently querying tables/views with hidden columns is >>>> implemented in a SP that consist of at least 2 SQL's, so those aren't good >>>> candidates for my proposal and will stay as is. >>>> BTW how other projects resolve the security problem? AFAIK usually >>>> hidden columns are hidden by defining a proper view that do not include >>>> those. >>>> >>> >>> That's a bad solution as long as your data is passed unmasked on the >>> wire >>> >> >> In some cases we do send a sensitive info in the current solution that >> uses SP's. >> Should we use (maybe already) a kind of secured connection to DB? >> >> Database security should be done in the database level and you will not >>> be able to do that without using SPs >>> >> >> I'm not a DB security expert, but from my experience from my previous >> jobs none of them used "SP's only" approach, in fact they didn't use SP's >> at all. >> >>> >>> >>> >>>> >>>> > 3) Changes in the SQL code done in patches may be more complicated >>>> to track >>>> >>>> Most of SQL code changes involve changes in a DAO too, so this >>>> shouldn't be an issue. >>>> >>> It is !!! , changes in DAO are easy to track since it is a Java code , >>> you are suggesting to write the SQL itself >>> >> >> If you need to update a Java code together with SQL, then how much >> difference is in seeing the change in a single file or in two separate ones? >> Frankly speaking I do not get what do you mean by "track". If you're >> about following what a DAO method does, then my approach simplifies the job. >> >>> >>> >>>> >>>> > 4) SQL Injection >>>> >>>> Again, how other projects resolve the security problem? Internet is >>>> full of articles/blogs of why stored procedures should be avoided ("avoid >>>> >>>> stored procedures" Google query returned ~6.6M results), so I guess >>>> there are some other approaches for the security issue if that exists. >>>> >>> >>> "use stored procedures" Google query returned About 4,840,000 results >>> >>> How many of those add "do not" to the search term? Anyhow avoid gives >> more results... >> > > Again, I'm not a DB security expert, but IIUC using prepared statements > with parameters (what my patches do) doesn't allow a SQL injection. On the > other side the existing SearchDao.getAllWithQuery implementors are exposed > to SQL injection attempts. Am I right? > > >>> >>>> >>>> > 5) I think that SP performs better than SQL inside Java >>>> >>>> Do we have a measurement of how much better? >>>> One of my intentions for this proposal is that people evidently avoid >>>> creating neat SQL queries and re-use the existing ones, which has much >>>> bigger performance impact. I guess that the biggest limit here is how >>>> complicated that procedure is in the engine code. >>>> >>> >>> That's why we have code review process and we should nack such >>> patches , so you think that if people are not familiar with Java8 syntax >>> for example we should move this code to be performed by a "easier" >>> mechanism ? >>> If people are not doing the right thing , we have gerrit for that, we >>> can comment , nack , whatever to make the code good >>> >> >> In a perfect word you're right, proper reviews should've stop that. But >> in the hard reality of oVirt code base it doesn't happen somehow... My >> proposal bases on the current situation of oVirt code that was created by >> using all Gerrit features. >> >>> >>> >>> >>>> >>>> >>>> On Mon, Mar 27, 2017 at 11:09 AM, Eli Mesika <emesika(a)redhat.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Mon, Mar 27, 2017 at 1:56 AM, Moti Asayag <masayag(a)redhat.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Sun, Mar 26, 2017 at 5:12 PM, Yevgeny Zaspitsky < >>>>>> yzaspits(a)redhat.com> wrote: >>>>>> >>>>>>> Hi All, >>>>>>> >>>>>>> Recently I had a task of performance improvement in one of our >>>>>>> network related flows and had some hard time following our DAL code and one >>>>>>> of the outcomes of the task was defining a couple of new quite simple, but >>>>>>> neat queries. >>>>>>> When I came to coding those new queries I remembered how hard was >>>>>>> following the existing DAL code, I decided to make my own new methods >>>>>>> clearer. So I created [1] and [2] patches. >>>>>>> >>>>>>> Everything is quite standard there beside the fact that they do not >>>>>>> use any stored procedure, but use SQL directly, IMHO by that they save time >>>>>>> that I spent in trying to follow what a DAO method does. Looking into the >>>>>>> method code you get the understanding of what this method is all about: >>>>>>> >>>>>>> - no looking for a stored procedure name that is buried in the >>>>>>> DAO class hierarchy >>>>>>> - no looking for the SP definition >>>>>>> >>>>>>> >>>>>> There are additional pros and cons for the suggestion to consider: >>>>>> >>>>>> Pros: >>>>>> >>>>>> 1. No need to run engine-setup after changing DB related code >>>>>> (in case of SQL inside Java). >>>>>> >>>>>> Cons: >>>>>> >>>>>> 1. DAO files might become very long. >>>>>> 2. If you decide to return the business entity associated with >>>>>> the DAO as a returned object, you won't know as a caller which fields to >>>>>> expect to be populated, which lead to 3: >>>>>> 3. An inflation of business entities to represent partial >>>>>> populated business entity or inflation of mappers inflation (this will be >>>>>> required for SP as well). >>>>>> 4. SQL code inside of Java: >>>>>> 1. Beside of the fact that a multi-line concatenated string that >>>>>> cannot be easily copied and run with psql, it means that we should compile >>>>>> the code in order to test the change (vs building with DEV_REBUILD=0 which >>>>>> only package the SQL file). >>>>>> 2. No syntax highlighting when performing code review. i.e. I >>>>>> don't think reviewing a patch such as >>>>>> https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/ne >>>>>> twork_sp.sql >>>>>> <https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/network_sp.sql> >>>>>> would be more clear inside a java file. >>>>>> 3. The user permissions management is implemented on DB >>>>>> level. That means that SQL will be more complex (more concatenated java >>>>>> lines). >>>>>> 5. Stored procedure are cached by project's code. See >>>>>> SimpleJdbcCallsHandler.getCall(), while the >>>>>> NamedParameterJdbcTemplate are cached by spring's code which seems less >>>>>> optimal (sync all calls using synchronized vs using ConcurrentHashMap as in >>>>>> SP cache). >>>>>> 6. With the NamedParametersJdbcTemplate there is no use of the >>>>>> DbEngineDialect. What's the impact of it ? >>>>>> >>>>>> So besides 5 and 6, the rest is a matter of style. I'd like to hear >>>>>> more opinions from other members. >>>>>> >>>>> >>>>> I agree with all you wrote Moti >>>>> I don't think that looking in SQL in Java code is clear than looking >>>>> in a SP code >>>>> Additionally >>>>> 1) You will pass more data on the wire instead of calling a SP with >>>>> parameters >>>>> 2) Your data that is passed on the wire is exposed to attacks since >>>>> you will have to implement DB security in the engine level (for example >>>>> hidden columns) >>>>> 3) Changes in the SQL code done in patches may be more complicated to >>>>> track >>>>> 4) SQL Injection >>>>> 5) I think that SP performs better than SQL inside Java >>>>> >>>>> I see no real reason to replace the SPs with SQL code , SP is just a >>>>> container for SQL code >>>>> >>>>> >>>>> >>>>>> >>>>>> Regards, >>>>>> Moti >>>>>> >>>>>> So I'd like to propose moving towards this approach in general in >>>>>>> all cases when nothing beyond a simple SQL is needed (no stored procedure >>>>>>> programming language is needed). >>>>>>> From my experience with the performance improvement task it looks >>>>>>> like people avoid adding new queries for a specific need of a flow, instead >>>>>>> they use the existing general ones (e.g. dao.getAllForX()) and do the >>>>>>> actual join in the bll code. >>>>>>> IMHO the proposed approach would simplify adding new specific >>>>>>> queries and by that would prevent a decent part of performance issues in >>>>>>> the future. >>>>>>> >>>>>>> I do not propose changing all existing SP's to inline queries in a >>>>>>> once, but to allow adding new queries this way. Also we might consider >>>>>>> converting relatively simple SP's to inline SQL statements later in a >>>>>>> graduate way. >>>>>>> >>>>>>> [1] - https://gerrit.ovirt.org/#/c/74456 >>>>>>> [2] - https://gerrit.ovirt.org/#/c/74458 >>>>>>> >>>>>>> Regards, >>>>>>> Yevgeny >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Devel mailing list >>>>>>> Devel(a)ovirt.org >>>>>>> http://lists.ovirt.org/mailman/listinfo/devel >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Regards, >>>>>> Moti >>>>>> >>>>>> _______________________________________________ >>>>>> Devel mailing list >>>>>> Devel(a)ovirt.org >>>>>> http://lists.ovirt.org/mailman/listinfo/devel >>>>>> >>>>> >>>>> >>>> >>> >> > > _______________________________________________ > Devel mailing list > Devel(a)ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel >

Martin Mucha

3 a.m.

On Thu, Mar 30, 2017 at 9:31 AM, Eli Mesika <emesika(a)redhat.com> wrote:

...

On Wed, Mar 29, 2017 at 4:23 PM, Martin Mucha <mmucha(a)redhat.com> wrote: > Hi, > > I didn't want to step into this, but I was asked by dan to express my > opinion about this. I'm not pushing for change (since I believe it will be > blocked), I'm just trying to help us focus. We should have some questions > answered prior to saying what we want to do... > > Current state of accessing db is that there's fired lots of unnecessary > db calls, just because there isn't easy way to do them correctly, someone > is lazy, whatever, that isn't important. The blame was laid on invalid code > review. Now let me remember maintainer Kolesnik, who during CR *insisted* > on creating N+1 problem[1] when accessing db, to save ±10 lines of code in > dal layer. So Eli is right, our code suck because of improper code revision > process, but sometimes it is maintainers who wants invalid code to be > pushed to simplify db access code. Therefore we cannot rely on review > process, this is real issue we have. I'm fine with it, this is how we do > things. But we also have (sometimes (a lot)) degraded performance because > of it. > > We should have answered following questions: > > 1) is performance even important at all? Should we optimize db access? > I'm assuming "yes" should be the answer. If it's not the answer, we can > skip next questions. > > 2) can someone responsible for whole project provide timings and count of > queries? This should easily bring crosshair on commands requiring attention. > 2.1) customer is issuing synchronous command. What is maximum number of > queries this command (plus all transitive internal commands) should fire? > What's the number for asynchronous command? > 2.2) same for time; what is the maximum time for synchronous and > asynchronous command? > > 3) lets say that we found command which requires optimization (and that > should not be hard). If we go back to mentioned N+1 problem, it's rather > easy to avoid it for one association(yet we did not do it even there), but > it's not that easy for multiple joins [1]. But it's superbly easy to > achieve it using ORM tool, even with still using only named queries, even > native ones defined outside of java. Supposing we want to optimize found > command, what would dal maintainers consider as a acceptable optimization? > Is there even possibility to not use stored procedures? I believe saying > simple "no", if that's the case, is valid response, which can save time. If > there is theoretical possibility, what criteria must be met so that we can > consider replacing most trivial stored procedures with some tool in our > project? There are lots of tools/frameworks, based on your criteria we > might find some, which would work best... > > ————— > > My generic opinion would be to stick with sp and allowing to use > something less heavy for simplest queries/stuff. It could beJPA using ORM > and entities. Or named queries (still using entities) or nativequeries > (which does not use entities, but plain sql) and both ends up creating > prepared statements on server startup. Or we could use some lighter orm. > But I'd definitely avoid writing any new homebrewed approach, this isn't a > sane option. > In that case , if we will not use SP , we still will have to secure the data (for example , hidden columns for some users like in the VDS view)

*Sorry, but I asked 3 questions we should have answers for to decide further steps. We have to start with what we want, and as I said, until then, all _generic opinions_ (including mine ones) are irrelevant. Can you please answer them? IIUC yevgeny is trying to find a way how to improve our project, but that's hard to accoplish if we do not have defined, what do we expect from our project. The thing, that developers and maintainers are avoiding writing new specific queries is real issue to tackle, and one of reasons they do so is lots of code need to be written. Therefore yevgeny I believe is trying to find a way how to overcome this. We can focus our efforts in this area, or at least say boldly „we do not want any changes”, both is fine I believe.* *You mentioned need to secure data. Is this sole reason which bounds us to SP? Can you give me link to specific example, so that I can understand better why we cannot use simple queries without SP? Is this limitation really relevant to majority of simple queries yevgeny is interrested in?** (since nobody is talking about dropping SP altogether). IIRC (and I do) I worked with db system mimicking windows permissions (hierarchical rules, allow/deny rules) and no SP was needed to do (native) sql query. Lets say I want to see all VDS for hostname then. We have this SP for this:* *CREATE OR REPLACE FUNCTION GetVdsByHostName (v_host_name VARCHAR(255))RETURNS SETOF vds STABLE AS $PROCEDURE$BEGIN BEGIN RETURN QUERY SELECT DISTINCT vds.* FROM vds WHERE host_name = v_host_name; END; RETURN;END;$PROCEDURE$LANGUAGE plpgsql;Now I log into psql and issue only this select contained in this SP. Why it won't work correctly?*

...

> > [1] example: Lets discuss query association A—>B—>C, 5xA, each A has 5B > etc. In our typical dao this will mean 1+5+5*5=31 queries. Same with > properly annotated ORM sums up to 1 or 3 queries. > > > M. > > On Mon, Mar 27, 2017 at 5:38 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> > wrote: > >> >> >> On Mon, Mar 27, 2017 at 5:30 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> >> wrote: >> >>> >>> >>> On Mon, Mar 27, 2017 at 12:45 PM, Eli Mesika <emesika(a)redhat.com> >>> wrote: >>> >>>> >>>> >>>> On Mon, Mar 27, 2017 at 12:26 PM, Yevgeny Zaspitsky < >>>> yzaspits(a)redhat.com> wrote: >>>> >>>>> > I don't think that looking in SQL in Java code is clear than >>>>> looking in a SP code >>>>> >>>>> Looking in SQL isn't the problem I'm trying to solve, but finding >>>>> that. >>>>> >>>>> > 1) You will pass more data on the wire instead of calling a SP with >>>>> parameters >>>>> >>>>> Passing the SQL string shouldn't be a problem unless that is very >>>>> long one (several K) and then we probably do something wrong. >>>>> >>>>> > 2) Your data that is passed on the wire is exposed to attacks since >>>>> you will have to implement DB security in the engine level (for example >>>>> hidden columns). >>>>> >>>>> IIUC currently querying tables/views with hidden columns is >>>>> implemented in a SP that consist of at least 2 SQL's, so those aren't good >>>>> candidates for my proposal and will stay as is. >>>>> BTW how other projects resolve the security problem? AFAIK usually >>>>> hidden columns are hidden by defining a proper view that do not include >>>>> those. >>>>> >>>> >>>> That's a bad solution as long as your data is passed unmasked on the >>>> wire >>>> >>> >>> In some cases we do send a sensitive info in the current solution that >>> uses SP's. >>> Should we use (maybe already) a kind of secured connection to DB? >>> >>> Database security should be done in the database level and you will not >>>> be able to do that without using SPs >>>> >>> >>> I'm not a DB security expert, but from my experience from my previous >>> jobs none of them used "SP's only" approach, in fact they didn't use SP's >>> at all. >>> >>>> >>>> >>>> >>>>> >>>>> > 3) Changes in the SQL code done in patches may be more complicated >>>>> to track >>>>> >>>>> Most of SQL code changes involve changes in a DAO too, so this >>>>> shouldn't be an issue. >>>>> >>>> It is !!! , changes in DAO are easy to track since it is a Java code >>>> , you are suggesting to write the SQL itself >>>> >>> >>> If you need to update a Java code together with SQL, then how much >>> difference is in seeing the change in a single file or in two separate ones? >>> Frankly speaking I do not get what do you mean by "track". If you're >>> about following what a DAO method does, then my approach simplifies the job. >>> >>>> >>>> >>>>> >>>>> > 4) SQL Injection >>>>> >>>>> Again, how other projects resolve the security problem? Internet is >>>>> full of articles/blogs of why stored procedures should be avoided ("avoid >>>>> >>>>> stored procedures" Google query returned ~6.6M results), so I guess >>>>> there are some other approaches for the security issue if that exists. >>>>> >>>> >>>> "use stored procedures" Google query returned About 4,840,000 results >>>> >>>> >>>> How many of those add "do not" to the search term? Anyhow avoid gives >>> more results... >>> >> >> Again, I'm not a DB security expert, but IIUC using prepared statements >> with parameters (what my patches do) doesn't allow a SQL injection. On the >> other side the existing SearchDao.getAllWithQuery implementors are exposed >> to SQL injection attempts. Am I right? >> >> >>>> >>>>> >>>>> > 5) I think that SP performs better than SQL inside Java >>>>> >>>>> Do we have a measurement of how much better? >>>>> One of my intentions for this proposal is that people evidently avoid >>>>> creating neat SQL queries and re-use the existing ones, which has much >>>>> bigger performance impact. I guess that the biggest limit here is how >>>>> complicated that procedure is in the engine code. >>>>> >>>> >>>> That's why we have code review process and we should nack such >>>> patches , so you think that if people are not familiar with Java8 syntax >>>> for example we should move this code to be performed by a "easier" >>>> mechanism ? >>>> If people are not doing the right thing , we have gerrit for that, we >>>> can comment , nack , whatever to make the code good >>>> >>> >>> In a perfect word you're right, proper reviews should've stop that. But >>> in the hard reality of oVirt code base it doesn't happen somehow... My >>> proposal bases on the current situation of oVirt code that was created by >>> using all Gerrit features. >>> >>>> >>>> >>>> >>>>> >>>>> >>>>> On Mon, Mar 27, 2017 at 11:09 AM, Eli Mesika <emesika(a)redhat.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Mon, Mar 27, 2017 at 1:56 AM, Moti Asayag <masayag(a)redhat.com> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Sun, Mar 26, 2017 at 5:12 PM, Yevgeny Zaspitsky < >>>>>>> yzaspits(a)redhat.com> wrote: >>>>>>> >>>>>>>> Hi All, >>>>>>>> >>>>>>>> Recently I had a task of performance improvement in one of our >>>>>>>> network related flows and had some hard time following our DAL code and one >>>>>>>> of the outcomes of the task was defining a couple of new quite simple, but >>>>>>>> neat queries. >>>>>>>> When I came to coding those new queries I remembered how hard was >>>>>>>> following the existing DAL code, I decided to make my own new methods >>>>>>>> clearer. So I created [1] and [2] patches. >>>>>>>> >>>>>>>> Everything is quite standard there beside the fact that they do >>>>>>>> not use any stored procedure, but use SQL directly, IMHO by that they save >>>>>>>> time that I spent in trying to follow what a DAO method does. Looking into >>>>>>>> the method code you get the understanding of what this method is all about: >>>>>>>> >>>>>>>> - no looking for a stored procedure name that is buried in the >>>>>>>> DAO class hierarchy >>>>>>>> - no looking for the SP definition >>>>>>>> >>>>>>>> >>>>>>> There are additional pros and cons for the suggestion to consider: >>>>>>> >>>>>>> Pros: >>>>>>> >>>>>>> 1. No need to run engine-setup after changing DB related code >>>>>>> (in case of SQL inside Java). >>>>>>> >>>>>>> Cons: >>>>>>> >>>>>>> 1. DAO files might become very long. >>>>>>> 2. If you decide to return the business entity associated with >>>>>>> the DAO as a returned object, you won't know as a caller which fields to >>>>>>> expect to be populated, which lead to 3: >>>>>>> 3. An inflation of business entities to represent partial >>>>>>> populated business entity or inflation of mappers inflation (this will be >>>>>>> required for SP as well). >>>>>>> 4. SQL code inside of Java: >>>>>>> 1. Beside of the fact that a multi-line concatenated string >>>>>>> that cannot be easily copied and run with psql, it means that we should >>>>>>> compile the code in order to test the change (vs building with >>>>>>> DEV_REBUILD=0 which only package the SQL file). >>>>>>> 2. No syntax highlighting when performing code review. i.e. >>>>>>> I don't think reviewing a patch such as >>>>>>> https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/ne >>>>>>> twork_sp.sql >>>>>>> <https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/network_sp.sql> >>>>>>> would be more clear inside a java file. >>>>>>> 3. The user permissions management is implemented on DB >>>>>>> level. That means that SQL will be more complex (more concatenated java >>>>>>> lines). >>>>>>> 5. Stored procedure are cached by project's code. See >>>>>>> SimpleJdbcCallsHandler.getCall(), while the >>>>>>> NamedParameterJdbcTemplate are cached by spring's code which seems less >>>>>>> optimal (sync all calls using synchronized vs using ConcurrentHashMap as in >>>>>>> SP cache). >>>>>>> 6. With the NamedParametersJdbcTemplate there is no use of the >>>>>>> DbEngineDialect. What's the impact of it ? >>>>>>> >>>>>>> So besides 5 and 6, the rest is a matter of style. I'd like to hear >>>>>>> more opinions from other members. >>>>>>> >>>>>> >>>>>> I agree with all you wrote Moti >>>>>> I don't think that looking in SQL in Java code is clear than >>>>>> looking in a SP code >>>>>> Additionally >>>>>> 1) You will pass more data on the wire instead of calling a SP with >>>>>> parameters >>>>>> 2) Your data that is passed on the wire is exposed to attacks since >>>>>> you will have to implement DB security in the engine level (for example >>>>>> hidden columns) >>>>>> 3) Changes in the SQL code done in patches may be more complicated >>>>>> to track >>>>>> 4) SQL Injection >>>>>> 5) I think that SP performs better than SQL inside Java >>>>>> >>>>>> I see no real reason to replace the SPs with SQL code , SP is just a >>>>>> container for SQL code >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> Moti >>>>>>> >>>>>>> So I'd like to propose moving towards this approach in general in >>>>>>>> all cases when nothing beyond a simple SQL is needed (no stored procedure >>>>>>>> programming language is needed). >>>>>>>> From my experience with the performance improvement task it looks >>>>>>>> like people avoid adding new queries for a specific need of a flow, instead >>>>>>>> they use the existing general ones (e.g. dao.getAllForX()) and do the >>>>>>>> actual join in the bll code. >>>>>>>> IMHO the proposed approach would simplify adding new specific >>>>>>>> queries and by that would prevent a decent part of performance issues in >>>>>>>> the future. >>>>>>>> >>>>>>>> I do not propose changing all existing SP's to inline queries in a >>>>>>>> once, but to allow adding new queries this way. Also we might consider >>>>>>>> converting relatively simple SP's to inline SQL statements later in a >>>>>>>> graduate way. >>>>>>>> >>>>>>>> [1] - https://gerrit.ovirt.org/#/c/74456 >>>>>>>> [2] - https://gerrit.ovirt.org/#/c/74458 >>>>>>>> >>>>>>>> Regards, >>>>>>>> Yevgeny >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Devel mailing list >>>>>>>> Devel(a)ovirt.org >>>>>>>> http://lists.ovirt.org/mailman/listinfo/devel >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Regards, >>>>>>> Moti >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Devel mailing list >>>>>>> Devel(a)ovirt.org >>>>>>> http://lists.ovirt.org/mailman/listinfo/devel >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >> _______________________________________________ >> Devel mailing list >> Devel(a)ovirt.org >> http://lists.ovirt.org/mailman/listinfo/devel >> > >

Martin Sivak

3:17 a.m.

...

In that case , if we will not use SP , we still will have to secure the data (for example , hidden columns for some users like in the VDS view)

But isn't this the current problem? - The whole permission model is so complicated that nobody wants to write new queries with it. (custom solution) - Permissions are checked both in db and in the engine differently (our custom solution) - We do not have clear distinction between internal queries and queries with user facing data. - The SQL code style we expect is not properly documented and has NO support in IDEs we use so it is all manual hit or miss when trying to get a patch merged. (custom sql style) - We hand "optimize" every query without measuring first (again all is custom with no caching) And the most important issues I think are: - We have no developer documentation for tables or queries that would explain what the structure is and what should be visible or hidden when certain user is looking at it. (We do not have any entity documentation either, so I often have to dig into dao, vdsbroker and vdsm just to find out the used units) - You insist on seeing every single patch and any db update means you have to constantly update the filename of the script, that makes merging hard and really slow (again custom solution, although I accept neither flyway nor liquidbase are too much better) We do not have to use ORM for everything (that would be a mistake too I think), but think about this: Hibernate/JPA or jOOQ or Spring Data probably have had more people working at them and using them, than what we will ever be able to provide for the whole engine. So why not reuse at least part of what the community can give us. New developers will get productive sooner too... When you sum all those up, no wonder people are trying to avoid touching the db layer. We have so many custom steps in the process that maybe we should start looking in what the industry is using. Having everything custom will take us nowhere (that is not just about DAO). Martin On Thu, Mar 30, 2017 at 9:31 AM, Eli Mesika <emesika(a)redhat.com> wrote: > > > On Wed, Mar 29, 2017 at 4:23 PM, Martin Mucha <mmucha(a)redhat.com> wrote: >> >> Hi, >> >> I didn't want to step into this, but I was asked by dan to express my >> opinion about this. I'm not pushing for change (since I believe it will be >> blocked), I'm just trying to help us focus. We should have some questions >> answered prior to saying what we want to do... >> >> Current state of accessing db is that there's fired lots of unnecessary db >> calls, just because there isn't easy way to do them correctly, someone is >> lazy, whatever, that isn't important. The blame was laid on invalid code >> review. Now let me remember maintainer Kolesnik, who during CR insisted on >> creating N+1 problem[1] when accessing db, to save ±10 lines of code in dal >> layer. So Eli is right, our code suck because of improper code revision >> process, but sometimes it is maintainers who wants invalid code to be pushed >> to simplify db access code. Therefore we cannot rely on review process, this >> is real issue we have. I'm fine with it, this is how we do things. But we >> also have (sometimes (a lot)) degraded performance because of it. >> >> We should have answered following questions: >> >> 1) is performance even important at all? Should we optimize db access? I'm >> assuming "yes" should be the answer. If it's not the answer, we can skip >> next questions. >> >> 2) can someone responsible for whole project provide timings and count of >> queries? This should easily bring crosshair on commands requiring attention. >> 2.1) customer is issuing synchronous command. What is maximum number of >> queries this command (plus all transitive internal commands) should fire? >> What's the number for asynchronous command? >> 2.2) same for time; what is the maximum time for synchronous and >> asynchronous command? >> >> 3) lets say that we found command which requires optimization (and that >> should not be hard). If we go back to mentioned N+1 problem, it's rather >> easy to avoid it for one association(yet we did not do it even there), but >> it's not that easy for multiple joins [1]. But it's superbly easy to achieve >> it using ORM tool, even with still using only named queries, even native >> ones defined outside of java. Supposing we want to optimize found command, >> what would dal maintainers consider as a acceptable optimization? Is there >> even possibility to not use stored procedures? I believe saying simple "no", >> if that's the case, is valid response, which can save time. If there is >> theoretical possibility, what criteria must be met so that we can consider >> replacing most trivial stored procedures with some tool in our project? >> There are lots of tools/frameworks, based on your criteria we might find >> some, which would work best... >> >> ————— >> >> My generic opinion would be to stick with sp and allowing to use something >> less heavy for simplest queries/stuff. It could beJPA using ORM and >> entities. Or named queries (still using entities) or nativequeries (which >> does not use entities, but plain sql) and both ends up creating prepared >> statements on server startup. Or we could use some lighter orm. But I'd >> definitely avoid writing any new homebrewed approach, this isn't a sane >> option. > >

...

In that case , if we will not use SP , we still will have to secure the data (for example , hidden columns for some users like in the VDS view)

> >> >> >> [1] example: Lets discuss query association A—>B—>C, 5xA, each A has 5B >> etc. In our typical dao this will mean 1+5+5*5=31 queries. Same with >> properly annotated ORM sums up to 1 or 3 queries. >> >> >> M. >> >> On Mon, Mar 27, 2017 at 5:38 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> >> wrote: >>> >>> >>> >>> On Mon, Mar 27, 2017 at 5:30 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> >>> wrote: >>>> >>>> >>>> >>>> On Mon, Mar 27, 2017 at 12:45 PM, Eli Mesika <emesika(a)redhat.com> wrote: >>>>> >>>>> >>>>> >>>>> On Mon, Mar 27, 2017 at 12:26 PM, Yevgeny Zaspitsky >>>>> <yzaspits(a)redhat.com> wrote: >>>>>> >>>>>> > I don't think that looking in SQL in Java code is clear than looking >>>>>> > in a SP code >>>>>> >>>>>> Looking in SQL isn't the problem I'm trying to solve, but finding >>>>>> that. >>>>>> >>>>>> > 1) You will pass more data on the wire instead of calling a SP with >>>>>> > parameters >>>>>> >>>>>> Passing the SQL string shouldn't be a problem unless that is very long >>>>>> one (several K) and then we probably do something wrong. >>>>>> >>>>>> > 2) Your data that is passed on the wire is exposed to attacks since >>>>>> > you will have to implement DB security in the engine level (for example >>>>>> > hidden columns). >>>>>> >>>>>> IIUC currently querying tables/views with hidden columns is >>>>>> implemented in a SP that consist of at least 2 SQL's, so those aren't good >>>>>> candidates for my proposal and will stay as is. >>>>>> BTW how other projects resolve the security problem? AFAIK usually >>>>>> hidden columns are hidden by defining a proper view that do not include >>>>>> those. >>>>> >>>>> >>>>> That's a bad solution as long as your data is passed unmasked on the >>>>> wire >>>> >>>> >>>> In some cases we do send a sensitive info in the current solution that >>>> uses SP's. >>>> Should we use (maybe already) a kind of secured connection to DB? >>>> >>>>> Database security should be done in the database level and you will not >>>>> be able to do that without using SPs >>>> >>>> >>>> I'm not a DB security expert, but from my experience from my previous >>>> jobs none of them used "SP's only" approach, in fact they didn't use SP's at >>>> all. >>>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>> > 3) Changes in the SQL code done in patches may be more complicated >>>>>> > to track >>>>>> >>>>>> Most of SQL code changes involve changes in a DAO too, so this >>>>>> shouldn't be an issue. >>>>> >>>>> It is !!! , changes in DAO are easy to track since it is a Java code , >>>>> you are suggesting to write the SQL itself >>>> >>>> >>>> If you need to update a Java code together with SQL, then how much >>>> difference is in seeing the change in a single file or in two separate ones? >>>> Frankly speaking I do not get what do you mean by "track". If you're >>>> about following what a DAO method does, then my approach simplifies the job. >>>>> >>>>> >>>>>> >>>>>> >>>>>> > 4) SQL Injection >>>>>> >>>>>> Again, how other projects resolve the security problem? Internet is >>>>>> full of articles/blogs of why stored procedures should be avoided ("avoid >>>>>> stored procedures" Google query returned ~6.6M results), so I guess >>>>>> there are some other approaches for the security issue if that exists. >>>>> >>>>> >>>>> "use stored procedures" Google query returned About 4,840,000 results >>>>> >>>> How many of those add "do not" to the search term? Anyhow avoid gives >>>> more results... >>> >>> >>> Again, I'm not a DB security expert, but IIUC using prepared statements >>> with parameters (what my patches do) doesn't allow a SQL injection. On the >>> other side the existing SearchDao.getAllWithQuery implementors are exposed >>> to SQL injection attempts. Am I right? >>> >>>>> >>>>>> >>>>>> >>>>>> > 5) I think that SP performs better than SQL inside Java >>>>>> >>>>>> Do we have a measurement of how much better? >>>>>> One of my intentions for this proposal is that people evidently avoid >>>>>> creating neat SQL queries and re-use the existing ones, which has much >>>>>> bigger performance impact. I guess that the biggest limit here is how >>>>>> complicated that procedure is in the engine code. >>>>> >>>>> >>>>> That's why we have code review process and we should nack such patches >>>>> , so you think that if people are not familiar with Java8 syntax for example >>>>> we should move this code to be performed by a "easier" mechanism ? >>>>> If people are not doing the right thing , we have gerrit for that, we >>>>> can comment , nack , whatever to make the code good >>>> >>>> >>>> In a perfect word you're right, proper reviews should've stop that. But >>>> in the hard reality of oVirt code base it doesn't happen somehow... My >>>> proposal bases on the current situation of oVirt code that was created by >>>> using all Gerrit features. >>>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Mar 27, 2017 at 11:09 AM, Eli Mesika <emesika(a)redhat.com> >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Mar 27, 2017 at 1:56 AM, Moti Asayag <masayag(a)redhat.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Sun, Mar 26, 2017 at 5:12 PM, Yevgeny Zaspitsky >>>>>>>> <yzaspits(a)redhat.com> wrote: >>>>>>>>> >>>>>>>>> Hi All, >>>>>>>>> >>>>>>>>> Recently I had a task of performance improvement in one of our >>>>>>>>> network related flows and had some hard time following our DAL code and one >>>>>>>>> of the outcomes of the task was defining a couple of new quite simple, but >>>>>>>>> neat queries. >>>>>>>>> When I came to coding those new queries I remembered how hard was >>>>>>>>> following the existing DAL code, I decided to make my own new methods >>>>>>>>> clearer. So I created [1] and [2] patches. >>>>>>>>> >>>>>>>>> Everything is quite standard there beside the fact that they do not >>>>>>>>> use any stored procedure, but use SQL directly, IMHO by that they save time >>>>>>>>> that I spent in trying to follow what a DAO method does. Looking into the >>>>>>>>> method code you get the understanding of what this method is all about: >>>>>>>>> >>>>>>>>> no looking for a stored procedure name that is buried in the DAO >>>>>>>>> class hierarchy >>>>>>>>> no looking for the SP definition >>>>>>>> >>>>>>>> >>>>>>>> There are additional pros and cons for the suggestion to consider: >>>>>>>> >>>>>>>> Pros: >>>>>>>> >>>>>>>> No need to run engine-setup after changing DB related code (in case >>>>>>>> of SQL inside Java). >>>>>>>> >>>>>>>> Cons: >>>>>>>> >>>>>>>> DAO files might become very long. >>>>>>>> If you decide to return the business entity associated with the DAO >>>>>>>> as a returned object, you won't know as a caller which fields to expect to >>>>>>>> be populated, which lead to 3: >>>>>>>> An inflation of business entities to represent partial populated >>>>>>>> business entity or inflation of mappers inflation (this will be required for >>>>>>>> SP as well). >>>>>>>> SQL code inside of Java: >>>>>>>> >>>>>>>> Beside of the fact that a multi-line concatenated string that cannot >>>>>>>> be easily copied and run with psql, it means that we should compile the code >>>>>>>> in order to test the change (vs building with DEV_REBUILD=0 which only >>>>>>>> package the SQL file). >>>>>>>> No syntax highlighting when performing code review. i.e. I don't >>>>>>>> think reviewing a patch such as >>>>>>>> https://gerrit.ovirt.org/#/c/66729/10/packaging/dbscripts/network_sp.sql >>>>>>>> would be more clear inside a java file. >>>>>>>> The user permissions management is implemented on DB level. That >>>>>>>> means that SQL will be more complex (more concatenated java lines). >>>>>>>> >>>>>>>> Stored procedure are cached by project's code. See >>>>>>>> SimpleJdbcCallsHandler.getCall(), while the NamedParameterJdbcTemplate are >>>>>>>> cached by spring's code which seems less optimal (sync all calls using >>>>>>>> synchronized vs using ConcurrentHashMap as in SP cache). >>>>>>>> With the NamedParametersJdbcTemplate there is no use of the >>>>>>>> DbEngineDialect. What's the impact of it ? >>>>>>>> >>>>>>>> So besides 5 and 6, the rest is a matter of style. I'd like to hear >>>>>>>> more opinions from other members. >>>>>>> >>>>>>> >>>>>>> I agree with all you wrote Moti >>>>>>> I don't think that looking in SQL in Java code is clear than looking >>>>>>> in a SP code >>>>>>> Additionally >>>>>>> 1) You will pass more data on the wire instead of calling a SP with >>>>>>> parameters >>>>>>> 2) Your data that is passed on the wire is exposed to attacks since >>>>>>> you will have to implement DB security in the engine level (for example >>>>>>> hidden columns) >>>>>>> 3) Changes in the SQL code done in patches may be more complicated to >>>>>>> track >>>>>>> 4) SQL Injection >>>>>>> 5) I think that SP performs better than SQL inside Java >>>>>>> >>>>>>> I see no real reason to replace the SPs with SQL code , SP is just a >>>>>>> container for SQL code >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Regards, >>>>>>>> Moti >>>>>>>> >>>>>>>>> So I'd like to propose moving towards this approach in general in >>>>>>>>> all cases when nothing beyond a simple SQL is needed (no stored procedure >>>>>>>>> programming language is needed). >>>>>>>>> From my experience with the performance improvement task it looks >>>>>>>>> like people avoid adding new queries for a specific need of a flow, instead >>>>>>>>> they use the existing general ones (e.g. dao.getAllForX()) and do the actual >>>>>>>>> join in the bll code. >>>>>>>>> IMHO the proposed approach would simplify adding new specific >>>>>>>>> queries and by that would prevent a decent part of performance issues in the >>>>>>>>> future. >>>>>>>>> >>>>>>>>> I do not propose changing all existing SP's to inline queries in a >>>>>>>>> once, but to allow adding new queries this way. Also we might consider >>>>>>>>> converting relatively simple SP's to inline SQL statements later in a >>>>>>>>> graduate way. >>>>>>>>> >>>>>>>>> [1] - https://gerrit.ovirt.org/#/c/74456 >>>>>>>>> [2] - https://gerrit.ovirt.org/#/c/74458 >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Yevgeny >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Devel mailing list >>>>>>>>> Devel(a)ovirt.org >>>>>>>>> http://lists.ovirt.org/mailman/listinfo/devel >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Regards, >>>>>>>> Moti >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Devel mailing list >>>>>>>> Devel(a)ovirt.org >>>>>>>> http://lists.ovirt.org/mailman/listinfo/devel >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>> >>> _______________________________________________ >>> Devel mailing list >>> Devel(a)ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/devel >> >> > > > _______________________________________________ > Devel mailing list > Devel(a)ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel

Yevgeny Zaspitsky

Monday, 27 March Mon, 27 Mar

4:01 a.m.

...

1. DAO files might become very long.

They would not be much longer than the current alternative, the .sql files.

...

2. An inflation of business entities to represent partial populated

business entity or inflation of mappers inflation (this will be required for SP as well). This isn't a problem of the DAO layer but bad design of business objects that do not verify their own state. That could happen in any layer and happens in existing DAO that uses stored procedures already.

...

3. An inflation of business entities to represent partial populated

business entity or inflation of mappers inflation (this will be required for SP as well). Look at my patches: one of them returns list of ids and another one returns whole Cluster object and uses the existing RowMapper that populates the whole Cluster object, so in both cases no much effort is needed in order to keep the current state in terms of BE completness. IMHO having to create a new class for a specific query would be the limit that will stop people from creating queries that return something that isn't an existing type. Anyhow I do not see how the proposed is changes the current state, can't we do the same with a stored procedure?

...

4.1 Beside of the fact that a multi-line concatenated string that cannot

be easily copied and run with psql, it means that we should compile the code in order to test the change (vs building with DEV_REBUILD=0 which only package the SQL file). Running a unit-test in your IDE after updating the in-code SQL is even faster than the current flow.

...

4.2 No syntax highlighting when performing code review. i.e. I don't

think reviewing a patch such as https://gerrit.ovirt.org/#/c/ 66729/10/packaging/dbscripts/network_sp.sql would be more clear inside a java file. If you review the DAO Java code in your IDE (I'd recommend that for any Java change) you get the hiighlighting too. Beside that a code is written/reviewed only once but is read many times and here my proposal is much easier.

...

4.3 The user permissions management is implemented on DB level. That

means that SQL will be more complex (more concatenated java lines). We can keep those long queries in SPs if you like. But I'm in favor to enable the shorter one to be in-lined.

...

5. Stored procedure are cached by project's code. See

SimpleJdbcCallsHandler.getCall(), while the NamedParameterJdbcTemplate are cached by spring's code which seems less optimal (sync all calls using synchronized vs using ConcurrentHashMap as in SP cache). I'm not sure that this caching code is needed. Has this proven impact/need? Are we pioneers in JDBC area? How other projects do that?

...

6. With the NamedParametersJdbcTemplate there is no use of the

DbEngineDialect. What's the impact of it ? In my patches I do use it through getCustomMapSqlParameterSource. On Mon, Mar 27, 2017 at 1:56 AM, Moti Asayag <masayag(a)redhat.com> wrote:

...

Martin Sivak

3:30 a.m.

I agree with Moti, but I think we might have an alternative. Does our DAO layer support named queries (aka PreparedStatements loaded from a properties file)? Those are generally plain SQL queries with arguments that are compiled in advance and still allow direct access to the JDBC internals without the heavy translation layer. Martin On Sun, Mar 26, 2017 at 4:12 PM, Yevgeny Zaspitsky <yzaspits(a)redhat.com> wrote:

...

Hi All, Recently I had a task of performance improvement in one of our network related flows and had some hard time following our DAL code and one of the outcomes of the task was defining a couple of new quite simple, but neat queries. When I came to coding those new queries I remembered how hard was following the existing DAL code, I decided to make my own new methods clearer. So I created [1] and [2] patches. Everything is quite standard there beside the fact that they do not use any stored procedure, but use SQL directly, IMHO by that they save time that I spent in trying to follow what a DAO method does. Looking into the method code you get the understanding of what this method is all about: no looking for a stored procedure name that is buried in the DAO class hierarchy no looking for the SP definition So I'd like to propose moving towards this approach in general in all cases when nothing beyond a simple SQL is needed (no stored procedure programming language is needed). From my experience with the performance improvement task it looks like people avoid adding new queries for a specific need of a flow, instead they use the existing general ones (e.g. dao.getAllForX()) and do the actual join in the bll code. IMHO the proposed approach would simplify adding new specific queries and by that would prevent a decent part of performance issues in the future. I do not propose changing all existing SP's to inline queries in a once, but to allow adding new queries this way. Also we might consider converting relatively simple SP's to inline SQL statements later in a graduate way. [1] - https://gerrit.ovirt.org/#/c/74456 [2] - https://gerrit.ovirt.org/#/c/74458 Regards, Yevgeny _______________________________________________ Devel mailing list Devel(a)ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

3016

days inactive

3020

days old

devel@ovirt.org

Manage subscription

12 comments

5 participants

tags (0)

participants (5)

Eli Mesika
Martin Mucha
Martin Sivak
Moti Asayag
Yevgeny Zaspitsky

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Writing SQL queries in DAO code