
Hi All, for quite some time I have been more or less involved in development of various UIs for oVirt based entirely on the oVirt's REST API ranging from the quite mature moVirt [1] through some cockpit extensions to a young and experimental user portal replacement [2]. One issue we hit over and over again is the missing data aggregation. In the 3.x era we used to use in moVirt the detail=something api to get the disks and nics of the VM, something like: GET /ovirt-engine/api/vms Accept: application/json; detail=disks This allowed us to store this data in local database leading to great user experience. Since this feature has been removed in 4.x API [3] we needed to retire to a different solution. When the VM detail is selected by the user, start loading the disks and nics and hope the user will not be fast enough to see the delay. The UX is slightly worse bug kinda acceptable. We hit this issue harder in the new user portal [2], because we already have the VM cached and show the whole VM in one screen. So, if you pick it, you will get it's details immediately. But, since you don't have all the details, we need to do an additional call (two actually) to load this data and they start to appear later. So, something which would be very fast and smooth starts to feel sluggish. Recently, we hit this issue again which forced us to sacrifice the UX even more - it is the "console in use" feature of user portal. The use case is this: - if the console is already taken by some user, there are complications if other current user tryes to take it as well (will avoid details about settings and permissins involved, but long story short, the user will probably not be allowed to connect to it. The "probably" is the key here since we can not do any intelligent decision in advance, we can only warn the user that the console is taken). - in the current GWT user portal, if the VM's console is taken, it is shown on the VM's "box" that "console is taken". This was a highly requested feature - to get this information using the current REST API, we need to go to the /vms/<vmid>/sessions subcollection. To get this for all VMs, it would be doing N queries per poll which we can not afford - so the current PR [4] will probably end up to only check it on the attempt to connect to the console warning the user. Maybe it will be also shown in Vm details. But the UX in case the user will look for a VM which has free console will suffer significantly (e.g. try one by one until some opens or look at details one by one to see if the warning appears (with a delay)) I understand that embedding the details of the VM to the response comes with a cost, namely: - performance hit - complexity of the API code - the "cleanness" of REST suffers But I think we should seriously consider to provide some option to data aggregation. I know this has been discussed many times with no result, but I think it is time to bring this topic up again. I'll try to summarize the (failed) attempts tried so far: - the detail=<something> parameter with ad-hoc embedding of data. This has been there and removed in 4.0 [3] - the DoctorREST project - e.g. a proxy above the current api. The idea was to create a service which will be independent of the engine itself, will locally poll the engine's REST, store all data in local (mongo)DB and provide a rich api with aggregations and projections and push notifications. This polling of everything to get the data to DoctorREST proved to be pretty costy, so also a more invasive approach of pushing data from engine to doctor has been discused [5]. None of this two approaches have been accepted (too complicated, too invasive). - writing some custom ad-hoc servlet serving only a purpose of one frontend - this is actually there for the dashboard, but it is not a generic solution for the other frontends and we really should not develop custom "APIs" for every frontend - there were some other proposals discussed (some 3th party solutions etc) but I think none of them made it even to a PoC So, now I would try again and try small to get at least some benefit. I see 2 paths we could try: 1: embed something which burns us immediatly, e.g. the /sessions into VMs. I really liked the ;detail=sessions approach, could we move it back? 2: add some tiny service which would just accept a list of queries, execute them locally (but using real HTTP requests) and return in one bulk. A naive implementation just to give a sense of what I mean of this would be a shell script getting list of strings like " https://localhost/ovirt-engine/api/vms/123/sessions" iterate over them and do a curl request for each, mangle the results into one string and return (credits for this idea to msivak). Easy to implement, possibility to add also projections later to save some bandwidth. But the API would anyway be hammered by bunch of queries, only the network roundtrip would be saved. 3: any other simple approaches? I honestly prefer the first approach. It is not beautiful, it is not REST-ful, but it is easy to implement, very pragmatic and useful. What do you think? Thank you and sorry for the long mail :) Tomas [1]: https://github.com/oVirt/moVirt [2]: https://github.com/oVirt/ovirt-web-ui [3]: https://gerrit.ovirt.org/#/c/61260 [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ [5]: https://gerrit.ovirt.org/#/c/45233/

for quite some time I have been more or less involved in development of various UIs for oVirt based entirely on the oVirt's REST API ranging from the quite mature moVirt [1] through some cockpit extensions to a young and experimental user portal replacement [2].
oVirt optimizer has the same issue..
2: add some tiny service which would just accept a list of queries, execute them locally (but using real HTTP requests) and return in one bulk. A naive implementation just to give a sense of what I mean of this would be a shell script getting list of strings like "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over them and do a curl request for each, mangle the results into one string and return (credits for this idea to msivak). Easy to implement, possibility to add also projections later to save some bandwidth. But the API would anyway be hammered by bunch of queries, only the network roundtrip would be saved.
The biggest cost for (especially mobile) clients is the cost of establishing new SSL connection. SSL is also pretty expensive on the server side. So running the aggregation service on the ovirt-engine machine (behind Apache) means the client will do a single SSL request with list of N urls and the local "reverse-proxy" will perform single authentication and N plain HTTP requests (or even better - AJP). It won't remove any time from the actual command run time, but it will reduce protocol overhead. I think this is the simplest first step that requires almost no change to existing infrastructure. -- Martin Sivak SLA / oVirt On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek <tjelinek@redhat.com> wrote:
Hi All,
for quite some time I have been more or less involved in development of various UIs for oVirt based entirely on the oVirt's REST API ranging from the quite mature moVirt [1] through some cockpit extensions to a young and experimental user portal replacement [2].
One issue we hit over and over again is the missing data aggregation. In the 3.x era we used to use in moVirt the detail=something api to get the disks and nics of the VM, something like:
GET /ovirt-engine/api/vms Accept: application/json; detail=disks
This allowed us to store this data in local database leading to great user experience. Since this feature has been removed in 4.x API [3] we needed to retire to a different solution. When the VM detail is selected by the user, start loading the disks and nics and hope the user will not be fast enough to see the delay. The UX is slightly worse bug kinda acceptable.
We hit this issue harder in the new user portal [2], because we already have the VM cached and show the whole VM in one screen. So, if you pick it, you will get it's details immediately. But, since you don't have all the details, we need to do an additional call (two actually) to load this data and they start to appear later. So, something which would be very fast and smooth starts to feel sluggish.
Recently, we hit this issue again which forced us to sacrifice the UX even more - it is the "console in use" feature of user portal. The use case is this: - if the console is already taken by some user, there are complications if other current user tryes to take it as well (will avoid details about settings and permissins involved, but long story short, the user will probably not be allowed to connect to it. The "probably" is the key here since we can not do any intelligent decision in advance, we can only warn the user that the console is taken). - in the current GWT user portal, if the VM's console is taken, it is shown on the VM's "box" that "console is taken". This was a highly requested feature - to get this information using the current REST API, we need to go to the /vms/<vmid>/sessions subcollection. To get this for all VMs, it would be doing N queries per poll which we can not afford - so the current PR [4] will probably end up to only check it on the attempt to connect to the console warning the user. Maybe it will be also shown in Vm details. But the UX in case the user will look for a VM which has free console will suffer significantly (e.g. try one by one until some opens or look at details one by one to see if the warning appears (with a delay))
I understand that embedding the details of the VM to the response comes with a cost, namely: - performance hit - complexity of the API code - the "cleanness" of REST suffers
But I think we should seriously consider to provide some option to data aggregation.
I know this has been discussed many times with no result, but I think it is time to bring this topic up again. I'll try to summarize the (failed) attempts tried so far: - the detail=<something> parameter with ad-hoc embedding of data. This has been there and removed in 4.0 [3] - the DoctorREST project - e.g. a proxy above the current api. The idea was to create a service which will be independent of the engine itself, will locally poll the engine's REST, store all data in local (mongo)DB and provide a rich api with aggregations and projections and push notifications. This polling of everything to get the data to DoctorREST proved to be pretty costy, so also a more invasive approach of pushing data from engine to doctor has been discused [5]. None of this two approaches have been accepted (too complicated, too invasive). - writing some custom ad-hoc servlet serving only a purpose of one frontend - this is actually there for the dashboard, but it is not a generic solution for the other frontends and we really should not develop custom "APIs" for every frontend - there were some other proposals discussed (some 3th party solutions etc) but I think none of them made it even to a PoC
So, now I would try again and try small to get at least some benefit. I see 2 paths we could try: 1: embed something which burns us immediatly, e.g. the /sessions into VMs. I really liked the ;detail=sessions approach, could we move it back? 2: add some tiny service which would just accept a list of queries, execute them locally (but using real HTTP requests) and return in one bulk. A naive implementation just to give a sense of what I mean of this would be a shell script getting list of strings like "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over them and do a curl request for each, mangle the results into one string and return (credits for this idea to msivak). Easy to implement, possibility to add also projections later to save some bandwidth. But the API would anyway be hammered by bunch of queries, only the network roundtrip would be saved. 3: any other simple approaches?
I honestly prefer the first approach. It is not beautiful, it is not REST-ful, but it is easy to implement, very pragmatic and useful. What do you think?
Thank you and sorry for the long mail :) Tomas
[1]: https://github.com/oVirt/moVirt [2]: https://github.com/oVirt/ovirt-web-ui [3]: https://gerrit.ovirt.org/#/c/61260 [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ [5]: https://gerrit.ovirt.org/#/c/45233/
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

I feel like every REST API I've ever worked with has had the aggregation + projection problem. It's like we're trying to use REST as a replacement for SQL -- but the logic that executes the "SQL" lives in a browser now, and it used to live on a server close to the DB. And REST isn't expressive for selecting data like SQL is. There must be some industry solution to this "I want to do SQL over REST" problem. On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak <msivak@redhat.com> wrote:
for quite some time I have been more or less involved in development of various UIs for oVirt based entirely on the oVirt's REST API ranging from the quite mature moVirt [1] through some cockpit extensions to a young and experimental user portal replacement [2].
oVirt optimizer has the same issue..
2: add some tiny service which would just accept a list of queries, execute them locally (but using real HTTP requests) and return in one bulk. A naive implementation just to give a sense of what I mean of this would be a shell script getting list of strings like "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over them and do a curl request for each, mangle the results into one string and return (credits for this idea to msivak). Easy to implement, possibility to add also projections later to save some bandwidth. But the API would anyway be hammered by bunch of queries, only the network roundtrip would be saved.
The biggest cost for (especially mobile) clients is the cost of establishing new SSL connection. SSL is also pretty expensive on the server side.
So running the aggregation service on the ovirt-engine machine (behind Apache) means the client will do a single SSL request with list of N urls and the local "reverse-proxy" will perform single authentication and N plain HTTP requests (or even better - AJP). It won't remove any time from the actual command run time, but it will reduce protocol overhead.
I think this is the simplest first step that requires almost no change to existing infrastructure.
-- Martin Sivak SLA / oVirt
Hi All,
for quite some time I have been more or less involved in development of various UIs for oVirt based entirely on the oVirt's REST API ranging from the quite mature moVirt [1] through some cockpit extensions to a young and experimental user portal replacement [2].
One issue we hit over and over again is the missing data aggregation. In
3.x era we used to use in moVirt the detail=something api to get the disks and nics of the VM, something like:
GET /ovirt-engine/api/vms Accept: application/json; detail=disks
This allowed us to store this data in local database leading to great user experience. Since this feature has been removed in 4.x API [3] we needed to retire to a different solution. When the VM detail is selected by the user, start loading the disks and nics and hope the user will not be fast enough to see the delay. The UX is slightly worse bug kinda acceptable.
We hit this issue harder in the new user portal [2], because we already have the VM cached and show the whole VM in one screen. So, if you pick it, you will get it's details immediately. But, since you don't have all the details, we need to do an additional call (two actually) to load this data and they start to appear later. So, something which would be very fast and smooth starts to feel sluggish.
Recently, we hit this issue again which forced us to sacrifice the UX even more - it is the "console in use" feature of user portal. The use case is this: - if the console is already taken by some user, there are complications if other current user tryes to take it as well (will avoid details about settings and permissins involved, but long story short, the user will probably not be allowed to connect to it. The "probably" is the key here since we can not do any intelligent decision in advance, we can only warn the user that the console is taken). - in the current GWT user portal, if the VM's console is taken, it is shown on the VM's "box" that "console is taken". This was a highly requested feature - to get this information using the current REST API, we need to go to
/vms/<vmid>/sessions subcollection. To get this for all VMs, it would be doing N queries per poll which we can not afford - so the current PR [4] will probably end up to only check it on the attempt to connect to the console warning the user. Maybe it will be also shown in Vm details. But the UX in case the user will look for a VM which has free console will suffer significantly (e.g. try one by one until some opens or look at details one by one to see if the warning appears (with a delay))
I understand that embedding the details of the VM to the response comes with a cost, namely: - performance hit - complexity of the API code - the "cleanness" of REST suffers
But I think we should seriously consider to provide some option to data aggregation.
I know this has been discussed many times with no result, but I think it is time to bring this topic up again. I'll try to summarize the (failed) attempts tried so far: - the detail=<something> parameter with ad-hoc embedding of data. This has been there and removed in 4.0 [3] - the DoctorREST project - e.g. a proxy above the current api. The idea was to create a service which will be independent of the engine itself, will locally poll the engine's REST, store all data in local (mongo)DB and provide a rich api with aggregations and projections and push notifications. This polling of everything to get the data to DoctorREST proved to be
On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek <tjelinek@redhat.com> wrote: the the pretty
costy, so also a more invasive approach of pushing data from engine to doctor has been discused [5]. None of this two approaches have been accepted (too complicated, too invasive). - writing some custom ad-hoc servlet serving only a purpose of one frontend - this is actually there for the dashboard, but it is not a generic solution for the other frontends and we really should not develop custom "APIs" for every frontend - there were some other proposals discussed (some 3th party solutions etc) but I think none of them made it even to a PoC
So, now I would try again and try small to get at least some benefit. I see 2 paths we could try: 1: embed something which burns us immediatly, e.g. the /sessions into VMs. I really liked the ;detail=sessions approach, could we move it back? 2: add some tiny service which would just accept a list of queries, execute them locally (but using real HTTP requests) and return in one bulk. A naive implementation just to give a sense of what I mean of this would be a shell script getting list of strings like "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over them and do a curl request for each, mangle the results into one string and return (credits for this idea to msivak). Easy to implement, possibility to add also projections later to save some bandwidth. But the API would anyway be hammered by bunch of queries, only the network roundtrip would be saved. 3: any other simple approaches?
I honestly prefer the first approach. It is not beautiful, it is not REST-ful, but it is easy to implement, very pragmatic and useful. What do you think?
Thank you and sorry for the long mail :) Tomas
[1]: https://github.com/oVirt/moVirt [2]: https://github.com/oVirt/ovirt-web-ui [3]: https://gerrit.ovirt.org/#/c/61260 [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ [5]: https://gerrit.ovirt.org/#/c/45233/
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Greg Sheremeta, MBA Red Hat, Inc. Sr. Software Engineer gshereme@redhat.com

I feel like every REST API I've ever worked with has had the aggregation + projection problem. It's like we're trying to use REST as a replacement for SQL -- but the logic that executes the "SQL" lives in a browser now, and it used to live on a server close to the DB. And REST isn't expressive for selecting data like SQL is.
The current industry solution I know about is called API gateway.. most of the big players have internal API with lots of low level stuff and then couple of external API gateways tailored to what the client needs. http://microservices.io/patterns/apigateway.html (check the backend for frontend section) This trend is also visible when you think about services that offer API gateway management and billing like https://aws.amazon.com/api-gateway/ or our very own https://www.3scale.net/ Martin On Fri, Mar 24, 2017 at 3:47 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
I feel like every REST API I've ever worked with has had the aggregation + projection problem. It's like we're trying to use REST as a replacement for SQL -- but the logic that executes the "SQL" lives in a browser now, and it used to live on a server close to the DB. And REST isn't expressive for selecting data like SQL is.
There must be some industry solution to this "I want to do SQL over REST" problem.
On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak <msivak@redhat.com> wrote:
for quite some time I have been more or less involved in development of various UIs for oVirt based entirely on the oVirt's REST API ranging from the quite mature moVirt [1] through some cockpit extensions to a young and experimental user portal replacement [2].
oVirt optimizer has the same issue..
2: add some tiny service which would just accept a list of queries, execute them locally (but using real HTTP requests) and return in one bulk. A naive implementation just to give a sense of what I mean of this would be a shell script getting list of strings like "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over them and do a curl request for each, mangle the results into one string and return (credits for this idea to msivak). Easy to implement, possibility to add also projections later to save some bandwidth. But the API would anyway be hammered by bunch of queries, only the network roundtrip would be saved.
The biggest cost for (especially mobile) clients is the cost of establishing new SSL connection. SSL is also pretty expensive on the server side.
So running the aggregation service on the ovirt-engine machine (behind Apache) means the client will do a single SSL request with list of N urls and the local "reverse-proxy" will perform single authentication and N plain HTTP requests (or even better - AJP). It won't remove any time from the actual command run time, but it will reduce protocol overhead.
I think this is the simplest first step that requires almost no change to existing infrastructure.
-- Martin Sivak SLA / oVirt
On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek <tjelinek@redhat.com> wrote:
Hi All,
for quite some time I have been more or less involved in development of various UIs for oVirt based entirely on the oVirt's REST API ranging from the quite mature moVirt [1] through some cockpit extensions to a young and experimental user portal replacement [2].
One issue we hit over and over again is the missing data aggregation. In the 3.x era we used to use in moVirt the detail=something api to get the disks and nics of the VM, something like:
GET /ovirt-engine/api/vms Accept: application/json; detail=disks
This allowed us to store this data in local database leading to great user experience. Since this feature has been removed in 4.x API [3] we needed to retire to a different solution. When the VM detail is selected by the user, start loading the disks and nics and hope the user will not be fast enough to see the delay. The UX is slightly worse bug kinda acceptable.
We hit this issue harder in the new user portal [2], because we already have the VM cached and show the whole VM in one screen. So, if you pick it, you will get it's details immediately. But, since you don't have all the details, we need to do an additional call (two actually) to load this data and they start to appear later. So, something which would be very fast and smooth starts to feel sluggish.
Recently, we hit this issue again which forced us to sacrifice the UX even more - it is the "console in use" feature of user portal. The use case is this: - if the console is already taken by some user, there are complications if other current user tryes to take it as well (will avoid details about settings and permissins involved, but long story short, the user will probably not be allowed to connect to it. The "probably" is the key here since we can not do any intelligent decision in advance, we can only warn the user that the console is taken). - in the current GWT user portal, if the VM's console is taken, it is shown on the VM's "box" that "console is taken". This was a highly requested feature - to get this information using the current REST API, we need to go to the /vms/<vmid>/sessions subcollection. To get this for all VMs, it would be doing N queries per poll which we can not afford - so the current PR [4] will probably end up to only check it on the attempt to connect to the console warning the user. Maybe it will be also shown in Vm details. But the UX in case the user will look for a VM which has free console will suffer significantly (e.g. try one by one until some opens or look at details one by one to see if the warning appears (with a delay))
I understand that embedding the details of the VM to the response comes with a cost, namely: - performance hit - complexity of the API code - the "cleanness" of REST suffers
But I think we should seriously consider to provide some option to data aggregation.
I know this has been discussed many times with no result, but I think it is time to bring this topic up again. I'll try to summarize the (failed) attempts tried so far: - the detail=<something> parameter with ad-hoc embedding of data. This has been there and removed in 4.0 [3] - the DoctorREST project - e.g. a proxy above the current api. The idea was to create a service which will be independent of the engine itself, will locally poll the engine's REST, store all data in local (mongo)DB and provide a rich api with aggregations and projections and push notifications. This polling of everything to get the data to DoctorREST proved to be pretty costy, so also a more invasive approach of pushing data from engine to doctor has been discused [5]. None of this two approaches have been accepted (too complicated, too invasive). - writing some custom ad-hoc servlet serving only a purpose of one frontend - this is actually there for the dashboard, but it is not a generic solution for the other frontends and we really should not develop custom "APIs" for every frontend - there were some other proposals discussed (some 3th party solutions etc) but I think none of them made it even to a PoC
So, now I would try again and try small to get at least some benefit. I see 2 paths we could try: 1: embed something which burns us immediatly, e.g. the /sessions into VMs. I really liked the ;detail=sessions approach, could we move it back? 2: add some tiny service which would just accept a list of queries, execute them locally (but using real HTTP requests) and return in one bulk. A naive implementation just to give a sense of what I mean of this would be a shell script getting list of strings like "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over them and do a curl request for each, mangle the results into one string and return (credits for this idea to msivak). Easy to implement, possibility to add also projections later to save some bandwidth. But the API would anyway be hammered by bunch of queries, only the network roundtrip would be saved. 3: any other simple approaches?
I honestly prefer the first approach. It is not beautiful, it is not REST-ful, but it is easy to implement, very pragmatic and useful. What do you think?
Thank you and sorry for the long mail :) Tomas
[1]: https://github.com/oVirt/moVirt [2]: https://github.com/oVirt/ovirt-web-ui [3]: https://gerrit.ovirt.org/#/c/61260 [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ [5]: https://gerrit.ovirt.org/#/c/45233/
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Greg Sheremeta, MBA Red Hat, Inc. Sr. Software Engineer gshereme@redhat.com

On Fri, Mar 24, 2017 at 3:58 PM, Martin Sivak <msivak@redhat.com> wrote:
projection problem. It's like we're trying to use REST as a replacement for SQL -- but the logic that executes the "SQL" lives in a browser now, and it used to live on a server close to the DB. And REST isn't expressive for selecting data like SQL is.
I feel like every REST API I've ever worked with has had the aggregation
The current industry solution I know about is called API gateway.. most of the big players have internal API with lots of low level stuff and then couple of external API gateways tailored to what the client needs.
http://microservices.io/patterns/apigateway.html (check the backend for frontend section)
This trend is also visible when you think about services that offer API gateway management and billing like https://aws.amazon.com/api-gateway/ or our very own https://www.3scale.net/
right, but the api gateway solves 2 problems: 1: if you have a microservice architecture it is hard for frontend to talk to 20 different moving services. So the gateway hides this complexity behind it. This is not the problem we have. 2: you can have more api gateways (e.g. more apis) tailored for every frontend. I don't think we need this - the current API serves us pretty well in every FE Im involved in. The only thing which I miss is the data aggregation. So I would go back to the original topic of this thread - do some small change which has a chance to be merged to the project and helps us where it hurts.
Martin
I feel like every REST API I've ever worked with has had the aggregation
projection problem. It's like we're trying to use REST as a replacement for SQL -- but the logic that executes the "SQL" lives in a browser now, and it used to live on a server close to the DB. And REST isn't expressive for selecting data like SQL is.
There must be some industry solution to this "I want to do SQL over REST" problem.
On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak <msivak@redhat.com> wrote:
for quite some time I have been more or less involved in development
of
various UIs for oVirt based entirely on the oVirt's REST API ranging from the quite mature moVirt [1] through some cockpit extensions to a young and experimental user portal replacement [2].
oVirt optimizer has the same issue..
2: add some tiny service which would just accept a list of queries, execute them locally (but using real HTTP requests) and return in one bulk. A naive implementation just to give a sense of what I mean of this would be a shell script getting list of strings like "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over
and do a curl request for each, mangle the results into one string and return (credits for this idea to msivak). Easy to implement, possibility to add also projections later to save some bandwidth. But the API would anyway be hammered by bunch of queries, only the network roundtrip would be saved.
The biggest cost for (especially mobile) clients is the cost of establishing new SSL connection. SSL is also pretty expensive on the server side.
So running the aggregation service on the ovirt-engine machine (behind Apache) means the client will do a single SSL request with list of N urls and the local "reverse-proxy" will perform single authentication and N plain HTTP requests (or even better - AJP). It won't remove any time from the actual command run time, but it will reduce protocol overhead.
I think this is the simplest first step that requires almost no change to existing infrastructure.
-- Martin Sivak SLA / oVirt
On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek <tjelinek@redhat.com> wrote:
Hi All,
for quite some time I have been more or less involved in development of various UIs for oVirt based entirely on the oVirt's REST API ranging from the quite mature moVirt [1] through some cockpit extensions to a young and experimental user portal replacement [2].
One issue we hit over and over again is the missing data aggregation. In the 3.x era we used to use in moVirt the detail=something api to get the disks and nics of the VM, something like:
GET /ovirt-engine/api/vms Accept: application/json; detail=disks
This allowed us to store this data in local database leading to great user experience. Since this feature has been removed in 4.x API [3] we needed to retire to a different solution. When the VM detail is selected by the user, start loading the disks and nics and hope the user will not be fast enough to see the delay. The UX is slightly worse bug kinda acceptable.
We hit this issue harder in the new user portal [2], because we already have the VM cached and show the whole VM in one screen. So, if you pick it, you will get it's details immediately. But, since you don't have all the details, we need to do an additional call (two actually) to load this data and they start to appear later. So, something which would be very fast and smooth starts to feel sluggish.
Recently, we hit this issue again which forced us to sacrifice the UX even more - it is the "console in use" feature of user portal. The use case is this: - if the console is already taken by some user, there are complications if other current user tryes to take it as well (will avoid details about settings and permissins involved, but long story short, the user will probably not be allowed to connect to it. The "probably" is the key here since we can not do any intelligent decision in advance, we can only warn the user that the console is taken). - in the current GWT user portal, if the VM's console is taken, it is shown on the VM's "box" that "console is taken". This was a highly requested feature - to get this information using the current REST API, we need to go to the /vms/<vmid>/sessions subcollection. To get this for all VMs, it would be doing N queries per poll which we can not afford - so the current PR [4] will probably end up to only check it on the attempt to connect to the console warning the user. Maybe it will be also shown in Vm details. But the UX in case the user will look for a VM which has free console will suffer significantly (e.g. try one by one until some opens or look at details one by one to see if the warning appears (with a delay))
I understand that embedding the details of the VM to the response comes with a cost, namely: - performance hit - complexity of the API code - the "cleanness" of REST suffers
But I think we should seriously consider to provide some option to data aggregation.
I know this has been discussed many times with no result, but I think it is time to bring this topic up again. I'll try to summarize the (failed) attempts tried so far: - the detail=<something> parameter with ad-hoc embedding of data. This has been there and removed in 4.0 [3] - the DoctorREST project - e.g. a proxy above the current api. The idea was to create a service which will be independent of the engine itself, will locally poll the engine's REST, store all data in local (mongo)DB and provide a rich api with aggregations and projections and push notifications. This polling of everything to get the data to DoctorREST proved to be pretty costy, so also a more invasive approach of pushing data from engine to doctor has been discused [5]. None of this two approaches have been accepted (too complicated, too invasive). - writing some custom ad-hoc servlet serving only a purpose of one frontend - this is actually there for the dashboard, but it is not a generic solution for the other frontends and we really should not develop custom "APIs" for every frontend - there were some other proposals discussed (some 3th party solutions etc) but I think none of them made it even to a PoC
So, now I would try again and try small to get at least some benefit. I see 2 paths we could try: 1: embed something which burns us immediatly, e.g. the /sessions into VMs. I really liked the ;detail=sessions approach, could we move it back? 2: add some tiny service which would just accept a list of queries, execute them locally (but using real HTTP requests) and return in one bulk. A naive implementation just to give a sense of what I mean of this would be a shell script getting list of strings like "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over
On Fri, Mar 24, 2017 at 3:47 PM, Greg Sheremeta <gshereme@redhat.com> wrote: + them them
and do a curl request for each, mangle the results into one string and return (credits for this idea to msivak). Easy to implement, possibility to add also projections later to save some bandwidth. But the API would anyway be hammered by bunch of queries, only the network roundtrip would be saved. 3: any other simple approaches?
I honestly prefer the first approach. It is not beautiful, it is not REST-ful, but it is easy to implement, very pragmatic and useful. What do you think?
Thank you and sorry for the long mail :) Tomas
[1]: https://github.com/oVirt/moVirt [2]: https://github.com/oVirt/ovirt-web-ui [3]: https://gerrit.ovirt.org/#/c/61260 [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ [5]: https://gerrit.ovirt.org/#/c/45233/
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Greg Sheremeta, MBA Red Hat, Inc. Sr. Software Engineer gshereme@redhat.com

2: you can have more api gateways (e.g. more apis) tailored for every frontend. I don't think we need this - the current API serves us pretty well in every FE Im involved in. The only thing which I miss is the data aggregation.
So it does not serve us well. Aggregation of data is one the usual points of using the gateway. Yes microservices are affected by this indeed, but so are we because implementing the aggregation directly in the current engine API layer is hard.
So I would go back to the original topic of this thread - do some small change which has a chance to be merged to the project and helps us where it hurts.
Can a simple HTTP/2 to HTTP/AJP gateway be the simplest solution? Our Apache might even have a module for it already. That way you can multiplex all the REST calls using a single tcp connection (and a single SSL negotiation). A custom SSO enabled service like that might be even better as it would be able to skip the authentication layers too and that would lower the engine load. But I am not sure it is possible with the current codebase. Martin On Fri, Mar 24, 2017 at 4:22 PM, Tomas Jelinek <tjelinek@redhat.com> wrote:
On Fri, Mar 24, 2017 at 3:58 PM, Martin Sivak <msivak@redhat.com> wrote:
I feel like every REST API I've ever worked with has had the aggregation + projection problem. It's like we're trying to use REST as a replacement for SQL -- but the logic that executes the "SQL" lives in a browser now, and it used to live on a server close to the DB. And REST isn't expressive for selecting data like SQL is.
The current industry solution I know about is called API gateway.. most of the big players have internal API with lots of low level stuff and then couple of external API gateways tailored to what the client needs.
http://microservices.io/patterns/apigateway.html (check the backend for frontend section)
This trend is also visible when you think about services that offer API gateway management and billing like https://aws.amazon.com/api-gateway/ or our very own https://www.3scale.net/
right, but the api gateway solves 2 problems:
1: if you have a microservice architecture it is hard for frontend to talk to 20 different moving services. So the gateway hides this complexity behind it. This is not the problem we have.
2: you can have more api gateways (e.g. more apis) tailored for every frontend. I don't think we need this - the current API serves us pretty well in every FE Im involved in. The only thing which I miss is the data aggregation.
So I would go back to the original topic of this thread - do some small change which has a chance to be merged to the project and helps us where it hurts.
Martin
On Fri, Mar 24, 2017 at 3:47 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
I feel like every REST API I've ever worked with has had the aggregation + projection problem. It's like we're trying to use REST as a replacement for SQL -- but the logic that executes the "SQL" lives in a browser now, and it used to live on a server close to the DB. And REST isn't expressive for selecting data like SQL is.
There must be some industry solution to this "I want to do SQL over REST" problem.
On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak <msivak@redhat.com> wrote:
for quite some time I have been more or less involved in development of various UIs for oVirt based entirely on the oVirt's REST API ranging from the quite mature moVirt [1] through some cockpit extensions to a young and experimental user portal replacement [2].
oVirt optimizer has the same issue..
2: add some tiny service which would just accept a list of queries, execute them locally (but using real HTTP requests) and return in one bulk. A naive implementation just to give a sense of what I mean of this would be a shell script getting list of strings like "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over them and do a curl request for each, mangle the results into one string and return (credits for this idea to msivak). Easy to implement, possibility to add also projections later to save some bandwidth. But the API would anyway be hammered by bunch of queries, only the network roundtrip would be saved.
The biggest cost for (especially mobile) clients is the cost of establishing new SSL connection. SSL is also pretty expensive on the server side.
So running the aggregation service on the ovirt-engine machine (behind Apache) means the client will do a single SSL request with list of N urls and the local "reverse-proxy" will perform single authentication and N plain HTTP requests (or even better - AJP). It won't remove any time from the actual command run time, but it will reduce protocol overhead.
I think this is the simplest first step that requires almost no change to existing infrastructure.
-- Martin Sivak SLA / oVirt
On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek <tjelinek@redhat.com> wrote:
Hi All,
for quite some time I have been more or less involved in development of various UIs for oVirt based entirely on the oVirt's REST API ranging from the quite mature moVirt [1] through some cockpit extensions to a young and experimental user portal replacement [2].
One issue we hit over and over again is the missing data aggregation. In the 3.x era we used to use in moVirt the detail=something api to get the disks and nics of the VM, something like:
GET /ovirt-engine/api/vms Accept: application/json; detail=disks
This allowed us to store this data in local database leading to great user experience. Since this feature has been removed in 4.x API [3] we needed to retire to a different solution. When the VM detail is selected by the user, start loading the disks and nics and hope the user will not be fast enough to see the delay. The UX is slightly worse bug kinda acceptable.
We hit this issue harder in the new user portal [2], because we already have the VM cached and show the whole VM in one screen. So, if you pick it, you will get it's details immediately. But, since you don't have all the details, we need to do an additional call (two actually) to load this data and they start to appear later. So, something which would be very fast and smooth starts to feel sluggish.
Recently, we hit this issue again which forced us to sacrifice the UX even more - it is the "console in use" feature of user portal. The use case is this: - if the console is already taken by some user, there are complications if other current user tryes to take it as well (will avoid details about settings and permissins involved, but long story short, the user will probably not be allowed to connect to it. The "probably" is the key here since we can not do any intelligent decision in advance, we can only warn the user that the console is taken). - in the current GWT user portal, if the VM's console is taken, it is shown on the VM's "box" that "console is taken". This was a highly requested feature - to get this information using the current REST API, we need to go to the /vms/<vmid>/sessions subcollection. To get this for all VMs, it would be doing N queries per poll which we can not afford - so the current PR [4] will probably end up to only check it on the attempt to connect to the console warning the user. Maybe it will be also shown in Vm details. But the UX in case the user will look for a VM which has free console will suffer significantly (e.g. try one by one until some opens or look at details one by one to see if the warning appears (with a delay))
I understand that embedding the details of the VM to the response comes with a cost, namely: - performance hit - complexity of the API code - the "cleanness" of REST suffers
But I think we should seriously consider to provide some option to data aggregation.
I know this has been discussed many times with no result, but I think it is time to bring this topic up again. I'll try to summarize the (failed) attempts tried so far: - the detail=<something> parameter with ad-hoc embedding of data. This has been there and removed in 4.0 [3] - the DoctorREST project - e.g. a proxy above the current api. The idea was to create a service which will be independent of the engine itself, will locally poll the engine's REST, store all data in local (mongo)DB and provide a rich api with aggregations and projections and push notifications. This polling of everything to get the data to DoctorREST proved to be pretty costy, so also a more invasive approach of pushing data from engine to doctor has been discused [5]. None of this two approaches have been accepted (too complicated, too invasive). - writing some custom ad-hoc servlet serving only a purpose of one frontend - this is actually there for the dashboard, but it is not a generic solution for the other frontends and we really should not develop custom "APIs" for every frontend - there were some other proposals discussed (some 3th party solutions etc) but I think none of them made it even to a PoC
So, now I would try again and try small to get at least some benefit. I see 2 paths we could try: 1: embed something which burns us immediatly, e.g. the /sessions into VMs. I really liked the ;detail=sessions approach, could we move it back? 2: add some tiny service which would just accept a list of queries, execute them locally (but using real HTTP requests) and return in one bulk. A naive implementation just to give a sense of what I mean of this would be a shell script getting list of strings like "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over them and do a curl request for each, mangle the results into one string and return (credits for this idea to msivak). Easy to implement, possibility to add also projections later to save some bandwidth. But the API would anyway be hammered by bunch of queries, only the network roundtrip would be saved. 3: any other simple approaches?
I honestly prefer the first approach. It is not beautiful, it is not REST-ful, but it is easy to implement, very pragmatic and useful. What do you think?
Thank you and sorry for the long mail :) Tomas
[1]: https://github.com/oVirt/moVirt [2]: https://github.com/oVirt/ovirt-web-ui [3]: https://gerrit.ovirt.org/#/c/61260 [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ [5]: https://gerrit.ovirt.org/#/c/45233/
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Greg Sheremeta, MBA Red Hat, Inc. Sr. Software Engineer gshereme@redhat.com

On Fri, Mar 24, 2017 at 6:43 PM, Martin Sivak <msivak@redhat.com> wrote:
2: you can have more api gateways (e.g. more apis) tailored for every frontend. I don't think we need this - the current API serves us pretty well in every FE Im involved in. The only thing which I miss is the data aggregation.
So it does not serve us well. Aggregation of data is one the usual points of using the gateway. Yes microservices are affected by this indeed, but so are we because implementing the aggregation directly in the current engine API layer is hard.
So I would go back to the original topic of this thread - do some small change which has a chance to be merged to the project and helps us where it hurts.
I'm wondering if very specific additional REST API calls can suffice. For example, a 'Get VM + disks + NIC' API call seems reasonable to add for the various clients who commonly need it.
Can a simple HTTP/2 to HTTP/AJP gateway be the simplest solution? Our Apache might even have a module for it already.
Current Apache used has only experimental module for it. Undertow is supposed to have a better support. I wonder when/if we can drop Apache... Y.
That way you can multiplex all the REST calls using a single tcp connection (and a single SSL negotiation).
A custom SSO enabled service like that might be even better as it would be able to skip the authentication layers too and that would lower the engine load. But I am not sure it is possible with the current codebase.
Martin
On Fri, Mar 24, 2017 at 4:22 PM, Tomas Jelinek <tjelinek@redhat.com> wrote:
On Fri, Mar 24, 2017 at 3:58 PM, Martin Sivak <msivak@redhat.com> wrote:
I feel like every REST API I've ever worked with has had the
+ projection problem. It's like we're trying to use REST as a replacement for SQL -- but the logic that executes the "SQL" lives in a browser now, and it used to live on a server close to the DB. And REST isn't expressive for selecting data like SQL is.
The current industry solution I know about is called API gateway.. most of the big players have internal API with lots of low level stuff and then couple of external API gateways tailored to what the client needs.
http://microservices.io/patterns/apigateway.html (check the backend for frontend section)
This trend is also visible when you think about services that offer API gateway management and billing like https://aws.amazon.com/api-gateway/ or our very own https://www.3scale.net/
right, but the api gateway solves 2 problems:
1: if you have a microservice architecture it is hard for frontend to talk to 20 different moving services. So the gateway hides this complexity behind it. This is not the problem we have.
2: you can have more api gateways (e.g. more apis) tailored for every frontend. I don't think we need this - the current API serves us pretty well in every FE Im involved in. The only thing which I miss is the data aggregation.
So I would go back to the original topic of this thread - do some small change which has a chance to be merged to the project and helps us where it hurts.
Martin
On Fri, Mar 24, 2017 at 3:47 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
I feel like every REST API I've ever worked with has had the
aggregation
+ projection problem. It's like we're trying to use REST as a replacement for SQL -- but the logic that executes the "SQL" lives in a browser now, and it used to live on a server close to the DB. And REST isn't expressive for selecting data like SQL is.
There must be some industry solution to this "I want to do SQL over REST" problem.
On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak <msivak@redhat.com> wrote:
for quite some time I have been more or less involved in
development
of various UIs for oVirt based entirely on the oVirt's REST API ranging from the quite mature moVirt [1] through some cockpit extensions to a young and experimental user portal replacement [2].
oVirt optimizer has the same issue..
2: add some tiny service which would just accept a list of queries, execute them locally (but using real HTTP requests) and return in one bulk. A naive implementation just to give a sense of what I mean of this would be a shell script getting list of strings like "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over them and do a curl request for each, mangle the results into one string and return (credits for this idea to msivak). Easy to implement, possibility to add also projections later to save some bandwidth. But the API would anyway be hammered by bunch of queries, only the network roundtrip would be saved.
The biggest cost for (especially mobile) clients is the cost of establishing new SSL connection. SSL is also pretty expensive on the server side.
So running the aggregation service on the ovirt-engine machine (behind Apache) means the client will do a single SSL request with list of N urls and the local "reverse-proxy" will perform single authentication and N plain HTTP requests (or even better - AJP). It won't remove any time from the actual command run time, but it will reduce protocol overhead.
I think this is the simplest first step that requires almost no change to existing infrastructure.
-- Martin Sivak SLA / oVirt
On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek <tjelinek@redhat.com
wrote:
Hi All,
for quite some time I have been more or less involved in development of various UIs for oVirt based entirely on the oVirt's REST API ranging from the quite mature moVirt [1] through some cockpit extensions to a young and experimental user portal replacement [2].
One issue we hit over and over again is the missing data aggregation. In the 3.x era we used to use in moVirt the detail=something api to get the disks and nics of the VM, something like:
GET /ovirt-engine/api/vms Accept: application/json; detail=disks
This allowed us to store this data in local database leading to great user experience. Since this feature has been removed in 4.x API [3] we needed to retire to a different solution. When the VM detail is selected by the user, start loading the disks and nics and hope the user will not be fast enough to see the delay. The UX is slightly worse bug kinda acceptable.
We hit this issue harder in the new user portal [2], because we already have the VM cached and show the whole VM in one screen. So, if you pick it, you will get it's details immediately. But, since you don't have all the details, we need to do an additional call (two actually) to load this data and they start to appear later. So, something which would be very fast and smooth starts to feel sluggish.
Recently, we hit this issue again which forced us to sacrifice the UX even more - it is the "console in use" feature of user portal. The use case is this: - if the console is already taken by some user, there are complications if other current user tryes to take it as well (will avoid details about settings and permissins involved, but long story short, the user will probably not be allowed to connect to it. The "probably" is the key here since we can not do any intelligent decision in advance, we can only warn the user that the console is taken). - in the current GWT user portal, if the VM's console is taken, it is shown on the VM's "box" that "console is taken". This was a highly requested feature - to get this information using the current REST API, we need to go to the /vms/<vmid>/sessions subcollection. To get this for all VMs, it would be doing N queries per poll which we can not afford - so the current PR [4] will probably end up to only check it on
attempt to connect to the console warning the user. Maybe it will be also shown in Vm details. But the UX in case the user will look for a VM which has free console will suffer significantly (e.g. try one by one until some opens or look at details one by one to see if the warning appears (with a delay))
I understand that embedding the details of the VM to the response comes with a cost, namely: - performance hit - complexity of the API code - the "cleanness" of REST suffers
But I think we should seriously consider to provide some option to data aggregation.
I know this has been discussed many times with no result, but I
aggregation the think
it is time to bring this topic up again. I'll try to summarize the (failed) attempts tried so far: - the detail=<something> parameter with ad-hoc embedding of data. This has been there and removed in 4.0 [3] - the DoctorREST project - e.g. a proxy above the current api. The idea was to create a service which will be independent of the engine itself, will locally poll the engine's REST, store all data in local (mongo)DB and provide a rich api with aggregations and projections and push notifications. This polling of everything to get the data to DoctorREST proved to be pretty costy, so also a more invasive approach of pushing data from engine to doctor has been discused [5]. None of this two approaches have been accepted (too complicated, too invasive). - writing some custom ad-hoc servlet serving only a purpose of one frontend - this is actually there for the dashboard, but it is not a generic solution for the other frontends and we really should not develop custom "APIs" for every frontend - there were some other proposals discussed (some 3th party solutions etc) but I think none of them made it even to a PoC
So, now I would try again and try small to get at least some benefit. I see 2 paths we could try: 1: embed something which burns us immediatly, e.g. the /sessions into VMs. I really liked the ;detail=sessions approach, could we move it back? 2: add some tiny service which would just accept a list of queries, execute them locally (but using real HTTP requests) and return in one bulk. A naive implementation just to give a sense of what I mean of this would be a shell script getting list of strings like "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over them and do a curl request for each, mangle the results into one string and return (credits for this idea to msivak). Easy to implement, possibility to add also projections later to save some bandwidth. But the API would anyway be hammered by bunch of queries, only the network roundtrip would be saved. 3: any other simple approaches?
I honestly prefer the first approach. It is not beautiful, it is not REST-ful, but it is easy to implement, very pragmatic and useful. What do you think?
Thank you and sorry for the long mail :) Tomas
[1]: https://github.com/oVirt/moVirt [2]: https://github.com/oVirt/ovirt-web-ui [3]: https://gerrit.ovirt.org/#/c/61260 [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ [5]: https://gerrit.ovirt.org/#/c/45233/
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Greg Sheremeta, MBA Red Hat, Inc. Sr. Software Engineer gshereme@redhat.com
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

Current Apache used has only experimental module for it. Undertow is supposed to have a better support. I wonder when/if we can drop Apache...
The last info I have about that from mperina is that we need Apache for kerberos support atm. Martin On Fri, Mar 24, 2017 at 5:30 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
On Fri, Mar 24, 2017 at 6:43 PM, Martin Sivak <msivak@redhat.com> wrote:
2: you can have more api gateways (e.g. more apis) tailored for every frontend. I don't think we need this - the current API serves us pretty well in every FE Im involved in. The only thing which I miss is the data aggregation.
So it does not serve us well. Aggregation of data is one the usual points of using the gateway. Yes microservices are affected by this indeed, but so are we because implementing the aggregation directly in the current engine API layer is hard.
So I would go back to the original topic of this thread - do some small change which has a chance to be merged to the project and helps us where it hurts.
I'm wondering if very specific additional REST API calls can suffice. For example, a 'Get VM + disks + NIC' API call seems reasonable to add for the various clients who commonly need it.
Can a simple HTTP/2 to HTTP/AJP gateway be the simplest solution? Our Apache might even have a module for it already.
Current Apache used has only experimental module for it. Undertow is supposed to have a better support. I wonder when/if we can drop Apache... Y.
That way you can multiplex all the REST calls using a single tcp connection (and a single SSL negotiation).
A custom SSO enabled service like that might be even better as it would be able to skip the authentication layers too and that would lower the engine load. But I am not sure it is possible with the current codebase.
Martin
On Fri, Mar 24, 2017 at 4:22 PM, Tomas Jelinek <tjelinek@redhat.com> wrote:
On Fri, Mar 24, 2017 at 3:58 PM, Martin Sivak <msivak@redhat.com> wrote:
I feel like every REST API I've ever worked with has had the aggregation + projection problem. It's like we're trying to use REST as a replacement for SQL -- but the logic that executes the "SQL" lives in a browser now, and it used to live on a server close to the DB. And REST isn't expressive for selecting data like SQL is.
The current industry solution I know about is called API gateway.. most of the big players have internal API with lots of low level stuff and then couple of external API gateways tailored to what the client needs.
http://microservices.io/patterns/apigateway.html (check the backend for frontend section)
This trend is also visible when you think about services that offer API gateway management and billing like https://aws.amazon.com/api-gateway/ or our very own https://www.3scale.net/
right, but the api gateway solves 2 problems:
1: if you have a microservice architecture it is hard for frontend to talk to 20 different moving services. So the gateway hides this complexity behind it. This is not the problem we have.
2: you can have more api gateways (e.g. more apis) tailored for every frontend. I don't think we need this - the current API serves us pretty well in every FE Im involved in. The only thing which I miss is the data aggregation.
So I would go back to the original topic of this thread - do some small change which has a chance to be merged to the project and helps us where it hurts.
Martin
On Fri, Mar 24, 2017 at 3:47 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
I feel like every REST API I've ever worked with has had the aggregation + projection problem. It's like we're trying to use REST as a replacement for SQL -- but the logic that executes the "SQL" lives in a browser now, and it used to live on a server close to the DB. And REST isn't expressive for selecting data like SQL is.
There must be some industry solution to this "I want to do SQL over REST" problem.
On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak <msivak@redhat.com> wrote:
> for quite some time I have been more or less involved in > development > of > various UIs for oVirt based entirely on the oVirt's REST API > ranging > from > the quite mature moVirt [1] through some cockpit extensions to a > young > and > experimental user portal replacement [2].
oVirt optimizer has the same issue..
> 2: add some tiny service which would just accept a list of > queries, > execute > them locally (but using real HTTP requests) and return in one > bulk. A > naive > implementation just to give a sense of what I mean of this would > be a > shell > script getting list of strings like > "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over > them > and > do a curl request for each, mangle the results into one string and > return > (credits for this idea to msivak). Easy to implement, possibility > to > add > also projections later to save some bandwidth. But the API would > anyway > be > hammered by bunch of queries, only the network roundtrip would be > saved.
The biggest cost for (especially mobile) clients is the cost of establishing new SSL connection. SSL is also pretty expensive on the server side.
So running the aggregation service on the ovirt-engine machine (behind Apache) means the client will do a single SSL request with list of N urls and the local "reverse-proxy" will perform single authentication and N plain HTTP requests (or even better - AJP). It won't remove any time from the actual command run time, but it will reduce protocol overhead.
I think this is the simplest first step that requires almost no change to existing infrastructure.
-- Martin Sivak SLA / oVirt
On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek <tjelinek@redhat.com> wrote: > Hi All, > > for quite some time I have been more or less involved in > development > of > various UIs for oVirt based entirely on the oVirt's REST API > ranging > from > the quite mature moVirt [1] through some cockpit extensions to a > young > and > experimental user portal replacement [2]. > > One issue we hit over and over again is the missing data > aggregation. > In > the > 3.x era we used to use in moVirt the detail=something > api to get the disks and nics of the VM, something like: > > GET /ovirt-engine/api/vms > Accept: application/json; detail=disks > > This allowed us to store this data in local database leading to > great > user > experience. Since this feature has been removed in 4.x API [3] > we needed to retire to a different solution. When the VM detail is > selected > by the user, start loading the disks and nics and hope the user > will not be fast enough to see the delay. The UX is slightly worse > bug > kinda > acceptable. > > We hit this issue harder in the new user portal [2], because we > already > have > the VM cached and show the whole VM in one screen. So, if you pick > it, > you > will get it's details immediately. > But, since you don't have all the details, we need to do an > additional > call > (two actually) to load this data and they start to appear later. > So, something which would be very fast and smooth starts to feel > sluggish. > > Recently, we hit this issue again which forced us to sacrifice the > UX > even > more - it is the "console in use" feature of user portal. > The use case is this: > - if the console is already taken by some user, there are > complications > if > other current user tryes to take it as well (will avoid details > about > settings and permissins involved, but long story short, the user > will > probably not be allowed to connect to it. The "probably" is the > key > here > since we can not do any intelligent decision in advance, we can > only > warn > the user that the console is taken). > - in the current GWT user portal, if the VM's console is taken, it > is > shown > on the VM's "box" that "console is taken". This was a highly > requested > feature > - to get this information using the current REST API, we need to > go > to > the > /vms/<vmid>/sessions subcollection. To get this for all VMs, it > would > be > doing N queries per poll which we can not afford > - so the current PR [4] will probably end up to only check it on > the > attempt > to connect to the console warning the user. Maybe it will be also > shown > in > Vm details. But the UX in case the user will look for a VM which > has > free > console will suffer significantly (e.g. try one by one until some > opens > or > look at details one by one to see if the warning appears (with a > delay)) > > I understand that embedding the details of the VM to the response > comes > with > a cost, namely: > - performance hit > - complexity of the API code > - the "cleanness" of REST suffers > > But I think we should seriously consider to provide some option to > data > aggregation. > > I know this has been discussed many times with no result, but I > think > it > is > time to bring this topic up again. I'll try to summarize the > (failed) > attempts tried so far: > - the detail=<something> parameter with ad-hoc embedding of data. > This > has > been there and removed in 4.0 [3] > - the DoctorREST project - e.g. a proxy above the current api. The > idea > was > to create a service which will be independent of the engine > itself, > will > locally poll the engine's REST, store all data in local (mongo)DB > and > provide a rich api with aggregations and projections and push > notifications. > This polling of everything to get the data to DoctorREST proved to > be > pretty > costy, so also a more invasive approach of pushing data from > engine > to > doctor has been discused [5]. None of this two approaches have > been > accepted > (too complicated, too invasive). > - writing some custom ad-hoc servlet serving only a purpose of one > frontend > - this is actually there for the dashboard, but it is not a > generic > solution > for the other frontends and we really should not develop custom > "APIs" > for > every frontend > - there were some other proposals discussed (some 3th party > solutions > etc) > but I think none of them made it even to a PoC > > So, now I would try again and try small to get at least some > benefit. > I > see > 2 paths we could try: > 1: embed something which burns us immediatly, e.g. the /sessions > into > VMs. I > really liked the ;detail=sessions approach, could we move it back? > 2: add some tiny service which would just accept a list of > queries, > execute > them locally (but using real HTTP requests) and return in one > bulk. A > naive > implementation just to give a sense of what I mean of this would > be a > shell > script getting list of strings like > "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over > them > and > do a curl request for each, mangle the results into one string and > return > (credits for this idea to msivak). Easy to implement, possibility > to > add > also projections later to save some bandwidth. But the API would > anyway > be > hammered by bunch of queries, only the network roundtrip would be > saved. > 3: any other simple approaches? > > I honestly prefer the first approach. It is not beautiful, it is > not > REST-ful, but it is easy to implement, very pragmatic and useful. > What do you think? > > Thank you and sorry for the long mail :) > Tomas > > [1]: https://github.com/oVirt/moVirt > [2]: https://github.com/oVirt/ovirt-web-ui > [3]: https://gerrit.ovirt.org/#/c/61260 > [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ > [5]: https://gerrit.ovirt.org/#/c/45233/ > > > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Greg Sheremeta, MBA Red Hat, Inc. Sr. Software Engineer gshereme@redhat.com
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

On Fri, Mar 24, 2017 at 8:57 PM, Martin Sivak <msivak@redhat.com> wrote:
Current Apache used has only experimental module for it. Undertow is supposed to have a better support. I wonder when/if we can drop Apache...
The last info I have about that from mperina is that we need Apache for kerberos support atm.
I don't think we need it - I remember reading that Undertow does support it as well. The only issue is that there are probably 10 people in the world who know how to configure Undertow for Kerberos, while many do for Apache. And since we leave it for the user to configure... Y.
Martin
On Fri, Mar 24, 2017 at 5:30 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
On Fri, Mar 24, 2017 at 6:43 PM, Martin Sivak <msivak@redhat.com> wrote:
2: you can have more api gateways (e.g. more apis) tailored for every frontend. I don't think we need this - the current API serves us
well in every FE Im involved in. The only thing which I miss is the data aggregation.
So it does not serve us well. Aggregation of data is one the usual points of using the gateway. Yes microservices are affected by this indeed, but so are we because implementing the aggregation directly in the current engine API layer is hard.
So I would go back to the original topic of this thread - do some small change which has a chance to be merged to the project and helps us where it hurts.
I'm wondering if very specific additional REST API calls can suffice. For example, a 'Get VM + disks + NIC' API call seems reasonable to add for the various clients who commonly need it.
Can a simple HTTP/2 to HTTP/AJP gateway be the simplest solution? Our Apache might even have a module for it already.
Current Apache used has only experimental module for it. Undertow is supposed to have a better support. I wonder when/if we can drop Apache... Y.
That way you can multiplex all the REST calls using a single tcp connection (and a single SSL negotiation).
A custom SSO enabled service like that might be even better as it would be able to skip the authentication layers too and that would lower the engine load. But I am not sure it is possible with the current codebase.
Martin
On Fri, Mar 24, 2017 at 4:22 PM, Tomas Jelinek <tjelinek@redhat.com> wrote:
On Fri, Mar 24, 2017 at 3:58 PM, Martin Sivak <msivak@redhat.com>
wrote:
I feel like every REST API I've ever worked with has had the aggregation + projection problem. It's like we're trying to use REST as a replacement for SQL -- but the logic that executes the "SQL" lives in a browser
now,
and it used to live on a server close to the DB. And REST isn't expressive for selecting data like SQL is.
The current industry solution I know about is called API gateway.. most of the big players have internal API with lots of low level stuff and then couple of external API gateways tailored to what the client needs.
http://microservices.io/patterns/apigateway.html (check the backend for frontend section)
This trend is also visible when you think about services that offer API gateway management and billing like https://aws.amazon.com/api-gateway/ or our very own https://www.3scale.net/
right, but the api gateway solves 2 problems:
1: if you have a microservice architecture it is hard for frontend to talk to 20 different moving services. So the gateway hides this complexity behind it. This is not the problem we have.
2: you can have more api gateways (e.g. more apis) tailored for every frontend. I don't think we need this - the current API serves us
well in every FE Im involved in. The only thing which I miss is the data aggregation.
So I would go back to the original topic of this thread - do some small change which has a chance to be merged to the project and helps us where it hurts.
Martin
On Fri, Mar 24, 2017 at 3:47 PM, Greg Sheremeta <gshereme@redhat.com
wrote:
I feel like every REST API I've ever worked with has had the aggregation + projection problem. It's like we're trying to use REST as a replacement for SQL -- but the logic that executes the "SQL" lives in a browser now, and it used to live on a server close to the DB. And REST isn't expressive for selecting data like SQL is.
There must be some industry solution to this "I want to do SQL over REST" problem.
On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak <msivak@redhat.com> wrote: > > > for quite some time I have been more or less involved in > > development > > of > > various UIs for oVirt based entirely on the oVirt's REST API > > ranging > > from > > the quite mature moVirt [1] through some cockpit extensions to a > > young > > and > > experimental user portal replacement [2]. > > oVirt optimizer has the same issue.. > > > 2: add some tiny service which would just accept a list of > > queries, > > execute > > them locally (but using real HTTP requests) and return in one > > bulk. A > > naive > > implementation just to give a sense of what I mean of this would > > be a > > shell > > script getting list of strings like > > "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over > > them > > and > > do a curl request for each, mangle the results into one string and > > return > > (credits for this idea to msivak). Easy to implement,
> > to > > add > > also projections later to save some bandwidth. But the API would > > anyway > > be > > hammered by bunch of queries, only the network roundtrip would be > > saved. > > The biggest cost for (especially mobile) clients is the cost of > establishing new SSL connection. SSL is also pretty expensive on
> server side. > > So running the aggregation service on the ovirt-engine machine > (behind > Apache) means the client will do a single SSL request with list of N > urls and the local "reverse-proxy" will perform single > authentication > and N plain HTTP requests (or even better - AJP). It won't remove > any > time from the actual command run time, but it will reduce protocol > overhead. > > I think this is the simplest first step that requires almost no > change > to existing infrastructure. > > -- > Martin Sivak > SLA / oVirt > > On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek > <tjelinek@redhat.com> > wrote: > > Hi All, > > > > for quite some time I have been more or less involved in > > development > > of > > various UIs for oVirt based entirely on the oVirt's REST API > > ranging > > from > > the quite mature moVirt [1] through some cockpit extensions to a > > young > > and > > experimental user portal replacement [2]. > > > > One issue we hit over and over again is the missing data > > aggregation. > > In > > the > > 3.x era we used to use in moVirt the detail=something > > api to get the disks and nics of the VM, something like: > > > > GET /ovirt-engine/api/vms > > Accept: application/json; detail=disks > > > > This allowed us to store this data in local database leading to > > great > > user > > experience. Since this feature has been removed in 4.x API [3] > > we needed to retire to a different solution. When the VM detail is > > selected > > by the user, start loading the disks and nics and hope the user > > will not be fast enough to see the delay. The UX is slightly worse > > bug > > kinda > > acceptable. > > > > We hit this issue harder in the new user portal [2], because we > > already > > have > > the VM cached and show the whole VM in one screen. So, if you
> > it, > > you > > will get it's details immediately. > > But, since you don't have all the details, we need to do an > > additional > > call > > (two actually) to load this data and they start to appear later. > > So, something which would be very fast and smooth starts to feel > > sluggish. > > > > Recently, we hit this issue again which forced us to sacrifice
> > UX > > even > > more - it is the "console in use" feature of user portal. > > The use case is this: > > - if the console is already taken by some user, there are > > complications > > if > > other current user tryes to take it as well (will avoid details > > about > > settings and permissins involved, but long story short, the user > > will > > probably not be allowed to connect to it. The "probably" is the > > key > > here > > since we can not do any intelligent decision in advance, we can > > only > > warn > > the user that the console is taken). > > - in the current GWT user portal, if the VM's console is taken, it > > is > > shown > > on the VM's "box" that "console is taken". This was a highly > > requested > > feature > > - to get this information using the current REST API, we need to > > go > > to > > the > > /vms/<vmid>/sessions subcollection. To get this for all VMs, it > > would > > be > > doing N queries per poll which we can not afford > > - so the current PR [4] will probably end up to only check it on > > the > > attempt > > to connect to the console warning the user. Maybe it will be also > > shown > > in > > Vm details. But the UX in case the user will look for a VM which > > has > > free > > console will suffer significantly (e.g. try one by one until some > > opens > > or > > look at details one by one to see if the warning appears (with a > > delay)) > > > > I understand that embedding the details of the VM to the response > > comes > > with > > a cost, namely: > > - performance hit > > - complexity of the API code > > - the "cleanness" of REST suffers > > > > But I think we should seriously consider to provide some option to > > data > > aggregation. > > > > I know this has been discussed many times with no result, but I > > think > > it > > is > > time to bring this topic up again. I'll try to summarize the > > (failed) > > attempts tried so far: > > - the detail=<something> parameter with ad-hoc embedding of data. > > This > > has > > been there and removed in 4.0 [3] > > - the DoctorREST project - e.g. a proxy above the current api. The > > idea > > was > > to create a service which will be independent of the engine > > itself, > > will > > locally poll the engine's REST, store all data in local (mongo)DB > > and > > provide a rich api with aggregations and projections and push > > notifications. > > This polling of everything to get the data to DoctorREST proved to > > be > > pretty > > costy, so also a more invasive approach of pushing data from > > engine > > to > > doctor has been discused [5]. None of this two approaches have > > been > > accepted > > (too complicated, too invasive). > > - writing some custom ad-hoc servlet serving only a purpose of one > > frontend > > - this is actually there for the dashboard, but it is not a > > generic > > solution > > for the other frontends and we really should not develop custom > > "APIs" > > for > > every frontend > > - there were some other proposals discussed (some 3th party > > solutions > > etc) > > but I think none of them made it even to a PoC > > > > So, now I would try again and try small to get at least some > > benefit. > > I > > see > > 2 paths we could try: > > 1: embed something which burns us immediatly, e.g. the /sessions > > into > > VMs. I > > really liked the ;detail=sessions approach, could we move it back? > > 2: add some tiny service which would just accept a list of > > queries, > > execute > > them locally (but using real HTTP requests) and return in one > > bulk. A > > naive > > implementation just to give a sense of what I mean of this would > > be a > > shell > > script getting list of strings like > > "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over > > them > > and > > do a curl request for each, mangle the results into one string and > > return > > (credits for this idea to msivak). Easy to implement,
pretty pretty possibility the pick the possibility
> > to > > add > > also projections later to save some bandwidth. But the API would > > anyway > > be > > hammered by bunch of queries, only the network roundtrip would be > > saved. > > 3: any other simple approaches? > > > > I honestly prefer the first approach. It is not beautiful, it is > > not > > REST-ful, but it is easy to implement, very pragmatic and useful. > > What do you think? > > > > Thank you and sorry for the long mail :) > > Tomas > > > > [1]: https://github.com/oVirt/moVirt > > [2]: https://github.com/oVirt/ovirt-web-ui > > [3]: https://gerrit.ovirt.org/#/c/61260 > > [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ > > [5]: https://gerrit.ovirt.org/#/c/45233/ > > > > > > _______________________________________________ > > Devel mailing list > > Devel@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/devel > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel
-- Greg Sheremeta, MBA Red Hat, Inc. Sr. Software Engineer gshereme@redhat.com
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

Top posting, sorry. There are a few things I'd like to clarify, regarding this subject: 1. Data aggregation, as requested now by Tomas, and by other people in the past. We used to have that 'detail' parameter, to aggregate certain very specific types of data, in particular to aggregate VM disks and NICs. We removed that in version 4 of the API because the implementation was extremely inefficient, from the engine point of view. An innocent request like this: GET /ovirt-engine/api/vms?detail=+disks,+nics Would generate, with the implementation we used to have, 1 query for the VMs and then as many queries for disks and NICs as VMs in the system. In our scale test environments, for example, with approx 4000 VMs and 10000 disks, that would take more than 20 hours to execute. In addition, we didn't have in the past any mechanism to make this available in a generic one, because there was no knowledge in the API of what are 'details'. In version 4 of the API we introduced a formal (kind of) specification of the API (a.k.a. the model), and int includes knowledge about what are 'links'. For example, the specification of the VM type contains this: @Link DiskAttachment[] diskAttachments(); @Link Nic[] nics(); With this information we are now in a position where we can implement this in a generic way. We intend to implement this using a mechanism similar to the existing 'detail' parameter: GET /ovirt-engine/api/vms/123?follow=disk_attachments,nics The naive implementation of this is to let the API call itself. For example, when the user requests to follow the 'disk_attachments' detail the API can just call itself to get that: GET /ovirt-engine/api/vms/123/disk_attachments However, we can't use that naive approach, if we do we end with the 1+C*N query problem described before. We need to use specific implementations for certain frequent use cases, like VMs+disks+nics, and that needs work in the API and in the backend. Tomas, if you want to help moving this forward, please open a RFE and makes sure it gets attention. 2. Reuse of TLS sessions. The part of creating TLS sessions that is expensive is the generation of the shared session key. That can be avoided if both the server and the client are careful and reuse the session, using the session cache mechanism built-in into TLS itself. The web servers that we use (Apache and Undertow) do implement this mechanism, and so do most of our clients. Make sure that your client uses it as well. In Java this is achieved re-using the SSLContext. We already do that for the engine to VDSM communciation for example. In JavaScript the browser already takes care of this. 3. Parallelism and latency. A typical problem that we have is that we send many request to the server. For example, to retrieve user sessions for a set of VMs we tend to send many requests like this: GET /ovirt-engine/api/vms/1/sessions GET /ovirt/engine/api/vms/2/sessions GET /ovirt-engine/api/vms/3/sessions ... And we do that in a synchronous way: send one, wait for the result, send another one, wait for the result, etc. This means that we don't take advantage of the parallelism of the server and that we add to each request the network round trip time. So if we have N requests, we have to wait at least N*RTT. The web servers that we use support multiple connections, and the protocol that we use, HTTP, supports pipe-lining. This means that you can send multiple requests in parallel, and that you can send multiple requests without waiting for the response. To give you an idea of the improvement that can be achieved, we recently added asynchronous request support to the Ruby SDK, with multiple connections and pipe-lining. In our scale testing environment that reduced the time to collect a complete inventory from approx 30 min to approx 2 min. Here you have an example: https://github.com/oVirt/ovirt-engine-sdk-ruby/blob/master/sdk/examples/asyn... So make sure that you take advantage of that in your clients. Sadly pipe-lining is disabled by default in most browsers, so this isn't helpful for JavaScript applications. 4. HTTP/2 support. The application server that we use, WildFly, supports HTTP/2, including ALPN, out of the box, since version 10.1. We need a mechanism to enable it: core: Add support for enabling HTTP/2 https://gerrit.ovirt.org/74621 And then we need to get Apache out of the way, for API traffic, at least. I think that is something we can do in the context of the engine "podification" effort. However, note that HTTP/2 won't have that big impact in performance for applications that continue to use a synchronous/serial style of interaction with the API. On 03/24/2017 11:16 PM, Yaniv Kaul wrote:
On Fri, Mar 24, 2017 at 8:57 PM, Martin Sivak <msivak@redhat.com <mailto:msivak@redhat.com>> wrote:
> Current Apache used has only experimental module for it. > Undertow is supposed to have a better support. I wonder when/if we can drop > Apache...
The last info I have about that from mperina is that we need Apache for kerberos support atm.
I don't think we need it - I remember reading that Undertow does support it as well. The only issue is that there are probably 10 people in the world who know how to configure Undertow for Kerberos, while many do for Apache. And since we leave it for the user to configure... Y.
Martin
On Fri, Mar 24, 2017 at 5:30 PM, Yaniv Kaul <ykaul@redhat.com <mailto:ykaul@redhat.com>> wrote: > > > On Fri, Mar 24, 2017 at 6:43 PM, Martin Sivak <msivak@redhat.com <mailto:msivak@redhat.com>> wrote: >> >> > 2: you can have more api gateways (e.g. more apis) tailored for every >> > frontend. I don't think we need this - the current API serves us pretty >> > well >> > in every FE Im involved in. The only thing which I miss is the data >> > aggregation. >> >> So it does not serve us well. Aggregation of data is one the usual >> points of using the gateway. >> Yes microservices are affected by this indeed, but so are we because >> implementing the aggregation directly in the current engine API layer >> is hard. >> >> > So I would go back to the original topic of this thread - do some small >> > change which has a chance to be merged to the project and helps us where >> > it >> > hurts. > > > I'm wondering if very specific additional REST API calls can suffice. > For example, a 'Get VM + disks + NIC' API call seems reasonable to add for > the various clients who commonly need it. > >> >> Can a simple HTTP/2 to HTTP/AJP gateway be the simplest solution? Our >> Apache might even have a module for it already. > > > Current Apache used has only experimental module for it. > Undertow is supposed to have a better support. I wonder when/if we can drop > Apache... > Y. > >> >> That way you can multiplex all the REST calls using a single tcp >> connection (and a single SSL negotiation). >> >> A custom SSO enabled service like that might be even better as it >> would be able to skip the authentication >> layers too and that would lower the engine load. But I am not sure it >> is possible with the current codebase. >> >> Martin >> >> On Fri, Mar 24, 2017 at 4:22 PM, Tomas Jelinek <tjelinek@redhat.com <mailto:tjelinek@redhat.com>> >> wrote: >> > >> > >> > On Fri, Mar 24, 2017 at 3:58 PM, Martin Sivak <msivak@redhat.com <mailto:msivak@redhat.com>> wrote: >> >> >> >> > I feel like every REST API I've ever worked with has had the >> >> > aggregation >> >> > + >> >> > projection problem. It's like we're trying to use REST as a >> >> > replacement >> >> > for >> >> > SQL -- but the logic that executes the "SQL" lives in a browser now, >> >> > and >> >> > it >> >> > used to live on a server close to the DB. And REST isn't expressive >> >> > for >> >> > selecting data like SQL is. >> >> >> >> The current industry solution I know about is called API gateway.. >> >> most of the big players have internal API with lots of low level stuff >> >> and then couple of external API gateways tailored to what the client >> >> needs. >> >> >> >> http://microservices.io/patterns/apigateway.html <http://microservices.io/patterns/apigateway.html> (check the backend >> >> for frontend section) >> >> >> >> This trend is also visible when you think about services that offer >> >> API gateway management and billing like >> >> https://aws.amazon.com/api-gateway/ <https://aws.amazon.com/api-gateway/> or our very own >> >> https://www.3scale.net/ >> > >> > >> > right, but the api gateway solves 2 problems: >> > >> > 1: if you have a microservice architecture it is hard for frontend to >> > talk >> > to 20 different moving services. So the gateway hides this complexity >> > behind >> > it. This is not the problem we have. >> > >> > 2: you can have more api gateways (e.g. more apis) tailored for every >> > frontend. I don't think we need this - the current API serves us pretty >> > well >> > in every FE Im involved in. The only thing which I miss is the data >> > aggregation. >> > >> > So I would go back to the original topic of this thread - do some small >> > change which has a chance to be merged to the project and helps us where >> > it >> > hurts. >> > >> >> >> >> >> >> >> >> Martin >> >> >> >> On Fri, Mar 24, 2017 at 3:47 PM, Greg Sheremeta <gshereme@redhat.com <mailto:gshereme@redhat.com>> >> >> wrote: >> >> > I feel like every REST API I've ever worked with has had the >> >> > aggregation >> >> > + >> >> > projection problem. It's like we're trying to use REST as a >> >> > replacement >> >> > for >> >> > SQL -- but the logic that executes the "SQL" lives in a browser now, >> >> > and >> >> > it >> >> > used to live on a server close to the DB. And REST isn't expressive >> >> > for >> >> > selecting data like SQL is. >> >> > >> >> > There must be some industry solution to this "I want to do SQL over >> >> > REST" >> >> > problem. >> >> > >> >> > On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak <msivak@redhat.com <mailto:msivak@redhat.com>> >> >> > wrote: >> >> >> >> >> >> > for quite some time I have been more or less involved in >> >> >> > development >> >> >> > of >> >> >> > various UIs for oVirt based entirely on the oVirt's REST API >> >> >> > ranging >> >> >> > from >> >> >> > the quite mature moVirt [1] through some cockpit extensions to a >> >> >> > young >> >> >> > and >> >> >> > experimental user portal replacement [2]. >> >> >> >> >> >> oVirt optimizer has the same issue.. >> >> >> >> >> >> > 2: add some tiny service which would just accept a list of >> >> >> > queries, >> >> >> > execute >> >> >> > them locally (but using real HTTP requests) and return in one >> >> >> > bulk. A >> >> >> > naive >> >> >> > implementation just to give a sense of what I mean of this would >> >> >> > be a >> >> >> > shell >> >> >> > script getting list of strings like >> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions>" iterate over >> >> >> > them >> >> >> > and >> >> >> > do a curl request for each, mangle the results into one string and >> >> >> > return >> >> >> > (credits for this idea to msivak). Easy to implement, possibility >> >> >> > to >> >> >> > add >> >> >> > also projections later to save some bandwidth. But the API would >> >> >> > anyway >> >> >> > be >> >> >> > hammered by bunch of queries, only the network roundtrip would be >> >> >> > saved. >> >> >> >> >> >> The biggest cost for (especially mobile) clients is the cost of >> >> >> establishing new SSL connection. SSL is also pretty expensive on the >> >> >> server side. >> >> >> >> >> >> So running the aggregation service on the ovirt-engine machine >> >> >> (behind >> >> >> Apache) means the client will do a single SSL request with list of N >> >> >> urls and the local "reverse-proxy" will perform single >> >> >> authentication >> >> >> and N plain HTTP requests (or even better - AJP). It won't remove >> >> >> any >> >> >> time from the actual command run time, but it will reduce protocol >> >> >> overhead. >> >> >> >> >> >> I think this is the simplest first step that requires almost no >> >> >> change >> >> >> to existing infrastructure. >> >> >> >> >> >> -- >> >> >> Martin Sivak >> >> >> SLA / oVirt >> >> >> >> >> >> On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek >> >> >> <tjelinek@redhat.com <mailto:tjelinek@redhat.com>> >> >> >> wrote: >> >> >> > Hi All, >> >> >> > >> >> >> > for quite some time I have been more or less involved in >> >> >> > development >> >> >> > of >> >> >> > various UIs for oVirt based entirely on the oVirt's REST API >> >> >> > ranging >> >> >> > from >> >> >> > the quite mature moVirt [1] through some cockpit extensions to a >> >> >> > young >> >> >> > and >> >> >> > experimental user portal replacement [2]. >> >> >> > >> >> >> > One issue we hit over and over again is the missing data >> >> >> > aggregation. >> >> >> > In >> >> >> > the >> >> >> > 3.x era we used to use in moVirt the detail=something >> >> >> > api to get the disks and nics of the VM, something like: >> >> >> > >> >> >> > GET /ovirt-engine/api/vms >> >> >> > Accept: application/json; detail=disks >> >> >> > >> >> >> > This allowed us to store this data in local database leading to >> >> >> > great >> >> >> > user >> >> >> > experience. Since this feature has been removed in 4.x API [3] >> >> >> > we needed to retire to a different solution. When the VM detail is >> >> >> > selected >> >> >> > by the user, start loading the disks and nics and hope the user >> >> >> > will not be fast enough to see the delay. The UX is slightly worse >> >> >> > bug >> >> >> > kinda >> >> >> > acceptable. >> >> >> > >> >> >> > We hit this issue harder in the new user portal [2], because we >> >> >> > already >> >> >> > have >> >> >> > the VM cached and show the whole VM in one screen. So, if you pick >> >> >> > it, >> >> >> > you >> >> >> > will get it's details immediately. >> >> >> > But, since you don't have all the details, we need to do an >> >> >> > additional >> >> >> > call >> >> >> > (two actually) to load this data and they start to appear later. >> >> >> > So, something which would be very fast and smooth starts to feel >> >> >> > sluggish. >> >> >> > >> >> >> > Recently, we hit this issue again which forced us to sacrifice the >> >> >> > UX >> >> >> > even >> >> >> > more - it is the "console in use" feature of user portal. >> >> >> > The use case is this: >> >> >> > - if the console is already taken by some user, there are >> >> >> > complications >> >> >> > if >> >> >> > other current user tryes to take it as well (will avoid details >> >> >> > about >> >> >> > settings and permissins involved, but long story short, the user >> >> >> > will >> >> >> > probably not be allowed to connect to it. The "probably" is the >> >> >> > key >> >> >> > here >> >> >> > since we can not do any intelligent decision in advance, we can >> >> >> > only >> >> >> > warn >> >> >> > the user that the console is taken). >> >> >> > - in the current GWT user portal, if the VM's console is taken, it >> >> >> > is >> >> >> > shown >> >> >> > on the VM's "box" that "console is taken". This was a highly >> >> >> > requested >> >> >> > feature >> >> >> > - to get this information using the current REST API, we need to >> >> >> > go >> >> >> > to >> >> >> > the >> >> >> > /vms/<vmid>/sessions subcollection. To get this for all VMs, it >> >> >> > would >> >> >> > be >> >> >> > doing N queries per poll which we can not afford >> >> >> > - so the current PR [4] will probably end up to only check it on >> >> >> > the >> >> >> > attempt >> >> >> > to connect to the console warning the user. Maybe it will be also >> >> >> > shown >> >> >> > in >> >> >> > Vm details. But the UX in case the user will look for a VM which >> >> >> > has >> >> >> > free >> >> >> > console will suffer significantly (e.g. try one by one until some >> >> >> > opens >> >> >> > or >> >> >> > look at details one by one to see if the warning appears (with a >> >> >> > delay)) >> >> >> > >> >> >> > I understand that embedding the details of the VM to the response >> >> >> > comes >> >> >> > with >> >> >> > a cost, namely: >> >> >> > - performance hit >> >> >> > - complexity of the API code >> >> >> > - the "cleanness" of REST suffers >> >> >> > >> >> >> > But I think we should seriously consider to provide some option to >> >> >> > data >> >> >> > aggregation. >> >> >> > >> >> >> > I know this has been discussed many times with no result, but I >> >> >> > think >> >> >> > it >> >> >> > is >> >> >> > time to bring this topic up again. I'll try to summarize the >> >> >> > (failed) >> >> >> > attempts tried so far: >> >> >> > - the detail=<something> parameter with ad-hoc embedding of data. >> >> >> > This >> >> >> > has >> >> >> > been there and removed in 4.0 [3] >> >> >> > - the DoctorREST project - e.g. a proxy above the current api. The >> >> >> > idea >> >> >> > was >> >> >> > to create a service which will be independent of the engine >> >> >> > itself, >> >> >> > will >> >> >> > locally poll the engine's REST, store all data in local (mongo)DB >> >> >> > and >> >> >> > provide a rich api with aggregations and projections and push >> >> >> > notifications. >> >> >> > This polling of everything to get the data to DoctorREST proved to >> >> >> > be >> >> >> > pretty >> >> >> > costy, so also a more invasive approach of pushing data from >> >> >> > engine >> >> >> > to >> >> >> > doctor has been discused [5]. None of this two approaches have >> >> >> > been >> >> >> > accepted >> >> >> > (too complicated, too invasive). >> >> >> > - writing some custom ad-hoc servlet serving only a purpose of one >> >> >> > frontend >> >> >> > - this is actually there for the dashboard, but it is not a >> >> >> > generic >> >> >> > solution >> >> >> > for the other frontends and we really should not develop custom >> >> >> > "APIs" >> >> >> > for >> >> >> > every frontend >> >> >> > - there were some other proposals discussed (some 3th party >> >> >> > solutions >> >> >> > etc) >> >> >> > but I think none of them made it even to a PoC >> >> >> > >> >> >> > So, now I would try again and try small to get at least some >> >> >> > benefit. >> >> >> > I >> >> >> > see >> >> >> > 2 paths we could try: >> >> >> > 1: embed something which burns us immediatly, e.g. the /sessions >> >> >> > into >> >> >> > VMs. I >> >> >> > really liked the ;detail=sessions approach, could we move it back? >> >> >> > 2: add some tiny service which would just accept a list of >> >> >> > queries, >> >> >> > execute >> >> >> > them locally (but using real HTTP requests) and return in one >> >> >> > bulk. A >> >> >> > naive >> >> >> > implementation just to give a sense of what I mean of this would >> >> >> > be a >> >> >> > shell >> >> >> > script getting list of strings like >> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions>" iterate over >> >> >> > them >> >> >> > and >> >> >> > do a curl request for each, mangle the results into one string and >> >> >> > return >> >> >> > (credits for this idea to msivak). Easy to implement, possibility >> >> >> > to >> >> >> > add >> >> >> > also projections later to save some bandwidth. But the API would >> >> >> > anyway >> >> >> > be >> >> >> > hammered by bunch of queries, only the network roundtrip would be >> >> >> > saved. >> >> >> > 3: any other simple approaches? >> >> >> > >> >> >> > I honestly prefer the first approach. It is not beautiful, it is >> >> >> > not >> >> >> > REST-ful, but it is easy to implement, very pragmatic and useful. >> >> >> > What do you think? >> >> >> > >> >> >> > Thank you and sorry for the long mail :) >> >> >> > Tomas >> >> >> > >> >> >> > [1]: https://github.com/oVirt/moVirt <https://github.com/oVirt/moVirt> >> >> >> > [2]: https://github.com/oVirt/ovirt-web-ui <https://github.com/oVirt/ovirt-web-ui> >> >> >> > [3]: https://gerrit.ovirt.org/#/c/61260 <https://gerrit.ovirt.org/#/c/61260> >> >> >> > [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ <https://github.com/oVirt/ovirt-web-ui/pull/106/> >> >> >> > [5]: https://gerrit.ovirt.org/#/c/45233/ <https://gerrit.ovirt.org/#/c/45233/> >> >> >> > >> >> >> > >> >> >> > _______________________________________________ >> >> >> > Devel mailing list >> >> >> > Devel@ovirt.org <mailto:Devel@ovirt.org> >> >> >> > http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> >> >> >> _______________________________________________ >> >> >> Devel mailing list >> >> >> Devel@ovirt.org <mailto:Devel@ovirt.org> >> >> >> http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> >> >> > >> >> > >> >> > >> >> > >> >> > -- >> >> > Greg Sheremeta, MBA >> >> > Red Hat, Inc. >> >> > Sr. Software Engineer >> >> > gshereme@redhat.com <mailto:gshereme@redhat.com> >> > >> > >> _______________________________________________ >> Devel mailing list >> Devel@ovirt.org <mailto:Devel@ovirt.org> >> http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> > >

On Mon, Mar 27, 2017 at 11:21 AM, Juan Hernández <jhernand@redhat.com> wrote:
Top posting, sorry.
There are a few things I'd like to clarify, regarding this subject:
1. Data aggregation, as requested now by Tomas, and by other people in the past.
We used to have that 'detail' parameter, to aggregate certain very specific types of data, in particular to aggregate VM disks and NICs. We removed that in version 4 of the API because the implementation was extremely inefficient, from the engine point of view. An innocent request like this:
GET /ovirt-engine/api/vms?detail=+disks,+nics
Would generate, with the implementation we used to have, 1 query for the VMs and then as many queries for disks and NICs as VMs in the system. In our scale test environments, for example, with approx 4000 VMs and 10000 disks, that would take more than 20 hours to execute.
In addition, we didn't have in the past any mechanism to make this available in a generic one, because there was no knowledge in the API of what are 'details'.
In version 4 of the API we introduced a formal (kind of) specification of the API (a.k.a. the model), and int includes knowledge about what are 'links'. For example, the specification of the VM type contains this:
@Link DiskAttachment[] diskAttachments(); @Link Nic[] nics();
With this information we are now in a position where we can implement this in a generic way.
We intend to implement this using a mechanism similar to the existing 'detail' parameter:
GET /ovirt-engine/api/vms/123?follow=disk_attachments,nics
The naive implementation of this is to let the API call itself. For example, when the user requests to follow the 'disk_attachments' detail the API can just call itself to get that:
GET /ovirt-engine/api/vms/123/disk_attachments
However, we can't use that naive approach, if we do we end with the 1+C*N query problem described before. We need to use specific implementations for certain frequent use cases, like VMs+disks+nics, and that needs work in the API and in the backend.
Tomas, if you want to help moving this forward, please open a RFE and makes sure it gets attention.
This sounds pretty good! I will open, but since we are talking already here I'll just use the opportunity to clarify the topic more and than I'll open the BZ. What I can imagine is the GetAllVmsQuery will accept in params also the list of details it should provide. Than, the GetAllVmsQuery will implement the efficient way of retrieving this info as well. So, from the API perspective, it will be about taking the ?follow=<something> part and passing it to the backend query params. What you think?
2. Reuse of TLS sessions.
The part of creating TLS sessions that is expensive is the generation of the shared session key. That can be avoided if both the server and the client are careful and reuse the session, using the session cache mechanism built-in into TLS itself. The web servers that we use (Apache and Undertow) do implement this mechanism, and so do most of our clients. Make sure that your client uses it as well. In Java this is achieved re-using the SSLContext. We already do that for the engine to VDSM communciation for example. In JavaScript the browser already takes care of this.
3. Parallelism and latency.
A typical problem that we have is that we send many request to the server. For example, to retrieve user sessions for a set of VMs we tend to send many requests like this:
GET /ovirt-engine/api/vms/1/sessions GET /ovirt/engine/api/vms/2/sessions GET /ovirt-engine/api/vms/3/sessions ...
And we do that in a synchronous way: send one, wait for the result, send another one, wait for the result, etc. This means that we don't take advantage of the parallelism of the server and that we add to each request the network round trip time. So if we have N requests, we have to wait at least N*RTT.
The web servers that we use support multiple connections, and the protocol that we use, HTTP, supports pipe-lining. This means that you can send multiple requests in parallel, and that you can send multiple requests without waiting for the response. To give you an idea of the improvement that can be achieved, we recently added asynchronous request support to the Ruby SDK, with multiple connections and pipe-lining. In our scale testing environment that reduced the time to collect a complete inventory from approx 30 min to approx 2 min. Here you have an example:
https://github.com/oVirt/ovirt-engine-sdk-ruby/blob/master/sdk/examples/ asynchronous_inventory.rb
So make sure that you take advantage of that in your clients. Sadly pipe-lining is disabled by default in most browsers, so this isn't helpful for JavaScript applications.
But we can try what we can do in moVirt about this: https://github.com/oVirt/moVirt/issues/260
4. HTTP/2 support.
The application server that we use, WildFly, supports HTTP/2, including ALPN, out of the box, since version 10.1. We need a mechanism to enable it:
core: Add support for enabling HTTP/2 https://gerrit.ovirt.org/74621
And then we need to get Apache out of the way, for API traffic, at least. I think that is something we can do in the context of the engine "podification" effort.
However, note that HTTP/2 won't have that big impact in performance for applications that continue to use a synchronous/serial style of interaction with the API.
On 03/24/2017 11:16 PM, Yaniv Kaul wrote:
On Fri, Mar 24, 2017 at 8:57 PM, Martin Sivak <msivak@redhat.com <mailto:msivak@redhat.com>> wrote:
> Current Apache used has only experimental module for it. > Undertow is supposed to have a better support. I wonder when/if we
can drop
> Apache...
The last info I have about that from mperina is that we need Apache for kerberos support atm.
I don't think we need it - I remember reading that Undertow does support it as well. The only issue is that there are probably 10 people in the world who know how to configure Undertow for Kerberos, while many do for Apache. And since we leave it for the user to configure... Y.
Martin
On Fri, Mar 24, 2017 at 5:30 PM, Yaniv Kaul <ykaul@redhat.com <mailto:ykaul@redhat.com>> wrote: > > > On Fri, Mar 24, 2017 at 6:43 PM, Martin Sivak <msivak@redhat.com <mailto:msivak@redhat.com>> wrote: >> >> > 2: you can have more api gateways (e.g. more apis) tailored for every >> > frontend. I don't think we need this - the current API serves us pretty >> > well >> > in every FE Im involved in. The only thing which I miss is the
data
>> > aggregation. >> >> So it does not serve us well. Aggregation of data is one the usual >> points of using the gateway. >> Yes microservices are affected by this indeed, but so are we
because
>> implementing the aggregation directly in the current engine API
layer
>> is hard. >> >> > So I would go back to the original topic of this thread - do some small >> > change which has a chance to be merged to the project and helps us where >> > it >> > hurts. > > > I'm wondering if very specific additional REST API calls can
suffice.
> For example, a 'Get VM + disks + NIC' API call seems reasonable to add for > the various clients who commonly need it. > >> >> Can a simple HTTP/2 to HTTP/AJP gateway be the simplest solution?
Our
>> Apache might even have a module for it already. > > > Current Apache used has only experimental module for it. > Undertow is supposed to have a better support. I wonder when/if we can drop > Apache... > Y. > >> >> That way you can multiplex all the REST calls using a single tcp >> connection (and a single SSL negotiation). >> >> A custom SSO enabled service like that might be even better as it >> would be able to skip the authentication >> layers too and that would lower the engine load. But I am not
sure it
>> is possible with the current codebase. >> >> Martin >> >> On Fri, Mar 24, 2017 at 4:22 PM, Tomas Jelinek <tjelinek@redhat.com <mailto:tjelinek@redhat.com>> >> wrote: >> > >> > >> > On Fri, Mar 24, 2017 at 3:58 PM, Martin Sivak <msivak@redhat.com <mailto:msivak@redhat.com>> wrote: >> >> >> >> > I feel like every REST API I've ever worked with has had the >> >> > aggregation >> >> > + >> >> > projection problem. It's like we're trying to use REST as a >> >> > replacement >> >> > for >> >> > SQL -- but the logic that executes the "SQL" lives in a browser now, >> >> > and >> >> > it >> >> > used to live on a server close to the DB. And REST isn't expressive >> >> > for >> >> > selecting data like SQL is. >> >> >> >> The current industry solution I know about is called API
gateway..
>> >> most of the big players have internal API with lots of low level stuff >> >> and then couple of external API gateways tailored to what the client >> >> needs. >> >> >> >> http://microservices.io/patterns/apigateway.html <http://microservices.io/patterns/apigateway.html> (check the
backend
>> >> for frontend section) >> >> >> >> This trend is also visible when you think about services that offer >> >> API gateway management and billing like >> >> https://aws.amazon.com/api-gateway/ <https://aws.amazon.com/api-gateway/> or our very own >> >> https://www.3scale.net/ >> > >> > >> > right, but the api gateway solves 2 problems: >> > >> > 1: if you have a microservice architecture it is hard for frontend to >> > talk >> > to 20 different moving services. So the gateway hides this complexity >> > behind >> > it. This is not the problem we have. >> > >> > 2: you can have more api gateways (e.g. more apis) tailored for every >> > frontend. I don't think we need this - the current API serves us pretty >> > well >> > in every FE Im involved in. The only thing which I miss is the
data
>> > aggregation. >> > >> > So I would go back to the original topic of this thread - do some small >> > change which has a chance to be merged to the project and helps us where >> > it >> > hurts. >> > >> >> >> >> >> >> >> >> Martin >> >> >> >> On Fri, Mar 24, 2017 at 3:47 PM, Greg Sheremeta <gshereme@redhat.com <mailto:gshereme@redhat.com>> >> >> wrote: >> >> > I feel like every REST API I've ever worked with has had the >> >> > aggregation >> >> > + >> >> > projection problem. It's like we're trying to use REST as a >> >> > replacement >> >> > for >> >> > SQL -- but the logic that executes the "SQL" lives in a browser now, >> >> > and >> >> > it >> >> > used to live on a server close to the DB. And REST isn't expressive >> >> > for >> >> > selecting data like SQL is. >> >> > >> >> > There must be some industry solution to this "I want to do SQL over >> >> > REST" >> >> > problem. >> >> > >> >> > On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak <msivak@redhat.com <mailto:msivak@redhat.com>> >> >> > wrote: >> >> >> >> >> >> > for quite some time I have been more or less involved in >> >> >> > development >> >> >> > of >> >> >> > various UIs for oVirt based entirely on the oVirt's REST
API
>> >> >> > ranging >> >> >> > from >> >> >> > the quite mature moVirt [1] through some cockpit extensions to a >> >> >> > young >> >> >> > and >> >> >> > experimental user portal replacement [2]. >> >> >> >> >> >> oVirt optimizer has the same issue.. >> >> >> >> >> >> > 2: add some tiny service which would just accept a list of >> >> >> > queries, >> >> >> > execute >> >> >> > them locally (but using real HTTP requests) and return in
one
>> >> >> > bulk. A >> >> >> > naive >> >> >> > implementation just to give a sense of what I mean of this would >> >> >> > be a >> >> >> > shell >> >> >> > script getting list of strings like >> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions>" iterate over >> >> >> > them >> >> >> > and >> >> >> > do a curl request for each, mangle the results into one string and >> >> >> > return >> >> >> > (credits for this idea to msivak). Easy to implement, possibility >> >> >> > to >> >> >> > add >> >> >> > also projections later to save some bandwidth. But the API would >> >> >> > anyway >> >> >> > be >> >> >> > hammered by bunch of queries, only the network roundtrip would be >> >> >> > saved. >> >> >> >> >> >> The biggest cost for (especially mobile) clients is the
cost of
>> >> >> establishing new SSL connection. SSL is also pretty expensive on the >> >> >> server side. >> >> >> >> >> >> So running the aggregation service on the ovirt-engine
machine
>> >> >> (behind >> >> >> Apache) means the client will do a single SSL request with list of N >> >> >> urls and the local "reverse-proxy" will perform single >> >> >> authentication >> >> >> and N plain HTTP requests (or even better - AJP). It won't remove >> >> >> any >> >> >> time from the actual command run time, but it will reduce protocol >> >> >> overhead. >> >> >> >> >> >> I think this is the simplest first step that requires
almost no
>> >> >> change >> >> >> to existing infrastructure. >> >> >> >> >> >> -- >> >> >> Martin Sivak >> >> >> SLA / oVirt >> >> >> >> >> >> On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek >> >> >> <tjelinek@redhat.com <mailto:tjelinek@redhat.com>> >> >> >> wrote: >> >> >> > Hi All, >> >> >> > >> >> >> > for quite some time I have been more or less involved in >> >> >> > development >> >> >> > of >> >> >> > various UIs for oVirt based entirely on the oVirt's REST
API
>> >> >> > ranging >> >> >> > from >> >> >> > the quite mature moVirt [1] through some cockpit extensions to a >> >> >> > young >> >> >> > and >> >> >> > experimental user portal replacement [2]. >> >> >> > >> >> >> > One issue we hit over and over again is the missing data >> >> >> > aggregation. >> >> >> > In >> >> >> > the >> >> >> > 3.x era we used to use in moVirt the detail=something >> >> >> > api to get the disks and nics of the VM, something like: >> >> >> > >> >> >> > GET /ovirt-engine/api/vms >> >> >> > Accept: application/json; detail=disks >> >> >> > >> >> >> > This allowed us to store this data in local database leading to >> >> >> > great >> >> >> > user >> >> >> > experience. Since this feature has been removed in 4.x API [3] >> >> >> > we needed to retire to a different solution. When the VM detail is >> >> >> > selected >> >> >> > by the user, start loading the disks and nics and hope the user >> >> >> > will not be fast enough to see the delay. The UX is slightly worse >> >> >> > bug >> >> >> > kinda >> >> >> > acceptable. >> >> >> > >> >> >> > We hit this issue harder in the new user portal [2], because we >> >> >> > already >> >> >> > have >> >> >> > the VM cached and show the whole VM in one screen. So, if you pick >> >> >> > it, >> >> >> > you >> >> >> > will get it's details immediately. >> >> >> > But, since you don't have all the details, we need to do
an
>> >> >> > additional >> >> >> > call >> >> >> > (two actually) to load this data and they start to appear later. >> >> >> > So, something which would be very fast and smooth starts to feel >> >> >> > sluggish. >> >> >> > >> >> >> > Recently, we hit this issue again which forced us to sacrifice the >> >> >> > UX >> >> >> > even >> >> >> > more - it is the "console in use" feature of user portal. >> >> >> > The use case is this: >> >> >> > - if the console is already taken by some user, there are >> >> >> > complications >> >> >> > if >> >> >> > other current user tryes to take it as well (will avoid details >> >> >> > about >> >> >> > settings and permissins involved, but long story short, the user >> >> >> > will >> >> >> > probably not be allowed to connect to it. The "probably" is the >> >> >> > key >> >> >> > here >> >> >> > since we can not do any intelligent decision in advance, we can >> >> >> > only >> >> >> > warn >> >> >> > the user that the console is taken). >> >> >> > - in the current GWT user portal, if the VM's console is taken, it >> >> >> > is >> >> >> > shown >> >> >> > on the VM's "box" that "console is taken". This was a
highly
>> >> >> > requested >> >> >> > feature >> >> >> > - to get this information using the current REST API, we need to >> >> >> > go >> >> >> > to >> >> >> > the >> >> >> > /vms/<vmid>/sessions subcollection. To get this for all VMs, it >> >> >> > would >> >> >> > be >> >> >> > doing N queries per poll which we can not afford >> >> >> > - so the current PR [4] will probably end up to only check it on >> >> >> > the >> >> >> > attempt >> >> >> > to connect to the console warning the user. Maybe it will be also >> >> >> > shown >> >> >> > in >> >> >> > Vm details. But the UX in case the user will look for a VM which >> >> >> > has >> >> >> > free >> >> >> > console will suffer significantly (e.g. try one by one until some >> >> >> > opens >> >> >> > or >> >> >> > look at details one by one to see if the warning appears (with a >> >> >> > delay)) >> >> >> > >> >> >> > I understand that embedding the details of the VM to the response >> >> >> > comes >> >> >> > with >> >> >> > a cost, namely: >> >> >> > - performance hit >> >> >> > - complexity of the API code >> >> >> > - the "cleanness" of REST suffers >> >> >> > >> >> >> > But I think we should seriously consider to provide some option to >> >> >> > data >> >> >> > aggregation. >> >> >> > >> >> >> > I know this has been discussed many times with no result, but I >> >> >> > think >> >> >> > it >> >> >> > is >> >> >> > time to bring this topic up again. I'll try to summarize
the
>> >> >> > (failed) >> >> >> > attempts tried so far: >> >> >> > - the detail=<something> parameter with ad-hoc embedding of data. >> >> >> > This >> >> >> > has >> >> >> > been there and removed in 4.0 [3] >> >> >> > - the DoctorREST project - e.g. a proxy above the current api. The >> >> >> > idea >> >> >> > was >> >> >> > to create a service which will be independent of the
engine
>> >> >> > itself, >> >> >> > will >> >> >> > locally poll the engine's REST, store all data in local (mongo)DB >> >> >> > and >> >> >> > provide a rich api with aggregations and projections and
push
>> >> >> > notifications. >> >> >> > This polling of everything to get the data to DoctorREST proved to >> >> >> > be >> >> >> > pretty >> >> >> > costy, so also a more invasive approach of pushing data
from
>> >> >> > engine >> >> >> > to >> >> >> > doctor has been discused [5]. None of this two approaches have >> >> >> > been >> >> >> > accepted >> >> >> > (too complicated, too invasive). >> >> >> > - writing some custom ad-hoc servlet serving only a purpose of one >> >> >> > frontend >> >> >> > - this is actually there for the dashboard, but it is not
a
>> >> >> > generic >> >> >> > solution >> >> >> > for the other frontends and we really should not develop custom >> >> >> > "APIs" >> >> >> > for >> >> >> > every frontend >> >> >> > - there were some other proposals discussed (some 3th
party
>> >> >> > solutions >> >> >> > etc) >> >> >> > but I think none of them made it even to a PoC >> >> >> > >> >> >> > So, now I would try again and try small to get at least
some
>> >> >> > benefit. >> >> >> > I >> >> >> > see >> >> >> > 2 paths we could try: >> >> >> > 1: embed something which burns us immediatly, e.g. the /sessions >> >> >> > into >> >> >> > VMs. I >> >> >> > really liked the ;detail=sessions approach, could we move it back? >> >> >> > 2: add some tiny service which would just accept a list of >> >> >> > queries, >> >> >> > execute >> >> >> > them locally (but using real HTTP requests) and return in
one
>> >> >> > bulk. A >> >> >> > naive >> >> >> > implementation just to give a sense of what I mean of this would >> >> >> > be a >> >> >> > shell >> >> >> > script getting list of strings like >> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions>" iterate over >> >> >> > them >> >> >> > and >> >> >> > do a curl request for each, mangle the results into one string and >> >> >> > return >> >> >> > (credits for this idea to msivak). Easy to implement, possibility >> >> >> > to >> >> >> > add >> >> >> > also projections later to save some bandwidth. But the API would >> >> >> > anyway >> >> >> > be >> >> >> > hammered by bunch of queries, only the network roundtrip would be >> >> >> > saved. >> >> >> > 3: any other simple approaches? >> >> >> > >> >> >> > I honestly prefer the first approach. It is not beautiful, it is >> >> >> > not >> >> >> > REST-ful, but it is easy to implement, very pragmatic and useful. >> >> >> > What do you think? >> >> >> > >> >> >> > Thank you and sorry for the long mail :) >> >> >> > Tomas >> >> >> > >> >> >> > [1]: https://github.com/oVirt/moVirt <https://github.com/oVirt/moVirt> >> >> >> > [2]: https://github.com/oVirt/ovirt-web-ui <https://github.com/oVirt/ovirt-web-ui> >> >> >> > [3]: https://gerrit.ovirt.org/#/c/61260 <https://gerrit.ovirt.org/#/c/61260> >> >> >> > [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ <https://github.com/oVirt/ovirt-web-ui/pull/106/> >> >> >> > [5]: https://gerrit.ovirt.org/#/c/45233/ <https://gerrit.ovirt.org/#/c/45233/> >> >> >> > >> >> >> > >> >> >> > _______________________________________________ >> >> >> > Devel mailing list >> >> >> > Devel@ovirt.org <mailto:Devel@ovirt.org> >> >> >> > http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> >> >> >> _______________________________________________ >> >> >> Devel mailing list >> >> >> Devel@ovirt.org <mailto:Devel@ovirt.org> >> >> >> http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> >> >> > >> >> > >> >> > >> >> > >> >> > -- >> >> > Greg Sheremeta, MBA >> >> > Red Hat, Inc. >> >> > Sr. Software Engineer >> >> > gshereme@redhat.com <mailto:gshereme@redhat.com> >> > >> > >> _______________________________________________ >> Devel mailing list >> Devel@ovirt.org <mailto:Devel@ovirt.org> >> http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> > >

On 03/27/2017 01:03 PM, Tomas Jelinek wrote:
On Mon, Mar 27, 2017 at 11:21 AM, Juan Hernández <jhernand@redhat.com <mailto:jhernand@redhat.com>> wrote:
Top posting, sorry.
There are a few things I'd like to clarify, regarding this subject:
1. Data aggregation, as requested now by Tomas, and by other people in the past.
We used to have that 'detail' parameter, to aggregate certain very specific types of data, in particular to aggregate VM disks and NICs. We removed that in version 4 of the API because the implementation was extremely inefficient, from the engine point of view. An innocent request like this:
GET /ovirt-engine/api/vms?detail=+disks,+nics
Would generate, with the implementation we used to have, 1 query for the VMs and then as many queries for disks and NICs as VMs in the system. In our scale test environments, for example, with approx 4000 VMs and 10000 disks, that would take more than 20 hours to execute.
In addition, we didn't have in the past any mechanism to make this available in a generic one, because there was no knowledge in the API of what are 'details'.
In version 4 of the API we introduced a formal (kind of) specification of the API (a.k.a. the model), and int includes knowledge about what are 'links'. For example, the specification of the VM type contains this:
@Link DiskAttachment[] diskAttachments(); @Link Nic[] nics();
With this information we are now in a position where we can implement this in a generic way.
We intend to implement this using a mechanism similar to the existing 'detail' parameter:
GET /ovirt-engine/api/vms/123?follow=disk_attachments,nics
The naive implementation of this is to let the API call itself. For example, when the user requests to follow the 'disk_attachments' detail the API can just call itself to get that:
GET /ovirt-engine/api/vms/123/disk_attachments
However, we can't use that naive approach, if we do we end with the 1+C*N query problem described before. We need to use specific implementations for certain frequent use cases, like VMs+disks+nics, and that needs work in the API and in the backend.
Tomas, if you want to help moving this forward, please open a RFE and makes sure it gets attention.
This sounds pretty good! I will open, but since we are talking already here I'll just use the opportunity to clarify the topic more and than I'll open the BZ.
What I can imagine is the GetAllVmsQuery will accept in params also the list of details it should provide. Than, the GetAllVmsQuery will implement the efficient way of retrieving this info as well.
So, from the API perspective, it will be about taking the ?follow=<something> part and passing it to the backend query params.
What you think?
Exactly, that is the point! The API by itself can't optimize database queries, all it can do is call the backend. It is the backend that has the opportunity and possibility to send optimized queries to the database. For other less common things we can use the naive approach, and implement the aggregation in the API itself. But for common use cases, like VM+disks+nics, we need to do it in an efficient way.
2. Reuse of TLS sessions.
The part of creating TLS sessions that is expensive is the generation of the shared session key. That can be avoided if both the server and the client are careful and reuse the session, using the session cache mechanism built-in into TLS itself. The web servers that we use (Apache and Undertow) do implement this mechanism, and so do most of our clients. Make sure that your client uses it as well. In Java this is achieved re-using the SSLContext. We already do that for the engine to VDSM communciation for example. In JavaScript the browser already takes care of this.
3. Parallelism and latency.
A typical problem that we have is that we send many request to the server. For example, to retrieve user sessions for a set of VMs we tend to send many requests like this:
GET /ovirt-engine/api/vms/1/sessions GET /ovirt/engine/api/vms/2/sessions GET /ovirt-engine/api/vms/3/sessions ...
And we do that in a synchronous way: send one, wait for the result, send another one, wait for the result, etc. This means that we don't take advantage of the parallelism of the server and that we add to each request the network round trip time. So if we have N requests, we have to wait at least N*RTT.
The web servers that we use support multiple connections, and the protocol that we use, HTTP, supports pipe-lining. This means that you can send multiple requests in parallel, and that you can send multiple requests without waiting for the response. To give you an idea of the improvement that can be achieved, we recently added asynchronous request support to the Ruby SDK, with multiple connections and pipe-lining. In our scale testing environment that reduced the time to collect a complete inventory from approx 30 min to approx 2 min. Here you have an example:
https://github.com/oVirt/ovirt-engine-sdk-ruby/blob/master/sdk/examples/asyn... <https://github.com/oVirt/ovirt-engine-sdk-ruby/blob/master/sdk/examples/asynchronous_inventory.rb>
So make sure that you take advantage of that in your clients. Sadly pipe-lining is disabled by default in most browsers, so this isn't helpful for JavaScript applications.
But we can try what we can do in moVirt about this: https://github.com/oVirt/moVirt/issues/260
Sure, I think there are plenty of asynchronous HTTP clients for Android, worth trying one of them. If you are brave enough you can even consider using the same library used in the Ruby SDK: libcurl. A bit of JNI here and there, and you are done.
4. HTTP/2 support.
The application server that we use, WildFly, supports HTTP/2, including ALPN, out of the box, since version 10.1. We need a mechanism to enable it:
core: Add support for enabling HTTP/2 https://gerrit.ovirt.org/74621
And then we need to get Apache out of the way, for API traffic, at least. I think that is something we can do in the context of the engine "podification" effort.
However, note that HTTP/2 won't have that big impact in performance for applications that continue to use a synchronous/serial style of interaction with the API.
On 03/24/2017 11:16 PM, Yaniv Kaul wrote: > > > On Fri, Mar 24, 2017 at 8:57 PM, Martin Sivak <msivak@redhat.com <mailto:msivak@redhat.com> > <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> wrote: > > > Current Apache used has only experimental module for it. > > Undertow is supposed to have a better support. I wonder when/if we can drop > > Apache... > > The last info I have about that from mperina is that we need Apache > for kerberos support atm. > > > I don't think we need it - I remember reading that Undertow does support > it as well. > The only issue is that there are probably 10 people in the world who > know how to configure Undertow for Kerberos, while many do for Apache. > And since we leave it for the user to configure... > Y. > > > > Martin > > On Fri, Mar 24, 2017 at 5:30 PM, Yaniv Kaul <ykaul@redhat.com <mailto:ykaul@redhat.com> > <mailto:ykaul@redhat.com <mailto:ykaul@redhat.com>>> wrote: > > > > > > On Fri, Mar 24, 2017 at 6:43 PM, Martin Sivak <msivak@redhat.com <mailto:msivak@redhat.com> > <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> wrote: > >> > >> > 2: you can have more api gateways (e.g. more apis) tailored for > every > >> > frontend. I don't think we need this - the current API serves > us pretty > >> > well > >> > in every FE Im involved in. The only thing which I miss is the data > >> > aggregation. > >> > >> So it does not serve us well. Aggregation of data is one the usual > >> points of using the gateway. > >> Yes microservices are affected by this indeed, but so are we because > >> implementing the aggregation directly in the current engine API layer > >> is hard. > >> > >> > So I would go back to the original topic of this thread - do > some small > >> > change which has a chance to be merged to the project and helps > us where > >> > it > >> > hurts. > > > > > > I'm wondering if very specific additional REST API calls can suffice. > > For example, a 'Get VM + disks + NIC' API call seems reasonable to > add for > > the various clients who commonly need it. > > > >> > >> Can a simple HTTP/2 to HTTP/AJP gateway be the simplest solution? Our > >> Apache might even have a module for it already. > > > > > > Current Apache used has only experimental module for it. > > Undertow is supposed to have a better support. I wonder when/if we > can drop > > Apache... > > Y. > > > >> > >> That way you can multiplex all the REST calls using a single tcp > >> connection (and a single SSL negotiation). > >> > >> A custom SSO enabled service like that might be even better as it > >> would be able to skip the authentication > >> layers too and that would lower the engine load. But I am not sure it > >> is possible with the current codebase. > >> > >> Martin > >> > >> On Fri, Mar 24, 2017 at 4:22 PM, Tomas Jelinek > <tjelinek@redhat.com <mailto:tjelinek@redhat.com> <mailto:tjelinek@redhat.com <mailto:tjelinek@redhat.com>>> > >> wrote: > >> > > >> > > >> > On Fri, Mar 24, 2017 at 3:58 PM, Martin Sivak > <msivak@redhat.com <mailto:msivak@redhat.com> <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> wrote: > >> >> > >> >> > I feel like every REST API I've ever worked with has had the > >> >> > aggregation > >> >> > + > >> >> > projection problem. It's like we're trying to use REST as a > >> >> > replacement > >> >> > for > >> >> > SQL -- but the logic that executes the "SQL" lives in a > browser now, > >> >> > and > >> >> > it > >> >> > used to live on a server close to the DB. And REST isn't > expressive > >> >> > for > >> >> > selecting data like SQL is. > >> >> > >> >> The current industry solution I know about is called API gateway.. > >> >> most of the big players have internal API with lots of low > level stuff > >> >> and then couple of external API gateways tailored to what the > client > >> >> needs. > >> >> > >> >> http://microservices.io/patterns/apigateway.html <http://microservices.io/patterns/apigateway.html> > <http://microservices.io/patterns/apigateway.html <http://microservices.io/patterns/apigateway.html>> (check the backend > >> >> for frontend section) > >> >> > >> >> This trend is also visible when you think about services that > offer > >> >> API gateway management and billing like > >> >> https://aws.amazon.com/api-gateway/ <https://aws.amazon.com/api-gateway/> > <https://aws.amazon.com/api-gateway/ <https://aws.amazon.com/api-gateway/>> or our very own > >> >> https://www.3scale.net/ > >> > > >> > > >> > right, but the api gateway solves 2 problems: > >> > > >> > 1: if you have a microservice architecture it is hard for > frontend to > >> > talk > >> > to 20 different moving services. So the gateway hides this > complexity > >> > behind > >> > it. This is not the problem we have. > >> > > >> > 2: you can have more api gateways (e.g. more apis) tailored for > every > >> > frontend. I don't think we need this - the current API serves > us pretty > >> > well > >> > in every FE Im involved in. The only thing which I miss is the data > >> > aggregation. > >> > > >> > So I would go back to the original topic of this thread - do > some small > >> > change which has a chance to be merged to the project and helps > us where > >> > it > >> > hurts. > >> > > >> >> > >> >> > >> >> > >> >> Martin > >> >> > >> >> On Fri, Mar 24, 2017 at 3:47 PM, Greg Sheremeta > <gshereme@redhat.com <mailto:gshereme@redhat.com> <mailto:gshereme@redhat.com <mailto:gshereme@redhat.com>>> > >> >> wrote: > >> >> > I feel like every REST API I've ever worked with has had the > >> >> > aggregation > >> >> > + > >> >> > projection problem. It's like we're trying to use REST as a > >> >> > replacement > >> >> > for > >> >> > SQL -- but the logic that executes the "SQL" lives in a > browser now, > >> >> > and > >> >> > it > >> >> > used to live on a server close to the DB. And REST isn't > expressive > >> >> > for > >> >> > selecting data like SQL is. > >> >> > > >> >> > There must be some industry solution to this "I want to do > SQL over > >> >> > REST" > >> >> > problem. > >> >> > > >> >> > On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak > <msivak@redhat.com <mailto:msivak@redhat.com> <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> > >> >> > wrote: > >> >> >> > >> >> >> > for quite some time I have been more or less involved in > >> >> >> > development > >> >> >> > of > >> >> >> > various UIs for oVirt based entirely on the oVirt's REST API > >> >> >> > ranging > >> >> >> > from > >> >> >> > the quite mature moVirt [1] through some cockpit > extensions to a > >> >> >> > young > >> >> >> > and > >> >> >> > experimental user portal replacement [2]. > >> >> >> > >> >> >> oVirt optimizer has the same issue.. > >> >> >> > >> >> >> > 2: add some tiny service which would just accept a list of > >> >> >> > queries, > >> >> >> > execute > >> >> >> > them locally (but using real HTTP requests) and return in one > >> >> >> > bulk. A > >> >> >> > naive > >> >> >> > implementation just to give a sense of what I mean of > this would > >> >> >> > be a > >> >> >> > shell > >> >> >> > script getting list of strings like > >> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions> > <https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions>>" iterate over > >> >> >> > them > >> >> >> > and > >> >> >> > do a curl request for each, mangle the results into one > string and > >> >> >> > return > >> >> >> > (credits for this idea to msivak). Easy to implement, > possibility > >> >> >> > to > >> >> >> > add > >> >> >> > also projections later to save some bandwidth. But the > API would > >> >> >> > anyway > >> >> >> > be > >> >> >> > hammered by bunch of queries, only the network roundtrip > would be > >> >> >> > saved. > >> >> >> > >> >> >> The biggest cost for (especially mobile) clients is the cost of > >> >> >> establishing new SSL connection. SSL is also pretty > expensive on the > >> >> >> server side. > >> >> >> > >> >> >> So running the aggregation service on the ovirt-engine machine > >> >> >> (behind > >> >> >> Apache) means the client will do a single SSL request with > list of N > >> >> >> urls and the local "reverse-proxy" will perform single > >> >> >> authentication > >> >> >> and N plain HTTP requests (or even better - AJP). It won't > remove > >> >> >> any > >> >> >> time from the actual command run time, but it will reduce > protocol > >> >> >> overhead. > >> >> >> > >> >> >> I think this is the simplest first step that requires almost no > >> >> >> change > >> >> >> to existing infrastructure. > >> >> >> > >> >> >> -- > >> >> >> Martin Sivak > >> >> >> SLA / oVirt > >> >> >> > >> >> >> On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek > >> >> >> <tjelinek@redhat.com <mailto:tjelinek@redhat.com> <mailto:tjelinek@redhat.com <mailto:tjelinek@redhat.com>>> > >> >> >> wrote: > >> >> >> > Hi All, > >> >> >> > > >> >> >> > for quite some time I have been more or less involved in > >> >> >> > development > >> >> >> > of > >> >> >> > various UIs for oVirt based entirely on the oVirt's REST API > >> >> >> > ranging > >> >> >> > from > >> >> >> > the quite mature moVirt [1] through some cockpit > extensions to a > >> >> >> > young > >> >> >> > and > >> >> >> > experimental user portal replacement [2]. > >> >> >> > > >> >> >> > One issue we hit over and over again is the missing data > >> >> >> > aggregation. > >> >> >> > In > >> >> >> > the > >> >> >> > 3.x era we used to use in moVirt the detail=something > >> >> >> > api to get the disks and nics of the VM, something like: > >> >> >> > > >> >> >> > GET /ovirt-engine/api/vms > >> >> >> > Accept: application/json; detail=disks > >> >> >> > > >> >> >> > This allowed us to store this data in local database > leading to > >> >> >> > great > >> >> >> > user > >> >> >> > experience. Since this feature has been removed in 4.x > API [3] > >> >> >> > we needed to retire to a different solution. When the VM > detail is > >> >> >> > selected > >> >> >> > by the user, start loading the disks and nics and hope > the user > >> >> >> > will not be fast enough to see the delay. The UX is > slightly worse > >> >> >> > bug > >> >> >> > kinda > >> >> >> > acceptable. > >> >> >> > > >> >> >> > We hit this issue harder in the new user portal [2], > because we > >> >> >> > already > >> >> >> > have > >> >> >> > the VM cached and show the whole VM in one screen. So, if > you pick > >> >> >> > it, > >> >> >> > you > >> >> >> > will get it's details immediately. > >> >> >> > But, since you don't have all the details, we need to do an > >> >> >> > additional > >> >> >> > call > >> >> >> > (two actually) to load this data and they start to appear > later. > >> >> >> > So, something which would be very fast and smooth starts > to feel > >> >> >> > sluggish. > >> >> >> > > >> >> >> > Recently, we hit this issue again which forced us to > sacrifice the > >> >> >> > UX > >> >> >> > even > >> >> >> > more - it is the "console in use" feature of user portal. > >> >> >> > The use case is this: > >> >> >> > - if the console is already taken by some user, there are > >> >> >> > complications > >> >> >> > if > >> >> >> > other current user tryes to take it as well (will avoid > details > >> >> >> > about > >> >> >> > settings and permissins involved, but long story short, > the user > >> >> >> > will > >> >> >> > probably not be allowed to connect to it. The "probably" > is the > >> >> >> > key > >> >> >> > here > >> >> >> > since we can not do any intelligent decision in advance, > we can > >> >> >> > only > >> >> >> > warn > >> >> >> > the user that the console is taken). > >> >> >> > - in the current GWT user portal, if the VM's console is > taken, it > >> >> >> > is > >> >> >> > shown > >> >> >> > on the VM's "box" that "console is taken". This was a highly > >> >> >> > requested > >> >> >> > feature > >> >> >> > - to get this information using the current REST API, we > need to > >> >> >> > go > >> >> >> > to > >> >> >> > the > >> >> >> > /vms/<vmid>/sessions subcollection. To get this for all > VMs, it > >> >> >> > would > >> >> >> > be > >> >> >> > doing N queries per poll which we can not afford > >> >> >> > - so the current PR [4] will probably end up to only > check it on > >> >> >> > the > >> >> >> > attempt > >> >> >> > to connect to the console warning the user. Maybe it will > be also > >> >> >> > shown > >> >> >> > in > >> >> >> > Vm details. But the UX in case the user will look for a > VM which > >> >> >> > has > >> >> >> > free > >> >> >> > console will suffer significantly (e.g. try one by one > until some > >> >> >> > opens > >> >> >> > or > >> >> >> > look at details one by one to see if the warning appears > (with a > >> >> >> > delay)) > >> >> >> > > >> >> >> > I understand that embedding the details of the VM to the > response > >> >> >> > comes > >> >> >> > with > >> >> >> > a cost, namely: > >> >> >> > - performance hit > >> >> >> > - complexity of the API code > >> >> >> > - the "cleanness" of REST suffers > >> >> >> > > >> >> >> > But I think we should seriously consider to provide some > option to > >> >> >> > data > >> >> >> > aggregation. > >> >> >> > > >> >> >> > I know this has been discussed many times with no result, > but I > >> >> >> > think > >> >> >> > it > >> >> >> > is > >> >> >> > time to bring this topic up again. I'll try to summarize the > >> >> >> > (failed) > >> >> >> > attempts tried so far: > >> >> >> > - the detail=<something> parameter with ad-hoc embedding > of data. > >> >> >> > This > >> >> >> > has > >> >> >> > been there and removed in 4.0 [3] > >> >> >> > - the DoctorREST project - e.g. a proxy above the current > api. The > >> >> >> > idea > >> >> >> > was > >> >> >> > to create a service which will be independent of the engine > >> >> >> > itself, > >> >> >> > will > >> >> >> > locally poll the engine's REST, store all data in local > (mongo)DB > >> >> >> > and > >> >> >> > provide a rich api with aggregations and projections and push > >> >> >> > notifications. > >> >> >> > This polling of everything to get the data to DoctorREST > proved to > >> >> >> > be > >> >> >> > pretty > >> >> >> > costy, so also a more invasive approach of pushing data from > >> >> >> > engine > >> >> >> > to > >> >> >> > doctor has been discused [5]. None of this two approaches > have > >> >> >> > been > >> >> >> > accepted > >> >> >> > (too complicated, too invasive). > >> >> >> > - writing some custom ad-hoc servlet serving only a > purpose of one > >> >> >> > frontend > >> >> >> > - this is actually there for the dashboard, but it is not a > >> >> >> > generic > >> >> >> > solution > >> >> >> > for the other frontends and we really should not develop > custom > >> >> >> > "APIs" > >> >> >> > for > >> >> >> > every frontend > >> >> >> > - there were some other proposals discussed (some 3th party > >> >> >> > solutions > >> >> >> > etc) > >> >> >> > but I think none of them made it even to a PoC > >> >> >> > > >> >> >> > So, now I would try again and try small to get at least some > >> >> >> > benefit. > >> >> >> > I > >> >> >> > see > >> >> >> > 2 paths we could try: > >> >> >> > 1: embed something which burns us immediatly, e.g. the > /sessions > >> >> >> > into > >> >> >> > VMs. I > >> >> >> > really liked the ;detail=sessions approach, could we move > it back? > >> >> >> > 2: add some tiny service which would just accept a list of > >> >> >> > queries, > >> >> >> > execute > >> >> >> > them locally (but using real HTTP requests) and return in one > >> >> >> > bulk. A > >> >> >> > naive > >> >> >> > implementation just to give a sense of what I mean of > this would > >> >> >> > be a > >> >> >> > shell > >> >> >> > script getting list of strings like > >> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions> > <https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions>>" iterate over > >> >> >> > them > >> >> >> > and > >> >> >> > do a curl request for each, mangle the results into one > string and > >> >> >> > return > >> >> >> > (credits for this idea to msivak). Easy to implement, > possibility > >> >> >> > to > >> >> >> > add > >> >> >> > also projections later to save some bandwidth. But the > API would > >> >> >> > anyway > >> >> >> > be > >> >> >> > hammered by bunch of queries, only the network roundtrip > would be > >> >> >> > saved. > >> >> >> > 3: any other simple approaches? > >> >> >> > > >> >> >> > I honestly prefer the first approach. It is not > beautiful, it is > >> >> >> > not > >> >> >> > REST-ful, but it is easy to implement, very pragmatic and > useful. > >> >> >> > What do you think? > >> >> >> > > >> >> >> > Thank you and sorry for the long mail :) > >> >> >> > Tomas > >> >> >> > > >> >> >> > [1]: https://github.com/oVirt/moVirt <https://github.com/oVirt/moVirt> > <https://github.com/oVirt/moVirt <https://github.com/oVirt/moVirt>> > >> >> >> > [2]: https://github.com/oVirt/ovirt-web-ui <https://github.com/oVirt/ovirt-web-ui> > <https://github.com/oVirt/ovirt-web-ui <https://github.com/oVirt/ovirt-web-ui>> > >> >> >> > [3]: https://gerrit.ovirt.org/#/c/61260 <https://gerrit.ovirt.org/#/c/61260> > <https://gerrit.ovirt.org/#/c/61260 <https://gerrit.ovirt.org/#/c/61260>> > >> >> >> > [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ <https://github.com/oVirt/ovirt-web-ui/pull/106/> > <https://github.com/oVirt/ovirt-web-ui/pull/106/ <https://github.com/oVirt/ovirt-web-ui/pull/106/>> > >> >> >> > [5]: https://gerrit.ovirt.org/#/c/45233/ <https://gerrit.ovirt.org/#/c/45233/> > <https://gerrit.ovirt.org/#/c/45233/ <https://gerrit.ovirt.org/#/c/45233/>> > >> >> >> > > >> >> >> > > >> >> >> > _______________________________________________ > >> >> >> > Devel mailing list > >> >> >> > Devel@ovirt.org <mailto:Devel@ovirt.org> <mailto:Devel@ovirt.org <mailto:Devel@ovirt.org>> > >> >> >> > http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> > <http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel>> > >> >> >> _______________________________________________ > >> >> >> Devel mailing list > >> >> >> Devel@ovirt.org <mailto:Devel@ovirt.org> <mailto:Devel@ovirt.org <mailto:Devel@ovirt.org>> > >> >> >> http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> > <http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel>> > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > -- > >> >> > Greg Sheremeta, MBA > >> >> > Red Hat, Inc. > >> >> > Sr. Software Engineer > >> >> > gshereme@redhat.com <mailto:gshereme@redhat.com> <mailto:gshereme@redhat.com <mailto:gshereme@redhat.com>> > >> > > >> > > >> _______________________________________________ > >> Devel mailing list > >> Devel@ovirt.org <mailto:Devel@ovirt.org> <mailto:Devel@ovirt.org <mailto:Devel@ovirt.org>> > >> http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> > <http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel>> > > > > > >

On Mon, Mar 27, 2017 at 1:32 PM, Juan Hernández <jhernand@redhat.com> wrote:
On 03/27/2017 01:03 PM, Tomas Jelinek wrote:
On Mon, Mar 27, 2017 at 11:21 AM, Juan Hernández <jhernand@redhat.com <mailto:jhernand@redhat.com>> wrote:
Top posting, sorry.
There are a few things I'd like to clarify, regarding this subject:
1. Data aggregation, as requested now by Tomas, and by other people
in
the past.
We used to have that 'detail' parameter, to aggregate certain very specific types of data, in particular to aggregate VM disks and
NICs. We
removed that in version 4 of the API because the implementation was extremely inefficient, from the engine point of view. An innocent request like this:
GET /ovirt-engine/api/vms?detail=+disks,+nics
Would generate, with the implementation we used to have, 1 query for
the
VMs and then as many queries for disks and NICs as VMs in the
system. In
our scale test environments, for example, with approx 4000 VMs and
10000
disks, that would take more than 20 hours to execute.
In addition, we didn't have in the past any mechanism to make this available in a generic one, because there was no knowledge in the
API of
what are 'details'.
In version 4 of the API we introduced a formal (kind of)
specification
of the API (a.k.a. the model), and int includes knowledge about what
are
'links'. For example, the specification of the VM type contains this:
@Link DiskAttachment[] diskAttachments(); @Link Nic[] nics();
With this information we are now in a position where we can implement this in a generic way.
We intend to implement this using a mechanism similar to the existing 'detail' parameter:
GET /ovirt-engine/api/vms/123?follow=disk_attachments,nics
The naive implementation of this is to let the API call itself. For example, when the user requests to follow the 'disk_attachments'
detail
the API can just call itself to get that:
GET /ovirt-engine/api/vms/123/disk_attachments
However, we can't use that naive approach, if we do we end with the 1+C*N query problem described before. We need to use specific implementations for certain frequent use cases, like VMs+disks+nics,
and
that needs work in the API and in the backend.
Tomas, if you want to help moving this forward, please open a RFE and makes sure it gets attention.
ok, opened: https://bugzilla.redhat.com/show_bug.cgi?id=1436206 Will try to get it done soon.
This sounds pretty good! I will open, but since we are talking already here I'll just use the opportunity to clarify the topic more and than I'll open the BZ.
What I can imagine is the GetAllVmsQuery will accept in params also the list of details it should provide. Than, the GetAllVmsQuery will implement the efficient way of retrieving this info as well.
So, from the API perspective, it will be about taking the ?follow=<something> part and passing it to the backend query params.
What you think?
Exactly, that is the point! The API by itself can't optimize database queries, all it can do is call the backend. It is the backend that has the opportunity and possibility to send optimized queries to the database.
For other less common things we can use the naive approach, and implement the aggregation in the API itself. But for common use cases, like VM+disks+nics, we need to do it in an efficient way.
2. Reuse of TLS sessions.
The part of creating TLS sessions that is expensive is the
generation of
the shared session key. That can be avoided if both the server and
the
client are careful and reuse the session, using the session cache mechanism built-in into TLS itself. The web servers that we use
(Apache
and Undertow) do implement this mechanism, and so do most of our clients. Make sure that your client uses it as well. In Java this is achieved re-using the SSLContext. We already do that for the engine
to
VDSM communciation for example. In JavaScript the browser already
takes
care of this.
3. Parallelism and latency.
A typical problem that we have is that we send many request to the server. For example, to retrieve user sessions for a set of VMs we
tend
to send many requests like this:
GET /ovirt-engine/api/vms/1/sessions GET /ovirt/engine/api/vms/2/sessions GET /ovirt-engine/api/vms/3/sessions ...
And we do that in a synchronous way: send one, wait for the result,
send
another one, wait for the result, etc. This means that we don't take advantage of the parallelism of the server and that we add to each request the network round trip time. So if we have N requests, we
have
to wait at least N*RTT.
The web servers that we use support multiple connections, and the protocol that we use, HTTP, supports pipe-lining. This means that you can send multiple requests in parallel, and that you can send
multiple
requests without waiting for the response. To give you an idea of the improvement that can be achieved, we recently added asynchronous
request
support to the Ruby SDK, with multiple connections and pipe-lining.
In
our scale testing environment that reduced the time to collect a complete inventory from approx 30 min to approx 2 min. Here you have
an
example:
master/sdk/examples/asynchronous_inventory.rb
master/sdk/examples/asynchronous_inventory.rb>
So make sure that you take advantage of that in your clients. Sadly pipe-lining is disabled by default in most browsers, so this isn't helpful for JavaScript applications.
But we can try what we can do in moVirt about this: https://github.com/oVirt/moVirt/issues/260
Sure, I think there are plenty of asynchronous HTTP clients for Android, worth trying one of them. If you are brave enough you can even consider using the same library used in the Ruby SDK: libcurl. A bit of JNI here and there, and you are done.
4. HTTP/2 support.
The application server that we use, WildFly, supports HTTP/2,
including
ALPN, out of the box, since version 10.1. We need a mechanism to enable it:
core: Add support for enabling HTTP/2 https://gerrit.ovirt.org/74621
And then we need to get Apache out of the way, for API traffic, at least. I think that is something we can do in the context of the
engine
"podification" effort.
However, note that HTTP/2 won't have that big impact in performance
for
applications that continue to use a synchronous/serial style of interaction with the API.
On 03/24/2017 11:16 PM, Yaniv Kaul wrote: > > > On Fri, Mar 24, 2017 at 8:57 PM, Martin Sivak <msivak@redhat.com
<mailto:msivak@redhat.com>
> <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> wrote: > > > Current Apache used has only experimental module for it. > > Undertow is supposed to have a better support. I wonder
when/if we can drop
> > Apache... > > The last info I have about that from mperina is that we need
Apache
> for kerberos support atm. > > > I don't think we need it - I remember reading that Undertow does
support
> it as well. > The only issue is that there are probably 10 people in the world
who
> know how to configure Undertow for Kerberos, while many do for
Apache.
> And since we leave it for the user to configure... > Y. > > > > Martin > > On Fri, Mar 24, 2017 at 5:30 PM, Yaniv Kaul <ykaul@redhat.com
<mailto:ykaul@redhat.com>
> <mailto:ykaul@redhat.com <mailto:ykaul@redhat.com>>> wrote: > > > > > > On Fri, Mar 24, 2017 at 6:43 PM, Martin Sivak <
msivak@redhat.com <mailto:msivak@redhat.com>
> <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> wrote: > >> > >> > 2: you can have more api gateways (e.g. more apis) tailored for > every > >> > frontend. I don't think we need this - the current API
serves
> us pretty > >> > well > >> > in every FE Im involved in. The only thing which I miss is the data > >> > aggregation. > >> > >> So it does not serve us well. Aggregation of data is one the usual > >> points of using the gateway. > >> Yes microservices are affected by this indeed, but so are we because > >> implementing the aggregation directly in the current engine API layer > >> is hard. > >> > >> > So I would go back to the original topic of this thread -
do
> some small > >> > change which has a chance to be merged to the project and helps > us where > >> > it > >> > hurts. > > > > > > I'm wondering if very specific additional REST API calls can suffice. > > For example, a 'Get VM + disks + NIC' API call seems reasonable to > add for > > the various clients who commonly need it. > > > >> > >> Can a simple HTTP/2 to HTTP/AJP gateway be the simplest solution? Our > >> Apache might even have a module for it already. > > > > > > Current Apache used has only experimental module for it. > > Undertow is supposed to have a better support. I wonder when/if we > can drop > > Apache... > > Y. > > > >> > >> That way you can multiplex all the REST calls using a single tcp > >> connection (and a single SSL negotiation). > >> > >> A custom SSO enabled service like that might be even better as it > >> would be able to skip the authentication > >> layers too and that would lower the engine load. But I am not sure it > >> is possible with the current codebase. > >> > >> Martin > >> > >> On Fri, Mar 24, 2017 at 4:22 PM, Tomas Jelinek > <tjelinek@redhat.com <mailto:tjelinek@redhat.com> <mailto:tjelinek@redhat.com <mailto:tjelinek@redhat.com>>> > >> wrote: > >> > > >> > > >> > On Fri, Mar 24, 2017 at 3:58 PM, Martin Sivak > <msivak@redhat.com <mailto:msivak@redhat.com> <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> wrote: > >> >> > >> >> > I feel like every REST API I've ever worked with has had the > >> >> > aggregation > >> >> > + > >> >> > projection problem. It's like we're trying to use REST as a > >> >> > replacement > >> >> > for > >> >> > SQL -- but the logic that executes the "SQL" lives in a > browser now, > >> >> > and > >> >> > it > >> >> > used to live on a server close to the DB. And REST
isn't
> expressive > >> >> > for > >> >> > selecting data like SQL is. > >> >> > >> >> The current industry solution I know about is called API gateway.. > >> >> most of the big players have internal API with lots of
low
> level stuff > >> >> and then couple of external API gateways tailored to what the > client > >> >> needs. > >> >> > >> >> http://microservices.io/patterns/apigateway.html <http://microservices.io/patterns/apigateway.html> > <http://microservices.io/patterns/apigateway.html <http://microservices.io/patterns/apigateway.html>> (check the
backend
> >> >> for frontend section) > >> >> > >> >> This trend is also visible when you think about services that > offer > >> >> API gateway management and billing like > >> >> https://aws.amazon.com/api-gateway/ <https://aws.amazon.com/api-gateway/> > <https://aws.amazon.com/api-gateway/ <https://aws.amazon.com/api-gateway/>> or our very own > >> >> https://www.3scale.net/ > >> > > >> > > >> > right, but the api gateway solves 2 problems: > >> > > >> > 1: if you have a microservice architecture it is hard for > frontend to > >> > talk > >> > to 20 different moving services. So the gateway hides this > complexity > >> > behind > >> > it. This is not the problem we have. > >> > > >> > 2: you can have more api gateways (e.g. more apis) tailored for > every > >> > frontend. I don't think we need this - the current API
serves
> us pretty > >> > well > >> > in every FE Im involved in. The only thing which I miss is the data > >> > aggregation. > >> > > >> > So I would go back to the original topic of this thread -
do
> some small > >> > change which has a chance to be merged to the project and helps > us where > >> > it > >> > hurts. > >> > > >> >> > >> >> > >> >> > >> >> Martin > >> >> > >> >> On Fri, Mar 24, 2017 at 3:47 PM, Greg Sheremeta > <gshereme@redhat.com <mailto:gshereme@redhat.com> <mailto:gshereme@redhat.com <mailto:gshereme@redhat.com>>> > >> >> wrote: > >> >> > I feel like every REST API I've ever worked with has
had the
> >> >> > aggregation > >> >> > + > >> >> > projection problem. It's like we're trying to use REST
as a
> >> >> > replacement > >> >> > for > >> >> > SQL -- but the logic that executes the "SQL" lives in a > browser now, > >> >> > and > >> >> > it > >> >> > used to live on a server close to the DB. And REST
isn't
> expressive > >> >> > for > >> >> > selecting data like SQL is. > >> >> > > >> >> > There must be some industry solution to this "I want
to do
> SQL over > >> >> > REST" > >> >> > problem. > >> >> > > >> >> > On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak > <msivak@redhat.com <mailto:msivak@redhat.com> <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> > >> >> > wrote: > >> >> >> > >> >> >> > for quite some time I have been more or less involved in > >> >> >> > development > >> >> >> > of > >> >> >> > various UIs for oVirt based entirely on the oVirt's REST API > >> >> >> > ranging > >> >> >> > from > >> >> >> > the quite mature moVirt [1] through some cockpit > extensions to a > >> >> >> > young > >> >> >> > and > >> >> >> > experimental user portal replacement [2]. > >> >> >> > >> >> >> oVirt optimizer has the same issue.. > >> >> >> > >> >> >> > 2: add some tiny service which would just accept a list of > >> >> >> > queries, > >> >> >> > execute > >> >> >> > them locally (but using real HTTP requests) and return in one > >> >> >> > bulk. A > >> >> >> > naive > >> >> >> > implementation just to give a sense of what I mean
of
> this would > >> >> >> > be a > >> >> >> > shell > >> >> >> > script getting list of strings like > >> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions> > <https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions>>" iterate over > >> >> >> > them > >> >> >> > and > >> >> >> > do a curl request for each, mangle the results into
one
> string and > >> >> >> > return > >> >> >> > (credits for this idea to msivak). Easy to
implement,
> possibility > >> >> >> > to > >> >> >> > add > >> >> >> > also projections later to save some bandwidth. But
the
> API would > >> >> >> > anyway > >> >> >> > be > >> >> >> > hammered by bunch of queries, only the network roundtrip > would be > >> >> >> > saved. > >> >> >> > >> >> >> The biggest cost for (especially mobile) clients is the cost of > >> >> >> establishing new SSL connection. SSL is also pretty > expensive on the > >> >> >> server side. > >> >> >> > >> >> >> So running the aggregation service on the ovirt-engine machine > >> >> >> (behind > >> >> >> Apache) means the client will do a single SSL request with > list of N > >> >> >> urls and the local "reverse-proxy" will perform single > >> >> >> authentication > >> >> >> and N plain HTTP requests (or even better - AJP). It won't > remove > >> >> >> any > >> >> >> time from the actual command run time, but it will
reduce
> protocol > >> >> >> overhead. > >> >> >> > >> >> >> I think this is the simplest first step that requires almost no > >> >> >> change > >> >> >> to existing infrastructure. > >> >> >> > >> >> >> -- > >> >> >> Martin Sivak > >> >> >> SLA / oVirt > >> >> >> > >> >> >> On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek > >> >> >> <tjelinek@redhat.com <mailto:tjelinek@redhat.com> <mailto:tjelinek@redhat.com <mailto:tjelinek@redhat.com>>> > >> >> >> wrote: > >> >> >> > Hi All, > >> >> >> > > >> >> >> > for quite some time I have been more or less involved in > >> >> >> > development > >> >> >> > of > >> >> >> > various UIs for oVirt based entirely on the oVirt's REST API > >> >> >> > ranging > >> >> >> > from > >> >> >> > the quite mature moVirt [1] through some cockpit > extensions to a > >> >> >> > young > >> >> >> > and > >> >> >> > experimental user portal replacement [2]. > >> >> >> > > >> >> >> > One issue we hit over and over again is the missing data > >> >> >> > aggregation. > >> >> >> > In > >> >> >> > the > >> >> >> > 3.x era we used to use in moVirt the
detail=something
> >> >> >> > api to get the disks and nics of the VM, something like: > >> >> >> > > >> >> >> > GET /ovirt-engine/api/vms > >> >> >> > Accept: application/json; detail=disks > >> >> >> > > >> >> >> > This allowed us to store this data in local database > leading to > >> >> >> > great > >> >> >> > user > >> >> >> > experience. Since this feature has been removed in
4.x
> API [3] > >> >> >> > we needed to retire to a different solution. When the VM > detail is > >> >> >> > selected > >> >> >> > by the user, start loading the disks and nics and
hope
> the user > >> >> >> > will not be fast enough to see the delay. The UX is > slightly worse > >> >> >> > bug > >> >> >> > kinda > >> >> >> > acceptable. > >> >> >> > > >> >> >> > We hit this issue harder in the new user portal [2], > because we > >> >> >> > already > >> >> >> > have > >> >> >> > the VM cached and show the whole VM in one screen. So, if > you pick > >> >> >> > it, > >> >> >> > you > >> >> >> > will get it's details immediately. > >> >> >> > But, since you don't have all the details, we need to do an > >> >> >> > additional > >> >> >> > call > >> >> >> > (two actually) to load this data and they start to appear > later. > >> >> >> > So, something which would be very fast and smooth starts > to feel > >> >> >> > sluggish. > >> >> >> > > >> >> >> > Recently, we hit this issue again which forced us to > sacrifice the > >> >> >> > UX > >> >> >> > even > >> >> >> > more - it is the "console in use" feature of user portal. > >> >> >> > The use case is this: > >> >> >> > - if the console is already taken by some user, there are > >> >> >> > complications > >> >> >> > if > >> >> >> > other current user tryes to take it as well (will
avoid
> details > >> >> >> > about > >> >> >> > settings and permissins involved, but long story
short,
> the user > >> >> >> > will > >> >> >> > probably not be allowed to connect to it. The "probably" > is the > >> >> >> > key > >> >> >> > here > >> >> >> > since we can not do any intelligent decision in advance, > we can > >> >> >> > only > >> >> >> > warn > >> >> >> > the user that the console is taken). > >> >> >> > - in the current GWT user portal, if the VM's console is > taken, it > >> >> >> > is > >> >> >> > shown > >> >> >> > on the VM's "box" that "console is taken". This was a highly > >> >> >> > requested > >> >> >> > feature > >> >> >> > - to get this information using the current REST API, we > need to > >> >> >> > go > >> >> >> > to > >> >> >> > the > >> >> >> > /vms/<vmid>/sessions subcollection. To get this for
all
> VMs, it > >> >> >> > would > >> >> >> > be > >> >> >> > doing N queries per poll which we can not afford > >> >> >> > - so the current PR [4] will probably end up to only > check it on > >> >> >> > the > >> >> >> > attempt > >> >> >> > to connect to the console warning the user. Maybe it will > be also > >> >> >> > shown > >> >> >> > in > >> >> >> > Vm details. But the UX in case the user will look
for a
> VM which > >> >> >> > has > >> >> >> > free > >> >> >> > console will suffer significantly (e.g. try one by
one
> until some > >> >> >> > opens > >> >> >> > or > >> >> >> > look at details one by one to see if the warning appears > (with a > >> >> >> > delay)) > >> >> >> > > >> >> >> > I understand that embedding the details of the VM to the > response > >> >> >> > comes > >> >> >> > with > >> >> >> > a cost, namely: > >> >> >> > - performance hit > >> >> >> > - complexity of the API code > >> >> >> > - the "cleanness" of REST suffers > >> >> >> > > >> >> >> > But I think we should seriously consider to provide some > option to > >> >> >> > data > >> >> >> > aggregation. > >> >> >> > > >> >> >> > I know this has been discussed many times with no result, > but I > >> >> >> > think > >> >> >> > it > >> >> >> > is > >> >> >> > time to bring this topic up again. I'll try to summarize the > >> >> >> > (failed) > >> >> >> > attempts tried so far: > >> >> >> > - the detail=<something> parameter with ad-hoc embedding > of data. > >> >> >> > This > >> >> >> > has > >> >> >> > been there and removed in 4.0 [3] > >> >> >> > - the DoctorREST project - e.g. a proxy above the current > api. The > >> >> >> > idea > >> >> >> > was > >> >> >> > to create a service which will be independent of the engine > >> >> >> > itself, > >> >> >> > will > >> >> >> > locally poll the engine's REST, store all data in
local
> (mongo)DB > >> >> >> > and > >> >> >> > provide a rich api with aggregations and projections and push > >> >> >> > notifications. > >> >> >> > This polling of everything to get the data to DoctorREST > proved to > >> >> >> > be > >> >> >> > pretty > >> >> >> > costy, so also a more invasive approach of pushing data from > >> >> >> > engine > >> >> >> > to > >> >> >> > doctor has been discused [5]. None of this two approaches > have > >> >> >> > been > >> >> >> > accepted > >> >> >> > (too complicated, too invasive). > >> >> >> > - writing some custom ad-hoc servlet serving only a > purpose of one > >> >> >> > frontend > >> >> >> > - this is actually there for the dashboard, but it is not a > >> >> >> > generic > >> >> >> > solution > >> >> >> > for the other frontends and we really should not develop > custom > >> >> >> > "APIs" > >> >> >> > for > >> >> >> > every frontend > >> >> >> > - there were some other proposals discussed (some 3th party > >> >> >> > solutions > >> >> >> > etc) > >> >> >> > but I think none of them made it even to a PoC > >> >> >> > > >> >> >> > So, now I would try again and try small to get at least some > >> >> >> > benefit. > >> >> >> > I > >> >> >> > see > >> >> >> > 2 paths we could try: > >> >> >> > 1: embed something which burns us immediatly, e.g.
the
> /sessions > >> >> >> > into > >> >> >> > VMs. I > >> >> >> > really liked the ;detail=sessions approach, could we move > it back? > >> >> >> > 2: add some tiny service which would just accept a list of > >> >> >> > queries, > >> >> >> > execute > >> >> >> > them locally (but using real HTTP requests) and return in one > >> >> >> > bulk. A > >> >> >> > naive > >> >> >> > implementation just to give a sense of what I mean
of
> this would > >> >> >> > be a > >> >> >> > shell > >> >> >> > script getting list of strings like > >> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions> > <https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions>>" iterate over > >> >> >> > them > >> >> >> > and > >> >> >> > do a curl request for each, mangle the results into
one
> string and > >> >> >> > return > >> >> >> > (credits for this idea to msivak). Easy to
implement,
> possibility > >> >> >> > to > >> >> >> > add > >> >> >> > also projections later to save some bandwidth. But
the
> API would > >> >> >> > anyway > >> >> >> > be > >> >> >> > hammered by bunch of queries, only the network roundtrip > would be > >> >> >> > saved. > >> >> >> > 3: any other simple approaches? > >> >> >> > > >> >> >> > I honestly prefer the first approach. It is not > beautiful, it is > >> >> >> > not > >> >> >> > REST-ful, but it is easy to implement, very pragmatic and > useful. > >> >> >> > What do you think? > >> >> >> > > >> >> >> > Thank you and sorry for the long mail :) > >> >> >> > Tomas > >> >> >> > > >> >> >> > [1]: https://github.com/oVirt/moVirt <https://github.com/oVirt/moVirt> > <https://github.com/oVirt/moVirt <https://github.com/oVirt/moVirt>> > >> >> >> > [2]: https://github.com/oVirt/ovirt-web-ui <https://github.com/oVirt/ovirt-web-ui> > <https://github.com/oVirt/ovirt-web-ui <https://github.com/oVirt/ovirt-web-ui>> > >> >> >> > [3]: https://gerrit.ovirt.org/#/c/61260 <https://gerrit.ovirt.org/#/c/61260> > <https://gerrit.ovirt.org/#/c/61260 <https://gerrit.ovirt.org/#/c/61260>> > >> >> >> > [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ <https://github.com/oVirt/ovirt-web-ui/pull/106/> > <https://github.com/oVirt/ovirt-web-ui/pull/106/ <https://github.com/oVirt/ovirt-web-ui/pull/106/>> > >> >> >> > [5]: https://gerrit.ovirt.org/#/c/45233/ <https://gerrit.ovirt.org/#/c/45233/> > <https://gerrit.ovirt.org/#/c/45233/ <https://gerrit.ovirt.org/#/c/45233/>> > >> >> >> > > >> >> >> > > >> >> >> > _______________________________________________ > >> >> >> > Devel mailing list > >> >> >> > Devel@ovirt.org <mailto:Devel@ovirt.org> <mailto:Devel@ovirt.org <mailto:Devel@ovirt.org>> > >> >> >> > http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> > <http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel>> > >> >> >> _______________________________________________ > >> >> >> Devel mailing list > >> >> >> Devel@ovirt.org <mailto:Devel@ovirt.org> <mailto:Devel@ovirt.org <mailto:Devel@ovirt.org>> > >> >> >> http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> > <http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel>> > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > -- > >> >> > Greg Sheremeta, MBA > >> >> > Red Hat, Inc. > >> >> > Sr. Software Engineer > >> >> > gshereme@redhat.com <mailto:gshereme@redhat.com> <mailto:gshereme@redhat.com <mailto:gshereme@redhat.com>> > >> > > >> > > >> _______________________________________________ > >> Devel mailing list > >> Devel@ovirt.org <mailto:Devel@ovirt.org> <mailto:Devel@ovirt.org <mailto:Devel@ovirt.org>> > >> http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> > <http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel>> > > > > > >

On Mon, Mar 27, 2017 at 8:59 AM, Tomas Jelinek <tjelinek@redhat.com> wrote:
On Mon, Mar 27, 2017 at 1:32 PM, Juan Hernández <jhernand@redhat.com> wrote:
On 03/27/2017 01:03 PM, Tomas Jelinek wrote:
On Mon, Mar 27, 2017 at 11:21 AM, Juan Hernández <jhernand@redhat.com <mailto:jhernand@redhat.com>> wrote:
Top posting, sorry.
There are a few things I'd like to clarify, regarding this subject:
1. Data aggregation, as requested now by Tomas, and by other people
in
the past.
We used to have that 'detail' parameter, to aggregate certain very specific types of data, in particular to aggregate VM disks and
NICs. We
removed that in version 4 of the API because the implementation was extremely inefficient, from the engine point of view. An innocent request like this:
GET /ovirt-engine/api/vms?detail=+disks,+nics
Would generate, with the implementation we used to have, 1 query
for the
VMs and then as many queries for disks and NICs as VMs in the
system. In
our scale test environments, for example, with approx 4000 VMs and
10000
disks, that would take more than 20 hours to execute.
In addition, we didn't have in the past any mechanism to make this available in a generic one, because there was no knowledge in the
API of
what are 'details'.
In version 4 of the API we introduced a formal (kind of)
specification
of the API (a.k.a. the model), and int includes knowledge about
what are
'links'. For example, the specification of the VM type contains
this:
@Link DiskAttachment[] diskAttachments(); @Link Nic[] nics();
With this information we are now in a position where we can
implement
this in a generic way.
We intend to implement this using a mechanism similar to the
existing
'detail' parameter:
GET /ovirt-engine/api/vms/123?follow=disk_attachments,nics
The naive implementation of this is to let the API call itself. For example, when the user requests to follow the 'disk_attachments'
detail
the API can just call itself to get that:
GET /ovirt-engine/api/vms/123/disk_attachments
However, we can't use that naive approach, if we do we end with the 1+C*N query problem described before. We need to use specific implementations for certain frequent use cases, like
VMs+disks+nics, and
that needs work in the API and in the backend.
Tomas, if you want to help moving this forward, please open a RFE
and
makes sure it gets attention.
ok, opened: https://bugzilla.redhat.com/show_bug.cgi?id=1436206 Will try to get it done soon.
Forgive me if this is radical, but has anyone thought of / discussed using a NoSQL alternative to our very normalized SQL db as a way to avoid the problem of aggregating details? Using mongodb as an example, embed some of the smaller objects, and there's no cost of aggregation there. IIRC, Doctor REST uses mongo under the hood. https://docs.mongodb.com/manual/tutorial/model-embedded-one-to-many-relation... Greg
This sounds pretty good! I will open, but since we are talking already here I'll just use the opportunity to clarify the topic more and than I'll open the BZ.
What I can imagine is the GetAllVmsQuery will accept in params also the list of details it should provide. Than, the GetAllVmsQuery will implement the efficient way of retrieving this info as well.
So, from the API perspective, it will be about taking the ?follow=<something> part and passing it to the backend query params.
What you think?
Exactly, that is the point! The API by itself can't optimize database queries, all it can do is call the backend. It is the backend that has the opportunity and possibility to send optimized queries to the database.
For other less common things we can use the naive approach, and implement the aggregation in the API itself. But for common use cases, like VM+disks+nics, we need to do it in an efficient way.
2. Reuse of TLS sessions.
The part of creating TLS sessions that is expensive is the
generation of
the shared session key. That can be avoided if both the server and
the
client are careful and reuse the session, using the session cache mechanism built-in into TLS itself. The web servers that we use
(Apache
and Undertow) do implement this mechanism, and so do most of our clients. Make sure that your client uses it as well. In Java this is achieved re-using the SSLContext. We already do that for the engine
to
VDSM communciation for example. In JavaScript the browser already
takes
care of this.
3. Parallelism and latency.
A typical problem that we have is that we send many request to the server. For example, to retrieve user sessions for a set of VMs we
tend
to send many requests like this:
GET /ovirt-engine/api/vms/1/sessions GET /ovirt/engine/api/vms/2/sessions GET /ovirt-engine/api/vms/3/sessions ...
And we do that in a synchronous way: send one, wait for the result,
send
another one, wait for the result, etc. This means that we don't take advantage of the parallelism of the server and that we add to each request the network round trip time. So if we have N requests, we
have
to wait at least N*RTT.
The web servers that we use support multiple connections, and the protocol that we use, HTTP, supports pipe-lining. This means that
you
can send multiple requests in parallel, and that you can send
multiple
requests without waiting for the response. To give you an idea of
the
improvement that can be achieved, we recently added asynchronous
request
support to the Ruby SDK, with multiple connections and pipe-lining.
In
our scale testing environment that reduced the time to collect a complete inventory from approx 30 min to approx 2 min. Here you
have an
example:
sdk/examples/asynchronous_inventory.rb
/sdk/examples/asynchronous_inventory.rb>
So make sure that you take advantage of that in your clients. Sadly pipe-lining is disabled by default in most browsers, so this isn't helpful for JavaScript applications.
But we can try what we can do in moVirt about this: https://github.com/oVirt/moVirt/issues/260
Sure, I think there are plenty of asynchronous HTTP clients for Android, worth trying one of them. If you are brave enough you can even consider using the same library used in the Ruby SDK: libcurl. A bit of JNI here and there, and you are done.
4. HTTP/2 support.
The application server that we use, WildFly, supports HTTP/2,
including
ALPN, out of the box, since version 10.1. We need a mechanism to enable it:
core: Add support for enabling HTTP/2 https://gerrit.ovirt.org/74621
And then we need to get Apache out of the way, for API traffic, at least. I think that is something we can do in the context of the
engine
"podification" effort.
However, note that HTTP/2 won't have that big impact in performance
for
applications that continue to use a synchronous/serial style of interaction with the API.
On 03/24/2017 11:16 PM, Yaniv Kaul wrote: > > > On Fri, Mar 24, 2017 at 8:57 PM, Martin Sivak <msivak@redhat.com
<mailto:msivak@redhat.com>
> <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> wrote: > > > Current Apache used has only experimental module for it. > > Undertow is supposed to have a better support. I wonder
when/if we can drop
> > Apache... > > The last info I have about that from mperina is that we need
Apache
> for kerberos support atm. > > > I don't think we need it - I remember reading that Undertow does
support
> it as well. > The only issue is that there are probably 10 people in the world
who
> know how to configure Undertow for Kerberos, while many do for
Apache.
> And since we leave it for the user to configure... > Y. > > > > Martin > > On Fri, Mar 24, 2017 at 5:30 PM, Yaniv Kaul <ykaul@redhat.com
<mailto:ykaul@redhat.com>
> <mailto:ykaul@redhat.com <mailto:ykaul@redhat.com>>> wrote: > > > > > > On Fri, Mar 24, 2017 at 6:43 PM, Martin Sivak <
msivak@redhat.com <mailto:msivak@redhat.com>
> <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> wrote: > >> > >> > 2: you can have more api gateways (e.g. more apis) tailored for > every > >> > frontend. I don't think we need this - the current API
serves
> us pretty > >> > well > >> > in every FE Im involved in. The only thing which I miss is the data > >> > aggregation. > >> > >> So it does not serve us well. Aggregation of data is one the usual > >> points of using the gateway. > >> Yes microservices are affected by this indeed, but so are we because > >> implementing the aggregation directly in the current engine API layer > >> is hard. > >> > >> > So I would go back to the original topic of this thread
- do
> some small > >> > change which has a chance to be merged to the project and helps > us where > >> > it > >> > hurts. > > > > > > I'm wondering if very specific additional REST API calls can suffice. > > For example, a 'Get VM + disks + NIC' API call seems reasonable to > add for > > the various clients who commonly need it. > > > >> > >> Can a simple HTTP/2 to HTTP/AJP gateway be the simplest solution? Our > >> Apache might even have a module for it already. > > > > > > Current Apache used has only experimental module for it. > > Undertow is supposed to have a better support. I wonder when/if we > can drop > > Apache... > > Y. > > > >> > >> That way you can multiplex all the REST calls using a single tcp > >> connection (and a single SSL negotiation). > >> > >> A custom SSO enabled service like that might be even better as it > >> would be able to skip the authentication > >> layers too and that would lower the engine load. But I am not sure it > >> is possible with the current codebase. > >> > >> Martin > >> > >> On Fri, Mar 24, 2017 at 4:22 PM, Tomas Jelinek > <tjelinek@redhat.com <mailto:tjelinek@redhat.com> <mailto:tjelinek@redhat.com <mailto:tjelinek@redhat.com>>> > >> wrote: > >> > > >> > > >> > On Fri, Mar 24, 2017 at 3:58 PM, Martin Sivak > <msivak@redhat.com <mailto:msivak@redhat.com> <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> wrote: > >> >> > >> >> > I feel like every REST API I've ever worked with has had the > >> >> > aggregation > >> >> > + > >> >> > projection problem. It's like we're trying to use REST as a > >> >> > replacement > >> >> > for > >> >> > SQL -- but the logic that executes the "SQL" lives in
a
> browser now, > >> >> > and > >> >> > it > >> >> > used to live on a server close to the DB. And REST
isn't
> expressive > >> >> > for > >> >> > selecting data like SQL is. > >> >> > >> >> The current industry solution I know about is called API gateway.. > >> >> most of the big players have internal API with lots of
low
> level stuff > >> >> and then couple of external API gateways tailored to what the > client > >> >> needs. > >> >> > >> >> http://microservices.io/patterns/apigateway.html <http://microservices.io/patterns/apigateway.html> > <http://microservices.io/patterns/apigateway.html <http://microservices.io/patterns/apigateway.html>> (check the
backend
> >> >> for frontend section) > >> >> > >> >> This trend is also visible when you think about services that > offer > >> >> API gateway management and billing like > >> >> https://aws.amazon.com/api-gateway/ <https://aws.amazon.com/api-gateway/> > <https://aws.amazon.com/api-gateway/ <https://aws.amazon.com/api-gateway/>> or our very own > >> >> https://www.3scale.net/ > >> > > >> > > >> > right, but the api gateway solves 2 problems: > >> > > >> > 1: if you have a microservice architecture it is hard for > frontend to > >> > talk > >> > to 20 different moving services. So the gateway hides
this
> complexity > >> > behind > >> > it. This is not the problem we have. > >> > > >> > 2: you can have more api gateways (e.g. more apis) tailored for > every > >> > frontend. I don't think we need this - the current API
serves
> us pretty > >> > well > >> > in every FE Im involved in. The only thing which I miss is the data > >> > aggregation. > >> > > >> > So I would go back to the original topic of this thread
- do
> some small > >> > change which has a chance to be merged to the project and helps > us where > >> > it > >> > hurts. > >> > > >> >> > >> >> > >> >> > >> >> Martin > >> >> > >> >> On Fri, Mar 24, 2017 at 3:47 PM, Greg Sheremeta > <gshereme@redhat.com <mailto:gshereme@redhat.com> <mailto:gshereme@redhat.com <mailto:gshereme@redhat.com>>> > >> >> wrote: > >> >> > I feel like every REST API I've ever worked with has
had the
> >> >> > aggregation > >> >> > + > >> >> > projection problem. It's like we're trying to use
REST as a
> >> >> > replacement > >> >> > for > >> >> > SQL -- but the logic that executes the "SQL" lives in
a
> browser now, > >> >> > and > >> >> > it > >> >> > used to live on a server close to the DB. And REST
isn't
> expressive > >> >> > for > >> >> > selecting data like SQL is. > >> >> > > >> >> > There must be some industry solution to this "I want
to do
> SQL over > >> >> > REST" > >> >> > problem. > >> >> > > >> >> > On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak > <msivak@redhat.com <mailto:msivak@redhat.com> <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> > >> >> > wrote: > >> >> >> > >> >> >> > for quite some time I have been more or less involved in > >> >> >> > development > >> >> >> > of > >> >> >> > various UIs for oVirt based entirely on the oVirt's REST API > >> >> >> > ranging > >> >> >> > from > >> >> >> > the quite mature moVirt [1] through some cockpit > extensions to a > >> >> >> > young > >> >> >> > and > >> >> >> > experimental user portal replacement [2]. > >> >> >> > >> >> >> oVirt optimizer has the same issue.. > >> >> >> > >> >> >> > 2: add some tiny service which would just accept a list of > >> >> >> > queries, > >> >> >> > execute > >> >> >> > them locally (but using real HTTP requests) and return in one > >> >> >> > bulk. A > >> >> >> > naive > >> >> >> > implementation just to give a sense of what I mean
of
> this would > >> >> >> > be a > >> >> >> > shell > >> >> >> > script getting list of strings like > >> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions> > <https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions>>" iterate
over
> >> >> >> > them > >> >> >> > and > >> >> >> > do a curl request for each, mangle the results
into one
> string and > >> >> >> > return > >> >> >> > (credits for this idea to msivak). Easy to
implement,
> possibility > >> >> >> > to > >> >> >> > add > >> >> >> > also projections later to save some bandwidth. But
the
> API would > >> >> >> > anyway > >> >> >> > be > >> >> >> > hammered by bunch of queries, only the network roundtrip > would be > >> >> >> > saved. > >> >> >> > >> >> >> The biggest cost for (especially mobile) clients is the cost of > >> >> >> establishing new SSL connection. SSL is also pretty > expensive on the > >> >> >> server side. > >> >> >> > >> >> >> So running the aggregation service on the ovirt-engine machine > >> >> >> (behind > >> >> >> Apache) means the client will do a single SSL request with > list of N > >> >> >> urls and the local "reverse-proxy" will perform
single
> >> >> >> authentication > >> >> >> and N plain HTTP requests (or even better - AJP). It won't > remove > >> >> >> any > >> >> >> time from the actual command run time, but it will
reduce
> protocol > >> >> >> overhead. > >> >> >> > >> >> >> I think this is the simplest first step that requires almost no > >> >> >> change > >> >> >> to existing infrastructure. > >> >> >> > >> >> >> -- > >> >> >> Martin Sivak > >> >> >> SLA / oVirt > >> >> >> > >> >> >> On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek > >> >> >> <tjelinek@redhat.com <mailto:tjelinek@redhat.com> <mailto:tjelinek@redhat.com <mailto:tjelinek@redhat.com>>> > >> >> >> wrote: > >> >> >> > Hi All, > >> >> >> > > >> >> >> > for quite some time I have been more or less involved in > >> >> >> > development > >> >> >> > of > >> >> >> > various UIs for oVirt based entirely on the oVirt's REST API > >> >> >> > ranging > >> >> >> > from > >> >> >> > the quite mature moVirt [1] through some cockpit > extensions to a > >> >> >> > young > >> >> >> > and > >> >> >> > experimental user portal replacement [2]. > >> >> >> > > >> >> >> > One issue we hit over and over again is the missing data > >> >> >> > aggregation. > >> >> >> > In > >> >> >> > the > >> >> >> > 3.x era we used to use in moVirt the
detail=something
> >> >> >> > api to get the disks and nics of the VM, something like: > >> >> >> > > >> >> >> > GET /ovirt-engine/api/vms > >> >> >> > Accept: application/json; detail=disks > >> >> >> > > >> >> >> > This allowed us to store this data in local
database
> leading to > >> >> >> > great > >> >> >> > user > >> >> >> > experience. Since this feature has been removed in
4.x
> API [3] > >> >> >> > we needed to retire to a different solution. When the VM > detail is > >> >> >> > selected > >> >> >> > by the user, start loading the disks and nics and
hope
> the user > >> >> >> > will not be fast enough to see the delay. The UX is > slightly worse > >> >> >> > bug > >> >> >> > kinda > >> >> >> > acceptable. > >> >> >> > > >> >> >> > We hit this issue harder in the new user portal
[2],
> because we > >> >> >> > already > >> >> >> > have > >> >> >> > the VM cached and show the whole VM in one screen. So, if > you pick > >> >> >> > it, > >> >> >> > you > >> >> >> > will get it's details immediately. > >> >> >> > But, since you don't have all the details, we need to do an > >> >> >> > additional > >> >> >> > call > >> >> >> > (two actually) to load this data and they start to appear > later. > >> >> >> > So, something which would be very fast and smooth starts > to feel > >> >> >> > sluggish. > >> >> >> > > >> >> >> > Recently, we hit this issue again which forced us
to
> sacrifice the > >> >> >> > UX > >> >> >> > even > >> >> >> > more - it is the "console in use" feature of user portal. > >> >> >> > The use case is this: > >> >> >> > - if the console is already taken by some user, there are > >> >> >> > complications > >> >> >> > if > >> >> >> > other current user tryes to take it as well (will
avoid
> details > >> >> >> > about > >> >> >> > settings and permissins involved, but long story
short,
> the user > >> >> >> > will > >> >> >> > probably not be allowed to connect to it. The "probably" > is the > >> >> >> > key > >> >> >> > here > >> >> >> > since we can not do any intelligent decision in advance, > we can > >> >> >> > only > >> >> >> > warn > >> >> >> > the user that the console is taken). > >> >> >> > - in the current GWT user portal, if the VM's console is > taken, it > >> >> >> > is > >> >> >> > shown > >> >> >> > on the VM's "box" that "console is taken". This was a highly > >> >> >> > requested > >> >> >> > feature > >> >> >> > - to get this information using the current REST API, we > need to > >> >> >> > go > >> >> >> > to > >> >> >> > the > >> >> >> > /vms/<vmid>/sessions subcollection. To get this
for all
> VMs, it > >> >> >> > would > >> >> >> > be > >> >> >> > doing N queries per poll which we can not afford > >> >> >> > - so the current PR [4] will probably end up to
only
> check it on > >> >> >> > the > >> >> >> > attempt > >> >> >> > to connect to the console warning the user. Maybe it will > be also > >> >> >> > shown > >> >> >> > in > >> >> >> > Vm details. But the UX in case the user will look
for a
> VM which > >> >> >> > has > >> >> >> > free > >> >> >> > console will suffer significantly (e.g. try one by
one
> until some > >> >> >> > opens > >> >> >> > or > >> >> >> > look at details one by one to see if the warning appears > (with a > >> >> >> > delay)) > >> >> >> > > >> >> >> > I understand that embedding the details of the VM to the > response > >> >> >> > comes > >> >> >> > with > >> >> >> > a cost, namely: > >> >> >> > - performance hit > >> >> >> > - complexity of the API code > >> >> >> > - the "cleanness" of REST suffers > >> >> >> > > >> >> >> > But I think we should seriously consider to provide some > option to > >> >> >> > data > >> >> >> > aggregation. > >> >> >> > > >> >> >> > I know this has been discussed many times with no result, > but I > >> >> >> > think > >> >> >> > it > >> >> >> > is > >> >> >> > time to bring this topic up again. I'll try to summarize the > >> >> >> > (failed) > >> >> >> > attempts tried so far: > >> >> >> > - the detail=<something> parameter with ad-hoc embedding > of data. > >> >> >> > This > >> >> >> > has > >> >> >> > been there and removed in 4.0 [3] > >> >> >> > - the DoctorREST project - e.g. a proxy above the current > api. The > >> >> >> > idea > >> >> >> > was > >> >> >> > to create a service which will be independent of the engine > >> >> >> > itself, > >> >> >> > will > >> >> >> > locally poll the engine's REST, store all data in
local
> (mongo)DB > >> >> >> > and > >> >> >> > provide a rich api with aggregations and projections and push > >> >> >> > notifications. > >> >> >> > This polling of everything to get the data to DoctorREST > proved to > >> >> >> > be > >> >> >> > pretty > >> >> >> > costy, so also a more invasive approach of pushing data from > >> >> >> > engine > >> >> >> > to > >> >> >> > doctor has been discused [5]. None of this two approaches > have > >> >> >> > been > >> >> >> > accepted > >> >> >> > (too complicated, too invasive). > >> >> >> > - writing some custom ad-hoc servlet serving only a > purpose of one > >> >> >> > frontend > >> >> >> > - this is actually there for the dashboard, but it is not a > >> >> >> > generic > >> >> >> > solution > >> >> >> > for the other frontends and we really should not develop > custom > >> >> >> > "APIs" > >> >> >> > for > >> >> >> > every frontend > >> >> >> > - there were some other proposals discussed (some 3th party > >> >> >> > solutions > >> >> >> > etc) > >> >> >> > but I think none of them made it even to a PoC > >> >> >> > > >> >> >> > So, now I would try again and try small to get at least some > >> >> >> > benefit. > >> >> >> > I > >> >> >> > see > >> >> >> > 2 paths we could try: > >> >> >> > 1: embed something which burns us immediatly, e.g.
the
> /sessions > >> >> >> > into > >> >> >> > VMs. I > >> >> >> > really liked the ;detail=sessions approach, could we move > it back? > >> >> >> > 2: add some tiny service which would just accept a list of > >> >> >> > queries, > >> >> >> > execute > >> >> >> > them locally (but using real HTTP requests) and return in one > >> >> >> > bulk. A > >> >> >> > naive > >> >> >> > implementation just to give a sense of what I mean
of
> this would > >> >> >> > be a > >> >> >> > shell > >> >> >> > script getting list of strings like > >> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions> > <https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions>>" iterate
over
> >> >> >> > them > >> >> >> > and > >> >> >> > do a curl request for each, mangle the results
into one
> string and > >> >> >> > return > >> >> >> > (credits for this idea to msivak). Easy to
implement,
> possibility > >> >> >> > to > >> >> >> > add > >> >> >> > also projections later to save some bandwidth. But
the
> API would > >> >> >> > anyway > >> >> >> > be > >> >> >> > hammered by bunch of queries, only the network roundtrip > would be > >> >> >> > saved. > >> >> >> > 3: any other simple approaches? > >> >> >> > > >> >> >> > I honestly prefer the first approach. It is not > beautiful, it is > >> >> >> > not > >> >> >> > REST-ful, but it is easy to implement, very pragmatic and > useful. > >> >> >> > What do you think? > >> >> >> > > >> >> >> > Thank you and sorry for the long mail :) > >> >> >> > Tomas > >> >> >> > > >> >> >> > [1]: https://github.com/oVirt/moVirt <https://github.com/oVirt/moVirt> > <https://github.com/oVirt/moVirt <https://github.com/oVirt/moVirt>> > >> >> >> > [2]: https://github.com/oVirt/ovirt-web-ui <https://github.com/oVirt/ovirt-web-ui> > <https://github.com/oVirt/ovirt-web-ui <https://github.com/oVirt/ovirt-web-ui>> > >> >> >> > [3]: https://gerrit.ovirt.org/#/c/61260 <https://gerrit.ovirt.org/#/c/61260> > <https://gerrit.ovirt.org/#/c/61260 <https://gerrit.ovirt.org/#/c/61260>> > >> >> >> > [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ <https://github.com/oVirt/ovirt-web-ui/pull/106/> > <https://github.com/oVirt/ovirt-web-ui/pull/106/ <https://github.com/oVirt/ovirt-web-ui/pull/106/>> > >> >> >> > [5]: https://gerrit.ovirt.org/#/c/45233/ <https://gerrit.ovirt.org/#/c/45233/> > <https://gerrit.ovirt.org/#/c/45233/ <https://gerrit.ovirt.org/#/c/45233/>> > >> >> >> > > >> >> >> > > >> >> >> > _______________________________________________ > >> >> >> > Devel mailing list > >> >> >> > Devel@ovirt.org <mailto:Devel@ovirt.org> <mailto:Devel@ovirt.org <mailto:Devel@ovirt.org>> > >> >> >> > http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> > <http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel>> > >> >> >> _______________________________________________ > >> >> >> Devel mailing list > >> >> >> Devel@ovirt.org <mailto:Devel@ovirt.org> <mailto:Devel@ovirt.org <mailto:Devel@ovirt.org>> > >> >> >> http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> > <http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel>> > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > -- > >> >> > Greg Sheremeta, MBA > >> >> > Red Hat, Inc. > >> >> > Sr. Software Engineer > >> >> > gshereme@redhat.com <mailto:gshereme@redhat.com> <mailto:gshereme@redhat.com <mailto:gshereme@redhat.com>> > >> > > >> > > >> _______________________________________________ > >> Devel mailing list > >> Devel@ovirt.org <mailto:Devel@ovirt.org> <mailto:Devel@ovirt.org <mailto:Devel@ovirt.org>> > >> http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> > <http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel>> > > > > > >
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Greg Sheremeta, MBA Red Hat, Inc. Sr. Software Engineer gshereme@redhat.com

Hi!
Forgive me if this is radical, but has anyone thought of / discussed using a NoSQL alternative to our very normalized SQL db as a way to avoid the problem of aggregating details? Using mongodb as an example, embed some of the smaller objects, and there's no cost of aggregation there. IIRC, Doctor REST uses mongo under the hood.
https://docs.mongodb.com/manual/tutorial/model-embedded-one-to-many-relation...
What about data integrity that is critical to us? Shmuel

On Mon, Mar 27, 2017 at 4:49 PM, Greg Sheremeta <gshereme@redhat.com> wrote:
On Mon, Mar 27, 2017 at 8:59 AM, Tomas Jelinek <tjelinek@redhat.com> wrote:
On Mon, Mar 27, 2017 at 1:32 PM, Juan Hernández <jhernand@redhat.com> wrote:
On 03/27/2017 01:03 PM, Tomas Jelinek wrote:
On Mon, Mar 27, 2017 at 11:21 AM, Juan Hernández <jhernand@redhat.com <mailto:jhernand@redhat.com>> wrote:
Top posting, sorry.
There are a few things I'd like to clarify, regarding this subject:
1. Data aggregation, as requested now by Tomas, and by other
people in
the past.
We used to have that 'detail' parameter, to aggregate certain very specific types of data, in particular to aggregate VM disks and
NICs. We
removed that in version 4 of the API because the implementation was extremely inefficient, from the engine point of view. An innocent request like this:
GET /ovirt-engine/api/vms?detail=+disks,+nics
Would generate, with the implementation we used to have, 1 query
for the
VMs and then as many queries for disks and NICs as VMs in the
system. In
our scale test environments, for example, with approx 4000 VMs and
10000
disks, that would take more than 20 hours to execute.
In addition, we didn't have in the past any mechanism to make this available in a generic one, because there was no knowledge in the
API of
what are 'details'.
In version 4 of the API we introduced a formal (kind of)
specification
of the API (a.k.a. the model), and int includes knowledge about
what are
'links'. For example, the specification of the VM type contains
this:
@Link DiskAttachment[] diskAttachments(); @Link Nic[] nics();
With this information we are now in a position where we can
implement
this in a generic way.
We intend to implement this using a mechanism similar to the
existing
'detail' parameter:
GET /ovirt-engine/api/vms/123?follow=disk_attachments,nics
The naive implementation of this is to let the API call itself. For example, when the user requests to follow the 'disk_attachments'
detail
the API can just call itself to get that:
GET /ovirt-engine/api/vms/123/disk_attachments
However, we can't use that naive approach, if we do we end with the 1+C*N query problem described before. We need to use specific implementations for certain frequent use cases, like
VMs+disks+nics, and
that needs work in the API and in the backend.
Tomas, if you want to help moving this forward, please open a RFE
and
makes sure it gets attention.
ok, opened: https://bugzilla.redhat.com/show_bug.cgi?id=1436206 Will try to get it done soon.
Forgive me if this is radical, but has anyone thought of / discussed using a NoSQL alternative to our very normalized SQL db as a way to avoid the problem of aggregating details? Using mongodb as an example, embed some of the smaller objects, and there's no cost of aggregation there. IIRC, Doctor REST uses mongo under the hood.
https://docs.mongodb.com/manual/tutorial/model-embedded-one-to-many- relationships-between-documents/
Greg, I really appreciate your enthusiasm for NoSQL technologies but we have to distinguish between the functional requirements for cache of schemaless frontend-optimized projections (as in the Dr. Rest case) and the main application data store, supporting transactional business logic. I don't believe the current architecture could survive the weak eventual-consistency guarantees of not fully transactional DB (such as Postgres is) :-) Best regards, Martin
Greg
This sounds pretty good! I will open, but since we are talking already here I'll just use the opportunity to clarify the topic more and than I'll open the BZ.
What I can imagine is the GetAllVmsQuery will accept in params also the list of details it should provide. Than, the GetAllVmsQuery will implement the efficient way of retrieving this info as well.
So, from the API perspective, it will be about taking the ?follow=<something> part and passing it to the backend query params.
What you think?
Exactly, that is the point! The API by itself can't optimize database queries, all it can do is call the backend. It is the backend that has the opportunity and possibility to send optimized queries to the database.
For other less common things we can use the naive approach, and implement the aggregation in the API itself. But for common use cases, like VM+disks+nics, we need to do it in an efficient way.
2. Reuse of TLS sessions.
The part of creating TLS sessions that is expensive is the
generation of
the shared session key. That can be avoided if both the server and
the
client are careful and reuse the session, using the session cache mechanism built-in into TLS itself. The web servers that we use
(Apache
and Undertow) do implement this mechanism, and so do most of our clients. Make sure that your client uses it as well. In Java this
is
achieved re-using the SSLContext. We already do that for the
engine to
VDSM communciation for example. In JavaScript the browser already
takes
care of this.
3. Parallelism and latency.
A typical problem that we have is that we send many request to the server. For example, to retrieve user sessions for a set of VMs we
tend
to send many requests like this:
GET /ovirt-engine/api/vms/1/sessions GET /ovirt/engine/api/vms/2/sessions GET /ovirt-engine/api/vms/3/sessions ...
And we do that in a synchronous way: send one, wait for the
result, send
another one, wait for the result, etc. This means that we don't
take
advantage of the parallelism of the server and that we add to each request the network round trip time. So if we have N requests, we
have
to wait at least N*RTT.
The web servers that we use support multiple connections, and the protocol that we use, HTTP, supports pipe-lining. This means that
you
can send multiple requests in parallel, and that you can send
multiple
requests without waiting for the response. To give you an idea of
the
improvement that can be achieved, we recently added asynchronous
request
support to the Ruby SDK, with multiple connections and
pipe-lining. In
our scale testing environment that reduced the time to collect a complete inventory from approx 30 min to approx 2 min. Here you
have an
example:
sdk/examples/asynchronous_inventory.rb
/sdk/examples/asynchronous_inventory.rb>
So make sure that you take advantage of that in your clients. Sadly pipe-lining is disabled by default in most browsers, so this isn't helpful for JavaScript applications.
But we can try what we can do in moVirt about this: https://github.com/oVirt/moVirt/issues/260
Sure, I think there are plenty of asynchronous HTTP clients for Android, worth trying one of them. If you are brave enough you can even consider using the same library used in the Ruby SDK: libcurl. A bit of JNI here and there, and you are done.
4. HTTP/2 support.
The application server that we use, WildFly, supports HTTP/2,
including
ALPN, out of the box, since version 10.1. We need a mechanism to enable it:
core: Add support for enabling HTTP/2 https://gerrit.ovirt.org/74621
And then we need to get Apache out of the way, for API traffic, at least. I think that is something we can do in the context of the
engine
"podification" effort.
However, note that HTTP/2 won't have that big impact in
performance for
applications that continue to use a synchronous/serial style of interaction with the API.
On 03/24/2017 11:16 PM, Yaniv Kaul wrote: > > > On Fri, Mar 24, 2017 at 8:57 PM, Martin Sivak <msivak@redhat.com
<mailto:msivak@redhat.com>
> <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> wrote: > > > Current Apache used has only experimental module for it. > > Undertow is supposed to have a better support. I wonder
when/if we can drop
> > Apache... > > The last info I have about that from mperina is that we need
Apache
> for kerberos support atm. > > > I don't think we need it - I remember reading that Undertow does
support
> it as well. > The only issue is that there are probably 10 people in the world
who
> know how to configure Undertow for Kerberos, while many do for
Apache.
> And since we leave it for the user to configure... > Y. > > > > Martin > > On Fri, Mar 24, 2017 at 5:30 PM, Yaniv Kaul <
ykaul@redhat.com <mailto:ykaul@redhat.com>
> <mailto:ykaul@redhat.com <mailto:ykaul@redhat.com>>> wrote: > > > > > > On Fri, Mar 24, 2017 at 6:43 PM, Martin Sivak <
msivak@redhat.com <mailto:msivak@redhat.com>
> <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>>
wrote:
> >> > >> > 2: you can have more api gateways (e.g. more apis) tailored for > every > >> > frontend. I don't think we need this - the current API
serves
> us pretty > >> > well > >> > in every FE Im involved in. The only thing which I miss is the data > >> > aggregation. > >> > >> So it does not serve us well. Aggregation of data is one the usual > >> points of using the gateway. > >> Yes microservices are affected by this indeed, but so are we because > >> implementing the aggregation directly in the current
engine
API layer > >> is hard. > >> > >> > So I would go back to the original topic of this thread
- do
> some small > >> > change which has a chance to be merged to the project
and
helps > us where > >> > it > >> > hurts. > > > > > > I'm wondering if very specific additional REST API calls
can
suffice. > > For example, a 'Get VM + disks + NIC' API call seems reasonable to > add for > > the various clients who commonly need it. > > > >> > >> Can a simple HTTP/2 to HTTP/AJP gateway be the simplest solution? Our > >> Apache might even have a module for it already. > > > > > > Current Apache used has only experimental module for it. > > Undertow is supposed to have a better support. I wonder when/if we > can drop > > Apache... > > Y. > > > >> > >> That way you can multiplex all the REST calls using a single tcp > >> connection (and a single SSL negotiation). > >> > >> A custom SSO enabled service like that might be even
better
as it > >> would be able to skip the authentication > >> layers too and that would lower the engine load. But I am not sure it > >> is possible with the current codebase. > >> > >> Martin > >> > >> On Fri, Mar 24, 2017 at 4:22 PM, Tomas Jelinek > <tjelinek@redhat.com <mailto:tjelinek@redhat.com> <mailto:tjelinek@redhat.com <mailto:tjelinek@redhat.com>>> > >> wrote: > >> > > >> > > >> > On Fri, Mar 24, 2017 at 3:58 PM, Martin Sivak > <msivak@redhat.com <mailto:msivak@redhat.com> <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> wrote: > >> >> > >> >> > I feel like every REST API I've ever worked with has had the > >> >> > aggregation > >> >> > + > >> >> > projection problem. It's like we're trying to use
REST
as a > >> >> > replacement > >> >> > for > >> >> > SQL -- but the logic that executes the "SQL" lives
in a
> browser now, > >> >> > and > >> >> > it > >> >> > used to live on a server close to the DB. And REST
isn't
> expressive > >> >> > for > >> >> > selecting data like SQL is. > >> >> > >> >> The current industry solution I know about is called
API
gateway.. > >> >> most of the big players have internal API with lots of
low
> level stuff > >> >> and then couple of external API gateways tailored to what the > client > >> >> needs. > >> >> > >> >> http://microservices.io/patterns/apigateway.html <http://microservices.io/patterns/apigateway.html> > <http://microservices.io/patterns/apigateway.html <http://microservices.io/patterns/apigateway.html>> (check the
backend
> >> >> for frontend section) > >> >> > >> >> This trend is also visible when you think about
services
that > offer > >> >> API gateway management and billing like > >> >> https://aws.amazon.com/api-gateway/ <https://aws.amazon.com/api-gateway/> > <https://aws.amazon.com/api-gateway/ <https://aws.amazon.com/api-gateway/>> or our very own > >> >> https://www.3scale.net/ > >> > > >> > > >> > right, but the api gateway solves 2 problems: > >> > > >> > 1: if you have a microservice architecture it is hard
for
> frontend to > >> > talk > >> > to 20 different moving services. So the gateway hides
this
> complexity > >> > behind > >> > it. This is not the problem we have. > >> > > >> > 2: you can have more api gateways (e.g. more apis) tailored for > every > >> > frontend. I don't think we need this - the current API
serves
> us pretty > >> > well > >> > in every FE Im involved in. The only thing which I miss is the data > >> > aggregation. > >> > > >> > So I would go back to the original topic of this thread
- do
> some small > >> > change which has a chance to be merged to the project
and
helps > us where > >> > it > >> > hurts. > >> > > >> >> > >> >> > >> >> > >> >> Martin > >> >> > >> >> On Fri, Mar 24, 2017 at 3:47 PM, Greg Sheremeta > <gshereme@redhat.com <mailto:gshereme@redhat.com> <mailto:gshereme@redhat.com <mailto:gshereme@redhat.com>>> > >> >> wrote: > >> >> > I feel like every REST API I've ever worked with has
had the
> >> >> > aggregation > >> >> > + > >> >> > projection problem. It's like we're trying to use
REST as a
> >> >> > replacement > >> >> > for > >> >> > SQL -- but the logic that executes the "SQL" lives
in a
> browser now, > >> >> > and > >> >> > it > >> >> > used to live on a server close to the DB. And REST
isn't
> expressive > >> >> > for > >> >> > selecting data like SQL is. > >> >> > > >> >> > There must be some industry solution to this "I want
to do
> SQL over > >> >> > REST" > >> >> > problem. > >> >> > > >> >> > On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak > <msivak@redhat.com <mailto:msivak@redhat.com> <mailto:msivak@redhat.com <mailto:msivak@redhat.com>>> > >> >> > wrote: > >> >> >> > >> >> >> > for quite some time I have been more or less involved in > >> >> >> > development > >> >> >> > of > >> >> >> > various UIs for oVirt based entirely on the
oVirt's
REST API > >> >> >> > ranging > >> >> >> > from > >> >> >> > the quite mature moVirt [1] through some cockpit > extensions to a > >> >> >> > young > >> >> >> > and > >> >> >> > experimental user portal replacement [2]. > >> >> >> > >> >> >> oVirt optimizer has the same issue.. > >> >> >> > >> >> >> > 2: add some tiny service which would just accept a list of > >> >> >> > queries, > >> >> >> > execute > >> >> >> > them locally (but using real HTTP requests) and return in one > >> >> >> > bulk. A > >> >> >> > naive > >> >> >> > implementation just to give a sense of what I
mean of
> this would > >> >> >> > be a > >> >> >> > shell > >> >> >> > script getting list of strings like > >> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions> > <https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions>>" iterate
over
> >> >> >> > them > >> >> >> > and > >> >> >> > do a curl request for each, mangle the results
into one
> string and > >> >> >> > return > >> >> >> > (credits for this idea to msivak). Easy to
implement,
> possibility > >> >> >> > to > >> >> >> > add > >> >> >> > also projections later to save some bandwidth.
But the
> API would > >> >> >> > anyway > >> >> >> > be > >> >> >> > hammered by bunch of queries, only the network roundtrip > would be > >> >> >> > saved. > >> >> >> > >> >> >> The biggest cost for (especially mobile) clients is the cost of > >> >> >> establishing new SSL connection. SSL is also pretty > expensive on the > >> >> >> server side. > >> >> >> > >> >> >> So running the aggregation service on the ovirt-engine machine > >> >> >> (behind > >> >> >> Apache) means the client will do a single SSL
request
with > list of N > >> >> >> urls and the local "reverse-proxy" will perform
single
> >> >> >> authentication > >> >> >> and N plain HTTP requests (or even better - AJP). It won't > remove > >> >> >> any > >> >> >> time from the actual command run time, but it will
reduce
> protocol > >> >> >> overhead. > >> >> >> > >> >> >> I think this is the simplest first step that
requires
almost no > >> >> >> change > >> >> >> to existing infrastructure. > >> >> >> > >> >> >> -- > >> >> >> Martin Sivak > >> >> >> SLA / oVirt > >> >> >> > >> >> >> On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek > >> >> >> <tjelinek@redhat.com <mailto:tjelinek@redhat.com> <mailto:tjelinek@redhat.com <mailto:tjelinek@redhat.com>>> > >> >> >> wrote: > >> >> >> > Hi All, > >> >> >> > > >> >> >> > for quite some time I have been more or less involved in > >> >> >> > development > >> >> >> > of > >> >> >> > various UIs for oVirt based entirely on the
oVirt's
REST API > >> >> >> > ranging > >> >> >> > from > >> >> >> > the quite mature moVirt [1] through some cockpit > extensions to a > >> >> >> > young > >> >> >> > and > >> >> >> > experimental user portal replacement [2]. > >> >> >> > > >> >> >> > One issue we hit over and over again is the
missing
data > >> >> >> > aggregation. > >> >> >> > In > >> >> >> > the > >> >> >> > 3.x era we used to use in moVirt the
detail=something
> >> >> >> > api to get the disks and nics of the VM, something like: > >> >> >> > > >> >> >> > GET /ovirt-engine/api/vms > >> >> >> > Accept: application/json; detail=disks > >> >> >> > > >> >> >> > This allowed us to store this data in local
database
> leading to > >> >> >> > great > >> >> >> > user > >> >> >> > experience. Since this feature has been removed
in 4.x
> API [3] > >> >> >> > we needed to retire to a different solution. When the VM > detail is > >> >> >> > selected > >> >> >> > by the user, start loading the disks and nics and
hope
> the user > >> >> >> > will not be fast enough to see the delay. The UX
is
> slightly worse > >> >> >> > bug > >> >> >> > kinda > >> >> >> > acceptable. > >> >> >> > > >> >> >> > We hit this issue harder in the new user portal
[2],
> because we > >> >> >> > already > >> >> >> > have > >> >> >> > the VM cached and show the whole VM in one screen. So, if > you pick > >> >> >> > it, > >> >> >> > you > >> >> >> > will get it's details immediately. > >> >> >> > But, since you don't have all the details, we need to do an > >> >> >> > additional > >> >> >> > call > >> >> >> > (two actually) to load this data and they start to appear > later. > >> >> >> > So, something which would be very fast and smooth starts > to feel > >> >> >> > sluggish. > >> >> >> > > >> >> >> > Recently, we hit this issue again which forced us
to
> sacrifice the > >> >> >> > UX > >> >> >> > even > >> >> >> > more - it is the "console in use" feature of user portal. > >> >> >> > The use case is this: > >> >> >> > - if the console is already taken by some user, there are > >> >> >> > complications > >> >> >> > if > >> >> >> > other current user tryes to take it as well (will
avoid
> details > >> >> >> > about > >> >> >> > settings and permissins involved, but long story
short,
> the user > >> >> >> > will > >> >> >> > probably not be allowed to connect to it. The "probably" > is the > >> >> >> > key > >> >> >> > here > >> >> >> > since we can not do any intelligent decision in advance, > we can > >> >> >> > only > >> >> >> > warn > >> >> >> > the user that the console is taken). > >> >> >> > - in the current GWT user portal, if the VM's console is > taken, it > >> >> >> > is > >> >> >> > shown > >> >> >> > on the VM's "box" that "console is taken". This
was
a highly > >> >> >> > requested > >> >> >> > feature > >> >> >> > - to get this information using the current REST API, we > need to > >> >> >> > go > >> >> >> > to > >> >> >> > the > >> >> >> > /vms/<vmid>/sessions subcollection. To get this
for all
> VMs, it > >> >> >> > would > >> >> >> > be > >> >> >> > doing N queries per poll which we can not afford > >> >> >> > - so the current PR [4] will probably end up to
only
> check it on > >> >> >> > the > >> >> >> > attempt > >> >> >> > to connect to the console warning the user. Maybe it will > be also > >> >> >> > shown > >> >> >> > in > >> >> >> > Vm details. But the UX in case the user will look
for a
> VM which > >> >> >> > has > >> >> >> > free > >> >> >> > console will suffer significantly (e.g. try one
by one
> until some > >> >> >> > opens > >> >> >> > or > >> >> >> > look at details one by one to see if the warning appears > (with a > >> >> >> > delay)) > >> >> >> > > >> >> >> > I understand that embedding the details of the VM to the > response > >> >> >> > comes > >> >> >> > with > >> >> >> > a cost, namely: > >> >> >> > - performance hit > >> >> >> > - complexity of the API code > >> >> >> > - the "cleanness" of REST suffers > >> >> >> > > >> >> >> > But I think we should seriously consider to
provide
some > option to > >> >> >> > data > >> >> >> > aggregation. > >> >> >> > > >> >> >> > I know this has been discussed many times with no result, > but I > >> >> >> > think > >> >> >> > it > >> >> >> > is > >> >> >> > time to bring this topic up again. I'll try to summarize the > >> >> >> > (failed) > >> >> >> > attempts tried so far: > >> >> >> > - the detail=<something> parameter with ad-hoc embedding > of data. > >> >> >> > This > >> >> >> > has > >> >> >> > been there and removed in 4.0 [3] > >> >> >> > - the DoctorREST project - e.g. a proxy above the current > api. The > >> >> >> > idea > >> >> >> > was > >> >> >> > to create a service which will be independent of the engine > >> >> >> > itself, > >> >> >> > will > >> >> >> > locally poll the engine's REST, store all data in
local
> (mongo)DB > >> >> >> > and > >> >> >> > provide a rich api with aggregations and projections and push > >> >> >> > notifications. > >> >> >> > This polling of everything to get the data to DoctorREST > proved to > >> >> >> > be > >> >> >> > pretty > >> >> >> > costy, so also a more invasive approach of pushing data from > >> >> >> > engine > >> >> >> > to > >> >> >> > doctor has been discused [5]. None of this two approaches > have > >> >> >> > been > >> >> >> > accepted > >> >> >> > (too complicated, too invasive). > >> >> >> > - writing some custom ad-hoc servlet serving only
a
> purpose of one > >> >> >> > frontend > >> >> >> > - this is actually there for the dashboard, but it is not a > >> >> >> > generic > >> >> >> > solution > >> >> >> > for the other frontends and we really should not develop > custom > >> >> >> > "APIs" > >> >> >> > for > >> >> >> > every frontend > >> >> >> > - there were some other proposals discussed (some 3th party > >> >> >> > solutions > >> >> >> > etc) > >> >> >> > but I think none of them made it even to a PoC > >> >> >> > > >> >> >> > So, now I would try again and try small to get at least some > >> >> >> > benefit. > >> >> >> > I > >> >> >> > see > >> >> >> > 2 paths we could try: > >> >> >> > 1: embed something which burns us immediatly,
e.g. the
> /sessions > >> >> >> > into > >> >> >> > VMs. I > >> >> >> > really liked the ;detail=sessions approach, could we move > it back? > >> >> >> > 2: add some tiny service which would just accept a list of > >> >> >> > queries, > >> >> >> > execute > >> >> >> > them locally (but using real HTTP requests) and return in one > >> >> >> > bulk. A > >> >> >> > naive > >> >> >> > implementation just to give a sense of what I
mean of
> this would > >> >> >> > be a > >> >> >> > shell > >> >> >> > script getting list of strings like > >> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions> > <https://localhost/ovirt-engine/api/vms/123/sessions <https://localhost/ovirt-engine/api/vms/123/sessions>>" iterate
over
> >> >> >> > them > >> >> >> > and > >> >> >> > do a curl request for each, mangle the results
into one
> string and > >> >> >> > return > >> >> >> > (credits for this idea to msivak). Easy to
implement,
> possibility > >> >> >> > to > >> >> >> > add > >> >> >> > also projections later to save some bandwidth.
But the
> API would > >> >> >> > anyway > >> >> >> > be > >> >> >> > hammered by bunch of queries, only the network roundtrip > would be > >> >> >> > saved. > >> >> >> > 3: any other simple approaches? > >> >> >> > > >> >> >> > I honestly prefer the first approach. It is not > beautiful, it is > >> >> >> > not > >> >> >> > REST-ful, but it is easy to implement, very pragmatic and > useful. > >> >> >> > What do you think? > >> >> >> > > >> >> >> > Thank you and sorry for the long mail :) > >> >> >> > Tomas > >> >> >> > > >> >> >> > [1]: https://github.com/oVirt/moVirt <https://github.com/oVirt/moVirt> > <https://github.com/oVirt/moVirt <https://github.com/oVirt/moVirt>> > >> >> >> > [2]: https://github.com/oVirt/ovirt-web-ui <https://github.com/oVirt/ovirt-web-ui> > <https://github.com/oVirt/ovirt-web-ui <https://github.com/oVirt/ovirt-web-ui>> > >> >> >> > [3]: https://gerrit.ovirt.org/#/c/61260 <https://gerrit.ovirt.org/#/c/61260> > <https://gerrit.ovirt.org/#/c/61260 <https://gerrit.ovirt.org/#/c/61260>> > >> >> >> > [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/ <https://github.com/oVirt/ovirt-web-ui/pull/106/> > <https://github.com/oVirt/ovirt-web-ui/pull/106/ <https://github.com/oVirt/ovirt-web-ui/pull/106/>> > >> >> >> > [5]: https://gerrit.ovirt.org/#/c/45233/ <https://gerrit.ovirt.org/#/c/45233/> > <https://gerrit.ovirt.org/#/c/45233/ <https://gerrit.ovirt.org/#/c/45233/>> > >> >> >> > > >> >> >> > > >> >> >> > _______________________________________________ > >> >> >> > Devel mailing list > >> >> >> > Devel@ovirt.org <mailto:Devel@ovirt.org> <mailto:Devel@ovirt.org <mailto:Devel@ovirt.org>> > >> >> >> > http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> > <http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel>> > >> >> >> _______________________________________________ > >> >> >> Devel mailing list > >> >> >> Devel@ovirt.org <mailto:Devel@ovirt.org> <mailto:Devel@ovirt.org <mailto:Devel@ovirt.org>> > >> >> >> http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> > <http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel>> > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > -- > >> >> > Greg Sheremeta, MBA > >> >> > Red Hat, Inc. > >> >> > Sr. Software Engineer > >> >> > gshereme@redhat.com <mailto:gshereme@redhat.com> <mailto:gshereme@redhat.com <mailto:gshereme@redhat.com>> > >> > > >> > > >> _______________________________________________ > >> Devel mailing list > >> Devel@ovirt.org <mailto:Devel@ovirt.org> <mailto:Devel@ovirt.org <mailto:Devel@ovirt.org>> > >> http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel> > <http://lists.ovirt.org/mailman/listinfo/devel <http://lists.ovirt.org/mailman/listinfo/devel>> > > > > > >
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Greg Sheremeta, MBA Red Hat, Inc. Sr. Software Engineer gshereme@redhat.com
participants (7)
-
Greg Sheremeta
-
Juan Hernández
-
Martin Betak
-
Martin Sivak
-
Shmuel Melamud
-
Tomas Jelinek
-
Yaniv Kaul