[ovirt-devel] REST API data aggregation

Fri Mar 24 17:57:04 UTC 2017

> Current Apache used has only experimental module for it.
> Undertow is supposed to have a better support. I wonder when/if we can drop
> Apache...

The last info I have about that from mperina is that we need Apache
for kerberos support atm.

Martin

On Fri, Mar 24, 2017 at 5:30 PM, Yaniv Kaul <ykaul at redhat.com> wrote:
>
>
> On Fri, Mar 24, 2017 at 6:43 PM, Martin Sivak <msivak at redhat.com> wrote:
>>
>> > 2: you can have more api gateways (e.g. more apis) tailored for every
>> > frontend. I don't think we need this - the current API serves us pretty
>> > well
>> > in every FE Im involved in. The only thing which I miss is the data
>> > aggregation.
>>
>> So it does not serve us well. Aggregation of data is one the usual
>> points of using the gateway.
>> Yes microservices are affected by this indeed, but so are we because
>> implementing the aggregation directly in the current engine API layer
>> is hard.
>>
>> > So I would go back to the original topic of this thread - do some small
>> > change which has a chance to be merged to the project and helps us where
>> > it
>> > hurts.
>
>
> I'm wondering if very specific additional REST API calls can suffice.
> For example, a 'Get VM + disks + NIC' API call seems reasonable to add for
> the various clients who commonly need it.
>
>>
>> Can a simple HTTP/2 to HTTP/AJP gateway be the simplest solution? Our
>> Apache might even have a module for it already.
>
>
> Current Apache used has only experimental module for it.
> Undertow is supposed to have a better support. I wonder when/if we can drop
> Apache...
> Y.
>
>>
>> That way you can multiplex all the REST calls using a single tcp
>> connection (and a single SSL negotiation).
>>
>> A custom SSO enabled service like that might be even better as it
>> would be able to skip the authentication
>> layers too and that would lower the engine load. But I am not sure it
>> is possible with the current codebase.
>>
>> Martin
>>
>> On Fri, Mar 24, 2017 at 4:22 PM, Tomas Jelinek <tjelinek at redhat.com>
>> wrote:
>> >
>> >
>> > On Fri, Mar 24, 2017 at 3:58 PM, Martin Sivak <msivak at redhat.com> wrote:
>> >>
>> >> > I feel like every REST API I've ever worked with has had the
>> >> > aggregation
>> >> > +
>> >> > projection problem. It's like we're trying to use REST as a
>> >> > replacement
>> >> > for
>> >> > SQL -- but the logic that executes the "SQL" lives in a browser now,
>> >> > and
>> >> > it
>> >> > used to live on a server close to the DB. And REST isn't expressive
>> >> > for
>> >> > selecting data like SQL is.
>> >>
>> >> The current industry solution I know about is called API gateway..
>> >> most of the big players have internal API with lots of low level stuff
>> >> and then couple of external API gateways tailored to what the client
>> >> needs.
>> >>
>> >> http://microservices.io/patterns/apigateway.html (check the backend
>> >> for frontend section)
>> >>
>> >> This trend is also visible when you think about services that offer
>> >> API gateway management and billing like
>> >> https://aws.amazon.com/api-gateway/ or our very own
>> >> https://www.3scale.net/
>> >
>> >
>> > right, but the api gateway solves 2 problems:
>> >
>> > 1: if you have a microservice architecture it is hard for frontend to
>> > talk
>> > to 20 different moving services. So the gateway hides this complexity
>> > behind
>> > it. This is not the problem we have.
>> >
>> > 2: you can have more api gateways (e.g. more apis) tailored for every
>> > frontend. I don't think we need this - the current API serves us pretty
>> > well
>> > in every FE Im involved in. The only thing which I miss is the data
>> > aggregation.
>> >
>> > So I would go back to the original topic of this thread - do some small
>> > change which has a chance to be merged to the project and helps us where
>> > it
>> > hurts.
>> >
>> >>
>> >>
>> >>
>> >> Martin
>> >>
>> >> On Fri, Mar 24, 2017 at 3:47 PM, Greg Sheremeta <gshereme at redhat.com>
>> >> wrote:
>> >> > I feel like every REST API I've ever worked with has had the
>> >> > aggregation
>> >> > +
>> >> > projection problem. It's like we're trying to use REST as a
>> >> > replacement
>> >> > for
>> >> > SQL -- but the logic that executes the "SQL" lives in a browser now,
>> >> > and
>> >> > it
>> >> > used to live on a server close to the DB. And REST isn't expressive
>> >> > for
>> >> > selecting data like SQL is.
>> >> >
>> >> > There must be some industry solution to this "I want to do SQL over
>> >> > REST"
>> >> > problem.
>> >> >
>> >> > On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak <msivak at redhat.com>
>> >> > wrote:
>> >> >>
>> >> >> > for quite some time I have been more or less involved in
>> >> >> > development
>> >> >> > of
>> >> >> > various UIs for oVirt based entirely on the oVirt's REST API
>> >> >> > ranging
>> >> >> > from
>> >> >> > the quite mature moVirt [1] through some cockpit extensions to a
>> >> >> > young
>> >> >> > and
>> >> >> > experimental user portal replacement [2].
>> >> >>
>> >> >> oVirt optimizer has the same issue..
>> >> >>
>> >> >> > 2: add some tiny service which would just accept a list of
>> >> >> > queries,
>> >> >> > execute
>> >> >> > them locally (but using real HTTP requests) and return in one
>> >> >> > bulk. A
>> >> >> > naive
>> >> >> > implementation just to give a sense of what I mean of this would
>> >> >> > be a
>> >> >> > shell
>> >> >> > script getting list of strings like
>> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over
>> >> >> > them
>> >> >> > and
>> >> >> > do a curl request for each, mangle the results into one string and
>> >> >> > return
>> >> >> > (credits for this idea to msivak). Easy to implement, possibility
>> >> >> > to
>> >> >> > add
>> >> >> > also projections later to save some bandwidth. But the API would
>> >> >> > anyway
>> >> >> > be
>> >> >> > hammered by bunch of queries, only the network roundtrip would be
>> >> >> > saved.
>> >> >>
>> >> >> The biggest cost for (especially mobile) clients is the cost of
>> >> >> establishing new SSL connection. SSL is also pretty expensive on the
>> >> >> server side.
>> >> >>
>> >> >> So running the aggregation service on the ovirt-engine machine
>> >> >> (behind
>> >> >> Apache) means the client will do a single SSL request with list of N
>> >> >> urls and the local "reverse-proxy" will perform single
>> >> >> authentication
>> >> >> and N plain HTTP requests (or even better - AJP). It won't remove
>> >> >> any
>> >> >> time from the actual command run time, but it will reduce protocol
>> >> >> overhead.
>> >> >>
>> >> >> I think this is the simplest first step that requires almost no
>> >> >> change
>> >> >> to existing infrastructure.
>> >> >>
>> >> >> --
>> >> >> Martin Sivak
>> >> >> SLA / oVirt
>> >> >>
>> >> >> On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek
>> >> >> <tjelinek at redhat.com>
>> >> >> wrote:
>> >> >> > Hi All,
>> >> >> >
>> >> >> > for quite some time I have been more or less involved in
>> >> >> > development
>> >> >> > of
>> >> >> > various UIs for oVirt based entirely on the oVirt's REST API
>> >> >> > ranging
>> >> >> > from
>> >> >> > the quite mature moVirt [1] through some cockpit extensions to a
>> >> >> > young
>> >> >> > and
>> >> >> > experimental user portal replacement [2].
>> >> >> >
>> >> >> > One issue we hit over and over again is the missing data
>> >> >> > aggregation.
>> >> >> > In
>> >> >> > the
>> >> >> > 3.x era we used to use in moVirt the detail=something
>> >> >> > api to get the disks and nics of the VM, something like:
>> >> >> >
>> >> >> > GET /ovirt-engine/api/vms
>> >> >> > Accept: application/json; detail=disks
>> >> >> >
>> >> >> > This allowed us to store this data in local database leading to
>> >> >> > great
>> >> >> > user
>> >> >> > experience. Since this feature has been removed in 4.x API [3]
>> >> >> > we needed to retire to a different solution. When the VM detail is
>> >> >> > selected
>> >> >> > by the user, start loading the disks and nics and hope the user
>> >> >> > will not be fast enough to see the delay. The UX is slightly worse
>> >> >> > bug
>> >> >> > kinda
>> >> >> > acceptable.
>> >> >> >
>> >> >> > We hit this issue harder in the new user portal [2], because we
>> >> >> > already
>> >> >> > have
>> >> >> > the VM cached and show the whole VM in one screen. So, if you pick
>> >> >> > it,
>> >> >> > you
>> >> >> > will get it's details immediately.
>> >> >> > But, since you don't have all the details, we need to do an
>> >> >> > additional
>> >> >> > call
>> >> >> > (two actually) to load this data and they start to appear later.
>> >> >> > So, something which would be very fast and smooth starts to feel
>> >> >> > sluggish.
>> >> >> >
>> >> >> > Recently, we hit this issue again which forced us to sacrifice the
>> >> >> > UX
>> >> >> > even
>> >> >> > more - it is the "console in use" feature of user portal.
>> >> >> > The use case is this:
>> >> >> > - if the console is already taken by some user, there are
>> >> >> > complications
>> >> >> > if
>> >> >> > other current user tryes to take it as well (will avoid details
>> >> >> > about
>> >> >> > settings and permissins involved, but long story short, the user
>> >> >> > will
>> >> >> > probably not be allowed to connect to it. The "probably" is the
>> >> >> > key
>> >> >> > here
>> >> >> > since we can not do any intelligent decision in advance, we can
>> >> >> > only
>> >> >> > warn
>> >> >> > the user that the console is taken).
>> >> >> > - in the current GWT user portal, if the VM's console is taken, it
>> >> >> > is
>> >> >> > shown
>> >> >> > on the VM's "box" that "console is taken". This was a highly
>> >> >> > requested
>> >> >> > feature
>> >> >> > - to get this information using the current REST API, we need to
>> >> >> > go
>> >> >> > to
>> >> >> > the
>> >> >> > /vms/<vmid>/sessions subcollection. To get this for all VMs, it
>> >> >> > would
>> >> >> > be
>> >> >> > doing N queries per poll which we can not afford
>> >> >> > - so the current PR [4] will probably end up to only check it on
>> >> >> > the
>> >> >> > attempt
>> >> >> > to connect to the console warning the user. Maybe it will be also
>> >> >> > shown
>> >> >> > in
>> >> >> > Vm details. But the UX in case the user will look for a VM which
>> >> >> > has
>> >> >> > free
>> >> >> > console will suffer significantly (e.g. try one by one until some
>> >> >> > opens
>> >> >> > or
>> >> >> > look at details one by one to see if the warning appears (with a
>> >> >> > delay))
>> >> >> >
>> >> >> > I understand that embedding the details of the VM to the response
>> >> >> > comes
>> >> >> > with
>> >> >> > a cost, namely:
>> >> >> > - performance hit
>> >> >> > - complexity of the API code
>> >> >> > - the "cleanness" of REST suffers
>> >> >> >
>> >> >> > But I think we should seriously consider to provide some option to
>> >> >> > data
>> >> >> > aggregation.
>> >> >> >
>> >> >> > I know this has been discussed many times with no result, but I
>> >> >> > think
>> >> >> > it
>> >> >> > is
>> >> >> > time to bring this topic up again. I'll try to summarize the
>> >> >> > (failed)
>> >> >> > attempts tried so far:
>> >> >> > - the detail=<something> parameter with ad-hoc embedding of data.
>> >> >> > This
>> >> >> > has
>> >> >> > been there and removed in 4.0 [3]
>> >> >> > - the DoctorREST project - e.g. a proxy above the current api. The
>> >> >> > idea
>> >> >> > was
>> >> >> > to create a service which will be independent of the engine
>> >> >> > itself,
>> >> >> > will
>> >> >> > locally poll the engine's REST, store all data in local (mongo)DB
>> >> >> > and
>> >> >> > provide a rich api with aggregations and projections and push
>> >> >> > notifications.
>> >> >> > This polling of everything to get the data to DoctorREST proved to
>> >> >> > be
>> >> >> > pretty
>> >> >> > costy, so also a more invasive approach of pushing data from
>> >> >> > engine
>> >> >> > to
>> >> >> > doctor has been discused [5]. None of this two approaches have
>> >> >> > been
>> >> >> > accepted
>> >> >> > (too complicated, too invasive).
>> >> >> > - writing some custom ad-hoc servlet serving only a purpose of one
>> >> >> > frontend
>> >> >> > - this is actually there for the dashboard, but it is not a
>> >> >> > generic
>> >> >> > solution
>> >> >> > for the other frontends and we really should not develop custom
>> >> >> > "APIs"
>> >> >> > for
>> >> >> > every frontend
>> >> >> > - there were some other proposals discussed (some 3th party
>> >> >> > solutions
>> >> >> > etc)
>> >> >> > but I think none of them made it even to a PoC
>> >> >> >
>> >> >> > So, now I would try again and try small to get at least some
>> >> >> > benefit.
>> >> >> > I
>> >> >> > see
>> >> >> > 2 paths we could try:
>> >> >> > 1: embed something which burns us immediatly, e.g. the /sessions
>> >> >> > into
>> >> >> > VMs. I
>> >> >> > really liked the ;detail=sessions approach, could we move it back?
>> >> >> > 2: add some tiny service which would just accept a list of
>> >> >> > queries,
>> >> >> > execute
>> >> >> > them locally (but using real HTTP requests) and return in one
>> >> >> > bulk. A
>> >> >> > naive
>> >> >> > implementation just to give a sense of what I mean of this would
>> >> >> > be a
>> >> >> > shell
>> >> >> > script getting list of strings like
>> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over
>> >> >> > them
>> >> >> > and
>> >> >> > do a curl request for each, mangle the results into one string and
>> >> >> > return
>> >> >> > (credits for this idea to msivak). Easy to implement, possibility
>> >> >> > to
>> >> >> > add
>> >> >> > also projections later to save some bandwidth. But the API would
>> >> >> > anyway
>> >> >> > be
>> >> >> > hammered by bunch of queries, only the network roundtrip would be
>> >> >> > saved.
>> >> >> > 3: any other simple approaches?
>> >> >> >
>> >> >> > I honestly prefer the first approach. It is not beautiful, it is
>> >> >> > not
>> >> >> > REST-ful, but it is easy to implement, very pragmatic and useful.
>> >> >> > What do you think?
>> >> >> >
>> >> >> > Thank you and sorry for the long mail :)
>> >> >> > Tomas
>> >> >> >
>> >> >> > [1]: https://github.com/oVirt/moVirt
>> >> >> > [2]: https://github.com/oVirt/ovirt-web-ui
>> >> >> > [3]: https://gerrit.ovirt.org/#/c/61260
>> >> >> > [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/
>> >> >> > [5]: https://gerrit.ovirt.org/#/c/45233/
>> >> >> >
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > Devel mailing list
>> >> >> > Devel at ovirt.org
>> >> >> > http://lists.ovirt.org/mailman/listinfo/devel
>> >> >> _______________________________________________
>> >> >> Devel mailing list
>> >> >> Devel at ovirt.org
>> >> >> http://lists.ovirt.org/mailman/listinfo/devel
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Greg Sheremeta, MBA
>> >> > Red Hat, Inc.
>> >> > Sr. Software Engineer
>> >> > gshereme at redhat.com
>> >
>> >
>> _______________________________________________
>> Devel mailing list
>> Devel at ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel
>
>