[ovirt-devel] REST API data aggregation

Yaniv Kaul ykaul at redhat.com
Fri Mar 24 16:30:28 UTC 2017


On Fri, Mar 24, 2017 at 6:43 PM, Martin Sivak <msivak at redhat.com> wrote:

> > 2: you can have more api gateways (e.g. more apis) tailored for every
> > frontend. I don't think we need this - the current API serves us pretty
> well
> > in every FE Im involved in. The only thing which I miss is the data
> > aggregation.
>
> So it does not serve us well. Aggregation of data is one the usual
> points of using the gateway.
> Yes microservices are affected by this indeed, but so are we because
> implementing the aggregation directly in the current engine API layer
> is hard.
>
> > So I would go back to the original topic of this thread - do some small
> > change which has a chance to be merged to the project and helps us where
> it
> > hurts.
>

I'm wondering if very specific additional REST API calls can suffice.
For example, a 'Get VM + disks + NIC' API call seems reasonable to add for
the various clients who commonly need it.


> Can a simple HTTP/2 to HTTP/AJP gateway be the simplest solution? Our
> Apache might even have a module for it already.
>

Current Apache used has only experimental module for it.
Undertow is supposed to have a better support. I wonder when/if we can drop
Apache...
Y.


> That way you can multiplex all the REST calls using a single tcp
> connection (and a single SSL negotiation).
>
> A custom SSO enabled service like that might be even better as it
> would be able to skip the authentication
> layers too and that would lower the engine load. But I am not sure it
> is possible with the current codebase.
>
> Martin
>
> On Fri, Mar 24, 2017 at 4:22 PM, Tomas Jelinek <tjelinek at redhat.com>
> wrote:
> >
> >
> > On Fri, Mar 24, 2017 at 3:58 PM, Martin Sivak <msivak at redhat.com> wrote:
> >>
> >> > I feel like every REST API I've ever worked with has had the
> aggregation
> >> > +
> >> > projection problem. It's like we're trying to use REST as a
> replacement
> >> > for
> >> > SQL -- but the logic that executes the "SQL" lives in a browser now,
> and
> >> > it
> >> > used to live on a server close to the DB. And REST isn't expressive
> for
> >> > selecting data like SQL is.
> >>
> >> The current industry solution I know about is called API gateway..
> >> most of the big players have internal API with lots of low level stuff
> >> and then couple of external API gateways tailored to what the client
> >> needs.
> >>
> >> http://microservices.io/patterns/apigateway.html (check the backend
> >> for frontend section)
> >>
> >> This trend is also visible when you think about services that offer
> >> API gateway management and billing like
> >> https://aws.amazon.com/api-gateway/ or our very own
> >> https://www.3scale.net/
> >
> >
> > right, but the api gateway solves 2 problems:
> >
> > 1: if you have a microservice architecture it is hard for frontend to
> talk
> > to 20 different moving services. So the gateway hides this complexity
> behind
> > it. This is not the problem we have.
> >
> > 2: you can have more api gateways (e.g. more apis) tailored for every
> > frontend. I don't think we need this - the current API serves us pretty
> well
> > in every FE Im involved in. The only thing which I miss is the data
> > aggregation.
> >
> > So I would go back to the original topic of this thread - do some small
> > change which has a chance to be merged to the project and helps us where
> it
> > hurts.
> >
> >>
> >>
> >>
> >> Martin
> >>
> >> On Fri, Mar 24, 2017 at 3:47 PM, Greg Sheremeta <gshereme at redhat.com>
> >> wrote:
> >> > I feel like every REST API I've ever worked with has had the
> aggregation
> >> > +
> >> > projection problem. It's like we're trying to use REST as a
> replacement
> >> > for
> >> > SQL -- but the logic that executes the "SQL" lives in a browser now,
> and
> >> > it
> >> > used to live on a server close to the DB. And REST isn't expressive
> for
> >> > selecting data like SQL is.
> >> >
> >> > There must be some industry solution to this "I want to do SQL over
> >> > REST"
> >> > problem.
> >> >
> >> > On Fri, Mar 24, 2017 at 5:54 AM, Martin Sivak <msivak at redhat.com>
> wrote:
> >> >>
> >> >> > for quite some time I have been more or less involved in
> development
> >> >> > of
> >> >> > various UIs for oVirt based entirely on the oVirt's REST API
> ranging
> >> >> > from
> >> >> > the quite mature moVirt [1] through some cockpit extensions to a
> >> >> > young
> >> >> > and
> >> >> > experimental user portal replacement [2].
> >> >>
> >> >> oVirt optimizer has the same issue..
> >> >>
> >> >> > 2: add some tiny service which would just accept a list of queries,
> >> >> > execute
> >> >> > them locally (but using real HTTP requests) and return in one
> bulk. A
> >> >> > naive
> >> >> > implementation just to give a sense of what I mean of this would
> be a
> >> >> > shell
> >> >> > script getting list of strings like
> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over
> >> >> > them
> >> >> > and
> >> >> > do a curl request for each, mangle the results into one string and
> >> >> > return
> >> >> > (credits for this idea to msivak). Easy to implement, possibility
> to
> >> >> > add
> >> >> > also projections later to save some bandwidth. But the API would
> >> >> > anyway
> >> >> > be
> >> >> > hammered by bunch of queries, only the network roundtrip would be
> >> >> > saved.
> >> >>
> >> >> The biggest cost for (especially mobile) clients is the cost of
> >> >> establishing new SSL connection. SSL is also pretty expensive on the
> >> >> server side.
> >> >>
> >> >> So running the aggregation service on the ovirt-engine machine
> (behind
> >> >> Apache) means the client will do a single SSL request with list of N
> >> >> urls and the local "reverse-proxy" will perform single authentication
> >> >> and N plain HTTP requests (or even better - AJP). It won't remove any
> >> >> time from the actual command run time, but it will reduce protocol
> >> >> overhead.
> >> >>
> >> >> I think this is the simplest first step that requires almost no
> change
> >> >> to existing infrastructure.
> >> >>
> >> >> --
> >> >> Martin Sivak
> >> >> SLA / oVirt
> >> >>
> >> >> On Fri, Mar 24, 2017 at 10:20 AM, Tomas Jelinek <tjelinek at redhat.com
> >
> >> >> wrote:
> >> >> > Hi All,
> >> >> >
> >> >> > for quite some time I have been more or less involved in
> development
> >> >> > of
> >> >> > various UIs for oVirt based entirely on the oVirt's REST API
> ranging
> >> >> > from
> >> >> > the quite mature moVirt [1] through some cockpit extensions to a
> >> >> > young
> >> >> > and
> >> >> > experimental user portal replacement [2].
> >> >> >
> >> >> > One issue we hit over and over again is the missing data
> aggregation.
> >> >> > In
> >> >> > the
> >> >> > 3.x era we used to use in moVirt the detail=something
> >> >> > api to get the disks and nics of the VM, something like:
> >> >> >
> >> >> > GET /ovirt-engine/api/vms
> >> >> > Accept: application/json; detail=disks
> >> >> >
> >> >> > This allowed us to store this data in local database leading to
> great
> >> >> > user
> >> >> > experience. Since this feature has been removed in 4.x API [3]
> >> >> > we needed to retire to a different solution. When the VM detail is
> >> >> > selected
> >> >> > by the user, start loading the disks and nics and hope the user
> >> >> > will not be fast enough to see the delay. The UX is slightly worse
> >> >> > bug
> >> >> > kinda
> >> >> > acceptable.
> >> >> >
> >> >> > We hit this issue harder in the new user portal [2], because we
> >> >> > already
> >> >> > have
> >> >> > the VM cached and show the whole VM in one screen. So, if you pick
> >> >> > it,
> >> >> > you
> >> >> > will get it's details immediately.
> >> >> > But, since you don't have all the details, we need to do an
> >> >> > additional
> >> >> > call
> >> >> > (two actually) to load this data and they start to appear later.
> >> >> > So, something which would be very fast and smooth starts to feel
> >> >> > sluggish.
> >> >> >
> >> >> > Recently, we hit this issue again which forced us to sacrifice the
> UX
> >> >> > even
> >> >> > more - it is the "console in use" feature of user portal.
> >> >> > The use case is this:
> >> >> > - if the console is already taken by some user, there are
> >> >> > complications
> >> >> > if
> >> >> > other current user tryes to take it as well (will avoid details
> about
> >> >> > settings and permissins involved, but long story short, the user
> will
> >> >> > probably not be allowed to connect to it. The "probably" is the key
> >> >> > here
> >> >> > since we can not do any intelligent decision in advance, we can
> only
> >> >> > warn
> >> >> > the user that the console is taken).
> >> >> > - in the current GWT user portal, if the VM's console is taken, it
> is
> >> >> > shown
> >> >> > on the VM's "box" that "console is taken". This was a highly
> >> >> > requested
> >> >> > feature
> >> >> > - to get this information using the current REST API, we need to go
> >> >> > to
> >> >> > the
> >> >> > /vms/<vmid>/sessions subcollection. To get this for all VMs, it
> would
> >> >> > be
> >> >> > doing N queries per poll which we can not afford
> >> >> > - so the current PR [4] will probably end up to only check it on
> the
> >> >> > attempt
> >> >> > to connect to the console warning the user. Maybe it will be also
> >> >> > shown
> >> >> > in
> >> >> > Vm details. But the UX in case the user will look for a VM which
> has
> >> >> > free
> >> >> > console will suffer significantly (e.g. try one by one until some
> >> >> > opens
> >> >> > or
> >> >> > look at details one by one to see if the warning appears (with a
> >> >> > delay))
> >> >> >
> >> >> > I understand that embedding the details of the VM to the response
> >> >> > comes
> >> >> > with
> >> >> > a cost, namely:
> >> >> > - performance hit
> >> >> > - complexity of the API code
> >> >> > - the "cleanness" of REST suffers
> >> >> >
> >> >> > But I think we should seriously consider to provide some option to
> >> >> > data
> >> >> > aggregation.
> >> >> >
> >> >> > I know this has been discussed many times with no result, but I
> think
> >> >> > it
> >> >> > is
> >> >> > time to bring this topic up again. I'll try to summarize the
> (failed)
> >> >> > attempts tried so far:
> >> >> > - the detail=<something> parameter with ad-hoc embedding of data.
> >> >> > This
> >> >> > has
> >> >> > been there and removed in 4.0 [3]
> >> >> > - the DoctorREST project - e.g. a proxy above the current api. The
> >> >> > idea
> >> >> > was
> >> >> > to create a service which will be independent of the engine itself,
> >> >> > will
> >> >> > locally poll the engine's REST, store all data in local (mongo)DB
> and
> >> >> > provide a rich api with aggregations and projections and push
> >> >> > notifications.
> >> >> > This polling of everything to get the data to DoctorREST proved to
> be
> >> >> > pretty
> >> >> > costy, so also a more invasive approach of pushing data from engine
> >> >> > to
> >> >> > doctor has been discused [5]. None of this two approaches have been
> >> >> > accepted
> >> >> > (too complicated, too invasive).
> >> >> > - writing some custom ad-hoc servlet serving only a purpose of one
> >> >> > frontend
> >> >> > - this is actually there for the dashboard, but it is not a generic
> >> >> > solution
> >> >> > for the other frontends and we really should not develop custom
> >> >> > "APIs"
> >> >> > for
> >> >> > every frontend
> >> >> > - there were some other proposals discussed (some 3th party
> solutions
> >> >> > etc)
> >> >> > but I think none of them made it even to a PoC
> >> >> >
> >> >> > So, now I would try again and try small to get at least some
> benefit.
> >> >> > I
> >> >> > see
> >> >> > 2 paths we could try:
> >> >> > 1: embed something which burns us immediatly, e.g. the /sessions
> into
> >> >> > VMs. I
> >> >> > really liked the ;detail=sessions approach, could we move it back?
> >> >> > 2: add some tiny service which would just accept a list of queries,
> >> >> > execute
> >> >> > them locally (but using real HTTP requests) and return in one
> bulk. A
> >> >> > naive
> >> >> > implementation just to give a sense of what I mean of this would
> be a
> >> >> > shell
> >> >> > script getting list of strings like
> >> >> > "https://localhost/ovirt-engine/api/vms/123/sessions" iterate over
> >> >> > them
> >> >> > and
> >> >> > do a curl request for each, mangle the results into one string and
> >> >> > return
> >> >> > (credits for this idea to msivak). Easy to implement, possibility
> to
> >> >> > add
> >> >> > also projections later to save some bandwidth. But the API would
> >> >> > anyway
> >> >> > be
> >> >> > hammered by bunch of queries, only the network roundtrip would be
> >> >> > saved.
> >> >> > 3: any other simple approaches?
> >> >> >
> >> >> > I honestly prefer the first approach. It is not beautiful, it is
> not
> >> >> > REST-ful, but it is easy to implement, very pragmatic and useful.
> >> >> > What do you think?
> >> >> >
> >> >> > Thank you and sorry for the long mail :)
> >> >> > Tomas
> >> >> >
> >> >> > [1]: https://github.com/oVirt/moVirt
> >> >> > [2]: https://github.com/oVirt/ovirt-web-ui
> >> >> > [3]: https://gerrit.ovirt.org/#/c/61260
> >> >> > [4]: https://github.com/oVirt/ovirt-web-ui/pull/106/
> >> >> > [5]: https://gerrit.ovirt.org/#/c/45233/
> >> >> >
> >> >> >
> >> >> > _______________________________________________
> >> >> > Devel mailing list
> >> >> > Devel at ovirt.org
> >> >> > http://lists.ovirt.org/mailman/listinfo/devel
> >> >> _______________________________________________
> >> >> Devel mailing list
> >> >> Devel at ovirt.org
> >> >> http://lists.ovirt.org/mailman/listinfo/devel
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Greg Sheremeta, MBA
> >> > Red Hat, Inc.
> >> > Sr. Software Engineer
> >> > gshereme at redhat.com
> >
> >
> _______________________________________________
> Devel mailing list
> Devel at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/devel/attachments/20170324/68ac6797/attachment-0001.html>


More information about the Devel mailing list