On Sun, Jul 9, 2017 at 4:52 PM, Roy Golan <rgolan@redhat.com> wrote:

On Thu, Jul 6, 2017 at 8:10 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Wed, Jul 5, 2017 at 5:57 PM Roy Golan <rgolan@redhat.com> wrote:
Hi all,

I would like to get feedback on $subject and see if I'm missing something. The impact of this is simply less resource consumption and by that we can support even greater number of hosts [1] and vms in the system.

If you think more relaxed statistics collection will affect a core flow let me know - as far as I see I didn't spot anything critical.

The overhead of a cycle per host something like that: 2 roundtrips per host in a cycle, (vm + host stats) and tons of memory allocation for char[] -> json-> maps of maps -> VM/Vds statistics -> Maps -> serialiazing to DB.

To minimize the effect of this change we can leave a call to 'list' verb to at least detect vms existence in the same rate as today.

Pros
- Engine has rore resources to support more hosts/vms/other activities of the engine
- Vdsm will have more resources as well (need to tweak vdsm to collect in the same
frequency)
- less DB writes and reads, approx half of what the system will do in the in its lifefpan (cause this is what is mainly does all the time)

Cons
- DWH/Dashboard will have less entries, I'm not sure what is graphical affect given our hourly resolution (cmiiw here)

Why we have a monitoring interval at all? why not move the stats to events?

Events is not suited for everything and the current vdsm can only guarantee 'at most once' semantics. We can not rely on that and that is why we poll.

I agree with the guarantees we have but why they are not enough for stats. Are the stats so critical?

Vdsm should collect stats and send updates to engine, engine can do only
polling only if vdsm did not send any update in the last couple of minutes or so.

Again you'd have to work harder to guarantee it . One of my main motivations here is to put as little effort as we can to gain more resources.

Same for stats collected by collectd, we want to stream updates to engine from
the host, no poll every host for the stats.

Again, not always - consider this, for example if we want to support big setups, in the end they are going to stream huge amount of information to the engine - we would have a problem handling this pressure. Poll will allow us to choose who to poll and when. (backpressure if that helps)

You have describe the issue that we have with poll. In large envs we have constant time between the polls and single cycle takes longer. Back pressure was implemented only on the engine side and vdsm side is still missing.

Nir

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1430876
_______________________________________________
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel