Re: [ovirt-devel] Chaning the statistics monitoring interval to 30s

6 Jul 2017

      --

SHIRLY RADCO

BI SOFTWARE ENGINEER,

Red Hat Israel <https://www.redhat.com/>

sradco@redhat.com
 <https://red.ht/sig>
 <https://redhat.com/summit>

On Thu, Jul 6, 2017 at 12:39 PM, Oved Ourfali <oourfali@redhat.com> wrote:
...
On Thu, Jul 6, 2017 at 12:19 PM, Roy Golan <rgolan@redhat.com> wrote:
...
On Thu, Jul 6, 2017 at 12:18 PM Roy Golan <rgolan@redhat.com> wrote:
...
Action items:
- Demonstrate the effect of the reduction of stats collection on the
system - WIP
- Code changes:
  - config item change: NumberVmRefreshesBeforeSave from 5 to 10
  - make the 'poll' vms job to fire at NumberVmRefreshesBeforeSave / 2
(or just make the code to support explicit time interval)
  - VDSM should get a config set with the sampling inteval - to support
back-compat
- Chages to DWH sampling and ManageIQ?
I think manageIQ can cope with either 60 seconds or 20 seconds intervals
(after a change we've made when we moved to 20 seconds).
Put an action item indeed to check that with us if we'll decide to do so.
Indeed 20 or 60 seconds. Their implementation is very strict and coupled
with vmware statistics which are 20 seconds.
...
...
...
On Thu, Jul 6, 2017 at 11:00 AM Yaniv Kaul <ykaul@redhat.com> wrote:
...
On Thu, Jul 6, 2017 at 10:04 AM, Oved Ourfali <oourfali@redhat.com>
wrote:
...
On Thu, Jul 6, 2017 at 9:38 AM, Arik Hadas <ahadas@redhat.com> wrote:
...
On Wed, Jul 5, 2017 at 9:36 PM, Shirly Radco <sradco@redhat.com>
wrote:
>
>
> --
>
> SHIRLY RADCO
>
> BI SOFTWARE ENGINEER,
>
> Red Hat Israel <https://www.redhat.com/>
>
> sradco@redhat.com
>  <https://red.ht/sig>
>  <https://redhat.com/summit>
>
>
> On Wed, Jul 5, 2017 at 6:35 PM, Arik Hadas <ahadas@redhat.com>
> wrote:
>
>>
>>
>> On Wed, Jul 5, 2017 at 5:57 PM, Roy Golan <rgolan@redhat.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I would like to get feedback on $subject and see if I'm missing
>>> something. The impact of this is simply less resource consumption and by
>>> that we can support even greater number of hosts [1] and vms in the system.
>>>
>>
>>> If you think more relaxed statistics collection will affect a core
>>> flow let me know - as far as I see I didn't spot anything critical.
>>>
>>
>>> The overhead of a cycle per host something like that: 2 roundtrips
>>> per host in a cycle, (vm + host stats) and tons of memory allocation for
>>> char[] -> json-> maps of maps -> VM/Vds statistics -> Maps -> serialiazing
>>> to DB.
>>>
>>> To minimize the effect of this change we can leave a call to
>>> 'list' verb to at least detect vms existence in the same rate as today.
>>>
>>
>> +1
>>
>>
>>>
>>> Pros
>>> - Engine has rore resources to support more hosts/vms/other
>>> activities of the engine
>>> - Vdsm will have more resources as well (need to tweak vdsm to
>>> collect in the same
>>> frequency)
>>> - less DB writes and reads, approx half of what the system will do
>>> in the in its lifefpan (cause this is what is mainly does all the time)
>>>
>>> Cons
>>> - DWH/Dashboard will have less entries, I'm not sure what is
>>> graphical affect given our hourly resolution (cmiiw here)
>>>
>>
>> What's the frequency of the queries done by DWH/Dashboard? Do they
>> count on the _update_date column of the queried data?
>>
>
> Current frequency is 20 seconds.
> The configurations are queried based on the _update_date, but
>  statistics are queried every interval.
>
> The affect will be less accuracy in the hourly calculations.
>
Ack. So if the proposed change is done, it would probably make sense
to increase the inverval of those queries to be higher than 30 sec, or at
least taking into consideration the _update_date of vm_statistics as well.
Note that it will cause issues with cloudforms to change those queries
to 30 seconds, so I guess we'll still query it every 20 seconds (although
the data won't change in some of those queries).
I thought it was configurable in ManageIQ how often to query, but in
any case even if we query every 20 seconds, we'll get updated VM stats,
which is fine, and not as updated hosts stats, which is fine as well, from
my perspective.
Y.
...
...
>
>> I'm asking because if they query the database every minute and say
>> "the time now is 10:30 and the queried data is ..." then there should not
>> be less entries.
>>
>>
>>>
>>>
>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1430876
>>>
>>
>>>
>>> _______________________________________________
>>> Devel mailing list
>>> Devel@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>
>>
>>
>> _______________________________________________
>> Devel mailing list
>> Devel@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel
>>
>
>
_______________________________________________
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel