Yaniv Dary
Technical Product Manager
Red Hat Israel Ltd.
34 Jerusalem Road
Building A, 4th floor
Ra'anana, Israel 4350109

Tel : +972 (9) 7692306
        8272306
Email: ydary@redhat.com
IRC : ydary

On Wed, Feb 22, 2017 at 5:57 PM, Francesco Romani <fromani@redhat.com> wrote:
On 02/21/2017 11:55 PM, Yaniv Dary wrote:


Yaniv Dary
Technical Product Manager
Red Hat Israel Ltd.
34 Jerusalem Road
Building A, 4th floor
Ra'anana, Israel 4350109

Tel : +972 (9) 7692306
        8272306
Email: ydary@redhat.com
IRC : ydary

On Feb 21, 2017 13:06, "Francesco Romani" <fromani@redhat.com> wrote:
Hello everyone,


in the last weeks I've been submitting PRs to collectd upstream, to
bring the virt plugin up to date with Vdsm and oVirt needs.

Previously, the collectd virt plugin reported only a subset of metrics
oVirt uses.

In current collectd master, the collectd virt plugin provides all the
data Vdsm (thus Engine) needs. This means that it is now

possible for Vdsm or Engine to query collectd, not Vdsm/libvirt, and
have the same data.


There are only two caveats:

1. it is yet to be seen which version of collectd will ship all those
enhancements

2. collectd *intentionally* report metrics as rates, not as absolute
values as Vdsm does. This may be one issue in presence of restarts/data
loss in the link between collectd and the metrics store.

How does this work? 
If we want to show memory usage over time for example, we need to have the usage, not the rate. 
How would this be reported?

I was imprecise, my fault.

Let me retry:
collectd intentionally report quite a lot of metrics we care about as rates, not as absolute values.
Memory is actually ok fine.

  a0/virt/disk_octets-hdc -> rate
  a0/virt/disk_octets-vda
  a0/virt/disk_ops-hdc -> rate
  a0/virt/disk_ops-vda
  a0/virt/disk_time-hdc -> rate
  a0/virt/disk_time-vda
  a0/virt/if_dropped-vnet0 -> rate
  a0/virt/if_errors-vnet0 -> rate
  a0/virt/if_octets-vnet0 -> rate
  a0/virt/if_packets-vnet0 -> rate
  a0/virt/memory-actual_balloon -> absolute
  a0/virt/memory-rss -> absolute
  a0/virt/memory-total -> absolute
  a0/virt/ps_cputime -> rate
  a0/virt/total_requests-flush-hdc ->  rate
  a0/virt/total_requests-flush-vda
  a0/virt/total_time_in_ms-flush-hdc -> rate
  a0/virt/total_time_in_ms-flush-vda
  a0/virt/virt_cpu_total -> rate
  a0/virt/virt_vcpu-0 -> rate
  a0/virt/virt_vcpu-1

collectd "just" reports the changes since the last sampling. I'm not sure which is the best way to handle that; I've sent a mail to collectd list some time ago, no answer so far.

Can you CC on that thread?
I don't know how ES would work with rates at all. 
I want to be able to show CPU usage over time and I need to know if its 80% or 10%.
 




-- 
Francesco Romani
Red Hat Engineering Virtualization R & D
IRC: fromani