Experimental Jenkins monitoring

Nadav Goldin ngoldin at redhat.com
Thu Apr 14 22:24:48 UTC 2016


Hi,
I've created an experimental dashboard for Jenkins at our Grafana instance:
http://graphite.phx.ovirt.org/dashboard/db/jenkins-monitoring
(if you don't have an account, you can enrol with github/google)

currently it collects the following metrics:
1) How many jobs in the Build Queue are waiting per slaves' label:

for instance: if there are 4 builds of a job that is restricted to 'el7'
and 2 builds of another job
which is restricted to 'el7' in the build queue we will see 6 for 'el7' in
the first graph.
'No label' sums jobs which are waiting but are unrestricted.

2) How many slaves are idle per label.
note that the slave's labels are contained in the job's labels, but not
vice versa, as
we allow regex expressions such as (fc21 || fc22 ). right now it treats
them as simple
strings.

3) Total number of online/offline/idle slaves

besides the normal monitoring, it can help us:
1) minimize the difference between 'idle' slaves per label and jobs waiting
in the build queue per label.
this might be caused by unnecessary restrictions on the label, or maybe by
the
'Throttle Concurrent Builds' plugin.
2) decide how many VMs and which OS to install on the new hosts.
3) in the future, once we have the 'slave pools' implemented, we could
implement
auto-scaling based on thresholds or some other function.


'experimental' - as it still needs to be tested for stability(it is based
on python-jenkins
and graphite-send) and also more metrics can be added(maybe avg running time
per job? builds per hour? ) - will be happy to hear.

I plan later to pack it all into independent fabric tasks(i.e. fab
do.jenkins.slaves.show)


Nadav
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/infra/attachments/20160415/3da7a273/attachment.html>


More information about the Infra mailing list