restricting check patch/merged parallel jobs

Wed May 11 12:00:17 UTC 2016

I enhanced this[1] graph to compare slaves utilization vs build queue, note
that
the slaves utilization is measured in percentages and the number of builds
in the queue
is absolute. basically when the red lines are high(large queue size) and
the green ones(slaves
utilization) are low we could have possibly had more builds running. we can
see in the past few days
we've reached nice utilization of ~ 90% and following that the queue size
decreased pretty
quickly, on the other hand there were times of only 16% utilization and a
large queue ~ 70.
last I checked the least significant problem is the OS as most standard-ci
jobs
are agnostic to EL/FC, usually it was the jobs limit, or sudden peeks in
patches sent
but I didn't get to add 'reason each job is waiting' metric yet, so its
just a feeling.

maybe the Priority Sorter Plugin[2] which comes bundled with Jenkins
could address the problem of jobs waiting 'unfairly' a long time in the
queue,
though it will require to define the priorities in the yamls.

[1]
http://graphite.phx.ovirt.org/dashboard/db/jenkins-monitoring?panelId=16&fullscreen&from=1462654800000&to=1462966158602&var-average_interval=12h&var-filtered_labels=All&var-filtered_jobs_labels=All
[2] https://wiki.jenkins-ci.org/display/JENKINS/Priority+Sorter+Plugin

On Wed, May 11, 2016 at 1:43 PM, Sandro Bonazzola <sbonazzo at redhat.com>
wrote:

>
>
> On Wed, May 11, 2016 at 12:34 PM, Eyal Edri <eedri at redhat.com> wrote:
>
>> From what I saw, it was mostly ovirt-engine and vdsm jobs pending on the
>> queue while other slaves are idle.
>> we have over 40 slaves and we're about to add more, so I don't think that
>> will be an issue and IMO 3 per job is not enough, especially if you get
>> idle slaves.
>>
>>
> +1 on raising then.
>
>
>
>> We are thinking on a more dynamic approach of dynamic vm allocation on
>> demand, so in the long run we'll have more control over it,
>> for now i'm monitoring the queue size and slaves on a regular basis [1],
>> so if anything will get blocked too much time we'll act and adjust
>> accordingly.
>>
>>
>> [1] http://graphite.phx.ovirt.org/dashboard/db/jenkins-monitoring
>>
>> On Wed, May 11, 2016 at 1:10 PM, Sandro Bonazzola <sbonazzo at redhat.com>
>> wrote:
>>
>>>
>>>
>>> On Tue, May 10, 2016 at 1:01 PM, Eyal Edri <eedri at redhat.com> wrote:
>>>
>>>> Shlomi,
>>>> Can you submit a patch to increase the limit to 6 for (i think all jobs
>>>> are using the same yaml template) and we'll continue to monitor to queue
>>>> and see if there is an improvement in the utilization of slaves?
>>>>
>>>
>>> Issue was that long lasting jobs caused queue to increase too much.
>>> Example: a patch set rebased on master and merged will cause triggering
>>> of check-merged jobs, upgrade jobs, ...; running 6 instance of each of them
>>> will cause all other projects to be queued for a lot of time.
>>>
>>>
>>>
>>>>
>>>> E.
>>>>
>>>> On Tue, May 10, 2016 at 1:58 PM, David Caro <dcaro at redhat.com> wrote:
>>>>
>>>>> On 05/10 13:54, Eyal Edri wrote:
>>>>> > Is there any reason we're limiting the amount of check patch & check
>>>>> merged
>>>>> > jobs to run only 3 in parallel?
>>>>> >
>>>>>
>>>>> We had some mess in the past where enabling parallel runs did not
>>>>> really force
>>>>> not using the same slave at the same time, I guess we never reenabled
>>>>> them.
>>>>>
>>>>> > Each jobs runs in mock and on its own VM, anything presenting us from
>>>>> > removing this limitation so we won't have idle slaves while other
>>>>> jobs are
>>>>> > in the queue?
>>>>> >
>>>>> > We can increase it at least to a higher level if we won't one
>>>>> specific job
>>>>> > to take over all slaves and starve other jobs, but i think
>>>>> ovirt-engine
>>>>> > jobs are probably the biggest consumer of ci, so the threshold
>>>>> should be
>>>>> > updated.
>>>>>
>>>>> +1
>>>>>
>>>>> >
>>>>> > --
>>>>> > Eyal Edri
>>>>> > Associate Manager
>>>>> > RHEV DevOps
>>>>> > EMEA ENG Virtualization R&D
>>>>> > Red Hat Israel
>>>>> >
>>>>> > phone: +972-9-7692018
>>>>> > irc: eedri (on #tlv #rhev-dev #rhev-integ)
>>>>>
>>>>> > _______________________________________________
>>>>> > Infra mailing list
>>>>> > Infra at ovirt.org
>>>>> > http://lists.ovirt.org/mailman/listinfo/infra
>>>>>
>>>>>
>>>>> --
>>>>> David Caro
>>>>>
>>>>> Red Hat S.L.
>>>>> Continuous Integration Engineer - EMEA ENG Virtualization R&D
>>>>>
>>>>> Tel.: +420 532 294 605
>>>>> Email: dcaro at redhat.com
>>>>> IRC: dcaro|dcaroest@{freenode|oftc|redhat}
>>>>> Web: www.redhat.com
>>>>> RHT Global #: 82-62605
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Eyal Edri
>>>> Associate Manager
>>>> RHEV DevOps
>>>> EMEA ENG Virtualization R&D
>>>> Red Hat Israel
>>>>
>>>> phone: +972-9-7692018
>>>> irc: eedri (on #tlv #rhev-dev #rhev-integ)
>>>>
>>>> _______________________________________________
>>>> Infra mailing list
>>>> Infra at ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/infra
>>>>
>>>>
>>>
>>>
>>> --
>>> Sandro Bonazzola
>>> Better technology. Faster innovation. Powered by community collaboration.
>>> See how it works at redhat.com
>>>
>>
>>
>>
>> --
>> Eyal Edri
>> Associate Manager
>> RHEV DevOps
>> EMEA ENG Virtualization R&D
>> Red Hat Israel
>>
>> phone: +972-9-7692018
>> irc: eedri (on #tlv #rhev-dev #rhev-integ)
>>
>
>
>
> --
> Sandro Bonazzola
> Better technology. Faster innovation. Powered by community collaboration.
> See how it works at redhat.com
>
> _______________________________________________
> Infra mailing list
> Infra at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ovirt.org/pipermail/infra/attachments/20160511/00cdfa2c/attachment.html>