<div dir="ltr"><div><div><div>I enhanced this[1] graph to compare slaves utilization vs build queue, note that<br></div>the slaves utilization is measured in percentages and the number of builds in the queue<br></div>is absolute. basically when the red lines are high(large queue size) and the green ones(slaves<br></div><div>utilization) are low we could have possibly had more builds running. we can see in the past few days<br></div><div>we&#39;ve reached nice utilization of ~ 90% and following that the queue size decreased pretty<br></div><div>quickly, on the other hand there were times of only 16% utilization and a large queue ~ 70.<br></div><div>last I checked the least significant problem is the OS as most standard-ci jobs<br></div><div>are agnostic to EL/FC, usually it was the jobs limit, or sudden peeks in patches sent<br></div><div>but I didn&#39;t get to add &#39;reason each job is waiting&#39; metric yet, so its just a feeling.<br><br></div><div>maybe the Priority Sorter Plugin[2] which comes bundled with Jenkins<br>could address the problem of jobs waiting &#39;unfairly&#39; a long time in the queue,<br>though it will require to define the priorities in the yamls.<br></div><div><br></div><div><br></div><div><br>[1] <a href="http://graphite.phx.ovirt.org/dashboard/db/jenkins-monitoring?panelId=16&amp;fullscreen&amp;from=1462654800000&amp;to=1462966158602&amp;var-average_interval=12h&amp;var-filtered_labels=All&amp;var-filtered_jobs_labels=All">http://graphite.phx.ovirt.org/dashboard/db/jenkins-monitoring?panelId=16&amp;fullscreen&amp;from=1462654800000&amp;to=1462966158602&amp;var-average_interval=12h&amp;var-filtered_labels=All&amp;var-filtered_jobs_labels=All</a><br>[2] <a href="https://wiki.jenkins-ci.org/display/JENKINS/Priority+Sorter+Plugin">https://wiki.jenkins-ci.org/display/JENKINS/Priority+Sorter+Plugin</a></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, May 11, 2016 at 1:43 PM, Sandro Bonazzola <span dir="ltr">&lt;<a href="mailto:sbonazzo@redhat.com" target="_blank">sbonazzo@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span class="">On Wed, May 11, 2016 at 12:34 PM, Eyal Edri <span dir="ltr">&lt;<a href="mailto:eedri@redhat.com" target="_blank">eedri@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">From what I saw, it was mostly ovirt-engine and vdsm jobs pending on the queue while other slaves are idle.<div>we have over 40 slaves and we&#39;re about to add more, so I don&#39;t think that will be an issue and IMO 3 per job is not enough, especially if you get idle slaves.</div><div><br></div></div></blockquote><div><br></div></span><div>+1 on raising then.</div><div><div class="h5"><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div></div><div>We are thinking on a more dynamic approach of dynamic vm allocation on demand, so in the long run we&#39;ll have more control over it,</div><div>for now i&#39;m monitoring the queue size and slaves on a regular basis [1], so if anything will get blocked too much time we&#39;ll act and adjust accordingly.</div><div><br></div><div><br></div><div>[1] <a href="http://graphite.phx.ovirt.org/dashboard/db/jenkins-monitoring" target="_blank">http://graphite.phx.ovirt.org/dashboard/db/jenkins-monitoring</a></div></div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, May 11, 2016 at 1:10 PM, Sandro Bonazzola <span dir="ltr">&lt;<a href="mailto:sbonazzo@redhat.com" target="_blank">sbonazzo@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span>On Tue, May 10, 2016 at 1:01 PM, Eyal Edri <span dir="ltr">&lt;<a href="mailto:eedri@redhat.com" target="_blank">eedri@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Shlomi,<div>Can you submit a patch to increase the limit to 6 for (i think all jobs are using the same yaml template) and we&#39;ll continue to monitor to queue and see if there is an improvement in the utilization of slaves?</div></div></blockquote><div><br></div></span><div>Issue was that long lasting jobs caused queue to increase too much.</div><div>Example: a patch set rebased on master and merged will cause triggering of check-merged jobs, upgrade jobs, ...; running 6 instance of each of them will cause all other projects to be queued for a lot of time.</div><div><div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><span><font color="#888888"><div><br></div><div>E.</div></font></span></div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 10, 2016 at 1:58 PM, David Caro <span dir="ltr">&lt;<a href="mailto:dcaro@redhat.com" target="_blank">dcaro@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>On 05/10 13:54, Eyal Edri wrote:<br>

&gt; Is there any reason we&#39;re limiting the amount of check patch &amp; check merged<br>

&gt; jobs to run only 3 in parallel?<br>

&gt;<br>

<br>

</span>We had some mess in the past where enabling parallel runs did not really force<br>

not using the same slave at the same time, I guess we never reenabled them.<br>

<span><br>

&gt; Each jobs runs in mock and on its own VM, anything presenting us from<br>

&gt; removing this limitation so we won&#39;t have idle slaves while other jobs are<br>

&gt; in the queue?<br>

&gt;<br>

&gt; We can increase it at least to a higher level if we won&#39;t one specific job<br>

&gt; to take over all slaves and starve other jobs, but i think ovirt-engine<br>

&gt; jobs are probably the biggest consumer of ci, so the threshold should be<br>

&gt; updated.<br>

<br>

</span>+1<br>

<span><br>

&gt;<br>

&gt; --<br>

&gt; Eyal Edri<br>

&gt; Associate Manager<br>

&gt; RHEV DevOps<br>

&gt; EMEA ENG Virtualization R&amp;D<br>

&gt; Red Hat Israel<br>

&gt;<br>

&gt; phone: <a href="tel:%2B972-9-7692018" value="+97297692018" target="_blank">+972-9-7692018</a><br>

&gt; irc: eedri (on #tlv #rhev-dev #rhev-integ)<br>

<br>

</span>&gt; _______________________________________________<br>

&gt; Infra mailing list<br>

&gt; <a href="mailto:Infra@ovirt.org" target="_blank">Infra@ovirt.org</a><br>

&gt; <a href="http://lists.ovirt.org/mailman/listinfo/infra" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman/listinfo/infra</a><br>

<span><font color="#888888"><br>

<br>

--<br>

David Caro<br>

<br>

Red Hat S.L.<br>

Continuous Integration Engineer - EMEA ENG Virtualization R&amp;D<br>

<br>

Tel.: <a href="tel:%2B420%20532%20294%20605" value="+420532294605" target="_blank">+420 532 294 605</a><br>

Email: <a href="mailto:dcaro@redhat.com" target="_blank">dcaro@redhat.com</a><br>

IRC: dcaro|dcaroest@{freenode|oftc|redhat}<br>

Web: <a href="http://www.redhat.com" rel="noreferrer" target="_blank">www.redhat.com</a><br>

RHT Global #: 82-62605<br>

</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br><div><div dir="ltr"><div><div dir="ltr"><div>Eyal Edri<br>Associate Manager</div><div>RHEV DevOps<br>EMEA ENG Virtualization R&amp;D<br>Red Hat Israel<br><br>phone: <a href="tel:%2B972-9-7692018" value="+97297692018" target="_blank">+972-9-7692018</a><br>irc: eedri (on #tlv #rhev-dev #rhev-integ)</div></div></div></div></div>

</div>

</div></div><br>_______________________________________________<br>

Infra mailing list<br>

<a href="mailto:Infra@ovirt.org" target="_blank">Infra@ovirt.org</a><br>

<a href="http://lists.ovirt.org/mailman/listinfo/infra" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman/listinfo/infra</a><br>

<br></blockquote></div></div></div><span><font color="#888888"><br><br clear="all"><div><br></div>-- <br><div><div dir="ltr"><div><div dir="ltr">Sandro Bonazzola<br>Better technology. Faster innovation. Powered by community collaboration.<br>See how it works at <a href="http://redhat.com" target="_blank">redhat.com</a><br></div></div></div></div>

</font></span></div></div>

</blockquote></div><br><br clear="all"><div><br></div>-- <br><div><div dir="ltr"><div><div dir="ltr"><div>Eyal Edri<br>Associate Manager</div><div>RHEV DevOps<br>EMEA ENG Virtualization R&amp;D<br>Red Hat Israel<br><br>phone: <a href="tel:%2B972-9-7692018" value="+97297692018" target="_blank">+972-9-7692018</a><br>irc: eedri (on #tlv #rhev-dev #rhev-integ)</div></div></div></div></div>

</div>

</div></div></blockquote></div></div></div><div><div class="h5"><br><br clear="all"><div><br></div>-- <br><div><div dir="ltr"><div><div dir="ltr">Sandro Bonazzola<br>Better technology. Faster innovation. Powered by community collaboration.<br>See how it works at <a href="http://redhat.com" target="_blank">redhat.com</a><br></div></div></div></div>

</div></div></div></div>

<br>_______________________________________________<br>

Infra mailing list<br>

<a href="mailto:Infra@ovirt.org">Infra@ovirt.org</a><br>

<a href="http://lists.ovirt.org/mailman/listinfo/infra" rel="noreferrer" target="_blank">http://lists.ovirt.org/mailman/listinfo/infra</a><br>

<br></blockquote></div><br></div>