
This is a multi-part message in MIME format. --------------070806040805040306050707 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Hi All, Lately I barely get any valuable input from the Jenkins CI builds on my patches. Throughout the last week most of the builds finished with different Jenkins failures. The reasons were: * git failure * lack of permission to mkdir * failure to retrieve artifacts from the artifactory * unexpected shutdown Such a high rate of failures makes the value of the builds very low and causes me to spend my time on understanding whether it's my fault or not. I'd be very thankful and happier if Jenkins reliability was improved. Regards, Yevgeny * English - detected * English * Hebrew * Russian * English * Hebrew * Russian <javascript:void(0);> --------------070806040805040306050707 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 7bit <html> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8"> <link href="chrome://translator/skin/floatingPanel.css" type="text/css" rel="stylesheet"> <link href="chrome://translator/skin/popup.css" type="text/css" rel="stylesheet"> </head> <body bgcolor="#FFFFFF" text="#000000"> Hi All,<br> <br> Lately I barely get any valuable input from the Jenkins CI builds on my patches. Throughout the last week most of the builds finished with different Jenkins failures.<br> The reasons were:<br> <ul> <li>git failure</li> <li>lack of permission to mkdir</li> <li>failure to retrieve artifacts from the artifactory</li> <li>unexpected shutdown</li> </ul> Such a high rate of failures makes the value of the builds very low and causes me to spend my time on understanding whether it's my fault or not.<br> <br> I'd be very thankful and happier if Jenkins reliability was improved.<br> <br> Regards,<br> Yevgeny<br> <div style="bottom: auto; left: 223px; right: auto; top: 31px; display: none;" class="translator-theme-default" id="translator-floating-panel"> <div title="Click to translate" id="translator-floating-panel-button"></div> </div> <div style="top: auto; max-width: 400px; max-height: 455px; bottom: 0px; left: auto; right: 0px; display: none;" class="translator-theme-default" id="translator-popup"> <div id="translator-popup-toolbar"> <div id="translator-popup-title"> <div id="translator-popup-source-languages-wrapper"> <ul style="margin-top: -18px;" id="translator-popup-source-languages"> <li id="translator-popup-source-language-auto" code="auto">English - detected</li> <li code="en">English</li> <li code="iw">Hebrew</li> <li code="ru">Russian</li> </ul> <div id="translator-popup-source-languages-scroller"></div> </div> <div id="translator-popup-languages-direction"></div> <div id="translator-popup-target-languages-wrapper"> <ul style="margin-top: -36px;" id="translator-popup-target-languages"> <li code="en">English</li> <li code="iw">Hebrew</li> <li code="ru">Russian</li> </ul> <div id="translator-popup-target-languages-scroller"></div> </div> </div> <div class="translator-popup-toolbar-spring"></div> <a title="Copy translated text to clipboard" href="javascript:void(0);" id="translator-popup-button-copy"></a></div> <div style="max-height: 405px; opacity: 1;" class="translator-message-type-normal" id="translator-popup-message"></div> <div style="display: none;" id="translator-popup-notice"></div> <textarea style="display: none;" id="translator-popup-textarea"></textarea></div> <div style="bottom: 0px; left: auto; right: 0px; top: auto; display: none;" title="Translating..." class="translator-theme-default" id="translator-popup-loading"></div> </body> </html> --------------070806040805040306050707--

Hi, we'd be more than happy to help and fix those issues. can you please provide links and info on specific failures so we can debug them? also, you're welcome also to open a ticket to our ticketing system [1] to track a specific item. keep in mind the infra team is limited in resources, so not all tickets might be solves quickly, especially if a major outage (like we had this week) is in progress. [1] https://fedorahosted.org/ovirt/newticket /e ----- Original Message -----
From: "Yevgeny Zaspitsky" <yzaspits@redhat.com> To: infra@ovirt.org Sent: Thursday, February 5, 2015 3:59:34 PM Subject: Jenkins failures
Hi All,
Lately I barely get any valuable input from the Jenkins CI builds on my patches. Throughout the last week most of the builds finished with different Jenkins failures. The reasons were:
* git failure * lack of permission to mkdir * failure to retrieve artifacts from the artifactory * unexpected shutdown
Such a high rate of failures makes the value of the builds very low and causes me to spend my time on understanding whether it's my fault or not.
I'd be very thankful and happier if Jenkins reliability was improved.
Regards, Yevgeny
* English - detected * English * Hebrew * Russian
* English * Hebrew * Russian
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

Hi, =20 we'd be more than happy to help and fix those issues. can you please provide links and info on specific failures so we can debu= g them? =20 also, you're welcome also to open a ticket to our ticketing system [1] to=
--OFj+1YLvsEfSXdCH Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Also take into account that monday/tuesday we had a major outage on jenkins= and all the slaves behaved unreliably if working at all. On 02/05, Eyal Edri wrote: track a specific item.
keep in mind the infra team is limited in resources, so not all tickets m= ight be solves quickly, especially if a major outage (like we had this week) is in progress. =20 [1] https://fedorahosted.org/ovirt/newticket =20 /e =20 ----- Original Message -----
From: "Yevgeny Zaspitsky" <yzaspits@redhat.com> To: infra@ovirt.org Sent: Thursday, February 5, 2015 3:59:34 PM Subject: Jenkins failures =20 Hi All, =20 Lately I barely get any valuable input from the Jenkins CI builds on my patches. Throughout the last week most of the builds finished with diff= erent Jenkins failures. The reasons were: =20 =20 * git failure * lack of permission to mkdir * failure to retrieve artifacts from the artifactory * unexpected shutdown =20 Such a high rate of failures makes the value of the builds very low and causes me to spend my time on understanding whether it's my fault or no= t. =20 I'd be very thankful and happier if Jenkins reliability was improved. =20 Regards, Yevgeny =20 =20 * English - detected * English * Hebrew * Russian =20 =20 * English * Hebrew * Russian =20 _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra =20
Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--=20 David Caro Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D Tel.: +420 532 294 605 Email: dcaro@redhat.com Web: www.redhat.com RHT Global #: 82-62605 --OFj+1YLvsEfSXdCH Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJU03k3AAoJEEBxx+HSYmnDDGMH/3D8Fs0pYkZ7BULsQz+oq5ZX yXtQFaTaFxGS82c2QoIrVQivyCWtVFab46MH5zCCCBoyx7qW5RCfhKTKDxrvMvXX Mgko8EG7pXBMr4NW013NR5gzFSK2OoV8V3hMVpRZfpeC484yT09LrCyx7G7gmqHe ICbUoCNynJ/LF07FfaJ1hqbb8q/miFjmXkkO4LaBXsqC+K308woLpVPH4gQGUhOc 0pnjxn4t2zJi/So3qcazRW9s8FgXLq+a3WvOHP+Y1gQTQozymomSDaVqTQEYzKCj /Ss8qb518wZ5ULpUw7/trhg4VxC8VqfXbjakRvWo0kd2xKt802hQzS+sIBJREII= =VJBJ -----END PGP SIGNATURE----- --OFj+1YLvsEfSXdCH--

Here is an example for a git failure on a Jenkins node: http://jenkins.ovirt.org/job/ovirt-engine_master_find-bugs_gerrit/26316/cons... On 05/02/15 16:07, David Caro wrote:
Also take into account that monday/tuesday we had a major outage on jenkins and all the slaves behaved unreliably if working at all.
On 02/05, Eyal Edri wrote:
Hi,
we'd be more than happy to help and fix those issues. can you please provide links and info on specific failures so we can debug them?
also, you're welcome also to open a ticket to our ticketing system [1] to track a specific item. keep in mind the infra team is limited in resources, so not all tickets might be solves quickly, especially if a major outage (like we had this week) is in progress.
[1] https://fedorahosted.org/ovirt/newticket
/e
----- Original Message -----
From: "Yevgeny Zaspitsky" <yzaspits@redhat.com> To: infra@ovirt.org Sent: Thursday, February 5, 2015 3:59:34 PM Subject: Jenkins failures
Hi All,
Lately I barely get any valuable input from the Jenkins CI builds on my patches. Throughout the last week most of the builds finished with different Jenkins failures. The reasons were:
* git failure * lack of permission to mkdir * failure to retrieve artifacts from the artifactory * unexpected shutdown
Such a high rate of failures makes the value of the builds very low and causes me to spend my time on understanding whether it's my fault or not.
I'd be very thankful and happier if Jenkins reliability was improved.
Regards, Yevgeny
* English - detected * English * Hebrew * Russian
* English * Hebrew * Russian
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

it seems that that slave isn't responsive to ssh, so it might got some sort of an infra issue. i think we should consider adding some sort of verification process for slaves. something that will run nightly or before each job if it's fast enough. we can think on checking - ping - ssh - git clone.. david, what do you think? might reduce a lot of false positive failures. e. ----- Original Message -----
From: "Yevgeny Zaspitsky" <yzaspits@redhat.com> To: "David Caro" <dcaroest@redhat.com>, "Eyal Edri" <eedri@redhat.com> Cc: infra@ovirt.org Sent: Tuesday, February 10, 2015 6:15:40 PM Subject: Re: Jenkins failures
Here is an example for a git failure on a Jenkins node: http://jenkins.ovirt.org/job/ovirt-engine_master_find-bugs_gerrit/26316/cons...
On 05/02/15 16:07, David Caro wrote:
Also take into account that monday/tuesday we had a major outage on jenkins and all the slaves behaved unreliably if working at all.
On 02/05, Eyal Edri wrote:
Hi,
we'd be more than happy to help and fix those issues. can you please provide links and info on specific failures so we can debug them?
also, you're welcome also to open a ticket to our ticketing system [1] to track a specific item. keep in mind the infra team is limited in resources, so not all tickets might be solves quickly, especially if a major outage (like we had this week) is in progress.
[1] https://fedorahosted.org/ovirt/newticket
/e
----- Original Message -----
From: "Yevgeny Zaspitsky" <yzaspits@redhat.com> To: infra@ovirt.org Sent: Thursday, February 5, 2015 3:59:34 PM Subject: Jenkins failures
Hi All,
Lately I barely get any valuable input from the Jenkins CI builds on my patches. Throughout the last week most of the builds finished with different Jenkins failures. The reasons were:
* git failure * lack of permission to mkdir * failure to retrieve artifacts from the artifactory * unexpected shutdown
Such a high rate of failures makes the value of the builds very low and causes me to spend my time on understanding whether it's my fault or not.
I'd be very thankful and happier if Jenkins reliability was improved.
Regards, Yevgeny
* English - detected * English * Hebrew * Russian
* English * Hebrew * Russian
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

--E13BgyNx05feLLmH Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 02/10, Eyal Edri wrote:
it seems that that slave isn't responsive to ssh, so it might got some sort of an infra issue. =20 i think we should consider adding some sort of verification process for s= laves. something that will run nightly or before each job if it's fast enough. =20 we can think on checking - ping - ssh - git clone.. =20 david, what do you think? might reduce a lot of false positive failures.
Adding something like that requires tampering jenkins internals (or wrappin= g a job inside a job or similar), so it's not easy doing it per-job. Jenkins itself should be doing connectivity checks periodically and take out slaves of the pool if unreachable, unresponsive (if the ping is not fast enough) or the disk/swap is filling up. The slave log shows that it was connected after the job ran, I'll try to fi= gure out what happened before with it
=20 e. =20 ----- Original Message -----
From: "Yevgeny Zaspitsky" <yzaspits@redhat.com> To: "David Caro" <dcaroest@redhat.com>, "Eyal Edri" <eedri@redhat.com> Cc: infra@ovirt.org Sent: Tuesday, February 10, 2015 6:15:40 PM Subject: Re: Jenkins failures =20 Here is an example for a git failure on a Jenkins node: http://jenkins.ovirt.org/job/ovirt-engine_master_find-bugs_gerrit/26316= /console =20 On 05/02/15 16:07, David Caro wrote:
Also take into account that monday/tuesday we had a major outage on j= enkins and all the slaves behaved unreliably if working at all.
On 02/05, Eyal Edri wrote:
Hi,
we'd be more than happy to help and fix those issues. can you please provide links and info on specific failures so we can= debug them?
also, you're welcome also to open a ticket to our ticketing system [= 1] to track a specific item. keep in mind the infra team is limited in resources, so not all tick= ets might be solves quickly, especially if a major outage (like we had this week) is in progress.
[1] https://fedorahosted.org/ovirt/newticket
/e
----- Original Message -----
From: "Yevgeny Zaspitsky" <yzaspits@redhat.com> To: infra@ovirt.org Sent: Thursday, February 5, 2015 3:59:34 PM Subject: Jenkins failures
Hi All,
Lately I barely get any valuable input from the Jenkins CI builds o= n my patches. Throughout the last week most of the builds finished with different Jenkins failures. The reasons were:
* git failure * lack of permission to mkdir * failure to retrieve artifacts from the artifactory * unexpected shutdown
Such a high rate of failures makes the value of the builds very low= and causes me to spend my time on understanding whether it's my fault o= r not.
I'd be very thankful and happier if Jenkins reliability was improve= d.
Regards, Yevgeny
* English - detected * English * Hebrew * Russian
* English * Hebrew * Russian
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra =20 =20
--=20 David Caro Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D Tel.: +420 532 294 605 Email: dcaro@redhat.com Web: www.redhat.com RHT Global #: 82-62605 --E13BgyNx05feLLmH Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJU2kvzAAoJEEBxx+HSYmnDP1oH/jJZCa33WhupSBNZiKrNWYdv NYCDNYySJhlRnYdLZPQPLHapUClJHiYlxGIkPzgxla1JMYZAcF5zmkvfmluBD4Eg QakhCWapn01W6BGp+h9o3mf5MkY5j6niF1Cjm5mBcBozS5tKviGq3C5YU4cTZosN hxB31ju6462UQkhxEMj1qsLlBs8N+AYdlbslKzofs/c6GXbyaiAQsISEIB5mXLg0 k0KomZibSbd8WtX1M73OLxH7craJk/B/zM1Xqg/4Qo/UnvVmCwt17ji6ER0aYkCd igFTljY1Hv/e/tmquWgTt35ShPiih/a0vB18s8bUZMMrP6bxX4Jw3sVNYRMXBRY= =rq3f -----END PGP SIGNATURE----- --E13BgyNx05feLLmH--

it's its the same slave: el6-vm04.phx.ovirt.org i took it offline for now, so if you retrigger your job it should work now. e. ----- Original Message -----
From: "Yevgeny Zaspitsky" <yzaspits@redhat.com> To: "David Caro" <dcaroest@redhat.com>, "Eyal Edri" <eedri@redhat.com> Cc: infra@ovirt.org Sent: Tuesday, February 10, 2015 6:15:40 PM Subject: Re: Jenkins failures
Here is an example for a git failure on a Jenkins node: http://jenkins.ovirt.org/job/ovirt-engine_master_find-bugs_gerrit/26316/cons...
On 05/02/15 16:07, David Caro wrote:
Also take into account that monday/tuesday we had a major outage on jenkins and all the slaves behaved unreliably if working at all.
On 02/05, Eyal Edri wrote:
Hi,
we'd be more than happy to help and fix those issues. can you please provide links and info on specific failures so we can debug them?
also, you're welcome also to open a ticket to our ticketing system [1] to track a specific item. keep in mind the infra team is limited in resources, so not all tickets might be solves quickly, especially if a major outage (like we had this week) is in progress.
[1] https://fedorahosted.org/ovirt/newticket
/e
----- Original Message -----
From: "Yevgeny Zaspitsky" <yzaspits@redhat.com> To: infra@ovirt.org Sent: Thursday, February 5, 2015 3:59:34 PM Subject: Jenkins failures
Hi All,
Lately I barely get any valuable input from the Jenkins CI builds on my patches. Throughout the last week most of the builds finished with different Jenkins failures. The reasons were:
* git failure * lack of permission to mkdir * failure to retrieve artifacts from the artifactory * unexpected shutdown
Such a high rate of failures makes the value of the builds very low and causes me to spend my time on understanding whether it's my fault or not.
I'd be very thankful and happier if Jenkins reliability was improved.
Regards, Yevgeny
* English - detected * English * Hebrew * Russian
* English * Hebrew * Russian
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
participants (3)
-
David Caro
-
Eyal Edri
-
Yevgeny Zaspitsky