[ OST Failure Report ] [ oVirt Master ] [ 12-10-2017 ] [003_00_metrics_bootstrap.configure_metrics ]

This is a multi-part message in MIME format. --------------64AE188192FE9908960621DA Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Hi, We had a failure to configure metrics in ovirt-system-tests which caused metrics_bootstrap to fail. The patch that was reported as the cause is below. ** *Link to suspected patches: https://gerrit.ovirt.org/#/c/82686/* * Link to Job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/ Link to all logs: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/artifact/ (Relevant) error snippet from the log: <error> * File "/usr/lib64/python2.7/unittest/case.py", line 369, in run testMethod() File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 129, in wrapped_test test() File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 59, in wrapper return func(get_test_prefix(), *args, **kwargs) File "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/003_00_metrics_bootstrap.py", line 53, in configure_metrics ' Exit code is %s' % result.code File "/usr/lib/python2.7/site-packages/nose/tools/trivial.py", line 29, in eq_ raise AssertionError(msg or "%r != %r" % (a, b)) 'Configuring ovirt machines for metrics failed. Exit code is 2\n-- ** *</error>* ** --------------64AE188192FE9908960621DA Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 7bit <html> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8"> </head> <body text="#000000" bgcolor="#FFFFFF"> <p>Hi, <br> </p> <p>We had a failure to configure metrics in ovirt-system-tests which caused metrics_bootstrap to fail. <br> </p> <p>The patch that was reported as the cause is below. <br> </p> <p><b style="font-weight:normal;" id="docs-internal-guid-5859b7a1-0fdb-62fa-6b21-a30918fe9b8a"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Link to suspected patches: <a class="moz-txt-link-freetext" href="https://gerrit.ovirt.org/#/c/82686/">https://gerrit.ovirt.org/#/c/82686/</a></span></p> <br> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Link to Job: <a class="moz-txt-link-freetext" href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/</a></span></p> <br> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">Link to all logs: <a class="moz-txt-link-freetext" href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/artifact/">http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/artifact/</a></span></p> <br> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;">(Relevant) error snippet from the log: </span></p> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"><error></span></p> </b></p> <pre style="box-sizing: border-box; white-space: pre-wrap; word-wrap: break-word; margin: 0px; color: rgb(51, 51, 51); font-size: 14px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;"> File "/usr/lib64/python2.7/unittest/case.py", line 369, in run testMethod() File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 129, in wrapped_test test() File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 59, in wrapper return func(get_test_prefix(), *args, **kwargs) File "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/003_00_metrics_bootstrap.py", line 53, in configure_metrics ' Exit code is %s' % result.code File "/usr/lib/python2.7/site-packages/nose/tools/trivial.py", line 29, in eq_ raise AssertionError(msg or "%r != %r" % (a, b)) 'Configuring ovirt machines for metrics failed. Exit code is 2\n--</pre> <p><b style="font-weight:normal;" id="docs-internal-guid-5859b7a1-0fdb-62fa-6b21-a30918fe9b8a"> <p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:11pt;font-family:Arial;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre-wrap;"></error></span></p> </b><br class="Apple-interchange-newline"> </p> </body> </html> --------------64AE188192FE9908960621DA--

Repo issues (again?) See log[1]. Y. [1] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/artifact/... On Thu, Oct 12, 2017 at 12:26 PM, Dafna Ron <dron@redhat.com> wrote:
Hi,
We had a failure to configure metrics in ovirt-system-tests which caused metrics_bootstrap to fail.
The patch that was reported as the cause is below.
*Link to suspected patches: https://gerrit.ovirt.org/#/c/82686/ <https://gerrit.ovirt.org/#/c/82686/>*
* Link to Job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/> Link to all logs: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/artifact/ <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/artifact/> (Relevant) error snippet from the log: <error> *
File "/usr/lib64/python2.7/unittest/case.py", line 369, in run testMethod() File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 129, in wrapped_test test() File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 59, in wrapper return func(get_test_prefix(), *args, **kwargs) File "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/003_00_metrics_bootstrap.py", line 53, in configure_metrics ' Exit code is %s' % result.code File "/usr/lib/python2.7/site-packages/nose/tools/trivial.py", line 29, in eq_ raise AssertionError(msg or "%r != %r" % (a, b)) 'Configuring ovirt machines for metrics failed. Exit code is 2\n--
*</error>*
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

On 12.10.2017 12:27, Yaniv Kaul wrote:
Repo issues (again?) See log[1]. Y.
[1] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/artifact/...
On Thu, Oct 12, 2017 at 12:26 PM, Dafna Ron <dron@redhat.com> wrote:
I agree, seems to be unrelated to my patch. -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski

Thank you. I was not sure if this is repo related since I could not see a specific package that it failed on. Sandro believes this might have been a repo outage so I am opening a ticket to try and investigate this issue and find the root cause. https://ovirt-jira.atlassian.net/browse/OVIRT-1693 Thanks, Dafna On 10/12/2017 11:37 AM, Viktor Mihajlovski wrote:
On 12.10.2017 12:27, Yaniv Kaul wrote:
Repo issues (again?) See log[1]. Y.
[1] http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/artifact/...
On Thu, Oct 12, 2017 at 12:26 PM, Dafna Ron <dron@redhat.com> wrote:
I agree, seems to be unrelated to my patch.

On Thu, Oct 12, 2017 at 2:36 PM, Dafna Ron <dron@redhat.com> wrote:
Thank you. I was not sure if this is repo related since I could not see a specific package that it failed on. Sandro believes this might have been a repo outage so I am opening a ticket to try and investigate this issue and find the root cause.
We have far too many repo outages. I believe it could be partially solved by properly and consistently keeping the reposync up-to-date. It's far from bullet-proof, and is annoying work, but we need to once every other week or so to just do it, to ensure we can perform offline installation (I don't believe it's a complete repo outage, but partial, which is why I think it'll help). Y.
Thanks, Dafna
On 10/12/2017 11:37 AM, Viktor Mihajlovski wrote:
On 12.10.2017 12:27, Yaniv Kaul wrote:
Repo issues (again?) See log[1]. Y.
[1] http://jenkins.ovirt.org/job/ovirt-master_change-queue- tester/3104/artifact/exported-artifacts/basic-suit-master- el7/003_00_metrics_bootstrap.py.junit.xml
On Thu, Oct 12, 2017 at 12:26 PM, Dafna Ron <dron@redhat.com> wrote:
I agree, seems to be unrelated to my patch.

On Thu, Oct 12, 2017 at 8:51 PM, Yaniv Kaul <ykaul@redhat.com> wrote:
On Thu, Oct 12, 2017 at 2:36 PM, Dafna Ron <dron@redhat.com> wrote:
Thank you. I was not sure if this is repo related since I could not see a specific package that it failed on. Sandro believes this might have been a repo outage so I am opening a ticket to try and investigate this issue and find the root cause.
We have far too many repo outages. I believe it could be partially solved by properly and consistently keeping the reposync up-to-date. It's far from bullet-proof, and is annoying work, but we need to once every other week or so to just do it, to ensure we can perform offline installation (I don't believe it's a complete repo outage, but partial, which is why I think it'll help).
Spoke too soon: yum.Errors.NoMoreMirrorsRepoError: failure: repodata/repomd.xml from centos-ovirt-4.2-el7: [Errno 256] No more mirrors to try. http://cbs.centos.org/repos/virt7-ovirt-42-testing/x86_64/os/repodata/repomd...: [Errno 14] HTTP Error 404 - Not Found Perhaps we are not using mirror links properly. Y.
Y.
Thanks, Dafna
On 10/12/2017 11:37 AM, Viktor Mihajlovski wrote:
On 12.10.2017 12:27, Yaniv Kaul wrote:
Repo issues (again?) See log[1]. Y.
[1] http://jenkins.ovirt.org/job/ovirt-master_change-queue-teste r/3104/artifact/exported-artifacts/basic-suit-master-el7/ 003_00_metrics_bootstrap.py.junit.xml
On Thu, Oct 12, 2017 at 12:26 PM, Dafna Ron <dron@redhat.com> wrote:
I agree, seems to be unrelated to my patch.

This is a multi-part message in MIME format. --------------A751BD54E68444A2C28A68D7 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Adding Eyal, Barak and Sandro and removing Victor. Personally, I do not mind taking on this task, but I think that I would need help in creating such a task. On 10/12/2017 06:51 PM, Yaniv Kaul wrote:
On Thu, Oct 12, 2017 at 2:36 PM, Dafna Ron <dron@redhat.com <mailto:dron@redhat.com>> wrote:
Thank you. I was not sure if this is repo related since I could not see a specific package that it failed on. Sandro believes this might have been a repo outage so I am opening a ticket to try and investigate this issue and find the root cause.
https://ovirt-jira.atlassian.net/browse/OVIRT-1693 <https://ovirt-jira.atlassian.net/browse/OVIRT-1693>
We have far too many repo outages. I believe it could be partially solved by properly and consistently keeping the reposync up-to-date. It's far from bullet-proof, and is annoying work, but we need to once every other week or so to just do it, to ensure we can perform offline installation (I don't believe it's a complete repo outage, but partial, which is why I think it'll help). Y.
Thanks, Dafna
On 10/12/2017 11:37 AM, Viktor Mihajlovski wrote: > On 12.10.2017 12:27, Yaniv Kaul wrote: >> Repo issues (again?) >> See log[1]. >> Y. >> >> [1] >> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/artifact/... <http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/artifact/exported-artifacts/basic-suit-master-el7/003_00_metrics_bootstrap.py.junit.xml> >> >> On Thu, Oct 12, 2017 at 12:26 PM, Dafna Ron <dron@redhat.com <mailto:dron@redhat.com>> wrote: >> > I agree, seems to be unrelated to my patch. >
--------------A751BD54E68444A2C28A68D7 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body text="#000000" bgcolor="#FFFFFF"> <div class="moz-cite-prefix">Adding Eyal, Barak and Sandro and removing Victor. <br> <br> Personally, I do not mind taking on this task, but I think that I would need help in creating such a task. <br> <br> On 10/12/2017 06:51 PM, Yaniv Kaul wrote:<br> </div> <blockquote type="cite" cite="mid:CAJgorsZVc1vQRHeA2nVJVAwk5Kx-iQjQkESiSix7z5K4Ayy5Cw@mail.gmail.com"> <div dir="ltr"><br> <div class="gmail_extra"><br> <div class="gmail_quote">On Thu, Oct 12, 2017 at 2:36 PM, Dafna Ron <span dir="ltr"><<a href="mailto:dron@redhat.com" target="_blank" moz-do-not-send="true">dron@redhat.com</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Thank you.<br> I was not sure if this is repo related since I could not see a specific<br> package that it failed on.<br> Sandro believes this might have been a repo outage so I am opening a<br> ticket to try and investigate this issue and find the root cause.<br> <br> <a href="https://ovirt-jira.atlassian.net/browse/OVIRT-1693" rel="noreferrer" target="_blank" moz-do-not-send="true">https://ovirt-jira.atlassian.<wbr>net/browse/OVIRT-1693</a></blockquote> <div><br> </div> <div>We have far too many repo outages.</div> <div>I believe it could be partially solved by properly and consistently keeping the reposync up-to-date.</div> <div>It's far from bullet-proof, and is annoying work, but we need to once every other week or so to just do it, to ensure we can perform offline installation (I don't believe it's a complete repo outage, but partial, which is why I think it'll help).</div> <div>Y.</div> <div> </div> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br> <br> Thanks,<br> Dafna<br> <div class="HOEnZb"> <div class="h5"><br> <br> On 10/12/2017 11:37 AM, Viktor Mihajlovski wrote:<br> > On 12.10.2017 12:27, Yaniv Kaul wrote:<br> >> Repo issues (again?)<br> >> See log[1].<br> >> Y.<br> >><br> >> [1]<br> >> <a href="http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/3104/artifact/..." rel="noreferrer" target="_blank" moz-do-not-send="true">http://jenkins.ovirt.org/job/<wbr>ovirt-master_change-queue-<wbr>tester/3104/artifact/exported-<wbr>artifacts/basic-suit-master-<wbr>el7/003_00_metrics_bootstrap.<wbr>py.junit.xml</a><br> >><br> >> On Thu, Oct 12, 2017 at 12:26 PM, Dafna Ron <<a href="mailto:dron@redhat.com" moz-do-not-send="true">dron@redhat.com</a>> wrote:<br> >><br> > I agree, seems to be unrelated to my patch.<br> ><br> <br> </div> </div> </blockquote> </div> <br> </div> </div> </blockquote> <p><br> </p> </body> </html> --------------A751BD54E68444A2C28A68D7--

2017-10-13 11:29 GMT+02:00 Dafna Ron <dron@redhat.com>:
Adding Eyal, Barak and Sandro and removing Victor.
Personally, I do not mind taking on this task, but I think that I would need help in creating such a task.
I think we are already mirroring everything and refreshing the mirror every few hours. Issue looks like we are not using them in some jobs.
On 10/12/2017 06:51 PM, Yaniv Kaul wrote:
On Thu, Oct 12, 2017 at 2:36 PM, Dafna Ron <dron@redhat.com> wrote:
Thank you. I was not sure if this is repo related since I could not see a specific package that it failed on. Sandro believes this might have been a repo outage so I am opening a ticket to try and investigate this issue and find the root cause.
We have far too many repo outages. I believe it could be partially solved by properly and consistently keeping the reposync up-to-date. It's far from bullet-proof, and is annoying work, but we need to once every other week or so to just do it, to ensure we can perform offline installation (I don't believe it's a complete repo outage, but partial, which is why I think it'll help). Y.
Thanks, Dafna
On 10/12/2017 11:37 AM, Viktor Mihajlovski wrote:
On 12.10.2017 12:27, Yaniv Kaul wrote:
Repo issues (again?) See log[1]. Y.
[1] http://jenkins.ovirt.org/job/ovirt-master_change-queue-teste r/3104/artifact/exported-artifacts/basic-suit-master-el7/ 003_00_metrics_bootstrap.py.junit.xml
On Thu, Oct 12, 2017 at 12:26 PM, Dafna Ron <dron@redhat.com> wrote:
I agree, seems to be unrelated to my patch.
-- SANDRO BONAZZOLA ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D Red Hat EMEA <https://www.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> <http://www.teraplan.it/redhat-osd-2017/>

On 13 October 2017 at 14:31, Sandro Bonazzola <sbonazzo@redhat.com> wrote:
2017-10-13 11:29 GMT+02:00 Dafna Ron <dron@redhat.com>:
Adding Eyal, Barak and Sandro and removing Victor.
Personally, I do not mind taking on this task, but I think that I would need help in creating such a task.
I think we are already mirroring everything and refreshing the mirror every few hours. Issue looks like we are not using them in some jobs.
I did not look deeply into this particular issue, but just to get everyone on the same page. * We have mirror system that is getting synced every 8 hours and has no known issues ATM * All repo issues you're seeing in OST are due to one of the following two reasons: 1. We are white-listing packages into the OST environment and the list needs to be maintained as package dependencies change 2. The OST VMs are not blocked from using the upstream CentOS repos/mirrors. And the upstream repos are not updated in an atomic fashion We have ongoing work [1] to fix issue #2 above, it takes time because it requires meticulous work to get all the required things into the whitelist. BTW when you see these issues in OST that are doe to upstream CentOS repos not being updated atomically, it usually correlates with a similar failure in the mirror sync job. [1]: https://ovirt-jira.atlassian.net/browse/OVIRT-1280 -- Barak Korren RHV DevOps team , RHCE, RHCi Red Hat EMEA redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted

On Sun, Oct 15, 2017 at 10:15 AM, Barak Korren <bkorren@redhat.com> wrote:
On 13 October 2017 at 14:31, Sandro Bonazzola <sbonazzo@redhat.com> wrote:
2017-10-13 11:29 GMT+02:00 Dafna Ron <dron@redhat.com>:
Adding Eyal, Barak and Sandro and removing Victor.
Personally, I do not mind taking on this task, but I think that I would need help in creating such a task.
I think we are already mirroring everything and refreshing the mirror every few hours. Issue looks like we are not using them in some jobs.
I did not look deeply into this particular issue, but just to get everyone on the same page.
* We have mirror system that is getting synced every 8 hours and has no known issues ATM * All repo issues you're seeing in OST are due to one of the following two reasons: 1. We are white-listing packages into the OST environment and the list needs to be maintained as package dependencies change 2. The OST VMs are not blocked from using the upstream CentOS repos/mirrors. And the upstream repos are not updated in an atomic fashion
We have ongoing work [1] to fix issue #2 above, it takes time because it requires meticulous work to get all the required things into the whitelist.
Since I send here and there patches to do this meticulous work, I know it's not such a big deal. Yes, it's annoying and I have not yet come up with an automated way to do it (I'm sure there is!), but it takes few hours and we can do it once every 2 weeks or so. It also has the nice benefit of reducing run time, sometimes dramatically. Y.
BTW when you see these issues in OST that are doe to upstream CentOS repos not being updated atomically, it usually correlates with a similar failure in the mirror sync job.
[1]: https://ovirt-jira.atlassian.net/browse/OVIRT-1280
-- Barak Korren RHV DevOps team , RHCE, RHCi Red Hat EMEA redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
participants (5)
-
Barak Korren
-
Dafna Ron
-
Sandro Bonazzola
-
Viktor Mihajlovski
-
Yaniv Kaul