September 2018 - Infra - oVirt List Archives

[CQ]: 94345, 3 (ovirt-provider-ovn) failed "ovirt-master" system tests
by oVirt Jenkins 20 Sep '18

20 Sep '18

Change 94345,3 (ovirt-provider-ovn) is probably the reason behind recent system test failures in the "ovirt-master" change queue and needs to be fixed. This change had been removed from the testing queue. Artifacts build from this change will not be released until it is fixed. For further details about the change see: https://gerrit.ovirt.org/#/c/94345/3 For failed test results see: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/10365/

2 1

[JIRA] (OVIRT-2503) Testing automatic Jira ticket logging for CQ monitoring
by Dafna Ron (oVirt JIRA) 20 Sep '18

20 Sep '18

[ https://ovirt-jira.atlassian.net/browse/OVIRT-2503?page=com.atlassian.jira.… ] Dafna Ron commented on OVIRT-2503: ---------------------------------- [~bkorren(a)redhat.com] yes. but to remind you, when we discussed monitoring you stated that we need to reply to every alert that comes in so that we know nothing has been missed and I have also been looking into the best way to do that (since I do not think that its viable in email). I was discussing this issue with leads of some of the jboss developing team who have been working agile for a while and they have recommended that we stop doing monitoring in email and just configure a jira project for the monitoring which would allow us to easily monitor every alert and assign them forward when needed. Jira is very flexible and we can configure it to do a lot of things that we cannot do now. I was also concerned about having too many opened tickets but then, if you have a way to close/dismiss tickets quickly (which is something that is possible in Jira) then I see it as two birds one stone since we would be able to: 1. make sure all alerts have been viewed by someone in the team. 2. give us an easier way to move forwards in collaboration with Dev on monitoring and perhaps, partially automating it. > Testing automatic Jira ticket logging for CQ monitoring > ------------------------------------------------------- > > Key: OVIRT-2503 > URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2503 > Project: oVirt - virtualization made easy > Issue Type: Bug > Reporter: Dafna Ron > Assignee: infra > > As part of monitoring improvement and shifting partial responsibility for CQ monitoring to the developers I would like to do some experiments on connecting CQ alerts to create Jira tickets automatically. > As we have the ost monitoring project we start by doing some tests on that project so that we do not overflow our jira with alerts. > For now I would like to start with Jira's opening with the following specifications: > 1. Subject: [ CQ ] [$patch number] [ oVirt $VER ($project) ] [ TEST NAME ] > 2. at the beginning, email sent to: dron(a)redhat.com > 3. jira description [cq message] > 4. all jira's has to have label=ost_failure and infra-owner as default. > I would like to change the project type to allow: > 1. easy closing of Jira's (one/two clicks if we can) > 2. view of Jira's like service tickets (rather then bugs) > There is a plugin called zapier that allows to easily connect a jira from an email and also allow to add some rules to the Jira which may make this easier for us. > can you also install it and link it to the ost jira? I have an email account that we can use for that. > cq.ovirt(a)gmail.com > https://zapier.com/apps/jira/integrations > [~bkorren(a)redhat.com] > once I do some tests on my own on this, I wanted to try and collaborate with one of the projects (maybe networking or one of sandro's teams) where CQ failures would automatically open a ticket to their team and they can handle the monitoring and escalate issues to us if needed. > Any advice on configurations we should be thinking of for that? -- This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100092)

1 0

[JIRA] (OVIRT-2503) Testing automatic Jira ticket logging for CQ monitoring
by Barak Korren (oVirt JIRA) 20 Sep '18

20 Sep '18

[ https://ovirt-jira.atlassian.net/browse/OVIRT-2503?page=com.atlassian.jira.… ] Barak Korren commented on OVIRT-2503: ------------------------------------- [~dron] I really don`t think this makes any sense since the relationship between jira ticket and CQ alerts aught to be one-to-many where this would result ina one-to-one relationship... > Testing automatic Jira ticket logging for CQ monitoring > ------------------------------------------------------- > > Key: OVIRT-2503 > URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2503 > Project: oVirt - virtualization made easy > Issue Type: Bug > Reporter: Dafna Ron > Assignee: infra > > As part of monitoring improvement and shifting partial responsibility for CQ monitoring to the developers I would like to do some experiments on connecting CQ alerts to create Jira tickets automatically. > As we have the ost monitoring project we start by doing some tests on that project so that we do not overflow our jira with alerts. > For now I would like to start with Jira's opening with the following specifications: > 1. Subject: [ CQ ] [$patch number] [ oVirt $VER ($project) ] [ TEST NAME ] > 2. at the beginning, email sent to: dron(a)redhat.com > 3. jira description [cq message] > 4. all jira's has to have label=ost_failure and infra-owner as default. > I would like to change the project type to allow: > 1. easy closing of Jira's (one/two clicks if we can) > 2. view of Jira's like service tickets (rather then bugs) > There is a plugin called zapier that allows to easily connect a jira from an email and also allow to add some rules to the Jira which may make this easier for us. > can you also install it and link it to the ost jira? I have an email account that we can use for that. > cq.ovirt(a)gmail.com > https://zapier.com/apps/jira/integrations > [~bkorren(a)redhat.com] > once I do some tests on my own on this, I wanted to try and collaborate with one of the projects (maybe networking or one of sandro's teams) where CQ failures would automatically open a ticket to their team and they can handle the monitoring and escalate issues to us if needed. > Any advice on configurations we should be thinking of for that? -- This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100092)

1 0

[oVirt Jenkins] ovirt-appliance_master_build-artifacts-el7-x86_64 - Build # 923 - Failure!
by jenkins＠jenkins.phx.ovirt.org 20 Sep '18

20 Sep '18

Project: http://jenkins.ovirt.org/job/ovirt-appliance_master_build-artifacts-el7-x86… Build: http://jenkins.ovirt.org/job/ovirt-appliance_master_build-artifacts-el7-x86… Build Number: 923 Build Status: Failure Triggered By: Started by timer ------------------------------------- Changes Since Last Success: ------------------------------------- Changes for Build #923 [Yuval Turgeman] automation: remove build-artifacts* from stdci-v2 ----------------- Failed Tests: ----------------- No tests ran.

1 1

[JIRA] (OVIRT-2498) Failing KubeVirt CI
by Barak Korren (oVirt JIRA) 20 Sep '18

20 Sep '18

[ https://ovirt-jira.atlassian.net/browse/OVIRT-2498?page=com.atlassian.jira.… ] Barak Korren updated OVIRT-2498: -------------------------------- Epic Link: OVIRT-2339 > Failing KubeVirt CI > ------------------- > > Key: OVIRT-2498 > URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2498 > Project: oVirt - virtualization made easy > Issue Type: Bug > Components: CI client projects > Reporter: Petr Kotas > Assignee: infra > > Hi, > I am working on fixing the issues on the KubeVirt e2e test suites. This > task is directly related to unstable CI, due to unknown errors. > The progress is reported in the CNV trello: > https://trello.com/c/HNXcMEQu/161-epic-improve-ci > I am creating this issue since the KubeVirt experience random timeouts on > random tests most of the times when test suites run. > The issue from outside is showing as timeouts on difference part of tests. > Sometimes the tests fails in set up phase, again due to random timeout. > The example in the link bellow timed out for network connection on > localhost. > [check-patch.k8s-1.11.0-dev.el7.x86_64] > requests.exceptions.ReadTimeout: > UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. > (read timeout=60) > Example of failing test suites is here > https://jenkins.ovirt.org/job/kubevirt_kubevirt_standard-check-pr/1916/cons… > The list of errors related to the failing CI can be found in my notes > https://docs.google.com/document/d/1_ll1DOMHgCRHn_Df9i4uvtRFyMK-bDCHEeGfJFT… > I am not sure whether KubeVirt already shared the resource requirements, so > I provide short summary: > *Resources for KubeVirt e2e tests:* > - at least 12GB of RAM - we start 3 nodes (3 docker images) each require > 4GB of RAM > - exposed /dev/kvm to enable native virtualization > - cached images, since these are used to build the test cluster: > - kubevirtci/os-3.10.0-crio:latest > - kubevirtci/os-3.10.0-multus:latest > - kubevirtci/os-3.10.0:latest > - kubevirtci/k8s-1.10.4:latest > - kubevirtci/k8s-multus-1.11.1:latest > - kubevirtci/k8s-1.11.0:latest > How can we overcome this? Can we work together to build a suitable > requirements for running the tests so it passes each time? > Kind regards, > Petr Kotas -- This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100092)

1 0

[JIRA] (OVIRT-2498) Failing KubeVirt CI
by Barak Korren (oVirt JIRA) 20 Sep '18

20 Sep '18

[ https://ovirt-jira.atlassian.net/browse/OVIRT-2498?page=com.atlassian.jira.… ] Barak Korren updated OVIRT-2498: -------------------------------- Epic Link: OVIRT-2339 > Failing KubeVirt CI > ------------------- > > Key: OVIRT-2498 > URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2498 > Project: oVirt - virtualization made easy > Issue Type: Bug > Components: CI client projects > Reporter: Petr Kotas > Assignee: infra > > Hi, > I am working on fixing the issues on the KubeVirt e2e test suites. This > task is directly related to unstable CI, due to unknown errors. > The progress is reported in the CNV trello: > https://trello.com/c/HNXcMEQu/161-epic-improve-ci > I am creating this issue since the KubeVirt experience random timeouts on > random tests most of the times when test suites run. > The issue from outside is showing as timeouts on difference part of tests. > Sometimes the tests fails in set up phase, again due to random timeout. > The example in the link bellow timed out for network connection on > localhost. > [check-patch.k8s-1.11.0-dev.el7.x86_64] > requests.exceptions.ReadTimeout: > UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. > (read timeout=60) > Example of failing test suites is here > https://jenkins.ovirt.org/job/kubevirt_kubevirt_standard-check-pr/1916/cons… > The list of errors related to the failing CI can be found in my notes > https://docs.google.com/document/d/1_ll1DOMHgCRHn_Df9i4uvtRFyMK-bDCHEeGfJFT… > I am not sure whether KubeVirt already shared the resource requirements, so > I provide short summary: > *Resources for KubeVirt e2e tests:* > - at least 12GB of RAM - we start 3 nodes (3 docker images) each require > 4GB of RAM > - exposed /dev/kvm to enable native virtualization > - cached images, since these are used to build the test cluster: > - kubevirtci/os-3.10.0-crio:latest > - kubevirtci/os-3.10.0-multus:latest > - kubevirtci/os-3.10.0:latest > - kubevirtci/k8s-1.10.4:latest > - kubevirtci/k8s-multus-1.11.1:latest > - kubevirtci/k8s-1.11.0:latest > How can we overcome this? Can we work together to build a suitable > requirements for running the tests so it passes each time? > Kind regards, > Petr Kotas -- This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100092)

1 0

[JIRA] (OVIRT-2498) Failing KubeVirt CI
by Barak Korren (oVirt JIRA) 20 Sep '18

20 Sep '18

[ https://ovirt-jira.atlassian.net/browse/OVIRT-2498?page=com.atlassian.jira.… ] Barak Korren updated OVIRT-2498: -------------------------------- Component/s: CI client projects > Failing KubeVirt CI > ------------------- > > Key: OVIRT-2498 > URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2498 > Project: oVirt - virtualization made easy > Issue Type: Bug > Components: CI client projects > Reporter: Petr Kotas > Assignee: infra > > Hi, > I am working on fixing the issues on the KubeVirt e2e test suites. This > task is directly related to unstable CI, due to unknown errors. > The progress is reported in the CNV trello: > https://trello.com/c/HNXcMEQu/161-epic-improve-ci > I am creating this issue since the KubeVirt experience random timeouts on > random tests most of the times when test suites run. > The issue from outside is showing as timeouts on difference part of tests. > Sometimes the tests fails in set up phase, again due to random timeout. > The example in the link bellow timed out for network connection on > localhost. > [check-patch.k8s-1.11.0-dev.el7.x86_64] > requests.exceptions.ReadTimeout: > UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. > (read timeout=60) > Example of failing test suites is here > https://jenkins.ovirt.org/job/kubevirt_kubevirt_standard-check-pr/1916/cons… > The list of errors related to the failing CI can be found in my notes > https://docs.google.com/document/d/1_ll1DOMHgCRHn_Df9i4uvtRFyMK-bDCHEeGfJFT… > I am not sure whether KubeVirt already shared the resource requirements, so > I provide short summary: > *Resources for KubeVirt e2e tests:* > - at least 12GB of RAM - we start 3 nodes (3 docker images) each require > 4GB of RAM > - exposed /dev/kvm to enable native virtualization > - cached images, since these are used to build the test cluster: > - kubevirtci/os-3.10.0-crio:latest > - kubevirtci/os-3.10.0-multus:latest > - kubevirtci/os-3.10.0:latest > - kubevirtci/k8s-1.10.4:latest > - kubevirtci/k8s-multus-1.11.1:latest > - kubevirtci/k8s-1.11.0:latest > How can we overcome this? Can we work together to build a suitable > requirements for running the tests so it passes each time? > Kind regards, > Petr Kotas -- This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100092)

1 0

[JIRA] (OVIRT-2498) Failing KubeVirt CI
by Barak Korren (oVirt JIRA) 20 Sep '18

20 Sep '18

[ https://ovirt-jira.atlassian.net/browse/OVIRT-2498?page=com.atlassian.jira.… ] Barak Korren updated OVIRT-2498: -------------------------------- Issue Type: Bug (was: By-EMAIL) > Failing KubeVirt CI > ------------------- > > Key: OVIRT-2498 > URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2498 > Project: oVirt - virtualization made easy > Issue Type: Bug > Reporter: Petr Kotas > Assignee: infra > > Hi, > I am working on fixing the issues on the KubeVirt e2e test suites. This > task is directly related to unstable CI, due to unknown errors. > The progress is reported in the CNV trello: > https://trello.com/c/HNXcMEQu/161-epic-improve-ci > I am creating this issue since the KubeVirt experience random timeouts on > random tests most of the times when test suites run. > The issue from outside is showing as timeouts on difference part of tests. > Sometimes the tests fails in set up phase, again due to random timeout. > The example in the link bellow timed out for network connection on > localhost. > [check-patch.k8s-1.11.0-dev.el7.x86_64] > requests.exceptions.ReadTimeout: > UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. > (read timeout=60) > Example of failing test suites is here > https://jenkins.ovirt.org/job/kubevirt_kubevirt_standard-check-pr/1916/cons… > The list of errors related to the failing CI can be found in my notes > https://docs.google.com/document/d/1_ll1DOMHgCRHn_Df9i4uvtRFyMK-bDCHEeGfJFT… > I am not sure whether KubeVirt already shared the resource requirements, so > I provide short summary: > *Resources for KubeVirt e2e tests:* > - at least 12GB of RAM - we start 3 nodes (3 docker images) each require > 4GB of RAM > - exposed /dev/kvm to enable native virtualization > - cached images, since these are used to build the test cluster: > - kubevirtci/os-3.10.0-crio:latest > - kubevirtci/os-3.10.0-multus:latest > - kubevirtci/os-3.10.0:latest > - kubevirtci/k8s-1.10.4:latest > - kubevirtci/k8s-multus-1.11.1:latest > - kubevirtci/k8s-1.11.0:latest > How can we overcome this? Can we work together to build a suitable > requirements for running the tests so it passes each time? > Kind regards, > Petr Kotas -- This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100092)

1 0

oVirt infra daily report - unstable production jobs - 588
by jenkins＠jenkins.phx.ovirt.org 20 Sep '18

20 Sep '18

Good morning! Attached is the HTML page with the jenkins status report. You can see it also here: - http://jenkins.ovirt.org/job/system_jenkins-report/588//artifact/exported-a… Cheers, Jenkins

1 0

[oVirt Jenkins] ovirt-system-tests_hc-basic-suite-master - Build # 644 - Failure!
by jenkins＠jenkins.phx.ovirt.org 20 Sep '18

20 Sep '18

Project: http://jenkins.ovirt.org/job/ovirt-system-tests_hc-basic-suite-master/ Build: http://jenkins.ovirt.org/job/ovirt-system-tests_hc-basic-suite-master/644/ Build Number: 644 Build Status: Failure Triggered By: Started by timer ------------------------------------- Changes Since Last Success: ------------------------------------- Changes for Build #644 [Milan Zamazal] ovf_import test [Eyal Edri] adding kubevirt/containerized-data-importer to stdci v2 ----------------- Failed Tests: ----------------- 1 tests failed. FAILED: 002_bootstrap.wait_engine Error Message: None != True after 600 seconds Stack Trace: Traceback (most recent call last): File "/usr/lib64/python2.7/unittest/case.py", line 369, in run testMethod() File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 142, in wrapped_test test() File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 60, in wrapper return func(get_test_prefix(), *args, **kwargs) File "/home/jenkins/workspace/ovirt-system-tests_hc-basic-suite-master/ovirt-system-tests/hc-basic-suite-master/test-scenarios/002_bootstrap.py", line 110, in wait_engine testlib.assert_true_within(_engine_is_up, timeout=10 * 60) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 278, in assert_true_within assert_equals_within(func, True, timeout, allowed_exceptions) File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 252, in assert_equals_within '%s != %s after %s seconds' % (res, value, timeout) AssertionError: None != True after 600 seconds

1 66