[JIRA] (OVIRT-2498) Failing KubeVirt CI
by Eyal Edri (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2498?page=com.atlassian.jir... ]
Eyal Edri updated OVIRT-2498:
-----------------------------
Resolution: Fixed
Status: Done (was: To Do)
> Failing KubeVirt CI
> -------------------
>
> Key: OVIRT-2498
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2498
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Components: CI client projects
> Reporter: Petr Kotas
> Assignee: infra
>
> Hi,
> I am working on fixing the issues on the KubeVirt e2e test suites. This
> task is directly related to unstable CI, due to unknown errors.
> The progress is reported in the CNV trello:
> https://trello.com/c/HNXcMEQu/161-epic-improve-ci
> I am creating this issue since the KubeVirt experience random timeouts on
> random tests most of the times when test suites run.
> The issue from outside is showing as timeouts on difference part of tests.
> Sometimes the tests fails in set up phase, again due to random timeout.
> The example in the link bellow timed out for network connection on
> localhost.
> [check-patch.k8s-1.11.0-dev.el7.x86_64]
> requests.exceptions.ReadTimeout:
> UnixHTTPConnectionPool(host='localhost', port=None): Read timed out.
> (read timeout=60)
> Example of failing test suites is here
> https://jenkins.ovirt.org/job/kubevirt_kubevirt_standard-check-pr/1916/co...
> The list of errors related to the failing CI can be found in my notes
> https://docs.google.com/document/d/1_ll1DOMHgCRHn_Df9i4uvtRFyMK-bDCHEeGfJ...
> I am not sure whether KubeVirt already shared the resource requirements, so
> I provide short summary:
> *Resources for KubeVirt e2e tests:*
> - at least 12GB of RAM - we start 3 nodes (3 docker images) each require
> 4GB of RAM
> - exposed /dev/kvm to enable native virtualization
> - cached images, since these are used to build the test cluster:
> - kubevirtci/os-3.10.0-crio:latest
> - kubevirtci/os-3.10.0-multus:latest
> - kubevirtci/os-3.10.0:latest
> - kubevirtci/k8s-1.10.4:latest
> - kubevirtci/k8s-multus-1.11.1:latest
> - kubevirtci/k8s-1.11.0:latest
> How can we overcome this? Can we work together to build a suitable
> requirements for running the tests so it passes each time?
> Kind regards,
> Petr Kotas
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100095)
5 years, 11 months
[JIRA] (OVIRT-2498) Failing KubeVirt CI
by Eyal Edri (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2498?page=com.atlassian.jir... ]
Eyal Edri commented on OVIRT-2498:
----------------------------------
[~pkotas] we've since then migrated the KubeVirt CI to run inside OpenShift, if you're still interested in the load / stats, reach out to [~ederevea] or [~gbenhaim(a)redhat.com] which are actively working on profiling and monitoring the infra and tests.
> Failing KubeVirt CI
> -------------------
>
> Key: OVIRT-2498
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2498
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Components: CI client projects
> Reporter: Petr Kotas
> Assignee: infra
>
> Hi,
> I am working on fixing the issues on the KubeVirt e2e test suites. This
> task is directly related to unstable CI, due to unknown errors.
> The progress is reported in the CNV trello:
> https://trello.com/c/HNXcMEQu/161-epic-improve-ci
> I am creating this issue since the KubeVirt experience random timeouts on
> random tests most of the times when test suites run.
> The issue from outside is showing as timeouts on difference part of tests.
> Sometimes the tests fails in set up phase, again due to random timeout.
> The example in the link bellow timed out for network connection on
> localhost.
> [check-patch.k8s-1.11.0-dev.el7.x86_64]
> requests.exceptions.ReadTimeout:
> UnixHTTPConnectionPool(host='localhost', port=None): Read timed out.
> (read timeout=60)
> Example of failing test suites is here
> https://jenkins.ovirt.org/job/kubevirt_kubevirt_standard-check-pr/1916/co...
> The list of errors related to the failing CI can be found in my notes
> https://docs.google.com/document/d/1_ll1DOMHgCRHn_Df9i4uvtRFyMK-bDCHEeGfJ...
> I am not sure whether KubeVirt already shared the resource requirements, so
> I provide short summary:
> *Resources for KubeVirt e2e tests:*
> - at least 12GB of RAM - we start 3 nodes (3 docker images) each require
> 4GB of RAM
> - exposed /dev/kvm to enable native virtualization
> - cached images, since these are used to build the test cluster:
> - kubevirtci/os-3.10.0-crio:latest
> - kubevirtci/os-3.10.0-multus:latest
> - kubevirtci/os-3.10.0:latest
> - kubevirtci/k8s-1.10.4:latest
> - kubevirtci/k8s-multus-1.11.1:latest
> - kubevirtci/k8s-1.11.0:latest
> How can we overcome this? Can we work together to build a suitable
> requirements for running the tests so it passes each time?
> Kind regards,
> Petr Kotas
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100095)
5 years, 11 months
[JIRA] (OVIRT-2506) Accessing logs is sometimes extremely slow even
without blueocean
by Eyal Edri (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-2506?page=com.atlassian.jir... ]
Eyal Edri updated OVIRT-2506:
-----------------------------
Resolution: Cannot Reproduce
Status: Done (was: To Do)
We introduced various improvements for Jenkins master to handle this,
also we found a memory leak in one of the blueoceans plugins which we're tracking a fix on GitHub for it.
if you can reproduce the slowness please reopen once it happens so we'll be able to debug it on live system.
> Accessing logs is sometimes extremely slow even without blueocean
> -----------------------------------------------------------------
>
> Key: OVIRT-2506
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2506
> Project: oVirt - virtualization made easy
> Issue Type: By-EMAIL
> Components: Jenkins Master
> Reporter: Roman Mohr
> Assignee: infra
>
> Hi,
> With the switch from blueocean to the text-based build summary in
> kubevirt/kubevirt, in general accessing works much better. Still
> especially in the morning hours it can take a long time to access the
> logs.
> Can it be that the build summary and the logs are still served by
> jenkins? If yes, would it help to copied out/cache/serve the build
> logs via a dedicated webserver?
> Best Regards,
> Roman
--
This message was sent by Atlassian Jira
(v1001.0.0-SNAPSHOT#100095)
5 years, 11 months