
On Sun, Nov 15, 2020 at 4:13 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Sun, Nov 15, 2020 at 12:28 PM Yedidyah Bar David <didi@redhat.com> wrote:
On Thu, Nov 12, 2020 at 9:24 PM Eitan Raviv <eraviv@redhat.com> wrote:
On Thu, Nov 12, 2020 at 5:46 PM Nir Soffer <nsoffer@redhat.com> wrote:
On Thu, Nov 12, 2020 at 4:01 PM Nir Soffer <nsoffer@redhat.com> wrote:
I had many failures in recent OST patches, so I posted this change: https://gerrit.ovirt.org/c/112174/
This patch does not change anything, but it modifies the lago vm configuration so it triggers 8 jobs. 2 network test suites failed: https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-system-tests_stan...
Can someone look at the network suite failures?
Network suite has indeed been failing randomly recently. More often than not it was due to timeouts while waiting for connections to the hosts, timeouts while waiting for hosts to reach deserted statuses, and in the above I also see what looks like a sock error on port 22. Not only are the failing tests random but also usually the next nightly passes. This leads me to believe that the cause of the failures is outside the scope of the tests code.
I noticed something similar as well - see thread:
[oVirt Jenkins] ovirt-system-tests_basic-suite-master_nightly - Build # 561 - Failure!
This is not only the networks suites, lot of other suites fail randomly.
Regarding the networks suites - can it be related to old kernel when running the tests in mock on el7 host? Do we need to require el8 host?
Do we see the same failures when running the network suites locally?
If these suites are not stable, we should not included them in the CI for OST patches, or mark them as expected failures so they do not fail the build.
I triggered another build since I see lot of random failures in other suites.
On the next build - different errors:
https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-system-tests_stan...
- basic_suite_4.3.el7.x86_64 - failed - basic_suite_master.el7.x86_64 - failed - network_suite_4.3.el7.x86_64 - failed - network_suite_master.el7.x86_64 - failed
Third build failed:
https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-system-tests_stan...
Failing suites:
- basic_suite_master.el7.x86_64 - network_suite_master.el7.x86_64
Forth build failed: https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-system-tests_stan... - basic_suite_master.el7.x86_64 - upgrade-from-release_suite_4.3.el7.x86_64
Looks like all failures happen with el7. Why are we running master (el8 based) on el7 hosts?
The basic master suites never failed when I run it locally, even with nested environment. But maybe I did not try enough, I did 10 runs.
With the current state OST CI is not useful to anyone. Builds take hours and fail randomly. This wastes our limited resources for other projects and makes contribution to this project very hard.
+1
_______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/JBA4FBJN2N6MMH...
_______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-leave@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/ZVFQOUXWCREYAH...
-- Didi