CI slaves extremely slow - overloaded slaves?

Barak Korren bkorren at redhat.com
Thu Dec 8 07:11:11 UTC 2016


More discussion and tracking needed, moving to Jira. (please put any
further discussion there):

https://ovirt-jira.atlassian.net/browse/OVIRT-919

On 7 December 2016 at 21:33, Nir Soffer <nsoffer at redhat.com> wrote:
> Hi all,
>
> In the last weeks we see more and more test failures due to timeouts in the CI.
>
> For example:
>
> 17:19:49 ======================================================================
> 17:19:49 FAIL: test_scale (storage_filesd_test.GetAllVolumesTests)
> 17:19:49 ----------------------------------------------------------------------
> 17:19:49 Traceback (most recent call last):
> 17:19:49   File
> "/home/jenkins/workspace/vdsm_master_check-patch-fc24-x86_64/vdsm/tests/storage_filesd_test.py",
> line 165, in test_scale
> 17:19:49     self.assertTrue(elapsed < 1.0, "Elapsed time: %f seconds"
> % elapsed)
> 17:19:49 AssertionError: Elapsed time: 1.105877 seconds
> 17:19:49 -------------------- >> begin captured stdout << ---------------------
> 17:19:49 1.105877 seconds
>
> This test runs in 0.048 seconds on my laptop:
>
> $ ./run_tests_local.sh storage_filesd_test.py -s
> nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
> storage_filesd_test.GetAllVolumesTests
>     test_no_templates                                           OK
>     test_no_volumes                                             OK
>     test_scale                                                  0.047932 seconds
> OK
>     test_with_template                                          OK
>
> ----------------------------------------------------------------------
> Ran 4 tests in 0.189s
>
> It seems that we are overloading the CI slaves. We should not use nested kvm
> for the CI, such vms are much slower then regular vms, and we probably run
> too many vms per cpu.
>
> We can disable such tests in the CI, but we do want to know when there is
> a regression in this code. Before it was fixed, the same test took 9 seconds
> on my laptop. We need fast machines in the CI for this.
>
> Nir
> _______________________________________________
> Infra mailing list
> Infra at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra



-- 
Barak Korren
bkorren at redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/



More information about the Infra mailing list