CI slaves extremely slow - overloaded slaves?
Nir Soffer
nsoffer at redhat.com
Wed Dec 7 19:33:08 UTC 2016
Hi all,
In the last weeks we see more and more test failures due to timeouts in the CI.
For example:
17:19:49 ======================================================================
17:19:49 FAIL: test_scale (storage_filesd_test.GetAllVolumesTests)
17:19:49 ----------------------------------------------------------------------
17:19:49 Traceback (most recent call last):
17:19:49 File
"/home/jenkins/workspace/vdsm_master_check-patch-fc24-x86_64/vdsm/tests/storage_filesd_test.py",
line 165, in test_scale
17:19:49 self.assertTrue(elapsed < 1.0, "Elapsed time: %f seconds"
% elapsed)
17:19:49 AssertionError: Elapsed time: 1.105877 seconds
17:19:49 -------------------- >> begin captured stdout << ---------------------
17:19:49 1.105877 seconds
This test runs in 0.048 seconds on my laptop:
$ ./run_tests_local.sh storage_filesd_test.py -s
nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
storage_filesd_test.GetAllVolumesTests
test_no_templates OK
test_no_volumes OK
test_scale 0.047932 seconds
OK
test_with_template OK
----------------------------------------------------------------------
Ran 4 tests in 0.189s
It seems that we are overloading the CI slaves. We should not use nested kvm
for the CI, such vms are much slower then regular vms, and we probably run
too many vms per cpu.
We can disable such tests in the CI, but we do want to know when there is
a regression in this code. Before it was fixed, the same test took 9 seconds
on my laptop. We need fast machines in the CI for this.
Nir
More information about the Infra
mailing list