Hi,
_TL; DR_ Let's cut the running time of '008_basic_ui_sanity.py' by more
than 3 minutes by sacrificing Firefox and Chrome screenshots in favor of
Chromium.
During the OST hackathon in Brno this year, I saw an opportunity to
optimize basic UI sanity tests from basic suite.
The way we currently run them, is by setting up a Selenium grid using 3
docker containers, with a dedicated network... that's insanity! (pun
intended).
Let's a look at the running time of '008_basic_ui_sanity.py' scenario
(
https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-te...):
01:31:50 @ Run test: 008_basic_ui_sanity.py:
01:31:50 nose.config: INFO: Ignoring files matching ['^\\.', '^_',
'^setup\\.py$']
01:31:50 # init:
01:31:50 # init: Success (in 0:00:00)
01:31:50 # start_grid:
01:34:05 # start_grid: Success (in 0:02:15)
01:34:05 # initialize_chrome:
01:34:18 # initialize_chrome: Success (in 0:00:13)
01:34:18 # login:
01:34:27 # login: Success (in 0:00:08)
01:34:27 # left_nav:
01:34:45 # left_nav: Success (in 0:00:18)
01:34:45 # close_driver:
01:34:46 # close_driver: Success (in 0:00:00)
01:34:46 # initialize_firefox:
01:35:02 # initialize_firefox: Success (in 0:00:16)
01:35:02 # login:
01:35:11 # login: Success (in 0:00:08)
01:35:11 # left_nav:
01:35:29 # left_nav: Success (in 0:00:18)
01:35:29 # cleanup:
01:35:36 # cleanup: Success (in 0:00:06)
01:35:36 # Results located at
/dev/shm/ost/deployment-basic-suite-master/008_basic_ui_sanity.py.junit.xml
01:35:36 @ Run test: 008_basic_ui_sanity.py: Success (in 0:03:45)
Starting the Selenium grid takes 2:15 out of 3:35 of total running time!
I've investigated a lot of approaches and came up with something like this:
* install 'chromium-headless' package on engine VM
* download 'chromedriver' and 'selenium hub' jar and deploy them in
'/var/opt/' on engine's VM
* run 'selenium.jar' on engine VM from '008_basic_ui_sanity.py' by
using Lago's ssh
* connect to the Selenium instance running on the engine in
'008_basic_ui_sanity.py'
* make screenshots
This series of patches represent the changes:
https://gerrit.ovirt.org/#/q/topic:selenium-on-engine+(status:open+OR+sta....
This is the new running time (
https://jenkins.ovirt.org/view/oVirt
system tests/job/ovirt-system-tests_manual/4195/):
20:13:26 @ Run test: 008_basic_ui_sanity.py:
20:13:26 nose.config: INFO: Ignoring files matching ['^\\.', '^_',
'^setup\\.py$']
20:13:26 # init:
20:13:26 # init: Success (in 0:00:00)
20:13:26 # make_screenshots:
20:13:27 * Retrying (Retry(total=2, connect=None, read=None,
redirect=None, status=None)) after connection broken by
'NewConnectionError('<urllib3.connection.HTTPConnection object at
0x7fdb6004f8d0>: Failed to establish a new connection: [Errno 111]
Connection refused',)': /wd/hub
20:13:27 * Retrying (Retry(total=1, connect=None, read=None,
redirect=None, status=None)) after connection broken by
'NewConnectionError('<urllib3.connection.HTTPConnection object at
0x7fdb6004fa10>: Failed to establish a new connection: [Errno 111]
Connection refused',)': /wd/hub
20:13:27 * Retrying (Retry(total=0, connect=None, read=None,
redirect=None, status=None)) after connection broken by
'NewConnectionError('<urllib3.connection.HTTPConnection object at
0x7fdb6004fb50>: Failed to establish a new connection: [Errno 111]
Connection refused',)': /wd/hub
20:13:28 * Redirecting
http://192.168.201.4:4444/wd/hub ->
http://192.168.201.4:4444/wd/hub/static/resource/hub.html
20:14:02 # make_screenshots: Success (in 0:00:35)
20:14:02 # Results located at
/dev/shm/ost/deployment-basic-suite-master/008_basic_ui_sanity.py.junit.xml
20:14:02 @ Run test: 008_basic_ui_sanity.py: Success (in 0:00:35)
(The 'NewConnectionErrors' is waiting for Selenium hub to be up and
running, I can silence these later).
And the screenshots are here:
https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-te...
_The pros:_
* we cut the running time by more than 3 minutes
_The cons:_
* we don't get Firefox or Chrome screenshots - we get Chromium
screenshots (although AFAIK, QE has much more Selenium tests which
cover both Firefox and Chrome)
* we polute the engine VM with 'chromium-headless' package and deps
(in total: 'chromium-headless', 'chromium-common', 'flac-libs'
and
'minizip'), although we can remove these after the tests
_Some design choices explained:_
Q: Why engine VM?
A: Because the engine VM already has 'X11' libs. We could install
'chromium-headless' (and even other browsers) on our Jenkins executors,
but that would mess them up a lot.
Q: Why Chromium?
A: Because it has a separate 'headless' package.
Q: Why not use 'chromedriver' RPM in favor of
https://chromedriver.storage.googleapis.com Chromedriver builds?
A: Because the RPM version pulls a lot of extra dependencies even on the
engine VM ('gtk3', 'cairo' etc.). Builds from the URL are the offical
Google Chromedriver builds, they contain a single binary, and they work
for us.
_What still needs to be polished with the patches:_
* Currently 'setup_engine_selenium.sh' script downloads each time
'selenium.jar' and 'chromedriver.zip' (even with these downloads we
get much faster set-up times) - we should bake these into the engine
VM image template.
* 'selenium_hub_running' function in 'selenium_on_engine.py' is
hackish - an ability to run an ssh command with a context manager
(and auto-terminate on it exits) should be part of Lago. Can be
refactored.
Questions, comments, reviews are welcome.
Regards, Marcin