Hi,

On 3/4/19 7:07 PM, Greg Sheremeta wrote:
Hi,

Thanks for trying to improve the tests!

I'm reluctant to give up Firefox sanity tests on every commit, though. In fact, I wanted to add Edge and Safari, because those are also supported browsers. Just today a Firefox only issue was reported, so they are valuable.

Was the Firefox-only issue detected by basic suite or some other tests?


Did you consider either leaving a grid up permanently or perhaps using a third party like saucelabs?
I did consider simply having our own grid for the OST.
There's even a thread somewhere on ovirt-devel, where someone found OST trying to connect to one of my VMs in Tel Aviv, where my own grid was running :D
I couldn't make a public demo though - OST executors couldn't see my VM in tlv.

This approach has 2 big flaws:

The way I see basic suite's UI sanity tests, is that they're exactly what they're called - sanity tests.
We do trivial checks like "can we log in to the webadmin site", "can we go to 'virtual machines' sub-page".
I'm not in favor of dropping these completely - I think they make sense, but I also think we can live with a trimmed-down version that saves a lot of time.
As I said - AFAIK QE have their own Selenium grid, where they run more complex tests on the UI.

Regards, Marcin



Best wishes, 
Greg 

On Mon, Mar 4, 2019, 11:39 AM Marcin Sobczyk <msobczyk@redhat.com> wrote:

Hi,

TL; DR Let's cut the running time of '008_basic_ui_sanity.py' by more than 3 minutes by sacrificing Firefox and Chrome screenshots in favor of Chromium.

During the OST hackathon in Brno this year, I saw an opportunity to optimize basic UI sanity tests from basic suite.
The way we currently run them, is by setting up a Selenium grid using 3 docker containers, with a dedicated network... that's insanity! (pun intended).
Let's a look at the running time of '008_basic_ui_sanity.py' scenario (https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/4197/):

01:31:50 @ Run test: 008_basic_ui_sanity.py:
01:31:50 nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
01:31:50   # init:
01:31:50   # init: Success (in 0:00:00)
01:31:50   # start_grid:
01:34:05   # start_grid: Success (in 0:02:15)
01:34:05   # initialize_chrome:
01:34:18   # initialize_chrome: Success (in 0:00:13)
01:34:18   # login:
01:34:27   # login: Success (in 0:00:08)
01:34:27   # left_nav:
01:34:45   # left_nav: Success (in 0:00:18)
01:34:45   # close_driver:
01:34:46   # close_driver: Success (in 0:00:00)
01:34:46   # initialize_firefox:
01:35:02   # initialize_firefox: Success (in 0:00:16)
01:35:02   # login:
01:35:11   # login: Success (in 0:00:08)
01:35:11   # left_nav:
01:35:29   # left_nav: Success (in 0:00:18)
01:35:29   # cleanup:
01:35:36   # cleanup: Success (in 0:00:06)
01:35:36   # Results located at /dev/shm/ost/deployment-basic-suite-master/008_basic_ui_sanity.py.junit.xml
01:35:36 @ Run test: 008_basic_ui_sanity.py: Success (in 0:03:45)

Starting the Selenium grid takes 2:15 out of 3:35 of total running time!

I've investigated a lot of approaches and came up with something like this:

  • install 'chromium-headless' package on engine VM
  • download 'chromedriver' and 'selenium hub' jar and deploy them in '/var/opt/' on engine's VM
  • run 'selenium.jar' on engine VM from '008_basic_ui_sanity.py' by using Lago's ssh
  • connect to the Selenium instance running on the engine in '008_basic_ui_sanity.py'
  • make screenshots
This series of patches represent the changes: https://gerrit.ovirt.org/#/q/topic:selenium-on-engine+(status:open+OR+status:merged).
This is the new running time (https://jenkins.ovirt.org/view/oVirt system tests/job/ovirt-system-tests_manual/4195/):

20:13:26 @ Run test: 008_basic_ui_sanity.py:
20:13:26 nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
20:13:26   # init:
20:13:26   # init: Success (in 0:00:00)
20:13:26   # make_screenshots:
20:13:27     * Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fdb6004f8d0>: Failed to establish a new connection: [Errno 111] Connection refused',)': /wd/hub
20:13:27     * Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fdb6004fa10>: Failed to establish a new connection: [Errno 111] Connection refused',)': /wd/hub
20:13:27     * Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fdb6004fb50>: Failed to establish a new connection: [Errno 111] Connection refused',)': /wd/hub
20:13:28     * Redirecting http://192.168.201.4:4444/wd/hub -> http://192.168.201.4:4444/wd/hub/static/resource/hub.html
20:14:02   # make_screenshots: Success (in 0:00:35)
20:14:02   # Results located at /dev/shm/ost/deployment-basic-suite-master/008_basic_ui_sanity.py.junit.xml
20:14:02 @ Run test: 008_basic_ui_sanity.py: Success (in 0:00:35)

(The 'NewConnectionErrors' is waiting for Selenium hub to be up and running, I can silence these later).
And the screenshots are here: https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/4195/artifact/exported-artifacts/screenshots/

The pros:
  • we cut the running time by more than 3 minutes
The cons:
  • we don't get Firefox or Chrome screenshots - we get Chromium screenshots (although AFAIK, QE has much more Selenium tests which cover both Firefox and Chrome)
  • we polute the engine VM with 'chromium-headless' package and deps (in total: 'chromium-headless', 'chromium-common', 'flac-libs' and 'minizip'), although we can remove these after the tests

Some design choices explained:

Q: Why engine VM?

A: Because the engine VM already has 'X11' libs. We could install 'chromium-headless' (and even other browsers) on our Jenkins executors, but that would mess them up a lot.

Q: Why Chromium?

A: Because it has a separate 'headless' package.

Q: Why not use 'chromedriver' RPM in favor of https://chromedriver.storage.googleapis.com Chromedriver builds?

A: Because the RPM version pulls a lot of extra dependencies even on the engine VM ('gtk3', 'cairo' etc.). Builds from the URL are the offical Google Chromedriver builds, they contain a single binary, and they work for us.

What still needs to be polished with the patches:

  • Currently 'setup_engine_selenium.sh' script downloads each time 'selenium.jar' and 'chromedriver.zip' (even with these downloads we get much faster set-up times) - we should bake these into the engine VM image template.
  • 'selenium_hub_running' function in 'selenium_on_engine.py' is hackish - an ability to run an ssh command with a context manager (and auto-terminate on it exits) should be part of Lago. Can be refactored.

Questions, comments, reviews are welcome.

Regards, Marcin



_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/RLB2KSNJS4YKVMCDUUHOZJWBQDGJCXGZ/