August 2019 - Devel - Ovirt List Archives

[Ovirt] [CQ weekly status] [10-08-2019]

by Dusan Fodor

Hi, This mail is to provide the current status of CQ and allow people to review status before and after the weekend. Please refer to below colour map for further information on the meaning of the colours. *CQ-4.2*: GREEN (#3) No failure during this week. *CQ-4.3*: RED (#1) Last failure was on 10-08 for ovirt-ansible-image-template in verify_glance_import. This issue is ongoing, investigation will continue under thread [OST Failure Report] [oVirt master&4.3] [09-08-2019] [verify_glance_import] *CQ-Master:* RED (#1) Last failure was on 10-08 for ovirt-ansible-image-template in verify_glance_import. This issue is ongoing, investigation will continue under thread [OST Failure Report] [oVirt master&4.3] [09-08-2019] [verify_glance_import] Current running jobs for 4.2 [1], 4.3 [2] and master [3] can be found here: [1] http://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt -4.2_change-queue-tester/ [2] https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt -4.3_change-queue-tester/ [3] http://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt -master_change-queue-tester/ Have a nice week! Dusan ------------------------------------------------------------------------------------------------------------------- COLOUR MAP Green = job has been passing successfully ** green for more than 3 days may suggest we need a review of our test coverage 1. 1-3 days GREEN (#1) 2. 4-7 days GREEN (#2) 3. Over 7 days GREEN (#3) Yellow = intermittent failures for different projects but no lasting or current regressions ** intermittent would be a healthy project as we expect a number of failures during the week ** I will not report any of the solved failures or regressions. 1. Solved job failures YELLOW (#1) 2. Solved regressions YELLOW (#2) Red = job has been failing ** Active Failures. The colour will change based on the amount of time the project/s has been broken. Only active regressions would be reported. 1. 1-3 days RED (#1) 2. 4-7 days RED (#2) 3. Over 7 days RED (#3) _______________________________________________ Devel mailing list -- devel(a)ovirt.org To unsubscribe send an email to devel-leave(a)ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt .org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt .org/message/YCNCKRK3G4EJXA3OCYAUS4VMKRDA67F4/

5 years, 10 months

2
1
0 / 0

CI: vdsm-standard-check-patch fails

by Amit Bawer

Hi, CI seems to fail constantly for unavailable remote gerrit repository. Example can be seen here: https://jenkins.ovirt.org/job/vdsm_standard-check-patch/9415/console

5 years, 10 months

6
20
0 / 0

donagh.moran@oracle.com

by donagh.moran＠oracle.com

Email address in subject Best Regards Donagh

5 years, 10 months

1
0
0 / 0

Ovirt Manager - FATAL: the database system is in recovery mode

by Sameer Sardar

PACKAGE_NAME="ovirt-engine" PACKAGE_VERSION="3.6.2.6" PACKAGE_DISPLAY_VERSION="3.6.2.6-1.el6" OPERATING SYSTEM="CentOS 6.7" Its been running error-free for over 3 years now. Over the past few weeks, the Ovirt manager application has been frequently losing connection with its own database residing on the same server. It is throwing up errors saying: “Caused by org.postgresql.util.PSQLException: FATAL: the database system is in recovery mode”. The only means we have found to recover from this is to hard reboot the server, after which it works fine for a couple of days before throwing up these errors again. Here are some logs : Caused by: org.postgresql.util.PSQLException: FATAL: the database system is in recovery mode at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:293) at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:108) at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66) at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:125) at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:30) at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:22) at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:32) at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24) at org.postgresql.Driver.makeConnection(Driver.java:393) at org.postgresql.Driver.connect(Driver.java:267) at org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createLocalManagedConnection(LocalManagedConnectionFactory.java:322)

5 years, 10 months

1
0
0 / 0

OST's basic suite UI sanity tests optimization

by Marcin Sobczyk

Hi, _TL; DR_ Let's cut the running time of '008_basic_ui_sanity.py' by more than 3 minutes by sacrificing Firefox and Chrome screenshots in favor of Chromium. During the OST hackathon in Brno this year, I saw an opportunity to optimize basic UI sanity tests from basic suite. The way we currently run them, is by setting up a Selenium grid using 3 docker containers, with a dedicated network... that's insanity! (pun intended). Let's a look at the running time of '008_basic_ui_sanity.py' scenario (https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-te...): 01:31:50 @ Run test: 008_basic_ui_sanity.py: 01:31:50 nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$'] 01:31:50 # init: 01:31:50 # init: Success (in 0:00:00) 01:31:50 # start_grid: 01:34:05 # start_grid: Success (in 0:02:15) 01:34:05 # initialize_chrome: 01:34:18 # initialize_chrome: Success (in 0:00:13) 01:34:18 # login: 01:34:27 # login: Success (in 0:00:08) 01:34:27 # left_nav: 01:34:45 # left_nav: Success (in 0:00:18) 01:34:45 # close_driver: 01:34:46 # close_driver: Success (in 0:00:00) 01:34:46 # initialize_firefox: 01:35:02 # initialize_firefox: Success (in 0:00:16) 01:35:02 # login: 01:35:11 # login: Success (in 0:00:08) 01:35:11 # left_nav: 01:35:29 # left_nav: Success (in 0:00:18) 01:35:29 # cleanup: 01:35:36 # cleanup: Success (in 0:00:06) 01:35:36 # Results located at /dev/shm/ost/deployment-basic-suite-master/008_basic_ui_sanity.py.junit.xml 01:35:36 @ Run test: 008_basic_ui_sanity.py: Success (in 0:03:45) Starting the Selenium grid takes 2:15 out of 3:35 of total running time! I've investigated a lot of approaches and came up with something like this: * install 'chromium-headless' package on engine VM * download 'chromedriver' and 'selenium hub' jar and deploy them in '/var/opt/' on engine's VM * run 'selenium.jar' on engine VM from '008_basic_ui_sanity.py' by using Lago's ssh * connect to the Selenium instance running on the engine in '008_basic_ui_sanity.py' * make screenshots This series of patches represent the changes: https://gerrit.ovirt.org/#/q/topic:selenium-on-engine+(status:open+OR+sta.... This is the new running time (https://jenkins.ovirt.org/view/oVirt system tests/job/ovirt-system-tests_manual/4195/): 20:13:26 @ Run test: 008_basic_ui_sanity.py: 20:13:26 nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$'] 20:13:26 # init: 20:13:26 # init: Success (in 0:00:00) 20:13:26 # make_screenshots: 20:13:27 * Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fdb6004f8d0>: Failed to establish a new connection: [Errno 111] Connection refused',)': /wd/hub 20:13:27 * Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fdb6004fa10>: Failed to establish a new connection: [Errno 111] Connection refused',)': /wd/hub 20:13:27 * Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fdb6004fb50>: Failed to establish a new connection: [Errno 111] Connection refused',)': /wd/hub 20:13:28 * Redirecting http://192.168.201.4:4444/wd/hub -> http://192.168.201.4:4444/wd/hub/static/resource/hub.html 20:14:02 # make_screenshots: Success (in 0:00:35) 20:14:02 # Results located at /dev/shm/ost/deployment-basic-suite-master/008_basic_ui_sanity.py.junit.xml 20:14:02 @ Run test: 008_basic_ui_sanity.py: Success (in 0:00:35) (The 'NewConnectionErrors' is waiting for Selenium hub to be up and running, I can silence these later). And the screenshots are here: https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-te... _The pros:_ * we cut the running time by more than 3 minutes _The cons:_ * we don't get Firefox or Chrome screenshots - we get Chromium screenshots (although AFAIK, QE has much more Selenium tests which cover both Firefox and Chrome) * we polute the engine VM with 'chromium-headless' package and deps (in total: 'chromium-headless', 'chromium-common', 'flac-libs' and 'minizip'), although we can remove these after the tests _Some design choices explained:_ Q: Why engine VM? A: Because the engine VM already has 'X11' libs. We could install 'chromium-headless' (and even other browsers) on our Jenkins executors, but that would mess them up a lot. Q: Why Chromium? A: Because it has a separate 'headless' package. Q: Why not use 'chromedriver' RPM in favor of https://chromedriver.storage.googleapis.com Chromedriver builds? A: Because the RPM version pulls a lot of extra dependencies even on the engine VM ('gtk3', 'cairo' etc.). Builds from the URL are the offical Google Chromedriver builds, they contain a single binary, and they work for us. _What still needs to be polished with the patches:_ * Currently 'setup_engine_selenium.sh' script downloads each time 'selenium.jar' and 'chromedriver.zip' (even with these downloads we get much faster set-up times) - we should bake these into the engine VM image template. * 'selenium_hub_running' function in 'selenium_on_engine.py' is hackish - an ability to run an ssh command with a context manager (and auto-terminate on it exits) should be part of Lago. Can be refactored. Questions, comments, reviews are welcome. Regards, Marcin

5 years, 10 months

7
10
0 / 0

Failure during OST run

by Andrej Cernek

Hi, I am failing at running ost: * [Thread-2] Deploy VM lago-network-suite-master-engine: ERROR (in 0:02:13) # Deploy environment: ERROR (in 0:02:14) @ Deploy oVirt environment: ERROR (in 0:02:14) The error seems to be caused by unavailable package: Error: Package: ovirt-engine-metrics-1.3.4-0.0.master.20190804083458.git68b317a.el7.noarch (alocalsync) Requires: ansible >= 2.8.3 Available: ansible-2.4.2.0-2.el7.noarch (extras) ansible = 2.4.2.0-2.el7 Installing: ansible-2.8.2-1.el7.noarch (alocalsync) ansible = 2.8.2-1.el7 Did anyone encounter similar issues? Thanks, Regards, Andrej

5 years, 10 months

3
2
0 / 0

ovirt-engine has been tagged (ovirt-engine-4.3.6.1)

by Tal Nisan

5 years, 10 months

1
0
0 / 0

vdsm has been tagged (v4.30.26)

by Milan Zamazal

5 years, 10 months

1
0
0 / 0

[Ovirt] [CQ weekly status] [02-08-2019]

by Dusan Fodor

Hi, This mail is to provide the current status of CQ and allow people to review status before and after the weekend. Please refer to below colour map for further information on the meaning of the colours. *CQ-4.2*: RED (#1) Last failure was on 01-08 for ovirt-ansible-hosted-engine-setup caused by missing dependency, patch is pending to fix this. *CQ-4.3*: RED (#1) Last failure was on 02-08 for vdsm caused by missing dependency, patch is pending to fix this. *CQ-Master:* RED (#1) Last failure was on 02-08 for ovirt-engine due failure in build-artifacts, which was caused by gerrit issue, which was reported Evgheni. Current running jobs for 4.2 [1], 4.3 [2] and master [3] can be found here: [1] http://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-4.2_change-... [2] https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-4.3_change... [3] http://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_chan... Have a nice week! Dusan ------------------------------------------------------------------------------------------------------------------- COLOUR MAP Green = job has been passing successfully ** green for more than 3 days may suggest we need a review of our test coverage 1. 1-3 days GREEN (#1) 2. 4-7 days GREEN (#2) 3. Over 7 days GREEN (#3) Yellow = intermittent failures for different projects but no lasting or current regressions ** intermittent would be a healthy project as we expect a number of failures during the week ** I will not report any of the solved failures or regressions. 1. Solved job failures YELLOW (#1) 2. Solved regressions YELLOW (#2) Red = job has been failing ** Active Failures. The colour will change based on the amount of time the project/s has been broken. Only active regressions would be reported. 1. 1-3 days RED (#1) 2. 4-7 days RED (#2) 3. Over 7 days RED (#3) _______________________________________________ Devel mailing list -- devel(a)ovirt.org To unsubscribe send an email to devel-leave(a)ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/YCNCKRK3G4E...

5 years, 10 months

5
5
0 / 0

Network tests failing randomly again: AssertionError: '192.168.98.1/29' unexpectedly found in ['192.168.99.1/29', '192.168.98.1/29', 'fe80::3c92:48ff:fecd:8366/64']

by Nir Soffer

This used to happen randomly in the past, and started to happen again. I can ignore this failure and merge, but this may fail the change queue. Build: https://jenkins.ovirt.org/job/vdsm_standard-check-patch/9520//artifact/ch... ====================================================================== FAIL: test_add_delete_ipv4 (network.ip_address_test.IPAddressTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jenkins/workspace/vdsm_standard-check-patch/vdsm/tests/network/ip_address_test.py", line 222, in test_add_delete_ipv4 self._test_add_delete(IPV4_A_WITH_PREFIXLEN, IPV4_B_WITH_PREFIXLEN) File "/home/jenkins/workspace/vdsm_standard-check-patch/vdsm/tests/network/ip_address_test.py", line 247, in _test_add_delete self._assert_has_no_address(nic, ip_b) File "/home/jenkins/workspace/vdsm_standard-check-patch/vdsm/tests/network/ip_address_test.py", line 344, in _assert_has_no_address self._assert_address_not_in(address_with_prefixlen, addresses) File "/home/jenkins/workspace/vdsm_standard-check-patch/vdsm/tests/network/ip_address_test.py", line 352, in _assert_address_not_in self.assertNotIn(address_with_prefixlen, addresses_list) AssertionError: '192.168.98.1/29' unexpectedly found in ['192.168.99.1/29', '192.168.98.1/29', 'fe80::3c92:48ff:fecd:8366/64'] -------------------- >> begin captured logging << -------------------- 2019-08-05 12:27:49,718 DEBUG (MainThread) [root] /sbin/ip link add name dummy_GmE1I type dummy (cwd None) (cmdutils:130) 2019-08-05 12:27:49,731 DEBUG (MainThread) [root] SUCCESS: <err> = ''; <rc> = 0 (cmdutils:138) 2019-08-05 12:27:49,733 DEBUG (netlink/events) [root] START thread <Thread(netlink/events, started daemon 140511381804800)> (func=<bound method Monitor._scan of <vdsm.network.netlink.monitor.Monitor object at 0x7fcb7bbfd0d0>>, args=(), kwargs={}) (concurrent:193) 2019-08-05 12:27:49,734 DEBUG (MainThread) [root] /sbin/ip link set dev dummy_GmE1I up (cwd None) (cmdutils:130) 2019-08-05 12:27:49,746 DEBUG (MainThread) [root] SUCCESS: <err> = ''; <rc> = 0 (cmdutils:138) 2019-08-05 12:27:49,749 DEBUG (netlink/events) [root] FINISH thread <Thread(netlink/events, started daemon 140511381804800)> (concurrent:196) 2019-08-05 12:27:49,755 DEBUG (MainThread) [root] /sbin/ip -4 addr add dev dummy_GmE1I 192.168.99.1/29 (cwd None) (cmdutils:130) 2019-08-05 12:27:49,763 DEBUG (MainThread) [root] SUCCESS: <err> = ''; <rc> = 0 (cmdutils:138) 2019-08-05 12:27:49,767 DEBUG (MainThread) [root] /sbin/ip -4 addr add dev dummy_GmE1I 192.168.98.1/29 (cwd None) (cmdutils:130) 2019-08-05 12:27:49,778 DEBUG (MainThread) [root] SUCCESS: <err> = ''; <rc> = 0 (cmdutils:138) 2019-08-05 12:27:49,785 DEBUG (MainThread) [root] /sbin/ip -4 addr del dev dummy_GmE1I 192.168.98.1/29 (cwd None) (cmdutils:130) 2019-08-05 12:27:49,796 DEBUG (MainThread) [root] SUCCESS: <err> = ''; <rc> = 0 (cmdutils:138) 2019-08-05 12:27:49,803 DEBUG (MainThread) [root] /sbin/ip link del dev dummy_GmE1I (cwd None) (cmdutils:130) 2019-08-05 12:27:49,824 DEBUG (MainThread) [root] SUCCESS: <err> = ''; <rc> = 0 (cmdutils:138)

5 years, 10 months

1
0
0 / 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Devel August 2019