Hello All.
We checked this with did and postgres connection error is not an error (although it prints stacktrace... cannot we not print stacktraces, please, for antyhing that we handle in code... it is really confusing when you need to find the root cause).
The test is checking for dwhd to be up using systemd:
testlib.assert_true_within_short(
lambda: engine.service('ovirt-engine-dwhd').alive()
)
that runs:
/usr/bin/systemctl status --lines=0 ovirt-engine-dwhd
lago.ssh: DEBUG: Command 90e98548 on lago-basic-suite-master-engine returned with 3
lago.ssh: DEBUG: Command 90e98548 on lago-basic-suite-master-engine output:
● ovirt-engine-dwhd.service - oVirt Engine Data Warehouse
Loaded: loaded (/usr/lib/systemd/system/ovirt-engine-dwhd.service; enabled; vendor preset: disabled)
Active: activating (auto-restart) (Result: exit-code) since Tue 2017-01-17 07:33:23 EST; 3min 4s ago
Main PID: 22448 (code=exited, status=1/FAILURE)
CGroup: /system.slice/ovirt-engine-dwhd.service
dwhd log [1] has the following error:
Error: Could not find or load main class ovirt_engine_dwh.historyetl_4_2.HistoryETL
so this looks to be the actual problem. The latest job failed with this is [2]. This also affects 4.1, e.g. [3].
[1] http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/4791/artifact/exported-artifacts/basic_suite_master.sh-el7/exported-artifacts/test_logs/basic-suite-master/post-001_initialize_engine.py/lago-basic-suite-master-engine/_var_log/ovirt-engine-dwh/ovirt-engine-dwhd.log