Hi.
We do watch those and this one was reported by Dafna though devel list was not included
for some reason (usually we do include it). We strive to follow up on it daily, but
sometimes we lag behind.
It would be good to send to the patch owner that the system is identified as being a
possible cause (by bi-section), but initially it was not done like that. Sometimes
reporting is misleading (e.g. the external repos changes that we use are not visible for
bi-section, also infra problems is something we fix). Though I am ok to try to CC the
patch owner in especially since we are working on gating as a long-term solution and IMO
this is a step in the right direction.
Anton.
On 17 Dec 2019, at 09:43, Yedidyah Bar David <didi(a)redhat.com>
wrote:
On Tue, Dec 17, 2019 at 10:11 AM Yedidyah Bar David <didi(a)redhat.com> wrote:
>
> Hi all,
>
> $subject. [1] has
> ovirt-engine-4.4.0-0.0.master.20191204120550.git04d5d05.el7.noarch .
>
> Tried to look around, and I have a few notes/questions:
>
> 1. Last successful run of [2] is 3 days old, but apparently it wasn't
> published. Any idea why?
>
> 2. Failed runs of [2] are reported to infra, with emails such as:
>
> [CQ]: 105472, 5 (ovirt-engine) failed "ovirt-master" system tests, but
> isn't the failure root cause
>
> Is anyone monitoring these?
>
> Is this the only alerting that CI generates on such failures?
>
> If first is No and second is Yes, then we need someone/something to
> start monitoring. This was discussed a lot, but I do not see any
> change. Ideally, such alerts should be To'ed or Cc'ed to the author
> and reviewers of the patch that CI found to be guilty (which might be
> wrong, that's not the point). Do we plan to have something like this?
> Any idea when it will be ready?
>
> 3. I looked at a few recent failures of [2], specifically [3][4]. Both
> seem to have been killed after a timeout, while running
> 'engine-config'. For [3] that's clear, see [5]:
>
> 2019-12-16 17:11:44,766::log_utils.py::__exit__::611::lago.ssh::DEBUG::end
> task:fb6611dc-55bb-4251-aeda-2578b2ec83a2:Get ssh client for
> lago-basic-suite-master-engine:
> 2019-12-16 17:11:44,931::ssh.py::ssh::58::lago.ssh::DEBUG::Running
> 22e2b6b6 on lago-basic-suite-master-engine: engine-config --set
> VdsmUseNmstate=true
> 2019-12-16 19:55:21,965::cmd.py::exit_handler::921::cli::DEBUG::signal
> 15 was caught
>
> Can't find stdout/stderr of engine-config, so it's hard to tell if it
> outputted anything helpful to understand why it was stuck.
>
> It's hard to tell that about [4], because it has very few artifacts
> collected, no idea why, notably no lago.log, but [6] does show:
>
> [36m # initialize_engine: [32mSuccess [0m (in 0:04:00) [0m
> [36m # engine_config: [0m [0m [0m
> [36m * Collect artifacts: [0m [0m [0m
> [36m - [Thread-34] lago-basic-suite-master-engine:
> [31mERROR [0m (in 0:00:04) [0m
> [36m * Collect artifacts: [31mERROR [0m (in 0:00:04) [0m
> [36m # engine_config: [31mERROR [0m (in 2:42:57) [0m
> /bin/bash: line 31: 5225 Killed
> ${_STDCI_TIMEOUT_CMD} "3h" "$script_path" < /dev/null
>
> If I run 'engine-config --set VdsmUseNmstate=true' on my
> 20191204120550.git04d5d05 engine, it returns quickly.
>
> Tried also adding a repo pointing at last successful run of [7], which
> is currently [8], and it prompts me to input a version, probably as a
> result of [9]. Ales/Martin, can you please have a look? Thanks.
Something like this might be enough, please take over:
https://gerrit.ovirt.org/105784
But the main point of my mail was the first points.
>
> [1]
https://resources.ovirt.org/pub/ovirt-master-snapshot/rpm/el7/noarch/
> [2]
https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/
> [3]
https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/17768/
> [4]
https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/17761/
> [5]
https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/17768/arti...
> [6]
https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/17761/arti...
> [7]
https://jenkins.ovirt.org/job/ovirt-engine_standard-on-merge/
> [8]
https://jenkins.ovirt.org/job/ovirt-engine_standard-on-merge/384/
> [9]
https://gerrit.ovirt.org/105440
> --
> Didi
--
Didi
_______________________________________________
Infra mailing list -- infra(a)ovirt.org
To unsubscribe send an email to infra-leave(a)ovirt.org
Privacy Statement:
https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/4TEQYJOB67N...