Il giorno ven 22 mar 2019 alle ore 12:42 Dan Kenigsberg <danken@redhat.com> ha scritto:

On Fri, 22 Mar 2019, 12:21 Sandro Bonazzola, <sbonazzo@redhat.com> wrote:

Il giorno ven 22 mar 2019 alle ore 11:14 Dan Kenigsberg <danken@redhat.com> ha scritto:

On Fri, 22 Mar 2019, 12:00 Sandro Bonazzola, <sbonazzo@redhat.com> wrote:

Il giorno ven 22 mar 2019 alle ore 10:52 Dan Kenigsberg <danken@redhat.com> ha scritto:
Yes, I'm repeating myself.
SKIPPING TESTS IS BAD

I agree. And having the suite failing on a broken test skipping all the following tests is even worse.
This is why I would prefer the rest of the product being tested while someone take ownership of the broken test and fix it.

This is a good reason to rewrite OST with pytest, which continues on failure.

Patches are welcome :-)

This is not an empty gesture. The network suite came into being because of this issue (and others)

And a good reason to ping mperina on IRC to debug this. And a good reason not to merge new code.

It doesn't convince me that we should ignore the failure without due debugging.

Debugging in indeed needed but not on production system blocking the rest of the CI. Maintainer of the test can debug it on own test environment.

The product of this system are bugs. We found one. If you skip it, we all risk it being forgotten. Skipping should be rare, and happen only after the owner is found and admits that he is too busy/lazy to fix it now, and files a bug to fix it later.

We didn't found a bug in the product we are testing, we found a bug in the test that still need to be identified.

According to Dafna: "we are randomly failing on get_host_hooks test for at least 3 weeks. its not a specific branch or project and there are no commonalities that I can see,"

If it was a bug in the product I would have totally agreed with you, it couldn't have been ignored. I'm not saying to ignore this as well.

Being a bug in the test itself I would rather prefer take a non reliable test off for further investigation on a development environment and ensure the rest of the tests are being executed in production environment finding bugs on the product if there are.

We have a test suite in order to fix bugs, not in order to kill itself.

Host hooks are Infra. Infra is mperina, rnori and msobczik.
Please consult with them before you shut our collective eyes.

Please point them to a failing job, and record the failing traceback.

On Fri, 22 Mar 2019, 11:27 Dafna Ron, <dron@redhat.com> wrote:
patch submitted: https://gerrit.ovirt.org/#/c/98773/

Thanks,
Dafna

On Fri, Mar 22, 2019 at 9:04 AM Sandro Bonazzola <sbonazzo@redhat.com> wrote:

Il giorno ven 22 mar 2019 alle ore 09:34 Dafna Ron <dron@redhat.com> ha scritto:
Hi,

we are randomly failing on get_host_hooks test for at least 3 weeks.
its not a specific branch or project and there are no commonalities that I can see, aside from not being able to communicate with the host.

this week its started happening at least once a day (this morning, 2 out of 3 failures were due to that test).

This test has been added by Yaniv Kaul over a year ago and he is no longer working on ovirt I think someone else should take ownership of this test and fix it.
Please let me know if you are intending to investigate and either fix the failure or fix the test if not I will add a skip to the test,

Please add a skip to the test and if someone will step in maintaining this test it will be re-enabled.

Thanks,
Dafna

--
SANDRO BONAZZOLA
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA
sbonazzo@redhat.com

--
SANDRO BONAZZOLA
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA
sbonazzo@redhat.com

--
SANDRO BONAZZOLA
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA
sbonazzo@redhat.com

SANDRO BONAZZOLA

MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV

Red Hat EMEA

sbonazzo@redhat.com