On Fri, Apr 5, 2019 at 2:00 PM Dafna Ron <dron@redhat.com> wrote:
Hi,

This mail is to provide the current status of CQ and allow people to review status before and after the weekend.
Please refer to below colour map for further information on the meaning of the colours.

CQ-4.2 GREEN (#1)

Last CQ job failure on 4.2 was 05-04-2019 on project ovirt-provider-ovn due to a code regression which was caused by [1]
The issue has been found and a patch created [2] but due to the hotplug_cpu failure, it took a long time to verify the fix before merge.
[1] https://gerrit.ovirt.org/#/c/98723/ - Stateless dhcpv6 does not support fixed IPs
[2] https://gerrit.ovirt.org/#/c/99193/ - Fix bug when fixed IPs are not provisioned

CQ-4.3 RED (#3)

We have been having constant failures on tests:
hotplug_cpu
add_master_storage_domain
The failures on these two are very frequent and I need to re-trigger tests on multiple failures daily.
There is a mail thread on this and several people have joined in to help debug the issue but I think that this issue needs more help from some of the other teams as its been going on for the last two weeks at least.

I've been looking at this for the past couple of days, and there are at least 3 distinct failure cases, all network related:

* No connection available (ping, ssh, other) until ARP is refreshed (traceroute, arping)
* ping works but SSH refuses connections
* No network connection even with the previous two

Can you provide me with a jump host somewhere in the TLV lab? A VM is fine, as long as it has X. The connection is too slow for me to X forward my way to the Lago engine, ovirt-vmconsole refuses connections, and even an enabled serial console does not always give me a login shell to the guest

This really looks like it's network related, but I'm not sure whether it's a change in lago, ovirt-system-tests, or the network tests themselves. All had network-related changes (related to ipv6) around the time this started failing.
 
This is also disrupting fixing actual regressions as the testing keep failing on these two unrelated issues.


CQ-Master:   RED (#3)

Master and 4.3 have the same issues.


Happy week!
Dafna


-------------------------------------------------------------------------------------------------------------------
COLOUR MAP

Green = job has been passing successfully

** green for more than 3 days may suggest we need a review of our test coverage


  1. 1-3 days       GREEN (#1)

  2. 4-7 days       GREEN (#2)

  3. Over 7 days GREEN (#3)


Yellow = intermittent failures for different projects but no lasting or current regressions

** intermittent would be a healthy project as we expect a number of failures during the week

** I will not report any of the solved failures or regressions.


  1. Solved job failures        YELLOW (#1)

  2. Solved regressions      YELLOW (#2)


Red = job has been failing

** Active Failures. The colour will change based on the amount of time the project/s has been broken. Only active regressions would be reported.


  1. 1-3 days      RED (#1)

  2. 4-7 days      RED (#2)

  3. Over 7 days RED (#3)



--

Ryan Barry

Associate Manager - RHV Virt/SLA

rbarry@redhat.com    M: +16518159306     IM: rbarry