On Fri, Nov 30, 2018 at 9:25 PM Ryan Barry <rbarry(a)redhat.com> wrote:
On Fri, Nov 30, 2018 at 2:18 PM Dan Kenigsberg <danken(a)redhat.com> wrote:
>
>
> On Fri, 30 Nov 2018, 19:33 Dafna Ron <dron(a)redhat.com wrote:
>
>> Hi,
>>
>> This mail is to provide the current status of CQ and allow people to
>> review status before and after the weekend.
>> Please refer to below colour map for further information on the meaning
>> of the colours.
>>
>> *CQ-4.2*: RED (#1)
>>
>> I checked last date ovirt-engine and vdsm passed and moved packages to
>> tested as they are the bigger projects and it was on the 27-11-218.
>>
>> We have been having sporadic failures for most of the projects on test
>> check_snapshot_with_memory.
>> We have deducted that this is caused by a code regression in storage
>> based on the following things:
>> 1.Evgheni and Gal helped debug this issue to rule out lago and infra
>> issue as the cause of failure and both determined the issue is a code
>> regression - most likely in storage.
>> 2. The failure only happens on 4.2 branch.
>> 3. the failure itself is cannot run a vm due to low disk space in
>> storage domain and we cannot see any failures which would leave any
>> leftovers in the storage domain.
>>
>> Dan and Ryan are actively
>>
>
> Actually, my involvement was a misguided attempt to solve another 4.2
> failure that I thought that I've seen.
>
> involved
>
>> in trying to find the regression but the consensus is that this is a
>> storage related regression and* we are having a problem getting the
>> storage team to join us in debugging the issue. *
>>
>> I prepared a patch to skip the test in case we cannot get cooperation
>> from storage team and resolve this regression in the next few days:
>>
https://gerrit.ovirt.org/#/c/95889/
>>
>
> Why do you consider this? Are we considering a release of 4.2 without
> live snapshot?
>
No, we aren't.
> Please do not merge it without an ack from Tal and Ryan.
>
Until we can bisect it, have you considered simply making a larger iSCSI
volume so OST stops failing there? I know it's an additional burden on
Infra's resources, and it's hopefully something we can revert later, but
it's likely to make OST pass for now so we can identify if/where other
failures are before we discover that even disabling this test (which I'm
against) doesn't make OST pass and we've lost a good bisection point.
I think this was tried already but its probably won't solve the issue, see
a suggested patch by Dan:
https://gerrit.ovirt.org/#/c/95712/
>
>
>> *CQ-Master:* YELLOW (#1)
>>
>> We have failures which CQ is still bisecting and until its done we
>> cannot point to any specific failing projects.
>>
>>
>> Happy week!
>> Dafna
>>
>>
>>
>>
-------------------------------------------------------------------------------------------------------------------
>> COLOUR MAP
>>
>> Green = job has been passing successfully
>>
>> ** green for more than 3 days may suggest we need a review of our test
>> coverage
>>
>>
>> 1.
>>
>> 1-3 days GREEN (#1)
>> 2.
>>
>> 4-7 days GREEN (#2)
>> 3.
>>
>> Over 7 days GREEN (#3)
>>
>>
>> Yellow = intermittent failures for different projects but no lasting or
>> current regressions
>>
>> ** intermittent would be a healthy project as we expect a number of
>> failures during the week
>>
>> ** I will not report any of the solved failures or regressions.
>>
>>
>> 1.
>>
>> Solved job failures YELLOW (#1)
>> 2.
>>
>> Solved regressions YELLOW (#2)
>>
>>
>> Red = job has been failing
>>
>> ** Active Failures. The colour will change based on the amount of time
>> the project/s has been broken. Only active regressions would be reported.
>>
>>
>> 1.
>>
>> 1-3 days RED (#1)
>> 2.
>>
>> 4-7 days RED (#2)
>> 3.
>>
>> Over 7 days RED (#3)
>>
>>
>>
--
Ryan Barry
Associate Manager - RHV Virt/SLA
rbarry(a)redhat.com M: +16518159306 IM: rbarry
<
https://red.ht/sig>
--
Eyal edri
MANAGER
RHV/CNV DevOps
EMEA VIRTUALIZATION R&D
Red Hat EMEA <
https://www.redhat.com/>
<
https://red.ht/sig> TRIED. TESTED. TRUSTED. <
https://redhat.com/trusted>
phone: +972-9-7692018
irc: eedri (on #tlv #rhev-dev #rhev-integ)