From: "Sandro Bonazzola" <sbonazzo(a)redhat.com>
To: "Eyal Edri" <eedri(a)redhat.com>
Cc: infra(a)ovirt.org, devel(a)ovirt.org
Sent: Thursday, June 4, 2015 9:50:56 AM
Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
Il 04/06/2015 08:46, Eyal Edri ha scritto:
>
>
> ----- Original Message -----
>> From: "Sandro Bonazzola" <sbonazzo(a)redhat.com>
>> To: "Eyal Edri" <eedri(a)redhat.com>, "Max Kovgan"
<mkovgan(a)redhat.com>
>> Cc: devel(a)ovirt.org, infra(a)ovirt.org
>> Sent: Thursday, June 4, 2015 9:11:10 AM
>> Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
>>
>> Il 03/06/2015 21:46, Eyal Edri ha scritto:
>>>
>>>
>>> ----- Original Message -----
>>>> From: "Max Kovgan" <mkovgan(a)redhat.com>
>>>> To: devel(a)ovirt.org
>>>> Cc: infra(a)ovirt.org
>>>> Sent: Wednesday, June 3, 2015 8:22:54 PM
>>>> Subject: [ovirt-devel] gerrit+ci improvement proposal
>>>>
>>>> Hi everyone!
>>>> We really want to have reliable and snappy CI: to allow short cycles
and
>>>> encourage developers to write tests.
>>>>
>>>> # Problem
>>>>
>>>> Many patches are neither ready for review nor for CI upon submission,
>>>> which
>>>> is OK.
>>>> But running all the jobs on those patches with limited resources
results
>>>> in:
>>>> overloaded resources, slow response time, unhappy developers.
>>>>
>>>> # Proposed Solution
>>>>
>>>> To run less jobs we know we don’t need to, thus making more resources
>>>> for
>>>> the
>>>> jobs we need to run.
>>>> We have been experimenting to make our CI stabler and quicker to
respond
>>>> by
>>>> using gerrit flags. This has improved in both directions very well
>>>> internally.
>>>> Now it seems a good time to let all the oVirt projects to use this.
>>>> This solution indirectly promotes reviews and quick tests - “to fail
>>>> early”,
>>>> yet full blown static code analysis and long tests to run “when ready”.
>>>>
>>>> # How it works
>>>>
>>>> 2 new gerrit independent flags are added to gerrit.
>>>>
>>>> ## CI flag
>>>>
>>>> Will express patch CI status. Values:
>>>> * +1 CI passed
>>>> * 0 CI did not run yet
>>>> * -1 CI failed
>>>> Permissions for setting: project maintainers (for special cases) should
>>>> be
>>>> able to set/override (except Jenkins).
>>>>
>>>> ## Workflow flag
>>>>
>>>> Will express patch “workflow” state. Values:
>>>> * 0 Work In Progress
>>>> * +1 Ready For Review
>>>> * +2 Ready For Merge
>>>> Permissions for setting: Owner can set +1, Project Maintainers can set
>>>> +2
>>>>
>>>> ## Review + CI Integration:
>>>>
>>>> Merging [“Submit” button to appear] will require: Review+1, CI+1,
>>>> Workflow+2
>>>> Patch lifecycle now is:
>>>> ---------------------------------------------------------------
>>>> patch state |owner |reviewer |maintainer |CI tests |pass
>>>> ---------------------------------------------------------------
>>>> added/updated |- |- |- |quick |CI+1
>>>> review |Workflow+1|Review+1 |- |heavy |CI+1
>>>> merge ready |- |- |Workflow+2 |gating |CI+1
>>>> merge |- |- |merge |merge |CI+1
>>>>
>>>> Changes from current workflow:
>>>> Owner only adds reviewers, now owner needs to set "Workflow+1"
for the
>>>> patch
>>>> to be reviewed, and heavily auto-tested.
>>>> Maintainer now needs to set "Workflow+2" and wait for
"Submit" button to
>>>> appear after CI has completed running gating tests.
>>>>
>>>>
>>>> Next step will be to automate merge the change after Workflow+2 has
been
>>>> set
>>>> by the Maintainer and gating tests passed.
>>>>
>>>>
>>>> ## Why now?
>>>>
>>>> It is elimination of waste. The sooner - the better.
>>>> The solution has been used for a while and it works.
>>>> Resolving the problem without gerrit involved will lead to adding
>>>> unreliable
>>>> code into jobs, and will still be prone to problems:
>>>> Just recently, 3d ago we’ve tried detecting what to run from jenkins
>>>> relying only on gerrit comments so that upon Verified+1, we’d run the
>>>> job.
>>>> We could not use “Review+1”, because it makes no sense at all, so we
>>>> left
>>>> the job to set Verified+1.
>>>> Meaning - re-trigger itself immediately more than 1 times.
>>>>
>>>> Jenkins and its visitors very unhappy, and we had to stop those jobs,
>>>> clean
>>>> up the queue, and spam developers.
>>>>
>>>> ## OK OK OK. Now what?
>>>>
>>>> Now we want your comments and opinions before pushing this further:
>>>> Please participate in this thread, so we can start trying it out.
>>>> Ask, Suggest better ideas, all this is welcome.
>>>>
>>>>
>>>> Best Regards!
>>>>
>>>>
>>>> N.B.
>>>> Of course, this is not written in stone, in case we find a better
>>>> approach
>>>> on
>>>> solving those issues, we will change to it.
>>>> And we will keep improving so don't be afraid that it will be
enforced:
>>>> if
>>>> this does not work out we will discard it.
>>>>
>>>> P.S.
>>>> Kudos to dcaro, most of the work was done by him, and most of this text
>>>> too.
>>>>
>>>
>>> +1 from me, releasing CI from running non critical and un-essential jobs
>>> will not only reduce load from ci,
>>> and shorted response time for developers, it will allow us to add much
>>> more
>>> powerful tests such as functional & system
>>> tests that actually add hosts and run VMs, improving our ability to find
>>> regression much more effectively.
>>>
>>> Another benefit to consider is saving reviewers time. I.e not only
>>> jenkins
>>> benefits from Worklow+1, but also human reviewers.
>>> Instead of looking at a patch that is too early to be reviewed, the
>>> author
>>> can set the Workflow+1 when the code is ready to review
>>> (even if he didn't verified it yet), thus saving time to other
reviewers
>>> -
>>> for example people can add an email rule
>>> to alert them only when they are added to patches that have Workflow+1,
>>> and
>>> not before.
>>
>> For human reviewers I suggest to keep using drafts until the patch is
>> finished.
>
> keep using? how many developers do you know are working with drafts until
> their patch is ready?
We have ~350 drafts in gerrit so someone is using them :-)
https://gerrit.ovirt.org/#/q/is:draft
good to know :)
so i guess even with that amount its still hard on jenkins....
> i agree if everyone would use drafts load on jenkins was already much
> lower,
> unfortunately its not the case.
>
>> Once it's finished and humans reviewed the logic of the patch, Workflow+1
>> should be triggered allowing automation to check the correctness of the
>> patch.
>> IMHO there's no reason for wasting CI time on patches that will be correct
>> from an automation point of view but nacked by reviewers. Especially if
>> the
>> patches are part of a big patchset.
>>
>>
>>>
>>> And one final note, for Workflow+2 -> this is a preparation for a gating
>>> system, like Zuul used by openstack, that in the future
>>> we might use as automatic merger pending passing a verification step.
>>> this
>>> will prevent errors that happen sometimes
>>> post merge due to conflicts or other issues, and will be another level of
>>> validation before final merge.
>>> But as max said, its all part of the plan and we'll test it of course
>>> before implementing to see its value.
>>>
>>>
>>>>
>>>>
>>>>
>>>> Max Kovgan
>>>>
>>>> Senior Software Engineer
>>>> Red Hat - EMEA ENG Virtualization R&D
>>>> Tel.: +972 9769 2060
>>>> Email: mkovgan [at] redhat [dot] com
>>>> Web:
http://www.redhat.com
>>>> RHT Global #: 82-72060
>>>>
>>>> _______________________________________________
>>>> Devel mailing list
>>>> Devel(a)ovirt.org
>>>>
http://lists.ovirt.org/mailman/listinfo/devel
>>> _______________________________________________
>>> Devel mailing list
>>> Devel(a)ovirt.org
>>>
http://lists.ovirt.org/mailman/listinfo/devel
>>>
>>
>>
>> --
>> Sandro Bonazzola
>> Better technology. Faster innovation. Powered by community collaboration.
>> See how it works at
redhat.com
>> _______________________________________________
>> Infra mailing list
>> Infra(a)ovirt.org
>>
http://lists.ovirt.org/mailman/listinfo/infra
>>
>>
>>
--
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at
redhat.com
_______________________________________________
Infra mailing list
Infra(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra