[ovirt-devel] Re: Patch Gating summary + FAQ

22 Sep 2019

      On Fri, 20 Sep 2019 at 08:54, Marcin Sobczyk <msobczyk@redhat.com> wrote:
...
On 9/19/19 4:08 PM, Ehud Yonasi wrote:
Hey everyone,
Following the presentation we did last week [1]
<https://bluejeans.com/s/zAjyX/>, I wanted to summarize the new patch
gating workflow that will be pushed to oVirt soon and will impact all the
developers.
Summary:
The purpose of the new workflow is to verify patches earlier ( shift left
) before they are merged and provide much faster feedback for developers if
their patch fails OST.
1.
Feedback from OST will now be posted directly to Gerrit instead of
   requiring human intervention from the infra team to notify developers
We expect developers to check why their patch is not passing the gate
(OST), debug it, find the root cause and fix it before merging the patch.
1.
Any concerns regarding the stability of OST should be communicated and
   addressed ASAP.
The status today is that if OST fails post-merge gating packages are not
pushed to tested and QE doesn’t get to test them. The change with pre-merge
gating is that patches won’t be merged if OST fails, so if there are any
fragile or flaky tests they should be examined by their maintainers and
fixed, skipped or removed by the maintainers.
1.
FYI, we are not removing the Merge button at this point, so
   maintainers will still be able to merge patches that they believe 100% is
   not breaking the build and failing OST tests.
Please note that merging patches that break OST will cause it to start
failing for all other patches, we urge you to avoid trying to bypass it if
at all possible.
In the following section, I will explain more on Patch Gating, how to
onboard it, etc.
FAQ on oVirt’s Gating System and how to onboard your project on it:
Q. What is Patch Gating?
A. It is triggered pre-merge on patches and running OST as the gate
system tests, unlike today
where we have post-merge OST that runs the patches after the projects are
merged. This means developers get early feedback on their patches if it
is passing OST.
Q. What causes the gating process to start? A. Once a patch is verified,
passed CI and has Code-Review +2 labels, the gating process will be
started. You will receive a message in the patch
Q. How does it report results to my patches?
A. A comment will be posted in your patch with the job URL failure.
Q. How will my patch get merged?
A. If the patch has passed the gating (OST), Zuul (The new CI system for
patch gating) will merge the patch automatically.
Q. How do I onboard my project?
A.
1.
Open a JIRA ticket or mail to infra-support@ovirt.org
   2.
Creating a file named 'zuul.yaml' under your project root OR
   `zuul.d/zuul.yaml` and fill with the following content:
- project:
templates:
- ost-gated-project
Q. My projects run on STDCI V1, is that ok?
A. No, the patch gating logic runs on STDCI V2 only! meaning that you
will have to shift your project to V2.
If you need help regarding the transition to V2 you can open a JIRA[2]
<https://ovirt-jira.atlassian.net>ticket or mail to
infra-support@ovirt.org
and visit the docs [3]
<https://ovirt-infra-docs.readthedocs.io/en/latest/>.
Q. What if I want to merge the patch regardless of OST results?
A. If you are a maintainer of the project, you can still merge it. we are
not removing the merge button option.
But, merging when failing OST can break your project so merging on
failure is unadvertised.
Q. What if my patch failing because of dependency on different project
patch?
A. Patch Gating (Zuul) has a mechanism for cross-project dependency! All
you need to do is to add to the
commit message the patch URL you are dependent on:
Depends-On: https://gerrit.ovirt.org/patch_number
And they will be tested together.
Note: you can have multiple dependencies.
Q. How do I debug OST?
A. There are various ways of looking in the logs and output for the
errors:
1.
Blue Ocean view,
   <https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-system-tests_gate/>
   you can see the jobs that were run inside the gate and find the suites
   which failed.
   2.
ci_build_summary view, An internal tool to view the threads and
   redirect to the specific logs/artifacts.
   3.
Test results analyzer, available if the tests were run. you can view
   the failed tests and their output and OST maintainers and your team
   leads should be able to assist.
   4.
For further learning on how to debug OST please visit the OST FAQ
   <https://drive.google.com/open?id=1Sohq7bdgZS341gs5-lvB9lyS0GXawqaIvuUziOWwoGQ>
   .
Q. Will the current infrastructure be able to support all the patches?
A. The CI team has made tremendous work in utilizing the infrastructure.
The OST gating will run inside OpenShift pods unlike before as bare metals
and we can
gain from that right now approximately 50 pods in parallel to run simultaneously
and we will review adding more if the need arises.
Q. When I have multiple patches, in which order will they be tested by the
gating system?
A. The patches will be tested as the flow they will be merged. The gating
system knows how to simulate patches post-merge.
Q. What do I do if I think OST failed because of an infra issue and not my
patch?
A. You can contact the CI team by sending mail to infra-support@ovirt.org
and explain your concerns + sending the patch URL.
Q. Will check-merged scripts be used by the gating system?
A. No, they will be used in the current workflow with OST post-merge
gating system called Change-Queue.
Q. Can I add my own tests to the gating system?
A. The gating system is running OST tests, so if it’s a test that should
be included in the OST, then yes.
Q. What will happen to the old change-queue system now that we have gating?
A. At this time, the change-queue system will stay and gate post-merge
jobs until all of oVirt projects will onboard to patch gating.
We might consider using the change-queue for further coverage of tests in
the future.
Q. How can I re-trigger a failed patch to the gate again?
A. There are 2 options to retrigger:
-
If the case is to fix your patch, just uploaded a new patchset and
   turn the Code-Review, Verified and CI labels again.
   -
If you want to re-trigger the same patchset again just write a comment
   in Gerrit:
‘ci emulate-gate please’
Q. I usually write a series of related patches that should be merged
together, can the Gating system test all of them in a single test?
A. No, they will be tested in parallel as the number of patches in the
series. This is why we’ve increased our capacity to run OST for this case.
This (and maybe the running time of OST suites) is the only thing that I
don't like about the gating - we do post series of small patches that are:
easy to review, atomic enough, so they can be merged one at a time, but
OTOH make sense only when merged in whole.
Can this functionality be achieved by including "Depends-on" that point to
other patches?
No. for technical reasons, testing and merging patches in batches cannot be
done ATM.

If patches are already based on one another - later patches will not be
tested without earlier patches, basing patches on one another is
essentially like setting an implicit "Depneds-on".

Given the way Gerrit and CI work - you can never rely on having a series of
patches be tested or merged together, you can only rely on patches that
depend on earlier patches to be tested and merged in the right order.

In other words, if you have a series of patches, checking out the series
"in the middle" should always work and pass tests, otherwise it will not
pass the gate.

Given that you mention the patches are atomic and can be merged one at a
time, there shouldn't really be a problem, if you're worried from a
performance standpoint, please note that we are moving from a point where
we were essentially running one test at a time to being able to run close
to 80 tests at a time. From an overall patch throughput POV the improvement
is going to be huge, as well as from a single patch integration latency
POV.

While the process was hidden from you so far because it was happening
post-merge, the reality is that we've been blocking failing patches from
going into oVirt for a long time now. With the current system, a patch can
end up being delayed from overall integration into oVirt for a few days as
the testing system was running bisection cycles to find and remove faulty
patches. Now if a patch is working, and does not depend on any faulty
patches, it is expected to take at most two OST runs to get merged, at
which point it is already integrated into oVirt. Additionally the process
will be visible for you in Gerrit as opposed to hidden away.
...
Architectural design document [4]
<https://drive.google.com/open?id=1qV_iNJL6jHARlti7zpnRZRfQM_Q9Z-w8ONvcKEmaAgA>
would provide you the understanding of the patch gating process with the
new services we will be using.
[1]: https://bluejeans.com/s/zAjyX/
[2]: https://ovirt-jira.atlassian.net
[3]: https://ovirt-infra-docs.readthedocs.io/en/latest/
[4]
https://drive.google.com/open?id=1qV_iNJL6jHARlti7zpnRZRfQM_Q9Z-w8ONvcKEmaAg...
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/KKEHCJHRS645H4...
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/KXDJTQ7QSFPSP7...
-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted