On Fri, 20 Sep 2019 at 08:54, Marcin Sobczyk <msobczyk@redhat.com> wrote:


On 9/19/19 4:08 PM, Ehud Yonasi wrote:

Hey everyone,

Following the presentation we did last week [1], I wanted to summarize the new patch gating workflow that will be pushed to oVirt soon and will impact all the developers.


Summary:


The purpose of the new workflow is to verify patches earlier ( shift left ) before they are merged and provide much faster feedback for developers if their patch fails OST. 


  1. Feedback from OST will now be posted directly to Gerrit instead of requiring human intervention from the infra team to notify developers


We expect developers to check why their patch is not passing the gate (OST), debug it, find the root cause and fix it before merging the patch.


  1. Any concerns regarding the stability of OST should be communicated and addressed ASAP.

The status today is that if OST fails post-merge gating packages are not pushed to tested and QE doesn’t get to test them. The change with pre-merge gating is that patches won’t be merged if OST fails, so if there are any fragile or flaky tests they should be examined by their maintainers and fixed, skipped or removed by the maintainers.


  1. FYI, we are not removing the Merge button at this point, so maintainers will still be able to merge patches that they believe 100% is not breaking the build and failing OST tests.


Please note that merging patches that break OST will cause it to start failing for all other patches, we urge you to avoid trying to bypass it if at all possible.


In the following section, I will explain more on Patch Gating, how to onboard it, etc.



FAQ on oVirt’s Gating System and how to onboard your project on it:


Q. What is Patch Gating?

A. It is triggered pre-merge on patches and running OST as the gate system tests, unlike today

where we have post-merge OST that runs the patches after the projects are merged. This means developers get early feedback on their patches if it is passing OST. 


Q. What causes the gating process to start? A. Once a patch is verified, passed CI and has Code-Review +2 labels, the gating process will be started. You will receive a message in the patch 


Q. How does it report results to my patches?

A. A comment will be posted in your patch with the job URL failure.



Q. How will my patch get merged?

A. If the patch has passed the gating (OST), Zuul (The new CI system for patch gating) will merge the patch automatically.



Q. How do I onboard my project?

A.  

  1. Open a JIRA ticket or mail to infra-support@ovirt.org

  2. Creating a file named 'zuul.yaml' under your project root OR `zuul.d/zuul.yaml` and fill with the following content:


- project:

    templates:

      - ost-gated-project



Q. My projects run on STDCI V1, is that ok?

A. No, the patch gating logic runs on STDCI V2 only! meaning that you will have to shift your project to V2.

If you need help regarding the transition to V2 you can open a JIRA[2] ticket or mail to infra-support@ovirt.org 

and visit the docs [3].


Q. What if I want to merge the patch regardless of OST results?

A. If you are a maintainer of the project, you can still merge it. we are not removing the merge button option.

But, merging when failing OST can break your project so merging on failure is unadvertised.


Q. What if my patch failing because of dependency on different project patch?

A. Patch Gating (Zuul) has a mechanism for cross-project dependency! All you need to do is to add to the

commit message the patch URL you are dependent on:


Depends-On: https://gerrit.ovirt.org/patch_number


And they will be tested together.


Note: you can have multiple dependencies. 


Q. How do I debug OST?

A. There are various ways of looking in the logs and output for the errors: 

  1. Blue Ocean view, you can see the jobs that were run inside the gate and find the suites which failed.

  2. ci_build_summary view, An internal tool to view the threads and redirect to the specific logs/artifacts.

  3. Test results analyzer, available if the tests were run. you can view the failed tests and their output and OST maintainers and your team leads should be able to assist.

  4. For further learning on how to debug OST please visit the OST FAQ.


Q. Will the current infrastructure be able to support all the patches?

A. The CI team has made tremendous work in utilizing the infrastructure. 

The OST gating will run inside OpenShift pods unlike before as bare metals and we can 

gain from that right now approximately 50 pods in parallel to run simultaneously and we will review adding more if the need arises.


Q. When I have multiple patches, in which order will they be tested by the gating system?

A. The patches will be tested as the flow they will be merged. The gating system knows how to simulate patches post-merge. 


Q. What do I do if I think OST failed because of an infra issue and not my patch?

A. You can contact the CI team by sending mail to infra-support@ovirt.org and explain your concerns + sending the patch URL.


Q. Will check-merged scripts be used by the gating system?

A. No, they will be used in the current workflow with OST post-merge gating system called Change-Queue.


Q. Can I add my own tests to the gating system?

A. The gating system is running OST tests, so if it’s a test that should be included in the OST, then yes.


Q. What will happen to the old change-queue system now that we have gating?

A. At this time, the change-queue system will stay and gate post-merge jobs until all of oVirt projects will onboard to patch gating. 

We might consider using the change-queue for further coverage of tests in the future.


Q. How can I re-trigger a failed patch to the gate again?

A. There are 2 options to retrigger:

  • If the case is to fix your patch, just uploaded a new patchset and turn the Code-Review, Verified and CI labels again.

  • If you want to re-trigger the same patchset again just write a comment in Gerrit: 

‘ci emulate-gate please’




 Q. I usually write a series of related patches that should be merged together, can the Gating system test all of them in a single test?

A. No, they will be tested in parallel as the number of patches in the series. This is why we’ve increased our capacity to run OST for this case.


This (and maybe the running time of OST suites) is the only thing that I don't like about the gating - we do post series of small patches that are:
easy to review, atomic enough, so they can be merged one at a time, but OTOH make sense only when merged in whole.
Can this functionality be achieved by including "Depends-on" that point to other patches?

No. for technical reasons, testing and merging patches in batches cannot be done ATM.

If patches are already based on one another - later patches will not be tested without earlier patches, basing patches on one another is essentially like setting an implicit "Depneds-on".

Given the way Gerrit and CI work - you can never rely on having a series of patches be tested or merged together, you can only rely on patches that depend on earlier patches to be tested and merged in the right order.

In other words, if you have a series of patches, checking out the series "in the middle" should always work and pass tests, otherwise it will not pass the gate.

Given that you mention the patches are atomic and can be merged one at a time, there shouldn't really be a problem, if you're worried from a performance standpoint, please note that we are moving from a point where we were essentially running one test at a time to being able to run close to 80 tests at a time. From an overall patch throughput POV the improvement is going to be huge, as well as from a single patch integration latency POV. 

While the process was hidden from you so far because it was happening post-merge, the reality is that we've been blocking failing patches from going into oVirt for a long time now. With the current system, a patch can end up being delayed from overall integration into oVirt for a few days as the testing system was running bisection cycles to find and remove faulty patches. Now if a patch is working, and does not depend on any faulty patches, it is expected to take at most two OST runs to get merged, at which point it is already integrated into oVirt. Additionally the process will be visible for you in Gerrit as opposed to hidden away.  
 



Architectural design document [4] would provide you the understanding of the patch gating process with the new services we will be using. 




[1]: https://bluejeans.com/s/zAjyX/

[2]: https://ovirt-jira.atlassian.net

[3]: https://ovirt-infra-docs.readthedocs.io/en/latest/

[4] https://drive.google.com/open?id=1qV_iNJL6jHARlti7zpnRZRfQM_Q9Z-w8ONvcKEmaAgA




_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/KKEHCJHRS645H4LNDPB4PKRBUBZEYLOY/
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/KXDJTQ7QSFPSP7HJGEUHLB75Y2N2CZ44/


--
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted