gerrit+ci improvement proposal

Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests. # Problem Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers. # Proposed Solution To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”. # How it works 2 new gerrit independent flags are added to gerrit. ## CI flag Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins). ## Workflow flag Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2 ## Review + CI Integration: Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1 Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests. Next step will be to automate merge the change after Workflow+2 has been set by the Maintainer and gating tests passed. ## Why now? It is elimination of waste. The sooner - the better. The solution has been used for a while and it works. Resolving the problem without gerrit involved will lead to adding unreliable code into jobs, and will still be prone to problems: Just recently, 3d ago we’ve tried detecting what to run from jenkins relying only on gerrit comments so that upon Verified+1, we’d run the job. We could not use “Review+1”, because it makes no sense at all, so we left the job to set Verified+1. Meaning - re-trigger itself immediately more than 1 times. Jenkins and its visitors very unhappy, and we had to stop those jobs, clean up the queue, and spam developers. ## OK OK OK. Now what? Now we want your comments and opinions before pushing this further: Please participate in this thread, so we can start trying it out. Ask, Suggest better ideas, all this is welcome. Best Regards! N.B. Of course, this is not written in stone, in case we find a better approach on solving those issues, we will change to it. And we will keep improving so don't be afraid that it will be enforced: if this does not work out we will discard it. P.S. Kudos to dcaro, most of the work was done by him, and most of this text too. Max Kovgan Senior Software Engineer Red Hat - EMEA ENG Virtualization R&D Tel.: +972 9769 2060 Email: mkovgan [at] redhat [dot] com Web: http://www.redhat.com RHT Global #: 82-72060

On Wednesday, June 03, 2015 01:22:54 PM Max Kovgan wrote:
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
Next step will be to automate merge the change after Workflow+2 has been set by the Maintainer and gating tests passed.
## Why now?
It is elimination of waste. The sooner - the better. The solution has been used for a while and it works. Resolving the problem without gerrit involved will lead to adding unreliable code into jobs, and will still be prone to problems: Just recently, 3d ago we’ve tried detecting what to run from jenkins relying only on gerrit comments so that upon Verified+1, we’d run the job. We could not use “Review+1”, because it makes no sense at all, so we left the job to set Verified+1. Meaning - re-trigger itself immediately more than 1 times.
Jenkins and its visitors very unhappy, and we had to stop those jobs, clean up the queue, and spam developers.
## OK OK OK. Now what?
Now we want your comments and opinions before pushing this further: Please participate in this thread, so we can start trying it out. Ask, Suggest better ideas, all this is welcome.
Best Regards!
N.B. Of course, this is not written in stone, in case we find a better approach on solving those issues, we will change to it. And we will keep improving so don't be afraid that it will be enforced: if this does not work out we will discard it.
P.S. Kudos to dcaro, most of the work was done by him, and most of this text too.
+1 looks like a good improvement to me.
Max Kovgan
Senior Software Engineer Red Hat - EMEA ENG Virtualization R&D Tel.: +972 9769 2060 Email: mkovgan [at] redhat [dot] com Web: http://www.redhat.com RHT Global #: 82-72060
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
Next step will be to automate merge the change after Workflow+2 has been set by the Maintainer and gating tests passed.
## Why now?
It is elimination of waste. The sooner - the better. The solution has been used for a while and it works. Resolving the problem without gerrit involved will lead to adding unreliable code into jobs, and will still be prone to problems: Just recently, 3d ago we’ve tried detecting what to run from jenkins relying only on gerrit comments so that upon Verified+1, we’d run the job. We could not use “Review+1”, because it makes no sense at all, so we left the job to set Verified+1. Meaning - re-trigger itself immediately more than 1 times.
Jenkins and its visitors very unhappy, and we had to stop those jobs, clean up the queue, and spam developers.
## OK OK OK. Now what?
Now we want your comments and opinions before pushing this further: Please participate in this thread, so we can start trying it out. Ask, Suggest better ideas, all this is welcome.
Best Regards!
N.B. Of course, this is not written in stone, in case we find a better approach on solving those issues, we will change to it. And we will keep improving so don't be afraid that it will be enforced: if this does not work out we will discard it.
P.S. Kudos to dcaro, most of the work was done by him, and most of this text too.
+1 from me, releasing CI from running non critical and un-essential jobs will not only reduce load from ci, and shorted response time for developers, it will allow us to add much more powerful tests such as functional & system tests that actually add hosts and run VMs, improving our ability to find regression much more effectively. Another benefit to consider is saving reviewers time. I.e not only jenkins benefits from Worklow+1, but also human reviewers. Instead of looking at a patch that is too early to be reviewed, the author can set the Workflow+1 when the code is ready to review (even if he didn't verified it yet), thus saving time to other reviewers - for example people can add an email rule to alert them only when they are added to patches that have Workflow+1, and not before. And one final note, for Workflow+2 -> this is a preparation for a gating system, like Zuul used by openstack, that in the future we might use as automatic merger pending passing a verification step. this will prevent errors that happen sometimes post merge due to conflicts or other issues, and will be another level of validation before final merge. But as max said, its all part of the plan and we'll test it of course before implementing to see its value.
Max Kovgan
Senior Software Engineer Red Hat - EMEA ENG Virtualization R&D Tel.: +972 9769 2060 Email: mkovgan [at] redhat [dot] com Web: http://www.redhat.com RHT Global #: 82-72060
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

+1 to only triggering CI when we're "ready." Via a flag or via gerrit drafts -- either is fine with me. ----- Original Message -----
From: "Eyal Edri" <eedri@redhat.com> To: "Max Kovgan" <mkovgan@redhat.com> Cc: infra@ovirt.org, devel@ovirt.org Sent: Wednesday, June 3, 2015 3:46:06 PM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
Next step will be to automate merge the change after Workflow+2 has been set by the Maintainer and gating tests passed.
## Why now?
It is elimination of waste. The sooner - the better. The solution has been used for a while and it works. Resolving the problem without gerrit involved will lead to adding unreliable code into jobs, and will still be prone to problems: Just recently, 3d ago we’ve tried detecting what to run from jenkins relying only on gerrit comments so that upon Verified+1, we’d run the job. We could not use “Review+1”, because it makes no sense at all, so we left the job to set Verified+1. Meaning - re-trigger itself immediately more than 1 times.
Jenkins and its visitors very unhappy, and we had to stop those jobs, clean up the queue, and spam developers.
## OK OK OK. Now what?
Now we want your comments and opinions before pushing this further: Please participate in this thread, so we can start trying it out. Ask, Suggest better ideas, all this is welcome.
Best Regards!
N.B. Of course, this is not written in stone, in case we find a better approach on solving those issues, we will change to it. And we will keep improving so don't be afraid that it will be enforced: if this does not work out we will discard it.
P.S. Kudos to dcaro, most of the work was done by him, and most of this text too.
+1 from me, releasing CI from running non critical and un-essential jobs will not only reduce load from ci, and shorted response time for developers, it will allow us to add much more powerful tests such as functional & system tests that actually add hosts and run VMs, improving our ability to find regression much more effectively.
Another benefit to consider is saving reviewers time. I.e not only jenkins benefits from Worklow+1, but also human reviewers. Instead of looking at a patch that is too early to be reviewed, the author can set the Workflow+1 when the code is ready to review (even if he didn't verified it yet), thus saving time to other reviewers - for example people can add an email rule to alert them only when they are added to patches that have Workflow+1, and not before.
And one final note, for Workflow+2 -> this is a preparation for a gating system, like Zuul used by openstack, that in the future we might use as automatic merger pending passing a verification step. this will prevent errors that happen sometimes post merge due to conflicts or other issues, and will be another level of validation before final merge. But as max said, its all part of the plan and we'll test it of course before implementing to see its value.
Max Kovgan
Senior Software Engineer Red Hat - EMEA ENG Virtualization R&D Tel.: +972 9769 2060 Email: mkovgan [at] redhat [dot] com Web: http://www.redhat.com RHT Global #: 82-72060
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

Il 03/06/2015 21:46, Eyal Edri ha scritto:
----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
Next step will be to automate merge the change after Workflow+2 has been set by the Maintainer and gating tests passed.
## Why now?
It is elimination of waste. The sooner - the better. The solution has been used for a while and it works. Resolving the problem without gerrit involved will lead to adding unreliable code into jobs, and will still be prone to problems: Just recently, 3d ago we’ve tried detecting what to run from jenkins relying only on gerrit comments so that upon Verified+1, we’d run the job. We could not use “Review+1”, because it makes no sense at all, so we left the job to set Verified+1. Meaning - re-trigger itself immediately more than 1 times.
Jenkins and its visitors very unhappy, and we had to stop those jobs, clean up the queue, and spam developers.
## OK OK OK. Now what?
Now we want your comments and opinions before pushing this further: Please participate in this thread, so we can start trying it out. Ask, Suggest better ideas, all this is welcome.
Best Regards!
N.B. Of course, this is not written in stone, in case we find a better approach on solving those issues, we will change to it. And we will keep improving so don't be afraid that it will be enforced: if this does not work out we will discard it.
P.S. Kudos to dcaro, most of the work was done by him, and most of this text too.
+1 from me, releasing CI from running non critical and un-essential jobs will not only reduce load from ci, and shorted response time for developers, it will allow us to add much more powerful tests such as functional & system tests that actually add hosts and run VMs, improving our ability to find regression much more effectively.
Another benefit to consider is saving reviewers time. I.e not only jenkins benefits from Worklow+1, but also human reviewers. Instead of looking at a patch that is too early to be reviewed, the author can set the Workflow+1 when the code is ready to review (even if he didn't verified it yet), thus saving time to other reviewers - for example people can add an email rule to alert them only when they are added to patches that have Workflow+1, and not before.
For human reviewers I suggest to keep using drafts until the patch is finished. Once it's finished and humans reviewed the logic of the patch, Workflow+1 should be triggered allowing automation to check the correctness of the patch. IMHO there's no reason for wasting CI time on patches that will be correct from an automation point of view but nacked by reviewers. Especially if the patches are part of a big patchset.
And one final note, for Workflow+2 -> this is a preparation for a gating system, like Zuul used by openstack, that in the future we might use as automatic merger pending passing a verification step. this will prevent errors that happen sometimes post merge due to conflicts or other issues, and will be another level of validation before final merge. But as max said, its all part of the plan and we'll test it of course before implementing to see its value.
Max Kovgan
Senior Software Engineer Red Hat - EMEA ENG Virtualization R&D Tel.: +972 9769 2060 Email: mkovgan [at] redhat [dot] com Web: http://www.redhat.com RHT Global #: 82-72060
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com

----- Original Message -----
From: "Sandro Bonazzola" <sbonazzo@redhat.com> To: "Eyal Edri" <eedri@redhat.com>, "Max Kovgan" <mkovgan@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 9:11:10 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
Il 03/06/2015 21:46, Eyal Edri ha scritto:
----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
Next step will be to automate merge the change after Workflow+2 has been set by the Maintainer and gating tests passed.
## Why now?
It is elimination of waste. The sooner - the better. The solution has been used for a while and it works. Resolving the problem without gerrit involved will lead to adding unreliable code into jobs, and will still be prone to problems: Just recently, 3d ago we’ve tried detecting what to run from jenkins relying only on gerrit comments so that upon Verified+1, we’d run the job. We could not use “Review+1”, because it makes no sense at all, so we left the job to set Verified+1. Meaning - re-trigger itself immediately more than 1 times.
Jenkins and its visitors very unhappy, and we had to stop those jobs, clean up the queue, and spam developers.
## OK OK OK. Now what?
Now we want your comments and opinions before pushing this further: Please participate in this thread, so we can start trying it out. Ask, Suggest better ideas, all this is welcome.
Best Regards!
N.B. Of course, this is not written in stone, in case we find a better approach on solving those issues, we will change to it. And we will keep improving so don't be afraid that it will be enforced: if this does not work out we will discard it.
P.S. Kudos to dcaro, most of the work was done by him, and most of this text too.
+1 from me, releasing CI from running non critical and un-essential jobs will not only reduce load from ci, and shorted response time for developers, it will allow us to add much more powerful tests such as functional & system tests that actually add hosts and run VMs, improving our ability to find regression much more effectively.
Another benefit to consider is saving reviewers time. I.e not only jenkins benefits from Worklow+1, but also human reviewers. Instead of looking at a patch that is too early to be reviewed, the author can set the Workflow+1 when the code is ready to review (even if he didn't verified it yet), thus saving time to other reviewers - for example people can add an email rule to alert them only when they are added to patches that have Workflow+1, and not before.
For human reviewers I suggest to keep using drafts until the patch is finished.
keep using? how many developers do you know are working with drafts until their patch is ready? i agree if everyone would use drafts load on jenkins was already much lower, unfortunately its not the case.
Once it's finished and humans reviewed the logic of the patch, Workflow+1 should be triggered allowing automation to check the correctness of the patch. IMHO there's no reason for wasting CI time on patches that will be correct from an automation point of view but nacked by reviewers. Especially if the patches are part of a big patchset.
And one final note, for Workflow+2 -> this is a preparation for a gating system, like Zuul used by openstack, that in the future we might use as automatic merger pending passing a verification step. this will prevent errors that happen sometimes post merge due to conflicts or other issues, and will be another level of validation before final merge. But as max said, its all part of the plan and we'll test it of course before implementing to see its value.
Max Kovgan
Senior Software Engineer Red Hat - EMEA ENG Virtualization R&D Tel.: +972 9769 2060 Email: mkovgan [at] redhat [dot] com Web: http://www.redhat.com RHT Global #: 82-72060
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

Il 04/06/2015 08:46, Eyal Edri ha scritto:
----- Original Message -----
From: "Sandro Bonazzola" <sbonazzo@redhat.com> To: "Eyal Edri" <eedri@redhat.com>, "Max Kovgan" <mkovgan@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 9:11:10 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
Il 03/06/2015 21:46, Eyal Edri ha scritto:
----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
Next step will be to automate merge the change after Workflow+2 has been set by the Maintainer and gating tests passed.
## Why now?
It is elimination of waste. The sooner - the better. The solution has been used for a while and it works. Resolving the problem without gerrit involved will lead to adding unreliable code into jobs, and will still be prone to problems: Just recently, 3d ago we’ve tried detecting what to run from jenkins relying only on gerrit comments so that upon Verified+1, we’d run the job. We could not use “Review+1”, because it makes no sense at all, so we left the job to set Verified+1. Meaning - re-trigger itself immediately more than 1 times.
Jenkins and its visitors very unhappy, and we had to stop those jobs, clean up the queue, and spam developers.
## OK OK OK. Now what?
Now we want your comments and opinions before pushing this further: Please participate in this thread, so we can start trying it out. Ask, Suggest better ideas, all this is welcome.
Best Regards!
N.B. Of course, this is not written in stone, in case we find a better approach on solving those issues, we will change to it. And we will keep improving so don't be afraid that it will be enforced: if this does not work out we will discard it.
P.S. Kudos to dcaro, most of the work was done by him, and most of this text too.
+1 from me, releasing CI from running non critical and un-essential jobs will not only reduce load from ci, and shorted response time for developers, it will allow us to add much more powerful tests such as functional & system tests that actually add hosts and run VMs, improving our ability to find regression much more effectively.
Another benefit to consider is saving reviewers time. I.e not only jenkins benefits from Worklow+1, but also human reviewers. Instead of looking at a patch that is too early to be reviewed, the author can set the Workflow+1 when the code is ready to review (even if he didn't verified it yet), thus saving time to other reviewers - for example people can add an email rule to alert them only when they are added to patches that have Workflow+1, and not before.
For human reviewers I suggest to keep using drafts until the patch is finished.
keep using? how many developers do you know are working with drafts until their patch is ready?
We have ~350 drafts in gerrit so someone is using them :-) https://gerrit.ovirt.org/#/q/is:draft
i agree if everyone would use drafts load on jenkins was already much lower, unfortunately its not the case.
Once it's finished and humans reviewed the logic of the patch, Workflow+1 should be triggered allowing automation to check the correctness of the patch. IMHO there's no reason for wasting CI time on patches that will be correct from an automation point of view but nacked by reviewers. Especially if the patches are part of a big patchset.
And one final note, for Workflow+2 -> this is a preparation for a gating system, like Zuul used by openstack, that in the future we might use as automatic merger pending passing a verification step. this will prevent errors that happen sometimes post merge due to conflicts or other issues, and will be another level of validation before final merge. But as max said, its all part of the plan and we'll test it of course before implementing to see its value.
Max Kovgan
Senior Software Engineer Red Hat - EMEA ENG Virtualization R&D Tel.: +972 9769 2060 Email: mkovgan [at] redhat [dot] com Web: http://www.redhat.com RHT Global #: 82-72060
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com

----- Original Message -----
From: "Sandro Bonazzola" <sbonazzo@redhat.com> To: "Eyal Edri" <eedri@redhat.com> Cc: infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 9:50:56 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
Il 04/06/2015 08:46, Eyal Edri ha scritto:
----- Original Message -----
From: "Sandro Bonazzola" <sbonazzo@redhat.com> To: "Eyal Edri" <eedri@redhat.com>, "Max Kovgan" <mkovgan@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 9:11:10 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
Il 03/06/2015 21:46, Eyal Edri ha scritto:
----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
Next step will be to automate merge the change after Workflow+2 has been set by the Maintainer and gating tests passed.
## Why now?
It is elimination of waste. The sooner - the better. The solution has been used for a while and it works. Resolving the problem without gerrit involved will lead to adding unreliable code into jobs, and will still be prone to problems: Just recently, 3d ago we’ve tried detecting what to run from jenkins relying only on gerrit comments so that upon Verified+1, we’d run the job. We could not use “Review+1”, because it makes no sense at all, so we left the job to set Verified+1. Meaning - re-trigger itself immediately more than 1 times.
Jenkins and its visitors very unhappy, and we had to stop those jobs, clean up the queue, and spam developers.
## OK OK OK. Now what?
Now we want your comments and opinions before pushing this further: Please participate in this thread, so we can start trying it out. Ask, Suggest better ideas, all this is welcome.
Best Regards!
N.B. Of course, this is not written in stone, in case we find a better approach on solving those issues, we will change to it. And we will keep improving so don't be afraid that it will be enforced: if this does not work out we will discard it.
P.S. Kudos to dcaro, most of the work was done by him, and most of this text too.
+1 from me, releasing CI from running non critical and un-essential jobs will not only reduce load from ci, and shorted response time for developers, it will allow us to add much more powerful tests such as functional & system tests that actually add hosts and run VMs, improving our ability to find regression much more effectively.
Another benefit to consider is saving reviewers time. I.e not only jenkins benefits from Worklow+1, but also human reviewers. Instead of looking at a patch that is too early to be reviewed, the author can set the Workflow+1 when the code is ready to review (even if he didn't verified it yet), thus saving time to other reviewers - for example people can add an email rule to alert them only when they are added to patches that have Workflow+1, and not before.
For human reviewers I suggest to keep using drafts until the patch is finished.
keep using? how many developers do you know are working with drafts until their patch is ready?
We have ~350 drafts in gerrit so someone is using them :-) https://gerrit.ovirt.org/#/q/is:draft
good to know :) so i guess even with that amount its still hard on jenkins....
i agree if everyone would use drafts load on jenkins was already much lower, unfortunately its not the case.
Once it's finished and humans reviewed the logic of the patch, Workflow+1 should be triggered allowing automation to check the correctness of the patch. IMHO there's no reason for wasting CI time on patches that will be correct from an automation point of view but nacked by reviewers. Especially if the patches are part of a big patchset.
And one final note, for Workflow+2 -> this is a preparation for a gating system, like Zuul used by openstack, that in the future we might use as automatic merger pending passing a verification step. this will prevent errors that happen sometimes post merge due to conflicts or other issues, and will be another level of validation before final merge. But as max said, its all part of the plan and we'll test it of course before implementing to see its value.
Max Kovgan
Senior Software Engineer Red Hat - EMEA ENG Virtualization R&D Tel.: +972 9769 2060 Email: mkovgan [at] redhat [dot] com Web: http://www.redhat.com RHT Global #: 82-72060
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

=20 =20 ----- Original Message -----
From: "Sandro Bonazzola" <sbonazzo@redhat.com> To: "Eyal Edri" <eedri@redhat.com> Cc: infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 9:50:56 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal =20 Il 04/06/2015 08:46, Eyal Edri ha scritto:
=20 =20 ----- Original Message -----
From: "Sandro Bonazzola" <sbonazzo@redhat.com> To: "Eyal Edri" <eedri@redhat.com>, "Max Kovgan" <mkovgan@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 9:11:10 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
Il 03/06/2015 21:46, Eyal Edri ha scritto:
----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycl=
es and
encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submissi= on, which is OK. But running all the jobs on those patches with limited resources r= esults in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don=E2=80=99t need to, thus making mor= e resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to r= espond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - =E2=80= =9Cto fail early=E2=80=9D, yet full blown static code analysis and long tests to run =E2=80= =9Cwhen ready=E2=80=9D.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed Permissions for setting: project maintainers (for special cases) s= hould be able to set/override (except Jenkins).
## Workflow flag
Will express patch =E2=80=9Cworkflow=E2=80=9D state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can= set +2
## Review + CI Integration:
Merging [=E2=80=9CSubmit=E2=80=9D button to appear] will require: = Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for=
--K8nIJk4ghYZn606h Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 06/04, Eyal Edri wrote: the
patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" but= ton to appear after CI has completed running gating tests.
Next step will be to automate merge the change after Workflow+2 ha= s been set by the Maintainer and gating tests passed.
## Why now?
It is elimination of waste. The sooner - the better. The solution has been used for a while and it works. Resolving the problem without gerrit involved will lead to adding unreliable code into jobs, and will still be prone to problems: Just recently, 3d ago we=E2=80=99ve tried detecting what to run = =66rom jenkins relying only on gerrit comments so that upon Verified+1, we=E2= =80=99d run the job. We could not use =E2=80=9CReview+1=E2=80=9D, because it makes no= sense at all, so we left the job to set Verified+1. Meaning - re-trigger itself immediately more than 1 times. =20 Jenkins and its visitors very unhappy, and we had to stop those = jobs, clean up the queue, and spam developers.
## OK OK OK. Now what?
Now we want your comments and opinions before pushing this further: Please participate in this thread, so we can start trying it out. Ask, Suggest better ideas, all this is welcome.
Best Regards!
N.B. Of course, this is not written in stone, in case we find a better approach on solving those issues, we will change to it. And we will keep improving so don't be afraid that it will be enfo= rced: if this does not work out we will discard it.
P.S. Kudos to dcaro, most of the work was done by him, and most of this= text too.
+1 from me, releasing CI from running non critical and un-essential= jobs will not only reduce load from ci, and shorted response time for developers, it will allow us to add m= uch more powerful tests such as functional & system tests that actually add hosts and run VMs, improving our ability to= find regression much more effectively.
Another benefit to consider is saving reviewers time. I.e not only jenkins benefits from Worklow+1, but also human reviewers. Instead of looking at a patch that is too early to be reviewed, the author can set the Workflow+1 when the code is ready to review (even if he didn't verified it yet), thus saving time to other revi= ewers - for example people can add an email rule to alert them only when they are added to patches that have Workflo= w+1, and not before.
For human reviewers I suggest to keep using drafts until the patch is finished. =20 keep using? how many developers do you know are working with drafts u= ntil their patch is ready? =20 We have ~350 drafts in gerrit so someone is using them :-) https://gerrit.ovirt.org/#/q/is:draft =20 good to know :) so i guess even with that amount its still hard on jenkins....=20
=20
=20
i agree if everyone would use drafts load on jenkins was already much lower, unfortunately its not the case. =20
Once it's finished and humans reviewed the logic of the patch, Workf= low+1 should be triggered allowing automation to check the correctness of =
patch. IMHO there's no reason for wasting CI time on patches that will be c= orrect from an automation point of view but nacked by reviewers. Especially= if the patches are part of a big patchset.
And one final note, for Workflow+2 -> this is a preparation for a g=
ating
system, like Zuul used by openstack, that in the future we might use as automatic merger pending passing a verification ste=
The main issue with drafts is that once it's published, you can't go back. = And for most patches I've seen (followed just a dozen) there are quite more non-ready patches after publishing than in draft stage. the p.
this will prevent errors that happen sometimes post merge due to conflicts or other issues, and will be another le= vel of validation before final merge. But as max said, its all part of the plan and we'll test it of cour= se before implementing to see its value.
Max Kovgan
Senior Software Engineer Red Hat - EMEA ENG Virtualization R&D Tel.: +972 9769 2060 Email: mkovgan [at] redhat [dot] com Web: http://www.redhat.com RHT Global #: 82-72060
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collabora= tion. See how it works at redhat.com _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
=20 =20 -- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboratio= n. See how it works at redhat.com
Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra =20 =20 =20
Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--=20 David Caro Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D Tel.: +420 532 294 605 Email: dcaro@redhat.com Web: www.redhat.com RHT Global #: 82-62605 --K8nIJk4ghYZn606h Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVb/3PAAoJEEBxx+HSYmnDnxIH/jHHtyMnvplDSSWZSoNC5r8v x7l7LUzc/21Ai1q7DRW9fM1jYtZM+cJQcsrb8YJC5RF6GMV7zL/AF6yXAhbLFQUg vgRC4QgeEk7usZ9Tss70JJKgr1044SAXqxwpcTFrDkBKNmqQpk1eQ5oMHLRe0I9u MxVPlLY+w4m208BAiMRHpITs0B/339a4f4mEqxUl8MMuYRxh5qwe1PQs0mFGt4c8 de7NJUi0AluTMq4soDJUPUQDmknLabUwYL2Dg9EG+x2I66wZI9Rxi44PjUmuXmLf Ig0uHapD6z4DS6HZGAUt+RAKXW4jTgig1vElYgtMUK237FRuDbPvqOzE0fCLfHU= =uymq -----END PGP SIGNATURE----- --K8nIJk4ghYZn606h--

----- Original Message -----
From: "Eyal Edri" <eedri@redhat.com> To: "Sandro Bonazzola" <sbonazzo@redhat.com> Cc: infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 9:46:40 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Sandro Bonazzola" <sbonazzo@redhat.com> To: "Eyal Edri" <eedri@redhat.com>, "Max Kovgan" <mkovgan@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 9:11:10 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
Il 03/06/2015 21:46, Eyal Edri ha scritto:
----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
Next step will be to automate merge the change after Workflow+2 has been set by the Maintainer and gating tests passed.
## Why now?
It is elimination of waste. The sooner - the better. The solution has been used for a while and it works. Resolving the problem without gerrit involved will lead to adding unreliable code into jobs, and will still be prone to problems: Just recently, 3d ago we’ve tried detecting what to run from jenkins relying only on gerrit comments so that upon Verified+1, we’d run the job. We could not use “Review+1”, because it makes no sense at all, so we left the job to set Verified+1. Meaning - re-trigger itself immediately more than 1 times.
Jenkins and its visitors very unhappy, and we had to stop those jobs, clean up the queue, and spam developers.
## OK OK OK. Now what?
Now we want your comments and opinions before pushing this further: Please participate in this thread, so we can start trying it out. Ask, Suggest better ideas, all this is welcome.
Best Regards!
N.B. Of course, this is not written in stone, in case we find a better approach on solving those issues, we will change to it. And we will keep improving so don't be afraid that it will be enforced: if this does not work out we will discard it.
P.S. Kudos to dcaro, most of the work was done by him, and most of this text too.
+1 from me, releasing CI from running non critical and un-essential jobs will not only reduce load from ci, and shorted response time for developers, it will allow us to add much more powerful tests such as functional & system tests that actually add hosts and run VMs, improving our ability to find regression much more effectively.
Another benefit to consider is saving reviewers time. I.e not only jenkins benefits from Worklow+1, but also human reviewers. Instead of looking at a patch that is too early to be reviewed, the author can set the Workflow+1 when the code is ready to review (even if he didn't verified it yet), thus saving time to other reviewers - for example people can add an email rule to alert them only when they are added to patches that have Workflow+1, and not before.
For human reviewers I suggest to keep using drafts until the patch is finished.
keep using? how many developers do you know are working with drafts until their patch is ready? i agree if everyone would use drafts load on jenkins was already much lower, unfortunately its not the case.
IMO we don't need the "workflow" flag. I'm okay with CI not running on "drafts". And yes... we do use them. We can try and educate people to use them more where needed. Drafts should be widely used in first-phase development, and less on bug-fixes. In addition, I think the patch owners shouldn't add reviewers, unless they need their input in the stage of the development. Once they want input, they should add reviewers. 1. So, if the patch is draft then no CI runs on it. 2. Once it turns into non-draft, you can run "light-CI" on it. 3. Once the patch has at least one +1 from a (human) reviewer, then it should run the "heavy" CI. 4. Once the patch has +1 from heavy CI, and +2 from reviewer (maintainer), then it can be merged. That's the process we have today, with slight change on when to run the CI and what CI to run (no CI on drafts, light CI on non-draft, heavy CI on +1 patches).
Once it's finished and humans reviewed the logic of the patch, Workflow+1 should be triggered allowing automation to check the correctness of the patch. IMHO there's no reason for wasting CI time on patches that will be correct from an automation point of view but nacked by reviewers. Especially if the patches are part of a big patchset.
And one final note, for Workflow+2 -> this is a preparation for a gating system, like Zuul used by openstack, that in the future we might use as automatic merger pending passing a verification step. this will prevent errors that happen sometimes post merge due to conflicts or other issues, and will be another level of validation before final merge. But as max said, its all part of the plan and we'll test it of course before implementing to see its value.
Max Kovgan
Senior Software Engineer Red Hat - EMEA ENG Virtualization R&D Tel.: +972 9769 2060 Email: mkovgan [at] redhat [dot] com Web: http://www.redhat.com RHT Global #: 82-72060
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel

----- Original Message -----
From: "Oved Ourfali" <ovedo@redhat.com> To: "Eyal Edri" <eedri@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 10:03:02 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Eyal Edri" <eedri@redhat.com> To: "Sandro Bonazzola" <sbonazzo@redhat.com> Cc: infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 9:46:40 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Sandro Bonazzola" <sbonazzo@redhat.com> To: "Eyal Edri" <eedri@redhat.com>, "Max Kovgan" <mkovgan@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 9:11:10 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
Il 03/06/2015 21:46, Eyal Edri ha scritto:
----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
Next step will be to automate merge the change after Workflow+2 has been set by the Maintainer and gating tests passed.
## Why now?
It is elimination of waste. The sooner - the better. The solution has been used for a while and it works. Resolving the problem without gerrit involved will lead to adding unreliable code into jobs, and will still be prone to problems: Just recently, 3d ago we’ve tried detecting what to run from jenkins relying only on gerrit comments so that upon Verified+1, we’d run the job. We could not use “Review+1”, because it makes no sense at all, so we left the job to set Verified+1. Meaning - re-trigger itself immediately more than 1 times.
Jenkins and its visitors very unhappy, and we had to stop those jobs, clean up the queue, and spam developers.
## OK OK OK. Now what?
Now we want your comments and opinions before pushing this further: Please participate in this thread, so we can start trying it out. Ask, Suggest better ideas, all this is welcome.
Best Regards!
N.B. Of course, this is not written in stone, in case we find a better approach on solving those issues, we will change to it. And we will keep improving so don't be afraid that it will be enforced: if this does not work out we will discard it.
P.S. Kudos to dcaro, most of the work was done by him, and most of this text too.
+1 from me, releasing CI from running non critical and un-essential jobs will not only reduce load from ci, and shorted response time for developers, it will allow us to add much more powerful tests such as functional & system tests that actually add hosts and run VMs, improving our ability to find regression much more effectively.
Another benefit to consider is saving reviewers time. I.e not only jenkins benefits from Worklow+1, but also human reviewers. Instead of looking at a patch that is too early to be reviewed, the author can set the Workflow+1 when the code is ready to review (even if he didn't verified it yet), thus saving time to other reviewers - for example people can add an email rule to alert them only when they are added to patches that have Workflow+1, and not before.
For human reviewers I suggest to keep using drafts until the patch is finished.
keep using? how many developers do you know are working with drafts until their patch is ready? i agree if everyone would use drafts load on jenkins was already much lower, unfortunately its not the case.
IMO we don't need the "workflow" flag. I'm okay with CI not running on "drafts". And yes... we do use them. We can try and educate people to use them more where needed. Drafts should be widely used in first-phase development, and less on bug-fixes.
In addition, I think the patch owners shouldn't add reviewers, unless they need their input in the stage of the development. Once they want input, they should add reviewers.
1. So, if the patch is draft then no CI runs on it. 2. Once it turns into non-draft, you can run "light-CI" on it. 3. Once the patch has at least one +1 from a (human) reviewer, then it should run the "heavy" CI.
This is pointless, we will have to ask someone (or add another user) to +1 the patch to enable the CI :-)
4. Once the patch has +1 from heavy CI, and +2 from reviewer (maintainer), then it can be merged.
That's the process we have today, with slight change on when to run the CI and what CI to run (no CI on drafts, light CI on non-draft, heavy CI on +1 patches).
I think Oved proposal is simpler and more useful. However, we need a way to run *any* ci jobs even on a draft. If we cannot afford this automatically with current system, lets add a way to trigger this manually from gerrit. So this becomes simply: 1. drafts do not run the ci automatically, but the owner can run the ci manually 2. published patches run the light ci jobs automatically, owner can run heavy jobs manually 3. run all ci jobs before merge, maintain can ignore ci results This works with the system David suggested in the past, where each project implement scripts for checking patches and checking patches before merge. Nir

--ieNMXl1Fr3cevapt Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 06/04, Nir Soffer wrote: >=20 >=20 > ----- Original Message ----- > > From: "Oved Ourfali" <ovedo@redhat.com> > > To: "Eyal Edri" <eedri@redhat.com> > > Cc: devel@ovirt.org, infra@ovirt.org > > Sent: Thursday, June 4, 2015 10:03:02 AM > > Subject: Re: [ovirt-devel] gerrit+ci improvement proposal > >=20 > >=20 > >=20 > > ----- Original Message ----- > > > From: "Eyal Edri" <eedri@redhat.com> > > > To: "Sandro Bonazzola" <sbonazzo@redhat.com> > > > Cc: infra@ovirt.org, devel@ovirt.org > > > Sent: Thursday, June 4, 2015 9:46:40 AM > > > Subject: Re: [ovirt-devel] gerrit+ci improvement proposal > > >=20 > > >=20 > > >=20 > > > ----- Original Message ----- > > > > From: "Sandro Bonazzola" <sbonazzo@redhat.com> > > > > To: "Eyal Edri" <eedri@redhat.com>, "Max Kovgan" <mkovgan@redhat.co= m> > > > > Cc: devel@ovirt.org, infra@ovirt.org > > > > Sent: Thursday, June 4, 2015 9:11:10 AM > > > > Subject: Re: [ovirt-devel] gerrit+ci improvement proposal > > > >=20 > > > > Il 03/06/2015 21:46, Eyal Edri ha scritto: > > > > >=20 > > > > >=20 > > > > > ----- Original Message ----- > > > > >> From: "Max Kovgan" <mkovgan@redhat.com> > > > > >> To: devel@ovirt.org > > > > >> Cc: infra@ovirt.org > > > > >> Sent: Wednesday, June 3, 2015 8:22:54 PM > > > > >> Subject: [ovirt-devel] gerrit+ci improvement proposal > > > > >> > > > > >> Hi everyone! > > > > >> We really want to have reliable and snappy CI: to allow short cy= cles > > > > >> and > > > > >> encourage developers to write tests. > > > > >> > > > > >> # Problem > > > > >> > > > > >> Many patches are neither ready for review nor for CI upon submis= sion, > > > > >> which > > > > >> is OK. > > > > >> But running all the jobs on those patches with limited resources > > > > >> results > > > > >> in: > > > > >> overloaded resources, slow response time, unhappy developers. > > > > >> > > > > >> # Proposed Solution > > > > >> > > > > >> To run less jobs we know we don=E2=80=99t need to, thus making m= ore resources > > > > >> for > > > > >> the > > > > >> jobs we need to run. > > > > >> We have been experimenting to make our CI stabler and quicker to > > > > >> respond > > > > >> by > > > > >> using gerrit flags. This has improved in both directions very we= ll > > > > >> internally. > > > > >> Now it seems a good time to let all the oVirt projects to use th= is. > > > > >> This solution indirectly promotes reviews and quick tests - =E2= =80=9Cto fail > > > > >> early=E2=80=9D, > > > > >> yet full blown static code analysis and long tests to run =E2=80= =9Cwhen > > > > >> ready=E2=80=9D. > > > > >> > > > > >> # How it works > > > > >> > > > > >> 2 new gerrit independent flags are added to gerrit. > > > > >> > > > > >> ## CI flag > > > > >> > > > > >> Will express patch CI status. Values: > > > > >> * +1 CI passed > > > > >> * 0 CI did not run yet > > > > >> * -1 CI failed > > > > >> Permissions for setting: project maintainers (for special cases) > > > > >> should > > > > >> be > > > > >> able to set/override (except Jenkins). > > > > >> > > > > >> ## Workflow flag > > > > >> > > > > >> Will express patch =E2=80=9Cworkflow=E2=80=9D state. Values: > > > > >> * 0 Work In Progress > > > > >> * +1 Ready For Review > > > > >> * +2 Ready For Merge > > > > >> Permissions for setting: Owner can set +1, Project Maintainers c= an set > > > > >> +2 > > > > >> > > > > >> ## Review + CI Integration: > > > > >> > > > > >> Merging [=E2=80=9CSubmit=E2=80=9D button to appear] will require= : Review+1, CI+1, > > > > >> Workflow+2 > > > > >> Patch lifecycle now is: > > > > >> --------------------------------------------------------------- > > > > >> patch state |owner |reviewer |maintainer |CI tests |pass > > > > >> --------------------------------------------------------------- > > > > >> added/updated |- |- |- |quick |CI+1 > > > > >> review |Workflow+1|Review+1 |- |heavy |CI+1 > > > > >> merge ready |- |- |Workflow+2 |gating |CI+1 > > > > >> merge |- |- |merge |merge |CI+1 > > > > >> > > > > >> Changes from current workflow: > > > > >> Owner only adds reviewers, now owner needs to set "Workflow+1" f= or the > > > > >> patch > > > > >> to be reviewed, and heavily auto-tested. > > > > >> Maintainer now needs to set "Workflow+2" and wait for "Submit" b= utton > > > > >> to > > > > >> appear after CI has completed running gating tests. > > > > >> > > > > >> > > > > >> Next step will be to automate merge the change after Workflow+2 = has > > > > >> been > > > > >> set > > > > >> by the Maintainer and gating tests passed. > > > > >> > > > > >> > > > > >> ## Why now? > > > > >> > > > > >> It is elimination of waste. The sooner - the better. > > > > >> The solution has been used for a while and it works. > > > > >> Resolving the problem without gerrit involved will lead to adding > > > > >> unreliable > > > > >> code into jobs, and will still be prone to problems: > > > > >> Just recently, 3d ago we=E2=80=99ve tried detecting what to ru= n from jenkins > > > > >> relying only on gerrit comments so that upon Verified+1, we=E2= =80=99d run > > > > >> the > > > > >> job. > > > > >> We could not use =E2=80=9CReview+1=E2=80=9D, because it makes = no sense at all, so we > > > > >> left > > > > >> the job to set Verified+1. > > > > >> Meaning - re-trigger itself immediately more than 1 times. > > > > >> =20 > > > > >> Jenkins and its visitors very unhappy, and we had to stop those > > > > >> jobs, > > > > >> clean > > > > >> up the queue, and spam developers. > > > > >> > > > > >> ## OK OK OK. Now what? > > > > >> > > > > >> Now we want your comments and opinions before pushing this furth= er: > > > > >> Please participate in this thread, so we can start trying it out. > > > > >> Ask, Suggest better ideas, all this is welcome. > > > > >> > > > > >> > > > > >> Best Regards! > > > > >> > > > > >> > > > > >> N.B. > > > > >> Of course, this is not written in stone, in case we find a better > > > > >> approach > > > > >> on > > > > >> solving those issues, we will change to it. > > > > >> And we will keep improving so don't be afraid that it will be > > > > >> enforced: > > > > >> if > > > > >> this does not work out we will discard it. > > > > >> > > > > >> P.S. > > > > >> Kudos to dcaro, most of the work was done by him, and most of th= is > > > > >> text > > > > >> too. > > > > >> > > > > >=20 > > > > > +1 from me, releasing CI from running non critical and un-essenti= al > > > > > jobs > > > > > will not only reduce load from ci, > > > > > and shorted response time for developers, it will allow us to add= much > > > > > more > > > > > powerful tests such as functional & system > > > > > tests that actually add hosts and run VMs, improving our ability = to > > > > > find > > > > > regression much more effectively. > > > > >=20 > > > > > Another benefit to consider is saving reviewers time. I.e not only > > > > > jenkins > > > > > benefits from Worklow+1, but also human reviewers. > > > > > Instead of looking at a patch that is too early to be reviewed, t= he > > > > > author > > > > > can set the Workflow+1 when the code is ready to review > > > > > (even if he didn't verified it yet), thus saving time to other > > > > > reviewers > > > > > - > > > > > for example people can add an email rule > > > > > to alert them only when they are added to patches that have Workf= low+1, > > > > > and > > > > > not before. > > > >=20 > > > > For human reviewers I suggest to keep using drafts until the patch = is > > > > finished. > > >=20 > > > keep using? how many developers do you know are working with drafts u= ntil > > > their patch is ready? > > > i agree if everyone would use drafts load on jenkins was already much > > > lower, > > > unfortunately its not the case. > > >=20 > >=20 > > IMO we don't need the "workflow" flag. > > I'm okay with CI not running on "drafts". And yes... we do use them. > > We can try and educate people to use them more where needed. > > Drafts should be widely used in first-phase development, and less on > > bug-fixes. > >=20 > > In addition, I think the patch owners shouldn't add reviewers, unless t= hey > > need their input in the stage of the development. > > Once they want input, they should add reviewers. > >=20 > > 1. So, if the patch is draft then no CI runs on it. > > 2. Once it turns into non-draft, you can run "light-CI" on it. > > 3. Once the patch has at least one +1 from a (human) reviewer, then it = should > > run the "heavy" CI. >=20 > This is pointless, we will have to ask someone (or add another user) to += 1 the > patch to enable the CI :-) >=20 > > 4. Once the patch has +1 from heavy CI, and +2 from reviewer (maintaine= r), > > then it can be merged. > >=20 > > That's the process we have today, with slight change on when to run the= CI > > and what CI to run (no CI on drafts, light CI on non-draft, heavy CI on= +1 > > patches). >=20 > I think Oved proposal is simpler and more useful. >=20 > However, we need a way to run *any* ci jobs even on a draft. If we cannot > afford this automatically with current system, lets add a way to trigger = this > manually from gerrit. >=20 > So this becomes simply: >=20 > 1. drafts do not run the ci automatically, but the owner can run the ci m= anually There are two ways I can think of such a trigger, through flags (that are easier to manage) or comments. The flags approach is the one that we exposed and tested on a few projects. > 2. published patches run the light ci jobs automatically, owner can run h= eavy jobs manually A big issue here is that there's currently no separation between light and heavy jobs, not even a list of what should be considered as such. So for starters any job is considered to be heavy. > 3. run all ci jobs before merge, maintain can ignore ci results >=20 > This works with the system David suggested in the past, where each projec= t=20 > implement scripts for checking patches and checking patches before merge. Both work with that system that was never implemented on any major project = so far due to reluctance to add ci scripts inside the code base (where I stron= gly think they belong, as the code base evolves, the scripts must change with i= t) So this is not yet implemented everywhere. >=20 > Nir > _______________________________________________ > Infra mailing list > Infra@ovirt.org > http://lists.ovirt.org/mailman/listinfo/infra --=20 David Caro Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D Tel.: +420 532 294 605 Email: dcaro@redhat.com Web: www.redhat.com RHT Global #: 82-62605 --ieNMXl1Fr3cevapt Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVcA2CAAoJEEBxx+HSYmnDe/UH/06AN7Idm85qhypzl3pWv7Y6 7aCaY28VSFfnLprieMGYQXLVRDpDULiKCfja3nGVSUFIZYtYn0rEH3WCLa0+tmUB aS2FAFAvbXxZNfwXcMMJULf1X8fp74FCZU49BCWpnv+SAjWXPbytE+HPrpWI8nR+ PqoHut52Ba5NSDfY/SJT2nmBZrQdbSY5cJBwz3BoovkI3gjfKES5/rd1AAlS8Wip xEUOzcGU3HDD9ucjsp/AHMSHn0tP7U6rkWQSFZPdezjCtDU9LNBIF/TC+eTNuHfo hNmH884ZdLnGEsrrIMMOWD/cJWG5bMmvLfX6VOpXIBxQoSjztOg/I/d/SjLRsnQ= =MvB8 -----END PGP SIGNATURE----- --ieNMXl1Fr3cevapt--

----- Original Message -----
From: "David Caro" <dcaroest@redhat.com> To: "Nir Soffer" <nsoffer@redhat.com> Cc: "Oved Ourfali" <ovedo@redhat.com>, "Eyal Edri" <eedri@redhat.com>, infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 11:34:10 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
On 06/04, Nir Soffer wrote:
----- Original Message -----
From: "Oved Ourfali" <ovedo@redhat.com> To: "Eyal Edri" <eedri@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 10:03:02 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Eyal Edri" <eedri@redhat.com> To: "Sandro Bonazzola" <sbonazzo@redhat.com> Cc: infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 9:46:40 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Sandro Bonazzola" <sbonazzo@redhat.com> To: "Eyal Edri" <eedri@redhat.com>, "Max Kovgan" <mkovgan@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 9:11:10 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
Il 03/06/2015 21:46, Eyal Edri ha scritto:
----- Original Message ----- > From: "Max Kovgan" <mkovgan@redhat.com> > To: devel@ovirt.org > Cc: infra@ovirt.org > Sent: Wednesday, June 3, 2015 8:22:54 PM > Subject: [ovirt-devel] gerrit+ci improvement proposal > > Hi everyone! > We really want to have reliable and snappy CI: to allow short > cycles > and > encourage developers to write tests. > > # Problem > > Many patches are neither ready for review nor for CI upon > submission, > which > is OK. > But running all the jobs on those patches with limited resources > results > in: > overloaded resources, slow response time, unhappy developers. > > # Proposed Solution > > To run less jobs we know we don’t need to, thus making more > resources > for > the > jobs we need to run. > We have been experimenting to make our CI stabler and quicker to > respond > by > using gerrit flags. This has improved in both directions very well > internally. > Now it seems a good time to let all the oVirt projects to use > this. > This solution indirectly promotes reviews and quick tests - “to > fail > early”, > yet full blown static code analysis and long tests to run “when > ready”. > > # How it works > > 2 new gerrit independent flags are added to gerrit. > > ## CI flag > > Will express patch CI status. Values: > * +1 CI passed > * 0 CI did not run yet > * -1 CI failed > Permissions for setting: project maintainers (for special cases) > should > be > able to set/override (except Jenkins). > > ## Workflow flag > > Will express patch “workflow” state. Values: > * 0 Work In Progress > * +1 Ready For Review > * +2 Ready For Merge > Permissions for setting: Owner can set +1, Project Maintainers can > set > +2 > > ## Review + CI Integration: > > Merging [“Submit” button to appear] will require: Review+1, CI+1, > Workflow+2 > Patch lifecycle now is: > --------------------------------------------------------------- > patch state |owner |reviewer |maintainer |CI tests |pass > --------------------------------------------------------------- > added/updated |- |- |- |quick |CI+1 > review |Workflow+1|Review+1 |- |heavy |CI+1 > merge ready |- |- |Workflow+2 |gating |CI+1 > merge |- |- |merge |merge |CI+1 > > Changes from current workflow: > Owner only adds reviewers, now owner needs to set "Workflow+1" for > the > patch > to be reviewed, and heavily auto-tested. > Maintainer now needs to set "Workflow+2" and wait for "Submit" > button > to > appear after CI has completed running gating tests. > > > Next step will be to automate merge the change after Workflow+2 > has > been > set > by the Maintainer and gating tests passed. > > > ## Why now? > > It is elimination of waste. The sooner - the better. > The solution has been used for a while and it works. > Resolving the problem without gerrit involved will lead to adding > unreliable > code into jobs, and will still be prone to problems: > Just recently, 3d ago we’ve tried detecting what to run from > jenkins > relying only on gerrit comments so that upon Verified+1, we’d > run > the > job. > We could not use “Review+1”, because it makes no sense at all, > so we > left > the job to set Verified+1. > Meaning - re-trigger itself immediately more than 1 times. > > Jenkins and its visitors very unhappy, and we had to stop those > jobs, > clean > up the queue, and spam developers. > > ## OK OK OK. Now what? > > Now we want your comments and opinions before pushing this > further: > Please participate in this thread, so we can start trying it out. > Ask, Suggest better ideas, all this is welcome. > > > Best Regards! > > > N.B. > Of course, this is not written in stone, in case we find a better > approach > on > solving those issues, we will change to it. > And we will keep improving so don't be afraid that it will be > enforced: > if > this does not work out we will discard it. > > P.S. > Kudos to dcaro, most of the work was done by him, and most of this > text > too. >
+1 from me, releasing CI from running non critical and un-essential jobs will not only reduce load from ci, and shorted response time for developers, it will allow us to add much more powerful tests such as functional & system tests that actually add hosts and run VMs, improving our ability to find regression much more effectively.
Another benefit to consider is saving reviewers time. I.e not only jenkins benefits from Worklow+1, but also human reviewers. Instead of looking at a patch that is too early to be reviewed, the author can set the Workflow+1 when the code is ready to review (even if he didn't verified it yet), thus saving time to other reviewers - for example people can add an email rule to alert them only when they are added to patches that have Workflow+1, and not before.
For human reviewers I suggest to keep using drafts until the patch is finished.
keep using? how many developers do you know are working with drafts until their patch is ready? i agree if everyone would use drafts load on jenkins was already much lower, unfortunately its not the case.
IMO we don't need the "workflow" flag. I'm okay with CI not running on "drafts". And yes... we do use them. We can try and educate people to use them more where needed. Drafts should be widely used in first-phase development, and less on bug-fixes.
In addition, I think the patch owners shouldn't add reviewers, unless they need their input in the stage of the development. Once they want input, they should add reviewers.
1. So, if the patch is draft then no CI runs on it. 2. Once it turns into non-draft, you can run "light-CI" on it. 3. Once the patch has at least one +1 from a (human) reviewer, then it should run the "heavy" CI.
This is pointless, we will have to ask someone (or add another user) to +1 the patch to enable the CI :-)
4. Once the patch has +1 from heavy CI, and +2 from reviewer (maintainer), then it can be merged.
That's the process we have today, with slight change on when to run the CI and what CI to run (no CI on drafts, light CI on non-draft, heavy CI on +1 patches).
I think Oved proposal is simpler and more useful.
However, we need a way to run *any* ci jobs even on a draft. If we cannot afford this automatically with current system, lets add a way to trigger this manually from gerrit.
So this becomes simply:
1. drafts do not run the ci automatically, but the owner can run the ci manually
There are two ways I can think of such a trigger, through flags (that are easier to manage) or comments. The flags approach is the one that we exposed and tested on a few projects.
2. published patches run the light ci jobs automatically, owner can run heavy jobs manually
A big issue here is that there's currently no separation between light and heavy jobs, not even a list of what should be considered as such. So for starters any job is considered to be heavy.
3. run all ci jobs before merge, maintain can ignore ci results
This works with the system David suggested in the past, where each project implement scripts for checking patches and checking patches before merge.
Both work with that system that was never implemented on any major project so far due to reluctance to add ci scripts inside the code base (where I strongly think they belong, as the code base evolves, the scripts must change with it) So this is not yet implemented everywhere.
Did you send patches for vdsm for this? In vdsm, we can use the @slowtest to mark tests that must run only before merge. Currently the slow tests are always disabled in the ci, it would be nice to enable them before merge anyway. Another solution, add orthogonal decorator (@ci("configname")) to include/exclude test in some ci configuration. The ci can run the tests with NOSE_CI=full or similar flag to select tests. Nir

--UBnjLfzoMQYIXCvq Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 06/04, Nir Soffer wrote: >=20 >=20 > ----- Original Message ----- > > From: "David Caro" <dcaroest@redhat.com> > > To: "Nir Soffer" <nsoffer@redhat.com> > > Cc: "Oved Ourfali" <ovedo@redhat.com>, "Eyal Edri" <eedri@redhat.com>, = infra@ovirt.org, devel@ovirt.org > > Sent: Thursday, June 4, 2015 11:34:10 AM > > Subject: Re: [ovirt-devel] gerrit+ci improvement proposal > >=20 > > On 06/04, Nir Soffer wrote: > > >=20 > > >=20 > > > ----- Original Message ----- > > > > From: "Oved Ourfali" <ovedo@redhat.com> > > > > To: "Eyal Edri" <eedri@redhat.com> > > > > Cc: devel@ovirt.org, infra@ovirt.org > > > > Sent: Thursday, June 4, 2015 10:03:02 AM > > > > Subject: Re: [ovirt-devel] gerrit+ci improvement proposal > > > >=20 > > > >=20 > > > >=20 > > > > ----- Original Message ----- > > > > > From: "Eyal Edri" <eedri@redhat.com> > > > > > To: "Sandro Bonazzola" <sbonazzo@redhat.com> > > > > > Cc: infra@ovirt.org, devel@ovirt.org > > > > > Sent: Thursday, June 4, 2015 9:46:40 AM > > > > > Subject: Re: [ovirt-devel] gerrit+ci improvement proposal > > > > >=20 > > > > >=20 > > > > >=20 > > > > > ----- Original Message ----- > > > > > > From: "Sandro Bonazzola" <sbonazzo@redhat.com> > > > > > > To: "Eyal Edri" <eedri@redhat.com>, "Max Kovgan" <mkovgan@redha= t.com> > > > > > > Cc: devel@ovirt.org, infra@ovirt.org > > > > > > Sent: Thursday, June 4, 2015 9:11:10 AM > > > > > > Subject: Re: [ovirt-devel] gerrit+ci improvement proposal > > > > > >=20 > > > > > > Il 03/06/2015 21:46, Eyal Edri ha scritto: > > > > > > >=20 > > > > > > >=20 > > > > > > > ----- Original Message ----- > > > > > > >> From: "Max Kovgan" <mkovgan@redhat.com> > > > > > > >> To: devel@ovirt.org > > > > > > >> Cc: infra@ovirt.org > > > > > > >> Sent: Wednesday, June 3, 2015 8:22:54 PM > > > > > > >> Subject: [ovirt-devel] gerrit+ci improvement proposal > > > > > > >> > > > > > > >> Hi everyone! > > > > > > >> We really want to have reliable and snappy CI: to allow short > > > > > > >> cycles > > > > > > >> and > > > > > > >> encourage developers to write tests. > > > > > > >> > > > > > > >> # Problem > > > > > > >> > > > > > > >> Many patches are neither ready for review nor for CI upon > > > > > > >> submission, > > > > > > >> which > > > > > > >> is OK. > > > > > > >> But running all the jobs on those patches with limited resou= rces > > > > > > >> results > > > > > > >> in: > > > > > > >> overloaded resources, slow response time, unhappy developers. > > > > > > >> > > > > > > >> # Proposed Solution > > > > > > >> > > > > > > >> To run less jobs we know we don=E2=80=99t need to, thus maki= ng more > > > > > > >> resources > > > > > > >> for > > > > > > >> the > > > > > > >> jobs we need to run. > > > > > > >> We have been experimenting to make our CI stabler and quicke= r to > > > > > > >> respond > > > > > > >> by > > > > > > >> using gerrit flags. This has improved in both directions ver= y well > > > > > > >> internally. > > > > > > >> Now it seems a good time to let all the oVirt projects to use > > > > > > >> this. > > > > > > >> This solution indirectly promotes reviews and quick tests - = =E2=80=9Cto > > > > > > >> fail > > > > > > >> early=E2=80=9D, > > > > > > >> yet full blown static code analysis and long tests to run = =E2=80=9Cwhen > > > > > > >> ready=E2=80=9D. > > > > > > >> > > > > > > >> # How it works > > > > > > >> > > > > > > >> 2 new gerrit independent flags are added to gerrit. > > > > > > >> > > > > > > >> ## CI flag > > > > > > >> > > > > > > >> Will express patch CI status. Values: > > > > > > >> * +1 CI passed > > > > > > >> * 0 CI did not run yet > > > > > > >> * -1 CI failed > > > > > > >> Permissions for setting: project maintainers (for special ca= ses) > > > > > > >> should > > > > > > >> be > > > > > > >> able to set/override (except Jenkins). > > > > > > >> > > > > > > >> ## Workflow flag > > > > > > >> > > > > > > >> Will express patch =E2=80=9Cworkflow=E2=80=9D state. Values: > > > > > > >> * 0 Work In Progress > > > > > > >> * +1 Ready For Review > > > > > > >> * +2 Ready For Merge > > > > > > >> Permissions for setting: Owner can set +1, Project Maintaine= rs can > > > > > > >> set > > > > > > >> +2 > > > > > > >> > > > > > > >> ## Review + CI Integration: > > > > > > >> > > > > > > >> Merging [=E2=80=9CSubmit=E2=80=9D button to appear] will req= uire: Review+1, CI+1, > > > > > > >> Workflow+2 > > > > > > >> Patch lifecycle now is: > > > > > > >> ------------------------------------------------------------= --- > > > > > > >> patch state |owner |reviewer |maintainer |CI tests |pa= ss > > > > > > >> ------------------------------------------------------------= --- > > > > > > >> added/updated |- |- |- |quick |CI= +1 > > > > > > >> review |Workflow+1|Review+1 |- |heavy |CI+1 > > > > > > >> merge ready |- |- |Workflow+2 |gating |CI= +1 > > > > > > >> merge |- |- |merge |merge |CI= +1 > > > > > > >> > > > > > > >> Changes from current workflow: > > > > > > >> Owner only adds reviewers, now owner needs to set "Workflow+= 1" for > > > > > > >> the > > > > > > >> patch > > > > > > >> to be reviewed, and heavily auto-tested. > > > > > > >> Maintainer now needs to set "Workflow+2" and wait for "Submi= t" > > > > > > >> button > > > > > > >> to > > > > > > >> appear after CI has completed running gating tests. > > > > > > >> > > > > > > >> > > > > > > >> Next step will be to automate merge the change after Workflo= w+2 > > > > > > >> has > > > > > > >> been > > > > > > >> set > > > > > > >> by the Maintainer and gating tests passed. > > > > > > >> > > > > > > >> > > > > > > >> ## Why now? > > > > > > >> > > > > > > >> It is elimination of waste. The sooner - the better. > > > > > > >> The solution has been used for a while and it works. > > > > > > >> Resolving the problem without gerrit involved will lead to a= dding > > > > > > >> unreliable > > > > > > >> code into jobs, and will still be prone to problems: > > > > > > >> Just recently, 3d ago we=E2=80=99ve tried detecting what t= o run from > > > > > > >> jenkins > > > > > > >> relying only on gerrit comments so that upon Verified+1, w= e=E2=80=99d > > > > > > >> run > > > > > > >> the > > > > > > >> job. > > > > > > >> We could not use =E2=80=9CReview+1=E2=80=9D, because it ma= kes no sense at all, > > > > > > >> so we > > > > > > >> left > > > > > > >> the job to set Verified+1. > > > > > > >> Meaning - re-trigger itself immediately more than 1 times. > > > > > > >> =20 > > > > > > >> Jenkins and its visitors very unhappy, and we had to stop = those > > > > > > >> jobs, > > > > > > >> clean > > > > > > >> up the queue, and spam developers. > > > > > > >> > > > > > > >> ## OK OK OK. Now what? > > > > > > >> > > > > > > >> Now we want your comments and opinions before pushing this > > > > > > >> further: > > > > > > >> Please participate in this thread, so we can start trying it= out. > > > > > > >> Ask, Suggest better ideas, all this is welcome. > > > > > > >> > > > > > > >> > > > > > > >> Best Regards! > > > > > > >> > > > > > > >> > > > > > > >> N.B. > > > > > > >> Of course, this is not written in stone, in case we find a b= etter > > > > > > >> approach > > > > > > >> on > > > > > > >> solving those issues, we will change to it. > > > > > > >> And we will keep improving so don't be afraid that it will be > > > > > > >> enforced: > > > > > > >> if > > > > > > >> this does not work out we will discard it. > > > > > > >> > > > > > > >> P.S. > > > > > > >> Kudos to dcaro, most of the work was done by him, and most o= f this > > > > > > >> text > > > > > > >> too. > > > > > > >> > > > > > > >=20 > > > > > > > +1 from me, releasing CI from running non critical and un-ess= ential > > > > > > > jobs > > > > > > > will not only reduce load from ci, > > > > > > > and shorted response time for developers, it will allow us to= add > > > > > > > much > > > > > > > more > > > > > > > powerful tests such as functional & system > > > > > > > tests that actually add hosts and run VMs, improving our abil= ity to > > > > > > > find > > > > > > > regression much more effectively. > > > > > > >=20 > > > > > > > Another benefit to consider is saving reviewers time. I.e not= only > > > > > > > jenkins > > > > > > > benefits from Worklow+1, but also human reviewers. > > > > > > > Instead of looking at a patch that is too early to be reviewe= d, the > > > > > > > author > > > > > > > can set the Workflow+1 when the code is ready to review > > > > > > > (even if he didn't verified it yet), thus saving time to other > > > > > > > reviewers > > > > > > > - > > > > > > > for example people can add an email rule > > > > > > > to alert them only when they are added to patches that have > > > > > > > Workflow+1, > > > > > > > and > > > > > > > not before. > > > > > >=20 > > > > > > For human reviewers I suggest to keep using drafts until the pa= tch is > > > > > > finished. > > > > >=20 > > > > > keep using? how many developers do you know are working with draf= ts > > > > > until > > > > > their patch is ready? > > > > > i agree if everyone would use drafts load on jenkins was already = much > > > > > lower, > > > > > unfortunately its not the case. > > > > >=20 > > > >=20 > > > > IMO we don't need the "workflow" flag. > > > > I'm okay with CI not running on "drafts". And yes... we do use them. > > > > We can try and educate people to use them more where needed. > > > > Drafts should be widely used in first-phase development, and less on > > > > bug-fixes. > > > >=20 > > > > In addition, I think the patch owners shouldn't add reviewers, unle= ss > > > > they > > > > need their input in the stage of the development. > > > > Once they want input, they should add reviewers. > > > >=20 > > > > 1. So, if the patch is draft then no CI runs on it. > > > > 2. Once it turns into non-draft, you can run "light-CI" on it. > > > > 3. Once the patch has at least one +1 from a (human) reviewer, then= it > > > > should > > > > run the "heavy" CI. > > >=20 > > > This is pointless, we will have to ask someone (or add another user) = to +1 > > > the > > > patch to enable the CI :-) > > >=20 > > > > 4. Once the patch has +1 from heavy CI, and +2 from reviewer > > > > (maintainer), > > > > then it can be merged. > > > >=20 > > > > That's the process we have today, with slight change on when to run= the > > > > CI > > > > and what CI to run (no CI on drafts, light CI on non-draft, heavy C= I on > > > > +1 > > > > patches). > > >=20 > > > I think Oved proposal is simpler and more useful. > > >=20 > > > However, we need a way to run *any* ci jobs even on a draft. If we ca= nnot > > > afford this automatically with current system, lets add a way to trig= ger > > > this > > > manually from gerrit. > > >=20 > > > So this becomes simply: > > >=20 > > > 1. drafts do not run the ci automatically, but the owner can run the = ci > > > manually > >=20 > > There are two ways I can think of such a trigger, through flags (that a= re > > easier to manage) or comments. The flags approach is the one that we ex= posed > > and tested on a few projects. > >=20 > > > 2. published patches run the light ci jobs automatically, owner can r= un > > > heavy jobs manually > >=20 > > A big issue here is that there's currently no separation between light = and > > heavy jobs, not even a list of what should be considered as such. So for > > starters any job is considered to be heavy. > >=20 > > > 3. run all ci jobs before merge, maintain can ignore ci results > > >=20 > > > This works with the system David suggested in the past, where each pr= oject > > > implement scripts for checking patches and checking patches before me= rge. > >=20 > > Both work with that system that was never implemented on any major proj= ect so > > far due to reluctance to add ci scripts inside the code base (where I > > strongly > > think they belong, as the code base evolves, the scripts must change wi= th it) > > So this is not yet implemented everywhere. >=20 > Did you send patches for vdsm for this? Done now, please review and merge, and add there any tests you want. https://gerrit.ovirt.org/41928 >=20 > In vdsm, we can use the @slowtest to mark tests that must run only before= merge. > Currently the slow tests are always disabled in the ci, it would be nice = to enable > them before merge anyway. >=20 > Another solution, add orthogonal decorator (@ci("configname")) to include= /exclude > test in some ci configuration. >=20 > The ci can run the tests with NOSE_CI=3Dfull or similar flag to select te= sts. >=20 That's not relevant in any way to the ci env, however you decide to run any tests and whatever tools you decide to use is up to you. > Nir --=20 David Caro Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D Tel.: +420 532 294 605 Email: dcaro@redhat.com Web: www.redhat.com RHT Global #: 82-62605 --UBnjLfzoMQYIXCvq Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVcBX4AAoJEEBxx+HSYmnDWbMH/RmgI4PNl67xhPXFnm69lo7p qZtWzfhzcOi43HR0tePRpTZvFIl06UiuWnQOPpcfzmApfpAFAngtQcFZgdukG3pK ekSVJQgWBjw/wEszJ6qqrzQUKIXig3V9jVjYwYedoFcIKBiQCKeRTJz7mytyH3RB Qs6EAS1S54VOiLiJi0jnj9DpnBXhrIBpRFTV6QwBYesqNtPpuzp4SX4JNaWlfqKf ECKBMASrod+w8uZeEkce+qce+S/I6FsU0o0goM7JhmqhwJ+Lk8KUWm4P+yv1fibb TGyYkH48c1vieQZda1+S4ZLDjwoQrcdKKUW4bS+LAVWFHA7eXByJUQhQytj9HSU= =7E0W -----END PGP SIGNATURE----- --UBnjLfzoMQYIXCvq--

----- Original Message -----
From: "Oved Ourfali" <ovedo@redhat.com> To: "Eyal Edri" <eedri@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 10:03:02 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Eyal Edri" <eedri@redhat.com> To: "Sandro Bonazzola" <sbonazzo@redhat.com> Cc: infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 9:46:40 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Sandro Bonazzola" <sbonazzo@redhat.com> To: "Eyal Edri" <eedri@redhat.com>, "Max Kovgan" <mkovgan@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 9:11:10 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
Il 03/06/2015 21:46, Eyal Edri ha scritto:
----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
Next step will be to automate merge the change after Workflow+2 has been set by the Maintainer and gating tests passed.
## Why now?
It is elimination of waste. The sooner - the better. The solution has been used for a while and it works. Resolving the problem without gerrit involved will lead to adding unreliable code into jobs, and will still be prone to problems: Just recently, 3d ago we’ve tried detecting what to run from jenkins relying only on gerrit comments so that upon Verified+1, we’d run the job. We could not use “Review+1”, because it makes no sense at all, so we left the job to set Verified+1. Meaning - re-trigger itself immediately more than 1 times.
Jenkins and its visitors very unhappy, and we had to stop those jobs, clean up the queue, and spam developers.
## OK OK OK. Now what?
Now we want your comments and opinions before pushing this further: Please participate in this thread, so we can start trying it out. Ask, Suggest better ideas, all this is welcome.
Best Regards!
N.B. Of course, this is not written in stone, in case we find a better approach on solving those issues, we will change to it. And we will keep improving so don't be afraid that it will be enforced: if this does not work out we will discard it.
P.S. Kudos to dcaro, most of the work was done by him, and most of this text too.
+1 from me, releasing CI from running non critical and un-essential jobs will not only reduce load from ci, and shorted response time for developers, it will allow us to add much more powerful tests such as functional & system tests that actually add hosts and run VMs, improving our ability to find regression much more effectively.
Another benefit to consider is saving reviewers time. I.e not only jenkins benefits from Worklow+1, but also human reviewers. Instead of looking at a patch that is too early to be reviewed, the author can set the Workflow+1 when the code is ready to review (even if he didn't verified it yet), thus saving time to other reviewers - for example people can add an email rule to alert them only when they are added to patches that have Workflow+1, and not before.
For human reviewers I suggest to keep using drafts until the patch is finished.
keep using? how many developers do you know are working with drafts until their patch is ready? i agree if everyone would use drafts load on jenkins was already much lower, unfortunately its not the case.
IMO we don't need the "workflow" flag. I'm okay with CI not running on "drafts". And yes... we do use them. We can try and educate people to use them more where needed. Drafts should be widely used in first-phase development, and less on bug-fixes.
In addition, I think the patch owners shouldn't add reviewers, unless they need their input in the stage of the development. Once they want input, they should add reviewers.
1. So, if the patch is draft then no CI runs on it. 2. Once it turns into non-draft, you can run "light-CI" on it. 3. Once the patch has at least one +1 from a (human) reviewer, then it should run the "heavy" CI. 4. Once the patch has +1 from heavy CI, and +2 from reviewer (maintainer), then it can be merged.
That's the process we have today, with slight change on when to run the CI and what CI to run (no CI on drafts, light CI on non-draft, heavy CI on +1 patches).
+1 This is he right approach to go (I am also using drafts and if other don't, we can change that....) Also, regarding the claim that publishing a draft is a one-way process, I don't think that this is problematic, you should publish a draft after it is stable and you addressed all comments and run all tests locally
Once it's finished and humans reviewed the logic of the patch, Workflow+1 should be triggered allowing automation to check the correctness of the patch. IMHO there's no reason for wasting CI time on patches that will be correct from an automation point of view but nacked by reviewers. Especially if the patches are part of a big patchset.
And one final note, for Workflow+2 -> this is a preparation for a gating system, like Zuul used by openstack, that in the future we might use as automatic merger pending passing a verification step. this will prevent errors that happen sometimes post merge due to conflicts or other issues, and will be another level of validation before final merge. But as max said, its all part of the plan and we'll test it of course before implementing to see its value.
Max Kovgan
Senior Software Engineer Red Hat - EMEA ENG Virtualization R&D Tel.: +972 9769 2060 Email: mkovgan [at] redhat [dot] com Web: http://www.redhat.com RHT Global #: 82-72060
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

----- Original Message -----
From: "Eli Mesika" <emesika@redhat.com> To: "Oved Ourfali" <ovedo@redhat.com> Cc: "Eyal Edri" <eedri@redhat.com>, infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 3:49:05 PM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Oved Ourfali" <ovedo@redhat.com> To: "Eyal Edri" <eedri@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 10:03:02 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Eyal Edri" <eedri@redhat.com> To: "Sandro Bonazzola" <sbonazzo@redhat.com> Cc: infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 9:46:40 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Sandro Bonazzola" <sbonazzo@redhat.com> To: "Eyal Edri" <eedri@redhat.com>, "Max Kovgan" <mkovgan@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 9:11:10 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
Il 03/06/2015 21:46, Eyal Edri ha scritto:
----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
Next step will be to automate merge the change after Workflow+2 has been set by the Maintainer and gating tests passed.
## Why now?
It is elimination of waste. The sooner - the better. The solution has been used for a while and it works. Resolving the problem without gerrit involved will lead to adding unreliable code into jobs, and will still be prone to problems: Just recently, 3d ago we’ve tried detecting what to run from jenkins relying only on gerrit comments so that upon Verified+1, we’d run the job. We could not use “Review+1”, because it makes no sense at all, so we left the job to set Verified+1. Meaning - re-trigger itself immediately more than 1 times.
Jenkins and its visitors very unhappy, and we had to stop those jobs, clean up the queue, and spam developers.
## OK OK OK. Now what?
Now we want your comments and opinions before pushing this further: Please participate in this thread, so we can start trying it out. Ask, Suggest better ideas, all this is welcome.
Best Regards!
N.B. Of course, this is not written in stone, in case we find a better approach on solving those issues, we will change to it. And we will keep improving so don't be afraid that it will be enforced: if this does not work out we will discard it.
P.S. Kudos to dcaro, most of the work was done by him, and most of this text too.
+1 from me, releasing CI from running non critical and un-essential jobs will not only reduce load from ci, and shorted response time for developers, it will allow us to add much more powerful tests such as functional & system tests that actually add hosts and run VMs, improving our ability to find regression much more effectively.
Another benefit to consider is saving reviewers time. I.e not only jenkins benefits from Worklow+1, but also human reviewers. Instead of looking at a patch that is too early to be reviewed, the author can set the Workflow+1 when the code is ready to review (even if he didn't verified it yet), thus saving time to other reviewers - for example people can add an email rule to alert them only when they are added to patches that have Workflow+1, and not before.
For human reviewers I suggest to keep using drafts until the patch is finished.
keep using? how many developers do you know are working with drafts until their patch is ready? i agree if everyone would use drafts load on jenkins was already much lower, unfortunately its not the case.
IMO we don't need the "workflow" flag. I'm okay with CI not running on "drafts". And yes... we do use them. We can try and educate people to use them more where needed. Drafts should be widely used in first-phase development, and less on bug-fixes.
In addition, I think the patch owners shouldn't add reviewers, unless they need their input in the stage of the development. Once they want input, they should add reviewers.
1. So, if the patch is draft then no CI runs on it. 2. Once it turns into non-draft, you can run "light-CI" on it. 3. Once the patch has at least one +1 from a (human) reviewer, then it should run the "heavy" CI. 4. Once the patch has +1 from heavy CI, and +2 from reviewer (maintainer), then it can be merged.
That's the process we have today, with slight change on when to run the CI and what CI to run (no CI on drafts, light CI on non-draft, heavy CI on +1 patches).
+1
This is he right approach to go (I am also using drafts and if other don't, we can change that....) Also, regarding the claim that publishing a draft is a one-way process, I don't think that this is problematic, you should publish a draft after it is stable and you addressed all comments and run all tests locally
this might be true, but the problem is: 1. we can't enforce people to use drafts (technically), so until they do, we'll still have a resource problem 2. until we do, even "light ci" jobs running per patch will overload the ci without need, this is why relying on another flag will help - if adding workflow is a problem, we can use the CR+1 as first attempt to improve the flow, and consider in the future to use workflow if it will be needed. (maybe we can even set it automatically somehow)
Once it's finished and humans reviewed the logic of the patch, Workflow+1 should be triggered allowing automation to check the correctness of the patch. IMHO there's no reason for wasting CI time on patches that will be correct from an automation point of view but nacked by reviewers. Especially if the patches are part of a big patchset.
And one final note, for Workflow+2 -> this is a preparation for a gating system, like Zuul used by openstack, that in the future we might use as automatic merger pending passing a verification step. this will prevent errors that happen sometimes post merge due to conflicts or other issues, and will be another level of validation before final merge. But as max said, its all part of the plan and we'll test it of course before implementing to see its value.
Max Kovgan
Senior Software Engineer Red Hat - EMEA ENG Virtualization R&D Tel.: +972 9769 2060 Email: mkovgan [at] redhat [dot] com Web: http://www.redhat.com RHT Global #: 82-72060
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

----- Original Message -----
From: "Eyal Edri" <eedri@redhat.com> To: "Eli Mesika" <emesika@redhat.com> Cc: "Oved Ourfali" <ovedo@redhat.com>, devel@ovirt.org, infra@ovirt.org Sent: Sunday, June 7, 2015 9:52:15 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Eli Mesika" <emesika@redhat.com> To: "Oved Ourfali" <ovedo@redhat.com> Cc: "Eyal Edri" <eedri@redhat.com>, infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 3:49:05 PM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Oved Ourfali" <ovedo@redhat.com> To: "Eyal Edri" <eedri@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 10:03:02 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Eyal Edri" <eedri@redhat.com> To: "Sandro Bonazzola" <sbonazzo@redhat.com> Cc: infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 9:46:40 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Sandro Bonazzola" <sbonazzo@redhat.com> To: "Eyal Edri" <eedri@redhat.com>, "Max Kovgan" <mkovgan@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 9:11:10 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
Il 03/06/2015 21:46, Eyal Edri ha scritto:
----- Original Message ----- > From: "Max Kovgan" <mkovgan@redhat.com> > To: devel@ovirt.org > Cc: infra@ovirt.org > Sent: Wednesday, June 3, 2015 8:22:54 PM > Subject: [ovirt-devel] gerrit+ci improvement proposal > > Hi everyone! > We really want to have reliable and snappy CI: to allow short > cycles > and > encourage developers to write tests. > > # Problem > > Many patches are neither ready for review nor for CI upon > submission, > which > is OK. > But running all the jobs on those patches with limited resources > results > in: > overloaded resources, slow response time, unhappy developers. > > # Proposed Solution > > To run less jobs we know we don’t need to, thus making more > resources > for > the > jobs we need to run. > We have been experimenting to make our CI stabler and quicker to > respond > by > using gerrit flags. This has improved in both directions very well > internally. > Now it seems a good time to let all the oVirt projects to use > this. > This solution indirectly promotes reviews and quick tests - “to > fail > early”, > yet full blown static code analysis and long tests to run “when > ready”. > > # How it works > > 2 new gerrit independent flags are added to gerrit. > > ## CI flag > > Will express patch CI status. Values: > * +1 CI passed > * 0 CI did not run yet > * -1 CI failed > Permissions for setting: project maintainers (for special cases) > should > be > able to set/override (except Jenkins). > > ## Workflow flag > > Will express patch “workflow” state. Values: > * 0 Work In Progress > * +1 Ready For Review > * +2 Ready For Merge > Permissions for setting: Owner can set +1, Project Maintainers can > set > +2 > > ## Review + CI Integration: > > Merging [“Submit” button to appear] will require: Review+1, CI+1, > Workflow+2 > Patch lifecycle now is: > --------------------------------------------------------------- > patch state |owner |reviewer |maintainer |CI tests |pass > --------------------------------------------------------------- > added/updated |- |- |- |quick |CI+1 > review |Workflow+1|Review+1 |- |heavy |CI+1 > merge ready |- |- |Workflow+2 |gating |CI+1 > merge |- |- |merge |merge |CI+1 > > Changes from current workflow: > Owner only adds reviewers, now owner needs to set "Workflow+1" for > the > patch > to be reviewed, and heavily auto-tested. > Maintainer now needs to set "Workflow+2" and wait for "Submit" > button > to > appear after CI has completed running gating tests. > > > Next step will be to automate merge the change after Workflow+2 > has > been > set > by the Maintainer and gating tests passed. > > > ## Why now? > > It is elimination of waste. The sooner - the better. > The solution has been used for a while and it works. > Resolving the problem without gerrit involved will lead to adding > unreliable > code into jobs, and will still be prone to problems: > Just recently, 3d ago we’ve tried detecting what to run from > jenkins > relying only on gerrit comments so that upon Verified+1, we’d > run > the > job. > We could not use “Review+1”, because it makes no sense at all, > so > we > left > the job to set Verified+1. > Meaning - re-trigger itself immediately more than 1 times. > > Jenkins and its visitors very unhappy, and we had to stop those > jobs, > clean > up the queue, and spam developers. > > ## OK OK OK. Now what? > > Now we want your comments and opinions before pushing this > further: > Please participate in this thread, so we can start trying it out. > Ask, Suggest better ideas, all this is welcome. > > > Best Regards! > > > N.B. > Of course, this is not written in stone, in case we find a better > approach > on > solving those issues, we will change to it. > And we will keep improving so don't be afraid that it will be > enforced: > if > this does not work out we will discard it. > > P.S. > Kudos to dcaro, most of the work was done by him, and most of this > text > too. >
+1 from me, releasing CI from running non critical and un-essential jobs will not only reduce load from ci, and shorted response time for developers, it will allow us to add much more powerful tests such as functional & system tests that actually add hosts and run VMs, improving our ability to find regression much more effectively.
Another benefit to consider is saving reviewers time. I.e not only jenkins benefits from Worklow+1, but also human reviewers. Instead of looking at a patch that is too early to be reviewed, the author can set the Workflow+1 when the code is ready to review (even if he didn't verified it yet), thus saving time to other reviewers - for example people can add an email rule to alert them only when they are added to patches that have Workflow+1, and not before.
For human reviewers I suggest to keep using drafts until the patch is finished.
keep using? how many developers do you know are working with drafts until their patch is ready? i agree if everyone would use drafts load on jenkins was already much lower, unfortunately its not the case.
IMO we don't need the "workflow" flag. I'm okay with CI not running on "drafts". And yes... we do use them. We can try and educate people to use them more where needed. Drafts should be widely used in first-phase development, and less on bug-fixes.
In addition, I think the patch owners shouldn't add reviewers, unless they need their input in the stage of the development. Once they want input, they should add reviewers.
1. So, if the patch is draft then no CI runs on it. 2. Once it turns into non-draft, you can run "light-CI" on it. 3. Once the patch has at least one +1 from a (human) reviewer, then it should run the "heavy" CI. 4. Once the patch has +1 from heavy CI, and +2 from reviewer (maintainer), then it can be merged.
That's the process we have today, with slight change on when to run the CI and what CI to run (no CI on drafts, light CI on non-draft, heavy CI on +1 patches).
+1
This is he right approach to go (I am also using drafts and if other don't, we can change that....) Also, regarding the claim that publishing a draft is a one-way process, I don't think that this is problematic, you should publish a draft after it is stable and you addressed all comments and run all tests locally
this might be true, but the problem is: 1. we can't enforce people to use drafts (technically), so until they do, we'll still have a resource problem
We can educate, and I don't see an issue with that.
2. until we do, even "light ci" jobs running per patch will overload the ci without need, this is why relying on another flag will help - if adding workflow is a problem, we can use the CR+1 as first attempt to improve the flow, and consider in the future to use workflow if it will be needed. (maybe we can even set it automatically somehow)
Perhaps marking as "verified" can be this flag. If the patch is verified by the author, then you run light CI on it. If it was also CR+1, run the heavy CI. That way you both don't need a new flag, and you don't waste resources on non-manually-verified bugs.
Once it's finished and humans reviewed the logic of the patch, Workflow+1 should be triggered allowing automation to check the correctness of the patch. IMHO there's no reason for wasting CI time on patches that will be correct from an automation point of view but nacked by reviewers. Especially if the patches are part of a big patchset.
And one final note, for Workflow+2 -> this is a preparation for a gating system, like Zuul used by openstack, that in the future we might use as automatic merger pending passing a verification step. this will prevent errors that happen sometimes post merge due to conflicts or other issues, and will be another level of validation before final merge. But as max said, its all part of the plan and we'll test it of course before implementing to see its value.
> > > > Max Kovgan > > Senior Software Engineer > Red Hat - EMEA ENG Virtualization R&D > Tel.: +972 9769 2060 > Email: mkovgan [at] redhat [dot] com > Web: http://www.redhat.com > RHT Global #: 82-72060 > > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel _______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

----- Original Message -----
From: "Oved Ourfali" <ovedo@redhat.com> To: "Eyal Edri" <eedri@redhat.com> Cc: infra@ovirt.org, devel@ovirt.org Sent: Sunday, June 7, 2015 9:55:56 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Eyal Edri" <eedri@redhat.com> To: "Eli Mesika" <emesika@redhat.com> Cc: "Oved Ourfali" <ovedo@redhat.com>, devel@ovirt.org, infra@ovirt.org Sent: Sunday, June 7, 2015 9:52:15 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Eli Mesika" <emesika@redhat.com> To: "Oved Ourfali" <ovedo@redhat.com> Cc: "Eyal Edri" <eedri@redhat.com>, infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 3:49:05 PM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Oved Ourfali" <ovedo@redhat.com> To: "Eyal Edri" <eedri@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 10:03:02 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Eyal Edri" <eedri@redhat.com> To: "Sandro Bonazzola" <sbonazzo@redhat.com> Cc: infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 9:46:40 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Sandro Bonazzola" <sbonazzo@redhat.com> To: "Eyal Edri" <eedri@redhat.com>, "Max Kovgan" <mkovgan@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 9:11:10 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
Il 03/06/2015 21:46, Eyal Edri ha scritto: > > > ----- Original Message ----- >> From: "Max Kovgan" <mkovgan@redhat.com> >> To: devel@ovirt.org >> Cc: infra@ovirt.org >> Sent: Wednesday, June 3, 2015 8:22:54 PM >> Subject: [ovirt-devel] gerrit+ci improvement proposal >> >> Hi everyone! >> We really want to have reliable and snappy CI: to allow short >> cycles >> and >> encourage developers to write tests. >> >> # Problem >> >> Many patches are neither ready for review nor for CI upon >> submission, >> which >> is OK. >> But running all the jobs on those patches with limited resources >> results >> in: >> overloaded resources, slow response time, unhappy developers. >> >> # Proposed Solution >> >> To run less jobs we know we don’t need to, thus making more >> resources >> for >> the >> jobs we need to run. >> We have been experimenting to make our CI stabler and quicker to >> respond >> by >> using gerrit flags. This has improved in both directions very >> well >> internally. >> Now it seems a good time to let all the oVirt projects to use >> this. >> This solution indirectly promotes reviews and quick tests - “to >> fail >> early”, >> yet full blown static code analysis and long tests to run “when >> ready”. >> >> # How it works >> >> 2 new gerrit independent flags are added to gerrit. >> >> ## CI flag >> >> Will express patch CI status. Values: >> * +1 CI passed >> * 0 CI did not run yet >> * -1 CI failed >> Permissions for setting: project maintainers (for special cases) >> should >> be >> able to set/override (except Jenkins). >> >> ## Workflow flag >> >> Will express patch “workflow” state. Values: >> * 0 Work In Progress >> * +1 Ready For Review >> * +2 Ready For Merge >> Permissions for setting: Owner can set +1, Project Maintainers >> can >> set >> +2 >> >> ## Review + CI Integration: >> >> Merging [“Submit” button to appear] will require: Review+1, >> CI+1, >> Workflow+2 >> Patch lifecycle now is: >> --------------------------------------------------------------- >> patch state |owner |reviewer |maintainer |CI tests |pass >> --------------------------------------------------------------- >> added/updated |- |- |- |quick |CI+1 >> review |Workflow+1|Review+1 |- |heavy |CI+1 >> merge ready |- |- |Workflow+2 |gating |CI+1 >> merge |- |- |merge |merge |CI+1 >> >> Changes from current workflow: >> Owner only adds reviewers, now owner needs to set "Workflow+1" >> for >> the >> patch >> to be reviewed, and heavily auto-tested. >> Maintainer now needs to set "Workflow+2" and wait for "Submit" >> button >> to >> appear after CI has completed running gating tests. >> >> >> Next step will be to automate merge the change after Workflow+2 >> has >> been >> set >> by the Maintainer and gating tests passed. >> >> >> ## Why now? >> >> It is elimination of waste. The sooner - the better. >> The solution has been used for a while and it works. >> Resolving the problem without gerrit involved will lead to >> adding >> unreliable >> code into jobs, and will still be prone to problems: >> Just recently, 3d ago we’ve tried detecting what to run from >> jenkins >> relying only on gerrit comments so that upon Verified+1, we’d >> run >> the >> job. >> We could not use “Review+1”, because it makes no sense at all, >> so >> we >> left >> the job to set Verified+1. >> Meaning - re-trigger itself immediately more than 1 times. >> >> Jenkins and its visitors very unhappy, and we had to stop >> those >> jobs, >> clean >> up the queue, and spam developers. >> >> ## OK OK OK. Now what? >> >> Now we want your comments and opinions before pushing this >> further: >> Please participate in this thread, so we can start trying it >> out. >> Ask, Suggest better ideas, all this is welcome. >> >> >> Best Regards! >> >> >> N.B. >> Of course, this is not written in stone, in case we find a >> better >> approach >> on >> solving those issues, we will change to it. >> And we will keep improving so don't be afraid that it will be >> enforced: >> if >> this does not work out we will discard it. >> >> P.S. >> Kudos to dcaro, most of the work was done by him, and most of >> this >> text >> too. >> > > +1 from me, releasing CI from running non critical and > un-essential > jobs > will not only reduce load from ci, > and shorted response time for developers, it will allow us to add > much > more > powerful tests such as functional & system > tests that actually add hosts and run VMs, improving our ability > to > find > regression much more effectively. > > Another benefit to consider is saving reviewers time. I.e not > only > jenkins > benefits from Worklow+1, but also human reviewers. > Instead of looking at a patch that is too early to be reviewed, > the > author > can set the Workflow+1 when the code is ready to review > (even if he didn't verified it yet), thus saving time to other > reviewers > - > for example people can add an email rule > to alert them only when they are added to patches that have > Workflow+1, > and > not before.
For human reviewers I suggest to keep using drafts until the patch is finished.
keep using? how many developers do you know are working with drafts until their patch is ready? i agree if everyone would use drafts load on jenkins was already much lower, unfortunately its not the case.
IMO we don't need the "workflow" flag. I'm okay with CI not running on "drafts". And yes... we do use them. We can try and educate people to use them more where needed. Drafts should be widely used in first-phase development, and less on bug-fixes.
In addition, I think the patch owners shouldn't add reviewers, unless they need their input in the stage of the development. Once they want input, they should add reviewers.
1. So, if the patch is draft then no CI runs on it. 2. Once it turns into non-draft, you can run "light-CI" on it. 3. Once the patch has at least one +1 from a (human) reviewer, then it should run the "heavy" CI. 4. Once the patch has +1 from heavy CI, and +2 from reviewer (maintainer), then it can be merged.
That's the process we have today, with slight change on when to run the CI and what CI to run (no CI on drafts, light CI on non-draft, heavy CI on +1 patches).
+1
This is he right approach to go (I am also using drafts and if other don't, we can change that....) Also, regarding the claim that publishing a draft is a one-way process, I don't think that this is problematic, you should publish a draft after it is stable and you addressed all comments and run all tests locally
this might be true, but the problem is: 1. we can't enforce people to use drafts (technically), so until they do, we'll still have a resource problem
We can educate, and I don't see an issue with that.
2. until we do, even "light ci" jobs running per patch will overload the ci without need, this is why relying on another flag will help - if adding workflow is a problem, we can use the CR+1 as first attempt to improve the flow, and consider in the future to use workflow if it will be needed. (maybe we can even set it automatically somehow)
Perhaps marking as "verified" can be this flag. If the patch is verified by the author, then you run light CI on it. If it was also CR+1, run the heavy CI.
question is how soon does an author ticks verify on his patch? does he wait for code review before? for e.g. - i heard from some developers they wait for CI to give them +1 until they even add reviewers, so this might be the chicken and egg problem.
That way you both don't need a new flag, and you don't waste resources on non-manually-verified bugs.
Once it's finished and humans reviewed the logic of the patch, Workflow+1 should be triggered allowing automation to check the correctness of the patch. IMHO there's no reason for wasting CI time on patches that will be correct from an automation point of view but nacked by reviewers. Especially if the patches are part of a big patchset.
> > And one final note, for Workflow+2 -> this is a preparation for a > gating > system, like Zuul used by openstack, that in the future > we might use as automatic merger pending passing a verification > step. > this > will prevent errors that happen sometimes > post merge due to conflicts or other issues, and will be another > level > of > validation before final merge. > But as max said, its all part of the plan and we'll test it of > course > before implementing to see its value. > > >> >> >> >> Max Kovgan >> >> Senior Software Engineer >> Red Hat - EMEA ENG Virtualization R&D >> Tel.: +972 9769 2060 >> Email: mkovgan [at] redhat [dot] com >> Web: http://www.redhat.com >> RHT Global #: 82-72060 >> >> _______________________________________________ >> Devel mailing list >> Devel@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/devel > _______________________________________________ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel >
-- Sandro Bonazzola Better technology. Faster innovation. Powered by community collaboration. See how it works at redhat.com _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
_______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

--wtjvnLv0o8UUzur2 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 06/07, Oved Ourfali wrote: >=20 >=20 > ----- Original Message ----- > > From: "Eyal Edri" <eedri@redhat.com> > > To: "Eli Mesika" <emesika@redhat.com> > > Cc: "Oved Ourfali" <ovedo@redhat.com>, devel@ovirt.org, infra@ovirt.org > > Sent: Sunday, June 7, 2015 9:52:15 AM > > Subject: Re: [ovirt-devel] gerrit+ci improvement proposal > >=20 > >=20 > >=20 > > ----- Original Message ----- > > > From: "Eli Mesika" <emesika@redhat.com> > > > To: "Oved Ourfali" <ovedo@redhat.com> > > > Cc: "Eyal Edri" <eedri@redhat.com>, infra@ovirt.org, devel@ovirt.org > > > Sent: Thursday, June 4, 2015 3:49:05 PM > > > Subject: Re: [ovirt-devel] gerrit+ci improvement proposal > > >=20 > > >=20 > > >=20 > > > ----- Original Message ----- > > > > From: "Oved Ourfali" <ovedo@redhat.com> > > > > To: "Eyal Edri" <eedri@redhat.com> > > > > Cc: devel@ovirt.org, infra@ovirt.org > > > > Sent: Thursday, June 4, 2015 10:03:02 AM > > > > Subject: Re: [ovirt-devel] gerrit+ci improvement proposal > > > >=20 > > > >=20 > > > >=20 > > > > ----- Original Message ----- > > > > > From: "Eyal Edri" <eedri@redhat.com> > > > > > To: "Sandro Bonazzola" <sbonazzo@redhat.com> > > > > > Cc: infra@ovirt.org, devel@ovirt.org > > > > > Sent: Thursday, June 4, 2015 9:46:40 AM > > > > > Subject: Re: [ovirt-devel] gerrit+ci improvement proposal > > > > >=20 > > > > >=20 > > > > >=20 > > > > > ----- Original Message ----- > > > > > > From: "Sandro Bonazzola" <sbonazzo@redhat.com> > > > > > > To: "Eyal Edri" <eedri@redhat.com>, "Max Kovgan" <mkovgan@redha= t.com> > > > > > > Cc: devel@ovirt.org, infra@ovirt.org > > > > > > Sent: Thursday, June 4, 2015 9:11:10 AM > > > > > > Subject: Re: [ovirt-devel] gerrit+ci improvement proposal > > > > > >=20 > > > > > > Il 03/06/2015 21:46, Eyal Edri ha scritto: > > > > > > >=20 > > > > > > >=20 > > > > > > > ----- Original Message ----- > > > > > > >> From: "Max Kovgan" <mkovgan@redhat.com> > > > > > > >> To: devel@ovirt.org > > > > > > >> Cc: infra@ovirt.org > > > > > > >> Sent: Wednesday, June 3, 2015 8:22:54 PM > > > > > > >> Subject: [ovirt-devel] gerrit+ci improvement proposal > > > > > > >> > > > > > > >> Hi everyone! > > > > > > >> We really want to have reliable and snappy CI: to allow short > > > > > > >> cycles > > > > > > >> and > > > > > > >> encourage developers to write tests. > > > > > > >> > > > > > > >> # Problem > > > > > > >> > > > > > > >> Many patches are neither ready for review nor for CI upon > > > > > > >> submission, > > > > > > >> which > > > > > > >> is OK. > > > > > > >> But running all the jobs on those patches with limited resou= rces > > > > > > >> results > > > > > > >> in: > > > > > > >> overloaded resources, slow response time, unhappy developers. > > > > > > >> > > > > > > >> # Proposed Solution > > > > > > >> > > > > > > >> To run less jobs we know we don=E2=80=99t need to, thus maki= ng more > > > > > > >> resources > > > > > > >> for > > > > > > >> the > > > > > > >> jobs we need to run. > > > > > > >> We have been experimenting to make our CI stabler and quicke= r to > > > > > > >> respond > > > > > > >> by > > > > > > >> using gerrit flags. This has improved in both directions ver= y well > > > > > > >> internally. > > > > > > >> Now it seems a good time to let all the oVirt projects to use > > > > > > >> this. > > > > > > >> This solution indirectly promotes reviews and quick tests - = =E2=80=9Cto > > > > > > >> fail > > > > > > >> early=E2=80=9D, > > > > > > >> yet full blown static code analysis and long tests to run = =E2=80=9Cwhen > > > > > > >> ready=E2=80=9D. > > > > > > >> > > > > > > >> # How it works > > > > > > >> > > > > > > >> 2 new gerrit independent flags are added to gerrit. > > > > > > >> > > > > > > >> ## CI flag > > > > > > >> > > > > > > >> Will express patch CI status. Values: > > > > > > >> * +1 CI passed > > > > > > >> * 0 CI did not run yet > > > > > > >> * -1 CI failed > > > > > > >> Permissions for setting: project maintainers (for special ca= ses) > > > > > > >> should > > > > > > >> be > > > > > > >> able to set/override (except Jenkins). > > > > > > >> > > > > > > >> ## Workflow flag > > > > > > >> > > > > > > >> Will express patch =E2=80=9Cworkflow=E2=80=9D state. Values: > > > > > > >> * 0 Work In Progress > > > > > > >> * +1 Ready For Review > > > > > > >> * +2 Ready For Merge > > > > > > >> Permissions for setting: Owner can set +1, Project Maintaine= rs can > > > > > > >> set > > > > > > >> +2 > > > > > > >> > > > > > > >> ## Review + CI Integration: > > > > > > >> > > > > > > >> Merging [=E2=80=9CSubmit=E2=80=9D button to appear] will req= uire: Review+1, CI+1, > > > > > > >> Workflow+2 > > > > > > >> Patch lifecycle now is: > > > > > > >> ------------------------------------------------------------= --- > > > > > > >> patch state |owner |reviewer |maintainer |CI tests |pa= ss > > > > > > >> ------------------------------------------------------------= --- > > > > > > >> added/updated |- |- |- |quick |CI= +1 > > > > > > >> review |Workflow+1|Review+1 |- |heavy |CI+1 > > > > > > >> merge ready |- |- |Workflow+2 |gating |CI= +1 > > > > > > >> merge |- |- |merge |merge |CI= +1 > > > > > > >> > > > > > > >> Changes from current workflow: > > > > > > >> Owner only adds reviewers, now owner needs to set "Workflow+= 1" for > > > > > > >> the > > > > > > >> patch > > > > > > >> to be reviewed, and heavily auto-tested. > > > > > > >> Maintainer now needs to set "Workflow+2" and wait for "Submi= t" > > > > > > >> button > > > > > > >> to > > > > > > >> appear after CI has completed running gating tests. > > > > > > >> > > > > > > >> > > > > > > >> Next step will be to automate merge the change after Workflo= w+2 > > > > > > >> has > > > > > > >> been > > > > > > >> set > > > > > > >> by the Maintainer and gating tests passed. > > > > > > >> > > > > > > >> > > > > > > >> ## Why now? > > > > > > >> > > > > > > >> It is elimination of waste. The sooner - the better. > > > > > > >> The solution has been used for a while and it works. > > > > > > >> Resolving the problem without gerrit involved will lead to a= dding > > > > > > >> unreliable > > > > > > >> code into jobs, and will still be prone to problems: > > > > > > >> Just recently, 3d ago we=E2=80=99ve tried detecting what t= o run from > > > > > > >> jenkins > > > > > > >> relying only on gerrit comments so that upon Verified+1, w= e=E2=80=99d > > > > > > >> run > > > > > > >> the > > > > > > >> job. > > > > > > >> We could not use =E2=80=9CReview+1=E2=80=9D, because it ma= kes no sense at all, > > > > > > >> so > > > > > > >> we > > > > > > >> left > > > > > > >> the job to set Verified+1. > > > > > > >> Meaning - re-trigger itself immediately more than 1 times. > > > > > > >> =20 > > > > > > >> Jenkins and its visitors very unhappy, and we had to stop = those > > > > > > >> jobs, > > > > > > >> clean > > > > > > >> up the queue, and spam developers. > > > > > > >> > > > > > > >> ## OK OK OK. Now what? > > > > > > >> > > > > > > >> Now we want your comments and opinions before pushing this > > > > > > >> further: > > > > > > >> Please participate in this thread, so we can start trying it= out. > > > > > > >> Ask, Suggest better ideas, all this is welcome. > > > > > > >> > > > > > > >> > > > > > > >> Best Regards! > > > > > > >> > > > > > > >> > > > > > > >> N.B. > > > > > > >> Of course, this is not written in stone, in case we find a b= etter > > > > > > >> approach > > > > > > >> on > > > > > > >> solving those issues, we will change to it. > > > > > > >> And we will keep improving so don't be afraid that it will be > > > > > > >> enforced: > > > > > > >> if > > > > > > >> this does not work out we will discard it. > > > > > > >> > > > > > > >> P.S. > > > > > > >> Kudos to dcaro, most of the work was done by him, and most o= f this > > > > > > >> text > > > > > > >> too. > > > > > > >> > > > > > > >=20 > > > > > > > +1 from me, releasing CI from running non critical and un-ess= ential > > > > > > > jobs > > > > > > > will not only reduce load from ci, > > > > > > > and shorted response time for developers, it will allow us to= add > > > > > > > much > > > > > > > more > > > > > > > powerful tests such as functional & system > > > > > > > tests that actually add hosts and run VMs, improving our abil= ity to > > > > > > > find > > > > > > > regression much more effectively. > > > > > > >=20 > > > > > > > Another benefit to consider is saving reviewers time. I.e not= only > > > > > > > jenkins > > > > > > > benefits from Worklow+1, but also human reviewers. > > > > > > > Instead of looking at a patch that is too early to be reviewe= d, the > > > > > > > author > > > > > > > can set the Workflow+1 when the code is ready to review > > > > > > > (even if he didn't verified it yet), thus saving time to other > > > > > > > reviewers > > > > > > > - > > > > > > > for example people can add an email rule > > > > > > > to alert them only when they are added to patches that have > > > > > > > Workflow+1, > > > > > > > and > > > > > > > not before. > > > > > >=20 > > > > > > For human reviewers I suggest to keep using drafts until the pa= tch is > > > > > > finished. > > > > >=20 > > > > > keep using? how many developers do you know are working with draf= ts > > > > > until > > > > > their patch is ready? > > > > > i agree if everyone would use drafts load on jenkins was already = much > > > > > lower, > > > > > unfortunately its not the case. > > > > >=20 > > > >=20 > > > > IMO we don't need the "workflow" flag. > > > > I'm okay with CI not running on "drafts". And yes... we do use them. > > > > We can try and educate people to use them more where needed. > > > > Drafts should be widely used in first-phase development, and less on > > > > bug-fixes. > > > >=20 > > > > In addition, I think the patch owners shouldn't add reviewers, unle= ss > > > > they > > > > need their input in the stage of the development. > > > > Once they want input, they should add reviewers. > > > >=20 > > > > 1. So, if the patch is draft then no CI runs on it. > > > > 2. Once it turns into non-draft, you can run "light-CI" on it. > > > > 3. Once the patch has at least one +1 from a (human) reviewer, then= it > > > > should > > > > run the "heavy" CI. > > > > 4. Once the patch has +1 from heavy CI, and +2 from reviewer > > > > (maintainer), > > > > then it can be merged. > > > >=20 > > > > That's the process we have today, with slight change on when to run= the > > > > CI > > > > and what CI to run (no CI on drafts, light CI on non-draft, heavy C= I on > > > > +1 > > > > patches). > > >=20 > > > +1 > > >=20 > > > This is he right approach to go (I am also using drafts and if other = don't, > > > we can change that....) > > > Also, regarding the claim that publishing a draft is a one-way proces= s, I > > > don't think that this is problematic, you should publish a draft afte= r it > > > is > > > stable and you addressed all comments and run all tests locally > > >=20 > >=20 > > this might be true, but the problem is: > > 1. we can't enforce people to use drafts (technically), so until they = do, > > we'll still have a resource problem >=20 >=20 > We can educate, and I don't see an issue with that. >=20 > > 2. until we do, even "light ci" jobs running per patch will overload t= he ci > > without need, this is why relying on another > > flag will help - if adding workflow is a problem, we can use the CR= +1 as > > first attempt to improve the flow, > > and consider in the future to use workflow if it will be needed. (m= aybe > > we can even set it automatically somehow) > >=20 >=20 > Perhaps marking as "verified" can be this flag. > If the patch is verified by the author, then you run light CI on it. > If it was also CR+1, run the heavy CI. There's no easy way to do that on jenkins so far (Can only filter one flag). >=20 > That way you both don't need a new flag, and you don't waste resources on= non-manually-verified bugs. >=20 > > > >=20 > > > >=20 > > > >=20 > > > > > > Once it's finished and humans reviewed the logic of the patch, > > > > > > Workflow+1 > > > > > > should be triggered allowing automation to check the correctnes= s of > > > > > > the > > > > > > patch. > > > > > > IMHO there's no reason for wasting CI time on patches that will= be > > > > > > correct > > > > > > from an automation point of view but nacked by reviewers. Espec= ially > > > > > > if > > > > > > the > > > > > > patches are part of a big patchset. > > > > > >=20 > > > > > >=20 > > > > > > >=20 > > > > > > > And one final note, for Workflow+2 -> this is a preparation f= or a > > > > > > > gating > > > > > > > system, like Zuul used by openstack, that in the future > > > > > > > we might use as automatic merger pending passing a verificati= on > > > > > > > step. > > > > > > > this > > > > > > > will prevent errors that happen sometimes > > > > > > > post merge due to conflicts or other issues, and will be anot= her > > > > > > > level > > > > > > > of > > > > > > > validation before final merge. > > > > > > > But as max said, its all part of the plan and we'll test it of > > > > > > > course > > > > > > > before implementing to see its value. > > > > > > >=20 > > > > > > >=20 > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Max Kovgan > > > > > > >> > > > > > > >> Senior Software Engineer > > > > > > >> Red Hat - EMEA ENG Virtualization R&D > > > > > > >> Tel.: +972 9769 2060 > > > > > > >> Email: mkovgan [at] redhat [dot] com > > > > > > >> Web: http://www.redhat.com > > > > > > >> RHT Global #: 82-72060 > > > > > > >> > > > > > > >> _______________________________________________ > > > > > > >> Devel mailing list > > > > > > >> Devel@ovirt.org > > > > > > >> http://lists.ovirt.org/mailman/listinfo/devel > > > > > > > _______________________________________________ > > > > > > > Devel mailing list > > > > > > > Devel@ovirt.org > > > > > > > http://lists.ovirt.org/mailman/listinfo/devel > > > > > > >=20 > > > > > >=20 > > > > > >=20 > > > > > > -- > > > > > > Sandro Bonazzola > > > > > > Better technology. Faster innovation. Powered by community > > > > > > collaboration. > > > > > > See how it works at redhat.com > > > > > > _______________________________________________ > > > > > > Infra mailing list > > > > > > Infra@ovirt.org > > > > > > http://lists.ovirt.org/mailman/listinfo/infra > > > > > >=20 > > > > > >=20 > > > > > > > > > > > _______________________________________________ > > > > > Devel mailing list > > > > > Devel@ovirt.org > > > > > http://lists.ovirt.org/mailman/listinfo/devel > > > > >=20 > > > > >=20 > > > > >=20 > > > > _______________________________________________ > > > > Infra mailing list > > > > Infra@ovirt.org > > > > http://lists.ovirt.org/mailman/listinfo/infra > > > >=20 > > > _______________________________________________ > > > Infra mailing list > > > Infra@ovirt.org > > > http://lists.ovirt.org/mailman/listinfo/infra > > >=20 > > >=20 > > > > > _______________________________________________ > > Infra mailing list > > Infra@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/infra > >=20 > _______________________________________________ > Infra mailing list > Infra@ovirt.org > http://lists.ovirt.org/mailman/listinfo/infra --=20 David Caro Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D Tel.: +420 532 294 605 Email: dcaro@redhat.com Web: www.redhat.com RHT Global #: 82-62605 --wtjvnLv0o8UUzur2 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVdAHDAAoJEEBxx+HSYmnD5N8H/3QNKSGdxYzIJNlHtxkhX2FU GBWSubmO6pI9uambHXAlm+kzmo4M+0haWueFhSl9z0o9zjX3ubTEjuoJtc6KWjTh wFowNZKv+TjOpnPIKEg/QInSWawJR1h+QYDaEvGp6RmeoGdUR2+RUcschu3y2bhV xDBsqKgNK0jkF/sVpCDuoEq+3gvwIdmrLsz7wz89JNhG0+Z+yMvtwX7dqxAMaz3z X48mnBPODX2LAzxS/+hzu4TwT6wDjMB/hAkJi5d7+lL/pgz5CqAKC8mjPujy0NHX bOH8Prkqslr88KNFX6KuLb2i5b76hv0jyaR+wp/NDoOUeerm8IvLa6IINZ84BeY= =Wr49 -----END PGP SIGNATURE----- --wtjvnLv0o8UUzur2--

----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed
-1 You must have CI error state. Most of the errors I have seen, the CI fail to run, and the failure is not related to the tested patch.
Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
-1 Cannot require CI+1, the CI is not reliable enough yet. Even if the CI will be reliable, a failed test which is not related to the submitted patch should not block unrelated changes.
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
-1 This means more work for developers. Instead, make workflow default to +1. If the want to disable the CI, she can set it to 0. Developers need more power and less process overhead. Nir

--CblX+4bnyfN0pR09 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 06/04, Nir Soffer wrote:
----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal =20 Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests. =20 # Problem =20 Many patches are neither ready for review nor for CI upon submission, w= hich is OK. But running all the jobs on those patches with limited resources result= s in: overloaded resources, slow response time, unhappy developers. =20 # Proposed Solution =20 To run less jobs we know we don=E2=80=99t need to, thus making more res= ources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respon= d by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - =E2=80=9Cto= fail early=E2=80=9D, yet full blown static code analysis and long tests to run =E2=80=9Cwhen= ready=E2=80=9D. =20 # How it works =20 2 new gerrit independent flags are added to gerrit. =20 ## CI flag =20 Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed =20 -1 =20 You must have CI error state. Most of the errors I have seen, the CI fail= to run, and the failure is not related to the tested patch.
You'll have to prove this, as the last time we discussed it less than 10% of the failures from the previous 2 weeks were that case. Also that's why maintainers can set +1 here and -1 is not a blocker.
Permissions for setting: project maintainers (for special cases) should= be able to set/override (except Jenkins). =20 ## Workflow flag =20 Will express patch =E2=80=9Cworkflow=E2=80=9D state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set = +2 =20 ## Review + CI Integration: =20 Merging [=E2=80=9CSubmit=E2=80=9D button to appear] will require: Revie= w+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1 =20 -1 =20 Cannot require CI+1, the CI is not reliable enough yet. =20 Even if the CI will be reliable, a failed test which is not related to th= e submitted
=20 patch should not block unrelated changes.
=20
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the =
This situation makes no sense, if it's reliable theres no error due to non changed stuff (unless the issue that triggers the error is part of your pat= ch history, because it's merged or you are based on a broken patch, in this ca= se it should fail and block) patch
to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests. =20 -1 =20 This means more work for developers. =20 Instead, make workflow default to +1. If the want to disable the CI, she = can set it to 0. =20 Developers need more power and less process overhead.
We have to make some stats, but from the small sample of patches that we checked, most than half of them were not ready to pass any ci, it's a bit m= ore work, if you keep using drafts, if you don't it's the same. It's not ment t= o be used alongside drafts but instead of it. So the only case where it increases the overhead is if you did not use draf= ts, in which case you should start using them, so no real downside there.
=20 Nir _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
--=20 David Caro Red Hat S.L. Continuous Integration Engineer - EMEA ENG Virtualization R&D Tel.: +420 532 294 605 Email: dcaro@redhat.com Web: www.redhat.com RHT Global #: 82-62605 --CblX+4bnyfN0pR09 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVb/+wAAoJEEBxx+HSYmnDudoH/2jmZobGgBTcFvwhuhOnC02B kblJcGOSUsjKlidbKmvJs/kp3fm3M+0CjmXpDq1lqKbdPECTTo+ZYx1uICE9YNPv Xin4xJAgNhUaSRnQTcRTdy4m9iBKFTD+62fPerq+i9dWidpi2h0jhCIOEU9JONHj cES3sQwNY6BZB3pMXHDXsWbyPkuVSpzrUQrQHbY/+a9w8f0q+1VP4l1f+wm9pRbg uIxIHGj5IqRKR94fsY7HNbJ1b67DLAKlEfvifH0cJEBuD/uV2pgZn8+ovyQEBsRJ PCc4HeiWRQE3siZfuXNeB7k0XgrEMCckQuzyoZAaA9+InRNVd7Fyy4wBn4ZyozA= =M9LY -----END PGP SIGNATURE----- --CblX+4bnyfN0pR09--

----- Original Message -----
From: "David Caro" <dcaroest@redhat.com> To: "Nir Soffer" <nsoffer@redhat.com> Cc: "Max Kovgan" <mkovgan@redhat.com>, infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 10:35:12 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
On 06/04, Nir Soffer wrote:
----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed
-1
You must have CI error state. Most of the errors I have seen, the CI fail to run, and the failure is not related to the tested patch.
You'll have to prove this, as the last time we discussed it less than 10% of the failures from the previous 2 weeks were that case.
10% is 10% too much :-)
Also that's why maintainers can set +1 here and -1 is not a blocker.
Ok, than its fine.
Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
-1
Cannot require CI+1, the CI is not reliable enough yet.
Even if the CI will be reliable, a failed test which is not related to the submitted patch should not block unrelated changes.
This situation makes no sense, if it's reliable theres no error due to non changed stuff (unless the issue that triggers the error is part of your patch history, because it's merged or you are based on a broken patch, in this case it should fail and block)
Here is an example: http://jenkins.ovirt.org/job/vdsm_master_unit-tests_created/3314/ Failed due to flaky networking tests, the failure is not related to the patch, which changes multipath configuration for some device. Flaky tests should be fixed, but blocking development because of them is not productive. The purpose of the CI is to help us move faster, not slow us down. The pep8 tests is doing this right - it checks only the code added in the last patch. In the current situation, the only way to move forward is to fix the flaky tests or mark them as @brokentest.
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
-1
This means more work for developers.
Instead, make workflow default to +1. If the want to disable the CI, she can set it to 0.
Developers need more power and less process overhead.
We have to make some stats, but from the small sample of patches that we checked, most than half of them were not ready to pass any ci,
Of course they were not ready, it is expected and good, and save developer time. The best way to check if a patch is ready for the CI is to let the CI run it :-)
it's a bit more work, if you keep using drafts, if you don't it's the same. It's not ment to be used alongside drafts but instead of it.
So the only case where it increases the overhead is if you did not use drafts, in which case you should start using them, so no real downside there.
How draft will help here? the CI run the tests on draft currently, and it is very convenient. It would be even more convenient if I can disable the CI for a draft, because it is not complete enough to pass the tests. Nir

----- Original Message -----
From: "Nir Soffer" <nsoffer@redhat.com> To: "David Caro" <dcaroest@redhat.com> Cc: devel@ovirt.org, infra@ovirt.org Sent: Thursday, June 4, 2015 10:57:30 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "David Caro" <dcaroest@redhat.com> To: "Nir Soffer" <nsoffer@redhat.com> Cc: "Max Kovgan" <mkovgan@redhat.com>, infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 10:35:12 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
On 06/04, Nir Soffer wrote:
----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed
-1
You must have CI error state. Most of the errors I have seen, the CI fail to run, and the failure is not related to the tested patch.
You'll have to prove this, as the last time we discussed it less than 10% of the failures from the previous 2 weeks were that case.
10% is 10% too much :-)
Also that's why maintainers can set +1 here and -1 is not a blocker.
Ok, than its fine.
Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
-1
Cannot require CI+1, the CI is not reliable enough yet.
Even if the CI will be reliable, a failed test which is not related to the submitted patch should not block unrelated changes.
This situation makes no sense, if it's reliable theres no error due to non changed stuff (unless the issue that triggers the error is part of your patch history, because it's merged or you are based on a broken patch, in this case it should fail and block)
Here is an example:
http://jenkins.ovirt.org/job/vdsm_master_unit-tests_created/3314/
Failed due to flaky networking tests, the failure is not related to the patch, which changes multipath configuration for some device.
Flaky tests should be fixed, but blocking development because of them is not productive.
The purpose of the CI is to help us move faster, not slow us down.
The pep8 tests is doing this right - it checks only the code added in the last patch.
In the current situation, the only way to move forward is to fix the flaky tests or mark them as @brokentest.
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
-1
This means more work for developers.
Instead, make workflow default to +1. If the want to disable the CI, she can set it to 0.
Developers need more power and less process overhead.
We have to make some stats, but from the small sample of patches that we checked, most than half of them were not ready to pass any ci,
Of course they were not ready, it is expected and good, and save developer time.
The best way to check if a patch is ready for the CI is to let the CI run it :-)
it's a bit more work, if you keep using drafts, if you don't it's the same. It's not ment to be used alongside drafts but instead of it.
So the only case where it increases the overhead is if you did not use drafts, in which case you should start using them, so no real downside there.
How draft will help here? the CI run the tests on draft currently, and it is very convenient.
it might be convenient, but unfourtunately we don't have enough resources in terms of slaves/servers to to on drafts, which cause jobs in jenkins to run much longer than they should.... i agree in a perfect world without a resource problem, we would run all tests on all patches.
It would be even more convenient if I can disable the CI for a draft, because it is not complete enough to pass the tests.
Nir _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra

----- Original Message -----
From: "Nir Soffer" <nsoffer@redhat.com> To: "Max Kovgan" <mkovgan@redhat.com> Cc: infra@ovirt.org, devel@ovirt.org Sent: Thursday, June 4, 2015 10:26:22 AM Subject: Re: [ovirt-devel] gerrit+ci improvement proposal
----- Original Message -----
From: "Max Kovgan" <mkovgan@redhat.com> To: devel@ovirt.org Cc: infra@ovirt.org Sent: Wednesday, June 3, 2015 8:22:54 PM Subject: [ovirt-devel] gerrit+ci improvement proposal
Hi everyone! We really want to have reliable and snappy CI: to allow short cycles and encourage developers to write tests.
# Problem
Many patches are neither ready for review nor for CI upon submission, which is OK. But running all the jobs on those patches with limited resources results in: overloaded resources, slow response time, unhappy developers.
# Proposed Solution
To run less jobs we know we don’t need to, thus making more resources for the jobs we need to run. We have been experimenting to make our CI stabler and quicker to respond by using gerrit flags. This has improved in both directions very well internally. Now it seems a good time to let all the oVirt projects to use this. This solution indirectly promotes reviews and quick tests - “to fail early”, yet full blown static code analysis and long tests to run “when ready”.
# How it works
2 new gerrit independent flags are added to gerrit.
## CI flag
Will express patch CI status. Values: * +1 CI passed * 0 CI did not run yet * -1 CI failed
-1
You must have CI error state. Most of the errors I have seen, the CI fail to run, and the failure is not related to the tested patch.
its not possible to populate the exact error into the gerrit comment, that's why there is a link to the job. the fact that you've seen false positive errors is unfourtunate, but it doesn't represent the majority of patches, and each false positive error should be communicated to infra team so we can resolve it, it has nothing to do with this suggestion, on the contrary - once jenkins will run much less un-needed runs on patches, the probability for false positives will reduce dramatically.
Permissions for setting: project maintainers (for special cases) should be able to set/override (except Jenkins).
## Workflow flag
Will express patch “workflow” state. Values: * 0 Work In Progress * +1 Ready For Review * +2 Ready For Merge Permissions for setting: Owner can set +1, Project Maintainers can set +2
## Review + CI Integration:
Merging [“Submit” button to appear] will require: Review+1, CI+1, Workflow+2 Patch lifecycle now is: --------------------------------------------------------------- patch state |owner |reviewer |maintainer |CI tests |pass --------------------------------------------------------------- added/updated |- |- |- |quick |CI+1 review |Workflow+1|Review+1 |- |heavy |CI+1 merge ready |- |- |Workflow+2 |gating |CI+1 merge |- |- |merge |merge |CI+1
-1
Cannot require CI+1, the CI is not reliable enough yet.
Even if the CI will be reliable, a failed test which is not related to the submitted patch should not block unrelated changes.
you're repeating the same calim from above, same answer.
Changes from current workflow: Owner only adds reviewers, now owner needs to set "Workflow+1" for the patch to be reviewed, and heavily auto-tested. Maintainer now needs to set "Workflow+2" and wait for "Submit" button to appear after CI has completed running gating tests.
-1
This means more work for developers.
Instead, make workflow default to +1. If the want to disable the CI, she can set it to 0.
this will serve nothing, and jobs will keep running on each patchset as now, increasing the time you need to wait for each job to run and adding risk for failures. the only alternative to this is to start using Drafts for in progress patches, but only 1/4 of the patches are drafts atm, so we still are running more jobs on ci than we should, thus preventing us from running more complex and useful jobs.
Developers need more power and less process overhead.
Nir _______________________________________________ Infra mailing list Infra@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
participants (9)
-
Alexander Wels
-
David Caro
-
Eli Mesika
-
Eyal Edri
-
Greg Sheremeta
-
Max Kovgan
-
Nir Soffer
-
Oved Ourfali
-
Sandro Bonazzola