[Users] [GSOC][Gerrit] add potential reviewers - questions

Alon Bar-Lev alonbl at redhat.com
Wed Mar 12 19:42:00 UTC 2014



----- Original Message -----
> From: "Itamar Heim" <iheim at redhat.com>
> To: "Maor Lipchuk" <mlipchuk at redhat.com>, "Alon Bar-Lev" <alonbl at redhat.com>
> Cc: "Eli Mesika" <emesika at redhat.com>, users at ovirt.org, "Tomasz Kołek" <tomasz-kolek at o2.pl>, "infra"
> <infra at ovirt.org>
> Sent: Wednesday, March 12, 2014 9:29:28 PM
> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
> 
> On 03/12/2014 07:29 PM, Maor Lipchuk wrote:
> > On 03/12/2014 05:57 PM, Alon Bar-Lev wrote:
> >>
> >>
> >> ----- Original Message -----
> >>> From: "Itamar Heim" <iheim at redhat.com>
> >>> To: "Eli Mesika" <emesika at redhat.com>, users at ovirt.org
> >>> Cc: "Maor Lipchuk" <mlipchuk at redhat.com>, "Tomasz Kołek"
> >>> <tomasz-kolek at o2.pl>, "infra" <infra at ovirt.org>
> >>> Sent: Wednesday, March 12, 2014 5:52:25 PM
> >>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
> >>>
> >>> On 03/12/2014 04:57 PM, Eli Mesika wrote:
> >>>>
> >>>>
> >>>> ----- Original Message -----
> >>>>> From: "David Caro" <dcaroest at redhat.com>
> >>>>> To: "Itamar Heim" <iheim at redhat.com>
> >>>>> Cc: "Maor Lipchuk" <mlipchuk at redhat.com>, users at ovirt.org, "Tomasz
> >>>>> Kołek"
> >>>>> <tomasz-kolek at o2.pl>, "infra"
> >>>>> <infra at ovirt.org>
> >>>>> Sent: Wednesday, March 12, 2014 11:01:21 AM
> >>>>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
> >>>>>
> >>>>> On Wed 12 Mar 2014 08:16:02 AM CET, Itamar Heim wrote:
> >>>>>> On 03/11/2014 10:08 PM, Maor Lipchuk wrote:
> >>>>>>> On 03/11/2014 05:20 PM, Itamar Heim wrote:
> >>>>>>>> On 03/11/2014 05:14 PM, Eyal Edri wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> ----- Original Message -----
> >>>>>>>>>> From: "Itamar Heim" <iheim at redhat.com>
> >>>>>>>>>> To: "Eyal Edri" <eedri at redhat.com>, "Tomasz Kołek"
> >>>>>>>>>> <tomasz-kolek at o2.pl>, users at ovirt.org, "infra" <infra at ovirt.org>
> >>>>>>>>>> Sent: Tuesday, March 11, 2014 5:10:54 PM
> >>>>>>>>>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers -
> >>>>>>>>>> questions
> >>>>>>>>>>
> >>>>>>>>>> On 03/11/2014 05:06 PM, Ewoud Kohl van Wijngaarden wrote:
> >>>>>>>>>>> On Tue, Mar 11, 2014 at 10:37:22AM -0400, Eyal Edri wrote:
> >>>>>>>>>>>>> Tomasz Kołek wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I've got a few questions about project description.
> >>>>>>>>>>>>> Please tell me if my problem's understanding is good or not.
> >>>>>>>>>>>>> We need to add a few flags/methods to git review module. This
> >>>>>>>>>>>>> flags
> >>>>>>>>>>>>> should
> >>>>>>>>>>>>> allow to add potential reviewers in gerrit.
> >>>>>>>>>>>>> So:
> >>>>>>>>>>>>> Let's assume that we've got special flags for this operations.
> >>>>>>>>>>>>> What's
> >>>>>>>>>>>>> next?
> >>>>>>>>>>>>> 1. In gerrit system we need to add special place for potential
> >>>>>>>>>>>>> reviewers?
> >>>>>>>>>>>>> 2. Potential reviewers should agree that they want to review?
> >>>>>>>>>>>>> 3. We can have more than one accepted reviewer?
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'm not sure i understood exactly what you mean by 'potential
> >>>>>>>>>>>> reviewers'.  do want gerrit (hook?) to automatically add
> >>>>>>>>>>>> reviewers
> >>>>>>>>>>>> to
> >>>>>>>>>>>> a patch according to the code sent?  so in fact you'll have a
> >>>>>>>>>>>> place
> >>>>>>>>>>>> somewhere for mapping code & specific developers?
> >>>>>>>>>>>
> >>>>>>>>>>> I really like this idea. Gerrit currently requires new users to
> >>>>>>>>>>> know
> >>>>>>>>>>> who
> >>>>>>>>>>> to add as reviewers, IMHO impeding new contributors.
> >>>>>>>>>>>
> >>>>>>>>>>> One relative simple solution would be to look at who recently
> >>>>>>>>>>> touched
> >>>>>>>>>>> the files that are being modified and add them as reviewers. This
> >>>>>>>>>>> can be
> >>>>>>>>>>> done by looking at the git log for a file. Some pseudo python
> >>>>>>>>>>> code
> >>>>>>>>>>> solution:
> >>>>>>>>>>>
> >>>>>>>>>>> reviewers = set()
> >>>>>>>>>>>
> >>>>>>>>>>> for modified_file in commit.files:
> >>>>>>>>>>>          reviewers += set(commit.author for commit in
> >>>>>>>>>>> git.log(modified_file))
> >>>>>>>>>>>
> >>>>>>>>>>> return reviewers
> >>>>>>>>>>>
> >>>>>>>>>>> This gives a system that those who touche a file, become the
> >>>>>>>>>>> maintainer
> >>>>>>>>>>> for that file. A more complex solution could improve on that and
> >>>>>>>>>>> limit
> >>>>>>>>>>> the reviewers added per patch. One can think of limiting to only
> >>>>>>>>>>> contributions in the last X months, weigh contributions so common
> >>>>>>>>>>> committers are prefered. It could also combine several methods.
> >>>>>>>>>>>
> >>>>>>>>>>> For example to limit to the 5 authors who touched the most files:
> >>>>>>>>>>>
> >>>>>>>>>>> reviewers = collections.Counter()  # New in python 2.7
> >>>>>>>>>>>
> >>>>>>>>>>> for modified_file in commit.files:
> >>>>>>>>>>>          reviewers += collections.Counter(commit.author for
> >>>>>>>>>>>          commit in
> >>>>>>>>>>>          git.log(modified_file))
> >>>>>>>>>>>
> >>>>>>>>>>> return [author for author, count in reviewers.most_common(5)]
> >>>>>>>>>>>
> >>>>>>>>>>> Since Counter also accepts a dictionary, one could also weigh the
> >>>>>>>>>>> touched lines per file. Downside there is big
> >>>>>>>>>>> whitespace/formatting
> >>>>>>>>>>> patches can skew the line count.
> >>>>>>>>>>>
> >>>>>>>>>>> In short, I think an entire thesis could be written on the
> >>>>>>>>>>> optimal
> >>>>>>>>>>> way
> >>>>>>>>>>> to determine reviewers but a simple algorithm could do to show
> >>>>>>>>>>> the
> >>>>>>>>>>> method works.
> >>>>>>>>>>>
> >>>>>>>>>>> Does this help?
> >>>>
> >>>> Maybe it will be worth to use the information we have in Bugzilla here:
> >>>>
> >>>> We can browse the BZ that were closed/verified in the last XXX days
> >>>> Per BZ , we know which patches are involved, who reviewed the patches,
> >>>> which files were changed, when files were changed and the rank of the
> >>>> change (number of lines changed)
> >>>> I believe that from this information we can compose a simple ranking
> >>>> algorithm that its output will be a list of N potential reviewers for
> >>>> the
> >>>> patch.
> >>>> Since we can aggregate the above information for all files related to
> >>>> the
> >>>> patch we want to add reviewers, we can have this set for the whole
> >>>> patch.
> >>>> This information should be processed and stored each N days and gerrit
> >>>> will
> >>>> be able to use it.
> >>>
> >>> why are we trying to invent hueristics, instead of declaring clear
> >>> ownership?
> >>
> >> Reminder: my source metadata plan that requires cooperation.
> >>
> >> Each source and component should have an explicit ownership up into bug
> >> database.
> >>
> >> I won't repeat it now, it is available at archives.
> >>
> > I think we are discussing two different issues here:
> > The first one considers the GSOC project, of adding reviewers
> > automatically to a patch, this can be done in many different heuristics
> > depending what the submitter prefers (use blame, maintainers who acked
> > the patches, bugs, list of owners to a file and so on).
> 
> that's the point. we should be using heuristics, rather ownership.

I want heuristics to solve bugs as well.
Maybe for each issue I will use heuristic and wait for X days for someone else to fix?
I would like to see how you build a building using heuristic reviews.

> this could still be done via a GSOC project for implementing the gerrit
> logic to do so.
> 
> >
> > The second issue is adding and maintain a list of owners by
> > files/folders in oVirt.
> > this is a specific use case to be used by the "add potential reviewers"
> > project.
> >
> >>>
> >>>>
> >>>>
> >>>>>>>>>>> _______________________________________________
> >>>>>>>>>>> Users mailing list
> >>>>>>>>>>> Users at ovirt.org
> >>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> I think if we do this, we want to make sure we cover per file who
> >>>>>>>>>> is
> >>>>>>>>>> required to +2 it before we consider it acked.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> won't it require maintaining static lists of people per
> >>>>>>>>> file/path/project?
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> yes, but considering our project layout, i don't see an alternative.
> >>>>>>>> (some of the layout could be improved to be path based, rather than
> >>>>>>>> file
> >>>>>>>> based)
> >>>>>>> I think it could be done automatically by analysing the file and see
> >>>>>>> who
> >>>>>>> mostly changed it recently, since the "owner" of the file might be
> >>>>>>> dynamic, who ever changed most of it few days ago might be more
> >>>>>>> familiar
> >>>>>>> with it today
> >>>>>>>
> >>>>>>> IMO the algorithm of adding the reviewers should be flexible.
> >>>>>>> For example, using a folder which will contain files, where each file
> >>>>>>> implement an algorithm to add the reviewers.
> >>>>>>>
> >>>>>>> for instance we can have two files:
> >>>>>>> 1. Add a reviewers by blame - the contributor which changed recently
> >>>>>>> the
> >>>>>>> code lines
> >>>>>>> 2. Add a reviewers by file - the contributor who changed most of the
> >>>>>>> file recently.
> >>>>>>>
> >>>>>>> Each file will implement the functional operation and will output the
> >>>>>>> reviewers emails.
> >>>>>>>
> >>>>>>> The user can then add a new algorithm or change it to be more
> >>>>>>> specific
> >>>>>>> to its project.
> >>>>>>> for example the user can add also the maintainers which acked the
> >>>>>>> patch
> >>>>>>> that was blamed.
> >>>>>>>> _______________________________________________
> >>>>>>>> Users mailing list
> >>>>>>>> Users at ovirt.org
> >>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
> >>>>>>>
> >>>>>>
> >>>>>> this shouldn't be automatic. we need to clearly define ownership. we
> >>>>>> can't
> >>>>>> do
> >>>>>> this per repo for the engine/vdsm. we can do this per repo for the
> >>>>>> other
> >>>>>> repo's probably (though solving the folder/file approach would cover
> >>>>>> the
> >>>>>> simpler repos as a private case).
> >>>>>>
> >>>>>> yes, it will require some work, maybe some moving around of files to
> >>>>>> make
> >>>>>> this
> >>>>>> easier by folders (topics) which should be relevant anyway.
> >>>>>
> >>>>> I think it would easier to maintain if we just have one file at the
> >>>>> root, instead of having the ownership information distributed
> >>>>> throughout the files/directories. That way you'll know where to look at
> >>>>> to check/modify the ownership as opposed to having to walk all the
> >>>>> files and upper directories.
> >>>>>
> >>>>> Also, adding all that ownership logic to gerrit might not be easy, as
> >>>>> it's not meant to go checking the source of the repositories looking
> >>>>> for configuration. We might want to take a look (again) to zuul, the
> >>>>> tool that openstack uses as gateway to trigger jenkins jobs and merge
> >>>>> patches.
> >>>>>
> >>>>>> _______________________________________________
> >>>>>> Infra mailing list
> >>>>>> Infra at ovirt.org
> >>>>>> http://lists.ovirt.org/mailman/listinfo/infra
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> David Caro
> >>>>>
> >>>>> Red Hat S.L.
> >>>>> Continuous Integration Engineer - EMEA ENG Virtualization R&D
> >>>>>
> >>>>> Email: dcaro at redhat.com
> >>>>> Web: www.redhat.com
> >>>>> RHT Global #: 82-62605
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Infra mailing list
> >>>>> Infra at ovirt.org
> >>>>> http://lists.ovirt.org/mailman/listinfo/infra
> >>>>>
> >>>
> >>> _______________________________________________
> >>> Infra mailing list
> >>> Infra at ovirt.org
> >>> http://lists.ovirt.org/mailman/listinfo/infra
> >>>
> >
> 
> 



More information about the Users mailing list