[Users] [GSOC][Gerrit] add potential reviewers - questions
Alon Bar-Lev
alonbl at redhat.com
Wed Mar 12 17:32:47 UTC 2014
----- Original Message -----
> From: "Maor Lipchuk" <mlipchuk at redhat.com>
> To: "Alon Bar-Lev" <alonbl at redhat.com>, "Itamar Heim" <iheim at redhat.com>
> Cc: "Eli Mesika" <emesika at redhat.com>, users at ovirt.org, "Tomasz Kołek" <tomasz-kolek at o2.pl>, "infra"
> <infra at ovirt.org>
> Sent: Wednesday, March 12, 2014 7:29:40 PM
> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
>
> On 03/12/2014 05:57 PM, Alon Bar-Lev wrote:
> >
> >
> > ----- Original Message -----
> >> From: "Itamar Heim" <iheim at redhat.com>
> >> To: "Eli Mesika" <emesika at redhat.com>, users at ovirt.org
> >> Cc: "Maor Lipchuk" <mlipchuk at redhat.com>, "Tomasz Kołek"
> >> <tomasz-kolek at o2.pl>, "infra" <infra at ovirt.org>
> >> Sent: Wednesday, March 12, 2014 5:52:25 PM
> >> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
> >>
> >> On 03/12/2014 04:57 PM, Eli Mesika wrote:
> >>>
> >>>
> >>> ----- Original Message -----
> >>>> From: "David Caro" <dcaroest at redhat.com>
> >>>> To: "Itamar Heim" <iheim at redhat.com>
> >>>> Cc: "Maor Lipchuk" <mlipchuk at redhat.com>, users at ovirt.org, "Tomasz
> >>>> Kołek"
> >>>> <tomasz-kolek at o2.pl>, "infra"
> >>>> <infra at ovirt.org>
> >>>> Sent: Wednesday, March 12, 2014 11:01:21 AM
> >>>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
> >>>>
> >>>> On Wed 12 Mar 2014 08:16:02 AM CET, Itamar Heim wrote:
> >>>>> On 03/11/2014 10:08 PM, Maor Lipchuk wrote:
> >>>>>> On 03/11/2014 05:20 PM, Itamar Heim wrote:
> >>>>>>> On 03/11/2014 05:14 PM, Eyal Edri wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ----- Original Message -----
> >>>>>>>>> From: "Itamar Heim" <iheim at redhat.com>
> >>>>>>>>> To: "Eyal Edri" <eedri at redhat.com>, "Tomasz Kołek"
> >>>>>>>>> <tomasz-kolek at o2.pl>, users at ovirt.org, "infra" <infra at ovirt.org>
> >>>>>>>>> Sent: Tuesday, March 11, 2014 5:10:54 PM
> >>>>>>>>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers -
> >>>>>>>>> questions
> >>>>>>>>>
> >>>>>>>>> On 03/11/2014 05:06 PM, Ewoud Kohl van Wijngaarden wrote:
> >>>>>>>>>> On Tue, Mar 11, 2014 at 10:37:22AM -0400, Eyal Edri wrote:
> >>>>>>>>>>>> Tomasz Kołek wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> I've got a few questions about project description.
> >>>>>>>>>>>> Please tell me if my problem's understanding is good or not.
> >>>>>>>>>>>> We need to add a few flags/methods to git review module. This
> >>>>>>>>>>>> flags
> >>>>>>>>>>>> should
> >>>>>>>>>>>> allow to add potential reviewers in gerrit.
> >>>>>>>>>>>> So:
> >>>>>>>>>>>> Let's assume that we've got special flags for this operations.
> >>>>>>>>>>>> What's
> >>>>>>>>>>>> next?
> >>>>>>>>>>>> 1. In gerrit system we need to add special place for potential
> >>>>>>>>>>>> reviewers?
> >>>>>>>>>>>> 2. Potential reviewers should agree that they want to review?
> >>>>>>>>>>>> 3. We can have more than one accepted reviewer?
> >>>>>>>>>>>
> >>>>>>>>>>> I'm not sure i understood exactly what you mean by 'potential
> >>>>>>>>>>> reviewers'. do want gerrit (hook?) to automatically add
> >>>>>>>>>>> reviewers
> >>>>>>>>>>> to
> >>>>>>>>>>> a patch according to the code sent? so in fact you'll have a
> >>>>>>>>>>> place
> >>>>>>>>>>> somewhere for mapping code & specific developers?
> >>>>>>>>>>
> >>>>>>>>>> I really like this idea. Gerrit currently requires new users to
> >>>>>>>>>> know
> >>>>>>>>>> who
> >>>>>>>>>> to add as reviewers, IMHO impeding new contributors.
> >>>>>>>>>>
> >>>>>>>>>> One relative simple solution would be to look at who recently
> >>>>>>>>>> touched
> >>>>>>>>>> the files that are being modified and add them as reviewers. This
> >>>>>>>>>> can be
> >>>>>>>>>> done by looking at the git log for a file. Some pseudo python code
> >>>>>>>>>> solution:
> >>>>>>>>>>
> >>>>>>>>>> reviewers = set()
> >>>>>>>>>>
> >>>>>>>>>> for modified_file in commit.files:
> >>>>>>>>>> reviewers += set(commit.author for commit in
> >>>>>>>>>> git.log(modified_file))
> >>>>>>>>>>
> >>>>>>>>>> return reviewers
> >>>>>>>>>>
> >>>>>>>>>> This gives a system that those who touche a file, become the
> >>>>>>>>>> maintainer
> >>>>>>>>>> for that file. A more complex solution could improve on that and
> >>>>>>>>>> limit
> >>>>>>>>>> the reviewers added per patch. One can think of limiting to only
> >>>>>>>>>> contributions in the last X months, weigh contributions so common
> >>>>>>>>>> committers are prefered. It could also combine several methods.
> >>>>>>>>>>
> >>>>>>>>>> For example to limit to the 5 authors who touched the most files:
> >>>>>>>>>>
> >>>>>>>>>> reviewers = collections.Counter() # New in python 2.7
> >>>>>>>>>>
> >>>>>>>>>> for modified_file in commit.files:
> >>>>>>>>>> reviewers += collections.Counter(commit.author for commit
> >>>>>>>>>> in
> >>>>>>>>>> git.log(modified_file))
> >>>>>>>>>>
> >>>>>>>>>> return [author for author, count in reviewers.most_common(5)]
> >>>>>>>>>>
> >>>>>>>>>> Since Counter also accepts a dictionary, one could also weigh the
> >>>>>>>>>> touched lines per file. Downside there is big
> >>>>>>>>>> whitespace/formatting
> >>>>>>>>>> patches can skew the line count.
> >>>>>>>>>>
> >>>>>>>>>> In short, I think an entire thesis could be written on the optimal
> >>>>>>>>>> way
> >>>>>>>>>> to determine reviewers but a simple algorithm could do to show the
> >>>>>>>>>> method works.
> >>>>>>>>>>
> >>>>>>>>>> Does this help?
> >>>
> >>> Maybe it will be worth to use the information we have in Bugzilla here:
> >>>
> >>> We can browse the BZ that were closed/verified in the last XXX days
> >>> Per BZ , we know which patches are involved, who reviewed the patches,
> >>> which files were changed, when files were changed and the rank of the
> >>> change (number of lines changed)
> >>> I believe that from this information we can compose a simple ranking
> >>> algorithm that its output will be a list of N potential reviewers for the
> >>> patch.
> >>> Since we can aggregate the above information for all files related to the
> >>> patch we want to add reviewers, we can have this set for the whole patch.
> >>> This information should be processed and stored each N days and gerrit
> >>> will
> >>> be able to use it.
> >>
> >> why are we trying to invent hueristics, instead of declaring clear
> >> ownership?
> >
> > Reminder: my source metadata plan that requires cooperation.
> >
> > Each source and component should have an explicit ownership up into bug
> > database.
> >
> > I won't repeat it now, it is available at archives.
> >
> I think we are discussing two different issues here:
> The first one considers the GSOC project, of adding reviewers
> automatically to a patch, this can be done in many different heuristics
> depending what the submitter prefers (use blame, maintainers who acked
> the patches, bugs, list of owners to a file and so on).
>
> The second issue is adding and maintain a list of owners by
> files/folders in oVirt.
> this is a specific use case to be used by the "add potential reviewers"
> project.
There is no such think as "reviewer" there is component maintainer.
Random reviewers are irrelevant.
Each GSOC participate should have a mentor, this mentor should be the maintainer of the relevant component.
The mentor is responsible for the process, while the participate is providing the technical work.
>
> >>
> >>>
> >>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> Users mailing list
> >>>>>>>>>> Users at ovirt.org
> >>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I think if we do this, we want to make sure we cover per file who
> >>>>>>>>> is
> >>>>>>>>> required to +2 it before we consider it acked.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> won't it require maintaining static lists of people per
> >>>>>>>> file/path/project?
> >>>>>>>>
> >>>>>>>
> >>>>>>> yes, but considering our project layout, i don't see an alternative.
> >>>>>>> (some of the layout could be improved to be path based, rather than
> >>>>>>> file
> >>>>>>> based)
> >>>>>> I think it could be done automatically by analysing the file and see
> >>>>>> who
> >>>>>> mostly changed it recently, since the "owner" of the file might be
> >>>>>> dynamic, who ever changed most of it few days ago might be more
> >>>>>> familiar
> >>>>>> with it today
> >>>>>>
> >>>>>> IMO the algorithm of adding the reviewers should be flexible.
> >>>>>> For example, using a folder which will contain files, where each file
> >>>>>> implement an algorithm to add the reviewers.
> >>>>>>
> >>>>>> for instance we can have two files:
> >>>>>> 1. Add a reviewers by blame - the contributor which changed recently
> >>>>>> the
> >>>>>> code lines
> >>>>>> 2. Add a reviewers by file - the contributor who changed most of the
> >>>>>> file recently.
> >>>>>>
> >>>>>> Each file will implement the functional operation and will output the
> >>>>>> reviewers emails.
> >>>>>>
> >>>>>> The user can then add a new algorithm or change it to be more specific
> >>>>>> to its project.
> >>>>>> for example the user can add also the maintainers which acked the
> >>>>>> patch
> >>>>>> that was blamed.
> >>>>>>> _______________________________________________
> >>>>>>> Users mailing list
> >>>>>>> Users at ovirt.org
> >>>>>>> http://lists.ovirt.org/mailman/listinfo/users
> >>>>>>
> >>>>>
> >>>>> this shouldn't be automatic. we need to clearly define ownership. we
> >>>>> can't
> >>>>> do
> >>>>> this per repo for the engine/vdsm. we can do this per repo for the
> >>>>> other
> >>>>> repo's probably (though solving the folder/file approach would cover
> >>>>> the
> >>>>> simpler repos as a private case).
> >>>>>
> >>>>> yes, it will require some work, maybe some moving around of files to
> >>>>> make
> >>>>> this
> >>>>> easier by folders (topics) which should be relevant anyway.
> >>>>
> >>>> I think it would easier to maintain if we just have one file at the
> >>>> root, instead of having the ownership information distributed
> >>>> throughout the files/directories. That way you'll know where to look at
> >>>> to check/modify the ownership as opposed to having to walk all the
> >>>> files and upper directories.
> >>>>
> >>>> Also, adding all that ownership logic to gerrit might not be easy, as
> >>>> it's not meant to go checking the source of the repositories looking
> >>>> for configuration. We might want to take a look (again) to zuul, the
> >>>> tool that openstack uses as gateway to trigger jenkins jobs and merge
> >>>> patches.
> >>>>
> >>>>> _______________________________________________
> >>>>> Infra mailing list
> >>>>> Infra at ovirt.org
> >>>>> http://lists.ovirt.org/mailman/listinfo/infra
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> David Caro
> >>>>
> >>>> Red Hat S.L.
> >>>> Continuous Integration Engineer - EMEA ENG Virtualization R&D
> >>>>
> >>>> Email: dcaro at redhat.com
> >>>> Web: www.redhat.com
> >>>> RHT Global #: 82-62605
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Infra mailing list
> >>>> Infra at ovirt.org
> >>>> http://lists.ovirt.org/mailman/listinfo/infra
> >>>>
> >>
> >> _______________________________________________
> >> Infra mailing list
> >> Infra at ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/infra
> >>
>
>
More information about the Users
mailing list