[Users] [GSOC][Gerrit] add potential reviewers - questions

Itamar Heim iheim at redhat.com
Wed Mar 12 19:46:00 UTC 2014


On 03/12/2014 09:42 PM, Alon Bar-Lev wrote:
>
>
> ----- Original Message -----
>> From: "Itamar Heim" <iheim at redhat.com>
>> To: "Maor Lipchuk" <mlipchuk at redhat.com>, "Alon Bar-Lev" <alonbl at redhat.com>
>> Cc: "Eli Mesika" <emesika at redhat.com>, users at ovirt.org, "Tomasz Kołek" <tomasz-kolek at o2.pl>, "infra"
>> <infra at ovirt.org>
>> Sent: Wednesday, March 12, 2014 9:29:28 PM
>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
>>
>> On 03/12/2014 07:29 PM, Maor Lipchuk wrote:
>>> On 03/12/2014 05:57 PM, Alon Bar-Lev wrote:
>>>>
>>>>
>>>> ----- Original Message -----
>>>>> From: "Itamar Heim" <iheim at redhat.com>
>>>>> To: "Eli Mesika" <emesika at redhat.com>, users at ovirt.org
>>>>> Cc: "Maor Lipchuk" <mlipchuk at redhat.com>, "Tomasz Kołek"
>>>>> <tomasz-kolek at o2.pl>, "infra" <infra at ovirt.org>
>>>>> Sent: Wednesday, March 12, 2014 5:52:25 PM
>>>>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
>>>>>
>>>>> On 03/12/2014 04:57 PM, Eli Mesika wrote:
>>>>>>
>>>>>>
>>>>>> ----- Original Message -----
>>>>>>> From: "David Caro" <dcaroest at redhat.com>
>>>>>>> To: "Itamar Heim" <iheim at redhat.com>
>>>>>>> Cc: "Maor Lipchuk" <mlipchuk at redhat.com>, users at ovirt.org, "Tomasz
>>>>>>> Kołek"
>>>>>>> <tomasz-kolek at o2.pl>, "infra"
>>>>>>> <infra at ovirt.org>
>>>>>>> Sent: Wednesday, March 12, 2014 11:01:21 AM
>>>>>>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
>>>>>>>
>>>>>>> On Wed 12 Mar 2014 08:16:02 AM CET, Itamar Heim wrote:
>>>>>>>> On 03/11/2014 10:08 PM, Maor Lipchuk wrote:
>>>>>>>>> On 03/11/2014 05:20 PM, Itamar Heim wrote:
>>>>>>>>>> On 03/11/2014 05:14 PM, Eyal Edri wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>> From: "Itamar Heim" <iheim at redhat.com>
>>>>>>>>>>>> To: "Eyal Edri" <eedri at redhat.com>, "Tomasz Kołek"
>>>>>>>>>>>> <tomasz-kolek at o2.pl>, users at ovirt.org, "infra" <infra at ovirt.org>
>>>>>>>>>>>> Sent: Tuesday, March 11, 2014 5:10:54 PM
>>>>>>>>>>>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers -
>>>>>>>>>>>> questions
>>>>>>>>>>>>
>>>>>>>>>>>> On 03/11/2014 05:06 PM, Ewoud Kohl van Wijngaarden wrote:
>>>>>>>>>>>>> On Tue, Mar 11, 2014 at 10:37:22AM -0400, Eyal Edri wrote:
>>>>>>>>>>>>>>> Tomasz Kołek wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've got a few questions about project description.
>>>>>>>>>>>>>>> Please tell me if my problem's understanding is good or not.
>>>>>>>>>>>>>>> We need to add a few flags/methods to git review module. This
>>>>>>>>>>>>>>> flags
>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>> allow to add potential reviewers in gerrit.
>>>>>>>>>>>>>>> So:
>>>>>>>>>>>>>>> Let's assume that we've got special flags for this operations.
>>>>>>>>>>>>>>> What's
>>>>>>>>>>>>>>> next?
>>>>>>>>>>>>>>> 1. In gerrit system we need to add special place for potential
>>>>>>>>>>>>>>> reviewers?
>>>>>>>>>>>>>>> 2. Potential reviewers should agree that they want to review?
>>>>>>>>>>>>>>> 3. We can have more than one accepted reviewer?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm not sure i understood exactly what you mean by 'potential
>>>>>>>>>>>>>> reviewers'.  do want gerrit (hook?) to automatically add
>>>>>>>>>>>>>> reviewers
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> a patch according to the code sent?  so in fact you'll have a
>>>>>>>>>>>>>> place
>>>>>>>>>>>>>> somewhere for mapping code & specific developers?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I really like this idea. Gerrit currently requires new users to
>>>>>>>>>>>>> know
>>>>>>>>>>>>> who
>>>>>>>>>>>>> to add as reviewers, IMHO impeding new contributors.
>>>>>>>>>>>>>
>>>>>>>>>>>>> One relative simple solution would be to look at who recently
>>>>>>>>>>>>> touched
>>>>>>>>>>>>> the files that are being modified and add them as reviewers. This
>>>>>>>>>>>>> can be
>>>>>>>>>>>>> done by looking at the git log for a file. Some pseudo python
>>>>>>>>>>>>> code
>>>>>>>>>>>>> solution:
>>>>>>>>>>>>>
>>>>>>>>>>>>> reviewers = set()
>>>>>>>>>>>>>
>>>>>>>>>>>>> for modified_file in commit.files:
>>>>>>>>>>>>>           reviewers += set(commit.author for commit in
>>>>>>>>>>>>> git.log(modified_file))
>>>>>>>>>>>>>
>>>>>>>>>>>>> return reviewers
>>>>>>>>>>>>>
>>>>>>>>>>>>> This gives a system that those who touche a file, become the
>>>>>>>>>>>>> maintainer
>>>>>>>>>>>>> for that file. A more complex solution could improve on that and
>>>>>>>>>>>>> limit
>>>>>>>>>>>>> the reviewers added per patch. One can think of limiting to only
>>>>>>>>>>>>> contributions in the last X months, weigh contributions so common
>>>>>>>>>>>>> committers are prefered. It could also combine several methods.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For example to limit to the 5 authors who touched the most files:
>>>>>>>>>>>>>
>>>>>>>>>>>>> reviewers = collections.Counter()  # New in python 2.7
>>>>>>>>>>>>>
>>>>>>>>>>>>> for modified_file in commit.files:
>>>>>>>>>>>>>           reviewers += collections.Counter(commit.author for
>>>>>>>>>>>>>           commit in
>>>>>>>>>>>>>           git.log(modified_file))
>>>>>>>>>>>>>
>>>>>>>>>>>>> return [author for author, count in reviewers.most_common(5)]
>>>>>>>>>>>>>
>>>>>>>>>>>>> Since Counter also accepts a dictionary, one could also weigh the
>>>>>>>>>>>>> touched lines per file. Downside there is big
>>>>>>>>>>>>> whitespace/formatting
>>>>>>>>>>>>> patches can skew the line count.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In short, I think an entire thesis could be written on the
>>>>>>>>>>>>> optimal
>>>>>>>>>>>>> way
>>>>>>>>>>>>> to determine reviewers but a simple algorithm could do to show
>>>>>>>>>>>>> the
>>>>>>>>>>>>> method works.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Does this help?
>>>>>>
>>>>>> Maybe it will be worth to use the information we have in Bugzilla here:
>>>>>>
>>>>>> We can browse the BZ that were closed/verified in the last XXX days
>>>>>> Per BZ , we know which patches are involved, who reviewed the patches,
>>>>>> which files were changed, when files were changed and the rank of the
>>>>>> change (number of lines changed)
>>>>>> I believe that from this information we can compose a simple ranking
>>>>>> algorithm that its output will be a list of N potential reviewers for
>>>>>> the
>>>>>> patch.
>>>>>> Since we can aggregate the above information for all files related to
>>>>>> the
>>>>>> patch we want to add reviewers, we can have this set for the whole
>>>>>> patch.
>>>>>> This information should be processed and stored each N days and gerrit
>>>>>> will
>>>>>> be able to use it.
>>>>>
>>>>> why are we trying to invent hueristics, instead of declaring clear
>>>>> ownership?
>>>>
>>>> Reminder: my source metadata plan that requires cooperation.
>>>>
>>>> Each source and component should have an explicit ownership up into bug
>>>> database.
>>>>
>>>> I won't repeat it now, it is available at archives.
>>>>
>>> I think we are discussing two different issues here:
>>> The first one considers the GSOC project, of adding reviewers
>>> automatically to a patch, this can be done in many different heuristics
>>> depending what the submitter prefers (use blame, maintainers who acked
>>> the patches, bugs, list of owners to a file and so on).
>>
>> that's the point. we should be using heuristics, rather ownership.

obvioulsy i meant here: we should *not* be using heuristics, rather 
ownership.


>
> I want heuristics to solve bugs as well.
> Maybe for each issue I will use heuristic and wait for X days for someone else to fix?
> I would like to see how you build a building using heuristic reviews.
>
>> this could still be done via a GSOC project for implementing the gerrit
>> logic to do so.
>>
>>>
>>> The second issue is adding and maintain a list of owners by
>>> files/folders in oVirt.
>>> this is a specific use case to be used by the "add potential reviewers"
>>> project.
>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Users mailing list
>>>>>>>>>>>>> Users at ovirt.org
>>>>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I think if we do this, we want to make sure we cover per file who
>>>>>>>>>>>> is
>>>>>>>>>>>> required to +2 it before we consider it acked.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> won't it require maintaining static lists of people per
>>>>>>>>>>> file/path/project?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> yes, but considering our project layout, i don't see an alternative.
>>>>>>>>>> (some of the layout could be improved to be path based, rather than
>>>>>>>>>> file
>>>>>>>>>> based)
>>>>>>>>> I think it could be done automatically by analysing the file and see
>>>>>>>>> who
>>>>>>>>> mostly changed it recently, since the "owner" of the file might be
>>>>>>>>> dynamic, who ever changed most of it few days ago might be more
>>>>>>>>> familiar
>>>>>>>>> with it today
>>>>>>>>>
>>>>>>>>> IMO the algorithm of adding the reviewers should be flexible.
>>>>>>>>> For example, using a folder which will contain files, where each file
>>>>>>>>> implement an algorithm to add the reviewers.
>>>>>>>>>
>>>>>>>>> for instance we can have two files:
>>>>>>>>> 1. Add a reviewers by blame - the contributor which changed recently
>>>>>>>>> the
>>>>>>>>> code lines
>>>>>>>>> 2. Add a reviewers by file - the contributor who changed most of the
>>>>>>>>> file recently.
>>>>>>>>>
>>>>>>>>> Each file will implement the functional operation and will output the
>>>>>>>>> reviewers emails.
>>>>>>>>>
>>>>>>>>> The user can then add a new algorithm or change it to be more
>>>>>>>>> specific
>>>>>>>>> to its project.
>>>>>>>>> for example the user can add also the maintainers which acked the
>>>>>>>>> patch
>>>>>>>>> that was blamed.
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Users mailing list
>>>>>>>>>> Users at ovirt.org
>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>
>>>>>>>>
>>>>>>>> this shouldn't be automatic. we need to clearly define ownership. we
>>>>>>>> can't
>>>>>>>> do
>>>>>>>> this per repo for the engine/vdsm. we can do this per repo for the
>>>>>>>> other
>>>>>>>> repo's probably (though solving the folder/file approach would cover
>>>>>>>> the
>>>>>>>> simpler repos as a private case).
>>>>>>>>
>>>>>>>> yes, it will require some work, maybe some moving around of files to
>>>>>>>> make
>>>>>>>> this
>>>>>>>> easier by folders (topics) which should be relevant anyway.
>>>>>>>
>>>>>>> I think it would easier to maintain if we just have one file at the
>>>>>>> root, instead of having the ownership information distributed
>>>>>>> throughout the files/directories. That way you'll know where to look at
>>>>>>> to check/modify the ownership as opposed to having to walk all the
>>>>>>> files and upper directories.
>>>>>>>
>>>>>>> Also, adding all that ownership logic to gerrit might not be easy, as
>>>>>>> it's not meant to go checking the source of the repositories looking
>>>>>>> for configuration. We might want to take a look (again) to zuul, the
>>>>>>> tool that openstack uses as gateway to trigger jenkins jobs and merge
>>>>>>> patches.
>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Infra mailing list
>>>>>>>> Infra at ovirt.org
>>>>>>>> http://lists.ovirt.org/mailman/listinfo/infra
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> David Caro
>>>>>>>
>>>>>>> Red Hat S.L.
>>>>>>> Continuous Integration Engineer - EMEA ENG Virtualization R&D
>>>>>>>
>>>>>>> Email: dcaro at redhat.com
>>>>>>> Web: www.redhat.com
>>>>>>> RHT Global #: 82-62605
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Infra mailing list
>>>>>>> Infra at ovirt.org
>>>>>>> http://lists.ovirt.org/mailman/listinfo/infra
>>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Infra mailing list
>>>>> Infra at ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/infra
>>>>>
>>>
>>
>>




More information about the Users mailing list