On 03/12/2014 05:57 PM, Alon Bar-Lev wrote:
----- Original Message -----
> From: "Itamar Heim" <iheim(a)redhat.com>
> To: "Eli Mesika" <emesika(a)redhat.com>, users(a)ovirt.org
> Cc: "Maor Lipchuk" <mlipchuk(a)redhat.com>, "Tomasz Kołek"
<tomasz-kolek(a)o2.pl>, "infra" <infra(a)ovirt.org>
> Sent: Wednesday, March 12, 2014 5:52:25 PM
> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
>
> On 03/12/2014 04:57 PM, Eli Mesika wrote:
>>
>>
>> ----- Original Message -----
>>> From: "David Caro" <dcaroest(a)redhat.com>
>>> To: "Itamar Heim" <iheim(a)redhat.com>
>>> Cc: "Maor Lipchuk" <mlipchuk(a)redhat.com>, users(a)ovirt.org,
"Tomasz Kołek"
>>> <tomasz-kolek(a)o2.pl>, "infra"
>>> <infra(a)ovirt.org>
>>> Sent: Wednesday, March 12, 2014 11:01:21 AM
>>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
>>>
>>> On Wed 12 Mar 2014 08:16:02 AM CET, Itamar Heim wrote:
>>>> On 03/11/2014 10:08 PM, Maor Lipchuk wrote:
>>>>> On 03/11/2014 05:20 PM, Itamar Heim wrote:
>>>>>> On 03/11/2014 05:14 PM, Eyal Edri wrote:
>>>>>>>
>>>>>>>
>>>>>>> ----- Original Message -----
>>>>>>>> From: "Itamar Heim" <iheim(a)redhat.com>
>>>>>>>> To: "Eyal Edri" <eedri(a)redhat.com>,
"Tomasz Kołek"
>>>>>>>> <tomasz-kolek(a)o2.pl>, users(a)ovirt.org,
"infra" <infra(a)ovirt.org>
>>>>>>>> Sent: Tuesday, March 11, 2014 5:10:54 PM
>>>>>>>> Subject: Re: [Users] [GSOC][Gerrit] add potential
reviewers -
>>>>>>>> questions
>>>>>>>>
>>>>>>>> On 03/11/2014 05:06 PM, Ewoud Kohl van Wijngaarden
wrote:
>>>>>>>>> On Tue, Mar 11, 2014 at 10:37:22AM -0400, Eyal Edri
wrote:
>>>>>>>>>>> Tomasz Kołek wrote:
>>>>>>>>>>>
>>>>>>>>>>> I've got a few questions about project
description.
>>>>>>>>>>> Please tell me if my problem's
understanding is good or not.
>>>>>>>>>>> We need to add a few flags/methods to git
review module. This
>>>>>>>>>>> flags
>>>>>>>>>>> should
>>>>>>>>>>> allow to add potential reviewers in gerrit.
>>>>>>>>>>> So:
>>>>>>>>>>> Let's assume that we've got special
flags for this operations.
>>>>>>>>>>> What's
>>>>>>>>>>> next?
>>>>>>>>>>> 1. In gerrit system we need to add special
place for potential
>>>>>>>>>>> reviewers?
>>>>>>>>>>> 2. Potential reviewers should agree that they
want to review?
>>>>>>>>>>> 3. We can have more than one accepted
reviewer?
>>>>>>>>>>
>>>>>>>>>> I'm not sure i understood exactly what you
mean by 'potential
>>>>>>>>>> reviewers'. do want gerrit (hook?) to
automatically add reviewers
>>>>>>>>>> to
>>>>>>>>>> a patch according to the code sent? so in fact
you'll have a place
>>>>>>>>>> somewhere for mapping code & specific
developers?
>>>>>>>>>
>>>>>>>>> I really like this idea. Gerrit currently requires
new users to know
>>>>>>>>> who
>>>>>>>>> to add as reviewers, IMHO impeding new contributors.
>>>>>>>>>
>>>>>>>>> One relative simple solution would be to look at who
recently
>>>>>>>>> touched
>>>>>>>>> the files that are being modified and add them as
reviewers. This
>>>>>>>>> can be
>>>>>>>>> done by looking at the git log for a file. Some
pseudo python code
>>>>>>>>> solution:
>>>>>>>>>
>>>>>>>>> reviewers = set()
>>>>>>>>>
>>>>>>>>> for modified_file in commit.files:
>>>>>>>>> reviewers += set(commit.author for commit in
>>>>>>>>> git.log(modified_file))
>>>>>>>>>
>>>>>>>>> return reviewers
>>>>>>>>>
>>>>>>>>> This gives a system that those who touche a file,
become the
>>>>>>>>> maintainer
>>>>>>>>> for that file. A more complex solution could improve
on that and
>>>>>>>>> limit
>>>>>>>>> the reviewers added per patch. One can think of
limiting to only
>>>>>>>>> contributions in the last X months, weigh
contributions so common
>>>>>>>>> committers are prefered. It could also combine
several methods.
>>>>>>>>>
>>>>>>>>> For example to limit to the 5 authors who touched the
most files:
>>>>>>>>>
>>>>>>>>> reviewers = collections.Counter() # New in python
2.7
>>>>>>>>>
>>>>>>>>> for modified_file in commit.files:
>>>>>>>>> reviewers +=
collections.Counter(commit.author for commit in
>>>>>>>>> git.log(modified_file))
>>>>>>>>>
>>>>>>>>> return [author for author, count in
reviewers.most_common(5)]
>>>>>>>>>
>>>>>>>>> Since Counter also accepts a dictionary, one could
also weigh the
>>>>>>>>> touched lines per file. Downside there is big
whitespace/formatting
>>>>>>>>> patches can skew the line count.
>>>>>>>>>
>>>>>>>>> In short, I think an entire thesis could be written
on the optimal
>>>>>>>>> way
>>>>>>>>> to determine reviewers but a simple algorithm could
do to show the
>>>>>>>>> method works.
>>>>>>>>>
>>>>>>>>> Does this help?
>>
>> Maybe it will be worth to use the information we have in Bugzilla here:
>>
>> We can browse the BZ that were closed/verified in the last XXX days
>> Per BZ , we know which patches are involved, who reviewed the patches,
>> which files were changed, when files were changed and the rank of the
>> change (number of lines changed)
>> I believe that from this information we can compose a simple ranking
>> algorithm that its output will be a list of N potential reviewers for the
>> patch.
>> Since we can aggregate the above information for all files related to the
>> patch we want to add reviewers, we can have this set for the whole patch.
>> This information should be processed and stored each N days and gerrit will
>> be able to use it.
>
> why are we trying to invent hueristics, instead of declaring clear
> ownership?
Reminder: my source metadata plan that requires cooperation.
Each source and component should have an explicit ownership up into bug database.
I won't repeat it now, it is available at archives.
I think we are discussing two different issues here:
The first one considers the GSOC project, of adding reviewers
automatically to a patch, this can be done in many different heuristics
depending what the submitter prefers (use blame, maintainers who acked
the patches, bugs, list of owners to a file and so on).
The second issue is adding and maintain a list of owners by
files/folders in oVirt.
this is a specific use case to be used by the "add potential reviewers"
project.
>
>>
>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Users mailing list
>>>>>>>>> Users(a)ovirt.org
>>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>
>>>>>>>>
>>>>>>>> I think if we do this, we want to make sure we cover per
file who is
>>>>>>>> required to +2 it before we consider it acked.
>>>>>>>>
>>>>>>>
>>>>>>> won't it require maintaining static lists of people per
>>>>>>> file/path/project?
>>>>>>>
>>>>>>
>>>>>> yes, but considering our project layout, i don't see an
alternative.
>>>>>> (some of the layout could be improved to be path based, rather
than
>>>>>> file
>>>>>> based)
>>>>> I think it could be done automatically by analysing the file and see
who
>>>>> mostly changed it recently, since the "owner" of the file
might be
>>>>> dynamic, who ever changed most of it few days ago might be more
familiar
>>>>> with it today
>>>>>
>>>>> IMO the algorithm of adding the reviewers should be flexible.
>>>>> For example, using a folder which will contain files, where each
file
>>>>> implement an algorithm to add the reviewers.
>>>>>
>>>>> for instance we can have two files:
>>>>> 1. Add a reviewers by blame - the contributor which changed recently
the
>>>>> code lines
>>>>> 2. Add a reviewers by file - the contributor who changed most of the
>>>>> file recently.
>>>>>
>>>>> Each file will implement the functional operation and will output
the
>>>>> reviewers emails.
>>>>>
>>>>> The user can then add a new algorithm or change it to be more
specific
>>>>> to its project.
>>>>> for example the user can add also the maintainers which acked the
patch
>>>>> that was blamed.
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users(a)ovirt.org
>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>
>>>>
>>>> this shouldn't be automatic. we need to clearly define ownership. we
>>>> can't
>>>> do
>>>> this per repo for the engine/vdsm. we can do this per repo for the other
>>>> repo's probably (though solving the folder/file approach would cover
the
>>>> simpler repos as a private case).
>>>>
>>>> yes, it will require some work, maybe some moving around of files to
make
>>>> this
>>>> easier by folders (topics) which should be relevant anyway.
>>>
>>> I think it would easier to maintain if we just have one file at the
>>> root, instead of having the ownership information distributed
>>> throughout the files/directories. That way you'll know where to look at
>>> to check/modify the ownership as opposed to having to walk all the
>>> files and upper directories.
>>>
>>> Also, adding all that ownership logic to gerrit might not be easy, as
>>> it's not meant to go checking the source of the repositories looking
>>> for configuration. We might want to take a look (again) to zuul, the
>>> tool that openstack uses as gateway to trigger jenkins jobs and merge
>>> patches.
>>>
>>>> _______________________________________________
>>>> Infra mailing list
>>>> Infra(a)ovirt.org
>>>>
http://lists.ovirt.org/mailman/listinfo/infra
>>>
>>>
>>>
>>> --
>>> David Caro
>>>
>>> Red Hat S.L.
>>> Continuous Integration Engineer - EMEA ENG Virtualization R&D
>>>
>>> Email: dcaro(a)redhat.com
>>> Web:
www.redhat.com
>>> RHT Global #: 82-62605
>>>
>>>
>>> _______________________________________________
>>> Infra mailing list
>>> Infra(a)ovirt.org
>>>
http://lists.ovirt.org/mailman/listinfo/infra
>>>
>
> _______________________________________________
> Infra mailing list
> Infra(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/infra
>