----- Original Message -----
From: "Maor Lipchuk" <mlipchuk(a)redhat.com>
To: "Alon Bar-Lev" <alonbl(a)redhat.com>, "Itamar Heim"
<iheim(a)redhat.com>
Cc: "Eli Mesika" <emesika(a)redhat.com>, users(a)ovirt.org, "Tomasz
Kołek" <tomasz-kolek(a)o2.pl>, "infra"
<infra(a)ovirt.org>
Sent: Wednesday, March 12, 2014 7:29:40 PM
Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
On 03/12/2014 05:57 PM, Alon Bar-Lev wrote:
>
>
> ----- Original Message -----
>> From: "Itamar Heim" <iheim(a)redhat.com>
>> To: "Eli Mesika" <emesika(a)redhat.com>, users(a)ovirt.org
>> Cc: "Maor Lipchuk" <mlipchuk(a)redhat.com>, "Tomasz
Kołek"
>> <tomasz-kolek(a)o2.pl>, "infra" <infra(a)ovirt.org>
>> Sent: Wednesday, March 12, 2014 5:52:25 PM
>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
>>
>> On 03/12/2014 04:57 PM, Eli Mesika wrote:
>>>
>>>
>>> ----- Original Message -----
>>>> From: "David Caro" <dcaroest(a)redhat.com>
>>>> To: "Itamar Heim" <iheim(a)redhat.com>
>>>> Cc: "Maor Lipchuk" <mlipchuk(a)redhat.com>,
users(a)ovirt.org, "Tomasz
>>>> Kołek"
>>>> <tomasz-kolek(a)o2.pl>, "infra"
>>>> <infra(a)ovirt.org>
>>>> Sent: Wednesday, March 12, 2014 11:01:21 AM
>>>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
>>>>
>>>> On Wed 12 Mar 2014 08:16:02 AM CET, Itamar Heim wrote:
>>>>> On 03/11/2014 10:08 PM, Maor Lipchuk wrote:
>>>>>> On 03/11/2014 05:20 PM, Itamar Heim wrote:
>>>>>>> On 03/11/2014 05:14 PM, Eyal Edri wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> ----- Original Message -----
>>>>>>>>> From: "Itamar Heim"
<iheim(a)redhat.com>
>>>>>>>>> To: "Eyal Edri" <eedri(a)redhat.com>,
"Tomasz Kołek"
>>>>>>>>> <tomasz-kolek(a)o2.pl>, users(a)ovirt.org,
"infra" <infra(a)ovirt.org>
>>>>>>>>> Sent: Tuesday, March 11, 2014 5:10:54 PM
>>>>>>>>> Subject: Re: [Users] [GSOC][Gerrit] add potential
reviewers -
>>>>>>>>> questions
>>>>>>>>>
>>>>>>>>> On 03/11/2014 05:06 PM, Ewoud Kohl van Wijngaarden
wrote:
>>>>>>>>>> On Tue, Mar 11, 2014 at 10:37:22AM -0400, Eyal
Edri wrote:
>>>>>>>>>>>> Tomasz Kołek wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> I've got a few questions about
project description.
>>>>>>>>>>>> Please tell me if my problem's
understanding is good or not.
>>>>>>>>>>>> We need to add a few flags/methods to
git review module. This
>>>>>>>>>>>> flags
>>>>>>>>>>>> should
>>>>>>>>>>>> allow to add potential reviewers in
gerrit.
>>>>>>>>>>>> So:
>>>>>>>>>>>> Let's assume that we've got
special flags for this operations.
>>>>>>>>>>>> What's
>>>>>>>>>>>> next?
>>>>>>>>>>>> 1. In gerrit system we need to add
special place for potential
>>>>>>>>>>>> reviewers?
>>>>>>>>>>>> 2. Potential reviewers should agree that
they want to review?
>>>>>>>>>>>> 3. We can have more than one accepted
reviewer?
>>>>>>>>>>>
>>>>>>>>>>> I'm not sure i understood exactly what
you mean by 'potential
>>>>>>>>>>> reviewers'. do want gerrit (hook?) to
automatically add
>>>>>>>>>>> reviewers
>>>>>>>>>>> to
>>>>>>>>>>> a patch according to the code sent? so in
fact you'll have a
>>>>>>>>>>> place
>>>>>>>>>>> somewhere for mapping code & specific
developers?
>>>>>>>>>>
>>>>>>>>>> I really like this idea. Gerrit currently
requires new users to
>>>>>>>>>> know
>>>>>>>>>> who
>>>>>>>>>> to add as reviewers, IMHO impeding new
contributors.
>>>>>>>>>>
>>>>>>>>>> One relative simple solution would be to look at
who recently
>>>>>>>>>> touched
>>>>>>>>>> the files that are being modified and add them
as reviewers. This
>>>>>>>>>> can be
>>>>>>>>>> done by looking at the git log for a file. Some
pseudo python code
>>>>>>>>>> solution:
>>>>>>>>>>
>>>>>>>>>> reviewers = set()
>>>>>>>>>>
>>>>>>>>>> for modified_file in commit.files:
>>>>>>>>>> reviewers += set(commit.author for
commit in
>>>>>>>>>> git.log(modified_file))
>>>>>>>>>>
>>>>>>>>>> return reviewers
>>>>>>>>>>
>>>>>>>>>> This gives a system that those who touche a
file, become the
>>>>>>>>>> maintainer
>>>>>>>>>> for that file. A more complex solution could
improve on that and
>>>>>>>>>> limit
>>>>>>>>>> the reviewers added per patch. One can think of
limiting to only
>>>>>>>>>> contributions in the last X months, weigh
contributions so common
>>>>>>>>>> committers are prefered. It could also combine
several methods.
>>>>>>>>>>
>>>>>>>>>> For example to limit to the 5 authors who
touched the most files:
>>>>>>>>>>
>>>>>>>>>> reviewers = collections.Counter() # New in
python 2.7
>>>>>>>>>>
>>>>>>>>>> for modified_file in commit.files:
>>>>>>>>>> reviewers +=
collections.Counter(commit.author for commit
>>>>>>>>>> in
>>>>>>>>>> git.log(modified_file))
>>>>>>>>>>
>>>>>>>>>> return [author for author, count in
reviewers.most_common(5)]
>>>>>>>>>>
>>>>>>>>>> Since Counter also accepts a dictionary, one
could also weigh the
>>>>>>>>>> touched lines per file. Downside there is big
>>>>>>>>>> whitespace/formatting
>>>>>>>>>> patches can skew the line count.
>>>>>>>>>>
>>>>>>>>>> In short, I think an entire thesis could be
written on the optimal
>>>>>>>>>> way
>>>>>>>>>> to determine reviewers but a simple algorithm
could do to show the
>>>>>>>>>> method works.
>>>>>>>>>>
>>>>>>>>>> Does this help?
>>>
>>> Maybe it will be worth to use the information we have in Bugzilla here:
>>>
>>> We can browse the BZ that were closed/verified in the last XXX days
>>> Per BZ , we know which patches are involved, who reviewed the patches,
>>> which files were changed, when files were changed and the rank of the
>>> change (number of lines changed)
>>> I believe that from this information we can compose a simple ranking
>>> algorithm that its output will be a list of N potential reviewers for the
>>> patch.
>>> Since we can aggregate the above information for all files related to the
>>> patch we want to add reviewers, we can have this set for the whole patch.
>>> This information should be processed and stored each N days and gerrit
>>> will
>>> be able to use it.
>>
>> why are we trying to invent hueristics, instead of declaring clear
>> ownership?
>
> Reminder: my source metadata plan that requires cooperation.
>
> Each source and component should have an explicit ownership up into bug
> database.
>
> I won't repeat it now, it is available at archives.
>
I think we are discussing two different issues here:
The first one considers the GSOC project, of adding reviewers
automatically to a patch, this can be done in many different heuristics
depending what the submitter prefers (use blame, maintainers who acked
the patches, bugs, list of owners to a file and so on).
The second issue is adding and maintain a list of owners by
files/folders in oVirt.
this is a specific use case to be used by the "add potential reviewers"
project.
There is no such think as "reviewer" there is component maintainer.
Random reviewers are irrelevant.
Each GSOC participate should have a mentor, this mentor should be the maintainer of the
relevant component.
The mentor is responsible for the process, while the participate is providing the
technical work.
>>
>>>
>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Users mailing list
>>>>>>>>>> Users(a)ovirt.org
>>>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I think if we do this, we want to make sure we cover
per file who
>>>>>>>>> is
>>>>>>>>> required to +2 it before we consider it acked.
>>>>>>>>>
>>>>>>>>
>>>>>>>> won't it require maintaining static lists of people
per
>>>>>>>> file/path/project?
>>>>>>>>
>>>>>>>
>>>>>>> yes, but considering our project layout, i don't see an
alternative.
>>>>>>> (some of the layout could be improved to be path based,
rather than
>>>>>>> file
>>>>>>> based)
>>>>>> I think it could be done automatically by analysing the file and
see
>>>>>> who
>>>>>> mostly changed it recently, since the "owner" of the
file might be
>>>>>> dynamic, who ever changed most of it few days ago might be more
>>>>>> familiar
>>>>>> with it today
>>>>>>
>>>>>> IMO the algorithm of adding the reviewers should be flexible.
>>>>>> For example, using a folder which will contain files, where each
file
>>>>>> implement an algorithm to add the reviewers.
>>>>>>
>>>>>> for instance we can have two files:
>>>>>> 1. Add a reviewers by blame - the contributor which changed
recently
>>>>>> the
>>>>>> code lines
>>>>>> 2. Add a reviewers by file - the contributor who changed most of
the
>>>>>> file recently.
>>>>>>
>>>>>> Each file will implement the functional operation and will
output the
>>>>>> reviewers emails.
>>>>>>
>>>>>> The user can then add a new algorithm or change it to be more
specific
>>>>>> to its project.
>>>>>> for example the user can add also the maintainers which acked
the
>>>>>> patch
>>>>>> that was blamed.
>>>>>>> _______________________________________________
>>>>>>> Users mailing list
>>>>>>> Users(a)ovirt.org
>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>
>>>>>
>>>>> this shouldn't be automatic. we need to clearly define
ownership. we
>>>>> can't
>>>>> do
>>>>> this per repo for the engine/vdsm. we can do this per repo for the
>>>>> other
>>>>> repo's probably (though solving the folder/file approach would
cover
>>>>> the
>>>>> simpler repos as a private case).
>>>>>
>>>>> yes, it will require some work, maybe some moving around of files
to
>>>>> make
>>>>> this
>>>>> easier by folders (topics) which should be relevant anyway.
>>>>
>>>> I think it would easier to maintain if we just have one file at the
>>>> root, instead of having the ownership information distributed
>>>> throughout the files/directories. That way you'll know where to look
at
>>>> to check/modify the ownership as opposed to having to walk all the
>>>> files and upper directories.
>>>>
>>>> Also, adding all that ownership logic to gerrit might not be easy, as
>>>> it's not meant to go checking the source of the repositories
looking
>>>> for configuration. We might want to take a look (again) to zuul, the
>>>> tool that openstack uses as gateway to trigger jenkins jobs and merge
>>>> patches.
>>>>
>>>>> _______________________________________________
>>>>> Infra mailing list
>>>>> Infra(a)ovirt.org
>>>>>
http://lists.ovirt.org/mailman/listinfo/infra
>>>>
>>>>
>>>>
>>>> --
>>>> David Caro
>>>>
>>>> Red Hat S.L.
>>>> Continuous Integration Engineer - EMEA ENG Virtualization R&D
>>>>
>>>> Email: dcaro(a)redhat.com
>>>> Web:
www.redhat.com
>>>> RHT Global #: 82-62605
>>>>
>>>>
>>>> _______________________________________________
>>>> Infra mailing list
>>>> Infra(a)ovirt.org
>>>>
http://lists.ovirt.org/mailman/listinfo/infra
>>>>
>>
>> _______________________________________________
>> Infra mailing list
>> Infra(a)ovirt.org
>>
http://lists.ovirt.org/mailman/listinfo/infra
>>