----- Original Message -----
From: "Itamar Heim" <iheim(a)redhat.com>
To: "Maor Lipchuk" <mlipchuk(a)redhat.com>, "Alon Bar-Lev"
<alonbl(a)redhat.com>
Cc: "Eli Mesika" <emesika(a)redhat.com>, users(a)ovirt.org, "Tomasz
Kołek" <tomasz-kolek(a)o2.pl>, "infra"
<infra(a)ovirt.org>
Sent: Wednesday, March 12, 2014 9:29:28 PM
Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
On 03/12/2014 07:29 PM, Maor Lipchuk wrote:
> On 03/12/2014 05:57 PM, Alon Bar-Lev wrote:
>>
>>
>> ----- Original Message -----
>>> From: "Itamar Heim" <iheim(a)redhat.com>
>>> To: "Eli Mesika" <emesika(a)redhat.com>, users(a)ovirt.org
>>> Cc: "Maor Lipchuk" <mlipchuk(a)redhat.com>, "Tomasz
Kołek"
>>> <tomasz-kolek(a)o2.pl>, "infra" <infra(a)ovirt.org>
>>> Sent: Wednesday, March 12, 2014 5:52:25 PM
>>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers - questions
>>>
>>> On 03/12/2014 04:57 PM, Eli Mesika wrote:
>>>>
>>>>
>>>> ----- Original Message -----
>>>>> From: "David Caro" <dcaroest(a)redhat.com>
>>>>> To: "Itamar Heim" <iheim(a)redhat.com>
>>>>> Cc: "Maor Lipchuk" <mlipchuk(a)redhat.com>,
users(a)ovirt.org, "Tomasz
>>>>> Kołek"
>>>>> <tomasz-kolek(a)o2.pl>, "infra"
>>>>> <infra(a)ovirt.org>
>>>>> Sent: Wednesday, March 12, 2014 11:01:21 AM
>>>>> Subject: Re: [Users] [GSOC][Gerrit] add potential reviewers -
questions
>>>>>
>>>>> On Wed 12 Mar 2014 08:16:02 AM CET, Itamar Heim wrote:
>>>>>> On 03/11/2014 10:08 PM, Maor Lipchuk wrote:
>>>>>>> On 03/11/2014 05:20 PM, Itamar Heim wrote:
>>>>>>>> On 03/11/2014 05:14 PM, Eyal Edri wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ----- Original Message -----
>>>>>>>>>> From: "Itamar Heim"
<iheim(a)redhat.com>
>>>>>>>>>> To: "Eyal Edri"
<eedri(a)redhat.com>, "Tomasz Kołek"
>>>>>>>>>> <tomasz-kolek(a)o2.pl>, users(a)ovirt.org,
"infra" <infra(a)ovirt.org>
>>>>>>>>>> Sent: Tuesday, March 11, 2014 5:10:54 PM
>>>>>>>>>> Subject: Re: [Users] [GSOC][Gerrit] add
potential reviewers -
>>>>>>>>>> questions
>>>>>>>>>>
>>>>>>>>>> On 03/11/2014 05:06 PM, Ewoud Kohl van
Wijngaarden wrote:
>>>>>>>>>>> On Tue, Mar 11, 2014 at 10:37:22AM -0400,
Eyal Edri wrote:
>>>>>>>>>>>>> Tomasz Kołek wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've got a few questions about
project description.
>>>>>>>>>>>>> Please tell me if my problem's
understanding is good or not.
>>>>>>>>>>>>> We need to add a few flags/methods
to git review module. This
>>>>>>>>>>>>> flags
>>>>>>>>>>>>> should
>>>>>>>>>>>>> allow to add potential reviewers in
gerrit.
>>>>>>>>>>>>> So:
>>>>>>>>>>>>> Let's assume that we've got
special flags for this operations.
>>>>>>>>>>>>> What's
>>>>>>>>>>>>> next?
>>>>>>>>>>>>> 1. In gerrit system we need to add
special place for potential
>>>>>>>>>>>>> reviewers?
>>>>>>>>>>>>> 2. Potential reviewers should agree
that they want to review?
>>>>>>>>>>>>> 3. We can have more than one
accepted reviewer?
>>>>>>>>>>>>
>>>>>>>>>>>> I'm not sure i understood exactly
what you mean by 'potential
>>>>>>>>>>>> reviewers'. do want gerrit (hook?)
to automatically add
>>>>>>>>>>>> reviewers
>>>>>>>>>>>> to
>>>>>>>>>>>> a patch according to the code sent? so
in fact you'll have a
>>>>>>>>>>>> place
>>>>>>>>>>>> somewhere for mapping code &
specific developers?
>>>>>>>>>>>
>>>>>>>>>>> I really like this idea. Gerrit currently
requires new users to
>>>>>>>>>>> know
>>>>>>>>>>> who
>>>>>>>>>>> to add as reviewers, IMHO impeding new
contributors.
>>>>>>>>>>>
>>>>>>>>>>> One relative simple solution would be to
look at who recently
>>>>>>>>>>> touched
>>>>>>>>>>> the files that are being modified and add
them as reviewers. This
>>>>>>>>>>> can be
>>>>>>>>>>> done by looking at the git log for a file.
Some pseudo python
>>>>>>>>>>> code
>>>>>>>>>>> solution:
>>>>>>>>>>>
>>>>>>>>>>> reviewers = set()
>>>>>>>>>>>
>>>>>>>>>>> for modified_file in commit.files:
>>>>>>>>>>> reviewers += set(commit.author for
commit in
>>>>>>>>>>> git.log(modified_file))
>>>>>>>>>>>
>>>>>>>>>>> return reviewers
>>>>>>>>>>>
>>>>>>>>>>> This gives a system that those who touche a
file, become the
>>>>>>>>>>> maintainer
>>>>>>>>>>> for that file. A more complex solution could
improve on that and
>>>>>>>>>>> limit
>>>>>>>>>>> the reviewers added per patch. One can think
of limiting to only
>>>>>>>>>>> contributions in the last X months, weigh
contributions so common
>>>>>>>>>>> committers are prefered. It could also
combine several methods.
>>>>>>>>>>>
>>>>>>>>>>> For example to limit to the 5 authors who
touched the most files:
>>>>>>>>>>>
>>>>>>>>>>> reviewers = collections.Counter() # New in
python 2.7
>>>>>>>>>>>
>>>>>>>>>>> for modified_file in commit.files:
>>>>>>>>>>> reviewers +=
collections.Counter(commit.author for
>>>>>>>>>>> commit in
>>>>>>>>>>> git.log(modified_file))
>>>>>>>>>>>
>>>>>>>>>>> return [author for author, count in
reviewers.most_common(5)]
>>>>>>>>>>>
>>>>>>>>>>> Since Counter also accepts a dictionary, one
could also weigh the
>>>>>>>>>>> touched lines per file. Downside there is
big
>>>>>>>>>>> whitespace/formatting
>>>>>>>>>>> patches can skew the line count.
>>>>>>>>>>>
>>>>>>>>>>> In short, I think an entire thesis could be
written on the
>>>>>>>>>>> optimal
>>>>>>>>>>> way
>>>>>>>>>>> to determine reviewers but a simple
algorithm could do to show
>>>>>>>>>>> the
>>>>>>>>>>> method works.
>>>>>>>>>>>
>>>>>>>>>>> Does this help?
>>>>
>>>> Maybe it will be worth to use the information we have in Bugzilla here:
>>>>
>>>> We can browse the BZ that were closed/verified in the last XXX days
>>>> Per BZ , we know which patches are involved, who reviewed the patches,
>>>> which files were changed, when files were changed and the rank of the
>>>> change (number of lines changed)
>>>> I believe that from this information we can compose a simple ranking
>>>> algorithm that its output will be a list of N potential reviewers for
>>>> the
>>>> patch.
>>>> Since we can aggregate the above information for all files related to
>>>> the
>>>> patch we want to add reviewers, we can have this set for the whole
>>>> patch.
>>>> This information should be processed and stored each N days and gerrit
>>>> will
>>>> be able to use it.
>>>
>>> why are we trying to invent hueristics, instead of declaring clear
>>> ownership?
>>
>> Reminder: my source metadata plan that requires cooperation.
>>
>> Each source and component should have an explicit ownership up into bug
>> database.
>>
>> I won't repeat it now, it is available at archives.
>>
> I think we are discussing two different issues here:
> The first one considers the GSOC project, of adding reviewers
> automatically to a patch, this can be done in many different heuristics
> depending what the submitter prefers (use blame, maintainers who acked
> the patches, bugs, list of owners to a file and so on).
that's the point. we should be using heuristics, rather ownership.
I want heuristics to solve bugs as well.
Maybe for each issue I will use heuristic and wait for X days for someone else to fix?
I would like to see how you build a building using heuristic reviews.
this could still be done via a GSOC project for implementing the
gerrit
logic to do so.
>
> The second issue is adding and maintain a list of owners by
> files/folders in oVirt.
> this is a specific use case to be used by the "add potential reviewers"
> project.
>
>>>
>>>>
>>>>
>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>> Users mailing list
>>>>>>>>>>> Users(a)ovirt.org
>>>>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I think if we do this, we want to make sure we
cover per file who
>>>>>>>>>> is
>>>>>>>>>> required to +2 it before we consider it acked.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> won't it require maintaining static lists of
people per
>>>>>>>>> file/path/project?
>>>>>>>>>
>>>>>>>>
>>>>>>>> yes, but considering our project layout, i don't see
an alternative.
>>>>>>>> (some of the layout could be improved to be path based,
rather than
>>>>>>>> file
>>>>>>>> based)
>>>>>>> I think it could be done automatically by analysing the file
and see
>>>>>>> who
>>>>>>> mostly changed it recently, since the "owner" of
the file might be
>>>>>>> dynamic, who ever changed most of it few days ago might be
more
>>>>>>> familiar
>>>>>>> with it today
>>>>>>>
>>>>>>> IMO the algorithm of adding the reviewers should be
flexible.
>>>>>>> For example, using a folder which will contain files, where
each file
>>>>>>> implement an algorithm to add the reviewers.
>>>>>>>
>>>>>>> for instance we can have two files:
>>>>>>> 1. Add a reviewers by blame - the contributor which changed
recently
>>>>>>> the
>>>>>>> code lines
>>>>>>> 2. Add a reviewers by file - the contributor who changed
most of the
>>>>>>> file recently.
>>>>>>>
>>>>>>> Each file will implement the functional operation and will
output the
>>>>>>> reviewers emails.
>>>>>>>
>>>>>>> The user can then add a new algorithm or change it to be
more
>>>>>>> specific
>>>>>>> to its project.
>>>>>>> for example the user can add also the maintainers which
acked the
>>>>>>> patch
>>>>>>> that was blamed.
>>>>>>>> _______________________________________________
>>>>>>>> Users mailing list
>>>>>>>> Users(a)ovirt.org
>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>
>>>>>>
>>>>>> this shouldn't be automatic. we need to clearly define
ownership. we
>>>>>> can't
>>>>>> do
>>>>>> this per repo for the engine/vdsm. we can do this per repo for
the
>>>>>> other
>>>>>> repo's probably (though solving the folder/file approach
would cover
>>>>>> the
>>>>>> simpler repos as a private case).
>>>>>>
>>>>>> yes, it will require some work, maybe some moving around of
files to
>>>>>> make
>>>>>> this
>>>>>> easier by folders (topics) which should be relevant anyway.
>>>>>
>>>>> I think it would easier to maintain if we just have one file at the
>>>>> root, instead of having the ownership information distributed
>>>>> throughout the files/directories. That way you'll know where to
look at
>>>>> to check/modify the ownership as opposed to having to walk all the
>>>>> files and upper directories.
>>>>>
>>>>> Also, adding all that ownership logic to gerrit might not be easy,
as
>>>>> it's not meant to go checking the source of the repositories
looking
>>>>> for configuration. We might want to take a look (again) to zuul,
the
>>>>> tool that openstack uses as gateway to trigger jenkins jobs and
merge
>>>>> patches.
>>>>>
>>>>>> _______________________________________________
>>>>>> Infra mailing list
>>>>>> Infra(a)ovirt.org
>>>>>>
http://lists.ovirt.org/mailman/listinfo/infra
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> David Caro
>>>>>
>>>>> Red Hat S.L.
>>>>> Continuous Integration Engineer - EMEA ENG Virtualization R&D
>>>>>
>>>>> Email: dcaro(a)redhat.com
>>>>> Web:
www.redhat.com
>>>>> RHT Global #: 82-62605
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Infra mailing list
>>>>> Infra(a)ovirt.org
>>>>>
http://lists.ovirt.org/mailman/listinfo/infra
>>>>>
>>>
>>> _______________________________________________
>>> Infra mailing list
>>> Infra(a)ovirt.org
>>>
http://lists.ovirt.org/mailman/listinfo/infra
>>>
>