Make "arc cover" more powerful
Open, NormalPublic

Description

When I do arc diff for the first time and the editor comes up, it would be cool if the Reviewers field were prepopulated with suggested reviewers or if there were some inline commented-out reviewers. (Or if I could upload the diff without Reviewers, or even just see suggested reviewers in the web UI). I'm imagining that Owners and blame information could suggest Reviewers that are at least good as the ones I select for unfamiliar code, since in practice I tend to just use Owners and blame in selecting reviewers.

Pretty wishlisty but would help make it more lightweight to do small diffs. (Also, blame is absurdly slow in our environment for bad reasons.)

Details

Differential Revisions
Restricted Differential Revision
aran assigned this task to vrana.Jan 29 2013, 6:44 AM
aran added projects: Differential, Arcanist.
aran added subscribers: aran, epriestley.
bolasblack triaged this task as High priority.Jan 29 2013, 8:44 AM
epriestley lowered the priority of this task from High to Normal.Jan 29 2013, 11:41 AM

T1125: Suggest field defaults in "arc diff" is related. T1259: Improve "arc cover" is also related. And T2450: Build a blame cache is related.

I'm generally hesitant about this for both technical and product reasons. On the technical side:

  • Parity between the CLI and web workflows is hard to achieve, especially in the general case. See T1125.
  • Computing anything blame-based preemptively will be tremendously slow, and we can't make it fast enough to be noise without a custom server for commit graphs. I have a strategy for caching blame in MySQL, but it reduces the blame operation to a log operation, which is still significantly slow. I believe we can't make the log operation fast without an additional software dependency. I don't want to put slow operations in the "arc diff" pipeline just in case you don't know who you want to review code, since it slows down everyone.

On the product side:

  • I don't think we can do a very good job of it. Blame just isn't that useful a lot of the time, and, at least outside of Facebook, owners data rarely exists. I'm not sure how the situation is nowadays.
  • One specific issue is that a blame-based mechanism is likely to put active developers on tons of reviews they have no interest in, which basically punishes developers for being prolific.
  • I think having reviewers pull reviews they care about (via Herald) is a fundamentally better model than pushing reviews to them based on heuristics.

Generally, I'd rather improve "arc cover", with the theory being that you'd use it to find reviewers if you were in an unfamiliar section of the codebase. This addresses most of the technical and product arguments -- it won't slow anything down, the author still has to copy reviewers into the commit message so they're at least consciously choosing to put mcslee and ola on every review, and we can give authors more information about why the reviewers were chosen to help them make this decision.

Specific ways that "arc cover" could be improved are:

  • In addition to blame author information, we can show blame reviewer information.
  • We can show owners information.
  • Possibly, we can show which users have Herald rules which will trigger, although this implies some complicated product questions.
vrana added a comment.Jan 29 2013, 5:52 PM

This is the algorithm I use for choosing reviewer in unfamiliar code:

  1. If the code has on owner, use him.
  2. If the author of the line I am changing is experienced enough and was active recently in this part of code then use him.
  3. Otherwise use the reviewer if he is still active here.
  4. Except it was a codemod - in this case, blame previous revision.
  5. If both author and reviewer of the line don't look like good candidates, look at the authors and reviewers of last commits of the file.
  6. If there's no recent activity, use directory.
  7. If it is still not good enough then use @ptarjan.

This is also basically what I teach in bootcamp (except the last step). Implementing just a subset of this would pick suboptimal reviewer quite often.

vrana added a subscriber: ptarjan.Jan 29 2013, 5:52 PM

Some of those are reasonable things to add to arc cover -- basically, just making it easier to go through that list. It's hard for us to automatically figure out if a commit is a good commit (i.e., not a codemod), but we can definitely show enough information to make 1, 2 (add last time touched), 3, 5, and 6 easier, e.g.

$ arc cover
epriestley
  Owner (Differential, 32 lines)
  Owner (Maniphest, 16 lines)
  Last authored 6 days ago
  ab19381fe Add some feature to Differential
  96 lines
vrana
  Owner (Diffusion, 8 lines)
  Last reviewed 2 days ago
  8ffaf1d0 Improve Diffusion yada yada
  29 lines
aran
  Authored nearby code 16 days ago
  8f19f101 Refactor something
btrahan
  Reviewed nearby code 32 days ago
  af19831 Replace all tabs with spaces

Run with --verbose for more details.
vrana added a comment.Jan 29 2013, 6:10 PM

Yeah, I agree with providing more information, I'm hesitant about pre-filling the reviewers field by this information.

I like it. You could try it out after you write your commit message with a prompt arc cover suggests to add @vrana and @epreistly as reviewers because blah blah blah. Add them? [Y/n].

We should put 7. in arc cover :) Then I'll set my OOO response to !accept (after we re-implement it of course).

For codemod detection a simple heuristic of num_lines > 5000 || (num_files > 100 && num_files / num_lines > 0.25) might work.

(I'm just going to merge T1259 here since this is much more detailed and we seem to have settled on the same result.)

aran added a comment.Jan 30 2013, 2:58 AM

arc diff --reviewers arc cover --oneline ? :)

aran added a comment.Jan 30 2013, 2:58 AM

There were supposed to be backticks.

vrana removed vrana as the assignee of this task.Feb 6 2013, 11:18 PM
vrana added a subscriber: vrana.
epriestley renamed this task from Automatically populate reviewers field or provide reviewers suggestions to Make "arc cover" more powerful.Feb 19 2013, 3:45 PM
epriestley edited this Maniphest Task.Apr 9 2013, 3:22 PM

One other piece of information which might be useful to include is "load" -- number of assigned and recent reviews.

epriestley added a subscriber: Unknown Object (MLST).Sep 3 2013, 11:41 PM

Via FB.

epriestley changed the visibility from "All Users" to "Public (No Login Required)".Jun 23 2015, 11:06 AM
epriestley moved this task from Backlog to v2 on the Owners board.
eadler added a project: Restricted Project.Mar 20 2016, 10:42 PM
eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Mar 21 2016, 3:20 PM
eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Jul 4 2016, 9:18 PM