Page MenuHomePhabricator

Allow users to amend revisions they don't own, without commandeering them
Closed, ResolvedPublic

Description

WMF is interested in allowing users to use arc diff to update a revision they don't own. Currently, this fails with an error:

You don't own revision "Dxxx: Embiggen Security", You can only update revisions you own. You can "Commandeer" this revision from the web interface if you want to become the owner.

This operation is also soft-prevented in the web UI: users aren't given an option to update revisions they don't own. It isn't technically prevented, but can't be accomplished without, e.g., editing the DOM nodes in the page.

See also this discussion downstream: https://phabricator.wikimedia.org/T121751


D15468 proposes an approach which adds an option to arc (diff.amend-without-commandeer) to disable this prompt and allow the process to move forward.

From a technical point of view, I don't want to add a client-side option for two reasons:

  • I generally want to avoid adding options except as a last resort where we have no alternatives (see T8227).
  • I want the CLI and web UI to have the same behavior, and this isn't possible if a client-side flag controls some of the CLI behavior.

A similar behavior which avoids these issues and would be more upstreamable from a product perspective is to change the hard error in the client to a prompt ("Are you sure you want to update a revision you don't own?"), and likewise allow but confirm the behavior in the web UI. Then the workflows would be the same, this would become possible, and we wouldn't need options.

However, I think this behavior is a very bad one, and I'm quite hesitant to upstream it. I'd prefer to go the other way: put technical checks in place to prevent this, so clients can not do this. I'll discuss this below.

Event Timeline

To provide some background and context, we added Commandeer in D2257 to deal with use cases like this:

  • Original author left the company, and some of their stuff didn't get fully sorted out.
  • Original author went into labor or had some other emergency and won't be available for a while, but a change needs to move forward before they're expected to return.
  • Differential needed a yellow action to complete the rainbow of action colors available at the time (see T1178).

In practice, we also use Commandeer occasionally when there's an actual change of ownership on a change: for example, maybe I submit a low-quality design diff to @chad primarily to discuss a general idea (I don't expect it to land, it's just easier to discuss in code than in ASCII art or screenshots), and once we're satisfied that the layout or interaction is broadly reasonable he might commandeer it to provide a real design.

We've previously resisted a few requests to make this workflow more freeform (like T3704), although I think these were generally coming from an administrative perspective and more an issue of "admins aren't all-powerful" than specific to the commandeer workflow.

I generally feel good about commandeer today from a product perspective. The weight (i.e., you have to take an explicit action to commandeer) and tone (i.e., "commandeer" implies a forceful takeover, and you wouldn't want to idly take this action on others' revisions) feel like they're in the right spot to me.


A more complex version of this workflow was proposed in D14104 (adding an arc revise). In this workflow, reviewers suggest replacement code using inline comments, then the author applies the changes with a new arc revise. The discussion there may be relevant here.

Here's my best effort to summarize downstream use cases for this by reading through https://phabricator.wikimedia.org/T121751 -- let me know if I'm misunderstanding, missing, or misrepresenting anything:

  • Correcting small errors made by the author (e.g., a typo in a comment?)
  • Finishing changes for authors who may not be responsive quickly (i.e., new contributors?).
  • Merging or rebasing for authors who may have difficulty performing these operations (particularly newcomers/interns?).

I think these use cases -- at least, the last two -- are primarily focused around "mentoring" (i.e., the reviewer is experienced and the author is inexperienced), not peer-to-peer "review". That is, it doesn't sound like full time engineers at WMF are frequently updating or amending each others' changes (maybe I'm misunderstanding and this is more common than I think?), but that someone is submitting a patch which is in the ballpark in theory but not quite there in practice, and someone who knows what they're doing is essentially taking responsibility for moving it forward. This is effectively a "Commandeer", but it's important to preserve the original author so the contributor/intern feels good about contributing (perhaps a "Commandeer Empathetically").

I think this workflow is somewhat common in open source: patches from new contributors frequently need adjustments, and are fire-and-forget as often as not, so you can't really rely on the author returning to finish the change.

Historically, we've used an "arc patch + amend locally + arc land" workflow to deal with these successfully, which is discussed downstream. One limitation is that this prevents the workflow from continuing from the web UI, which is increasingly important as we tie in more build and deployment logic.

Broadly, the Phabricator upstream has mostly moved from "arc patch + amend locally + arc land" to "almost never accept patches from contributors", which has worked well for us, but this is obviously an unusual stance for an open source project to take and presumably toxic conceptually and actively goal-defeating in the WMF downstream.

On the "rebase as a courtesy" workflow (i..e, helping someone who is very inexperienced rebase a change), I think you can technically do this today:

  • Rebase.
  • Run arc diff --only to create a diff only.
  • Say "I rebased things for you here: ..., use the web UI to update this revision with that change if things look good".

This may not actually work or may not stick around if it does work.

As a mentor, I think this is maybe really a discourtesy: teaching your intern how to rebase is probably one of the most valuable skills you could teach them, and doing it for them won't help them learn how to do it. Teaching them how to do it is unbelievably frustrating though, especially if they're remote and you have to do it over IRC, so I can understand mentors wanting to do this on behalf of their mentees, especially on their first few patches.

I'm also a little unsure about how well this will work in the Phabricator model -- how do interns currently fetch changes after someone rebases for them? I'd expect this to cause nearly as much trouble as rebasing does on its own. Or is this always a last step before commit?

Also from a mentor/empathy viewpoint, I think a strong point of Differential is that it's something of an equalizer for newcomers: at least by default, a new hire and a senior engineer have the same capabilities and powers, and senior engineers are subject to the same process that new engineers are. I think it's heartening to be able to look at your mentor's changes and see that they still make plenty of dumb mistakes, or be able to catch something in their code. Obviously, there's a lot of social context on top of this technical equality, but I think review is a strong mentorship/onboarding/leadership tool and a big part of that is putting everyone on equal footing. If you set a quality bar as a tech lead, the tool (by default) holds you to it, which makes it much stronger.

I also think ownership and accountability are hugely important, particularly for new hires, and these collaboration flows tend to undermine them.

Anyway, I think there are kind of four cases here, based on the reviewer/author relationship?

New Contributors: ("Commander Empathetically") New contributors often send fire-and-forget patches, and reviewer-update is fully justified and desirable. Reviewer essentially takes responsibility for the change: if it breaks in production, they'll be the one fixing it. This workflow is basically a "Commandeer", but doesn't want to use "Commandeer" because it destroys the social/emotional value of contributing to no benefit for anyone.

Interns/GsoC/Outreach: ("Mentoring") Interns may be very inexperienced and have difficulty with complex operations like rebasing. Teaching them this skill is very frustrating and time consuming for both mentor and mentee, and may not be a good use of time, especially in a remote, part-time, limited-duration program. (But, are these updates happening mid-change, or only at the end? Is this just a flavor of "new contributors"?)

New Hires: ("Leadership") I think this workflow is generally toxic for new hires (who presumably have good engineering fundamentals and will be contributing to the project for a long time), and that the value of cultivating an environment of equality and ownership/accountability dramatically outweighs saving a few minutes doing something for a new engineer. (There's maybe some grey area with new hires in technical but non-engineering roles (e.g., network operations, technical account management, design) who may have solid technical skills but not have much experience with engineering operations like git rebase.)

Experienced Engineers / Peers: ("Review") I don't think this workflow is particularly useful or common for experienced engineers, nor does it seem to be common downstream from what I can tell.

Broadly, I am concerned about building features that make the "Open Source Contributor / Intern" case a little better but gut the principles of equality, ownership and accountability from the "New Hire" case and make them feel less like they're coming in on equal footing. I think the "New Hire" case is a far more important case in general and certainly more important for almost all Phabricator users, although maybe relatively less common or important for WMF than interns/contributors.

An example of a concerning "New Hire" case is a scenario like this:

It's your first day at Big Company, Inc. as a software engineer.
You submit your first a change to fix a bug.
Someone way more senior than you rewrites it, updates over it, then tells you to commit it.

I think this is soul-crushing and incredibly toxic, and want the software to prevent this from being a path-of-least-resistance workflow as much as we reasonably can. The easiest thing should generally be to ask the author to take responsibility/ownership, fix issues, etc., -- the same thing you'd do with a more senior engineer -- so new hires are on equal footing from day one.

Here's a general outline of a workflow which might satisfy everything without creating the "stomp all over new hires" tension that I want to avoid:

  • You can update other users' revisions, but the updates are advisory, and you can not actually apply them.
  • They appear in the UI as "epriestley suggested an update to this revision. [ Show Changes ] [ Accept Update ] [ Dismiss ]" or similar, with whatever explanatory text you provided when you suggested the update.
  • The author must actively accept updates to promote them so they become the active diff.

That would mean:

  • For drive-by contributors, "commandeer secretly" is still the workflow.
  • For mentoring, you do the rebase and suggest the update. The mentee can review it and accept it. This gives them more control and autonomy but still lets you do steps for them without spending an entire day teaching them how to rebase over IRC.
  • For new hires, this workflow has way more friction than telling them to fix their stuff, which is good, and doesn't wrest nearly as much control from them even if it is used.
  • For peer review, this probably won't impact anything.

Partially unresolved is drive-by contributors where you want to submit through a build/CI/deploy pipeline instead of committing directly. I'm not sure if this use case really exists or not, and this could probably be rolled into arc land later if it does.

This is moderately complicated technically, but not unreasonably so. The biggest issue is that a revision's active diff is not currently explicit, and is instead the attached diff with the highest ID. This likely needs to change eventually to clear other adjacent issues like T1081.

Maybe notable is that these use cases seem to be completely different from the use cases discussed in D14104 (where "lightweight diffs are probably a plurality, if not majority"). There is also some possible overlap with T6008.

In those use cases, peers are frequently fixing typos, rewriting loops, etc., for each other?

There is a narrow band of use cases here where semi-technical users are correcting non-executable source-controlled information like documentation; these use cases make sense to me but are largely a barrier-to-entry issue ("our technical writers could be more effective if they could use a browser for minor changes instead of learning git").

I have to imagine some of this is an engineering organization issue (i.e., lack of coding standards, lack of lint support, weird culture of accountability?) but I'm not really sure. For peers, I can't really imagine a case where rewriting someone's loop is better than suggesting they consider a different structure or pointing them at a standards doc, or lint fixing it automatically before I even see it.

This "advisory update" workflow is completely useless for fixing typos, but I generally think that's a feature. I'm not sure how common the peer-to-peer "fix a typo" use case is in the downstream or what the scope of this class of changes is.

FWIW - full time engineers at the WMF do routinely amend each others changes in gerrit, without hesitation.

I don't personally feel that it's a good habit but it's the norm at WMF and to quote @dereckson "it inherits from the collaborative wiki culture."

An example of a concerning "New Hire" case is a scenario like this:

It's your first day at Big Company, Inc. as a software engineer.
You submit your first a change to fix a bug.
Someone way more senior than you rewrites it, updates over it, then tells you to commit it.

I think this is soul-crushing and incredibly toxic, and want the software to prevent this from being a path-of-least-resistance workflow as much as we reasonably can. The easiest thing should generally be to ask the author to take responsibility/ownership, fix issues, etc., -- the same thing you'd do with a more senior engineer -- so new hires are on equal footing from day one.

Here's my soul-crushing new hire story from working at the WMF:

It took 6 months before I got my first patch accepted in gerrit. This is not an exageration.

At my previous job I deployed to production within the first week.

It took 6 months before I got my first patch accepted in gerrit.

Ouch, that's rough.

I vaguely remember these charts from downstream, too (which seem to show a 30+ day median turnaround for review, at least as of about two years ago):

https://phabricator.wikimedia.org/F13952

I don't know if those numbers are still in the ballpark or if things have changed, but maybe we're also talking about fairly different processes here. The upstream is generally expecting that the median change takes no more than a few days end-to-end, and review rounds (e.g., "change -> request update" or "change -> accept") are often same-day or better.

Pushing more responsibility on the author might possibly help with turnaround time -- as a reviewer, it's easier to be prompt if you don't have to do as much work. I don't really know and don't want to presume too much about what forces you're facing, but we've generally had good experiences with a strong ownership/accountability model and I think it's definitely way better for the author to have to fix their own typos/loops but get a 1-day turnaround than to have someone else fix their stuff with a 1-week (or 6 month!) turnaround. Of course, this may not really have much bearing here, and has very limited applicability for contributors vs full-time employees.

GitHub actually encourages reviewers to "Add commits to PRs"; It's a little hard to un-teach once it gets hold.

GitHub actually encourages reviewers to "Add commits to PRs"; It's a little hard to un-teach once it gets hold.

one of the many reasons that the github pull request model is broken as designed :\

I like the idea of proposing an update but not automatically applying it until the author clicks 'approve.'

If the only major technical impediment is storing the 'current' diff for a revision, then that seems fairly straightforward to implement. The simple patch to arcanist might satisfy WMF users though and I don't mind maintaining that as a downstream patch (though users will have to grab our patched version of arc instead of using arc update I think that's reasonable)

There have only been a handful of occasions where something like this would have been convenient for my team - in the "drive-by" scenario. One developer happens across a bug and fixes it locally in order to continue their testing/development, meanwhile the bug is actually part of a larger problem which another developer is investigating. The options in this scenario are:

  1. Wait for "drive-by" developer's changes to be landed
  2. Have the "drive-by" developer export a diff of the relevant file(s) and email to the other

Having this feature would be a convenience but it happens so infrequently that I only recall thinking about asking for the feature once or twice. Occasionally instead of a bugfix it might be a small-yet-convenient refactor in question.

Here are a few use cases for experienced peers rebasing each other's commits:

  • I am reviewing a patch which is OK but I don't like some detail like naming or code organization. It is somewhat subjective and a nontrivial effort to change, so I don't want to force the patch author to do extra work because of my personal preferences, but I am sufficiently annoyed by it that I am willing to do the extra work myself, so I ask them if it's OK for me to change that, and do it. The real work in the patch is still theirs, and so is the responsibility, so 'commandeer' sounds inappropriate.
  • a patch needs some trivial change (easy rebase, comment typo fix etc), and the author is not immediately available (different timezone or maybe a volunteer who only has time on the weekends). Asking them to fix will cause unnecessary roundtrip delay and require the author to pay again the cost of resuming the context, setting up the developer environment etc. Also during the roundtrip time more code changes might be merged, requiring a less trivial rebase which might necessitate code review again. So it can be a lot of wasted time compared to just me doing the rebase/typo fix.
  • multiple developers working together on a complex patch which touches different parts of the system. This is typical enough in open-source software that the Co-Authored-By commit footer got invented, and would probably happen more at the WMF if support for it in Gerrit would not be so horrible.

Suggesting diffs is a great way to handle the first use case, a bit awkward but OK for the third, not useful at all for the second.

It has been common in http://www.mediawiki.org/wiki/Parsoid code development for reviewers to sometimes take over a patch, fix issues, or propose a different way of doing things. That sometimes seems faster and more efficient in terms of getting over tricky areas of code and working through different approaches / issues. It works out well for us because none of us mind that approach.

https://gerrit.wikimedia.org/r/#/c/238957/ is an example where Tim Starling, C.Scott Ananian, and I all submitted amended patches .. and in the end, I acknowledged it with a Co-authored by: line in the bottom.

There are number of other instances where we have done this in the Parsoid code base ... to ensure quick forward progress rather than long drawn out review cycles ... especially when someone is on vacation or is not around for other reasons.

GitHub actually encourages reviewers to "Add commits to PRs"; It's a little hard to un-teach once it gets hold.

one of the many reasons that the github pull request model is broken as designed :\

GitHub's pull request model is very popular and seems to reflect the intent of Git itself. Why do you suggest it's broken?

as the author of D14104, the advisory update proposal here would make me happy. Its basically taking the class of comments that are suggesting specific fixes or additions and making it more lightweight. One minor ui problem to solve would be supporting pure inserts or deletes.

So these advisory updates would be applied with arc amend? I might have written exactly that approach except for a desire to keep my prototype version untangled from core logic.

20after4 claimed this task.

This is resolved to my satisfaction by D15468: Allow amending revisions without commandeering first. @epriestley: should this remain open to pursue something more elaborate in the future?

Thanks to everyone for the feedback. I will check out D14104, that looks interesting.

I think this is fine to close for now -- the discussion probably remains relevant going forward, but I can point back here the next time a driving use case arises. I think we're unlikely to pursue any of this "suggest a patch" vein of changes on our natural roadmap for a long time.